Dormando, There are roughly 58 million entries in file_to_replicate of a total 71 million files. It seems like the Replication Worker is for some reason not deleting completed rows (though the code path exists). Note that only 570K entries in file_to_replicate have failcount > 0. Only 9 entries have a nexttry = ENDOFTIME.
mysql> select count(*) from file_to_replicate; +----------+ | count(*) | +----------+ | 58395828 | +----------+ 1 row in set (2 min 6.26 sec) Best, Brian -----Original Message----- From: dormando [mailto:[EMAIL PROTECTED] Sent: Monday, April 28, 2008 12:50 AM To: Brian Lynch Cc: [email protected] Subject: Re: Replication Oddities >>>> Would it be possible to purge portions of the file_to_replicate > table? I'm currently pulling out known good replications to identify > bogus entries. You should sample rows out of file_to_replicate, see if the nexttry is set to 2147483647 - and that all of the paths are invalid. I've never outright removed rows from file_to_replicate, _unless_ I have verified that the fid is gone, ie: - Has no matching 'file' entry. - Has no matching 'file_on' rows (odd bug, haven't fixed yet). - Has file row, file_on row(s), but all paths are dead. 404's. If at least one of those conditions are met, the fid can be removed from file_to_replicate, and you might want to see why they disappeared to begin with. Otherwise you do not remove the row. If the nexttry is off in the future but not equal to ENDOFTIME (2147483647) you can try UPDATE'ing those rows to UNIX_TIMESTAMP() and see if they get chewed through. If not, you should find out exactly what's going on. Odds are one of the three conditions listed above has happened. If otherwise, you should definitely give a best effort in figuring out what it was. Yeah, this should be way more automatic. We'll get to it someday, and also accept patches ;) -Dormando
