I think that Andreas's concept of treating these mirrors as a database is good. Checkpoint logical log replay is better than a simple rsync for large numbers of files.
The replication problem for databases is well-understood and open-source code for it is available from at least Postgresql. Grab the current log and any logs you're missing since last update and off you go Another approach which is a non-starter practically speaking but I will mention anyway: Use zfs. Make one filesystem for each mirrored project (CPAN, freshmeat, etc). Daily or at other regular interval make a zfs snapshot. Purge old ones after some reasonable time such as 2 days. Mirror sites request a zfs incremental stream with the name of their last rec'd snapshot and that of the current. While zfs is available for Solaris 10, OpenSolaris and I believe FreeBSD (the Mac OSX port halted IIRC) this isn't available enough for major mirrors to use Sent from my BlackBerry® smartphone with Nextel Direct Connect