Since approximately Mid August, the load on Central has been growing at an exponential rate. You may have noticed slowdowns or dropped connections recently as a side effect. We first had issues with Apache HTTPD load increasing above the capacity of the machine. We switched over to Nginx (http://blogs.sonatype.com/people/brian/2008/10/29/nginx-is-centrals-new -friend/) and this resolved the load, but then the 100mbps connection was regularly becoming saturated. Every hour, on the hour for about 20 minutes around the clock, the connection would max out and then return to about 50% utilization. We spent many days working with Contegix to diagnose the problem but no single source stood out immediately.
Yesterday we finally discovered that nearly all the traffic, both the hourly spikes and the 50% background traffic is being caused by downloads of the nexus-index.zip. After investigating the various tools that use this data, we have concluded that Artifactory has a critical bug ( apparently since June: http://issues.jfrog.org/jira/browse/RTFACT-390) that is causing every 1.3 instance to repeatedly download the 27mb zip file. We found many cases of a single ip downloading the index more than 1000 times a day! In the config it is set as follows: <!-- The cron definition to control the activation of the m2eclipse indexer. --> <indexer> <!-- By Default index every 5 hours --> <cronExp>0 0 /5 * * ?</cronExp> </indexer> (this is a quartz syntax which is "s m h...") This by itself wouldn't be a huge issue except for the fact that Artifactory ignores the index.properties file which contains the last update timestamp AND doesn't first issue a HEAD to check the timestamp. This means that every Artifactory 1.3 instance is grabbing this 27mb file at least every 5 hours (we can't explain why certain ips are doing it 1000+ times a day..perhaps the config was modified there or some other scheduling issue is present). For reference, the index on central is only updated once a week on Sundays. To protect the Maven Community from ongoing troubles, we have had to take the extra-ordinary step of blocking all downloads of the index file by Artifactory until this is resolved. Upon doing this, the traffic has fallen to 10% of what it has been in the recent past. If you are using Artifactory, please adjust this cron definition to run only weekly and save yourself and us tons of wasted bandwidth and money. Note that other tools like the Nexus Maven Repository Manager (http://nexus.sonatype.org), M2e and Q4e use the Nexus Indexer API and are immune to this problem and are not blocked from downloading the index. A new version of Nexus and the Nexus Indexer API will be published soon along with M2e that will leverage incremental indexes to significantly reduce the download requirements and allow near real time index updates. Brian Fox Apache Maven PMC http://blogs.sonatype.org/people/brian
