Our Archiva 1.3.5 instance has started becoming unresponsive almost every
night. We have a cron job which does a wget every 5 minutes on the main
Archiva page to ensure that it's available. This cron job sends an alert if
the wget fails and Archiva is then bounced. The bounce does make Archiva
available again until the next night.
We suspect that the repository scans may be the cause, but we aren't sure. We
have the repository scans (two of them, internal and snapshots) set to run
nightly at 7pm. They run sequentially, so the last one finishes at roughly
11:30pm. Then we have the database scan set to run at 3am.
Software builds that occur later in the night seem to hang when trying to
upload a pom. Here's an example schedule of events:
1:00am - build starts
1:01am - build tries to upload a pom to Archiva and appears to hang waiting for
the upload to finish
1:45am - the wget sends an alarm and Archiva is bounced. At this point, the
build resumes but fails since the upload failed.
Since the wget doesn't alarm until 1:45am, this leads us to believe that
Archiva may be degrading in performance over time - it must be "alive" enough
for the wget check to succeed, but "dead" enough that uploads can't complete.
Here is the exact wget command our cron monitor is running:
wget -tries 10 -q -O - http://our.archiva.url:port/archiva
There is nothing in archiva.log to indicate any problem. In the example above,
the log is "business as usual" until roughly 1am, and then there are no other
messages until Archiva gets bounced, at which point we see the normal logging
messages showing that Archiva is starting up.
The problem occurred every day last week, but then Sunday and Monday everything
was fine. The problem resumed on Tuesday night, but exactly 1 hour later than
when it usually occurs.
A few weeks ago we deleted the index and database to let Archiva create them
fresh, and this eliminated a large number of errors which we had been getting
on a daily basis. So the index and database are probably in better shape now
than they'd been in the past.
We've looked to see if something external to Archiva, such as system backups,
might be tying up resources and affecting Archiva performance. We've not found
anything that seems like a suspect.
Our primary theory at this point is that the repository scans are somehow
adversely affecting Archiva and at some point 1-2 hours after they complete,
Archiva becomes unresponsive. We are setting up some test builds to run over
the holiday weekend which will hopefully give us more data to prove or disprove
this (we are running a build every hour so we can better pinpoint exactly when
uploads stop working).
Any ideas?
As a workaround, based on the assumption that the scans are causing the
problem, we have considered changing the respository scan from daily to weekly.
Would this affect any aspect of Archiva other than the ability to search for
recently uploaded artifacts?
Thanks,
David