On 23 July 2014 23:28, Stallard,David <[email protected]> wrote: > Brett, we did do some tweaking to the cron schedules for both snapshots > and internal yesterday, that¹s probably what initiated the scan. And I¹m > guessing a directory scan of snapshots is sitting in the queue waiting for > the internal scan to finish. We will probably bounce Archiva to stop > these scans and clear the queues. Is there any harmful side effect to > bouncing during a scan? I think we¹ve done it before without impact. As > an enhancement, an admin button to abort an in-progress scan would be > useful. >
I don't see any weird side effect. Good idea regarding the button to abort. Can you create a jira issue for that? Thanks Olivier > Thanks, > David > > > On 7/23/14, 12:59 AM, "Brett Porter" <[email protected]> wrote: > > >From a quick look at the code, it looks like that scan will happen >>whenever the configuration for the repository is changed. Is that what >>happened for you? >> >>Not sure if that was intentional or not. >> >>- Brett >> >>On 23 Jul 2014, at 7:13 am, Stallard,David <[email protected]> wrote: >> >>> We have roughly 1.6 terabytes of data in our largest Archiva instance >>>it it grows rapidly. Because of this amount of data, and/or perhaps >>>because of limitations of our current hardware (which we are working to >>>improve), doing a full directory scan degrades performance of Archiva as >>>a whole and it can take quite a long time to complete...48 hours or more. >>> >>> Because of that, we don't do directory scans unless we feel it's >>>necessary to fix some unusual situation. The index scans are usually >>>sufficient. >>> >>> Today, a directory scan of the internal repository mysteriously started >>>up. Although the System Status page doesn't say what type of scan is >>>running, I believe it's a directory scan because the Files Processed >>>number is equal to the New Files number. This has bogged down the >>>system as expected and we're getting complaints from users about uploads >>>and downloads taking a long time. >>> >>> Looking in the log to try and find how this scan was started, I found >>>the following line: >>> >>> 2014-07-22 11:09:26,770 [pool-5-thread-1] INFO >>>org.apache.archiva.scheduler.repository.ArchivaRepositoryScanningTaskExec >>>utor [] - Executing task from queue with job name: RepositoryTask >>>[repositoryId=internal, resourceFile=null, scanAll=true, >>>updateRelatedArtifacts=false] >>> >>> This seems to indicate that either the scheduler kicked it off, or at >>>some point in the past a directory scan was added to the queue and it is >>>just now being processed. I don't know if the latter is even possible >>>or not...I thought that the stuff in the queue was individual artifacts >>>that had been marked by scans for later processing. >>> >>> Our Cron Expression for the internal repository is the following, which >>>should not have kicked off a scan at the time shown above. However, >>>even if it did, I believe that the Cron Expression usually kicks off >>>index scans rather than directory scans? >>> >>> 0 0 19 * * ? >>> >>> So, two questions: >>> >>> >>> 1. Any idea why this directory scan might have been started? >>> 2. Is there any way to stop a scan after it has started? I'm >>>assuming a bounce of Archiva would stop it, but an option that didn't >>>incur downtime would be preferable. >>> >>> Thanks, >>> David >> > > -- Olivier Lamy Ecetera: http://ecetera.com.au http://twitter.com/olamy | http://linkedin.com/in/olamy
