after studying the code and the analysis done by Steven Denny in jira, i think he is right.
"Note that the queue is created and then immediately reaped, and after totalSize is incremented, that queue does not appear in the list, even though it supposedly has the item added to it. The upshot is that the url is never fetched (as the queue has gone) so totalSize never = 0, and eventually the abort will happen. In short I'd say this is a sync issue, but I'm not sure where the best place to lock would be." it seems there is a race condition between addFetchItem and getFetchItem, if addFetchItem is not synchronized. if getFetchItem is called before addFetchItem has finished, then the queue is reaped and later addFetchItem increments the counter. reinhard schwab schrieb: > sorry, i have overseen a method with the same name in FetchItemQueues. > line number 394 in my code version after expanding the import statements. > will test it. > > reinhard schwab schrieb: > >> i have had now the opportunity to test again fetching. >> it has looked good so far until now. >> again the same behaviour. >> >> i have added a synchronized modifier to one method before and rebuild. >> >> public synchronized void addFetchItem(FetchItem it) { >> if (it == null) >> return; >> queue.add(it); >> } >> >> line number is different here, but it should be this method, as it is >> the only one with this name. >> i will try to debug and analyze the code. >> >> kevin chen schrieb: >> >> >>> It worked for me. >>> >>> On Tue, 2010-01-26 at 09:31 +0000, Julien Nioche wrote: >>> >>> >>> >>>> See https://issues.apache.org/jira/browse/NUTCH-719. A solution has >>>> been proposed but I am not sure that it really fixes the problem. >>>> J. >>>> >>>> 2010/1/26 reinhard schwab <reinhard.sch...@aon.at>: >>>> >>>> >>>> >>>>> sometimes i watch >>>>> >>>>> -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1 >>>>> -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1 >>>>> Aborting with 10 hung threads. >>>>> >>>>> if i connect with jconsole, all fetcher threads are sleeping. >>>>> something wrong with fetchQueues totalSize? >>>>> >>>>> before it has logged only 1 queue despite it claims to have 2. >>>>> >>>>> 2010-01-26 03:07:57,422 INFO fetcher.Fetcher - -activeThreads=10, >>>>> spinWaiting=9, fetchQueues.totalSize=2 >>>>> 2010-01-26 03:07:57,422 INFO fetcher.Fetcher - * queue: >>>>> http://80.120.141.100 >>>>> 2010-01-26 03:07:57,422 INFO fetcher.Fetcher - maxThreads = 1 >>>>> 2010-01-26 03:07:57,422 INFO fetcher.Fetcher - inProgress = 0 >>>>> 2010-01-26 03:07:57,422 INFO fetcher.Fetcher - crawlDelay = 1000 >>>>> 2010-01-26 03:07:57,422 INFO fetcher.Fetcher - minCrawlDelay = 0 >>>>> 2010-01-26 03:07:57,423 INFO fetcher.Fetcher - nextFetchTime = >>>>> 1264471678263 >>>>> 2010-01-26 03:07:57,423 INFO fetcher.Fetcher - now = >>>>> 1264471677423 >>>>> 2010-01-26 03:07:57,423 INFO fetcher.Fetcher - 0. >>>>> http://www.wirtshauskultur.at/default.asp?id=9574&tt=WHK_R17 >>>>> >>>>> regards >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> > > >