Yes I found two corrupted segment, but not with Luke which did not give any help on this one. Event the faulty segments could open nicely. I loggued the HitDetails to find out which segments where creating the error.
An improvement could be to catch the exception and log the segment id so that it is found quickly. Thx anyway. 2011/3/7 Andrzej Bialecki <[email protected]> > On 3/7/11 10:27 AM, MilleBii wrote: > >> Randomly I now seem to get this error in production where it was working >> fine for more than a year.... >> >> java.lang.NullPointerException >> >>> at >>> >>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:248) >>> at >>> >>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:63) >>> at >>> >>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:53) >>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:166) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >>> at java.lang.Thread.run(Thread.java:636) >>> >>> >> + for some queries, the first hit pages are fine and suddently it stops >> and >> I get a blank page, for some I get it on the query >> + I checked the query with Luke. Looked fine >> + the preceding bean call in search.jsp (bean.search(query, start + >> hitsToRetrieve, hitsPerSite, "site", sort, reverse); did not generate any >> exception as far as I can judge. >> >> What can be the cause of that ? how to debug that one ? >> >> I'm using Nutch1.0. >> > > One of your segments may be corrupt - usually this means it's either not > fetched, or not parsed, or truly corrupt (or missing). The expected list of > valid segments is the list of segment names that was used to produce the > index - segment names are recorded in Lucene indexes. You could open all > indexes (e.g. with Luke) and see what are the top terms in the "segment" > field. > > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > > -- -MilleBii-

