Re: [Nutch-general] 0.9 ClassCastException: org.apache.hadoop.io.Text

Lauren Massa Lochridge Mon, 23 Apr 2007 19:43:28 -0700


Ken,

Thanks very much - you were right. I'd never made the mistake before ofcopying in the newly created, ( 0.9 ), /crawl which resulted in addingto the existing 8.1 segments, rather than deleting all of the old 8.1and thereby replacing /crawl entirely; your response prompted me to lookat that again and sure enough that's what it was!

Thanks.
Lauren Massa-Lochridge
eXlr8, Inc.


Ken Krugler wrote:

Any opinions about a problem we have with 0.9 are appreciated.
The problem is that hits are found via command line NutchBeaninvocation, (in this small test case 333 hits) however, the resultset is zero hits due to the exception. Luke also accesses these sameindexes just fine.
Got the Hadoop patch that was referred to in the archives, becausethe description seemed applicable, however it appears to be the sameversion of hadoop-core: 12.2.2 that came with nutch 0.9. Is thatpatch already integrated into the most recent 0.9 nutch release or isit otherwise not applicable? Can someone tell me what the problem isgiven the exception in the log below?
This looks similar to a problem I had when I was trying to use anolder crawl (one generated by a version of Nutch in between 0.8.1 and0.9) with the 0.9 distribution.
E.g. if the page content was saved using an older version of Nutch,then when the summarizer tries to load the content, you can run intothis exception.
-- Ken
Thanks.
Lauren Lochridge
eXlr8, Inc.

   $ bin/nutch org.apache.nutch.searcher.NutchBean news
   Total hits: 333
   Exception in thread "main" java.lang.RuntimeException:
   java.lang.ClassCastExcept
   ion: org.apache.hadoop.io.Text
           at
   org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
   java:204)
           at
   org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:344)
atorg.apache.nutch.searcher.NutchBean.main(NutchBean.java:395)
   Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text
           at org.apache.hadoop.io.UTF8.compareTo(UTF8.java:123)
           at
   org.apache.hadoop.io.WritableComparator.compare(WritableComparator.ja
   va:107)
           at
   org.apache.hadoop.io.MapFile$Reader.binarySearch(MapFile.java:369)
           at org.apache.hadoop.io.MapFile$Reader.seek(MapFile.java:338)
           at org.apache.hadoop.io.MapFile$Reader.get(MapFile.java:392)
           at
   org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFo
   rmat.java:86)
           at
   org.apache.nutch.searcher.FetchedSegments$Segment.getEntry(FetchedSeg
   ments.java:95)
           at
   org.apache.nutch.searcher.FetchedSegments$Segment.getParseText(Fetche
   dSegments.java:86)
           at
   org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.
   java:159)
           at
   org.apache.nutch.searcher.FetchedSegments$SummaryThread.run(FetchedSe
   gments.java:177)

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] 0.9 ClassCastException: org.apache.hadoop.io.Text

Reply via email to