[ https://issues.apache.org/jira/browse/NUTCH-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478683 ]
Nathan ter Bogt commented on NUTCH-422: --------------------------------------- Has anyone got the binary version of this module to work? I get to the indexing part and get the following error: Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:357) at org.apache.nutch.indexer.Indexer.index(Indexer.java:296) at org.apache.nutch.crawl.Crawl.main(Crawl.java:121) And this is what I get in my hadoop log: 2007-03-07 15:26:33,272 INFO indexer.Indexer - Optimizing index. 2007-03-07 15:26:33,275 WARN mapred.LocalJobRunner - job_qq3l2z java.lang.NoClassDefFoundError: org/jdom/JDOMException at org.apache.nutch.indexer.extra.ExtraIndexingFilter.filter(ExtraIndexingFilter.java:68) at org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:72) at org.apache.nutch.indexer.Indexer.reduce(Indexer.java:235) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:247) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:112) Any help would be greatly appreciated. Lastly, I'm all for the query-extra plugin also. > index-extra plugin creates additional fields in the index, based on > configurable logic > -------------------------------------------------------------------------------------- > > Key: NUTCH-422 > URL: https://issues.apache.org/jira/browse/NUTCH-422 > Project: Nutch > Issue Type: New Feature > Components: indexer > Affects Versions: 0.8.1 > Environment: All environments > Reporter: Alan Tanaman > Assigned To: Sami Siren > Attachments: index-extra-v1.0-bin-java1.5.zip, > index-extra-v1.0-source.zip > > > Extract from the Readme file: > A. Introduction > The index-extra plugin allows you to configure additional fields that you > wish to be added to the index, based on one of the following sources: > - The parsed text > - Meta data fields > - Previously created document-to-be-indexed fields > - Plain constant string > - Java expression combining one or more of the above, and resolving to > a string > A regex can also be applied to any of the above, allowing fields to be > created based on patterns extracted from the source. > B. Installation > 1) Binaries only: Copy the 'index-extra' folder within > index-extra-v1.0-bin-java1.5.zip to NUTCHDIR/build > Copy the 'index-extra-conf.xml' file to > NUTCHDIR/conf, and configure > Enable the plugin by updating the nutch-site.xml file > 2) Source code: Always refer to the Nutch wiki for detailed > instructions on building Nutch. In short: > Copy the 'index-extra' folder within > index-extra-v1.0-source.zip to NUTCHDIR/src/plugin > Update the build.xml in NUTCHDIR/src/plugin to > include plugin > Update the NUTCHDIR/default.properties file to > include plugin > run ant to build > Copy the 'index-extra-conf.xml' file to > NUTCHDIR/conf, and configure > Enable the plugin by updating the nutch-site.xml file > C. Known Issues > 1) For this plugin to work correctly on any document field, it is > necessary to run the other index filters > first, so that all basic document fields are generated first. To do > this, configure the indexingfilter.order > property. (Please see patch NUTCH-421 to enable indexingfilter.order > property. If this patch is not applied, > the plugin will still work, but will not be able to use document fields > created by other index filter plugins.) > 2) At this stage, field boost can not be used as Nutch scoring overrides > the field boost with its own > document-level boost calculation. This occurs at the end of > org.apache.nutch.indexer.Indexer's reduce method. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers