[ https://issues.apache.org/jira/browse/NUTCH-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sami Siren reassigned NUTCH-422: -------------------------------- Assignee: Sami Siren > index-extra plugin creates additional fields in the index, based on > configurable logic > -------------------------------------------------------------------------------------- > > Key: NUTCH-422 > URL: https://issues.apache.org/jira/browse/NUTCH-422 > Project: Nutch > Issue Type: New Feature > Components: indexer > Affects Versions: 0.8.1 > Environment: All environments > Reporter: Alan Tanaman > Assigned To: Sami Siren > Attachments: index-extra-v1.0-bin-java1.5.zip, > index-extra-v1.0-source.zip > > > Extract from the Readme file: > A. Introduction > The index-extra plugin allows you to configure additional fields that you > wish to be added to the index, based on one of the following sources: > - The parsed text > - Meta data fields > - Previously created document-to-be-indexed fields > - Plain constant string > - Java expression combining one or more of the above, and resolving to > a string > A regex can also be applied to any of the above, allowing fields to be > created based on patterns extracted from the source. > B. Installation > 1) Binaries only: Copy the 'index-extra' folder within > index-extra-v1.0-bin-java1.5.zip to NUTCHDIR/build > Copy the 'index-extra-conf.xml' file to > NUTCHDIR/conf, and configure > Enable the plugin by updating the nutch-site.xml file > 2) Source code: Always refer to the Nutch wiki for detailed > instructions on building Nutch. In short: > Copy the 'index-extra' folder within > index-extra-v1.0-source.zip to NUTCHDIR/src/plugin > Update the build.xml in NUTCHDIR/src/plugin to > include plugin > Update the NUTCHDIR/default.properties file to > include plugin > run ant to build > Copy the 'index-extra-conf.xml' file to > NUTCHDIR/conf, and configure > Enable the plugin by updating the nutch-site.xml file > C. Known Issues > 1) For this plugin to work correctly on any document field, it is > necessary to run the other index filters > first, so that all basic document fields are generated first. To do > this, configure the indexingfilter.order > property. (Please see patch NUTCH-421 to enable indexingfilter.order > property. If this patch is not applied, > the plugin will still work, but will not be able to use document fields > created by other index filter plugins.) > 2) At this stage, field boost can not be used as Nutch scoring overrides > the field boost with its own > document-level boost calculation. This occurs at the end of > org.apache.nutch.indexer.Indexer's reduce method. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers