[ 
https://issues.apache.org/jira/browse/NUTCH-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren reassigned NUTCH-422:
--------------------------------

    Assignee: Sami Siren

> index-extra plugin creates additional fields in the index, based on 
> configurable logic
> --------------------------------------------------------------------------------------
>
>                 Key: NUTCH-422
>                 URL: https://issues.apache.org/jira/browse/NUTCH-422
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>    Affects Versions: 0.8.1
>         Environment: All environments
>            Reporter: Alan Tanaman
>         Assigned To: Sami Siren
>         Attachments: index-extra-v1.0-bin-java1.5.zip, 
> index-extra-v1.0-source.zip
>
>
> Extract from the Readme file:
> A.  Introduction
>     The index-extra plugin allows you to configure additional fields that you 
> wish to be added to the index, based on one of the following sources:
>       - The parsed text
>       - Meta data fields
>       - Previously created document-to-be-indexed fields
>       - Plain constant string
>       - Java expression combining one or more of the above, and resolving to 
> a string
>     A regex can also be applied to any of the above, allowing fields to be 
> created based on patterns extracted from the source.
> B.  Installation
>     1)  Binaries only:  Copy the 'index-extra' folder within 
> index-extra-v1.0-bin-java1.5.zip to NUTCHDIR/build
>                         Copy the 'index-extra-conf.xml' file to 
> NUTCHDIR/conf, and configure
>                         Enable the plugin by updating the nutch-site.xml file
>     2)  Source code:    Always refer to the Nutch wiki for detailed 
> instructions on building Nutch.  In short:
>                         Copy the 'index-extra' folder within 
> index-extra-v1.0-source.zip to NUTCHDIR/src/plugin
>                         Update the build.xml in NUTCHDIR/src/plugin to 
> include plugin
>                         Update the NUTCHDIR/default.properties file to 
> include plugin
>                         run ant to build
>                         Copy the 'index-extra-conf.xml' file to 
> NUTCHDIR/conf, and configure
>                         Enable the plugin by updating the nutch-site.xml file
> C.  Known Issues
>     1)  For this plugin to work correctly on any document field, it is 
> necessary to run the other index filters
>     first, so that all basic document fields are generated first.  To do 
> this, configure the indexingfilter.order
>     property.  (Please see patch NUTCH-421 to enable indexingfilter.order 
> property. If this patch is not applied,
>     the plugin will still work, but will not be able to use document fields 
> created by other index filter plugins.)
>     2)  At this stage, field boost can not be used as Nutch scoring overrides 
> the field boost with its own
>     document-level boost calculation.  This occurs at the end of 
> org.apache.nutch.indexer.Indexer's reduce method.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to