[
https://issues.apache.org/jira/browse/NUTCH-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116743#comment-13116743
]
Markus Jelsma commented on NUTCH-1005:
--------------------------------------
I agree with Julien as it's the most flexible solution although it may be a bit
more more to set up for simple extraction. But since such a feature is missing
right now and heading text is very important in relevance ranking i feel we
should add this in 1.4 anyway.
If not, we should mark this for no version and set as wont fix.
> Index headings plugin
> ---------------------
>
> Key: NUTCH-1005
> URL: https://issues.apache.org/jira/browse/NUTCH-1005
> Project: Nutch
> Issue Type: New Feature
> Components: indexer, parser
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Priority: Minor
> Fix For: 1.4
>
> Attachments: HeadingsIndexingFilter.java, HeadingsParseFilter.java,
> NUTCH-1005-1.4-2.patch, NUTCH-1005-1.4-3.patch
>
>
> Very simple plugin for extracting and indexing a comma separated list of
> headings via the headings configuration directive.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira