[ https://issues.apache.org/jira/browse/NUTCH-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney reassigned NUTCH-2414: ------------------------------------------- Assignee: Lewis John McGibbney > Allow LanguageIndexingFilter to actually filter documents by language. > ---------------------------------------------------------------------- > > Key: NUTCH-2414 > URL: https://issues.apache.org/jira/browse/NUTCH-2414 > Project: Nutch > Issue Type: Improvement > Components: plugin > Affects Versions: 1.13 > Reporter: Yossi Tamari > Assignee: Lewis John McGibbney > Priority: Minor > Fix For: 1.14 > > > It is often useful to only index pages in select languages (e.g. only those > languages that we intend to search in). At first glance it seems that this is > done by LanguageIndexingFilter, but currently all the filter does is add the > language as a field to the index. > We can add a configuration property to LanguageIndexingFilter that will allow > it to only index languages specified in this property. -- This message was sent by Atlassian JIRA (v6.4.14#64029)