[jira] [Updated] (NUTCH-2703) parse-tika: Boilerpipe should not run for non-(X)HTML pages

2019-04-11 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2703: - Priority: Minor (was: Critical) > parse-tika: Boilerpipe should not run for non-(X)HTML pages >

[jira] [Updated] (NUTCH-2703) parse-tika: Boilerpipe should not run for non-(X)HTML pages

2019-03-20 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2703: - Attachment: NUTCH-2703.patch > parse-tika: Boilerpipe should not run for non-(X)HTML pages >

[jira] [Updated] (NUTCH-2703) parse-tika: Boilerpipe should not run for non-(X)HTML pages

2019-03-18 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2703: --- Summary: parse-tika: Boilerpipe should not run for non-(X)HTML pages (was: Boilerpipe should

[jira] [Updated] (NUTCH-2703) parse-tika: Boilerpipe should not run for non-(X)HTML pages

2019-03-18 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-2703: --- Component/s: plugin > parse-tika: Boilerpipe should not run for non-(X)HTML pages > -