Avoid parsing uneccessary links and get a more relevant outlink list
Key: NUTCH-488
URL: https://issues.apache.org/jira/browse/NUTCH-488
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emmanuel Joke updated NUTCH-488:
Attachment: DOMContentUtils.patch
Avoid parsing uneccessary links and get a more relevant outlink
[
https://issues.apache.org/jira/browse/NUTCH-489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emmanuel Joke updated NUTCH-489:
Attachment: SuffixURLFilter.java.patch
suffix-urlfilter.txt.patch
URLFilter-suffix
[
https://issues.apache.org/jira/browse/NUTCH-489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12497770
]
Doğacan Güney commented on NUTCH-489:
-
This is obviously useful but:
* Your patches both in this issue and in
[
https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-490:
-
Attachment: HtmlParser.java.diff
Patch for HtmlParser.
Extension point with filters for
[
https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marcin Okraszewski updated NUTCH-490:
-
Attachment: nutch-extensionpoins_plugin.xml.diff
Patch for plugin.xml in
[
https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12497851
]
Vadim Bauer commented on NUTCH-427:
---
There is an Error in the plugin.xml File
the plugin id should be protocol-smb
[
https://issues.apache.org/jira/browse/NUTCH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498041
]
Doug Cook commented on NUTCH-25:
Thanks! I'll take a look at your proposed patch... (that was fast! ask and ye
shall
[
https://issues.apache.org/jira/browse/NUTCH-489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emmanuel Joke updated NUTCH-489:
Attachment: SuffixURLFilter_v2.java.patch
My mistake...
I've added a new patchwhich is supposed