See https://builds.apache.org/job/Nutch-trunk/2617/
[
https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981964#comment-13981964
]
Navid Shekoufa commented on NUTCH-1714:
---
Thanks for this very suitable patch! I've
Hi Diaa,
Why doesn't nutch assume that web links that have www. at the beginning are
of the http protocol?
It would be not a big problem to do so. The url normalizer provides scopes
(inject, fetch, etc.): you only have to point the property
urlnormalizer.regex.file.inject to a special
[
https://issues.apache.org/jira/browse/NUTCH-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-566.
---
Resolution: Fixed
Was fixed by NUTCH-797 with version 1.4 (2.x will be patched soon), the
[
https://issues.apache.org/jira/browse/NUTCH-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-952:
--
Attachment: test_nutch_952.html
Was fixed by NUTCH-797 for v 1.4 (2.x will follow soon).
[
https://issues.apache.org/jira/browse/NUTCH-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-952.
---
Resolution: Fixed
fix outlink which started with '?' in html parser
[
https://issues.apache.org/jira/browse/NUTCH-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-566:
--
Fix Version/s: (was: 1.9)
Sun's URL class has bug in creation of relative query URLs
[
https://issues.apache.org/jira/browse/NUTCH-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-952:
--
Fix Version/s: (was: 1.9)
fix outlink which started with '?' in html parser
[
https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982116#comment-13982116
]
Sebastian Nagel commented on NUTCH-797:
---
Hi [~jnioche], is there anything left
[
https://issues.apache.org/jira/browse/NUTCH-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1764.
Resolution: Fixed
Fix Version/s: (was: 1.8)
+1
Thanks, [~diaa_abdallah]!
Iain Lopata created NUTCH-1765:
--
Summary: SolrClean to remove redirected URLs from Solr
Key: NUTCH-1765
URL: https://issues.apache.org/jira/browse/NUTCH-1765
Project: Nutch
Issue Type:
[
https://issues.apache.org/jira/browse/NUTCH-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13982128#comment-13982128
]
Hudson commented on NUTCH-1764:
---
SUCCESS: Integrated in Nutch-trunk #2618 (See
12 matches
Mail list logo