[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176172#comment-14176172
]
Sebastian Nagel commented on NUTCH-1483:
But URI.toString(), UrlUtil.toASCII(Strin
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176160#comment-14176160
]
Sebastian Nagel commented on NUTCH-1483:
The reason for (2) is best explained with
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176111#comment-14176111
]
Sebastian Nagel edited comment on NUTCH-1483 at 10/18/14 9:56 PM:
--
[
https://issues.apache.org/jira/browse/NUTCH-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1878:
---
Attachment: NUTCH-1878-v1.patch
Patch which adds additional negative context/look-behind to th
Sebastian Nagel created NUTCH-1878:
--
Summary: urlnormalizer-regex to keep third slash in
file:///path/index.html
Key: NUTCH-1878
URL: https://issues.apache.org/jira/browse/NUTCH-1878
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1483:
---
Attachment: (was: NUTCH-1483.patch)
> Can't crawl filesystem with protocol-file plugin
> -
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1483:
---
Fix Version/s: 2.3
> Can't crawl filesystem with protocol-file plugin
> --
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14176111#comment-14176111
]
Sebastian Nagel commented on NUTCH-1483:
You'll get it working if
(1) the above me
[
https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488558#comment-13488558
]
Sebastian Nagel edited comment on NUTCH-1483 at 10/18/14 7:24 PM:
--
9 matches
Mail list logo