[
https://issues.apache.org/jira/browse/NUTCH-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma closed NUTCH-824.
---
Bulk close of resolved issues for 1.3.
Crawling - File Error 404 when fetching file with an hexadecimal
[
https://issues.apache.org/jira/browse/NUTCH-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche resolved NUTCH-824.
-
Resolution: Fixed
Assignee: Julien Nioche (was: Markus Jelsma)
Have reactivated the tests
[
https://issues.apache.org/jira/browse/NUTCH-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12925308#action_12925308
]
Markus Jelsma commented on NUTCH-824:
-
You're correct, no patch has been submitted and
[
https://issues.apache.org/jira/browse/NUTCH-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-824:
Affects Version/s: 2.0
1.3
1.2
Fix Version/s:
[
https://issues.apache.org/jira/browse/NUTCH-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michela Becchi resolved NUTCH-824.
--
Fix Version/s: 1.0.0
Resolution: Fixed
Hi,
I fixed (or, at least, circumvented) this by
Hi,
I circumvented this problem by modifying the
org.apache.nutch.protocol.file.FileResponse class belonging to the
protocol-file plugin.
In particular, at line 120, I added
String path = .equals(url.getPath()) ? / : url.getPath();
+String decoded_path = path;
+try {
+
Crawling - File Error 404 when fetching file with an hexadecimal character in
the file name.
Key: NUTCH-824
URL: https://issues.apache.org/jira/browse/NUTCH-824
Hi Julien,
Thanks a lot.
I tried the same test you indicated (bin/nutch plugin protocol-file
org.apache.nutch.protocol.file ...) and got again an Error 404. Of course,
I don't get this error if, when issuing the command, I replace the
hexadecimal representation (e.g., %28 with ().
I opened an
8 matches
Mail list logo