[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242491#comment-16242491
]
Yossi Tamari commented on NUTCH-2456:
-
Updated the title as you suggested.
It seems to me that in most
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yossi Tamari updated NUTCH-2456:
Summary: Allow to index pages/URLs not contained in CrawlDb (was:
Redirected documents are not
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242450#comment-16242450
]
Sebastian Nagel commented on NUTCH-2456:
Thanks, I'll have a look at the PR.
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242304#comment-16242304
]
Yossi Tamari commented on NUTCH-2456:
-
BTW, I submitted a PR that tries to be a minimal fix for this
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242295#comment-16242295
]
Yossi Tamari commented on NUTCH-2456:
-
db.update.additions.allowed is set to false, which I guess
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242247#comment-16242247
]
Sebastian Nagel edited comment on NUTCH-2456 at 11/7/17 4:02 PM:
-
For
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242247#comment-16242247
]
Sebastian Nagel edited comment on NUTCH-2456 at 11/7/17 3:56 PM:
-
For
[
https://issues.apache.org/jira/browse/NUTCH-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242247#comment-16242247
]
Sebastian Nagel commented on NUTCH-2456:
For every item in a redirect chain URL -> target_1 ->
[
https://issues.apache.org/jira/browse/NUTCH-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241941#comment-16241941
]
Jurian Broertjes commented on NUTCH-2431:
-
Will have a look at your feedback the coming week
>
[
https://issues.apache.org/jira/browse/NUTCH-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241893#comment-16241893
]
Sebastian Nagel edited comment on NUTCH-2451 at 11/7/17 12:01 PM:
--
Ok,
[
https://issues.apache.org/jira/browse/NUTCH-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241893#comment-16241893
]
Sebastian Nagel commented on NUTCH-2451:
Ok, after a look at the code (Ftp.java): it's during
[
https://issues.apache.org/jira/browse/NUTCH-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241866#comment-16241866
]
Hiran Chaudhuri edited comment on NUTCH-2451 at 11/7/17 11:44 AM:
--
Let's
[
https://issues.apache.org/jira/browse/NUTCH-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241866#comment-16241866
]
Hiran Chaudhuri edited comment on NUTCH-2451 at 11/7/17 11:43 AM:
--
Let's
[
https://issues.apache.org/jira/browse/NUTCH-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241866#comment-16241866
]
Hiran Chaudhuri commented on NUTCH-2451:
Let's assume no suitable URLStreamHandler is registered.
14 matches
Mail list logo