[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2007-06-27 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508445 ] Doğacan Güney commented on NUTCH-289: - It seems this issue has kind of died down, but this would be a great

[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2006-11-16 Thread Uros Gruber (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=comments#action_12450315 ] Uros Gruber commented on NUTCH-289: --- One question. Why does IP need to be in CrawlDatum and not in metadata? CrawlDatum should store IP address

[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2006-05-31 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=comments#action_12413996 ] Andrzej Bialecki commented on NUTCH-289: - Re: lookup in ParseOutputFormat: I respectfully disagree. Consider the scenario when you run Fetcher in non-parsing mode.

[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2006-05-31 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=comments#action_12414114 ] Doug Cutting commented on NUTCH-289: It should be possible to partition by IP and limit fetchlists by IP. Resolving only in the fetcher is too late to implement these

[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2006-05-30 Thread Matt Kangas (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=comments#action_12413939 ] Matt Kangas commented on NUTCH-289: --- +1 to saving IP address in CrawlDatum, wherever the value comes from. (Fetcher or otherwise) CrawlDatum should store IP address

[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2006-05-30 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=comments#action_12413940 ] Stefan Groschupf commented on NUTCH-289: +1 Andrzej, I agree that lookup the ip in ParseOutputFormat would be the best as Doug suggested. The biggest problem nutch has

[jira] Commented: (NUTCH-289) CrawlDatum should store IP address

2006-05-27 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-289?page=comments#action_12413604 ] Andrzej Bialecki commented on NUTCH-289: - I'm not sure how to address round-robin DNS with your approach ... Also, I think the best place to resolve and record the