Status of language plugin

2006-05-19 Thread Teruhiko Kurosaka
Hello Jérôme, Because of other issues at work, I was away from Nutch. Now I'm back, and I see you are making progresses according to your notes in jira. Is there an API doc or design doc that I can read to understand where you are? Is the language plugin architecture already in the main trunk?

[jira] Commented: (NUTCH-272) Max. pages to crawl/fetch per site (emergency limit)

2006-05-19 Thread Matt Kangas (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-272?page=comments#action_12412601 ] Matt Kangas commented on NUTCH-272: --- I've been thinking about this after hitting several sites that explode into 1.5 M URLs (or more). I could sleep easier at night if I

[jira] Commented: (NUTCH-272) Max. pages to crawl/fetch per site (emergency limit)

2006-05-19 Thread Matt Kangas (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-272?page=comments#action_12412614 ] Matt Kangas commented on NUTCH-272: --- btw, I'd love to be proven wrong, because if generate.max.per.host parameter works as a hard URL cap per site, I could be sleeping

[jira] Commented: (NUTCH-272) Max. pages to crawl/fetch per site (emergency limit)

2006-05-19 Thread Stefan Neufeind (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-272?page=comments#action_12412620 ] Stefan Neufeind commented on NUTCH-272: --- Oh, I just discovered this new parameter was added in 0.8-dev :-) But to my understanding of the description in