Re: Fetcher2 Reduce Phase Question

2008-04-11 Thread Andrzej Bialecki
Sandeep Tata wrote: Hi Folks, I was just wondering what computation really happens in the reduce phase for Fetcher2 ? If Fetcher was running in the parsing mode, then in the reduce phase Outlinks are separated from Parse output and stored in crawl_parse, and other data in parse_text and

Re: Fetcher2's delay between successive requests

2007-04-24 Thread Doğacan Güney
I have discovered another bug in Fetcher2. Plugin lib-http checks Protocol.CHECK_{BLOCKING,ROBOTS}(which resolve to strings protocol.plugin.check.{blocking,robots}) to see if it should handle blocking or not. But fetcher2 sets http.plugin.check.{blocking,robots} (notice the protocol/http

Re: Fetcher2's delay between successive requests

2007-04-24 Thread Andrzej Bialecki
Doğacan Güney wrote: Hi all, I have been working on Fetcher2 code lately and I came across this particular code (in FetchItemQueue.getFetchItem) that I didn't quite understand: public FetchItem getFetchItem() { ... long last = endTime.get() + (maxThreads 1 ? crawlDelay : minCrawlDelay);

Re: Fetcher2's delay between successive requests

2007-04-24 Thread Andrzej Bialecki
Doğacan Güney wrote: I have discovered another bug in Fetcher2. Plugin lib-http checks Protocol.CHECK_{BLOCKING,ROBOTS}(which resolve to strings protocol.plugin.check.{blocking,robots}) to see if it should handle blocking or not. But fetcher2 sets http.plugin.check.{blocking,robots} (notice

Re: Fetcher2's delay between successive requests

2007-04-24 Thread Andrzej Bialecki
Doğacan Güney wrote: I don't get it. The code seems to do exactly the opposite of what you are saying. If maxThreads == 1 then maxThreads 1 is false thus the expression evaluates to minCrawlDelay not crawlDelay. Shouldn't the expression be (maxThreads 1 ? minCrawlDelay : crawlDelay) ? Yep,

Re: Fetcher2's delay between successive requests

2007-04-24 Thread Doğacan Güney
On 4/24/07, Andrzej Bialecki [EMAIL PROTECTED] wrote: Doğacan Güney wrote: I don't get it. The code seems to do exactly the opposite of what you are saying. If maxThreads == 1 then maxThreads 1 is false thus the expression evaluates to minCrawlDelay not crawlDelay. Shouldn't the expression

Re: Fetcher2

2007-01-25 Thread kauu
please give us the url,thx On 1/25/07, chee wu [EMAIL PROTECTED] wrote: Just appended the portion for .81 to NUTCH-339 - Original Message - From: Armel T. Nene [EMAIL PROTECTED] To: nutch-dev@lucene.apache.org Sent: Thursday, January 25, 2007 8:06 AM Subject: RE: Fetcher2 Chee

RE: Fetcher2

2007-01-25 Thread Armel T. Nene
:[EMAIL PROTECTED] Sent: 25 January 2007 09:31 To: nutch-dev@lucene.apache.org Subject: Re: Fetcher2 please give us the url,thx On 1/25/07, chee wu [EMAIL PROTECTED] wrote: Just appended the portion for .81 to NUTCH-339 - Original Message - From: Armel T. Nene [EMAIL PROTECTED

RE: Fetcher2

2007-01-24 Thread Armel T. Nene
] Sent: 24 January 2007 03:59 To: nutch-dev@lucene.apache.org Subject: Re: Fetcher2 Thanks! I successfully port Fetcher2 to Nutch.81, it's prettyly easy... I can share the code,if any one want to use .. - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] To: nutch-dev

Re: Fetcher2

2007-01-24 Thread chee wu
Just appended the portion for .81 to NUTCH-339 - Original Message - From: Armel T. Nene [EMAIL PROTECTED] To: nutch-dev@lucene.apache.org Sent: Thursday, January 25, 2007 8:06 AM Subject: RE: Fetcher2 Chee, Can you make the code available through Jira. Thanks, Armel

Re: Fetcher2

2007-01-23 Thread chee wu
Thanks! I successfully port Fetcher2 to Nutch.81, it's prettyly easy... I can share the code,if any one want to use .. - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] To: nutch-dev@lucene.apache.org Sent: Tuesday, January 23, 2007 12:09 AM Subject: Re: Fetcher2 chee wu

Re: Fetcher2

2007-01-22 Thread chee wu
Fetcher2 should be a great help for me,but seems can't integrate with Nutch81. Any advice on how to use it based on .81? - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] To: nutch-dev@lucene.apache.org Sent: Thursday, January 18, 2007 5:18 AM Subject: Fetcher2 Hi all,

Re: Fetcher2

2007-01-22 Thread Andrzej Bialecki
chee wu wrote: Fetcher2 should be a great help for me,but seems can't integrate with Nutch81. Any advice on how to use it based on .81? You would have to port it to Nutch 0.8.1 - e.g. change all Text occurences to UTF8, and most likely make other changes too ... -- Best regards, Andrzej