Sandeep Tata wrote:
Hi Folks,
I was just wondering what computation really happens in the reduce
phase for Fetcher2 ?
If Fetcher was running in the parsing mode, then in the reduce phase
Outlinks are separated from Parse output and stored in crawl_parse, and
other data in parse_text and
I have discovered another bug in Fetcher2. Plugin lib-http checks
Protocol.CHECK_{BLOCKING,ROBOTS}(which resolve to strings
protocol.plugin.check.{blocking,robots}) to see if it should handle
blocking or not.
But fetcher2 sets http.plugin.check.{blocking,robots} (notice the
protocol/http
Doğacan Güney wrote:
Hi all,
I have been working on Fetcher2 code lately and I came across this
particular code (in FetchItemQueue.getFetchItem) that I didn't quite
understand:
public FetchItem getFetchItem() {
...
long last = endTime.get() + (maxThreads 1 ? crawlDelay : minCrawlDelay);
Doğacan Güney wrote:
I have discovered another bug in Fetcher2. Plugin lib-http checks
Protocol.CHECK_{BLOCKING,ROBOTS}(which resolve to strings
protocol.plugin.check.{blocking,robots}) to see if it should handle
blocking or not.
But fetcher2 sets http.plugin.check.{blocking,robots} (notice
Doğacan Güney wrote:
I don't get it. The code seems to do exactly the opposite of what you
are saying. If maxThreads == 1 then maxThreads 1 is false thus the
expression evaluates to minCrawlDelay not crawlDelay. Shouldn't the
expression be (maxThreads 1 ? minCrawlDelay : crawlDelay) ?
Yep,
On 4/24/07, Andrzej Bialecki [EMAIL PROTECTED] wrote:
Doğacan Güney wrote:
I don't get it. The code seems to do exactly the opposite of what you
are saying. If maxThreads == 1 then maxThreads 1 is false thus the
expression evaluates to minCrawlDelay not crawlDelay. Shouldn't the
expression
please give us the url,thx
On 1/25/07, chee wu [EMAIL PROTECTED] wrote:
Just appended the portion for .81 to NUTCH-339
- Original Message -
From: Armel T. Nene [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org
Sent: Thursday, January 25, 2007 8:06 AM
Subject: RE: Fetcher2
Chee
:[EMAIL PROTECTED]
Sent: 25 January 2007 09:31
To: nutch-dev@lucene.apache.org
Subject: Re: Fetcher2
please give us the url,thx
On 1/25/07, chee wu [EMAIL PROTECTED] wrote:
Just appended the portion for .81 to NUTCH-339
- Original Message -
From: Armel T. Nene [EMAIL PROTECTED
]
Sent: 24 January 2007 03:59
To: nutch-dev@lucene.apache.org
Subject: Re: Fetcher2
Thanks! I successfully port Fetcher2 to Nutch.81, it's prettyly easy... I
can share the code,if any one want to use ..
- Original Message -
From: Andrzej Bialecki [EMAIL PROTECTED]
To: nutch-dev
Just appended the portion for .81 to NUTCH-339
- Original Message -
From: Armel T. Nene [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org
Sent: Thursday, January 25, 2007 8:06 AM
Subject: RE: Fetcher2
Chee,
Can you make the code available through Jira.
Thanks,
Armel
Thanks! I successfully port Fetcher2 to Nutch.81, it's prettyly easy... I can
share the code,if any one want to use ..
- Original Message -
From: Andrzej Bialecki [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org
Sent: Tuesday, January 23, 2007 12:09 AM
Subject: Re: Fetcher2
chee wu
Fetcher2 should be a great help for me,but seems can't integrate with Nutch81.
Any advice on how to use it based on .81?
- Original Message -
From: Andrzej Bialecki [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org
Sent: Thursday, January 18, 2007 5:18 AM
Subject: Fetcher2
Hi all,
chee wu wrote:
Fetcher2 should be a great help for me,but seems can't integrate with Nutch81.
Any advice on how to use it based on .81?
You would have to port it to Nutch 0.8.1 - e.g. change all Text
occurences to UTF8, and most likely make other changes too ...
--
Best regards,
Andrzej
13 matches
Mail list logo