date:20061122

Re: [jira] Commented: (NUTCH-395) Increase fetching speed

2006-11-22 Thread AJ Chen

I checked out the code from trunk after Sami committed the change. I started out a new crawl db and run several cycles of crawl sequentially on one linux server. See below for the real numbers from my test. The performance is still poor because the crawler still spend too much time in reduce and

Re: [jira] Commented: (NUTCH-395) Increase fetching speed

2006-11-22 Thread Sami Siren

What kind of hardware are you running on? Your pages per sec ratio seems very low to me. How big was your crawldb when you started and how big was it at end? What kind of filters and normalizers are you using? -- Sami Siren AJ Chen wrote: I checked out the code from trunk after Sami

Question on adaptive re-fetch plugin

2006-11-22 Thread Scott Green

Hi NUTCH-61(http://issues.apache.org/jira/browse/NUTCH-61) is about adaptive re-fetch plugin, and Jerome Charron had commented --Why not making FetchSchedule a new ExtensionPoint and then DefaultFetchSchedule and AdaptiveFetchSchedule some fetch schedule plugins? . I am for it. Maintaining

Re: What's the status of Nutch-GUI?

2006-11-22 Thread Zaheed Haque

Scott: Would you be kind enough to upload your Nutch-Gui patch which works with current trunk? I would like to give it a try. Regards On 11/22/06, scott green [EMAIL PROTECTED] wrote: On 11/22/06, Sami Siren [EMAIL PROTECTED] wrote: scott green wrote: Hi I am now port Stefan to my