Hi all,
I am currently working with Nutch 0.7.1,
I want to start using the mapred, any ideas where I can find the latest
version.
B.T.W I looked at the path:
http://svn.apache.org/repos/asf/lucene/nutch/branches/
but the only directory that exists there is branch-0.7/
Thanks,
Raffi
mapred is now trunk...
Am 19.12.2005 um 18:46 schrieb Rafi Iz:
Hi all,
I am currently working with Nutch 0.7.1,
I want to start using the mapred, any ideas where I can find the
latest version.
B.T.W I looked at the path: http://svn.apache.org/repos/asf/lucene/
nutch/branches/
but the only
Stefan Groschupf wrote:
Anyway today we note that when fetching with http-client the sum of
errors and fetched pages is much less than the size defined when
generating the segment.
Changing to protocol-http solves the problem.
Has anyone also note this behavior?
I haven't, but this
OK I will do that tomorrow!
However in case it is known as buggy, we may should not set up as
default http protocol plugin as it is by today.
Newbies checking out nutch ill use the version that does not fetch
all pages, since most people start with the standard configuration.
Am 19.12.2005
The same problem on FreeBSD 6.0 + jdk1.4.2
I think it was also reported some time ago by Rod Taylor.
Switch to protocol-http.
SG Hi there,
SG is there someone out there that can confirm a problem we discovered?
SG We was wondering why not all pages of a generated segments was
SG fetched.
Stefan Groschupf wrote:
OK I will do that tomorrow!
However in case it is known as buggy, we may should not set up as
default http protocol plugin as it is by today.
Newbies checking out nutch ill use the version that does not fetch
all pages, since most people start with the standard
+1 - especially for amount of support Stefan gives to nutch users.
P.
Andrzej Bialecki wrote:
Hi,
During the past year and more Stefan participated actively in the
development, and contributed many high-quality patches. He's been
spending considerable effort on addressing many issues in JIRA,
Thanks for the fast response,
Do you know where I can find a compressed version?
Thanks,
Rafi
From: Stefan Groschupf [EMAIL PROTECTED]
Reply-To: nutch-dev@lucene.apache.org
To: nutch-dev@lucene.apache.org
Subject: Re: Latest version of Mapred Date: Mon, 19 Dec 2005 19:00:29 +0100
mapred is
Thanks for the fast response,
Do you know where I can find a compressed version?
Here are the nightly builds:
http://cvs.apache.org/dist/lucene/nutch/nightly/
Regards
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
I tried separating the Tomcat into a different machine and bingo.
The performance went up by 30%%. Right now I only have two machines
with 900K URLs each that act as Nutch servers and one machine that hosts the
Tomcat.
At this time I don't suspect any more that Tomcat is synchronously
By the way, is there an easy way to split the index I have already
have.
I would hate to recrawl all of the 1.9MM URLs again and waste
bandwidth.
Well I do not know any tool that comes with nutch or a other tool
that does it, may there is one.
But to write a java class that creates two
check the next command
FetchListTool (-local | -ndfs namenode:port) db segment_dir
[-refetchonly] [-topN N] [-cutoff cutoffscore] [-numFetchers numFetchers]
[-adddays numDays]
This command call to a function called emitMultipleLists which spit out
several fetchlists, so that you can fetch
I have the book so I'll check what I can do with the API.
Thanks Stefan,
Ledio
-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: Monday, December 19, 2005 3:38 PM
To: nutch-dev@lucene.apache.org
Subject: Re: [Nutch-dev] distributed search
By the way, is there
13 matches
Mail list logo