Re: Problems with MapRed-

2006-02-04 Thread Rafit Izhak_Ratzin
the problem. Any way as you mention removing teh jr is not enough, I still have the same problem. Thanks again, Rafit From: Mike Smith [EMAIL PROTECTED] Reply-To: nutch-user@lucene.apache.org To: nutch-user@lucene.apache.org Subject: Re: Problems with MapRed- Date: Wed, 1 Feb 2006 17:52:17

Re: Problems with MapRed-

2006-02-01 Thread Andrzej Bialecki
Mike Smith wrote: I finally find out why this problem happens, there should be a problem with the JS parser. Because I used this: nameplugin.includes/name valueprotocol-http|urlfilter-regex|parse-(text|html)|index-basic|query-(basic|site|url)/value instead of the default one which has JS in

Re: Problems with MapRed-

2006-02-01 Thread Mike Smith
Hi Andrzej I repeated the crawl with plugged JS parser and problem happeded again, but by removing JS parser everything goes smoothly. I am using a single machine and verything is running locally but using ndfs. Have you tried that URL to see if you can crawl that for depth 2? in the tasktracker

Re: Problems with MapRed-

2006-01-31 Thread Mike Smith
@lucene.apache.org Subject: Re: Problems with MapRed- Date: Sun, 29 Jan 2006 16:42:15 -0800 This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the namenode has only 2 in its list so it logs

Re: Problems with MapRed-

2006-01-30 Thread Mike Smith
. Thanks, Rafit From: Ken Krugler [EMAIL PROTECTED] Reply-To: nutch-user@lucene.apache.org To: nutch-user@lucene.apache.org Subject: Re: Problems with MapRed- Date: Sun, 29 Jan 2006 16:42:15 -0800 This looks like the namenode has lost connection to one of the datanodes. The default

Re: Problems with MapRed-

2006-01-29 Thread Stefan Groschupf
Sounds like your tasktracker wasn't able to connect to your jobtracker and more. Are you sure the jobtracker still runs and the tasktracker can access the jobtracker box still under same hostname? Am 28.01.2006 um 21:21 schrieb Rafit Izhak_Ratzin: Hi, I ran the mapreduce starting with 10

Re: Problems with MapRed-

2006-01-29 Thread Mike Smith
I do have the same problem and this problem is killing. I have tried all sort of comfiguration and tricks. I have 3 machines, all three are datanodes and 1 is jobtracker. It successfully fetches 300,000 pages, but when I try to fetch more than that by injecting more number of pages at the first

Re: Problems with MapRed-

2006-01-29 Thread Mike Smith
I forgot to mention the namenode log file gives me thousands of these: 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 Thanks, Mike On 1/29/06,

Re: Problems with MapRed-

2006-01-29 Thread Stefan Groschupf
Am 30.01.2006 um 00:50 schrieb Mike Smith: I do have the same problem and this problem is killing. I have tried all sort of comfiguration and tricks. I have 3 machines, all three are datanodes and 1 is jobtracker. It 3 tasktracker, 1 jobtracker, 3 datanodes and 1 namenode, right?

Re: Problems with MapRed-

2006-01-29 Thread Ken Krugler
Hi Mike, I forgot to mention the namenode log file gives me thousands of these: 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 From our

Re: Problems with MapRed-

2006-01-29 Thread Dominik Friedrich
This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the namenode has only 2 in its list so it logs this warning. As Stefan suggested, you should check the diskspace on your machines. If I recall correctly

Re: Problems with MapRed-

2006-01-29 Thread Ken Krugler
This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the namenode has only 2 in its list so it logs this warning. As Stefan suggested, you should check the diskspace on your machines. If I recall correctly