Re: Problems with MapRed-

2006-02-04 Thread Rafit Izhak_Ratzin
make the problem. Any way as you mention removing teh jr is not enough, I still have the same problem. Thanks again, Rafit From: Mike Smith <[EMAIL PROTECTED]> Reply-To: nutch-user@lucene.apache.org To: nutch-user@lucene.apache.org Subject: Re: Problems with MapRed- Date: Wed, 1 Feb 20

Re: Problems with MapRed-

2006-02-01 Thread Mike Smith
Hi Andrzej I repeated the crawl with plugged JS parser and problem happeded again, but by removing JS parser everything goes smoothly. I am using a single machine and verything is running locally but using ndfs. Have you tried that URL to see if you can crawl that for depth 2? in the tasktracker l

Re: Problems with MapRed-

2006-02-01 Thread Andrzej Bialecki
Mike Smith wrote: I finally find out why this problem happens, there should be a problem with the JS parser. Because I used this: plugin.includes protocol-http|urlfilter-regex|parse-(text|html)|index-basic|query-(basic|site|url) instead of the default one which has JS in it and I could fetch h

Re: Problems with MapRed-

2006-01-31 Thread Mike Smith
I have a group of 5000 that > > > when I > > > Run them have this problem, I am continuing with this problem till > > > I'll find > > > the smallest group that has this problem and let you know about the > > > seed. > > > > > > T

Re: Problems with MapRed-

2006-01-31 Thread Mike Smith
that has this problem and let you know about the > > seed. > > > > Thanks, > > Rafit > > > > > > > > >From: Ken Krugler < [EMAIL PROTECTED]> > > >Reply-To: nutch-user@lucene.apache.org > > >To: nutch-user@lucene.apache.org > &

Re: Problems with MapRed-

2006-01-30 Thread Mike Smith
the smallest group that has this problem and let you know about the seed. > > Thanks, > Rafit > > > > >From: Ken Krugler <[EMAIL PROTECTED]> > >Reply-To: nutch-user@lucene.apache.org > >To: nutch-user@lucene.apache.org > >Subject: Re: Problems with Ma

Re: Problems with MapRed-

2006-01-30 Thread Rafit Izhak_Ratzin
ED]> Reply-To: nutch-user@lucene.apache.org To: nutch-user@lucene.apache.org Subject: Re: Problems with MapRed- Date: Sun, 29 Jan 2006 16:42:15 -0800 This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the nam

Re: Problems with MapRed-

2006-01-30 Thread Rafit Izhak_Ratzin
ED]> Reply-To: nutch-user@lucene.apache.org To: nutch-user@lucene.apache.org Subject: Re: Problems with MapRed- Date: Sun, 29 Jan 2006 16:42:15 -0800 This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the nam

Re: Problems with MapRed-

2006-01-29 Thread Ken Krugler
This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the namenode has only 2 in its list so it logs this warning. As Stefan suggested, you should check the diskspace on your machines. If I recall correctly da

Re: Problems with MapRed-

2006-01-29 Thread Dominik Friedrich
This looks like the namenode has lost connection to one of the datanodes. The default number of replications in ndfs is 3 and it seems like the namenode has only 2 in its list so it logs this warning. As Stefan suggested, you should check the diskspace on your machines. If I recall correctly da

Re: Problems with MapRed-

2006-01-29 Thread Ken Krugler
Hi Mike, I forgot to mention the namenode log file gives me thousands of these: 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 From our experien

Re: Problems with MapRed-

2006-01-29 Thread Stefan Groschupf
may the hdds are full? try: bin/nutch ndfs -report Nutch generates some temporarily data until processing. Am 30.01.2006 um 00:54 schrieb Mike Smith: I forgot to mention the namenode log file gives me thousands of these: 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets

Re: Problems with MapRed-

2006-01-29 Thread Stefan Groschupf
Am 30.01.2006 um 00:50 schrieb Mike Smith: I do have the same problem and this problem is killing. I have tried all sort of comfiguration and tricks. I have 3 machines, all three are datanodes and 1 is jobtracker. It 3 tasktracker, 1 jobtracker, 3 datanodes and 1 namenode, right? successfu

Re: Problems with MapRed-

2006-01-29 Thread Mike Smith
I forgot to mention the namenode log file gives me thousands of these: 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 060129 13 Zero targets found, forbidden1.size=2allowSameHostTargets=false forbidden2.size()=0 Thanks, Mike On 1/29/06, Mik

Re: Problems with MapRed-

2006-01-29 Thread Mike Smith
I do have the same problem and this problem is killing. I have tried all sort of comfiguration and tricks. I have 3 machines, all three are datanodes and 1 is jobtracker. It successfully fetches 300,000 pages, but when I try to fetch more than that by injecting more number of pages at the first cy

Re: Problems with MapRed-

2006-01-29 Thread Stefan Groschupf
Sounds like your tasktracker wasn't able to connect to your jobtracker and more. Are you sure the jobtracker still runs and the tasktracker can access the jobtracker box still under same hostname? Am 28.01.2006 um 21:21 schrieb Rafit Izhak_Ratzin: Hi, I ran the mapreduce starting with 10 U

Problems with MapRed-

2006-01-28 Thread Rafit Izhak_Ratzin
Hi, I ran the mapreduce starting with 10 URL into the sixth cycle where it fetched 400K pages and everything was fine. 060127 001055 TOTAL urls: 1877326 060127 001055 avg score:1.099 060127 001055 max score:1666.305 060127 001055 min score:1.0 060127 001055 retry