date:20060109

Re: fetcher.threads.per.host bug in 0.7.1?

2006-01-09 Thread Ken Krugler

Is there a bug in 0.7.1 that causes the fetcher.threads.per.host 
setting to be ignored?


Why do you think it's getting ignored?

Is it because of the "Exceeded http.max.delays" errors below?

These show up when the fetcher.threads.per.host limit causes a thread 
to delay and then loop, because another thread is already accessing a 
page from the same host. When a thread has looped more than 
http.max.delays times, it triggers the that error. So it's actually a 
sign that fetcher.threads.per.host is being used, not ignored.


Looks like you're going after a bunch of pages from the same domain 
(fas.org), which means you're going to get a bunch of these errors 
even with just three threads.


-- Ken


[snip]



 fetcher.threads.per.host
 1
 This number is the maximum number of threads that
   should be allowed to access a host at one time.







Fetch Log

060109 202235 fetching http://www.fas.org/irp/news/1998/06/prs_rel21.html
060109 202250 fetch of 
http://www.fas.org/irp/news/1998/04/t04141998_t0414asd-3.html failed 
with: java.lang.Exception: org.apache.nutch.protocol.RetryLater: 
Exceeded http.max.delays: retry later.
060109 202250 fetch of 
http://www.fas.org/asmp/campaigns/smallarms/sawgconf.PDF failed 
with: java.lang.Exception: org.apache.nutch.protocol.RetryLater: 
Exceeded http.max.delays: retry later.

060109 202250 fetching http://www.fas.org/irp/commission/testhaas.htm
060109 202250 fetching http://www.fas.org/asmp/profiles/bahrain.htm
060109 202250 fetching 
http://www.fas.org/irp/cia/product/dci_speech_03082001.html

060109 202306 fetching http://www.fas.org/irp/news/1998/06/980609-drug10.htm
060109 202321 fetch of 
http://www.fas.org/irp/commission/testhaas.htm failed with: 
java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded 
http.max.delays: retry later.
060109 202321 fetch of http://www.fas.org/asmp/profiles/bahrain.htm 
failed with: java.lang.Exception: 
org.apache.nutch.protocol.RetryLater: Exceeded http.max.delays: 
retry later.

060109 202321 fetching http://www.fas.org/irp/news/1998/04/980422-terror2.htm
060109 202321 fetching http://www.fas.org/irp//congress/2004_cr/index.html
060109 202321 fetching http://www.fas.org/irp//congress/2001_rpt/index.html
060109 202338 fetching http://www.fas.org/irp/budget/fy98_navy/0601152n.htm
060109 202354 fetching http://www.fas.org/irp/dia/product/cent21strat.htm
060109 202408 fetch of 
http://www.fas.org/irp/news/1998/04/980422-terror2.htm failed with: 
java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded 
http.max.delays: retry later.
060109 202408 fetch of 
http://www.fas.org/irp//congress/2004_cr/index.html failed with: 
java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded 
http.max.delays: retry later.

060109 202408 fetching http://www.fas.org/faspir/2001/v54n2/qna.htm
060109 202408 fetching http://www.fas.org/graphics/predator/index.htm
060109 202409 fetching http://www.fas.org/irp/doddir/dod/5200-1r/chapter_6.htm
060109 202425 fetching http://www.fas.org/irp//congress/1995_hr/140.htm



--
Ken Krugler
Krugle, Inc.
+1 530-470-9200

RE: http status 500?

2006-01-09 Thread Andy Morris

Okay, I did that and restarted tomcat from my crawl.test
directory..still get an error when searching. Do I need to rerun the
crawl?
andy

-Original Message-
From: Jerry Russell [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 09, 2006 5:59 PM
To: nutch-user@lucene.apache.org
Subject: Re: http status 500?

In the file
{tomcatroot}\webapps\nutch-0.7\WEB-INF\classes\nutch-site.xml

Add the following inside the  tag:

  
  searcher.dir
  /full/path/to/the/directory/containing/segments
   

Jerry
[EMAIL PROTECTED]
http://circuitscout.com



Andy Morris wrote:

>What config file, the nutch-daemon.sh or just the nutch file?  Or in
the
>tomcat folder?
>Andy
>
>-Original Message-
>From: Jerry Russell [mailto:[EMAIL PROTECTED] 
>Sent: Monday, January 09, 2006 5:22 PM
>To: nutch-user@lucene.apache.org
>Subject: Re: http status 500?
>
>Hi Andy,
>
>  Have you added the path to the segments directory to your 
>configuration file? If not, you need to do that, or start tomcat while 
>that is your current directory. Is there any stack trace, or error in 
>the catalina.out?
>
>Jerry
>[EMAIL PROTECTED]
>http://circuitscout.com
>
>Andy Morris wrote:
>
>  
>
>>Okay, I think I got nutch working and tomcat runs.  I did a crawl and
>>
>>
>it
>  
>
>>got some data I think.  When I go to the search page and do a search I
>>get an http status 500 page
>>javax.servlet.ServletException: Not implemented
>>
>>root cause
>>
>>java.lang.Error: Not implemented
>>
>>any ideas? Do I need to build tomcat from cratch. This is on a fedora
>>core 2 box with tomcat 4.1.27-13 from rpm.
>>
>>andy
>>
>> 
>>
>>
>>
>
>  
>

Re: http status 500?

2006-01-09 Thread Jerry Russell


In the file {tomcatroot}\webapps\nutch-0.7\WEB-INF\classes\nutch-site.xml

Add the following inside the  tag:

 
 searcher.dir
 /full/path/to/the/directory/containing/segments
  

Jerry
[EMAIL PROTECTED]
http://circuitscout.com



Andy Morris wrote:


What config file, the nutch-daemon.sh or just the nutch file?  Or in the
tomcat folder?
Andy

-Original Message-
From: Jerry Russell [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 09, 2006 5:22 PM

To: nutch-user@lucene.apache.org
Subject: Re: http status 500?

Hi Andy,

 Have you added the path to the segments directory to your 
configuration file? If not, you need to do that, or start tomcat while 
that is your current directory. Is there any stack trace, or error in 
the catalina.out?


Jerry
[EMAIL PROTECTED]
http://circuitscout.com

Andy Morris wrote:

 


Okay, I think I got nutch working and tomcat runs.  I did a crawl and
   


it
 


got some data I think.  When I go to the search page and do a search I
get an http status 500 page
javax.servlet.ServletException: Not implemented

root cause

java.lang.Error: Not implemented

any ideas? Do I need to build tomcat from cratch. This is on a fedora
core 2 box with tomcat 4.1.27-13 from rpm.

andy

RE: http status 500?

2006-01-09 Thread Andy Morris

What config file, the nutch-daemon.sh or just the nutch file?  Or in the
tomcat folder?
Andy

-Original Message-
From: Jerry Russell [mailto:[EMAIL PROTECTED] 
Sent: Monday, January 09, 2006 5:22 PM
To: nutch-user@lucene.apache.org
Subject: Re: http status 500?

Hi Andy,

  Have you added the path to the segments directory to your 
configuration file? If not, you need to do that, or start tomcat while 
that is your current directory. Is there any stack trace, or error in 
the catalina.out?

Jerry
[EMAIL PROTECTED]
http://circuitscout.com

Andy Morris wrote:

>Okay, I think I got nutch working and tomcat runs.  I did a crawl and
it
>got some data I think.  When I go to the search page and do a search I
>get an http status 500 page
>javax.servlet.ServletException: Not implemented
>
>root cause
>
>java.lang.Error: Not implemented
>
>any ideas? Do I need to build tomcat from cratch. This is on a fedora
>core 2 box with tomcat 4.1.27-13 from rpm.
>
>andy
>
>  
>

Re: http status 500?

2006-01-09 Thread Jerry Russell


Hi Andy,

 Have you added the path to the segments directory to your 
configuration file? If not, you need to do that, or start tomcat while 
that is your current directory. Is there any stack trace, or error in 
the catalina.out?


Jerry
[EMAIL PROTECTED]
http://circuitscout.com

Andy Morris wrote:


Okay, I think I got nutch working and tomcat runs.  I did a crawl and it
got some data I think.  When I go to the search page and do a search I
get an http status 500 page
javax.servlet.ServletException: Not implemented

root cause

java.lang.Error: Not implemented

any ideas? Do I need to build tomcat from cratch. This is on a fedora
core 2 box with tomcat 4.1.27-13 from rpm.

andy

http status 500?

2006-01-09 Thread Andy Morris

Okay, I think I got nutch working and tomcat runs.  I did a crawl and it
got some data I think.  When I go to the search page and do a search I
get an http status 500 page
javax.servlet.ServletException: Not implemented

root cause

java.lang.Error: Not implemented

any ideas? Do I need to build tomcat from cratch. This is on a fedora
core 2 box with tomcat 4.1.27-13 from rpm.

andy

No cluster results

2006-01-09 Thread Håvard W. Kongsgård


No cluster results is displayed next to the search results.
Is this because I turned clustering on after running the fetch and the 
indexing?


nutch-site.xml

nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html|msword|pdf)|index-basic|query-(basic|site|url)|clustering-carrot2
 Regular expression naming plugin directory names to
 include.  Any plugin not matching this expression is excluded.
 In any case you need at least include the nutch-extensionpoints plugin. By
 default Nutch includes crawling just HTML and plain text via HTTP,
 and basic indexing and search plugins.

Re: Search result is an empty site

2006-01-09 Thread Håvard W. Kongsgård


Never mind solved it

for tomcat 5 run

export JAVA_OPTS="-Xmx128m -Xms128m"



Håvard W. Kongsgård wrote:

No I use 0.7.1, I have tested nutch/tomcat with 20 000 docs so I know 
it works. Searching using site like "china site:www.fas.org" also works.


Dominik Friedrich wrote:

If you use the mapred version from svn trunk you might have run into 
the same problem as I have. In the mapred version the searcher.dir 
property in nutch-default.xml is set to crawl and not . anymore. If 
you use this version you have either to put the index and the 
segments dirs into a folder called crawl and start tomcat from above 
that folder or change that value in the nutch-site.xml in 
webapps/ROOT/WEB-INF/classes of your tomcat nutch deployment.


regards
Dominik

Håvard W. Kongsgård wrote:

Hi, I am running a nutch server with a db containing 20 docs. 
When I start tomcat and search for something the browser displays an 
empty site.

Is this a memory problem, how do I fix it?

System: 2,6 | Memory 1 GB | SUSE 9.2

Re: Multi CPU support

2006-01-09 Thread Doug Cutting


Teruhiko Kurosaka wrote:

Can I use MapReduce to run Nutch on a multi CPU system?


Yes.


I want to run the index job on two (or four) CPUs
on a single system.  I'm not trying to distribute the job
over multiple systems.

If the MapReduce is the way to go,
do I just specify config parameters like these:
mapred.tasktracker.tasks.maxiumum=2
mapred.job.tracker=localhost:9001
mapred.reduce.tasks=2 (or 1?)

and
bin/start-all.sh

?


That should work.  You'd probably want to set the default number of map 
tasks to be a multiple of the number of CPUs, and the number of reduce 
tasks to be exactly the number of cpus.


Don't use start-all.sh, but rather just:

bin/nutch-daemon.sh start tasktracker
bin/nutch-daemon.sh start jobtracker


Must I use NDFS for MapReduce?


No.

Doug

Multi CPU support

2006-01-09 Thread Teruhiko Kurosaka

Can I use MapReduce to run Nutch on a multi CPU system?

I want to run the index job on two (or four) CPUs
on a single system.  I'm not trying to distribute the job
over multiple systems.

If the MapReduce is the way to go,
do I just specify config parameters like these:
mapred.tasktracker.tasks.maxiumum=2
mapred.job.tracker=localhost:9001
mapred.reduce.tasks=2 (or 1?)

and
bin/start-all.sh

?

Must I use NDFS for MapReduce?

Do I need to do anything else to make sure that
the two processes run on different CPUs?

Is this the only way to take advantage of a multi CPU system?

-kuro

Creating Multiple Nutch Beans for Searching

2006-01-09 Thread Saravanaraj Duraisamy

will the performance gets drastically reduced, if i create a Nutch Bean for
each and every user?
can some one shed light on this issue?

Re: Is any one able to successfully run Distributed Crawl?

2006-01-09 Thread Doug Cutting


Pushpesh Kr. Rajwanshi wrote:

Just wanted to confirm that this distributed crawl you
did using nutch version 0.7.1 or some other version? And was that a
successful distributed crawl using map reduce or some work around for
distributed crawl?


No, this is 0.8-dev.  This was using in early December using the version 
of Nutch then in the mapred branch.  This version has since been merged 
into the trunk and will be eventually released as 0.8.  I believe 
everything in my previous message is still relevant to the current trunk.


Doug

Re: Full Range of Results Not Showing

2006-01-09 Thread Doug Cutting


Neal Whitley wrote:
I have Nutch 0.7 up and running however when I search there are a number 
times where Nutch finds more total matching pages than in returns on a 
search.


Example: On a search Nutch finds 81 matching pages but only returns 46 
in a result set.

Hits *46-46* (out of about 81 total matching pages):


Nutch, like Yahoo! and Google, only shows two hits from a site.  Are 
there "more from site" links with some hits?  There should be.  Is there 
a "show all hits" link at the bottom of the last page?  There should be.


Doug

Fetching only the pages in an urlfile

2006-01-09 Thread Vish D.

Hi,
How, or where, can I specify to the fetcher to only fetch content/pages of
the urls in the specified urlfile?

I.E., I want to avoid unnecessary fetching of extract content...I want to
avoid the following (dumplinks output using readdb):

"from http://www.eurekalert.org/pub_releases/2005-12/uopm-mut120805.php
 to http://www.upmc.edu/";

(I don't want to fetch "http://www.upmc.edu/"; url, but only "
http://www.eurekalert.org/pub_releases/2005-12/uopm-mut120805.php";)

Thanks in advance for any help!

Vish

Re: Appropriate MapReduce Hardware

2006-01-09 Thread Doug Cutting


Chris Schneider wrote:
2) The TaskTracker nodes should probably also be DataNodes in such a 
relatively small system. No significant data is saved on the TaskTracker 
machine, except in its role as a DataNode.


It is actually optimal for TaskTracker and DataNodes to both be run on 
all slave boxes.  That way map tasks can be assigned to nodes where 
their input data is local, and reduce tasks can write the first copy of 
their output locally, reducing network i/o.  (These optimizations are 
not in the current code, but will be soon.)


3) The NameNode box probably wants to keep large indexes of blocks in 
memory, but I wouldn't expect these to exceed the same 2GB metric we're 
using for the TaskTrackers. Likewise, I wouldn't expect the CPU speed to 
be a major constraint (mostly network bound). Finally, I can't imagine 
why the NameNode would need tons of disk space.


4) I would imagine that the JobTracker would have even less need for big 
RAM and a fast CPU, let alone hard drive space. I'd probably start with 
this running on the same box as the NameNode.


I typically run the NameNode and JobTracker on the same box, the master. 
 Ideally this box might be configured differently (e.g.,, a RAID for 
higher disk reliability) but practically speaking its fine and simpler 
to have it configured the same as the others.  I usually run a cron 
entry on the NameNode box which periodically copies NDFS name data to 
another drive or machine with rsync, since this is a single-point of 
failure.


7) Since the local network will probably be the gating performance 
parameter, we'll need a 1GB network.


Yes, I've benchmarked 30 & 180 node NDFS systems with 100MB networking, 
and the network does appear the be the bottleneck.


Doug

Re: Help on language

2006-01-09 Thread Sameer Tamsekar

Would you tell me where i can get help document on How to use NGramProfile
to train the
language identifier and how to detect it.

Marathi language used in India. Uses Devanagari Script and also space is
used for separator.

Will it be OK if i use Stop Analyzer instead of NutchDocumentAnalyzer with
my custom stopwords?

where i have to make changes in Nutch code?

RE: Help on language

2006-01-09 Thread Teruhiko Kurosaka

Could you tell me where Marathi is used and what script (a set of
letters) is used
to write it? Does Marathi use spaces to separate words?

If so, I don't see much problem from the architectural point of view.
You just write 
the analyzer plugin (not very easy for some languages but do-able).
 
But if it doesn't use spaces, like Japanese (also Korean and Chinese?),
then you'd have a problem.  Currently, the Query expressions analysis
assumes that
words are separated by spaces for non-CJK (Chinese, Japanese and Korean)
characters,
and a single CJK character forms a word, an invalid assumption. The
analysis part of
the Query expression is not made plugable yet. (I'm trying to come up
with some proposal.)

Oh, by the way, you'd need a dev version of Nutch to use the plugable
language
analyzer.  The stable version has the generic analyzer hard-coded.

-kuro

> -Original Message-
> From: Sameer Tamsekar [mailto:[EMAIL PROTECTED] 
> Sent: 2006-1-08 2:40
> To: nutch-user@lucene.apache.org
> Subject: Help on language
> 
> Hello,
> 
>  I am working on building custom analyzer and language detector
> for native language("Marathi") , does anybody have idea how to extend
> nutch for using this language.
> 
> Regards,
> 
> Sameer
>

fetcher.threads.per.host bug in 0.7.1?

2006-01-09 Thread Håvard W. Kongsgård

Is there a bug in 0.7.1 that causes the fetcher.threads.per.host setting 
to be ignored?




Nutch-site.xml



 fetcher.server.delay

 15.0

 The number of seconds the fetcher will delay between

  successive requests to the same server. 





 fetcher.threads.fetch

 3

 The number of FetcherThreads the fetcher should use.

   This is also determines the maximum number of requests that are

   made at once (each FetcherThread handles one

connection). 





 fetcher.threads.per.host

 1

 This number is the maximum number of threads that

   should be allowed to access a host at one time.





Fetch Log

060109 202235 fetching http://www.fas.org/irp/news/1998/06/prs_rel21.html
060109 202250 fetch of 
http://www.fas.org/irp/news/1998/04/t04141998_t0414asd-3.html failed 
with: java.lang.Exception: org.apache.nutch.protocol.RetryLater: 
Exceeded http.max.delays: retry later.
060109 202250 fetch of 
http://www.fas.org/asmp/campaigns/smallarms/sawgconf.PDF failed with: 
java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded 
http.max.delays: retry later.

060109 202250 fetching http://www.fas.org/irp/commission/testhaas.htm
060109 202250 fetching http://www.fas.org/asmp/profiles/bahrain.htm
060109 202250 fetching 
http://www.fas.org/irp/cia/product/dci_speech_03082001.html

060109 202306 fetching http://www.fas.org/irp/news/1998/06/980609-drug10.htm
060109 202321 fetch of http://www.fas.org/irp/commission/testhaas.htm 
failed with: java.lang.Exception: org.apache.nutch.protocol.RetryLater: 
Exceeded http.max.delays: retry later.
060109 202321 fetch of http://www.fas.org/asmp/profiles/bahrain.htm 
failed with: java.lang.Exception: org.apache.nutch.protocol.RetryLater: 
Exceeded http.max.delays: retry later.
060109 202321 fetching 
http://www.fas.org/irp/news/1998/04/980422-terror2.htm

060109 202321 fetching http://www.fas.org/irp//congress/2004_cr/index.html
060109 202321 fetching http://www.fas.org/irp//congress/2001_rpt/index.html
060109 202338 fetching http://www.fas.org/irp/budget/fy98_navy/0601152n.htm
060109 202354 fetching http://www.fas.org/irp/dia/product/cent21strat.htm
060109 202408 fetch of 
http://www.fas.org/irp/news/1998/04/980422-terror2.htm failed with: 
java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded 
http.max.delays: retry later.
060109 202408 fetch of 
http://www.fas.org/irp//congress/2004_cr/index.html failed with: 
java.lang.Exception: org.apache.nutch.protocol.RetryLater: Exceeded 
http.max.delays: retry later.

060109 202408 fetching http://www.fas.org/faspir/2001/v54n2/qna.htm
060109 202408 fetching http://www.fas.org/graphics/predator/index.htm
060109 202409 fetching 
http://www.fas.org/irp/doddir/dod/5200-1r/chapter_6.htm

060109 202425 fetching http://www.fas.org/irp//congress/1995_hr/140.htm

Full Range of Results Not Showing

2006-01-09 Thread Neal Whitley



I have Nutch 0.7 up and running however when I search there are a number
times where Nutch finds more total matching pages than in returns on a
search.
Example: On a search Nutch finds 81 matching pages but only returns 46 in
a result set.
Hits 46-46 (out of about 81 total matching pages): 
Why is it doing this? Or, what do I need to correct?
Thanks,
Neal

Fedora core 2 install

2006-01-09 Thread Andy Morris

 Okay, I think I have nutch set up properly.  I have java and tomcat
inistalled.  I can run a crawl and it processes the urls in the urls
file.  When I got to the search site and do a search I get an error 

http status 500...

description The server encountered an internal error () that prevented
it from fulfilling this request.

exception

exception

javax.servlet.ServletException:
org.apache.nutch.clustering.OnlineClustererFactory

root cause

java.lang.NoClassDefFoundError:
org.apache.nutch.clustering.OnlineClustererFactory

Is there something I am missing.  I started tomcat from my crawl.test
directory. Was that correct?  Fedora core2 does not have the catalina.sh
file to start tomcat, I installed tomcat from yum.  I can get to the web
site and the search site.

Andy

Nutch freezing - deflateBytes

2006-01-09 Thread Insurance Squared Inc.

Our nutch installation (version .7, running on Mandrake linux) continues 
to freeze sporadically during fetching.  Our developer has it pinned 
down to the deflateBytes library. 

"it looped in the native method called deflateBytes for very long time. 
Some times, it took several hours."


That's all we've got so far.  Has anyone run into a problem with this 
library before, or know of a quick way around the issue?


Thanks.

Re: Search result is an empty site

2006-01-09 Thread Håvard W. Kongsgård

No I use 0.7.1, I have tested nutch/tomcat with 20 000 docs so I know it 
works. Searching using site like "china site:www.fas.org" also works.


Dominik Friedrich wrote:

If you use the mapred version from svn trunk you might have run into 
the same problem as I have. In the mapred version the searcher.dir 
property in nutch-default.xml is set to crawl and not . anymore. If 
you use this version you have either to put the index and the segments 
dirs into a folder called crawl and start tomcat from above that 
folder or change that value in the nutch-site.xml in 
webapps/ROOT/WEB-INF/classes of your tomcat nutch deployment.


regards
Dominik

Håvard W. Kongsgård wrote:

Hi, I am running a nutch server with a db containing 20 docs. 
When I start tomcat and search for something the browser displays an 
empty site.

Is this a memory problem, how do I fix it?

System: 2,6 | Memory 1 GB | SUSE 9.2

Re: Search result is an empty site

2006-01-09 Thread Dominik Friedrich

If you use the mapred version from svn trunk you might have run into the 
same problem as I have. In the mapred version the searcher.dir property 
in nutch-default.xml is set to crawl and not . anymore. If you use this 
version you have either to put the index and the segments dirs into a 
folder called crawl and start tomcat from above that folder or change 
that value in the nutch-site.xml in webapps/ROOT/WEB-INF/classes of your 
tomcat nutch deployment.


regards
Dominik

Håvard W. Kongsgård wrote:
Hi, I am running a nutch server with a db containing 20 docs. When 
I start tomcat and search for something the browser displays an empty 
site.

Is this a memory problem, how do I fix it?

System: 2,6 | Memory 1 GB | SUSE 9.2

Search result is an empty site

2006-01-09 Thread Håvard W. Kongsgård

Hi, I am running a nutch server with a db containing 20 docs. When I 
start tomcat and search for something the browser displays an empty site.

Is this a memory problem, how do I fix it?

System: 2,6 | Memory 1 GB | SUSE 9.2

Re: Help on language

2006-01-09 Thread Jérôme Charron

> >  I am working on building custom analyzer

To build a custom analyzer, take a look at analysis-de and analysis-fr
plugins
(they use some lucene analyzers).
A specific analyzer is used depending on the language guessed by the
language identifier.


> and language detector
> > for native language("Marathi") , does anybody have idea how to extend
> > nutch for using this language.

Use the
org.apache.nutch.analysis.lang.NGramProfile command to generate a profile of
ngrams for Marathi from a textual corpus.
Usage for creating a new profile is:
NGramProfile -create profilename filename encoding

Regards

Jérôme


--
http://motrech.free.fr/
http://www.frutch.org/

Re: Help needed please !Please Ignore

2006-01-09 Thread Gal Nitzan

On Mon, 2006-01-09 at 02:06 +0200, Gal Nitzan wrote:
> Hi,
> 
> I see only one fetcher task but I have three tasktrackers.
> 
> What am I missing?
> 
> Thanks,
> 
> G.
> 
> 
>

Re: fetcher.threads.per.host bug in 0.7.1?

RE: http status 500?

Re: http status 500?

RE: http status 500?

Re: http status 500?

http status 500?

No cluster results

Re: Search result is an empty site

Re: Multi CPU support

Multi CPU support

Creating Multiple Nutch Beans for Searching

Re: Is any one able to successfully run Distributed Crawl?

Re: Full Range of Results Not Showing

Fetching only the pages in an urlfile

Re: Appropriate MapReduce Hardware

Re: Help on language

RE: Help on language

fetcher.threads.per.host bug in 0.7.1?

Full Range of Results Not Showing

Fedora core 2 install

Nutch freezing - deflateBytes

Re: Search result is an empty site

Re: Search result is an empty site

Search result is an empty site

Re: Help on language

Re: Help needed please !Please Ignore

26 matches

Site Navigation

Mail list logo

Footer information