Problems Installing

2006-04-02 Thread Paul Stewart
Hi there... I am trying to get nutch running Have done a trial indexing run successfully etc... Now I'm running into issues that may be more Tomcat related than Nutch: HTTP Status 500 - type Exception

RE: Problems Installing

2006-04-02 Thread Paul Stewart
to root.war and dump that into webapps under tomcat? 3. did it install ok (can you see the exploded pages under webapps root? Just checking, this is how I fixed the same issue under windows. r/d -Original Message- From: Paul Stewart [mailto:[EMAIL PROTECTED] Sent: Sunday, April 02, 2006 11

RE: Tomcat Problem

2006-04-02 Thread Paul Stewart
: Sunday, April 02, 2006 11:51 PM To: nutch-user@lucene.apache.org Subject: RE: Tomcat Problem Hey, Check the classpath and ur JSP file. Regards Kamesh -Original Message- From: Paul Stewart [mailto:[EMAIL PROTECTED] Sent: Monday, April 03, 2006 4:25 AM To: nutch-user@lucene.apache.org Subject

RE: Tomcat Problem

2006-04-03 Thread Paul Stewart
-Original Message- From: Paul Stewart [mailto:[EMAIL PROTECTED] Sent: Monday, April 03, 2006 4:25 AM To: nutch-user@lucene.apache.org Subject: Tomcat Problem Sorry if this is slightly off-topic but I'm just trying to get Nutch running for testing... I *think* this is Tomcat related: HTTP

Nutch 500 Error

2006-04-05 Thread Paul Stewart
Hi there... I was having a number of problems with my install, mainly because I'm not used to Tomcat and/or Nutch etc... Anyways, I am running Fedora 4 and was told that the packages are bad idea to use so uninstalled all of my java/tomcat rpm's and installed new binaries today from the source

RE: Nutch 500 Error

2006-04-06 Thread Paul Stewart
Thanks.. Tried that ... Same error HTTP Status 500 - type Exception report message description The server encountered an internal error () that prevented it from fulfilling this request. exception

RE: Nutch 500 Error

2006-04-11 Thread Paul Stewart
? All the best, Paul -Original Message- From: sudhendra seshachala [mailto:[EMAIL PROTECTED] Sent: Thursday, April 06, 2006 12:02 PM To: nutch-user@lucene.apache.org Subject: RE: Nutch 500 Error It should be java -versionI think. Paul Stewart [EMAIL PROTECTED] wrote: Thanks for the reply

RE: Nutch 500 Error

2006-04-12 Thread Paul Stewart
- segments - index - indexes point searcher.dir to home/nutch/crawl. Hope this helps. Thanks Sudhi Paul Stewart [EMAIL PROTECTED] wrote: Thanks I was doing the java command wrong... Back to my original problem - I re-ran throught the entire tutorial to ensure I was doing it right

Hardware Planning

2007-11-28 Thread Paul Stewart
Hi folks... I have read the archives and looking for input specific to my estimated requirements: Want to index about 100 million public webpages. Space and bandwidth are not a problem - coming up with the right hardware and keeping the cost down is my goal. I would estimate only 1-2 searches

RE: Hardware Planning

2007-11-29 Thread Paul Stewart
9:53 PM To: nutch-user@lucene.apache.org Subject: Re: Hardware Planning Have you considered EC2 + S3? Also Rightscale has some interesting solutions, which I am currently evaluating. On Nov 28, 2007 9:38 PM, Paul Stewart [EMAIL PROTECTED] wrote: Hi folks... I have read the archives

New Installation - Problems - Error 500

2008-01-29 Thread Paul Stewart
Hi folks... Just installing a new server for Nutch - testing at this point... Ran a crawl with no problems but can't do a search without getting an Error 500. CentOS5.1, Tomcat5.5.20, Java SDK 1.5.0_14 The last time I installed Nutch I ran into a similar issue and it had to do with a config

RE: New Installation - Problems - Error 500

2008-01-29 Thread Paul Stewart
-Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 29, 2008 11:30 AM To: nutch-user@lucene.apache.org Subject: Re: New Installation - Problems - Error 500 Paul Stewart wrote: java.lang.NoClassDefFoundError: org.apache.hadoop.util.ReflectionUtils

RE: New Installation - Problems - Error 500

2008-01-29 Thread Paul Stewart
5:38 PM, Paul Stewart [EMAIL PROTECTED] wrote: Thanks.. my apologies as new to Java (to complicate matters). When I check in the tomcat.conf file I can't find a place to specify. When I do a search, there is multiple versions installed: /usr/bin/java /usr/share/java /usr/include/c++/4.1.1

RE: New Installation - Problems - Error 500

2008-01-29 Thread Paul Stewart
: New Installation - Problems - Error 500 Hi, On Jan 29, 2008 7:14 PM, Paul Stewart [EMAIL PROTECTED] wrote: Thanks for the reply... Java -version shows this: java version 1.4.2 gij (GNU libgcj) version 4.1.2 20070626 (Red Hat 4.1.2-14) I just had a closer look at your stacktrace and your

RE: New Installation - Problems - Error 500

2008-01-30 Thread Paul Stewart
That's wonderful - what a great list! You guys respond very quickly... Now I gotta get back to reading the docs as I'm sure most of what I just asked is already in there...;) Best! Paul -Original Message- From: John Mendenhall [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 29, 2008

Stats?

2008-01-31 Thread Paul Stewart
Hi folks... Is there a way to retrieve stats from Nutch - meaning how many webpages are indexed, to be indexed etc?? When I was working with AspSeek and Mnogosearch in the past I could run a command to see stats Thanks again, Paul

Limiting Crawl Time

2008-02-05 Thread Paul Stewart
Hi folks... What is the best way to say limit crawling to perhaps 3-4 hours per day? Is there a way to do this? Right now, I have a crawl depth of 6 and maximum per site of 100. I thought this would limit things pretty low but during some test crawls, my last crawl took 2.5 days to complete:

RE: Limiting Crawl Time

2008-02-06 Thread Paul Stewart
-Original Message- From: Susam Pal [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 05, 2008 10:36 PM To: nutch-user@lucene.apache.org Subject: Re: Limiting Crawl Time Did you try specifying a topN value? -depth 3 -topN 1000 should be close to what you want. On 2/6/08, Paul Stewart [EMAIL PROTECTED