Appropriate steps for mapred

2005-12-19 Thread Michael Taggart
I have followed the tutorial at media-style.com and actually have a mapred installation of nutch working. Thanks Stefan :) My question now is the correct steps to continuously fetch and index. I have read some people talking about mergesegs and updatedb however Stefan's tutorial doesn't list these

Re: MapRed searching

2005-12-16 Thread Michael Taggart
/classes/nutch-default.xml the parameter searcher.dir just /user/nutchuser . That's it. HTH Stefan Am 16.12.2005 um 02:55 schrieb Michael Taggart: I got mapred to complete a full index cycle. I now would like to search the index I created except I can't find out how to do

Re: MapRed searching

2005-12-16 Thread Michael Taggart
Sorry Stefan, I am so used to typing usr that I wrote my email incorrectly. Here is exactly what is in my nutch-site.xml: property namesearcher.dir/name value/user/root/value description Path to root of index directories. This directory is searched (in order) for either the file

Re: MapRed searching

2005-12-16 Thread Michael Taggart
. uncompress your nutch-XXX.war file a folder called ROOT.war with unzip and change this in ROOT.war/WEB-INF/classes also. Than you can simply copy this folder into TOMCAT/webapps, that's it. Am 16.12.2005 um 20:09 schrieb Michael Taggart: Sorry Stefan, I am so used to typing usr that I

Re: MapRed searching

2005-12-16 Thread Michael Taggart
Should I specify that urls.txt file as /user/root/urls/urls.txt so it pulls it off the ndfs? On Fri, 2005-12-16 at 21:39 +0100, Stefan Groschupf wrote: I would like to crawl a list of domains, but I would like crawling limited to just those domains. When I first played around with nutch in

Re: MapRed searching

2005-12-16 Thread Michael Taggart
I'm also guessing that it's important for all tasktrackers to have the appropriate configuration set in their conf/nutch-site.xml or can I just do it on the namenode? On Fri, 2005-12-16 at 12:57 -0800, Michael Taggart wrote: Should I specify that urls.txt file as /user/root/urls/urls.txt so

Re: java.net.ConnectException: Connection refused

2005-12-15 Thread Michael Taggart
Marko, Thanks for the reply. Copying that folder to my nutch installation worked! No errors here. Can't wait to unleash the power of this program. Thanks Again, Mike On Thu, 2005-12-15 at 09:42 +0100, Marko Bauhardt wrote: Hi Mike, Exception in thread main java.lang.NullPointerException

java.net.ConnectException: Connection refused

2005-12-14 Thread Michael Taggart
I've followed the steps in the media-style wiki for setting up a map reduce system. I am only having one strange error when I attempt to start the tasktrackers. Here is my output: [EMAIL PROTECTED] nutch]# bin/nutch-daemon.sh start tasktracker starting tasktracker, logging to

Re: java.net.ConnectException: Connection refused

2005-12-14 Thread Michael Taggart
, although everything seems to work fine after that, even though it gives that error, so it's probably not a problem. - Matt Zytaruk Michael Taggart wrote: I've followed the steps in the media-style wiki for setting up a map reduce system. I am only having one strange error when I

Re: java.net.ConnectException: Connection refused

2005-12-14 Thread Michael Taggart
boxA.localnetwork since the name from the outside would be somthing like: boxA.companyDomain.com So double check that the name the boxA use to identify itself against other boxes (host.conf) is also setuped in the dns the other boxes use. HTH Stefan Am 14.12.2005 um 22:49 schrieb Michael

Re: java.net.ConnectException: Connection refused

2005-12-14 Thread Michael Taggart
check that the name the boxA use to identify itself against other boxes (host.conf) is also setuped in the dns the other boxes use. HTH Stefan Am 14.12.2005 um 22:49 schrieb Michael Taggart: I've followed the steps in the media-style wiki for setting up a map reduce system. I am

Re: java.net.ConnectException: Connection refused

2005-12-14 Thread Michael Taggart
is wrong with the jobtracker. Can I have the namenode also be the jobtracker or would this cause a conflict? Mike On Wed, 2005-12-14 at 16:36 -0800, Michael Taggart wrote: Ok, I think I have boiled the problem down. Turns out the jobtracker was actually never running on my BoxA When I start

Calling All Nutch Experts

2005-12-12 Thread Michael Taggart
I have downloaded, installed, and successfully played around with nutch and have to say I am quite impressed with the power of this program. Basically, I would like to hire a nutch expert to help me layout a plan on how to use nutch for the following scenario. We have about 1000 domains that we

Re: Calling All Nutch Experts

2005-12-12 Thread Michael Taggart
and legwork myself. Just need a good guru to mentor and point me in the right directions. On Mon, 2005-12-12 at 15:00 -0800, sub paul wrote: I would jump on $500 offer but I have someone paying me $250 already, for doing twice the work. On 12/12/05, Michael Taggart [EMAIL PROTECTED] wrote: I have

Re: Calling All Nutch Experts

2005-12-12 Thread Michael Taggart
Thanks Stefan, I see you actually wrote that wiki article. I am going to do my best to figure it out. I'll let the group know if I have any problems. Mike On Tue, 2005-12-13 at 02:53 +0100, Stefan Groschupf wrote: Nevertheless, the mailing lists are amongst the best I've ever seen -

Nutch Tomcat JasperException

2005-12-09 Thread Michael Taggart
Hi all, I am a total newb with nutch and tomcat, but I have followed the steps outlined in http://lucene.apache.org/nutch/tutorial.html#Getting+Started and was able to get the nutch page to show up when I go to mydomain:8080. However, my problem is when I run a search. Here is the following output

Nutch Tomcat5 or.apache.jasper.JasperException

2005-12-09 Thread Michael Taggart
Hi all, I am a total newb with nutch and tomcat, but I have followed the steps outlined in http://lucene.apache.org/nutch/tutorial.html#Getting+Started and was able to get the nutch page to show up when I go to mydomain:8080. However, my problem is when I run a search. Here is the following output