Jp Mutch wrote: > > My questions are regarding crawling and testing/searching: > Due to my local requirements, initially I just need to run all of nutch > on a single machine in its local filesystem, without really needing > Hadoop or DFS [I don't mind if they are running "under the hood"]. > Later on if the initial study is successful, I will of course > switch to the full blown Nutch with Hadoop+DFS+Distributed Search. > > Hadoop is run no matter what. Its no big deal, unless there is a Hadoop bug, several have come along but have been fixed. hadoop needs a tmp directory to execute jobs in the distributed fashion. I usually point mine to C:\tmp Hdoop will also create some directories related to its filesystem. the main directories you will work with will be your crawl directory and its subfolders crawldb lindb, indexes, and segements. > (Q1) What tutorial do I need to follow to get Nutch 9.12 > to crawl and index on a single machine? > (a) The Nutch 0.8 tutorial > http://lucene.apache.org/nutch/tutorial8.html ? > OR > (c) The new Hadoop tutorial > http://wiki.apache.org/nutch/NutchHadoopTutorial ? > > The .8 would work, there are some additional notes on windows on the wiki > (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP > machine with Cygwin+Tomcat 5.5? > Yes > > Appreciate any help. > Thanks a lot! > > -jp > > > > --------------------------------- > Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ > countries) for 2ยข/min or less. >
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
