I wonder if the name "crawl" implies that the command is sort of standard command, and all you would need? After all, if I where to sit down with a "crawler", it seems very logical that "crawl" would be how you run it! I like the simplicity of crawl from a "getting started" approach. I agree though that I know I used it as a short cut... I didn't want to learn all the lower level concepts, I just wanted to crawl a couple URLs and toss them into Solr. "crawl" and the example code did great!
Maybe instead of having "crawl" be a core part of running Nutch, instead it's "run-example-crawl.sh" and in the Wiki it's caveated that you should then look inside it and learn all the various steps. Eric On Aug 23, 2011, at 6:50 AM, Markus Jelsma wrote: > What kind of shell script did you have in mind? The wiki already provides > some > useful scripts. The tutorials on Nutch also show commands that can be used in > custom scripts. > > Is an immediate crawl-with-one-command a desired feature? Provided as Java > code or shell script? > > On Tuesday 23 August 2011 10:12:57 Julien Nioche wrote: >> +1 let's replace it with a shell script instead. >> >> On 22 August 2011 21:56, Markus Jelsma <markus.jel...@openindex.io> wrote: >>> Hi, >>> >>> The crawl command seems to add a lot of confusion. It hides the entire >>> crawl >>> cycle logic from new users, leading to questions, lack of understanding >>> of basic Nutch concepts, unsupported switches of the jobs it executes, >>> more problems etc. I am quite an opponent of the crawl command and would >>> also not >>> recommend it to anyone including new users. A running Nutch almost always >>> requires some scripting here and there, cron jobs, locks etc. >>> >>> I propose (most likely a challenging statement) to deprecate the crawl >>> command >>> in 1.4. >>> >>> Users, developers, please comment. >>> >>> Thanks > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 ----------------------------------------------------- Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.