Re: The crawl command, keep or get rid of

2011-08-23 Thread Julien Nioche
+1 let's replace it with a shell script instead. On 22 August 2011 21:56, Markus Jelsma markus.jel...@openindex.io wrote: Hi, The crawl command seems to add a lot of confusion. It hides the entire crawl cycle logic from new users, leading to questions, lack of understanding of basic Nutch

Re: The crawl command, keep or get rid of

2011-08-23 Thread Markus Jelsma
What kind of shell script did you have in mind? The wiki already provides some useful scripts. The tutorials on Nutch also show commands that can be used in custom scripts. Is an immediate crawl-with-one-command a desired feature? Provided as Java code or shell script? On Tuesday 23 August

Re: The crawl command, keep or get rid of

2011-08-23 Thread Julien Nioche
What kind of shell script did you have in mind? The wiki already provides some useful scripts. The tutorials on Nutch also show commands that can be used in custom scripts. That's exactly my point. There are various scripts in the wiki, based on different versions of Nutch and of variable

Re: The crawl command, keep or get rid of

2011-08-23 Thread Markus Jelsma
You're right: https://issues.apache.org/jira/browse/NUTCH-1087 On Tuesday 23 August 2011 13:24:27 Julien Nioche wrote: What kind of shell script did you have in mind? The wiki already provides some useful scripts. The tutorials on Nutch also show commands that can be used in custom

Re: The crawl command, keep or get rid of

2011-08-23 Thread Radim Kolar
I agree. Nuke crawl command

Re: The crawl command, keep or get rid of

2011-08-23 Thread Eric Pugh
I wonder if the name crawl implies that the command is sort of standard command, and all you would need? After all, if I where to sit down with a crawler, it seems very logical that crawl would be how you run it! I like the simplicity of crawl from a getting started approach. I agree though

The crawl command, keep or get rid of

2011-08-22 Thread Markus Jelsma
Hi, The crawl command seems to add a lot of confusion. It hides the entire crawl cycle logic from new users, leading to questions, lack of understanding of basic Nutch concepts, unsupported switches of the jobs it executes, more problems etc. I am quite an opponent of the crawl command and