So once the crawl (which abstracts iterative crawls till the depth is
reached) is finished, is there a way to trigger a recrawl as well as a part
of some command line option so that Nutch continues to run as a daemon or
is shell script the way out?

Regards | Vikas

On Fri, May 11, 2012 at 8:26 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> If you would like I could add you to the moderators group and you can
> word it how you wish.
>
> Please sign up to Jira, give me your Jira username on this page, and I
> will happily add you the the group.
>
> On the other-hand, if you don't wish to do this, then please reply
> here with your suggestion and I'll make sure something gets changed to
> accommodate your suggestions.
>
> Thanks
>
> On Fri, May 11, 2012 at 2:52 PM, Matthias Paul <[email protected]>
> wrote:
> > In was confused by this tutorial:
> http://wiki.apache.org/nutch/NutchTutorial
> > Reading this page one might get to the conclusion that the crawl tool
> > can't do iterative crawling, because under "3.2 Using Individual
> > Commands for Whole-Web Crawling" there's  the sentence "This also
> > permits ... incremental crawling", as if the crawl command described
> > before (3.1 Using the Crawl Command) couldn't do that.
> >
> > Could someone perhaps improve this part of the tutorial?
> >
> > Matthias
> >
> >
> >
> >
> >
> >
> > On Thu, May 10, 2012 at 8:39 PM, Markus Jelsma
> > <[email protected]> wrote:
> >>
> >> By default each crawl is iterative. The crawl command is nothing more
> than a wrapper around the individual crawl cycle commands. The depth
> parameter is nothing more than executing a single crawl cycle multiple
> times. This is, if i am not mistaken, also true for older releases,
> certainly 1.2 and above.
> >>
> >>
> >> On Thu, 10 May 2012 19:31:27 +0100, Lewis John Mcgibbney <
> [email protected]> wrote:
> >>>
> >>> For the record, there is a patch pending review for Nutchgora which
> >>> will sort part of this for you as well.
> >>>
> >>> https://issues.apache.org/jira/browse/NUTCH-1301
> >>>
> >>> Susam Pal also contributed a patch for Nutchgora regarding incremental
> >>> indexing but I can't find it just now sorry.
> >>>
> >>> Lewis
> >>>
> >>>
> >>> On Thu, May 10, 2012 at 5:18 PM, Matthias Paul
> >>> <[email protected]> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> can the crawl-command also be used for iterative crawls?
> >>>> In older Nutch-versions this was not possible but in 1.5 it seems to
> work?
> >>>>
> >>>> Thanks
> >>>> Matthias
> >>
> >>
> >> --
> >> Markus Jelsma - CTO - Openindex
>
>
>
> --
> Lewis
>

Reply via email to