Thanks for the response. What is the property name for this default
value of topN in nutch-default.xml?

On 9/6/07, Rikard Lindner <[EMAIL PROTECTED]> wrote:
> There is a default value in nutch-default.xml
>
> /Rikard
>
> 2007/9/6, Smith Norton <[EMAIL PROTECTED]>:
> >
> > In the bin/generate command, if I omit the 'topN' argument, what is
> > the behavior?
> >
> > Does it generate all possible URLs or does it assume a default topN value?
> >
> > I tried omitting topN value in my crawl script and I find that my
> > crawl is running much faster. Earlier I had a -topN 2000 argument and
> > it used to take 4-5 days to finish a crawl of depth 5.
> >
> > Now, without the topN argument, it finished a crawl of depth 5 in 6
> > hours. Can anyone explain what's going on?
> >
>

Reply via email to