Thanks for the response Lewis.

I'll give nucth 2.3.1 a spin later tonight.

I didn't have success with batchId. I thought I could overwrite this in the
DB with 123 and then ./fetch 123 would get all urls marked with 123.
I seem to be missing where the generate command stores its segments.

For now I'm happy looking through the code for the first time.
I think I'll try building a generator or fetch job which can
prioritize/boost domains. I'm no Java wiz but it'll be a good exercise
regardless if it works or not.

Thanks,
Lex

On Tue, Jan 12, 2016 at 4:13 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> Hi Lex,
>
> On Mon, Jan 11, 2016 at 2:16 PM, <[email protected]>
> wrote:
>
> >
> > I'm using Nutch 2.3.
> >
>
> Please note we are on the very cusp of releasing Apache Nutch 2.3.1 which
> has a number of bug fixes and improvements. There is a VOTE out right now
> for it. If you have time please consider taking it for a spin and providing
> us with feedback. Thanks.
>
>
> >
> > After thinking about it more I see batchId. And after running ./generate
> > -topN x I see a batch id generated. I wonder if its safe to overwrite the
> > batchId to 123 and then run ./fetch 123?
> >
> >
> When you say overwrite the batch id you mean passing the -batchId 123
> argument to GeneratorJob? Yes I think that this is OK. Baring in mind that
> the batchId is autogenerated anyways, I am not sure that this would matter
> much. All that it would do is enable you to remember that you previously
> generated a batch with ID 123 :)
> Thanks
>

Reply via email to