Hi,

I already came up with similar changes to the code as in this patch. Only 
suggestion to this patch's code is that to move checking if url exists in the 
datastore under


if (!additionsAllowed) {
         return;
       }


and close datastore.


Thanks.
Alex.
-----Original Message-----
From: Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
To: user <user@nutch.apache.org>
Sent: Tue, Jun 24, 2014 9:07 am
Subject: Re: updatedb deletes all metadata except _csh_


Hi Alex,

I am really sorry for not making the connection here.

On Tue, Jun 24, 2014 at 12:31 AM, <user-digest-h...@nutch.apache.org> wrote:

>
> So far, this looks like a bug in updatedb when filtering with batchId.
>
> I could only found one solution, to check if new pages are in the datastore
> and if they are skip them.
> Otherwise updatedb with option -all will also work.
>

https://issues.apache.org/jira/browse/NUTCH-1679

If you can run with this patch, then please post your results here.

 

Reply via email to