Hi Adnane,
Yes, we were getting your mail. Just too busy too respond so thank you for
your patience.
OK so this sounds like a bug IMHO. No Metadata should be deleted, at the
most updates should occur, that is all.
Can you please log an issue at the Nutch Jira instance describing your
Nutch 2.X search stack along with an entire log if possible and any queries
which can allow us to better understand the issue at hand?
Thanks in advance.
lewis

On Wed, Feb 17, 2016 at 10:34 AM, <[email protected]> wrote:

> From: Adnane Benjelloun <[email protected]>
> To: <[email protected]>
> Cc:
> Date: Tue, 16 Feb 2016 22:03:53 -0500
> Subject: fetch deletes all metadata except _csh_ and _rs_
> Hello,
>
>
>
> This problem happens at the second time I crawl a page
>
>
>
> bin/nutch inject urls/
>
> bin/nutch generate -topN 1000
>
> bin/nutch fetch -all
>
> bin/nutch parse -force -all
>
> bin/nutch updatedb -all
>
>
>
> second time :
>
>
>
> bin/nutch generate -topN 1000 --> batchid changes for all existing pages
>
> bin/nutch fetch -all --> *** metadatas are delete for all pages already
> crawled **
>
> bin/nutch parse -force -all
>
> bin/nutch updatedb -all
>
>
>
> I'm using mongodb
>
>
>
> Any Help please ? I'm not sure if it's a nutch bug or  it's my
> misunderstanding on nutch.
>
>
>
> Best regards,
>
>
> Adnane
>
>

Reply via email to