Hi Folks, It looks like this can be reproduced with HBase and the bug is actually up in Nutch 2.X layer. https://issues.apache.org/jira/browse/NUTCH-2222 This issue will track the bug... sigh! Lewis
On Tue, Feb 23, 2016 at 6:31 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi Folks, > From people using Nutch 2.3.1 (released around a month or so ago) it looks > like there is a bug in the gora-mongodb module where data within the > metadata fields of the Nutch WebPage object are being overwritten. > I've asked people to log bug reports if they come across it again. For > reference here are some threads relating to it > > http://www.mail-archive.com/user%40nutch.apache.org/msg14318.html > http://www.mail-archive.com/user%40nutch.apache.org/msg14313.html > http://www.mail-archive.com/user%40nutch.apache.org/msg14295.html > > Lewis > > -- > *Lewis* > -- *Lewis*

