Hi Kaveh,
On Wed, Apr 3, 2013 at 1:30 PM, <[email protected]> wrote:
> Hi
>
> so I am not sure if binoy is talking about this but here it is:
>
> the original exception comes from
> src/java/org/apache/nutch/**indexer/IndexUtil.java line 66
>
> public NutchDocument index(String key, WebPage page) {
> NutchDocument doc = new NutchDocument();
> doc.add("id", key);
> doc.add("digest", StringUtil.toHexString(page.**
> getSignature().array()));
> ==>> doc.add("batchId", page.getBatchId().toString());
>
> page.getBatchId() returns null for every urls. my guess is that updatedb
> removes the batchID from the rows in webpage since the generate and fetch
> work fine with batchId but after the updatedb ( which by the way does not
> accept batchId as one of its parameter which means that it is going over
> the entire webpage table everytime you run it, but that is a different
> issue) solrindex can't find the batchIds
>
> I've reopened NUTCH-1532 and attached a trivial patch which should now
protect against the NPE people have been getting.
Can you please check it out and get back to us?
Thank you Kaveh