The current code putting updb_mrk in dbUpdateReducer is as follows

Utf8 mark = Mark.PARSE_MARK.removeMarkIfExist(page);
    if (mark != null) {
      Mark.UPDATEDB_MARK.putMark(page, mark); 
   }
the mark is always null, independent if there is PARSE_MARK or not.

This function calls

 public Utf8 removeFromMarkers(Utf8 key) {
    if (markers == null) { return null; }
    getStateManager().setDirty(this, 20);
    return markers.remove(key);
  }

it seems to me that getStateManager().setDirty(this, 20); removes marker and 
that is why the last line  returns null.

I tried to follow  getStateManager().setDirty(this, 20)  in the hierarchy of 
classes, but did not find anything useful.

I  have fixed the issue by replacing the above lines with

    Utf8 parse_mark = Mark.PARSE_MARK.checkMark(page);
    if (parse_mark != null)
    {
        Mark.UPDATEDB_MARK.putMark(page, parse_mark);
        Mark.PARSE_MARK.removeMark(page);
     }

Thanks.
Alex.



-----Original Message-----

From: Ferdy Galema <[email protected]>
To: user <[email protected]>
Sent: Thu, Aug 2, 2012 12:16 am
Subject: Re: Nutch 2 solrindex


Hi,

Do you want to open a Jira and attach the patch over there? Or just explain
what the problem is caused. I'm curious to what this might be.

Thanks,
Ferdy.

On Wed, Aug 1, 2012 at 9:27 PM, <[email protected]> wrote:

> This is directly related to the thread I have opened yesterday. I think
> this is a bug, since updatedb fails to put update mark.
> I have fixed it by modifying code. I have a patch, but not sure if I can
> send it as an attachment.
>
> Alex.
>
>
>
> -----Original Message-----
> From: Bai Shen <[email protected]>
> To: user <[email protected]>
> Sent: Wed, Aug 1, 2012 10:37 am
> Subject: Nutch 2 solrindex
>
>
> I'm trying to crawl using Nutch 2.  However, I can't seem to get it to
> index to solr without adding -reindex to the command.  And at that point it
> indexes everything I've crawled.  I've tried both -all and the batch id,
> but neither one results in anything being indexed to solr.
>
> Any suggestions of what to look at?
>
> Thanks.
>
>
>

 

Reply via email to