Thanks. I will see if I can reproduce and patch this. (In case you do not
create a Jira).

On Thu, Aug 2, 2012 at 7:54 PM, <[email protected]> wrote:

> The current code putting updb_mrk in dbUpdateReducer is as follows
>
> Utf8 mark = Mark.PARSE_MARK.removeMarkIfExist(page);
>     if (mark != null) {
>       Mark.UPDATEDB_MARK.putMark(page, mark);
>    }
> the mark is always null, independent if there is PARSE_MARK or not.
>
> This function calls
>
>  public Utf8 removeFromMarkers(Utf8 key) {
>     if (markers == null) { return null; }
>     getStateManager().setDirty(this, 20);
>     return markers.remove(key);
>   }
>
> it seems to me that getStateManager().setDirty(this, 20); removes marker
> and that is why the last line  returns null.
>
> I tried to follow  getStateManager().setDirty(this, 20)  in the hierarchy
> of classes, but did not find anything useful.
>
> I  have fixed the issue by replacing the above lines with
>
>     Utf8 parse_mark = Mark.PARSE_MARK.checkMark(page);
>     if (parse_mark != null)
>     {
>         Mark.UPDATEDB_MARK.putMark(page, parse_mark);
>         Mark.PARSE_MARK.removeMark(page);
>      }
>
> Thanks.
> Alex.
>
>
>
> -----Original Message-----
>
> From: Ferdy Galema <[email protected]>
> To: user <[email protected]>
> Sent: Thu, Aug 2, 2012 12:16 am
> Subject: Re: Nutch 2 solrindex
>
>
> Hi,
>
> Do you want to open a Jira and attach the patch over there? Or just explain
> what the problem is caused. I'm curious to what this might be.
>
> Thanks,
> Ferdy.
>
> On Wed, Aug 1, 2012 at 9:27 PM, <[email protected]> wrote:
>
> > This is directly related to the thread I have opened yesterday. I think
> > this is a bug, since updatedb fails to put update mark.
> > I have fixed it by modifying code. I have a patch, but not sure if I can
> > send it as an attachment.
> >
> > Alex.
> >
> >
> >
> > -----Original Message-----
> > From: Bai Shen <[email protected]>
> > To: user <[email protected]>
> > Sent: Wed, Aug 1, 2012 10:37 am
> > Subject: Nutch 2 solrindex
> >
> >
> > I'm trying to crawl using Nutch 2.  However, I can't seem to get it to
> > index to solr without adding -reindex to the command.  And at that point
> it
> > indexes everything I've crawled.  I've tried both -all and the batch id,
> > but neither one results in anything being indexed to solr.
> >
> > Any suggestions of what to look at?
> >
> > Thanks.
> >
> >
> >
>
>
>

Reply via email to