Hi,
i find a bug in nutch 2.0, which causes Mark.UPDATEDB_MARK could not
mark it's bat chid.
Here in org.apache.nutch.crawl.DbUpdateReducer.java , reduce function:
Mark.GENERATE_MARK.removeMarkIfExist(page);
Mark.FETCH_MARK.removeMarkIfExist(page);
Utf8 mark = Mark.PARSE_MARK.removeMarkIfExist(page);
if (mark != null) {
Mark.UPDATEDB_MARK.putMark(page, mark);
}
it clear the generate, fetch & parse bat chid, and set updated bat chid,
but Mark.UPDATEDB_MARK.putMark(page, mark) could not execute, because
mark is always null.
In gora 0.2, the remove function of StatefulHashMap ,which is called
by WebPage's Markers always return null.
Thanks.