Hi everybody,

I implemented my custom indexer plugin for Nutch2.x to store data in mongo.
After analysis of documents in MongoDb i found out that indexer plugin was
executed several times for same page.

Why does Nutch need to recrawl of same page for several times?
I was looking in code of  reducer for update phase
https://github.com/apache/nutch/blob/release-2.3/src/java/org/apache/nutch/crawl/DbUpdateReducer.java
and i didn't understand what is the purpose of clearing marks especially
generate mark.

Best Regards,
Dzmitry

Reply via email to