Hi everybody, I implemented my custom indexer plugin for Nutch2.x to store data in mongo. After analysis of documents in MongoDb i found out that indexer plugin was executed several times for same page.
Why does Nutch need to recrawl of same page for several times? I was looking in code of reducer for update phase https://github.com/apache/nutch/blob/release-2.3/src/java/org/apache/nutch/crawl/DbUpdateReducer.java and i didn't understand what is the purpose of clearing marks especially generate mark. Best Regards, Dzmitry

