Hi all, I want to do cleaning on the html in the cached page - where is the cache located that I should read and what extension point I can use? If I do that before indexing, will this action be too expansive?
Thank you for any answer. -- View this message in context: http://www.nabble.com/Write-back-to-the-segment--tp16077713p16077713.html Sent from the Nutch - Dev mailing list archive at Nabble.com.