I am also looking for a way to exclude certain content from within a html page that is being parsed. I am trying to do it from within the Parse Filter, but I am not sure how to do it. Did you figure out anything? Does anyone else know how this would work?
Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Nutch-1-0-partially-content-indexing-by-Nutch-tp916640p928731.html Sent from the Nutch - Dev mailing list archive at Nabble.com.

