would a better option be to use this property? indexer.max.content.length = 32765
----- Original Message ----- From: "KRIS MUSSHORN" <[email protected]> To: [email protected] Sent: Friday, September 30, 2016 9:25:17 AM Subject: control order of operations I've got nutch-site.xml set to http.content.limit = 32765 ( 1 short of solr max ). I also have parser.html.whitelist set to ignore a bunch of irrelevant tags. Can I set nutch so that whitelist applies before truncation? Kris

