thanks Markus 
----- Original Message -----

From: "Markus Jelsma" <[email protected]> 
To: [email protected] 
Sent: Tuesday, October 4, 2016 7:01:57 AM 
Subject: RE: control order of operations 

Hello - this is not Solr's maximum for a field at all. But it is Java's maximum 
for String. Just don't use string when indexing. 
Markus 

-----Original message----- 
> From:KRIS MUSSHORN <[email protected]> 
> Sent: Friday 30th September 2016 17:54 
> To: [email protected] 
> Subject: Re: control order of operations 
> 
> would a better option be to use this property? 
> 
> indexer.max.content.length = 32765 
> 
> ----- Original Message ----- 
> 
> From: "KRIS MUSSHORN" <[email protected]> 
> To: [email protected] 
> Sent: Friday, September 30, 2016 9:25:17 AM 
> Subject: control order of operations 
> 
> I've got nutch-site.xml set to http.content.limit = 32765 ( 1 short of solr 
> max ). 
> 
> I also have parser.html.whitelist set to ignore a bunch of irrelevant tags. 
> 
> Can I set nutch so that whitelist applies before truncation? 
> 
> Kris 
> 
> 

Reply via email to