I think I have an instance of the known bug 
https://issues.apache.org/jira/browse/NUTCH-2186
I need to keep raw html in my Solr index (or somewhere) so that an external 
tool can access it and parse it. So, I added addBinaryContent and base64 to my 
indexing command. On the very first segment, I get a bunch of failures with 
messages that say "String length must be a multiple of four." The same is true 
if I omit the base64 argument.
Is there a workaround or fix for this issue? I am using Nutch 1.12 and Solr 
5.4.1.

Reply via email to