And just out of curiosity -- what would the answer be if I used the tika
server? Does it have a limit by default?
Thanks,
Oliver
On 14.10.2014 15:47, Nick Burch wrote:
On Tue, 14 Oct 2014, imyuka wrote:
I suppose I'm calling Tika with parse+content handler, the following
code is one example I found on the internet:
ContentHandler handler = new BodyContentHandler();
If you look at the JavaDocs for BodyContentHandler:
https://tika.apache.org/1.6/api/org/apache/tika/sax/BodyContentHandler.html
You'll see that it has an optional parameter when you create it to
specify the maximum number of characters it'll accept. Pass in a
higher number, or -1 to disable the checks
Nick