Stop text extraction when the maxFieldLength limit is reached
-------------------------------------------------------------
Key: JCR-2506
URL: https://issues.apache.org/jira/browse/JCR-2506
Project: Jackrabbit Content Repository
Issue Type: Improvement
Components: indexing, jackrabbit-core
Reporter: Jukka Zitting
Assignee: Jukka Zitting
Priority: Minor
When indexing large documents the text extraction often takes quite a while and
uses lots of memory even if only the first maxFieldLength (by default 10000)
tokens are used. I'd like to add a maxExtractLength parameter that can be used
to set the maximum number of characters to extract from a binary. The default
value of this parameter could be something like ten times the maxFieldLength
setting.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.