[ 
https://issues.apache.org/jira/browse/TIKA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304283#comment-17304283
 ] 

Tim Allison commented on TIKA-3325:
-----------------------------------

Is there any reason that one would want to differentiate between writeLimit per 
(embedded) document and writeLimit overall?  

 

If there are no objections, I'll modify RecursiveParserWrapper to read the 
writelimit off the handler and then enforce that across the combined characters 
written to the main document _and_ the embedded documents, as is the behavior 
for {{/tika}}.

> Add header to limit extracted content
> -------------------------------------
>
>                 Key: TIKA-3325
>                 URL: https://issues.apache.org/jira/browse/TIKA-3325
>             Project: Tika
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 1.24.1
>            Reporter: Julien Massiera
>            Priority: Major
>
> In Tika server, it would be very useful to handle a new header in the 
> requests of the /rmeta endpoint, that would allow to define a bytes 
> limitation of the data content extracted from documents.
> For example the header "writeLimit: 1000000" would guarantee that the 
> "X-TIKA:content" metadata object would not exceed 1000000 bytes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to