[ 
https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332759#comment-17332759
 ] 

Tim Allison commented on TIKA-3372:
-----------------------------------

While I'm fixing this, [~julienFL] and fellow devs.  The fixed behavior is that 
the parse is stopped when the content length hits the limit.  This means that 
users will lose metadata about embedded objects.  If the parser hits the write 
limit on attachment 3, it will not process attachments 4-n at all.  Is this 
what we want or do we just want the parser to stop writing to the content but 
still gather the metadata (including embedded file type)?

> Fix writelimit in recursiveparserhandler
> ----------------------------------------
>
>                 Key: TIKA-3372
>                 URL: https://issues.apache.org/jira/browse/TIKA-3372
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> On the dev list, [~julienFL] noted surprising behavior with the new write 
> limit in the /rmeta handler.  I wasn't able to replicate it, but there is 
> clearly a bug in how the write limiting is working. The upshot is that we're 
> still effectively write limiting per object not for the full container doc 
> and embedded objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to