[ 
https://issues.apache.org/jira/browse/TIKA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507662#comment-17507662
 ] 

Julien Massiera commented on TIKA-3695:
---------------------------------------

[~tallison] concerning that point "If a parser tries to write too much 
metadata, do we throw a WriteLimitException and stop parsing, or do we keep 
parsing but add a "metadata truncation" flag to the metadata object? I'd be 
inclined to the latter.", I would also go for the second option, keep parsing 
but add a "metadata truncation" flag.

On the technical aspect, I am unfortunately not able to argue on your proposals 
as I did not dig into the code enough to understand the logic behind the 
metadata  

> LimitingMetadataFilter
> ----------------------
>
>                 Key: TIKA-3695
>                 URL: https://issues.apache.org/jira/browse/TIKA-3695
>             Project: Tika
>          Issue Type: New Feature
>          Components: metadata
>    Affects Versions: 1.28.1, 2.3.0
>            Reporter: Julien Massiera
>            Priority: Major
>
> Some files may contain abnormally big metadata (several MB, be it for the 
> metadata values, the metadata names, but also for the total amount of 
> metadata) that can be problematic concerning the memory consumption.
> It would be great to develop a new LimitingMetadataFilter so that we can 
> filter out the metadata according to different bytes limits (on metadata 
> names, metadata values and global amount of metadata) 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to