[ 
https://issues.apache.org/jira/browse/TIKA-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240795#comment-17240795
 ] 

Kenneth William Krugler commented on TIKA-3239:
-----------------------------------------------

Hi [~harirehm] - this is the expected behavior. There's no way to communicate 
back that data was dropped due to the limit being hit, thus an exception is 
thrown.

As a side comment, please ask questions like this on the mailing list, as 
that's a lighter-weight way of handling, and others can benefit from the 
exchange. See 
https://tika.apache.org/mail-lists.html#:~:text=The%20user%20mailing%20list%20at,in%20contributing%20to%20Tika%20development.

> TikaException: data length must be < 1000000
> --------------------------------------------
>
>                 Key: TIKA-3239
>                 URL: https://issues.apache.org/jira/browse/TIKA-3239
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.24.1
>            Reporter: HARI RAM
>            Priority: Major
>
> Tika exception is thrown when trying to parse PSD files using the latest tika 
> version (1.24.1). 
>  
>  
> {code:java}
> org.apache.tika.exception.TikaException: data length must be < 1000000: 
> 7108276
>       at 
> org.apache.tika.parser.image.PSDParser$ResourceBlock.<init>(PSDParser.java:233)
>       at 
> org.apache.tika.parser.image.PSDParser$ResourceBlock.<init>(PSDParser.java:167)
>       at org.apache.tika.parser.image.PSDParser.parse(PSDParser.java:135)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>       at org.apache.tika.Tika.parseToString(Tika.java:527)
>       at org.apache.tika.Tika.parseToString(Tika.java:602)
> {code}
>  
> Is this limit configurable? Shouldn't that be parsing up to the limit and 
> return the parsed data?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to