[
https://issues.apache.org/jira/browse/TIKA-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246025#comment-17246025
]
Hudson commented on TIKA-3243:
------------------------------
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #73 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/73/])
TIKA-3243 -- bump max record length and enable manual configuration of max
record length (tallison:
[https://github.com/apache/tika/commit/e8fa990e201f9ffdef2da3b0e2237a0b93ed9353])
* (add)
tika-parsers/tika-parsers-classic/tika-parsers-classic-modules/tika-parser-image-module/src/test/resources/org/apache/tika/parser/image/tika-config-TIKA-3243.xml
* (edit)
tika-parsers/tika-parsers-classic/tika-parsers-classic-modules/tika-parser-image-module/src/main/java/org/apache/tika/parser/image/PSDParser.java
* (edit)
tika-parsers/tika-parsers-classic/tika-parsers-classic-modules/tika-parser-image-module/src/test/java/org/apache/tika/parser/image/PSDParserTest.java
> PSDParser MAX_DATA_LENGTH_BYTES check causes TikaException
> ----------------------------------------------------------
>
> Key: TIKA-3243
> URL: https://issues.apache.org/jira/browse/TIKA-3243
> Project: Tika
> Issue Type: Bug
> Reporter: Shunfei Chen
> Assignee: Tim Allison
> Priority: Major
>
> We are using Tika library AutoDetectParser to extract metadata from a variety
> of files. We have been seeing some TikaException(stack trace below) in the
> past month since we upgraded to tika 1.24.1.
>
> {code:java}
> Caused by: org.apache.tika.exception.TikaException: data length must be <
> 1000000: 17777730
> at
> org.apache.tika.parser.image.PSDParser$ResourceBlock.<init>(PSDParser.java:233)
> at
> org.apache.tika.parser.image.PSDParser$ResourceBlock.<init>(PSDParser.java:167)
> at org.apache.tika.parser.image.PSDParser.parse(PSDParser.java:135)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159)
> {code}
> However, I think the PSD file we are parsing is a valid file. I can view it
> and can open it with photoshop. After some digging, I believe the changes was
> introduce as part of this jira
> https://issues.apache.org/jira/browse/TIKA-3050 and this commit
> [https://github.com/apache/tika/commit/ab8a9ed830ec710a32e4ffdf4989aea3aaea92ef(line:]
> 232).
>
> The biggest size we have seen in from the files our users uploaded is
> 161,548,458 so far, which is way above 1000,000 in PSDParser
> Please let me know if you need any extra informations.
>
> Thanks
> Shunfei.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)