[ https://issues.apache.org/jira/browse/TIKA-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999705#comment-13999705 ]
ASF GitHub Bot commented on TIKA-1169: -------------------------------------- GitHub user mkr opened a pull request: https://github.com/apache/tika/pull/8 TIKA-1169: Adding other Mach-O magic bytes for jnilib files. Adding remaining Mach-o binary signatures to fix TIKA-1169 You can merge this pull request into a Git repository by running: $ git pull https://github.com/mkr/tika tika-1169-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tika/pull/8.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8 ---- commit 8b033abfa19a3cb8939ac67bbabcd4d53e89ec42 Author: Matthias Krueger <m...@mkr.io> Date: 2014-05-16T08:58:40Z TIKA-1169: Adding other Mach-O magic bytes for jnilib files. MH_MAGIC = 0xfeedface MH_CIGAM = 0xcefaedfe MH_MAGIC_64 = 0xfeedfacf MH_CIGAM_64 = 0xcffaedfe See https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html ---- > Fails to parse jnilib file > -------------------------- > > Key: TIKA-1169 > URL: https://issues.apache.org/jira/browse/TIKA-1169 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.4 > Environment: Windows 7 x64, Java 1.6.45 > Reporter: brat > Priority: Critical > Fix For: 1.5 > > Attachments: libwrapper.jnilib > > > Hi, > I'm trying to parse a folder with jnilib file inside, but Tika 1.4 throws > exception : > java.io.IOException: > at org.apache.tika.parser.ParsingReader.read(ParsingReader.java:260) > at java.io.Reader.read(Unknown Source) > at ca.cloudscraper.core.impl.Engine.process(Engine.java:63) > at ca.cloudscraper.core.impl.Engine.process(Engine.java:34) > at ca.cloudscraper.core.impl.Engine.process(Engine.java:34) > at ca.cloudscraper.core.impl.Engine.process(Engine.java:34) > at ca.cloudscraper.core.impl.Engine.execute(Engine.java:117) > at > ca.cloudscraper.core.tests.LuceneServiceImplTest.test5(LuceneServiceImplTest.java:140) > at > ca.cloudscraper.core.tests.LuceneServiceImplTest.main(LuceneServiceImplTest.java:176) > Caused by: org.apache.tika.exception.TikaException: Failed to parse a Java > class > at > org.apache.tika.parser.asm.XHTMLClassVisitor.parse(XHTMLClassVisitor.java:66) > at org.apache.tika.parser.asm.ClassParser.parse(ClassParser.java:51) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > at > org.apache.tika.parser.ParsingReader$ParsingTask.run(ParsingReader.java:221) > at java.lang.Thread.run(Unknown Source) > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at org.objectweb.asm.ClassReader.readClass(ClassReader.java:2157) > at org.objectweb.asm.ClassReader.accept(ClassReader.java:542) > at org.objectweb.asm.ClassReader.accept(ClassReader.java:506) > at > org.apache.tika.parser.asm.XHTMLClassVisitor.parse(XHTMLClassVisitor.java:61) > ... 6 more > Seems like Tika tries to parse this file as Java class file, but that > obviously doesn't work. > I've tried to create custom-mimetypes.xml file like this : > <?xml version="1.0" encoding="UTF-8"?> > <mime-info> > <mime-type type="application/octet-stream"> > <_comment>Mac OSX jnilib</_comment> > <glob pattern="*.jnilib"/> > </mime-type> > </mime-info> > and after I repack tika-app-1.4.jar with this file in org.apache.tika.mime > folder, the problem still > exists. > Jnilib file is actually from the ActiveMQ 5.8.0 binary found in > bin/macosx/libwrapper.jnilib -- This message was sent by Atlassian JIRA (v6.2#6252)