[ 
https://issues.apache.org/jira/browse/TIKA-570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Douglas updated TIKA-570:
----------------------------------

    Attachment: TIKA-570.patch

I am attaching a patch that encodes the "BM" prefix, the color planes 
signature, and the possible bit count values in tika-mimetypes.xml. I believe 
that since we are checking for the "BM" magic, this should not conflict with 
any OS/2 variations, since they have different magic values, like "BA", "CI", 
etc.

This patch adds the original text file to the test document set and confirms in 
the unit test that it is not detected as a bitmap.

> If this is a BMP, my name is horatio alger
> ------------------------------------------
>
>                 Key: TIKA-570
>                 URL: https://issues.apache.org/jira/browse/TIKA-570
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>            Reporter: Benson Margulies
>         Attachments: C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt, 
> C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt, TIKA-570.patch
>
>
> I am attaching a file which Tika is identifying as a bmp. It contains 
> ordinary text.
>  
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.image.imagepar...@20a19811
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
>       at com.basistech.jug.FileHarvester.process(FileHarvester.java:204)
>       at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:165)
>       at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:179)
>       at com.basistech.jug.FileHarvester.harvest(FileHarvester.java:135)
>       at com.basistech.jug.FileHarvester.run(FileHarvester.java:247)
>       at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.RuntimeException: New BMP version not implemented yet.
>       at 
> com.sun.imageio.plugins.bmp.BMPImageReader.readHeader(BMPImageReader.java:462)
>       at 
> com.sun.imageio.plugins.bmp.BMPImageReader.getWidth(BMPImageReader.java:174)
>       at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:75)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
>       ... 8 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to