[ 
https://issues.apache.org/jira/browse/TIKA-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970579#action_12970579
 ] 

Nick Burch commented on TIKA-570:
---------------------------------

Reading http://en.wikipedia.org/wiki/BMP_file_format I'm not sure what else we 
can be sure to find, but I'm tempted to say we also require either "00 00" or 
"00 00 00" inside the first few KB - a text file shouldn't have that many 
nulls, but most bitmaps will.

> If this is a BMP, my name is horatio alger
> ------------------------------------------
>
>                 Key: TIKA-570
>                 URL: https://issues.apache.org/jira/browse/TIKA-570
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>            Reporter: Benson Margulies
>         Attachments: C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt, 
> C80A5295-EFC7-44DD-9A39-B882D1EC6F38.txt
>
>
> I am attaching a file which Tika is identifying as a bmp. It contains 
> ordinary text.
>  
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.image.imagepar...@20a19811
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
>       at com.basistech.jug.FileHarvester.process(FileHarvester.java:204)
>       at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:165)
>       at com.basistech.jug.FileHarvester.harvestDir(FileHarvester.java:179)
>       at com.basistech.jug.FileHarvester.harvest(FileHarvester.java:135)
>       at com.basistech.jug.FileHarvester.run(FileHarvester.java:247)
>       at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.RuntimeException: New BMP version not implemented yet.
>       at 
> com.sun.imageio.plugins.bmp.BMPImageReader.readHeader(BMPImageReader.java:462)
>       at 
> com.sun.imageio.plugins.bmp.BMPImageReader.getWidth(BMPImageReader.java:174)
>       at org.apache.tika.parser.image.ImageParser.parse(ImageParser.java:75)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
>       ... 8 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to