[ 
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085128#comment-16085128
 ] 

Luis Filipe Nassif commented on TIKA-2428:
------------------------------------------

Seems like the issue is at POI level. Threads are stuck at:
{code}
java.lang.Thread.State: RUNNABLE
        at java.io.FileInputStream.skip(Native Method)
        at java.io.BufferedInputStream.skip(Unknown Source)
        - locked <0x0000000717f30ac0> (a java.io.BufferedInputStream)
        at org.apache.tika.io.ProxyInputStream.skip(ProxyInputStream.java:117)
        at org.apache.tika.io.TikaInputStream.skip(TikaInputStream.java:655)
        at java.io.FilterInputStream.skip(Unknown Source)
        at org.apache.poi.util.IOUtils.skipFully(IOUtils.java:364)
        at 
org.apache.poi.hemf.record.UnimplementedHemfRecord.init(UnimplementedHemfRecord.java:43)
        at 
org.apache.poi.hemf.extractor.HemfExtractor$HemfRecordIterator._next(HemfExtractor.java:101)
        at 
org.apache.poi.hemf.extractor.HemfExtractor$HemfRecordIterator.next(HemfExtractor.java:77)
        at 
org.apache.poi.hemf.extractor.HemfExtractor$HemfRecordIterator.next(HemfExtractor.java:60)
        at org.apache.tika.parser.microsoft.EMFParser.parse(EMFParser.java:82)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at 
dpf.sp.gpinf.indexer.parsers.IndexerDefaultParser.parse(IndexerDefaultParser.java:150)
        at 
dpf.sp.gpinf.indexer.io.ParsingReader$ParsingTask.run(ParsingReader.java:263)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

{code}

> EMFParser loops forever with corrupted files
> --------------------------------------------
>
>                 Key: TIKA-2428
>                 URL: https://issues.apache.org/jira/browse/TIKA-2428
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.15, 1.16
>            Reporter: Luis Filipe Nassif
>         Attachments: Carved-1285676.emf, Carved-1296288.emf, Carved-912866.emf
>
>
> EMFParser hangs with the attached corrupted EMF files.
> Sorry [~talli...@apache.org]! Just now having time to test against our 
> forensic test corpus...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to