[ https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085128#comment-16085128 ]
Luis Filipe Nassif commented on TIKA-2428: ------------------------------------------ Seems like the issue is at POI level. Threads are stuck at: {code} java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.skip(Native Method) at java.io.BufferedInputStream.skip(Unknown Source) - locked <0x0000000717f30ac0> (a java.io.BufferedInputStream) at org.apache.tika.io.ProxyInputStream.skip(ProxyInputStream.java:117) at org.apache.tika.io.TikaInputStream.skip(TikaInputStream.java:655) at java.io.FilterInputStream.skip(Unknown Source) at org.apache.poi.util.IOUtils.skipFully(IOUtils.java:364) at org.apache.poi.hemf.record.UnimplementedHemfRecord.init(UnimplementedHemfRecord.java:43) at org.apache.poi.hemf.extractor.HemfExtractor$HemfRecordIterator._next(HemfExtractor.java:101) at org.apache.poi.hemf.extractor.HemfExtractor$HemfRecordIterator.next(HemfExtractor.java:77) at org.apache.poi.hemf.extractor.HemfExtractor$HemfRecordIterator.next(HemfExtractor.java:60) at org.apache.tika.parser.microsoft.EMFParser.parse(EMFParser.java:82) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at dpf.sp.gpinf.indexer.parsers.IndexerDefaultParser.parse(IndexerDefaultParser.java:150) at dpf.sp.gpinf.indexer.io.ParsingReader$ParsingTask.run(ParsingReader.java:263) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} > EMFParser loops forever with corrupted files > -------------------------------------------- > > Key: TIKA-2428 > URL: https://issues.apache.org/jira/browse/TIKA-2428 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.15, 1.16 > Reporter: Luis Filipe Nassif > Attachments: Carved-1285676.emf, Carved-1296288.emf, Carved-912866.emf > > > EMFParser hangs with the attached corrupted EMF files. > Sorry [~talli...@apache.org]! Just now having time to test against our > forensic test corpus... -- This message was sent by Atlassian JIRA (v6.4.14#64029)