[
https://issues.apache.org/jira/browse/TIKA-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801937#comment-13801937
]
Marius Dumitru Florea edited comment on TIKA-1179 at 10/22/13 3:44 PM:
-----------------------------------------------------------------------
I confirm 1.5-SNAPSHOT fixes the problem. Do you know when is the 1.5 release
planned?
was (Author: mflorea):
I confirm 1.5-SNAPSHOT fixes the problem. Do yo
> A corrupt mp3 file can cause an infinite loop in Mp3Parser
> ----------------------------------------------------------
>
> Key: TIKA-1179
> URL: https://issues.apache.org/jira/browse/TIKA-1179
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.4
> Reporter: Marius Dumitru Florea
> Assignee: Ray Gauss II
> Fix For: 1.5
>
> Attachments: corrupt.mp3
>
>
> I have a thread that indexes (among other things) files using Apache Sorl.
> This thread hangs (still running but with no progress) when trying to extract
> meta data from the mp3 file attached to this issue. Here are a couple of
> thread dumps taken at various moments:
> {noformat}
> "XWiki Solr index thread" daemon prio=10 tid=0x0000000003b72800 nid=0x64b5
> runnable [0x00007f46f4617000]
> java.lang.Thread.State: RUNNABLE
> at
> org.apache.commons.io.input.AutoCloseInputStream.close(AutoCloseInputStream.java:63)
> at
> org.apache.commons.io.input.AutoCloseInputStream.afterRead(AutoCloseInputStream.java:77)
> at
> org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:99)
> at java.io.BufferedInputStream.fill(Unknown Source)
> at java.io.BufferedInputStream.read1(Unknown Source)
> at java.io.BufferedInputStream.read(Unknown Source)
> - locked <0x00000000cb7094e8> (a java.io.BufferedInputStream)
> at org.apache.tika.io.ProxyInputStream.read(ProxyInputStream.java:99)
> at java.io.FilterInputStream.read(Unknown Source)
> at org.apache.tika.io.TailStream.read(TailStream.java:117)
> at org.apache.tika.io.TailStream.skip(TailStream.java:140)
> at org.apache.tika.parser.mp3.MpegStream.skipStream(MpegStream.java:283)
> at org.apache.tika.parser.mp3.MpegStream.skipFrame(MpegStream.java:160)
> at
> org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:193)
> at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:71)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at org.apache.tika.Tika.parseToString(Tika.java:380)
> ...
> {noformat}
> {noformat}
> "XWiki Solr index thread" daemon prio=10 tid=0x0000000003b72800 nid=0x64b5
> runnable [0x00007f46f4618000]
> java.lang.Thread.State: RUNNABLE
> at org.apache.tika.io.TailStream.skip(TailStream.java:133)
> at org.apache.tika.parser.mp3.MpegStream.skipStream(MpegStream.java:283)
> at org.apache.tika.parser.mp3.MpegStream.skipFrame(MpegStream.java:160)
> at
> org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:193)
> at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:71)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at org.apache.tika.Tika.parseToString(Tika.java:380)
> ...
> {noformat}
> {noformat}
> "XWiki Solr index thread" daemon prio=10 tid=0x0000000003b72800 nid=0x64b5
> runnable [0x00007f46f4617000]
> java.lang.Thread.State: RUNNABLE
> at java.io.BufferedInputStream.read1(Unknown Source)
> at java.io.BufferedInputStream.read(Unknown Source)
> - locked <0x00000000cb1be170> (a java.io.BufferedInputStream)
> at org.apache.tika.io.ProxyInputStream.read(ProxyInputStream.java:99)
> at java.io.FilterInputStream.read(Unknown Source)
> at org.apache.tika.io.TailStream.read(TailStream.java:117)
> at org.apache.tika.io.TailStream.skip(TailStream.java:140)
> at org.apache.tika.parser.mp3.MpegStream.skipStream(MpegStream.java:283)
> at org.apache.tika.parser.mp3.MpegStream.skipFrame(MpegStream.java:160)
> at
> org.apache.tika.parser.mp3.Mp3Parser.getAllTagHandlers(Mp3Parser.java:193)
> at org.apache.tika.parser.mp3.Mp3Parser.parse(Mp3Parser.java:71)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at org.apache.tika.Tika.parseToString(Tika.java:380)
> ...
> {noformat}
> This makes our Solr indexer very fragile as it prevents it from indexing
> other files thus leading to incomplete search results.
--
This message was sent by Atlassian JIRA
(v6.1#6144)