[
https://issues.apache.org/jira/browse/TIKA-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933086#comment-13933086
]
Fabian Lange commented on TIKA-1113:
------------------------------------
Hi,
this is partly Tikas fault as well.
the upstream detector has no support for video yet (todo) and such returns
"application/ogg"
next up is the tika mime detector, which detects "audio/ogg".
As this is more specialised it wins, and will cause the upstream audio detector
to kick in. as the file is a video, it will fail.
I tried to fix the magic file for audio/ogg but failed to do so (overriding
default mimetypes via custom etc didnt work)
I found that here https://bugs.freedesktop.org/show_bug.cgi?id=1002 is a mime
magic list for the ogg types that should be used
Could you fix the mime magic detector? a 400kb test file for video is here:
http://techslides.com/demos/sample-videos/small.ogv
> Parsing for OGV file results in java.lang.ClassCastException
> ------------------------------------------------------------
>
> Key: TIKA-1113
> URL: https://issues.apache.org/jira/browse/TIKA-1113
> Project: Tika
> Issue Type: Bug
> Components: metadata, parser
> Affects Versions: 1.3
> Environment: OS X 10.8.3
> JDK 1.6.0_45 64-bit
> Reporter: Alexander Chow
>
> When parsing any OGV file (e.g.,
> [gizmo.ogv|http://www.808.dk/pics/video/gizmo.ogv]), log will output
> something like the following:
> {code}
> Warning - invalid checksum on page 2 of stream 3f1 (1009)
> Warning - invalid checksum on page 3 of stream 3f1 (1009)
> Warning - invalid checksum on page 4 of stream 3f1 (1009)
> Warning - invalid checksum on page 5 of stream 3f1 (1009)
> Warning - invalid checksum on page 6 of stream 3f1 (1009)
> Warning - invalid checksum on page 7 of stream 3f1 (1009)
> Warning - invalid checksum on page 22 of stream 3f1 (1009)
> Warning - invalid checksum on page 33 of stream 3f1 (1009)
> Warning - invalid checksum on page 34 of stream 3f1 (1009)
> Warning - invalid checksum on page 35 of stream 3f1 (1009)
> Warning - invalid checksum on page 36 of stream 3f1 (1009)
> Warning - invalid checksum on page 37 of stream 3f1 (1009)
> Warning - invalid checksum on page 38 of stream 3f1 (1009)
> Warning - invalid checksum on page 52 of stream 3f1 (1009)
> Warning - invalid checksum on page 65 of stream 3f1 (1009)
> Warning - invalid checksum on page 69 of stream 3f1 (1009)
> Warning - invalid checksum on page 75 of stream 3f1 (1009)
> Warning - invalid checksum on page 76 of stream 3f1 (1009)
> Warning - invalid checksum on page 77 of stream 3f1 (1009)
> Warning - invalid checksum on page 78 of stream 3f1 (1009)
> Warning - invalid checksum on page 79 of stream 3f1 (1009)
> Warning - invalid checksum on page 80 of stream 3f1 (1009)
> Exception in thread "main" org.apache.tika.exception.TikaException:
> Unexpected RuntimeException from org.gagravarr.tika.VorbisParser@7c29e357
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at com.test.OGVTest.main(OGVTest.java:31)
> Caused by: java.lang.ClassCastException: org.gagravarr.vorbis.VorbisAudioData
> cannot be cast to org.gagravarr.vorbis.VorbisInfo
> at org.gagravarr.vorbis.VorbisFile.<init>(VorbisFile.java:78)
> at org.gagravarr.vorbis.VorbisFile.<init>(VorbisFile.java:55)
> at org.gagravarr.tika.VorbisParser.parse(VorbisParser.java:58)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> ... 3 more
> {code}
> Testcode is the following:
> {code:title=OGVTest.java}
> void parse(String fileName) throws Exception {
> InputStream inputStream = new FileInputStream(fileName);
>
> Metadata metadata = new Metadata();
>
> Parser parser = new AutoDetectParser();
>
> ParseContext parserContext = new ParseContext();
> parserContext.set(Parser.class, parser);
> ContentHandler contentHandler = new WriteOutContentHandler(
> new DummyWriter());
> parser.parse(inputStream, contentHandler, metadata,
> parserContext);
>
> System.out.println(metadata);
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)