[ 
https://issues.apache.org/jira/browse/TIKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295333#comment-17295333
 ] 

Nick Burch commented on TIKA-3310:
----------------------------------

FYI There's a few unrelated changes in the pull request, including InputStream 
stuff and loads of whitespace changes, which makes reviewing it harder than 
ideal

In terms of the logic, I'd probably prefer two passes. Not sure if it'd make 
any difference, but possibly safest without a full review of the spec and loads 
of files. Keep the current major brand check first. If none of those match, 
only then check the compatible brands for a match

> MP4 video detected as application/mp4
> -------------------------------------
>
>                 Key: TIKA-3310
>                 URL: https://issues.apache.org/jira/browse/TIKA-3310
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Peter Kronenberg
>            Priority: Major
>         Attachments: sample-movie.mp4
>
>
> The attached file is an MP4 video.  When running _new Tika().detect()_ it 
> returns _video/quicktime_.   But when actually running it through the 
> MP4Parser, it returns a very generic _application/mp4_.
>  
> Looking at the code, it seems that the generic type comes about because the 
> _majorBrand_ of my file is _isom_, which doesn’t match any of the desired 
> values, so it defaults to _application/mp4._  Now, I know absolutely nothing 
> about mp4 encoding.  But looking further, I see there’s a list of 
> compatibleBrands, which in my case, includes _mp41_, which would match the 
> expected type of _video/mp4_ .
>  I coded this up so that if the major brand does not match one of the desired 
> values, it checks to see if any of the compatible brands match, and uses the 
> first one it finds.
>  Is this a proper solution?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to