[ 
https://issues.apache.org/jira/browse/TIKA-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400934#comment-17400934
 ] 

Nick Burch commented on TIKA-3528:
----------------------------------

The specification document from Microsoft documents the following well-known 
stream types:
 * ASF_Audio_Media - F8699E40-5B4D-11CF-A8FD-00805F5C442B
 * ASF_Video_Media - BC19EFC0-5B4D-11CF-A8FD-00805F5C442B
 * ASF_Command_Media - 59DACFC0-59E6-11D0-A3AC-00A0C90348F6
 * ASF_JFIF_Media - B61BE100-5B4E-11CF-A8FD-00805F5C442B
 * ASF_Degradable_JPEG_Media - 35907DE0-E415-11CF-A917-00805F5C442B
 * ASF_File_Transfer_Media - 91BD222C-F21C-497A-8B6D-5AA86BFC0185
 * ASF_Binary_Media - 3AFB65E2-47EF-40F2-AC2C-70A90D71D343

The stream type is the third entry in a Stream Properties Object, after the 
ASF_Stream_Properties_Object guid of B7DC0791-A9B7-11CF-8EE6-00C00C205365 and 
then a 64 bit size, so would need a mask'd magic match (and a long one at 
that!). There's some endian "fun" as well to building a compound 2-guid with 
gaps match, any volunteers? :)

We would also need to find / define mime types for the extra types - 
[https://docs.microsoft.com/en-us/windows/win32/wmp/file-name-extensions] 
doesn't seem to have recommended types for many of these

> WMV file detected as WMA (audio/x-ms-wma)
> -----------------------------------------
>
>                 Key: TIKA-3528
>                 URL: https://issues.apache.org/jira/browse/TIKA-3528
>             Project: Tika
>          Issue Type: Bug
>          Components: mime
>            Reporter: Nitish Gupta
>            Priority: Major
>
> Attached file is detected as "audio/x-ms-wma" instead of "video/x-ms-asf".
> Link : 
> [https://drive.google.com/file/d/1yB1_RcMxINHSs2s5AQHG4QrEdGWzJwy6/view?usp=sharing]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to