[
https://issues.apache.org/jira/browse/TIKA-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400934#comment-17400934
]
Nick Burch commented on TIKA-3528:
----------------------------------
The specification document from Microsoft documents the following well-known
stream types:
* ASF_Audio_Media - F8699E40-5B4D-11CF-A8FD-00805F5C442B
* ASF_Video_Media - BC19EFC0-5B4D-11CF-A8FD-00805F5C442B
* ASF_Command_Media - 59DACFC0-59E6-11D0-A3AC-00A0C90348F6
* ASF_JFIF_Media - B61BE100-5B4E-11CF-A8FD-00805F5C442B
* ASF_Degradable_JPEG_Media - 35907DE0-E415-11CF-A917-00805F5C442B
* ASF_File_Transfer_Media - 91BD222C-F21C-497A-8B6D-5AA86BFC0185
* ASF_Binary_Media - 3AFB65E2-47EF-40F2-AC2C-70A90D71D343
The stream type is the third entry in a Stream Properties Object, after the
ASF_Stream_Properties_Object guid of B7DC0791-A9B7-11CF-8EE6-00C00C205365 and
then a 64 bit size, so would need a mask'd magic match (and a long one at
that!). There's some endian "fun" as well to building a compound 2-guid with
gaps match, any volunteers? :)
We would also need to find / define mime types for the extra types -
[https://docs.microsoft.com/en-us/windows/win32/wmp/file-name-extensions]
doesn't seem to have recommended types for many of these
> WMV file detected as WMA (audio/x-ms-wma)
> -----------------------------------------
>
> Key: TIKA-3528
> URL: https://issues.apache.org/jira/browse/TIKA-3528
> Project: Tika
> Issue Type: Bug
> Components: mime
> Reporter: Nitish Gupta
> Priority: Major
>
> Attached file is detected as "audio/x-ms-wma" instead of "video/x-ms-asf".
> Link :
> [https://drive.google.com/file/d/1yB1_RcMxINHSs2s5AQHG4QrEdGWzJwy6/view?usp=sharing]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)