On Thu, 27 Mar 2014, Konstantin Gribov wrote:
Some containers (like matroska/mkv) tags audio and subtitle streams with
language tag and some comment. From mplayer console output:

[lavf] stream 0: video (h264), -vid 0
[lavf] stream 1: audio (aac), -aid 0, -alang rus, Rus BaibaKo.tv
[lavf] stream 2: audio (ac3), -aid 1, -alang eng, Eng

Ogg + CMML would give something similar

I don't know any established semantics for video streams but the first
usually is default for playback.

How should a Tika parser handle such a file though? Include the primary audio metadata with the video stream as the primary object, and report embedded items for the other audio streams? Report all as embedded items? Report the primary video stream as the main thing, and give all other video + audio as embedded items? Something else?

Nick

Reply via email to