On Thu, 27 Mar 2014, Konstantin Gribov wrote:
Some containers (like matroska/mkv) tags audio and subtitle streams with
language tag and some comment. From mplayer console output:
[lavf] stream 0: video (h264), -vid 0
[lavf] stream 1: audio (aac), -aid 0, -alang rus, Rus BaibaKo.tv
[lavf] stream 2: audio (ac3), -aid 1, -alang eng, Eng
Ogg + CMML would give something similar
I don't know any established semantics for video streams but the first
usually is default for playback.
How should a Tika parser handle such a file though? Include the primary
audio metadata with the video stream as the primary object, and report
embedded items for the other audio streams? Report all as embedded items?
Report the primary video stream as the main thing, and give all other
video + audio as embedded items? Something else?
Nick