On Apr 8, 2011, at 8:54 AM, Ian Hickson wrote:
>> *) Discoverability is indeed an issue, but this can be fixed by defining
>> a common track API for signalling and enabling/disabling tracks:
>>
>> {{{
>> interface Track {
>> readonly attribute DOMString kind;
>> readonly attribute DOMString label;
>> readonly attribute DOMString language;
>>
>> const unsigned short OFF = 0;
>> const unsigned short HIDDEN = 1;
>> const unsigned short SHOWING = 2;
>> attribute unsigned short mode;
>> };
>>
>> interface HTMLMediaElement : HTMLElement {
>> [...]
>> readonly attribute Track[] tracks;
>> };
>> }}}
>
> There's a big difference between text tracks, audio tracks, and video
> tracks. While it makes sense, for instance, to have text tracks enabled
> but not showing, it makes no sense to do that with audio tracks.
Audio and video tracks require more data, hence it's less preferred to allow
them being enabled but not showing. If data wasn't an issue, it would be great
if this were possible; it'd allow instant switching between multiple audio
dubs, or camera angles.
In terms of the data model, I don't believe there's major differences between
audio, text or video tracks. They all exist at the same level - one down from
the main presentation layer. Toggling versus layering can be an option for all
three kinds of tracks.
For example, multiple video tracks can be mixed together in one media element's
display. Think about PiP, perspective side by side (Stevenote style) or a 3D
grid (group chat, like Skype). Perhaps this should be supported instead of
relying upon multiple video elements, manual positioning and APIs to knit
things together. One would loose in terms of flexibility, but gain in terms of
API complexity (it's still one "video") and ease of implementation for HTML
developers.
- Jeroen