Re: [whatwg] Timed tracks: feedback compendium

Philip Jägenstedt Mon, 13 Sep 2010 00:55:38 -0700

On Sat, 11 Sep 2010 01:27:48 +0200, Silvia Pfeiffer<[email protected]> wrote:

On Fri, Sep 10, 2010 at 11:00 PM, Philip Jägenstedt<[email protected]>wrote:
On Thu, 09 Sep 2010 15:08:43 +0200, Silvia Pfeiffer
<[email protected]> wrote:

 On Wed, Sep 8, 2010 at 9:19 AM, Ian Hickson <[email protected]> wrote:
On Fri, 23 Jul 2010, Philip Jägenstedt wrote:
If we must have both kind=subtitles and kind=captions, then I'dsuggest
> making the default subtitles, as that is without a doubt the most
common
> kind of timed text. Making captions the default only means that most
> timed text will be mislabeled as being appropriate for the HoH whenit
> is not.
Ok, I've changed the default. However, I'm not fighting this battleif it
comes up again, and will just change it back if people don't defend
having
this as the default. (And then change it back again if the browserspick
"subtitles" in their implementations after all, of course.)
Note that captions aren't just for users that are hard-of-hearing.Most
of
the time when I use timed tracks, I want captions, because the reasonI
have them enabled is that I have the sound muted.
Hmm, you both have good points. Maybe we should choose something as the
default that is not visible on screen, such as "descriptions"? Thatwould
avoid the issue and make it explicit for people who provide captions or
subtitles that they have to make a choice.
If we want people to make an explicit choice, we should make kind a
required attribute and make browsers ignore <track>s without it. (Ithink
subtitles is a good default though.)
I think you misunderstood - my explanation probably wasn't very good. I'm
looking at it from the authoring POV.
What I meant was: if I author a text track that is supposed to bevisible on
screen as the video plays back and if we choose either @kind=subtitle or
@kind=caption as the default, then I don't have to really think through
about what I authored as it will be displayed on screen. This invitespeople
to not distinguish between whether they authored subtitles or captions,
which is a bad thing, because a deaf user may then get tracks with thewronglabel and expectations. If, however, we choose as a default somethingthatis not visible on screen, e.g. @kind=description or @kind=metadata, thentheauthor who wants their text track to be visible on screen has to give ita
label, i.e. make an explicit choice between @kind=subtitle and
@kind=caption. I believe this will lead to more correctly labeledcontent. I
am therefore strongly against default labeling with either subtitle or
caption. We could make @kind a required attribute instead as you aresaying.

OK, I think we mostly agree. Any default will sometimes be wrong, so tonot have to choose between subtitles and captions, I'd still really preferif specific HoH-tags like <sound> can be shown or hidden depending on userpreference. I think that would lead to more content actually being writtenfor HoH users, as it doesn't requiring maintaining 2 different files.

 On Sun, 25 Jul 2010, Silvia Pfeiffer wrote:
>
> I think if we have a mixed set of .srt files out there, some ofwhich> are old-style srt files (with line numbers, without WebSRT markup)and
> some are WebSRT files with all the bells and whistles and with
> additional external CSS files, we create such a mess for thatexisting
> ecosystem that we won't find much love.
I'm not sure our goal is to find love here, but in general I wouldagreethat it would be better to have one format than two. I don't see whywe
wouldn't just have one format here though. The idea of WebSRT is to be
sufficiently backwards-compatible that that is possible.
With "finding love" I referred to your expressed goals:
 - Keep implementation costs for standalone players low.
 - Use existing technologies where appropriate.
 - Try as much as possible to have things Just Work.
With WebSRT, we will have one label for two different types of files:theold-style SRT files and the new WebSRT files. Just putting a singlelabel
on
them doesn't mean it is one format, in particular when most old fileswill
not be conformant to the new label and
Apart from the encoding, what else about old SRT files wouldn't
be conformant?
<font> and <u>

Oh, right. It would still render though, and I assume one could style <u>to actually be underlined if one really wanted to.

Does it matter that they aren't conformant if they work
anyway?
The ones on the wrong charset won't work, at least not without us
introducing specific handling for it - which is incidentally specific
handling that non-Web applications won't get, so they are still left outinthe rain. Think of a new standalone application that was developed justfor
WebSRT and only deals with UTF-8. It will not deal well with those legacy
files.

Requiring UTF-8 and not requiring UTF-8 both has its downsides. I thinkthat handling charset as an attribute on <track> isn't very difficult, butif there are SRT-incompatible changes for other reasons (e.g. a header)then I think we should go back to always requiring UTF-8.

 many new files will not play in the software created for the old spec.
As long as we don't add a header, the files will play in most existing
software. Apart from parsers that assume that SRT is plain text (andthuswould be unsuitable for much existing SRT content), what kind ofbreakage
have you found with WebSRT-specific syntax in existing software?
I think we need to add a header - and possibly other things in thefuture.Will we forever have the SRT restrictions hold back the introduction ofnew
features into WebSRT?

Yes, if we extend SRT we can't break compatibility. However, it seems thatall the extensibility needed already exists, as arbitrary tag names arehandled by the parser.

 None is allowed today, but it would be relatively straight-forward to

introduce metadata before the cues (or even in between the cues). For
example, we could add defaults:

 *
 DEFAULTS
 L:-1 T:50% A:middle

 00:00:20,000 --> 00:00:24,400
 Altocumulus clouds occur between six thousand

 00:00:24,600 --> 00:00:27,800
 and twenty thousand feet above ground level.

We could add metadata (here using a different syntax that is similarly
backwards-compatible with what the spec parser does today):



  @charset --> win-1252

 @language --> en-US

 00:00:20,000 --> 00:00:24,400
 Altocumulus clouds occur between six thousand

 00:00:24,600 --> 00:00:27,800
 and twenty thousand feet above ground level.

When I read the following:

"A WebSRT file body consists of an optional U+FEFF BYTE ORDER MARK(BOM)

character, followed by zero or more WebSRT line
terminators<
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-line-terminator
>,

followed by zero or more WebSRT
cues<
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue
>

separated
from each other by two or more WebSRT line
terminators<
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-line-terminator
>,

followed by zero or more WebSRT line
terminators<
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-line-terminator
>

."
then that doesn't imply for me that we can add anything in front of the

WebSRT cues without breaking the spec, or that we can define cues thatare

not time ranges around the "-->" sign.


The parsing algorithm simply skips over things it doesn't recognize,

that's why adding basically any new syntax in between cues wouldn'tbreak

existing WebSRT parsers.

Legacy SRT parsers are not required to do so and even if they areactuallyimplemented to deal with this situation, it's a dangerous assumption. Wemay

as well write into the syntax description of WebSRT that any line that

doesn't match the syntax description has to be ignored, which then hasthe

effect that every single file in the world is a valid WebSRT file.

You're right, we shouldn't go around adding stuff before or between thecues just because the WebSRT parser allows it unless we also make surethat most legacy SRT parsers will handle it.


(Making anything valid makes the syntax useless, though.)

Allowing anything as part of the syntax is a bit
dangerous though, as most unrecognized stuff between cues are likely
broken cues. Validators should warn about it, not treat it as a comment.


I wasn't aware of the effect of the standardised parsing algorithm for

WebSRT allowing "broken cues" to be dealt with. This will effectivelymean

that a parser will be required to parse all files that it is given from
beginning to end and discard all non-conformant lines - even if that file

may be a 100GB large movie file. In this case, I would really recommendthat

we put a magic identifier at the beginning of Web SRT files so we can be

sure that the intention of the file was to be a WebSRT file. Let's havethe

string "WebSRT" at the beginning of the files.

That's a good point. I don't suppose it's a huge problem in practice thaterrors can't be detected until EOF, but it's certainly not a desirablefeature. To maintain some sanity, we probably ought to either require thecorrect MIME type or require the correct magic bytes. From the <video>MIME type debacle, I think I slightly prefer magic bytes to be checked bythe parser.

I've also argued for the inclusion of metadata, so I'm beginning to warmup to the idea of adding a header beginning with "WebSRT" or some such. Ifwe do this, no existing SRT content can be reused, but we can still try tomake it possible for WebSRT files to be reusable in desktop applications,by keeping the syntax highly compatible so that the same parser can beused for both without a mode switch.


--
Philip Jägenstedt
Core Developer
Opera Software

Re: [whatwg] Timed tracks: feedback compendium

Reply via email to