On Tue, 19 Oct 2010 22:35:50 +0200, Silvia Pfeiffer
<[email protected]> wrote:
On Tue, Sep 14, 2010 at 7:49 PM, Philip Jägenstedt <[email protected]>
wrote:
On Tue, 14 Sep 2010 10:30:03 +0200, Simon Pieters <[email protected]>
wrote:
On Tue, 14 Sep 2010 10:11:16 +0200, Philip Jägenstedt
<[email protected]>
wrote:
The point of a header is that browsers can identify WebSRT files and
not
keep parsing through a 100GB movie file,
I don't think we should break SRT compat for this. I don't think this
is a
problem at all. We already have this situation elsewhere, e.g. what if
you
do <link rel=stylesheet href=movie.webm>?
If it really turns out to be a problem you could just apply the
hardware
limitations clause and abort parsing if you haven't found any cues
after
parsing X bytes or whatever.
In any case, the spec currently requires text/srt (or other supported
subtitle format MIME type) for <track>, so a movie file would be
rejected
based on the MIME type per spec (see step 4 in
#sourcing-out-of-band-timed-tracks).
Well, I was hoping to sidestep the issue of MIME types and file
extensions
by always ignoring them. Last I checked Apache doesn't have a default
mapping for .srt, so everyone using <track> would have to add it
themselves.
About metadata, I noticed that there's a voice called <credit>...
I think that's only for the credits at the start or end of a movie.
Anyway: I'm trying to summarize the changes that were discussed this
far to WebSRT. I think we have the following:
* add a header to identify the kind of websrt file & the language
* add a means to add metadata as name-value pairs
e.g.
WebSRT
language: en-US
author: Frank
date: 2010-09-20
kind: subtitle
copyright: WGBH, 2010
license: CC-BY-SA, http://creativecommons.org/licenses/by-sa/3.0/
What should happen when the language in <track srclang> doesn't match the
language in the file itself? Also, why is kind needed in the file?
* add a means to add comments
e.g.
// Lines starting with // are comments
So far the web two comment syntaxes: <!-- SGML style --> and /* CSS style
*/, so if we need comments I think we should pick one of these.
And some changes on <track>:
* make @kind a required attribute
Why was this?
* add @type for mime type identification as we allow more than just
WebSRT as external formats, e.g. TTML
Having more than one format seems to complicate rendering. The WebSRT
rendering rules tries to avoid overlap between cues from different tracks,
but I don't see how that could work between different formats, unless all
formats have basically the same model. It certainly wouldn't work with a
fixed-layout format like TTML. In other words, can't this wait until some
implementor has shown concrete interest in implementing more than one
format?
Anyway, I agree that at least a magic header like "WebSRT" is needed
because of the horrors of legacy SRT parsing. Breaking SRT compat means
that we can go back to requiring UTF-8 as the encoding. However, UTF-8
does complicate the magic header a bit due to the possibility of a BOM
[1]. While it would be nice to forbid the use of a BOM, I expect we'd then
see lots of frustration from authors who's editors automatically insert
it...
[1] http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
--
Philip Jägenstedt
Core Developer
Opera Software