On Fri, 16 Apr 2010 15:49:38 +0900, Silvia Pfeiffer <[email protected]> wrote:
On Fri, Apr 16, 2010 at 3:32 PM, Anne van Kesteren <[email protected]> wrote:
A spec would also need to be written if we go for this new
TTML-minus-certain-features-and-using-CSS-rather-than-XSL-FO format. That would probably be worse since we would be forking an existing format in an incompatible way.

No forking - just specifying a mapping of the things that are
supportable. And yes: that needs to be written too.

Sounds like a fork to me. E.g. if we don't want a new parser for <color> values (and we really don't) and use the CSS parser things would be different.


Also, if we are introducing HTML markup inside SRT time cues, then it
would make sense to turn the complete SRT file into markup, not just
the part inside the time cue. Further, SRT has no way to specify which
language it is written in and further such general mechanisms that
already exist for HTML.

What general mechanisms are needed exactly? Why is language needed? Isn't that already specified by the embedder?

I guess the problem is more with char sets.
For HTML pages and other Web content, there is typically information
inside the resource that tells you what character set the document is
written in. E.g. HTML pages have
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">.
Such functionality is not available for SRT, so it is impossible for a
browser to tell what charset to use to render the content in.

It would simply always be UTF-8, much like text/cache-manifest and text/event-stream.


And yes, we have made an adjustment in the Media Associations spec for
<track> to contain a hint on what mime type and charset the external
document is specified in. But that is only a bad fix of SRT's problem.
It should be available inside the file so that any application can use
the SRT file without requiring additional information.

I guess.


The extended SRT file will barely have anything in common with the
original ones. There is more HTML markup to learn than SRT markup. And
having HTML markup encapsulated in a non-html file is just weird.
Also, the numbering through of the captions is honestly not very
useful.

Yeah, maybe you're right.


(3) TTML file: (no hyperlinks, no images - just for comparison)

---
<?xml version="1.0" encoding="utf-8"?>
<tt xml:lang="en_us" xmlns="http://www.w3.org/ns/ttml";>
  <head>
    <styling>
      <style xml:id="left-align"
        tts:fontFamily="proportionalSansSerif"
        tts:textAlign="left"
      />
      <style xml:id="right-align"
        tts:fontFamily="monospaceSerif"
        tts:textAlign="right"
      />
      <style xml:id="speaker"
        tts:fontFamily="monospaceSerif"
        tts:textAlign="left"
        tts:fontWeight="bold"
      />
    </styling>
    <layout>
      <region xml:id="subtitleArea"
        tts:extent="560px 62px"
        tts:padding="5px 3px"
      />
    </layout>
  </head>
  <body region="subtitleArea">
    <div>
      <p style="left-align" begin="0.15s" end="0.17s 951ms">
        <div style="speaker">Proog:</div>
        <div tts:color="green">At the <span
tts:fontStyle="italic">left</span> we can see...</div>
      </p>
      <p style="right-align" begin="0.18s 166ms" end="0.20s 83ms">
        <div tts:color="green">At the right we can see the...</div>
      </p>
    </div>
  </body>
</tt>
---

That this sample file has namespace errors and is therefore not well-formed is part of the reason I think TTML is a very bad idea. (Besides giving a new meaning to a bunch of HTML-like elements.)


(4) possibly new xml/html-ish file:

[...]

I think (4) is preferable over (2) for the more consistent markup and
actual xml parsability.

I don't buy the XML parser argument (as a) an XML parser is not much simpler because of the internal subset and b) it comes with namespaces), but I can see how a new format might be somewhat better-looking than something based on SRT.


--
Anne van Kesteren
http://annevankesteren.nl/

Reply via email to