Re: [whatwg] Timed tracks for

2010-07-27 Thread Sam Dutton
This has huge potential -- congratulations on such a clear and simple
spec. 

I particularly like the idea of the metadata kind attribute and the
ability in WebSRT to include arbitrary metadata.

>> The addCueRange() API has been removed and replaced with a feature
based on the subtitle mechanism. <<
I'm not sure what this means -- are you referring to timed track cues?

Couple of minor queries:
* 'time track' is referred to a couple of times in the spec -- it's not
clear why this is used instead of 'timed track'
* 'the WebSRT file must WebSRT file using cue text' -- I guess this
should be 'the WebSRT file must be a WebSRT file using cue text'

Also -- is trackgroup out of the spec? 

Sam Dutton

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.



Re: [whatwg] Introduction of media accessibility features

2010-04-23 Thread Sam Dutton
Some thoughts about the Media Multitrack and TextAssociations specs --
and also about http://wiki.whatwg.org/wiki/Timed_tracks...

The specs are great news in terms of accessibility and should open up
possibilities for search and temporal addressing. There may be cases
where it makes sense to encode metadata with a media resource, but the
ability to use timed data in textual format, synchronised but separate
from the media it relates to, has to be a good thing.

The specs and the Timed Tracks wiki also made me think about the
increasing *granularity* of media and digitised (meta)data. 

For example: film archives used to have no more than a one-sentence
description for an entire can of film, illegibly written in a book or on
index cards or (more recently) with a few lines in a Word document. Now,
digitised footage will often be provided alongside timed, digitised
metadata: high quality, structured, detailed, shot-level, frame accurate
data about content, location, personnel, dialogue, rights, ratings and
more. Accelerating digitisation is at the heart of this
'granularisation', obviously, but a variety of technologies contribute:
linked data and semantic markup, temporal URLs, image recognition (show
me frames in this video with a car in them), M3U / HTTP streaming, and
so on -- even the new iPhone seekToTime method.

So, in addition to what's on offer in the specs, I'm wondering if it
might be possible to have time-aligned *data*, with custom roles.  

For example, imagine a video with a synchronised 'chapter' carousel
below it (like the R&DTV demo at
http://www.bbc.co.uk/blogs/rad/2009/08/html5.html). The video element
would have a track with 'chapter' as its role attribute, and the
location of the chapter data file as its src. The data file would
consist of an array of 'chapter' objects, each representing some timed
data. Every object in the track source would require a start and/or end
values, and a content value with arbitrary properties:

{
start: 10.00,
end: 20.00,
content: {
title: "Chapter 2",
description: "Some blah relating to chapter 2",
image: "/images/chapter2.png"
   }
},
{
start: 20.00,
end: 30.00,
content: {
title: "Chapter 3",
description: "Chapter 3 blah",
image: "/images/chapter3.png"
   }
}
.
.
.

In this example, selecting the chapter track for the video would cause
the video element to emit segment entry/exit events -- a bit like the
Cue Ranges idea. In this example, each event would correspond to an
object in the chapter data source. 

I'm not sure of the best way to implement the Event object for a 'data
track', but maybe it would include:
- a type property, as for other Event objects, which would evaluate to
'chapter' in this case
- a content property evaluating to the content object defined in the
data
- a property indicating entry or exit (this seems a bit woolly...)

To make use of data tracks, content authors would need to build layouts
with elements that could listen for events and display content
appropriately -- and data tracks could also refer to content areas
provided by the browser, e.g. for captions. Conversely, multiple content
providers could provide different data tracks for the same media.

This approach would also make it possible to publish multiple data
tracks, separately searchable and displayable. For example, a footage
archive could provide a track each for sound effects, dialogue, and
location. (This example makes me think -- maybe it should be possible to
select multiple tracks?)

I can imagine various other scenarios:
- a journalist builds a slideshow of images synchronised with audio
playback
- an educational publisher builds multiple annotation tracks for video
clips, providing different sets of content for different school years
- a news provider builds an archive search function, enabling users to
navigate via search to individual segments of footage and view
synchronised shot descriptions and metadata
- a broadcaster publishes multiple tracks of content for a sporting
event, including technical detail, follow-a-competitor, and a comedy
track
- an architect videos a building site, adding timed annotations like
YouTube Annotations.

Of course, most of what I've described can already be achieved
reasonably well with a bit of JavaScript hacking, but all these examples
belong to such a common class of use case that I think it might be
better to have some kind of native implementation, rather than a variety
of JavaScript alternatives reliant on the timeupdate event. 

Sam Dutton

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the

Re: [whatwg] Codecs for and

2009-08-09 Thread Sam Dutton
As an aside to Chris McCormick's comments, I wonder if it might also be 
useful/possible/appropriate (or not) to provide access to media data in the way 
that the ActionScript computeSpectrum function does: 

http://livedocs.adobe.com/flash/9.0/ActionScriptLangRefV3/flash/media/SoundMixer.html#computeSpectrum%28%29

Sample visualization using Canvas with computeSpectrum: 
http://www2.nihilogic.dk/labs/canvas_music_visualization/

Sam Dutton

--

Message: 1
Date: Sun, 9 Aug 2009 11:16:01 +1000
From: Silvia Pfeiffer 
Subject: Re: [whatwg] Codecs for  and 
To: Chris McCormick 
Cc: whatwg@lists.whatwg.org
Message-ID:
<2c0e02830908081816v74711d64ya72c8cc11550b...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Sun, Aug 9, 2009 at 3:15 AM, Chris McCormick wrote:
> On Wed, Jul 08, 2009 at 09:24:42AM -0700, Charles Pritchard wrote:
>> There are two use cases that I think are important: a codec
>> implementation (let's use Vorbis),
>> and an accessibility implementation, working with a  element.
>
> Here are a few more use-cases that many people would consider just as
> important:
>
> * Browser based music software and synthesis toys.
> * New types of 'algorithmic' music like that pioneered by Brian Eno.
> * Browser based games which want to use procedural audio instead of
> pre-rendered sound effects.
>
> I'd like to reiterate the previously expressed sentiment that only 
> implementing
> pre-rendered audio playback is like having a browser that only supports static
> images loaded from the server instead of animations and  tags.
>
> What is really needed is a DSP vector processor which runs outside of ECMA
> script, but with a good API so that the ECMAscripts can talk to it directly.
> Examples of reference software, mostly open source, which do this type of 
> thing
> follow:
>
> * Csound
> * Supercollider
> * Pure Data
> * Nyquist
> * Chuck
> * Steinberg VSTs
>
> I am going to use the terms "signal vector", "audio buffer", and "array"
> interchangeably below.
>
> Four major types of synthesis would be useful, but they are pretty much
> isomorphic, so any one of them could be implemented as a base-line:
>
> * Wavetable (implement vector write/read/lookup operators)
> * FM & AM (implement vector + and * operators)
> * Subtractive (implement unit delay from which you can build filters)
> * Frequency domain (implemnt FFT and back again)
>
> Of these, I feel that wavetable synthesis should be the first type of 
> synthesis
> to be implemented, since most of the code for manipulating audio buffers is
> already going to be in the browsers and exposing those buffers shouldn't be
> hugely difficult. Basically what this would take is ensuring some things about
> the audio tag:
>
> * Supports playback of arbitrarily small buffers.
> * Seamlessly loops those small buffers.
> * Allows read/write access to those buffers from ECMAscript.
>
> Given the above, the other types of synthesis are possible, albeit slowly. For
> example, FM & AM synthesis are possible by adding adding/multiplying vectors 
> of
> sine data together into a currently looping audio buffer. Subtractive 
> synthesis
> is possible by adding delayed versions of the data in the buffer to itself.
> Frequency domain synthesis is possible by analysing the data in the buffer 
> with
> FFT (and reverse FFT) and writing back new data. I see this API as working as
> previously posted, by Charles Prichard, but with the following extra
> possibility:
>
> 
> buffer = document.getElementById("mybuffer");
> // here myfunc is a function which will change
> // the audio buffer each time the buffer loops
> buffer.loopCallback = myfunc;
> buffer.loop = True;
> buffer.play();
>
> Of course, the ECMA script is probably going to be too slow in the short term,
> so moving forward it would be great if there was a library/API which can do 
> the
> following vector operations in the background at a speed faster than doing 
> them
> directly, element by element inside ECMAscript (a bit like Python's Numeric
> module). All inputs and outputs are signal vectors/audio tag buffers:
>
> * + - add two signal vectors (2 input, 1 output)
> * * - multiply two signal vectors (2 input, 1 output)
> * z - delay a signal vector with customisable sample length (2 input, 1 
> output)
> * read - do a table lookup (1 input, 1 output)
> * write - do a table write (2 input, 1 output)
> * copy - memcpy a signal vector (1 input, 1 output)
> * fft do a fast fourier transform - (1 input, 2 output)
> * rfft do a reverse fast fourier transform - (2 inputs, 1 o

Re: [whatwg] A Quick Introduction to HTML 5

2009-06-29 Thread Sam Dutton
What's the audience for this section?  (Apologies if this has been
covered elsewhere.)

If the intended readers are new to HTML, as implied, then technical
words and concepts shouldn't be introduced without explanation.

For example:

   >> The tree formed by an HTML document in this way is turned into a
DOM tree when parsed. <<

DOM has not been explained at this point and won't mean anything to a
novice. (Neither will 'tree' or 'parsed', I guess.)

   >> This DOM tree can then be manipulated from scripts. <<

Might be good to explain what 'scripts' means. (Would it hurt to say
something like 'scripting languages such as JavaScript'? Still a bit
vague.)

   >> Since DOM trees are the "live" representation of an HTML document
<<

What does 'live' mean in this context? 

   >> ... instead of the serialisation described above. Each element in
the DOM tree is represented by an object, and thus objects have APIs ...
<<

Again, novices won't understand 'serialisation', 'represented by an
object' or 'APIs'.

Sam Dutton

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.



Re: [whatwg] HTML 5 video tag questions

2009-06-15 Thread Sam Dutton
>> Maybe to make this more clear section 4.8.7.1 should add a sentence 
>> somewhere like:

>> Authors may provide multiple source elements to provide different codecs for 
>> different user agents. 
 
Could multiple source elements also be used to provide different bit-rate 
sources (or even alternative versions, e.g. different languages) as well as 
different codecs, something like the HTTP streaming playlist idea?
 
Sam Dutton

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

<>

[whatwg] Cue range implementation?

2009-06-15 Thread Sam Dutton
Have addCueRange and removeCueRanges been implemented in any browser yet? I've 
looked at nightly builds of Firefox and Safari (on Windows, at least) but they 
don't seem to be there.

Has there been any further discussion since the thread last year (Re: [whatwg] 
re-thinking "cue ranges") about whether an event or callback model would be 
better? 

I can imagine cue ranges being extremely useful for handling all kinds of timed 
changes to content: not just annotations or subtitling. We've been working with 
JavaScript/JSON to implement timed changes to CSS and HTML, relative to a 'time 
parent' such as a video, as well as 'custom events' such as chapter changes. 
Cue ranges would make the implementation of this kind of timed presentation 
much more efficient and straightforward.  

Sam Dutton


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

<>

Re: [whatwg] When closing the browser

2009-06-05 Thread Sam Dutton
I'm joining the thread late, so apologies if I'm missing the point, but
for me it would be very useful to have better session management
facilities -- like those available in Qt, for example, or at least
something analogous to the Cocoa Touch applicationWillTerminate: method.


Sam Dutton 

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.