Re: [whatwg] Remove addCueRange/removeCueRanges

2009-08-17 Thread Dr. Markus Walther
Hi,

 I see no reason why they should not be applicable to data URIs when it
 is obvious that the data URI is a media file. This has not yet been
 discussed, but would be an obvious use case.

OK. That would be welcome - although there could be syntactic problems
as where to place fragment parameters.

 BTW: Did the start and end attribute implementations that you refer to
 cover the data scheme, too?

Yes, on Apple's Safari at the time. I had a working prototype which now
longer works.

 Or if you really wanted to do it in javascript, you'd only need to
 reload the resource:
 Of course we want to do this dynamically in JavaScript - IMHO it would
 be the norm not the exeception to select fragments based on user input.
 Precomputed fragments are of limited use. I don't quite understand why
 the dynamic case is so often underrepresented in these discussions...
 
 http://open.bbc.co.uk/rad/demos/html5/rdtv/episode2/index.html
 This example from the BBC shows how to dynamically jump to fragments
 based on user input by setting the currentTime of the video. I don't
 see a difference between using the currentTime and using start and
 end. Precision is influenced more strongly by the temporal
 resolution of the decoding pipeline rather than the polling resolution
 for currentTime.

I agree regarding 'start'. W.r.t. 'end', the difference is quite simple
IMHO: when treated as a declarative description as to where audio has to
stop, it is up to the UA to implement it correctly (there could be some
friendly browser competition as to who is most accurate...). I see no
practical reason why this could not be done sample-accurately for media
types such as 16-bit PCM WAVE that support sample-accurate work.

On the other hand, 'currentTime' has no such semantics. So you would be
looking at a bewildering array of factors influencing temporal
resolution, including JS  machine speed, and you could not make any
guarantees to your customer. In speech applications, sub-millisecond
resolution matters as to whether you hear an audible artifact or not. If
your task would be, say, to remove such artifacts from the signal, a
'currentTime'-based solution in all likelihood would not give
reproducible results.

 I doubt the previous implementations of start and
 end gave you a 3 sample accurate resolution even for wav files.

'end' was quite inaccurate for Safari, but this may be due to a
buffering  issue - to this day, Safari's audio latency when doing
something like onmouseover='audio.play(); is much higher than Mozilla
Firefox.

So, to my mind it seems a solvable UA implementation issue, not a
problem with semantics.

Kind regards,
-- Markus
_
SVOX AG, Baslerstr. 30, CH-8048 Zürich, Switzerland
Dr. Markus Walther, Software Engineer Speech Technology
Tel.: +41 43 544 06 36
Fax: +41 43 544 06 01
Mail: walt...@svox.com



This e-mail message contains confidential information which is for the
use of the addressee(s) only. Please notify the sender by return e-mail
or call us immediately, if you erroneously received this e-mail.


Re: [whatwg] Remove addCueRange/removeCueRanges

2009-08-17 Thread Dr. Markus Walther


Max Romantschuk wrote:
 I'll chime in here, having done extensive work with audio and video
 codecs. With current codec implementations getting sample- or
 frame-accurate resolution is largely a pipe dream. (Outside of the realm
 of platforms dedicated to content production and playback.) Especially
 for video there can be several seconds between keyframes, frame-accurate
 jumps requiring complex buffering tricks.

Quick feedback: I completely agree it is illusory to guarantee
sample-accuracy accross codecs, and never meant to imply such a
requirement.

The much weaker goal I would propose is to support at least one simple
lossless audio format in this regard (I am not qualified to comment on
the video case). Simple means 'simple to generate, simple to decode',
and PCM WAVE meets these requirements, so would be an obvious candidate.

For that candidate at least I think one could give sample-accurate
implementations of subinterval selection - tons of audio applications
demonstrate this is possible.

-- Markus
_
SVOX AG, Baslerstr. 30, CH-8048 Zürich, Switzerland
Dr. Markus Walther, Software Engineer Speech Technology
Tel.: +41 43 544 06 36
Fax: +41 43 544 06 01
Mail: walt...@svox.com



This e-mail message contains confidential information which is for the
use of the addressee(s) only. Please notify the sender by return e-mail
or call us immediately, if you erroneously received this e-mail.


Re: [whatwg] Remove addCueRange/removeCueRanges

2009-08-14 Thread Dr. Markus Walther
Hi,

 The .start/.end properties were dropped in favor of media fragments,
 which the Media Fragments Working Group is producing a spec for.

Who decided this? Has this decision been made public on this list?

 It will
 be something like http://www.example.com/movie.mov#t=12.33,21.16

var audioObject = new Audio();
audioObject.src
='data:audio/x-wav;base64,UklGRiIAAABXQVZFZm10IBABAAEAIlYAAESsAAACABAAZGF0Yf7///8A';
// play entire audio
audioObject.play();
// play (0.54328,0.72636) media fragment
?

 
 See http://www.w3.org/2008/01/media-fragments-wg.html and
 http://www.w3.org/2008/WebVideo/Fragments/wiki/Syntax#Examples

Did you look at these yourself? I couldn't find something that
approaches a spec of comparable quality to WHATWG in these pages.

Is there any provision for the dynamic case, where you want to change
the media fragment after it has been loaded, with zero server
interaction, and working for data URIs as well?

 Actually, out of curiousity: could gapless concatenation of several
 audio objects be added as well, e.g.

 audioObject1.append(audioObject2)

 or even

 audioObject.join([audioObject1,audioObject2,...,audioObjectN)
 
 There has been much discussion about audio canvas API:s and I trust
 this could fit into that scope.

As the 'inventor' of the term, I am of course familiar with the
discussion - here I was merely adding an item to the wishlist.

 View source at
 http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#video
 and search for v2 and you'll find some of these ideas.

Could these be lifted from hidden HTML comments to something with better
visibility somehow?

-- Markus


Re: [whatwg] Remove addCueRange/removeCueRanges

2009-08-14 Thread Dr. Markus Walther
Silvia,

 2009/8/13 Dr. Markus Walther walt...@svox.com:
 please note that with cue ranges removed, the last HTML 5 method to
 perform audio subinterval selection is gone.
 
 Not quite. You can always use the video.currentTime property in a
 javascript to directly jump to a time offset in a video. And in your
 javascript you can check this property until it arrives at your
 determined end time. So, there is a way to do this even now.

How can polling approach that somehow monitors currentTime meet any
halfway-decent accuracy requirements? E.g. to be accurate to 1-3 samples
at 22050 Hz sampling frequency? I doubt your approach could fulfill this.

To my mind, the current turn of events suggests simply to allow
start/end attributes back into the WHATWG spec, eased by the fact that
there were already browser implementations of it.

-- Markus


Re: [whatwg] Remove addCueRange/removeCueRanges

2009-08-14 Thread Dr. Markus Walther


Silvia Pfeiffer wrote:
 2009/8/14 Dr. Markus Walther walt...@svox.com:
 Hi,

 The .start/.end properties were dropped in favor of media fragments,
 which the Media Fragments Working Group is producing a spec for.
 Who decided this? Has this decision been made public on this list?

 It will
 be something like http://www.example.com/movie.mov#t=12.33,21.16
 var audioObject = new Audio();
 audioObject.src
 ='data:audio/x-wav;base64,UklGRiIAAABXQVZFZm10IBABAAEAIlYAAESsAAACABAAZGF0Yf7///8A';
 // play entire audio
 audioObject.play();
 // play (0.54328,0.72636) media fragment
 ?
 
 Not in this way. In fact, the new way will be much much simpler and
 does not require javascript. 

With the code snippet given I was pointing out that it is not obvious
(to me at least) how the proposed media fragment solution covers data
URIs. If it is not meant to cover them, it is limited in a way that the
solution it seeks to replace is not.

 Or if you really wanted to do it in javascript, you'd only need to
 reload the resource:

Of course we want to do this dynamically in JavaScript - IMHO it would
be the norm not the exeception to select fragments based on user input.
Precomputed fragments are of limited use. I don't quite understand why
the dynamic case is so often underrepresented in these discussions...

-- Markus


Re: [whatwg] Remove addCueRange/removeCueRanges

2009-08-13 Thread Dr. Markus Walther
Hi,

please note that with cue ranges removed, the last HTML 5 method to
perform audio subinterval selection is gone.

AFAIK, when dropping support for 'start' and 'end' attributes it was
noted on this list that cue ranges would provide a replacement to
dynamically select, say, a 3-second range from a 1-hour audio source.

So, if cue ranges will indeed be dropped, could browser vendors and
standards people consider putting 'start' and 'end' back in, just like
Safari had it for a while (albeit buggy)?

Actually, out of curiousity: could gapless concatenation of several
audio objects be added as well, e.g.

audioObject1.append(audioObject2)

or even

audioObject.join([audioObject1,audioObject2,...,audioObjectN)

Just my 2c.

--Markus

Philip Jägenstedt wrote:
 Hi,
 
 We would like to request that addCueRange/removeCueRanges be dropped
 from the spec before going into Last Call. We are not satisfied with it
 and want to see it replaced with a solution that includes (scriptless)
 timed text (a.k.a captions/subtitles). I don't think that this will be
 finished in time for Last Call however, because we need implementor
 experience to write a good spec. However, we have no intention of
 implementing both cue ranges and its replacement, so it is better if the
 spec doesn't provide any solution for now.
 
 I have been briefly in contact with other browser vendors and while I
 cannot speak for them here, I hope those that agree will chime in if
 necessary.
 


Re: [whatwg] Codecs for audio and video

2009-06-30 Thread Dr. Markus Walther


Ian Hickson wrote:
 On Tue, 30 Jun 2009, Matthew Gregan wrote:
 Is there any reason why PCM in a Wave container has been removed from 
 HTML 5 as a baseline for audio?
 
 Having removed everything else in these sections, I figured there wasn't 
 that much value in requiring PCM-in-Wave support. However, I will continue 
 to work with browser vendors directly and try to get a common codec at 
 least for audio, even if that is just PCM-in-Wave.

Please, please do so - I was shocked to read that PCM-in-Wave as the
minimal 'consensus' container for audio is under threat of removal, too.

Frankly, I don't understand why audio was drawn into this. Is there any
patent issue with PCM-in-Wave? If not, then IMHO the decision should be
orthogonal to video.

-- Markus


Re: [whatwg] Codecs for audio and video

2009-06-30 Thread Dr. Markus Walther
Gregory Maxwell wrote:
 PCM in wav is useless for many applications: you're not going to do
 streaming music with it, for example.

 It would work fine for sound effects...

The world in which web browsers live is quite a bit bigger than internet
and ordinary consumer use combined...

Browser-based intranet applications for companies working with
professional audio or speech are but one example. Please see my earlier
contributions to this list for more details.

 but it still is more code to
 support, a lot more code in some cases depending on how the
 application is layered even though PCM wav itself is pretty simple.
 And what exactly does PCM wav mean?  float samples? 24 bit integers?
 16bit? 8bit? ulaw? big-endian? 2 channel? 8 channel? Is a correct
 duration header mandatory?

To give one specific point in this matrix: 16-bit integer samples,
little-endian, 1 channel, correct duration header not mandatory.
This is relevant in practice in what we do. I can't speak for others.

 It would be misleading to name a 'partial baseline'. If the document
 can't manage make a complete workable recommendation, why make one at
 all?

I disagree. Why insist on perfection here? In my view, the whole of HTML
5 as discussed here is about reasonable compromises that can be
supported now or pretty soon. As the browsers which already support PCM
wav (e.g. Safari, Firefox) show, it isn't impossible to get this right.

Regards,
-- Markus


Re: [whatwg] video tag : loop for ever

2008-10-30 Thread Dr. Markus Walther


Silvia Pfeiffer wrote:
 I believe your use case of creating an adio editor through using the
 audio tag is a bit far fetched. I don't think it lends itself to
 that kind of functionality. 

Your belief is fine with me - you haven't seen the prototype running on
Safari ;-)

 You would not use the img tag to
 implement a picture editor either.

This is a non-compelling analogy that I already discussed on the list -
IMHO it's simply a matter of taste whether to proliferate HTML elements
or extend the API of an existing element a bit. For what it's worth,
people _could_ have extended img instead of going for canvas ...

 As for start/end attributes - I still believe that a javascript API
 towards changing start and end times for playback is much more
 appropriate than changing attributes and expecting the media framework
 to react to the changed attribute values. If the main usecase for you
 is dynamic and not static, then you should have an interface that has
 direct access to the video controls (i.e. directly run a function)
 rather than going through an attribute indirection (i.e. change state,
 which needs to trigger a function). Note that this does not imply a
 roundtrip to the server.

My use cases are neutral to the finer points you raise here - I would
simply do a pause() before setting start/end to new values and calling
play() again. If necessary, there could be restrictions in the spec on
resetting those values during playback, or no guarantees of audible
glitches from the UA. Either way is fine, just not dropping start/end
altogether.

--Markus



[whatwg] Web-based dynamic audio apps - WAS: Re: video tag : loop for ever

2008-10-17 Thread Dr. Markus Walther

Eric Carlson wrote:

 Imagine e.g. an audio editor in a browser and the task play this
 selection of the oscillogram...

 Why should such use cases be left to the Flash 10 crowd
 (http://www.adobe.com/devnet/flash/articles/dynamic_sound_generation.html)?


 I for one want to see them become possible with open web standards!

   I am anxious to see audio-related web apps appear too, I just don't
 think that including 'start' and 'end' attributes won't make them
 significantly easier to write.

I did a tiny prototype of the above use case - audio editor in a browser
- and it would have been significantly easier to write, if not Apple's
Safari had a bad and still unfixed bug in the implementation of 'end'
... (http://bugs.webkit.org/show_bug.cgi?id=19305)

 
 In addition, cutting down on number of HTTP transfers is generally
 advocated as a performance booster, so the ability to play sections of a
 larger media file using only client-side means might be of independent
 interest.

   The 'start' and 'end' attributes, as currently defined in the spec,
 only limit the portion of a file that is played - not the portion of a
 file that is downloaded.

I know that, but for me that's not the issue at all.

The issue is _latency_. How long from a user action to audible playback
- that's what's relevant to any end user.

You can't do responsive audio manipulation in the browser without fast,
low-latency client-side computation. All the server-side proposals miss
this crucial point.

For another use case, consider web-based tools for DJs, for mixing and
combining audio clips. There's a lot of clips on the web. But if
manipulating them is not realtime enough, people won't care.

For another use case, consider web-based games with dynamic audio, etc.

Robert O'Callahan wrote:
 On Fri, Oct 17, 2008 at 5:24 AM, Dr. Markus Walther [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:

 Imagine e.g. an audio editor in a browser and the task play this
 selection of the oscillogram...

 Why should such use cases be left to the Flash 10 crowd

(http://www.adobe.com/devnet/flash/articles/dynamic_sound_generation.html)?


 If people go in that direction they won't be using cue ranges etc,
 they'll be using dynamic audio generation, which deserves its own API.

And I proposed the beginnings of such an API in several postings on this
list under the topic 'audio canvas', but it seemingly met with little
interest. Now Flash 10 has some of the things I proposed... maybe that's
a louder voice?

 OK, in principle you could use audio with data:audio/wav, but that
 would be crazy. Then again, this is the Web so of course people will do
 that.

I did exactly that in my tiny audio-editor prototype for
proof-of-concept purposes - I guess I must be crazy :-) Actually it was
partly a workaround for browser bugginess, see above.

Give me an API with at least

float getSample(long samplePosition)

putSample(long samplePosition, float sampleValue)

play(long samplePositionStart, unsigned long numSamples),

and sanity will be restored ;-)

The current speed race w.r.t. the fastest JavaScript on the planet will
then take care of the rest.

Silvia Pfeiffer wrote:

 Linking to a specific time point or section in a media file is not
 something that needs to be solved by HTML. It is in fact a URI issue
 and is being developed by the W3C Media Fragments working group.

 If you use a URI such as http://example.com/mediafile.ogv#time=12-30
 in the src attribute of the video element, you will not even have to
 worry about start and end attributes for the video element.

Unless Media Fragments can be a) set dynamically for an already
downloaded media file _without triggering re-download_,  b) time
specification can be accurate to the individual sample for the case of
audio, c) W3C finishes this quickly enough and d) browsers take the W3C
recommendation seriously, it is not an alternative for my use cases.

It's all about dynamic audio and the future. By the time the spec hits
the market, static media is not the only thing on the web anymore.

Jonas Sicking wrote:

 The problem with relying on cues is that audio plays a lot faster than
 we can gurentee that cue-callbacks will happen. So if you for example
 create a audio file with a lot of sound effects back to back it is
 possible that a fraction of a second of the next sound will play before
 the cue-callback is able to stop it.

If I understand this correctly, cue callback delay would potentially
make it impossible to have precise audio intervals, needed for the above
use cases.

But _then_ replacing 'start' and 'end' with cue ranges and 'currentTime'
is NOT possible, because they are no longer guaranteed to be equivalent
in terms of precision.

It seems the arguments are converging more towards keeping 'start' and
'end' in the spec.

-- Markus


Re: [whatwg] video tag : loop for ever

2008-10-16 Thread Dr. Markus Walther


Eric Carlson wrote:
 
 On Oct 15, 2008, at 8:31 PM, Chris Double wrote:
 
 On Thu, Oct 16, 2008 at 4:07 PM, Eric Carlson [EMAIL PROTECTED]
 wrote:
 However I also think
 that playing just a segment of a media file will be a common
 use-case, so I
 don't think we need start and end either.

 How would you emulate end via JavaScript in a reasonably accurate
 manner?

 
   With a cue point.
 
 If I have a WAV audio file and I want to start and stop
 between specific points? For example a transcript of the audio may
 provide the ability to play a particular section of the transcript.

   If you use a script-based controller instead of the one provided by
 the UA, you can easily limit playback to whatever portion of the file
 you want:
 
 SetTime: function(time) { this.elem.currentTime =
 (timethis._minTime) ? this._minTime :
 (timethis._maxTIme?this._maxTIme:time); }

IMHO, using 'currentTime' and cue ranges is - while technically possible
- a more cumbersome and roundabout way to delimitate a single audio
interval than just using 'start' and 'end' attributes.

I advocate keeping the simple way to do it, with 'start' and 'end', in
the spec.

Also, since you just showed how it can be implemented using cue ranges
and currentTime, having a second, simpler interface (for the case of a
single interval) should be cheap in terms of implementation cost, if you
plan to implement the other one anyway.

   I agree that it is more work to implement a custom controller, but it
 seems a reasonable requirement given that this is likely to be a
 relatively infrequent usage pattern.

How do you know this will be infrequent?

   Or do you think that people will frequently want to limit playback to
 a section of a media file?

Yes, I think so - if people include those folks working with
professional audio/speech/music production. More specifically the
innovative ones among those, who would like to see audio-related web
apps to appear.

Imagine e.g. an audio editor in a browser and the task play this
selection of the oscillogram...

Why should such use cases be left to the Flash 10 crowd
(http://www.adobe.com/devnet/flash/articles/dynamic_sound_generation.html)?

I for one want to see them become possible with open web standards!

In addition, cutting down on number of HTTP transfers is generally
advocated as a performance booster, so the ability to play sections of a
larger media file using only client-side means might be of independent
interest.

-- Markus


Re: [whatwg] Audio canvas?

2008-07-24 Thread Dr. Markus Walther


I think an interesting approach for an audio canvas would be to allow 
you to both manipulate audio data directly (through a 
getSampleData/putSampleData type interface), but also build up an audio 
filter graph, both with some predefined filters/generators and with the 
ability to do filters in javascript.  Would make for some interesting 
possibilities, esp. if it's able to take audio as input.


I entirely agree. In my own proposal sofar I only mentioned simple 
time-domain ops (cut/add silence/fade) as filters - what would be the 
filters/generators on your wishlist?


-- Markus


Re: [whatwg] Audio canvas?

2008-07-17 Thread Dr. Markus Walther



ddailey wrote:
I recall a little app called soundEdit (I think) that ran in the Mac 
back in the mid 1980's. I think it was shareware (at least it was 
ubiquitous).


The editing primitives were fairly cleanly defined and, had a reasonable 
metaphoric correspondence to the familiar drawing actions.


There was a thing where you could grab a few seconds of sound and copy 
it and paste it; you could drag and drop; you could invert (by just 
subtracting each of the tones from a ceiling) you could reverse (by 
inverting the time axis). You could even go in with your mouse and drag 
formants around. It was pretty cool.


It would not be a major task for someone to standardize such an 
interface and I believe any patents would be expired by now.


No need to go to particular _applications_ for inspirations when 
libraries developed with some generality in mind (e.g. 
http://www.speech.kth.se/snack/man/snack2.2/tcl-man.html) can serve as 
inspiration already. A carefully chosen subset of Snack might be a good 
start.



David
- Original Message - From: Dave Singer [EMAIL PROTECTED]
To: whatwg@lists.whatwg.org
Sent: Wednesday, July 16, 2008 2:25 PM
Subject: Re: [whatwg] Audio canvas?



At 20:18  +0200 16/07/08, Dr. Markus Walther wrote:


get/setSample(samplePoint t, sampleValue v, channel c).

For the sketched use case - in-browser audio editor -, functions on 
sample regions from {cut/add silence/amplify/fade} would be nice and 
were mentioned as an extended possibility, but that is optional.


I don't understand the reference to MIDI, because my use case has no 
connection to musical notes, it's about arbitrary audio data on which 
MIDI has nothing to say.


get/set sample are 'drawing primitives' that are the equivalent of 
get/setting a single pixel in images.  Yes, you can draw anything a 
pixel at a time, but it's mighty tedious.  You might want to lay down 
a tone, or some noise, or shape the sound with an envelope, or do a 
whole host of other operations at a higher level than 
sample-by-sample, just as canvas supports drawing lines, shapes, and 
so on.  That's all I meant by the reference to MIDI.


I see. However, to repeat what I said previously:

audio =/= music.

The direction you're hinting at would truly justify inventing a new 
element, since it sounds like it's specialized to synthesized music. But 
that's a pretty narrow subset of what audio encompasses.


Regarding the tediousness of doing things one sample at a time I agree, 
but maybe it's not as bad as it sounds. It depends on how fast 
JavaScript gets, and Squirrelfish is a very promising step (since the 
developers acknowledge they learnt the lessions from Lua, the next 
acceleration step could be to copy ideas from luajit, the extremely fast 
Lua-to-machine-code JIT compiler). If it gets fast enough, client-side 
libraries could do amazing stuff using sample-at-a-time primitives.


Still, as I suggest above, a few higher-level methods could be useful,


-- Markus


[whatwg] Audio canvas?

2008-07-16 Thread Dr. Markus Walther

I have noted an asymmetry between canvas and audio:

canvas supports loading of ready-made images _and_ pixel manipulation 
(get/putImageData).


audio supports loading of ready-made audio but _not_ sample manipulation.

With browser JavaScript getting faster all the time (Squirrelfish...), 
audio manipulation in the browser is within reach, if supported by rich 
enough built-in objects.


Minimally, sample-accurate methods would be needed to
- get/set a sample value v at sample point t on channel c from audio
- play a region from sample point t1 to sample point t2

(Currently, everything is specified using absolute time, so rounding 
errors might prevent sample-accurate work).


More powerful methods might cut/add silence/amplify/fade portions of 
audio in a sample-accurate way.


It would be OK if this support were somewhat restricted, e.g. only for 
certain uncompressed audio formats such as PCM WAVE.


Question: What do people think about making audio more like canvas 
as sketched above?


-- Markus



Re: [whatwg] Audio canvas?

2008-07-16 Thread Dr. Markus Walther


 My understanding of HTMLMediaElement is that the currentTime, volume
 and playbackRate properties can be modified live.

 So in a way Audio is already like Canvas : the developer modify things
 on the go. There is no automated animations/transitions like in SVG
 for instance.

 Doing a cross fade in Audio is done exactly the same way as in Canvas.

That's not what I described, however. Canvas allows access to the most 
primitive element with which an image is composed, the pixel. Audio does 
not allow access to the sample, which is the equivalent of pixel in the 
sound domain. That's a severe limitation. Using tricks with data URIs 
and a known simple audio format such as PCM WAVE is no real substitute, 
because JavaScript strings are immutable.


It is unclear to me why content is still often seen as static by default 
- if desktop apps are moved to the browser, images and sound will 
increasingly be generated and modified on-the-fly, client-side.


 And if you're thinking special effects ( e.g.: delay, chorus, flanger,
 pass band, ... ) remember that with Canvas, advanced effects require
 trickery and to composite multiple Canvas elements.

I have use cases in mind like an in-browser audio editor for music or 
speech applications (think 'Cooledit/Audacity in a browser'), where 
doing everything server-side would be prohibitive due to the amount of 
network traffic.


--Markus


Re: [whatwg] Audio canvas?

2008-07-16 Thread Dr. Markus Walther

Thanks for all the feedback sofar!

Dave Singer wrote:


As others have pointed out, I think you're asking for a new element, 
where you can 'draw' audio as well as pre-load it, just like canvas 
where you can load pictures and also draw them.  This is not the audio 
element, any more than canvas is the img element.


Not sure I agree. Your line of reasoning in general leads to a 
proliferation of elements, whereas my proposal to extend audio makes 
that same element more powerful. I guess it's more a matter of 
aesthetics which approach is better.


It's an interesting idea, but you'd have to answer 'what are your 
drawing primitives', and so on.  More, when creating visual content, you 
are drawing on spatial axes, whereas in audio you are creating or 
modifying samples, which lie themselves on a temporal axis.


I agree and I think I pointed that out already in my initial posting.

I'm guessing that something like MIDI would be drawing primitives, but 
overall this idea would seem to need a lot of working out...


Again in that initial posting I was quite specific about an initial set 
of 'drawing' primitives - audio-manipulation primitives -, minimally


get/setSample(samplePoint t, sampleValue v, channel c).

For the sketched use case - in-browser audio editor -, functions on 
sample regions from {cut/add silence/amplify/fade} would be nice and 
were mentioned as an extended possibility, but that is optional.


I don't understand the reference to MIDI, because my use case has no 
connection to musical notes, it's about arbitrary audio data on which 
MIDI has nothing to say.


-- Markus