Hi Guys,

Here are some implementation ideas I conjured from thin air a while a go 
(perhaps that's all they contain), hope someone finds it interesting. 
And in case it's not patented yet(in case someone would WANT to patent 
it, go figure?!?), now there is prior art ;-).

Recently there was a brief discussion on LAD that there will NEVER be 
enough computing power to fill all of our audio needs. That took some 
ideas to the surface that I was thinking about a while ago.
They are probably not new at all, (infact I know they aren't) but I 
haven't seen any mention of it here on LAD, or any other audio forum, 
unless I'm lacking terminology or something...

We (the people composing/recording music on a computer) strive to do 
everything in realtime when handling audio, and it's the 'right thing to 
do'.
Computers do get more powerful all the time, increasing what we can do 
in realtime, but the general idea is that it will never be enough. Today 
computers are powerful enough that it would be an option to cheat a 
little. Basically what I'm talking about is prerenderiing sequences of 
audio that can be utilized as realtime submixes.

Our fellas making 3d movies with software have done this for a long 
time. Since there is close to nothing in the 3d (high quality) world 
that can be rendered in real-time they are faced with similar issues all 
the time. If we look at post-processing they have developed tools to 
handle this in a way that, actually, is rather close to real-time non 
destructive editing.
3D-applications like Shake and Maya have post processing views where you 
can apply effects to single frames or an entire sequence of film, to 
alter it's presence. The relation to adding effects to audio tracks is 
almost 1-1. The entire effect sequence is however not rendered in 
realtime but rather updated in the background, when changes are 
introduced, like adjusting parameters for filters etc...

This can for instance look like this...

<image1>--<filter>--\
                     \
                  <merge>---<another filter>---
                     /
<image2>--<blur>----/
 

... like a tree structure.

An example would be that we would change a parameter in the <blur> 
filter, all of <blur><merge> and <another filter> will be updated due to 
their dependencies on each other.

If we map and audio example to the above picture it could look like this:

<audio track1>---<super delay>--\
                                 \
                              <mixer>---<super reverb>--
                                 /
<audio track2>--<super chorus>--/

Two audio tracks being mixed together with some effects added at the 
same time, nothing fancy.
If we pretend that we have performance issues here, could be anything 
from a effect that drains CPU to that there are too many tracks so the 
harddisk is a limiting factor, or the mixers and sub mixers cannot 
provide data in realtime. This would mean we are in trouble if we wish 
to render in realtime. A standard fix would probably be to do a submix 
of some parts and using this new mix as a new <audio track>. Not very 
elegant, and to some extent bad due to it's destructive nature, the 
original sound can no longer be used in the project.

What if this structure, had the ability to cache up changes in the 
background? Like this:

<audio track1>*---<super delay>*--\
                                   \
                                <mixer>*---<super reverb>*--
                                   /
<audio track2>*--<super chorus>*--/

The same as above but I added an * everywhere, it represents cache 
points where the data could be cached up in the background to make up 
for any node(s) that isn't able to perform in realtime.
With this approach we could effectively 'pretend' that the track is 
rendered in realtime. We would (of course) not get any audio before the 
track was rendered the first time, and we would not get any audio when 
parameters are changed prior to this effect in the pipeline. But while 
there are no changes the audio track would be played back as if the 
effect(s) was rendered in realtime. And of course the data would be 
nondestructively changed, which is a major plus.

An examle; if the <super chorus> wasn't able to perform in realtime it 
would be cached up in the background, as would all the following nodes 
through their dependency on the <super chorus>. So... when we adjust a 
parameter on the <super chorus> it will trigger a rewrite of all the 
depending nodes as well as its own cache.

If we talk about extending this approach to software synthesizers the 
idea is even more appealing.
Note: To be able to render synth output they MUST be able to operate 
like a VST Instrument, producing output in non realtime. I hope this is 
or can be made the case for softsynths in Linux?
Csound as an excellent example of an application that more seldom than 
often can provide output in realtime.
If we imagine a midi+audio application where the midi part tries to 
drive a Csound softsynth, which would send it's output to the audio 
mixer (I'm just imagining here, not possible at the moment I would think).
Csound would probably choke pretty much all the time, if the midi+audio 
application was aware of the performance problem (could be told by the 
user if there is no good software solution) then it could drive Csound 
in the background (non realtime) and cache up the audio output for later 
mixing in the audio part of the program. This way, the chain from the 
midi tracks, containing the actual notes, to the output audio data can 
remain unbroken although Csound is unable to deliver.

Some catches.
An immediate reason not to use this approach is that it eats storage 
space, a fast and big harddrive is therefore a must, today I wouldn't 
think that be much of a problem, raid'n'all. Caching on a ram-disk would 
be a very good idea also, to improve speed.

You probably need a pretty good multitasking environment to do these 
background renderings also, this probably being a good reason why 
something similar (to my though limited knowledge) never has been 
implemented on win9x (other windows versions are practically unused for 
audio work as far as I know) or macOS.
Related is also the need for lots of MIPS/FLOPS/GIPS/GOPS, but todays 
CPUs seems to me to be up to it, especially if there are several.
Since latency is not an issue, the rendering could, in theory, be 
performed by a renderfarm on the side! :-)

Implementation issues
Adding a cache point at all nodes in the mixing tree is probably not 
trivial. Neither is the logic behind calculating what to update and when 
to update it. But I sure would like it if something similar was 
implemented 'real soon now' tm.

Improvements to the concept would be to allow for different modes in 
parts of the mixing tree. Eg, some parts are always rendered realtime, 
some are always non realtime and then some are dynamically switched 
between the two when need arises.
Another nice feature would be to allow for starting playback even though 
a full background render isn't performed. If we for instance are only 
playing back a small part of the full song, a region, then this is all 
that needs to be cached up, or , even more refined, that the background 
rendering is estimated to finish 'in time' even though we start 
playback! ...

Ouch! I think I'm done...

Brain dump completed...

/Robert
 
Ps. I originally sent this message to this list almost a week ago, it 
got stuck in our mail relay though. The mail relay totaly broke down 
when someone decided to use it to spam some commercial, the queue 
contained 160000 messages when they discovered it, and the stupid relay 
has a nominal throughput of less than one message a second... oh well... 
If the message appears again, disregard.

Ds.


Reply via email to