Re: [Sursound] Optimised Decoder matrix (Ambdec)

Aaron Heller Tue, 23 Apr 2013 15:15:30 -0700

Clearly there's no magic here -- this scheme isn't going to let you play
third-order recordings  on a cube or something as irregular as The Morning
Line.  As Fons has pointed out on the Linux audio list, decoder design for
irregular arrays is still a bit of a black art.  Even if there were a
definition of optimal we could all agree upon, I wouldn't claim that AllRAD
produces optimal decoders.  An interesting example to study is the
trirectangle, as we do know how to make optimal decoders for that, so any
deviation shows the compromises in the AllRAD approach.  I can post some
performance graphs if anyone would like to see them.

It does offer a deterministic solution for some speaker configurations that
are difficult to handle with conventional techniques, specifically
partial-coverage rigs, like a dome, which seem to be a fairy common
configuration, and it produces useable decoders quickly.

The AllRAD decoder we did for the 24-speaker dome at Bing Concert Hall
needed no manual tuning and the improvement over the previous (hand-tuned!)
decoder I'd designed was immediately obvious to everyone. Bing is a fairly
reverberant hall, so perhaps some artifacts were masked, but we also did
some careful full-sphere and upper-hemisphere decoder comparisons in
CCRMA's listening room, which is pretty dead, with quite favorable results.

One nice feature is that it gives you a easy-to-understand way to control
the reproduction of sources that are from directions where you don't have
adequate coverage. This is essentially the same problem as doing something
reasonable to play material with overhead sources, say an airplane flyover,
on a horizontal array.

Frankly, I was skeptical too, as it takes just a bit of math to see that
VBAP interpolation is not correct for Ambisonics. However, once you start
designing decoders for irregular arrays, you have to trade off uniform
energy (loudness) for correct angular direction. Numerical optimization
schemes let you make those trade offs explicitly by adjusting the weights
in the objective function, but my experience is that it gets unwieldy once
you have more than a 100 or so variables; for reference, a 3rd-order
decoder for a 24-speaker array has 384 variables per frequency band. One
approach we mention in our LAC2012 paper is to optimize each order
successively, freezing lower orders as you go, except for an overall gain.
We found that that tends to create decoders that have a lower average rE
than optimizing all orders simultaneously, but it makes the process quicker
and more tractable.

In contrast, AllRAD starts with uniform energy distribution and tries to
preserve that by using a much larger number of virtual loudspeakers than
real ones, so that the signals for each triangle of speakers are derived
from 10 to 20 virtual speakers.

I did take a detailed look at rE in third-order AllRAD decoders. Over
regions with adequate loudspeaker coverage, maximum angular errors tend to
be a bit larger than the decoders produced by my optimizer when angular
accuracy is favored (~5-7 vs 1 degree), but I get good-sounding decoders
for large arrays in a few seconds vs a few hours. I'd like to think that
these decoders would be a good starting point for the optimizer, but
haven't experimented with that yet.

I also spent some time trying understand why variations in magnitude and
direction of rE are not symmetric despite symmetric speaker arrays, and
finally realized that it is because the triangulation of the convex hull is
not necessarily symmetric.

One aspect of the manual tuning that Dave and Fons allude to is what to do
when there are large gaps in coverage of the speaker array, such as the
bottom of a dome, or a missing speaker from the 5th-order 12-speaker ring
Fons mentions. Conventional advice is don't use arrays like that for
ambisonic playback, but often that is not an option, as placement of
loudspeakers is typically constrained by architecture, rigging, aesthetics,
time, budget, especially in temporary arrays, such as those CCRMA installs
at Bing or in their courtyard.

The solution Zotter proposes is to insert imaginary loudspeakers into the
array to fill large gaps. There are a couple of things one can do with the
signals intended for those imaginary speakers: decorrelate and mix into the
other speakers or simply discard. My toolbox currently does the latter so
it can produce presets for AmbDec, but it would not be difficult to
implement the former with a different decoding engine.

When used with a dome, the effect is that as a signal is panned down in
elevation, it sticks at the equator and then fades out. Adjusting the
placement of the imaginary speaker controls how quickly it fades, close to
the center gives quick fades, farther down slower ones.

Finally, lots of people seem to have thought about or tried hybrid
ambi/vbap schemes (even me -- the other day I stumbled across a long
forgotten email to Eric Benjamin where I described such a scheme). Zotter
and colleagues at IEM get credit for publishing a succinct description of
what they did and the results of their analysis.

If you don't want to download and install a couple thousand lines of MATLAB
code, spend me some speaker coordinates in a CSV file and I'll send you
some decoders to try out. My only request is that they be for real arrays
you have access to (vs. pathological examples), so you can listen and
report back on what you hear.

Aaron Heller (hel...@ai.sri.com)
Menlo Park, CA US
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://mail.music.vt.edu/mailman/private/sursound/attachments/20130423/2df0a9dd/attachment.html>
_______________________________________________
Sursound mailing list
Sursound@music.vt.edu
https://mail.music.vt.edu/mailman/listinfo/sursound

Re: [Sursound] Optimised Decoder matrix (Ambdec)

Reply via email to