On 2013-11-05, Fons Adriaensen wrote:

I have often wondered if there was a way to confirm matrix encoding by using a scope to display phase differences, but have found that not to be the case

It should be possible to do that, but the process will be a bit more complicated.

This has also been talked about before.

Calculate a series of FFTs on overlapping windows, over the entire lenght of the recording, and look for components that have equal magnitude in L and R.

On my part, I'd rather do a continuous (perhaps FFT-enabled, perhaps frequency segregated) wideband Hilbert transform, and in way or another try to bin+aggregate the phase relationships between the two carrier channels, so as to build up an histogram of where the phase relationships landed over time. I believe, over time, that would be the easiest and most adaptable way of gathering the basic data, to then be subjected to some prima facie Bayesian reasoning and/or algorithms derived from machine learning from-sample.

I suspect support vector machines derived from Vapnik-Cervonekis-complexity-limited low order Chebyshev-polynomials might be useful in the latter, too: their inverse problem is the most well behaved I know with regard to all kinds of noise and nonlinearity, in these kinds of problems, and at the same time, if you can somehow marry them to an efficient preprocessing and prebinning stage, maybe you could even do realtime, adaptive recognition of the original analog encoding.

But obviously, since I've never actually coded anything like this out or seen the intermediate statistics, this must remain just a hint to someone more willing than me to "actually talk the talk".

If they were panned to center using a normal stereo pan pot they should be exactly in phase.

That, and the fact that in analogue material you can't really rely upon the stuff even staying within the same encoding regime, is why I above mentioned time spans. And why I believe this sort of blind decoding, if tried, should be able to do mixed decoding and/or seamless transitions from the start. Because, once you put in even a statistically optimal recognizer, it'll be just half sure, half of the time, and will be telling you lots of unexpected things in the middle of that half-and-half.

Thus, the worst problem might not be to detect what you want or do not want to see at all, and then just decode what is there. The real problem might be to deal with continually and variously detecting something in between, and how to decode that, then, without sounding hideous/silly pretty much all of the time.

If the signal is UHJ encoded AMB there will be a phase difference of around 35 degrees.

In fact, under some rather mundane assumptions, at least two channel UHJ can even be automatically and reliably detected to the level of resolving L from R. The same isn't true for e.g. Dolby MR, where the left-right difference doesn't exist at a level deeper than simple 180 degree phase reversal.

If you find that consistently on all center front panned components that would be strong evidence of UHJ encoding.

So, exactly that. At the same time, the accumulation over the Scheiber sphere of phase-amplitude points, combined with a couple of rather minimalistic a priories on what sound fields ought to look like in real life, ought to be able to statistically differentiate between pretty much all of the extant, untagged, analogue systems. MP/SQ/QS/UHJ/RM, all that typical stuff I at least know about, and with a couple of tricks over time, probably even the various extant versions of each of those systems. In fact, using algorithms derived from the audio encoding lit of the past ten years or so of mobile phone codec algorithms, you should even be able to efficiently (i.e. in real time) compensate for static delay differences between the channels, slowly varying same, quite a lot of backround noise, and even certain kinds of speaker-microphone-like rapidly varying angular distortion. In some regards even Doppler, even if it's particularly nasty, as a non-shift-invariant phenomenon.

So, exactly that, and even more.

Looking at the complete signal it's probably impossible to decide if any phase difference is significant or not, you need the 'logic decoding' first.

The way I see it myself, as an amateur (and perhaps soon freelance professional) audio DSP guy, as well as an amateur economist, is that you shouldn't look at the instantaneous phase differentials as such. Instead, you should recognize that there is a whole spectrum of different timespans between none at all and hours, relevant to the solution, all of them interworking the whole time.

Granted, I might be making this a bit more difficult from the start than it has to be. But still, think about a typical audio feed from your current Western television set. It does have commercial breaks in it, no, which might contain totally fucked up audio wrt the audio you had in the movie they just interrupted. No? So in principle, do you not have to switch fluently and rather often between encodings, not only in kind, but in time as well? I think you have to be able to do that.

Also, you have to be able to do that even in mixture, because in most work I've heard of, "the stuff" was *not* mixed any single system. As I recall, even such iconic things as the Star Wars soundtrack (almost) always contained ad hoc elements which didn't utilize the underlying Dolby encoding, but were placed as such onto the raw channels. If I remember right, you'd have to be able to decode even time-variant mixtures, from the start, in order to be useful in the real world. And, you in fact *can* do that, at least in theory, but what I'm then saying is that the methods you'd use to do that can't stay within the typical single-rate LTI framework; the'll have to be multistage and multirate, often nonlinear ones.

(And actually I might have to ask the group a couple of salient questions here, in a short time, because it just might be I have to utilize some ambisonic methods just the way I've described, on short order. ;) )
--
Sampo Syreeni, aka decoy - [email protected], http://decoy.iki.fi/front
+358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound

Reply via email to