On 2011-12-05, Eric Carmichel wrote:

[...] noteworthy responses, particularly regarding my comment that most binaural recordings that I’ve listened to don’t give a sense of “open space.” Naturally, we all have a unique HRTF, and recordings or IRs made with an acoustical test fixture (e.g. KEMAR) probably won’t match our own HRTF.

Then it's an open research question to you scientific types, where the discrepancy actually lies. Isn't it? Like, most of the research in here goes with personalized HRTF's (which don't seem to work either), head tracking (which does seem to work, but not perfectly), the obvious theoretical stuff which nobody has even tried (e.g. auditory parallax while headtracking directions as well), and whatnot which we have't yet thought up (that's your job). :)

Recordings made with KEMAR (Knowles Electronic Manikin for Acoustic Research) have the microphones deeply seated in this fixture. Such recordings will have a “naturally occurring” resonant peak around 3 kHz because of the KEMAR’s pseudo ear canal (which, for KEMAR, is just a straight tube, with or without Zwislocki couplers).

In the "we don't know yet" department, nobody's ever proven KEMAR-like modelling is correct. I mean, it relies on a one-dimensional approximation of the auditory canal, which might not hold in full at the highest frequencies. Especially if you happen to believe ultrasonics have something to give us, like many audiophiles do, especially in spatial reproduction. What if the audiotory canal itself stops being one-dimensional at HF and becomes a 2D waveguide in crossection? It isn't as though transverse vibrational modes couldn't be transmitted through the tympanic membrane, or as though they couldn't then theoretically, differentially affect the cochlear microcilia, as a minor transverse induced mode of excitation...

There I'd like to bring up something we already know about natural excitation of the microcilia: we already know those neural cells nor those which follow them in the auditory nerve or the lower auditory nuclei won't depolarize at a rate exceeding some 1kHz or so. Nor do they have much S/N ratio an sich; nor are they fully uncorrelated.

Thus we already know the afferent auditory pathway cannot *possibly* carry all of the auditory information we already know it must, if we think about it simplistically. Thus, the first two things *I'd* really like to know about the peripheral auditory system are: 1) how precisely is temporal coding used in excess of the cochlear place coding which we already know, and 2) how does the feedback modulation via the efferent auditory pathway participate in that coding, and the place coding, in whole?

The first part, temporal coding, as the precise instance of neural firing after the resonant but also place-wise stochastically (fully so?) rectifying cochlear membrane *must* somehow carry information which is particularly relevant to dichotic hearing. That is almost self-evident when you compare the resonance characteristics of coclea in vitro against the dichotic listening experiments of old. No amount of envelope detection could get you to such low angular resolution in dichotic/binaural hearing, ever, unless it was mostly being derived from the exact, continuous time onset of the neural pulses. Possibly all over the board. (Which then also makes supersonics relevant, at least in theory.)

Now that I've read some basics of cochlear implant tech, I don't see how such considerations are taken into account. Thus, Eric, since you seem to be worried about the effects of real life background noise on CI's, maybe you could go double the mile by trying out a CI analysis algorithm which hybridises your typical Shannonesque noise band vocoder with a selective application of pure, rectified, time-domain information, straigh from the sampler? Perhaps a sampler with vastly more bandwidth than you commonly use for CI purposes?

The style of headphones we use may destroy the ear canal’s natural resonant peak, particularly if the headphones are of the insert type.

Absolutely. Been there, done that. That's why I always go with open, gently grabbing designs. And even they kill the outer ear and upper body response which is so crucial to a real HRTF. (The "R", "related", is because it's not just about the head, but always about the upper torso as well.

And to wit, I've never seen anybody study what *really* happens to the response when you twist your head while keeping your torso/body intact. We can do that, we do do that, yet no head tracker even that I've seen takes any notice of the fact. Once again, it could be a valid research area.)

Otherwise, we may have to use a peaking filter to re-create an open-ear type of response.

Of course they try to as best they can. But can they really emulate all of the relevant degrees of freedom which this sort of thing might require? Do we even know which the relevant degrees of freedom really are, as of now? I don't think so.

I sort of thing this is why Gerzon kept himself to speaker rigs, by the way. His thinking shows definite signs of understading binaural stuff as well (e.g. in the final part of his General Metatheory, http://decoy.iki.fi/dsound/ambisonic/motherlode/data/6827.pdf ), but that sort of stuff is just too fickle to be put down as a general audio framework which ambisonic tries to be. In free space general work is much easier to do.

I’d estimate that the earcup volume of circumaural headphones is around 6 cm2. But because headphones include active drivers, computing the combined resonance of the ear canal with the earcup’s volume may not be so simple: [...]

All of that can be dealt with if we know the impedance tranfer function over all of the auditory band of both an ear canal and a headphone. Do we know either of those? Nope, and there's then yet another research subject. (Both of them can actually be fully and not-too-difficultly measured if you know your basic power electronic theory, plus instrument your headphones right. Hint: we usually instrument them for either voltage or amperage, but not both. If you do both, sample them back at full rate, and model your headphone/speaker's dynamic behavior via an optimum Kalman filter on the way, you can usually measure the backreactance over the whole frequency band, given a nice enough amplifier.)

[...] I have listened to Hector’s recording using AKG K240 studio phones (semi-open). (Thanks to Hector for making his recording available.) [...]

Thus for general use, we'd like to see psychoacustically motivated regularization tactics for binaural listening, which make the worst problems go away? Like front-back reversal and like? Even if it leads to something pretty bland for the average listener. No?

For those who have their own ears recorded and their own earplugs already cast, for them we'd want to give something better and more particularized. Which is what ambisonic does as well, for everybody with normal hearing, when done right. Also, I believe it can do that at zero delay while headtracking at the same time. (See my post a few years ago about how you can exchange convolution in the ambisonic domain as an operation with the rotation operation.)

Again, many thanks to all for sharing thoughts, recordings, references, and wisdom.

Always, and I hope I'm not boring you: It's already well-known on-list that I tend to write these kinds of huge, meandering essays, sometimes for no reason at all even. :)
--
Sampo Syreeni, aka decoy - [email protected], http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound

Reply via email to