Re: [Sursound] HRTFs, recordings, headphones, and more

Sampo Syreeni Mon, 05 Dec 2011 16:37:17 -0800

On 2011-12-05, Eric Carmichel wrote:

[...] noteworthy responses, particularly regarding my comment thatmost binaural recordings that I’ve listened to don’t give a sense of“open space.” Naturally, we all have a unique HRTF, and recordings orIRs made with an acoustical test fixture (e.g. KEMAR) probably won’tmatch our own HRTF.

Then it's an open research question to you scientific types, where thediscrepancy actually lies. Isn't it? Like, most of the research in heregoes with personalized HRTF's (which don't seem to work either), headtracking (which does seem to work, but not perfectly), the obvioustheoretical stuff which nobody has even tried (e.g. auditory parallaxwhile headtracking directions as well), and whatnot which we have't yetthought up (that's your job). :)

Recordings made with KEMAR (Knowles Electronic Manikin for AcousticResearch) have the microphones deeply seated in this fixture. Suchrecordings will have a “naturally occurring” resonant peak around 3kHz because of the KEMAR’s pseudo ear canal (which, for KEMAR, is justa straight tube, with or without Zwislocki couplers).

In the "we don't know yet" department, nobody's ever proven KEMAR-likemodelling is correct. I mean, it relies on a one-dimensionalapproximation of the auditory canal, which might not hold in full at thehighest frequencies. Especially if you happen to believe ultrasonicshave something to give us, like many audiophiles do, especially inspatial reproduction. What if the audiotory canal itself stops beingone-dimensional at HF and becomes a 2D waveguide in crossection? Itisn't as though transverse vibrational modes couldn't be transmittedthrough the tympanic membrane, or as though they couldn't thentheoretically, differentially affect the cochlear microcilia, as a minortransverse induced mode of excitation...

There I'd like to bring up something we already know about naturalexcitation of the microcilia: we already know those neural cells northose which follow them in the auditory nerve or the lower auditorynuclei won't depolarize at a rate exceeding some 1kHz or so. Nor do theyhave much S/N ratio an sich; nor are they fully uncorrelated.

Thus we already know the afferent auditory pathway cannot *possibly*carry all of the auditory information we already know it must, if wethink about it simplistically. Thus, the first two things *I'd* reallylike to know about the peripheral auditory system are: 1) how preciselyis temporal coding used in excess of the cochlear place coding which wealready know, and 2) how does the feedback modulation via the efferentauditory pathway participate in that coding, and the place coding, inwhole?

The first part, temporal coding, as the precise instance of neuralfiring after the resonant but also place-wise stochastically (fully so?)rectifying cochlear membrane *must* somehow carry information which isparticularly relevant to dichotic hearing. That is almost self-evidentwhen you compare the resonance characteristics of coclea in vitroagainst the dichotic listening experiments of old. No amount of envelopedetection could get you to such low angular resolution indichotic/binaural hearing, ever, unless it was mostly being derived fromthe exact, continuous time onset of the neural pulses. Possibly all overthe board. (Which then also makes supersonics relevant, at least intheory.)

Now that I've read some basics of cochlear implant tech, I don't see howsuch considerations are taken into account. Thus, Eric, since you seemto be worried about the effects of real life background noise on CI's,maybe you could go double the mile by trying out a CI analysis algorithmwhich hybridises your typical Shannonesque noise band vocoder with aselective application of pure, rectified, time-domain information,straigh from the sampler? Perhaps a sampler with vastly more bandwidththan you commonly use for CI purposes?

The style of headphones we use may destroy the ear canal’s naturalresonant peak, particularly if the headphones are of the insert type.

Absolutely. Been there, done that. That's why I always go with open,gently grabbing designs. And even they kill the outer ear and upper bodyresponse which is so crucial to a real HRTF. (The "R", "related", isbecause it's not just about the head, but always about the upper torsoas well.

And to wit, I've never seen anybody study what *really* happens to theresponse when you twist your head while keeping your torso/body intact.We can do that, we do do that, yet no head tracker even that I've seentakes any notice of the fact. Once again, it could be a valid researcharea.)

Otherwise, we may have to use a peaking filter to re-create anopen-ear type of response.

Of course they try to as best they can. But can they really emulate allof the relevant degrees of freedom which this sort of thing mightrequire? Do we even know which the relevant degrees of freedom reallyare, as of now? I don't think so.

I sort of thing this is why Gerzon kept himself to speaker rigs, by theway. His thinking shows definite signs of understading binaural stuff aswell (e.g. in the final part of his General Metatheory,http://decoy.iki.fi/dsound/ambisonic/motherlode/data/6827.pdf ), butthat sort of stuff is just too fickle to be put down as a general audioframework which ambisonic tries to be. In free space general work ismuch easier to do.

I’d estimate that the earcup volume of circumaural headphones isaround 6 cm2. But because headphones include active drivers, computingthe combined resonance of the ear canal with the earcup’s volume maynot be so simple: [...]

All of that can be dealt with if we know the impedance tranfer functionover all of the auditory band of both an ear canal and a headphone. Dowe know either of those? Nope, and there's then yet another researchsubject. (Both of them can actually be fully and not-too-difficultlymeasured if you know your basic power electronic theory, plus instrumentyour headphones right. Hint: we usually instrument them for eithervoltage or amperage, but not both. If you do both, sample them back atfull rate, and model your headphone/speaker's dynamic behavior via anoptimum Kalman filter on the way, you can usually measure thebackreactance over the whole frequency band, given a nice enoughamplifier.)

[...] I have listened to Hector’s recording using AKG K240 studiophones (semi-open). (Thanks to Hector for making his recordingavailable.) [...]

Thus for general use, we'd like to see psychoacustically motivatedregularization tactics for binaural listening, which make the worstproblems go away? Like front-back reversal and like? Even if it leads tosomething pretty bland for the average listener. No?

For those who have their own ears recorded and their own earplugsalready cast, for them we'd want to give something better and moreparticularized. Which is what ambisonic does as well, for everybody withnormal hearing, when done right. Also, I believe it can do that at zerodelay while headtracking at the same time. (See my post a few years agoabout how you can exchange convolution in the ambisonic domain as anoperation with the rotation operation.)

Again, many thanks to all for sharing thoughts, recordings,references, and wisdom.

Always, and I hope I'm not boring you: It's already well-known on-listthat I tend to write these kinds of huge, meandering essays, sometimesfor no reason at all even. :)

--
Sampo Syreeni, aka decoy - [email protected], http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
_______________________________________________
Sursound mailing list
[email protected]
https://mail.music.vt.edu/mailman/listinfo/sursound

Re: [Sursound] HRTFs, recordings, headphones, and more

Reply via email to