Re: [MP3 ENCODER] psymodel ?

Mark Taylor Fri, 09 Jun 2000 10:46:16 -0700
> 
> I am not major in psychoacoustic effect but I am wondering the way
> LAME's(and ISO dist10's) psychoacoustic noise threshold calculation.
> 
> it is calculated by the way like this.
> 
> E(i) : energy of bark i
> T(i) : threshold of bark i
> R(i) : ratio of bark i
> E'(i) : energy of band(scalefactor band) i
> T'(i): threshold of band(scalefactor band) i
> R'(i): ratio of band(scalefactor band) i
> S    : spread function to calculate the masking effect
> ATH(i): ATH threshold of band(scalefactor band) i
> 
> step1
> [T(0) T(1) ... ] = S [E(0) E(1) .... ]
> 
> step2
> R(i) = T(i) / E(i)
> 
> step3
> R'(i) = sum_{some area} R(i)
> E'(i) = sum_{some area} E(i)
> 
> step4
> T'(i) = max(R'(i) * E'(i), ATH(i))
> 
> 
> step 1&2 are done in psymodel.c, and 3&4 are in quantize.c(calc_xmin).
> 
> Question:
> why are there "step2" and "step4" ?
> why doesn't it calculate T'(i) directly from T(i) ?
> 
> I hope remove these steps and we can get more simple&fast psymodel.c

This was changed slighty (a few months ago), and it now looks more like
this:


 step1
 [T(0) T(1) ... ] = S [E(0) E(1) .... ]
 
 
 step2     j=partition band (0..63), i = scalefactor bands (0..22)
 R'(i) = sum_{some area} R(j)
 E'(i) = sum_{some area} E(j)
 
 step3   (done in quantize.c)
 T'(i) = max( (R'(i) / E'(i)) * E''(i), ATH(i))


Where 
E'' = energy in band i as measured by the MDCT.
E' = energy measured by the FFT.

The windows used by the MDCT are optimized for computing
frequency information as accurately as possible, at the
expense of the energy.  the FFT is supposed to give 
a more accurate energy estimate.

The only reasons for E' and E'' in the formula is for normalization,
since the MDCT and FFT use different units.

R'          = masking in units of energy as computed by FFT.  
R'/E'       = percentage of the signal which is masked.

R''         = masking, in units of energy as computed by the MDCT

We need R'' to measure the quantization error.  The formula in step3
comes from:

R''/E'' = R'/E'

Menno also posted a different formula (below), but I dont fully understand
it since the maskings are given in partition bands, which is
an avarage over many frequencies and thus there is no phase
information?





Hi,

Tmdct(i) can be calculated directly from Tdft(i) with this formula:

Tmdct(i) = (2/M) * Tdft(i) * (cos( 2*Pi*n0*(i + 0.5)/N - /_S(i) ))^2

where:
M = number of samples in frequency domain
N = number of samples in time domain
n0 = (M+1)/2
S(i) = the DFT, where /_S(i) is the phase

If you want the full explanation, please let me know.

Bye, Menno















 
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] psymodel ?

Reply via email to