> Good idea, but I'm afraid that doing only this it would still miss:
>
> *very short tones (less than 3 granules long)
> *tones rapidly changing of freqs (sweeps)
>
> But yes, doing forward and backward prediction is a good idea.
>

Well.. there are several alternative methods, check out the method
implemented in PEAQ (ITU-R 1387) and Frank Baumgarte's 'non linear' model.
Instead of computing tonality, these models perform exponential additions of
individual maskers, so the final effect is very similar to tonality
estimation (why? - because tones are built on individual "peaks" and noise
contains many similar spectral lines)

It is very important that one that implements this approach tunes-up the
"alpha" factor of smearing, so the pure noise and pure tone gives masking
powers according to Zwicker's data. I figured out that "alpha" factor
depends on window size and partition band median bark value.

I have tried this approach in the AAC encoder, but the problem of this model
is its speed - it requires lots of 'pow()' calculations in
spreading-function convolution process, and therefore it is not really
useful in real-time conditions. However, according to Baumgarte - it gives
much better masking estimation. However, FhG encoders do not use this.

Also, check out Anibal Ferreira's PhD thesis, where he described intra-frame
tonality estimation based on MDCT spectrum. I haven't tried this because it
was full of hard-core math, but one of these days I might try it :)

Best Regards,
Ivan Dimkovic


_______________________________________________
mp3encoder mailing list
[EMAIL PROTECTED]
http://minnie.tuhs.org/mailman/listinfo/mp3encoder

Reply via email to