> Good idea, but I'm afraid that doing only this it would still miss: > > *very short tones (less than 3 granules long) > *tones rapidly changing of freqs (sweeps) > > But yes, doing forward and backward prediction is a good idea. >
Well.. there are several alternative methods, check out the method implemented in PEAQ (ITU-R 1387) and Frank Baumgarte's 'non linear' model. Instead of computing tonality, these models perform exponential additions of individual maskers, so the final effect is very similar to tonality estimation (why? - because tones are built on individual "peaks" and noise contains many similar spectral lines) It is very important that one that implements this approach tunes-up the "alpha" factor of smearing, so the pure noise and pure tone gives masking powers according to Zwicker's data. I figured out that "alpha" factor depends on window size and partition band median bark value. I have tried this approach in the AAC encoder, but the problem of this model is its speed - it requires lots of 'pow()' calculations in spreading-function convolution process, and therefore it is not really useful in real-time conditions. However, according to Baumgarte - it gives much better masking estimation. However, FhG encoders do not use this. Also, check out Anibal Ferreira's PhD thesis, where he described intra-frame tonality estimation based on MDCT spectrum. I haven't tried this because it was full of hard-core math, but one of these days I might try it :) Best Regards, Ivan Dimkovic _______________________________________________ mp3encoder mailing list [EMAIL PROTECTED] http://minnie.tuhs.org/mailman/listinfo/mp3encoder
