> Ivan Dimkovic wrote:
> >
> > These tables (ISO Psychoacoustic Model) are built by taking values from
> > optimized encoders used in MPEG listening tests.
>
> Optimised ? Errr, really ? From what I've been testing, the absolute
> thresholds seem to be incredibly poor performers here..

Yes - I forgot to mention that they are always taking 'worst case'
scenario - since people that monitor MPEG reference software development are
employees of companies making commercial MPEG encoding software, it is
obvious that they certainly don't want you to make good encoder without
months (years!) of hard work :) For example, psychoacoustic model in one
state-of-the-art encoder was tuned by hunderds of man-hours.

Check out the LAME ATH formula approximation function, it is very good (with
several presets)...

> In my escapades with wavelet compression at the moment, I seem to require
> quite substantial boosts of the resulting SNR to attain what can be called
> quality results.. Routinely I'm adding another 12dB over the returned SNR
of
> the ISO PA Layer-2 model.

Hmm - adding 12 dB of SNR is not a good idea, at least not for
medium-bitrate codecs. Better approach is fine-tuning of the psychoacoustic
model and analysis-by-synthesis bit-rate allocation algorithms. "Dumb"

Adding of bit-rate simply by increasing perceptual threshold of audibility
margin
is not always a better idea - what if your codec must code at some fixed
bit-rate, say, 128 kbps - and local bitrate increase is impossible?

> I just wanted to try some ATH experiments before deciding whether or not
to
> call it a day with the ISO Models, and start on something better from
> scratch..
> I think the later, though it's all top fun anyway :)
>

Hmm.. that question bugged many people (including LAME developers, myself
and several other high-quality encoder developers) - It turns out that
writing your own model and fine-tuning coding parameters is always better
idea than using ISO reference-software "surrogate" models. Of course, good
implementation of Psychoacoustic Model II is useful for something, but it
also has many flaws (bad tonality detection, fixed spreading function, fixed
set of parameters - independent of target bitrate, etc..)

> > I think the table is in energy domain (i.e. energy = pow(10.0,
> > DecibelValue/10.0))
>
> Ah, many thanks, I get it now :) Any pointers to the best places to start
> reading on the subject of the current state of play with PA models ?
>

Let's see... First, the origin of the Psychoacoustic Model II is AT&T - but
don't expect to get many information from its developer (he will respond you
that their psychoacoustic model is "trade secret" - you can buy it actually,
if you have $$$ to throw ;) - so, it would be a good idea to start reading
AES papers regarding psychoacoustics (www.aes.org), books from Zwicker, new
work from Frank Baumgarte, etc.. In general, it is possible to find
everything necessary for you to write good psychoacoustic model - but not in
one book or paper :) Search the LAME mailing list archive, I think a couple
days ago someone requested a list of recommended reading regarding
psychoacoustics, so I won't repeat myself here..

Don't forget that perceptual coder is set of TOOLS, and each tool must be
tuned-up to work perfectly with other coding tools - this is what makes
difference between bad, average, very good and perfect encoder
implementation!

Regards,
-- Ivan





_______________________________________________
mp3encoder mailing list
[EMAIL PROTECTED]
http://minnie.tuhs.org/mailman/listinfo/mp3encoder

Reply via email to