>
> > 1. go to transform sizes 1024 and 128
> >
> MP3 uses 576 and 192. When 576 is too low for tonal music and 192 too long for
> percussions, then this is right. But a 1:8 ratio can create other problems.
> Note that MD uses 128, 256, 512 and 1024 sample blocks.
> Useful are block sizes from 1 ms ... 35 ms.
>
I guess it is a trade off between simplicity and flexability. The
1024/128 windows come from AAC, but I have no idea if they are
optimal. Adding more window sizes increases complexity, since every
different since window requires a different window function and a
different set of huffman tables and partitioning schemes. Also, every
transition from two different size windows is lossy. The MDCT is only
lossless for overlapping windows of the same size. So it is good to
minimize transitions. Another thing to keep in mind is that short
windows are not bad for tonal music - they just are not as efficient.
Instead of having many different window sizes, I would just make sure
to use more bits for the short windows to make sure they sound as good
as longer windows.
>
>
> 5.
> Spectral prefiltering to get nearly constant ATH in every CB.
>
If I understood your original posts on this topic, the point of this
is to keep large amplitude signals in the lower CB effecting lower
amplitude signals in the higher CBs (the so called filter leakage).
But I dont think this is a problem since the current filter banks have
pretty good frequency resolution. The prefilter, unless you go to a
much larger (and more expensive) window will have just as much leakage
as the current filterbanks.
>
> > MPEG on the other hand spends a lot of effort (maybe even too much
> > effort?) on "noise shaping".
> >
> This I don't understand.
>
> What makes MPEG noise shaping? I don't found any useful documentation,
> so I think of:
>
noise shaping is the act of allocating bits among the
critical bands. You have to decide which bands are
important and deserve lots of bits/resolution, and
which bands can be quantized with very few bits.
These decisions are based on continously computing
the quatization noise and comparing it to the psycho
acoustic maskings in each CB. (The effects you were
describing are attempted to be modeled by the psycho acoustics)
I believe noise shaping is the main difference between different MP3
encoders. I'm sure MPEG did not document any good noise shaping
algorithms on purpose :-) There are a few simple things in the
literature, but I've never found any documentaion of a noise shaping
algorithm used in an actual commercial encoder.
Mark
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )