Hi Raph

Thanks for the test case and observations!  It sounds like a good
case which really isolates the background noise variation, which
not that you say that I think is the cause of problems in several
of the cases I've been working with.

There is a related phenomena we have observed with joint stereo:
frequent switching between regular stereo and mid/side stereo will
create noticable artificats.  looks like there are probably many other
things that also benifit from a more consistent frame-to-frame
treatment.  I believe this is what the AAC encoders refer to as
"temporal noise shaping" - trying to keep the noise profile slowly
varying in time may be more important than aggressive noise
minimization for each frame.

Mark



> Date: Fri, 3 Dec 1999 20:30:53 -0800
> X-Authentication-Warning: cs.csoft.net: $s=geek.rcc.se doesn't match 
>$[EMAIL PROTECTED]
> X-Authentication-Warning: onan.artofcode.com: raph set sender to 
>[EMAIL PROTECTED] using -f
> From: Raph Levien <[EMAIL PROTECTED]>
> Sender: [EMAIL PROTECTED]
> Precedence: bulk
> Reply-To: [EMAIL PROTECTED]
> X-UIDL: &ZQd9)U\!!9H]!!C"b!!
> X-UID: 686
> 
> Greetings lamers,
> 
>    First, major kudos for the work that's been done, especially in the
> space betwen 3.13 and 3.50. Lame is truly head and shoulders above the
> other free encoders, and is starting to give Fraunhofer a run for
> their money. It is exciting to think that with continuing refinement
> it may surpass the FhG coder.
> 
>    I've been doing a fair amount of listening tests while archiving my
> CD collection, and, while I'm not a "golden ear" by any standards,
> what I've come up with may be of interest.
> 
>    One of my standard test tracks now is the beginning of track 6 of
> the "Ma Vie En Rose" soundtrack. I've put a clip of this up at:
> 
>    http://www.cs.berkeley.edu/~raph/mp3/
> 
>    This track has a few interesting features. The most relevant for
> lame is the harp glissando at the end of the clip. At 128kbps, if you
> listen carefully to the background noise, you hear it modulate in
> amplitude.
> 
>    To my ears, it sounds like the problem lame has here is consistency
> from frame to frame. This is something that FhG excels at, even at low
> bit rates. My guess is that they have something in there that
> explicitly manages frame-to-frame consistency.
> 
>    If you compare coders at low bitrates (~64kbps), you tend to hear
> both the background noise variations and a "warbling" effect in tonal
> passages. Both sound to me like frame-to-frame variation, but what's
> interesting is that the degradation pattern is _not_ the same among
> the different codecs. In particular, blade is a lot worse than lame
> for the warbling effects (I hear them quite clearly at 128kbps), but
> amplitude variation in the background noise is actually better. This
> suggests to me that there's an aspect of the lame psychoacoustic model
> which is overoptimizing for something else other than background noise
> consistency.
> 
>    To me, lame's VBR doesn't particularly help with this artifact, by
> the way.
> 
>    I've also listened to the ftb_samp example. I agree it's an
> excellent test, as the degradation at 128kpbs is quite noticeable.
> Again, to my ears it sounds like a lot of the problem is
> frame-to-frame amplitude variation. The sounds are a lot more complex,
> though, so it's a little harder for me to pick out what's going on.
> Incidentally, if you want to hear a joke, listen to this track at
> 128kbps with the FhG 2.72 encoder; it does a terrible job (much worse
> than lame 3.50).
> 
>    I suspect that what makes this track particularly difficult is
> their use of chorusing effects. Intuitively, this should make sense -
> chorusing basically takes narrow frequency peaks in the source stream
> and adds new ones closeby in the output stream. And this is, in fact,
> exactly what the MP3 psychoacoustic theory says you can't hear well
> :). A chorusing unit will also introduce subtle periodic variations. A
> lot of what I've heard from the not-even-lame coders sounds like
> beating between the frame rate and the periodicity of the chorus unit.
> 
>    I haven't done any serious work with audio compression, but I have
> with image compression, and there are (I think) some interesting
> analogies. JPEG, like MP3, is based on breaking the source signal into
> blocks (576/192 samples for MP3, 8x8 pixel blocks for JPEG), doing a
> DCT, quantizing, and Huffman coding. They are, I think, almost
> cousins. There are differences, of course; JPEG doesn't do the subband
> thing, and its DCT is 2D rather than 1D. However, even the "stereo" is
> in some way analogous: JPEG encodes a three-channel signal (RGB) by
> splitting into a "mid" (Y, intensity) and "side" (Cr and Cb, red and
> blue chromaticities) signal, and encoding each separately.
> 
>    Perhaps not surprisingly, then, even the artifacts are analogous.
> It's well known that JPEG performs very poorly on detailed edges near
> a smooth (or even white, in the case of most documents) background.
> Remind you of pre-echo? Also, at high compression ratios, you can
> easily see the seams at the edge of each 8x8 block.
> 
>    Smarter JPEG encoders (like the "optimize" mode in the IJG coder,
> which is free software and almost certainly the best coder out there)
> explicitly do things to reduce the block-to-block seam artifacts.
> Perhaps both camps have something to learn from each other.
> 
>    Finally, on the subject of the patent-free coder, you guys probably
> know that Huffman coding is not the most information-theoretically
> efficient. Arithmetic coding probably holds that title, but is
> unfortunately patented by IBM. However, zip-style compression is
> capable of squeezing out some of the difference. I've noticed that
> gzip reduces the size of .mp3's by 2 - 2.5%. It might make sense to
> use zip _instead_ of the Huffman coding. You might also look into some
> of the things that the png project has done to increase the
> effectiveness of zip compression in the lossless domain.
> 
>    Again, kudos for the fabulous work. Lame still has a long way to go
> at low bitrates, but at 160kbps is definitely good enough for my
> archive.
> 
> Raph
> 
> P.S. A bit of friendly spam: I warmly invite all the developers of lame
> to sign up for accounts at http://www.advogato.org/
> --
> MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
> 
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Reply via email to