> X-Authentication-Warning: geek.rcc.se: majordom set sender to
>[EMAIL PROTECTED] using -f
> From: "Mathew Hendry" <[EMAIL PROTECTED]>
> Date: Mon, 27 Mar 2000 19:08:42 +0100
> Content-Type: text/plain;
> charset="iso-8859-1"
> X-Priority: 3
> X-MSMail-Priority: Normal
> X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600
> Sender: [EMAIL PROTECTED]
> Precedence: bulk
> Reply-To: [EMAIL PROTECTED]
> X-UIDL: ;1d!!,7k!!_EC!!JOl!!
>
> > From: "Paul Hartman" <[EMAIL PROTECTED]>
> >
> > Somebody pointed out to me that somewhere in the process of encoding a WAV
> > to MP3, then decoding back to WAV some extra frames are added at the end
> > (appears to be silence?). re-encoding and decoding this same thing
> steadily
> > makes it grow each time. It seems not necessarily to be any particular
> > encoder or decoder, or any specific bitrate. Is there a explaination for
> > this, and is there any way to prevent or correct this (so that a mp3
> > decoded back to WAV can be as close as possible to the original)?
>
> The explanation is twofold:
>
> 1) Encoding and decoding delays - of the order of 1000 samples; encoder- and
> decoder-specific
> 2) Padding to frame bounaries - on average adds 1152/2 samples
>
> The delays add silence to the beginning of the decoded signal; the padding
> adds silence to the end.
>
> If you know the combined (encoder + decoder) delay and the length of the
> original signal, you can compensate for both effects with a little editing.
>
> -- Mat.
>
Here's a more technical explination. This is part of a FAQ that is
yet to be posted on the web site. I post this to this news group
about once a month. I've never gotten any comments, so either it is
crystal clear, or unintelligible :-)
Mark
1. Why does LAME add silence to the beginning and end of each song?
This is because of several factors:
DECODER DELAY AT START OF FILE:
All *decoders* I have tested introduce a delay of 528 samples. That
is, after decoding an mp3 file, the output will have 528 samples of
0's appended to the front. This is because the standard
MDCT/filterbank routines used by the ISO have a 528 sample delay. It
would be possible to write a MDCT/filterbank routine with a 0 sample
delay (see description of Takehiro's MDCT/filterbank routine used in
LAME encoding below) but I dont know that anyone has done this.
Furthermore, because of the overlapped nature of MDCT frames, the
first half of the first granule (1 granule=576 samples) doesn't have a
previous frame to overlap with, resulting in attenuation of the first
N samples. The value of N depends on the window type. For
"STOP_TYPE" and "SHORT_TYPE", N=96, while for
"START_TYPE" and "NORMAL_TYPE", N=288. The first frame produced by
LAME 3.56 and up will always be of STOP_TYPE or SHORT_TYPE.
ENCODER DELAY AT START OF FILE:
ISO based encoders (BladeEnc, 8hz-mp3, etc) use a MDCT/filterbank
routine similar to the one used in decoding, and thus also introduce
their own 528 sample delay. A .wav file encoded & decoded will have a
1056 sample delay (1056 samples will be appended to the beginning).
The FhG encoder (at highest quality) introduces a 1160 sample delay,
for a total encoding/decoding delay of 1688 samples. I haven't tested
Xing.
Starting with LAME 3.55, we have a new MDCT/filterbank routine written
by Takehiro Tominaga with a 48 sample delay. With even more rewriting,
this could be reduced to 0. And there is no reason an inverse routine
could not be used in a decoder. However, there are a few problems
with using such a short delay:
1.) The 96 samples of the first frame are attenuated by the MDCT
window. If the encoder delay is greater than 96, this window will
have no effect since the first 96 samples are all padding. With a
48 sample encoder delay, the first 48 samples will be improperly
attenuated. (.001 seconds worth of data at 44.1kHz).
2.) In LAME, psycho-acoustics for the first 576 granule are not correct.
This could be fixed, but at the expense of adding more buffering
and code complexity.
If points 2. or 3. do not bother you, you can decrease the
the encoder delay by setting ENCDELAY in encoder.h. The default
right now is 800.
PADDING AT THE END OF A FILE
Extra padding at the end of a file can be caused by a couple of things:
1. Because the MDCT's are overlapped, it looks something like this:
<--576 MDCT coefficients--><--576 MDCT coefficients--><--576 MDCT coefficients-->
<-- 576 samples PCM output --><-- 576 samples PCM output -->
So no matter where you truncate your MP3 file, the last 288 samples of
that granule will not be decoded. So LAME appends 288 samples of
padding to the input file to guarantee all input samples will be
decoded.
2. If the number of samples is not an exact multiple of 1152,
then last frame of data is padded with 0's so that it has 1152 samples.
Before lame3.56, we just added a few extra frames to make sure all
internal buffers would be flushed. In lame3.56, we tried to pad
with the exact minimum number of samples needed.
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )