Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-10-02 Thread Mark Taylor

Hi Frank,

 
 1. FFT filters are strictly speaking no filters (they are not a LTI system),
so they have some nasty properties, which are more or less audible. The
audibility depends on the steepness of the filter.  So high passes should
never be made by FFT filters. Never ever. 
 
Filter flanks modulating the signal, a property LTI systems NEVER have. 
May be also low pass filter are a bad idea. For high pass filters I'm
absolutely sure. 
 

Yes, I agree but I've never heard of anyone using an FFT for low pass
filtering.  The problem is piecing together the results from the
different windows.  LAME does use an FFT filterbank for the psycho
acoustics, where it needs the energies for each frequency.  A windowed
FFT gives excellent power spectrum estimation, which is why its use is
suggested by MPEG for the psycho acoustics in all their codecs.

The MDCT is different, it was only invented/discovered in the
1980s to solve the "edge effects" problem.  It is a set of overlapped,
modified cosine transformations.  The windows have to be specially
constructed to make the transform (forward then backward) lossless.
It is not possible to get a lossless tranform out of any windowed FFT,
nor if the windows overlap less then 50% (for a proof of this, see
Malvar, "Signal Processing with Lapped Tranforms").  I was going
to say Malvar knows what he is doing, except on his web site:
http://www.research.microsoft.com/~malvar/
he has the patently absurd comment that WMA at 64kbs beats
MP3 at 128kbs :-)  


The "polyphase filterbank" is similar to the MDCT, but not lossless,
and has since been replaced by the MDCT (for example, in AAC and
Vorbis).  Here's what one of the MPEG papers says about the polyhpase
filterbank (maybe this means something to you): "a QMF filter of order
511, with rejection of side lobes better than 96db."
They call the MDCT a QMF filter with perfect reconstruction.

Using your notation (below), LTI filters are what I've seen called FIR
(for when b(i)=0) and IIR (when b(i)0).  So now I finally understand
what you are talking about :-) I doubt the MDCT can be characterized
as a LTI filter. 

Mark



 
 2. FFT filters approximating non recursive filter (often called FIR filters,
which is not correct), but actually they are a mixture of a frequency
dependent modulator and a filter. Non recursive filters are only a very
special class of filters. All LTI filtering is done by:
 
 A B
y(n) := Sum a(i) x(n-i) - Sum b(i) y(n-i) 
i=0   i=1
 
Every Filter can be characterized by the a(0...A) and b(1...B). For non
recursive filters is B=0 and A=0 (also called moving average filters),
for absolute phase filters is A=0 and B0 (also called auto regression
filter). Filters with B0 and A0 are mixing both base vectors of filters
and are also called auto regression moving average filters.
 
You can divide ever (LTI) filter into two filters, a MA and a AR filter:
 
 A 
v(n) := Sum a(i) x(n-i)
i=0
 
B
y(n) := v(n) - Sum b(i) y(n-i) 
   i=1
 
Now you can set b(0):=1
 
 A 
v(n) := Sum a(i) x(n-i)
i=0
 
 B
v(n) := Sum b(i) y(n-i)
i=0
 
   This gives (x,y,z complex, O is a big omega and omega/fs, j is sqrt(-1) )
 
A
   v(w)/x(w) = Sum a(i) exp (jOi/fs)
   i=0
 
B
   v(w)/y(w) = Sum b(i) exp (jOi/fs)
   i=0
 
   Substituting exp(jO/fs) = z gives
 
  Ai
   v/x = Sum a(i) z
 i=0
 
  Bi
   v/y = Sum b(i) z
 i=0
 
   and 
 
  Ai
 Sum a(i) z
 i=0
   y/x = ---
  Bi
 Sum b(i) z
 i=0

   So you can see: MA = P_A(z), AR = 1/Q_B(z) and ARMA = P_A(z)/Q_B(z).
 
   Example: The easiest AR filter, a integrator (1st order) can only
be programmed by a infinite long MA filter.
 
 
Are polyphase filters LTI systems? FFT filters aren't.
And they are comparable with the subset of MA filters.
 
 -- 
 Mit freundlichen Grüßen
 Frank Klemm
  
 eMail | [EMAIL PROTECTED]   home: [EMAIL PROTECTED]
 phone | +49 (3641) 64-2721home: +49 (3641) 390545
 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany
 
 --
 MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
 
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-27 Thread Frank Klemm

Hi Mark,

::  A couple of comments/questions:
::   
::   ::  Also, every transition from two different size windows is lossy.  The
::   ::  MDCT is only lossless for overlapping windows of the same size. 
::   ::
::   Is this a problem of bad designed (asymmetric) window functions or a
::   problem of the MDCT (different from DCT).
::   
::  
::  It is a problem with all lapped transforms (like the MDCT).  
::  You need at least a 50% overlap to get a lossless transform.  
::  In AAC, a 1024 frame followed by a 128 frame, the 128 frame
::  will use a window of size 256, so it only extends 64 samples into
::  the 1024 frame.
::
I've tested a (slow stupid)FT transform based system with randomly switching
window size and had only rounding errors. The signal is devided into
arbitrary blocks. Every FFT blocks uses two of these blocks uses two (often
different) cos² functions for cross fading. The blocks are cosine
transformed.

When I have time, I will test this again (I'm not so familar with the DCT,
I'm only familar with LTI, FT, zT, LT is also a little bit more difficult).



::   4. The prefilter has a extremly short size of 4 or 5 TAPs, which is
::  far below 128, 192, 576, or 1024.
::   
::   
::  If this is true, then the 1024 FFT should have plenty of frequency
::  resolution and the prefilter can be easily implimented via the FFT
::  coefficients.  So no need for a new filter?
::

1. FFT filters are strictly speaking no filters (they are not a LTI system),
   so they have some nasty properties, which are more or less audible. The
   audibility depends on the steepness of the filter.  So high passes should
   never be made by FFT filters. Never ever. 

   Filter flanks modulating the signal, a property LTI systems NEVER have. 
   May be also low pass filter are a bad idea. For high pass filters I'm
   absolutely sure. 

2. FFT filters approximating non recursive filter (often called FIR filters,
   which is not correct), but actually they are a mixture of a frequency
   dependent modulator and a filter. Non recursive filters are only a very
   special class of filters. All LTI filtering is done by:

A B
   y(n) := Sum a(i) x(n-i) - Sum b(i) y(n-i) 
   i=0   i=1

   Every Filter can be characterized by the a(0...A) and b(1...B). For non
   recursive filters is B=0 and A=0 (also called moving average filters),
   for absolute phase filters is A=0 and B0 (also called auto regression
   filter). Filters with B0 and A0 are mixing both base vectors of filters
   and are also called auto regression moving average filters.

   You can divide ever (LTI) filter into two filters, a MA and a AR filter:

A 
   v(n) := Sum a(i) x(n-i)
   i=0

   B
   y(n) := v(n) - Sum b(i) y(n-i) 
  i=1

   Now you can set b(0):=1

A 
   v(n) := Sum a(i) x(n-i)
   i=0

B
   v(n) := Sum b(i) y(n-i)
   i=0

  This gives (x,y,z complex, O is a big omega and omega/fs, j is sqrt(-1) )

   A
  v(w)/x(w) = Sum a(i) exp (jOi/fs)
  i=0

   B
  v(w)/y(w) = Sum b(i) exp (jOi/fs)
  i=0

  Substituting exp(jO/fs) = z gives

 Ai
  v/x = Sum a(i) z
i=0

 Bi
  v/y = Sum b(i) z
i=0

  and 

 Ai
Sum a(i) z
i=0
  y/x = ---
 Bi
Sum b(i) z
i=0
   
  So you can see: MA = P_A(z), AR = 1/Q_B(z) and ARMA = P_A(z)/Q_B(z).

  Example: The easiest AR filter, a integrator (1st order) can only
   be programmed by a infinite long MA filter.


   Are polyphase filters LTI systems? FFT filters aren't.
   And they are comparable with the subset of MA filters.

-- 
Mit freundlichen Grüßen
Frank Klemm
 
eMail | [EMAIL PROTECTED]   home: [EMAIL PROTECTED]
phone | +49 (3641) 64-2721home: +49 (3641) 390545
sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-25 Thread mythos

Frank Klemm wrote:
 
 On Sat, Sep 23, 2000 at 10:24:58AM -0600, Mark Taylor wrote:
 
  There is one thing I would like to do, but the work in LAME
  seems to never end :-)   A variant on MP3 which
  uses everything from LAME, but with the following changes:
 
 
 I would not call it MP3. A distinguished name (MP4 or MPEG-%f Layer IV)
 would be much better to not confuse millions of people.

well I have actually been thinkg about this for some time, how about
MP3x (mp3 eXtension or eXtendend), something that would make it clear
that it's an variant of mp3

thanks again

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-25 Thread Mark Taylor

Hi Frank,

A couple of comments/questiosn:
 
 ::  Also, every transition from two different size windows is lossy.  The
 ::  MDCT is only lossless for overlapping windows of the same size. 
 ::
 Is this a problem of bad designed (asymmetric) window functions or a
 problem of the MDCT (different from DCT).
 

It is a problem with all lapped transforms (like the MDCT).  
You need at least a 50% overlap to get a lossless transform.  
In AAC, a 1024 frame followed by a 128 frame, the 128 frame
will use a window of size 256, so it only extends 64 samples into
the 1024 frame.


 
 4. The prefilter has a extremly short size of 4 or 5 TAPs, which is
far below 128, 192, 576, or 1024.
 
 
If this is true, then the 1024 FFT should have plenty of frequency
resolution and the prefilter can be easily implimented via the FFT
coefficients.  So no need for a new filter?


Mark





   
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-24 Thread Frank Klemm

::  
::
::1. go to transform sizes 1024 and 128
::   
::   MP3 uses 576 and 192. When 576 is too low for tonal music and 192 too long for
::   percussions, then this is right. But a 1:8 ratio can create other problems.
::   Note that MD uses 128, 256, 512 and 1024 sample blocks.
::   Useful are block sizes from 1 ms ... 35 ms.
::   
::  
::  I guess it is a trade off between simplicity and flexability.  The
::  1024/128 windows come from AAC, but I have no idea if they are
::  optimal.  Adding more window sizes increases complexity, since every
::  different since window requires a different window function and a
::  different set of huffman tables and partitioning schemes.
::
increases complexity:
Often complexity is decreased by such trades. You replaces
hard decision by much more softer decisions

different window:
If the window size is only 1:2, I would use composed cos^2
windows. You need FFTs of 128*(2,3,4,5,6,8,9,10,12,16) samples.

huffman tables:
Yes, difficult

partitioning schemes:
A problem of MP3 like systems with uniformed time slices.


::  Also, every transition from two different size windows is lossy.  The
::  MDCT is only lossless for overlapping windows of the same size. 
::
Is this a problem of bad designed (asymmetric) window functions or a
problem of the MDCT (different from DCT).


::  So it is good to minimize transitions.  Another thing to keep in mind is
::  that short windows are not bad for tonal music - they just are not as
::  efficient. 
::
You need more bits, or in CBR modes, quality decreases. But it is true,
short blocks only need a small amount of additional bits, if used instead
of long blocks. Vice versa you need a lot more bits.


::   5.
::   Spectral prefiltering to get nearly constant ATH in every CB.
::   
::  If I understood your original posts on this topic, the point of this is
::  to keep large amplitude signals in the lower CB effecting lower
::  amplitude signals in the higher CBs (the so called filter leakage). But
::  I dont think this is a problem since the current filter banks have
::  pretty good frequency resolution.  The prefilter, unless you go to a
::  much larger (and more expensive) window will have just as much leakage
::  as the current filterbanks.
::  
This is true for all CBs except the first and the last.

1. DC in the signal increases quantization steps of the first two or three
   bands (not CB), but is not a masker at all. The same is true for AC in
   the range from 16...70 Hz. Also note, that there are HQ loudspeakers out
   there cutting all frequencies below the loudspeaker's power frequency
   response.  They have a nice flat frequncy response down to x Hz, and
   below this frequency the frequency response falls with 48 dB/oct. x can
   be something in the range from 80 Hz (some of the active controlled BO,
   a very tiny box with 2x4" bass speakers, sounds good, but is unable to
   produce any low frequencies) down to 25 Hz (Canton Digital 1, digital
   controlled and equalized). These loudspeakers also don't create any
   distortion if there are high level low frequency signals (a vented tube
   box creates a lot of them, it is also possible to kill a 150 W
   loudspeaker with 1...5 Watt because of pull out the diaphragm).

   So signal below 80 Hz should not affect masking, right?
   This is not a MP3/MP4/AAC related topic, it's a psycho related topic.

2. Quantization steps of all bands within a CB are identically. This doesn't
   matter for sfb2...19, but for sfb 0, 1, 20 and 21, especially for 0
   (0...120 Hz) and 21 (16...22/24 kHz).

   But the ATH in this cfb's differs by 20 dB. So especially sfb21 is
   still difficult to code if there's an addition sfb21 scaler. The only
   possibility you have is to cut high frequencies (for instance 18 kHz).

   To change this psycho model and coding (= no standard MP3) must be
   changed.

3. Prefiltering eases the programming of psycho. You are separating
   static ATH and dynamic masking and you can handle them separately.
   The human ear also do that. ATH is a property of sound conducting 
   to the free field to the cochlea, masking is an effect within
   the cochlea.

4. The prefilter has a extremly short size of 4 or 5 TAPs, which is
   far below 128, 192, 576, or 1024.

   

::  noise shaping is the act of allocating bits among the critical bands. 
::  You have to decide which bands are important and deserve lots of
::  bits/resolution, and which bands can be quantized with very few bits.
::
Aha. Bit-balancing between the CBs.
Not bit-balancing within one CB.

::  These decisions are based on continously computing the quatization noise
::  and comparing it to the psycho acoustic maskings in each CB.  (The
::  effects you were describing are attempted to be modeled by the psycho
::  acoustics).
::
I don't understand how psycho acoustics can model the effect I described.
IMHO it's a quantization problem.

The human ear can't distinguish (white) 

Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-24 Thread mythos

Gabriel Bouvigne wrote:
 
  how about altering some of the mp3 specs themselves and creating a
 lame
  specific mp3 variant?
  are there any legal reasons not to do this? would the quality gain
 be
  worth the effort?
 The problem is that no player would them be able to play the files.
 MP3 is an internationnal standard, and I think that we must follow it.
 For creating a new standard, there is another project called Vorbis.
 
 
 Regards,
 
 --
 
 Gabriel Bouvigne - France
 [EMAIL PROTECTED]
 mobile phone: [EMAIL PROTECTED]
 icq: 12138873
 
 MP3' Tech: www.mp3-tech.org
Doesn't lame already use a variant of the mpg123lib for it's internal
decoding?
would It be possible to use that as the basis for a plugin for other
players?
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-23 Thread Gabriel Bouvigne



 how about altering some of the mp3 specs 
themselves and creating a lame specific mp3 variant? are there 
any legal reasons not to do this? would the quality gain be worth the 
effort?
The problem is that no player would them be able to 
play the files. MP3 is an internationnal standard, and I think that we must 
follow it.
For creating a new standard, there is another 
project called Vorbis.


Regards,

--Gabriel Bouvigne - 
France[EMAIL PROTECTED]mobile phone: [EMAIL PROTECTED]icq: 
12138873MP3' Tech: www.mp3-tech.org


Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-23 Thread Mark Taylor


 
  how about altering some of the mp3 specs themselves and creating a lame
  specific mp3 variant?
  are there any legal reasons not to do this? would the quality gain be
  worth the effort?
 
 The problem is that no player would them be able to play the files. MP3 is
 an internationnal standard, and I think that we must follow it.
 For creating a new standard, there is another project called Vorbis.
 
 

There is one thing I would like to do, but the work in LAME
seems to never end :-)   A variant on MP3 which
uses everything from LAME, but with the following changes:

1. go to transform sizes 1024 and 128
2. replace polyphase fitlerbank + MDCT with one large MDCT
3. allow mid/side stereo to be turned on/off for each critical band.

This would be a new standard, but it could only be an improvement over
the current LAME/MP3.

Vorbis is a very different codec, so it might be might be good to have
both a VQ codec (Vorbis) and a traditional scalefactor/critical band
codec.

Vorbis uses vector quantization which combines entropy coding and
quantization in one step.  The quantization error is controlled by the
choice of codebook (fixed in advance?) and to a lesser extent by the
choise of the "floor" function which is derived from the psycho
acoustics.  But Vorbis is unusual in that, IIRC, the encoder does not make
these choices based on the quantization error for the frame being
encoded.  

MPEG on the other hand spends a lot of effort (maybe even too much
effort?) on "noise shaping".  Via scalefactors, it adjusts the
quantization accuracy by looking at the quantization error and
comparing it to the psycho acoustic masking.  Then the quantized
coefficients are huffman coded in a seperate, lossless step.
LAME's VBR modes take this approach to the extreme :-)

Vorbis is also VBR, but bitrate is not chosen to achieve
a quantization error  psycho acoustic masking, 
so Vorbis VBR is more similar to LAME's ABR mode.

Mark

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-23 Thread Frank Klemm

On Sat, Sep 23, 2000 at 10:24:58AM -0600, Mark Taylor wrote:
 
 There is one thing I would like to do, but the work in LAME
 seems to never end :-)   A variant on MP3 which
 uses everything from LAME, but with the following changes:


I would not call it MP3. A distinguished name (MP4 or MPEG-%f Layer IV)
would be much better to not confuse millions of people.

 
 1. go to transform sizes 1024 and 128

MP3 uses 576 and 192. When 576 is too low for tonal music and 192 too long for
percussions, then this is right. But a 1:8 ratio can create other problems.
Note that MD uses 128, 256, 512 and 1024 sample blocks.
Useful are block sizes from 1 ms ... 35 ms.

 2. replace polyphase fitlerbank + MDCT with one large MDCT

Okay.


 3. allow mid/side stereo to be turned on/off for each critical band.

I would suggest another model. A frame can contain:

1) Spectral coding of Channel 1 (1000 bit)
2) Global Vectorizer  (1 or 9 bit)
3) 21 Critical Band Vectorizer (3 bit, 24 bit, 45 bit, ... 150 bit)
4) Spectral Coding of Channel 2 (1000 bit)

The needs are:
1   2   3   4
Mono:   M   0
Unbalanced Mono:M'  x
lowest quality stereo:  M'  x
Intensity stereo:   M'  x
M'  x   x
Classic Joint Stereo:   M   0   S
L  128  R
Classic Stereo: L  128  R
Enhanced Joint Stereo:  M'  x   S'
M'  x   x   S'

 / L \   / CH1 \
( ) = CBV(f) * GV * (   )
 \ R /   \ CH2 /

Mono: 
Classic Mono

Unbalanced Mono:  
Often Mono Signals are not well balanced, so a classic
matrix coding gives a lot of Sideband signal.
This mode avoids this.

Lowest quality stereo:
Interviews with 2 speakers. With 600 bps you can code
the direction of the active spreaker. Not very good, but
better than mono.

Intensity stereo:
Direction coding, but now independent for every CB.
So bass drum, voice and percussion can have different
directions.

Classic Joint Stereo:
(L+R)/q(2) and (L-R)/q(2)  or  L and R  are coded.

Classic Stereo:
L and R are coded.

Enhanced Joint Stereo:
a) calculates r and a of every CB
b) if r is near +/-1.0 and all a are nearly equal, use Global Vectorizer
   to extract information
c) depending on the r's use 0...7 bit to code the remaining a
   as Critical Band Vectorizer
d) matrix the signal depending on the Global Vectorizer +
   Critical Band Vectorizer, you got CH1(f) and CH2(f).
e) Code CH1
f) Code CH2
   

4.
Better sfb21 handling.


5.
Spectral prefiltering to get nearly constant ATH in every CB.


6.
Using Bit 15..12 =  as an additional bitrate (384 kbps).
In this case the maximum granule size must also be increased.

 
 This would be a new standard, but it could only be an improvement over
 the current LAME/MP3.
 
A question: When improving MP3, why not using AAC? A new standard is nice
for programmers, but bad for users.


 MPEG on the other hand spends a lot of effort (maybe even too much
 effort?) on "noise shaping".

This I don't understand.

What makes MPEG noise shaping? I don't found any useful documentation,
so I think of:

  * MP3 can code tonal sounds very good, there are some
spectral peaks to code and a lot of masked signal

  * For noisy signals MP3 must code nearly every spectral signal
resulting in nearly no savings

  * the human ear can't distinguish noises with nearly the same
spectral power density, so it is sufficient to code
a noise signal with a similar but different SPD.

-- 
Frank Klemm

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-23 Thread Mark Taylor


  
  1. go to transform sizes 1024 and 128
 
 MP3 uses 576 and 192. When 576 is too low for tonal music and 192 too long for
 percussions, then this is right. But a 1:8 ratio can create other problems.
 Note that MD uses 128, 256, 512 and 1024 sample blocks.
 Useful are block sizes from 1 ms ... 35 ms.
 

I guess it is a trade off between simplicity and flexability.  The
1024/128 windows come from AAC, but I have no idea if they are
optimal.  Adding more window sizes increases complexity, since every
different since window requires a different window function and a
different set of huffman tables and partitioning schemes.  Also, every
transition from two different size windows is lossy.  The MDCT is only
lossless for overlapping windows of the same size.  So it is good to
minimize transitions.  Another thing to keep in mind is that short
windows are not bad for tonal music - they just are not as efficient.
Instead of having many different window sizes, I would just make sure
to use more bits for the short windows to make sure they sound as good
as longer windows.


 
 
 5.
 Spectral prefiltering to get nearly constant ATH in every CB.
 
If I understood your original posts on this topic, the point of this
is to keep large amplitude signals in the lower CB effecting lower
amplitude signals in the higher CBs (the so called filter leakage).
But I dont think this is a problem since the current filter banks have
pretty good frequency resolution.  The prefilter, unless you go to a
much larger (and more expensive) window will have just as much leakage
as the current filterbanks.




 
  MPEG on the other hand spends a lot of effort (maybe even too much
  effort?) on "noise shaping".
 
 This I don't understand.
 
 What makes MPEG noise shaping? I don't found any useful documentation,
 so I think of:
 

noise shaping is the act of allocating bits among the 
critical bands.  You have to decide which bands are
important and deserve lots of bits/resolution, and
which bands can be quantized with very few bits.
These decisions are based on continously computing
the quatization noise and comparing it to the psycho
acoustic maskings in each CB.   (The effects you were
describing are attempted to be modeled by the psycho acoustics)

I believe noise shaping is the main difference between different MP3
encoders.  I'm sure MPEG did not document any good noise shaping
algorithms on purpose :-)  There are a few simple things in the
literature, but I've never found any documentaion of a noise shaping
algorithm used in an actual commercial encoder.


Mark

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-22 Thread Robert Hegemann

Youri Pepplinkhuizen schrieb am Mit, 20 Sep 2000:
 Hi,
 
 I got some suggestions for LAME. They're not too complicated (probably are
 to implement, though), so here goes:
 
 - LAME VBR doesn't encode the LSB (Least Significant Bit) correctly (as
 described on http://privatewww.essex.ac.uk/~djmrob/mp3decoders/lsb.html) -
 what exactly is causing this and how could it be fixed?


didn't he test the DECODING of LAME ???


 - A good suggestion, IMHO: freeformat VBR encoding up to 640 kbs - this
 feature could be implemented through the -B switch (with additional values
 of 384/448/512/640), which would default to 320, but could be set higher to
 give a freeformat VBR. This would have the advantage that some parts which
 are too difficult to compress even with 320 kbs could use even higher
 bitrates which would improve the quality of those frames. I'm sure not many
 people would mind a freeformat VBR MP3 if it would mean a nice improvement
 in quality without having the file bloated like with freeformat CBR.


Sorry, this is impossible, because it is forbidden to mix VBR with freeformat!
If one uses freeformat he has to use CBR.


 - An engine which would analyze the source data and find out which mode
 for -X would be best to use for compression (I have to admit though, that I
 am not too familiar with the -Xx settings - if someone could please explain
 these modes (or point me to a document which explains them), I'd be very
 grateful)


you can dig them up in the mail archive, most were described earlier this year.

 
 That's all for now. LAME is the best MP3 encoder out there - better than FhG
 even - and I am certain it will improve much more, considering the great
 people that are working on it right now. Keep up the good work!
 
 Thanks!
 
 -Youri


Ciao Robert


--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-22 Thread mythos

Mark Taylor wrote:
  - An engine which would analyze the source data and find out which mode
  for -X would be best to use for compression (I have to admit though, that I
  am not too familiar with the -Xx settings - if someone could please explain
  these modes (or point me to a document which explains them), I'd be very
  grateful)
 
 
 Unfortunately, I think the only way to tell which is better is with
 listening tests.  The different -X specify different ways to measure
 "quality" of a quantization, and thus determines how LAME picks the
 best quantization amoung the many possibilities.  The analysis program
 you describe would also have to have some way to measure "quality".
 
 It is tempting to just use the encoded - original RMS difference
 as the quality measure.  This probably works at very high bitrates,
 but at 128kbs, it ignores the psycho acoustic masking which
 MP3 is based on, and is not the right way to measure quality.
 
 I claim that this problem, and better psycho acoustics,
 are the two big unanswered questions in audio compression!
 
 could such a testing program be used to determine the best archival
quality/size ratio?
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-22 Thread mythos

how about altering some of the mp3 specs themselves and creating a lame
specific mp3 variant?
are there any legal reasons not to do this? would the quality gain be
worth the effort?
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



Re: [MP3 ENCODER] Some suggestions for LAME - please review

2000-09-22 Thread mythos

sorry about the repeat messages but when I wrote them I was very tired
and my brain was at half-power, an particular alterations I would like
to see would be the ability to use larger than 320 kbps bitrates and vbr
and abr (abr 320)
I think i got everything this time, thanx for your time
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )



[MP3 ENCODER] Some suggestions for LAME - please review

2000-09-21 Thread Youri Pepplinkhuizen

Hi,

I got some suggestions for LAME. They're not too complicated (probably are
to implement, though), so here goes:

- LAME VBR doesn't encode the LSB (Least Significant Bit) correctly (as
described on http://privatewww.essex.ac.uk/~djmrob/mp3decoders/lsb.html) -
what exactly is causing this and how could it be fixed?
- A good suggestion, IMHO: freeformat VBR encoding up to 640 kbs - this
feature could be implemented through the -B switch (with additional values
of 384/448/512/640), which would default to 320, but could be set higher to
give a freeformat VBR. This would have the advantage that some parts which
are too difficult to compress even with 320 kbs could use even higher
bitrates which would improve the quality of those frames. I'm sure not many
people would mind a freeformat VBR MP3 if it would mean a nice improvement
in quality without having the file bloated like with freeformat CBR.
- An engine which would analyze the source data and find out which mode
for -X would be best to use for compression (I have to admit though, that I
am not too familiar with the -Xx settings - if someone could please explain
these modes (or point me to a document which explains them), I'd be very
grateful)

That's all for now. LAME is the best MP3 encoder out there - better than FhG
even - and I am certain it will improve much more, considering the great
people that are working on it right now. Keep up the good work!

Thanks!

-Youri

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )