Re[2]: [MP3 ENCODER] the -mx mode - different philosophy

2000-08-22 Thread Roel VdB

Hello Gabriel,

Tuesday, August 22, 2000, 12:43:07 PM, you wrote:

GB First, please note that it has been a long time I didn't really looked
GB inside of the Lame code, so I'll perhaps tell a few wrong statements. (btw,
GB please could anyone explain me when to use the word "tell" and when "say"?)

I hope you don't expect english advice from me :)).

 If I understand correctly, the "-mj" is evaluating if a frame
 qualifies for M/S coding beforehand, and if so, it will then be coded
 in M/S, independent of the outcome.

GB There is also another parameter: trying to minimise the toggling between s
GB and ms

if this really is necessary, this condition could be left in (even the
current M/S criterium), but because the "-mx" will get results from
experiment, I'd like it to cast as much as possible predictions.
Just let it "compute" and take out the best one.

If, of course, this kind of excessive toggling is a decoder problem,
it'd need to be a criterium to be met in the encoder. If not, just let
the encoder encode, and pick out the frames with lowest noise...

 I've heard my fair share of examples where lame opts for M/S, but
 afterwards this is a bad choice, giving a M/S frame sounding much worse
 than S would have, or in vbr, more bytes are used on the M/S frame
 compared to the S frame.

GB Does this really happens in vbr? Could you please try using Mp3x and see if
GB the same frame could sometimes use more space in ms than in s?

I have more than an educated guess when it comes to this.

btw: could someone update that stats display on the end of encoding?
I'd like a counter of how many M/S and S frames are in each bitrate.
Much easier and fast than using Mp3x.

GB  It seems so strange...If it's true, I think that there is a mistake somewhere in 
the ms
GB bit allocation

why is it so strange?  Is it feasible that a reasonable simple formula to
determine if a frame is fit for M/S is able to _exactly_ predict how
it comes out after encoding?  It can never do so 100% accurately.

to make my point, let me quote Mark Taylor himself: (about JS)
 This works much better than the algorithm suggested in the ISO MP3
 spec. But you still run into trouble:  what if 90% of the bands can
 handle mid/side encoding, and 10% cant?  LAME has to make a decision
 in these cases, and it is possible it can make the wrong decision.

It is proven quite clear to me in the Velvet example:

- Sounds _fine_ in 192S (-ms)
- Totally flunks in 192JS (-mj)

so, even if the JS sample only has 35% M/S frames, this still is
obviously too much because the M/S are there while the S ones would be
a clearly better choise.

With this in the back of my head, I looked at what vbr (-V1 -q1) did on the same
sample:
Joint Stereo  320   113 (24.8%)
Stereo32099 (21.7%)

I know there more possible causes (bit-reservoir conditions etc) for
this behaviour, but this would be very unlikely. (because the
bitreservoir: one time a JS frame is bigger, another the S would be,
cancelling out each-others effects in the long run)

So let's interpret this in the most simple way:

* we know M/S makes mistakes on this sample (192 cbr)
* #JS 320 frames  #S 320 frames

educated-guess: Lame opts for M/S for the same reason it did on the cbr
case, but after encoding, it ends up with very big amount of
intruduced noise - high framesize, and maybe even maxxes out @320.
This while the whole time it would have been better of with a 256S
frame or so ...

 problem: once the criterium is met, and a frame tagged as
 "M/S"-material, it will always be a M/S, even if S would have been
 better.
GB Not always: I think that if we got something like s-s-s-ms-s-s it will be
GB converted before bitstream formatting to s-s-s-s-s-s in order to reduce the
GB togling artefact.

In general.  Or practically: much too often ;)

 Big advantage of this prediction method is the speed.

GB I don't know if it's still the case, but in the past both ms and s data were
GB computed as the mode of a frame could be changed according to the next one.

that would be nice

 Since you never have 100% accurate prediction this is one of the most
 prominent causes of poor quality in -mj mode. (read that post
 of me referring to 192JS of the Velvet track)

GB This Velvet track must have some (perhaps not yet known) other difficulties,
GB as the results are quite catastrophic for every encoder, including mp2 ones.

Lame 192S sounds fine, also -V1 -q1, but I'm thinking it's
unnecesarely too big...

 What I'm suggesting: a "-mx" mode (or whatever letter)

GB This is, to my mind, the goal of -mj, so any change should be made into -mj

I disagree. Initially I was also thinking this, but then when I
discussed all these alledged improvements, I found a healthy amount of
reservation to this idea because of the big implications.  Suggestion
was: the M/S prediction needs tweaking for this problem-wav, rather
than changing the whole system.

And, in retrospect, I understand this.  The current -mj mode 

Re: Re[2]: [MP3 ENCODER] the -mx mode - different philosophy

2000-08-22 Thread Gabriel Bouvigne

I'm not equiped for listening tests here (only an awe64), but is the velvet
problem the thing I'm hearing in the right channel? (or am I thinking I'm
hearing something?)

If this is the case, it seems to me that it's reduced in -m f, but the
stereo image is also changed by this switch.
If it's gone in forced mode, it could be a toggling problem. Roel, could you
check this?

Regards,

--

Gabriel Bouvigne - France
[EMAIL PROTECTED]
icq: 12138873

MP3' Tech: www.mp3-tech.org


--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )