The observations Takehiro made are really interesting,
but I think he came to a wrong conclusion. There is no
problem with the ath masking threshold, but with the
calculated distortion in calc_noise. We would have to
detect such peaks he describes, and calculate the
resulting distortion different than the usual way.
I would suggest the following:
if in a scalefactor band the maximum distortion
of a frequency line is larger than .25 of the sum
then the distortion of that band is the maximum
distortion within that band
else the distortion is the average distortion of
that band (just like it is now)
A patch against lame3.55 follows. This patch solves the
problems with "Glockenspiel" gspi35_1.wav.
(as far as I have tested it yet)
Robert
Lame A Mpeg-audio Experience
diff -c lame3.55/quantize.c lame3.55.w4/quantize.c
*** lame3.55/quantize.c Thu Nov 11 18:24:05 1999
--- lame3.55.w4/quantize.c Wed Nov 17 22:43:36 1999
***************
*** 1374,1379 ****
--- 1374,1381 ----
step = pow( 2.0, (cod_info->quantizerStepSize) * 0.25 );
for ( sfb = 0; sfb < cod_info->sfb_lmax; sfb++ )
{
+ FLOAT8 max_sfb_noise = 0;
+
start = scalefac_band_long[ sfb ];
end = scalefac_band_long[ sfb+1 ];
bw = end - start;
***************
*** 1382,1390 ****
{
FLOAT8 temp;
temp = fabs(xr[l]) - pow43[ix[l]] * step;
! sum += temp * temp;
}
! xfsf[0][sfb] = sum / bw;
/* max -30db noise below threshold */
noise = 10*log10(Max(.001,xfsf[0][sfb] / l3_xmin->l[gr][ch][sfb]));
distort[0][sfb] = noise;
--- 1384,1394 ----
{
FLOAT8 temp;
temp = fabs(xr[l]) - pow43[ix[l]] * step;
! temp*= temp;
! sum += temp;
! max_sfb_noise = Max(max_sfb_noise,temp);
}
! xfsf[0][sfb] = max_sfb_noise < 0.25*sum ? sum/bw : max_sfb_noise;
/* max -30db noise below threshold */
noise = 10*log10(Max(.001,xfsf[0][sfb] / l3_xmin->l[gr][ch][sfb]));
distort[0][sfb] = noise;
***************
*** 1408,1413 ****
--- 1412,1419 ----
for ( sfb = cod_info->sfb_smax; sfb < 12; sfb++ )
{
+ FLOAT8 max_sfb_noise = 0;
+
start = scalefac_band_short[ sfb ];
end = scalefac_band_short[ sfb+1 ];
bw = end - start;
***************
*** 1416,1424 ****
{
FLOAT8 temp;
temp = fabs((*xr_s)[l][i]) - pow43[(*ix_s)[l][i]] * step;
! sum += temp * temp;
}
! xfsf[i+1][sfb] = sum / bw;
/* max -30db noise below threshold */
noise = 10*log10(Max(.001,xfsf[i+1][sfb] / l3_xmin->s[gr][ch][sfb][i] ));
distort[i+1][sfb] = noise;
--- 1422,1432 ----
{
FLOAT8 temp;
temp = fabs((*xr_s)[l][i]) - pow43[(*ix_s)[l][i]] * step;
! temp*= temp;
! sum += temp;
! max_sfb_noise = Max(max_sfb_noise,temp);
}
! xfsf[i+1][sfb] = max_sfb_noise < 0.25*sum ? sum/bw : max_sfb_noise;
/* max -30db noise below threshold */
noise = 10*log10(Max(.001,xfsf[i+1][sfb] / l3_xmin->s[gr][ch][sfb][i] ));
distort[i+1][sfb] = noise;
--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )