Re: [MP3 ENCODER] Correlation mid/side
| Odesílatel: Mark Stephens [EMAIL PROTECTED] | | How can you normalize without first scanning the entire file for the loudest | entry? | | mark stephens You can use compressor, but it is not normalizing, it will decrease dynamics. Then you should not scan whole file. Regards Jaroslav Lukesh -- note: (Bill) Gates to Hell! -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
trimming and normalization can be done without first scanning the whole file. you'll to tell me how ! for the begining of the stream OK. But at the end, maybe there is a 2s pure digital silence and the another thing... And for normaliztion I don't see how at all, seince you need to need the min/max. Maybe with a circular buffer it would work... -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: How can you normalize without first scanning the entire file for the loudest :: entry? :: MP3 can be adjusted by multiples of 1.5 dB without any quality loss. So first code and track the biggest amplitude and adjust the MP3 in a second pass. You have only to increase one byte per frame. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: trimming and normalization can be done without first scanning the whole :: file. :: :: you'll to tell me how ! :: for the begining of the stream OK. But at the end, maybe there is a 2s pure :: digital silence and the another thing... :: 2 seconds of digital silence (350 KByte) can be stored in one variable (4 Byte). If, contrary to all expectations, this was only a gap, a corresponding number of silence MP3 frames can be emited. Interface must be a little be changed, because a big pile of data must be emited after such a silence pause. Instead of: char mp3buffer [MAGICK_CONSTANT]; mp3_encode ( global_context, left_pcm_data, right_pcm_data, number_of_samples, mp3buffer, sizeof(mp3buffer) );// argument orgies the structure must be: typedef struct { char* ptr; size_t len; } buffer_t; // avoid a lot of single loaf around variables // group them by functionality buffer_t mp3buffer; mp3buffer.len = 0; // can be enlarged on demand mp3buffer.ptr = malloc ( mp3buffer.len ); mp3_encode ( global_context, pcmbuffer, mp3buffer ); // avoid argument orgies :: :: And for normaliztion I don't see how at all, seince you need to need the :: min/max. Maybe with a circular buffer it would work... :: MP3 adjustable by multiples of 1.5 dB. This can be done after converting the whole PCM file. If the larges value was 19100, calculate: floor ( 4. * ld (32767/19100) ) = floor ( 4. * 0.77867 ) = floor ( 3.1147 ) = 3 Enlarge scale factors by 3. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
Well, a pre-processor is what I'm programming. But you can't integrate it with lame since it can't work in real-time/pipe (for DC adjust, trimming, normalisation). - Original Message - From: "Frank Klemm" [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, September 08, 2000 1:20 AM Subject: Re: [MP3 ENCODER] Correlation mid/side :: | | Anyway I think that the very low frequencies are used in music like :: | | drumbass with very good sound systems. The infra bass is something I :: | really :: | | like in clubs ;) :: | | Since I want to encode files in good quality (maybe playable in a club) :: | I'd :: | | prefer to keep this and just remove the DC offset... I think it can be :: | :: | then remove all under 5Hz, these freqs you do not need, or it is created :: in :: | another way (by rythm) from transients. :: :: And why not 1Hz ? I'm sure you can make good mechanical effects at this :: frequency ;) Or maybe 0.1Hz ? Well I only need/want to remove the 0Hz I :: prefer to remain consistent with the original on low frequencies (higher :: ones are another question). :: I use a legendre transformation instead of a fourier transformation to remove this stuff. It removes such stuff much better than any frequency domain filtering by best preserving the original signal. May be such functionality should be programmed in a lame preprocessor (lame++ called): * legendre based filtering * fourier based filtering * centering of the signal * detecting best lame mode (-mm, -mj, -mf, -ms) * -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
| Odesílatel: Steve Lhomme [EMAIL PROTECTED] | | Well, a pre-processor is what I'm programming. But you can't integrate it | with lame since it can't work in real-time/pipe (for DC adjust, trimming, | normalisation). You should add some filter processing, for example notch filters for 50+100Hz and 60+120Hz, 2-4band fully paramteric EQ (gain, Q, base freq), high-pass and low-pass variable freq. first and second order filtering. For DC adjust I STILL strongly recommend high-pass filter, it is used by proffesionals. DC offset calculating over track is non-proffesional solution, but mathematically right. Finally, all that is mathematically right, should not be right in life. Regards Jaroslav Lukesh -- note: (Bill) Gates to Hell! -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: | Odesílatel: Steve Lhomme [EMAIL PROTECTED] :: | :: | Well, a pre-processor is what I'm programming. But you can't integrate it :: | with lame since it can't work in real-time/pipe (for DC adjust, trimming, :: | normalisation). :: :: You should add some filter processing, for example notch filters for :: 50+100Hz and 60+120Hz, 2-4band fully paramteric EQ (gain, Q, base freq), :: high-pass and low-pass variable freq. first and second order filtering. :: 1st before adding such functionality lame should define an easy to use interface for such plug ins, so the programmer of the the plug ins not need to know anything about lame except a 2 page long plugin interface description. No changes on lame are necessary. 2nd notch filters are a bad solution to remove hum. Use subtractive PLL hum filters, there are more effective and more gentle to the origin signal. :: For DC adjust I STILL strongly recommend high-pass filter, it is used by :: professionals. :: For professional high-pass filters are usable, for ready-made CD tracks they are fully unusable. You need a little bit of the past of the track. :: DC offset calculating over track is non-proffesional :: solution, but mathematically right. :: What is "mathematically right"? Mathematics is always right. Mathematical transformations have certain properties and some are nice and other are disturbing for a given service. And a lot of pairs of properties mutually excluding. :: Finally, all that is mathematically right, should not be right in life. :: To find a mathematical solution, fill out the sheet: aim (note: not a know solution, really the aim): addition conditions: -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: | Odesílatel: Steve Lhomme [EMAIL PROTECTED] :: | :: | Well, a pre-processor is what I'm programming. But you can't integrate :: :: it :: | with lame since it can't work in real-time/pipe (for DC adjust, trimmin :: :: g, :: | normalisation). :: :: You should add some filter processing, for example notch filters for :: 50+100Hz and 60+120Hz, 2-4band fully paramteric EQ (gain, Q, base freq), :: high-pass and low-pass variable freq. first and second order filtering. :: :: For DC adjust I STILL strongly recommend high-pass filter, it is used by :: proffesionals. DC offset calculating over track is non-proffesional :: solution, but mathematically right. Finally, all that is mathematically :: right, should not be right in life. :: :: Well I know that. But in this case it is ;) And since I wanted to make something :: that could normalize, then had the idea of trimming, and then had the one of :: DC adjust. I think I'll go this way, since I have to scan the whole file for :: the other processings... :: trimming and normalization can be done without first scanning the whole file. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
| You should apply a 16 Hz lowpass filter for DC removal. Note that lowest | organ note has 16.3Hz. | | Did you hear tones under 16Hz? Did you have speakerboxes that you will give | these low frequencies? I want to made sub-woofer with 16-30Hz range for my | home stereo, but no lower. | | Well. I think you're talking about a highpass filter ;) sure, sorry for mistake | Anyway I think that the very low frequencies are used in music like | drumbass with very good sound systems. The infra bass is something I really | like in clubs ;) | Since I want to encode files in good quality (maybe playable in a club) I'd | prefer to keep this and just remove the DC offset... I think it can be then remove all under 5Hz, these freqs you do not need, or it is created in another way (by rythm) from transients. Regards Jaroslav Lukesh -- note: (Bill) Gates to Hell! -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
| fubox volume | 10 Hz 6250 l | | -- | Mit freundlichen Grüßen | Frank Klemm I will use double bandpass double chamber box (6-th order, "BOSE" type), size (nice piece of furniture :-) approx 300 liters (270lt bassreflex at 20Hz + 30lt bassreflex at 56Hz) with single speaker at Qts 1.17, rez 45Hz Vas 50lt. Much better response at 16-100Hz +/- 6dB than speakers with Qts about 0.3 (not usable for this way of use). Regards Jaroslav Lukesh -- note: (Bill) Gates to Hell! -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: | | Anyway I think that the very low frequencies are used in music like :: | | drumbass with very good sound systems. The infra bass is something I :: | really :: | | like in clubs ;) :: | | Since I want to encode files in good quality (maybe playable in a club) :: | I'd :: | | prefer to keep this and just remove the DC offset... I think it can be :: | :: | then remove all under 5Hz, these freqs you do not need, or it is created :: in :: | another way (by rythm) from transients. :: :: And why not 1Hz ? I'm sure you can make good mechanical effects at this :: frequency ;) Or maybe 0.1Hz ? Well I only need/want to remove the 0Hz... I :: prefer to remain consistent with the original on low frequencies (higher :: ones are another question). :: I use a legendre transformation instead of a fourier transformation to remove this stuff. It removes such stuff much better than any frequency domain filtering by best preserving the original signal. May be such functionality should be programmed in a lame preprocessor (lame++ called): * legendre based filtering * fourier based filtering * centering of the signal * detecting best lame mode (-mm, -mj, -mf, -ms) * -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
| Odesílatel: Steve Lhomme [EMAIL PROTECTED] | | Howdy Robert, | | | | Alex, if you remember Frank's post about DC offsets, there he attached | | a little C program to calculate AC/DC offsets as well as a correlation | | between left and right channels. (was around 00/08/05) | | | | I'm not sure I read those - DC offsets aren't particularly relevant to my | | current efforts (real-time coding). | | You want to compute (remove ?) the DC in real-time ? Well the DC is just the | mean of the whole signal, but in real-time you should have a mean on a | portion of the signal... I think a good solution would be to have a fixed | length circular buffer with all the samples coming in. You'd just need to | make the mean of all these samples to get the DC offset. I think the size | should be of 10ms or so. Does anyone know the ear latency for that ? I think | for speaking it's a few milliseconds (the latency of the speech, not the | ear). You should apply a 16 Hz lowpass filter for DC removal. Note that lowest organ note has 16.3Hz. Did you hear tones under 16Hz? Did you have speakerboxes that you will give these low frequencies? I want to made sub-woofer with 16-30Hz range for my home stereo, but no lower. Regards Jaroslav Lukesh -- note: (Bill) Gates to Hell! -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
new option: --nice: Changes priority depending on system load (uses clock, gettimeofday and sleep: nice is not portable and useless to it bad design) -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
You should apply a 16 Hz lowpass filter for DC removal. Note that lowest organ note has 16.3Hz. Did you hear tones under 16Hz? Did you have speakerboxes that you will give these low frequencies? I want to made sub-woofer with 16-30Hz range for my home stereo, but no lower. Well. I think you're talking about a highpass filter ;) Anyway I think that the very low frequencies are used in music like drumbass with very good sound systems. The infra bass is something I really like in clubs ;) Since I want to encode files in good quality (maybe playable in a club) I'd prefer to keep this and just remove the DC offset... I think it can be computed with a few milliseconds for most files (except more 'mathematical' signals) and I think the ear wouldn't noticethsi variation, and the speaker dynamic would be enhanced. -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: :: You should apply a 16 Hz lowpass filter for DC removal. Note that lowest :: organ note has 16.3Hz. :: using residuals (no 16.3 Hz tone, but 32.6 Hz and 48.9 Hz). :: Did you hear tones under 16Hz? :: It is difficult to speak from "hearing" in the range from 10...25 Hz (we have lowest frequency systems to test the mechanical stability of microscope systems, vibrations are a big problem). You also "hear" 10 Hz. It modulates human speech so your speech sounds like a computer voice :: Did you have speakerboxes that you will give :: these low frequencies? :: Yes. The question is what you hear first, the distortion or the primary tone. :: I want to made sub-woofer with 16-30Hz range for my :: home stereo, but no lower. :: Note: fu box volume 50 Hz10 l 40 Hz24 l 30 Hz77 l 20 Hz 390 l 10 Hz 6250 l -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
RE: [MP3 ENCODER] Correlation mid/side
Howdy Robert, Alex, if you remember Frank's post about DC offsets, there he attached a little C program to calculate AC/DC offsets as well as a correlation between left and right channels. (was around 00/08/05) I'm not sure I read those - DC offsets aren't particularly relevant to my current efforts (real-time coding). I plugged into LAME that correlation test for a single frame. If you define RH_VALIDATE_MS at compile time this code gets active. Actually the decision whether to use L/R or M/S coding is based on masking relations. But sometimes LAME switches to M/S coding where L/R coding would be using fewer bits. The extra code you can enable with above defined will try to get a rough estimation on the correlation of left and right channels and will consider the perceptual entropy to check if it would be better not to switch to M/S coding. Ah. So LAME uses (or can use) multiple criteria for switching. I'm in the process of adding mid/side to my encoder now, and have just put in a first pass based on what the ISO spec 'specifies' (OK - suggests? hints at?) in Appendix G. Their switching criterion seems particularly random and strange: it is based on a comparison of the sum and difference of the squared energies of the two channels: sum(rl[i]^2 - rr[i]^2) 0.8 * sum(rl[i]^2 + rr[i]^2) Summing 0=i512, where rl and rr are supposedly the energies (so that they are squaring the energy?!?) of the FFT line spectra. (I suspect that they meant to have an absolute value around the difference term.) I don't think this works very well, and I'm not sure why they thought it would work (i.e. what its theoretical basis was), so I'm thinking of substituting a correlator. Do you know why one would correlate on the differential instead of the signal? Also, does anyone know the basis for the ISO switching criterion? Do they really mean square energy (quadrupled magnitude)? They give no hints as to how to reconcile the mid/side samples with the right/left psychoacoustics in the loop section of the encoder. Computing psychoacoustics for the sum and difference signals makes no sense to me, as one is never going to listen to them and thus the psychoacoustic threshold figures are irrelevant, but the alternative of trying to simultaneously allocate bit/noise for both channels seems overly complicated/possibly impossible (oxymoron strikes!). I'm compromising right now by just calculating the distortion thresholds using the L/R SMR and the M/S signal and bandwidths (and the loop distortions with the M/S signal), but this seems like a pretty silly way to do things, and it doesn't sound very good. I'm trying not to plagiarize LAME (I still use the ISO/ATT psych model, for one), but I gather that LAME does some sort of mid/side psychoacoustic processing? Thanks for any feedback/assistance, Alex -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: :: Also, does anyone know the basis for the ISO switching criterion? Do they :: really mean square energy (quadrupled magnitude)? They give no hints as to :: how to reconcile the mid/side samples with the right/left psychoacoustics in :: the loop section of the encoder. Computing psychoacoustics for the sum and :: difference signals makes no sense to me, as one is never going to listen to :: them and thus the psychoacoustic threshold figures are irrelevant, but the :: alternative of trying to simultaneously allocate bit/noise for both channels :: seems overly complicated/possibly impossible (oxymoron strikes!). I'm :: compromising right now by just calculating the distortion thresholds using :: the L/R SMR and the M/S signal and bandwidths (and the loop distortions with :: the M/S signal), but this seems like a pretty silly way to do things, and it :: doesn't sound very good. I'm trying not to plagiarize LAME (I still use the :: ISO/ATT psych model, for one), but I gather that LAME does some sort of :: mid/side psychoacoustic processing? :: 1st step: - Calculate diffuse field corrected ear-drum SPL°): L' = a(w) L + b(w) R, R' = a(w) R + b(w) L w is a small omega, a(w),b(w) complex Note: different for loudspeakers and head phones (where a \approx 1 and b \approx 0) Note: search for HRTF to get usable approaches for a and b. Note: for very good results you must take into consideration that a and b changes depending on the temporal direct/indirect sound ratio - ignore in-brain talk over (ca. -60 dB @1 kHz, much lower than accoustic talk over) - calculate threshold for the left and the right ear 2nd step: if frame can be coded in LR mode - code L using L and threshold(L) - code R using R and threshold(R) if frame can be coded in MS mode - code M using M and 0.6...0.8 * min(threshold(L),threshold(R)) - code S using S, decoded coded M, threshold(L) and threshold(R) 3rd step: select MS/LR depending on: - br(MS)/br(LR) ratio - the past of the audio file - bit pool fuel - (the future of the audio file using one mechanism I mailed before) - user options? -- Mit freundlichen Grüßen Frank Klemm °) SPL = Sound preasure level eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )