Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
> > > the ficticious previous granule that doesn't really exisit, except > > that the output of the decoder for granule 2 is going to be combined with > > the data in the buffer which would have held the decoded output of > > granule 1. I will assume the decoder initializes this buffer with > > zeros. > > > > Lets ignore quantization. The lapped MDCT followed > > by the IMDCT is lossless. That means that the IMDCT output > > from granule 2 when added to the IMDCT output from granule 1 > > is indentical to the input. > > > > Ok. But, unfortunately, we _can't_ ignore quantization, because we have a > _terribly_ bad granule-pair to encode: granule 2 contains sound which is > not just fading in, granule 1 is zeroed, so our encoding will be decoded > quite badly if there is _any_ quantization. And in the 1st block you output, > you don't have any bitreservoir to save you, either, I'm afraid. > Good point. So this does not get us any closer to allowing to smoothly join mp3's. But it does allow us to set the encoder delay in LAME to as low as 96. The problem you mention effect all encodings, no matter what the delay/padding, since at some point, the music will have to start so we always have one granule of data following a granule of all 0's. But I guess with a large encoder delay, the is time to build up the bitreservoir. > > In our case, the decoder just sets the granule 1 IMDCT output to all > > 0's because it never actually computes this and is just initializing a > > buffer. But the output of granule 2 IMDCT is computed correctly. > > > > output = granule_1_ouput + granule_2_output > > > > granule_1_output: encoder uses all 0's, which is incorrect since > >the MDCT (if it was performed) would have seen > >some of the data in granule 2. > > > > granule_2_outptu: correct > > > > Therefor, the output will be correct *except* where it uses data from > > granule 1, but this can effect at most the first 96 samples. > > I don't understand this. I would think it would affect 0 samples? > the first 96 samples of the output from granule 2 will be combined with the last 96 samples of granule 1. But these last 96 samples are all zeros, instead of what they should have been if granule 1 has been computed properly. Mark -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
> > > > It's not the fade that is the problem, it's the mdct aliasing. > > > > I disagree. There may be some problems created by the *filterbank*, > but the only attenuantion caused by the MDCT will be the first 96 > samples. Here's an example: > [...SNIP...] > granule 2 is the first granule that is encoded. granule 1 is I assume you mean 1+2 is the first encoded? 1 granule = 1152 samples. Overlapping and stuff. > the ficticious previous granule that doesn't really exisit, except > that the output of the decoder for granule 2 is going to be combined with > the data in the buffer which would have held the decoded output of > granule 1. I will assume the decoder initializes this buffer with > zeros. > > Lets ignore quantization. The lapped MDCT followed > by the IMDCT is lossless. That means that the IMDCT output > from granule 2 when added to the IMDCT output from granule 1 > is indentical to the input. > Ok. But, unfortunately, we _can't_ ignore quantization, because we have a _terribly_ bad granule-pair to encode: granule 2 contains sound which is not just fading in, granule 1 is zeroed, so our encoding will be decoded quite badly if there is _any_ quantization. And in the 1st block you output, you don't have any bitreservoir to save you, either, I'm afraid. > In our case, the decoder just sets the granule 1 IMDCT output to all > 0's because it never actually computes this and is just initializing a > buffer. But the output of granule 2 IMDCT is computed correctly. > > output = granule_1_ouput + granule_2_output > > granule_1_output: encoder uses all 0's, which is incorrect since >the MDCT (if it was performed) would have seen >some of the data in granule 2. > > granule_2_outptu: correct > > Therefor, the output will be correct *except* where it uses data from > granule 1, but this can effect at most the first 96 samples. I don't understand this. I would think it would affect 0 samples? > > However, the polyphase filterbank is another story, and this > is something I hadn't thought about when I claimed only > the first 96 samples will be attenuated: > > The data (with first 96 samples corrupt) is then sent to the > inverse polyphase filterbank. I dont know much about how this > albatross works, but I think it has an effective window length of > 512. So the bad data in the first 96 samples can corrupt > samples up to 96+512. You can eliminate the poly bank delay as well, I think. > > Any idea how bad this corruption is? The poly phase bank uses quite steep filters. Basically, the poly bank is a 512 point fft, then a window (almost rectangular), fft back, dct. > > btw, you cant check in it lame because it looks like the > filterbank (although the delay is only 48), is not "primed" > correctly, so the first 286 samples will be ignored. > I will fix this and post more about it next... > > Mark > > -- > MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) > -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
> > Hello Segher, > > > > Friday, May 19, 2000, 10:48:54 AM, you wrote: > > > > >> After reading that extensive explanation that you posted a few weeks > > >> ago, I'm under the impression that 96 0-samples won't change a thing > > SB> Actually, 576 samples. > > > > 50% of granule of "SHORT_TYPE" or "STOP_TYPE" is 96 samples. > > 50% of short type is 192 samples > 50% of stop type is 576 samples > > > > > >> because these are the cause of the 0->1 factor up to 50% of the first > > >> frame. Apologies for assuming the fix would be easy. > > SB> Sorry, it's impossible even. > > > > It's hard to believe there is no way to (*)remove that "fade-in" or > > "fade-out" effect by: > > - feeding the encoder something for the 0th frame, so that the 1st 50% > > of the 1st frame does not "fade" > > - feed the encoder a modified first frame that compensates for the > > 50% fade > > - change the code so that frame 1 and last don't use 50% overlap. > > It's not the fade that is the problem, it's the mdct aliasing. > I disagree. There may be some problems created by the *filterbank*, but the only attenuantion caused by the MDCT will be the first 96 samples. Here's an example: granule 1 granule 2 576 samples 576 samples | 192 | 192 | 192 | 192 | 192 | 192 | data:< real data > short block <-> <-> <-> end block <--> output: | 192 | 192 | 192 | granule 2 is the first granule that is encoded. granule 1 is the ficticious previous granule that doesn't really exisit, except that the output of the decoder for granule 2 is going to be combined with the data in the buffer which would have held the decoded output of granule 1. I will assume the decoder initializes this buffer with zeros. Lets ignore quantization. The lapped MDCT followed by the IMDCT is lossless. That means that the IMDCT output from granule 2 when added to the IMDCT output from granule 1 is indentical to the input. In our case, the decoder just sets the granule 1 IMDCT output to all 0's because it never actually computes this and is just initializing a buffer. But the output of granule 2 IMDCT is computed correctly. output = granule_1_ouput + granule_2_output granule_1_output: encoder uses all 0's, which is incorrect since the MDCT (if it was performed) would have seen some of the data in granule 2. granule_2_outptu: correct Therefor, the output will be correct *except* where it uses data from granule 1, but this can effect at most the first 96 samples. However, the polyphase filterbank is another story, and this is something I hadn't thought about when I claimed only the first 96 samples will be attenuated: The data (with first 96 samples corrupt) is then sent to the inverse polyphase filterbank. I dont know much about how this albatross works, but I think it has an effective window length of 512. So the bad data in the first 96 samples can corrupt samples up to 96+512. Any idea how bad this corruption is? btw, you cant check in it lame because it looks like the filterbank (although the delay is only 48), is not "primed" correctly, so the first 286 samples will be ignored. I will fix this and post more about it next... Mark -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
| "really without gaps" is not always true, because of the inter-frame | dependencies (bit-reservoir). The only way to do it trully gapless is to | join these mp3s before decoding, but no present player can do it run-time. I | can hear small pops in winamp even if using continous output, because each | mp3 is decoded separately. You can use --nores option in LAME for disabling bit reservoir :-) In fact, you little go down with quality, but you can safely cut MP3s. Regards Jaroslav Lukesh -- -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
> > When you want to burn audio cd from this, just join > > these mp3's > > together (there's no tool for this yet, you have to cut the possible TAGs > > from the files with some TAG program and then join with copy file1 + file2 > > musiCutter 0.5 will have this join function (will be released in a few > weeks). Or any half comptetent unix user can do this with head and cat. Scott Manley (aka Szyzyg) /-- _@/ Mail -\ ___ _ _ __ __ _ | Armagh Observatory | / __| __ ___| |_| |_ | \/ |__ _ _ _ | |___ _ _ | Armagh | \__ \/ _/ _ \ _| _| | |\/| / _` | ' \| / -_) || | | Northern Ireland | |___/\__\___/\__|\__| |_| |_\__,_|_||_|_\___|\_, | | BT61 9DG. | http://star.arm.ac.uk/~spm/welcome.html |__/ \=/ -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
> Couldn't it be so that user can (with some command) determine if he wants to > remove the samples or not (or specify the number of samples he wants to > remove)? When you decode mp3's made by other encoder (or older version of > Lame perhaps), the delay is different... > Caster It can't be solved in the encoder. What you _could_ do is: Encode the tracks with some overlap, and in the decoder, start decoding the 2nd while still playing the 1st, throw away the last frames of the 1st track && the first frames of the 2nd track, and then try to find the closest match between the 1st end && the 2nd start (something like cdparanoia/cdda2wav does; only not perfect matching, but least squares should do nicely, I think), and lastly, glueing those two streams together (fade em in/out so click will be annihilated). Solving in the encoder is impossible; mpeg audio works at frame level, what you need is sample level. Ciao, HTH, Segher -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
RE: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
Caster wrote: > the program musiCutter from http://macik.homepage.com to split it into mp3 > files (it can import the cuesheet so you don't have to manually > find starts > and ends of mp3s) Then you can play it with winamp and some > continous output > plugin and it plays it really without gaps because musicutter > splits between > mp3 frames. "really without gaps" is not always true, because of the inter-frame dependencies (bit-reservoir). The only way to do it trully gapless is to join these mp3s before decoding, but no present player can do it run-time. I can hear small pops in winamp even if using continous output, because each mp3 is decoded separately. > When you want to burn audio cd from this, just join > these mp3's > together (there's no tool for this yet, you have to cut the possible TAGs > from the files with some TAG program and then join with copy file1 + file2 musiCutter 0.5 will have this join function (will be released in a few weeks). > etc.) Then decode to wav in EAC with the same offset as used for encoding > (so the start of the wav will be without delay and end won't be cut, there > can be added very short silence but it doesn't matter on the last > track) and > burn the wav with cuesheet you have. That's how I do it and it's very good > (i don't know better way now). Yes, this way works reliable. Slavo -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
- Original Message - From: "Mark Taylor" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, May 18, 2000 8:26 PM Subject: Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection] > As a first pass, I just modified "lame --decode" to remove > exactly 1104 samples (572 sample delay from LAME encoder, > 528 from LAME/mpglib decoder). But other decoders have > different delays. (ISO based: 528. FhG: 1160 +/- a few > samples depending on quality setting). Couldn't it be so that user can (with some command) determine if he wants to remove the samples or not (or specify the number of samples he wants to remove)? When you decode mp3's made by other encoder (or older version of Lame perhaps), the delay is different... Caster -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
to vdbj: your idea about the info tag for cutting start/end - i just got the same idea yesterday :)) my opinion is: include the info somehow into the mp3 (maybe extra TAG at the end of mp3 if it cannot be added somewhere in start) and the included info would be - number of samples to cut from start, and number of samples to cut from the end (so these numbers won't exceed some byte-limit if the file is too big). The decoder would decode the file as normal, but when passing it to output (soundcard, file) it would cut these samples. lame would use it on --decode and it could be implemented in some winamp input codec (for example mpg123...). However, i have one solution for now - Set the compression in EAC and determine the encoding offset, then rip the whole CD in EAC with "copy image and create cuesheet". This will create one big mp3 and a cue sheet. Then use the program musiCutter from http://macik.homepage.com to split it into mp3 files (it can import the cuesheet so you don't have to manually find starts and ends of mp3s) Then you can play it with winamp and some continous output plugin and it plays it really without gaps because musicutter splits between mp3 frames. When you want to burn audio cd from this, just join these mp3's together (there's no tool for this yet, you have to cut the possible TAGs from the files with some TAG program and then join with copy file1 + file2 etc.) Then decode to wav in EAC with the same offset as used for encoding (so the start of the wav will be without delay and end won't be cut, there can be added very short silence but it doesn't matter on the last track) and burn the wav with cuesheet you have. That's how I do it and it's very good (i don't know better way now). Caster - Original Message - From: "Gabriel Bouvigne" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, May 18, 2000 4:27 PM Subject: Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection] > vdbj wrote: > > > Hello Gabriel, > > > > Thursday, May 18, 2000, 12:38:53 PM, you wrote: > > > > GB> Perhaps we could add an advanced option for adjusting the encoder delay?. > > GB> I know that if reduced to the max, it will lower the quality of the > > GB> first frame, but in some cases it could be a better choice. > > > > Wouldn't that be trying to repudiate the inherent nature of mp3? > > Isn't it quite impossible with those MDCT thingies to represent impulses? It > > might work, but at a quality cost. Then also: what about the last > > padded frame? To me it seems more practical to keep the mp3 stream as > > it is, but just provide the decoder with exact info on where to begin, > > and where to end. > > > > The point is that even with an added tag, we can't ensure the delay to > be reduced in any mp3 decoder, but when reducing the encoder delay, the > final delay after decoding could be reduced in every mp3 player. > It's right that it will lower the quality of the first frame, but it's > very short. I don't remenber the exact of the quality reduction, but > Mark mentionned it once on this mailing list. > > Regards > > > -- > > Gabriel Bouvigne - France > > www.mp3-tech.org > > -- > MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) > > > -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
> > What I suggest to compensate for this 'mp3 lapse': > > 1- (preferable): first encode the stream, then insert precise > start- and end-point into Info header (like extension to Xing VBR > one). Then a tool aquainted with this extended header would be able > to do a very accurate "--decode" (hint :)), and a later concatenation > would be within margin of perfection. > > I took the liberty to quickly (don't know C) browse through the lame > source, and I saw a larger frame size was taken compared to Xing for > info header, so LAME string could be included. Maybe some (extra) room > could be utilized to store those extra few start- and stop bits? > This would work, in the sense that it would allow a a fully mp3-lapse-aware decoder could then be made so that % lame input.wav - | lame --decode - output.wav would have sizeof(input.wav)==sizeof(output.wav). As a first pass, I just modified "lame --decode" to remove exactly 1104 samples (572 sample delay from LAME encoder, 528 from LAME/mpglib decoder). But other decoders have different delays. (ISO based: 528. FhG: 1160 +/- a few samples depending on quality setting). There are still a couple of problems: the first and last 96 samples will be attenuated by the MDCT window (multiplied by a function which goes from 0 up to 1) so the volume will go to 0 at the start and end. (= clicks if you concatenate the .wav files together). There are other problems for perfectly seemless concatenation, caused be the fact that mp3 frames overlap by 50%. so to encode frame N, you need 50% of the data from frame N+1 (and to encode frame N+1 you need the last 50% of the data from frame N). One thing that would make these problems easier to solve would be to write a 0 delay encoder and decoder. When Takehiro rewrote the filterbank/MDCT in LAME, he reduced the delay from 528 to 48, and I think this could be reduced to 0? Then put the same technology into mpglib. Problem is, this is a lot of technical coding, for a very limited application. I've suggested it several times, and no one has ever volunteered :-) And, here's something I post ever few weeks or so: 1. Why does LAME add silence to the beginning and end of each song? 2. Why cant MP3 files be seamlessly spliced together? 3. What is the size of a MPEG1/2 frame? == 1. Why does LAME add silence to the beginning and end of each song? This is because of several factors: DECODER DELAY AT START OF FILE: All *decoders* I have tested introduce a delay of 528 samples. That is, after decoding an mp3 file, the output will have 528 samples of 0's appended to the front. This is because the standard MDCT/filterbank routines used by the ISO have a 528 sample delay. It would be possible to write a MDCT/filterbank routine with a 0 sample delay (see description of Takehiro's MDCT/filterbank routine used in LAME encoding below) but I dont know that anyone has done this. Furthermore, because of the overlapped nature of MDCT frames, the first half of the first granule (1 granule=576 samples) doesn't have a previous frame to overlap with, resulting in attenuation of the first N samples. The value of N depends on the window type. For "STOP_TYPE" and "SHORT_TYPE", N=96, while for "START_TYPE" and "NORMAL_TYPE", N=288. The first frame produced by LAME 3.56 and up will always be of STOP_TYPE or SHORT_TYPE. ENCODER DELAY AT START OF FILE: ISO based encoders (BladeEnc, 8hz-mp3, etc) use a MDCT/filterbank routine similar to the one used in decoding, and thus also introduce their own 528 sample delay. A .wav file encoded & decoded will have a 1056 sample delay (1056 samples will be appended to the beginning). The FhG encoder (at highest quality) introduces a 1160 sample delay, for a total encoding/decoding delay of 1688 samples. I haven't tested Xing. Starting with LAME 3.55, we have a new MDCT/filterbank routine written by Takehiro Tominaga with a 48 sample delay. With even more rewriting, this could be reduced to 0. And there is no reason an inverse routine could not be used in a decoder. However, there are a few problems with using such a short delay: 1.) The psycho-acoustics for the first mp3 frame cannot be processed until the encoder gets the second frame of input data. Thus lame_encode() buffers the first frame and does not encode it until given a second frame of input data. 2.) The 96 samples of the first frame are attenuated by the MDCT window. If the encoder delay is greater than 96, this window will have no effect since the first 96 samples are all padding. With a 48 sample encoder delay, the first 48 samples will be improperly attenuated. (.001 seconds worth of data at 44.1kHz). 3.) In LAME, psycho-acoustics for the first 576 granule are not correct. This could be fixed, but at the expense of adding more buffering and code complexity. I
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
vdbj wrote: > Hello Gabriel, > > Thursday, May 18, 2000, 12:38:53 PM, you wrote: > > GB> Perhaps we could add an advanced option for adjusting the encoder delay?. > GB> I know that if reduced to the max, it will lower the quality of the > GB> first frame, but in some cases it could be a better choice. > > Wouldn't that be trying to repudiate the inherent nature of mp3? > Isn't it quite impossible with those MDCT thingies to represent impulses? It > might work, but at a quality cost. Then also: what about the last > padded frame? To me it seems more practical to keep the mp3 stream as > it is, but just provide the decoder with exact info on where to begin, > and where to end. > The point is that even with an added tag, we can't ensure the delay to be reduced in any mp3 decoder, but when reducing the encoder delay, the final delay after decoding could be reduced in every mp3 player. It's right that it will lower the quality of the first frame, but it's very short. I don't remenber the exact of the quality reduction, but Mark mentionned it once on this mailing list. Regards -- Gabriel Bouvigne - France www.mp3-tech.org -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
v> I took the liberty to quickly (don't know C) browse through the lame v> source, and I saw a larger frame size was taken compared to Xing for v> info header, so LAME string could be included. Maybe some (extra) room v> could be utilized to store those extra few start- and stop bits? or maybe best to also include "# of frames to follow until last one" there, because an ending frame would desire very much from the decoding side. -- Best regards, vdbjmailto:[EMAIL PROTECTED] -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
Hello vdbj, Thursday, May 18, 2000, 12:05:47 PM, you wrote: v> Hello, v> I'm no ingenieer and also no engineer (let alone english) v> I took the liberty to quickly (don't know C) browse through the lame v> source, and I saw a larger frame size was taken compared to Xing for v> info header, so LAME string could be included. Maybe some (extra) room v> could be utilized to store those extra few start- and stop bits? could also be handy to include some sort of "stop-frame" to designate end of the stream, so that ID3 and all sort of tags come after this one, and you don't end up analysing the tag :) -- Best regards, vdbjmailto:[EMAIL PROTECTED] -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
When I wanted to encode a live mix to put in myplay.com I basically recorded it and encoded it in one big mp3 file and then split it into smaller files, there are frame errors at the start of each track - but the way that myply reassembles things on public streams meant that these were concatenated into a seamless mix... mp3 doesn't handle mix albums well though... > Real-Life Problem: I don't succeed to back up live/mix albums with any > encoder. I rip seperate wavs and encode-decode them seperately. After > manually removing all Info headers and Tags on them damn mp3's, and > "--decode"ing with same tool as encoding, result still lacks. > practically: LAME -V1 -mj -b128 -h -t gives me ±8 ms at the end of > track N and ±20 ms at the start of track N+1, resulting in a 30 ms > "silence" when concatenated. > > Technical problem: From what I mean to understand about mp3, the > last frame always has some padding due to a restriced set of discrete > frame lengths. > Why there is a silence at the start of the first frame I don't know, > but I invented some pre-echo reserve or something to make it > understandable for myself. (perception eh ;)) Scott Manley (aka Szyzyg) /-- _@/ Mail -\ ___ _ _ __ __ _ | Armagh Observatory | / __| __ ___| |_| |_ | \/ |__ _ _ _ | |___ _ _ | Armagh | \__ \/ _/ _ \ _| _| | |\/| / _` | ' \| / -_) || | | Northern Ireland | |___/\__\___/\__|\__| |_| |_\__,_|_||_|_\___|\_, | | BT61 9DG. | http://star.arm.ac.uk/~spm/welcome.html |__/ \=/ -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] bringing down one (more) mp3 restriction [LAME case projection]
vdbj wrote: (removed long mail) Perhaps we could add an advanced option for adjusting the encoder delay?. I know that if reduced to the max, it will lower the quality of the first frame, but in some cases it could be a better choice. Regards, -- Gabriel Bouvigne - France www.mp3-tech.org -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )