Re: [MP3 ENCODER] Interesting high quality settings and possible bug
Gargos Chode wrote about the -k switch: Hrmm... that is an interesting idea. I completely hadn't thought of this. Does this actually take away bits from being used to encode the audio frame? If so then what is the real use of this switch? I had thought this switch would help to prevent the mp3 from being possibly corrupted by being transferred over and over again. Not that this really happens often but I thought why not. If however this switch really isn't that useful and it takes bits away from being used to encode the audio then I will stop using it. Currently I haven't noticed any degredation in sound just through normal listening tests, although I haven't looked into the matter further. I will do some testing and see if encoding without this switch seems to have any impact. It takes something around 612 bits/s. This is less 0.5% for 128 kbps, even less for higher bit rates. It becomes interesting for bitrates in the range from 8...24 kbps. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] VBR brutal test file, bug in ATH?
I've started testing Lame with synthetic input to test some lame properties. One is a dual tone sweep, a slow araising sweep from 16 Hz to 14 kHz and a second arising from this frequency up to 18 kHz. CBR (-b160): adds some clicks at the end of every fast sweep VBR (-V4): sounds really worse. All kinds of errors on a very high level. May be also coding errors (it sounds like a lot of bit errors AND masking errors). VBR (-V0): sounds equal to CBR VBR (-V4 --noath): sounds equal to CBR VBR (-V4 -b112) sounds equal to CBR VBR (--athlower 80): sounds equal to CBR So I expected it is a ATH problem. But reducing input by 24 dB don't enforce or reduce the problem. Tests on -48 dB are not possible because of the limited input precission, quantization noise becomes important and makes listening tests inpossible. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] LAME file name changes from 3.86 to 3.87?
:: :: :: That name was changed because one make system (MSDOS?) interpreted :: the '-' in quantize-pvt.c as a compiler option. :: MSDOS can't store a name like "quantize-pvt.c", you got at most: "quantize.c" or "quanti~1.c". -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] mpglib related routines (Re: modularization)
:: :: 3. Creating a dozen or so odd types, all synonymous with 'int' :: is a great idea. But I saw a few 'int's left. I think :: we need to replace these with more new types. :: When you were a physicist, you were a cgs advocate and a SI enemy, right? -- Mit freundlichen Grüßen Frank Klemm background for not involved: cgs (stands for centimeter, gram, second) sets several elemental constants to 1 ( c=1, µ=1, k=1, kb=1) and uses only (floating point) numbers for all physical quantities. You can write all thing shorter and faster, and often you have no idea what you are talk about. eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Winamp/100hz bug: SOLVED!
:: :: ISO spec says the maximum should be 8191. But as part of huffman :: decoding, you sometimes add 15 to the result, yielding values as large :: as 8206. Right now, LAME (and the ISO dist10 code) will make use of :: the full range: values up to 8206. :: :: The question is, should LAME be modified to limit this range to 8191. :: How likely is it that music triggers this error? When it is very unlikely in music, I would fix it in Lame. Rationale: * unlikely in music means that it is very unlikely to improve music' quality by using factor from 1FFF to 200E. * unlikely in music also means that it is not (urgent) necessary to fic this bug in WinAmp to play correctly older lame encoded files Does it decrease performance to make this command line configurable? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] mpglib related routines (Re: modularization)
2. I dont think declaring the sample rate as 'long double' is enough. We need to add a portable, object oriented arbritrary precision floating point library to LAME. The sample rate should be a default of 128 digits, but the user can increase this with suitable options. Mathematican: Arbitrary precission is not enough. Physist: Planck time is around 10^-42 s, age of universe is around 10^+18 s, so 60 decimal digits are good enough. non C Programmer: Current CPUs are supporting long double (being 64 or 80 bit) in hardware, so why we aren't use this. Assembler Programmer: How large are the floating point registers? Is there any trick to store two numbers to increase speed? No? Video technican: Films are at most around 4 hours, humans can detect time differences down to 20 ms, so we need something around 6 decimal digits. Markedroid: Reduce precission as long as noone complains. Advanced Markedroid: Reduce precission as long as noone charges you. Professional Markedroid: Reduce precission as long as income increases. C Programmer: Why we are not using a 'int', if we need more precission, then we can store ms or mHz in ints. Also µs, ms, cs and s/HZ are possible (see POSIX API). C++ Programmer: We need a class ... Ada programmers: Don't we have a generic type to store time intervals? COBOL programmer: Sample frequencies are between 0 and 9 Hz, right? FORTRAN programmer: It isn't an index, so I use a REAL. Rivest: Can we assume that all sample frequencies can be exactly described by a/b with a and b being two finite integral numbers with gcm (a,b) := 1 ... Haskell programmer: Do we need numbers? You can describe all by sets. or: There is really no computation until you want to remove the WAV files from hard disk. Perl: What are types? 4. LAME should be converted to C++ or Java. C just is not up to the task of distinguishing between 30 different types of integers. Current C (C89 + POSIX) has something around 18 integer types. C99 + POSIX something around 90 integer types. May be the ISO committee is stupid, may be they have recognized some problems of the C language. C89 + POSIX ~~~ char signed char unsigned char short unsigned short long unsigned long int unsigned int size_t ssize_t off_t ptrdiff_t wchar_t ptr_t clock_t time_t C99 + POSIX ~~~ char signed char unsigned char short unsigned short long unsigned long int unsigned int size_t ssize_t off_t off64_t fpos_t fpos64_t ptrdiff_t wchar_t wint_t ptr_t clock_t time_t int8_t int16_t int32_t int64_t int128_t uint8_t uint16_t uint32_t uint64_t uint128_t int_least8_t int_least16_t int_least32_t int_least64_t int_least128_t uint_least8_t uint_least16_t uint_least32_t uint_least64_t uint_least128_t int_fast8_t int_fast16_t int_fast32_t int_fast64_t int_fast128_t int_fast16_t int_fast32_t int_fast64_t int_fast128_t uint_fast8_t uint_fast16_t uint_fast32_t uint_fast64_t uint_fast128_t bool ... Also note: float_t and double_t And there are also a lot of Unix types out there (having currently fixed sizes): pid_t, dev_t, ino_t, fsid_t, gid_t, uid_t, mode_t, umode_t, nlink_t, ... -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] wrong prototype
fft routines are wrongly prototyped: void fft_long ( lame_internal_flags* gfc, FLOAT x_real [BLKSIZE ], int, sample_t ** ); void fft_short ( lame_internal_flags* gfc, FLOAT x_real [3] [BLKSIZE_s], int, sample_t ** ); void init_fft ( lame_internal_flags* gfc ); Right is: void fft_long ( lame_internal_flags* gfc, FLOAT x_real [BLKSIZE ], int, sample_t [] [2] ); void fft_short ( lame_internal_flags* gfc, FLOAT x_real [3] [BLKSIZE_s], int, sample_t [] [2] ); void init_fft ( lame_internal_flags* gfc ); or void fft_long ( lame_internal_flags* gfc, FLOAT x_real [BLKSIZE ], int, sample_t* [2] ); void fft_short ( lame_internal_flags* gfc, FLOAT x_real [3] [BLKSIZE_s], int, sample_t* [2] ); void init_fft ( lame_internal_flags* gfc ); which are different from the above. If gfc is not modified, also void fft_long ( const lame_internal_flags* gfc, FLOAT x_real [BLKSIZE ], int, sample_t ** ); void fft_short ( const lame_internal_flags* gfc, FLOAT x_real [3] [BLKSIZE_s], int, sample_t ** ); void init_fft ( lame_internal_flags* gfc ); is right and fine. Also a last remark for today: There still strange tabs within the source code. __ Examples to store 2D in C (comments are in German): #include stdio.h #include stdlib.h #include time.h #define TYP char #define XRES128 #define YRES128 /*** 2D-Array **/ /* Eine aufgerufene Funktion */ void foo1 ( TYP (*Array) [XRES], size_t len ) { size_t i; size_t j; int k; clock_t c = clock (); for ( k = 0; k 1000; k++ ) for ( i = 0; i XRES; i++ ) for ( j = 1; j len-1; j++ ) Array [j] [i] = Array [j-1] [i] + Array [j+1] [i]; c = clock() - c; printf ("%.3f µs\n", 1.e6/CLOCKS_PER_SEC/XRES/YRES*c); } /* Möglichkeiten, so ein Array zu erzeugen und zu zerstören */ void bar1 ( void ) { TYP A [YRES] [XRES]; TYP (*B) [XRES] = malloc ( YRES*XRES*sizeof(TYP) ); foo1 (A, YRES); foo1 (B, YRES); free (B); } /*** Zeiger-Array **/ /* Eine aufgerufene Funktion */ void foo2 ( TYP** Array, size_t len ) { size_t i; size_t j; int k; clock_t c = clock (); for ( k = 0; k 1000; k++) for ( i = 0; i XRES; i++ ) for ( j = 1; j len-1; j++ ) Array [j] [i] = Array [j-1] [i] + Array [j+1] [i]; c = clock() - c; printf ("%.3f µs\n", 1.e6/CLOCKS_PER_SEC/XRES/YRES*c); } /* Möglichkeiten, so ein Array zu erzeugen und zu zerstören */ void bar2 ( void ) { size_t i; size_t j; TYPtmp [YRES] [XRES]; TYP* p; TYP* A [YRES]; TYP** B; TYP* C [YRES]; TYP* D [YRES]; TYP** E; TYP** F; TYP** G; for ( i=0; iYRES; i++ ) A [i] = tmp [i]; B = A; p = malloc (YRES*XRES*sizeof(TYP) ); for ( i = 0; i YRES; i++, p += XRES ) C [i] = p; for ( i = 0; i YRES; i++ ) D [i] = malloc ( XRES*sizeof(TYP) ); E = malloc ( YRES*sizeof(TYP*) ); for ( i = 0; i YRES; i++ ) E [i] = malloc ( XRES*sizeof(TYP) ); F = malloc ( YRES*sizeof(TYP*) ); p = malloc ( YRES*XRES*sizeof(TYP) ); for ( i = 0; i YRES; i++, p += XRES ) F [i] = p; j = YRES * sizeof(TYP*) / sizeof(TYP) + 1; p = malloc ( j*sizeof(TYP) + YRES*XRES*sizeof(TYP) ); G = (TYP**) p; p += j; for ( i = 0; i YRES; i++, p += XRES ) G [i] = p; foo2 (A, YRES); foo2 (B, YRES); foo2 (C, YRES); foo2 (D, YRES); foo2 (E, YRES); foo2 (F, YRES); foo2 (G, YRES); free (*C); for ( i = 0; i YRES; i++ ) free ( D[i] ); for ( i = 0; i YRES; i++ ) free ( E[i] ); free (E); free (*F); free (F); free (G); } int main (void) { bar1 (); bar2 (); return 0; } /* * Für Fortgeschrittene: *- das ganze mit 3 Dimensionen *- das ganze mit N Dimensionen *- das ganze mit allen 'const'-Varianten *- das ganze mit allen 'const'-Varianten mit 3 Dimensionen *- das ganze mit allen 'const'-Varianten mit N Dimensionen */ -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] lame 3.87 encode-decode roundtrip
Some remarks about the ATHformula. The result are differing really a lot from my experiments. The difference is 10 dB and more at =12 kHz. For very young people the difference may be larger. Examples: Frequency Formula Experiment Difference [kHz] [dB] [dB][dB] 1 3.362.720.6 2 -0.25 -1.401.1 3 -4.56 -4.640.1 4 -3.38 -5.502.1 5 0.48 -2.272.8 6 2.083.89 -1.9 7 3.168.13 -5.0 8 4.78 11.27 -6.5 9 7.18 13.23 -6.0 10 10.57 14.39 -3.8 11 15.17 12.852.3 12 21.23 11.49 10 13 29.02 13.24 16 14 38.85 18.13 21 15 51.04 23.57 28 16 65.93 35.63 30 17 83.89 51.24 32 18 105.33 60.13 45 Binaural, Sennheiser HD 560, diffuse field. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Parameter setting functions...
On Wed, Oct 04, 2000 at 09:42:51PM +0100, Sigbjørn Skjæret wrote: Just thought I'd say my thoughts on the different parameter setting function proposals we've had so far... Individual functions for each parameter: Pros: - None. ;) Cons: - Litters the API with "thousands" of functions. - If the parameter's type changes, the API has to change. This never happens. First you can (mostly) prevent this by avoiding miserliness of bits (Mark! It is easier to pass an boolean through a long double argument than a long double argument through a boolean argument). Secondly I prefer only to add functions. Old functions nevertheless slowly dying if you stop testing them. - If a new parameter is introduced, the API has to change. This is a generic property of an API: The API changes if the API changes. You can't avoid this by reducing the API entry point to one function (otherwise the Linux API has been not changed from Linux 0.01, it is still the sys_call() function, which is an intreq 0x80). Don't mix the two items "number of API entry points" and "complexity of an API". Math libs don't become easier by: math_call ( MATH_SIN_TYPE , MATH_ONE_ARGUMENT, MATH_DOUBLE_ARGUMENT, (double)0.707 ); math_call ( MATH_SQRT_TYPE, MATH_ONE_ARGUMENT, MATH_DOUBLE_ARGUMENT, (double)1.732 ); math_call ( MATH_POWER_TYPE, MATH_TWO_ARGUMENTS, MATH_DOUBLE_ARGUMENT, (double)2.616, (double)1.616 ); Real programmers also merging all other functions (lame_ioctl, math_call, sys_call, X11_ioctl, ...) and use only one function doing all things: void do_stuff ( int do_, ... /* stuff */ ); Then you don't need a linker and can do linking by a hex editor ;-) Also possible are: DoIt (); PerformDataFunction (); HandleStuff (); do_args_method (); snafucate (); do_args_method ( LIB_SELECT_LAME | LIB_LAME_SETUP_FUNC | LIB_LAME_SETUP_SFREQ | LIB_SET_DBL_TYPE, (double)44100 ); do_args_method ( LIB_SELECT_POSIX | LIB_POSIX_IO_FUNC | LIB_POSIX_IO_WRITE | (LIB_SET_INT_TYPE10) | (LIB_SET_VOIDPTR_TYPE5) | (LIB_SET_SIZET_TYPE0), (int)fd, (void*) buff, (size_t) length ); do_args_method ( LIB_SELECT_CLIB | LIB_CLIB_BASE_FUNC | LIB_CLIB_BASE_EXIT | LIB_SET_INT_TYPE, (int)0 ); Giving a parameter structure: Pros: - Hmmm, none. ;) Cons: - You have to be very careful not to disturb the order of the parameters. - You end up with a bunch of duplicates if you have to change parameters. - Different compilers can cause different alignments. - you can't add a translation layer between API and lib. Giving tag-pairs on stack to one function which parses them: Pros: - API never has to change. Cons: - Littering with different tags for each type. - It's possible to pass the wrong type. Giving tag-pairs on stack to 3 functions (one for each type): Why 3 functions? You only need an int ;-) Pros: - API never has to change. - One tag for any type. Cons: - It's still possible to pass the wrong type, but it's much clearer since the function itself states which to pass. Giving tag-pairs on stack to one function which parses them: const char* lame_ioctl ( enum TagItem, const char* TagValue ); returns NULL or the actually set parameter. Note: Both methods (thousands of functions and thousands of tags) are equivalent: * use one function ( lame_ioctl() ) and thousands of constants to tell this function what functionality is actually requested * use thousands of functions (lame_x () ) to execute a functionality The difference is that second possiblity is more type safe, and the first really looks like you never need to change the API, which is only partially true (backward linking is possible, but you have still a runtime error, this is often called error obscuring). -- Frank "C programmers hate readable programs" Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] mpglib related routines (Re: modularization)
On Sun, Oct 01, 2000 at 09:10:19PM +0200, Robert Hegemann wrote: So for backward compatibility we should make a wrapper library with the old interface (as much as possible) and mark this as old and outdated, to give clients the possibility for smooth migration to the new API. old clientnew clients || v| wrapper lib | || ++- | || v vv lame-enc-lib lame-dec-lib lame-hdr-lib lame-enc-lib: - lame's encoding engine - maybe with Xing's VBR header stuff if it must be lame-dec-lib - lame's wrapper to the mpg123 library lame-hdr-lib - wave header - Xing header - ID3 stuff wrapper lib - the old libmp3lame and interface What about the proposal to design and discuss a well designed lame API now without implementation? AFAICS this takes at least 3 months, if you want to have a durable and neat API, Mark don't want to change the API until summer 2001. May be should use this time to design all the structures are needed and the lame.h file needed. This interface should be designed in a way, so * no arbitrary constants are in it * also Layer I, Layer II and AAC can added to this interface * also multiple channels are possible * feed count be int16, int32, float, double * Support of huge arrays (open file, memmap, run encoder once, save result) * solving 'Gap' problem * solving compatibility problem with LAME 3.xx interface I propose a file 'NewAPI.h', were all this is written down (prototypes and rationale). This is better than only discussing and later forgetting the results. /* * All Bitstream Data is stored in a bitbuffer_t structure. So you * - only have to pass one argument * - functions have the chance to increase the buffer if necessary * - you don't need the return value to return the size of the MP3 data, * so you can return other things (error codes or other stuff) * - correctness of the bitbuffer parameter can be checked by the C Compiler * - all functions have the same structure * typedef struct { void* data;// data buffer, allocated via malloc () const size_t size;// maximal size of the data buffer size_tlen; // valid octets stored in this structure } bitbuffer_t /* * The lame API only sees very little of the internal structure of * _IO_LAME. Only if compiled with BUILD_LAME_LIB all elements are * visible (and writable). */ typedef void (*fnptr) (...); typedef struct { #ifdef BUILD_LAME_LIB const fnptr* virtual_function_table; int128struct_ID; int init_state; ... #else /* BUILD_LAME_LIB */ const fnptr* virtual_function_table; const int128 struct_ID; const int init_state; const char__internal_data__ [32000]; #endif /* BUILD_LAME_LIB */ } _IO_LAME; typedef int errorcode_t; bitbuffer_t open_bitbuffer ( size_t size = 16384 ); int close_bitbuffer ( bitbuffer_t* buffer ); int set_FILE_binary ( FILE* fp ); LAME*lame_open ( void ); int lame_close ( LAME* lp ); /* Nice would be if this fucntions returns the exact input frequency. But this is only possible if the output frequency is available */ long double lame_set_input_samplefreq ( LAME* lp, long double freq = 44100.l ); long double lame_set_output_samplefreq ( LAME* lp, long double freq = 44100.l ); int lame_set_number_of_channels ( LAME* lp, int front = 2, int rear = 0, int dual = 0, int docu = 0 ); int lame_set_min_bitrate( LAME* lp, int minbitrate = 112000 ); int lame_set_max_bitrate( LAME* lp, int maxbitrate = 32 ); int lame_set_minmax_bitrate ( LAME* lp, int minbitrate = 112000, maxbitrate = 32 ); int lame_set_cbr_bitrate( LAME* lp, int minbitrate = 128000 ); real lame_set_cwlimit( LAME* lp, real freq = 8871.68 ); const char* lame_version( LAME* lp ); real lame_set_mask_to_noise_ratio( LAME* lp, real ratio = 0. ); lame_mode_t lame_set_bitrate_mode ( LAME* lp, lame_mode_t mode = lame_cbr ); real lame_set_fullscale_level( LAME* lp, real fullscale = 32768. ); int lame_set_coding_quality ( LAME* lp, int quality = 50 ); // 0 worst, 50 default, 100 best int64lame_read_coded_pcmsamples ( LAME* lp ); errorcode_t lame_encode_buffer_short ( LAME* lp, bitbuffer_t* buffer, size_tpcmdata_len, const short* channel1, ... ); errorcode_t lame_encode_buffer_interleaved_short ( LAME* lp, bitbuffer_t* buffer,
Re: [MP3 ENCODER] resampling
:: :: ::I think this should be a seperate utility outside of lame? Most people ::encode from CDs, which usually are already correctly filtered for stuff ::below 20 Hz. :: :: For pop music this is (mostly) true. I will test several CDs. Next week. :: The psycho part I would nevertheless filter with a 8rd order Chebychew :: high pass @80 Hz. Remember active controlled boxes by BO. But also :: normal vented tubes have a relatively sharp cut off at low frequencies, :: so don't rely on masking. :: :: :: The windowed FFT used for the psycho acoustics surely has better :: frequency responce than an 8th order Chebychev filter, :: so there will negligable leakage of this 80Hz tone into :: other frequencies. If you think this is a problem, I would :: try some experiemnts by removing (from the psycho acoustics) :: the coefficients up to 80hz. :: 1. Chebychev filters (LTI system) have no leakage at all 2. Chebychev highpass filters (like all IIR filters) have a worse frequency resolution for high frequencies ( - oo) and a extremly good for low frequencies ( - 0). 3. You can program a FFT like filterbank with simple 2nd order lowpass filters. But this filterbank has problems with detecting high frequencies if there are high level low frequency signals are present. Calculate yourself: #include stdio.h #include math.h #define PASS_BAND_RIPPLE0.1 // dB #define ORDER 8 #define STOP_BAND 84.22 // Hz, note damping is not // -3 dB, but -PASS_BAND_RIPPLE dB double cheychev_polynom ( double x, int n ) { // NOTE: These are really polynoms, but the trigonometric // represenation is much easier to remember if ( x = 1 ) return cos ( n*acos (x) ); else return cosh ( n*acosh (x) ); // would be the same formula for complex numbers } int step ( int i ) { if ( i200 ) return 1; if ( i500 ) return 2; if ( i 1000 ) return10; if ( i 2000 ) return50; if ( i 5000 ) return 200; if ( i 1 ) return 1000; if ( i 25000 ) return 5000; return i; } int main (void) { int i; double poly; double damp; double ripple = pow ( 10., 0.1*PASS_BAND_RIPPLE ) - 1.; for ( i = 1; i = 2; i += step (i) ) { poly = cheychev_polynom ( (double) STOP_BAND / i, ORDER ); damp = 10. * log10 ( 1. + ripple * poly * poly ) + ( ORDER 1 ? 0. : -PASS_BAND_RIPPLE ); printf ( "%5u Hz%9.2f dB\n", i, -damp ); } return 0; } -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] resampling
:: :: Question: When fs_in/fs_out is not representable by a:b with little a and b, :: what do you like best: :: :: [_] Lame rounds fs_in in a way, so fs_in/fs_out is representable by little a:b :: [_] Lame have a function to resample exactly any ratio :: [_] both, selectable :: :: LAME is going to only work with integer samplerates. MP3 output must :: always be an integer samplerate. If you want to encode :: from non-integer samplerate source, resample first. :: Input is a 57:10.00 WAV file with 34301.0 Hz sample frequency (470 MByte). What should be IYHO the output? a) Resample 9:7 and got a 57:10.10 MP3 file with 44.1 kHz with 30 ppm pitch error + faster (factor of 2) o little pitch error - timing error + excellent side slope rejection is no problem + you need no fractional fs, because the precission is restricted by the maximum possible a:b ratio (for instance a,b = 441) + often rounds rounded values to the correct values (see below) b) Resample 44100:34301 and got a 57:10.00 MP3 file with 44.1 kHz with exact pitch + slower (nearly doubles the computation time) + no pitch error + no timing error - for good side slope rejection (90 dB) you need something between 128 and 256 tables - this routine makes exactly what you are requesting: fs_in = 3, fs_out = 5: This method really resamples by 5:3 = 1.500015...:1, not 3:2. c) Both should be possible. I only need the input: a), b) or c). -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] MMX question
:: Mark, I got that already, but I have no idea how to check :: for the presence of a MMX CPU. If someone knows how to do :: that, fine! Maybe there is some special x86 command that :: allows to decide whether it is a MMX CPU or not. :: As a quick fix we could implement that command line switch :: for LAME. :: :: Can't you just execute an MMX instruction and catch an exception if it :: crashes? i don't know much about C so i don't know how error handling is :: done. :: Oh, oh no! Do you want to start a contest of writing the dirties possible program? main(_){float x,sin();for(_=0;__*_8;_+=_) ... -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] resampling
:: As a value of 200 for BLACKSIZE showed an improvement in resampling, why :: does is still got a value as low as 25? :: Low pass, high pass and resampling code should be replaced by artefact-less program code. All three are currently done by code not being a LTI system, which results in unnecessary distortions. You can see this in the spectral view of CoolEdit Pro. LTI stands for Linear Time Invariant. Non LTI systems are generating additional frequencies instead of only emphasing and deemphasing frequencies. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Second question: Compatiblity
Currently programs (must) access the lame_global_flags structure directly to setup lame. To get as much compatibility as necessary, it is important to know *when* programs (must) accessing this structure. [_] Frontends only accessing this structure for setup before any PCM sample to MP3 coding. In this case compatibility can be achieved via a simple trick. [_] Frontends also accessing this structure during the coding process. This makes things much harder. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] resampling
:: :: :: :: As a value of 200 for BLACKSIZE showed an improvement in resampling, why :: does is still got a value as low as 25? :: :: :: Regards, :: :: -- :: :: Gabriel Bouvigne - France :: :: Hi Gabriel, :: :: Increasing BLACKSIZE only improves the sharpness of the lowpass :: cutoff. For resampling, I dont think we need an extremely sharp :: cutoff, and maybe a more gradual cutoff even sounds better? :: :: The problem David discovered (on that mp3.com posting) (aliasing onto :: lower frequencies) was related to BPC - the number of precomputed :: convolution functions. LAME was precomputing only 16, but this is now :: bumped up to 160 in lame 3.87. :: :: I hope to add something soon which has it precompute the exact amount :: needed. Does anyone have code which computes the lcd (largest :: common denominator) of two ints? :: Frequencies are real numbers, not integral numbers. And I don't like code like: if ( Frequency == 44055 || Frequency == 44056 ) Frequency = 2863636/65.L; // NTSC PCM if ( Frequency == 31468 || Frequency == 31469 ) Frequency = 2863636/91.L; // NTSC Hi8 Digital if ( Frequency == 22254 || Frequency == 22255 ) Frequency = 244800/11.L; // MAC HQ if ( Frequency == 11127 || Frequency == 11128 ) Frequency = 122400/11.L; // MAC LQ if ( Frequency == 8012 || Frequency == 8013 ) Frequency = 312500/39.L; // NeXT/Telco if ( Frequency == 5512 || Frequency == 5513 ) Frequency = 44100/ 8.L; // 1/8 CD Table index search is done by: double Factorize ( const long double f1, const long double f2, int* x1, int* x2 ) { unsigned i; long ltmp; long double ftmp; double minerror = 1.; double abserror = 1.; double error; assert ( f1 0. ); assert ( f2 0. ); assert ( x1 != NULL ); assert ( x2 != NULL ); for ( i = 1; i = MAX_TABLES; i++) { ftmp = f2 * i / f1; ltmp = (long) ( ftmp + 0.5 ); error = fabs ( ltmp/ftmp - 1.); if ( error minerror ) { *x1 = i; *x2 = (int)ltmp; minerror = error * 0.9; abserror = ltmp/ftmp - 1.; } } return abserror; } :: I think the number of windows needed :: is given by: out_samplerate/(lcd(in_samplerate,out_samplerate)) :: :: (44.1khz - 32khz requires 320 windows, but 48khz-32khz only :: requires 4!) :: 48 kHz - 32 kHz requires 2 sets of coefficients. This coefficients are not the windows. The window function is the same for all sets of coefficients. You need setup code like: err = Factorize ( NewFrequency, Frequency, iNew, iOld ); if ( err != 0. ) fprintf ( stderr, "(%.6Lf = %.6Lf Hz, Ratio %u:%u, Error %+.3f ppm) ", Frequency, NewFrequency, iNew, iOld, 1.e6*err ); else fprintf ( stderr, "(%.6Lf = %.6Lf Hz, Ratio %u:%u) ", Frequency, NewFrequency, iNew, iOld ); ... for ( i = 0; i iNew; i++ ) { k = imuldiv (i, iOld, iNew) - WINDOW_SIZE; (iNew iOld ? Calc_Coeffs_Down : Calc_Coeffs_Up) ( Coeff [i], iNew, iOld, i, k ); } and transforming code like: for ( ...; i ...; i++ ) { k = imuldiv (i, iOld, iNew) - WINDOW_SIZE; ScalarWindow ( Dest.fData [i], Coeff [i%iNew], sData + k ); } -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] TagItem issues...
:: Sigbjørn Skjæret schrieb am Die, 26 Sep 2000: ::Why do we need float at this point ?? :: :: Because several of the parsed arguments are floats? :: :: ie. frequencies could be passed in Hertz as ints, :: was just something to think about now while we :: change the API anyway :: What about not integer value sampling frequencies? For stand alone audio rounding is no problem. But for audio/video and unbuffered streaming this may produce really unnecessary problems. For instance AIFF stores sampling frequency as 80 bit-IEEE-754 long double. Avoid problems instead of patching the results of this problems. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Free format
:: :: another thing that does not work: :: :: lame -b640 --freeformat -g fatboy.wav fatboy.mp3 :: fatal error. MAXFRAMESIZE not large enough. :: :: in mpg123.h MAXFRAMESIZE is defined as 1792, but a 32 kHz 640 kps MP3 :: consists of 2880 bytes per frame (2090 at 44.1 kHz, 1920 at 48 kHz). :: Changing this seems to trigger another BUG somewhere: :: LAME stops with: Only 8 and 16 bit input files supported :: :: 1792*8 = 14336 bits, :: 1920*8 = 15360 bits, :: 2090*8 = 16720 bits, more than 32767/2, reason for BUG? :: 2880*8 = 23040 bits, more than 32767/2, reason for BUG? :: :: I haven't looked deeper at this, I had no time yet :-(. :: test this also with --noshort. --decoder crashs on larger data rates and --noshort. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Some suggestions for LAME - please review
Hi Mark, :: A couple of comments/questions: :: :: :: Also, every transition from two different size windows is lossy. The :: :: MDCT is only lossless for overlapping windows of the same size. :: :: :: Is this a problem of bad designed (asymmetric) window functions or a :: problem of the MDCT (different from DCT). :: :: :: It is a problem with all lapped transforms (like the MDCT). :: You need at least a 50% overlap to get a lossless transform. :: In AAC, a 1024 frame followed by a 128 frame, the 128 frame :: will use a window of size 256, so it only extends 64 samples into :: the 1024 frame. :: I've tested a (slow stupid)FT transform based system with randomly switching window size and had only rounding errors. The signal is devided into arbitrary blocks. Every FFT blocks uses two of these blocks uses two (often different) cos² functions for cross fading. The blocks are cosine transformed. When I have time, I will test this again (I'm not so familar with the DCT, I'm only familar with LTI, FT, zT, LT is also a little bit more difficult). :: 4. The prefilter has a extremly short size of 4 or 5 TAPs, which is :: far below 128, 192, 576, or 1024. :: :: :: If this is true, then the 1024 FFT should have plenty of frequency :: resolution and the prefilter can be easily implimented via the FFT :: coefficients. So no need for a new filter? :: 1. FFT filters are strictly speaking no filters (they are not a LTI system), so they have some nasty properties, which are more or less audible. The audibility depends on the steepness of the filter. So high passes should never be made by FFT filters. Never ever. Filter flanks modulating the signal, a property LTI systems NEVER have. May be also low pass filter are a bad idea. For high pass filters I'm absolutely sure. 2. FFT filters approximating non recursive filter (often called FIR filters, which is not correct), but actually they are a mixture of a frequency dependent modulator and a filter. Non recursive filters are only a very special class of filters. All LTI filtering is done by: A B y(n) := Sum a(i) x(n-i) - Sum b(i) y(n-i) i=0 i=1 Every Filter can be characterized by the a(0...A) and b(1...B). For non recursive filters is B=0 and A=0 (also called moving average filters), for absolute phase filters is A=0 and B0 (also called auto regression filter). Filters with B0 and A0 are mixing both base vectors of filters and are also called auto regression moving average filters. You can divide ever (LTI) filter into two filters, a MA and a AR filter: A v(n) := Sum a(i) x(n-i) i=0 B y(n) := v(n) - Sum b(i) y(n-i) i=1 Now you can set b(0):=1 A v(n) := Sum a(i) x(n-i) i=0 B v(n) := Sum b(i) y(n-i) i=0 This gives (x,y,z complex, O is a big omega and omega/fs, j is sqrt(-1) ) A v(w)/x(w) = Sum a(i) exp (jOi/fs) i=0 B v(w)/y(w) = Sum b(i) exp (jOi/fs) i=0 Substituting exp(jO/fs) = z gives Ai v/x = Sum a(i) z i=0 Bi v/y = Sum b(i) z i=0 and Ai Sum a(i) z i=0 y/x = --- Bi Sum b(i) z i=0 So you can see: MA = P_A(z), AR = 1/Q_B(z) and ARMA = P_A(z)/Q_B(z). Example: The easiest AR filter, a integrator (1st order) can only be programmed by a infinite long MA filter. Are polyphase filters LTI systems? FFT filters aren't. And they are comparable with the subset of MA filters. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Please Vote: Interface types
To avoid endless debates about the interface without clearifying the basic concept I do introduce all possible methods I know (for C) with explaining all pro and cons I know. I hope this is the best I can do. +--X8X8---+ | [_] I do not vote| | [_] pointer to a public structure| | [_] private struct | | [_] pointer to a private struct | | [_] lame handle | +--X8X8---+ a) pointer to a public structure Program have to allocate a struct on stack, heap or statically (or lame_open has to do this) and have to release this at the end. The structure of this struct is well known, every element can be accessed via read and write. All lame functions get as first argument a pointer to this struct. Interface is peeking and poking this structure. + source code compatible with the current version (you only need to recompile, if you have all the sources) - problems while extending, reording, changing structure removing, changing, promoting elements of this structure (this can be avoid with more or less effort until a given degree, but this also will hinder development, and finally it goes wrong) - problems when compiler will change binary structure representation - problems when structure must be extended - problems when changing the libmp3lame.so due to the points mentioned above - the are some ambiguous situations (example: VBR_q=0: quality=standard or quality=0) - error handling is more difficult (who is the guilty?) Note: Changing an interface not become easier in the future. Note: If necessary a wrapper can be written to support the old interface. b) private struct Program have to allocate a struct on stack, heap or statically. The structure of this struct is not known, so access is possible, but only with obvious very dirty methods. All lame function get as first argument a pointer to this struct. Interface are well defined function calls. - problems when the struct must be enlarged o rest see c) c) pointer to a private struct Program calls lame_open and gets back a pointer to a structure. The structure of this struct is not known, so access is possible, but only with obvious very dirty methods. All lame function get as first argument this struct pointer. Interface are well defined function calls. - not source code compatible with the current version + no problems changing the internal structure. You can do nearly everything + no problems changing the shared lib without recompiling all programs + no compiler representation problems + structure can be extended without any problem + well defined interface with the possibilty of error handling Note: UNIX/C buffered FILE I/O works in this way: fopen/fread/fwrite/fputs/fclose/setvbuf/fseek/... d) lame handle Program calls lame_open and gets back an integer value. This value is an index into an internal lame table. Access to this internal table is also only possible via obvious very dirty methods. All lame function get as first argument this integer value. Interface are well defined function calls. o first see c) + better anonymization of the structure + needs not the trick of mapping two different structures in API and Lame - a little bit less error checking is possible (first parameter is an int, here you can pass a lot of trash: 44100/1050 ) Note: POSIX unbuffered I/O works in this way: open/read/write/close/lseek/dup/dup2 Good operating Systems also hiding the internal structure by memory protection (Kernel/User space). Hope this helps. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] TagItem issues...
:: /* Usage : :: lame_set_parameters(gfp, LAME_SAMP_FREQ__INT, 44100 , :: LAME_NR_CHANNELS__INT , 2 , :: LAME_LOW_PASS_VALUE__FLOAT , 16.050 , :: LAME_LOW_PASS_WIDTH__FLOAT , 0.75, :: LAME_COMMENT__STRING , "This is cool" , :: LAME_END_MARKER); :: :: LAME_END_MARKER informs the lame_set_parameters to stop reading values, :: as AFAIK :: there is not other way to determine whether all the args has been read :: :: the __FLOAT and __INT attachments are there only to help illustrate my :: idea, :: but could be helpful in the actual implementation too :: :: */ :: One function, but is this simple? No error checking, a lot of really strange constants, mixing Hz and kHz. Do you like bugs? Do you love bugs? Do you buy a car with only one joystick to do all operations (including wheel- and oil-change)? Only one joystick, that means very simple operations! Welcome on the graveyard. typedef struct { void* ptr; const size_t size; size_t len; } bytebuffer_t; bytebuffer_t bb; LAME* gfp; gfp = lame_open (); lame_set_samplefreq( gfp, 44100.0 ); lame_set_channels ( gfp, 2 ); lame_set_lowpass ( gfp, 16050.0 ); lame_set_lowpass_width ( gfp, 750.0 ); lame_set_lowpass_type ( gfp, lame_filter_chebychev, 8, 0.5 ); lame_set_comment ( gfp, "This is not error prone" ); lame_add_comment ( gfp, "Developers will not hate lame developers" ); lame_add_comment ( gfp, "Lame developers do not need a body guard" ); alloc_bytebuffer ( bb, 16384 ); error = lame_encode_buffer ( gfp, bb, 88200, channel_left, channel_right ); if (error) lame_perror ( "The following error occured", error ); fwrite ( fp, 1, bb.len, bb.ptr ); error = lame_encode_interleaved_buffer ( gfp, bb, 88200, interleaved_buffer ); if (error) lame_perror ( "The following error occured", error ); fwrite ( fp, 1, bb.len, bb.ptr ); error = lame_encode_finish ( gfp, bb ); if (error) lame_perror ( "The following error occured", error ); fwrite ( fp, 1, bb.len, bb.ptr ); if ( bb.size 16384 ) printf ( "BitBuffer was increased by a subroutine, instead of crashing!\n" ); free_bytebuffer ( bb ); error = lame_close ( gfp ); if (error) lame_perror ( "Shit, the last operation and a error", error ); -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] realtime encoding specs ?
:: Ok, My goal is to setup an fm tuner card and record some late night radio. :: The signal and production quality of the broadcast are top notch, so I'd :: like :: to preserve as much as possible. Also, I'd ideally like to encode realtime.. :: I'm given to understand that lower bitrates encode faster ? :: What would be the processor power required for realtime encoding at: :: lame -V 1 -b 128 -h -m j -q2 ? :: lame with a CBR of 160 ? :: :: Note: my guess at a VBR encoding spec is sort of a guess at what might :: produce :: a file with an average of 150bkps, which I think would be perfect.. if this :: is overkill :: for a high quality fm signal, what would be a more reasonable alternative ? :: Works on a K6-2-315 and Lame 3.87 alpha 3. Average CPU load is around 90% -- #! /bin/bash time=3600 output=/home/pfk/1.mp3 lame=/usr/local/bin/lame lameopt='-b 112 -V1 -q5 -mj --lowpass 15.0' fs=32000 ( nice -n -20 wavrec -t $time -s $fs -b 16 -S /proc/self/fd/1 ) | \ ( nice -n -20 buffer -p 0 -s 36k -m 32760k ) | \ ( nice -n -20 $lame $lameopt /proc/self/fd/0 /proc/self/fd/1 ) | \ ( nice -n -20 buffer -p 34 -s 256k -m 768k ) $output --- For faster machine I would change 'lameopt' to one of the following: lameopt='-b112 -V1 -q1 -mj --lowpass 15.0' lameopt='-b128 -V0 -q1 -mj --lowpass 15.0' lameopt='-b128 -V1 -q1 -mj --lowpass 15.0' lameopt='-b160 -V0 -q1 -mj --lowpass 15.0' If you are recording for CDs change fs to 44100. May be someone can add this to lame (rlame.sh = recorder lame)? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] TagItem issues...
_voice ); encoding lame_close ( handle ); Handles have advantages and disadvantages: - you need an internal pointer array, with affects the maximum number of handles: lame_global_flags *Table [20]; lame_open() searches a free slot and allocates memory. A forget lame_close() makes difficult to find problems. - no type security. Noone avoids things like: lame_set_highpass ( 44100/1000, 100.0 ); + you can't access the structure itself. - this can also be achieved via a simple trick. You use different structure definitions for the interface and for the lib. interface: typedef enum { valid_setup = 1,// ready for setup valid_encode = 2 // ready for encoding // otherwise invalid } valid_t; #ifndef LAME_LIB typedef struct { valid_t valid; } LAME; #else typedef struct { valid_tvalid; long doubleinput_sample_freq; long doubleoutput_sample_freq; unsigned quality; // unsigned is better to ease boundary checks bool crc; unsigned long cbr_bitrate; // bps, not Gbps or thinks like that unsigned long vbr_min_bitrate; unsigned long vbr_max_bitrate; ... } LAME; #endif LAME* lame_open ( void ); intlame_set_crc( LAME*, enable_t ); intlame_set_genuine( LAME*, bool ); intlame_set_lowpass( LAME*, double ); intlame_close ( LAME* ); double lame_report_lowpass ( const LAME* ); // I hate set and get -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Some suggestions for LAME - please review
white) noise from another (white) noise with the same auto correlation function but a fully different temporal MDCT spectrum. So it is a waste of bandwidth to try to store exactly this (white) noise signal. So it should be possible to increase the quantization steps and to use a 1D-error diffusion in the frequency domain to synthese a different but (nearly) in the same way colored noise after decoding. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Simple ABX testing program
Currently I'm writing a little (Unix) program for making ABX double blind tests. Current functionality: * Able to load WAV/AIFF and MP1/2/3 files via lame * Files must be Stereo, 44.1 kHz, 16 bit, up to 3 minutes long * No DC canceling to reduce fading noise between A, B and X * No sample adjustment to detect sample shifts between A and B * Listener can switch between A (listen to A), B (listen to B), X (listen to the unknown) and M (switches between A or B and X for ever replay) * Block repeat, listener can select a block which is repeated (instead of the whole piece of music). * voting control and evaluation. a/A listen to A b/B listen to B x/X listen to the unknown channel m/M listen to A or B and X Q exit program ^A vote for X=A ^B vote for X=B F1 block begin is start of file F2 move block begin 0.1s = F3 block begin = current time F4 move block begin 0.1s = F5 block end is end of file F6 move block end 0.1s = F7 block end = current time F8 move block end 0.1s = r/R Review f/F Forward/Cue -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] various questions (was: Some suggestions for LAME - please review)
On Sun, Sep 24, 2000 at 10:57:39AM +0200, Gabriel Bouvigne wrote: I've got another question about window sizes: are the short ones really essential in VBR? Would it be possible to only use long ones, and then allocating a lot more bits in the case of transcients? After all, Xing uses only long ones, and does a not so bad job for transcients for an encoder using only long ones. (note: I'm not saying that Xing is a reference in term of quality) Tested with a synthesized signal: --noshort -b128:awful --noshort -b320:bad --noshort -b550 --freeformat: Decoder SIGSEG -b320: good, but distinguishable from the origin without any effort (20/20) -b550 --freeformat: okay Note: All-purpose lossless compressing utilities gave a better compression ratio: input uses input uses round to HQ quantization nearest integer quantization gzip190 kbps74 kbps bzip154 kbps68 kbps Very short attacks seems to be a nightmare for MP3. Signal is: * white noise * attack time: 0.5 ms * release time: 25 ms * pause time to fill the bit pool: 474.5 ms * both channels are uncorrelated * all attacks are different and also sounding a little bit different Note: The percussion attacks in "Money for nothing" are a little bit similar to these attacks: * white noise from 1...18 kHz (+/- 3 dB) * attack time: ca. 0.5 ms * release time: ca. 20...30 ms * but: no silence between the attacks How to capture Win95 Screen Shots? What utility would be the best? -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] various questions (was: Some suggestions for LAME - please review)
On Sun, Sep 24, 2000 at 10:57:39AM +0200, Gabriel Bouvigne wrote: Minidisc also uses mixed windows. Perhaps mixed windows would help in our case. I've got another question about window sizes: are the short ones really essential in VBR? Would it be possible to only use long ones, and then allocating a lot more bits in the case of transcients? A long window have a duration of up to 36 ms (32 kHz). So the worst case pre-echo's are: * - 5 dB for dt = 12 ms (32 kHz) or dt = 9 ms (44.1 kHz) * -12 dB for dt = 18 ms (32 kHz) or dt = 13 ms (44.1 kHz) * -24 dB for dt = 24 ms (32 kHz) or dt = 17 ms (44.1 kHz) Because this is much more flat than the human pre-masking, you really need a huge amount of more bits. Often 320 kbps are sounding worse. For post-masking I found values around 1 dB/6 ms for 1...5 kHz. What's the value for pre-masking? 5. Spectral prefiltering to get nearly constant ATH in every CB. Why can we read in the literature that humans got 25 CB but mp3 uses only 22? I think it is because the low frequency CBs are larger than the in the literature. You have two problems: * MP3 uses only CB width which are a multiple of 4, perhaps to make use of the Intel SIMD instructions ;-) * So all CBs have sizes of multiples of 111 Hz/153 Hz/167 Hz, which can't be mapped to the CBs often found in literature. * The exact width of an CB is a little bit arbitrary, you can found values from 40 Hz...120 Hz for low frequencies. It depends on the exact definition of the item "CB". A lot of people say that 100 Hz for low frequencies is much too large. * Note, that Zwicker also splits the 3 low frequency CBs into several subbands to compensate the ATH frequency dependencies (see Table 1 in DIN 45631). See also ISO 532: "Methode de calcul du niveau d'isononie" Another question: I have some C++ programs generating test signals. The programs are around 1...2 KByte large and generating WAV-Files in the range from 1...10 MByte. Some of them are really nasty for MP3. Should we collect such programs? -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] various questions (was: Some suggestions for LAME- please review)
:: :: So the highest subbands don't have any scalefactor? I know that :: Brandebourg said that there is no proof that 16kHz really contribute to :: the hearing of the music, and then it could be intentionnal, but could :: it be a "bug" or mistake in the mp3 specs? :: 40 Hz...16 kHz (+0.2dB,-0.3dB)°) seems to be not enough to pass AB tests. 25 Hz...18 kHz seems to be sufficient, and 20...20 kHz are recommended. These are values for monoton decreasing frequency response. Using a slight emphasis from fu-1.5 kHz to fu reduces significantly the bandwidth needs. The easiest way is to do this with a static frequency response like: 12.5 kHz 0.0 dB 13 kHz -0.2 dB 13.5 kHz 0.0 dB 14 kHz +0.2 dB 14.5 kHz+0.4 dB 15 kHz +0.6 dB 15.5 kHz+0.8 dB 16 kHz +1.0 dB 16.5 kHz-oo dB The frequency response should be so calculated, that the white noise' cochlea excitement is not changed. This should be possible for fu = 14 kHz. Better methods are calculating this preemphasis dynamically from the actual signal. I've tested the first method with fs=29.4 kHz and got a nearly indistinguishable signal compared to the classical low pass filtering with fu = 0.45*fs = 13.2 kHz resulting in "poor" quality. To my mind 16 kHz are enough for music. Using some emphasis tricks makes this statement more secure. Have someone a piece of music with a triangle? For my experiments I still need some very tonal high frequency samples. :: After all, I think that in 48kHz encoding some freq higher than 16kHz got a :: scalefactor, so it could be theorically be possible to affect a scalefactor. :: No. The scaleband assignments are different for 32/44.1/48 kHz, so you got 16 kHz for all fs. Long Blocks Short Blocks 32 kHz: ...15.25 kHz...14.92 kHz 44.1 kHz: ...15.96 kHz...15.50 kHz 48 kHz: ...15.96 kHz...15.62 kHz -- Mit freundlichen Grüßen Frank Klemm PS: °) minimum requirements of studio equipment frequency response. eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] post-encoding mp3 amplification
On Fri, Sep 22, 2000 at 11:22:43AM -0600, Mark Taylor wrote: Is it theoretically possible to amplify the sound in a mp3 file without reencoding it? What would be the quality loss of this operation? Yes: modifying the global_gain field in each channel of each granule in a Layer III frame has the effect of amplifying or diminishing the decoded signal in increments of 1.5 dB. There is apparently no quality loss associated with this change as long as the signal is not amplified so much as to cause clipping. This utility will do exactly that: http://www.chat.ru/~lrsp/English/index.html I can't download any file. Download starts at 5 KB/s and decreases slowly to 50...100 B/s. Time-Out after 50...100 KB. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] post-encoding mp3 amplification
:: Is it theoretically possible to amplify the sound in a mp3 file without :: reencoding it? What would be the quality loss of this operation? :: :: Yes: modifying the global_gain field in each channel of each granule in a :: Layer III frame has the effect of amplifying or diminishing the decoded signal :: in increments of 1.5 dB. :: :: There is apparently no quality loss associated with this change as long as the :: signal is not amplified so much as to cause clipping. :: Clipping is also no problem of quality loss. It's only a problem of a decoder decoding a MP3 to a fix-point PCM file. These decoders have problems with high level clipping and with low level quanization noise. MP3 can handle a dynamic of up to 400 dB. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Live-XMMS 1.0.0 and Lame 3.87
:: :: Yes, XMMS is playing it in real-time. The live-xmms is a plugin that :: should be getting the decoded output in real time (as it's played). :: :: Without the plugin enabled, the decoder (in XMMS) only takes very little :: CPU (doesn't even show up in top most of the time). :: top is displaying a lot of nonsense for RT tasks. You can write RT programs allocating 50% of the CPU power and top still says the program generates a load below 1%. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Some suggestions for LAME - please review
On Sat, Sep 23, 2000 at 10:24:58AM -0600, Mark Taylor wrote: There is one thing I would like to do, but the work in LAME seems to never end :-) A variant on MP3 which uses everything from LAME, but with the following changes: I would not call it MP3. A distinguished name (MP4 or MPEG-%f Layer IV) would be much better to not confuse millions of people. 1. go to transform sizes 1024 and 128 MP3 uses 576 and 192. When 576 is too low for tonal music and 192 too long for percussions, then this is right. But a 1:8 ratio can create other problems. Note that MD uses 128, 256, 512 and 1024 sample blocks. Useful are block sizes from 1 ms ... 35 ms. 2. replace polyphase fitlerbank + MDCT with one large MDCT Okay. 3. allow mid/side stereo to be turned on/off for each critical band. I would suggest another model. A frame can contain: 1) Spectral coding of Channel 1 (1000 bit) 2) Global Vectorizer (1 or 9 bit) 3) 21 Critical Band Vectorizer (3 bit, 24 bit, 45 bit, ... 150 bit) 4) Spectral Coding of Channel 2 (1000 bit) The needs are: 1 2 3 4 Mono: M 0 Unbalanced Mono:M' x lowest quality stereo: M' x Intensity stereo: M' x M' x x Classic Joint Stereo: M 0 S L 128 R Classic Stereo: L 128 R Enhanced Joint Stereo: M' x S' M' x x S' / L \ / CH1 \ ( ) = CBV(f) * GV * ( ) \ R / \ CH2 / Mono: Classic Mono Unbalanced Mono: Often Mono Signals are not well balanced, so a classic matrix coding gives a lot of Sideband signal. This mode avoids this. Lowest quality stereo: Interviews with 2 speakers. With 600 bps you can code the direction of the active spreaker. Not very good, but better than mono. Intensity stereo: Direction coding, but now independent for every CB. So bass drum, voice and percussion can have different directions. Classic Joint Stereo: (L+R)/q(2) and (L-R)/q(2) or L and R are coded. Classic Stereo: L and R are coded. Enhanced Joint Stereo: a) calculates r and a of every CB b) if r is near +/-1.0 and all a are nearly equal, use Global Vectorizer to extract information c) depending on the r's use 0...7 bit to code the remaining a as Critical Band Vectorizer d) matrix the signal depending on the Global Vectorizer + Critical Band Vectorizer, you got CH1(f) and CH2(f). e) Code CH1 f) Code CH2 4. Better sfb21 handling. 5. Spectral prefiltering to get nearly constant ATH in every CB. 6. Using Bit 15..12 = as an additional bitrate (384 kbps). In this case the maximum granule size must also be increased. This would be a new standard, but it could only be an improvement over the current LAME/MP3. A question: When improving MP3, why not using AAC? A new standard is nice for programmers, but bad for users. MPEG on the other hand spends a lot of effort (maybe even too much effort?) on "noise shaping". This I don't understand. What makes MPEG noise shaping? I don't found any useful documentation, so I think of: * MP3 can code tonal sounds very good, there are some spectral peaks to code and a lot of masked signal * For noisy signals MP3 must code nearly every spectral signal resulting in nearly no savings * the human ear can't distinguish noises with nearly the same spectral power density, so it is sufficient to code a noise signal with a similar but different SPD. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] latest sfb question
::The whole sfb21 thingy is a kludge, we should extend psymodel.c ::to calculate maskings for that band too. :: :: the ATH is so large in that band, I would be afraid that :: any computed maskings would always be ATH, and thus :: not worth computing? :: :: Wouldn't it be possible to use the ATH value as masking (as now), but use :: this value in the masking computing? It should prevent to lower a lot the :: overall masking for sfb21 when using something like --athlower 100. After :: all in the others sfb when using --athlower 100 the whole masking is not :: reduced, only the ATH. I'm afraid that now, using --athlower 100 would :: result in an incredible ammount of bits in the latest sfb. I'll test it and :: post the results. :: The main problem of masking at low SPLs is that is weaker than at higher SPLs. At 30 dB(A) low frequency and high frequency slope are nearly identical. Masking is relative to the masking signal higher, but much faster falling into high frequency direction. x 90 dB(A) ___ \___ || \___/ \ \___|| \___|| x 55 dB(A) \___/ \ _ \___ \_ \___ \_ \_ \_ \_ x\ 40 dB(A)\_ \ \_ \ \_ -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] castings
:: :: 1) one type is long double, the other will be casted to long double :: 2) one type is double, the other will be casted to double :: 3) one type is float, the other will be casted to float Fully wrong. The rest I haven't checked. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] castings
:: :: Albert you are right, but this shows that it is necessary to be :: resolved, not casted. :: Compile programs with gnatmake, not with gcc ;-) -- Mit freundlichen Grüßen Frank Klemm PS: Ada programs are compiled with gnatmake. Make functionality is part of Ada itself. eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] latest sfb question
Emacs Settings ~~ So, you can either get rid of GNU emacs, or change it to use saner values. To do the latter, you can stick the following in your .emacs file: (defun linux-c-mode () "C mode with adjusted defaults for use with the Linux kernel." (interactive) (c-mode) (c-set-style "KR") (setq c-basic-offset 4)) This will define the M-x linux-c-mode command. When hacking on a module, if you put the string -*- linux-c -*- somewhere on the first two lines, this mode will be automatically invoked. Also, you may want to add (setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode) auto-mode-alist)) Jed Settings Put a "/* -*- mode: C; mode: fold -*- */" at the beginning of the file. %--- % C-mode variables: %--- C_INDENT= 4;% amount of space to indent within block. C_BRACE = 4;% amount of space to indent brace C_BRA_NEWLINE = 0;% If non-zero, insert a newline first before inserting % a '{'. Many C programmers like this to be 0. A zero % value will force '{' to be on same line as insertion. % The jed source code uses 1 for this variable. % Note that in C mode, the keys '{' and '}' are bound % to the commands 'brace_bra_cmd' and 'brace_ket_cmd' % respectively. C_Colon_Offset = 0; % Controls the indentation of case statements. C_CONTINUED_OFFSET = 4; % This variable controls the indentation of statements % that are continued onto the next line. -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] castings
:: Hi Frank, :: :: :: :: :: 1)one type is long double, the other will be casted to long double :: :: 2)one type is double, the other will be casted to double :: :: 3)one type is float, the other will be casted to float :: Fully wrong. :: The rest I haven't checked. :: :: :: Does it mean that you've tested the above? :: If yes - how did you test this to be wrong? what is the compiler, :: optimization options? :: First a) char, signed char, short = int b) unsigned char, unsigned short = unsigned int c) float = double Second d) still different types? ldouble double ulong longuintint ldouble ldouble ldouble ldouble ldouble ldouble ldouble double ldouble double double double double double ulong ldouble double ulong ulong ulong ulong longldouble double ulong longulong long uintldouble double ulong ulong uintuint int ldouble double ulong longuintint float op float gives a double. Note that a lot of compilers modifying rule c) and d) using instead: c') float, double = long double d') still different types? ldouble double ulong longuintint ldouble ldouble ldouble ldouble ldouble ldouble ldouble double ldouble ldouble ldouble ldouble ldouble ldouble ulong ldouble ldouble ulong ulong ulong ulong longldouble ldouble ulong longulong long uintldouble ldouble ulong ulong uintuint int ldouble ldouble ulong longuintint On machines computing with 80 bit-IEEE floats rounding costs huge amounts of CPU time. But ANSI C forces this silly rounding on standard. From the view of code optimization, C was a great milestone in the 70's, in the 2000's it is a millstone round the compiler builder's neck. Problems: * nearly no infrastructure * problems with 64 bit (problems with pointer = array[index] transformation) * typical C structures are not very good for super scalar machines * pointer aliases avoid a lot of deep optimization on super scalar machines * low level standardization (nearly no demands on the hardware, so also 13 bit CPUs with 1's complement and really strange arithmetic (a+b-b!=a on overrun) can be supported). -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] castings
:: Albert Faber schrieb am Son, 17 Sep 2000: :: Robert, :: So if i have the following piece of code :: ::int my_signed= -1; ::unsigned int my_unsigned=10; :: ::if (my_signedmy_unsigned) :: printf("my_singed my_unsigned\n"); :: else :: printf("my_singed is = my_unsigned\n"); :: :: It should print: "my_singed my_unsigned\n" according to your implicit :: casting rules, which can cause horrible unexpected run-time problems also. :: :: yes, it does! :: There is a package out there called LCLint. It finds a lot of such standard C bugs. It is free. Problems: * No C++ * Too sloppy * You must use this tool from the very beginning, otherwise you got several hundreds to thousands warnings per file. A lot of them are typical C pit falls lurking to become active ... Example: $ cat hello.c #include stdio.h int main ( /*@unused@*/ int argc, /*@unused@*/ char** argv ) { int my_signed = -1; unsigned int my_unsigned = +10; if ( my_signed my_unsigned ) printf ( "my_signed my_unsigned\n" ); else printf ( "my_signed = my_unsigned\n" ); return 0; } $ lint hello.c LCLint 2.4b --- 18 Apr 98 hello.c: (in function main) hello.c:8:10: Operands of have incompatible types (int, unsigned int): my_signed my_unsigned To ignore signs in type comparisons use +ignoresigns Finished LCLint checking --- 1 code error found $ _ The "/*@unused@*/" is to suppress: $ lint hello.c LCLint 2.4b --- 18 Apr 98 hello.c: (in function main) hello.c:8:10: Operands of have incompatible types (int, unsigned int): my_signed my_unsigned To ignore signs in type comparisons use +ignoresigns hello.c:3:16: Parameter argc not used A function parameter is not used in the body of the function. If the argument is needed for type compatibility or future plans, use /*@unused@*/ in the argument declaration. (-paramuse will suppress message) hello.c:3:29: Parameter argv not used Finished LCLint checking --- 3 code errors found -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] latest sfb question
:: :: :: Frank, that's not what Gaby is talking about. :: But if you are talking about the spreading function, there :: are more parameters than loudness: :: - frequency :: - tonality :: - temporal effects :: - difference tones reducing masking :: You've forgotten one: - frequency response of the audio equipment and listening room, especially interferences and hall. See remarks in AAC listening test. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] castings
On Sun, Sep 17, 2000 at 11:44:01PM +0200, Albert Faber wrote: Lets take: float x = 1.5; longy = 1234567890; double z = x * y; printf ("%30.12f\n", z); 1) one type is long double, the other will be casted to long double Not fulfilled. 2) one type is double, the other will be casted to double Not fulfilled. 3) one type is float, the other will be casted to float Fulfilled: Convert (long)1234567890 to (float)1234567936. 4) one type is char, short int, enum or bitfield, the other will be converted to int, if int can represent all values of the original type; if not, the other type will be converted to unsigned int Not fulfilled. 5) one type is unsigned long, the other will be unsigned long Not fulfilled. 6) one type is long int and the other is unsigned int, then the unsigned int type will be converted to long int, if long int can represent all values of unsigned int. If this is not the case, then both are converted to unsigned long int Not fulfilled. 7) one type is long, the other will be converted to long Not fulfilled. 8) one type is unsigned, the other will get unsigned Not fulfilled. 9) is none of the above cases true, then both are of type int Not fulfilled. So the code should be equivalent to: double z = (float)1.5 * (float)1234567936; printf ("%30.12f\n", z); which not only looks like nonsense. But also gcc makes a lot of very strange things: #include stdio.h float x = 1.5; longy = 1234567890; float y_float = 1234567890.0; float a = 1.e32; int main (void) { double z = x * y; printf ( "%30.12f\n", z ); printf ( "%30.12f\n", (float)x * (float)y );// this two lines printf ( "%30.12f\n", (float)x * y_float ); // give a different result on gcc printf ( "%30.12f\n", (double)x * (double)y ); printf ( "a*a is %g, sizeof(a*a) is %u\n", a*a, sizeof(a*a) ); // ;-) return 0; } Maybe Richie made an arrangement with the FORTRAN compiler builder to save their jobs for the next 50 years ;-) -- Frank Klemm Note: The background is that a float can't store all values of a signed long and an unsigned long. The integer range of a float is only -16777216...+16777216. For the same reason 64 bit ints should be auto casted to long double, not to double. -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] castings (OT)
On Mon, Sep 18, 2000 at 08:17:46PM +0200, Robert Hegemann wrote: a) char, signed char, short = int b) unsigned char, unsigned short = unsigned int c) float = double So your Compiler/target CPU has only an affinity for some elementary types. This is very special and different on other Compilers/CPUs. It's taken from: Kerningham Richie: Programming in C It's not a compiler, but a book. The German translation is *different* from the original. Some additional remarks are taken from the ANSI Standard (1990). Second d) still different types? ldouble double ulong longuintint ldouble ldouble ldouble ldouble ldouble ldouble ldouble double ldouble double double double double double ulong ldouble double ulong ulong ulong ulong longldouble double ulong longulong long uintldouble double ulong ulong uintuint int ldouble double ulong longuintint float op float gives a double. Silly, you are saying above that it is fully wrong and then telling the same at the bottom. When it is the same it should be gave the same result. But it gave different results. Also tables are much nicer. Table lookup is faster and more error proof than text parsing ;-) -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Time-stretching (off topic)
On Thu, Sep 14, 2000 at 09:48:04PM +0200, Markus Fick wrote: In the case you want to change the sampling frequency WITHOUT changing the duration of the sound (or changing the sound duration without following the ratio of original to replay sampling rate) the following link is a good point for information (incl. source code): http://www.dspdimension.com/html/pscalestft.html Note, that pitch shifting without artefacts can only be done by integer factors. The more often used factors around 0.8...1.0 are generating amplitude modulation of your input signal, aharmonic distortions and also preecho problems, with can be reduced by variable analyzer window size. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] MP3 DECODER: Current frame channel type
Can someone add a feature to the decoder? It should display the current stereo coding (MS vs. LR). Best would be to use code like that: fprintf ( stderr, " %s " , frametype ? "M " : " S" ); So you have a blinking "light". Another proposal: The width of the resampling filter could be modified by the -q settings. -q9 linear filtering -q7 5 point FIR filter, Hamming -q5 9 point FIR filter, Hann -q3 19 point FIR filter, BM -q0 79 point FIR filter, BM Is there a special reason why oversampling is done by 4 point interpolation instead of the sinc interpolator? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] min bitrate and bit reservoir (was: MS switching)
:: :: Frank, your are missing a third option: :: :: PROJECT_MAINTAINER_DOES_NOT_CARE_AND DOES_NOT_WANT_THIS_CODE_IN_LAME :: :: The display is updated every 50 frames. It is simple :: and works well enough. :: :: I want this type of code kept to an absolute minimum since this kind :: of stuff really belongs in the front end. Most GUIs which use LAME :: already do their own status display. I dont want to see even more :: timing code in lame.c. :: Then move it to a separate function and a separate file. Then you don't pollute the coder functionality with things like that. An functions become shorter. Programmers may like large functions, at least maintainers not. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] min bitrate and bit reservoir (was: MS switching)
:: :: This also prevents lame from update frequencies in the range of :: 50...60 Hz :: (resulting in additional hum!) on Athlon 1000 systems coding mono, :: low :: quality MPEG-2 files. Sounds worse. :: :: Are you saying (I hope say is correct) that on fast systems, Lame display :: changes the result of the mp3? If this is the case, I think that we MUST :: find why. :: Not the result of the MP3, but when listening to FM radio you can hear what the computer makes. Especially on non closed computers. Especially the video card seems to generate lots of noise when blittering images. :: Otherwise note, that the old behave updates display every 4.5 minutes :: (386/40 with Cyrix copro, 8 years old) or 8.5 hours (386SX/16 without :: copro, 10 years old). :: :: off topic: is there anyone on this planet encoding mp3 with a 386? The :: slowest thing I personnaly used for mp3 encoding was dx4-100, and it was :: really awfull. :: Yes, me. I measured the time to generate one frame. These are two slow firewall computers. The 386SX/16 has 8 MB of RAM, the 386DX/40 16 MB. Lame works on it. :: Another question: :: What is ETA? A basque terror organization? I don't found it in any printed :: dictionary, also not in the big webster I bought in England. But you find :: nearly all C keywords in it ;-) :: :: There are 2 indications of the remaining time in Lame: one for the remaining :: time for the Lame process, and the other for the overall system remaining :: time. Tthey could be different if you're using a lot of simultaneous tasks :: on your computer. But I don't remember which one is which one, and I'm not :: sure if it works on every OS. :: The CPU time display shows it in CPU time, i.e. in real CPU clocks. The REAL time display shows the REAL world time which normal humans really interest. The last is the remaining real world time until the program is finished. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Percentages (informational)
[Charset iso-8859-2 unsupported, filtering to ASCII...] :: Frank Klemm wrote: :: :: The following outputs have the following meanings: :: ::[ ] p = 0.00%, never used ::[%..] 0.00% p 0.01% ::[%.0] 0.01% p 0.05% ::[%.1] 0.05% = p 0.15% ::[%.2] 0.15% = p 0.25% :: ... ::[%.9] 0.85% = p 0.95% ::[ 1%] 0.95% = p 1.5% ::[ 2%] 1.5% = p 2.5% ::[ 3%] 2.5% = p 3.5% :: ... ::[99%] 98.5% = p 99.5% :: [100%] 99.5% = p +oo :: :: Why not just allocate more characters , like [0.13%] ? :: That's also possible. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: Re[2]: [MP3 ENCODER] MS switching
This point is debatable. I am in the clan of the people using -B 256, and here is why: I choosed to keep strict ISO compatibility (--strictly-enforce-iso), because if in the future I use an hardware player, I would be worried to have some of my mp3 becoming unplayable. The ISO standard specifies a strict limit for the data size (btw this limit was choosed too low in the standard) so 320k frames can't use the bit reservoir, and if there were some available bits in the reservoir, if a 320k frame is used, the reservoir is definitely wasted. As I don't want to have too much unused space in my files, I choosed to use -B 256, even if I might be sacrifying a little of quality. The 4 examples I brought were POSSIBLE information you have after coding a PCM file. These are possible scenarios after you have coding files. You've tried to do your very best and you see after coding it wasn't. Real world examples and exercises: After coding with -V0 you got: Frame | CPU time/estim | REAL time/estim | play/CPU | remain 8453/8453 (100%)|6:54/6:54|6:55/6:55| 0.5325x|0:00 32 [ 2%] 40 [ ] 48 [ ] 56 [ ] 64 [ 0%]* 80 [ 0%]* 96 [ 0%]* 112 [ 1%]* 128 [ 1%]*** 160 [ 7%] 192 [32%]** 224 [29%]** 256 [15%]* 320 [12%]*** average: 219.0 kbps Calculate the file increase for the options '-b160', '-b192' and '-b192 -F'. Is this tolerable? What to you expect about coding quality? Frame | CPU time/estim | REAL time/estim | play/CPU | remain 8453/8453 (100%)|7:44/7:45|7:45/7:45| 0.4749x|0:00 32 [ 2%] 40 [ ] 48 [ ] 56 [ 0%]* 64 [ 0%]* 80 [ 0%]* 96 [ 1%]*** 112 [ 9%]*** 128 [18%]** 160 [30%]** 192 [22%] 224 [10%]** 256 [ 4%]*** 320 [ 3%]** average: 169.0 kbps Calculate the file increase for the options '-b112', '-b128' and '-b128 -F'. Is this tolerable? What to you expect about coding quality? In the past, when Lame was using strict iso, because of this reason the default for -V1 was -B 256. The Xing vbr encoder also limit up to 256k frames in order to not waste bits. (I don't know about the FhG vbr). FhG forces 48 kHz on 320 kbps. Maybe to bring the bits/granule ratio to the right place. Also I extended the brhist function to display percentages between 0.05 and 1 more accurate. [%.5] Can be read as 5 %. (%. = parts per thousand) or as .5 % = 0.5 % -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] --voice
On Mon, Sep 11, 2000 at 10:01:58PM -, Eric Howgate wrote: Could someone satisfy my curiosity about this option (descibed as 'experimenatl' in the docs for ver 3.85, but I see that it is available in RazorLame) ? Does it have fixed default parameters like the --preset voice option, and if so what are they ? What is the thinking behinfd this option - audio books perhaps ? --voice do exactly the same as --preset voice. You got good voice quality at about 56 kbps. These are the best settings I found for voice and about 56 kbps. I took a poem and now I can ricite the poem without any problem ;-) The same can be done with: phone, voice, radio, tape, cd and studio. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: Re[4]: [MP3 ENCODER] MS switching
On Mon, Sep 11, 2000 at 07:12:24PM +0200, Robert Hegemann wrote: ISO says maximum buffer size is 7680 bit: kHz kbps bpf wasted -- 856 8064384 11.02580°) 8359679 1280°) 7680 0 16 112 8064384 22.05160 8359679 24 160 7680 0 32 224 8064384 44.1 320 8359679 48 320 7680 0 -- kHz kilo Hertz kbps kilo bits per second bpf bits per frame That's why FhG switches to 48 kHz on 320 kbps. But: A lot of players and soundcards have problems with 48 kHz Upsampling of lame is worse (downsampling is good, but not very good) -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] RPC support
RPC support, SMP support and distributed computing would be a very nice thing for LAME and would be an outstanding feature. Are there are plans and interests to support this? Not this year, but starting with this things in spring next year, not earlier. The core of lame is less or not effected, but the outer layers of lame. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] min bitrate and bit reservoir (was: MS switching)
There are two possiblities for display update frequency: 1st) #define I_HAVE_NEVER_SEEN_LAME_ON_A_486_100_OR_A_ATHLON_1000 and display updates every 50 frames (MPEG-1) or 100 frames (MPEG-2) (Why this differences?) 2nd) Do not define I_HAVE_NEVER_SEEN_LAME_ON_A_486_100_OR_A_ATHLON_1000 You can also define MY_PREFERED_UPDATE_STEP by values like 1, 2, 5, 10, 20, 50, 100 (default is 1). Update interval is 2 seconds (if not modified by -disptime xxx) time + needed for display (so no lockup is possible), additional frame number steps can be rounded to multiples of MY_PREFERED_UPDATE_STEP. But the display will not more often than every 2 seconds updated (if not modified by -disptime xxx). The standard value of 2 seconds is also debatable. It's a float, so personal preferences can be satisfied very accurately ;-) This also prevents lame from update frequencies in the range of 50...60 Hz (resulting in additional hum!) on Athlon 1000 systems coding mono, low quality MPEG-2 files. Sounds worse. Another remark. Display update after *every* frame reduces Lame speed by 0.8% for HQ settings (-q1 --studio -mj -v), by 4% for lowest quality (-q9 --phone -mm -v) (K6-300). So I see no problem in performance drops also on the slowest system (on a 386 the display is updated after every frame with HQ settings resulting in an additional speed impact). Otherwise note, that the old behave updates display every 4.5 minutes (386/40 with Cyrix copro, 8 years old) or 8.5 hours (386SX/16 without copro, 10 years old). Another question: What is ETA? A basque terror organization? I don't found it in any printed dictionary, also not in the big webster I bought in England. But you find nearly all C keywords in it ;-) eta: the 7th letter of the Greek alphabet -- H or n ' Eta: basque terror organization in spain Is this the so notorious tech speak of computer infected with sillycon (SP?) brains? BTW: Silicon is one of the most enjoyable words when english reports are translated to german by incompetent translators. They tranlate silicon to the german word Silikon (in german the letter 'c' is mostly replaced by 'k'). But Silikon has the meaning of silicone which let people smile due to its usage in plastic surgery. The other one is translating american billions (10^9) and trillions (10^12) to Billionen (10^12) and Trillionen (10^18). You got fantastic gross national product for the U.S.A. with values in the range of 10^10 US $ per citicen. But this is fully off-topic. -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] min bitrate and bit reservoir (was: MS switching)
On Tue, Sep 12, 2000 at 12:25:46PM +0200, Gabriel Bouvigne wrote: In both cases I think ( I didn't really calculate) the size increase should be something around 5%. Wrong. For the slight increase ca. 0.1%, for the more harder ca. 1%. Yes, it's tolerable, but I personnaly don't think that this would increase the overall quality by 5%, So I personnaly don't use any minimum bitrate. There is something wrong in the low level range. I have two tools: mp3_loud: You can increase the loudness of a MP3 by multiples of 1.5 dB resample: You can generate HQ low level WAV files (side effect). Example: $ resample input.wav 44100.0 input-60dB.wav 0.001 $ lame input-60dB.wav output-60dB.mp3 $ mp3_loud 32 output-60dB.mp3 output-60dB+48dB.mp3 Listen to output-60dB+48dB.mp3 CBR are okay, VBR not. Both programs (pre alpha hacks) are 23 Kbyte large. (but for classical music I use --noath). But I don't get your point about what you're trying to explain with those examples. That I haven't tested. What makes this options? Aha! Another test suite. I need a computer farm for all test suites. And a bundle of children for the hearing tests ;-) Should we enable --noath for -V1 and -V0 on standard ? FhG forces 48 kHz on 320 kbps. Maybe to bring the bits/granule ratio to the right place. But in VBR mixed sampling rates are fordidden. Also in CBR. But FhG don't mixes sampling rates, but it forces 48 kHz on 320 kbps. I don't understand this until the word 4096 bit/granule feld. $ mp3encdemo31 -v -qual 9 -br 256000 -if Maire.wav -of Maire_FhG_256.mp3 * MPEG Layer-3 Encoder V3.1 Demo (build Sep 23 1998) * (C) 1998 by Fraunhofer IIS-A This program is protected by copyright law and international treaties. Any reproduction or distribution of this program, or any portion of it, may result in severe civil and criminal penalties, and will be prosecuted to the maximum extent possible under law. For further info, please visit http://www.iis.fhg.de/audio/ this program is limited to encoding 30 seconds of audio data. in: 44100 Hz, 2 channel(s), 16 bit/sample out: 44100 Hz, 2 channel(s), 256000 bit/s full huffman search ON 88.1 seconds running time encoding finished. $ mp3encdemo31 -v -qual 9 -br 32 -if Maire.wav -of Maire_FhG_320.mp3 * MPEG Layer-3 Encoder V3.1 Demo (build Sep 23 1998) * (C) 1998 by Fraunhofer IIS-A This program is protected by copyright law and international treaties. Any reproduction or distribution of this program, or any portion of it, may result in severe civil and criminal penalties, and will be prosecuted to the maximum extent possible under law. For further info, please visit http://www.iis.fhg.de/audio/ this program is limited to encoding 30 seconds of audio data. in: 44100 Hz, 2 channel(s), 16 bit/sample out: 48000 Hz, 2 channel(s), 32 bit/s full huffman search ON 73.8 seconds running time encoding finished. $ _ This feature is enabled for -qual=7...9 and without any quality selector. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Percentages (informational)
The following outputs have the following meanings: [ ] p = 0.00%, never used [%..] 0.00% p 0.01% [%.0] 0.01% p 0.05% [%.1] 0.05% = p 0.15% [%.2] 0.15% = p 0.25% ... [%.9] 0.85% = p 0.95% [ 1%] 0.95% = p 1.5% [ 2%] 1.5% = p 2.5% [ 3%] 2.5% = p 3.5% ... [99%] 98.5% = p 99.5% [100%] 99.5% = p +oo May be too difficult for end users, but nice for development. Another question would be, what is best: * Showing the percentage relative to the number of already coded frames * Showing the percentage relative to the number of expected frames Both have advantages and disadvantages. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] min bitrate and bit reservoir (was: MS switching)
:: ETA - Estimated time of arrival. :: Why this time changes continuously? Is the estimation so bad? I think it's not the estimated time of arrival, but the remaining time until arrival. RTUA? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Re: MP3 Format
Presets: There are three blocks which can be selected via conditional compiling. Block 2: use low fs Block 3: do not use fsb21 if possible. Differences are: Block 2 Block phon+ (0...4 kHz) fs=8 kHzfs=11 kHz 700 KB 610 KB voice (0...12 kHz) fs=24 kHz fs=32 kHz 1.8 MB 2.0 MB So I suggest to use block 1, which uses the fs with the lowest bitrate. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] VBR / CBR / ABR
CBR VBR ABR Can use different frame sizes no yes yes File size depends also on complexity of source no yes no Can adapted bit demand by * different frame sizes no yes yes * use of bits from the pool yes ? no Size of the bit pool (bit) * MPEG-1: 4088 ? ? * MPEG-2: 2040 ? ? frame size for 320 kbps/ 32 kHz (bit)11520 11520 11520 44.1 kHz (bit) 836083608360 48 kHz (bit) 768076807680 effective framesize regarding the usage of the bitpool (bit) * ISO:61440 ? ? * common usage: 131072 ? ? Size of the original PCM data (2*16 bit) 36864 36864 36864 What would be done in VBR/ABR if the needed frame size is between two allowed sizes? Emitting frames with wobbling effective frame size or also using a (small) bit pool and a more constant effective frame size? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] MS switching
Currently there are the following options to control channel frame coding: -ms only LR frames allowed -mf only MS frames allowed -mj both are allowed But there are no controls to affect the switching more sensitively. So, for instance, a switch can be added to set a penalty bitrate for the MS coding theme: -mS 10 use MS coding if it saves 10 kbps -mS 20 use MS coding if it saves 20 kbps -mS 200 use always LR coding scheme Also the threshold can be adjusted due to the average bitrate. -q1-q2-q5-q7 96 uses -mS 3 uses -mS 3 uses -mS 3 uses -mS 3 112 uses -mS 6 uses -mS 6 uses -mS 6 uses -mS 6 128 uses -mS 10 uses -mS 10 uses -mS 10 uses -mS 10 144 uses -mS 15 uses -mS 15 uses -mS 15 uses -mS 200 160 uses -mS 20 uses -mS 20 uses -mS 200 uses -mS 200 192 uses -mS 30 uses -mS 200 uses -mS 200 uses -mS 200 224 uses -mS 40 uses -mS 200 uses -mS 200 uses -mS 200 256 uses -mS 50 uses -mS 200 uses -mS 200 uses -mS 200 320 uses -mS 200 uses -mS 200 uses -mS 200 uses -mS 200 So the usage of the MS frames becomes more and more unlikely for high bitrates. This is better than a sharp break. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Extending report
Is it possible to extend the report at the end? 128.0 kbps frames: 1 total, 571 short, 8917 Mid/Side "%5.1f kbps\tframes: %5lu total, %4lu short, %5lu Mid/Side\n" -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] very lowand very high-quality settings
:: :: I thought the "-V0" versions are better. The coder has the ability to :: increase bit rate on demand. There are only problems with the so called :: silence detection which generates switching artefacts, so I added the -F :: option. :: :: Using -V0 the ATH is lowered (by 16 dB I think), so there are a lot less :: analog silences than in cbr mode. :: But could you explain why analog silence (and then switching to low bitrate :: frames) could lead to switching artefacts? And did you ever encountered such :: an artefact cause by analog silence processing? :: A) Take a EBU test CD for low voltage converter linearity (for instance used by "Stiftung Warentest" to test CD players). It contains some music attenuated by 60 dB. Or take some WAV files with fade ins and fade outs and attenuated this by 60 dB (Dire Straits: Brothers in Arms, Private Investigations). B) Code this files by lame with lame -V0 -b160 -q1 input-60.wav output.0.mp3 lame -V4 -b160 -q1 input-60.wav output.4.mp3 lame -b160 -q1 input-60.wav output.c.mp3 lame -b160 -q1 -mm input-60.wav output.m.mp3 C) Enforce input.wav, output.0.mp3, output.4.mp3, output.c.mp3 and output.m.mp3 by 54 dB. D) Listen to this 5 enforced sound files. output.+54.wav: noisy, no artefacts output.c.+54.mp3: noisy, no artefacts output.m.+54.mp3: much more noisy, no artefacts output.4.+54.mp3: heaviest MP3 artefacts, combined with a lot of random muting bitrate: 0:00-0:04 32 kbps (muted) 0:02-6:36 160 kbps (with some mutes) 6:36-7:00 32 kbps (muted) output.0.+54.mp3: heavy MP3 artefacts, combined with random muting bitrate: 0:00-0:01 32 kbps (muted) 0:02-6:36 160 kbps 6:57-7:00 32 kbps (muted) Okay, this is not normal operation. But it's a test to show weakness of the coder. Tools available to HQ-attenuate WAV files and enforce MP3 files. :: For very low audio levels I would prefer a adaptive low pass filter :: and adaptive data rate. May be this will be added if sample_t becomes :: float. :: :: Level (1...5 kHz) Lowpass (kHz) Channel Datarate :: separation ::-30 dB 24 kHz 1.0 min (b, 320) ::-40 dB 20 kHz 1.0 min (b, 256) ::-50 dB 18 kHz 1.0 min (b, 192) ::-60 dB 16 kHz 1.0 min (b, 128) ::-70 dB 12 kHz 0.99 min (b, 112) ::-80 dB 7 kHz 0.9 min (b, 96) ::-90 dB 4 kHz 0.5 min (b, 48) :: -100 dB 3 kHz0.1 32 :: -110 dB 2 kHz 0.01 32 :: :: Wouldn't this lead to some artefacts due to the change of available :: frequencies? :: It will sounds a little bit muffled. But recodings without lots of distortions in this level region will have a lot of shaping noise in this range, so the Encoder is confused by this noise. :: And if it's not the case, why only lowpassing and not bandpassing? :: Bandpassing is better, indeed. If available. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] include-defines
:: I'm not too fond of your new naming-scheme for defines to check if a include :: has already been included (__VERSION_H__ f.ex.)... :: :: Generally this scheme is only for internal compiler defines and system :: includes, and normal projects should not use pre- and postceding underscores :: in defines (esp. not two). :: :: The old scheme is the more publically excepted method (VERSION_H_INCLUDED). :: These tokens should be similar. A file has the name: path/name.extention Currently in usage: 1 __name_extention__ 2 name_extention 3 name_extention_INCLUDED 4 name_extention_INCLUDE 5 name_DOT_extention 6 somethingelse_H Select one and apply it to ALL files. 3) seems to be a good joice. Someone against this? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] LAME style guide, rule #1
On Sun, Sep 10, 2000 at 05:23:32PM -0600, Mark Taylor wrote: LAME compiles on all modern OS's, and under dozens of C compilers. This fact alone means that LAME is effectively complient C code. No additional "complience" work is needed or wanted. If you disagree, start a discussion in mp3encoder and try to convience other developers. I've tested serveral DSP boards (32 bit, lame has several 16 bit flaws, so 16 bit will never work) and the code doesn't works on any of them. You can compile the code without any problem, the program also runs, but the output have nothing to do with MP3. Okay, C is a language "guarantee for nothing, compile everything", so it's very difficult to write something which is "complient C code". Yesterday I compiled "The old man and the sea", got only 3 warnings, but the program does not run. Strange ;-) -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: How can you normalize without first scanning the entire file for the loudest :: entry? :: MP3 can be adjusted by multiples of 1.5 dB without any quality loss. So first code and track the biggest amplitude and adjust the MP3 in a second pass. You have only to increase one byte per frame. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: trimming and normalization can be done without first scanning the whole :: file. :: :: you'll to tell me how ! :: for the begining of the stream OK. But at the end, maybe there is a 2s pure :: digital silence and the another thing... :: 2 seconds of digital silence (350 KByte) can be stored in one variable (4 Byte). If, contrary to all expectations, this was only a gap, a corresponding number of silence MP3 frames can be emited. Interface must be a little be changed, because a big pile of data must be emited after such a silence pause. Instead of: char mp3buffer [MAGICK_CONSTANT]; mp3_encode ( global_context, left_pcm_data, right_pcm_data, number_of_samples, mp3buffer, sizeof(mp3buffer) );// argument orgies the structure must be: typedef struct { char* ptr; size_t len; } buffer_t; // avoid a lot of single loaf around variables // group them by functionality buffer_t mp3buffer; mp3buffer.len = 0; // can be enlarged on demand mp3buffer.ptr = malloc ( mp3buffer.len ); mp3_encode ( global_context, pcmbuffer, mp3buffer ); // avoid argument orgies :: :: And for normaliztion I don't see how at all, seince you need to need the :: min/max. Maybe with a circular buffer it would work... :: MP3 adjustable by multiples of 1.5 dB. This can be done after converting the whole PCM file. If the larges value was 19100, calculate: floor ( 4. * ld (32767/19100) ) = floor ( 4. * 0.77867 ) = floor ( 3.1147 ) = 3 Enlarge scale factors by 3. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: | Odesílatel: Steve Lhomme [EMAIL PROTECTED] :: | :: | Well, a pre-processor is what I'm programming. But you can't integrate it :: | with lame since it can't work in real-time/pipe (for DC adjust, trimming, :: | normalisation). :: :: You should add some filter processing, for example notch filters for :: 50+100Hz and 60+120Hz, 2-4band fully paramteric EQ (gain, Q, base freq), :: high-pass and low-pass variable freq. first and second order filtering. :: 1st before adding such functionality lame should define an easy to use interface for such plug ins, so the programmer of the the plug ins not need to know anything about lame except a 2 page long plugin interface description. No changes on lame are necessary. 2nd notch filters are a bad solution to remove hum. Use subtractive PLL hum filters, there are more effective and more gentle to the origin signal. :: For DC adjust I STILL strongly recommend high-pass filter, it is used by :: professionals. :: For professional high-pass filters are usable, for ready-made CD tracks they are fully unusable. You need a little bit of the past of the track. :: DC offset calculating over track is non-proffesional :: solution, but mathematically right. :: What is "mathematically right"? Mathematics is always right. Mathematical transformations have certain properties and some are nice and other are disturbing for a given service. And a lot of pairs of properties mutually excluding. :: Finally, all that is mathematically right, should not be right in life. :: To find a mathematical solution, fill out the sheet: aim (note: not a know solution, really the aim): addition conditions: -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: | Odesílatel: Steve Lhomme [EMAIL PROTECTED] :: | :: | Well, a pre-processor is what I'm programming. But you can't integrate :: :: it :: | with lame since it can't work in real-time/pipe (for DC adjust, trimmin :: :: g, :: | normalisation). :: :: You should add some filter processing, for example notch filters for :: 50+100Hz and 60+120Hz, 2-4band fully paramteric EQ (gain, Q, base freq), :: high-pass and low-pass variable freq. first and second order filtering. :: :: For DC adjust I STILL strongly recommend high-pass filter, it is used by :: proffesionals. DC offset calculating over track is non-proffesional :: solution, but mathematically right. Finally, all that is mathematically :: right, should not be right in life. :: :: Well I know that. But in this case it is ;) And since I wanted to make something :: that could normalize, then had the idea of trimming, and then had the one of :: DC adjust. I think I'll go this way, since I have to scan the whole file for :: the other processings... :: trimming and normalization can be done without first scanning the whole file. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Lame re-sampling bug?
On Fri, Sep 08, 2000 at 06:48:49PM +0200, Steve Lhomme wrote: Hum... And if 19 is a magic value, why didn't you use the following ? BLACKSIZE = 200 filter_l = (BLACKSIZE - 19) | David: can you run the same test with a stencil 10x bigger? | To do this, change: | | BLACKSIZE = 200change in util.h | filter_l = 191change in util.c | 200 - 19 == 191 ??? -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: | | Anyway I think that the very low frequencies are used in music like :: | | drumbass with very good sound systems. The infra bass is something I :: | really :: | | like in clubs ;) :: | | Since I want to encode files in good quality (maybe playable in a club) :: | I'd :: | | prefer to keep this and just remove the DC offset... I think it can be :: | :: | then remove all under 5Hz, these freqs you do not need, or it is created :: in :: | another way (by rythm) from transients. :: :: And why not 1Hz ? I'm sure you can make good mechanical effects at this :: frequency ;) Or maybe 0.1Hz ? Well I only need/want to remove the 0Hz... I :: prefer to remain consistent with the original on low frequencies (higher :: ones are another question). :: I use a legendre transformation instead of a fourier transformation to remove this stuff. It removes such stuff much better than any frequency domain filtering by best preserving the original signal. May be such functionality should be programmed in a lame preprocessor (lame++ called): * legendre based filtering * fourier based filtering * centering of the signal * detecting best lame mode (-mm, -mj, -mf, -ms) * -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] problems with LAME CVS
:: // pfk :: // For 44.1 kHz :: // 1... 96 kbps: Mono better than ugly stereo °) :: -- // 97...159 kbps: Joint Stereo :: -- // 160...192 kbps: Force Joint Stereo bandwidth not enough for LR stereo, :but reducing switching artefacts :: // 193...kbps: Stereo enough bandwidth for LR stereo :: // :: // °) mostly prevent by automatic downsampling :: :: :: Why using forced joint by default? I'm personnaly againt it, as it leads to :: spacialization artifacts. :: bitrates are calculated for fs=44.1 kHz, scale for other fs 080 096 128 160 192 | | | | | Lame 3.87 pre-klemm --mj---ms my proposal ---mm---mjmfms--- differences ... 96 kbps: stereo makes no sense (for 44.1 kHz), this case is mostly prevented by automatic downscaling (currently it only occures at 8 and 16 kbps, limit is 17.4 kbps). Note: * this automatic can be disabled via -mj or -mf or -ms * this automatic only occures if the MP3 encoder can't lower the PCM data rate by a further decreasing the fs * or by forcing high sampling frequencies and low bit rates without forcing a special stereo mode 97...159 kbps: joint stereo needs the lowest data rate may be causing switching artefacts. 160...191 kbps: Most music is coded by 95% of MS frames, the resting 5% are not saving so much space. I've not checked the code, but switching LR - MS seems to result in additional bits. I've found only one piece of music so -mj saves more than 0.1 percent over -mj. Often -mj files are larger. 192... kbps: -ms as -mf also prevents switching artefacts. If the psycho accoustic model is correct and the noise shaping is done correctly, it depends on the correlation coefficent which is the best, r talks about the degree of the saving (r = +/-1: max, r=0 none) a high low data rate channel +0.5...+2 L+R L-R +2... oo LR oo...-2 LR -2...-0.5 L-R L+R -0.5...+0.5 RL "a" may be should not be calculated by the total signal, but by the signal splitted into several subbands. So -mf is the best for "a" = +/-0.5...2, -ms for the rest. Lame pre-klemm forces -ms for = 160 kbps, so the question is, what are the problems with -mf if -mf and -ms are nearly the same. 1st problem: weakness of the MS psycho accoustic model (?) 2nd problem: optimum quantization is much more difficult in MS than in LR, so for a r=0 signal quantization noise of the MS signal is higher than in the LR model (for the best process on average both are the same) 3rd problem: music or parts of music with a "a" outside +/-0.5...2 Solutions: 1st problem: I have a lot of ideas, but * I have to work, lame is only hobby * days on earth only have 24 h (venus would be nice) * my english is bad * literature I have is written in German and is printed on paper * explaining difficult problems via email in a foreign language is very time consumpting (30 min email = 2 min talk). Mostly I take a pencil and a stack of paper to explain things and this totally fails for email. * I can't read a lot of the lame code without having the tendency to press Metar 2nd problem: May be not so difficult, but CPU time consupting A rest of 0.5 dB don't plays such a big role. 3rd problem: This is exactly the same problem -ms has with the complementary. And this happens much much more often. So if this is a problem -mj is the answer and not -ms or -mf. But 160 kbps should have enough reservoir to cover this problem. For -ms I doubt that this is the case. See ^^^ Also note that "a" tends to slip from big positive values directly to opposite negative values. See also next e-Mail. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] problems with LAME CVS
of (*m) ); } printf ("\n"); report ( name, header[11], k0, 0 ); report ( name, header[11], k1, 1 ); break; case 2: while ( ( samples = read (fd, s, sizeof(s)) ) 0 ) { analyze_stereo ( s, samples / sizeof (*s) ); analyze_dstereo ( s, samples / sizeof (*s) ); } printf ("\n"); report ( name, header[11], k0, 0 ); report ( name, header[11], k1, 1 ); break; default: fprintf ( stderr, "%u Channels not supported: %s\n", header[11], name ); break; } } int main ( int argc, char** argv ) { char* name; intfd; report_init (); if (argc 2) readfile ( "stdin", 0 ); else while ( (name = *++argv) != NULL ) { if ( (fd = open ( name, O_RDONLY )) = 0 ) { readfile ( name, fd ); close ( fd ); } else { fprintf ( stderr, "Can't open: %s\n", name ); } } return 0; } -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Lame re-sampling bug?
:: :: As you one remarked - yet another option nobody knows the ideal value :: for ;-) :: :: Upsampling uses a different algorithm - has anyone looked at that? :: Yes. Sounds bad. Especially CDs with some distortions above 15 kHz (dithering noise, 15.75 kHz or 16 kHz tones). -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] problems with LAME CVS
7% 100.000% 1 channel 1.000 technik/Patrick_Piecha/R0e045a.wav 23.824% 23.824% 100.000% 1 channel 0.034%0.034% 1.000 technik/instruments/Wave.wav 4.561%4.561% 100.000% 1 channel 1.000 technik/instruments/Wave.wav 64.065% 64.065% 100.000% 1 channel 0.033%0.033% 1.000 technik/instruments/bass.wav 5.189%5.189% 100.000% 1 channel 1.000 technik/instruments/bass.wav 90.992% 90.992% 100.000% 1 channel -0.117% -0.117% 1.000 technik/instruments/ct-icey1.wav 18.562% 18.562% 100.000% 1 channel 1.000 technik/instruments/ct-icey1.wav 16.489% 16.489% 100.000% 1 channel 0.006%0.006% 1.000 technik/instruments/hicymbal.wav 14.036% 14.036% 100.000% 1 channel 1.000 technik/instruments/hicymbal.wav 16.197% 16.197% 100.000% 1 channel 0.001%0.001% 1.000 technik/instruments/hihat2.wav 22.158% 22.158% 100.000% 1 channel 1.000 technik/instruments/hihat2.wav 19.619% 19.619% 100.000% 1 channel 0.369%0.369% 1.000 technik/instruments/shaker.wav 25.223% 25.223% 100.000% 1 channel 1.000 technik/instruments/shaker.wav 43.839% 43.839% 100.000% 1 channel 0.064%0.064% 1.000 technik/instruments/type.wav 55.481% 55.481% 100.000% 1 channel 1.000 technik/instruments/type.wav 11.552% 10.816% 70.847% MS-Stereo 0.000% -0.000% 0.936 technik/loudspeaker/ir.wav 5.995%5.835% 32.765% MS-Stereo 0.973 technik/loudspeaker/ir.wav 9.717%7.809% 54.401% MS-Stereo -0.000% -0.000% 0.804 technik/loudspeaker/ir_l_ff.wav 6.179%4.861% 11.550% Stereo0.787 technik/loudspeaker/ir_l_ff.wav 31.498% 26.802% 85.468% MS-Stereo -0.000%0.000% 0.851 technik/loudspeaker/ir_l_ffl.wav 2.966%2.412% 88.489% MS-Stereo 0.813 technik/loudspeaker/ir_l_ffl.wav 4.172%1.388% 24.541% Stereo 0.000% -0.000% 0.333 technik/loudspeaker/ir_l_nf.wav 2.941%0.763% 33.144% MS-Stereo 0.259 technik/loudspeaker/ir_l_nf.wav 7.330%6.025% 71.489% MS-Stereo -0.000% -0.000% 0.822 technik/loudspeaker/ir_r_ff.wav 5.082%3.755% 37.084% MS-Stereo 0.739 technik/loudspeaker/ir_r_ff.wav 28.540% 26.679% 83.192% MS-Stereo 0.000%0.000% 0.935 technik/loudspeaker/ir_r_ffl.wav 2.754%2.540% 83.531% MS-Stereo 0.922 technik/loudspeaker/ir_r_ffl.wav 3.747%1.088% 25.103% Stereo 0.000%0.000% 0.290 technik/loudspeaker/ir_r_nf.wav 3.134%0.896% 35.534% MS-Stereo 0.286 technik/loudspeaker/ir_r_nf.wav 5.910%5.248% 25.726% Stereo 0.000%0.000% 0.888 technik/loudspeaker/jr_b3.wav 6.824%5.965% 22.686% Stereo0.874 technik/loudspeaker/jr_b3.wav 4.028%5.156% 19.210% Stereo 0.000%0.000% 1.280 technik/loudspeaker/jr_l3.wav 4.391%6.019% 13.717% Stereo1.371 technik/loudspeaker/jr_l3.wav 5.741%7.783% 38.160% MS-Stereo -0.000% -0.000% 1.356 technik/loudspeaker/jr_r1.wav 5.832%7.629% 29.787% Stereo1.308 technik/loudspeaker/jr_r1.wav 3.586%5.396% 36.868% MS-Stereo -0.000%0.000% 1.505 technik/loudspeaker/jr_r2.wav 3.782%6.262% 33.153% MS-Stereo 1.656 technik/loudspeaker/jr_r2.wav 4.998%3.281% 37.027% MS-Stereo -0.000%0.000% 0.657 technik/loudspeaker/jr_r3.wav 5.856%3.511% 34.702% MS-Stereo 0.600 technik/loudspeaker/jr_r3.wav 52.965% 55.628% 92.795% ? 4.472%0.673% 1.050 technik/tapedeck/aiwa_928.wav 16.590% 19.768% -9.369% Stereo1.192 technik/tapedeck/aiwa_928.wav 140.965% 159.477% 99.802% Mono 3.232%4.181% 1.131 technik/tapedeck/aiwa_929.wav 199.454% 225.647% 99.803% Mono 1.131 technik/tapedeck/aiwa_929.wav 38.973% 43.374% 87.996% MS-Stereo 0.002%0.009% 1.113 technik/Fraunhofer_Beispiele/extended-PCM/BlackBird.wav 6.572%6.202% 71.891% MS-Stereo 0.944 technik/Fraunhofer_Beispiele/extended-PCM/BlackBird.wav -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
new option: --nice: Changes priority depending on system load (uses clock, gettimeofday and sleep: nice is not portable and useless to it bad design) -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: :: You should apply a 16 Hz lowpass filter for DC removal. Note that lowest :: organ note has 16.3Hz. :: using residuals (no 16.3 Hz tone, but 32.6 Hz and 48.9 Hz). :: Did you hear tones under 16Hz? :: It is difficult to speak from "hearing" in the range from 10...25 Hz (we have lowest frequency systems to test the mechanical stability of microscope systems, vibrations are a big problem). You also "hear" 10 Hz. It modulates human speech so your speech sounds like a computer voice :: Did you have speakerboxes that you will give :: these low frequencies? :: Yes. The question is what you hear first, the distortion or the primary tone. :: I want to made sub-woofer with 16-30Hz range for my :: home stereo, but no lower. :: Note: fu box volume 50 Hz10 l 40 Hz24 l 30 Hz77 l 20 Hz 390 l 10 Hz 6250 l -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] lame source C++ compatible?
:: From: "David Balazic" [EMAIL PROTECTED] :: :: C++ is a superset ( more or less ) of C , so it should handle C code. :: :: In terms of features that's broadly true for C89, but differences in the :: type system, operator precedence and the the set of reserved/key words are :: enough to make porting a chore, usually an unnecessary one. Then you have :: C99 with a pile of features that C++ lacks, such as variable-length arrays. :: So it is wise to use the subset of C89/C95/C99 and C++ that is identically in this languages. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] ABR 320?
:: would it be possible to encode an mp3 at ABR 320 using frames larger :: than 320 kbps? :: e.g. 440kbps , 560kbps :: 560 kbps result in distortions. 551 kbps is the last without heavy artefacts. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] normalization
On Tue, Sep 05, 2000 at 02:21:45AM +0200, Francois du Toit wrote: I want to implement a normalizing routine in one of my programs, can anybody recommend one? It would be for 16 bit CD audio. Would simply multiplying by a constant factor and rounding be good enough or would the rounding errors cause some problems Rounding cases problems, especially on high quality/low noise audio. The simpliest way is to add 0.75 LSB triangle noise before rounding. -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] mpglib compiles once again...
:: Right, just committed a bunch of (quick and ugly) fixes to mpglib so it should :: once again compile without spewing errors .. it's probably still completely :: broken though, I hadn't time to test. :: :: Someone should take some time going over mpglib and clean it up a bit (Frank, :: since you broke it, why don't you fix it?) :: I had to present a lecture on Tuesday from 11:00 to 13:00. So I had no time in the last 36 hours. Layer I now seems to work. Is there a simple Layer I coder out there to test Layer I decoding? I've never have seen a Layer I file. :: :: Also possibly fixed a mono-decoding bug in layer1.c?! :: Fixed. First I tried to print out the Layer I/II/III file format (download of Staroffice was necessary). Guessing is really not my strength. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] VBR, distortion, and CBR
:: :: This is a little bit problematic, because the number of distorted :: :: bands does not tell you the weight of distortions you will get. :: :: What do you think sounds uglier: :: ::20 distorted bands, each 0.1 dB :: :: or :: :: 1 distorted band by 2 dB :: :: ??? :: :: :: :: I would say that 20 distorted bands at 0.1dB is preferable, and assume :: :: that this is the choice of lame's algorithms. Is this assumption wrong ? :: :: :: Yes and no. :: :: That depends on the distortion of the undistorted bands. Also undistorted :: bands have an distortion, it is negative. :: :: You are saying : "has a distortion" != "is distorted" ? :: distorted bands:threshold noise undistorted bands: 0 = noise = threshold So calculate: distortion[dB] = 20 dB * lg (noise/threshold) distorted bands:distortion 0 dB undistorted bands: distortion 0 dB zero distortion bands: distortion = -oo dB (very unlikely that this happens) Noise and threshold are voltages (no power, otherwise multiply by 10 dB). The distortion = max ( x1, ..., xn) model implies an infinite sharp transition of the distortion reception. But reality is far beyond from this model. :: First you need a table of the probability to hear distortions: :: :: distortion probability error noise voltage ratio :: [d=dB] [p=%] [i] :: -1050.0 0. 0.316 : 1 :: -3 50.8 0.0004 0.708 : 1 :: -2.5 52.0 0.0025 0.750 : 1 :: -2 53.2 0.0064 0.794 : 1 :: -1.5 55.6 0.020 0.841 : 1 :: -1 57.9 0.040 0.891 : 1 :: -0.5 63.3 0.116 0.944 : 1 :: -0.25 66.3 0.176 0.972 : 1 ::0 70.2 0.281 1.000 : 1 :: :: isn't 0 dB distortion == no distortion ? :: How can it be heared ? :: No: distortion == threshold. Decibel (dB) is a logarithmic unit. Zero distortion means -oo dB (oo stands for infinite), tenth voltage -20 dB, half voltage -6 dB, equal voltage +/-0 dB, double voltage +6 dB. Do you know Graham Bell? :: +0.1 72.2 0.336 1.011 : 1 :: +0.2 74.2 0.422 1.023 : 1 :: +0.3 76.1 0.504 1.035 : 1 :: +0.4 78.0 0.593 1.047 : 1 :: +0.5 80.0 0.706 1.059 : 1 :: +1 87.1 1.281.122 : 1 :: +1.5 93.7 2.341.189 : 1 :: +2 97.2 3.651.259 : 1 :: +2.5 99.0 5.431.334 : 1 :: +3 99.8 8.4 1.412 : 1 :: +3.5 99.9 9.5 1.496 : 1 :: Search also for "masking_lower" in "lame.c". -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Multi Pass MP3 Encoder
What about the idea to allow the user to code the file in two passes for the very best quality by the extense of doubling the CPU time. 1st pass with "--hintfile": WAV ---| |--- MP3 (first pass quality) | LAME | | |--- HINT 2nd pass with "--usehintfile" WAV | |--- MP3 (first pass quality) | LAME | HINT --| |--- HINT (unused) In the hint file for instance one byte per frame is stored: Bit 7:4 Should MS or LR coding be used? 0 no info 1 very strong buy for LR stereo br(LR) = 0.545 br(MS) ... 8 no differce br(LR) == br(MS) ... 15 very strong buy for MS stereo br(LR) = 1.834 br(MS) Bit 3:0 Bitrate demand for the best coding 0 no info 1 low bit rate demand 20% ... 5 normal bit rate demand 100% ... 15 very high bit rate demand 300% MS/LR switching can be optimized in a seconds pass. Also the bit reservoir can be better balanced. You can empty the bit reservoir for an attack if there are following some silent milliseconds of music 5 5 5 8 10 3 2 4 4 3 4) ^ empty bit pool here for best Q But may be the attack is followed by a much more worse attack 5 5 5 8 10 14 3 2 4 4 3 4 ^ ^ | empty the bit pool here and not here The same efect can be achieved by increasing the latency time of lame so lame can see more of the music' future. Proposal Application feed samples via lame_encode_buffer() to lame. lame collects the samples in an overlapping circular buffer°) with an size of for instance 8192 samples. Every function (gpsycho, lame encoder, MS/LR detection, ...) gets the same big data block, but everyone are interested of another area of the data: |--| MS/LR ^^^ gpsycho ^^ coder ^ coder (out of bit scenario) -- Mit freundlichen Grüßen Frank Klemm °) circular overlapping buffer Two successive buffers are filled with the same data. | 1 2 3 4 5 6 7 8||1 2 3 4 5 6 7 8| || | 9 10 3 4 5 6 7 8||9 10 3 4 5 6 7 8| || | 9 10 11 12 13 6 7 8||9 10 11 12 13 6 7 8| || Advantage: No wrapping Disadvantage: Needs some 10^-2 MByte of RAM eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] the -mx mode - different philosophy
[Charset iso-8859-2 unsupported, filtering to ASCII...] :: Frank Klemm wrote: :: :: :: -md is not documented. :: :: :: :: What is it ? Does anyone made a dual channel mode? :: :: :: :: :: It's like -ms, but no bitrate balance is possible between the channels :: (maybe also for the main purpose, dual language, a disadvantage from the :: quality aspect) and also a hint for the decode to only decode one :: (selectable) channel. :: :: Main advantage is the lower CPU usage. :: :: Just to clarify : -md is "dual channel" ? :: Yes: Bit 7,6 of the MPEG frame header: 00 LR Stereo frame(Stereo, Left + Right Channel, bit balancing possible) 01 MS Stereo frame(Stereo, Mid- and Side Signal) 10 Dual Channel frame (Mono, for instance dual language english/german) 11 Single Channel frame (Mono) Ist it right that Layer II MS-Frames are unable to NOT use intensity stereo for f fs/4 (for CD 11 kHz)? -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Correlation mid/side
:: :: Also, does anyone know the basis for the ISO switching criterion? Do they :: really mean square energy (quadrupled magnitude)? They give no hints as to :: how to reconcile the mid/side samples with the right/left psychoacoustics in :: the loop section of the encoder. Computing psychoacoustics for the sum and :: difference signals makes no sense to me, as one is never going to listen to :: them and thus the psychoacoustic threshold figures are irrelevant, but the :: alternative of trying to simultaneously allocate bit/noise for both channels :: seems overly complicated/possibly impossible (oxymoron strikes!). I'm :: compromising right now by just calculating the distortion thresholds using :: the L/R SMR and the M/S signal and bandwidths (and the loop distortions with :: the M/S signal), but this seems like a pretty silly way to do things, and it :: doesn't sound very good. I'm trying not to plagiarize LAME (I still use the :: ISO/ATT psych model, for one), but I gather that LAME does some sort of :: mid/side psychoacoustic processing? :: 1st step: - Calculate diffuse field corrected ear-drum SPL°): L' = a(w) L + b(w) R, R' = a(w) R + b(w) L w is a small omega, a(w),b(w) complex Note: different for loudspeakers and head phones (where a \approx 1 and b \approx 0) Note: search for HRTF to get usable approaches for a and b. Note: for very good results you must take into consideration that a and b changes depending on the temporal direct/indirect sound ratio - ignore in-brain talk over (ca. -60 dB @1 kHz, much lower than accoustic talk over) - calculate threshold for the left and the right ear 2nd step: if frame can be coded in LR mode - code L using L and threshold(L) - code R using R and threshold(R) if frame can be coded in MS mode - code M using M and 0.6...0.8 * min(threshold(L),threshold(R)) - code S using S, decoded coded M, threshold(L) and threshold(R) 3rd step: select MS/LR depending on: - br(MS)/br(LR) ratio - the past of the audio file - bit pool fuel - (the future of the audio file using one mechanism I mailed before) - user options? -- Mit freundlichen Grüßen Frank Klemm °) SPL = Sound preasure level eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] normalization
:: Even working with floats ??? :: :: What is the LSB in this case ? :: The LSB is 1 for int's. See "MPEG-2 AAC Stereo Verification Tests" and the item "Glockenspiel was too high". May be this is more comprehensible. :: | On Tue, Sep 05, 2000 at 02:21:45AM +0200, Francois du Toit wrote: :: | I want to implement a normalizing routine in one of my programs, can :: | anybody recommend one? It would be for 16 bit CD audio. :: | :: | Would simply multiplying by a constant factor and rounding be good enough :: | or would the rounding errors cause some problems :: | :: | Rounding cases problems, especially on high quality/low noise audio. :: | :: | The simpliest way is to add 0.75 LSB triangle noise before rounding. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Coding history report for each (non trivial) function
May be evry (non trival function should get a history header? /* * Function: double sinc (IN double x); * * Purpose:calculates sin (pi x) / (pi x) * * Input: * * History: *Person_1 2000-06-02: first version *Person_2 2000-06-12: zero cross error bug removed */ -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] MGDIFF
May be the following file can be added. I use it do quick show differences of two brances. Quick and dirty, can still be optimized. I named it MGDIFF and it needs mgdiff. #! /bin/bash for i in {*,*/*,*/*/*,*/*/*/*}.{c,h}; do file1=`pwd`/$i file2=`echo $file1 | sed s/lame/lame_old/` if diff -bwB --brief $file1 $file2; then echo equal $i else echo -e "\033[7mDifferent: $i\033[0m" mgdiff -title "mgdiff $i" -geometry 1536x800 -args "-bw" $file1 $file2 sleep 2 fi done -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Frank's coherence [off topic?]
:: :: I plugged into LAME that correlation test for a single frame. :: If you define RH_VALIDATE_MS at compile time this code gets active. :: Actually the decision whether to use L/R or M/S coding is based :: on masking relations. But sometimes LAME switches to M/S coding :: where L/R coding would be using fewer bits. The extra code you :: can enable with above defined will try to get a rough estimation :: on the correlation of left and right channels and will consider :: the perceptual entropy to check if it would be better not to switch :: to M/S coding. :: I've added a remark on this piece of code. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Layer III decoding has been broken between Mo and now
Decoding of Layer 3 doesn't work anymore. Last checkout/commit/checkout in the night So/Mo worked, now it is broken. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Neues File
Ich benötige ein neues File für mlame/mlame.bat mlame_corr.c oder so. mlame soll (auf Wunsch) automatisch zwischen -ms, -mj und -mm umschalten. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Please stop breaking LAME... ;)
:: Hi Everyone, :: :: I haven't been keeping up with things for the last week, because :: my wife and I just had our first baby :-) (baby's requisite website: :: www.wildpuppy.com/baby) :: :: Anyway, now LAME CVS fails all my test cases. This is normal since :: small changes in just the order of operations will show up in these :: tests. I normally then track down exactly what is responsible and :: make sure I understand what is going on. But in this case it looks like :: this will not be possible: There are massive differences in every :: single file. Most of them may be long overdue, but for the most part :: they are related to coding style and/or cosmetic. :: what CODING style? Sorry, it is very laborious to only read (not understand) the source code. Currently the first action in the editor is the auto-format features of xjed. It only takes 1 or 2 seconds, but the code can't be checked in anymore. Transmitting the changes from the local copy to the 'commit'-able version is sinewy, especially line by line. It begins with the simpliest things like tabulators '\t' with different sizes (3, 4 and 8 are used in lame), continues with circle references in header files (try to change the type of internal_flags from 'void*' to 'lame_internal_flags*') and ends with dangerous variable names like gfc-stereo and incorrect code due to provoked misunderstandings. It uses some features still tolerated by the ANSI C3.159-1989 standard and strictly forbidden in C99 and C++. A lot of things I've not seen for 3 or 4 years. May be features should be stopped for 2 or 3 days and only these things should be solved. The simpliest way to get a coding style is to use /usr/src/linux/Documentation/CodingStyle . May be there are better coding styles (note that coding styles are also a question of personal taste), but this coding style is the coding style of one of the most successful projects and it is much better than every tohuwabohu. :: It is not approrpriate for new developers to make so many changes so :: quickly without consulting the rest of us. :: Waiting for more than 2 or 3 days makes it nearly impossible to check in changes. That are the results of the usage of SourceSafe at work. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Please stop breaking LAME... ;)
:: :: Currently my aim is, that the program is compilable with: :: gcc 2.95.2 :: g++ 2.95.2 :: :: I run gcc 2.95.(3) and SAS/C (the latter is usually much more helpful on :: warnings, and still much more forgiving on "errors")... :: Can you add g++ for testing? Or also some other C++ compiler? Or use gcc at monday/wednesday and friday and g++ at the rest? :: :: Also it would be nice if you cast types that don't match the functions :: :: types into the correct one :: Casting is a bad thing. Correct the reason for the casting, not the warning :: (or the error in C++) itself. See Ada Rationale. It is Ada related, but the :: reasons are the same for every procedural language like C, Pascal, Ada, ... :: :: Ofcos, as I said, the best thing is to use the correct one to begin with, but :: if that isn't possible, atleast cast the value into the correct one. :: Write a (inline) function to convert from type A to type B. Wild type casting in C is one of the most dangerous things. C can't cast from type A to type B. There is only the possibility to cast every shit to type B. And this is dangerous. So you write a int ifreq; double dfreq; ifreq = (int) ( dfreq ); and now dfreq changes the type to 'struct bla*'. C converts this without batting an eyelid. Who has corrected the SigSeg fault which occured only with gcc 2.95, but not with g++ 2.95? fread() faulted with unobtrusively parameters in get_audio.c. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Please test byte order handling
I've added support of 8, 16, 24 and 32 bit PCM data. Supported are big and little endian. But maybe the behaviour has changed. If there are changes, currect the line: int big = ( gfp-input_format != sf_wave ) ^ ( gfp-swapbytes == TRUE ) ; in get_audio.c. If big is 0, the data is treated as little endian, otherwise as big endian. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Some of the test signals
9.293%9.231% 47.185% Giora Feidman/The Dance of Joy -- [12] The Wedding Waltz.wav 0.867%0.779% 33.951% Giora Feidman/The Dance of Joy -- [12] The Wedding Waltz.wav 26.829% 27.573% 45.900% Giora Feidman/The Dance of Joy -- [13] Shpiel Zhe Mir a Liedele.wav 4.769%4.094% 35.219% Giora Feidman/The Dance of Joy -- [13] Shpiel Zhe Mir a Liedele.wav 27.031% 25.049% 56.670% Giora Feidman/The Dance of Joy -- [14] L'Chaim.wav 6.040%5.671% 42.539% Giora Feidman/The Dance of Joy -- [14] L'Chaim.wav 11.513% 10.835% 55.324% Giora Feidman/The Dance of Joy -- [15] Song for Two.wav 1.436%1.300% 32.576% Giora Feidman/The Dance of Joy -- [15] Song for Two.wav 40.172% 35.276% 39.601% Giora Feidman/The Dance of Joy -- [16] The Freilach Dance.wav 7.909%6.883% 32.210% Giora Feidman/The Dance of Joy -- [16] The Freilach Dance.wav 34.639% 34.106% 31.471% Giora Feidman/The Dance of Joy -- [17] Rue du Bac (Encore).wav 5.712%5.372% 33.006% Giora Feidman/The Dance of Joy -- [17] Rue du Bac (Encore).wav or: 97.208% 102.976% 67.960% Máire Brennan/track01.cdda.wav 13.924% 14.760% 70.173% Máire Brennan/track01.cdda.wav 72.494% 74.756% 78.354% Máire Brennan/track02.cdda.wav 6.646%7.175% 77.827% Máire Brennan/track02.cdda.wav 96.411% 94.620% 71.151% Máire Brennan/track03.cdda.wav 12.567% 13.465% 65.035% Máire Brennan/track03.cdda.wav 69.573% 72.951% 53.852% Máire Brennan/track04.cdda.wav 7.117%7.342% 60.341% Máire Brennan/track04.cdda.wav 95.564% 94.909% 73.071% Máire Brennan/track05.cdda.wav 12.542% 15.006% 67.547% Máire Brennan/track05.cdda.wav 77.162% 76.282% 56.997% Máire Brennan/track06.cdda.wav 9.009%8.984% 59.754% Máire Brennan/track06.cdda.wav 91.141% 95.097% 75.882% Máire Brennan/track07.cdda.wav 10.008% 12.816% 75.723% Máire Brennan/track07.cdda.wav 70.234% 69.401% 37.077% Máire Brennan/track08.cdda.wav 5.285%4.792% 64.201% Máire Brennan/track08.cdda.wav 77.033% 79.583% 74.154% Máire Brennan/track09.cdda.wav 7.023%6.975% 74.601% Máire Brennan/track09.cdda.wav 80.308% 72.977% 66.552% Máire Brennan/track10.cdda.wav 9.940%9.481% 64.353% Máire Brennan/track10.cdda.wav 31.907% 31.418% 77.333% Sting/Nada como el Sol -- [1] Mariposa Libre.wav 6.732%6.020% 78.771% Sting/Nada como el Sol -- [1] Mariposa Libre.wav 33.899% 35.975% 57.517% Sting/Nada como el Sol -- [2] Fragil (Portuguese).wav 5.155%5.363% 58.397% Sting/Nada como el Sol -- [2] Fragil (Portuguese).wav 53.352% 54.498% 81.746% Sting/Nada como el Sol -- [3] Si estamos Juntos.wav 20.470% 21.473% 82.523% Sting/Nada como el Sol -- [3] Si estamos Juntos.wav 33.011% 34.478% 74.927% Sting/Nada como el Sol -- [4] Ellas Danzan Solas (Cueca Sola).wav 6.487%6.866% 62.332% Sting/Nada como el Sol -- [4] Ellas Danzan Solas (Cueca Sola).wav 32.281% 35.191% 57.279% Sting/Nada como el Sol -- [5] Fragilidad.wav 5.396%6.870% 56.256% Sting/Nada como el Sol -- [5] Fragilidad.wav -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] the -mx mode - different philosophy
:: :: I think, if there are really cases where MS frames would take a lot more :: bits than ST frames, then we can tune our switching criterion and see :: for example how the energy ratios (already calculated in GPSYCHO) look :: like. Maybe another constraint for energy ratios would solve the :: switching problems without the need to quantize all four channels. :: I've tested several files with '-mj' and '-mf' (using -V 0...2). Most coded files have the same size within 0.1%. So you can prevent switching artefacts by using -mf instead of -mj. Bit demands of the modes (tested with 20 CD albums): -mm 100% -mj -mm * 1.62 ...1.88 -mf -mj * 0.999...1.001 -ms -mm * 1.985...2.001 -md is not documented. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Please stop breaking LAME... ;)
:: :: I run gcc 2.95.(3) and SAS/C (the latter is usually much more helpful on :: :: warnings, and still much more forgiving on "errors")... :: Can you add g++ for testing? :: Or also some other C++ compiler? :: Or use gcc at monday/wednesday and friday and g++ at the rest? :: :: I have no incentive to use C++ at all, infact I'm not even interested... :: You should not use C++, but the C code should not use special C expressions allowed in C89 (ANSI C Standard from 1989), but strictly forbidden in ANSI C++. A lot of them are forbidden in C99 (ANSI C Standard from October 1999). You can't compiler LAME with a C99 compiler. Good C Code should be compilable with (nearly) all C(89/95/99) and C++ compilers. :: I'm not talking about "wild" typecasting, I'm saying that strict, controlled :: typecasting can be good, instead of leaving it up to the compiler to make the :: choices (which might not always be the correct one)... :: :: C can't cast from type A to type B. There is only the possibility to :: cast every shit to type B. And this is dangerous. :: So you write a ::int ifreq; ::double dfreq; ::ifreq = (int) ( dfreq ); :: and now dfreq changes the type to 'struct bla*'. :: C converts this without batting an eyelid. :: :: What the hell are you on about?! :: :: Oh, w8, you mean if some smartass changes dfreq to a struct without checking :: the code, or atleast changing the name so these kinds of conflicts can occur? :: :: I'd say he'd be downright stupid. ;) :: Some of these bugs are parts of international standards. The most well known is one in the Unix X11 system. It was found 15 years after it was programmed while porting X11 for Ada. Noone need to be a smartass to make such errors. If you write a C program with 1 serious error/1500 LOC, you are a top programmer. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] lame source C++ compatible?
:: From: "Segher Boessenkool" [EMAIL PROTECTED] :: ::How to compile lame on a system where only a C++ compiler is available :: (the ::C compiler costs extra money)? Currently lame generates nearly :: uncountable ::errors with a C++ compiler. :: :: Put :: :: extern "C" { :: } :: :: around everything? (Around a whole file is ok, I believe). :: :: That only affects the linkage of functions (e.g. disables name mangling). :: :: There's no easy way to do it in general. I've done it a few times for some :: fairly large pieces of code, and it's no fun at all. Then there's the :: maintenance... :: :: Is the C compiler expensive? :) :: So, now the source is usable for g++. Layer 1 and 2 decoding is disabled. Too much deadly sins per line. Using structs and typedef structs with the same name and other horror videos. Don't disturb the remaining thousands of warnings. C++ have problems more problems to compile every collection of ASCII characters ;-) -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] Please stop breaking LAME... ;)
:: Please check that your code compiles and works ok before checking it in to :: the CVS ... :: That doesn't help fully to solve the problem. Compiler and runtime libs are different. So the check can only be done for one (or two) compilers. Another problem are interferencing code changes in the evaluation phase (currently 1 hour at my computer). Currently my aim is, that the program is compilable with: gcc 2.95.2 g++ 2.95.2 It is generally a good idea if a C program is (also) compilable with a C++ compiler. I've patched a lot of code, reached the aim at 5:30 a.m. and now I got an executable. Several hundreds of warnings are remaining. layer1 and layer2 decoding I have disabled. There are some ugly things in it, I must correct separately: struct bla { int i; char c; }; typedef struct bla bla [32] [64]; struct bla a1; bla a2; a1 [12] [7] . c = 12; a2 . c = 7; Really nasty hacking stuff. C++ forbits it strictly. In the future I suggest to test the code with C *and* C++. It also increases the chance of portable ANSI C code. :: I've repeatedly had to go fix compile errors in the CVS now... :P :: I too. :: Also it would be nice if you cast types that don't match the functions types :: into the correct one :: Casting is a bad thing. Correct the reason for the casting, not the warning (or the error in C++) itself. See Ada Rationale. It is Ada related, but the reasons are the same for every procedural language like C, Pascal, Ada, ... Another idea is to use separate types for separate items. Changing the type of an item becomes easy. The idea of C is that everything is an int, but that isn't a very nice idea. You mix too much things with are fully incompatible. And a sample frequency is for one person a long, for the next a int, for a 3rd a unsigned int, the 4rd takes a float to save storage and the last one takes a long double to reduce the danger of roundings. Casts are the wrong way to hide this mess. typedef long double samplefreq_t; samplefreq_t in_sample_freq = 44100; No mixtures of int, long, unsigned, float and double. :: (or use the correct one to begin with) .. it's both the :: correct and nice way to do it... :: :: Also don't set ambigous types like "unsigned", use the proper full name, :: "unsigned int". :: Right. Some of them I have fixed, some of them are mine ;-( :: And a problem I can't figure out right now is why LAME won't encode at all :: anymore, it just creates an empty mp3 (that means it atleast opens it), and :: then quits saying "Error writing mp3, disk full?" .. what have you done? ;) :: Fixed. Bug of mime. I've added a check of the return value of the final fwrite(). There are runtime libs out there which reports an error for the request of writing 1 item of 0 bytes. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] mlame usage without naming .mp3.mp3
:: How do you get mlame to batch encode MP3-MP3 without adding a ".mp3" to :: every file; resulting in a file with ".mp3.mp3" as extentions? :: :: Brent :: -- :: MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ ) :: Fiexed. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Heavy problems with lame source file design
I have heavy problems replacing the definition of the element: internal_flags in the structure lame_global_flags. lame_internal_flags is defined in lame.h with: // /* more internal variables, which will not exist after lame_encode_finish() */ // void* internal_flags; which is not very fine (it's the PEEK and POKE of C). // /* more internal variables, which will not exist after lame_encode_finish() */ // lame_internal_flags* internal_flags; is the real structure of internal_flags. This structure is defined in util.h and contains a structure plotting_data. plotting_data is defined in gtkanal.h and gtkanal.h needs the type lame_global_flags defined in lame.h. Summary: lame_global_flags defined in lame.h needs lame_internal_flags defined in util.h lame_internal_flags defined in util.h needs plotting_data defined in gtkanal.h plotting_data defined in gtkanal.h needs lame_global_flags defined in lame.h . Who is responsible for that? Who obscured this by using a 'void*'? -- Frank "I hate gordic knots" Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] Encoding 44.1 kHz as 48 kHz sounds not so very well
Why a quadratic interpolation is used for upsampling instead of a sinc interpolator? -- Frank Klemm -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
Re: [MP3 ENCODER] lame/CODING_STYLE
:: Frank Klemm schrieb am Fre, 01 Sep 2000: :: lame/CODING_STYLE, version 0.001 ;-) :: - :: This is the first try of a Coding Style: :: :: notes on some points :: :: * Don't use tabulators (the character with the value '\t') in source code, :: especially these with a width of unequal 8. Lame sources are using :: different sizes for tabulators. :: :: I don't like tabulators too :: :: * Functions should be not longer than 50 lines of code. :: :: this is debateable Yes. 24 lines. :: * Document functions. :: :: programmers don't document, you know ;-) :: OK, your point is clear and I'll try to add more comments :: (hopefully useful remarks..) :: :: * Don't use single 'short' variables to save storage. :: Short variables are especially on Pentium Class Computer much slower than :: int's. DEC alpha also hates short variables. :: :: Example: float bla [1024]; ::short i; ::for ( i = 0; i 1024; i++ ) :: bla [i] = i; :: :: I'm not so sure about shorts. Documents on the Intel compiler suggest :: for example to put the index variables of nested loops in a struct :: to improve cache performance, this way they would be in the same cache line. :: May be this plays a role, but only if you have more than 8 loops. And if you have more than 8 loops, the performance of the outer loops becomes very unimportant. In 32 bit mode the pentium supports 8 bit and 32 bit. 16 bit need the operand size prefix. This is slow. DEC Alpha cannot handle 16 bit int's at all. The compiler generates a lot of bit shifting and masking stuff. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )
[MP3 ENCODER] lame source C++ compatible?
How to compile lame on a system where only a C++ compiler is available (the C compiler costs extra money)? Currently lame generates nearly uncountable errors with a C++ compiler. -- Mit freundlichen Grüßen Frank Klemm eMail | [EMAIL PROTECTED] home: [EMAIL PROTECTED] phone | +49 (3641) 64-2721home: +49 (3641) 390545 sMail | R.-Breitscheid-Str. 43, 07747 Jena, Germany -- MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )