On Fri, Oct 14, 2005 at 09:00:37PM +0200, [EMAIL PROTECTED] wrote: > First of all, thank you all very much for you comments, the picture is much > clearer now. > > I just don't fully understand the floating-point precission part. If numbers > from binary 0.1 to 1.0 are represented using 24 bits (sign bit + mantissa, I > think the implicit 1 does not count), and the numbers from binary 0.01 to > 0.1 also have 24 bits of precision, and so on for 0.001..0.01, etc. wouldn't > that mean we have a higher resolution? > We are using 7 of 8 exponent bits too. Just wasting the cases where the > exponent is larger than 0, or has some special meaning.) > That would give you 31 bits, minus a couple useless and redundant cases. > (when the exponent is -128, and denormalizing ocurrs) > > Then I also fail to see why it's bad for overflows to ocurr in fixed point. > Those signals (above 0dB FS) would clip on the hardware anyway, and are > expected to do so, since they were either badly recorded or amplified. > > Greetings, Dimitri
Obviously, too strong a signal will clip. This of course requires calculations to check for overflow. More likely, such checks will be too slow, so overflow will result in the value "wrapping around", that is, maxint + 1 => minint. This of course sounds really, really bad. Less obviously, too weak a signal will also distort. If the signal is in the range [-6, 5] then there are only 10 discrete values a sample may have, and your signal will sound like it's being played through your PC beeper with a pencil stuck through the cone. Also, any DC offset will indirectly make fewer values available, resulting in either clipping, or lack of precision. These problems don't seem so bad if only the output is considered, but in any audio processing pipeline, there are many more signal paths that must be considered. Imagine all the different signal paths in a filter. It is possible to avoid the mentioned problems using fixed point arithmetic where the "point" is fixed at a different place depending on the expected signal level. However, it's not much fun, and I very much doubt it is significantly faster. By using floating point, you gaurantee that the biggest step between two consecutive values is somewhere between 1/2^23 and 1/2^24th of your signal range (provided you don't run up against the limits of floating point, which is hard to do). Of course there are smaller steps between consecutive values as values approach 0; this is what makes floating-point desirable. In other words, you do have more than 2^24 values available to you, but the worst case difference between consecutive values is never much worse than if you had exactly 2^24 values.
