On Jan 9, 2021, at 11:20, Dan Stowell wrote: > This is a very interesting discussion, thanks.
Definitely interesting discussion. Although there might be some overlap, we're generally talking about two completely different sets of processing: subjective versus objective. I'm not sure how much examples of one type will help with problems of the other type. Summation (mixing) and multiplication (gain) are purely objective operations. There is only one correct answer for a given setup. Approximation is noise or distortion. Subjective processes are inherently noisy. At the very least, to the degree that the "signal to noise" ratio could be improved, it would still be a subjective determination as to which result is better or worse than another. > Since Dario picks up on Andy's comment about 1-dimensional data and neural > nets, I must comment: Andy's assertion is a bit out-of-date, and there are > lots of NN methods that are designed/applied for one-dimensional scalar > signals. A famous one for audio is deepmind's "WaveNet"; it's just one > example of the more general idea of "time-domain CNN" or one-dimensional CNN. > You can also apply recurrent NNs (RNNs) or even the very fashionable > Transformer networks, all of these are reasonable things to apply to 1D > signals such as audio. WaveNet would fall into what I'm calling subjective processes. To synthesize human speech out of nowhere - by definition - means that you have no reference against which to measure signal to noise. It's still very interesting, of course, but in a different class of problems to solve. > I'm no expert on sigma-delta at all, but my hunch about Dario's idea is that > it could well be a great idea. Xue comments that sigma-delta systems are > tuned to minimise SNR across the whole audio spectrum. One thing that might > be possible with a NN method might be to move beyond that, e.g. optimising > for (a proxy of) perceptual quality rather than for raw SNR. Also, Brian's > comment about dithering strongly implies to me that, despite other commenters > treating SDM as a strictly deductive system, there may be a benefit of some > "inference" that a NN can provide, to produce the best approximation of an > output given uncertain (e.g. dithered) input data. I'm not fully versed in 1-bit processing, either, but I've definitely seen evidence of noise shaping - dither that's shaped using perceptual spectrum weighting. Not all sigma-delta systems attempt to achieve a flat noise spectrum across the entire audio band. I suppose I may be focused too intently on addition and multiplication, where approximation is neither necessary nor acceptable. One of the reasons that this discussion involving delta-sigma audio is interesting to me is that every computing system that I am aware of uses multi-bit math building blocks at the hardware level. To my knowledge, there are literally no circuits that calculate sum, product, or any other linear process using only single-bit operand storage and processing. This is closely related to my interests in power-efficient processing. If there's no way to directly perform basic mathematical operations on a 1-bit stream in hardware, then we're left with very inefficient abstract software approximations that can be too expensive for battery-powered devices, just to use one example. Where multi-bit arithmetic units are universally available, proven to be 100% accurate (with few exceptions), and well known; direct single-bit arithmetic seems to be impossible in hardware (without converting to multi-bit), inherently lossy, and doomed to being an approximation at best. Not to distract from the original topic, but if anyone reading is aware of hardware implementations of direct single-bit mathematical operations (addition, subtraction, multiplication, division) without utilizing the established convert-to-multi-bit-and-process-using-hardware-multi-bit-arithmetic-units-before-converting-back-to-single-bit, then please share. This may be a philosophically loaded question, since any algorithm with more than two states would need to have a multi-bit state storage register. Thus, it may be mathematically impossible to operate on a 1-bit input stream using 1-bit arithmetic units to produce a 1-bit output stream (unless you're willing to limit the operations to +0, -0, x1, /1). Brian Willoughby
