Re: [music-dsp] about entropy encoding
I had an idea on how to improve the entropy estimate, especially for some types of signals. Here are the results: http://morpheus.spectralhead.com/entropy/estimate2/ -P -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] about entropy encoding
On 17/07/2015, Ethan Duni ethan.d...@gmail.com wrote: Said another way: if that's the only signal I want to transmit, then I'll just build those parameters into the receiver and not bother transmitting anything. The entropy will be zero. The receiver will simply output the desired waveform without any help from the other end. I argue that's not a transmitter, but rather, a signal generator. In that case, you do not transmit anything, you just have a signal generator that constantly outputs a square wave with constant parameters. Once you build an _actual_ transmitter, that actually transmits something, then the entropy of a message is 100% certain to be bound to be nonzero, since you need to transmit at least a single bit to transmit anything. You cannot transmit nothing ... That's not a transmitter. [Sidenote: assuming your signal generator is a digital one, it will contain bits that represent 1) its code 2) its parameters, both of which have nonzero entropy. The just build those parameters into the receiver step is exactly where that entropy comes from. Consider algorithmic entropy[1] (also called Kolmogorov complexity) - any program that has an output of nonzero length, is bound to have nonzero size, hence nonzero entropy. Therefore, your constant signal generator will have nonzero (algorithmic) entropy. If you want to transmit your constant signal generator program in a message, then the length of the message is bound to be nonzero.] It's only if there's some larger class of signals containing this one that I want to handle, that the question of entropy comes up. Larger class of signals precisely means, at least two signals, since 2 1. Once you have at least two signals that you possibly want to transmit (which is virtuall all cases when you have an actual transmitter and not a constant signal generator), then each message will always have nonzero entropy. Hence, when you have an actual transmitter, the entropy of a message is bound to be nonzero in all cases. [And if you only have a single signal with 100% probability, then your constant signal generator will have nonzero algorithmic entropy, which means the length of the smallest program that produces that output. You cannot produce something from nothing - by hard-coding the parameters into your program, you just hid that entropy inside the program's algorithm or data. If you want to transmit the program or its parameters, the length of the message will be nonzero. Quote from [1]: In Shannon's theory, entropy is a property of a statistical ensemble [...]. Kolmogorov and Chaitin showed independently that entropy is in fact an *intrinsic* property of an object, which can be computed (in theory at least) without reference to anything outside that object. The field that deals with this is called algorithmic information theory, developed around 1965, and these crackpots believe that entropy (information content) is not necessarily a function of a probability distribution. This theory is a half century old, I'm not saying anything new.] -P Reference: [1] http://www.maths.tcd.ie/pub/Maths/Courseware/AlgorithmicEntropy/Entropy.pdf -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] about entropy encoding
On 18/07/2015, Peter S peter.schoffhau...@gmail.com wrote: [Sidenote: assuming your signal generator is a digital one, it will contain bits that represent 1) its code 2) its parameters, both of which have nonzero entropy. The just build those parameters into the receiver step is exactly where that entropy comes from. Example: if your 100% probability squarewave that you want to generate is characterized by the set of parameters amplitude=100, phase=30, frequency=1000, and duty cycle=75%, represented by the binary sequence 1 1100100 0 101000 1001011, then if you just build those parameters into the receiver means, that you put those bits (or some other represenations of) into your program. Without your program containing these numbers (in whatever form or encoding), how will it know that it needs to output a square wave with amplitude of 100, phase of 30, frequency of 1000 and duty cycle of 75% ? If any of those parameters are missing, it won't know, so it won't output that exact square wave. Therefore, your program *must* contain these numbers. So all you did, is you put that entropy into your program, and you pretend that there's no entropy. It's still there, except it is now in your program, not in a message. You merely turned it into algorithmic entropy - you cannot make something out of nothing. -P -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] about entropy encoding
Arbitrary waveform entropy estimate #3 http://morpheus.spectralhead.com/entropy/estimate3/ Results: - For white noise, gives ~1. - For simple periodic waveforms, gives ~0.1. - For mixture of two simple nonharmonic periodic waveforms, gives 0.5. - For increasing length periodic noise, gives increasing estimate. At this point I'm convinced that it is possible to create entropy estimators that work well for arbitrary simple periodic waveforms. For nonharmonic mixtures of periodic waveforms, not as good, but still distinguishes them from noise. -P -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] about entropy encoding
Consider what happens when you have an actual transmitter/receiver pair, and not just a constant signal generator. There are two possibilities: 1) They use a pre-agreed codebook. 2) The transmitter first sends the codebook to be used to decode the messages. If either of these is not true, then the receiver will have absolutely no clue what to output. It will just receive nessages '0', '001', '101', '1100', '1110', '10', ... If receiver doesn't have a codebook (either pre-agreed or received) to look up what each of these codes mean, then the receiver will think: What to do? I have no idea what these mean, so I don't know what to output. In case 1) when they have a pre-agreed codebook, you simply turned its entropy into algorithmic entropy (you hid that entropy inside your program's data), and the receiver can only reconstruct a certain set of hard-coded parameters (a certain set of hard-coded waveforms). In case 2), transmitter first builds a codebook (for example, based on the probabilities of the messages it is going to send using Huffman coding), and transmits it to the receiver. For example, let's say that there are going to be only 2 different square waves: A) Amplitude=100, phase=30, frequency=1000, duty cycle=75%, represented by binary sequence '1 1100100 0 101000 1001011'. B) Amplitude=50, phase=0, frequency=333, duty cycle=50%, represented by binary sequence '1 110010 0 101001101 110010'. Transmitter decides to send square wave A as bit '0', and send square wave B as bit '1'. Now, in order for them to be able to communicate, transmitter *must* first send this mapping (codebook) to the receiver for the receiver to be able to understand and decode the messages! So the *very* first message will be this: Hey receiver! When I send you '0', that means 'square wave, amplitude=100, phase=30, frequency=1000, duty cycle=75%'. And when I send you '1', that means 'square wave, amplitude=50, phase=0, frequency=333, duty cycle=50%'. End of codebook. Expressed as binary, it may look something like this (encoded as either prefix code, fixed length code or some other encoding): 0 1 1100100 0 101000 1001011 1 1 110010 0 101001101 110010 The encoding of the codebook may be different, but it is *certain* to contain all the parameters for all the possible square waves that will be sent, and this _must_ be the very first message, otherwise the receiver won't know what the received code means! Note, there's no way to encode this further using another encoding, because then for that other encoding, it would need to send another codebook first Hence, the codebook must once be sent 'as-is' or using some prefix encoding that doesn't use a custom code mapping. So, before sending *any* encoded messages, unless the codebook is pre-agreed - the transmitter *must* first send *all* sets of parameters *once*, for the receiver to be able to understand the messages. Irregardless of _whatever_ probability distribution they have. If transmitter wants to send some *very* complex waveform, even if that will be encoded as a short, single '0' bit later on, it must first send the waveform *once* for the receiver to be able to reconstruct it! Otherwise, receiver will not know what to do when receiving the message '0'! So, assuming that the messages are sent by encoding via a codebook, using however short code something may be encoded, unless it is pre-agreed (and thus hard-coded as algorithmic entropy in the program), it *must* be sent at least *once* for the receiver to reconstruct it. Without that, receiver is unable to decode the message. So, if codebook has length of 100 (using whatever encoding), and there are going to be 100 messages, then the *actual* entropy (length) of each message will be increased by 1 bit, since on average, there is 1 extra bit in the code book for each message that is sent, which is absolutely necessary for decoding the messages. Said otherwise, if the average length of a message is H bits and you send N messages, and the codebook has a length of L bits, then the *actual* effective entropy H' of each message will be H' = H + L/N bits, not H bits, because transmitter must also send the codebook first, increasing the overall average bits per message (hence, the actual effective entropy) by L/N bits. Since the length of the codebook L is nonzero unless it's pre-agreed, L/N will be nonzero unless the number of messages to be sent are infinite (so, always). Hence, the encoded parameter set in the cookbook (from whatever probability distribution) will *always* boost the average entropy per message by a *nonzero* amount, its exact amount only determined by the length of the codebook (the size of the parameter set), and the number of messages. (And that is just in addition to the entropy of the messages coming from the probability distribution, which is already nonzero once you have at least two possible messages of nonzero probability. So we have
Re: [music-dsp] about entropy encoding
It follows from the above, that Shannon's entropy model is a simplified, idealized model of information, that pretends that algorithms have zero length and thus no entropy, and you can magically share a codebook telepathically without actually transmitting it. If you dismiss all these details and just pretend they do not exist, then in a highly simplistic, idealized model, you may actually have an entropy of zero in a *single*, special corner case. Even in Shannon's highly simplified model, in virtually all other cases, entropy is nonzero. Later models of information, like the one of Kolmogorov, base the amount of information not on statistical probability, but rather, the minimal length of the algorithm needed to encode that information, which turn out to be near equivalent definitions. Quote from [1]: Firstly, we have to point out the relation between Kolmogorov complexity and Shannon's entropy. Briefly, classic information theory says a random variable X distributed according to P(X=x) has entropy or complexity of a statistical form, where the interpretation is that H(X) bits are on the average sufficient to describe an outcome x. Computational or algorithmic complexity says that an object x has complexity C(x)=the minimum length of a binary program for x. It is very interesting to remark that these two notions turn out to be much the same. Thus, the intended interpretation of complexity C(x) as a measure of the information content of an individual object x is supported by a tight quantitative relationship to Shannon's probabilistic notion. In particular, the entropy H=-SUM P(x)*logP(x) of the distribution p is asymptotically equal to the expected complexity SUM P(x)*C(x). According to [2]: Algorithmic entropy is closely related to statistically defined entropy, the statistical entropy of an ensemble being, for any concisely describable ensemble, very nearly equal to the ensemble average of the algorithmic entropy of its members. According to [3]: Nevertheless, once allowance is made for the units used, the expectation value of the algorithmic entropy for a set of strings belonging to an ensemble is virtually the same as the Shannon entropy, or the Boltzmann or Gibbs entropies derived for a set of states. According to these sources, the algorithmic entropy asymptotically equals, very nearly equals, or virtually the same as the probabilistic entropy. Therefore, any signal that has nonzero length, needs a nonzero-length program to encode, therefore, has nonzero entropy. [1] On Shannon, Fisher, and Algorithmic Entropy in Cognitive Systems http://uni-obuda.hu/conferences/saci04/Crisan.pdf [2] 2003, Niels Henrik Gregersen, From Complexity to Life https://en.wiktionary.org/wiki/algorithmic_entropy [3] 2009, Sean Devine, The Insights of Algorithmic Entropy http://www.mdpi.com/1099-4300/11/1/85/pdf -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp
Re: [music-dsp] about entropy encoding
On 7/18/15 10:10 PM, Peter S wrote: It follows from the above, that Shannon's entropy model is a simplified, idealized model of information, that pretends that algorithms have zero length and thus no entropy, and you can magically share a codebook telepathically without actually transmitting it. Shannon makes no issue of that. if you need the codebook at the other end (because you don't already have it), you need to send it. it's not so big and i have seen that in some compressed files and i am sure .zip files (at least the old ones, that were mostly Huffman coded but also had some run-length encoding in the model) have the codebook in the file preamble. even so, Shannon information theory is sorta static. it does not deal with the kind of redundancy of a repeated symbol or string. just with the probability of occurrence (which is measured from the relative frequency of occurrence) of messages. run-length coding isn't in there. maybe there is something in Shannon information theory about information measure with conditional probability (that might obviate what LPC can do to compress data). -- r b-j r...@audioimagination.com Imagination is more important than knowledge. -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp