On 18/07/2015, robert bristow-johnson <r...@audioimagination.com> wrote: > > listen, one thing i have to remind myself here (because if i don't, i'm > gonna get embarrassed) is to not underestimate either the level of > scholarship nor the level of practical experience doing things related > to music (or at least audio) and DSP.
Failure to understand the difference between the concepts of 'entropy' and 'entropy rate' indicates something lacking in the scholarship level. Failure to understand that log2(p) != 0 when p != 1 may also indicate something lacking in the scholarship level (or some conceptual misunderstanding). The notion of 'entropy' is already well-defined in the literature, so it is not my task to define it, it is better to point out to the references in the literature. > listen, i understand LPC. Great! I never said or thought or implied that you don't. I merely noticed and expressed that a bitflip counter is equivalent to an error estimator for a subset of linear predictors (namely, binary delta coding), without implying anything about your knowledge or understanding. (And for that matter, I did not address that particular message to you personally, I just thought this particular topic could be interesting for *some* readers.) > i know how to do this stuff. i know how to teach this stuff and in a > previous lifetime have done it. Great. When did I say or imply you don't? LPC is actually a fairly basic concept. > so be careful, lest you lose people's interest, to restrain any > patronizing tone. we're not as dumb as you think. When did I say you're dumb? I merely pointed out to Ethan Duni, that "entropy" != "entropy rate". Failure to understand this will result in failure to understand that except in some rare corner cases, virtually all signals have entropy. That doesn't imply that he is dumb (I don't think he is dumb), that just implies that he doesn't understand the difference between entropy and entropy rate. Hence, I pointed out to a standard piece of literature that defines it, specifically because he asked me to define it. If you guys are eager enough to be patronizing and pointing out to the supposedly missing areas of my knowledge, then I'll also be eager to point out where your knowledge may be lacking. That's a fair deal, isn't it? After you Robert told me that I am a crackpot giving me a link to "Crackpot index" (I still remember - I don't forget), I think I was not even harsh (and in light of that, you sound hypocritical). Now let's get back to the topic. > so here's my question(s) ... please read carefully because i'm the guy > i can sorta see a generic measure of entropy doing that, but it assumes > messages that are adjacent bits. you can add to that messages that are > grouped together in any imagined fashion, not just adjacent bits. Exactly. That would give another correlation measure. > there are virtually uncountable ways of grouping together the bits of a > binary file to measure entropy. i simply cannot believe you did it > every-which-way. Exactly. I did not do it in every possible way, though that would be interesting to test (in a given window). For finite number of bits, the number of combinations is finite, though for high N it will be certainly impractical to compute. I would expect, the more correlations you analyze, the more accurate measure you'll get. > so, because i am too lazy to look around at previous posts, what is the > *specific* way or ways you grouped together the bits into messages so > that you could count their frequency of occurrence and get a measure of > mean bits of information per message? First, let me point out that there's no universally good answer to this question, and in general, it depends on the message and the source material (and the length of the analysis window, amount of input, etc.). In my tests, I got fairly good results on *certain* test data by using a sliding window, essentially what you described (grouping adjacent bits and builiding a histogram). If I understand right, the frequency of m1m2m3...mN will give the relative probability (transition probability) of mN occurring after m1m2m3...mN-1. I got this idea from Shannon's paper, where he does the same for letters to analyze transition probabilities of English text. If the bits/symbols are grouped in a different fashion, then the occurrence of that pattern would indicate the relative probabilities of bits/symbols being in that particular postion relative to each other (I haven't tested this). In the tests that I'm currently doing, I find that what worked well for simple 1-bit signals, doesn't work well for multi-bit (audio) signals, indicating that a different measure is needed. As said, there are a zillion ways of analyzing possible correlations, so I cannot give you an universally good measure. In fact, that's what I'd like to find out myself as well. So this simple measure has very limited application, and I find that it doesn't work well for more complex signals. I have some further ideas, but without testing them, I can't tell how well they would work. > that question you certainly did not answer to my satisfaction. I am afraid to disappoint you, but I cannot give you a One Universal Truth. In the tests that gave good results for *some* test material, I simply grouped adjacent bits. But that measure gave bad results on other material, so that is certainly not universal. What would work better for general audio signals, I am currently still trying to figure out (and I believe every model will have some weaknesses). -P -- dupswapdrop -- the music-dsp mailing list and website: subscription info, FAQ, source code archive, list archive, book reviews, dsp links http://music.columbia.edu/cmc/music-dsp http://music.columbia.edu/mailman/listinfo/music-dsp