Re: [music-dsp] about entropy encoding

Peter S Fri, 17 Jul 2015 23:14:25 -0700

On 18/07/2015, robert bristow-johnson <r...@audioimagination.com> wrote:
>
> listen, one thing i have to remind myself here (because if i don't, i'm
> gonna get embarrassed) is to not underestimate either the level of
> scholarship nor the level of practical experience doing things related
> to music (or at least audio) and DSP.


Failure to understand the difference between the concepts of 'entropy'
and 'entropy rate' indicates something lacking in the scholarship
level. Failure to understand that log2(p) != 0 when p != 1 may also
indicate something lacking in the scholarship level (or some
conceptual misunderstanding). The notion of 'entropy' is already
well-defined in the literature, so it is not my task to define it, it
is better to point out to the references in the literature.

> listen, i understand LPC.

Great! I never said or thought or implied that you don't. I merely
noticed and expressed that a bitflip counter is equivalent to an error
estimator for a subset of linear predictors (namely, binary delta
coding), without implying anything about your knowledge or
understanding. (And for that matter, I did not address that particular
message to you personally, I just thought this particular topic could
be interesting for *some* readers.)

> i know how to do this stuff.  i know how to teach this stuff and in a
> previous lifetime have done it.

Great. When did I say or imply you don't? LPC is actually a fairly
basic concept.

> so be careful, lest you lose people's interest, to restrain any
> patronizing tone.  we're not as dumb as you think.

When did I say you're dumb? I merely pointed out to Ethan Duni, that
"entropy" != "entropy rate". Failure to understand this will result in
failure to understand that except in some rare corner cases, virtually
all signals have entropy. That doesn't imply that he is dumb (I don't
think he is dumb), that just implies that he doesn't understand the
difference between entropy and entropy rate. Hence, I pointed out to a
standard piece of literature that defines it, specifically because he
asked me to define it.

If you guys are eager enough to be patronizing and pointing out to the
supposedly missing areas of my knowledge, then I'll also be eager to
point out where your knowledge may be lacking. That's a fair deal,
isn't it? After you Robert told me that I am a crackpot giving me a
link to "Crackpot index" (I still remember - I don't forget), I think
I was not even harsh (and in light of that, you sound hypocritical).

Now let's get back to the topic.

> so here's my question(s) ... please read carefully because i'm the guy
> i can sorta see a generic measure of entropy doing that, but it assumes
> messages that are adjacent bits.  you can add to that messages that are
> grouped together in any imagined fashion, not just adjacent bits.

Exactly. That would give another correlation measure.

> there are virtually uncountable ways of grouping together the bits of a
> binary file to measure entropy.  i simply cannot believe you did it
> every-which-way.

Exactly. I did not do it in every possible way, though that would be
interesting to test (in a given window). For finite number of bits,
the number of combinations is finite, though for high N it will be
certainly impractical to compute. I would expect, the more
correlations you analyze, the more accurate measure you'll get.

> so, because i am too lazy to look around at previous posts, what is the
> *specific* way or ways you grouped together the bits into messages so
> that you could count their frequency of occurrence and get a measure of
> mean bits of information per message?

First, let me point out that there's no universally good answer to
this question, and in general, it depends on the message and the
source material (and the length of the analysis window, amount of
input, etc.). In my tests, I got fairly good results on *certain* test
data by using a sliding window, essentially what you described
(grouping adjacent bits and builiding a histogram). If I understand
right, the frequency of m1m2m3...mN will give the relative probability
(transition probability) of mN occurring after m1m2m3...mN-1. I got
this idea from Shannon's paper, where he does the same for letters to
analyze transition probabilities of English text. If the bits/symbols
are grouped in a different fashion, then the occurrence of that
pattern would indicate the relative probabilities of bits/symbols
being in that particular postion relative to each other (I haven't
tested this).

In the tests that I'm currently doing, I find that what worked well
for simple 1-bit signals, doesn't work well for multi-bit (audio)
signals, indicating that a different measure is needed. As said, there
are a zillion ways of analyzing possible correlations, so I cannot
give you an universally good measure. In fact, that's what I'd like to
find out myself as well. So this simple measure has very limited
application, and I find that it doesn't work well for more complex
signals. I have some further ideas, but without testing them, I can't
tell how well they would work.

> that question you certainly did not answer to my satisfaction.

I am afraid to disappoint you, but I cannot give you a One Universal
Truth. In the tests that gave good results for *some* test material, I
simply grouped adjacent bits. But that measure gave bad results on
other material, so that is certainly not universal. What would work
better for general audio signals, I am currently still trying to
figure out (and I believe every model will have some weaknesses).

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] about entropy encoding

Reply via email to