Re: [music-dsp] about entropy encoding

2015-07-16 Thread Theo Verelst
Nonono, you don't get it, but I suppose only academics should try to 
do a proper universal theory application attempt, I won't respond to 
this anymore. I do suggest that if you'd take your own impulses and 
encode them with you own algorithms you would find less interesting and 
far less poetic information decay than you seem to suggest. I mean the 
diminishing values of your own elasticity coefficients are worrying and 
one sided.


It's like, if you'd make an mpeg, or an alternative audio lossless 
encoder, you want to draw conclusions about what the encoding efficiency 
or particular basis vectors and their weight factors are going to 
actually mean. That's not really how most people see the relatively 
simple statistical method you so meticulously (for me boringly) 
outlined, just like your own quote from the famous computer theoretician 
von Neumann doesn't mean what you think it means in the proper context 
(because there were theoretical issues related to determinism, ability 
to store the whole universe in bits stored in itself, and the difference 
between a computer program running with non-infinite speed on a computer 
and the differential equations that rule physics).


Think about the true givens in any theoretical statistics 
game/theory/program you want to work on, as soon as there are givens in 
statistical interpretations, formulas, programs, etc., there are 
different formulas at play ( P(AuB)=P(A)+P(B)-P(AB) kind of thing, for 
those fortunate enough to have received education at the required level, 
that also holds for continuous, and for multi-dimensional probability 
distributions, and then there's the chain rule as well)


For a bit of the historic idea (don't feel obliged to click links, it's 
just some basics):


https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/intro-to-channel-capacity-information-theory


T.V.
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 15/07/2015, Ethan Duni ethan.d...@gmail.com wrote:
 Right, this is an artifact of the approximation you're doing. The model
 doesn't explicitly understand periodicity, but instead only looks for
 transitions, so the more transitions per second (higher frequency) the more
 it has to do.

Yes. So for a periodic waveform, the estimation error equals the
frequency, as it should be zero. For maximal frequency (pattern
01010101...), it has maximal error.

 The ideal estimator/transmission system for a periodic signal would figure
 out that the signal is periodic and what the period/duty cycle and
 amplitude are, and then simply transmit these 4 (finite) pieces of data.
 Then the receiver can use those to generate an infinitely long signal.

This model will be baffled as soon as you send something into it that
is not harmonic. So it is only ideal in the very simple case of a
single, periodic, harmonic waveform, which is just a small subset of
arbitrary signals.

 Your model
 will never get to a zero entropy rate because it doesn't understand the
 concept of periodicity, and so has to keep sending more data forever

Strictly speaking, that's not true. It will (correctly) give an
entropy rate of zero in the corner case of constant signal (f=0), as
that has no transitions. (For all other signals, it will always give
nonzero entropy rate, and so have an estimate error.)

 You only need the analysis window length to be greater than the period.
 Then you can do a search over possible shifts of the analysis window
 compared to the current frame, and you will find an exact match.

Only if the cycle length is integer, otherwise it won't be 100% exact...

 You can use fractional delay filters for this.

And fractional delay filters will introduce some error to the signal.
Example: linear interpolation reduces high frequencies, and allpass
interpolation introduces Nyquist ringing. (Either of the two is true
for all interpolators in varying amounts). So it is almost 100%
certain that the original and the fractionally shifted signal will not
be bit-by-bit identical, whatever interpolator you're using, unless
you use some very high amount of oversampling with sinc interpolation
with a very long kernel, requiring impractically large amounts of CPU
power (and unless you do the oversampling and interpolation using a
higher precision accumulator, you'll still have numerical errors
causing some bits to be different).

Ideally, I consider a match to be exact only in case of every bit
being equal, whatever the representation (floating point or fixed). So
if you factor in to your entropy estimate how much self-similarity is
between the periods, then if the match is not 100% but just say, 99.9%
due to quantization, interpolation and other numerical errors, then
your entropy measure will not be 0 but rather 0.01 (or something like
that). Two periods are 100% bit-by-bit identical only in case of
integer period length. And if two periods are not 100% identical, then
how do you know for _sure_ if that's supposed to be the exact same
period, and not a slightly different period, having some new
information? Quantization, interpolation and other numerical errors
will add a slight uncertainity to your entropy estimate; in practice,
things are very rarely exact. Which I consider one of the reasons
why a practical entropy estimator will likely never give zero for a
periodic signal.

 What you need very long windows for is estimating the entropy rate of
 non-periodic signals that still have significant dependencies across
 samples.

Or, estimating the entropy rate of mixtures of several periodic
signals of non-harmonic frequencies, which, all being fully
deterministic, having an actual entropy rate of zero. If your model is
to find a single repeating period, then - unless your analysis window
is long enough - it won't find the common period, which is the least
common multiple of the frequencies of the individual components. You
can trivially see that in that case, that model can successfully
utilize a longer window to identify longer cycles.

 You can think of a general entropy rate estimator as some (possibly quite
 sophisticated, nonlinear) signal predictor, where the resulting prediction
 error is then assumed to be Gaussian white noise. You then get your entropy
 estimate simply by estimating the power in the prediction error. If the
 signal in question is well modeled by the estimator, the assumption of a
 Gaussian white noise prediction error will be accurate and the resulting
 entropy rate estimate will be quite good. If not, the prediction error will
 have some different statistics (including dependencies across time) and
 treating it as iid Gaussian noise will introduce error into the entropy
 estimate. Specifically, the estimate will be too high, since iid Gaussian
 noise is the max entropy signal (for a given rms power). This
 overestimation corresponds directly to the failure of the signal predictor
 to handle the dependencies in 

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 16/07/2015, Peter S peter.schoffhau...@gmail.com wrote:

 The above is a histogram based entropy estimator. Another method of
 estimating entropy is to build a predictor that tries to predict the
 signal from the preceding samples. When compressing waveforms, audio
 codecs typically do that.

And on some level, a statistical histogram estimator is essentially a
predictor that tries to predict the next symbol based on a statistical
measurement, so it says So far the most frequent symbol was X. So I
predict that symbol X will have the highest probability for being the
next symbol.

Wave compression codecs often use some form of linear predictor,
trying to predict the next sample from the preceding ones, and encode
only the residue or error between the prediction and the actual
sample. So when the prediction is good, what they actually encode is
like a small amplitude Gaussian distribution noise, which is the error
between the prediction and the original signal.

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 16/07/2015, Theo Verelst theo...@theover.org wrote:
 Nonono, you don't get it, but I suppose only academics should try to
 do a proper universal theory application attempt, I won't respond to
 this anymore. I do suggest that if you'd take your own impulses and
 encode them with you own algorithms you would find less interesting and
 far less poetic information decay than you seem to suggest. I mean the
 diminishing values of your own elasticity coefficients are worrying and
 one sided.

poetic information decay, elasticity coefficient... how are these
terms relevant here at all? I thought the term poetic is only used
in literature classes, and not statistics classes. Throwing around
non-relevant terms doesn't make it sound more academic.

I think you still don't understand that I was not trying to do a
proper universal theory or information theory poetry, rather, just
did some statistical tests on some test data. I merely said this
particular formula, when applied to this data set, gives these
results. Without any generalization or universal theory attempt.

 there are
 different formulas at play ( P(AuB)=P(A)+P(B)-P(AB) kind of thing, for
 those fortunate enough to have received education at the required level,

Congratulations, you can recall a formula from your statistics class.

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 16/07/2015, Peter S peter.schoffhau...@gmail.com wrote:
 Quantization, interpolation and other numerical errors
 will add a slight uncertainity to your entropy estimate; in practice,
 things are very rarely exact.

Another view at looking at this - for any real-world signal, there is
typically some noise in the signal. When it is not dithered, there's
quantization error, resulting in quantization noise. When it is
dithered, there's noise from the dithering. When you interpolate it,
there is error from the interpolation, and possibly noise from the
numerical errors. When the signal is sampled from analog, then there's
a noise floor from the A/D converter, the microphone, or whatever
equipment was used to produce it.

So whatever the real-world sampled periodic signal is, irregardless of
how it was derived, unless the cycle length is integer, periods will
not be bit-by-bit identical, but rather, have some small noise and
error with a certain distribution. So irregardless of method, the
measured entropy will never reach zero (unless it's special corner
case), because these various forms of noise will always add some
slight level of unpredictability to the signal. Dither noise and the
noise floor of an A/D converter or a microphone are fully
uncorrelated, so those will always add some unpredictability and thus
(perceived) entropy to the signal, even when it's fully periodic.
Quantization noise is not trivial to model either. Hence, unless it is
a constant signal, measured entropy will typically never be zero, as
there is almost always some slight noise in a sampled signal.

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] Sampling theorem extension

2015-07-16 Thread Charles Z Henry
On Mon, Jul 13, 2015 at 8:39 AM, Charles Z Henry czhe...@gmail.com wrote:
 On Mon, Jul 13, 2015 at 3:28 AM, Vadim Zavalishin
 vadim.zavalis...@native-instruments.de wrote:
 On 10-Jul-15 19:50, Charles Z Henry wrote:

 The more general conjecture for the math heads :
 If u is the solution of a differential equation with forcing function g
 and y = conv(u, v)
 Then, y is the solution of the same differential equation with forcing
 function
 h=conv(g,v)

 I haven't got a solid proof for that yet, but it looks pretty easy.


 How about the equation

 u''=-w*u+g

 where v is sinc and w is above the sampling frequency?

 I think you meant sqrt(w) is above the sampling freq.

 This is also a good one.  I also saw some scenarios like this, that
 just might take a little time.  The math should come out that if
 sqrt(w)  f_s, u=0 and if sqrt(w)  f_s, u is a sine.  I'll work on it
 a bit over the week and see if I can make the calculus work :)

--From here on, let's replace w with w^2, to make things simpler in discussion--

My first observations on the general problem:
For numerical integration, we need to re-write as a system of
1st-order ODE's in the first place:

d/dt u' + w^2*u = g
d/dt u - u' = 0

Integrate the 1st equation to get u', and integrate the 2nd equation to get u.

I think the general problem (arbitrary order linear ODE's and forcing
functions) is better cast as a system of 1st order ODE's.  There will
be an inductive step, so that increasing the order of the ODE (aka
size of the system of 1st order ODE's) does not present a novel
problem.
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 16/07/2015, Ethan Duni ethan.d...@gmail.com wrote:
 if a
 signal can be described by a finite set of parameters (amplitude, phase and
 frequency, say) then it immediately follows that it has zero entropy rate.

Let's imagine you want to transmit a square wave with amplitude=100,
phase=30, frequency=1000, and duty cycle=75%.

Question: what is the 'entropy' of this square wave?

Let's assume that the transmitter indicates if the signal is a
periodic wave, or not. Let's assume you send a single bit to indicate
periodic waveform - you send 1 if perodic, 0 if not. For simplicity,
now let's assume that a periodic wave is always a square.

What do you need to send to transmit this square wave fully?

1) You send '1' to indicate that the signal is a periodic square wave.
2) You send '100' to transmit the amplitude, or written as binary, 1100100.
3) You send '30' to transmit phase, or as binary, 0
4) You send '1000' to transmit frequency, or as binary, 101000.
5) You send '75' to transmit duty cycle, or as binary, 1001011.

So, to transmit your square wave, you will need to somehow transmit
the following sequence of bits:

1 1100100 0 101000 1001011

This is the binary sequence that uniquely identifies your square wave.

1) If you do not send '1' to indicate square wave, your receiver won't
recognize it.
2) If you do not send amplitude, the receiver doesn't know what
amplitude to use.
3) If you do not transmit phase, the phase will be off by 30 (whatever unit).
4) If you do not send frequency, the receiver will not know what to do.
5) If you do not transmit duty cycle, it will be off by 25%.

Conclusion: each one of these pieces of information are *absolutely*
necessary to reconstruct the square wave with amplitude=100, phase=30,
frequency=1000 and duty cycle=75%. If *any* of these data are missing,
the reconstruction will not be exact.

Since entropy means the minimum number of information (usually,
bits) that need to be sent to uniquely identify something, the entropy
of this square wave equals the entropy of the bit sequence 1 1100100
0 101000 1001011, which is a nonzero value in all cases.

How much it is *exactly*, depends on the encoding you use to transmit
these numbers. The only thing that is certain, that it will have a
length that is nonzero, therefore, a square wave has _nonzero_
entropy, for all possible sets of parameters.

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni
his model will be baffled as soon as you send something into it that
is not harmonic. So it is only ideal in the very simple case of a
single, periodic, harmonic waveform, which is just a small subset of
arbitrary signals.

I'm not suggesting using a parametric signal model as an estimator. I'm
saying that an ideal estimator would be smart enough to figure out when
it's dealing with a parametrizable signal, and exploit that. It would also
be smart enough to realize when it's dealing with a non-parametrizable
signal, and do the appropriate thing in those cases.

Quantization, interpolation and other numerical errors
will add a slight uncertainity to your entropy estimate; in practice,
things are very rarely exact. Which I consider one of the reasons
why a practical entropy estimator will likely never give zero for a
periodic signal.

Getting to *exactly* zero is kind of nit-picky. An estimation error on the
order of, say, 10^-5 bits/second is as good as zero, since it's saying you
only have to send 1 bit every few hours. Given that you are unlikely to be
encoding any signals that last that long, the difference between that and
zero is kind of academic. This is just a matter of numerical error that can
be reduced arbitrarily by throwing computational power at the problem -
it's not a fundamental issue with the estimation approach itself.

The reason the non-zero estimates you're getting from your estimator are a
problem is that they are artifacts of the estimation strategy - not the
numerical precision - and so they do not reduce with more data, they are
correlated with signal properties like frequency and duty cycle, etc. These
are signs of flaws in the basic estimation approach, not in the numerical
implementation thereof.

E

On Thu, Jul 16, 2015 at 7:07 AM, Peter S peter.schoffhau...@gmail.com
wrote:

 On 15/07/2015, Ethan Duni ethan.d...@gmail.com wrote:
  Right, this is an artifact of the approximation you're doing. The model
  doesn't explicitly understand periodicity, but instead only looks for
  transitions, so the more transitions per second (higher frequency) the
 more
  it has to do.

 Yes. So for a periodic waveform, the estimation error equals the
 frequency, as it should be zero. For maximal frequency (pattern
 01010101...), it has maximal error.

  The ideal estimator/transmission system for a periodic signal would
 figure
  out that the signal is periodic and what the period/duty cycle and
  amplitude are, and then simply transmit these 4 (finite) pieces of data.
  Then the receiver can use those to generate an infinitely long signal.

 This model will be baffled as soon as you send something into it that
 is not harmonic. So it is only ideal in the very simple case of a
 single, periodic, harmonic waveform, which is just a small subset of
 arbitrary signals.

  Your model
  will never get to a zero entropy rate because it doesn't understand the
  concept of periodicity, and so has to keep sending more data forever

 Strictly speaking, that's not true. It will (correctly) give an
 entropy rate of zero in the corner case of constant signal (f=0), as
 that has no transitions. (For all other signals, it will always give
 nonzero entropy rate, and so have an estimate error.)

  You only need the analysis window length to be greater than the period.
  Then you can do a search over possible shifts of the analysis window
  compared to the current frame, and you will find an exact match.

 Only if the cycle length is integer, otherwise it won't be 100% exact...

  You can use fractional delay filters for this.

 And fractional delay filters will introduce some error to the signal.
 Example: linear interpolation reduces high frequencies, and allpass
 interpolation introduces Nyquist ringing. (Either of the two is true
 for all interpolators in varying amounts). So it is almost 100%
 certain that the original and the fractionally shifted signal will not
 be bit-by-bit identical, whatever interpolator you're using, unless
 you use some very high amount of oversampling with sinc interpolation
 with a very long kernel, requiring impractically large amounts of CPU
 power (and unless you do the oversampling and interpolation using a
 higher precision accumulator, you'll still have numerical errors
 causing some bits to be different).

 Ideally, I consider a match to be exact only in case of every bit
 being equal, whatever the representation (floating point or fixed). So
 if you factor in to your entropy estimate how much self-similarity is
 between the periods, then if the match is not 100% but just say, 99.9%
 due to quantization, interpolation and other numerical errors, then
 your entropy measure will not be 0 but rather 0.01 (or something like
 that). Two periods are 100% bit-by-bit identical only in case of
 integer period length. And if two periods are not 100% identical, then
 how do you know for _sure_ if that's supposed to be the exact same
 period, and not a slightly different period, having 

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 16/07/2015, Ethan Duni ethan.d...@gmail.com wrote:
 But, it seems that it does *not* approach zero. If you fed an arbitrarily
 long periodic waveform into this estimator, you won't see the estimate
 approaching zero as you increase the length.

False. The better estimators give an estimate that approaches zero.

% set pattern [randbits [randnum 20]]; puts pattern=$pattern; for {set i 1} {$i
=10} {incr i} {put L=$i, ; measure [repeat $pattern 1] $i}
pattern=1000110011011010
L=1, Estimated entropy per bit: 1.00
L=2, Estimated entropy per bit: 1.023263
L=3, Estimated entropy per bit: 0.843542
L=4, Estimated entropy per bit: 0.615876
L=5, Estimated entropy per bit: 0.337507
L=6, Estimated entropy per bit: 0.17
L=7, Estimated entropy per bit: 0.071429
L=8, Estimated entropy per bit: 0.031250
L=9, Estimated entropy per bit: 0.013889
L=10, Estimated entropy per bit: 0.006250

It seems that the series approaches zero with increasing length.
I can repeat it arbitrary times with an arbitrary repeated waveform:
http://morpheus.spectralhead.com/entropy/random-pattern-tests.txt

Longer patterns (up to cycle length 100):
http://morpheus.spectralhead.com/entropy/random-pattern-tests-100.txt

For comparison, same tests repeated on white noise:
http://morpheus.spectralhead.com/entropy/white-noise-tests.txt

The numbers do not lie.

 Also you are only able to deal
 with 1-bit waveforms, I don't see how you can make any claims about general
 periodic waveforms with this.

I have an organ called brain that can extrapolate from data, and
make predictions.

Random periodic shapes give somewhat higher entropy estimate than pure
waves like square, so there's somewhat more entropy in it.

 No, all periodic signals have exactly zero entropy rate.

Entropy != entropy rate. The shape of the waveform itself has some entropy.

 The correct statement is that your estimator does an even worse job on
 complicated periodic waveforms, than it does on ones like square waves.
 This is because it's counting transitions, and not actually looking for
 periodicity.

The bitflip estimator, yes. The pattern matching estimator, no.

 Again, periodic waveforms have exactly zero entropy rate.

Again, entropy != entropy rate. I wrote entropy not entropy rate.
The shape of the waveform itself contains some information, doesn't
it? To be able to transmit a waveform of arbitrary shape, you need to
transmit the shape somehow at least *once*, don't you think? Otherwise
you cannot reconstruct it... Hence, any periodic waveform has nonzero
_entropy_. The more complicated the waveform shape is, the more
entropy it has.

 If there is no randomness, then there is no entropy.

Entirely false. The entropy of the waveform shape is nonzero.
Do not confuse entropy with entropy rate.

Even a constant signal has entropy. Unless it's zero, you need to
transmit the constant value at least *once* to be able to reconstruct
it. Otherwise, how do you know what to reconstruct, if you transmit
*nothing*?

 This is the reason that parameter estimation is coming up - if a
 signal can be described by a finite set of parameters (amplitude, phase and
 frequency, say) then it immediately follows that it has zero entropy rate.

But not zero _entropy_. The parameters (amplitude, phase, frequency,
waveform shape) themselves have some entropy - you *do* need to
transmit those parameters to be able to reconstruct the signal. Which
definitely do have _nonzero_ entropy. If you do not transmit anything
(= zero entropy), you cannot reconstruct the waveform. Hence, any
waveform has nonzero entropy, and entropy != entropy rate. If you
did not receive anything at the receiver (zero entropy), what will you
reconstruct?

 Your estimator is itself a crude predictor:

Which estimator? I made at least 4 different estimators so far, some
more crude, some more sophisticated.

 But since it is such a crude predictor it can't account for
 obvious, deterministic behavior such as periodic square waves.

The better estimators can.

 Depends on the audio codec. High performance ones actually don't do that,
 for a couple of reasons. One is that general audio signals don't really
 have a reliable time structure to exploit, so you're actually better off
 exploiting psychoacoustics.

Are you speaking about lossy, or lossless codecs?
Lossless codecs like FLAC and WavPack certainly do that.

 Another is that introducing such
 dependencies means that any packet loss/corruption results in error
 propagation.

And how is error propagation from packet corruption anyhow relevant
to entropy? That is an entirely unrelated property of codecs. (And
besides, they usually work on frames and include a CRC32 or MD5 hash
for the frame, so if a frame is corrupted, you can trivially
retransmit that frame.)

 Signal predictors are a significant part of speech codecs
 (where the fact that you're dealing with human speech signals gives you
 some reliable assumptions to exploit) and in low performance audio codecs
 

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni
This algorithm gives an entropy rate estimate approaching zero for any
periodic waveform, irregardless of the shape (assuming the analysis
window is large enough).

But, it seems that it does *not* approach zero. If you fed an arbitrarily
long periodic waveform into this estimator, you won't see the estimate
approaching zero as you increase the length. Also you are only able to deal
with 1-bit waveforms, I don't see how you can make any claims about general
periodic waveforms with this.

Random periodic shapes give somewhat higher entropy estimate than pure
waves like square, so there's somewhat more entropy in it.

No, all periodic signals have exactly zero entropy rate.

The correct statement is that your estimator does an even worse job on
complicated periodic waveforms, than it does on ones like square waves.
This is because it's counting transitions, and not actually looking for
periodicity.

Periodic waveforms have a
lot of self-similarity, so they have low entropy. Constant signal is
fully self-similar, so it has zero entropy.

Again, periodic waveforms have exactly zero entropy rate. This is true of
all deterministic signals. If there is no randomness, then there is no
entropy. This is the reason that parameter estimation is coming up - if a
signal can be described by a finite set of parameters (amplitude, phase and
frequency, say) then it immediately follows that it has zero entropy rate.
The fact that your estimator doesn't do parameter estimation means that its
not going to capture those cases correctly, and instead will give non-zero
entropy rate estimates. Maybe that's an acceptable trade-off for the domain
you want to address, but it's not something that you can simply hand-wave
away.

Another method of estimating entropy is to build a predictor that tries to
predict the
signal from the preceding samples.

That's not so much another method as it is a basic requirement of
estimating the entropy rate. You have to account, in some way, for the
dependencies on previous samples, in order to get an idea of how much
surprise is coming in each sample. Your estimator is itself a crude
predictor: it predicts that the signal is constant, and then any deviation
from that is counted as a surprise and so increments the entropy
estimate. But since it is such a crude predictor it can't account for
obvious, deterministic behavior such as periodic square waves.

When compressing waveforms, audio codecs typically do that.

Depends on the audio codec. High performance ones actually don't do that,
for a couple of reasons. One is that general audio signals don't really
have a reliable time structure to exploit, so you're actually better off
exploiting psychoacoustics. You get a certain amount of decoration from the
MDCT or whatever, but there isn't really a notion of a general signal
predictor or an prediction error. Another is that introducing such
dependencies means that any packet loss/corruption results in error
propagation. Signal predictors are a significant part of speech codecs
(where the fact that you're dealing with human speech signals gives you
some reliable assumptions to exploit) and in low performance audio codecs
(ADPCM) where the overall system is simple enough that introducing some
adaptive prediction doesn't cause too many troubles.

E

On Wed, Jul 15, 2015 at 8:39 PM, Peter S peter.schoffhau...@gmail.com
wrote:

 On 16/07/2015, robert bristow-johnson r...@audioimagination.com wrote:
 
  i've only been following this coarsely, but did you post, either in code
  or pseudo-code the entropy estimator algorithm?  i'd be curious.

 It's essentially based on the ideas outlined on 2015-07-14 19:52 CET,
 see higher-order approximations. It's a sliding-window histogram
 estimator.

  what if the quantization was fully dithered (TPDF dither of 2 LSBs)?
  what does your estimator do with that?

 So far it's a rudimentary proof-of-concept estimator that currently
 works only on 1-bit samples. So if you dither that, it will think it
 is all noise :) In general, adding any unpredictable noise source to a
 signal will increase its measured entropy.

  how does you estimator know how to measure the necessary
  information content of different simple deterministic waveforms?  once
  you establish that the waveform is a triangle wave, again it doesn't
  take more than three number to fully describe it.

 My estimator basically uses pattern matching. It has no notion of
 'triangle wave', or any perodic wave for that matter, it just sees
 patterns. So it sees a 'corner' of a square wave, and remembers that
 pattern. Next time that corner comes in the next period, it says Oh!
 This corner again! I've seen this already. So it produces a match. At
 the end of the analysis, it says So, it was 523 matches for this very
 same corner, which is high probability, thus low entropy. And repeats
 the same for every pattern encountered, summing the estimated
 entropies. Works for 1-bit signals, higher bit-depths I am still
 

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 17/07/2015, Ethan Duni ethan.d...@gmail.com wrote:
 What are these better estimators? It seems that you have several estimators
 in mind but I can't keep track of what they all are,
 I urge you to slow down, collect your thoughts, and
 spend a bit more time editing your posts for clarity (and length).

I urge you to pay more attention and read more carefully.
I do not want to repeat myself several times.
(Others will think it's repetitive and boring.)

[And fuck the spend more time part, I already spent 30+ hours editing.]

 And what is entropy per bit? Entropy is measured in bits, in the first
 place. Did you mean entropy per symbol or something?

Are you implying that a bit is not a symbol?
A bit *is* a symbol. So of course, I meant that.

 Entropy is measured in bits, in the first place.

According to IEC 8-13, entropy is measured in shannons:
https://en.wikipedia.org/wiki/Shannon_%28unit%29

For historical reasons, bits is often used synonymously with shannons.

 Maybe you could try using this brain to interact in a good-faith way.

Faith belongs to church.

 The entropy of a signal - as opposed to entropy rate - is not a
 well-defined quantity, generally speaking.

It's exact value is not well-defined, yet it is *certain* to be non-zero.
(Unless you only have only 1 particular signal with 100% probability.)

 The standard quantity of interest in the signal context is entropy rate

Another standard quality of interest in the signal context is entropy.

https://en.wikipedia.org/wiki/Entropy_%28information_theory%29

Quote:
Entropy is a measure of unpredictability of information content.

 If you want to talk about signal entropy, distinct from the entropy rate,
 then you need to do some additional work to specify what you mean by that.

Let me give you an example.

You think that a constant signal has no randomness, thus no entropy (zero bits).
Let's do a little thought experiement:

I have a constant signal, that I want to transmit to you over some
noiseless discrete channel. Since you think a constant signal has zero
entropy, I send you _nothing_ (precisely zero bits).

Now try to reconstruct my constant signal from the nothing that you
received from me! Can you?

.
.
.

There's very high chance you can't. Let me give you a hint. My
constant signal is 16 bit signed PCM, and first sample of it is
sampled from uniform distribution noise.

What is the 'entropy' of my constant signal?

Answer: since the first sample is sampled from uniform distribution
noise, the probability of you successfully guessing my constant signal
is 1/65536. Hence, it has an entropy of log2(65536) = 16 bits. In
other words, I need to send you all the 16 bits of the first sample
for you to be able to reconstruct my constant signal with 100%
certainity. Without receiving those 16 bits, you cannot reconstruct my
constant signal with 100% certainity. That's the measure of its
uncertainity or unpredictability.

So you (falsely) thought a constant signal has zero randomness and
thus zero entropy, yet it turns out that when I sampled that constant
signal from the output of 16-bit uniform distribution white noise,
then my constant signal will have 16 bits of entropy. And if I want to
transmit it to you, then I need to send you a minimum of 16 bits for
you to be able to reconstruct, despite that it's a constant signal.

It may have an asymptotic 'entropy rate' of zero, yet that doesn't
mean that the total entropy is zero. So the 'entropy rate' doesn't
tell you the entropy of the signal. The total entropy (uncertainity,
unpredictability, randomness) in this particular constant signal is 16
bits. Hence, its entropy is nonzero, and in this case, 16 bits. Hence,
if I want to send it to you in a message, I need to send a minimum of
16 bits to send this constant signal to you. The 'entropy rate'
doesn't tell you the full picture.

 Switching back and forth between talking about narrow parametric classes of
 signals (like rectangular waves) and general signals is going to lead to
 confusion, and it is on you to keep track of that and write clearly.

When I say square wave, then I mean square wave.
When I say arbitrary signal, then I mean arbitrary signal.
What's confusing about that? What makes you unable to follow?

 Moreover, the (marginal) entropy of (classes of) signals is generally not
 an interesting quantity in the first place [...]
 Hence the emphasis on entropy rate,
 which will distinguish between signals that require a lot of bandwidth to
 transmit and those that don't.

Entropy also tells you how much bandwith you require to transmit a
signal. It's _precisely_ what it tells you. It doesn't tell you per
symbol, but rather tells you the total number of bits you minimally
need to transmit the signal fully. I don't understand why you just
focus on entropy rate (= asymptotic entropy per symbol), and forget
about the total entropy. Both measures tell you the same kind of
information, in a slightly different context.

But not zero 

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 17/07/2015, robert bristow-johnson r...@audioimagination.com wrote:

 in your model, is one sample (from the DSP semantic) the same as a
 message (from the Information Theory semantic)?

A message can be anything - it can be a sample, a bit, a combination
of samples or bits, a set of parameters representing a square wave,
whatever.

 is your measure the same as

  SUM{ prob(message) * -log2(prob(message)) }
 all messages

 ?

Yes. In this particular example, a message means a set of
parameters that describe a parametric square wave. To distinguish
between two or more messages, at least 1 bit is needed.
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread robert bristow-johnson

On 7/17/15 12:08 AM, Peter S wrote:

On 17/07/2015, Peter Speter.schoffhau...@gmail.com  wrote:

Think of it as this - if your receiver can distinguish only two
different sets of parameters, then you need to send at least *one* bit
to distinguish between them - '0' meaning square wave A, and '1'
meaning square wave B. Without sending at least a *single* bit, your
receiver cannot distinguish between square waves A and B.

It also follows that when your receiver can distinguish between
precisely two sets of parameters of nonzero probability, then in
practice, the entropy of a square wave (= parameter set) will *always*
be 1 bit. Your transmitter will always need to send *exactly* one bit
to distinguish between square wave A and square wave B,
irregardless of whatever probabilities they have.

So you need to specify a distribution [...] in order to talk about
the entropy is false - if I know that the parameters set has a size
of two (meaning two different square waves), then in practice, the
entropy will *always* be 1 bit - your transmitter will send either 0
or 1, meaning square wave A or B, for _all_ possible combination of
probabilities. In this case, the probabilities are entirely irrelevant
- you always send exactly 1 bit.

Hence, if the parameter set has a size of at least two, then you must
always send _at least_ one bit, hence, nonzero entropy, irregardless
of probability distribution. Entropy is zero _only_ if the parameter
set has a size of 1 with probability p=1.



in your model, is one sample (from the DSP semantic) the same as a 
message (from the Information Theory semantic)?


is your measure the same as

SUM{ prob(message) * -log2(prob(message)) }
   all messages

?


--

r b-j  r...@audioimagination.com

Imagination is more important than knowledge.



--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
On 17/07/2015, Peter S peter.schoffhau...@gmail.com wrote:

 Think of it as this - if your receiver can distinguish only two
 different sets of parameters, then you need to send at least *one* bit
 to distinguish between them - '0' meaning square wave A, and '1'
 meaning square wave B. Without sending at least a *single* bit, your
 receiver cannot distinguish between square waves A and B.

It also follows that when your receiver can distinguish between
precisely two sets of parameters of nonzero probability, then in
practice, the entropy of a square wave (= parameter set) will *always*
be 1 bit. Your transmitter will always need to send *exactly* one bit
to distinguish between square wave A and square wave B,
irregardless of whatever probabilities they have.

So you need to specify a distribution [...] in order to talk about
the entropy is false - if I know that the parameters set has a size
of two (meaning two different square waves), then in practice, the
entropy will *always* be 1 bit - your transmitter will send either 0
or 1, meaning square wave A or B, for _all_ possible combination of
probabilities. In this case, the probabilities are entirely irrelevant
- you always send exactly 1 bit.

Hence, if the parameter set has a size of at least two, then you must
always send _at least_ one bit, hence, nonzero entropy, irregardless
of probability distribution. Entropy is zero _only_ if the parameter
set has a size of 1 with probability p=1.

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Peter S
If you assume the entropy _rate_ to be the average entropy per bits
(although there are other possible definitions), then - if the total
entropy of a waveform is finite, then as the number of observations
approaches infinity, the entropy rate will approach zero, as
finite/infinite = 0.

This does *not* mean that the _entropy_ of the signal is zero. All
signals have *some* amount of entropy, some finite, some infinite.
Even a constant signal has nonzero entropy, as you need to transmit
the constant value at least *once* to be able to reconstruct it.
(Well, maybe an infinite constant zero signal has zero entropy.)

So do *not* confuse 'entropy' and 'entropy rate' - they're different
measures. Zero entropy _rate_ does *not* mean that the total entropy
is zero. It just means, the total entropy is finite, so if you make
infinite observations, then the *average* entropy will converge to
zero.

When estimating 'entropy rate' (average entropy per bit), then the
result will never actually reach zero in a finite amount of time. Also
the amount of total entropy in the signal will affect how fast the
measured entropy rate converges to zero, even if it has only a
finite amount of entropy, and thus an asymptotic entropy rate of zero.
In estimates, this causes the measured entropy rate of complex
periodic waveforms to converge slower, because the complex shape of
the waveform has more entropy.

-P
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp


Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni
False. The better estimators give an estimate that approaches zero.

% set pattern [randbits [randnum 20]]; puts pattern=$pattern; for {set i
1} {$i
=10} {incr i} {put L=$i, ; measure [repeat $pattern 1] $i}
pattern=1000110011011010
L=1, Estimated entropy per bit: 1.00
L=2, Estimated entropy per bit: 1.023263
L=3, Estimated entropy per bit: 0.843542
L=4, Estimated entropy per bit: 0.615876
L=5, Estimated entropy per bit: 0.337507
L=6, Estimated entropy per bit: 0.17
L=7, Estimated entropy per bit: 0.071429
L=8, Estimated entropy per bit: 0.031250
L=9, Estimated entropy per bit: 0.013889
L=10, Estimated entropy per bit: 0.006250

What are these better estimators? It seems that you have several estimators
in mind but I can't keep track of what they all are, apart from the
bit-flip counter one. I urge you to slow down, collect your thoughts, and
spend a bit more time editing your posts for clarity (and length).

 And what is entropy per bit? Entropy is measured in bits, in the first
place. Did you mean entropy per symbol or something?

I have an organ called brain that can extrapolate from data, and
make predictions.

Maybe you could try using this brain to interact in a good-faith way.

Entropy != entropy rate. The shape of the waveform itself has some entropy.
Again, entropy != entropy rate. I wrote entropy not entropy rate.

The entropy of a signal - as opposed to entropy rate - is not a
well-defined quantity, generally speaking. The standard quantity of
interest in the signal context is entropy rate, and so - in the interest of
good-faith engagement - I have assumed that is what you were referring to.
If you want to talk about signal entropy, distinct from the entropy rate,
then you need to do some additional work to specify what you mean by that.
I.e., you need to set up a distribution over the space of possible signals.
Switching back and forth between talking about narrow parametric classes of
signals (like rectangular waves) and general signals is going to lead to
confusion, and it is on you to keep track of that and write clearly.

Moreover, the (marginal) entropy of (classes of) signals is generally not
an interesting quantity in the first place - hence the emphasis on entropy
rate. Signals of practical interest for telecommunications are all going to
have infinite entropy, generally speaking, because they are infinitely
long and keep producing new surprises. Hence the emphasis on entropy rate,
which will distinguish between signals that require a lot of bandwidth to
transmit and those that don't.

But not zero _entropy_. The parameters (amplitude, phase, frequency,
waveform shape) themselves have some entropy - you *do* need to
transmit those parameters to be able to reconstruct the signal.

Again, that all depends on assumptions on the distribution of the
parameters. You haven't specified anything like that, so these assertions
are not even wrong. They're simply begging the question of what is your
signal space and what is the distribution on it.

I thought you have a problem with getting a nonzero estimate, as you
said that an estimator that gives nonzero result for a periodic
waveform is a poor estimator.

Again, the important distinction is between estimators that don't hit the
correct answer even theoretically (this indicates that the underlying
algorithm is inadequate) and the inevitable imperfections of an actual
numerical implementation (which can be made as small as one likes by
throwing more computational resources at them). These are quite different
issues - the latter is a matter of trading off resource constraints for
performance, the former is a fundamental algorithmic limitation that no
amount of implementation resources can ever address.

Let's imagine you want to transmit a square wave with amplitude=100,
phase=30, frequency=1000, and duty cycle=75%.
Question: what is the 'entropy' of this square wave?

What is the distribution over the space of those parameters? The entropy is
a function of that distribution, so without specifying it there's no
answer.

Said another way: if that's the only signal I want to transmit, then I'll
just build those parameters into the receiver and not bother transmitting
anything. The entropy will be zero. The receiver will simply output the
desired waveform without any help from the other end. It's only if there's
some larger class of signals containing this one that I want to handle,
that the question of entropy comes up.

a square wave has _nonzero_
entropy, for all possible sets of parameters.

Again, this is begging the question. A square wave is not a distribution,
and so doesn't have entropy. You need to specify what is the possible set
of parameters, and then specify a distribution over that set, in order to
talk about the entropy.

If you assume the entropy _rate_ to be the average entropy per bits

What is per bits? You mean per symbol or something?

The definition of entropy rate is the limit of the conditional entropy