ANN: TEXTCOMBINE-SP

Mok-Kong Shen Mon, 10 Jul 2017 05:12:27 -0700

An estimate of entropy of English texts is 1.34 bits per letter [1]. This

implies that, if the letters are coded into 5 bits, one needs toappropriately

combine 4 text files in order to obtain bit sequences of full entropy, since

4*1.34 = 5.36 > 5. The method used in our software is to sum (mod 32)the codedvalues of a-z (mapped to 0-25) as 5 bits of the corresponding letters ofthe

text files.


There are plenty of other schemes for obtaining high quality pseudo-random
sequences in practice, e.g. AES in counter mode. However our scheme seems to
be much simpler both in the underlying logic (understandability) and in

implementation and is thus a viable alternative that one could use/needunder

circumstances.

The software is available at mok-kong-shen.de

M. K. Shen
------------------------------------------------------------------------

[1] T. M. Cover, R. C. King, A Convergent Gambling Estimate of theEntropy of

English, IEEE Trans. Inf. Theory, vol. 24, 1978, pp. 413-421.

--
https://mail.python.org/mailman/listinfo/python-announce-list

       Support the Python Software Foundation:
       http://www.python.org/psf/donations/

ANN: TEXTCOMBINE-SP

Reply via email to