On Jul 13, 2018, at 2:22 PM, David Mason <dma...@ryerson.ca> wrote:
> 
>     Acgq75VpCWjdsJaa5abe9JeX3I (don't worry, this isn't a real password to 
> anything)
> 
> …I fed this through an online entropy calculator and got 4.29 bits of Shannon 
> entropy

That calculator is giving you bits *per character*.

You can see this several ways:

1. Double the message and the bits per character doesn’t change because the 
size of the source alphabet doesn’t change.

2. Add a dollar sign to the message, and bpc goes up a bit.  (This conflicts 
with your report that adding a special character didn’t change it, but it did 
for me.)

3. Turn on the calculator’s case folding option and the bpc value goes down a 
bit.

One key realization you should get from this calculator is that ASCII text is 
not 7 or 8 bits of entropy per character.  It simply is not, because not all 
characters in the source text are equally likely.  Many code points may never 
be used in a given corpus.

Another realization is that a random blob of hex noise should asymptotically 
approach 4 bpc, since each character is 4 bits of data, and the data are 
supposed to be evenly distributed across the code space.

Here’s some noise from grc.com/pass:

    C79683189EFBEBEC30A4C1A6D733F0242FB48E2582F3B2E7581D85E91E0A2FA5

The initial value is 3.91, and pasting it in a bunch of times does increase the 
value towards 4, suggesting it’s got pretty good entropy.

Now paste in an equivalent number of ‘a’ characters, and you get 0 bits of 
entropy.  Strictly speaking, you get 1 bit of entropy for the whole message, 
but it shows 0 because the calculator is rounding the result off to 3 
significant figures.

> So I tried:
>     dd if=/dev/random bs=100 count=1|od -c
> and the result only gave 5.00 bits

That’s plausible.  With a much larger sample, the result should approach 7, 8, 
16, or 21, depending on your local character set size.  (Respectively: pure 
ASCII, ISO 8859 or similar, UCS-2 and full Unicode.)

Now see if you can guess the asymptotic ideal for this slightly different 
command:

    $ dd if=/dev/random bs=100 count=1 | od


Spoiler below.




















………..












3, because the output is restricted to octal, thus 3 bpc.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to