On Jul 13, 2018, at 2:22 PM, David Mason <dma...@ryerson.ca> wrote: > > Acgq75VpCWjdsJaa5abe9JeX3I (don't worry, this isn't a real password to > anything) > > …I fed this through an online entropy calculator and got 4.29 bits of Shannon > entropy
That calculator is giving you bits *per character*. You can see this several ways: 1. Double the message and the bits per character doesn’t change because the size of the source alphabet doesn’t change. 2. Add a dollar sign to the message, and bpc goes up a bit. (This conflicts with your report that adding a special character didn’t change it, but it did for me.) 3. Turn on the calculator’s case folding option and the bpc value goes down a bit. One key realization you should get from this calculator is that ASCII text is not 7 or 8 bits of entropy per character. It simply is not, because not all characters in the source text are equally likely. Many code points may never be used in a given corpus. Another realization is that a random blob of hex noise should asymptotically approach 4 bpc, since each character is 4 bits of data, and the data are supposed to be evenly distributed across the code space. Here’s some noise from grc.com/pass: C79683189EFBEBEC30A4C1A6D733F0242FB48E2582F3B2E7581D85E91E0A2FA5 The initial value is 3.91, and pasting it in a bunch of times does increase the value towards 4, suggesting it’s got pretty good entropy. Now paste in an equivalent number of ‘a’ characters, and you get 0 bits of entropy. Strictly speaking, you get 1 bit of entropy for the whole message, but it shows 0 because the calculator is rounding the result off to 3 significant figures. > So I tried: > dd if=/dev/random bs=100 count=1|od -c > and the result only gave 5.00 bits That’s plausible. With a much larger sample, the result should approach 7, 8, 16, or 21, depending on your local character set size. (Respectively: pure ASCII, ISO 8859 or similar, UCS-2 and full Unicode.) Now see if you can guess the asymptotic ideal for this slightly different command: $ dd if=/dev/random bs=100 count=1 | od Spoiler below. ……….. 3, because the output is restricted to octal, thus 3 bpc. _______________________________________________ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users