I've heard that modern "secret" societies have texts written in english, with only a few recurring special symbols that have to be taught. Could for example the character thorn be used for 'the' saving two characters? Could some bigram glyphs be used for compression, like Qu? What about substituting some phonetic 'mispelings' that are still comprehensible?
On Mon, Apr 23, 2012 at 4:42 PM, Kragen Javier Sitaker <kra...@canonical.org > wrote: > #!/usr/bin/python > # -*- coding: utf-8 -*- > """Compute proportional-font print size of a text. > > The laser printer at my new workplace is 600dpi in both directions. > It prints on A4 or similar paper: 216×279mm. > > Simple multiplication yields a capacity of 33.6 megabits per page, or > about 4 megabytes. > > At 600dpi, a 4×6 pixel character cell like the one I use in > <http://canonical.org/~kragen/sw/dofonts-1k.html> gives you an 80×66 > page of 13.5 mm × 16.8 mm. (Janne Kujala designed the font.) If you > can successfully control every pixel, the result should be clearly > readable with a magnifying glass. (If we consider 5-point text as the > lower limit of comfortable readability, and these 6-pixel-tall > characters are 1/100 inch, you need 7× magnification to make the text > comfortably readable.) > > Further calculation suggests that the A4 page will contain an array > fitting 16 such reduced pages horizontally and 16.6 of them > vertically; practically this is probably 15 horizontally and 16 > vertically, or 240 pages, 480 pages on the two sides of the paper, or > 31680 80-column lines. This is on the order of a megabyte and a half > of text, assuming an average of about 50 bytes per line. You could > print the King James Bible on four sheets of paper. > > In this form, the Rosetta Disk's 13000-page archive would require some > 260 sheets of paper, the size of an average hardcover book. Their > metal disk is probably more durable than the paper, but the paper > version can be printed for about US$20-100, rather than the several > thousand dollars for the disk. > > But perhaps we can do better! That's only about a third of the total > data capacity of the page. Can a proportional font let us use less > than 4 pixels horizontally, on average? > > After inspection, I think that the 4×6 pixel font would need the > following widths for its 96 character glyphs, modified slightly, and > ensuring one pixel of space on the right of each one: > """ > > proportions_1 = map(int, """ > 2 2 5 4 4 4 4 2 3 3 4 4 3 3 2 4 > 4 3 4 4 4 4 4 4 4 4 2 3 4 3 4 4 > 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 > 4 4 4 4 4 4 4 4 4 4 4 3 4 3 4 3 > 3 4 4 4 4 4 4 4 4 2 3 4 4 4 4 4 > 4 4 4 4 4 4 4 4 4 4 4 4 2 4 5 0 > """.split()) > > """ > So we can definitely use less than 4 pixels on average. We could > probably even give an extra pixel or two of width to M, N, W, m, w, > and maybe #, Z, and z, and get an improvement in readability: > """ > > proportions_2 = map(int, """ > 2 2 6 4 4 4 4 2 3 3 4 4 3 3 2 4 > 4 3 4 4 4 4 4 4 4 4 2 3 4 3 4 4 > 4 4 4 4 4 4 4 4 4 4 4 4 4 6 5 4 > 4 4 4 4 4 4 4 6 4 4 6 3 4 3 4 3 > 3 4 4 4 4 4 4 4 4 2 3 4 4 6 4 4 > 4 4 4 4 4 4 4 6 4 4 5 4 2 4 5 0 > """.split()) > > """ > If you're really shooting for density instead of readability, you > could make up some other glyph to represent parapraph breaks or line > breaks or whatever, probably only 3 pixels wide. > """ > > def total_pixel_width(inputfile, proportions, newline_width): > assert len(proportions) == 96 > widths = [0] * 32 + proportions + [0] * 128 > widths[ord('\n')] = newline_width > > total_width = 0 > for line in inputfile: > for char in line: > total_width += widths[ord(char)] > return total_width > > def main(): > import sys, codecs > sys.stdout = codecs.getwriter('utf-8')(sys.stdout) > kjv = lambda: open('bible-pg10.txt') > print u"Fixed-width 4×6 with a 4-pixel newline:" > print total_pixel_width(kjv(), [4] * 96, 4) > print "With some characters narrower:" > print total_pixel_width(kjv(), proportions_1, 3) > print "With some narrower and some wider:" > print total_pixel_width(kjv(), proportions_2, 3) > > if __name__ == '__main__': > main() > > """ > Output: > Fixed-width 4×6 with a 4-pixel newline: > 17407376 > With some characters narrower: > 15187418 > With some narrower and some wider: > 15486129 > > So yes, the proportional font saves 11% of the space, or 13% in the > version where none of the glyphs are wider. You’d need 15.5 million > pixels horizontally to represent the PG KJV this way with the wider > variant, or 92 916 774 total pixels, 92.9 megabits or 11 614 597 > bytes’ worth. So you could print the entire KJV on three sheets of > paper, using a regular laser printer. And with some human effort you > can do substantially better; about 0.3 million of those 15.5 million > are used just in representing newlines, and most of those are > linebreaks to keep the lines from getting too wide, not for separating > paragraphs. > > """ > -- > To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol
-- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss