----- Forwarded message from Kragen Javier Sitaker <[email protected]> -----
From: Kragen Javier Sitaker <[email protected]> To: Joe Blaylock <[email protected]> Subject: Re: reducing charset size for compressibility with case-shift characters (in Python) On Sat, Apr 16, 2011 at 10:50:34AM -0700, Joe Blaylock wrote: > On Sat, 2011-04-16 at 03:37 -0400, Kragen Javier Sitaker wrote: > > lowercase = 'abcdefghijklmnopqrstuvwxyz' > > numbers = '0123456789' > > > > else: > > yield current_state[lowercase.index(char)] > > elif char == DC3: > > current_state = numbers > > Couldn't you achieve a modest increase in compressibility at the expense of > calculation time by representing all numerical sequences as base-26 encoded > strings? Quite possibly. In the Project Gutenberg Bible, that would make a substantial fraction of the numbers one digit instead of two, or two digits instead of three. > You'd have to run a buffer large enough for any numeric runs you > process, but the transformation itself is easy. You couldn't do that > nice direct-indexing thing any more though. Well, not without > creating more abstraction. Indeed. May I forward this to kragen-discuss? Kragen ----- End forwarded message ----- -- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss
