----- Forwarded message from Joe Blaylock <j...@jrbl.org> ----- Subject: Re: reducing charset size for compressibility with case-shift characters (in Python) From: Joe Blaylock <j...@jrbl.org> To: Kragen Javier Sitaker <kra...@canonical.org>
On Sat, 2011-04-16 at 03:37 -0400, Kragen Javier Sitaker wrote: > lowercase = 'abcdefghijklmnopqrstuvwxyz' > numbers = '0123456789' > > else: > yield current_state[lowercase.index(char)] > elif char == DC3: > current_state = numbers Couldn't you achieve a modest increase in compressibility at the expense of calculation time by representing all numerical sequences as base-26 encoded strings? You'd have to run a buffer large enough for any numeric runs you process, but the transformation itself is easy. You couldn't do that nice direct-indexing thing any more though. Well, not without creating more abstraction. Joe ----- End forwarded message ----- -- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss