On Thu, Mar 12, 2026 at 10:49 AM Brian K. White <[email protected]> wrote:
That's curious that you arrived at +143 for that file. > > I just added a brute force scanner which tries all possible values 0-255 > using xor and using rot, and for the same file I get +122 (rot122) Oh, I’m sure your algorithm is correct. Mine was just a hasty measurement of the sort of savings that were achievable. I believe there should be a better algorithm than brute force as it is reminiscent of other, trickier problems <https://dl.acm.org/doi/epdf/10.1145/358234.381162> which had neat solutions. Sadly, the problem is so easily tractable by simply enumerating all 256 possibilities, we have little reason to discover it. My code calculates 256 bins, sums[x], containing the count of the number of bytes which have the value x. It calculates the total size by adding the in the input file size to the count of the bytes which are less than 35, except 9 and 32. def getsize(sums): r'''Calculate the bang-code filesize''' total = sum([sums[x] for x in sums]) \ + sum([sums[x] for x in range(35) if x!=9 and x!=32]) return total Possibly I’m biting myself with premature optimization because I don’t recount the bins for each rotation. Instead, I rotate them like so: def rotate(sums, k): r'''Given the table of sums for various bytes, rotate it by k''' return { x:sums[(x+k)%256] for x in range(256) } I am now also including 127 by default like you are. It's so useful it's > basically user-hostile not to, and it doesn't cost hardly anything. > Ah, that’s likely the error in my calculation. I forgot to add in sums[127]! I haven’t implemented the search in co2do yet, but I did add a static rotation of +136 with reasonable results. Oh yeah I switched to a rolling xor checksum too. > It only needs ints in basic and I think actually catches more errors. > Still kind of swiss cheese compared to real crc algorithms but those are > expensive and this is cheap. > Need a cheap CRC algorithm? Coincidentally, I made a 34-byte 8080 machine language <https://github.com/hackerb9/crc16-8080/> program which calculates CRC16/xmodem. Are you seeing many transmission errors? Also I'm calling it !yenc. Because it's not yenc. And yet essentially is > so it's descriptive of it's properties both ways. > I don’t know yenc yet, so the name !yenc is a bit confusing, but I do like that it could be read as “why NOT encode?” —b9
