On Thu, Mar 12, 2026 at 10:49 AM Brian K. White <[email protected]>
wrote:

That's curious that you arrived at +143 for that file.
>
> I just added a brute force scanner which tries all possible values 0-255
> using xor and using rot, and for the same file I get +122  (rot122)

Oh, I’m sure your algorithm is correct. Mine was just a hasty measurement
of the sort of savings that were achievable. I believe there should be a
better algorithm than brute force as it is reminiscent of other, trickier
problems <https://dl.acm.org/doi/epdf/10.1145/358234.381162> which had neat
solutions. Sadly, the problem is so easily tractable by simply enumerating
all 256 possibilities, we have little reason to discover it.

My code calculates 256 bins, sums[x], containing the count of the number of
bytes which have the value x. It calculates the total size by adding the in
the input file size to the count of the bytes which are less than 35,
except 9 and 32.

def getsize(sums):
    r'''Calculate the bang-code filesize'''
    total = sum([sums[x] for x in sums]) \
        + sum([sums[x] for x in range(35) if x!=9 and x!=32])
    return total

Possibly I’m biting myself with premature optimization because I don’t
recount the bins for each rotation. Instead, I rotate them like so:

def rotate(sums, k):
    r'''Given the table of sums for various bytes, rotate it by k'''
    return { x:sums[(x+k)%256] for x in range(256) }

I am now also including 127 by default like you are. It's so useful it's
> basically user-hostile not to, and it doesn't cost hardly anything.
>
Ah, that’s likely the error in my calculation. I forgot to add in sums[127]!

I haven’t implemented the search in co2do yet, but I did add a static
rotation of +136 with reasonable results.

Oh yeah I switched to a rolling xor checksum too.
> It only needs ints in basic and I think actually catches more errors.
> Still kind of swiss cheese compared to real crc algorithms but those are
> expensive and this is cheap.
>
Need a cheap CRC algorithm? Coincidentally, I made a 34-byte 8080 machine
language <https://github.com/hackerb9/crc16-8080/> program which calculates
CRC16/xmodem. Are you seeing many transmission errors?

Also I'm calling it !yenc. Because it's not yenc. And yet essentially is
> so it's descriptive of it's properties both ways.
>
I don’t know yenc yet, so the name !yenc is a bit confusing, but I do like
that it could be read as “why NOT encode?”

—b9

Reply via email to