No transmission errors, I just don't like for something to make a promise
that it doesn't actually do. If it's gonna have a checksum at all, it
should actually prove what it purports to prove.

Or rather, if a stronger check can be had for basically no significant cost
in size or speed, why not have a stronger check instead of a weak one? Just
adding +1 to the simple sum already closes a lot of holes just by making
nulls count, and the largest possible file made of all 255s still only
needs singles to hold the sum, at least better than the default doubles.
And adds practically nothing to the loader code. As far as I can see all
old loaders should have done that all along.

Even ts-dos has a couple big long strings of repeats that would be
essentially ignored even by the usual simple total sum that almost all old
loaders used.

The xor without a similar +1 is maybe even worse because there are more
input patterns that result in not really proving anything. But then again
the xor makes it harder for a later error to correct an earlier error, and
the bad input patterns are artificial.

So now I have the simple xor as default for the speed, and the fact no one
is downloading these from bbs's over phone lines with modems that predate
mnp5 error correction any more, let alone manually typing them in. (look at
the original TINY loader, 40 column lines! The error message says "typo"!)

But you can also choose "sum+" (sum with +1 per byte), "xor+",  "mod+".

They each have some small pro & con but really anything is fine which is
why the simple xor is default.

mod+ uses mod32512.
The largest value mod32512 outputs is 32511. 32511 + a 255 byte + 1 is
still an INT.  It means that the sum never exceeds int, yet still rolls
over the fewest possible times, giving the fewest possible repeats. And the
large odd mod value doesn't cost more cpu time than mod256, at least not in
basic. The ints might be why this is actually faster than the simple sum+1
which needs singles.

xor+ needs singles because definitely the wrong input can make that go
outside of int. No normal binary ever does, but 12k of all 127s results in
over 1.5m. But it's faster than either mod or sum and 2 singles vs 2 ints
is only a few bytes in raw space, but I think still not as bulletproof as
the mod or sum.

Probably the mod is the strongest, even stronger than the straight sum+1
because it's more position-aware(later errors ccorrecting earlier errors),
and yet actually faster, and only needs ints while the sum needs singles.
If you want to generate a robust loader to be distributed and archived,
maybe use the mod option. If you want to generate a loader to use once
immediately and throw away, use the default simple xor.

bkw

On Fri, Mar 20, 2026, 3:19 AM B 9 <[email protected]> wrote:

> On Thu, Mar 12, 2026 at 10:49 AM Brian K. White <[email protected]>
> wrote:
>
> That's curious that you arrived at +143 for that file.
>>
>> I just added a brute force scanner which tries all possible values 0-255
>> using xor and using rot, and for the same file I get +122  (rot122)
>
> Oh, I’m sure your algorithm is correct. Mine was just a hasty measurement
> of the sort of savings that were achievable. I believe there should be a
> better algorithm than brute force as it is reminiscent of other, trickier
> problems <https://dl.acm.org/doi/epdf/10.1145/358234.381162> which had
> neat solutions. Sadly, the problem is so easily tractable by simply
> enumerating all 256 possibilities, we have little reason to discover it.
>
> My code calculates 256 bins, sums[x], containing the count of the number
> of bytes which have the value x. It calculates the total size by adding
> the in the input file size to the count of the bytes which are less than
> 35, except 9 and 32.
>
> def getsize(sums):
>     r'''Calculate the bang-code filesize'''
>     total = sum([sums[x] for x in sums]) \
>         + sum([sums[x] for x in range(35) if x!=9 and x!=32])
>     return total
>
> Possibly I’m biting myself with premature optimization because I don’t
> recount the bins for each rotation. Instead, I rotate them like so:
>
> def rotate(sums, k):
>     r'''Given the table of sums for various bytes, rotate it by k'''
>     return { x:sums[(x+k)%256] for x in range(256) }
>
> I am now also including 127 by default like you are. It's so useful it's
>> basically user-hostile not to, and it doesn't cost hardly anything.
>>
> Ah, that’s likely the error in my calculation. I forgot to add in
> sums[127]!
>
> I haven’t implemented the search in co2do yet, but I did add a static
> rotation of +136 with reasonable results.
>
> Oh yeah I switched to a rolling xor checksum too.
>> It only needs ints in basic and I think actually catches more errors.
>> Still kind of swiss cheese compared to real crc algorithms but those are
>> expensive and this is cheap.
>>
> Need a cheap CRC algorithm? Coincidentally, I made a 34-byte 8080 machine
> language <https://github.com/hackerb9/crc16-8080/> program which
> calculates CRC16/xmodem. Are you seeing many transmission errors?
>
> Also I'm calling it !yenc. Because it's not yenc. And yet essentially is
>> so it's descriptive of it's properties both ways.
>>
> I don’t know yenc yet, so the name !yenc is a bit confusing, but I do like
> that it could be read as “why NOT encode?”
>
> —b9
>

Reply via email to