Re: [M100] Loading cross platform .CO files

Brian K. White Thu, 12 Mar 2026 10:47:57 -0700

On 3/10/26 22:31, B 9 wrote:

Nifty! I had been thinking about how to encode NULLs in a single bytesince they are so much more common, but hadn’t known about the yEncoffset. Doing some histogram analysis on the most easily available batch<https://github.com/hackerb9/co2do/tree/histogram/histogram/kurtdekker/co> of .CO files, it does not look like 42 is an optimal offset. I likethe idea of a bespoke offset! Here’s a sample of a program which candetermine that for you for a single .CO file or a whole directory ofthem: histco.py <https://github.com/hackerb9/co2do/blob/histogram/histogram/histco.py>:
|$ ./histogram/histco.py testfiles/ALTERN.CO <http://ALTERN.CO>Unrotated: 4758 bytes. Can save 956 bytes (20.09%) Rotation +143 => 3802bytes. Rotation of +42 would save 871 bytes (18.31%) |
|$ cd histogram/kurtdekker/co/ $ ../../histco.py Unrotated: 111237bytes. Can save 18353 bytes (16.50%) Rotation +136 => 92884 bytes.Rotation of +42 would save 6696 bytes (6.02%) |


Super cool.


That's curious that you arrived at +143 for that file.

I just added a brute force scanner which tries all possible values 0-255using xor and using rot, and for the same file I get +122 (rot122)



$ time co2ba ALTERN.CO call |wc -c
4401

real    0m0.097s
user    0m0.070s
sys     0m0.031s

$ time XA=best co2ba ALTERN.CO call |wc -c
trying all possible XA values...
XA=+122
4354

real    0m5.571s
user    0m5.541s
sys     0m0.032s

For the XA variable I'm using a convention that ^val means xor by val,and +val means rotate by val

Apparently it makes a difference because the best possible rotate didslightly better than the best possible xor, when all 256 possible valueswere tried. So it's not just a case that for every xor result there issome equal rot result just at some other location.

But I wonder if my scanner logic is bad because we both should havegotten the same value. I presume we both did the same rotate:


(byte+n)%256

I'm not doing anything efficient, it's super brute force.

I start with t=LEN s=LEN*2 and then walk the binary and t++ every time Ihit any of the unsafe values using xorN. Then at the end if t<s thens=t. t is the total bytes when using xorN, s is the smallest t seen sofar. Repeat for rotN. Repeat both for N=0-255.

I am now also including 127 by default like you are. It's so useful it'sbasically user-hostile not to, and it doesn't cost hardly anything.Especially after this gain from the rotate. So there should be nodifference from that.



with some debug echos added

$ XA=best ../co2ba.sh ALTERN.CO call |wc -c
trying all possible XA values...
^0 -> 4752
+0 -> 4752
^1 -> 4718
+1 -> 5552
^2 -> 4723
+2 -> 5514
^3 -> 4782
+3 -> 5531
^4 -> 4697
+4 -> 5515
...
^120 -> 3903
+120 -> 3815
^121 -> 3892
+121 -> 3814
^122 -> 3890
+122 -> 3806
^123 -> 3886
+123 -> 3807
...
+140 -> 3877
^141 -> 3839
+141 -> 3856
^142 -> 3837
+142 -> 3883
^143 -> 3834
+143 -> 3880
^144 -> 3846
+144 -> 3887
^145 -> 3840
+145 -> 3892
...
^248 -> 4704
+248 -> 3942
^249 -> 4697
+249 -> 3939
^250 -> 4705
+250 -> 3946
^251 -> 4702
+251 -> 3936
^252 -> 4640
+252 -> 3911
^253 -> 4702
+253 -> 3941
^254 -> 4699
+254 -> 3968
XA=+122  ->  3806 bytes
4354

And there is in fact 3806 bytes of payload after I manually remove thebasic and linebreaks, so at least I'm counting right. The loop thatgenerates the payload is a totally separate thing later, and the twothings agree on the total at least. And the file works, passes checksumand runs etc.



Oh yeah I switched to a rolling xor checksum too.
It only needs ints in basic and I think actually catches more errors.

Still kind of swiss cheese compared to real crc algorithms but those areexpensive and this is cheap.

Another idea I did before that was just add the total byte count to thesum, and have the loader add an extra +1 per poke. That way even if thedata was all 0's, as long as the +1's were actually added one at a timealong the way the sum would break if any bytes were lost, or added forthat matter.

That worked, but I think the xor is even cheaper and still improves onthe simple sum. I think the main weakness is with strings of repeatingbytes, any value not just 0, they way xoring the same 2 values exactlyreverses itself, means that you can drop a byte and catch it, but if youdrop 2 of the same byte in a row, you wouldn't know it. Those 2 byteswould have had no effect on the sum when they were both present.

Maybe I should go back to adding the total length to the finalcomparison after all. That should catch even more & freakier errors andstill essentially free. I can even still keep the ints. The max possiblefile size (29.6k) plus the max possible checksum (255) is still wayshort of max int (32k).

Also I'm calling it !yenc. Because it's not yenc. And yet essentially isso it's descriptive of it's properties both ways.

I started on rle but it's kind of a transporter accident still. It sortamostly lives...


--
bkw

—b9
P.S. A possible lesson for us on rolling our own encoding: Searching forsample .CO files to test with my histogram program, I found a ZIP fileof Kurt Dekker’s games on Bitchin 100 <https://bitchin100.com/m100-oss/archive.html>. Kurt actually released those in his own DEC format<https://github.com/hackerb9/co2do/blob/histogram/histogram/kurtdekker/util/FTU.TXT>. I downloaded the link labeled “everything in one BIGzip”, but it did not include any .CO files, so I rolled my own dec2co.sh<https://github.com/hackerb9/co2do/blob/histogram/histogram/kurtdekker/dec2co.sh> program. Later, when I found that bitchin100 /did/ havethe .CO files, merely misfiled. I was rather surprised to see that threeof them did not exactly match mine. It seems there’s a bug in the toolKurt released (FTU.BAS <https://github.com/hackerb9/co2do/blob/histogram/histogram/kurtdekker/util/FTU.BAS>) which occasionally causesit to emit bytes beyond the length specified in the .CO file header.
On Mon, Mar 9, 2026 at 3:24 PM Brian K. White <[email protected]<mailto:[email protected]>> wrote:
    On 3/9/26 14:03, Brian K. White wrote:
     > Wow, I haven't tested this enough to push it up to github yet (I
    haven't
     > even tried loading the result on a 100 yet to make sure it actually
     > decodes correctly) but I think I just reduced the output .DO size
    from
     > 5305 to 4378 just by applying a static offset to all bytes before
    encoding.
     >
     > Almost a whole 1K out of 5 just from that!

    Ran ok. The decode time stayed the same, but the transfer time went
    down
    and the ram used went down of course.
--bkw



--
bkw

Re: [M100] Loading cross platform .CO files

Reply via email to