It works all on one line after all!
It doesn't make the file any shorter, actually a couple bytes larger
because "ELSE:" is longer than "\r2"
But it did get about 1 second faster.
0READT:CLEAR2,T:DEFINTI,O,C,V,L:DEFSNGA,K,S,T,X,E:DEFSTRB,M,D,N:READT,L,X,K,N,O,M:E=T+L-1:A=T:S=0:C=0:CLS:PRINT"Installing
"N
1PRINT@20,CINT((L-(E-A))/L*100)"%":READD:FORI=1TOLEN(D):B=MID$(D,I,1):IFB=MTHENC=O:NEXT:ELSEV=ASC(B)-C:POKEA,V:C=0:A=A+1:S=S+V:NEXT:IFA<=ETHEN1
2PRINT:IFS<>KTHENPRINT"Bad Checksum":END
3CALLX
4DATA59346,3614,59346,454932,"ALTERN.CO",64,"!"
On 3/4/26 07:11, Brian K. White wrote:
I've gotten co2ba.sh about as good as I think I'm going to get it for
now.
It generates a larger block of loader code, which also runs slower,
but the file is somewhat smaller and that ends up making the total job
take about the same time with a 3.6k sample input co file.
The sample file used for these tests and comparisons is
ALTERN.CO manually reconstituted from
https://github.com/LivingM100SIG/Living_M100SIG/blob/main/M100SIG/Lib-07-UTILITIES/ALTERN.100
My previous loader code looks like this:
(extra blank lines to help reading after email wraps the long lines):
-----------
0CLEAR0,59346:A=59346:S=0:N$="ALTERN.CO":CLS:?"Installing "N$" ..";
1D$="":READD$:FORI=1TOLEN(D$)STEP2:B=(ASC(MID$(D$,I,1))-97)*16+ASC(MID$(D$,I+1,1))-97:POKEA,B:A=A+1:S=S+B:NEXT:?".";:IFA<62960THEN1
2IFS<>454932THEN?"Bad Checksum":END
3CALL59346
4DATAmndmpfmndbeccklppfolcbim...
-----------
With the new scheme I'm down to this
-----------
0READT:CLEAR2,T:DEFINTI,O,C,V,L:DEFSNGA,K,S,T,X,E:DEFSTRB,M,D,N:READT,L,X,K,N,O,M:E=T+L-1:A=T:S=0:C=0:CLS:PRINT"Installing
"N
1PRINT@20,CINT((L-(E-A))/L*100)"%":READD:FORI=1TOLEN(D):B=MID$(D,I,1):IFB=MTHENC=O:NEXT
2V=ASC(B)-C:POKEA,V:C=0:A=A+1:S=S+V:NEXT:IFA<=ETHEN1
3PRINT:IFS<>KTHENPRINT"Bad Checksum":END
4PRINT"Done. Please type: NEW":SAVEMN,T,E,X
5DATA59346,3614,59346,454932,"ALTERN.CO",64,"!"
6DATA"Í<õÍ1B*¿õë!a!DÍõ|µÊêç...
-----------
Part of the size difference is some apples/oranges differences that
make it not a direct comparison. The two could be more similar than
this if I wanted. Previously I just had the generator write the co
header variables directly in the code instead of having a header data
line, while in the new one I'm doing it all from a data line, because
I like that the loader code then is self contained & portable. You
could copy the loader block and stick it on top of some other paylod
and it would work.
And another part is I made a real percent-done display on the new one
because it doesn't cost any run time, just a few more bytes of file
size. It only runs once per data line and outside of the inner loop.
The defint/defsng etc making line 0 longer also makes it run several
seconds faster.
I actually have an even slightly shorter version just by using the
range syntax for the DEF*
DEFINTA-E:DEFSNGF-K:DEFSTRL-O
vs
DEFINTI,O,C,V,L:DEFSNGA,K,S,T,X,E:DEFSTRB,M,D,N
but it makes the code just about unreadable since the letters lose all
meaning.
The notable points:
no goto in the inner loop, just next.
saved a line and also made it so that the generator script doesn't
have any forward references, so it can just increment line numbers
without having to hard code like a GOTO3 on line 1 etc.
Instead of
O=64 C=0 ... IFB=MTHENC=1 ... V=ASC(B)-(O*C)
(on every byte set a decode flag to 0 or 1, then multiply the encoding
offset by the on/off flag to enable/disable the offset)
Just
O=64 C=0 ... IFB=MTHENC=O ... V=ASC(B)-C
(instead of setting the encode flag to 0 or 1, just set it to 0 or the
actual offset value, then just subtract it directly without the
multiplication step. Always subtract, sometimes it's 0, sometimes its 64.
As far as I can tell, 0, 1, and 64 are all the same int and the same
work to process as long as the variables are declared to the same type.
Already mentioned all variables from data, can-nable loader code etc.
If the top address is the very first data value, you can read it, use
it to clear, and then just read it again to still have it after the
clear without wasting much space or cpu, and without needing the
generator script to write the value twice in duplicate assignments
before & after the clear.
Already mentioned the fancy percent-done progress.
The generator script has config options so you can change the behavior
at run-time by env variables.
So you can change the starting line number, the line number increment,
the length of the data lines, the encoding mark character, the
encoding offset value.
I have the generator script now counting all the bytes in the output
line when building data lines and deciding when to start a new line,
so now every line fills to the specified max length as much as
possible even though the size of the data varies because of the varied
encoding.
All in all, the new way generates a smaller file, but the loader code
runs slower, and it ends up taking almost exactly the same total time
to load. The smaller file size is a win though, and the total time is
actually *slightly* in favor of the new way.
The new scheme is conceptually simple but it takes 2 lines of code and
includes an IF branch where the old way the entire loop is on a single
line and the the same math ops happen for every byte, no branching.
I read that one optimization is to move initialization/setup code to
the end instead of the top, and use goto or gosub to jump to it and
back, and have your tight loop as close to the top as possible.
Something about BASIC searching from the start of the file repeatedly?
Well I tried that and it made no difference in my run times. I tried
both goto and gosub.
For now I kept the old script in the repo as co2ba_old.sh since it's
output is probably still useful being all pure low ascii printable text.
Converting the same input:
ALTERN.CO 3620 bytes
old:
7749 bytes
xfer time: 1:05
load time: 2:04
total: 3:09
new:
5334 bytes
xfer time: 0:45
load time: 2:23
total: 3:07
https://github.com/bkw777/dl2/blob/master/co2ba.md
https://github.com/bkw777/dl2/blob/master/co2ba.sh
Anyway, thanks again for the idea Steve!
--
bkw