On Wed, Mar 1, 2023 at 6:39 AM Brian K. White [email protected]
<http://mailto:[email protected]> wrote:

> I also wrote a renumberer and packer in bash (actually in awk too before
> that, still in the repo in the "attic")
>
> https://github.com/bkw777/BA_stuff
>


Okay, that’s freaking awesome. I can imagine doing it in AWK, but making
that work in Bash is impressive.


I was trying [...]
>
> ONAGOSUB310,311,312,,,,316,317,318,319
>
> and reseq didn't handle that right. I don't know what other renumberers do
> because I decided at least for what I was working on it would be more
> convenient to renumber on the host than on the 100.
>


Tricky! I don’t know about other renumberers, but a bit ago I wrote
something similar: a program to remove all unnecessary comments from a
program. Part of it was a Bash script, jumpdestinations
<https://github.com/hackerb9/M100LE/blob/main/adjunct/jumpdestinations>,
which identifies the lines which are referenced by other lines and thus
shouldn’t be removed even if they only contain a REM statement. Until
today, I used this regular expression:

egrep -io '(GO\s*TO|GOSUB|THEN|ELSE|RESTORE|RESUME|RETURN|RUN)(\s*,?\s*[0-9]+)+'

As you can see, I had missed exactly the same subtlety as reseq.


> The packer might be slightly wrong in one aspect. It converts all prints
> to ?s and rems to 's, which does make the ascii file smaller, but I have
> since read somewhere that one or both of those may actually result in
> slightly more ram usage on the 100? I haven't performed a test to find out.
> So the packer could maybe use an option for that.
>


I analyzed the BA format
<http://fileformats.archiveteam.org/wiki/Tandy_200_BASIC_tokenized_file>
and both ? and PRINT are encoded the same way and make no difference once
the program is loaded. However, REM and ' are not. If you are looking to
save space in the ASCII file, then the packer was correct to use ', but if
you want to save space in the BA file, the packer should have used REM.

I am not sure there’s a good use case for minimizing the ASCII file these
days. Back when 7-bit transfers via TELCOM or LOAD “COM:98N1” were the
norm, maybe it was worth it, but it seems like a mistake given that the
ASCII file is only needed briefly while the BA file will remain on the
computer as long as the program is used.

> I might try to add line-unwrapping to the packer next. There are some
> unwrapping that would be too involved to try to do, like keeping track of
> broken but recombinable prints and other literals.
>


Yeah, a proper BASIC parser would be pretty much necessary. But, once you
had that, you might be able to do fancier things, like unwrapping
subroutines that are only reachable from one location. For example, lines
31000–31020 could be merged into line 23000 in TSWEEP.DO:

23000 GOSUB 24000:GOSUB 31000:KEY(8) STOP:
PRINT@3*YO%+35,"Quit?";:PRINT@4*YO%+35,"(y/n)";:GOSUB20000:KEY(8)
ON:IF YN%=1 THEN CLS:MENU ELSE GOSUB 17200 : GOSUB 17300 : RETURN
⋮30090 PRINT@7*YO%+PO%, CHR$(237)+STRING$(38,232)+CHR$(238);:KEY(8)
ON:RETURN31000 FOR TY = 0 TO 7 'subroutine: Clear Right Pane31010
PRINT @ TY*YO%+32,STRING$(8,32);31020 NEXT : RETURN


But it should be no problem to at least do some combining where if the end
> of one line is not something like a THEN branch, and the next line number
> is not a jump target anywhere else in the file, and the total new combined
> line would be under either the 127 or 254 threshold (whichever you want)
> it's ok to combine.
>

Nice. Every line you can remove from the source code will save five bytes
from the .BA file.

I'm not familiar with the 127 character limit. Is that for the NEC PCs?
—b9

P.S. What’s the character limit in the .BA format? A quick test of

10???????????????????????????????????????????????????

shows that I can have a program in memory which is larger than EDIT can
handle. Perhaps it would be a useful addition to my tokenizer to be able to
pack lines at the token level.

Reply via email to