On Wed, Mar 1, 2023 at 6:39 AM Brian K. White [email protected] <http://mailto:[email protected]> wrote:
> I also wrote a renumberer and packer in bash (actually in awk too before > that, still in the repo in the "attic") > > https://github.com/bkw777/BA_stuff > Okay, that’s freaking awesome. I can imagine doing it in AWK, but making that work in Bash is impressive. I was trying [...] > > ONAGOSUB310,311,312,,,,316,317,318,319 > > and reseq didn't handle that right. I don't know what other renumberers do > because I decided at least for what I was working on it would be more > convenient to renumber on the host than on the 100. > Tricky! I don’t know about other renumberers, but a bit ago I wrote something similar: a program to remove all unnecessary comments from a program. Part of it was a Bash script, jumpdestinations <https://github.com/hackerb9/M100LE/blob/main/adjunct/jumpdestinations>, which identifies the lines which are referenced by other lines and thus shouldn’t be removed even if they only contain a REM statement. Until today, I used this regular expression: egrep -io '(GO\s*TO|GOSUB|THEN|ELSE|RESTORE|RESUME|RETURN|RUN)(\s*,?\s*[0-9]+)+' As you can see, I had missed exactly the same subtlety as reseq. > The packer might be slightly wrong in one aspect. It converts all prints > to ?s and rems to 's, which does make the ascii file smaller, but I have > since read somewhere that one or both of those may actually result in > slightly more ram usage on the 100? I haven't performed a test to find out. > So the packer could maybe use an option for that. > I analyzed the BA format <http://fileformats.archiveteam.org/wiki/Tandy_200_BASIC_tokenized_file> and both ? and PRINT are encoded the same way and make no difference once the program is loaded. However, REM and ' are not. If you are looking to save space in the ASCII file, then the packer was correct to use ', but if you want to save space in the BA file, the packer should have used REM. I am not sure there’s a good use case for minimizing the ASCII file these days. Back when 7-bit transfers via TELCOM or LOAD “COM:98N1” were the norm, maybe it was worth it, but it seems like a mistake given that the ASCII file is only needed briefly while the BA file will remain on the computer as long as the program is used. > I might try to add line-unwrapping to the packer next. There are some > unwrapping that would be too involved to try to do, like keeping track of > broken but recombinable prints and other literals. > Yeah, a proper BASIC parser would be pretty much necessary. But, once you had that, you might be able to do fancier things, like unwrapping subroutines that are only reachable from one location. For example, lines 31000–31020 could be merged into line 23000 in TSWEEP.DO: 23000 GOSUB 24000:GOSUB 31000:KEY(8) STOP: PRINT@3*YO%+35,"Quit?";:PRINT@4*YO%+35,"(y/n)";:GOSUB20000:KEY(8) ON:IF YN%=1 THEN CLS:MENU ELSE GOSUB 17200 : GOSUB 17300 : RETURN ⋮30090 PRINT@7*YO%+PO%, CHR$(237)+STRING$(38,232)+CHR$(238);:KEY(8) ON:RETURN31000 FOR TY = 0 TO 7 'subroutine: Clear Right Pane31010 PRINT @ TY*YO%+32,STRING$(8,32);31020 NEXT : RETURN But it should be no problem to at least do some combining where if the end > of one line is not something like a THEN branch, and the next line number > is not a jump target anywhere else in the file, and the total new combined > line would be under either the 127 or 254 threshold (whichever you want) > it's ok to combine. > Nice. Every line you can remove from the source code will save five bytes from the .BA file. I'm not familiar with the 127 character limit. Is that for the NEC PCs? —b9 P.S. What’s the character limit in the .BA format? A quick test of 10??????????????????????????????????????????????????? shows that I can have a program in memory which is larger than EDIT can handle. Perhaps it would be a useful addition to my tokenizer to be able to pack lines at the token level.
