I used Renum1 from Club100 library. I have inspected the tokenized BA in a hex editor. As far as I can tell, line numbers aren’t really compressed in any way. So in my original program, most of my line numbers were between 1000-30000, and each reference to them was 4-5 bytes.
Now most of my lines are 1-3 bytes after renumbering from 1. I also do use Packer.BA from Club100. This removes comments, and combines lines that aren’t referenced by GOTO, GOSUB, etc. Best, George On Tue, Feb 28, 2023 at 9:49 PM B 9 <[email protected]> wrote: > > > On Tue, Feb 28, 2023 at 4:55 PM [email protected] <[email protected]> > wrote: > >> Thanks all! >> >> At some point I’ll look into adding Tokenization directly into Github. >> > > Awesome. It looks like compiling and running a C program may be trivial in > the yaml file: > > - uses: actions/checkout@v3 > > - run: | > make > ./tokenize FOO.DO > > > By the way, you may be able to use a Python lexer, such as ply > <https://www.dabeaz.com/ply/ply.html>, to create a Python program from my > flex source code. However, I suspect that will be more work than it's > worth. > > > I also used a line renumberer which brought down the .BA file to 76% of >> the previous version. > > > Wow. What renumberer did you use? And why did renumbering reduce the file > size? > > By the way, a tokenizer should be able to reduce the file size > dramatically by simply omitting the string after REM statements. Having it > remove vestigial lines completely would be slightly trickier and probably > require a second pass as it'd have to make sure the line was not a target > of GOTO (or any of the other varied ways of referring to line numbers). > > —b9 >
