> On Jul 18, 2018, at 1:21 PM, Paul Berger via cctalk <cctalk@classiccmp.org> 
> wrote:
> 
> I would think that any interpreted BASIC would do this or for that matter any 
> interpreted language except maybe for APL which is pretty much written with 
> tokens anyway.  One other exception I can think of is perl  which is stored 
> as source text.  Saving in tokenized form was good for to reasons, it saved 
> storage space, both in memory and on mass storage and when you loaded the 
> program it was ready to go.
> 
> Paul...
>> 
>>>>> I think it was called a "decompiler" though.  Seemed like magic at the 
>>>>> time.
>>>>> 
>>>>> Googling reveals "You may be remembering the BASIC PLUS
>>>>> decompiler under RSTS.  RSTS BASIC PLUS was interpreted from "push-pop" 
>>>>> code.
>>>>> The symbol table was available in the compiled file, and the 
>>>>> correspondence
>>>>> between push-pop operations and BASIC PLUS source was very close, so you
>>>>> could get back very reasonable code."
>>>>> 
>>>>> And our previous discussion of it a decade ago:
>>>>> 
>>>>> https://marc.info/?l=classiccmp&m=121804804023540&w=2

I would not say "written with tokens".

Basic-PLUS essentially used a stack machine code, easy to generate and pretty 
efficient.  It wasn't designed to be reversible, but since the symbol table was 
saved as well (had to be, to allow for incremental editing and interactive 
debugging) you could reverse pretty easily.  This sort of thing has a long 
history.  UCSD Pascal used something similar, which it called "P-code".  The 
TUTOR language of the U of Illinois PLATO system did as well, except for 
expressions which were compiled into actual machine code.  That sort of mixed 
encoding was used a decade earlier in the first ALGOL compiler, by Dijkstra and 
Zonneveld, 1961, for the EL-X1.  And yes, it's still done a lot, I believe 
Python is a good example.

A somewhat different approach is found in RT-11 BASIC, a somewhat simpler 
language than BASIC-PLUS and an unrelated implementation.  That one does 
convert the text into tokens, it doesn't generate a stack language 
transformation as B+ did.  And the token encoding is explicitly designed to be 
reversible: when you use the LIST command the token stream is converted back to 
source text.  That means, for example, that comments are included in the token 
stream (unlike B+).

        paul

Reply via email to