Hi Alex, > To explain this, I'd like to refer to "doc/structures" in each model. > "doc/structures" is the foundation of each implementation, like a set of
That seems like long term assignment;-) > The primary data types of picoLisp2 are: > > xxxxxxxxxxxxxxxxxxxxxxxxxxxxx010 Number > xxxxxxxxxxxxxxxxxxxxxxxxxxxxx100 Symbol > xxxxxxxxxxxxxxxxxxxxxxxxxxxxx000 Cell A cell consists of two pointers, isn't it 64 bits on 32 bit picoLisp2 then? > For miniPicoLisp it is (the number of 'x's is reduced): > > num xxxxxx10 > sym xxxxx100 > cell xxxxx000 Same for miniPicoLisp, isn't a cell 128 bits big when compiled on 64 bit platform? What the "raw data" mean in doc/structures there? I am puzzled by "bin". Why is the size of things documented as 8 bits for miniPicoLisp? > That is, a number is still indicated by an AND with 2, or an atom with > 6. Note, however, that you cannot directly check for a symbol here, > because a number may also have bit three on. To determine if a given > datum is a symbol, it must first be asserted that it is not a number. Can't you just say something like: num if (X & 3) == 2 sym if (X & 7) == 4 cell if (X & 7) == 0 > This design gives an additional bit for the number's value, at the > expense of a possibly more time-consuming check. I don't follow the reason really, the check can be pretty simple. > Finally, for the encoding of symbol names, rather convoluted structures > are used which are too involved to describe in this mail. Perhaps an > intensive study of "doc/structures" and the sources could give some > insight. In picoLisp2, symbol names are simply combined big and short > numbers. Yes, symbol names and string are big mystery to me so far:-) > Now, for picoLisp3, which is guaranteed to have a word size of 64 bits: > > cnt xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxS010 > big xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxS100 > sym xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx1000 > cell xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0000 > > We have an additional tag bit, and use it to differentiate between short > numbers and bignums. The 'S' bit in each type is the sign bit. So a > number can be identified by ANDing with 6, a symbol with 8, and so on. No, you can't determine symbol by anding with 8 because that would clash with the sign bit of cnt and big? I am really puzzled now;-) Is it some trick with "symbol names are simply combined big and short numbers"? I would say: cnt if X & 2 big if X & 4 sym if (X & f) == 8 cell if (X & f) == 0 > So, to bring it to the point: The above implementation was chosen simply > to save space. To my experience, optimizing space consumptions is far > more important than short sighted code optimizations or structural > design decisions (e.g. for a compiler). If the code is slow, you might > have a system that is half as fast as the other. So what? But if you run > out of space because you'd need twice as much, performance will go down > dramatically because of cache misses, swapping and trashing. Yes. > Even if you decide to live with (62 bit) short numbers only, still the > limitation of current C compilers does not allow to directly implement > the mul/div operation with an intermediate double-word result (as used > in '*/'). So you'd have to resort to half-word twiddling or assembly > language here too. This should not be a problem. > The implementation of 6-and-a-half bits for ASCII characters in > miniPicoLisp does not allow for UTF-8 support or external symbol > encodings. This is serious. However, could not be miniPicoLisp more clever and use as many bits as possible (like picoLisp2 and 3 do) instead of being limited to 8 bits? Other people on the mailing list discussed other languages like python etc. in the thread about asynchronous I/O. I think that the frameworks they used do that quite easily because the underlying interpreter does not use C stack and so is built to support first class continuations like scheme does. Have you thought about using the picoLisp heap instead of the traditional C stack? Thanks for explanations, Tomas -- UNSUBSCRIBE: mailto:[EMAIL PROTECTED]
