Re: picolisp reader dot handling inconsistency

Tomas Hlavaty Sun, 04 Oct 2009 06:35:13 -0700

Hi Alex,

> So for now I would tend to stay with Tomas' proposal, in handling the
> dot as a meta-character only when not part of an atom (i.e. surrounded
> by white space or other meta-characters like '(' and ')'). The
> advantage is that then we can use '.' as part of symbol names, which
> is quite nice sometimes.


I am still trying to understand and come up with the simplest and most
"regular" way to do the dot notation.  I think PicoLisp and also Common
Lisp have some unnecessary restrictions in some pathological cases and
these can be avoided using a simple and regular algorithm.

In PicoLisp, there are some cases that behave quite arbitrary.

: '(. 1 2)
-> (\. 1 2)

Why is the dot treated as a symbol when at the head of the list?

: '(. . 1)
-> (\. . 1)

: '(. . 1 . 2)
(\. . 1) -- Bad dotted pair
? -> 2
? 

Is the error case "Bad dotted pair" necessary?

: '(. . .)
-> (\. . \.)

Why is the last dot treated as a symbol and not as a circular list?

: '(. . . .)
(\. . \.) -- Bad dotted pair
? Bad input ')' (41)
? : 

What is (should be) this?

!? (.)
 -- Undefined
? 

What is (should be) this?

: (quote (. . . .))
(. . .) -- Bad dotted pair
? Bad input ')' (41)
? Bad input ')' (41)
? : 

What is (should be) this?

And so on...

I think that the dot (standing on its own while reading a list) can have
similar meaning as simply setting cdr of the last cell while building a
list.  I think we had some discussion a while ago about 'chain' not
being able to set the cdr of the last cell.  I think the reader could
behave pretty much like 'make' in some sense.

I have a prototype minimalistic reader in Java that currently supports
lists and symbols only (no strings and numbers) but the algorithm for
reading lists (the 'readL' method below) is very regular (requires
single character look-ahead only):

    Any symbol() {
        Character C = xchar();
        if(charIn(C, "() \t\n")) err(C, "Symbol expected");
        StringBuffer L = new StringBuffer();
        L.append(C);
        while((null != (C = peek())) && !charIn(C, "() \t\n"))
            L.append(xchar());
        String M = L.toString();
        return intern(M);
    }
    Any read1(boolean Top) {
        skip();
        Any Z = null;
        Character X = peek();
        if(null != X) {
            switch(X) {
            case '(': xchar(); Z = readL(); break;
            case ')': xchar(); if(Top) err("Reader overflow"); Z = Rp; break;
            default: Z = symbol();
            }
        }
        return Z;
    }
    Any readL() {
        Any A = mkCons(NIL, NIL);
        Any Z = A;
        Any X;
        boolean D = false;
        while(null != (X = read1(false)) && Rp != X) {
            if(Dot != X) {
                Z.con(D ? X : mkCons(X, NIL));
                if(Z.cdr().isCons()) Z = Z.cdr();
            }
            D = Dot == X;
        }
        if(D) Z.con(A.cdr());
        return A.cdr();
    }

Rp represents right parenthesis and Dot is an interned (unique) "."
symbol.

The nice thing about 'readL' is that many of the pathological cases
behave "predictably" and there are no arbitrary (error) cases.

: (quote . (abc(def)ghi))
-> (abc (def) ghi)
: (quote . .)
-> (quote .)
: (quote . (quote .))
-> (quote .)
: (quote (. . . .))
-> (NIL)
: (quote . (1 2 3 . 4))
-> (1 2 3 . 4)
: (quote . (1 2 3 . 4 . 5))
-> (1 2 3 . 5)
: (quote . (1 2 3 . 4 . 5 . 6 .))
-> (1 2 3 .)

Does such a behaviour make more sense to others to?

Thank you,

Tomas
-- 
UNSUBSCRIBE: mailto:[email protected]?subject=unsubscribe

Re: picolisp reader dot handling inconsistency

Reply via email to