I wound up stumbling on an interesting little problem that I thought I would put out in front of the group.

In the discussion of Scheme interpreters, one thing that normally gets omitted from the discussion is parsing and lexing the actual Scheme input.

"But that's easy!" I hear you cry. "It's just a tree!"--"It's just recursive!" I hear you bleat. Maybe I'm just stupid, but I haven't found it that easy.

Let's start from a simple problem statement:

"Given a stream of ASCII characters, produce the pairs or atoms that stream of characters represents in Scheme."

We don't even really have to make a distinction between numbers, strings, etc. yet. Any delimited string of characters is an atom, delimiters are "(", ")", whitespace ...

and DOT "." .

Huh?  What's *that*?

That, my friends, is called the fly in the ointment.

So, how does this work?  Some examples:

1 becomes 1 -- easy
() becomes "nil" -- make it whatever you want

(1) becomes [1, nil] -- proper lists terminate with nil

(1 2) becomes [1, [2, nil]] -- okay

(1 2 3) becomes [1, [2, [3, nil]]]

(1 (2)) becomes [1, [ [2, nil], nil]] -- getting tricky

(1 (2) (3)) becomes [1, [ [2, nil], [ [3, nil], nil]]] -- tougher

(1 ()) becomes [1, [nil, nil]]

(() 1) becomes [nil, [1, nil]]

Getting tougher but not too bad ... now lets add DOT.

(1 . 2) becomes [1, 2] -- not a proper list but okay

((1) . 2) becomes [ [1, nil], 2]

(1 . (2)) becomes [1, [2, nil]]

(1 . (2 . (3 . ()))) becomes [1, [2, [3, nil]]] -- look familiar?

((1 . 2) . (3 . 4)) becomes [[1, 2], [3, 4]]

And, finally, some mixtures:

(1 2 . 3) becomes [1, [2, 3]]

(1 2 . ()) becomes [1, [2, nil]]

(() 1 2 . 3) becomes [nil, [1, [2, 3]]]


You may come upon a recursive solution. This has a couple of subtleties. A naive recursive solution will bomb if you give it too complex an expression.

The second question: can you make it a *tail-recursive* solution that is only limited by memory? Making it properly tail-recursive is one problem. Making your language of choice actually *do* tail recursion is a second problem if you aren't actually implementing this in a tail-recursive language.

Finally, is there an *iterative* solution?

The answer to them all is: yes. However, *finding* said solutions requires some thought.

I originally started this problem in Python. It wasn't as straightforward as I expected. The solutions aren't long, they are just subtle.

I have Python code for the naive recursive solution and am working on code for the iterative. I can point you at code that does the tail-recursive one.

It would be interesting to see how others solved it.

-a

--
KPLUG-LPSG@kernel-panic.org
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to