Re: (symbols) inside a function

Alexander Burger Wed, 01 Feb 2012 12:04:58 -0800

Hi Imran,

> As you know, I'm just trying out picolisp in my spare time (hence my slow
> responses on this list, sorry about that). To put some context to these


That's perfect. No hurry :)

I also want to spend more time concentrating on project work.


> questions, I thought that using seperate namespaces to simulate a key/value
> dictionary/hash-table would be a nice idea to play with, to understand 
> picolisp
> better.

While this is theoretically possible, I would stronly disadvise it.

Namespaces are a huge overhead, if you just want to create some data
lookup structure. Calling 'symbols' creates a copy of the current
namespace, so this means that for a (relatively small) namespace of 1000
symbols, a tree of 1000 nodes (or roughly 1500 cells) is copied. That's
24000 bytes of memory. Searching for a few symbols in that big tree is
also unnecessarily slow.


> (alists are nice and easy, but if you update it frequently cons'ing changes to
> the begining, it gets slower & slower to work with the less frequently 
> changing

A simple linear list search is for up to 100 or so items probably faster
than a lookup in the above large tree.

> key/values at the end of the alist)

You might consider using symbol properties then. Properties are
implemented using a last-recently-used scheme. When a property is found
with 'get' or related functions, it is moved to the front of the list.

But for large data sets, 'idx' (as José Romero pointed out) is probably
better. In fact, namespaces internally have the same structure as 'idx'
trees.


> Anyway, as the symbol-table/namespace itself is just a big hash-table, then 
> all
> we would need are: + a way to create a new namespace + check if a variable
> exists in that namespace + get or set a variable in that namespace

Yes, but with a really big overhead (see above).


> (BTW, I hope this doesn't come across as "dogshedding" - or "arguing for
> aguments sake". I'm genuinely interested to see where I can use picolisp in my
> current stack - its 1/10th the memory footprint of racket!)

OK.


> >> (mk Dc)
> >
> > creates a new namespace 'Dc', and immidiately sets the current namespace
> > back to 'pico'. That's all right.
> 
> It works, but it doesn't feel right. There's additional, unecessary work being
> done here (switching the *current* namespace to the new one, then switching it
> back to the old namespace).

Compared to the amount of work of creating the namespace (copying
thousands of tree cells), the setting of the current namespace is
ridiculously small. It involves not more than switching a pointer.

The creation and immediately setting of the namespace is just convenient.
The intention of namespaces is to build indpendent modules, and in such
cases you will want to do

   (symbols 'newspace 'pico)
   (de a () ...)
   (de b () ...)

If you put such a code sequence into a single file, you have an isolated
module, because 'load' restores the current namespace when it is done,
effectively making 'newspace' available in 'pico'.


> I think that the lisp forms provided to cover the namespace functionality you
> added to the vm ('symbols', and the ~ read macro) don't go far enough. They
> cover one use case for namespaces, but not all potential uses.

To my feeling, they fit together perfectly. At least for the intended
purpose.


> Is the symbol 'pico itself just an idx tree, or some derivative of that? I

Yes. To be precise: It is a cons-pair of two idx trees, one for short
names (7 or less characters), and one for long names.

But: The internal matching algorithm is not compatible to 'idx'. It
doesn't use comparison functions (=, < etc.) on the symbol names
(strings) as 'idx' does, but compares complete chunks of 64-bit words.
This is the reason for the distinction between short and long words.
Therefore, 'idx' can't really be used with namespaces.


> didn't fully grok the explanation of idx tree's from the ref - is there a
> Rosetta example (or other snippet) which you can point me to to explain them?

Yes, for example

   http://rosettacode.org/wiki/Associative_arrays/Iteration
   http://rosettacode.org/wiki/Anagrams
   http://rosettacode.org/wiki/Anagrams/Deranged anagrams
   http://rosettacode.org/wiki/Hamming_numbers
   http://rosettacode.org/wiki/Huffman_coding
   http://rosettacode.org/wiki/Inverted_index
   http://rosettacode.org/wiki/LZW_compression
   http://rosettacode.org/wiki/Priority_queue
   http://rosettacode.org/wiki/Set
   http://rosettacode.org/wiki/Sokoban


> Is there some other way of accessing the contents of other namespaces, other
> than using the ~ read macro?

Not really, AFAIK at the moment.


> > is _read_ while 'pico' is the active namespace. A symbol 'k1' didn't
> > exist at the moment the above 'mk' was called, so an entry for that name
> > will be created in 'pico',
> 
> Interesting. So the reader automatically intern's any "free symbol" (for want 
> of
> a better term) that it finds, before evaluating the expr?

Yep. This is a very central principle of Lisp. Expressions are always
read completely before being evaluated. If the reader comes upon an atom
which is not a number, it returns it as a symbol. Along that way, it
either finds an existing one, or it creates it automatically.

Namespaces are just a simple extension of that principle.

The reader creates everything necessary to build the s-expression, be it
internal, external, transient or anonymous symbols, small or big
numbers, and cell structures.


> Is this just a way of avoiding "variable doesn't exist" exceptions in any
> situation?

I would not put it this way.

Variables are very different from symbols. A symbol is a structured
piece of data in memory, which may happen to be used as a variable in a
given context. But variables may also be list cells, e.g.: (set (cddr
Lst) 7), and symbols may be used for other purposes than variables.

So "doesn't exist" won't happen for symbols (as it won't happen for
numbers or list cells).

A variable will always exist, but it may happen to be unbound. In that
case some Lisps issue a "variable not bound" error when 'eval' tries to
access the value (though the symbol _does_ exist, it just doesn't have a
value).

In PicoLisp, there are no "unbound" symbols. Symbols are always bound to
an initial value upon creation: For internal symbols this is NIL, and
for transient symbols it is that symbol itself. This is just for
convenience.


> But ...
> 
> If a function doesn't evaluate the argument, any non-existant variable name
> menioned in the call to that function are still interned, before the function
> gets its (unevaluated) arg list.
> 
> Doesn't that seem, well, wrong?

I think this is very clean and consistent. The reader doesn't (and
shoudn't) know what the expression _means_ it is building. It just
returns an s-expression. Whether the symbols in that expressions are
functions, variables, property keys, or whatever, depends on the
context(s) the s-expression is used later.

So a symbol _always_ exists if it is seen by the reader. When such a
symbol happens to be interpreted as a variable later, it depends on the
value it is bound to at that moment. Not more and not less.


> (de dummy X (println "doing nothing!"))
> 
> (de k-vars ()
>   (filter '((Sym)
>             (= "k" (car (chop Sym))) )
>     (all)))
> 
> pico λ: (k-vars)
> -> (key kill keep keep> keep!> keep1> keep?> k-vars)
> pico λ: (dummy k2)
> "doing nothing!"
> -> "doing nothing!"
> pico λ: (k-vars )
> -> (k2 key kill keep keep> keep!> keep1> keep?> k-vars)
> ####
> 
> Honestly, seems strange to me.

Hmm, to me not at all ;-)

An internal symbol with the name "k2" didn't exist (in the current
namespace). When the reader reads the expression (dummy k2), it creates
such a symbol. It doesn't know about the meaning of 'dummy', it might
well be that 'dummy' is a function which assigns a value to the symbol,
isn't it?

Consider (de k2 ...) or (setq k2 ...), then you _do_ want 'k2' to be
created, and then initialized, right?

So the important point is perhaps the clear separation between 'read'
and 'eval'.


> > and the expression '(setq Dc ...)' is
> > returned, and then evaluated. It is irrelevant what 'dset' does with its
> > arguments during that evaluation, the symbol 'k1' exists in 'pico'.
> 
> I still don't understand why the original example was failing though. I
> understand that pico~k1 is interned (as NIL) before (dset) runs. But, once

I think we should not talk about 'k1' being "interned as NIL". Interning
a symbol, and talking about its value, are completely different things.

As I said, it is just a convenience that new internal symbols are
initially bound to NIL. You can easily create, say, a transient symbol,
assign it a value, and then intern it:

   : (setq "abc" 123)
   -> 123

   : "abc"           
   -> 123

   : (intern '"abc") 
   -> abc

   : abc            
   -> 123

So the above 'pico~k1' _returns_ a symbol with the name "k1" in the
namespace in the value of the symbol 'pico'. Regardless of the value
'k1' has hat that moment.

The function 'symbols' is concerned only about namespaces, i.e. the
associations of symbols and their names. Nothing to do with the values
or properties of those symbols.


> inside dset, we've switched namespaces, and then:
> 
>       (eval (list 'setq Key Value))
> 
> where Key = k1, and Value = <whatever I passed in>. So why isn't
> New-namespace~k1 now created?

Because the reader created the symbol.


> > So (pack "foo" "bar") and (pack 'foo 'bar) both return "foobar", and
> > (intern (pack "foo" "bar")) is just fine.
> 
> Ok. But there doesn't seem to be a way (that I can find) to dynamically 
> combine
> a namespace name, a '~', and a variable name

Right. '~' is a read macro, it has this meaning only while reading an
expression.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:[email protected]?subject=Unsubscribe

Re: (symbols) inside a function

Reply via email to