Hi Randall, hi all,

the system we are currently developing at SmApper Technologies runs
distributed over a number of blades, holding picoLisp databases with a
total of (currently) up to 800 million objects. The picoLisp processes
communicate via TCP, propagating data during import, and it became
necessary for those databases to mutually query their contents.

For that, picoLisp now offers three simple extensions. As Randall keeps
urging me to post an explanation and examples, I decided to write this
mail.


The extensions are a Pilog rule for remote queries, and a global
variable and a new function in the interpreter core to transparently
handle foreign database objects.

The 'remote' rule (in "lib/pilog.l") was developed first. It takes an
arbitrary Pilog expression, arranges everything so that this expression
is evaluated in parallel on remote machines, and collects the results
locally.

Typically, the results of queries contain not only primitives like
numbers or strings, but most notably references to other external
symbols on remote machines. To make it possible to handle these symbols
locally (access their attributes, display in the GUI etc.), a dedicated
mechanism was built into the core.

This mechanism must avoid conflicts with symbols in the local database,
as well as with symbols residing on other remote machines. It is
implemented by maintaining a database file offset for each consulted
database. The "normal" picoLisp database resides in a well-defined
number of files (determined by the list in the second argument to
'pool'). For each external symbol (e.g. {3-7}) the interpreter knows how
to access its contents (here the 7th object in the 3rd database file).
An error occurs if the file number is larger than the number of files in
the database.

A new global variable '*Ext' can be used now to define additional number
spaces for dynamic database extensions. It should hold a list of cons
pairs, where the CAR of each pair defines an offset, and the CDR a
function to produce the contents of a symbol having that offset. The
individual offsets should be chosen with care, so that they don't
overlap.

An auxiliary function 'ext' is provided, that takes such an offset and
causes certain I/O functions to add and subtract that offset to all file
numbers of external symbols during input and output.

With that mechanism, both local and external databases see only symbols
they can handle in their own space.



Let's look at a simple but complete example. We will use the family demo
as described in the "Database Programming" chapter of "doc/tut.html".


### Set up a DB server ###

Please edit "doc/family.l", and extend the 'go' function so that it
looks like

   (de go ()
      (rollback)
      (task (port 4000)  # Set up the object server in the background
         (when (setq Sock (accept @))
            (unless (fork)  # Child process
               (task P)
               (close P)
               (in Sock
                  (while (rd)
                     (out Sock (pr (eval @))) ) )
               (bye) )
            (close Sock) ) )
      (server 8080 "@person") )

The background task will listen on port 4000, and handle any requests.
For this simple demo, we avoid issues like access permissions and
authentication.

Then start it on a separate console as

   $ ./p dbg.l doc/family.l -main -go

It could be used now with a browser just like before. In addition, it
listens on port 4000 for remote queries.


### Test client ###

Start a picoLisp process on another console. It could be another
application, or just a plain picoLisp process without a local database:

   $ ./p dbg.l

Then set '*Ext':

   : (setq *Ext  # Define extension functions
      (mapcar
         '((V)
            (let (@Ext (car V)  @Host (cddr V)  @Port (cadr V) Sock)
               (cons @Ext
                  (curry (@Ext @Host @Port Sock) (Obj)
                     (when (or Sock (setq Sock (connect @Host @Port)))
                        (out Sock (ext @Ext (pr (cons 'qsym Obj))))
                        (prog1 (in Sock (ext @Ext (rd)))
                           (unless @
                              (close Sock)
                              (off Sock) ) ) ) ) ) ) )
         (quote
            (20 4000 . "localhost") ) ) )

Normally, the 'quote' expression in the end would specify several
offsets, ports and host. For our purpose, a single process on
"localhost", listening on port 4000 (i.e. our family demo), will
suffice. If we look at the first (and only) entry

   : (pretty (car *Ext))
   (20
      (Obj)
      (job '((Sock))
         (when (or Sock (setq Sock (connect "localhost" 4000)))
            (out Sock (ext 20 (pr (cons 'qsym Obj))))
            (prog1 (in Sock (ext 20 (rd)))
               (unless @ (close Sock) (off Sock)) ) ) ) )

we see an offset of 20 (this would leave plenty space for a local
database), and a function taking a single 'Obj' argument. This function
will be called internally by the picoLisp interpreter whenever an
external symbol is accessed that doesn't belong to the local database.
It connects to the given host and port (or re-uses an already open
connection in the local variable 'Sock'), uses the 'ext' function to
translate output (with 'pr') and input (with 'rd'), and closes the
connection if end of file is encountered.


Finally, just for convenience, define a list of database resources,
matching the offset(s) in '*Ext':

   : (setq *Rsrc
      (quote
         (20  connect "localhost" 4000) ) )

The CARs again are the offsets, and the CDR are executable expressions
so that the 'remote' Pilog rule knows where and how to connect to remote
servers.


Now we are ready to start remote queries and symbol accesses in the test
client. The simplest example along the line of "doc/tut.html" would
be a query to get all "Edward"s

   : (? (db nm +Person "Edward" @P))
   -> NIL

Naturally, it won't work, because the test client has no local database!

However, if we use 'remote', it works:

   : (? @Rsrc *Rsrc (remote (@P . @Rsrc) (db nm +Person "Edward" @P)))
    @Rsrc=((20 connect "localhost" 4000)) @P={K-;}
    @Rsrc=((20 connect "localhost" 4000)) @P={K-1B}
    @Rsrc=((20 connect "localhost" 4000)) @P={K-R}
    @Rsrc=((20 connect "localhost" 4000)) @P={K-1K}
    @Rsrc=((20 connect "localhost" 4000)) @P={K-a}.

The original query, executed natively on the server, would return the
symbols {2-;}, {2-1B}, {2-R}, ... (as opposed to {K-;}, {K-1B}, ...),
and so on. We can see that the offset of 20 transformed the database
file number '2' to 'K'.

Another query:

   (? @Rsrc *Rsrc (remote (@P (20 connect "localhost" 4000)) (db nm +Person 
"Edward" @P) (val "Queen" @P mate job)))
    @Rsrc=((20 connect "localhost" 4000)) @P={K-1B}
   -> NIL

We see that any query that would be legal on a local database, can be
evaluated remotely simply by extending the query with

   (remote (@Var . @Resources) ..)

'@var' is the variable which 'remote' is to return from the remote
machine.


The returned external symbols can be used locally, just like normal
external symbols:

   : (show '{K-1B})
      {K-1B} (+Man)
      kids ({K-1C} {K-1D} {K-1E} {K-1F} {K-1G} {K-1H} {K-1I} {K-g} {K-a})
      nm "Albert Edward"
      job "Prince"
      mate {K-f}
      fin 680370
      dat 664554
   -> {K-1B}

   : (show '{K-1B} 'kids 1 'nm)
   "Beatrice Mary Victoria" "Beatrice Mary Victoria"
   -> "Beatrice Mary Victoria"

   : (show '{K-1B} 'mate 'nm)
   "Victoria I" "Victoria I"
   -> "Victoria I"

   : (show '{K-1B} 'mate 'kids)
   -> ({K-a} {K-g} {K-1I} {K-1H} {K-1G} {K-1F} {K-1E} {K-1D} {K-1C})


A '+QueryChart' in a local GUI can have 'remote' applied to any Pilog
expression (typically with 'select'), and will work just like with a
local database.


Is this explanation somehow understandable?

I'm looking forward for comments and questions :-)

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:[EMAIL PROTECTED]

Reply via email to