mm-ADT breakthrough regarding writes and partial data loading

Marko Rodriguez Wed, 29 May 2019 16:00:16 -0700

Hi,

*** This email is primarily for Kuppitz (and Stephen might appreciate the 
general idea — especially the last part). ***


We have 3 types of containers in mm-ADT.

        1. map        -> [:]
        2. list       -> [ ]
        3. ?sequence? -> .*

This #3 thing doesn’t have a name, but its what is iterated by a reference — 
its “the referents.” However, we have been using this “thing” extensively as it 
is how we model the following database structures:

Graph: [db][values,V]      => &vertex*
RDBMS: [db][values,people] => &person*
RDF:   [db][triples]       => &statement*

I’ve been banging my head against the wall all day trying to figure out how to 
write to these #3 things! That is, how will we do:

Graph: add/update/delete vertex
RDBMS: add/update/delete row
RDF:   add/delete statement
...

While I was gardening this afternoon, it struck me! 

        &person* is exactly what it says it is — a reference to zero or more 
person objects.

This isn’t the table! No, its a cursor to the head of a 
sequence/stream/iterator. Its not a “container” — its transient!  A result set.

So then what is the people-table? Its a list! 

Here is the full bytecode to create an mm-ADT RDBMS.

[define,row,map,[(@string,@object)*]]
[define,table,list,[@row*]]
[define,db,map,[(@string:@table)*]]

1. a row extends map with the constraint that all keys must be strings.
2. a table extends list with the constraint that all elements must be rows.
3. a db extends map with the constraint that all keys are strings (table name) 
with values that are tables.

So that is the meta-model. What about a particular instance of an RDBMS? In 
bytecode, here is how you define the people-table and the person-object. 

[define,person,row,[name:@string,age:@int]]
[define,people,table,[@person*]]

Thus, people-table is a list of zero or more person-rows. Now lets CREATE TABLE 
and put me in it: 

[db][insert,[create,people,[[[create,person,[name:marko,age:29]]]]]]

##############################################

Okay. So here is where the fun beings… What is the output signature of the 
following bytecode?

[db][value,people]

Its:

@people

"Whoa! Chill out there cowboy?! So are you saying that when you access the 
people-table, you get back the entire list representation of the table? That 
could be an insane amount of data!?”

To that I say — "yes, you are right, the above bytecode will do that. Its a 
very dangerous piece of bytecode." However, the bytecode below does something 
different…and this is what is going to open a vast new territory for the mm-ADT 
spec:

[db][&value,people] => &people

[&value] is a “get by reference” where [value] is a “get by value”. Why is this 
cool? Well, according to the mm-ADT specification, operations on a reference 
MUST be semantically equivalent to operations on the object itself. Thus, we 
can now read/write/delete the table like any old list!

[db][&value,people][insert,[create,person,[name:josh,age:32]]]

Tada! Appending to the reference will push the append to the database with the 
only data transfer being the &people reference (cheap) fetched and the 
peson-object pushed (basically INSERT INTO people (josh,32)).

#### @stephen: this is where you will get excited. ####

[db][&value,people][has,name,eq,marko]
        => person[name:marko,age:29]

As expected. I got the person-row with name:marko. All this ‘by value’.

Now, let me formally define this new [&value] instruction:

opcode   : &value
arguments: @object,@pattern? 

The optional pattern says: “what aspects of the referents do you want by 
value?” In other words, want do you want to know for certain in the referent 
pattern (it all connects!).

[db][&value,people,@person[age:@int]][has,name,eq,marko]
        => person[name:&string,age:29]

And there you have it. A half populated object. With the @person[age:@int] 
pattern, we are saying, I only want the age-property of the person object by 
value. Everything else, by reference. Thus, if we actually want the name, well, 
we have the reference to go and get it (which is a database call), but if our 
bytecode optimizer is good and we never use the name, well, we only grab the 
data we need! 

You may be thinking, but don’t we need the name for the next instruction: 
[has,NAME,eq,marko]?!  To that I say: “You poor fool. Realize that we have a 
&people and if there is an index on the people table, well 
[has,name,@cop,@string] is an instruction pattern and thus, we are in the 
reference graph! And if we don’t have an index on the people-table, then YES, 
we will need to pull the name by value cause we will be filtering in the 
processor, not the storage system. Its all so self-consistent I can barely 
contain my bowels.

Three problems solved with [&value]

        1. ?sequences? are no longer these weird anonymous data structures. 
They are cursors in the classic database sense.
        2. The same instructions we use for manipulating containers can 
manipulate ‘remote’ containers. (referenced containers)
        3. We can selectively populate only portions of an object with all the 
remaining portions maintaining references.

Its almost too perfect.

Take care,
Marko.

http://rredux.com <http://rredux.com/>

mm-ADT breakthrough regarding writes and partial data loading

Reply via email to