Re: The Bytecode Pattern-Matching Model

Marko Rodriguez Fri, 17 May 2019 06:58:43 -0700

Hi,

Thanks for your question.


I suppose that a “limit bandwidth”-optimization could be based on the provider 
looking at all the instructions in the submitted instruction and then use that 
information to constrain what bytecode patterns it exposes. A simple 
ProviderStrategy would be the means of doing that.

Perhaps showing you what I think the Tuple API should look like would help. 
This API would represent the primary way in which the TP VM interacts with the 
structure/ provider. Thus, this is for all cookies in the cookie jar!

############################################################

public interface Tuple<A> extends Iterator<Tuple<A>> {

  public boolean hasKey(Object key);
  public boolean hasValue(Object value);
  public <B> Tuple<B> get(Object key);
  public A value();
  public long count();
  public boolean hasNext();
  public Tuple<A> next();

  public boolean match(Instruction instruction);
  public Tuple apply(Instruction instruction);
  
}

############################################################

Structure neo4j = Neo4jStructureFactory.open(config1)
Tuple<Map<String,String> db = neo4j.root(); 
  => { type:graph | [V] }#1

//////

Let a = 

{ type:vertex, name:marko, age:29 | [inE] [outE] }#1

a.count()                       => 1
a.value()                       => 
Map.of('type','vertex','name','marko','age',29)
a.get('type')                   => { 'vertex' }#1
a.get('name')                   => { 'marko' }#1
a.hasKey('blah')                => false
a.match(Instruction.of('outE')) => true

//////

b = a.apply(Instruction.of('outE’))

{ type:edge, label:?string | [outV] [inV] }#?

b.count()                      => -1
b.hasKey('weight')             => null            // not false because all we 
know is type:edge & label:?string about #? of things.
b.hasKey('type')               => true
b.hasKey('label')              => true
b.get('label')                 => { ?string }#?   // ?string is something like 
Unknown.of(Type.string())

//////

c = b.apply(Instruction.of('inV'))

{ type:vertex }#?

c.count()      => -1
c.value()      => Map.of('type','vertex')
c.hasNext()    => true
c.next()       => { type:vertex, name:stephen, age:17 | [inE] [outE] }
c.hasNext()    => true
c.next()       => { type:vertex, name:kuppitz | [inE] [outE] }
c.hasNext()    => false
c.count()      => 0

//////

d = { type:vertex, name:kuppitz | [inE] [outE] }

e = d.get('name')

{ kuppitz }#1

e.count()     => 1
e.value()     => 'kuppitz'

//////

Let f = 

{ type:edge | [outV] [inV] [has,label,eq,?0] }?10

f.count()                                            => 10
f.get('type')                                        => { 'edge' }#10
f.match(Instruction.of('has','label',P.eq,'knows'))  => true

//////

g = f.apply(Instruction.of('has','label',P.eq,'knows'))

{ type:edge, label:knows | [outV] [inV] }#1

g.count()      => 1
g.hasNext()    => true
g.next()       => { type:edge, label:knows | [outV] [inV] }#1  // its iteration 
is itself!
g.hasNext()    => false                                        // g lost the 
reference
g.count()      => 0

//////

Cool? Questions?

Thanks,
Marko.

http://rredux.com <http://rredux.com/>




> On May 17, 2019, at 6:57 AM, Stephen Mallette <[email protected]> wrote:
> 
> This is a nicely refined representation of this concept. I think I've
> followed this abstractly since you first started discussing it, but I've
> struggled with the implementation of it and how it would best work (which
> is probably the reason I keep thinking that I"m not following the
> abstraction hehe). You nicely wrote this from the perspective of the
> individual providers which I think connected me more to the more concrete
> aspect of things, which leads me to this question:  Does the provider send
> the instructions by looking at the query or do they just provide all the
> possible instructions and TP figures it out? (i feel like i've kinda read
> it both ways at different times).
> 
> On Fri, May 17, 2019 at 8:12 AM Marko Rodriguez <[email protected] 
> <mailto:[email protected]>>
> wrote:
> 
>> Hello,
>> 
>> This email is primarily for Kuppitz and Josh. Kuppitz offered me his
>> attention yesterday. I explained to him an idea I’ve been working on this
>> week. I’ve been frustrated lately because emails and IM are so hard to
>> express abstract ideas. Fortunately, Kuppitz was patient with me. Then he
>> got it. Then he innovated on it. I was elated.
>> 
>>        https://twitter.com/twarko/status/1129117666910674944 
>> <https://twitter.com/twarko/status/1129117666910674944> <
>> https://twitter.com/twarko/status/1129117666910674944 
>> <https://twitter.com/twarko/status/1129117666910674944>>
>> 
>> Josh was interested in what this was all about. I had to go to leave for
>> hockey, but I gave him a fast break down. He sorta got the vibe, but wanted
>> to know more…..
>> 
>> ########################################
>> 
>> There is only one type of “tuple.”
>> 
>> { }#?
>> 
>> The notation says: there are objects, but I don’t know how many of them
>> there are…..if you want to know more, iterate.
>> 
>> ########################################
>> 
>> Let us begin…………..
>> 
>> 
>> ——————TP4 WITH PROVIDER A——————
>> 
>> g.
>> 
>> { [V] }#1
>> 
>> There is one object. Thus, what you see is all that I know about this
>> object. In particular, what I know is that it can be mapped via the
>> bytecode instruction [V].
>> 
>> Let us apply [V].
>> 
>> { name:?string | [has,age,?0,?1] [has,id,eq,?0] }#?
>> 
>> There are some number of objects. If you want to know what they are,
>> iterate. However, I am aware of a feature that they all share. I do know
>> for a fact (by the way I was designed by my creator ProviderA) that every
>> one of the objects has a name-key to some string value. Also, two has()
>> bytecode patterns are available.
>> 
>> Let us apply [hasKey,name].
>> 
>> { name:?string | [has,age,?0,?1] [has,id,eq,?0] }#?
>> 
>> The instruction didn't match any of the available bytecode patterns. Thus,
>> the instruction has to evaluated. Did you need to iterate and filter out
>> those that don’t have a name-key? No. As I told you, I know that every one
>> of the objects has a name-key.
>> 
>> Let us apply [has,id,eq,1].
>> 
>> { name:marko, age:29 | [inE] [outE] }#1
>> 
>> There is one thing. It has primitive key/value data —  a name and an age.
>> 
>> Let us apply [values,name].
>> 
>> { marko }#1
>> 
>> That bytecode instruction didn't match any the available bytecode
>> patterns. The instruction was evaluated and there is one thing: the string
>> “marko.”
>> 
>> We did:
>> 
>> g.V().hasKey(‘name’).hasId(1).values(‘name’)
>> 
>> The query you provided used an index on id. How do we know that? You
>> didn’t have to iterate all the objects and filter on id. I was able to jump
>> from all vertices to the one with id=1.
>> 
>> ——————TP4 WITH PROVIDER B——————
>> 
>> { type:person, name:?string, age:?int | [has,name,eq,?0] }?10
>> 
>> There are 10 objects. Some providers can’t determine how many objects
>> there are without full iteration. But, by the way I was designed, I know. I
>> also know that all the object have a type:person key/value. I also know
>> they all have a name-key and int-key with known value types.
>> 
>> What am I?
>> 
>> CREATE TABLE people {
>>  name varchar(100),
>>  age int
>> }
>> CREATE INDEX people_name_idx ON people (name);
>> 
>> ——————TP4 WITH PROVIDER C——————
>> 
>> g.V().has(‘name’,’marko’).has(‘age’,gt(20)).id()
>> 
>> This is easy. My creator, ProviderC, provides multi-key indices. And when
>> the database instance was created, a (name,age)-index was created. Also,
>> because you only want the id of those vertices named marko whose age is
>> greater than 20, I don’t have to manifest the vertices, I can simply get
>> the id out of the index. This is what I provided for each instruction of
>> your query...
>> 
>> 1. { type:graph | [V] }#1
>> 2. { type:vertex | [has,name,eq,?0] [has,age,?0,?1] [id] }#?
>> 3. { type:vertex, label:person, name:marko | [has,age,?0,?1] [id] }#?
>> 4. { type:vertex, label:person, name:marko, age:gt(20) | [id] }#?
>> 5. { type:int }#?
>> 
>> Unlike ProviderA, all the objects in me have a type-key. It is just
>> something I like to do. Call it my quirk. Thus, on line #2, I know that
>> there are some number of vertex objects. And do you see my multi-property
>> index there? On line #3, I know for a fact that every one of those objects
>> has a name:marko entry. Finally, by line #5, I don’t know how many
>> id-objects there are, but I do know they are all integers. If you want to
>> know what they are, iterate.
>> 
>> Below are the possible "bytecode pattern”-paths that are available off of
>> the graph object. At any point through this pattern, you could iterate.
>> 
>>                        [V]
>>                       / | \
>>                      / [id]\
>>                     /       \
>>      [has,name,eq,?0]        [has,age,?0,?1]
>>         /         \             /          \
>>        /           \           /            \
>> [has,age,?0,?1]    [id]    [has,name,eq,?0]  [id]
>>       |                          |
>>      [id]                       [id]
>> 
>> 
>> *** In case the diagram above looks weird in your mail client:
>> https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f <
>> https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f 
>> <https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f>>
>> 
>> ——————TP4 WITH PROVIDER D——————
>> 
>> I support "vertex-centric indices.” For certain queries, I don’t have to
>> manifest/iterate the incident edges of a vertex to check their key/value
>> pairs. In particular, I have index all the incident knows-edges by their
>> weight property. Wanna know who marko knows well? Do this query:
>> 
>> …outE(‘knows’).has(‘weight’,gt(0.85)).inV()
>> 
>> { label:person, name:marko, age:29 | [outE] [inE] }#1
>> // [outE]
>> { weight:float? | [has,label,eq,?1] [inV] }#20
>> // [has,label,eq,knows]
>> { label:knows, weight:float? | [has,weight,?0,?1] [inV] }#15
>> // [has,weight,gt,0.85]
>> { label:knows, weight:gt(0.85) | [inV] }#15
>> // [inV]
>> { label:person }#15
>> 
>> See. I didn’t create single edge! I do know there are 20 outgoing edges
>> from marko, but I didn’t manifest them. I then was able to jump to the
>> adjacent vertices. If you want to know about those, you can iterate….
>> 
>> …label()
>> 
>> { person }#15
>> 
>> Haha. I don’t have to iterate to solve that. I know that all 15 adjacent
>> vertices are labeled as ‘person’. I was able to go from v[1] to 15 person
>> strings without manifesting any intermediate edges or vertices! I’m pretty
>> freakin’ sweet. How do I know that you ask? I’m an in-memory graph database
>> and my vertex-centric indices are just Java sets. Its cheap for me to
>> provide counts, so I do. Most other providers can’t do that. But I can.
>> 
>> ——————TP4 WITH PROVIDER E——————
>> 
>> 
>> …out(‘knows’).values(‘name’)
>>     ==compiles to==>
>> [outE][has,label,eq,knows][inV][values,name]
>> 
>> 
>> { name:marko, age:29 | [outE] [inE] }#1
>> // [outE]
>> { [has,label,eq,?1] [inV] }#20
>> // [has,label,eq,knows]
>> { label:knows | [inV] }#15
>> // [inV]
>> { label:person | [values,name] }#15
>> // [values,name]
>> { type:string }#15
>> 
>> Did you see that? I didn’t manifest any incident edges nor adjacent
>> vertices and I was able to give you the name of all the people that marko
>> knows! Can you guess what features I have?
>> 
>>        * Incident edges are indexed by label.
>>        * Certain properties of a vertex can be denormalized (stored
>> locally) to their adjacent neighbors.
>> 
>> Thanks for reading,
>> Marko.
>> 
>> http://rredux.com <http://rredux.com/> <http://rredux.com/ 
>> <http://rredux.com/>>

Re: The Bytecode Pattern-Matching Model

Reply via email to