This is a nicely refined representation of this concept. I think I've followed this abstractly since you first started discussing it, but I've struggled with the implementation of it and how it would best work (which is probably the reason I keep thinking that I"m not following the abstraction hehe). You nicely wrote this from the perspective of the individual providers which I think connected me more to the more concrete aspect of things, which leads me to this question: Does the provider send the instructions by looking at the query or do they just provide all the possible instructions and TP figures it out? (i feel like i've kinda read it both ways at different times).
On Fri, May 17, 2019 at 8:12 AM Marko Rodriguez <[email protected]> wrote: > Hello, > > This email is primarily for Kuppitz and Josh. Kuppitz offered me his > attention yesterday. I explained to him an idea I’ve been working on this > week. I’ve been frustrated lately because emails and IM are so hard to > express abstract ideas. Fortunately, Kuppitz was patient with me. Then he > got it. Then he innovated on it. I was elated. > > https://twitter.com/twarko/status/1129117666910674944 < > https://twitter.com/twarko/status/1129117666910674944> > > Josh was interested in what this was all about. I had to go to leave for > hockey, but I gave him a fast break down. He sorta got the vibe, but wanted > to know more….. > > ######################################## > > There is only one type of “tuple.” > > { }#? > > The notation says: there are objects, but I don’t know how many of them > there are…..if you want to know more, iterate. > > ######################################## > > Let us begin………….. > > > ——————TP4 WITH PROVIDER A—————— > > g. > > { [V] }#1 > > There is one object. Thus, what you see is all that I know about this > object. In particular, what I know is that it can be mapped via the > bytecode instruction [V]. > > Let us apply [V]. > > { name:?string | [has,age,?0,?1] [has,id,eq,?0] }#? > > There are some number of objects. If you want to know what they are, > iterate. However, I am aware of a feature that they all share. I do know > for a fact (by the way I was designed by my creator ProviderA) that every > one of the objects has a name-key to some string value. Also, two has() > bytecode patterns are available. > > Let us apply [hasKey,name]. > > { name:?string | [has,age,?0,?1] [has,id,eq,?0] }#? > > The instruction didn't match any of the available bytecode patterns. Thus, > the instruction has to evaluated. Did you need to iterate and filter out > those that don’t have a name-key? No. As I told you, I know that every one > of the objects has a name-key. > > Let us apply [has,id,eq,1]. > > { name:marko, age:29 | [inE] [outE] }#1 > > There is one thing. It has primitive key/value data — a name and an age. > > Let us apply [values,name]. > > { marko }#1 > > That bytecode instruction didn't match any the available bytecode > patterns. The instruction was evaluated and there is one thing: the string > “marko.” > > We did: > > g.V().hasKey(‘name’).hasId(1).values(‘name’) > > The query you provided used an index on id. How do we know that? You > didn’t have to iterate all the objects and filter on id. I was able to jump > from all vertices to the one with id=1. > > ——————TP4 WITH PROVIDER B—————— > > { type:person, name:?string, age:?int | [has,name,eq,?0] }?10 > > There are 10 objects. Some providers can’t determine how many objects > there are without full iteration. But, by the way I was designed, I know. I > also know that all the object have a type:person key/value. I also know > they all have a name-key and int-key with known value types. > > What am I? > > CREATE TABLE people { > name varchar(100), > age int > } > CREATE INDEX people_name_idx ON people (name); > > ——————TP4 WITH PROVIDER C—————— > > g.V().has(‘name’,’marko’).has(‘age’,gt(20)).id() > > This is easy. My creator, ProviderC, provides multi-key indices. And when > the database instance was created, a (name,age)-index was created. Also, > because you only want the id of those vertices named marko whose age is > greater than 20, I don’t have to manifest the vertices, I can simply get > the id out of the index. This is what I provided for each instruction of > your query... > > 1. { type:graph | [V] }#1 > 2. { type:vertex | [has,name,eq,?0] [has,age,?0,?1] [id] }#? > 3. { type:vertex, label:person, name:marko | [has,age,?0,?1] [id] }#? > 4. { type:vertex, label:person, name:marko, age:gt(20) | [id] }#? > 5. { type:int }#? > > Unlike ProviderA, all the objects in me have a type-key. It is just > something I like to do. Call it my quirk. Thus, on line #2, I know that > there are some number of vertex objects. And do you see my multi-property > index there? On line #3, I know for a fact that every one of those objects > has a name:marko entry. Finally, by line #5, I don’t know how many > id-objects there are, but I do know they are all integers. If you want to > know what they are, iterate. > > Below are the possible "bytecode pattern”-paths that are available off of > the graph object. At any point through this pattern, you could iterate. > > [V] > / | \ > / [id]\ > / \ > [has,name,eq,?0] [has,age,?0,?1] > / \ / \ > / \ / \ > [has,age,?0,?1] [id] [has,name,eq,?0] [id] > | | > [id] [id] > > > *** In case the diagram above looks weird in your mail client: > https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f < > https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f> > > ——————TP4 WITH PROVIDER D—————— > > I support "vertex-centric indices.” For certain queries, I don’t have to > manifest/iterate the incident edges of a vertex to check their key/value > pairs. In particular, I have index all the incident knows-edges by their > weight property. Wanna know who marko knows well? Do this query: > > …outE(‘knows’).has(‘weight’,gt(0.85)).inV() > > { label:person, name:marko, age:29 | [outE] [inE] }#1 > // [outE] > { weight:float? | [has,label,eq,?1] [inV] }#20 > // [has,label,eq,knows] > { label:knows, weight:float? | [has,weight,?0,?1] [inV] }#15 > // [has,weight,gt,0.85] > { label:knows, weight:gt(0.85) | [inV] }#15 > // [inV] > { label:person }#15 > > See. I didn’t create single edge! I do know there are 20 outgoing edges > from marko, but I didn’t manifest them. I then was able to jump to the > adjacent vertices. If you want to know about those, you can iterate…. > > …label() > > { person }#15 > > Haha. I don’t have to iterate to solve that. I know that all 15 adjacent > vertices are labeled as ‘person’. I was able to go from v[1] to 15 person > strings without manifesting any intermediate edges or vertices! I’m pretty > freakin’ sweet. How do I know that you ask? I’m an in-memory graph database > and my vertex-centric indices are just Java sets. Its cheap for me to > provide counts, so I do. Most other providers can’t do that. But I can. > > ——————TP4 WITH PROVIDER E—————— > > > …out(‘knows’).values(‘name’) > ==compiles to==> > [outE][has,label,eq,knows][inV][values,name] > > > { name:marko, age:29 | [outE] [inE] }#1 > // [outE] > { [has,label,eq,?1] [inV] }#20 > // [has,label,eq,knows] > { label:knows | [inV] }#15 > // [inV] > { label:person | [values,name] }#15 > // [values,name] > { type:string }#15 > > Did you see that? I didn’t manifest any incident edges nor adjacent > vertices and I was able to give you the name of all the people that marko > knows! Can you guess what features I have? > > * Incident edges are indexed by label. > * Certain properties of a vertex can be denormalized (stored > locally) to their adjacent neighbors. > > Thanks for reading, > Marko. > > http://rredux.com <http://rredux.com/> > > > > >
