Hi,
Kuppitz makes fun of me for my constant use of the word “tuple” for anything
that has to do with TP4 structure/.
Perhaps this is the API:
https://gist.github.com/okram/84912722a2c00f26f07f1c4825eacd50
<https://gist.github.com/okram/84912722a2c00f26f07f1c4825eacd50>
My response below to Stephen is still worth reading as its more detailed and I
assume you understand it for the link above.
What I like about the updated API:
Are you only talking RDBMS?
TMap. “relations"
Are you only talking GraphDB?
TMap, TPrimitive. “vertices” and “edges” and their property
values.
@Josh: want to build a type system over graphdb —>
“vertex+edge” = “relations”.
Are you only talking DocumentDB?
TMap, TList, TPrimitive. “objects” containing “objects”,
“lists”, “primitives"
Are you only talking Wide-Column?
TMap. “relations"
…
I’ll stop for now. I don’t want to overload y’all. And its the freakin’
weekend… oh wait, everyday is the weekend for me.
Peace in the Far East (LA),
Marko.
http://rredux.com <http://rredux.com/>
> On May 17, 2019, at 7:58 AM, Marko Rodriguez <[email protected]> wrote:
>
> Hi,
>
> Thanks for your question.
>
> I suppose that a “limit bandwidth”-optimization could be based on the
> provider looking at all the instructions in the submitted instruction and
> then use that information to constrain what bytecode patterns it exposes. A
> simple ProviderStrategy would be the means of doing that.
>
> Perhaps showing you what I think the Tuple API should look like would help.
> This API would represent the primary way in which the TP VM interacts with
> the structure/ provider. Thus, this is for all cookies in the cookie jar!
>
> ############################################################
>
> public interface Tuple<A> extends Iterator<Tuple<A>> {
>
> public boolean hasKey(Object key);
> public boolean hasValue(Object value);
> public <B> Tuple<B> get(Object key);
> public A value();
> public long count();
> public boolean hasNext();
> public Tuple<A> next();
>
> public boolean match(Instruction instruction);
> public Tuple apply(Instruction instruction);
>
> }
>
> ############################################################
>
> Structure neo4j = Neo4jStructureFactory.open(config1)
> Tuple<Map<String,String> db = neo4j.root();
> => { type:graph | [V] }#1
>
> //////
>
> Let a =
>
> { type:vertex, name:marko, age:29 | [inE] [outE] }#1
>
> a.count() => 1
> a.value() =>
> Map.of('type','vertex','name','marko','age',29)
> a.get('type') => { 'vertex' }#1
> a.get('name') => { 'marko' }#1
> a.hasKey('blah') => false
> a.match(Instruction.of('outE')) => true
>
> //////
>
> b = a.apply(Instruction.of('outE’))
>
> { type:edge, label:?string | [outV] [inV] }#?
>
> b.count() => -1
> b.hasKey('weight') => null // not false because all we
> know is type:edge & label:?string about #? of things.
> b.hasKey('type') => true
> b.hasKey('label') => true
> b.get('label') => { ?string }#? // ?string is something
> like Unknown.of(Type.string())
>
> //////
>
> c = b.apply(Instruction.of('inV'))
>
> { type:vertex }#?
>
> c.count() => -1
> c.value() => Map.of('type','vertex')
> c.hasNext() => true
> c.next() => { type:vertex, name:stephen, age:17 | [inE] [outE] }
> c.hasNext() => true
> c.next() => { type:vertex, name:kuppitz | [inE] [outE] }
> c.hasNext() => false
> c.count() => 0
>
> //////
>
> d = { type:vertex, name:kuppitz | [inE] [outE] }
>
> e = d.get('name')
>
> { kuppitz }#1
>
> e.count() => 1
> e.value() => 'kuppitz'
>
> //////
>
> Let f =
>
> { type:edge | [outV] [inV] [has,label,eq,?0] }?10
>
> f.count() => 10
> f.get('type') => { 'edge' }#10
> f.match(Instruction.of('has','label',P.eq,'knows')) => true
>
> //////
>
> g = f.apply(Instruction.of('has','label',P.eq,'knows'))
>
> { type:edge, label:knows | [outV] [inV] }#1
>
> g.count() => 1
> g.hasNext() => true
> g.next() => { type:edge, label:knows | [outV] [inV] }#1 // its
> iteration is itself!
> g.hasNext() => false // g lost the
> reference
> g.count() => 0
>
> //////
>
> Cool? Questions?
>
> Thanks,
> Marko.
>
> http://rredux.com <http://rredux.com/>
>
>
>
>
>> On May 17, 2019, at 6:57 AM, Stephen Mallette <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> This is a nicely refined representation of this concept. I think I've
>> followed this abstractly since you first started discussing it, but I've
>> struggled with the implementation of it and how it would best work (which
>> is probably the reason I keep thinking that I"m not following the
>> abstraction hehe). You nicely wrote this from the perspective of the
>> individual providers which I think connected me more to the more concrete
>> aspect of things, which leads me to this question: Does the provider send
>> the instructions by looking at the query or do they just provide all the
>> possible instructions and TP figures it out? (i feel like i've kinda read
>> it both ways at different times).
>>
>> On Fri, May 17, 2019 at 8:12 AM Marko Rodriguez <[email protected]
>> <mailto:[email protected]>>
>> wrote:
>>
>>> Hello,
>>>
>>> This email is primarily for Kuppitz and Josh. Kuppitz offered me his
>>> attention yesterday. I explained to him an idea I’ve been working on this
>>> week. I’ve been frustrated lately because emails and IM are so hard to
>>> express abstract ideas. Fortunately, Kuppitz was patient with me. Then he
>>> got it. Then he innovated on it. I was elated.
>>>
>>> https://twitter.com/twarko/status/1129117666910674944
>>> <https://twitter.com/twarko/status/1129117666910674944> <
>>> https://twitter.com/twarko/status/1129117666910674944
>>> <https://twitter.com/twarko/status/1129117666910674944>>
>>>
>>> Josh was interested in what this was all about. I had to go to leave for
>>> hockey, but I gave him a fast break down. He sorta got the vibe, but wanted
>>> to know more…..
>>>
>>> ########################################
>>>
>>> There is only one type of “tuple.”
>>>
>>> { }#?
>>>
>>> The notation says: there are objects, but I don’t know how many of them
>>> there are…..if you want to know more, iterate.
>>>
>>> ########################################
>>>
>>> Let us begin…………..
>>>
>>>
>>> ——————TP4 WITH PROVIDER A——————
>>>
>>> g.
>>>
>>> { [V] }#1
>>>
>>> There is one object. Thus, what you see is all that I know about this
>>> object. In particular, what I know is that it can be mapped via the
>>> bytecode instruction [V].
>>>
>>> Let us apply [V].
>>>
>>> { name:?string | [has,age,?0,?1] [has,id,eq,?0] }#?
>>>
>>> There are some number of objects. If you want to know what they are,
>>> iterate. However, I am aware of a feature that they all share. I do know
>>> for a fact (by the way I was designed by my creator ProviderA) that every
>>> one of the objects has a name-key to some string value. Also, two has()
>>> bytecode patterns are available.
>>>
>>> Let us apply [hasKey,name].
>>>
>>> { name:?string | [has,age,?0,?1] [has,id,eq,?0] }#?
>>>
>>> The instruction didn't match any of the available bytecode patterns. Thus,
>>> the instruction has to evaluated. Did you need to iterate and filter out
>>> those that don’t have a name-key? No. As I told you, I know that every one
>>> of the objects has a name-key.
>>>
>>> Let us apply [has,id,eq,1].
>>>
>>> { name:marko, age:29 | [inE] [outE] }#1
>>>
>>> There is one thing. It has primitive key/value data — a name and an age.
>>>
>>> Let us apply [values,name].
>>>
>>> { marko }#1
>>>
>>> That bytecode instruction didn't match any the available bytecode
>>> patterns. The instruction was evaluated and there is one thing: the string
>>> “marko.”
>>>
>>> We did:
>>>
>>> g.V().hasKey(‘name’).hasId(1).values(‘name’)
>>>
>>> The query you provided used an index on id. How do we know that? You
>>> didn’t have to iterate all the objects and filter on id. I was able to jump
>>> from all vertices to the one with id=1.
>>>
>>> ——————TP4 WITH PROVIDER B——————
>>>
>>> { type:person, name:?string, age:?int | [has,name,eq,?0] }?10
>>>
>>> There are 10 objects. Some providers can’t determine how many objects
>>> there are without full iteration. But, by the way I was designed, I know. I
>>> also know that all the object have a type:person key/value. I also know
>>> they all have a name-key and int-key with known value types.
>>>
>>> What am I?
>>>
>>> CREATE TABLE people {
>>> name varchar(100),
>>> age int
>>> }
>>> CREATE INDEX people_name_idx ON people (name);
>>>
>>> ——————TP4 WITH PROVIDER C——————
>>>
>>> g.V().has(‘name’,’marko’).has(‘age’,gt(20)).id()
>>>
>>> This is easy. My creator, ProviderC, provides multi-key indices. And when
>>> the database instance was created, a (name,age)-index was created. Also,
>>> because you only want the id of those vertices named marko whose age is
>>> greater than 20, I don’t have to manifest the vertices, I can simply get
>>> the id out of the index. This is what I provided for each instruction of
>>> your query...
>>>
>>> 1. { type:graph | [V] }#1
>>> 2. { type:vertex | [has,name,eq,?0] [has,age,?0,?1] [id] }#?
>>> 3. { type:vertex, label:person, name:marko | [has,age,?0,?1] [id] }#?
>>> 4. { type:vertex, label:person, name:marko, age:gt(20) | [id] }#?
>>> 5. { type:int }#?
>>>
>>> Unlike ProviderA, all the objects in me have a type-key. It is just
>>> something I like to do. Call it my quirk. Thus, on line #2, I know that
>>> there are some number of vertex objects. And do you see my multi-property
>>> index there? On line #3, I know for a fact that every one of those objects
>>> has a name:marko entry. Finally, by line #5, I don’t know how many
>>> id-objects there are, but I do know they are all integers. If you want to
>>> know what they are, iterate.
>>>
>>> Below are the possible "bytecode pattern”-paths that are available off of
>>> the graph object. At any point through this pattern, you could iterate.
>>>
>>> [V]
>>> / | \
>>> / [id]\
>>> / \
>>> [has,name,eq,?0] [has,age,?0,?1]
>>> / \ / \
>>> / \ / \
>>> [has,age,?0,?1] [id] [has,name,eq,?0] [id]
>>> | |
>>> [id] [id]
>>>
>>>
>>> *** In case the diagram above looks weird in your mail client:
>>> https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f
>>> <https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f> <
>>> https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f
>>> <https://gist.github.com/okram/f7f20a3c33aa7caca7c28e85fd16be3f>>
>>>
>>> ——————TP4 WITH PROVIDER D——————
>>>
>>> I support "vertex-centric indices.” For certain queries, I don’t have to
>>> manifest/iterate the incident edges of a vertex to check their key/value
>>> pairs. In particular, I have index all the incident knows-edges by their
>>> weight property. Wanna know who marko knows well? Do this query:
>>>
>>> …outE(‘knows’).has(‘weight’,gt(0.85)).inV()
>>>
>>> { label:person, name:marko, age:29 | [outE] [inE] }#1
>>> // [outE]
>>> { weight:float? | [has,label,eq,?1] [inV] }#20
>>> // [has,label,eq,knows]
>>> { label:knows, weight:float? | [has,weight,?0,?1] [inV] }#15
>>> // [has,weight,gt,0.85]
>>> { label:knows, weight:gt(0.85) | [inV] }#15
>>> // [inV]
>>> { label:person }#15
>>>
>>> See. I didn’t create single edge! I do know there are 20 outgoing edges
>>> from marko, but I didn’t manifest them. I then was able to jump to the
>>> adjacent vertices. If you want to know about those, you can iterate….
>>>
>>> …label()
>>>
>>> { person }#15
>>>
>>> Haha. I don’t have to iterate to solve that. I know that all 15 adjacent
>>> vertices are labeled as ‘person’. I was able to go from v[1] to 15 person
>>> strings without manifesting any intermediate edges or vertices! I’m pretty
>>> freakin’ sweet. How do I know that you ask? I’m an in-memory graph database
>>> and my vertex-centric indices are just Java sets. Its cheap for me to
>>> provide counts, so I do. Most other providers can’t do that. But I can.
>>>
>>> ——————TP4 WITH PROVIDER E——————
>>>
>>>
>>> …out(‘knows’).values(‘name’)
>>> ==compiles to==>
>>> [outE][has,label,eq,knows][inV][values,name]
>>>
>>>
>>> { name:marko, age:29 | [outE] [inE] }#1
>>> // [outE]
>>> { [has,label,eq,?1] [inV] }#20
>>> // [has,label,eq,knows]
>>> { label:knows | [inV] }#15
>>> // [inV]
>>> { label:person | [values,name] }#15
>>> // [values,name]
>>> { type:string }#15
>>>
>>> Did you see that? I didn’t manifest any incident edges nor adjacent
>>> vertices and I was able to give you the name of all the people that marko
>>> knows! Can you guess what features I have?
>>>
>>> * Incident edges are indexed by label.
>>> * Certain properties of a vertex can be denormalized (stored
>>> locally) to their adjacent neighbors.
>>>
>>> Thanks for reading,
>>> Marko.
>>>
>>> http://rredux.com <http://rredux.com/> <http://rredux.com/
>>> <http://rredux.com/>>
>