Re: N-Tuples, Pointers, Data Model Interfaces, and Bytecode Instructions

Joshua Shinavier Sat, 11 May 2019 08:11:12 -0700

Oops, looked back at my email and noticed some nonsense. This:

knows_weight_gt_85(v, e) = knows(v, e) ∧ weight(e, w) ∧ w>0.85



should be more like this:

knows_weight_gt_85(_, v, e) = knows(e, v, _) ∧ weight(_, e, w) ∧ w>0.85


because we need the identity of edges (and by extension, properties). So we
treat edge relations like triples, where the first element of the triple is
the identity of the edge. The _ symbol represents "don't care" variables.

Josh


On Sat, May 11, 2019 at 8:01 AM Joshua Shinavier <[email protected]> wrote:

> Hi Marko,
>
> Responses inline.
>
>
> On Tue, May 7, 2019 at 6:26 AM Marko Rodriguez <[email protected]>
> wrote:
>
>> Whoa.
>>
>> Check out this trippy trick.
>>
>> First, here is how you define a pointer to a map-tuple.
>>
>>         *{k1?v1, k2?v2, …, kn?vn}
>>                 * says “this is a pointer to a map" { }
>>                 ? is some comparator like =, >, <, !=, contains(), etc.
>>
>
> OK.
>
>
>
>
>> Assume the vertex map tuple v[1]:
>>
>> {#id:1, #label:person, name:marko, age:29}
>>
>> Now, we can add the following fields:
>>
>> 1. #outE:*{#outV=*{#id=1}}  // references all tuples that have an outV
>> field that is a pointer to the the v[1] vertex tuple.
>>
>
> Yes, I agree. You won't be surprised to hear me say that this boils down
> to good old select() and project(), e.g.
>
> g.V(1).select("*", "out")
>
>
> Here, I am using "*" as shorthand for "any matching relation", i.e.
> anything with an "out" (your "outV") field.
>
>
>
>> 2. #outE.knows:*{#outV=*{#id=1},#label=knows} // references all outgoing
>> knows-edges.
>>
>
> g.V(1).select("knows", "out")
>
>
>
> 3. #outE.knows.weight_gt_85:*{#outV=*{#id=1},#label=knows,weight>0.85} //
>> references all strong outgoing knows-edges
>>
>
> g.V(1).select("knows", "out").as("e").select("weight",
> "out").project("in").is(P.gt(0.85)).back("e")
>
>
> I like how you are giving a name to a new relation built up of other
> relations. In relational calculus, this looks something like:
>
> knows_weight_gt_85(v, e) = knows(v, e) ∧ weight(e, w) ∧ w>0.85
>
>
> And I do think we should make defining new relations as straightforward as
> possible in TP4. Compositionality is life.
>
>
>
>> By using different types of pointers, a graph database provider can make
>> explicit their internal structure. Assume all three fields above are in the
>> v[1] vertex tuple. This means that:
>>
>>         1. all of v[1]’s outgoing edges are group together. <— linear scan
>>
>
> By convention, a relation could be indexed left to right, so 
> knows_weight_gt_85(v,
> e) would express exactly that.
>
>
>
>>         2. all of v[1]’s outgoing knows-edges are group together. <—
>> indexed by label
>>
>
> Same.
>
>
>
>>         3. all of v[1]’s strong outgoing knows-edges are group together
>> <— indexed by label and weight
>>
>
> Yep.
>
>
>
>> Thus, a graph database provider can describe the way in which it
>> internally organizes adjacent edges — i.e. vertex-centric indices!
>
>
> This looks like convergence.
>
>
>
>> This means then that TP4 can do vertex-centric index optimizations
>> automatically for providers!
>>
>
> Ex-actly.
>
>
>
>>         1. values(“#outE”).hasLabel(‘knows’).has(‘weight’,gt(0.85)) //
>> grab all edges, then filter on label, then filter on weight.
>>         2. values(“#outE.knows”).has(‘weight’,gt(0.85)) // grab all
>> knows-edges, then filter on weight.
>>         3. values(“#outE.knows.weight_gt_85”) // grab all strong
>> knows-edges.
>>
>> *** Realize that Gremlin outE() will just compile to bytecode
>> values(“#outE”).
>>
>> Freakin’ crazy! … Josh was interested in using the n-tuple structure to
>> describe indices. I was against it. I believe I still am. However, this is
>> pretty neat. As Josh was saying though, without a rich enough n-tuple
>> description of the underlying database, there should be no reason for
>> providers to have to write custom strategies and instructions ?!?!?!?!?
>> crazy!?
>>
>
> I think we might not mean the same thing by "indices", so maybe we just
> don't get hung up on that term, but we are on the same page w.r.t. what you
> wrote in this email. What's more, these indices... ok, maybe we do need to
> call them indices... can be relations of more than two variables. See the
> geospatial index example from my previous email. A vertex-centric index is
> to an edge what a generic index is to a hyperedge.
>
>
> Josh
>
>
>
>>
>> Marko.
>>
>> http://rredux.com <http://rredux.com/>
>>
>>
>>
>>
>> > On May 7, 2019, at 4:44 AM, Marko Rodriguez <[email protected]>
>> wrote:
>> >
>> > Hey Josh,
>> >
>> >> I think of your Pointer<T> as a reference to an entity. It does not
>> contain
>> >> the entity it refers to, but it contains the primary key of that
>> entity.
>> >
>> > Exactly! I was just thinking that last night. Tuples don’t need a
>> separate ID system. No -- pointers reference the primary key of a tuple!
>> Better yet perhaps, they can reference one-to-many. For instance:
>> >
>> > { id:1, label:person, name:marko, age:29, outE:*(outV=id) }
>> >
>> > Thus, a pointer is defined by a pattern match. Haven’t thought through
>> the consequences, but … :)
>> >
>> >> Here, I have invented an Entity class to indicate that the pointer
>> resolves
>> >> to a vertex (an entity without a tuple, or rather with a 0-tuple -- the
>> >> unit element).
>> >
>> > Ah — the 0-tuple. Neat thought.
>> >
>> > I look forward to your slides from the Knowledge Graph Conference. If I
>> wasn’t such a reclusive hermit, I would have loved to have joined you there.
>> >
>> > Take care,
>> > Marko.
>> >
>> > http://rredux.com <http://rredux.com/>
>> >
>> >
>> >> On Mon, May 6, 2019 at 9:38 PM Marko Rodriguez <[email protected]
>> <mailto:[email protected]>> wrote:
>> >>
>> >>> Hey Josh,
>> >>>
>> >>>> I am feeling the tuples... as long as they can be typed, e.g.
>> >>>>
>> >>>>    <V> myTuple.get(Integer) -- int-indexed tuples
>> >>>>    <V> myTuple.get(String) -- string-indexed tuples
>> >>>> In most programming languages, "tuples" are not lists, though they
>> are
>> >>> typed by a list of element types. E.g. in Haskell you might have a
>> tuple
>> >>> with the type
>> >>>>    (Double, Double, Bool)
>> >>>
>> >>>
>> >>> Yes, we have Pair<A,B>, Triple<A,B,C>, Quadruple<A,B,C,D>, etc.
>> However
>> >>> for base Tuple<A> of unknown length, the best I can do in Java is
>> <A>. :|
>> >>> You can see my stubs in the gist:
>> >>>        https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8
>> <https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8> <
>> >>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8 <
>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8>> (LINES
>> >>> #21-42)
>> >>>
>> >>>> If this is in line with your proposal, then we agree that tuples
>> should
>> >>> be the atomic unit of data in TP4.
>> >>>
>> >>> Yep. Vertices, Edges, Rows, Documents, etc. are all just tuples.
>> However,
>> >>> I suspect that we will disagree on some of my tweaks. Thus, I’d
>> really like
>> >>> to get your feedback on:
>> >>>
>> >>>        1. pointers (tuple entries referencing tuples).
>> >>>        2. sequences (multi-value tuple entries).
>> >>>        3. # hidden map keys :|
>> >>>                - sorta ghetto.
>> >>>
>> >>> Also, I’m still not happy with db().has().has().as(‘x’).db().where()…
>> its
>> >>> an intense syntax and its hard to strategize.
>> >>>
>> >>> I really want to nail down this “universal model” (tuple structure and
>> >>> tuple-oriented instructions) as then I can get back on the codebase
>> and
>> >>> start to flush this stuff out with confidence.
>> >>>
>> >>> See ya,
>> >>> Marko.
>> >>>
>> >>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <
>> http://rredux.com/>>
>> >>>
>> >>>
>> >>>>
>> >>>> Josh
>> >>>>
>> >>>>
>> >>>> On Mon, May 6, 2019 at 5:34 PM Marko Rodriguez <[email protected]
>> <mailto:[email protected]>
>> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >>>> Hi,
>> >>>>
>> >>>> I spent this afternoon playing with n-tuples, pointers, data model
>> >>> interfaces, and bytecode instructions.
>> >>>>
>> >>>>
>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8 <
>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8> <
>> >>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8 <
>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8>> <
>> >>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8 <
>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8> <
>> >>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8 <
>> https://gist.github.com/okram/25d50724da89452853a3f4fa894bcbe8>>>
>> >>>>
>> >>>> *** Kuppitz: They are tuples :). A Map<K,V> extends Tuple<Pair<K,V>>.
>> >>> Tada!
>> >>>>
>> >>>> What I like about this is that it combines the best of both worlds
>> >>> (Josh+Marko).
>> >>>>        * just flat tuples of arbitrary length.
>> >>>>                * pattern matching for arbitrary joins. (k1=k2 AND
>> k3=k4
>> >>> …)
>> >>>>                * pointers chasing for direct links. (edges, foreign
>> >>> keys, document _id references, URI resolutions, …)
>> >>>>        * sequences are a special type of tuple used for multi-valued
>> >>> entries.
>> >>>>        * has()/values()/etc. work on all tuple types! (maps, lists,
>> >>> tuples, vertices, edges, rows, statements, documents, etc.)
>> >>>>
>> >>>> Thoughts?,
>> >>>> Marko.
>> >>>>
>> >>>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <
>> http://rredux.com/>> <http://rredux.com/ <http://rredux.com/> <
>> >>> http://rredux.com/ <http://rredux.com/>>>
>> >
>>
>>

Re: N-Tuples, Pointers, Data Model Interfaces, and Bytecode Instructions

Reply via email to