Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-05-11 Thread Joshua Shinavier
OK, beginning at the beginning.


On Mon, May 6, 2019 at 3:58 AM Marko Rodriguez  wrote:

> Hey Josh,
>
>
> > One more thing is needed: disjoint unions. I described these in my email
> on
> > algebraic property graphs. They are the "plus" operator to complement the
> > "times" operator in our type algebra. A disjoint union type is just like
> a
> > tuple type, but instead of having values for field a AND field b AND
> field
> > c, an instance of a union type has a value for field a XOR field b XOR
> > field c. Let me know if you are not completely sold on union types, and I
> > will provide additional motivation.
>
> Huh. That is an interesting concept. Can you please provide examples?
>

Yes. If you think back to your elementary school algebra, you will recall
four basic associative operations: addition, multiplication, subtraction,
and division. Simple stuff, but let's make things even simpler by throwing
out inverses. So we have: addition and multiplication. You also need unit
elements 0 and 1 which have the usual properties. This structure is called
a semiring , and with it, you can
build up a rich type system, and allows you to reason on equations of
types. Multiplication represents the concatenation of tuples -- a × b × c
is a type that has a AND b and c -- whereas addition represents a choice -- a +
b + c is a type that has a XOR b XOR c.

Examples of multiplication are edges (e.g. a knows edge type is the product
of Person and Person; the out-vertex is a person, and the in-vertex is a
person) and properties (e.g. age is a product of Person and the primitive
integer type). For example, you could express the knows type as Person
× Person or as prod{out=Person, in=Person} if you want to give names to the
components of tuples (fields).

Examples of addition are in- or out-types which are a disjunction of other
types. For example, in the TInkerPop classic graph, the name property can
attach to either a Person or a Project, so the type is (Person +
Project) × string, or prod{out=sum{person=Person, project=Project},
in=string} if you want field names.

Just as the teacher made you do at the blackboard, you can distribute
multiplication over a sum, so

(Person + Project) × string = (Person × string) + (Project × string)


 In other words, a name property which can attach either to a person or
project is equivalent to two distinct properties, maybe call them personName
and projectName, which each attach to only one type of vertex.

Other fun things you can build with unions include lists, trees, and other
recursive data structures. How do you formalize a "list of people" as a
type? Well, you can think of it in this way:

ListOfPeople = () + (Person) + (Person × Person) + (Person × Person ×
Person) + ...


In other words, a list of people can be either the additive unit (0-tuple),
a single person, a pair of people, a triplet of people... an n-tuple of
people for any n >= 0. You could also write:

ListOfPeople = () + (Person × ListOfPeople)


Products let you concatenate types and tuples to build larger types and
tuples; sums enable choices and pattern matching.



> One thing I want to stress. The “universal bytecode” is just standard
> [op,arg*]* bytecode save that data access is via the “universal model's"
> db() instruction. Thus, AND/OR/pattern matching/etc. is all available.
> Likewise union(), repeat(), coalesce(), choose(), etc. are all available.
>
> db().and(as('a').values('knows').as('b'),
>  or(as('a').has('name','marko'),
> as('a').values(‘created').count().is(gt(1))),
>  as('b').values(’created').as('c')).
>  path(‘c')
>

No disagreement. This is essentially functional pattern matching as
motivated above, though it includes a condition we wouldn't include in the
type system itself: the "created" count.



> As you can see, and()/or() pattern matching is possible and can be nested.
>   *** SIDENOTE: In TP3, such nested and()/or() pattern matching is
> expressed using match() where the root grouping is assumed to be and()’d
> together.
>

Yep.



>   *** SIDENOTE: In TP4, I want to get rid of an explicit match() bytecode
> instruction and replace it with and()/or() instructions with prefix/suffix
> as()s.
>

Hmm. I think the match() syntax is useful, even if you can build match()
expressions out of and() and or(). Or maybe we just point users to
OpenCypher if they want conjunctive query patterns. Jeremy Hanna and I
chatted about this at the conference earlier this week... it is really just
a matter of providing the best syntactic sugar. You CAN do everything that
match() or OpenCypher can do in Gremlin, but this is not to say you always
SHOULD.



>[...]

> Or other tuples, or tagged values. E.g. any edge projects to two vertices,
> > which are (trivial) tuples as opposed to primitive values.
>
> Good point. I started to do some modeling and I’ve been getting some good
> mileage from a new “pointer” primitive. Assume 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-05-06 Thread Marko Rodriguez
Hey Josh,


> One more thing is needed: disjoint unions. I described these in my email on
> algebraic property graphs. They are the "plus" operator to complement the
> "times" operator in our type algebra. A disjoint union type is just like a
> tuple type, but instead of having values for field a AND field b AND field
> c, an instance of a union type has a value for field a XOR field b XOR
> field c. Let me know if you are not completely sold on union types, and I
> will provide additional motivation.

Huh. That is an interesting concept. Can you please provide examples?

>> The instructions:
>>1. relations can be “queried” for matching tuples.
>> 
> 
> Yes.

One thing I want to stress. The “universal bytecode” is just standard 
[op,arg*]* bytecode save that data access is via the “universal model's" db() 
instruction. Thus, AND/OR/pattern matching/etc. is all available. Likewise 
union(), repeat(), coalesce(), choose(), etc. are all available.

db().and(as('a').values('knows').as('b'),
 or(as('a').has('name','marko'),
as('a').values(‘created').count().is(gt(1))),
 as('b').values(’created').as('c')).
 path(‘c')

As you can see, and()/or() pattern matching is possible and can be nested.
  *** SIDENOTE: In TP3, such nested and()/or() pattern matching is expressed 
using match() where the root grouping is assumed to be and()’d together.
  *** SIDENOTE: In TP4, I want to get rid of an explicit match() bytecode 
instruction and replace it with and()/or() instructions with prefix/suffix 
as()s.
  *** SIDENOTE: In TP4, in general, any nested bytecode that starts with as(x) 
is path(x) and any bytecode that ends with as(y) is where(eq(path(y)).

> 
>>2. tuple values can be projected out to yield primitives.
>> 
> 
> Or other tuples, or tagged values. E.g. any edge projects to two vertices,
> which are (trivial) tuples as opposed to primitive values.

Good point. I started to do some modeling and I’ve been getting some good 
mileage from a new “pointer” primitive. Assume every N-Tuple has a unique ID 
(outside the data models id space). If so, the TinkerPop toy graph as N-Tuples 
is:

[0][id:1,name:marko,age:29,created:*1,knows:*2]
[1][0:*3]
[2][0:*4,1:*5]
[3][id:3,name:lop,lang:java]
[4][id:2,name:vadas,age:27]
[5][id:4,name:josh,age:32,created*:…]

I know you are thinking that vertices don’t have “outE” projections so this 
isn’t inline with your thinking. However, check this out. If we assume that 
pointers are automatically dereferenced on reference then:

db().has(‘name’,’marko’).values(‘knows’).values(‘name’) => vadas, josh

Pointers are useful when a tuple has another tuple as a value. Instead of 
nesting, you “blank node.” DocumentDBs (with nested list/maps) would use this 
extensively.

> Grumble... db() is just an alias for select()... grumble…

select() and project() are existing instructions in TP3 (TP4?).

SELECT
db() will iterate all N-Tuples
has() will filter out those N-Tuples with respective key/values.
and()/or() are used for nested pattern matching.

PROJECT
values() will project out the n-tuple values.

> Here, we are kind of mixing fields with property keys. Yes,
> db().has('name', 'marko') can be used to search for elements of any type...
> if that type agrees with the out-type of the "name" relation. In my
> TinkerPop Classic example, the out type of "name" is (Person OR Project),
> so your query will get you people or projects.

Like indices, I don’t think we should introduce types. But this is up for 
further discussion...

> Which is to say that we define the out-type of "name" to be the disjoint
> union of all element types. The type becomes trivial. However, we can also
> be more selective if we want to, restricting "name" only to a small subset
> of types.

Hm… I’m listening. I’m running into problems in my modeling when trying to 
generically fit things into relational tables. Maybe typing is necessary :(.


> Good idea. TP4 can provide several "flavors" of interfaces, each of which
> is idiomatic for each major class of database provider. Meeting the
> providers halfway will make integration that much easier.

Yes. With respects to graphdb providers, they want to think in terms of 
Vertex/Edges/etc. We want to put the bytecode in their language so:

1. It is easier for them to write custom strategies.
2. inV() can operate on their Vertex object without them having to 
implement inV().
*** Basically just like TP3 is now. GraphDB providers implement 
Graph/Vertex/Edge and everything works! However, they will then want to write 
custom instructions/strategies to do use their databases optimizations such as 
vertex-centric indices for outE(‘knows’).has(‘stars’,gt(3)).inV().


> I think we will see steps like V() and R() in Gremlin, but do not need them
> in bytecode. Again, db() is just select(), V() is just select(), etc. The
> model-specific interfaces 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-05-03 Thread Joshua Shinavier
Hi Marko,

Thanks for the detailed emails. Responses inline.


On Thu, May 2, 2019 at 6:40 AM Marko Rodriguez  wrote:

> [...]
> Thus, there exists a data model that can describe these database
> structures in a database agnostic manner.
> - not in terms of tables, vertices, JSON, column families, …
>

100% with you on this.



> While we call this a “universal model” it is NOT more “general”
> (theoretically powerful) than any other database structure.
>

I agree. We should be trying harder to find equivalences, as opposed to
introducing a "bigger, better, brand-new shiny" data model.



> Reasons for creating a “universal model”.
>
> 1. To have a reduced set of objects for the TP4 VM to consider.
> - edges are just vertices with one incoming and outgoing
> “edge.”
>

Kinda. Let's say edges are elements with two fields. Vertices are elements
with no fields.



> - a column family is just a “map” of rows which are just
> “maps.”
>

Kinda. Let's say a table / column family is a data type with a number of
fields. Equivalently, it is a relation with a number of columns. You
brought up a good point in your previous email w.r.t. "person" vs.
"people", but that's why mappings are needed. A trivial schema mapping
gives you an element type "person" from a relation/table "people" and vice
versa. The table and the type are equivalent.



> - tables are just groupings of schema-equivalent rows.
>

Agreed. The "universal model" just makes an element out of each row.



> 2. To have a limited set of instructions in the TP4 bytecode
> specification.
> - outE/inE/outV/inV are just following direct “links”
> between objects.
>

inV and outV, yes, because they are fields of an edge element. outE and inE
are different, because they are not fields of the vertex. However, they are
functions. You can put them in the same namespace as inV and outV if you
want to; just keep in mind that in terms of relational algebra, they are a
fundamentally different operation.



> - has(), values(), keys(), valueMap(), etc. need not just
> apply to vertices and edges.
>

Agreed.



> 3. To have a simple serialization format.
> - we do not want to ship around
> rows/vertices/edges/documents/columns/etc.
> - we want to make it easy for other languages to integrate
> with the TP4 VM.
> - we want to make it easy to create TP4 VMs in other
> languages.
>

What is easier than a table? Any finite graph in this model is just a
collection of tables which can be shipped around as CSVs, among other
formats.



> 4. To have a theoretical understanding of the relationship between
> the various data structures.
> - “this is just a that” is useful to limit the
> complexities of our codebase and explain to the public how different
> database relate.
>

Yes.



> [...]
> The objects:
> 1. primitives: floats, doubles, Strings, ints, etc.
>

Yes.



> 2. tuples: key’d collections of primitives. (instances)
> 3. relations: groupings of tuples with ?equivalent? schemas.
> (types)
>

These are the same thing. A tuple is a row, is an element. A relation is a
set of elements/tuples/rows of the same type.

One more thing is needed: disjoint unions. I described these in my email on
algebraic property graphs. They are the "plus" operator to complement the
"times" operator in our type algebra. A disjoint union type is just like a
tuple type, but instead of having values for field a AND field b AND field
c, an instance of a union type has a value for field a XOR field b XOR
field c. Let me know if you are not completely sold on union types, and I
will provide additional motivation.



> The instructions:
> 1. relations can be “queried” for matching tuples.
>

Yes.



> 2. tuple values can be projected out to yield primitives.
>

Or other tuples, or tagged values. E.g. any edge projects to two vertices,
which are (trivial) tuples as opposed to primitive values.


Lets do a “traversal” from marko to the people he knows.
>
> // g.V().has(‘name’,’marko’).outE(‘knows’).inV().values(‘name’)
>
> db(‘person’).has(‘name’,’marko’).as(‘x’).
> db(‘knows’).has(‘#outV’, path(‘x’).by(‘#id’)).as(‘y’).
> db(‘person’).has(‘#id’, path(‘y’).by(‘#inV’)).
>   values(‘name’)
>

I still don't think we need the "db" step, but I think that syntax works --
you are distinguishing between fields and higher-order things like
properties by using hash characters for the field names.



> While the above is a single stream of processing, I will state what each
> line above has at that point in the stream.
> - [#label:person,name:marko,age:29]
>

Keeping in mind that "name" and "age" are property keys as opposed to
fields, yes.



> - [#label:knows,#outV:1,#inV:2,weight:0.5], ...
> - [#label:person,name:vadas,age:27], ...
> - vadas, ...
>

OK.



> 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-05-02 Thread Marko Rodriguez
Hey Josh (others),

I was thinking of our recent divergence in thought. I thought it would be smart 
for me to summarize where we are and to do my best to describe your model so as 
to better understand your perspective and to help you better understand how 
your model will ultimately execute on the TP4 VM.


# WHY A UNIVERSAL MODEL? #
###

Every database data model can be losslessly embedded in every other database 
data model.
- e.g. you can embed a property graph structure in a relational 
structure.
- e.g. you can embed a document structure in a property graph structure.
- e.g. you can embed a wide-column structure in a document structure.
- …
- e.g. you can embed a property graph structure in a Hadoop sequence 
file or Spark RDD.

Thus, there exists a data model that can describe these database structures in 
a database agnostic manner.
- not in terms of tables, vertices, JSON, column families, …

While we call this a “universal model” it is NOT more “general” (theoretically 
powerful) than any other database structure.

Reasons for creating a “universal model”.

1. To have a reduced set of objects for the TP4 VM to consider.
- edges are just vertices with one incoming and outgoing “edge.”
- a column family is just a “map” of rows which are just “maps.”
- tables are just groupings of schema-equivalent rows.
- …
2. To have a limited set of instructions in the TP4 bytecode 
specification.
- outE/inE/outV/inV are just following direct “links” between 
objects.
- has(), values(), keys(), valueMap(), etc. need not just apply 
to vertices and edges.
- …
3. To have a simple serialization format.
- we do not want to ship around 
rows/vertices/edges/documents/columns/etc.
- we want to make it easy for other languages to integrate with 
the TP4 VM.
- we want to make it easy to create TP4 VMs in other languages.
- ...
4. To have a theoretical understanding of the relationship between the 
various data structures.
- “this is just a that” is useful to limit the complexities of 
our codebase and explain to the public how different database relate.

Without further ado...


# THE UNIVERSAL MODEL #


*** This is as I understand it. I will let Josh decide whether I captured his 
ideas correctly. ***
*** All subsequent x().y().z() expressions are BYTECODE, not GREMLIN (just 
using an easier syntax then [op,arg*]*. ***

The objects:
1. primitives: floats, doubles, Strings, ints, etc.
2. tuples: key’d collections of primitives. (instances)
3. relations: groupings of tuples with ?equivalent? schemas. (types)

The instructions:
1. relations can be “queried” for matching tuples.
2. tuple values can be projected out to yield primitives.

Lets do a “traversal” from marko to the people he knows.

// g.V().has(‘name’,’marko’).outE(‘knows’).inV().values(‘name’)

db(‘person’).has(‘name’,’marko’).as(‘x’).
db(‘knows’).has(‘#outV’, path(‘x’).by(‘#id’)).as(‘y’).
db(‘person’).has(‘#id’, path(‘y’).by(‘#inV’)).
  values(‘name’)

While the above is a single stream of processing, I will state what each line 
above has at that point in the stream.
- [#label:person,name:marko,age:29]
- [#label:knows,#outV:1,#inV:2,weight:0.5], ...
- [#label:person,name:vadas,age:27], ...
- vadas, ...
Databases strategies can be smart to realize that only the #id or #inV or #outV 
of the previous object is required and thus, limit what is actually accessed 
and flow’d through the processing engine.
- [#id:1]
- [#id:0,#inV:2] …
- [#id:2,name:vadas] …
- vadas, ...
*** More on such compiler optimizations (called strategies) later ***

POSITIVE NOTES:

1. All relations are ‘siblings’ accessed via db().
- There is no concept of nesting data. A very flat structure.
2. All subsequent has()/where()/is()/etc.-filter steps after db() 
define the pattern match query.
- It is completely up to the database to determine how to 
retrieve matching tuples.
- For example: using indices, pointer chasing, linear scans w/ 
filter, etc.
3. All subsequent map()/flatmap()/etc. steps are projections of data in 
the tuple.
- The database returns key’d tuples composed of primitives.
- Primitive data can be accessed and further processed. 
(projections)
4. The bytecode describes a computation that is irrespective of the 
underlying database’s encoding of that structure.
- Amazon Neptune, MySQL, Cassandra, Spark, Hadoop, Ignite, etc. 
can be fed the same bytecode and will yield the same 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-30 Thread Marko Rodriguez
Hello,

> First, the "root". While we do need context for traversals, I don't think
> there should be a distinct kind of root for each kind of structure. Once
> again, select(), or operations derived from select() will work just fine.

So given your example below, “root” would be db in this case. 
db is the reference to the structure as a whole.
Within db, substructures exist. 
Logically, this makes sense.
For instance, a relational database’s references don’t leak outside the RDBMs 
into other areas of your computer’s memory.
And there is always one entry point into every structure — the connection. And 
what does that connection point to:
vertices, keyspaces, databases, document collections, etc. 
In other words, “roots.” (even the JVM has a “root” — it called the heap).

> Want the "person" table? db.select("person"). Want a sequence of vertices
> with the label "person"? db.select("person"). What we are saying in either
> case is "give me the 'person' relation. Don't project any specific fields;
> just give me all the data". A relational DB and a property graph DB will
> have different ways of supplying the relation, but in either case, it can
> hide behind the same interface (TRelation?).

In your lexicon, for both RDBMS and graph:
db.select(‘person’) is saying, select the people table (which is 
composed of a sequence of “person" rows)
db.select(‘person’) is saying, select the person vertices (which is 
composed of a sequence of “person" vertices)
…right off the bat you have the syntax-problem of people vs. person. Tables are 
typically named the plural of the rows. That
doesn’t exist in graph databases as there is just one vertex set (i.e. one 
“table”).

In my lexicon (TP instructions)
db().values(‘people’) is saying, flatten out the person rows of the 
people table.
V().has(label,’person’) is saying, flatten out the vertex objects of 
the graph’s vertices and filter out non-person vertices.

Well, that is stupid, why not have the same syntax for both structures?
Because they are different. There are no “person” relations in the classic 
property graph (Neo4j 1.0). There are only vertex relations with a label=person 
entry.
In a relational database there are “person” relations and these are bundled 
into disjoint tables (i.e. relation sets — and schema constrained).

The point I’m making is that instead of trying to fit all these data structures 
into a strict type system that ultimately looks like
a bunch of disjoint relational sets, lets mimic the vendor-specified semantics. 
Lets take these systems at their face value
and not try and “mathematize” them. If they are inconsistent and ugly, fine. If 
we map them into another system that is mathematical
and beautiful, great. However, every data structure, from Neo4j’s 
representation for OLTP traversals
 to that “same" data being OLAP processed as Spark RDDs or Hadoop
SequenceFiles will all have their ‘oh shits’ (impedance mismatches) and that is 
okay. As this is the reality we are tying to model!

Graph and RDBMs have two different data models (their unique worldview):

RDBMS:   Databases->Tables->Rows->Primitives
GraphDB: Vertices->Edges->Vertices->Edges->Vertices-> ...

Here is a person->knows->person “traversal” in TP4 bytecode over an RDBMS (#key 
are ’symbols’ (constants)):

db().values(“people”).as(“x”).
db().values(“knows”).as(“y”).
  where(“x”,eq(“y”)).by(#id).by(#outV).
db().values(“people”).as(“z”).
  where(“y”,eq(“z”)).by(#inV).by(#id)
   
Pretty freakin’ disgusting, eh? Here is a person->knows->person “traversal” in 
TP4 bytecode over a property graph:

V().has(#label,”person”).values(#outE).has(#label,”knows”).values(#inV)

So we have two completely different bytecode representations for the same 
computational result. Why?
Because we have two completely different data models!

One is a set of disjoint typed-relations (i.e. RDBMS).
One is a set of nested loosely-typed-relations (i.e. property graphs).

Why not make them the same? Because they are not the same and that is exactly 
what I believe we should be capturing.

Just looking at the two computations above you see that a relational database 
is doing “joins” while a graph database is doing “traversals”.
We have to use path-data to compute a join. We have to use memory! (and we do). 
We don’t have to use path-data to compute a traversal.
We don’t have to use memory! (and we don’t!). That is the fundamental nature of 
the respective computations that are taking place.
That is what gives each system their particular style of computing.

NEXT: There is nothing that says you can’t map between the two? Lets go 
property graph to RDBMS.
- we could make a person table, a software table, a knows table, a 
created table.
- that only works if the property graph is schema-based.
- we could make a single vertex table with another 3 column properties 
table (vertexId,key,value)
- we could…
Which 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-30 Thread Joshua Shinavier
Hi Marko,

I like it. But I still have some constructive criticism. I think a little
more simplicity in the right places will make things like index support,
query optimization, and integration with SEDMs (someone else's data model)
that much easier in the future.

First, the "root". While we do need context for traversals, I don't think
there should be a distinct kind of root for each kind of structure. Once
again, select(), or operations derived from select() will work just fine.
Want the "person" table? db.select("person"). Want a sequence of vertices
with the label "person"? db.select("person"). What we are saying in either
case is "give me the 'person' relation. Don't project any specific fields;
just give me all the data". A relational DB and a property graph DB will
have different ways of supplying the relation, but in either case, it can
hide behind the same interface (TRelation?).

But wait, you say, what if the under the hood, you have a TTable in one
case, and TSequence in the other? They are so different! That's why
the Dataflow
Model

is so great; to an extent, you can think of the two as interchangeable. I
think we would get a lot of mileage out of treating them as interchangeable
within TP4.

So instead of a data model -specific "root", I argue for a universal root
together with a set of relations and what we might call an "indexes". An
index is an arrow from a type to a relation which says "give me a
column/value pair, and I will give you all matching tuples from this
relation". The result is another relation. Where data sources differentiate
themselves is by having different relations and indexes.

For example, if the underlying data structure is nothing but a stream of
Trip tuples, you will have a single relation "Trip", and no indexes. Sorry;
you just have to wait for tuples to go by, and filter on them. So if you
say d.select("Trip", "driver") -- where d is a traversal that gets you to a
User -- the machine knows that it can't use "driver" to look up a specific
set of trips; it has to use a filter over all future "Trip" tuples. If, on
the other hand, we have a relational database, we have the option of
indexing on "driver". In this case, d.select("Trip", "driver") may take you
to a specific table like "Trip_by_driver" which has "driver" as a primary
key. The machine recognizes that this index exists, and uses it to answer
the query more efficiently. The alternative is to do a full scan over any
table which contains the "Trip" relation. Since TinkerPop3, we have been
without a vendor-neutral API for indexes, but this is where such an API
would really start to shine. Consider Neo4j's single property indexes,
JanusGraph's composite indexes, and even RDF triple indices (spo, ops,
etc.) as in AllegroGraph in addition to primary keys in relational
databases.

TTuple -- cool. +1

"Enums" -- I agree that enums are necessary, but we need even more: tagged
unions . They are part of the
system of algebraic data types which I described on Friday. An enum is a
special case of a tagged union in which there is no value, just a type tag.
May I suggest something like TValue, which contains a value (possibly
trivial) together with a type tag. This enables ORs and pattern matching.
For example, suppose "created" edges are allowed to point to either
"Project" or "Document" vertices. The in-type of "created" is
union{project:Project, document:Document). Now the in value of a specific
edge can be TValue("project", [some project vertex]) or TValue("document",
[some document vertex]) and you have the freedom to switch on the type tag
if you want to, e.g. the next step in the traversal can give you the "name"
of the project or the "title" of the document as appropriate.

Multi-properties -- agreed; has() is good enough.

Meta-properties -- again, this is where I think we should have a
lower-level select() operation. Then has() builds on that operation.
Whereas select() matches on fields of a relation, has() matches on property
values and other higher-order things. If you want properties of properties,
don't use has(); use select()/from(). Most of the time, you will just want
to use has().

Agreed that every *entity* should have an id(), and also a label() (though
it should always be possible to infer label() from the context). I would
suggest TEntity (or TElement), which has id(), label(), and value(), where
value() provides the raw value (usually a TTuple) of the entity.

Josh



On Mon, Apr 29, 2019 at 10:35 AM Marko Rodriguez 
wrote:

> Hello Josh,
>
> > A has("age",29), for example, operates at a different level of
> abstraction than a
> > has("city","Santa Fe") if "city" is a column in an "addresses" table.
>
> So hasXXX() operators work on TTuples. Thus:
>
> g.V().hasLabel(‘person’).has(‘age’,29)
> g.V().hasLabel(‘address’).has(‘city’,’Santa Fe’)
>
> ..both work as a person-vertex 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-29 Thread Marko Rodriguez
Hey,

Check this out:


Machine machine = LocalMachine.open();
TraversalSource jdbc =
Gremlin.traversal(machine).
withProcessor(PipesProcessor.class).
withStructure(JDBCStructure.class, 
Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test"));
  
System.out.println(jdbc.db().values("people").as("x”).
db().values("addresses").as("y").has("name", 
__.path("x").by("name")).
  path("x", "y").toList());
System.out.println(“\n\n”)
System.out.println(jdbc.db().values("people").as("x”).
db().values("addresses").as("y").has("name", 
__.path("x").by("name")).
  path("x", "y").explain().toList());


[[{NAME=marko, AGE=29}, {CITY=santa fe, NAME=marko}], [{NAME=josh, AGE=32}, 
{CITY=san jose, NAME=josh}]]


[Original   [db, values(people)@x, db, 
values(addresses)@y, hasKeyValue(name,[path(x,[value(name)])]), path(x,y,|)]
JDBCStrategy[db(), values(people)@x, db(), values(addresses)@y, 
hasKeyValue(name,[path(x,[value(name)])]), path(x,y,|)]
JDBCQueryStrategy   [jdbc:sql(conn9: url=jdbc:h2:/tmp/test 
user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)]
PipesStrategy   [jdbc:sql(conn9: url=jdbc:h2:/tmp/test 
user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)]
CoefficientStrategy [jdbc:sql(conn9: url=jdbc:h2:/tmp/test 
user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)]
CoefficientVerificationStrategy [jdbc:sql(conn9: url=jdbc:h2:/tmp/test 
user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)]
---
Compilation [FlatMapInitial]
Execution Plan [PipesProcessor] [InitialStep[FlatMapInitial]]]





I basically look for a db.values.db.values.has-pattern in the bytecode and if I 
find it, I try and roll it into a single provider-specific instruction that 
does a SELECT query.

Here is JDBCQueryStrategy (its ghetto and error prone, but I just wanted to get 
the basic concept working):

https://github.com/apache/tinkerpop/blob/7142dc16d8fc81ad8bd4090096b42e5b9b1744f4/java/machine/structure/jdbc/src/main/java/org/apache/tinkerpop/machine/structure/jdbc/strategy/JDBCQueryStrategy.java
 

Here is SqlFlatMapStep (hyper-ghetto… but whateva’):

https://github.com/apache/tinkerpop/blob/7142dc16d8fc81ad8bd4090096b42e5b9b1744f4/java/machine/structure/jdbc/src/main/java/org/apache/tinkerpop/machine/structure/jdbc/function/flatmap/SqlFlatMap.java
 


Na na!,
Marko.

http://rredux.com 




> On Apr 29, 2019, at 11:50 AM, Marko Rodriguez  wrote:
> 
> Hello Kuppitz,
> 
>> I don't think it's a good idea to keep this mindset for TP4; NULLs are too
>> important in RDBMS. I don't know, maybe you can convince SQL people that
>> dropping a value is the same as setting its value to NULL. It would work
>> for you and me and everybody else who's familiar with Gremlin, but SQL
>> people really love their NULLs….
> 
> Hmm……. I don’t like nulls. Perhaps with time a clever solution will emerge. 
> 
> 
>> I'd prefer to just have special accessors for these. E.g. g.V().meta("id").
>> At least valueMaps would then only have String-keys.
>> I see the issue with that (naming collisions), but it's still better than
>> the enums in my opinion (which became a pain when started to implement
>> GLVs).
> 
> So, TSymbols are not Java enums. They are simply a “primitive”-type that will 
> have a serialization like:
> 
>   symbol[id]
> 
> Meaning, that people can make up Symbols all day long without having to 
> update serializers. How I see them working is that they are Strings prefixed 
> with #.
> 
> g.V().outE() <=>   g.V().values(“#outE”)
> g.V().id()   <=>   g.V().value(“#id”)
> g.V().hasLabel(“person") <=>   g.V().has(“#label”,”person”)
> 
> Now that I type this out, perhaps we don’t even have a TSymbol-class. 
> Instead, any String that starts with # is considered a symbol. Now watch this:
> 
> g.V().label()  <=>   g.V().value(“#label”)
> g.V().labels() <=>   g.V().values(“#label”)
> 
> In this way, we can support Neo4j multi-labels as a Neo4jVertex’s #label-Key 
> references a TSequence.
> 
> g.V(1).label() => TSequence
> g.V(1).labels() => String, String, String, …
> 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-29 Thread Marko Rodriguez
Hello Kuppitz,

> I don't think it's a good idea to keep this mindset for TP4; NULLs are too
> important in RDBMS. I don't know, maybe you can convince SQL people that
> dropping a value is the same as setting its value to NULL. It would work
> for you and me and everybody else who's familiar with Gremlin, but SQL
> people really love their NULLs….

Hmm……. I don’t like nulls. Perhaps with time a clever solution will emerge. 

> I'd prefer to just have special accessors for these. E.g. g.V().meta("id").
> At least valueMaps would then only have String-keys.
> I see the issue with that (naming collisions), but it's still better than
> the enums in my opinion (which became a pain when started to implement
> GLVs).

So, TSymbols are not Java enums. They are simply a “primitive”-type that will 
have a serialization like:

symbol[id]

Meaning, that people can make up Symbols all day long without having to update 
serializers. How I see them working is that they are Strings prefixed with #.

g.V().outE() <=>   g.V().values(“#outE”)
g.V().id()   <=>   g.V().value(“#id”)
g.V().hasLabel(“person") <=>   g.V().has(“#label”,”person”)

Now that I type this out, perhaps we don’t even have a TSymbol-class. Instead, 
any String that starts with # is considered a symbol. Now watch this:

g.V().label()  <=>   g.V().value(“#label”)
g.V().labels() <=>   g.V().values(“#label”)

In this way, we can support Neo4j multi-labels as a Neo4jVertex’s #label-Key 
references a TSequence.

g.V(1).label() => TSequence
g.V(1).labels() => String, String, String, …
g.V(1).label().add(“programmer”)
g.V(1).label().drop(“person”)

So we could do “meta()”, but then you need respective “hasXXX”-meta() methods. 
I think #symbol is easiest .. ?

> Also, what I'm wondering about now: Have you thought about Stored
> Procedures and Views in RDBMS? Views can be treated as tables, easy, but
> what about stored procedures? SPs can be found in many more DBMS, would be
> bad to not support them (or hack something ugly together later in the
> development process).

I’m not super versed in RDBMS technology. Can you please explain to me how to 
create a StoreProcedure and the range of outputs a StoredProcedure produces? 
From there, I can try and “Bytecode-ize” it.

Thanks Kuppitz,
Marko.

http://rredux.com 




> On Mon, Apr 29, 2019 at 7:34 AM Marko Rodriguez  >
> wrote:
> 
>> Hi,
>> 
>> *** This email is primarily for Josh (and Kuppitz). However, if others are
>> interested… ***
>> 
>> So I did a lot of thinking this weekend about structure/ and this morning,
>> I prototyped both graph/ and rdbms/.
>> 
>> This is the way I’m currently thinking of things:
>> 
>>1. There are 4 base types in structure/.
>>- Primitive: string, long, float, int, … (will constrain
>> these at some point).
>>- TTuple: key/value map.
>>- TSequence: an iterable of v objects.
>>- TSymbol: like Ruby, I think we need “enum-like” symbols
>> (e.g., #id, #label).
>> 
>>2. Every structure has a “root.”
>>- for graph its TGraph implements TSequence
>>- for rdbms its a TDatabase implements
>> TTuple
>> 
>>3. Roots implement Structure and thus, are what is generated by
>> StructureFactory.mint().
>>- defined using withStructure().
>>- For graph, its accessible via V().
>>- For rdbms, its accessible via db().
>> 
>>4. There is a list of core instructions for dealing with these
>> base objects.
>>- value(K key): gets the TTuple value for the provided key.
>>- values(K key): gets an iterator of the value for the
>> provided key.
>>- entries(): gets an iterator of T2Tuple objects for the
>> incoming TTuple.
>>- hasXXX(A,B): various has()-based filters for looking
>> into a TTuple and a TSequence
>>- db()/V()/etc.: jump to the “root” of the withStructure()
>> structure.
>>- drop()/add(): behave as one would expect and thus.
>> 
>> 
>> 
>> For RDBMS, we have three interfaces in rdbms/.
>> (machine/machine-core/structure/rdbms)
>> 
>>1. TDatabase implements TTuple // the root
>> structure that indexes the tables.
>>2. TTable implements TSequence> // a table is a sequence
>> of rows
>>3. TRow implements TTuple> // a row has string column
>> names
>> 
>> I then created a new project at machine/structure/jdbc). The classes in
>> here implement the above rdbms/ interfaces/
>> 
>> Here is an RDBMS session:
>> 
>> final Machine machine = LocalMachine.open();
>> final TraversalSource jdbc =
>>Gremlin.traversal(machine).
>>withProcessor(PipesProcessor.class).
>>withStructure(JDBCStructure.class,
>> Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test"));
>> 
>> 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-29 Thread Marko Rodriguez
Hello Josh,

> A has("age",29), for example, operates at a different level of abstraction 
> than a
> has("city","Santa Fe") if "city" is a column in an "addresses" table.

So hasXXX() operators work on TTuples. Thus:

g.V().hasLabel(‘person’).has(‘age’,29)
g.V().hasLabel(‘address’).has(‘city’,’Santa Fe’)

..both work as a person-vertex and an address-vertex are TTuples. If these were 
tables, then:

jdbc.db().values(‘people’).has(‘age’,29)
jdbc.db().values(‘addresses’).has(‘city’,’Santa Fe’)

…also works as both people and addresses are TTables which extend 
TTuple.

In summary, its its a TTuple, then hasXXX() is good go.

// IGNORE UNTIL AFTER READING NEXT SECTION //
*** SIDENOTE: A TTable (which is a TSequence) could have Symbol-based metadata. 
Thus TTable.value(#label) -> “people.” If so, then
jdbc.db().hasLabel(“people”).has(“age”,29)

> At least, they
> are different if the data model allows for multi-properties,
> meta-properties, and hyper-edges. A property is something that can either
> be there, attached to an element, or not be there. There may also be more
> than one such property, and it may have other properties attached to it. A
> column of a table, on the other hand, is always there (even if its value is
> allowed to be null), always has a single value, and cannot have further
> properties attached.

1. Multi-properties.

Multi-properties works because if name references a TSequence, then its the 
sequence that you analyze with has(). This is another reason why TSequence is 
important. Its a reference to a “stream” so there isn’t another layer of 
tuple-nesting.

// assume v[1] has name={marko,mrodriguez,markor}
g.V(1).value(‘name’) => TSequence
g.V(1).values(‘name’) => marko, mrodriguez, markor
g.V(1).has(‘name’,’marko’) => v[1]

2. Meta-properties

// assume v[1] has name=[value:marko,creator:josh,timestamp:12303] // i.e. a 
tuple value
g.V(1).value(‘name’) => TTuple // doh!
g.V(1).value(‘name’).value(‘value’) => marko
g.V(1).value(‘name’).value(‘creator’) => josh

So things get screwy. — however, it only gets screwy when you mix your 
“metadata” key/values with your “data” key/values. This is why I think TSymbols 
are important. Imagine the following meta-property tuple for v[1]:

[#value:marko,creator:josh,timestamp:12303]

If you do g.V(1).value(‘name’), we could look to the value indexed by the 
symbol #value, thus => “marko”.
If you do g.V(1).values(‘name’), you would get back a TSequence with a single 
TTuple being the meta property.
If you do g.V(1).values(‘name’).value(), we could get the value indexed by the 
symbol #value.
If you do g.V(1).values(‘name’).value(‘creator’), it will return the primitive 
string “josh”.

I believe that the following symbols should be recommended for use across all 
data structures.
#id, #label, #key, #value
…where id(), label(), key(), value() are tuple.get(Symbol). Other symbols for 
use with propertygraph/ include:
#outE, #inV, #inE, #outV, #bothE, #bothV

> In order to simplify user queries, you can let has() and values() do double
> duty, but I still feel that there are lower-level operations at play, at a
> logical level even if not at a bytecode level. However, expressing the a
> traversal in terms of its lowest-level relational operations may also be
> useful for query optimization.

One thing that I’m doing, that perhaps you haven’t caught onto yet, is that I’m 
not modeling everything in terms of “tables.” Each data structure is trying to 
stay as pure to its conceptual model as possible. Thus, there are no “joins” in 
property graphs as outE() references a TSequence, where TEdge is an 
interface that extends TTuple. You can just walk without doing any type of 
INNER JOIN. Now, if you model a property graph in a relational database, you 
will have to strategize the bytecode accordingly! Just a heads up in case you 
haven’t noticed that.

Thanks for your input,
Marko.

http://rredux.com 



> 
> Josh
> 
> 
> 
> On Mon, Apr 29, 2019 at 7:34 AM Marko Rodriguez  >
> wrote:
> 
>> Hi,
>> 
>> *** This email is primarily for Josh (and Kuppitz). However, if others are
>> interested… ***
>> 
>> So I did a lot of thinking this weekend about structure/ and this morning,
>> I prototyped both graph/ and rdbms/.
>> 
>> This is the way I’m currently thinking of things:
>> 
>>1. There are 4 base types in structure/.
>>- Primitive: string, long, float, int, … (will constrain
>> these at some point).
>>- TTuple: key/value map.
>>- TSequence: an iterable of v objects.
>>- TSymbol: like Ruby, I think we need “enum-like” symbols
>> (e.g., #id, #label).
>> 
>>2. Every structure has a “root.”
>>- for graph its TGraph implements TSequence
>>- for rdbms its a TDatabase implements
>> TTuple
>> 
>>3. Roots implement Structure and thus, are what is generated by
>> 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-29 Thread Daniel Kuppitz
>
> we don’t support ‘null' in TP


I don't think it's a good idea to keep this mindset for TP4; NULLs are too
important in RDBMS. I don't know, maybe you can convince SQL people that
dropping a value is the same as setting its value to NULL. It would work
for you and me and everybody else who's familiar with Gremlin, but SQL
people really love their NULLs

TSymbol: like Ruby, I think we need “enum-like” symbols (e.g., #id, #label).


I'd prefer to just have special accessors for these. E.g. g.V().meta("id").
At least valueMaps would then only have String-keys.
I see the issue with that (naming collisions), but it's still better than
the enums in my opinion (which became a pain when started to implement
GLVs).

Also, what I'm wondering about now: Have you thought about Stored
Procedures and Views in RDBMS? Views can be treated as tables, easy, but
what about stored procedures? SPs can be found in many more DBMS, would be
bad to not support them (or hack something ugly together later in the
development process).

Cheers,
Daniel


On Mon, Apr 29, 2019 at 7:34 AM Marko Rodriguez 
wrote:

> Hi,
>
> *** This email is primarily for Josh (and Kuppitz). However, if others are
> interested… ***
>
> So I did a lot of thinking this weekend about structure/ and this morning,
> I prototyped both graph/ and rdbms/.
>
> This is the way I’m currently thinking of things:
>
> 1. There are 4 base types in structure/.
> - Primitive: string, long, float, int, … (will constrain
> these at some point).
> - TTuple: key/value map.
> - TSequence: an iterable of v objects.
> - TSymbol: like Ruby, I think we need “enum-like” symbols
> (e.g., #id, #label).
>
> 2. Every structure has a “root.”
> - for graph its TGraph implements TSequence
> - for rdbms its a TDatabase implements
> TTuple
>
> 3. Roots implement Structure and thus, are what is generated by
> StructureFactory.mint().
> - defined using withStructure().
> - For graph, its accessible via V().
> - For rdbms, its accessible via db().
>
> 4. There is a list of core instructions for dealing with these
> base objects.
> - value(K key): gets the TTuple value for the provided key.
> - values(K key): gets an iterator of the value for the
> provided key.
> - entries(): gets an iterator of T2Tuple objects for the
> incoming TTuple.
> - hasXXX(A,B): various has()-based filters for looking
> into a TTuple and a TSequence
> - db()/V()/etc.: jump to the “root” of the withStructure()
> structure.
> - drop()/add(): behave as one would expect and thus.
>
> 
>
> For RDBMS, we have three interfaces in rdbms/.
> (machine/machine-core/structure/rdbms)
>
> 1. TDatabase implements TTuple // the root
> structure that indexes the tables.
> 2. TTable implements TSequence> // a table is a sequence
> of rows
> 3. TRow implements TTuple> // a row has string column
> names
>
> I then created a new project at machine/structure/jdbc). The classes in
> here implement the above rdbms/ interfaces/
>
> Here is an RDBMS session:
>
> final Machine machine = LocalMachine.open();
> final TraversalSource jdbc =
> Gremlin.traversal(machine).
> withProcessor(PipesProcessor.class).
> withStructure(JDBCStructure.class,
> Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test"));
>
> System.out.println(jdbc.db().toList());
> System.out.println(jdbc.db().entries().toList());
> System.out.println(jdbc.db().value("people").toList());
> System.out.println(jdbc.db().values("people").toList());
> System.out.println(jdbc.db().values("people").value("name").toList());
> System.out.println(jdbc.db().values("people").entries().toList());
>
> This yields:
>
> []
> [PEOPLE:]
> []
> [, ]
> [marko, josh]
> [NAME:marko, AGE:29, NAME:josh, AGE:32]
>
> The bytecode of the last query is:
>
> [db(), values(people),
> entries]
>
> JDBCDatabase implements TDatabase, Structure.
> *** JDBCDatabase is the root structure and is referenced by db()
> *** (CRUCIAL POINT)
>
> Assume another table called ADDRESSES with two columns: name and city.
>
>
> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).value(“city”)
>
> The above is equivalent to:
>
> SELECT city FROM people,addresses WHERE people.name=addresses.name
>
> If you want to do an inner join (a product), you do this:
>
>
> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).as(“y”).path(“x”,”y")
>
> The above is equivalent to:
>
> SELECT * FROM addresses INNER JOIN people ON people.name=addresses.name
>
> NOTES:
> 1. Instead of select(), we simply jump to the root via db() (or
> V() for graph).
> 2. Instead of 

Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)

2019-04-29 Thread Joshua Shinavier
Hi Marko,

I will respond in more detail tomorrow (I'm a late-night-thinking,
early-morning-writing kind of guy) but yes I think this is cool, so long as
we are not overloading the steps with different levels of abstraction.
A has("age",
29), for example, operates at a different level of abstraction than a
has("city",
"Santa Fe") if "city" is a column in an "addresses" table. At least, they
are different if the data model allows for multi-properties,
meta-properties, and hyper-edges. A property is something that can either
be there, attached to an element, or not be there. There may also be more
than one such property, and it may have other properties attached to it. A
column of a table, on the other hand, is always there (even if its value is
allowed to be null), always has a single value, and cannot have further
properties attached. The same goes for values().

In order to simplify user queries, you can let has() and values() do double
duty, but I still feel that there are lower-level operations at play, at a
logical level even if not at a bytecode level. However, expressing the a
traversal in terms of its lowest-level relational operations may also be
useful for query optimization.

Josh



On Mon, Apr 29, 2019 at 7:34 AM Marko Rodriguez 
wrote:

> Hi,
>
> *** This email is primarily for Josh (and Kuppitz). However, if others are
> interested… ***
>
> So I did a lot of thinking this weekend about structure/ and this morning,
> I prototyped both graph/ and rdbms/.
>
> This is the way I’m currently thinking of things:
>
> 1. There are 4 base types in structure/.
> - Primitive: string, long, float, int, … (will constrain
> these at some point).
> - TTuple: key/value map.
> - TSequence: an iterable of v objects.
> - TSymbol: like Ruby, I think we need “enum-like” symbols
> (e.g., #id, #label).
>
> 2. Every structure has a “root.”
> - for graph its TGraph implements TSequence
> - for rdbms its a TDatabase implements
> TTuple
>
> 3. Roots implement Structure and thus, are what is generated by
> StructureFactory.mint().
> - defined using withStructure().
> - For graph, its accessible via V().
> - For rdbms, its accessible via db().
>
> 4. There is a list of core instructions for dealing with these
> base objects.
> - value(K key): gets the TTuple value for the provided key.
> - values(K key): gets an iterator of the value for the
> provided key.
> - entries(): gets an iterator of T2Tuple objects for the
> incoming TTuple.
> - hasXXX(A,B): various has()-based filters for looking
> into a TTuple and a TSequence
> - db()/V()/etc.: jump to the “root” of the withStructure()
> structure.
> - drop()/add(): behave as one would expect and thus.
>
> 
>
> For RDBMS, we have three interfaces in rdbms/.
> (machine/machine-core/structure/rdbms)
>
> 1. TDatabase implements TTuple // the root
> structure that indexes the tables.
> 2. TTable implements TSequence> // a table is a sequence
> of rows
> 3. TRow implements TTuple> // a row has string column
> names
>
> I then created a new project at machine/structure/jdbc). The classes in
> here implement the above rdbms/ interfaces/
>
> Here is an RDBMS session:
>
> final Machine machine = LocalMachine.open();
> final TraversalSource jdbc =
> Gremlin.traversal(machine).
> withProcessor(PipesProcessor.class).
> withStructure(JDBCStructure.class,
> Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test"));
>
> System.out.println(jdbc.db().toList());
> System.out.println(jdbc.db().entries().toList());
> System.out.println(jdbc.db().value("people").toList());
> System.out.println(jdbc.db().values("people").toList());
> System.out.println(jdbc.db().values("people").value("name").toList());
> System.out.println(jdbc.db().values("people").entries().toList());
>
> This yields:
>
> []
> [PEOPLE:]
> []
> [, ]
> [marko, josh]
> [NAME:marko, AGE:29, NAME:josh, AGE:32]
>
> The bytecode of the last query is:
>
> [db(), values(people),
> entries]
>
> JDBCDatabase implements TDatabase, Structure.
> *** JDBCDatabase is the root structure and is referenced by db()
> *** (CRUCIAL POINT)
>
> Assume another table called ADDRESSES with two columns: name and city.
>
>
> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).value(“city”)
>
> The above is equivalent to:
>
> SELECT city FROM people,addresses WHERE people.name=addresses.name
>
> If you want to do an inner join (a product), you do this:
>
>
> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).as(“y”).path(“x”,”y")
>
> The above is equivalent to:
>
> SELECT * FROM addresses INNER JOIN people ON