Hey, Check this out:
############################ Machine machine = LocalMachine.open(); TraversalSource jdbc = Gremlin.traversal(machine). withProcessor(PipesProcessor.class). withStructure(JDBCStructure.class, Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test")); System.out.println(jdbc.db().values("people").as("x”). db().values("addresses").as("y").has("name", __.path("x").by("name")). path("x", "y").toList()); System.out.println(“\n\n”) System.out.println(jdbc.db().values("people").as("x”). db().values("addresses").as("y").has("name", __.path("x").by("name")). path("x", "y").explain().toList()); ############################ [[{NAME=marko, AGE=29}, {CITY=santa fe, NAME=marko}], [{NAME=josh, AGE=32}, {CITY=san jose, NAME=josh}]] [Original [db, values(people)@x, db, values(addresses)@y, hasKeyValue(name,[path(x,[value(name)])]), path(x,y,|)] JDBCStrategy [db(<database#conn9: url=jdbc:h2:/tmp/test user=>), values(people)@x, db(<database#conn10: url=jdbc:h2:/tmp/test user=>), values(addresses)@y, hasKeyValue(name,[path(x,[value(name)])]), path(x,y,|)] JDBCQueryStrategy [jdbc:sql(conn9: url=jdbc:h2:/tmp/test user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)] PipesStrategy [jdbc:sql(conn9: url=jdbc:h2:/tmp/test user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)] CoefficientStrategy [jdbc:sql(conn9: url=jdbc:h2:/tmp/test user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)] CoefficientVerificationStrategy [jdbc:sql(conn9: url=jdbc:h2:/tmp/test user=,x,y,SELECT x.*, y.* FROM people AS x, addresses AS y WHERE x.name=y.name)] ------------------------------- Compilation [FlatMapInitial] Execution Plan [PipesProcessor] [InitialStep[FlatMapInitial]]] I basically look for a db.values.db.values.has-pattern in the bytecode and if I find it, I try and roll it into a single provider-specific instruction that does a SELECT query. Here is JDBCQueryStrategy (its ghetto and error prone, but I just wanted to get the basic concept working): https://github.com/apache/tinkerpop/blob/7142dc16d8fc81ad8bd4090096b42e5b9b1744f4/java/machine/structure/jdbc/src/main/java/org/apache/tinkerpop/machine/structure/jdbc/strategy/JDBCQueryStrategy.java <https://github.com/apache/tinkerpop/blob/7142dc16d8fc81ad8bd4090096b42e5b9b1744f4/java/machine/structure/jdbc/src/main/java/org/apache/tinkerpop/machine/structure/jdbc/strategy/JDBCQueryStrategy.java> Here is SqlFlatMapStep (hyper-ghetto… but whateva’): https://github.com/apache/tinkerpop/blob/7142dc16d8fc81ad8bd4090096b42e5b9b1744f4/java/machine/structure/jdbc/src/main/java/org/apache/tinkerpop/machine/structure/jdbc/function/flatmap/SqlFlatMap.java <https://github.com/apache/tinkerpop/blob/7142dc16d8fc81ad8bd4090096b42e5b9b1744f4/java/machine/structure/jdbc/src/main/java/org/apache/tinkerpop/machine/structure/jdbc/function/flatmap/SqlFlatMap.java> Na na!, Marko. http://rredux.com <http://rredux.com/> > On Apr 29, 2019, at 11:50 AM, Marko Rodriguez <okramma...@gmail.com> wrote: > > Hello Kuppitz, > >> I don't think it's a good idea to keep this mindset for TP4; NULLs are too >> important in RDBMS. I don't know, maybe you can convince SQL people that >> dropping a value is the same as setting its value to NULL. It would work >> for you and me and everybody else who's familiar with Gremlin, but SQL >> people really love their NULLs…. > > Hmm……. I don’t like nulls. Perhaps with time a clever solution will emerge. > ???? > >> I'd prefer to just have special accessors for these. E.g. g.V().meta("id"). >> At least valueMaps would then only have String-keys. >> I see the issue with that (naming collisions), but it's still better than >> the enums in my opinion (which became a pain when started to implement >> GLVs). > > So, TSymbols are not Java enums. They are simply a “primitive”-type that will > have a serialization like: > > symbol[id] > > Meaning, that people can make up Symbols all day long without having to > update serializers. How I see them working is that they are Strings prefixed > with #. > > g.V().outE() <=> g.V().values(“#outE”) > g.V().id() <=> g.V().value(“#id”) > g.V().hasLabel(“person") <=> g.V().has(“#label”,”person”) > > Now that I type this out, perhaps we don’t even have a TSymbol-class. > Instead, any String that starts with # is considered a symbol. Now watch this: > > g.V().label() <=> g.V().value(“#label”) > g.V().labels() <=> g.V().values(“#label”) > > In this way, we can support Neo4j multi-labels as a Neo4jVertex’s #label-Key > references a TSequence<String>. > > g.V(1).label() => TSequence<String> > g.V(1).labels() => String, String, String, … > g.V(1).label().add(“programmer”) > g.V(1).label().drop(“person”) > > So we could do “meta()”, but then you need respective “hasXXX”-meta() > methods. I think #symbol is easiest .. ? > >> Also, what I'm wondering about now: Have you thought about Stored >> Procedures and Views in RDBMS? Views can be treated as tables, easy, but >> what about stored procedures? SPs can be found in many more DBMS, would be >> bad to not support them (or hack something ugly together later in the >> development process). > > I’m not super versed in RDBMS technology. Can you please explain to me how to > create a StoreProcedure and the range of outputs a StoredProcedure produces? > From there, I can try and “Bytecode-ize” it. > > Thanks Kuppitz, > Marko. > > http://rredux.com <http://rredux.com/> > > > > >> On Mon, Apr 29, 2019 at 7:34 AM Marko Rodriguez <okramma...@gmail.com >> <mailto:okramma...@gmail.com>> >> wrote: >> >>> Hi, >>> >>> *** This email is primarily for Josh (and Kuppitz). However, if others are >>> interested… *** >>> >>> So I did a lot of thinking this weekend about structure/ and this morning, >>> I prototyped both graph/ and rdbms/. >>> >>> This is the way I’m currently thinking of things: >>> >>> 1. There are 4 base types in structure/. >>> - Primitive: string, long, float, int, … (will constrain >>> these at some point). >>> - TTuple<K,V>: key/value map. >>> - TSequence<V>: an iterable of v objects. >>> - TSymbol: like Ruby, I think we need “enum-like” symbols >>> (e.g., #id, #label). >>> >>> 2. Every structure has a “root.” >>> - for graph its TGraph implements TSequence<TVertex> >>> - for rdbms its a TDatabase implements >>> TTuple<String,TTable> >>> >>> 3. Roots implement Structure and thus, are what is generated by >>> StructureFactory.mint(). >>> - defined using withStructure(). >>> - For graph, its accessible via V(). >>> - For rdbms, its accessible via db(). >>> >>> 4. There is a list of core instructions for dealing with these >>> base objects. >>> - value(K key): gets the TTuple value for the provided key. >>> - values(K key): gets an iterator of the value for the >>> provided key. >>> - entries(): gets an iterator of T2Tuple objects for the >>> incoming TTuple. >>> - hasXXX(A,B): various has()-based filters for looking >>> into a TTuple and a TSequence >>> - db()/V()/etc.: jump to the “root” of the withStructure() >>> structure. >>> - drop()/add(): behave as one would expect and thus. >>> >>> ———— >>> >>> For RDBMS, we have three interfaces in rdbms/. >>> (machine/machine-core/structure/rdbms) >>> >>> 1. TDatabase implements TTuple<String,TTable> // the root >>> structure that indexes the tables. >>> 2. TTable implements TSequence<TRow<?>> // a table is a sequence >>> of rows >>> 3. TRow<V> implements TTuple<String,V>> // a row has string column >>> names >>> >>> I then created a new project at machine/structure/jdbc). The classes in >>> here implement the above rdbms/ interfaces/ >>> >>> Here is an RDBMS session: >>> >>> final Machine machine = LocalMachine.open(); >>> final TraversalSource jdbc = >>> Gremlin.traversal(machine). >>> withProcessor(PipesProcessor.class). >>> withStructure(JDBCStructure.class, >>> Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test")); >>> >>> System.out.println(jdbc.db().toList()); >>> System.out.println(jdbc.db().entries().toList()); >>> System.out.println(jdbc.db().value("people").toList()); >>> System.out.println(jdbc.db().values("people").toList()); >>> System.out.println(jdbc.db().values("people").value("name").toList()); >>> System.out.println(jdbc.db().values("people").entries().toList()); >>> >>> This yields: >>> >>> [<database#conn1: url=jdbc:h2:/tmp/test user=>] >>> [PEOPLE:<table#PEOPLE>] >>> [<table#people>] >>> [<row#PEOPLE:1>, <row#PEOPLE:2>] >>> [marko, josh] >>> [NAME:marko, AGE:29, NAME:josh, AGE:32] >>> >>> The bytecode of the last query is: >>> >>> [db(<database#conn1: url=jdbc:h2:/tmp/test user=>), values(people), >>> entries] >>> >>> JDBCDatabase implements TDatabase, Structure. >>> *** JDBCDatabase is the root structure and is referenced by db() >>> *** (CRUCIAL POINT) >>> >>> Assume another table called ADDRESSES with two columns: name and city. >>> >>> >>> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).value(“city”) >>> >>> The above is equivalent to: >>> >>> SELECT city FROM people,addresses WHERE people.name=addresses.name >>> >>> If you want to do an inner join (a product), you do this: >>> >>> >>> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).as(“y”).path(“x”,”y") >>> >>> The above is equivalent to: >>> >>> SELECT * FROM addresses INNER JOIN people ON people.name=addresses.name >>> >>> NOTES: >>> 1. Instead of select(), we simply jump to the root via db() (or >>> V() for graph). >>> 2. Instead of project(), we simply use value() or values(). >>> 3. Instead of select() being overloaded with by() join syntax, we >>> use has() and path(). >>> - like TP3 we will be smart about dropping path() data >>> once its no longer referenced. >>> 4. We can also do LEFT and RIGHT JOINs (haven’t thought through >>> FULL OUTER JOIN yet). >>> - however, we don’t support ‘null' in TP so I don’t know >>> if we want to support these null-producing joins. ? >>> >>> LEFT JOIN: >>> * If an address doesn’t exist for the person, emit a “null”-filled >>> path. >>> >>> jdbc.db().values(“people”).as(“x”). >>> db().values(“addresses”).as(“y”). >>> choose(has(“name”,eq(path(“x”).by(“name”))), >>> identity(), >>> path(“y”).by(null).as(“y”)). >>> path(“x”,”y") >>> >>> SELECT * FROM addresses LEFT JOIN people ON people.name=addresses.name >>> >>> RIGHT JOIN: >>> >>> jdbc.db().values(“people”).as(“x”). >>> db().values(“addresses”).as(“y”). >>> choose(has(“name”,eq(path(“x”).by(“name”))), >>> identity(), >>> path(“x”).by(null).as(“x”)). >>> path(“x”,”y") >>> >>> >>> SUMMARY: >>> >>> There are no “low level” instructions. Everything is based on the standard >>> instructions that we know and love. Finally, if not apparent, the above >>> bytecode chunks would ultimately get strategized into a single SQL query >>> (breadth-first) instead of one-off queries (depth-first) to improve >>> performance. >>> >>> Neat?, >>> Marko. >>> >>> http://rredux.com <http://rredux.com/> <http://rredux.com/ >>> <http://rredux.com/>> >