Hi Mark, Exactly. I never saw Gremlin-Py until now and just noticed it on the Apache TinkerPop homepage. That is good stuff. Moreover, as you say, there is a distinction between:
1. Writing Gremlin in a host language. 2. Communicating to a GremlinServer-compliant server in a host language. The (1) is about query syntax and the (2) is about protocol stuffs. Lots of the libraries either confound the two or just do (2) with (1) simply being a Groovy String (cheesy). I would like to see a lot more (1) of the community libraries as I think this is one of the big selling points of Gremlin -- write in your native language. BTW, I added Gremlin-Py to the description in the "host language embedding" section here: http://www.planettinkerpop.org/#gremlin (2 scrolls down). Thanks for your thoughts, Marko. http://markorodriguez.com On Apr 14, 2016, at 7:06 AM, Mark Henderson <emehr...@gmail.com> wrote: > I've written "native object to Gremlin" libs in both PHP and Python and it > isn't too bad/not too far from Groovy. The biggest issues were around indices > [..] (when it had that format) and closures "{x -> ...}", but otherwise both > langs allowed for easy query building. > > It basically looked like this in PHP: > > $g= Gremlin(); > $g->V()->has('"name"','mark'); > echo (str)$g; //g.V().has("name",SOME_BOUND_VAR_1) > > Works pretty much the same with the Python lib that I've been building > (https://github.com/emehrkay/gremlinpy). > > If we wanted to actually execute the query on every step, that wouldn't be > too difficult to implement with Gremlinpy. Gremlinpy is a simple linked list, > it looks at g.V().has('"name"', 'mark') as three token objects with a shared > pool of bound parameters. It creates the string query and parameters > dictionary when you cast the list to a string. The only change needed would > be to bind in a library like Gremlinclient > (https://github.com/davebshow/gremlinclient), build the query with every > step, and send it to the server. > > res = g.V() # sends request > res2 = g.V().has('"name"', 'mark') # second request > ... > > The remaining difficulty would be deciding what gets bound. Maybe you can > pass in a key val pair for what you want bound > > res = g.V().has('"name"',{'NAME':'mark'}) # g.V().has("name",NAME) > > > > On Tuesday, April 12, 2016 at 10:54:08 PM UTC-4, Dmill wrote: > Yes a lot of the points you bring up are valid. > > One of the main problems with stringifying everything is that it does not > allow for some of the stuff I mentioned in my PS. That is to name "smart > merges". This query building behavior that makes use of scopes is > unfortunately the standard for frameworks in the industry. > This is mostly due to the SQL heritage and it's declarative nature ; ordering > of "steps" doesn't matter so it allows for easy "after the fact" client side > filtering. It's not uncommon to have a base query that gets altered by some > filtering data. In some cases it's a simple has() that needs to be injected > somewhere, in other cases it's a repeat() that needs to be completely altered. > Use cases can get a little complicated here but in it's simplest form imagine > having to add/remove entries to/from a match(). Of course that scenario works > well with a toString approach but for other steps, not so well. Our > experience has been that the builder needs to be aware of the step's > signatures to resolve merges. > > So sure this is another problem entirely, in the end users can't really do > this with string queries either. But for widespread adoption it would be best > if the query builder could handle these scenarios. > > Also to bounce off of some of your comments : > > > $id -> "~id" > > $label -> "~label" > > g.V().out("%%x") > > $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l) > > All of the above are absolutely possible. But it's a lot to keep in mind for > users that are already trying to figure out how Gremlin works. Now they also > need to translate gremlin-groovy into gremlin-php. > One of the advantages of going the hard route and keeping track of all step > signatures instead of a toString approach is that you can significantly > reduce the above cases. The builder can resolve quite a few of these > automatically and when conflicts arise it can do it's best to resolve it and > throw/log a warning telling the user how he could explicit his query. > > >For your Date example, you would have to have a special "toString()" for PHP > >dates to Java dates (or whichever backend ScriptEngine is being used). > > There are no PHP Dates [insert desperate crying emoji here]. PHP sucks with > typing. It's got it's good points but this kind of stuff is not one of them. > Basically PHP Dates come in various forms, from Integer timestamps to String > and only the user really knows what he wants. We can provide this > functionality like you did with long() but it's another thing to keep in mind. > > One point we haven't gone over have been lambdas. We can't really toString > these. I guess this is where customStep() or script() come in play. > > To wrap it up, a toString query builder is absolutely an option and could > cover a lot of the API. In fact in PHP we could magically make any API method > available, $g->something("~label", "lolo") would stringify to > g.something(label, "lolo") regardless of whether or not the step exists. But > this involves quite a few language specific alterations and doesn't provide > much (if any) functional benefit. > It would be so much easier for people to just write a gremlin-groovy string > as it's well documented and doesn't need any extra knowledge. > If on the other hand the query builder has features like mentioned in the PS > or earlier in this post, it's well worth the effort. I believe most people > who build their own query builders do so to support some form of extra > feature they wouldn't have by using gremlin-groovy string queries. > But such a query builder enters the realm of non-trivial (although not > unachievable). A first step in helping people make these builders would be to > provide an easily parseable list of signatures for the most desirable > classes. Maybe something along the lines of a yaml file. > > Anyways I'm just thinking out loud at this point. > > > > > > On Tue, Apr 12, 2016 at 9:42 PM, Marko Rodriguez <okram...@gmail.com> wrote: > Hi Dylan, > > Your email is excellent. Thank you for breaking things down for me. Here are > some responses. > >> 1. Method overloading : >> >> abstract class Query { >> public function has(PropertyKey $key); //1 >> public function has(PropertyKey $key, Object $value); //2 >> public function has(Label $label, String $value); //3 >> public function has(VertexId $id, Long $value); //4 >> public function has(VertexId $id, Int $value); //5 >> public function has(VertexId $id, Predicate $p); //6 >> } >> >> The above is illegal in languages like PHP (or javascript?). Instead we're >> stuck with : >> >> abstract class Query { >> public function has(Array $args); >> } >> >> We're then left to figure out what is what in the array and sort out how we >> need to stringify the output. > > I was thinking, why would you need to introspect into the array? Just > toString() each element in the array with a comma (,) in between. For > instance: > > * has("age",32) ==> has(["age",32]) ==> has("age",32) // all String array > element need " " wrappers. > * has("age") ==> has(["age"]) ==> has("age") > * has("person","name","marko") ==> has(["person","name","marko"]) ==> > has("person","name","marko") > > Thus, Gremlin-PHP have one has()-method and that method just iterates the > arguments and toString()'s thing accordingly with comma deliminators. > >> If the user does $g->V()->has("label", "user") do we add quotes to the first >> argument or is it a label/id? What about the second argument, is it a >> predicate? etc. This gets complexe very quickly. > > The universal rule --- if its a String add quotes. If its not, don't. > > $id -> "~id" > $label -> "~label" > > $g->V()->has($label,"user") > >> And what if I had $g->V()->has("id", 36) . PHP only supports Int so one of >> the two signatures (4 or 5) needs to give as we have a major conflict. This >> example is fictional for has() but I've run into this on a couple of other >> methods, just can't remember which. > > Yea, that sucks. Well, you could do this: > > $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l) > > This would, of course, bind you to Gremlin-Groovy as the ultimate > ScriptEngine. > >> Another example would be g.V().has(id, neq(m)) . We could imagine the >> following PHP equivalent $g->V()->has(new Id(), Predicate::neq("m")) where >> Id() is a class that helps us recognize this type, and neq() a static method >> of Predicate. However "m" has to be passed as string and we have no clue >> what m is... is this a string or a binding or a server side variable? More >> on this in point 2. > > Well, this is the same problem in Gremlin-Java. where() is ALWAYS bindings > and has() is ALWAYS objects. Thus: > > $g->V()->where("a",Predicate::neq("m")) ==> g.V().where("a",neq("m")) // > again strings always get " "-wrappers. > >> To close things off here there's also the case of signatures like >> out(String... edgeLabels) that need their own logic. > > Again, just toString() each object in the array and insert commas between. > > $g->V()->out(["created","knows"]) ==> g.V().out("created","knows") > > >> Conclusion: There's a lot of manual work that needs to go into separating >> the logic between signatures and handling special cases. Part of this can be >> automated if your language supports magic getters and setters by parsing the >> javadocs for example. But not only is that an if, the rest will still be >> manual. This step is maintenance heavy. > > I see the biggest pains being: > > 1. Having to implement each method. > 2. Having to have helper classes for P, T, Order, Column, etc. > > This is simply a matter of fat fingering stuff in and not anything > implementation-wise that is problematic -- ????…. > >> 2. Conflicts >> >> Because we're manipulating strings it's really hard to tell a few items >> appart (binding vs server variable vs string; Theres a reason why I separate >> binding and variable). >> >> For instance in the example above of gremlin : g.V().has(id, neq(m)) vs PHP: >> $g->V()->has(new Id(), Predicate::neq("m")) we don't know what to make of m. >> Is this a binding or a string or even a variable that was previously set in >> the session? There is no clean way of working around this. >> >> Firstly because bindings tend to be handled on a different layer than the >> query builder. >> Secondly because methods that will help in avoiding the conflicts will also >> lose typing data. >> For example : $g->V()->has(new Id(), Predicate::neq(Query::variable("m"))) >> could generate the proper query by outputting m without quotes but we don't >> know what type m is so in some cases it might be tricky to select the proper >> signature. >> >> Conclusion: there are a number of ways around this point. We use prefixes >> B_m or V_m and a hack to ignore signatures altogether when in this scenario. >> It's not that these aren't solve-able they just aren't trivial. > > Hm. Yea, I'm not to smart about sever variables. Out of my butt you could > create a "crazy String" for those an then do replaceAll-style updates. > > g.V().out("%%x") > > replaceAll("%%x",x) > > ? > > >> 3. API >> >> Why we would need traversal, graph, vertex and edge APIs are quite self >> explanatory for everyday work with Gremlin. I'm just going to expose why we >> would also require some Java classes as well. >> >> Because JSON is lossy by nature we often have to cast variables to certain >> types. For example by submitting these kind of scripts : >> g.V(1).property("date", new Date(B_m)); with B_m = timestamp. This is just >> another case that is difficult to cover. >> >> This adds onto the other points in making a gremlin language variant >> non-trivial. >> >> All of the above can be worked around by using an injection method that just >> appends a string to the query : $g->customStep("V().has(id, neq(m))") but >> that's besides the point. > > > Ah. Classy. Note that in ?3.2.1? we might support script()-step. > > g.V().script("out().map{ it.name }") > > …to enable lambdas in remote'd traversals (Server or OLAP). > > For your Date example, you would have to have a special "toString()" for PHP > dates to Java dates (or whichever backend ScriptEngine is being used). > > $g->V()->property("data", phpDate) > > Your Array-string-ifier would not just call toString() blindly on the objects > of the array arguments, but would do stuff like: > > if(object instanceof String) > return \" + object.toString() + "\; > else if(object instanceof Date) > return "new Date(…)"; > else > return object.toString() > > >> Final Conclusion: It's not a trivial task. Of course the examples above are >> very verbose and achieving something closer to gremlin in style is possible >> but there are always going to be "gotchas" users will need to keep in mind. >> A while back in TP2 I released a php library for this (the one we currently >> use in our projects). I decided to remove it as it was too much maintenance >> to get it to work across user causes so I decided to concentrate on our own >> one (some choices made in 2. wouldn't have worked for other cases) >> I'm convinced there's got to be a way of reconciling everything and getting >> this to work flawlessly but it's going to require a lot of thought/work >> >> >> PS: I mentioned some other points like managing multiple versions of gremlin >> (for two lines of releases) which is a real headache. >> For performance it may be good to allow the builder to handle multiple >> lines, which comes with it's load of complications as well. >> And then there's the ability to "block" queries and either inject them into >> each other or merge them together which simplifies unit testing and extends >> functionality : >> >> $query = $g->V()->out("likes")->flag("flagname")->has("age", 20); >> // Some logic here accesses new information and realizes the query needs >> altering >> $query->getFlag("flagname")->out("hates", true) // true for merge >> $query->toString(); // g.V().out('likes', hates').has('age', 20) >> >> But this point alone could warrant it's own email as it is relatively >> complex. Though TP3 has simplified some cases thanks to union() and some >> other steps. >> >> Our builder supports all of the above so if you have any questions feel free >> to ask me. >> >> Phew that was long. I'll add this to the ticket in a bit. > > > Yes, maintenance seems the biggest pain. Every new method to Gremlin-Java > requires updates to Gremlin-PHP ---- perhaps there is a programmatic way to > introspect the Java source file (or JavaDoc) and generate the code > automagically? > > public GraphTraversal out(final String… edgeLabels) > ==auto-write==> > out(Array… edgeLabels) { > $string -> $string + ".out(" + StringHelper::toString(edgeLabels) + ")"; > } > > > If you could do that, then the only code you actually have to write/maintain > (besides the introspector above) is StringHelper which does all the fancy > String conversion of arguments. > > ??. > > Thanks Dylan for your time, > Marko. > > http://markorodriguez.com > >> >> On Tue, Apr 12, 2016 at 4:37 PM, Marko Rodriguez <okram...@gmail.com> wrote: >> Hello everyone, >> >> Please see the section entitled "Host Language Embedding" here: >> http://www.planettinkerpop.org/#gremlin (3 sections down) >> >> When I was writing up this section, I noticed that most of the language >> drivers that are advertised on our homepage >> (http://tinkerpop.incubator.apache.org/#graph-libraries) know how to talk to >> Gremlin Server via web sockets, REST, etc., but rely on the user to create a >> String of their graph traversal and submit it. For instance, here is a >> snippet from the Gremlin-PHP documentation: >> >> $db = new Connection([ >> 'host' => 'localhost', >> 'graph' => 'graph', >> 'username' => 'pomme', >> 'password' => 'hardToCrack' >> ]); >> //you can set $db->timeout = 0.5; if you wish >> $db->open(); >> $db->send('g.V(2)'); >> //do something with result >> $db->close(); >> >> $db->send(String) is great, but it would be better if the user didn't have >> to leave PHP. >> >> Please see this ticket: >> https://issues.apache.org/jira/browse/TINKERPOP-1232 >> >> I think for non-JVM languages, it would be nice if these drivers (PHP, >> JavaScript, Python, etc.) didn't require the user to explicitly create >> Gremlin-XXX Strings, but instead either used JINI or model-3 in the ticket >> above. Lets look at model-3 as I think its the easiest and more general. >> >> For instance, they would have a class in their native language that would >> mirror the GraphTraversal API. *** I don't know any other languages well >> enough, so I'm just going to do this in Groovy :), hopefully you get the >> generalized point. *** >> >> public class Test { >> >> String s; >> >> public Test(final String source) { >> s = source; >> } >> >> public Test() { >> s = ""; >> } >> >> public Test V() { >> s = s + ".V()"; >> return this; >> } >> >> public Test outE(final String label) { >> s = s + ".outE(\"${label}\")"; >> return this; >> } >> >> public Test repeat(final Test test) { >> s = s + ".repeat(${test.toString()})"; >> return this; >> } >> >> public String toString() { >> return s; >> } >> } >> >> Then, via fluency (function composition) and nesting, you could generate a >> Gremlin-Groovy (or which ever ScriptEngine language) traversal String in the >> backend. >> >> gremlin> g = new Test("g"); >> ==>g >> gremlin> g.V().outE("knows") >> ==>g.V().outE("knows") >> gremlin> >> gremlin> g = new Test("g"); >> ==>g >> gremlin> g.V().repeat(new Test().outE("knows")) >> ==>g.V().repeat(.outE("knows")) >> gremlin> >> >> From there, that String is then submitted as you normally do with your >> driver. For instance, with Gremlin-PHP, via $db->send(String). >> >> Of course, if your driver is already on a JVM language, there is no reason >> to do this (e.g. Gremlin-Scala), but if you are not on the JVM, this gives >> the user host language embedding and a more natural "look and feel." >> Moreover, if your language doesn't use "dot notation," you would use the >> natural idioms of your language. >> >> $g->V->outE("knows") >> >> If anyone is interested in updating their non-JVM language driver to use >> this model, I would like to write a blog post about it. Or perhaps, a >> tutorial for for language designers. >> >> Thoughts?, >> Marko. >> >> http://markorodriguez.com >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Gremlin-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to gremlin-user...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/gremlin-users/47A92EFF-CB36-41EA-B252-6823A42F4D7B%40gmail.com. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Gremlin-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to gremlin-user...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/gremlin-users/CAE0QJeqntbEk70yg_d2SACxaKxA995Z8SL5_3JRaxzsQCCOMbw%40mail.gmail.com. >> For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to gremlin-user...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/gremlin-users/B0319D4D-9834-47E3-9743-522BA13060A9%40gmail.com. > > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to gremlin-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/gremlin-users/600cc5c0-99f0-4bf9-9be6-92b7a9a58838%40googlegroups.com. > For more options, visit https://groups.google.com/d/optout.