I think writing "Gremlin/Groovy" in a host language is pretty awesome as long as it isn't too far off from writing actual Gremlin. I can revive my PHP project if it would be helpful to the community. A JavaScript version would probably be one that would get the most attention from developers today, but JS, even with es6, doesn't have the flexibility (maybe with Proxies) with its objects where you wouldn't have to write a full-on 1-to-1 api equivalent of Gremlin (let alone mimicking Groovy). It seems like a Ruby version would be doable by implementing `method_missing`
Thanks for adding Gremlinpy to the new site (I need to clean up the code a bit *shame*) On Thursday, April 14, 2016 at 9:34:40 AM UTC-4, Marko A. Rodriguez wrote: > > Hi Mark, > > Exactly. I never saw Gremlin-Py until now and just noticed it on the > Apache TinkerPop homepage. That is good stuff. Moreover, as you say, there > is a distinction between: > > 1. Writing Gremlin in a host language. > 2. Communicating to a GremlinServer-compliant server in a host language. > > The (1) is about query syntax and the (2) is about protocol stuffs. > > Lots of the libraries either confound the two or just do (2) with (1) > simply being a Groovy String (cheesy). > > I would like to see a lot more (1) of the community libraries as I think > this is one of the big selling points of Gremlin -- write in your native > language. > > BTW, I added Gremlin-Py to the description in the "host language > embedding" section here: http://www.planettinkerpop.org/#gremlin (2 > scrolls down). > > Thanks for your thoughts, > Marko. > > http://markorodriguez.com > > On Apr 14, 2016, at 7:06 AM, Mark Henderson <emeh...@gmail.com > <javascript:>> wrote: > > I've written "native object to Gremlin" libs in both PHP and Python and it > isn't too bad/not too far from Groovy. The biggest issues were around > indices [..] (when it had that format) and closures "{x -> ...}", but > otherwise both langs allowed for easy query building. > > It basically looked like this in PHP: > > $g= Gremlin(); > $g->V()->has('"name"','mark'); > echo (str)$g; //g.V().has("name",SOME_BOUND_VAR_1) > > Works pretty much the same with the Python lib that I've been building ( > https://github.com/emehrkay/gremlinpy). > > If we wanted to actually execute the query on every step, that wouldn't be > too difficult to implement with Gremlinpy. Gremlinpy is a simple linked > list, it looks at g.V().has('"name"', 'mark') as three token objects with a > shared pool of bound parameters. It creates the string query and parameters > dictionary when you cast the list to a string. The only change needed would > be to bind in a library like Gremlinclient ( > https://github.com/davebshow/gremlinclient), build the query with every > step, and send it to the server. > > res = g.V() # sends request > res2 = g.V().has('"name"', 'mark') # second request > ... > > The remaining difficulty would be deciding what gets bound. Maybe you can > pass in a key val pair for what you want bound > > res = g.V().has('"name"',{'NAME':'mark'}) # g.V().has("name",NAME) > > > > On Tuesday, April 12, 2016 at 10:54:08 PM UTC-4, Dmill wrote: > > Yes a lot of the points you bring up are valid. > > One of the main problems with stringifying everything is that it does not > allow for some of the stuff I mentioned in my PS. That is to name "smart > merges". This query building behavior that makes use of scopes is > unfortunately the standard for frameworks in the industry. > This is mostly due to the SQL heritage and it's declarative nature ; > ordering of "steps" doesn't matter so it allows for easy "after the fact" > client side filtering. It's not uncommon to have a base query that gets > altered by some filtering data. In some cases it's a simple has() that > needs to be injected somewhere, in other cases it's a repeat() that needs > to be completely altered. > Use cases can get a little complicated here but in it's simplest form > imagine having to add/remove entries to/from a match(). Of course that > scenario works well with a toString approach but for other steps, not so > well. Our experience has been that the builder needs to be aware of the > step's signatures to resolve merges. > > So sure this is another problem entirely, in the end users can't really do > this with string queries either. But for widespread adoption it would be > best if the query builder could handle these scenarios. > > Also to bounce off of some of your comments : > > > $id -> "~id" > > $label -> "~label" > > g.V().out("%%x") > > $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l) > > All of the above are absolutely possible. But it's a lot to keep in mind > for users that are already trying to figure out how Gremlin works. Now they > also need to translate gremlin-groovy into gremlin-php. > One of the advantages of going the hard route and keeping track of all > step signatures instead of a toString approach is that you can > significantly reduce the above cases. The builder can resolve quite a few > of these automatically and when conflicts arise it can do it's best to > resolve it and throw/log a warning telling the user how he could explicit > his query. > > >For your Date example, you would have to have a special "toString()" for > PHP dates to Java dates (or whichever backend ScriptEngine is being used). > > There are no PHP Dates [insert desperate crying emoji here]. PHP sucks > with typing. It's got it's good points but this kind of stuff is not one of > them. Basically PHP Dates come in various forms, from Integer timestamps to > String and only the user really knows what he wants. We can provide this > functionality like you did with long() but it's another thing to keep in > mind. > > One point we haven't gone over have been lambdas. We can't really toString > these. I guess this is where customStep() or script() come in play. > > To wrap it up, a toString query builder is absolutely an option and could > cover a lot of the API. In fact in PHP we could magically make any API > method available, $g->something("~label", "lolo") would stringify to > g.something(label, > "lolo") regardless of whether or not the step exists. But this involves > quite a few language specific alterations and doesn't provide much (if any) > functional benefit. > It would be so much easier for people to just write a gremlin-groovy > string as it's well documented and doesn't need any extra knowledge. > If on the other hand the query builder has features like mentioned in the > PS or earlier in this post, it's well worth the effort. I believe most > people who build their own query builders do so to support some form of > extra feature they wouldn't have by using gremlin-groovy string queries. > But such a query builder enters the realm of non-trivial (although not > unachievable). A first step in helping people make these builders would be > to provide an easily parseable list of signatures for the most desirable > classes. Maybe something along the lines of a yaml file. > > Anyways I'm just thinking out loud at this point. > > > > > > On Tue, Apr 12, 2016 at 9:42 PM, Marko Rodriguez <okram...@gmail.com> > wrote: > > Hi Dylan, > > Your email is excellent. Thank you for breaking things down for me. Here > are some responses. > > *1. Method overloading :* > > abstract class Query { > public function has(PropertyKey $key); //1 > public function has(PropertyKey $key, Object $value); //2 > public function has(Label $label, String $value); //3 > public function has(VertexId $id, Long $value); //4 > public function has(VertexId $id, Int $value); //5 > public function has(VertexId $id, Predicate $p); //6 > } > > The above is illegal in languages like PHP (or javascript?). Instead we're > stuck with : > > abstract class Query { > public function has(Array $args); > } > > We're then left to figure out what is what in the array and sort out how > we need to stringify the output. > > > I was thinking, why would you need to introspect into the array? Just > toString() each element in the array with a comma (,) in between. For > instance: > > * has("age",32) ==> has(["age",32]) ==> has("age",32) // all String array > element need " " wrappers. > * has("age") ==> has(["age"]) ==> has("age") > * has("person","name","marko") ==> has(["person","name","marko"]) ==> > has("person","name","marko") > > Thus, Gremlin-PHP have one has()-method and that method just iterates the > arguments and toString()'s thing accordingly with comma deliminators. > > If the user does $g->V()->has("label", "user") do we add quotes to the > first argument or is it a label/id? What about the second argument, is it a > predicate? etc. This gets complexe very quickly. > > > The universal rule --- if its a String add quotes. If its not, don't. > > $id -> "~id" > $label -> "~label" > > $g->V()->has($label,"user") > > And what if I had $g->V()->has("id", 36) . PHP only supports Int so one > of the two signatures (4 or 5) needs to give as we have a major conflict. > This example is fictional for has() but I've run into this on a couple of > other methods, just can't remember which. > > > Yea, that sucks. Well, you could do this: > > $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l) > > This would, of course, bind you to Gremlin-Groovy as the ultimate > ScriptEngine. > > Another example would be g.V().has(id, neq(m)) . We could imagine the > following PHP equivalent $g->V()->has(new Id(), Predicate::neq("m")) > where Id() is a class that helps us recognize this type, and neq() a > static method of Predicate. However "m" has to be passed as string and we > have no clue what m is... is this a string or a binding or a server side > variable? More on this in point *2.* > > > Well, this is the same problem in Gremlin-Java. where() is ALWAYS bindings > and has() is ALWAYS objects. Thus: > > $g->V()->where("a",Predicate::neq("m")) ==> g.V().where("a",neq("m")) // > again strings always get " "-wrappers. > > To close things off here there's also the case of signatures like > out(String... > edgeLabels) that need their own logic. > > > Again, just toString() each object in the array and insert commas between. > > $g->V()->out(["created","knows"]) ==> g.V().out("created","knows") > > > *Conclusion*: There's a lot of manual work that needs to go into > separating the logic between signatures and handling special cases. Part of > this can be automated if your language supports magic getters and setters > by parsing the javadocs for example. But not only is that an if, the rest > will still be manual. This step is maintenance heavy. > > > I see the biggest pains being: > > 1. Having to implement each method. > 2. Having to have helper classes for P, T, Order, Column, etc. > > This is simply a matter of fat fingering stuff in and not anything > implementation-wise that is problematic -- ????…. > > *2. Conflicts* > > Because we're manipulating strings it's really hard to tell a few items > appart (binding vs server variable vs string; Theres a reason why I > separate binding and variable). > > For instance in the example above of *gremlin :* g.V().has(id, neq(m)) vs > *PHP:* $g->V()->has(new Id(), Predicate::neq("m")) we don't know what to > make of m. Is this a binding or a string or even a variable that was > previously set in the session? There is no clean way of working around this. > > Firstly because bindings tend to be handled on a different layer than the > query builder. > Secondly because methods that will help in avoiding the conflicts will > also lose typing data. > For example : $g->V()->has(new Id(), > Predicate::neq(Query::variable("m"))) could generate the proper query by > outputting m without quotes but we don't know what type m is so in some > cases it might be tricky to select the proper signature. > > *Conclusion*: there are a number of ways around this point. We use > prefixes B_m or V_m and a hack to ignore signatures altogether when in this > scenario. It's not that these aren't solve-able they just aren't trivial. > > > Hm. Yea, I'm not to smart about sever variables. Out of my butt you could > create a "crazy String" for those an then do replaceAll-style updates. > > g.V().out("%%x") > > replaceAll("%%x",x) > > ? > > > *3. API* > > Why we would need traversal, graph, vertex and edge APIs are quite self > explanatory for everyday work with Gremlin. I'm just going to expose why we > would also require some Java classes as well. > > Because JSON is lossy by nature we often have to cast variables to certain > types. For example by submitting these kind of scripts : > g.V(1).property("date", > new Date(B_m)); with B_m = timestamp. This is just another case that is > difficult to cover. > > This adds onto the other points in making a gremlin language variant > non-trivial. > > All of the above can be worked around by using an injection method that > just appends a string to the query : $g->customStep("V().has(id, > neq(m))") but that's besides the point. > > > > Ah. Classy. Note that in ?3.2.1? we might support script()-step. > > g.V().script("out().map{ it.name }") > > …to enable lambdas in remote'd traversals (Server or OLAP). > > For your Date example, you would have to have a special "toString()" for > PHP dates to Java dates (or whichever backend ScriptEngine is being used). > > $g->V()->property("data", phpDate) > > Your Array-string-ifier would not just call toString() blindly on the > objects of the array arguments, but would do stuff like: > > if(object instanceof String) > return \" + object.toString() + "\; > else if(object instanceof Date) > return "new Date(…)"; > else > return object.toString() > > > > *Final Conclusion:* It's not a trivial task. Of course the examples above > are very verbose and achieving something closer to gremlin in style is > possible but there are always going to be "gotchas" users will need to keep > in mind. A while back in TP2 I released a php library for this (the one we > currently use in our projects). I decided to remove it as it was too much > maintenance to get it to work across user causes so I decided to > concentrate on our own one (some choices made in *2.* wouldn't have > worked for other cases) > I'm convinced there's got to be a way of reconciling everything and > getting this to work flawlessly but it's going to require a lot of > thought/work > > > PS: I mentioned some other points like managing multiple versions of > gremlin (for two lines of releases) which is a real headache. > For performance it may be good to allow the builder to handle multiple > lines, which comes with it's load of complications as well. > And then there's the ability to "block" queries and either inject them > into each other or merge them together which simplifies unit testing and > extends functionality : > > $query = $g->V()->out("likes")->flag("flagname")->has("age", 20); > // Some logic here accesses new information and realizes the query needs > altering > $query->getFlag("flagname")->out("hates", true) // true for merge > $query->toString(); // g.V().out('likes', hates').has('age', 20) > > But this point alone could warrant it's own email as it is relatively > complex. Though TP3 has simplified some cases thanks to union() and some > other steps. > > Our builder supports all of the above so if you have any questions feel > free to ask me. > > Phew that was long. I'll add this to the ticket in a bit. > > > > Yes, maintenance seems the biggest pain. Every new method to Gremlin-Java > requires updates to Gremlin-PHP ---- perhaps there is a programmatic way to > introspect the Java source file (or JavaDoc) and generate the code > automagically? > > public GraphTraversal out(final String… edgeLabels) > ==auto-write==> > out(Array… edgeLabels) { > $string -> $string + ".out(" + StringHelper::toString(edgeLabels) + ")"; > } > > > If you could do that, then the only code you actually have to > write/maintain (besides the introspector above) is StringHelper which does > all the fancy String conversion of arguments. > > ??. > > Thanks Dylan for your time, > Marko. > > http://markorodriguez.com > > > On Tue, Apr 12, 2016 at 4:37 PM, Marko Rodriguez <okram...@gmail.com> > wrote: > > Hello everyone, > > Please see the section entitled "Host Language Embedding" here: > http://www.planettinkerpop.org/#gremlin (3 sections down) > > When I was writing up this section, I noticed that most of the language > drivers that are advertised on our homepage ( > http://tinkerpop.incubator.apache.org/#graph-libraries) know how to talk > to Gremlin Server via web sockets, REST, etc., but rely on the user to > create a String of their graph traversal and submit it. For instance, here > is a snippet from the Gremlin-PHP documentation: > > $db = new Connection([ > 'host' => 'localhost', > 'graph' => 'graph', > 'username' => 'pomme', > 'password' => 'hardToCrack' > ]); > //you can set $db->timeout = 0.5; if you wish > $db->open(); > $db->send('g.V(2)'); > //do something with result > $db->close(); > > > $db->send(String) is great, but it would be better if the user didn't have > to leave PHP. > > Please see this ticket: > https://issues.apache.org/jira/browse/TINKERPOP-1232 > > I think for non-JVM languages, it would be nice if these drivers (PHP, > JavaScript, Python, etc.) didn't require the user to explicitly create > Gremlin-XXX Strings, but instead either used JINI or model-3 in the ticket > above. Lets look at model-3 as I think its the easiest and more general. > > For instance, they would have a class in their native language that would > mirror the GraphTraversal API. *** I don't know any other languages well > enough, so I'm just going to do this in Groovy :), hopefully you get the > generalized point. *** > > public class Test { > > String s; > > public Test(final String source) { > s = source; > } > > public Test() { > s = ""; > } > > public Test V() { > s = s + ".V()"; > return this; > } > > public Test outE(final String label) { > s = s + ".outE(\"${label}\")"; > return this; > } > > public Test repeat(final Test test) { > s = s + ".repeat(${test.toString()})"; > return this; > } > > public String toString() { > return s; > } > } > > > Then, via fluency (function composition) and nesting, you could generate a > Gremlin-Groovy (or which ever ScriptEngine language) traversal String in > the backend. > > gremlin> g = new Test("g"); > ==>g > gremlin> g.V().outE("knows") > ==>g.V().outE("knows") > gremlin> > gremlin> g = new Test("g"); > ==>g > gremlin> g.V().repeat(new Test().outE("knows")) > ==>g.V().repeat(.outE("knows")) > gremlin> > > > From there, that String is then submitted as you normally do with your > driver. For instance, with Gremlin-PHP, via $db->send(String). > > Of course, if your driver is already on a JVM language, there is no reason > to do this (e.g. Gremlin-Scala), but if you are not on the JVM, this gives > the user host language embedding and a more natural "look and feel." > Moreover, if your language doesn't use "dot notation," you would use the > natural idioms of your language. > > $g->V->outE("knows") > > > If anyone is interested in updating their non-JVM language driver to use > this model, I would like to write a blog post about it. Or perhaps, a > tutorial for for language designers. > > Thoughts?, > Marko. > > http://markorodriguez.com > > > -- > You received this message because you are subscribed to the Google Groups > "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to gremlin-user...@googlegroups.com. > To view this discussion on the web visit > > ...