There is something totally mesmerizing about it... I think all the awesome TinkerPop/Gremlin logos that have been created by now really show that having a talented artist/illustrator attached to an open source project like TinkerPop from the start (a) was a stroke of genius and (b) can really enrich the user and community experience, engaging a part of the brain that usually doesn't get involved in dev work (at least for me).
planettinkerpop.com is also a great example of this, man that site looks good. Is this repo going to be public so that others could hijack that tutorial stub? ________________________________ From: Mark Henderson <emehr...@gmail.com> Sent: Thursday, April 14, 2016 1:24 PM To: Gremlin-users Cc: dev@tinkerpop.incubator.apache.org Subject: Re: [TinkerPop] Creating a host language embedded Gremlin language variant. Ha that logo is hilariously awesome! I will add it to the repo later tonight. I will let you know once I have everything setup regarding the repo. Thanks, On Thursday, April 14, 2016 at 12:10:04 PM UTC-4, Marko A. Rodriguez wrote: Hi Mark, A logo would be awesome! Thanks. Please see attached. I'd love to help with the tutorial. I think it will not only help the Gremlin community, but the library will get a lot better as a result. Just let me know where to start and what you'd like to see. We have a way of creating easily creating/publishing tutorials in TinkerPop3. http://tinkerpop.apache.org/docs/3.1.1-incubating/tutorials/ I don't know how to do it, but Stephen does. How about you do this: 1. You fork Apache TinkerPop tp31/. 2. You give Stephen and I rights to your forked repository. 3. Stephen will create the tutorial stub. (this will help me learn when I see his commit). - @stephen: call it gremlin-language-variants 4. You and I then go to town on creating the tutorial. Please read over the ticket and comment as appropriate so we jive and are on the same page going into this: https://issues.apache.org/jira/browse/TINKERPOP-1232 Thank you Mark, Mark…………….o. http://markorodriguez.com [https://lh5.googleusercontent.com/proxy/Tcxc5jhEBzp0PvuEBkj_1zBCYUcOmI4Er4eOVJiVHgQxHwFgOa8=w5000-h5000] On Thursday, April 14, 2016 at 10:47:14 AM UTC-4, Marko A. Rodriguez wrote: Hi Mark, I think that any host language embedding should use its native idioms while, at the same time, staying as true as possible to Gremlin-Java (not Gremlin-Groovy -- though they are nearly identical). I would argue that Gremlin-Java is the "true representation" of the language. So what do I mean by native idioms? in_V vs inV // if camel case isn't a thing in the native language $g vs. g // of course if thats how variables are referenced …huh, can't think of anything else :). But I hope you get the point. I notice in Gremlin-Py you do g.v(2) vs g.V(2). Why is that? *** Would you be interested in working on a tutorial (with me?) about the 3 ways to create a Gremlin language variant. Given your expertise in Python and the existence of Gremlin-Py, I think we can both (1) make a good tutorial to teach others down the line and (2) spruce up Gremlin-Py's documentation and appearance (e.g. you need a Gremlin logo! -- Gremlin with a Snake around his neck? -- want me to make you one?). *** Please see: https://issues.apache.org/jira/browse/TINKERPOP-1232 Thanks Mark, Marko. http://markorodriguez.com<http://markorodriguez.com/> On Apr 14, 2016, at 8:28 AM, Mark Henderson <emeh...@gmail.com> wrote: I think writing "Gremlin/Groovy" in a host language is pretty awesome as long as it isn't too far off from writing actual Gremlin. I can revive my PHP project if it would be helpful to the community. A JavaScript version would probably be one that would get the most attention from developers today, but JS, even with es6, doesn't have the flexibility (maybe with Proxies) with its objects where you wouldn't have to write a full-on 1-to-1 api equivalent of Gremlin (let alone mimicking Groovy). It seems like a Ruby version would be doable by implementing `method_missing` Thanks for adding Gremlinpy to the new site (I need to clean up the code a bit *shame*) On Thursday, April 14, 2016 at 9:34:40 AM UTC-4, Marko A. Rodriguez wrote: Hi Mark, Exactly. I never saw Gremlin-Py until now and just noticed it on the Apache TinkerPop homepage. That is good stuff. Moreover, as you say, there is a distinction between: 1. Writing Gremlin in a host language. 2. Communicating to a GremlinServer-compliant server in a host language. The (1) is about query syntax and the (2) is about protocol stuffs. Lots of the libraries either confound the two or just do (2) with (1) simply being a Groovy String (cheesy). I would like to see a lot more (1) of the community libraries as I think this is one of the big selling points of Gremlin -- write in your native language. BTW, I added Gremlin-Py to the description in the "host language embedding" section here: http://www.planettinkerpop.org/#gremlin (2 scrolls down). Thanks for your thoughts, Marko. http://markorodriguez.com<http://markorodriguez.com/> On Apr 14, 2016, at 7:06 AM, Mark Henderson <emeh...@gmail.com> wrote: I've written "native object to Gremlin" libs in both PHP and Python and it isn't too bad/not too far from Groovy. The biggest issues were around indices [..] (when it had that format) and closures "{x -> ...}", but otherwise both langs allowed for easy query building. It basically looked like this in PHP: $g= Gremlin(); $g->V()->has('"name"','mark'); echo (str)$g; //g.V().has("name",SOME_BOUND_VAR_1) Works pretty much the same with the Python lib that I've been building (https://github.com/emehrkay/gremlinpy). If we wanted to actually execute the query on every step, that wouldn't be too difficult to implement with Gremlinpy. Gremlinpy is a simple linked list, it looks at g.V().has('"name"', 'mark') as three token objects with a shared pool of bound parameters. It creates the string query and parameters dictionary when you cast the list to a string. The only change needed would be to bind in a library like Gremlinclient (https://github.com/davebshow/gremlinclient), build the query with every step, and send it to the server. res = g.V() # sends request res2 = g.V().has('"name"', 'mark') # second request ... The remaining difficulty would be deciding what gets bound. Maybe you can pass in a key val pair for what you want bound res = g.V().has('"name"',{'NAME':'mark'}) # g.V().has("name",NAME) On Tuesday, April 12, 2016 at 10:54:08 PM UTC-4, Dmill wrote: Yes a lot of the points you bring up are valid. One of the main problems with stringifying everything is that it does not allow for some of the stuff I mentioned in my PS. That is to name "smart merges". This query building behavior that makes use of scopes is unfortunately the standard for frameworks in the industry. This is mostly due to the SQL heritage and it's declarative nature ; ordering of "steps" doesn't matter so it allows for easy "after the fact" client side filtering. It's not uncommon to have a base query that gets altered by some filtering data. In some cases it's a simple has() that needs to be injected somewhere, in other cases it's a repeat() that needs to be completely altered. Use cases can get a little complicated here but in it's simplest form imagine having to add/remove entries to/from a match(). Of course that scenario works well with a toString approach but for other steps, not so well. Our experience has been that the builder needs to be aware of the step's signatures to resolve merges. So sure this is another problem entirely, in the end users can't really do this with string queries either. But for widespread adoption it would be best if the query builder could handle these scenarios. Also to bounce off of some of your comments : > $id -> "~id" > $label -> "~label" > g.V().out("%%x") > $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l) All of the above are absolutely possible. But it's a lot to keep in mind for users that are already trying to figure out how Gremlin works. Now they also need to translate gremlin-groovy into gremlin-php. One of the advantages of going the hard route and keeping track of all step signatures instead of a toString approach is that you can significantly reduce the above cases. The builder can resolve quite a few of these automatically and when conflicts arise it can do it's best to resolve it and throw/log a warning telling the user how he could explicit his query. >For your Date example, you would have to have a special "toString()" for PHP >dates to Java dates (or whichever backend ScriptEngine is being used). There are no PHP Dates [insert desperate crying emoji here]. PHP sucks with typing. It's got it's good points but this kind of stuff is not one of them. Basically PHP Dates come in various forms, from Integer timestamps to String and only the user really knows what he wants. We can provide this functionality like you did with long() but it's another thing to keep in mind. One point we haven't gone over have been lambdas. We can't really toString these. I guess this is where customStep() or script() come in play. To wrap it up, a toString query builder is absolutely an option and could cover a lot of the API. In fact in PHP we could magically make any API method available, $g->something("~label", "lolo") would stringify to g.something(label, "lolo") regardless of whether or not the step exists. But this involves quite a few language specific alterations and doesn't provide much (if any) functional benefit. It would be so much easier for people to just write a gremlin-groovy string as it's well documented and doesn't need any extra knowledge. If on the other hand the query builder has features like mentioned in the PS or earlier in this post, it's well worth the effort. I believe most people who build their own query builders do so to support some form of extra feature they wouldn't have by using gremlin-groovy string queries. But such a query builder enters the realm of non-trivial (although not unachievable). A first step in helping people make these builders would be to provide an easily parseable list of signatures for the most desirable classes. Maybe something along the lines of a yaml file. Anyways I'm just thinking out loud at this point. On Tue, Apr 12, 2016 at 9:42 PM, Marko Rodriguez <okram...@gmail.com> wrote: Hi Dylan, Your email is excellent. Thank you for breaking things down for me. Here are some responses. 1. Method overloading : abstract class Query { public function has(PropertyKey $key); //1 public function has(PropertyKey $key, Object $value); //2 public function has(Label $label, String $value); //3 public function has(VertexId $id, Long $value); //4 public function has(VertexId $id, Int $value); //5 public function has(VertexId $id, Predicate $p); //6 } The above is illegal in languages like PHP (or javascript?). Instead we're stuck with : abstract class Query { public function has(Array $args); } We're then left to figure out what is what in the array and sort out how we need to stringify the output. I was thinking, why would you need to introspect into the array? Just toString() each element in the array with a comma (,) in between. For instance: * has("age",32) ==> has(["age",32]) ==> has("age",32) // all String array element need " " wrappers. * has("age") ==> has(["age"]) ==> has("age") * has("person","name","marko") ==> has(["person","name","marko"]) ==> has("person","name","marko") Thus, Gremlin-PHP have one has()-method and that method just iterates the arguments and toString()'s thing accordingly with comma deliminators. If the user does $g->V()->has("label", "user") do we add quotes to the first argument or is it a label/id? What about the second argument, is it a predicate? etc. This gets complexe very quickly. The universal rule --- if its a String add quotes. If its not, don't. $id -> "~id" $label -> "~label" $g->V()->has($label,"user") And what if I had $g->V()->has("id", 36) . PHP only supports Int so one of the two signatures (4 or 5) needs to give as we have a major conflict. This example is fictional for has() but I've run into this on a couple of other methods, just can't remember which. Yea, that sucks. Well, you could do this: $g->V()->has($id,Number::long(36)) ==> g.V().has("id",36l) This would, of course, bind you to Gremlin-Groovy as the ultimate ScriptEngine. Another example would be g.V().has(id, neq(m)) . We could imagine the following PHP equivalent $g->V()->has(new Id(), Predicate::neq("m")) where Id() is a class that helps us recognize this type, and neq() a static method of Predicate. However "m" has to be passed as string and we have no clue what m is... is this a string or a binding or a server side variable? More on this in point 2. Well, this is the same problem in Gremlin-Java. where() is ALWAYS bindings and has() is ALWAYS objects. Thus: $g->V()->where("a",Predicate::neq("m")) ==> g.V().where("a",neq("m")) // again strings always get " "-wrappers. To close things off here there's also the case of signatures like out(String... edgeLabels) that need their own logic. Again, just toString() each object in the array and insert commas between. $g->V()->out(["created","knows"]) ==> g.V().out("created","knows") Conclusion: There's a lot of manual work that needs to go into separating the logic between signatures and handling special cases. Part of this can be automated if your language supports magic getters and setters by parsing the javadocs for example. But not only is that an if, the rest will still be manual. This step is maintenance heavy. I see the biggest pains being: 1. Having to implement each method. 2. Having to have helper classes for P, T, Order, Column, etc. This is simply a matter of fat fingering stuff in and not anything implementation-wise that is problematic -- ????…. 2. Conflicts Because we're manipulating strings it's really hard to tell a few items appart (binding vs server variable vs string; Theres a reason why I separate binding and variable). For instance in the example above of gremlin : g.V().has(id, neq(m)) vs PHP: $g->V()->has(new Id(), Predicate::neq("m")) we don't know what to make of m. Is this a binding or a string or even a variable that was previously set in the session? There is no clean way of working around this. Firstly because bindings tend to be handled on a different layer than the query builder. Secondly because methods that will help in avoiding the conflicts will also lose typing data. For example : $g->V()->has(new Id(), Predicate::neq(Query::variable("m"))) could generate the proper query by outputting m without quotes but we don't know what type m is so in some cases it might be tricky to select the proper signature. Conclusion: there are a number of ways around this point. We use prefixes B_m or V_m and a hack to ignore signatures altogether when in this scenario. It's not that these aren't solve-able they just aren't trivial. Hm. Yea, I'm not to smart about sever variables. Out of my butt you could create a "crazy String" for those an then do replaceAll-style updates. g.V().out("%%x") replaceAll("%%x",x) ? 3. API Why we would need traversal, graph, vertex and edge APIs are quite self explanatory for everyday work with Gremlin. I'm just going to expose why we would also require some Java classes as well. Because JSON is lossy by nature we often have to cast variables to certain types. For example by submitting these kind of scripts : g.V(1).property("date", new Date(B_m)); with B_m = timestamp. This is just another case that is difficult to cover. This adds onto the other points in making a gremlin language variant non-trivial. All of the above can be worked around by using an injection method that just appends a string to the query : $g->customStep("V().has(id, neq(m))") but that's besides the point. Ah. Classy. Note that in ?3.2.1? we might support script()-step. g.V().script("out().map{ it.name<http://it.name/> }") …to enable lambdas in remote'd traversals (Server or OLAP). For your Date example, you would have to have a special "toString()" for PHP dates to Java dates (or whichever backend ScriptEngine is being used). $g->V()->property("data", phpDate) Your Array-string-ifier would not just call toString() blindly on the objects of the array arguments, but would do stuff like: if(object instanceof String) return \" + object.toString() + "\; else if(object instanceof Date) return "new Date(…)"; else return object.toString() Final Conclusion: It's not a trivial task. Of course the examples above are very verbose and achieving something closer to gremlin in style is possible but there are always going to be "gotchas" users will need to keep in mind. A while back in TP2 I released a php library for this (the one we currently use in our projects). I decided to remove it as it was too much maintenance to get it to work across user causes so I decided to concentrate on our own one (some choices made in 2. wouldn't have worked for other cases) I'm convinced there's got to be a way of reconciling everything and getting this to work flawlessly but it's going to require a lot of thought/work PS: I mentioned some other points like managing multiple versions of gremlin (for two lines of releases) which is a real headache. For performance it may be good to allow the builder to handle multiple lines, which comes with it's load of complications as well. And then there's the ability to "block" queries and either inject them into each other or merge them together which simplifies unit testing and extends functionality : $query = $g->V()->out("likes")->flag("flagname")->has("age", 20); // Some logic here accesses new information and realizes the query needs altering $query->getFlag("flagname")->out("hates", true) // true for merge $query->toString(); // g.V().out('likes', hates').has('age', 20) But this point alone could warrant it's own email as it is relatively complex. Though TP3 has simplified some cases thanks to union() and some other steps. Our builder supports all of the above so if you have any questions feel free to ask me. Phew that was long. I'll add this to the ticket in a bit. Yes, maintenance seems the biggest pain. Every new method to Gremlin-Java requires updates to Gremlin-PHP ---- perhaps there is a programmatic way to introspect the Java source file (or JavaDoc) and generate the code automagically? public GraphTraversal out(final String… edgeLabels) ==auto-write==> out(Array… edgeLabels) { $string -> $string + ".out(" + StringHelper::toString(edgeLabels) + ")"; } If you could do that, then the only code you actually have to write/maintain (besides the introspector above) is StringHelper which does all the fancy String conversion of arguments. ??. Thanks Dylan for your time, Marko. http://markorodriguez.com<http://markorodriguez.com/> On Tue, Apr 12, 2016 at 4:37 PM, Marko Rodriguez <okram...@gmail.com> wrote: Hello everyone, Please see the section entitled "Host Language Embedding" here: http://www.planettinkerpop.org/#gremlin (3 sections down) When I was writing up this section, I noticed that most of the language drivers that are advertised on our homepage (http://tinkerpop.incubator.apache.org/#graph-libraries) know how to talk to Gremlin Server via web sockets, REST, etc., but rely on the user to create a String of their graph traversal and submit it. For instance, here is a snippet from the Gremlin-PHP documentation: $db = new Connection([ 'host' => 'localhost', 'graph' => 'graph', 'username' => 'pomme', 'password' => 'hardToCrack' ]); //you can set $db->timeout = 0.5; if you wish $db->open(); $db->send('g.V(2)'); //do something with result $db->close(); $db->send(String) is great, but it would be better if the user didn't have to leave PHP. Please see this ticket: https://issues.apache.org/jira/browse/TINKERPOP-1232 I think for non-JVM languages, it would be nice if these drivers (PHP, JavaScript, Python, etc.) didn't require the user to explicitly create Gremlin-XXX Strings, but instead either used JINI or model-3 in the ticket above. Lets look at model-3 as I think its the easiest and more general. For instance, they would have a class in their native language that would mirror the GraphTraversal API. *** I don't know any other languages well enough, so I'm just going to do this in Groovy :), hopefully you get the generalized point. *** public class Test { String ...