I think this is a compelling argument, however, it has one major flaw: Gremlin is currently not aware of the schema or any statistics of the underlying graph database. For "simple" optimizations that's not too bad - the underlying graph database can simply replace the respective step in the traversal with an optimized step. That's what Titan does for TitanGraphStep or TitanVertexStep. Those are also interesting as there is quite a bit of logic you need to put in there to understand what you can reorder and pull into a step.
However, it gets pretty complicated when you look at a thing like MatchStep which is very crucial for most of the arguments that make Gremlin a general "traversal machine". Both SPARQL->Gremlin and SQL->Gremlin rely heavily on MatchStep. Now, looking more closely at MatchStep there seems to be no way for Titan to instill its knowledge of the schema or statistic or indexes or anything into the algorithm that executes MatchStep. So, for Titan to get a "good" implementation of MatchStep Titan will need to effectively reimplement it. And, arguably, that's a big part of a query language (i.e. the entire declarative piece of Gremlin). So, the question then becomes: Does the argument of Gremlin being a universal traversal machine only hold for the imperative parts or can it be extended to the declarative aspects as well? On Sat, Oct 31, 2015 at 9:06 AM Marko Rodriguez <[email protected]> wrote: > Hi, > > Yesterday I was HipChatting with Alex Popescu (cc:d) about "there is no > need for a standard query language" as there is no need for a "standard > programming language." He said something to the effect of "that is a strong > argument, however there will then be discussions of virtual machine > execution vs. native execution." > > Last night I was thinking -- "hmmm, that will be a bad argument to make." > Why? > > Gremlin shouldn't be touted as a "virtual machine" but as a "traversal > machine" (an execution engine). When Gremlin talks to an underlying graph > system its talking to TinkerPop ("Blueprints") and then to the native API > of the graph system. For systems that have TinkerPop as their native API > (Titan/Bitsy/etc.) Gremlin is not a "virtual machine." For systems that > don't (OrientDB/Neo4j/etc.), the cost for the indirection from going from > TinkerPop API to the graph systems native API is trivial as its typically > just object wrapping on the short-lived object heap (we will amortize this > cost later -- watch). Next, all graph systems maintain an "execution > engine" for their respective query language. That is, OrientSQL, Cypher, > SPARQL ultimately talk to their graph system's API: OrientDB Java API, > Neo4j Java API, and Sesame or Jena, respectively. Gremlin does the same > thing, it just talks to TinkerPop ("Blueprints") first, which then talks to > those APIs. What makes Gremlin neat is that the execution engine and the > language are not strongly coupled as its very easy for any graph language > to compile to the Gremlin machine. So there is no relative cost in the > language->machine translation, the cost (though minor -- wait for it) is in > the machine->API translation. However, given the conceptual simplicity (and > engineering) of the Gremlin machine, those costs are quickly subsumed. With > MatchStep's runtime optimizer, traverser bulking, LazyBarriers, and (most > importantly) provider specific compiler strategies (see Titan's beautiful > use of these), Gremlin can be faster than the provider's "native query" > language. In fact, some internal benchmarking I've done has shown that > Gremlin is indeed equal or faster than the native language of the graph > system where sometimes those speed differences are 5x to the the life of > the universe. Thus, the cost of TinkerPopAPI->NativeAPI is so trivial at > that point, its not worth even considering discussing the "cost of > virtualization." I suspect that (though this is complete speculation at > this point) that X-Language->GremlinMachine->Y-System could be faster than > X-Language->Y-System given Gremlin's current (and future) compiler/engine > design and evolution. > > Thus, Gremlin shouldn't be seen as a "virtual machine," but as a > "traversal machine" that any one can connect to their graph system. It > supports any graph language that compiles to it. It is an efficient/simple > OLTP/OLAP execution engine pre-written for you. > > Thanks, > Marko. > > http://markorodriguez.com > > On Oct 31, 2015, at 12:37 AM, pieter <[email protected]> wrote: > > Yeah, Cypher/Sparql/OrientQL whatever does not compete with Gremlin. > Gremlin enables all of them. > > Cheers > Pieter > > On 31/10/2015 00:26, Marko Rodriguez wrote: > > Hello, > > While these ideas are not new to people on TinkerPop3, I had a nice > revelation that I expressed in the following tweet series. > > https://twitter.com/twarko/status/660215611117535232 > > Why is there no "standard programming language?" Different programming > languages are good at different things. > What makes more languages emerge and grow? A virtual machine abstraction. > The JVM is the breeding ground for programming languages. > Java projects can have many programming languages in them. No worries. > > There should be no "standard graph language?" Different graph > languages are good at different things. > What makes more graph languages emerge and grow? A traversal machine > abstraction. > The Gremlin traversal machine can be the breeding ground for traversal > languages. > TinkerPop projects can have many graph languages in them. No worries. > > Take care, > Marko. > > http://markorodriguez.com > > -- > You received this message because you are subscribed to the Google > Groups "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > > https://groups.google.com/d/msgid/gremlin-users/C31C88C0-DE7B-4383-94B8-8E8EEAA82A69%40gmail.com > < > https://groups.google.com/d/msgid/gremlin-users/C31C88C0-DE7B-4383-94B8-8E8EEAA82A69%40gmail.com?utm_medium=email&utm_source=footer > >. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/gremlin-users/563461AC.1070605%40gmail.com > . > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/gremlin-users/F7FA4CE3-CDCD-40CE-8A10-8D0DF18FD89B%40gmail.com > <https://groups.google.com/d/msgid/gremlin-users/F7FA4CE3-CDCD-40CE-8A10-8D0DF18FD89B%40gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. >
