Hello all - Yes, Cypher could run on the Gremlin Traversal Machine (GTM), and in some ways it already does.
The GTM is like the JVM for graphs -- see this paper by Marko Rodriguez... - "The Gremlin Graph Traversal Machine and Language" http://arxiv.org/pdf/1508.03843v1.pdf And for a high-level overview of the GTM, see this blog post by Marko: - "The Benefits of the Gremlin Graph Traversal Machine" http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine You can already run SPARQL on Gremlin: https://github.com/dkuppitz/sparql-gremlin And this week Ted Wilmes released a SQL-Gremlin compiler https://groups.google.com/d/topic/gremlin-users/npncDyVQJSU/discussion There is interest in GraphQL and Datalog on Gremlin so you'd get all of this for free via the Gremlin FlinkGraphComputer. - James On Wed, Dec 16, 2015 at 12:54 PM, Vasiliki Kalavri < vasilikikala...@gmail.com> wrote: > Hey, > > I think I might have confused you, so let me try to explain :) > > First, Gremlin is a language similar to Cypher, but it is also a traversal > machine, which also supports distributed traversals. For distributed > traversals, Gremlin uses a "graph computer", which runs the Gremlin > traversals using the BSP model. Essentially, vertices receive traversers as > messages and execute the traverser's step as the update function (for more > info see section 5 in [1]). > > Thus, Tinkerpop has a GiraphGraphComputer to run on top of Giraph, a > SparkGraphComputer to run on top of Spark, etc. > > The Tinkerpop community has offered to work on a FlinkGraphComputer, which, > similarly to the existing graph computers, will use one of the Flink/Gelly > iteration abstractions. > > Now, there are 2 questions for the Flink community: > (1): do we think this is interesting/useful and something we can help them > with? > (2): do we think it makes sense to "host" the FlinkGraphComputer on the > Flink codebase? > > > Neo4j/Cypher on Flink is a separate discussion in my opinion. As far as I > understand, Cypher could run on Gremlin, but there is no compiler for it > yet. I have been discussing with people from Neo4j and we have jointly > written a description for a thesis project regarding OpenCypher on Flink. > The idea is to collaboratively supervise/help the student(s). Of course, if > anyone else is interested in this (not necessarily a student) we can always > use more help, so just let me know! > > Thanks, > -Vasia. > > [1]: > http://arxiv.org/pdf/1508.03843v1.pdf > > > On 16 December 2015 at 19:21, Stephan Ewen <se...@apache.org> wrote: > > > I am not very familiar with Gremlin, but I remember a brainstorming > session > > with Martin Neumann on porting Cypher (the neo4j query language) to > Flink. > > We looked at Cypher queries for filtering and traversing the graph. > > > > It looked like it would work well. We remember we could even model > > recursive conditions on traversals pretty well with delta iterations. > > > > If Gremlin's use cases are anything like Cypher, I could ping Martin and > > see if we can collect again some of those ideas. > > > > Stephan > > > > > > > > On Tue, Dec 15, 2015 at 5:35 PM, Vasiliki Kalavri < > > vasilikikala...@gmail.com > > > wrote: > > > > > Hi Dr. Fabian, > > > > > > thanks a lot for your answer! > > > > > > > > > On 15 December 2015 at 15:42, Fabian Hueske <fhue...@gmail.com> wrote: > > > > > > > Hi Vasia, > > > > > > > > I agree, Gremlin definitely looks like an interesting API for Flink. > > > > I'm not sure how it relates to Gelly. I guess Gelly would (initially) > > be > > > > more tightly integrated with the DataSet API whereas Gremlin would > be a > > > > connector for other languages. Any ideas on this? > > > > > > > > > > The idea is to provide a FlinkGraphComputer which will use Gelly's > > > iterations to compile the Gremlin query language to Flink. > > > In my previous email, I linked to our discussion over at the Tinkerpop > > > mailing list, where you can find more details on this. By adding the > > > FlinkGraphComputer, we basically get any graph query language that > > compiles > > > to the Gremlin VM for free. > > > > > > > > > > > > > > Another question would be whether the connector should to into Flink > or > > > > Tinkerpop. For example, the Spark, Giraph, and Neo4J connectors are > all > > > > included in Tinkerpop. > > > > This should be discussed with the Tinkerpop community. > > > > > > > > > > > I'm copying from the Tinkerpop mailing list thread (link for full > thread > > in > > > my previous email): > > > > > > > > > *In the past, TinkerPop use to be a "dumping ground" for all > > > implementations, but we decided for TinkerPop3 that we would only have > > > "reference implementations" so users can play, system providers can > > learn, > > > and ultimately, system providers would provide TinkerPop support in > their > > > distribution. As such, we would like to have FlinkGraphComputer > > distributed > > > with Flink. If that sounds like something your project would be > > comfortable > > > with, I think we can provide a JIRA/PR for FlinkGraphComputer (as well > as > > > any necessary documentation). We can start with a JIRA ticket to get > > things > > > going. Thoughts?* > > > > > > > > > This is why I brought the conversation over here, so I hear the > opinions > > > of the Flink community on this :) > > > > > > > > > > > > > Best, Fabian > > > > > > > > > > > > > -Vasia. > > > > > > > > > > > > > > > > > > > > > 2015-12-14 18:33 GMT+01:00 Vasiliki Kalavri < > vasilikikala...@gmail.com > > >: > > > > > > > > > Ping squirrels! Any thoughts/opinions on this? > > > > > > > > > > On 9 December 2015 at 20:40, Vasiliki Kalavri < > > > vasilikikala...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > Hello squirrels, > > > > > > > > > > > > I have been discussing with the Apache Tinkerpop [1] community > > > > regarding > > > > > > an integration with Flink/Gelly. > > > > > > You can read our discussion in [2]. > > > > > > > > > > > > Tinkerpop has a graph traversal machine called Gremlin, which > > > supports > > > > > > many high-level graph processing languages and runs on top of > > > different > > > > > > systems (e.g. Giraph, Spark, Graph DBs). You can read more in > this > > > > great > > > > > > blog post [3]. > > > > > > > > > > > > The idea is to provide a FlinkGraphComputer implementation, which > > > will > > > > > add > > > > > > Gremlin support to Flink. > > > > > > > > > > > > I believe Tinkerpop is a great project and I would love to see an > > > > > > integration with Gelly. > > > > > > Before we move forward, I would like your input! > > > > > > To me, it seems that this addition would nicely fit in > > flink-contrib, > > > > > > where we also have connectors to other projects. > > > > > > If you agree, I will go ahead and open a JIRA about it. > > > > > > > > > > > > Thank you! > > > > > > -Vasia. > > > > > > > > > > > > [1]: https://tinkerpop.incubator.apache.org/ > > > > > > [2]: > > > > > > > > > > > > > > > > > > > > > https://mail-archives.apache.org/mod_mbox/incubator-tinkerpop-dev/201511.mbox/%3ccanva_a390l7g169r8sn+ej1-yfkbudlnd4td6atwnp0uza-...@mail.gmail.com%3E > > > > > > [3]: > > > > > > > > > > > > > > > > > > > > > http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine > > > > > > > > > > > > On 25 November 2015 at 16:54, Vasiliki Kalavri < > > > > > vasilikikala...@gmail.com> > > > > > > wrote: > > > > > > > > > > > >> Hi James, > > > > > >> > > > > > >> I've just subscribed to the Tinkerpop dev mailing list. Could > you > > > > please > > > > > >> send a reply to the thread, so then I can reply to it? > > > > > >> I'm not sure how I can reply to the thread otherwise... > > > > > >> I also saw that there is a grafos.ml project thread. I could > also > > > > > >> provide some input there :) > > > > > >> > > > > > >> Thanks! > > > > > >> -Vasia. > > > > > >> > > > > > >> On 25 November 2015 at 15:09, James Thornton < > > > > james.thorn...@gmail.com> > > > > > >> wrote: > > > > > >> > > > > > >>> Hi Vasia - > > > > > >>> > > > > > >>> Yes, a FlinkGraphComputer should be a straight-forward first > > step. > > > > > Also, > > > > > >>> on > > > > > >>> the Apache Tinkerpop dev mailing list, Marko thought it might > be > > > cool > > > > > if > > > > > >>> there was a "Graph API" similar to the "Table API" -- hooking > in > > > > > Gremlin > > > > > >>> to > > > > > >>> Flink's fluent API would give Flink users a full graph query > > > > language. > > > > > >>> > > > > > >>> Stephen Mallette is a TinkerPop core contributor, and he has > > > already > > > > > >>> started working on a FlinkGraphComputer. There is a > > Flink/Tinkerpop > > > > > >>> thread > > > > > >>> on the TinkerPop dev list -- it would be great to have you part > > of > > > > the > > > > > >>> conversation there too as we work on the integration: > > > > > >>> > > > > > >>> > > > http://mail-archives.apache.org/mod_mbox/incubator-tinkerpop-dev/ > > > > > >>> > > > > > >>> Thanks, Vasia. > > > > > >>> > > > > > >>> - James > > > > > >>> > > > > > >>> > > > > > >>> On Mon, Nov 23, 2015 at 10:28 AM, Vasiliki Kalavri < > > > > > >>> vasilikikala...@gmail.com> wrote: > > > > > >>> > > > > > >>> > Hi James, > > > > > >>> > > > > > > >>> > thank you for your e-mail and your interest in Flink :) > > > > > >>> > > > > > > >>> > I've recently taken a _quick_ look into Apache TinkerPop and > I > > > > think > > > > > >>> it'd > > > > > >>> > be very interesting to integrate with Flink/Gelly. > > > > > >>> > Are you thinking about something like a Flink GraphComputer, > > > > similar > > > > > to > > > > > >>> > Giraph and Spark GraphComputer's? > > > > > >>> > I believe such an integration should be straight-forward to > > > > > implement. > > > > > >>> You > > > > > >>> > can start by looking into Flink iteration operators [1] and > > Gelly > > > > > >>> iteration > > > > > >>> > abstractions [2]. > > > > > >>> > > > > > > >>> > Regarding Apache Geode, I'm not familiar with project, but > I'll > > > try > > > > > to > > > > > >>> take > > > > > >>> > a look in the following days! > > > > > >>> > > > > > > >>> > Cheers, > > > > > >>> > -Vasia. > > > > > >>> > > > > > > >>> > > > > > > >>> > [1]: > > > > > >>> > > > > > > >>> > > > > > > >>> > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#iteration-operators > > > > > >>> > [2]: > > > > > >>> > > > > > > >>> > > > > > > >>> > > > > > > > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#iterative-graph-processing > > > > > >>> > > > > > > >>> > > > > > > >>> > On 20 November 2015 at 08:32, James Thornton < > > > > > james.thorn...@gmail.com > > > > > >>> > > > > > > >>> > wrote: > > > > > >>> > > > > > > >>> > > Hi - > > > > > >>> > > > > > > > >>> > > This is James Thornton (espeed) from the Apache Tinkerpop > > > > project ( > > > > > >>> > > http://tinkerpop.incubator.apache.org/). > > > > > >>> > > > > > > > >>> > > The Flink iterators should pair well with Gremlin's Graph > > > > Traversal > > > > > >>> > Machine > > > > > >>> > > ( > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > > > > > > > > > > > http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine > > > > > >>> > > ) > > > > > >>> > > -- it would be good to coordinate on creating an > integration. > > > > > >>> > > > > > > > >>> > > Also, Apache Geode made a splash today on HN ( > > > > > >>> > > https://news.ycombinator.com/item?id=10596859) -- > connecting > > > > Geode > > > > > >>> and > > > > > >>> > > Flink would be killer. Here's the Geode/Spark connector for > > > > > >>> refefference: > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > > > > > > > > > > > https://github.com/apache/incubator-geode/tree/develop/gemfire-spark-connector > > > > > >>> > > > > > > > >>> > > - James > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > >