*Vasia* *Kalavri**: **Gelly**: Large-scale graph analysis with Apache * *Flink*
https <https://youtu.be/-tFzG2dzJXw>:// <https://youtu.be/-tFzG2dzJXw> youtu.be <https://youtu.be/-tFzG2dzJXw>/-tFzG2dzJXw <https://youtu.be/-tFzG2dzJXw> On Nov 30, 2015 12:49 PM, "Marko Rodriguez" <[email protected]> wrote: > Hi Vasia (everyone), > > Does Flink have a graph query language? If not, then with a > FlinkGraphComputer implementation, Flink could ship with Gremlin support. > > If you have the time, please read the following blog post as it will help > explain our approach and how Flink could benefit from it: > > http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine > > In short, if Flink provides a FlinkGraphComputer implementation, then the > Gremlin virtual machine will work over Flink and any language that compiles > to the Gremlin virtual machine will thus work over Flink. > > If you would like to see a demo of TinkerPop with, for example Spark or > Giraph, I'd be more than happy to do a Google Hangout session with you (< 1 > hour) so you can better understand the breadth of the work we are doing and > how it can benefit your efforts. > > Thanks Vasia, > Marko. > > http://markorodriguez.com > > On Nov 27, 2015, at 5:27 AM, Stephen Mallette <[email protected]> > wrote: > > > Hi Vasia, I had started tinkering on it in my spare time in a separate > > repo. There really isn't much to collaborate on at this point. I was > > mostly trying to understand the parallels between Flink and Spark so > that I > > could understand how a FlinkGraphComputer could be implemented given what > > I'd seen of the Spark implementation Marko did. I had expected to > > contribute the work to Flink (rather than keep it here on the TinkerPop > > side). Anyway, not much else to offer - Marko can probably get you > running > > much faster than I can, as that area is where he holds the most > expertise. > > You should probably keep an eye out for his comments. > > > > > > > > On Wed, Nov 25, 2015 at 11:38 AM, Vasiliki Kalavri <[email protected]> > wrote: > > > >> Hi James and TinkerPop community, > >> > >> thanks a lot for starting this discussion! > >> I am Vasia, Apache Flink PMC and core Gelly developer. Nice to meet you > ;) > >> > >> I'm only starting to get familiar with the TinkerPop project, but it > seems > >> that it can play well with Flink. > >> As you already noticed, a FlinkGraphComputer should be straight-forward > to > >> implement. Gelly has a vertex-centric API that is similar to the > >> scatter-gather model [1] and a gather-sum-apply API [2] that is closer > to > >> the Powergraph model. These are built on top of Flink's delta iteration > >> operators, which are more generic and could also be used directly for > the > >> FlinkGraphComputer, if the existing Gelly abstractions won't work. > >> > >> Regarding the difference between stream and batch in Flink. Flink is a > >> streaming dataflow engine, on top of which you can run both streaming > and > >> batch jobs. A batch job is simply seen by Flink as a job operating on a > >> finite stream. Respectively, Flink has a stream and a batch API. Gelly > is > >> currently built on top of the batch API, i.e. the DataSet API. > >> > >> James mentioned in the Flink mailing list that someone has already > started > >> working on a FlinkGraphComputer. Is there a JIRA for this? Let me know > if > >> you have questions or you think I can help in some way! > >> > >> Cheers, > >> -Vasia. > >> > >> [1]: > >> > >> > https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#vertex-centric-iterations > >> [2]: > >> > >> > https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#gather-sum-apply-iterations > >> [3]: > >> > >> > https://ci.apache.org/projects/flink/flink-docs-master/apis/iterations.html#delta-iterate-operator > >> > >> On 25 November 2015 at 17:05, James Thornton <[email protected]> > >> wrote: > >> > >>> Hi Vasia - > >>> > >>> Welcome to TinkerPop (linking you into the Flink thread as > requested)... > >>> > >>> - James > >>> > >>> On Mon, Nov 23, 2015 at 10:01 AM, Marko Rodriguez < > [email protected]> > >>> wrote: > >>> > >>>> Hi James, > >>>> > >>>> Thank you for always having a ear to the tech pulse. If it wasn't for > >>> you, > >>>> I would still be excited about XMPP and would be programming in > Tcl/Tk. > >>>> > >>>> Given my 20 minute review of their docs …… It would be cool if like > the > >>>> "Table API," they also had a "Graph API" that was just TinkerPop > >>>> Graph/Vertex/Edge. That could be super intrusive, so as a simple step > >> -- > >>>> they already have a "vertex-centric" API and thus, having a > >>>> FlinkGraphComputer implementation seems "easy." Then from there, > >> Gremlin > >>>> should just work. I don't really understand the difference between > >> steam > >>>> and batch unless they are talking the difference between "Storm" and > >>>> "MapReduce." ? Would be cool to see how TinkerPop fits into the > >>>> stream-scene. > >>>> > >>>> Next, their fluent API is similar to Spark's and I would argue that > >>>> Gremlin's API is much nicer than just low-level primitives like map(), > >>>> flatMap(), etc. Thus, they could really benefit from having a full > >> graph > >>>> query language already available for their users. (As a side note, its > >>>> really nice to see more and more systems use functional/fluent APIs as > >>> this > >>>> really trains the next generation to think like this which is > important > >>> as > >>>> Gremlin is purely this! Hopefully the SQL model of querying starts to > >>> look > >>>> odd to people in comparison.) > >>>> > >>>> I just sent out this tweet: > >>>> https://twitter.com/apachetinkerpop/status/668820458599530497 > >>>> > >>>> If they seem positive, I can detail in JIRA what would be required for > >>>> them to have TinkerPop-support. > >>>> > >>>> Thanks again James, > >>>> Marko. > >>>> > >>>> http://markorodriguez.com > >>>> > >>>> On Nov 19, 2015, at 12:19 PM, James Thornton <[email protected] > > > >>>> wrote: > >>>> > >>>>> Hi - > >>>>> > >>>>> Apache Flink has a graph API named Gelly... > >>>>> > >>>>> > >> https://flink.apache.org/news/2015/08/24/introducing-flink-gelly.html > >>>>> > >>>>> ...and Flink's "dedicated support for iterative operations" should > >> pair > >>>>> well with Gremlin: > >>>>> > >>>>> https://flink.apache.org/features.html > >>>>> > >>>>> Has anyone dug into this yet? > >>>>> > >>>>> - James > >>>>> > >>>>> > >>>>> -- > >>>>> James Thornton, *http://electricspeed.com <http://electricspeed.com > >>> * > >>>> > >>>> > >>> > >>> > >>> -- > >>> James Thornton, *http://electricspeed.com <http://electricspeed.com>* > >>> > >> > >
