*Vasia* *Kalavri**: **Gelly**: Large-scale graph analysis with Apache *
*Flink*

https <https://youtu.be/-tFzG2dzJXw>:// <https://youtu.be/-tFzG2dzJXw>
youtu.be <https://youtu.be/-tFzG2dzJXw>/-tFzG2dzJXw
<https://youtu.be/-tFzG2dzJXw>
On Nov 30, 2015 12:49 PM, "Marko Rodriguez" <[email protected]> wrote:

> Hi Vasia (everyone),
>
> Does Flink have a graph query language? If not, then with a
> FlinkGraphComputer implementation, Flink could ship with Gremlin support.
>
> If you have the time, please read the following blog post as it will help
> explain our approach and how Flink could benefit from it:
>
> http://www.datastax.com/dev/blog/the-benefits-of-the-gremlin-graph-traversal-machine
>
> In short, if Flink provides a FlinkGraphComputer implementation, then the
> Gremlin virtual machine will work over Flink and any language that compiles
> to the Gremlin virtual machine will thus work over Flink.
>
> If you would like to see a demo of TinkerPop with, for example Spark or
> Giraph, I'd be more than happy to do a Google Hangout session with you (< 1
> hour) so you can better understand the breadth of the work we are doing and
> how it can benefit your efforts.
>
> Thanks Vasia,
> Marko.
>
> http://markorodriguez.com
>
> On Nov 27, 2015, at 5:27 AM, Stephen Mallette <[email protected]>
> wrote:
>
> > Hi Vasia, I had started tinkering on it in my spare time in a separate
> > repo.  There really isn't much to collaborate on at this point.  I was
> > mostly trying to understand the parallels between Flink and Spark so
> that I
> > could understand how a FlinkGraphComputer could be implemented given what
> > I'd seen of the Spark implementation Marko did.  I had expected to
> > contribute the work to Flink (rather than keep it here on the TinkerPop
> > side).  Anyway, not much else to offer - Marko can probably get you
> running
> > much faster than I can, as that area is where he holds the most
> expertise.
> > You should probably keep an eye out for his comments.
> >
> >
> >
> > On Wed, Nov 25, 2015 at 11:38 AM, Vasiliki Kalavri <[email protected]>
> wrote:
> >
> >> Hi James and TinkerPop community,
> >>
> >> thanks a lot for starting this discussion!
> >> I am Vasia, Apache Flink PMC and core Gelly developer. Nice to meet you
> ;)
> >>
> >> I'm only starting to get familiar with the TinkerPop project, but it
> seems
> >> that it can play well with Flink.
> >> As you already noticed, a FlinkGraphComputer should be straight-forward
> to
> >> implement. Gelly has a vertex-centric API that is similar to the
> >> scatter-gather model [1] and a gather-sum-apply API [2] that is closer
> to
> >> the Powergraph model. These are built on top of Flink's delta iteration
> >> operators, which are more generic and could also be used directly for
> the
> >> FlinkGraphComputer, if the existing Gelly abstractions won't work.
> >>
> >> Regarding the difference between stream and batch in Flink. Flink is a
> >> streaming dataflow engine, on top of which you can run both streaming
> and
> >> batch jobs. A batch job is simply seen by Flink as a job operating on a
> >> finite stream. Respectively, Flink has a stream and a batch API. Gelly
> is
> >> currently built on top of the batch API, i.e. the DataSet API.
> >>
> >> James mentioned in the Flink mailing list that someone has already
> started
> >> working on a FlinkGraphComputer. Is there a JIRA for this? Let me know
> if
> >> you have questions or you think I can help in some way!
> >>
> >> Cheers,
> >> -Vasia.
> >>
> >> [1]:
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#vertex-centric-iterations
> >> [2]:
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#gather-sum-apply-iterations
> >> [3]:
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-master/apis/iterations.html#delta-iterate-operator
> >>
> >> On 25 November 2015 at 17:05, James Thornton <[email protected]>
> >> wrote:
> >>
> >>> Hi Vasia -
> >>>
> >>> Welcome to TinkerPop (linking you into the Flink thread as
> requested)...
> >>>
> >>> - James
> >>>
> >>> On Mon, Nov 23, 2015 at 10:01 AM, Marko Rodriguez <
> [email protected]>
> >>> wrote:
> >>>
> >>>> Hi James,
> >>>>
> >>>> Thank you for always having a ear to the tech pulse. If it wasn't for
> >>> you,
> >>>> I would still be excited about XMPP and would be programming in
> Tcl/Tk.
> >>>>
> >>>> Given my 20 minute review of their docs …… It would be cool if like
> the
> >>>> "Table API," they also had a "Graph API" that was just TinkerPop
> >>>> Graph/Vertex/Edge. That could be super intrusive, so as a simple step
> >> --
> >>>> they already have a "vertex-centric" API and thus, having a
> >>>> FlinkGraphComputer implementation seems "easy." Then from there,
> >> Gremlin
> >>>> should just work. I don't really understand the difference between
> >> steam
> >>>> and batch unless they are talking the difference between "Storm" and
> >>>> "MapReduce." ? Would be cool to see how TinkerPop fits into the
> >>>> stream-scene.
> >>>>
> >>>> Next, their fluent API is similar to Spark's and I would argue that
> >>>> Gremlin's API is much nicer than just low-level primitives like map(),
> >>>> flatMap(), etc. Thus, they could really benefit from having a full
> >> graph
> >>>> query language already available for their users. (As a side note, its
> >>>> really nice to see more and more systems use functional/fluent APIs as
> >>> this
> >>>> really trains the next generation to think like this which is
> important
> >>> as
> >>>> Gremlin is purely this! Hopefully the SQL model of querying starts to
> >>> look
> >>>> odd to people in comparison.)
> >>>>
> >>>> I just sent out this tweet:
> >>>>        https://twitter.com/apachetinkerpop/status/668820458599530497
> >>>>
> >>>> If they seem positive, I can detail in JIRA what would be required for
> >>>> them to have TinkerPop-support.
> >>>>
> >>>> Thanks again James,
> >>>> Marko.
> >>>>
> >>>> http://markorodriguez.com
> >>>>
> >>>> On Nov 19, 2015, at 12:19 PM, James Thornton <[email protected]
> >
> >>>> wrote:
> >>>>
> >>>>> Hi -
> >>>>>
> >>>>> Apache Flink has a graph API named Gelly...
> >>>>>
> >>>>>
> >> https://flink.apache.org/news/2015/08/24/introducing-flink-gelly.html
> >>>>>
> >>>>> ...and Flink's "dedicated support for iterative operations" should
> >> pair
> >>>>> well with Gremlin:
> >>>>>
> >>>>> https://flink.apache.org/features.html
> >>>>>
> >>>>> Has anyone dug into this yet?
> >>>>>
> >>>>> - James
> >>>>>
> >>>>>
> >>>>> --
> >>>>> James Thornton, *http://electricspeed.com <http://electricspeed.com
> >>> *
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> James Thornton, *http://electricspeed.com <http://electricspeed.com>*
> >>>
> >>
>
>

Reply via email to