GraphFrame is just a Graph Analytics/Query Engine, not a Graph Engine which
GraphX used to be.

And I'm sorry to say, it doesn’t fit most scenarioes at all in fact.

Enzo, I don’t think there is any roadmap of Graph libraries for Spark for
now.

*Andy*


On Tue, Mar 14, 2017 at 7:28 AM, Tim Hunter <timhun...@databricks.com>
wrote:

> Hello Enzo,
>
> since this question is also relevant to Spark, I will answer it here. The
> goal of GraphFrames is to provide graph capabilities along with excellent
> integration to the rest of the Spark ecosystem (using modern APIs such as
> DataFrames). As you seem to be well aware, a large number of graph
> algorithms can be implemented in terms of a small subset of graph
> primitives. These graph primitives can be translated to Spark operations,
> but we feel that some important low-level optimizations should be added to
> the Catalyst engine in order to realize the true potential of GraphFrames.
> You can find a flavor of this work in this presentation of Ankur Dave [1].
> This is still an area of collaboration with the Spark core team, and we
> would like to merge GraphFrames in Spark 2.x eventually.
>
> Where does it leave us for the time being? GraphFrames is actively
> supported, and we implemented a highly scalable version of GraphFrames in
> November. As you mentioned, there are a number of distributed Graph
> frameworks out there, but to my knowledge they are not as easy to integrate
> with Spark. The current approach has been to reach parity with GraphX first
> and then add new algorithms based on popular demand. Along these lines,
> GraphBLAS could be added on top of it if someone is willing to step up.
>
> Tim
>
> [1] https://spark-summit.org/east-2016/events/graphframes-
> graph-queries-in-spark-sql/
>
> On Mon, Mar 13, 2017 at 2:58 PM, Nicholas Chammas <
> nicholas.cham...@gmail.com> wrote:
>
>> Since GraphFrames is not part of the Spark project, your
>> GraphFrames-specific questions are probably better directed at the
>> GraphFrames issue tracker:
>>
>> https://github.com/graphframes/graphframes/issues
>>
>> As far as I know, GraphFrames is an active project, though not as active
>> as Spark of course. There will be lulls in development since the people
>> driving that project forward also have major commitments to other projects.
>> This is natural.
>>
>> If you post on GitHub I would wager somewhere there (maybe Joseph or Tim
>> <https://github.com/graphframes/graphframes/graphs/contributors>?)
>> should be able to answer your questions about GraphFrames.
>>
>>
>>    1. The page you linked refers to a *plan* to move GraphFrames to the
>>    standard Spark release cycle. Is this *plan* publicly available /
>>    visible?
>>
>> I didn’t see any such reference to a plan in the page I linked you to.
>> Rather, the page says
>> <http://graphframes.github.io/#what-are-graphframes>:
>>
>> The current plan is to keep GraphFrames separate from core Apache Spark
>> for the time being.
>>
>> Nick
>> ​
>>
>> On Mon, Mar 13, 2017 at 5:46 PM enzo <e...@smartinsightsfromdata.com>
>> wrote:
>>
>>> Nick
>>>
>>> Thanks for the quick answer :)
>>>
>>> Sadly, the comment in the page doesn’t answer my questions. More
>>> specifically:
>>>
>>> 1. GraphFrames last activity in github was 2 months ago.  Last release
>>> on 12 Nov 2016.  Till recently 2 month was close to a Spark release
>>> cycle.  Why there has been no major development since mid November?
>>>
>>> 2. The page you linked refers to a *plan* to move GraphFrames to the
>>> standard Spark release cycle.  Is this *plan* publicly available / visible?
>>>
>>> 3. I couldn’t find any statement of intent to preserve either one or the
>>> other APIs, or just merge them: in other words, there seem to be no
>>> overarching plan for a cohesive & comprehensive graph API (I apologise in
>>> advance if I’m wrong).
>>>
>>> 4. I was initially impressed by GraphFrames syntax in places similar to
>>> Neo4J Cypher (now open source), but later I understood was an incomplete
>>> lightweight experiment (with no intention to move to full compatibility,
>>> perhaps for good reasons).  To me it sort of gave the wrong message.
>>>
>>> 5. In the mean time the world of graphs is changing. GraphBlas forum
>>> seems to make some traction: a library based on GraphBlas has been made
>>> available on Accumulo (Graphulo).  Assuming that Spark is NOT going to
>>> adopt similar lines, nor to follow Datastax with tinkertop and Gremlin,
>>> again, what is the new,  cohesive & comprehensive API that Spark is going
>>> to deliver?
>>>
>>>
>>> Sadly, the API uncertainty may force developers to more stable kind of
>>> API / platforms & roadmaps.
>>>
>>>
>>>
>>> Thanks Enzo
>>>
>>> On 13 Mar 2017, at 22:09, Nicholas Chammas <nicholas.cham...@gmail.com>
>>> wrote:
>>>
>>> Your question is answered here under "Will GraphFrames be part of Apache
>>> Spark?", no?
>>>
>>> http://graphframes.github.io/#what-are-graphframes
>>>
>>> Nick
>>>
>>> On Mon, Mar 13, 2017 at 4:56 PM enzo <e...@smartinsightsfromdata.com>
>>> wrote:
>>>
>>> Please see this email  trail:  no answer so far on the user@spark
>>> board.  Trying the developer board for better luck
>>>
>>> The question:
>>>
>>> I am a bit confused by the current roadmap for graph and graph analytics
>>> in Apache Spark.
>>>
>>> I understand that we have had for some time two libraries (the following
>>> is my understanding - please amend as appropriate!):
>>>
>>> . GraphX, part of Spark project.  This library is based on RDD and it is
>>> only accessible via Scala.  It doesn’t look that this library has been
>>> enhanced recently.
>>> . GraphFrames, independent (at the moment?) library for Spark.  This
>>> library is based on Spark DataFrames and accessible by Scala & Python. Last
>>> commit on GitHub was 2 months ago.
>>>
>>> GraphFrames cam about with the promise at some point to be integrated in
>>> Apache Spark.
>>>
>>> I can see other projects coming up with interesting libraries and ideas
>>> (e.g. Graphulo on Accumulo, a new project with the goal of implementing
>>> the GraphBlas building blocks for graph algorithms on top of Accumulo).
>>>
>>> Where is Apache Spark going?
>>>
>>> Where are graph libraries in the roadmap?
>>>
>>>
>>>
>>> Thanks for any clarity brought to this matter.
>>>
>>> Thanks Enzo
>>>
>>> Begin forwarded message:
>>>
>>> *From: *"Md. Rezaul Karim" <rezaul.ka...@insight-centre.org>
>>> *Subject: **Re: Question on Spark's graph libraries*
>>> *Date: *10 March 2017 at 13:13:15 CET
>>> *To: *Robin East <robin.e...@xense.co.uk>
>>> *Cc: *enzo <e...@smartinsightsfromdata.com>, spark users <
>>> u...@spark.apache.org>
>>>
>>> +1
>>>
>>> Regards,
>>> _________________________________
>>> *Md. Rezaul Karim*, BSc, MSc
>>> PhD Researcher, INSIGHT Centre for Data Analytics
>>> National University of Ireland, Galway
>>> IDA Business Park, Dangan, Galway, Ireland
>>> Web: http://www.reza-analytics.eu/index.html
>>> <http://139.59.184.114/index.html>
>>>
>>> On 10 March 2017 at 12:10, Robin East <robin.e...@xense.co.uk> wrote:
>>>
>>> I would love to know the answer to that too.
>>> ------------------------------------------------------------
>>> -------------------
>>> Robin East
>>> *Spark GraphX in Action* Michael Malak and Robin East
>>> Manning Publications Co.
>>> http://www.manning.com/books/spark-graphx-in-action
>>>
>>>
>>>
>>>
>>>
>>> On 9 Mar 2017, at 17:42, enzo <e...@smartinsightsfromdata.com> wrote:
>>>
>>> I am a bit confused by the current roadmap for graph and graph analytics
>>> in Apache Spark.
>>>
>>> I understand that we have had for some time two libraries (the following
>>> is my understanding - please amend as appropriate!):
>>>
>>> . GraphX, part of Spark project.  This library is based on RDD and it is
>>> only accessible via Scala.  It doesn’t look that this library has been
>>> enhanced recently.
>>> . GraphFrames, independent (at the moment?) library for Spark.  This
>>> library is based on Spark DataFrames and accessible by Scala & Python. Last
>>> commit on GitHub was 2 months ago.
>>>
>>> GraphFrames cam about with the promise at some point to be integrated in
>>> Apache Spark.
>>>
>>> I can see other projects coming up with interesting libraries and ideas
>>> (e.g. Graphulo on Accumulo, a new project with the goal of implementing
>>> the GraphBlas building blocks for graph algorithms on top of Accumulo).
>>>
>>> Where is Apache Spark going?
>>>
>>> Where are graph libraries in the roadmap?
>>>
>>>
>>>
>>> Thanks for any clarity brought to this matter.
>>>
>>> Enzo
>>>
>>>
>>>
>>>
>>>
>>>
>

Reply via email to