So removing GraphX from Spark would not prevent GraphFrames from
continuing, they could pick up the GraphX source and incorporate it into
their project.

Twitter: https://twitter.com/holdenkarau
Fight Health Insurance: https://www.fighthealthinsurance.com/
<https://www.fighthealthinsurance.com/?q=hk_email>
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau
Pronouns: she/her


On Sat, Oct 5, 2024 at 5:22 PM Russell Jurney <russell.jur...@gmail.com>
wrote:

> A lot of people like me use GraphFrames for its connected components
> implementation and its motif matching feature. I am willing to work on it
> to keep it alive. They did a 0.8.3 release not too long ago. Please keep
> GraphX alive.
>
> On Sat, Oct 5, 2024 at 3:44 PM Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> I added the user list as they may have vested interest here and and
>> hopefully can contribute
>>
>> Few suggestions:
>>
>>
>>    1. Data-Driven Decision Making: Return to the core metrics—analyze
>>    usage trends, performance benchmarks, and the actual impact on businesses
>>    that rely on GraphX. Objectivity can be restored by letting data speak
>>    louder than opinions so to speak.
>>    2. Broaden the Discussion: Engage more stakeholders from diverse
>>    backgrounds (especially spark  users) to bring in new perspectives and
>>    counterbalance the more vocal but potentially narrow interests of core
>>    maintainers or open-source contributors.
>>    3. Define Clear Criteria for Decision Making: Agree on a set of
>>    objective criteria by which the project’s future will be judged. These
>>    could include market demand, contribution levels, maintenance costs,
>>    alternative solutions, and alignment with the overall Spark ecosystem
>>    goals. Some have already been covered.
>>    4. Timely Conclusion of Discussions: Set a timeline for making a
>>    decision. Long, open-ended discussions tend to lose focus. Putting
>>    deadlines forces participants to focus on key issues and prevents endless
>>    debates.
>>    5. Borrowing from commercial settings, it is often necessary for a
>>    strong leadership team to step in and make the final decision after
>>    considering the input. When the objectivity of discussions starts to wane,
>>    leadership needs to cut through the round discussions and steer towards
>>    action based on business and technical realities.
>>
>>
>> HTH
>>
>> Mich Talebzadeh,
>>
>> Architect | Data Engineer | Data Science | Financial Crime
>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>> London, United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Sat, 5 Oct 2024 at 06:26, Ángel <angel.alvarez.pas...@gmail.com>
>> wrote:
>>
>>> I completely agree with everyone here. I don’t think the issue is
>>> deprecating it; to me, the problem lies in not providing a new and better
>>> solution for handling graphs in Spark. In the past, I used GraphX via
>>> GraphFrames for record linkage, and I found it both useful and effective.
>>> Is there any discussion about a potential replacement?
>>>
>>> I’d be willing to help maintain GraphX, though I don’t have previous
>>> experience with maintaining open-source projects. All I can promise is good
>>> intentions, willingness to learn and lots of energy and passion. Is that
>>> enough?
>>>
>>> Btw, what's your take on this?
>>>
>>>
>>>    -
>>>
>>>    GraphX will be deprecated in favor of a new graphing component,
>>>    SparkGraph, based on Cypher
>>>    <https://neo4j.com/developer/cypher-query-language/>, a much richer
>>>    graph language than previously offered by GraphX.
>>>
>>>
>>>
>>> https://cloud.google.com/blog/products/data-analytics/introducing-spark-3-and-hadoop-3-on-dataproc-image-version-2-0
>>>
>>> El sáb, 5 oct 2024 a las 2:17, Mark Hamstra (<markhams...@gmail.com>)
>>> escribió:
>>>
>>>> As I wrote to Holden privately, I might well change my vote to be in
>>>> favor of a deprecation label combined with some effective means of
>>>> communicating that this doesn't mean the end for GraphX if interested
>>>> contributors come forward to rescue it. I don't like either the idea
>>>> of keeping unmaintained code and public APIs around (especially if
>>>> there are problems with them) or the idea of removing Spark
>>>> functionality just because no one has contributed to it for a while. A
>>>> naked deprecation label feels somewhat drastic and pre-emptive to me.
>>>> I don't expect that GraphX will be the last part of Spark to run the
>>>> risk of death through neglect, and I think we need an effective means
>>>> of encouraging resuscitation that a deprecation label on its own does
>>>> not provide. On the other hand, if no one really is willing to come to
>>>> the aid of GraphX or other neglected functionality given adequate
>>>> warning of possible removal, I'm not then opposed to the usual
>>>> deprecation and removal process.
>>>>
>>>>
>>>> On Fri, Oct 4, 2024 at 4:10 PM Sean Owen <sro...@gmail.com> wrote:
>>>> >
>>>> > This is a reasonable discussion, but maybe the more practical point
>>>> is: are you sure you want to block this unilaterally? This effectively
>>>> makes a decision that GraphX cannot be removed for a long while. I'd
>>>> understand it more if we had an active maintainer and/or active user
>>>> proposing to veto, but my understanding is this is just a proposal to block
>>>> this on behalf of some users, someone else who might do some work and
>>>> hasn't to date for some reason. Add to that the fact that the 'pro'
>>>> arguments all seem to be arguments for working on GraphFrames, and I find
>>>> this somewhat drastic.
>>>> >
>>>> > On Fri, Oct 4, 2024 at 5:23 PM Mark Hamstra <markhams...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> "You can't say nothing is removable until there are no users."
>>>> >>
>>>> >> That is not what I am saying. Rather, I am countering what others
>>>> seem
>>>> >> to be suggesting: There are no users and no interest, therefore we
>>>> can
>>>> >> and should deprecate.
>>>> >>
>>>> >> On Fri, Oct 4, 2024 at 3:10 PM Sean Owen <sro...@gmail.com> wrote:
>>>> >> >
>>>> >> > I could flip this argument around. More strongly, not being
>>>> deprecated means "won't be removed" and likewise implies support and
>>>> development. I don't think either of the latter have been true for years.
>>>> What suggests this will change? A todo list is not going to do anything,
>>>> IMHO.
>>>> >> >
>>>> >> > I'm also concerned about the cost of that, which I have observed.
>>>> GraphX PRs are almost certainly not going to be reviewed because of its
>>>> state. Deprecation both communicates that reality, and leaves an option
>>>> open, whereas not deprecating forecloses that option for a while.
>>>> >> >
>>>> >> > I don't think the question is, does anyone use it? because anyone
>>>> can continue to use it -- in Spark 3.x for sure, and in 4.x if not removed.
>>>> >> > You can't say nothing is removable until there are no users.
>>>> >> >
>>>> >> > Also, why would GraphFrames not be the logical home of this going
>>>> forward anyway? which I think is the subtext.
>>>> >> >
>>>> >> > On Fri, Oct 4, 2024 at 4:56 PM Mark Hamstra <markhams...@gmail.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> I'm -1(*) because, while it technically means "might be removed
>>>> in the
>>>> >> >> future", I think developers and users are more prone to interpret
>>>> >> >> something being marked as deprecated as "very likely will be
>>>> removed
>>>> >> >> in the future, so don't depend on this or waste your time
>>>> contributing
>>>> >> >> to its further development." I don't think the latter is what we
>>>> want
>>>> >> >> just because something hasn't been updated meaningfully in a
>>>> while.
>>>> >> >> There have been How To articles for GraphX and Graph Frames
>>>> posted in
>>>> >> >> the not too distant past, and the Google Search trend shows a
>>>> pretty
>>>> >> >> steady level of interest, not a decline to zero, so I don't think
>>>> that
>>>> >> >> it is accurate to declare that there is no use or interest in
>>>> GraphX.
>>>> >> >>
>>>> >> >> Unless retaining GraphX is imposing significant costs on
>>>> continuing
>>>> >> >> Spark development, I can't support deprecating GraphX. I can
>>>> support
>>>> >> >> encouraging GraphX and Graph Frames development through something
>>>> like
>>>> >> >> a To Do list or document of "What we'd like to see in the way of
>>>> >> >> further development of Spark's graph processing capabilities" --
>>>> i.e.,
>>>> >> >> things that encourage and support new contributions to address any
>>>> >> >> shortcomings in Spark's graph processing, not things that
>>>> discourage
>>>> >> >> contributions and use in the way that I believe simply declaring
>>>> >> >> GraphX to be deprecated would.
>>>> >> >>
>>>> >> >>
>>>> >> >> On Sun, Sep 29, 2024 at 11:04 AM Holden Karau <
>>>> holden.ka...@gmail.com> wrote:
>>>> >> >> >
>>>> >> >> > Since we're getting close to cutting a 4.0 branch I'd like to
>>>> float the idea of officially deprecating Graph X. What that would mean (to
>>>> me) is we would update the docs to indicate that Graph X is deprecated and
>>>> it's APIs may be removed at anytime in the future.
>>>> >> >> >
>>>> >> >> > Alternatively, we could mark it as "unmaintained and in search
>>>> of maintainers" with a note that if no maintainers are found, we may remove
>>>> it in a future minor version.
>>>> >> >> >
>>>> >> >> > Looking at the source graph X, I don't see any meaningful
>>>> active development going back over three years*. There is even a thread on
>>>> user@ from 2017 asking if graph X is maintained anymore, with no
>>>> response from the developers.
>>>> >> >> >
>>>> >> >> > Now I'm open to the idea that GraphX is stable and "works as
>>>> is" and simply doesn't require modifications but given the user thread I'm
>>>> a little concerned here about bringing this API with us into Spark 4 if we
>>>> don't have anyone signed up to maintain it.
>>>> >> >> >
>>>> >> >> > * Excluding globally applied changes
>>>> >> >> > --
>>>> >> >> > Twitter: https://twitter.com/holdenkarau
>>>> >> >> > Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>> >> >> > Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9
>>>> >> >> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>> >> >> > Pronouns: she/her
>>>> >> >>
>>>> >> >>
>>>> ---------------------------------------------------------------------
>>>> >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>> >> >>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>>

Reply via email to