Thanks, I'm working on SPARK-42856 but the tests fail due to formatting
issues - confusing as I ran scalafmt. Working on it...

Russ

On Sun, Nov 17, 2024 at 7:05 PM Xiao Li <lix...@databricks.com> wrote:

> Hi, Russell,
>
>
> After reviewing the JIRAs, it seems that only SPARK-42856 is directly
> relevant to GraphX. While the other three JIRAs mention GraphX in their
> descriptions, they appear to be more related to the build or the REPL
> rather than GraphX itself.
>
> Thanks,
>
> Xiao
>
>
>
>
>
>
> On Nov 16, 2024 at 5:39:27 PM, Russell Jurney <russell.jur...@gmail.com>
> wrote:
>
>> Scratch that, there appear to be... 4 unfixed bugs for GraphX
>> outstanding? :)
>> https://issues.apache.org/jira/browse/SPARK-42856?jql=project%20%3D%20SPARK%20AND%20issuetype%20%3D%20Bug%20AND%20status%20%3D%20Open%20AND%20text%20~%20%22graphx%22
>>
>> On Sat, Nov 16, 2024 at 5:23 PM Russell Jurney <russell.jur...@gmail.com>
>> wrote:
>>
>>> I'm looking at Spark's JIRA on a search for GraphX and I thought I would
>>> ask rather than just slog through it: anyone got some low hanging fruit
>>> bugs they can suggest I fix?
>>>
>>> Thanks,
>>> Russell
>>>
>>> On Thu, Nov 14, 2024 at 11:49 AM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> + 1
>>>>
>>>> Mich Talebzadeh,
>>>>
>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>> London, United Kingdom
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* The information provided is correct to the best of my
>>>> knowledge but of course cannot be guaranteed . It is essential to note
>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>> expert opinions (Werner
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>
>>>>
>>>> On Thu, 14 Nov 2024 at 18:52, Russell Jurney <russell.jur...@gmail.com>
>>>> wrote:
>>>>
>>>>> Okay, first I’m going to fix a bug or two, I’ll get started on an SPIP.
>>>>>
>>>>> Russ
>>>>>
>>>>> On Wed, Nov 13, 2024 at 1:56 PM Mich Talebzadeh <
>>>>> mich.talebza...@gmail.com> wrote:
>>>>>
>>>>>> Hm. Since it sounds like a plan why Russell you go ahead and create a
>>>>>> SPIP for it, then, this discussion takes a formal approach and is
>>>>>> documented. Otherwise we are just flogging a dead horse so to speak.
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> Mich Talebzadeh,
>>>>>>
>>>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>>>> College London
>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>>> London, United Kingdom
>>>>>>
>>>>>>
>>>>>>    view my Linkedin profile
>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>
>>>>>>
>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Disclaimer:* The information provided is correct to the best of my
>>>>>> knowledge but of course cannot be guaranteed . It is essential to note
>>>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>>>> expert opinions (Werner
>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>
>>>>>>
>>>>>> On Wed, 13 Nov 2024 at 20:10, Russell Jurney <
>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>
>>>>>>> It might be, but graph processing is a desirable, very useful
>>>>>>> feature of Spark. GraphX doesn't see more popularity because it never 
>>>>>>> got a
>>>>>>> DataFrame interface. If someone is willing to add one and maintain it, 
>>>>>>> that
>>>>>>> seems best of all.
>>>>>>>
>>>>>>> Russ
>>>>>>>
>>>>>>> On Wed, Nov 13, 2024 at 7:12 AM Ángel <
>>>>>>> angel.alvarez.pas...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Seems to me.... it would be easier to move GraphX to graphframes
>>>>>>>> than the opposite.
>>>>>>>>
>>>>>>>> El mar, 8 oct 2024 a las 21:52, Reynold Xin
>>>>>>>> (<r...@databricks.com.invalid>) escribió:
>>>>>>>>
>>>>>>>>> We can also consider the following: move GraphFrame into Spark,
>>>>>>>>> and make GraphX an internal impl detail of GraphFrame. Then we can 
>>>>>>>>> over
>>>>>>>>> time change the implementation, simplify it (not sure if it is 
>>>>>>>>> possible,
>>>>>>>>> but somebody can look into it)....
>>>>>>>>>
>>>>>>>>> On Mon, Oct 7, 2024 at 7:04 PM Russell Jurney <
>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Took a look at recent activity. Spark 3.5 support
>>>>>>>>>> <https://github.com/graphframes/graphframes/commit/e54f249605dde60787f9b41b88ed7d5872b7dfab>
>>>>>>>>>>  was
>>>>>>>>>> added a year ago. I'm sure we'll add Spark 4 support as soon as it 
>>>>>>>>>> is out.
>>>>>>>>>>
>>>>>>>>>> There is a new issue to organize a GraphFrames Hackathon
>>>>>>>>>> <https://github.com/graphframes/graphframes/issues/460>. Please
>>>>>>>>>> sign up to help!
>>>>>>>>>> https://github.com/graphframes/graphframes/issues/460
>>>>>>>>>>
>>>>>>>>>> I seriously need GraphX and GraphFrames to make it... I have no
>>>>>>>>>> other way of doing property graph motif matching on large graphs. 
>>>>>>>>>> It's kind
>>>>>>>>>> of important to me.
>>>>>>>>>>
>>>>>>>>>> Some slides on my work with GraphFrames:
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> Russell
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 7, 2024 at 6:06 PM Holden Karau <
>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> That’s awesome!
>>>>>>>>>>>
>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 7, 2024 at 5:42 PM Russell Jurney <
>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I’ll organize a hackathon. A friend wants to finish the
>>>>>>>>>>>> implementation of Lucian modularity for GraphFrames. I’ll fix some 
>>>>>>>>>>>> GraphX
>>>>>>>>>>>> bugs at it.
>>>>>>>>>>>>
>>>>>>>>>>>> I did just blog all about the motif matching in GraphFrames:
>>>>>>>>>>>>
>>>>>>>>>>>> https://blog.graphlet.ai/financial-crime-and-corruption-network-motifs-4cf2e8e10eb5
>>>>>>>>>>>>
>>>>>>>>>>>> Russ
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 7, 2024 at 5:38 PM Holden Karau <
>>>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> So this discuss thread and the vote thread to deprecate to
>>>>>>>>>>>>> leave the option of removing it during 4.X are probably the 
>>>>>>>>>>>>> highest profile
>>>>>>>>>>>>> it’s been in years.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the past for parts of Spark I’ve cared about I’ve organized
>>>>>>>>>>>>> virtual meetings to co-ordinate work — if your connected with 
>>>>>>>>>>>>> some of the
>>>>>>>>>>>>> Spark+Graph community reaching out to find others and organizing 
>>>>>>>>>>>>> a meeting
>>>>>>>>>>>>> could be a way to raise the profile a bit? Maybe organize a 
>>>>>>>>>>>>> virtual
>>>>>>>>>>>>> hackathon (I’m meaning to try this for some other things so happy 
>>>>>>>>>>>>> to share
>>>>>>>>>>>>> what I learn from doing that)?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 5:02 PM Russell Jurney <
>>>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I’ll look for a bug to fix. If GraphX is outside of Spark,
>>>>>>>>>>>>>> Spark would tend to break GraphFrames and it will be burdensome 
>>>>>>>>>>>>>> on an
>>>>>>>>>>>>>> external project to keep up. Graph computing on Spark is 
>>>>>>>>>>>>>> implrtant to a lot
>>>>>>>>>>>>>> of people, is there a way to raise visibility here?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 4:24 PM Holden Karau <
>>>>>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are no specific tickets associated with the lack of
>>>>>>>>>>>>>>> maintaince or this as the component has not been maintained for 
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>> sufficiently long time. If your interested in taking it on 
>>>>>>>>>>>>>>> that’s
>>>>>>>>>>>>>>> wonderful, probably starting with fixing some bugs could be a 
>>>>>>>>>>>>>>> great place
>>>>>>>>>>>>>>> to start and figure out if it’s something you want to do long 
>>>>>>>>>>>>>>> term.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would recommend making a first bug fix in a actively
>>>>>>>>>>>>>>> maintained area of Spark to get to
>>>>>>>>>>>>>>> Know some reviewers since there is not anyone tracking the
>>>>>>>>>>>>>>> GraphX PRs.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As a note I don’t think GraphX is required for Graph Frames
>>>>>>>>>>>>>>> long term, so another option would be to talk to the 
>>>>>>>>>>>>>>> GraphFrames folks and
>>>>>>>>>>>>>>> move the GraphX code over to it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ideally we’d have someone willing to act as a mentor or
>>>>>>>>>>>>>>> guide but so far we have no volunteers (especially no one 
>>>>>>>>>>>>>>> familiar with the
>>>>>>>>>>>>>>> graph X code).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>> Fight Health Insurance:
>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>>>>>>>> YouTube Live Streams:
>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 3:25 PM Russell Jurney <
>>>>>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I volunteer to maintain GraphX to keep GraphFrames a viable
>>>>>>>>>>>>>>>> project. I don’t have a clear view on whether it works with 
>>>>>>>>>>>>>>>> Spark 4 or if
>>>>>>>>>>>>>>>> it needs updates? I don’t have Spark commits but I’m a 
>>>>>>>>>>>>>>>> committer on Apache
>>>>>>>>>>>>>>>> DataFu and mentored the Spark feature for it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can someone tell me what is involved? Point me at a ticket?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Russell
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 12:11 AM Erik Eklund <
>>>>>>>>>>>>>>>> eekl...@definitivehc.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>> We rely on GraphX for an important component of our
>>>>>>>>>>>>>>>>> product. And we really want it to stay a typed interface. 
>>>>>>>>>>>>>>>>> Please keep
>>>>>>>>>>>>>>>>> GraphX.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Erik
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *From: *Holden Karau <holden.ka...@gmail.com>
>>>>>>>>>>>>>>>>> *Date: *Sunday, October 6, 2024 at 06:22
>>>>>>>>>>>>>>>>> *To: *Ángel <angel.alvarez.pas...@gmail.com>
>>>>>>>>>>>>>>>>> *Cc: *Russell Jurney <russell.jur...@gmail.com>, Mich
>>>>>>>>>>>>>>>>> Talebzadeh <mich.talebza...@gmail.com>, Spark dev list <
>>>>>>>>>>>>>>>>> dev@spark.apache.org>, user @spark <u...@spark.apache.org>
>>>>>>>>>>>>>>>>> *Subject: *Re: [DISCUSS] Deprecate GraphX OR Find new
>>>>>>>>>>>>>>>>> maintainers interested in GraphX OR leave it as is?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So are there companies using it? And are they willing to
>>>>>>>>>>>>>>>>> contribute to maintaining it?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fight Health Insurance:
>>>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,OT9ylxCx5xRNCToPSzu0VEvefs4uts16fTBydH2NiLHMGEwLjrEXgkhU8W-Ai6xD8VDMyWea44GBMOEecMNdapaZKZbBTrZpquOBKi6YRlqu-FVAzji6-w,,&typo=1>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,h0ccgHctUPRY4zAN_qZ-qdBgLDpQLtm7KaOL4u12U4PR7PeJ4MUBOS8bbD7CNssUIMqRMvY_pOqbh7PfLY0lRpQh9mfqBC0KnSHBZzxxSJJr-55r5kv6YjYwrA,,&typo=1>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> YouTube Live Streams:
>>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 9:17 PM Ángel <
>>>>>>>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> That would definitely affect companies using GraphX, but
>>>>>>>>>>>>>>>>> at least they’d have the choice to migrate their code.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think that’s probably the way to go.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El dom, 6 oct 2024 a las 6:09, Holden Karau (<
>>>>>>>>>>>>>>>>> holden.ka...@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So removing GraphX from Spark would not prevent
>>>>>>>>>>>>>>>>> GraphFrames from continuing, they could pick up the GraphX 
>>>>>>>>>>>>>>>>> source and
>>>>>>>>>>>>>>>>> incorporate it into their project.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fight Health Insurance:
>>>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,9xMMQlY7gtmkqxT0NTmS8KMg4wOUjw0PWKM-oepAYAkE-SiM5pyXCb80AuRZYJ4zMIedVlwVMAKi_eh52Hof0LsteXx2eIslnsDBdmVeuocpILpneg,,&typo=1>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,kbGbMBRMidAYi0aqUmj949vRahpEjVzSgJv_YYtO5EteSXZy4RrMYXJU48mN2CyS5sdovsgiFAAiBLnyQ29gCCn8xbTrEJmfIhjtH7tD4N31VUoLtQ,,&typo=1>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> YouTube Live Streams:
>>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 5:22 PM Russell Jurney <
>>>>>>>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> A lot of people like me use GraphFrames for its connected
>>>>>>>>>>>>>>>>> components implementation and its motif matching feature. I 
>>>>>>>>>>>>>>>>> am willing to
>>>>>>>>>>>>>>>>> work on it to keep it alive. They did a 0.8.3 release not too 
>>>>>>>>>>>>>>>>> long ago.
>>>>>>>>>>>>>>>>> Please keep GraphX alive.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 3:44 PM Mich Talebzadeh <
>>>>>>>>>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I added the user list as they may have vested
>>>>>>>>>>>>>>>>> interest here and and hopefully can contribute
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Few suggestions:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    1. Data-Driven Decision Making: Return to the core
>>>>>>>>>>>>>>>>>    metrics—analyze usage trends, performance benchmarks, and 
>>>>>>>>>>>>>>>>> the actual impact
>>>>>>>>>>>>>>>>>    on businesses that rely on GraphX. Objectivity can be 
>>>>>>>>>>>>>>>>> restored by letting
>>>>>>>>>>>>>>>>>    data speak louder than opinions so to speak.
>>>>>>>>>>>>>>>>>    2. Broaden the Discussion: Engage more stakeholders
>>>>>>>>>>>>>>>>>    from diverse backgrounds (especially spark  users) to 
>>>>>>>>>>>>>>>>> bring in new
>>>>>>>>>>>>>>>>>    perspectives and counterbalance the more vocal but 
>>>>>>>>>>>>>>>>> potentially narrow
>>>>>>>>>>>>>>>>>    interests of core maintainers or open-source contributors.
>>>>>>>>>>>>>>>>>    3. Define Clear Criteria for Decision Making: Agree on
>>>>>>>>>>>>>>>>>    a set of objective criteria by which the project’s future 
>>>>>>>>>>>>>>>>> will be judged.
>>>>>>>>>>>>>>>>>    These could include market demand, contribution levels, 
>>>>>>>>>>>>>>>>> maintenance costs,
>>>>>>>>>>>>>>>>>    alternative solutions, and alignment with the overall 
>>>>>>>>>>>>>>>>> Spark ecosystem
>>>>>>>>>>>>>>>>>    goals. Some have already been covered.
>>>>>>>>>>>>>>>>>    4. Timely Conclusion of Discussions: Set a timeline
>>>>>>>>>>>>>>>>>    for making a decision. Long, open-ended discussions tend 
>>>>>>>>>>>>>>>>> to lose focus.
>>>>>>>>>>>>>>>>>    Putting deadlines forces participants to focus on key 
>>>>>>>>>>>>>>>>> issues and prevents
>>>>>>>>>>>>>>>>>    endless debates.
>>>>>>>>>>>>>>>>>    5. Borrowing from commercial settings, it is often
>>>>>>>>>>>>>>>>>    necessary for a strong leadership team to step in and make 
>>>>>>>>>>>>>>>>> the final
>>>>>>>>>>>>>>>>>    decision after considering the input. When the objectivity 
>>>>>>>>>>>>>>>>> of discussions
>>>>>>>>>>>>>>>>>    starts to wane, leadership needs to cut through the round 
>>>>>>>>>>>>>>>>> discussions and
>>>>>>>>>>>>>>>>>    steer towards action based on business and technical 
>>>>>>>>>>>>>>>>> realities.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> HTH
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Mich Talebzadeh,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> 
>>>>>>>>>>>>>>>>> Imperial
>>>>>>>>>>>>>>>>> College London
>>>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> London, United Kingdom
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  [image: Image removed by sender.]  view my Linkedin
>>>>>>>>>>>>>>>>> profile
>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fen.everybodywiki.com%2fMich_Talebzadeh&c=E,1,U1JaGVMkko53HkJO5fwmkIXfziTOWL3K1CkAeHwFG55TbZQUd5xVNLGpLt2o0ytujE6zaLpqU2GWCZqHSbo3SU4Wh9Rl8NG4bWPbFWUwyw,,&typo=1>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> *Disclaimer:* The information provided is correct to the
>>>>>>>>>>>>>>>>> best of my knowledge but of course cannot be guaranteed . It 
>>>>>>>>>>>>>>>>> is essential
>>>>>>>>>>>>>>>>> to note that, as with any advice, quote "one test result is
>>>>>>>>>>>>>>>>> worth one-thousand expert opinions (Werner
>>>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, 5 Oct 2024 at 06:26, Ángel <
>>>>>>>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I completely agree with everyone here. I don’t think the
>>>>>>>>>>>>>>>>> issue is deprecating it; to me, the problem lies in not 
>>>>>>>>>>>>>>>>> providing a new and
>>>>>>>>>>>>>>>>> better solution for handling graphs in Spark. In the past, I 
>>>>>>>>>>>>>>>>> used GraphX
>>>>>>>>>>>>>>>>> via GraphFrames for record linkage, and I found it both 
>>>>>>>>>>>>>>>>> useful and
>>>>>>>>>>>>>>>>> effective. Is there any discussion about a potential 
>>>>>>>>>>>>>>>>> replacement?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I’d be willing to help maintain GraphX, though I don’t
>>>>>>>>>>>>>>>>> have previous experience with maintaining open-source 
>>>>>>>>>>>>>>>>> projects. All I can
>>>>>>>>>>>>>>>>> promise is good intentions, willingness to learn and lots of 
>>>>>>>>>>>>>>>>> energy and
>>>>>>>>>>>>>>>>> passion. Is that enough?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Btw, what's your take on this?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ·         *GraphX* will be deprecated in favor of a new
>>>>>>>>>>>>>>>>> graphing component, SparkGraph, based on Cypher
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fneo4j.com%2fdeveloper%2fcypher-query-language%2f&c=E,1,5sP_K0oxQDLYIfWhFPwgNEmTuXMR7tvCjLLcf_ZBAv7oIBySxARy9TyrqNkmZKfXwrIDrhe6TVBCUun2luRV_mAbSD4rooD9YRt5GYYgbHbBUYerg1mpA4Oe6eo,&typo=1>,
>>>>>>>>>>>>>>>>> a much richer graph language than previously offered by 
>>>>>>>>>>>>>>>>> GraphX.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://cloud.google.com/blog/products/data-analytics/introducing-spark-3-and-hadoop-3-on-dataproc-image-version-2-0
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El sáb, 5 oct 2024 a las 2:17, Mark Hamstra (<
>>>>>>>>>>>>>>>>> markhams...@gmail.com>) escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As I wrote to Holden privately, I might well change my
>>>>>>>>>>>>>>>>> vote to be in
>>>>>>>>>>>>>>>>> favor of a deprecation label combined with some effective
>>>>>>>>>>>>>>>>> means of
>>>>>>>>>>>>>>>>> communicating that this doesn't mean the end for GraphX if
>>>>>>>>>>>>>>>>> interested
>>>>>>>>>>>>>>>>> contributors come forward to rescue it. I don't like
>>>>>>>>>>>>>>>>> either the idea
>>>>>>>>>>>>>>>>> of keeping unmaintained code and public APIs around
>>>>>>>>>>>>>>>>> (especially if
>>>>>>>>>>>>>>>>> there are problems with them) or the idea of removing Spark
>>>>>>>>>>>>>>>>> functionality just because no one has contributed to it
>>>>>>>>>>>>>>>>> for a while. A
>>>>>>>>>>>>>>>>> naked deprecation label feels somewhat drastic and
>>>>>>>>>>>>>>>>> pre-emptive to me.
>>>>>>>>>>>>>>>>> I don't expect that GraphX will be the last part of Spark
>>>>>>>>>>>>>>>>> to run the
>>>>>>>>>>>>>>>>> risk of death through neglect, and I think we need an
>>>>>>>>>>>>>>>>> effective means
>>>>>>>>>>>>>>>>> of encouraging resuscitation that a deprecation label on
>>>>>>>>>>>>>>>>> its own does
>>>>>>>>>>>>>>>>> not provide. On the other hand, if no one really is
>>>>>>>>>>>>>>>>> willing to come to
>>>>>>>>>>>>>>>>> the aid of GraphX or other neglected functionality given
>>>>>>>>>>>>>>>>> adequate
>>>>>>>>>>>>>>>>> warning of possible removal, I'm not then opposed to the
>>>>>>>>>>>>>>>>> usual
>>>>>>>>>>>>>>>>> deprecation and removal process.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Oct 4, 2024 at 4:10 PM Sean Owen <sro...@gmail.com>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > This is a reasonable discussion, but maybe the more
>>>>>>>>>>>>>>>>> practical point is: are you sure you want to block this 
>>>>>>>>>>>>>>>>> unilaterally? This
>>>>>>>>>>>>>>>>> effectively makes a decision that GraphX cannot be removed 
>>>>>>>>>>>>>>>>> for a long
>>>>>>>>>>>>>>>>> while. I'd understand it more if we had an active maintainer 
>>>>>>>>>>>>>>>>> and/or active
>>>>>>>>>>>>>>>>> user proposing to veto, but my understanding is this is just 
>>>>>>>>>>>>>>>>> a proposal to
>>>>>>>>>>>>>>>>> block this on behalf of some users, someone else who might do 
>>>>>>>>>>>>>>>>> some work and
>>>>>>>>>>>>>>>>> hasn't to date for some reason. Add to that the fact that the 
>>>>>>>>>>>>>>>>> 'pro'
>>>>>>>>>>>>>>>>> arguments all seem to be arguments for working on 
>>>>>>>>>>>>>>>>> GraphFrames, and I find
>>>>>>>>>>>>>>>>> this somewhat drastic.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Fri, Oct 4, 2024 at 5:23 PM Mark Hamstra <
>>>>>>>>>>>>>>>>> markhams...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> "You can't say nothing is removable until there are no
>>>>>>>>>>>>>>>>> users."
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> That is not what I am saying. Rather, I am countering
>>>>>>>>>>>>>>>>> what others seem
>>>>>>>>>>>>>>>>> >> to be suggesting: There are no users and no interest,
>>>>>>>>>>>>>>>>> therefore we can
>>>>>>>>>>>>>>>>> >> and should deprecate.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> On Fri, Oct 4, 2024 at 3:10 PM Sean Owen <
>>>>>>>>>>>>>>>>> sro...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > I could flip this argument around. More strongly, not
>>>>>>>>>>>>>>>>> being deprecated means "won't be removed" and likewise 
>>>>>>>>>>>>>>>>> implies support and
>>>>>>>>>>>>>>>>> development. I don't think either of the latter have been 
>>>>>>>>>>>>>>>>> true for years.
>>>>>>>>>>>>>>>>> What suggests this will change? A todo list is not going to 
>>>>>>>>>>>>>>>>> do anything,
>>>>>>>>>>>>>>>>> IMHO.
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > I'm also concerned about the cost of that, which I
>>>>>>>>>>>>>>>>> have observed. GraphX PRs are almost certainly not going to 
>>>>>>>>>>>>>>>>> be reviewed
>>>>>>>>>>>>>>>>> because of its state. Deprecation both communicates that 
>>>>>>>>>>>>>>>>> reality, and
>>>>>>>>>>>>>>>>> leaves an option open, whereas not deprecating forecloses 
>>>>>>>>>>>>>>>>> that option for a
>>>>>>>>>>>>>>>>> while.
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > I don't think the question is, does anyone use it?
>>>>>>>>>>>>>>>>> because anyone can continue to use it -- in Spark 3.x for 
>>>>>>>>>>>>>>>>> sure, and in 4.x
>>>>>>>>>>>>>>>>> if not removed.
>>>>>>>>>>>>>>>>> >> > You can't say nothing is removable until there are no
>>>>>>>>>>>>>>>>> users.
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > Also, why would GraphFrames not be the logical home
>>>>>>>>>>>>>>>>> of this going forward anyway? which I think is the subtext.
>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>> >> > On Fri, Oct 4, 2024 at 4:56 PM Mark Hamstra <
>>>>>>>>>>>>>>>>> markhams...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >> I'm -1(*) because, while it technically means "might
>>>>>>>>>>>>>>>>> be removed in the
>>>>>>>>>>>>>>>>> >> >> future", I think developers and users are more prone
>>>>>>>>>>>>>>>>> to interpret
>>>>>>>>>>>>>>>>> >> >> something being marked as deprecated as "very likely
>>>>>>>>>>>>>>>>> will be removed
>>>>>>>>>>>>>>>>> >> >> in the future, so don't depend on this or waste your
>>>>>>>>>>>>>>>>> time contributing
>>>>>>>>>>>>>>>>> >> >> to its further development." I don't think the
>>>>>>>>>>>>>>>>> latter is what we want
>>>>>>>>>>>>>>>>> >> >> just because something hasn't been updated
>>>>>>>>>>>>>>>>> meaningfully in a while.
>>>>>>>>>>>>>>>>> >> >> There have been How To articles for GraphX and Graph
>>>>>>>>>>>>>>>>> Frames posted in
>>>>>>>>>>>>>>>>> >> >> the not too distant past, and the Google Search
>>>>>>>>>>>>>>>>> trend shows a pretty
>>>>>>>>>>>>>>>>> >> >> steady level of interest, not a decline to zero, so
>>>>>>>>>>>>>>>>> I don't think that
>>>>>>>>>>>>>>>>> >> >> it is accurate to declare that there is no use or
>>>>>>>>>>>>>>>>> interest in GraphX.
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >> Unless retaining GraphX is imposing significant
>>>>>>>>>>>>>>>>> costs on continuing
>>>>>>>>>>>>>>>>> >> >> Spark development, I can't support deprecating
>>>>>>>>>>>>>>>>> GraphX. I can support
>>>>>>>>>>>>>>>>> >> >> encouraging GraphX and Graph Frames development
>>>>>>>>>>>>>>>>> through something like
>>>>>>>>>>>>>>>>> >> >> a To Do list or document of "What we'd like to see
>>>>>>>>>>>>>>>>> in the way of
>>>>>>>>>>>>>>>>> >> >> further development of Spark's graph processing
>>>>>>>>>>>>>>>>> capabilities" -- i.e.,
>>>>>>>>>>>>>>>>> >> >> things that encourage and support new contributions
>>>>>>>>>>>>>>>>> to address any
>>>>>>>>>>>>>>>>> >> >> shortcomings in Spark's graph processing, not things
>>>>>>>>>>>>>>>>> that discourage
>>>>>>>>>>>>>>>>> >> >> contributions and use in the way that I believe
>>>>>>>>>>>>>>>>> simply declaring
>>>>>>>>>>>>>>>>> >> >> GraphX to be deprecated would.
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >> On Sun, Sep 29, 2024 at 11:04 AM Holden Karau <
>>>>>>>>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>>> >> >> > Since we're getting close to cutting a 4.0 branch
>>>>>>>>>>>>>>>>> I'd like to float the idea of officially deprecating Graph X. 
>>>>>>>>>>>>>>>>> What that
>>>>>>>>>>>>>>>>> would mean (to me) is we would update the docs to indicate 
>>>>>>>>>>>>>>>>> that Graph X is
>>>>>>>>>>>>>>>>> deprecated and it's APIs may be removed at anytime in the 
>>>>>>>>>>>>>>>>> future.
>>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>>> >> >> > Alternatively, we could mark it as "unmaintained
>>>>>>>>>>>>>>>>> and in search of maintainers" with a note that if no 
>>>>>>>>>>>>>>>>> maintainers are found,
>>>>>>>>>>>>>>>>> we may remove it in a future minor version.
>>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>>> >> >> > Looking at the source graph X, I don't see any
>>>>>>>>>>>>>>>>> meaningful active development going back over three years*. 
>>>>>>>>>>>>>>>>> There is even a
>>>>>>>>>>>>>>>>> thread on user@ from 2017 asking if graph X is maintained
>>>>>>>>>>>>>>>>> anymore, with no response from the developers.
>>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>>> >> >> > Now I'm open to the idea that GraphX is stable and
>>>>>>>>>>>>>>>>> "works as is" and simply doesn't require modifications but 
>>>>>>>>>>>>>>>>> given the user
>>>>>>>>>>>>>>>>> thread I'm a little concerned here about bringing this API 
>>>>>>>>>>>>>>>>> with us into
>>>>>>>>>>>>>>>>> Spark 4 if we don't have anyone signed up to maintain it.
>>>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>>>> >> >> > * Excluding globally applied changes
>>>>>>>>>>>>>>>>> >> >> > --
>>>>>>>>>>>>>>>>> >> >> > Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>>> >> >> > Fight Health Insurance:
>>>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f&c=E,1,9CeJ-bKUShnxOFZMc15zJG1qgfAB9rnSDzrmLzNiXb8qE0NXedNCoZy4HobcS7laOMqtvJzYjvDzjBld1FaCPZpOBW6cf1l_xaG4bEbjYoDpNG0zuQ9_K5TW&typo=1>
>>>>>>>>>>>>>>>>> >> >> > Books (Learning Spark, High Performance Spark,
>>>>>>>>>>>>>>>>> etc.): https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,HJPBNbN3nfUZcb0-2OgveqIE5I5lvPSv-bOfRXIprFdSsGMlNq15o6rueLf2ZQRfytMu0-t3IxSjYou2uuPzUrSAqJ0LV42n2hG8rnkkpN4AA5w4mQZFTs4,&typo=1>
>>>>>>>>>>>>>>>>> >> >> > YouTube Live Streams:
>>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>>> >> >> > Pronouns: she/her
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>>>> >> >> To unsubscribe e-mail:
>>>>>>>>>>>>>>>>> dev-unsubscr...@spark.apache.org
>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Reply via email to