Scratch that, there appear to be... 4 unfixed bugs for GraphX outstanding?
:)
https://issues.apache.org/jira/browse/SPARK-42856?jql=project%20%3D%20SPARK%20AND%20issuetype%20%3D%20Bug%20AND%20status%20%3D%20Open%20AND%20text%20~%20%22graphx%22

On Sat, Nov 16, 2024 at 5:23 PM Russell Jurney <russell.jur...@gmail.com>
wrote:

> I'm looking at Spark's JIRA on a search for GraphX and I thought I would
> ask rather than just slog through it: anyone got some low hanging fruit
> bugs they can suggest I fix?
>
> Thanks,
> Russell
>
> On Thu, Nov 14, 2024 at 11:49 AM Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> + 1
>>
>> Mich Talebzadeh,
>>
>> Architect | Data Engineer | Data Science | Financial Crime
>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>> London, United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Thu, 14 Nov 2024 at 18:52, Russell Jurney <russell.jur...@gmail.com>
>> wrote:
>>
>>> Okay, first I’m going to fix a bug or two, I’ll get started on an SPIP.
>>>
>>> Russ
>>>
>>> On Wed, Nov 13, 2024 at 1:56 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> Hm. Since it sounds like a plan why Russell you go ahead and create a
>>>> SPIP for it, then, this discussion takes a formal approach and is
>>>> documented. Otherwise we are just flogging a dead horse so to speak.
>>>>
>>>> HTH
>>>>
>>>> Mich Talebzadeh,
>>>>
>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>> London, United Kingdom
>>>>
>>>>
>>>>    view my Linkedin profile
>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>
>>>>
>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>
>>>>
>>>>
>>>> *Disclaimer:* The information provided is correct to the best of my
>>>> knowledge but of course cannot be guaranteed . It is essential to note
>>>> that, as with any advice, quote "one test result is worth one-thousand
>>>> expert opinions (Werner
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>
>>>>
>>>> On Wed, 13 Nov 2024 at 20:10, Russell Jurney <russell.jur...@gmail.com>
>>>> wrote:
>>>>
>>>>> It might be, but graph processing is a desirable, very useful feature
>>>>> of Spark. GraphX doesn't see more popularity because it never got a
>>>>> DataFrame interface. If someone is willing to add one and maintain it, 
>>>>> that
>>>>> seems best of all.
>>>>>
>>>>> Russ
>>>>>
>>>>> On Wed, Nov 13, 2024 at 7:12 AM Ángel <angel.alvarez.pas...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Seems to me.... it would be easier to move GraphX to graphframes than
>>>>>> the opposite.
>>>>>>
>>>>>> El mar, 8 oct 2024 a las 21:52, Reynold Xin
>>>>>> (<r...@databricks.com.invalid>) escribió:
>>>>>>
>>>>>>> We can also consider the following: move GraphFrame into Spark, and
>>>>>>> make GraphX an internal impl detail of GraphFrame. Then we can over time
>>>>>>> change the implementation, simplify it (not sure if it is possible, but
>>>>>>> somebody can look into it)....
>>>>>>>
>>>>>>> On Mon, Oct 7, 2024 at 7:04 PM Russell Jurney <
>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Took a look at recent activity. Spark 3.5 support
>>>>>>>> <https://github.com/graphframes/graphframes/commit/e54f249605dde60787f9b41b88ed7d5872b7dfab>
>>>>>>>>  was
>>>>>>>> added a year ago. I'm sure we'll add Spark 4 support as soon as it is 
>>>>>>>> out.
>>>>>>>>
>>>>>>>> There is a new issue to organize a GraphFrames Hackathon
>>>>>>>> <https://github.com/graphframes/graphframes/issues/460>. Please
>>>>>>>> sign up to help!
>>>>>>>> https://github.com/graphframes/graphframes/issues/460
>>>>>>>>
>>>>>>>> I seriously need GraphX and GraphFrames to make it... I have no
>>>>>>>> other way of doing property graph motif matching on large graphs. It's 
>>>>>>>> kind
>>>>>>>> of important to me.
>>>>>>>>
>>>>>>>> Some slides on my work with GraphFrames:
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>> Russell
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 7, 2024 at 6:06 PM Holden Karau <holden.ka...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> That’s awesome!
>>>>>>>>>
>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>> Pronouns: she/her
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Oct 7, 2024 at 5:42 PM Russell Jurney <
>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I’ll organize a hackathon. A friend wants to finish the
>>>>>>>>>> implementation of Lucian modularity for GraphFrames. I’ll fix some 
>>>>>>>>>> GraphX
>>>>>>>>>> bugs at it.
>>>>>>>>>>
>>>>>>>>>> I did just blog all about the motif matching in GraphFrames:
>>>>>>>>>>
>>>>>>>>>> https://blog.graphlet.ai/financial-crime-and-corruption-network-motifs-4cf2e8e10eb5
>>>>>>>>>>
>>>>>>>>>> Russ
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 7, 2024 at 5:38 PM Holden Karau <
>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> So this discuss thread and the vote thread to deprecate to leave
>>>>>>>>>>> the option of removing it during 4.X are probably the highest 
>>>>>>>>>>> profile it’s
>>>>>>>>>>> been in years.
>>>>>>>>>>>
>>>>>>>>>>> In the past for parts of Spark I’ve cared about I’ve organized
>>>>>>>>>>> virtual meetings to co-ordinate work — if your connected with some 
>>>>>>>>>>> of the
>>>>>>>>>>> Spark+Graph community reaching out to find others and organizing a 
>>>>>>>>>>> meeting
>>>>>>>>>>> could be a way to raise the profile a bit? Maybe organize a virtual
>>>>>>>>>>> hackathon (I’m meaning to try this for some other things so happy 
>>>>>>>>>>> to share
>>>>>>>>>>> what I learn from doing that)?
>>>>>>>>>>>
>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 7, 2024 at 5:02 PM Russell Jurney <
>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I’ll look for a bug to fix. If GraphX is outside of Spark,
>>>>>>>>>>>> Spark would tend to break GraphFrames and it will be burdensome on 
>>>>>>>>>>>> an
>>>>>>>>>>>> external project to keep up. Graph computing on Spark is implrtant 
>>>>>>>>>>>> to a lot
>>>>>>>>>>>> of people, is there a way to raise visibility here?
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 7, 2024 at 4:24 PM Holden Karau <
>>>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> There are no specific tickets associated with the lack of
>>>>>>>>>>>>> maintaince or this as the component has not been maintained for a
>>>>>>>>>>>>> sufficiently long time. If your interested in taking it on that’s
>>>>>>>>>>>>> wonderful, probably starting with fixing some bugs could be a 
>>>>>>>>>>>>> great place
>>>>>>>>>>>>> to start and figure out if it’s something you want to do long 
>>>>>>>>>>>>> term.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I would recommend making a first bug fix in a actively
>>>>>>>>>>>>> maintained area of Spark to get to
>>>>>>>>>>>>> Know some reviewers since there is not anyone tracking the
>>>>>>>>>>>>> GraphX PRs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As a note I don’t think GraphX is required for Graph Frames
>>>>>>>>>>>>> long term, so another option would be to talk to the GraphFrames 
>>>>>>>>>>>>> folks and
>>>>>>>>>>>>> move the GraphX code over to it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ideally we’d have someone willing to act as a mentor or guide
>>>>>>>>>>>>> but so far we have no volunteers (especially no one familiar with 
>>>>>>>>>>>>> the graph
>>>>>>>>>>>>> X code).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/
>>>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email>
>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 3:25 PM Russell Jurney <
>>>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I volunteer to maintain GraphX to keep GraphFrames a viable
>>>>>>>>>>>>>> project. I don’t have a clear view on whether it works with 
>>>>>>>>>>>>>> Spark 4 or if
>>>>>>>>>>>>>> it needs updates? I don’t have Spark commits but I’m a committer 
>>>>>>>>>>>>>> on Apache
>>>>>>>>>>>>>> DataFu and mentored the Spark feature for it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can someone tell me what is involved? Point me at a ticket?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Russell
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 12:11 AM Erik Eklund <
>>>>>>>>>>>>>> eekl...@definitivehc.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>> We rely on GraphX for an important component of our product.
>>>>>>>>>>>>>>> And we really want it to stay a typed interface. Please keep 
>>>>>>>>>>>>>>> GraphX.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Erik
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *From: *Holden Karau <holden.ka...@gmail.com>
>>>>>>>>>>>>>>> *Date: *Sunday, October 6, 2024 at 06:22
>>>>>>>>>>>>>>> *To: *Ángel <angel.alvarez.pas...@gmail.com>
>>>>>>>>>>>>>>> *Cc: *Russell Jurney <russell.jur...@gmail.com>, Mich
>>>>>>>>>>>>>>> Talebzadeh <mich.talebza...@gmail.com>, Spark dev list <
>>>>>>>>>>>>>>> dev@spark.apache.org>, user @spark <u...@spark.apache.org>
>>>>>>>>>>>>>>> *Subject: *Re: [DISCUSS] Deprecate GraphX OR Find new
>>>>>>>>>>>>>>> maintainers interested in GraphX OR leave it as is?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So are there companies using it? And are they willing to
>>>>>>>>>>>>>>> contribute to maintaining it?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fight Health Insurance:
>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,OT9ylxCx5xRNCToPSzu0VEvefs4uts16fTBydH2NiLHMGEwLjrEXgkhU8W-Ai6xD8VDMyWea44GBMOEecMNdapaZKZbBTrZpquOBKi6YRlqu-FVAzji6-w,,&typo=1>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,h0ccgHctUPRY4zAN_qZ-qdBgLDpQLtm7KaOL4u12U4PR7PeJ4MUBOS8bbD7CNssUIMqRMvY_pOqbh7PfLY0lRpQh9mfqBC0KnSHBZzxxSJJr-55r5kv6YjYwrA,,&typo=1>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> YouTube Live Streams:
>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 9:17 PM Ángel <
>>>>>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That would definitely affect companies using GraphX, but at
>>>>>>>>>>>>>>> least they’d have the choice to migrate their code.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think that’s probably the way to go.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El dom, 6 oct 2024 a las 6:09, Holden Karau (<
>>>>>>>>>>>>>>> holden.ka...@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So removing GraphX from Spark would not prevent GraphFrames
>>>>>>>>>>>>>>> from continuing, they could pick up the GraphX source and 
>>>>>>>>>>>>>>> incorporate it
>>>>>>>>>>>>>>> into their project.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fight Health Insurance:
>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,9xMMQlY7gtmkqxT0NTmS8KMg4wOUjw0PWKM-oepAYAkE-SiM5pyXCb80AuRZYJ4zMIedVlwVMAKi_eh52Hof0LsteXx2eIslnsDBdmVeuocpILpneg,,&typo=1>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,kbGbMBRMidAYi0aqUmj949vRahpEjVzSgJv_YYtO5EteSXZy4RrMYXJU48mN2CyS5sdovsgiFAAiBLnyQ29gCCn8xbTrEJmfIhjtH7tD4N31VUoLtQ,,&typo=1>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> YouTube Live Streams:
>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Pronouns: she/her
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 5:22 PM Russell Jurney <
>>>>>>>>>>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> A lot of people like me use GraphFrames for its connected
>>>>>>>>>>>>>>> components implementation and its motif matching feature. I am 
>>>>>>>>>>>>>>> willing to
>>>>>>>>>>>>>>> work on it to keep it alive. They did a 0.8.3 release not too 
>>>>>>>>>>>>>>> long ago.
>>>>>>>>>>>>>>> Please keep GraphX alive.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 3:44 PM Mich Talebzadeh <
>>>>>>>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I added the user list as they may have vested interest here
>>>>>>>>>>>>>>> and and hopefully can contribute
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Few suggestions:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>    1. Data-Driven Decision Making: Return to the core
>>>>>>>>>>>>>>>    metrics—analyze usage trends, performance benchmarks, and 
>>>>>>>>>>>>>>> the actual impact
>>>>>>>>>>>>>>>    on businesses that rely on GraphX. Objectivity can be 
>>>>>>>>>>>>>>> restored by letting
>>>>>>>>>>>>>>>    data speak louder than opinions so to speak.
>>>>>>>>>>>>>>>    2. Broaden the Discussion: Engage more stakeholders from
>>>>>>>>>>>>>>>    diverse backgrounds (especially spark  users) to bring in 
>>>>>>>>>>>>>>> new perspectives
>>>>>>>>>>>>>>>    and counterbalance the more vocal but potentially narrow 
>>>>>>>>>>>>>>> interests of core
>>>>>>>>>>>>>>>    maintainers or open-source contributors.
>>>>>>>>>>>>>>>    3. Define Clear Criteria for Decision Making: Agree on a
>>>>>>>>>>>>>>>    set of objective criteria by which the project’s future will 
>>>>>>>>>>>>>>> be judged.
>>>>>>>>>>>>>>>    These could include market demand, contribution levels, 
>>>>>>>>>>>>>>> maintenance costs,
>>>>>>>>>>>>>>>    alternative solutions, and alignment with the overall Spark 
>>>>>>>>>>>>>>> ecosystem
>>>>>>>>>>>>>>>    goals. Some have already been covered.
>>>>>>>>>>>>>>>    4. Timely Conclusion of Discussions: Set a timeline for
>>>>>>>>>>>>>>>    making a decision. Long, open-ended discussions tend to lose 
>>>>>>>>>>>>>>> focus. Putting
>>>>>>>>>>>>>>>    deadlines forces participants to focus on key issues and 
>>>>>>>>>>>>>>> prevents endless
>>>>>>>>>>>>>>>    debates.
>>>>>>>>>>>>>>>    5. Borrowing from commercial settings, it is often
>>>>>>>>>>>>>>>    necessary for a strong leadership team to step in and make 
>>>>>>>>>>>>>>> the final
>>>>>>>>>>>>>>>    decision after considering the input. When the objectivity 
>>>>>>>>>>>>>>> of discussions
>>>>>>>>>>>>>>>    starts to wane, leadership needs to cut through the round 
>>>>>>>>>>>>>>> discussions and
>>>>>>>>>>>>>>>    steer towards action based on business and technical 
>>>>>>>>>>>>>>> realities.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> HTH
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Mich Talebzadeh,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> 
>>>>>>>>>>>>>>> Imperial
>>>>>>>>>>>>>>> College London
>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> London, United Kingdom
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  [image: Image removed by sender.]  view my Linkedin profile
>>>>>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fen.everybodywiki.com%2fMich_Talebzadeh&c=E,1,U1JaGVMkko53HkJO5fwmkIXfziTOWL3K1CkAeHwFG55TbZQUd5xVNLGpLt2o0ytujE6zaLpqU2GWCZqHSbo3SU4Wh9Rl8NG4bWPbFWUwyw,,&typo=1>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Disclaimer:* The information provided is correct to the
>>>>>>>>>>>>>>> best of my knowledge but of course cannot be guaranteed . It is 
>>>>>>>>>>>>>>> essential
>>>>>>>>>>>>>>> to note that, as with any advice, quote "one test result is
>>>>>>>>>>>>>>> worth one-thousand expert opinions (Werner
>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, 5 Oct 2024 at 06:26, Ángel <
>>>>>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I completely agree with everyone here. I don’t think the
>>>>>>>>>>>>>>> issue is deprecating it; to me, the problem lies in not 
>>>>>>>>>>>>>>> providing a new and
>>>>>>>>>>>>>>> better solution for handling graphs in Spark. In the past, I 
>>>>>>>>>>>>>>> used GraphX
>>>>>>>>>>>>>>> via GraphFrames for record linkage, and I found it both useful 
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> effective. Is there any discussion about a potential 
>>>>>>>>>>>>>>> replacement?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I’d be willing to help maintain GraphX, though I don’t have
>>>>>>>>>>>>>>> previous experience with maintaining open-source projects. All 
>>>>>>>>>>>>>>> I can
>>>>>>>>>>>>>>> promise is good intentions, willingness to learn and lots of 
>>>>>>>>>>>>>>> energy and
>>>>>>>>>>>>>>> passion. Is that enough?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Btw, what's your take on this?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ·         *GraphX* will be deprecated in favor of a new
>>>>>>>>>>>>>>> graphing component, SparkGraph, based on Cypher
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fneo4j.com%2fdeveloper%2fcypher-query-language%2f&c=E,1,5sP_K0oxQDLYIfWhFPwgNEmTuXMR7tvCjLLcf_ZBAv7oIBySxARy9TyrqNkmZKfXwrIDrhe6TVBCUun2luRV_mAbSD4rooD9YRt5GYYgbHbBUYerg1mpA4Oe6eo,&typo=1>,
>>>>>>>>>>>>>>> a much richer graph language than previously offered by GraphX.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://cloud.google.com/blog/products/data-analytics/introducing-spark-3-and-hadoop-3-on-dataproc-image-version-2-0
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El sáb, 5 oct 2024 a las 2:17, Mark Hamstra (<
>>>>>>>>>>>>>>> markhams...@gmail.com>) escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As I wrote to Holden privately, I might well change my vote
>>>>>>>>>>>>>>> to be in
>>>>>>>>>>>>>>> favor of a deprecation label combined with some effective
>>>>>>>>>>>>>>> means of
>>>>>>>>>>>>>>> communicating that this doesn't mean the end for GraphX if
>>>>>>>>>>>>>>> interested
>>>>>>>>>>>>>>> contributors come forward to rescue it. I don't like either
>>>>>>>>>>>>>>> the idea
>>>>>>>>>>>>>>> of keeping unmaintained code and public APIs around
>>>>>>>>>>>>>>> (especially if
>>>>>>>>>>>>>>> there are problems with them) or the idea of removing Spark
>>>>>>>>>>>>>>> functionality just because no one has contributed to it for
>>>>>>>>>>>>>>> a while. A
>>>>>>>>>>>>>>> naked deprecation label feels somewhat drastic and
>>>>>>>>>>>>>>> pre-emptive to me.
>>>>>>>>>>>>>>> I don't expect that GraphX will be the last part of Spark to
>>>>>>>>>>>>>>> run the
>>>>>>>>>>>>>>> risk of death through neglect, and I think we need an
>>>>>>>>>>>>>>> effective means
>>>>>>>>>>>>>>> of encouraging resuscitation that a deprecation label on its
>>>>>>>>>>>>>>> own does
>>>>>>>>>>>>>>> not provide. On the other hand, if no one really is willing
>>>>>>>>>>>>>>> to come to
>>>>>>>>>>>>>>> the aid of GraphX or other neglected functionality given
>>>>>>>>>>>>>>> adequate
>>>>>>>>>>>>>>> warning of possible removal, I'm not then opposed to the
>>>>>>>>>>>>>>> usual
>>>>>>>>>>>>>>> deprecation and removal process.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Oct 4, 2024 at 4:10 PM Sean Owen <sro...@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > This is a reasonable discussion, but maybe the more
>>>>>>>>>>>>>>> practical point is: are you sure you want to block this 
>>>>>>>>>>>>>>> unilaterally? This
>>>>>>>>>>>>>>> effectively makes a decision that GraphX cannot be removed for 
>>>>>>>>>>>>>>> a long
>>>>>>>>>>>>>>> while. I'd understand it more if we had an active maintainer 
>>>>>>>>>>>>>>> and/or active
>>>>>>>>>>>>>>> user proposing to veto, but my understanding is this is just a 
>>>>>>>>>>>>>>> proposal to
>>>>>>>>>>>>>>> block this on behalf of some users, someone else who might do 
>>>>>>>>>>>>>>> some work and
>>>>>>>>>>>>>>> hasn't to date for some reason. Add to that the fact that the 
>>>>>>>>>>>>>>> 'pro'
>>>>>>>>>>>>>>> arguments all seem to be arguments for working on GraphFrames, 
>>>>>>>>>>>>>>> and I find
>>>>>>>>>>>>>>> this somewhat drastic.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > On Fri, Oct 4, 2024 at 5:23 PM Mark Hamstra <
>>>>>>>>>>>>>>> markhams...@gmail.com> wrote:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> "You can't say nothing is removable until there are no
>>>>>>>>>>>>>>> users."
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> That is not what I am saying. Rather, I am countering
>>>>>>>>>>>>>>> what others seem
>>>>>>>>>>>>>>> >> to be suggesting: There are no users and no interest,
>>>>>>>>>>>>>>> therefore we can
>>>>>>>>>>>>>>> >> and should deprecate.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> On Fri, Oct 4, 2024 at 3:10 PM Sean Owen <
>>>>>>>>>>>>>>> sro...@gmail.com> wrote:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > I could flip this argument around. More strongly, not
>>>>>>>>>>>>>>> being deprecated means "won't be removed" and likewise implies 
>>>>>>>>>>>>>>> support and
>>>>>>>>>>>>>>> development. I don't think either of the latter have been true 
>>>>>>>>>>>>>>> for years.
>>>>>>>>>>>>>>> What suggests this will change? A todo list is not going to do 
>>>>>>>>>>>>>>> anything,
>>>>>>>>>>>>>>> IMHO.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > I'm also concerned about the cost of that, which I have
>>>>>>>>>>>>>>> observed. GraphX PRs are almost certainly not going to be 
>>>>>>>>>>>>>>> reviewed because
>>>>>>>>>>>>>>> of its state. Deprecation both communicates that reality, and 
>>>>>>>>>>>>>>> leaves an
>>>>>>>>>>>>>>> option open, whereas not deprecating forecloses that option for 
>>>>>>>>>>>>>>> a while.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > I don't think the question is, does anyone use it?
>>>>>>>>>>>>>>> because anyone can continue to use it -- in Spark 3.x for sure, 
>>>>>>>>>>>>>>> and in 4.x
>>>>>>>>>>>>>>> if not removed.
>>>>>>>>>>>>>>> >> > You can't say nothing is removable until there are no
>>>>>>>>>>>>>>> users.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Also, why would GraphFrames not be the logical home of
>>>>>>>>>>>>>>> this going forward anyway? which I think is the subtext.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > On Fri, Oct 4, 2024 at 4:56 PM Mark Hamstra <
>>>>>>>>>>>>>>> markhams...@gmail.com> wrote:
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> I'm -1(*) because, while it technically means "might
>>>>>>>>>>>>>>> be removed in the
>>>>>>>>>>>>>>> >> >> future", I think developers and users are more prone
>>>>>>>>>>>>>>> to interpret
>>>>>>>>>>>>>>> >> >> something being marked as deprecated as "very likely
>>>>>>>>>>>>>>> will be removed
>>>>>>>>>>>>>>> >> >> in the future, so don't depend on this or waste your
>>>>>>>>>>>>>>> time contributing
>>>>>>>>>>>>>>> >> >> to its further development." I don't think the latter
>>>>>>>>>>>>>>> is what we want
>>>>>>>>>>>>>>> >> >> just because something hasn't been updated
>>>>>>>>>>>>>>> meaningfully in a while.
>>>>>>>>>>>>>>> >> >> There have been How To articles for GraphX and Graph
>>>>>>>>>>>>>>> Frames posted in
>>>>>>>>>>>>>>> >> >> the not too distant past, and the Google Search trend
>>>>>>>>>>>>>>> shows a pretty
>>>>>>>>>>>>>>> >> >> steady level of interest, not a decline to zero, so I
>>>>>>>>>>>>>>> don't think that
>>>>>>>>>>>>>>> >> >> it is accurate to declare that there is no use or
>>>>>>>>>>>>>>> interest in GraphX.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Unless retaining GraphX is imposing significant costs
>>>>>>>>>>>>>>> on continuing
>>>>>>>>>>>>>>> >> >> Spark development, I can't support deprecating GraphX.
>>>>>>>>>>>>>>> I can support
>>>>>>>>>>>>>>> >> >> encouraging GraphX and Graph Frames development
>>>>>>>>>>>>>>> through something like
>>>>>>>>>>>>>>> >> >> a To Do list or document of "What we'd like to see in
>>>>>>>>>>>>>>> the way of
>>>>>>>>>>>>>>> >> >> further development of Spark's graph processing
>>>>>>>>>>>>>>> capabilities" -- i.e.,
>>>>>>>>>>>>>>> >> >> things that encourage and support new contributions to
>>>>>>>>>>>>>>> address any
>>>>>>>>>>>>>>> >> >> shortcomings in Spark's graph processing, not things
>>>>>>>>>>>>>>> that discourage
>>>>>>>>>>>>>>> >> >> contributions and use in the way that I believe simply
>>>>>>>>>>>>>>> declaring
>>>>>>>>>>>>>>> >> >> GraphX to be deprecated would.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> On Sun, Sep 29, 2024 at 11:04 AM Holden Karau <
>>>>>>>>>>>>>>> holden.ka...@gmail.com> wrote:
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > Since we're getting close to cutting a 4.0 branch
>>>>>>>>>>>>>>> I'd like to float the idea of officially deprecating Graph X. 
>>>>>>>>>>>>>>> What that
>>>>>>>>>>>>>>> would mean (to me) is we would update the docs to indicate that 
>>>>>>>>>>>>>>> Graph X is
>>>>>>>>>>>>>>> deprecated and it's APIs may be removed at anytime in the 
>>>>>>>>>>>>>>> future.
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > Alternatively, we could mark it as "unmaintained and
>>>>>>>>>>>>>>> in search of maintainers" with a note that if no maintainers 
>>>>>>>>>>>>>>> are found, we
>>>>>>>>>>>>>>> may remove it in a future minor version.
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > Looking at the source graph X, I don't see any
>>>>>>>>>>>>>>> meaningful active development going back over three years*. 
>>>>>>>>>>>>>>> There is even a
>>>>>>>>>>>>>>> thread on user@ from 2017 asking if graph X is maintained
>>>>>>>>>>>>>>> anymore, with no response from the developers.
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > Now I'm open to the idea that GraphX is stable and
>>>>>>>>>>>>>>> "works as is" and simply doesn't require modifications but 
>>>>>>>>>>>>>>> given the user
>>>>>>>>>>>>>>> thread I'm a little concerned here about bringing this API with 
>>>>>>>>>>>>>>> us into
>>>>>>>>>>>>>>> Spark 4 if we don't have anyone signed up to maintain it.
>>>>>>>>>>>>>>> >> >> >
>>>>>>>>>>>>>>> >> >> > * Excluding globally applied changes
>>>>>>>>>>>>>>> >> >> > --
>>>>>>>>>>>>>>> >> >> > Twitter: https://twitter.com/holdenkarau
>>>>>>>>>>>>>>> >> >> > Fight Health Insurance:
>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f&c=E,1,9CeJ-bKUShnxOFZMc15zJG1qgfAB9rnSDzrmLzNiXb8qE0NXedNCoZy4HobcS7laOMqtvJzYjvDzjBld1FaCPZpOBW6cf1l_xaG4bEbjYoDpNG0zuQ9_K5TW&typo=1>
>>>>>>>>>>>>>>> >> >> > Books (Learning Spark, High Performance Spark,
>>>>>>>>>>>>>>> etc.): https://amzn.to/2MaRAG9
>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,HJPBNbN3nfUZcb0-2OgveqIE5I5lvPSv-bOfRXIprFdSsGMlNq15o6rueLf2ZQRfytMu0-t3IxSjYou2uuPzUrSAqJ0LV42n2hG8rnkkpN4AA5w4mQZFTs4,&typo=1>
>>>>>>>>>>>>>>> >> >> > YouTube Live Streams:
>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau
>>>>>>>>>>>>>>> >> >> > Pronouns: she/her
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>> >> >> To unsubscribe e-mail:
>>>>>>>>>>>>>>> dev-unsubscr...@spark.apache.org
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>

Reply via email to