Thanks, I'm working on SPARK-42856 but the tests fail due to formatting issues - confusing as I ran scalafmt. Working on it...
Russ On Sun, Nov 17, 2024 at 7:05 PM Xiao Li <lix...@databricks.com> wrote: > Hi, Russell, > > > After reviewing the JIRAs, it seems that only SPARK-42856 is directly > relevant to GraphX. While the other three JIRAs mention GraphX in their > descriptions, they appear to be more related to the build or the REPL > rather than GraphX itself. > > Thanks, > > Xiao > > > > > > > On Nov 16, 2024 at 5:39:27 PM, Russell Jurney <russell.jur...@gmail.com> > wrote: > >> Scratch that, there appear to be... 4 unfixed bugs for GraphX >> outstanding? :) >> https://issues.apache.org/jira/browse/SPARK-42856?jql=project%20%3D%20SPARK%20AND%20issuetype%20%3D%20Bug%20AND%20status%20%3D%20Open%20AND%20text%20~%20%22graphx%22 >> >> On Sat, Nov 16, 2024 at 5:23 PM Russell Jurney <russell.jur...@gmail.com> >> wrote: >> >>> I'm looking at Spark's JIRA on a search for GraphX and I thought I would >>> ask rather than just slog through it: anyone got some low hanging fruit >>> bugs they can suggest I fix? >>> >>> Thanks, >>> Russell >>> >>> On Thu, Nov 14, 2024 at 11:49 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> + 1 >>>> >>>> Mich Talebzadeh, >>>> >>>> Architect | Data Engineer | Data Science | Financial Crime >>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >>>> College London <https://en.wikipedia.org/wiki/Imperial_College_London> >>>> London, United Kingdom >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> >>>> >>>> >>>> *Disclaimer:* The information provided is correct to the best of my >>>> knowledge but of course cannot be guaranteed . It is essential to note >>>> that, as with any advice, quote "one test result is worth one-thousand >>>> expert opinions (Werner >>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>> >>>> >>>> On Thu, 14 Nov 2024 at 18:52, Russell Jurney <russell.jur...@gmail.com> >>>> wrote: >>>> >>>>> Okay, first I’m going to fix a bug or two, I’ll get started on an SPIP. >>>>> >>>>> Russ >>>>> >>>>> On Wed, Nov 13, 2024 at 1:56 PM Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> Hm. Since it sounds like a plan why Russell you go ahead and create a >>>>>> SPIP for it, then, this discussion takes a formal approach and is >>>>>> documented. Otherwise we are just flogging a dead horse so to speak. >>>>>> >>>>>> HTH >>>>>> >>>>>> Mich Talebzadeh, >>>>>> >>>>>> Architect | Data Engineer | Data Science | Financial Crime >>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >>>>>> College London >>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London> >>>>>> London, United Kingdom >>>>>> >>>>>> >>>>>> view my Linkedin profile >>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>> >>>>>> >>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>> >>>>>> >>>>>> >>>>>> *Disclaimer:* The information provided is correct to the best of my >>>>>> knowledge but of course cannot be guaranteed . It is essential to note >>>>>> that, as with any advice, quote "one test result is worth one-thousand >>>>>> expert opinions (Werner >>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>>>> >>>>>> >>>>>> On Wed, 13 Nov 2024 at 20:10, Russell Jurney < >>>>>> russell.jur...@gmail.com> wrote: >>>>>> >>>>>>> It might be, but graph processing is a desirable, very useful >>>>>>> feature of Spark. GraphX doesn't see more popularity because it never >>>>>>> got a >>>>>>> DataFrame interface. If someone is willing to add one and maintain it, >>>>>>> that >>>>>>> seems best of all. >>>>>>> >>>>>>> Russ >>>>>>> >>>>>>> On Wed, Nov 13, 2024 at 7:12 AM Ángel < >>>>>>> angel.alvarez.pas...@gmail.com> wrote: >>>>>>> >>>>>>>> Seems to me.... it would be easier to move GraphX to graphframes >>>>>>>> than the opposite. >>>>>>>> >>>>>>>> El mar, 8 oct 2024 a las 21:52, Reynold Xin >>>>>>>> (<r...@databricks.com.invalid>) escribió: >>>>>>>> >>>>>>>>> We can also consider the following: move GraphFrame into Spark, >>>>>>>>> and make GraphX an internal impl detail of GraphFrame. Then we can >>>>>>>>> over >>>>>>>>> time change the implementation, simplify it (not sure if it is >>>>>>>>> possible, >>>>>>>>> but somebody can look into it).... >>>>>>>>> >>>>>>>>> On Mon, Oct 7, 2024 at 7:04 PM Russell Jurney < >>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Took a look at recent activity. Spark 3.5 support >>>>>>>>>> <https://github.com/graphframes/graphframes/commit/e54f249605dde60787f9b41b88ed7d5872b7dfab> >>>>>>>>>> was >>>>>>>>>> added a year ago. I'm sure we'll add Spark 4 support as soon as it >>>>>>>>>> is out. >>>>>>>>>> >>>>>>>>>> There is a new issue to organize a GraphFrames Hackathon >>>>>>>>>> <https://github.com/graphframes/graphframes/issues/460>. Please >>>>>>>>>> sign up to help! >>>>>>>>>> https://github.com/graphframes/graphframes/issues/460 >>>>>>>>>> >>>>>>>>>> I seriously need GraphX and GraphFrames to make it... I have no >>>>>>>>>> other way of doing property graph motif matching on large graphs. >>>>>>>>>> It's kind >>>>>>>>>> of important to me. >>>>>>>>>> >>>>>>>>>> Some slides on my work with GraphFrames: >>>>>>>>>> >>>>>>>>>> [image: image.png] >>>>>>>>>> >>>>>>>>>> [image: image.png] >>>>>>>>>> >>>>>>>>>> [image: image.png] >>>>>>>>>> >>>>>>>>>> [image: image.png] >>>>>>>>>> >>>>>>>>>> [image: image.png] >>>>>>>>>> >>>>>>>>>> Russell >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Oct 7, 2024 at 6:06 PM Holden Karau < >>>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> That’s awesome! >>>>>>>>>>> >>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>>> Pronouns: she/her >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, Oct 7, 2024 at 5:42 PM Russell Jurney < >>>>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> I’ll organize a hackathon. A friend wants to finish the >>>>>>>>>>>> implementation of Lucian modularity for GraphFrames. I’ll fix some >>>>>>>>>>>> GraphX >>>>>>>>>>>> bugs at it. >>>>>>>>>>>> >>>>>>>>>>>> I did just blog all about the motif matching in GraphFrames: >>>>>>>>>>>> >>>>>>>>>>>> https://blog.graphlet.ai/financial-crime-and-corruption-network-motifs-4cf2e8e10eb5 >>>>>>>>>>>> >>>>>>>>>>>> Russ >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Oct 7, 2024 at 5:38 PM Holden Karau < >>>>>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> So this discuss thread and the vote thread to deprecate to >>>>>>>>>>>>> leave the option of removing it during 4.X are probably the >>>>>>>>>>>>> highest profile >>>>>>>>>>>>> it’s been in years. >>>>>>>>>>>>> >>>>>>>>>>>>> In the past for parts of Spark I’ve cared about I’ve organized >>>>>>>>>>>>> virtual meetings to co-ordinate work — if your connected with >>>>>>>>>>>>> some of the >>>>>>>>>>>>> Spark+Graph community reaching out to find others and organizing >>>>>>>>>>>>> a meeting >>>>>>>>>>>>> could be a way to raise the profile a bit? Maybe organize a >>>>>>>>>>>>> virtual >>>>>>>>>>>>> hackathon (I’m meaning to try this for some other things so happy >>>>>>>>>>>>> to share >>>>>>>>>>>>> what I learn from doing that)? >>>>>>>>>>>>> >>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>>>>> Pronouns: she/her >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Oct 7, 2024 at 5:02 PM Russell Jurney < >>>>>>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I’ll look for a bug to fix. If GraphX is outside of Spark, >>>>>>>>>>>>>> Spark would tend to break GraphFrames and it will be burdensome >>>>>>>>>>>>>> on an >>>>>>>>>>>>>> external project to keep up. Graph computing on Spark is >>>>>>>>>>>>>> implrtant to a lot >>>>>>>>>>>>>> of people, is there a way to raise visibility here? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 4:24 PM Holden Karau < >>>>>>>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> There are no specific tickets associated with the lack of >>>>>>>>>>>>>>> maintaince or this as the component has not been maintained for >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> sufficiently long time. If your interested in taking it on >>>>>>>>>>>>>>> that’s >>>>>>>>>>>>>>> wonderful, probably starting with fixing some bugs could be a >>>>>>>>>>>>>>> great place >>>>>>>>>>>>>>> to start and figure out if it’s something you want to do long >>>>>>>>>>>>>>> term. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I would recommend making a first bug fix in a actively >>>>>>>>>>>>>>> maintained area of Spark to get to >>>>>>>>>>>>>>> Know some reviewers since there is not anyone tracking the >>>>>>>>>>>>>>> GraphX PRs. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As a note I don’t think GraphX is required for Graph Frames >>>>>>>>>>>>>>> long term, so another option would be to talk to the >>>>>>>>>>>>>>> GraphFrames folks and >>>>>>>>>>>>>>> move the GraphX code over to it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ideally we’d have someone willing to act as a mentor or >>>>>>>>>>>>>>> guide but so far we have no volunteers (especially no one >>>>>>>>>>>>>>> familiar with the >>>>>>>>>>>>>>> graph X code). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>>>>> Fight Health Insurance: >>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/ >>>>>>>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>>>>>>>>> YouTube Live Streams: >>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>>>>>>> Pronouns: she/her >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 3:25 PM Russell Jurney < >>>>>>>>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I volunteer to maintain GraphX to keep GraphFrames a viable >>>>>>>>>>>>>>>> project. I don’t have a clear view on whether it works with >>>>>>>>>>>>>>>> Spark 4 or if >>>>>>>>>>>>>>>> it needs updates? I don’t have Spark commits but I’m a >>>>>>>>>>>>>>>> committer on Apache >>>>>>>>>>>>>>>> DataFu and mentored the Spark feature for it. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Can someone tell me what is involved? Point me at a ticket? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Russell >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Oct 7, 2024 at 12:11 AM Erik Eklund < >>>>>>>>>>>>>>>> eekl...@definitivehc.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>> We rely on GraphX for an important component of our >>>>>>>>>>>>>>>>> product. And we really want it to stay a typed interface. >>>>>>>>>>>>>>>>> Please keep >>>>>>>>>>>>>>>>> GraphX. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Erik >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *From: *Holden Karau <holden.ka...@gmail.com> >>>>>>>>>>>>>>>>> *Date: *Sunday, October 6, 2024 at 06:22 >>>>>>>>>>>>>>>>> *To: *Ángel <angel.alvarez.pas...@gmail.com> >>>>>>>>>>>>>>>>> *Cc: *Russell Jurney <russell.jur...@gmail.com>, Mich >>>>>>>>>>>>>>>>> Talebzadeh <mich.talebza...@gmail.com>, Spark dev list < >>>>>>>>>>>>>>>>> dev@spark.apache.org>, user @spark <u...@spark.apache.org> >>>>>>>>>>>>>>>>> *Subject: *Re: [DISCUSS] Deprecate GraphX OR Find new >>>>>>>>>>>>>>>>> maintainers interested in GraphX OR leave it as is? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So are there companies using it? And are they willing to >>>>>>>>>>>>>>>>> contribute to maintaining it? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Fight Health Insurance: >>>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/ >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,OT9ylxCx5xRNCToPSzu0VEvefs4uts16fTBydH2NiLHMGEwLjrEXgkhU8W-Ai6xD8VDMyWea44GBMOEecMNdapaZKZbBTrZpquOBKi6YRlqu-FVAzji6-w,,&typo=1> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,h0ccgHctUPRY4zAN_qZ-qdBgLDpQLtm7KaOL4u12U4PR7PeJ4MUBOS8bbD7CNssUIMqRMvY_pOqbh7PfLY0lRpQh9mfqBC0KnSHBZzxxSJJr-55r5kv6YjYwrA,,&typo=1> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> YouTube Live Streams: >>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Pronouns: she/her >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 9:17 PM Ángel < >>>>>>>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> That would definitely affect companies using GraphX, but >>>>>>>>>>>>>>>>> at least they’d have the choice to migrate their code. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I think that’s probably the way to go. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> El dom, 6 oct 2024 a las 6:09, Holden Karau (< >>>>>>>>>>>>>>>>> holden.ka...@gmail.com>) escribió: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So removing GraphX from Spark would not prevent >>>>>>>>>>>>>>>>> GraphFrames from continuing, they could pick up the GraphX >>>>>>>>>>>>>>>>> source and >>>>>>>>>>>>>>>>> incorporate it into their project. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Fight Health Insurance: >>>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/ >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,9xMMQlY7gtmkqxT0NTmS8KMg4wOUjw0PWKM-oepAYAkE-SiM5pyXCb80AuRZYJ4zMIedVlwVMAKi_eh52Hof0LsteXx2eIslnsDBdmVeuocpILpneg,,&typo=1> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,kbGbMBRMidAYi0aqUmj949vRahpEjVzSgJv_YYtO5EteSXZy4RrMYXJU48mN2CyS5sdovsgiFAAiBLnyQ29gCCn8xbTrEJmfIhjtH7tD4N31VUoLtQ,,&typo=1> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> YouTube Live Streams: >>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Pronouns: she/her >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 5:22 PM Russell Jurney < >>>>>>>>>>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> A lot of people like me use GraphFrames for its connected >>>>>>>>>>>>>>>>> components implementation and its motif matching feature. I >>>>>>>>>>>>>>>>> am willing to >>>>>>>>>>>>>>>>> work on it to keep it alive. They did a 0.8.3 release not too >>>>>>>>>>>>>>>>> long ago. >>>>>>>>>>>>>>>>> Please keep GraphX alive. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sat, Oct 5, 2024 at 3:44 PM Mich Talebzadeh < >>>>>>>>>>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I added the user list as they may have vested >>>>>>>>>>>>>>>>> interest here and and hopefully can contribute >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Few suggestions: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1. Data-Driven Decision Making: Return to the core >>>>>>>>>>>>>>>>> metrics—analyze usage trends, performance benchmarks, and >>>>>>>>>>>>>>>>> the actual impact >>>>>>>>>>>>>>>>> on businesses that rely on GraphX. Objectivity can be >>>>>>>>>>>>>>>>> restored by letting >>>>>>>>>>>>>>>>> data speak louder than opinions so to speak. >>>>>>>>>>>>>>>>> 2. Broaden the Discussion: Engage more stakeholders >>>>>>>>>>>>>>>>> from diverse backgrounds (especially spark users) to >>>>>>>>>>>>>>>>> bring in new >>>>>>>>>>>>>>>>> perspectives and counterbalance the more vocal but >>>>>>>>>>>>>>>>> potentially narrow >>>>>>>>>>>>>>>>> interests of core maintainers or open-source contributors. >>>>>>>>>>>>>>>>> 3. Define Clear Criteria for Decision Making: Agree on >>>>>>>>>>>>>>>>> a set of objective criteria by which the project’s future >>>>>>>>>>>>>>>>> will be judged. >>>>>>>>>>>>>>>>> These could include market demand, contribution levels, >>>>>>>>>>>>>>>>> maintenance costs, >>>>>>>>>>>>>>>>> alternative solutions, and alignment with the overall >>>>>>>>>>>>>>>>> Spark ecosystem >>>>>>>>>>>>>>>>> goals. Some have already been covered. >>>>>>>>>>>>>>>>> 4. Timely Conclusion of Discussions: Set a timeline >>>>>>>>>>>>>>>>> for making a decision. Long, open-ended discussions tend >>>>>>>>>>>>>>>>> to lose focus. >>>>>>>>>>>>>>>>> Putting deadlines forces participants to focus on key >>>>>>>>>>>>>>>>> issues and prevents >>>>>>>>>>>>>>>>> endless debates. >>>>>>>>>>>>>>>>> 5. Borrowing from commercial settings, it is often >>>>>>>>>>>>>>>>> necessary for a strong leadership team to step in and make >>>>>>>>>>>>>>>>> the final >>>>>>>>>>>>>>>>> decision after considering the input. When the objectivity >>>>>>>>>>>>>>>>> of discussions >>>>>>>>>>>>>>>>> starts to wane, leadership needs to cut through the round >>>>>>>>>>>>>>>>> discussions and >>>>>>>>>>>>>>>>> steer towards action based on business and technical >>>>>>>>>>>>>>>>> realities. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> HTH >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Mich Talebzadeh, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> >>>>>>>>>>>>>>>>> Imperial >>>>>>>>>>>>>>>>> College London >>>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> London, United Kingdom >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [image: Image removed by sender.] view my Linkedin >>>>>>>>>>>>>>>>> profile >>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fen.everybodywiki.com%2fMich_Talebzadeh&c=E,1,U1JaGVMkko53HkJO5fwmkIXfziTOWL3K1CkAeHwFG55TbZQUd5xVNLGpLt2o0ytujE6zaLpqU2GWCZqHSbo3SU4Wh9Rl8NG4bWPbFWUwyw,,&typo=1> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> *Disclaimer:* The information provided is correct to the >>>>>>>>>>>>>>>>> best of my knowledge but of course cannot be guaranteed . It >>>>>>>>>>>>>>>>> is essential >>>>>>>>>>>>>>>>> to note that, as with any advice, quote "one test result is >>>>>>>>>>>>>>>>> worth one-thousand expert opinions (Werner >>>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>>>>>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sat, 5 Oct 2024 at 06:26, Ángel < >>>>>>>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I completely agree with everyone here. I don’t think the >>>>>>>>>>>>>>>>> issue is deprecating it; to me, the problem lies in not >>>>>>>>>>>>>>>>> providing a new and >>>>>>>>>>>>>>>>> better solution for handling graphs in Spark. In the past, I >>>>>>>>>>>>>>>>> used GraphX >>>>>>>>>>>>>>>>> via GraphFrames for record linkage, and I found it both >>>>>>>>>>>>>>>>> useful and >>>>>>>>>>>>>>>>> effective. Is there any discussion about a potential >>>>>>>>>>>>>>>>> replacement? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I’d be willing to help maintain GraphX, though I don’t >>>>>>>>>>>>>>>>> have previous experience with maintaining open-source >>>>>>>>>>>>>>>>> projects. All I can >>>>>>>>>>>>>>>>> promise is good intentions, willingness to learn and lots of >>>>>>>>>>>>>>>>> energy and >>>>>>>>>>>>>>>>> passion. Is that enough? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Btw, what's your take on this? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> · *GraphX* will be deprecated in favor of a new >>>>>>>>>>>>>>>>> graphing component, SparkGraph, based on Cypher >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fneo4j.com%2fdeveloper%2fcypher-query-language%2f&c=E,1,5sP_K0oxQDLYIfWhFPwgNEmTuXMR7tvCjLLcf_ZBAv7oIBySxARy9TyrqNkmZKfXwrIDrhe6TVBCUun2luRV_mAbSD4rooD9YRt5GYYgbHbBUYerg1mpA4Oe6eo,&typo=1>, >>>>>>>>>>>>>>>>> a much richer graph language than previously offered by >>>>>>>>>>>>>>>>> GraphX. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://cloud.google.com/blog/products/data-analytics/introducing-spark-3-and-hadoop-3-on-dataproc-image-version-2-0 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> El sáb, 5 oct 2024 a las 2:17, Mark Hamstra (< >>>>>>>>>>>>>>>>> markhams...@gmail.com>) escribió: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> As I wrote to Holden privately, I might well change my >>>>>>>>>>>>>>>>> vote to be in >>>>>>>>>>>>>>>>> favor of a deprecation label combined with some effective >>>>>>>>>>>>>>>>> means of >>>>>>>>>>>>>>>>> communicating that this doesn't mean the end for GraphX if >>>>>>>>>>>>>>>>> interested >>>>>>>>>>>>>>>>> contributors come forward to rescue it. I don't like >>>>>>>>>>>>>>>>> either the idea >>>>>>>>>>>>>>>>> of keeping unmaintained code and public APIs around >>>>>>>>>>>>>>>>> (especially if >>>>>>>>>>>>>>>>> there are problems with them) or the idea of removing Spark >>>>>>>>>>>>>>>>> functionality just because no one has contributed to it >>>>>>>>>>>>>>>>> for a while. A >>>>>>>>>>>>>>>>> naked deprecation label feels somewhat drastic and >>>>>>>>>>>>>>>>> pre-emptive to me. >>>>>>>>>>>>>>>>> I don't expect that GraphX will be the last part of Spark >>>>>>>>>>>>>>>>> to run the >>>>>>>>>>>>>>>>> risk of death through neglect, and I think we need an >>>>>>>>>>>>>>>>> effective means >>>>>>>>>>>>>>>>> of encouraging resuscitation that a deprecation label on >>>>>>>>>>>>>>>>> its own does >>>>>>>>>>>>>>>>> not provide. On the other hand, if no one really is >>>>>>>>>>>>>>>>> willing to come to >>>>>>>>>>>>>>>>> the aid of GraphX or other neglected functionality given >>>>>>>>>>>>>>>>> adequate >>>>>>>>>>>>>>>>> warning of possible removal, I'm not then opposed to the >>>>>>>>>>>>>>>>> usual >>>>>>>>>>>>>>>>> deprecation and removal process. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Oct 4, 2024 at 4:10 PM Sean Owen <sro...@gmail.com> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > This is a reasonable discussion, but maybe the more >>>>>>>>>>>>>>>>> practical point is: are you sure you want to block this >>>>>>>>>>>>>>>>> unilaterally? This >>>>>>>>>>>>>>>>> effectively makes a decision that GraphX cannot be removed >>>>>>>>>>>>>>>>> for a long >>>>>>>>>>>>>>>>> while. I'd understand it more if we had an active maintainer >>>>>>>>>>>>>>>>> and/or active >>>>>>>>>>>>>>>>> user proposing to veto, but my understanding is this is just >>>>>>>>>>>>>>>>> a proposal to >>>>>>>>>>>>>>>>> block this on behalf of some users, someone else who might do >>>>>>>>>>>>>>>>> some work and >>>>>>>>>>>>>>>>> hasn't to date for some reason. Add to that the fact that the >>>>>>>>>>>>>>>>> 'pro' >>>>>>>>>>>>>>>>> arguments all seem to be arguments for working on >>>>>>>>>>>>>>>>> GraphFrames, and I find >>>>>>>>>>>>>>>>> this somewhat drastic. >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > On Fri, Oct 4, 2024 at 5:23 PM Mark Hamstra < >>>>>>>>>>>>>>>>> markhams...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> "You can't say nothing is removable until there are no >>>>>>>>>>>>>>>>> users." >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> That is not what I am saying. Rather, I am countering >>>>>>>>>>>>>>>>> what others seem >>>>>>>>>>>>>>>>> >> to be suggesting: There are no users and no interest, >>>>>>>>>>>>>>>>> therefore we can >>>>>>>>>>>>>>>>> >> and should deprecate. >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> On Fri, Oct 4, 2024 at 3:10 PM Sean Owen < >>>>>>>>>>>>>>>>> sro...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>> >> > I could flip this argument around. More strongly, not >>>>>>>>>>>>>>>>> being deprecated means "won't be removed" and likewise >>>>>>>>>>>>>>>>> implies support and >>>>>>>>>>>>>>>>> development. I don't think either of the latter have been >>>>>>>>>>>>>>>>> true for years. >>>>>>>>>>>>>>>>> What suggests this will change? A todo list is not going to >>>>>>>>>>>>>>>>> do anything, >>>>>>>>>>>>>>>>> IMHO. >>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>> >> > I'm also concerned about the cost of that, which I >>>>>>>>>>>>>>>>> have observed. GraphX PRs are almost certainly not going to >>>>>>>>>>>>>>>>> be reviewed >>>>>>>>>>>>>>>>> because of its state. Deprecation both communicates that >>>>>>>>>>>>>>>>> reality, and >>>>>>>>>>>>>>>>> leaves an option open, whereas not deprecating forecloses >>>>>>>>>>>>>>>>> that option for a >>>>>>>>>>>>>>>>> while. >>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>> >> > I don't think the question is, does anyone use it? >>>>>>>>>>>>>>>>> because anyone can continue to use it -- in Spark 3.x for >>>>>>>>>>>>>>>>> sure, and in 4.x >>>>>>>>>>>>>>>>> if not removed. >>>>>>>>>>>>>>>>> >> > You can't say nothing is removable until there are no >>>>>>>>>>>>>>>>> users. >>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>> >> > Also, why would GraphFrames not be the logical home >>>>>>>>>>>>>>>>> of this going forward anyway? which I think is the subtext. >>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>> >> > On Fri, Oct 4, 2024 at 4:56 PM Mark Hamstra < >>>>>>>>>>>>>>>>> markhams...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> >> >> I'm -1(*) because, while it technically means "might >>>>>>>>>>>>>>>>> be removed in the >>>>>>>>>>>>>>>>> >> >> future", I think developers and users are more prone >>>>>>>>>>>>>>>>> to interpret >>>>>>>>>>>>>>>>> >> >> something being marked as deprecated as "very likely >>>>>>>>>>>>>>>>> will be removed >>>>>>>>>>>>>>>>> >> >> in the future, so don't depend on this or waste your >>>>>>>>>>>>>>>>> time contributing >>>>>>>>>>>>>>>>> >> >> to its further development." I don't think the >>>>>>>>>>>>>>>>> latter is what we want >>>>>>>>>>>>>>>>> >> >> just because something hasn't been updated >>>>>>>>>>>>>>>>> meaningfully in a while. >>>>>>>>>>>>>>>>> >> >> There have been How To articles for GraphX and Graph >>>>>>>>>>>>>>>>> Frames posted in >>>>>>>>>>>>>>>>> >> >> the not too distant past, and the Google Search >>>>>>>>>>>>>>>>> trend shows a pretty >>>>>>>>>>>>>>>>> >> >> steady level of interest, not a decline to zero, so >>>>>>>>>>>>>>>>> I don't think that >>>>>>>>>>>>>>>>> >> >> it is accurate to declare that there is no use or >>>>>>>>>>>>>>>>> interest in GraphX. >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> >> >> Unless retaining GraphX is imposing significant >>>>>>>>>>>>>>>>> costs on continuing >>>>>>>>>>>>>>>>> >> >> Spark development, I can't support deprecating >>>>>>>>>>>>>>>>> GraphX. I can support >>>>>>>>>>>>>>>>> >> >> encouraging GraphX and Graph Frames development >>>>>>>>>>>>>>>>> through something like >>>>>>>>>>>>>>>>> >> >> a To Do list or document of "What we'd like to see >>>>>>>>>>>>>>>>> in the way of >>>>>>>>>>>>>>>>> >> >> further development of Spark's graph processing >>>>>>>>>>>>>>>>> capabilities" -- i.e., >>>>>>>>>>>>>>>>> >> >> things that encourage and support new contributions >>>>>>>>>>>>>>>>> to address any >>>>>>>>>>>>>>>>> >> >> shortcomings in Spark's graph processing, not things >>>>>>>>>>>>>>>>> that discourage >>>>>>>>>>>>>>>>> >> >> contributions and use in the way that I believe >>>>>>>>>>>>>>>>> simply declaring >>>>>>>>>>>>>>>>> >> >> GraphX to be deprecated would. >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> >> >> On Sun, Sep 29, 2024 at 11:04 AM Holden Karau < >>>>>>>>>>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >> >> > >>>>>>>>>>>>>>>>> >> >> > Since we're getting close to cutting a 4.0 branch >>>>>>>>>>>>>>>>> I'd like to float the idea of officially deprecating Graph X. >>>>>>>>>>>>>>>>> What that >>>>>>>>>>>>>>>>> would mean (to me) is we would update the docs to indicate >>>>>>>>>>>>>>>>> that Graph X is >>>>>>>>>>>>>>>>> deprecated and it's APIs may be removed at anytime in the >>>>>>>>>>>>>>>>> future. >>>>>>>>>>>>>>>>> >> >> > >>>>>>>>>>>>>>>>> >> >> > Alternatively, we could mark it as "unmaintained >>>>>>>>>>>>>>>>> and in search of maintainers" with a note that if no >>>>>>>>>>>>>>>>> maintainers are found, >>>>>>>>>>>>>>>>> we may remove it in a future minor version. >>>>>>>>>>>>>>>>> >> >> > >>>>>>>>>>>>>>>>> >> >> > Looking at the source graph X, I don't see any >>>>>>>>>>>>>>>>> meaningful active development going back over three years*. >>>>>>>>>>>>>>>>> There is even a >>>>>>>>>>>>>>>>> thread on user@ from 2017 asking if graph X is maintained >>>>>>>>>>>>>>>>> anymore, with no response from the developers. >>>>>>>>>>>>>>>>> >> >> > >>>>>>>>>>>>>>>>> >> >> > Now I'm open to the idea that GraphX is stable and >>>>>>>>>>>>>>>>> "works as is" and simply doesn't require modifications but >>>>>>>>>>>>>>>>> given the user >>>>>>>>>>>>>>>>> thread I'm a little concerned here about bringing this API >>>>>>>>>>>>>>>>> with us into >>>>>>>>>>>>>>>>> Spark 4 if we don't have anyone signed up to maintain it. >>>>>>>>>>>>>>>>> >> >> > >>>>>>>>>>>>>>>>> >> >> > * Excluding globally applied changes >>>>>>>>>>>>>>>>> >> >> > -- >>>>>>>>>>>>>>>>> >> >> > Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>>>>>>> >> >> > Fight Health Insurance: >>>>>>>>>>>>>>>>> https://www.fighthealthinsurance.com/ >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f&c=E,1,9CeJ-bKUShnxOFZMc15zJG1qgfAB9rnSDzrmLzNiXb8qE0NXedNCoZy4HobcS7laOMqtvJzYjvDzjBld1FaCPZpOBW6cf1l_xaG4bEbjYoDpNG0zuQ9_K5TW&typo=1> >>>>>>>>>>>>>>>>> >> >> > Books (Learning Spark, High Performance Spark, >>>>>>>>>>>>>>>>> etc.): https://amzn.to/2MaRAG9 >>>>>>>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,HJPBNbN3nfUZcb0-2OgveqIE5I5lvPSv-bOfRXIprFdSsGMlNq15o6rueLf2ZQRfytMu0-t3IxSjYou2uuPzUrSAqJ0LV42n2hG8rnkkpN4AA5w4mQZFTs4,&typo=1> >>>>>>>>>>>>>>>>> >> >> > YouTube Live Streams: >>>>>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>>>>>>>>> >> >> > Pronouns: she/her >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>>>>> >> >> To unsubscribe e-mail: >>>>>>>>>>>>>>>>> dev-unsubscr...@spark.apache.org >>>>>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>