Okay, first I’m going to fix a bug or two, I’ll get started on an SPIP.
Russ On Wed, Nov 13, 2024 at 1:56 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hm. Since it sounds like a plan why Russell you go ahead and create a SPIP > for it, then, this discussion takes a formal approach and is documented. > Otherwise we are just flogging a dead horse so to speak. > > HTH > > Mich Talebzadeh, > > Architect | Data Engineer | Data Science | Financial Crime > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College > London <https://en.wikipedia.org/wiki/Imperial_College_London> > London, United Kingdom > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* The information provided is correct to the best of my > knowledge but of course cannot be guaranteed . It is essential to note > that, as with any advice, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Wed, 13 Nov 2024 at 20:10, Russell Jurney <russell.jur...@gmail.com> > wrote: > >> It might be, but graph processing is a desirable, very useful feature of >> Spark. GraphX doesn't see more popularity because it never got a DataFrame >> interface. If someone is willing to add one and maintain it, that seems >> best of all. >> >> Russ >> >> On Wed, Nov 13, 2024 at 7:12 AM Ángel <angel.alvarez.pas...@gmail.com> >> wrote: >> >>> Seems to me.... it would be easier to move GraphX to graphframes than >>> the opposite. >>> >>> El mar, 8 oct 2024 a las 21:52, Reynold Xin (<r...@databricks.com.invalid>) >>> escribió: >>> >>>> We can also consider the following: move GraphFrame into Spark, and >>>> make GraphX an internal impl detail of GraphFrame. Then we can over time >>>> change the implementation, simplify it (not sure if it is possible, but >>>> somebody can look into it).... >>>> >>>> On Mon, Oct 7, 2024 at 7:04 PM Russell Jurney <russell.jur...@gmail.com> >>>> wrote: >>>> >>>>> Took a look at recent activity. Spark 3.5 support >>>>> <https://github.com/graphframes/graphframes/commit/e54f249605dde60787f9b41b88ed7d5872b7dfab> >>>>> was >>>>> added a year ago. I'm sure we'll add Spark 4 support as soon as it is out. >>>>> >>>>> There is a new issue to organize a GraphFrames Hackathon >>>>> <https://github.com/graphframes/graphframes/issues/460>. Please sign >>>>> up to help! https://github.com/graphframes/graphframes/issues/460 >>>>> >>>>> I seriously need GraphX and GraphFrames to make it... I have no other >>>>> way of doing property graph motif matching on large graphs. It's kind of >>>>> important to me. >>>>> >>>>> Some slides on my work with GraphFrames: >>>>> >>>>> [image: image.png] >>>>> >>>>> [image: image.png] >>>>> >>>>> [image: image.png] >>>>> >>>>> [image: image.png] >>>>> >>>>> [image: image.png] >>>>> >>>>> Russell >>>>> >>>>> >>>>> On Mon, Oct 7, 2024 at 6:06 PM Holden Karau <holden.ka...@gmail.com> >>>>> wrote: >>>>> >>>>>> That’s awesome! >>>>>> >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>> Pronouns: she/her >>>>>> >>>>>> >>>>>> On Mon, Oct 7, 2024 at 5:42 PM Russell Jurney < >>>>>> russell.jur...@gmail.com> wrote: >>>>>> >>>>>>> I’ll organize a hackathon. A friend wants to finish the >>>>>>> implementation of Lucian modularity for GraphFrames. I’ll fix some >>>>>>> GraphX >>>>>>> bugs at it. >>>>>>> >>>>>>> I did just blog all about the motif matching in GraphFrames: >>>>>>> >>>>>>> https://blog.graphlet.ai/financial-crime-and-corruption-network-motifs-4cf2e8e10eb5 >>>>>>> >>>>>>> Russ >>>>>>> >>>>>>> On Mon, Oct 7, 2024 at 5:38 PM Holden Karau <holden.ka...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> So this discuss thread and the vote thread to deprecate to leave >>>>>>>> the option of removing it during 4.X are probably the highest profile >>>>>>>> it’s >>>>>>>> been in years. >>>>>>>> >>>>>>>> In the past for parts of Spark I’ve cared about I’ve organized >>>>>>>> virtual meetings to co-ordinate work — if your connected with some of >>>>>>>> the >>>>>>>> Spark+Graph community reaching out to find others and organizing a >>>>>>>> meeting >>>>>>>> could be a way to raise the profile a bit? Maybe organize a virtual >>>>>>>> hackathon (I’m meaning to try this for some other things so happy to >>>>>>>> share >>>>>>>> what I learn from doing that)? >>>>>>>> >>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>> Pronouns: she/her >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Oct 7, 2024 at 5:02 PM Russell Jurney < >>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I’ll look for a bug to fix. If GraphX is outside of Spark, Spark >>>>>>>>> would tend to break GraphFrames and it will be burdensome on an >>>>>>>>> external >>>>>>>>> project to keep up. Graph computing on Spark is implrtant to a lot of >>>>>>>>> people, is there a way to raise visibility here? >>>>>>>>> >>>>>>>>> On Mon, Oct 7, 2024 at 4:24 PM Holden Karau < >>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> There are no specific tickets associated with the lack of >>>>>>>>>> maintaince or this as the component has not been maintained for a >>>>>>>>>> sufficiently long time. If your interested in taking it on that’s >>>>>>>>>> wonderful, probably starting with fixing some bugs could be a great >>>>>>>>>> place >>>>>>>>>> to start and figure out if it’s something you want to do long term. >>>>>>>>>> >>>>>>>>>> I would recommend making a first bug fix in a actively maintained >>>>>>>>>> area of Spark to get to >>>>>>>>>> Know some reviewers since there is not anyone tracking the GraphX >>>>>>>>>> PRs. >>>>>>>>>> >>>>>>>>>> As a note I don’t think GraphX is required for Graph Frames long >>>>>>>>>> term, so another option would be to talk to the GraphFrames folks >>>>>>>>>> and move >>>>>>>>>> the GraphX code over to it. >>>>>>>>>> >>>>>>>>>> Ideally we’d have someone willing to act as a mentor or guide but >>>>>>>>>> so far we have no volunteers (especially no one familiar with the >>>>>>>>>> graph X >>>>>>>>>> code). >>>>>>>>>> >>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>>>>> <https://www.fighthealthinsurance.com/?q=hk_email> >>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>> Pronouns: she/her >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Oct 7, 2024 at 3:25 PM Russell Jurney < >>>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> I volunteer to maintain GraphX to keep GraphFrames a viable >>>>>>>>>>> project. I don’t have a clear view on whether it works with Spark 4 >>>>>>>>>>> or if >>>>>>>>>>> it needs updates? I don’t have Spark commits but I’m a committer on >>>>>>>>>>> Apache >>>>>>>>>>> DataFu and mentored the Spark feature for it. >>>>>>>>>>> >>>>>>>>>>> Can someone tell me what is involved? Point me at a ticket? >>>>>>>>>>> >>>>>>>>>>> Russell >>>>>>>>>>> >>>>>>>>>>> On Mon, Oct 7, 2024 at 12:11 AM Erik Eklund < >>>>>>>>>>> eekl...@definitivehc.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> We rely on GraphX for an important component of our product. >>>>>>>>>>>> And we really want it to stay a typed interface. Please keep >>>>>>>>>>>> GraphX. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Erik >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *From: *Holden Karau <holden.ka...@gmail.com> >>>>>>>>>>>> *Date: *Sunday, October 6, 2024 at 06:22 >>>>>>>>>>>> *To: *Ángel <angel.alvarez.pas...@gmail.com> >>>>>>>>>>>> *Cc: *Russell Jurney <russell.jur...@gmail.com>, Mich >>>>>>>>>>>> Talebzadeh <mich.talebza...@gmail.com>, Spark dev list < >>>>>>>>>>>> dev@spark.apache.org>, user @spark <u...@spark.apache.org> >>>>>>>>>>>> *Subject: *Re: [DISCUSS] Deprecate GraphX OR Find new >>>>>>>>>>>> maintainers interested in GraphX OR leave it as is? >>>>>>>>>>>> >>>>>>>>>>>> So are there companies using it? And are they willing to >>>>>>>>>>>> contribute to maintaining it? >>>>>>>>>>>> >>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>> >>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,OT9ylxCx5xRNCToPSzu0VEvefs4uts16fTBydH2NiLHMGEwLjrEXgkhU8W-Ai6xD8VDMyWea44GBMOEecMNdapaZKZbBTrZpquOBKi6YRlqu-FVAzji6-w,,&typo=1> >>>>>>>>>>>> >>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,h0ccgHctUPRY4zAN_qZ-qdBgLDpQLtm7KaOL4u12U4PR7PeJ4MUBOS8bbD7CNssUIMqRMvY_pOqbh7PfLY0lRpQh9mfqBC0KnSHBZzxxSJJr-55r5kv6YjYwrA,,&typo=1> >>>>>>>>>>>> >>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>>>> >>>>>>>>>>>> Pronouns: she/her >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Oct 5, 2024 at 9:17 PM Ángel < >>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> That would definitely affect companies using GraphX, but at >>>>>>>>>>>> least they’d have the choice to migrate their code. >>>>>>>>>>>> >>>>>>>>>>>> I think that’s probably the way to go. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> El dom, 6 oct 2024 a las 6:09, Holden Karau (< >>>>>>>>>>>> holden.ka...@gmail.com>) escribió: >>>>>>>>>>>> >>>>>>>>>>>> So removing GraphX from Spark would not prevent GraphFrames >>>>>>>>>>>> from continuing, they could pick up the GraphX source and >>>>>>>>>>>> incorporate it >>>>>>>>>>>> into their project. >>>>>>>>>>>> >>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>> >>>>>>>>>>>> Fight Health Insurance: https://www.fighthealthinsurance.com/ >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f%3fq%3dhk_email&c=E,1,9xMMQlY7gtmkqxT0NTmS8KMg4wOUjw0PWKM-oepAYAkE-SiM5pyXCb80AuRZYJ4zMIedVlwVMAKi_eh52Hof0LsteXx2eIslnsDBdmVeuocpILpneg,,&typo=1> >>>>>>>>>>>> >>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,kbGbMBRMidAYi0aqUmj949vRahpEjVzSgJv_YYtO5EteSXZy4RrMYXJU48mN2CyS5sdovsgiFAAiBLnyQ29gCCn8xbTrEJmfIhjtH7tD4N31VUoLtQ,,&typo=1> >>>>>>>>>>>> >>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>>>> >>>>>>>>>>>> Pronouns: she/her >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Oct 5, 2024 at 5:22 PM Russell Jurney < >>>>>>>>>>>> russell.jur...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> A lot of people like me use GraphFrames for its connected >>>>>>>>>>>> components implementation and its motif matching feature. I am >>>>>>>>>>>> willing to >>>>>>>>>>>> work on it to keep it alive. They did a 0.8.3 release not too long >>>>>>>>>>>> ago. >>>>>>>>>>>> Please keep GraphX alive. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Oct 5, 2024 at 3:44 PM Mich Talebzadeh < >>>>>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I added the user list as they may have vested interest here and >>>>>>>>>>>> and hopefully can contribute >>>>>>>>>>>> >>>>>>>>>>>> Few suggestions: >>>>>>>>>>>> >>>>>>>>>>>> 1. Data-Driven Decision Making: Return to the core >>>>>>>>>>>> metrics—analyze usage trends, performance benchmarks, and the >>>>>>>>>>>> actual impact >>>>>>>>>>>> on businesses that rely on GraphX. Objectivity can be restored >>>>>>>>>>>> by letting >>>>>>>>>>>> data speak louder than opinions so to speak. >>>>>>>>>>>> 2. Broaden the Discussion: Engage more stakeholders from >>>>>>>>>>>> diverse backgrounds (especially spark users) to bring in new >>>>>>>>>>>> perspectives >>>>>>>>>>>> and counterbalance the more vocal but potentially narrow >>>>>>>>>>>> interests of core >>>>>>>>>>>> maintainers or open-source contributors. >>>>>>>>>>>> 3. Define Clear Criteria for Decision Making: Agree on a >>>>>>>>>>>> set of objective criteria by which the project’s future will be >>>>>>>>>>>> judged. >>>>>>>>>>>> These could include market demand, contribution levels, >>>>>>>>>>>> maintenance costs, >>>>>>>>>>>> alternative solutions, and alignment with the overall Spark >>>>>>>>>>>> ecosystem >>>>>>>>>>>> goals. Some have already been covered. >>>>>>>>>>>> 4. Timely Conclusion of Discussions: Set a timeline for >>>>>>>>>>>> making a decision. Long, open-ended discussions tend to lose >>>>>>>>>>>> focus. Putting >>>>>>>>>>>> deadlines forces participants to focus on key issues and >>>>>>>>>>>> prevents endless >>>>>>>>>>>> debates. >>>>>>>>>>>> 5. Borrowing from commercial settings, it is often >>>>>>>>>>>> necessary for a strong leadership team to step in and make the >>>>>>>>>>>> final >>>>>>>>>>>> decision after considering the input. When the objectivity of >>>>>>>>>>>> discussions >>>>>>>>>>>> starts to wane, leadership needs to cut through the round >>>>>>>>>>>> discussions and >>>>>>>>>>>> steer towards action based on business and technical realities. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> HTH >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Mich Talebzadeh, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime >>>>>>>>>>>> >>>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial >>>>>>>>>>>> College London >>>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London> >>>>>>>>>>>> >>>>>>>>>>>> London, United Kingdom >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [image: Image removed by sender.] view my Linkedin profile >>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fen.everybodywiki.com%2fMich_Talebzadeh&c=E,1,U1JaGVMkko53HkJO5fwmkIXfziTOWL3K1CkAeHwFG55TbZQUd5xVNLGpLt2o0ytujE6zaLpqU2GWCZqHSbo3SU4Wh9Rl8NG4bWPbFWUwyw,,&typo=1> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *Disclaimer:* The information provided is correct to the best >>>>>>>>>>>> of my knowledge but of course cannot be guaranteed . It is >>>>>>>>>>>> essential to >>>>>>>>>>>> note that, as with any advice, quote "one test result is worth >>>>>>>>>>>> one-thousand expert opinions (Werner >>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, 5 Oct 2024 at 06:26, Ángel < >>>>>>>>>>>> angel.alvarez.pas...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I completely agree with everyone here. I don’t think the issue >>>>>>>>>>>> is deprecating it; to me, the problem lies in not providing a new >>>>>>>>>>>> and >>>>>>>>>>>> better solution for handling graphs in Spark. In the past, I used >>>>>>>>>>>> GraphX >>>>>>>>>>>> via GraphFrames for record linkage, and I found it both useful and >>>>>>>>>>>> effective. Is there any discussion about a potential replacement? >>>>>>>>>>>> >>>>>>>>>>>> I’d be willing to help maintain GraphX, though I don’t have >>>>>>>>>>>> previous experience with maintaining open-source projects. All I >>>>>>>>>>>> can >>>>>>>>>>>> promise is good intentions, willingness to learn and lots of >>>>>>>>>>>> energy and >>>>>>>>>>>> passion. Is that enough? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Btw, what's your take on this? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> · *GraphX* will be deprecated in favor of a new >>>>>>>>>>>> graphing component, SparkGraph, based on Cypher >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fneo4j.com%2fdeveloper%2fcypher-query-language%2f&c=E,1,5sP_K0oxQDLYIfWhFPwgNEmTuXMR7tvCjLLcf_ZBAv7oIBySxARy9TyrqNkmZKfXwrIDrhe6TVBCUun2luRV_mAbSD4rooD9YRt5GYYgbHbBUYerg1mpA4Oe6eo,&typo=1>, >>>>>>>>>>>> a much richer graph language than previously offered by GraphX. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> https://cloud.google.com/blog/products/data-analytics/introducing-spark-3-and-hadoop-3-on-dataproc-image-version-2-0 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> El sáb, 5 oct 2024 a las 2:17, Mark Hamstra (< >>>>>>>>>>>> markhams...@gmail.com>) escribió: >>>>>>>>>>>> >>>>>>>>>>>> As I wrote to Holden privately, I might well change my vote to >>>>>>>>>>>> be in >>>>>>>>>>>> favor of a deprecation label combined with some effective means >>>>>>>>>>>> of >>>>>>>>>>>> communicating that this doesn't mean the end for GraphX if >>>>>>>>>>>> interested >>>>>>>>>>>> contributors come forward to rescue it. I don't like either the >>>>>>>>>>>> idea >>>>>>>>>>>> of keeping unmaintained code and public APIs around (especially >>>>>>>>>>>> if >>>>>>>>>>>> there are problems with them) or the idea of removing Spark >>>>>>>>>>>> functionality just because no one has contributed to it for a >>>>>>>>>>>> while. A >>>>>>>>>>>> naked deprecation label feels somewhat drastic and pre-emptive >>>>>>>>>>>> to me. >>>>>>>>>>>> I don't expect that GraphX will be the last part of Spark to >>>>>>>>>>>> run the >>>>>>>>>>>> risk of death through neglect, and I think we need an effective >>>>>>>>>>>> means >>>>>>>>>>>> of encouraging resuscitation that a deprecation label on its >>>>>>>>>>>> own does >>>>>>>>>>>> not provide. On the other hand, if no one really is willing to >>>>>>>>>>>> come to >>>>>>>>>>>> the aid of GraphX or other neglected functionality given >>>>>>>>>>>> adequate >>>>>>>>>>>> warning of possible removal, I'm not then opposed to the usual >>>>>>>>>>>> deprecation and removal process. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Oct 4, 2024 at 4:10 PM Sean Owen <sro...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> > >>>>>>>>>>>> > This is a reasonable discussion, but maybe the more practical >>>>>>>>>>>> point is: are you sure you want to block this unilaterally? This >>>>>>>>>>>> effectively makes a decision that GraphX cannot be removed for a >>>>>>>>>>>> long >>>>>>>>>>>> while. I'd understand it more if we had an active maintainer >>>>>>>>>>>> and/or active >>>>>>>>>>>> user proposing to veto, but my understanding is this is just a >>>>>>>>>>>> proposal to >>>>>>>>>>>> block this on behalf of some users, someone else who might do some >>>>>>>>>>>> work and >>>>>>>>>>>> hasn't to date for some reason. Add to that the fact that the 'pro' >>>>>>>>>>>> arguments all seem to be arguments for working on GraphFrames, and >>>>>>>>>>>> I find >>>>>>>>>>>> this somewhat drastic. >>>>>>>>>>>> > >>>>>>>>>>>> > On Fri, Oct 4, 2024 at 5:23 PM Mark Hamstra < >>>>>>>>>>>> markhams...@gmail.com> wrote: >>>>>>>>>>>> >> >>>>>>>>>>>> >> "You can't say nothing is removable until there are no >>>>>>>>>>>> users." >>>>>>>>>>>> >> >>>>>>>>>>>> >> That is not what I am saying. Rather, I am countering what >>>>>>>>>>>> others seem >>>>>>>>>>>> >> to be suggesting: There are no users and no interest, >>>>>>>>>>>> therefore we can >>>>>>>>>>>> >> and should deprecate. >>>>>>>>>>>> >> >>>>>>>>>>>> >> On Fri, Oct 4, 2024 at 3:10 PM Sean Owen <sro...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >> > >>>>>>>>>>>> >> > I could flip this argument around. More strongly, not >>>>>>>>>>>> being deprecated means "won't be removed" and likewise implies >>>>>>>>>>>> support and >>>>>>>>>>>> development. I don't think either of the latter have been true for >>>>>>>>>>>> years. >>>>>>>>>>>> What suggests this will change? A todo list is not going to do >>>>>>>>>>>> anything, >>>>>>>>>>>> IMHO. >>>>>>>>>>>> >> > >>>>>>>>>>>> >> > I'm also concerned about the cost of that, which I have >>>>>>>>>>>> observed. GraphX PRs are almost certainly not going to be reviewed >>>>>>>>>>>> because >>>>>>>>>>>> of its state. Deprecation both communicates that reality, and >>>>>>>>>>>> leaves an >>>>>>>>>>>> option open, whereas not deprecating forecloses that option for a >>>>>>>>>>>> while. >>>>>>>>>>>> >> > >>>>>>>>>>>> >> > I don't think the question is, does anyone use it? because >>>>>>>>>>>> anyone can continue to use it -- in Spark 3.x for sure, and in 4.x >>>>>>>>>>>> if not >>>>>>>>>>>> removed. >>>>>>>>>>>> >> > You can't say nothing is removable until there are no >>>>>>>>>>>> users. >>>>>>>>>>>> >> > >>>>>>>>>>>> >> > Also, why would GraphFrames not be the logical home of >>>>>>>>>>>> this going forward anyway? which I think is the subtext. >>>>>>>>>>>> >> > >>>>>>>>>>>> >> > On Fri, Oct 4, 2024 at 4:56 PM Mark Hamstra < >>>>>>>>>>>> markhams...@gmail.com> wrote: >>>>>>>>>>>> >> >> >>>>>>>>>>>> >> >> I'm -1(*) because, while it technically means "might be >>>>>>>>>>>> removed in the >>>>>>>>>>>> >> >> future", I think developers and users are more prone to >>>>>>>>>>>> interpret >>>>>>>>>>>> >> >> something being marked as deprecated as "very likely will >>>>>>>>>>>> be removed >>>>>>>>>>>> >> >> in the future, so don't depend on this or waste your time >>>>>>>>>>>> contributing >>>>>>>>>>>> >> >> to its further development." I don't think the latter is >>>>>>>>>>>> what we want >>>>>>>>>>>> >> >> just because something hasn't been updated meaningfully >>>>>>>>>>>> in a while. >>>>>>>>>>>> >> >> There have been How To articles for GraphX and Graph >>>>>>>>>>>> Frames posted in >>>>>>>>>>>> >> >> the not too distant past, and the Google Search trend >>>>>>>>>>>> shows a pretty >>>>>>>>>>>> >> >> steady level of interest, not a decline to zero, so I >>>>>>>>>>>> don't think that >>>>>>>>>>>> >> >> it is accurate to declare that there is no use or >>>>>>>>>>>> interest in GraphX. >>>>>>>>>>>> >> >> >>>>>>>>>>>> >> >> Unless retaining GraphX is imposing significant costs on >>>>>>>>>>>> continuing >>>>>>>>>>>> >> >> Spark development, I can't support deprecating GraphX. I >>>>>>>>>>>> can support >>>>>>>>>>>> >> >> encouraging GraphX and Graph Frames development through >>>>>>>>>>>> something like >>>>>>>>>>>> >> >> a To Do list or document of "What we'd like to see in the >>>>>>>>>>>> way of >>>>>>>>>>>> >> >> further development of Spark's graph processing >>>>>>>>>>>> capabilities" -- i.e., >>>>>>>>>>>> >> >> things that encourage and support new contributions to >>>>>>>>>>>> address any >>>>>>>>>>>> >> >> shortcomings in Spark's graph processing, not things that >>>>>>>>>>>> discourage >>>>>>>>>>>> >> >> contributions and use in the way that I believe simply >>>>>>>>>>>> declaring >>>>>>>>>>>> >> >> GraphX to be deprecated would. >>>>>>>>>>>> >> >> >>>>>>>>>>>> >> >> >>>>>>>>>>>> >> >> On Sun, Sep 29, 2024 at 11:04 AM Holden Karau < >>>>>>>>>>>> holden.ka...@gmail.com> wrote: >>>>>>>>>>>> >> >> > >>>>>>>>>>>> >> >> > Since we're getting close to cutting a 4.0 branch I'd >>>>>>>>>>>> like to float the idea of officially deprecating Graph X. What >>>>>>>>>>>> that would >>>>>>>>>>>> mean (to me) is we would update the docs to indicate that Graph X >>>>>>>>>>>> is >>>>>>>>>>>> deprecated and it's APIs may be removed at anytime in the future. >>>>>>>>>>>> >> >> > >>>>>>>>>>>> >> >> > Alternatively, we could mark it as "unmaintained and in >>>>>>>>>>>> search of maintainers" with a note that if no maintainers are >>>>>>>>>>>> found, we may >>>>>>>>>>>> remove it in a future minor version. >>>>>>>>>>>> >> >> > >>>>>>>>>>>> >> >> > Looking at the source graph X, I don't see any >>>>>>>>>>>> meaningful active development going back over three years*. There >>>>>>>>>>>> is even a >>>>>>>>>>>> thread on user@ from 2017 asking if graph X is maintained >>>>>>>>>>>> anymore, with no response from the developers. >>>>>>>>>>>> >> >> > >>>>>>>>>>>> >> >> > Now I'm open to the idea that GraphX is stable and >>>>>>>>>>>> "works as is" and simply doesn't require modifications but given >>>>>>>>>>>> the user >>>>>>>>>>>> thread I'm a little concerned here about bringing this API with us >>>>>>>>>>>> into >>>>>>>>>>>> Spark 4 if we don't have anyone signed up to maintain it. >>>>>>>>>>>> >> >> > >>>>>>>>>>>> >> >> > * Excluding globally applied changes >>>>>>>>>>>> >> >> > -- >>>>>>>>>>>> >> >> > Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>> >> >> > Fight Health Insurance: >>>>>>>>>>>> https://www.fighthealthinsurance.com/ >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fwww.fighthealthinsurance.com%2f&c=E,1,9CeJ-bKUShnxOFZMc15zJG1qgfAB9rnSDzrmLzNiXb8qE0NXedNCoZy4HobcS7laOMqtvJzYjvDzjBld1FaCPZpOBW6cf1l_xaG4bEbjYoDpNG0zuQ9_K5TW&typo=1> >>>>>>>>>>>> >> >> > Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>>>> <https://linkprotect.cudasvc.com/url?a=https%3a%2f%2famzn.to%2f2MaRAG9&c=E,1,HJPBNbN3nfUZcb0-2OgveqIE5I5lvPSv-bOfRXIprFdSsGMlNq15o6rueLf2ZQRfytMu0-t3IxSjYou2uuPzUrSAqJ0LV42n2hG8rnkkpN4AA5w4mQZFTs4,&typo=1> >>>>>>>>>>>> >> >> > YouTube Live Streams: >>>>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>>>> >> >> > Pronouns: she/her >>>>>>>>>>>> >> >> >>>>>>>>>>>> >> >> >>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>> >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>>>> >> >> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>>>> >>>>>>>>>>>>