In fact, I showed it not just to you but other colleague of yours too. But there has been absolutely no comment or anything on that from then , till now.
On Thu, May 28, 2026 at 11:19 AM Asif Shahid <[email protected]> wrote: > also take a look at this jira > https://issues.apache.org/jira/browse/SPARK-47320 > for this also an alternate PR was opened. > This problem is do deep in code, that I even showed you that in the > existing test itself, if the join condition's operand are swapped, test > fails.. Its completely broken , the self joins. > I had proposed a consistent fix, which solves the issue completely and > logically, but again an alternate PR was filed.. > What issue was there in my PR , which I created...? > Regards > Asif > > On Thu, May 28, 2026 at 11:14 AM Asif Shahid <[email protected]> > wrote: > >> >> >> On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]> wrote: >> >>> >>> As for the fix, itself, is not indicative of any thing as its a one >>>> liner, test has uncanny resemblance >>> >>> >>> Asif, what exactly is the "uncanny resemblance" between those test cases >>> in https://github.com/apache/spark/pull/49154/changes vs >>> https://github.com/apache/spark/pull/55644/changes ? Besides the fact >>> that obviously they are comparing canonicalized forms. >>> Again, sorry for not noticing your PR, but I don't feel my fix has >>> anything to do with yours. >>> >> Ok. I respect your opinion. Each one is entitled to its own view >> >>> >>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016 >>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3 >>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a >>>> different route. >>> >>> >>> Do you see anything in common between >>> https://github.com/apache/spark/pull/50029/changes and >>> https://github.com/apache/spark/pull/50757/changes ? >>> Because I do see. That someone else had a much better idea: >>> https://github.com/apache/spark/pull/50757#issuecomment-2844972082 / >>> https://github.com/apache/spark/pull/50230 and it was implemented for >>> the benefit of Spark. >>> IMO, that's the normal way of dealing with issues in an open-source >>> project. Ideas come and go and hopefully the one best wins. >>> >> The checksum approach has its expense. That can come later , because >> apriori its possible to detect whether the expression is returning value >> from an indeterministic expression. >> You opened an alternate PR, which I have described in the PR discussion >> that to fix the round robin issue which you were dealing with, you are >> trying to impose an order in in-deterministic expression evaluattion, which >> itself is against the basic premise that if data is in-determinate, there >> cannot be order in it. >> What issue did u see in the logic, that an alternate PR was >> opened...which impacted all the stages ( including the ancestors?) and I >> already discussed internally why the idea you had in mind would not work. I >> specifically asked, why dont we discuss via the PR filed... >> >> >> >>> >>> Peter >>> >>> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]> >>> wrote: >>> >>>> Hi Nicholas, >>>> You wanted some examples , right: >>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016 >>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3 >>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a >>>> different route. >>>> Did any one who created new PR and route, showed up any unaddressable >>>> logical issue? >>>> The same goes for all the PRs ( which in case I have closed) >>>> Regards >>>> Asif >>>> >>>> >>>> >>>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas < >>>> [email protected]> wrote: >>>> >>>>> I think repeatedly calling the contributors on this list a “cartel” is >>>>> not conducive to a calm and amicable resolution. >>>>> >>>>> You may have some history built up that led you to use that word, but >>>>> to the rest of us it comes out of nowhere; you in fact opened this thread >>>>> with that attack. If you keep making your case in this manner, you will >>>>> just turn everyone against you. >>>>> >>>>> If there is a history of what you feel is others stealing your work, >>>>> please link to a few examples so we can see what you are seeing. If you >>>>> can’t do that, then just focus on this current example. And try to refrain >>>>> from calling people names unless your goal is just to have a fight, as >>>>> opposed to resolving the problematic behavior so you can continue to >>>>> contribute. >>>>> >>>>> I am not a committer and don’t have any special role in this >>>>> community. I am speaking just as an observer and regular contributor to >>>>> the >>>>> project. >>>>> >>>>> > I have experienced this before, as recent as couple of months back ( >>>>> https://issues.apache.org/jira/browse/SPARK-54386) >>>>> >>>>> For others following along, I took a look at this ticket and the >>>>> associated PRs: #53261 <https://github.com/apache/spark/pull/53261> / >>>>> #53100 <https://github.com/apache/spark/pull/53100> >>>>> >>>>> It looks like Asif is upset that he submitted a fix for the same issue >>>>> a week or so prior to the fix that eventually got merged. But the fixes >>>>> are >>>>> different, and the one that got merged is a lot shorter, though they are >>>>> both simple. The PR that got merged was submitted by someone who appears >>>>> to >>>>> be employed by Databricks; perhaps this is part of the “cartel” >>>>> accusation. >>>>> The two PRs were reviewed by different committers, however, and the one >>>>> that got merged was merged in by someone who does _not_ work for >>>>> Databricks. >>>>> >>>>> I don’t see anything here other than the normal dynamic of a large and >>>>> busy open source project. Committer attention is limited; things fall >>>>> through the cracks; different contributors may occasionally work on the >>>>> same thing without knowing about each other. A minor help to this specific >>>>> problem would be to have some way of automatically linking issues that >>>>> appear to be about the same thing. >>>>> >>>>> Nick >>>>> >>>>> >>>>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]> >>>>> wrote: >>>>> >>>>> Hi Peter, >>>>> Pls see inline for comments/ replies >>>>> >>>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]> >>>>> wrote: >>>>> >>>>>> Hey Asif, >>>>>> >>>>>> Are you referring to >>>>>> https://github.com/apache/spark/pull/49154/changes vs. >>>>>> https://github.com/apache/spark/pull/55644/changes? Those are >>>>>> definitely solving the same issue but I can assure you I wouldn't take >>>>>> any >>>>>> code from your PR without consulting with you first. >>>>>> >>>>> Yes Indeed Peter, I am referring to those. >>>>> As for the fix, itself, is not indicative of any thing as its a one >>>>> liner, test has uncanny resemblance. >>>>> >>>>> >>>>>> As far as I remember, I opened SPARK-56694 / >>>>>> https://github.com/apache/spark/pull/55644 because I ran into that >>>>>> minor bug during the implementation of >>>>>> https://github.com/apache/spark/pull/55298. >>>>>> >>>>> >>>>> >>>>>> Sorry, I didn't check whether a ticket or PR already existed. >>>>>> >>>>> >>>>> The below I am addressing to the whole cartel.: >>>>> I have experienced this before, as recent as couple of months back ( >>>>> https://issues.apache.org/jira/browse/SPARK-54386) >>>>> I have experienced, my personal effort ( going into weeks) to debug, >>>>> reproduce issue reliably , being hijacked by members, without even >>>>> discussing the fix proposed, ( by opening new PRs). ( If interested, I can >>>>> provide details of the PRs / issues I am talking about) >>>>> I have seen a perfectly valid PR being nixed , by following comment >>>>> which essentially said >>>>> " my code of making the cache lookup more effective , would result in >>>>> greater chances of stale cache being picked, which already spark suffers >>>>> from." >>>>> Now the PR was related to collapsing the projects in analysis phase, >>>>> and side effect was cache pick up being more sensitive. >>>>> So this is such a frivolous reason to nix the PR , because "staleness" >>>>> is an underlying existing issue which had nothing to do with my PR. And >>>>> its >>>>> more amusing , that if a DB is giving even one wrong result in millions, >>>>> that makes all the results a suspect in any case. It does not matter at >>>>> what frequency this occurs. To me the real reason was code complexity ( & >>>>> more likely the loss of control of the code to the outsider). >>>>> >>>>> The reason I call this open source community as cartel, is because, I >>>>> have seen the way it works pretty closely and have experienced it in the >>>>> email exchanges which happen on this group. >>>>> For the same PR , same issue, if advertently or inadvertently , other >>>>> person ( especially a member) gets his changes pushed, by the virtue of >>>>> his >>>>> standing/position and the "for profit" company the person works, how would >>>>> you give the credit to the original person who discovered the issue first >>>>> / >>>>> provided the fix? >>>>> Why are issues filed by some immediately worked upon by members ( some >>>>> of whom claim to be working full time on spark) ? Is it because certain >>>>> companies / groups ( for profit companies, mind you ) exert undue >>>>> control, or the petty newbee has to be in the good books of members ( with >>>>> the hope that at some point they will also reach that position of power ?) >>>>> >>>>> Given the AI advent and such occurrences, how will you give due >>>>> credit to the original creators and how do you plan to prevent some member >>>>> for taking up idea of any old open PR ( which for reasons of complexity >>>>> and >>>>> non technical reasons) , polishing it up and pushing it as their own? >>>>> >>>>> I am also curious , am I the only one who is troubled by all this, or >>>>> there are others who have experienced it? >>>>> >>>>> Regards >>>>> Asif >>>>> >>>>> >>>>>> If you have further improvements please feel free to open a PR. >>>>>> >>>>>> Best, >>>>>> Peter >>>>>> >>>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> I had filed a bug >>>>>>> https://issues.apache.org/jira/browse/SPARK-45866 >>>>>>> >>>>>>> I had also opened a PR for the same. >>>>>>> >>>>>>> Now I see that the ticket I filed is still open, but the issue has >>>>>>> been fixed using a new ticket >>>>>>> https://issues.apache.org/jira/browse/SPARK-56694 >>>>>>> >>>>>>> and on top of that the bug test and ofcourse the fix ( which in any >>>>>>> case would be same) has been taken from my PR for >>>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f >>>>>>> >>>>>>> To me this is clear unethical conduct of cartel member, unless I am >>>>>>> missing some valid reason. >>>>>>> >>>>>>> And the irony is that the fix is still incomplete, as I just found >>>>>>> and filed a new ticket >>>>>>> https://issues.apache.org/jira/browse/SPARK-57126 >>>>>>> >>>>>>> I know that atleast some cartel members are insecure and think of >>>>>>> OSS as their fiefdom, but this sort of behaviour , I never expected. >>>>>>> Regards >>>>>>> Asif >>>>>>> >>>>>> >>>>>
