On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]> wrote:
> > As for the fix, itself, is not indicative of any thing as its a one liner, >> test has uncanny resemblance > > > Asif, what exactly is the "uncanny resemblance" between those test cases > in https://github.com/apache/spark/pull/49154/changes vs > https://github.com/apache/spark/pull/55644/changes ? Besides the fact > that obviously they are comparing canonicalized forms. > Again, sorry for not noticing your PR, but I don't feel my fix has > anything to do with yours. > Ok. I respect your opinion. Each one is entitled to its own view > > 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016 >> To discover this bug and reproduce it reliably, I spent nearly 2- 3 >> weeks. I filed a PR. The bug was fixed via a different PR , taken a >> different route. > > > Do you see anything in common between > https://github.com/apache/spark/pull/50029/changes and > https://github.com/apache/spark/pull/50757/changes ? > Because I do see. That someone else had a much better idea: > https://github.com/apache/spark/pull/50757#issuecomment-2844972082 / > https://github.com/apache/spark/pull/50230 and it was implemented for the > benefit of Spark. > IMO, that's the normal way of dealing with issues in an open-source > project. Ideas come and go and hopefully the one best wins. > The checksum approach has its expense. That can come later , because apriori its possible to detect whether the expression is returning value from an indeterministic expression. You opened an alternate PR, which I have described in the PR discussion that to fix the round robin issue which you were dealing with, you are trying to impose an order in in-deterministic expression evaluattion, which itself is against the basic premise that if data is in-determinate, there cannot be order in it. What issue did u see in the logic, that an alternate PR was opened...which impacted all the stages ( including the ancestors?) and I already discussed internally why the idea you had in mind would not work. I specifically asked, why dont we discuss via the PR filed... > > Peter > > On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]> wrote: > >> Hi Nicholas, >> You wanted some examples , right: >> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016 >> To discover this bug and reproduce it reliably, I spent nearly 2- 3 >> weeks. I filed a PR. The bug was fixed via a different PR , taken a >> different route. >> Did any one who created new PR and route, showed up any unaddressable >> logical issue? >> The same goes for all the PRs ( which in case I have closed) >> Regards >> Asif >> >> >> >> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas < >> [email protected]> wrote: >> >>> I think repeatedly calling the contributors on this list a “cartel” is >>> not conducive to a calm and amicable resolution. >>> >>> You may have some history built up that led you to use that word, but to >>> the rest of us it comes out of nowhere; you in fact opened this thread with >>> that attack. If you keep making your case in this manner, you will just >>> turn everyone against you. >>> >>> If there is a history of what you feel is others stealing your work, >>> please link to a few examples so we can see what you are seeing. If you >>> can’t do that, then just focus on this current example. And try to refrain >>> from calling people names unless your goal is just to have a fight, as >>> opposed to resolving the problematic behavior so you can continue to >>> contribute. >>> >>> I am not a committer and don’t have any special role in this community. >>> I am speaking just as an observer and regular contributor to the project. >>> >>> > I have experienced this before, as recent as couple of months back ( >>> https://issues.apache.org/jira/browse/SPARK-54386) >>> >>> For others following along, I took a look at this ticket and the >>> associated PRs: #53261 <https://github.com/apache/spark/pull/53261> / >>> #53100 <https://github.com/apache/spark/pull/53100> >>> >>> It looks like Asif is upset that he submitted a fix for the same issue a >>> week or so prior to the fix that eventually got merged. But the fixes are >>> different, and the one that got merged is a lot shorter, though they are >>> both simple. The PR that got merged was submitted by someone who appears to >>> be employed by Databricks; perhaps this is part of the “cartel” accusation. >>> The two PRs were reviewed by different committers, however, and the one >>> that got merged was merged in by someone who does _not_ work for Databricks. >>> >>> I don’t see anything here other than the normal dynamic of a large and >>> busy open source project. Committer attention is limited; things fall >>> through the cracks; different contributors may occasionally work on the >>> same thing without knowing about each other. A minor help to this specific >>> problem would be to have some way of automatically linking issues that >>> appear to be about the same thing. >>> >>> Nick >>> >>> >>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]> wrote: >>> >>> Hi Peter, >>> Pls see inline for comments/ replies >>> >>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]> wrote: >>> >>>> Hey Asif, >>>> >>>> Are you referring to https://github.com/apache/spark/pull/49154/changes >>>> vs. https://github.com/apache/spark/pull/55644/changes? Those are >>>> definitely solving the same issue but I can assure you I wouldn't take any >>>> code from your PR without consulting with you first. >>>> >>> Yes Indeed Peter, I am referring to those. >>> As for the fix, itself, is not indicative of any thing as its a one >>> liner, test has uncanny resemblance. >>> >>> >>>> As far as I remember, I opened SPARK-56694 / >>>> https://github.com/apache/spark/pull/55644 because I ran into that >>>> minor bug during the implementation of >>>> https://github.com/apache/spark/pull/55298. >>>> >>> >>> >>>> Sorry, I didn't check whether a ticket or PR already existed. >>>> >>> >>> The below I am addressing to the whole cartel.: >>> I have experienced this before, as recent as couple of months back ( >>> https://issues.apache.org/jira/browse/SPARK-54386) >>> I have experienced, my personal effort ( going into weeks) to debug, >>> reproduce issue reliably , being hijacked by members, without even >>> discussing the fix proposed, ( by opening new PRs). ( If interested, I can >>> provide details of the PRs / issues I am talking about) >>> I have seen a perfectly valid PR being nixed , by following comment >>> which essentially said >>> " my code of making the cache lookup more effective , would result in >>> greater chances of stale cache being picked, which already spark suffers >>> from." >>> Now the PR was related to collapsing the projects in analysis phase, and >>> side effect was cache pick up being more sensitive. >>> So this is such a frivolous reason to nix the PR , because "staleness" >>> is an underlying existing issue which had nothing to do with my PR. And its >>> more amusing , that if a DB is giving even one wrong result in millions, >>> that makes all the results a suspect in any case. It does not matter at >>> what frequency this occurs. To me the real reason was code complexity ( & >>> more likely the loss of control of the code to the outsider). >>> >>> The reason I call this open source community as cartel, is because, I >>> have seen the way it works pretty closely and have experienced it in the >>> email exchanges which happen on this group. >>> For the same PR , same issue, if advertently or inadvertently , other >>> person ( especially a member) gets his changes pushed, by the virtue of his >>> standing/position and the "for profit" company the person works, how would >>> you give the credit to the original person who discovered the issue first / >>> provided the fix? >>> Why are issues filed by some immediately worked upon by members ( some >>> of whom claim to be working full time on spark) ? Is it because certain >>> companies / groups ( for profit companies, mind you ) exert undue >>> control, or the petty newbee has to be in the good books of members ( with >>> the hope that at some point they will also reach that position of power ?) >>> >>> Given the AI advent and such occurrences, how will you give due credit >>> to the original creators and how do you plan to prevent some member for >>> taking up idea of any old open PR ( which for reasons of complexity and non >>> technical reasons) , polishing it up and pushing it as their own? >>> >>> I am also curious , am I the only one who is troubled by all this, or >>> there are others who have experienced it? >>> >>> Regards >>> Asif >>> >>> >>>> If you have further improvements please feel free to open a PR. >>>> >>>> Best, >>>> Peter >>>> >>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> I had filed a bug >>>>> https://issues.apache.org/jira/browse/SPARK-45866 >>>>> >>>>> I had also opened a PR for the same. >>>>> >>>>> Now I see that the ticket I filed is still open, but the issue has >>>>> been fixed using a new ticket >>>>> https://issues.apache.org/jira/browse/SPARK-56694 >>>>> >>>>> and on top of that the bug test and ofcourse the fix ( which in any >>>>> case would be same) has been taken from my PR for >>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f >>>>> >>>>> To me this is clear unethical conduct of cartel member, unless I am >>>>> missing some valid reason. >>>>> >>>>> And the irony is that the fix is still incomplete, as I just found and >>>>> filed a new ticket >>>>> https://issues.apache.org/jira/browse/SPARK-57126 >>>>> >>>>> I know that atleast some cartel members are insecure and think of OSS >>>>> as their fiefdom, but this sort of behaviour , I never expected. >>>>> Regards >>>>> Asif >>>>> >>>> >>>
