Also I must admit that I did not know oss works by opening alternate PRs. In the places where I have worked most of my life, we work on the opened PR with the original author and try to bridge the gap.
On Thu, May 28, 2026, 11:25 AM Asif Shahid <[email protected]> wrote: > In fact, I showed it not just to you but other colleague of yours too. But > there has been absolutely no comment or anything on that from then , till > now. > > On Thu, May 28, 2026 at 11:19 AM Asif Shahid <[email protected]> > wrote: > >> also take a look at this jira >> https://issues.apache.org/jira/browse/SPARK-47320 >> for this also an alternate PR was opened. >> This problem is do deep in code, that I even showed you that in the >> existing test itself, if the join condition's operand are swapped, test >> fails.. Its completely broken , the self joins. >> I had proposed a consistent fix, which solves the issue completely and >> logically, but again an alternate PR was filed.. >> What issue was there in my PR , which I created...? >> Regards >> Asif >> >> On Thu, May 28, 2026 at 11:14 AM Asif Shahid <[email protected]> >> wrote: >> >>> >>> >>> On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]> >>> wrote: >>> >>>> >>>> As for the fix, itself, is not indicative of any thing as its a one >>>>> liner, test has uncanny resemblance >>>> >>>> >>>> Asif, what exactly is the "uncanny resemblance" between those test >>>> cases in https://github.com/apache/spark/pull/49154/changes vs >>>> https://github.com/apache/spark/pull/55644/changes ? Besides the fact >>>> that obviously they are comparing canonicalized forms. >>>> Again, sorry for not noticing your PR, but I don't feel my fix has >>>> anything to do with yours. >>>> >>> Ok. I respect your opinion. Each one is entitled to its own view >>> >>>> >>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016 >>>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3 >>>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a >>>>> different route. >>>> >>>> >>>> Do you see anything in common between >>>> https://github.com/apache/spark/pull/50029/changes and >>>> https://github.com/apache/spark/pull/50757/changes ? >>>> Because I do see. That someone else had a much better idea: >>>> https://github.com/apache/spark/pull/50757#issuecomment-2844972082 / >>>> https://github.com/apache/spark/pull/50230 and it was implemented for >>>> the benefit of Spark. >>>> IMO, that's the normal way of dealing with issues in an open-source >>>> project. Ideas come and go and hopefully the one best wins. >>>> >>> The checksum approach has its expense. That can come later , because >>> apriori its possible to detect whether the expression is returning value >>> from an indeterministic expression. >>> You opened an alternate PR, which I have described in the PR discussion >>> that to fix the round robin issue which you were dealing with, you are >>> trying to impose an order in in-deterministic expression evaluattion, which >>> itself is against the basic premise that if data is in-determinate, there >>> cannot be order in it. >>> What issue did u see in the logic, that an alternate PR was >>> opened...which impacted all the stages ( including the ancestors?) and I >>> already discussed internally why the idea you had in mind would not work. I >>> specifically asked, why dont we discuss via the PR filed... >>> >>> >>> >>>> >>>> Peter >>>> >>>> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]> >>>> wrote: >>>> >>>>> Hi Nicholas, >>>>> You wanted some examples , right: >>>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016 >>>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3 >>>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a >>>>> different route. >>>>> Did any one who created new PR and route, showed up any unaddressable >>>>> logical issue? >>>>> The same goes for all the PRs ( which in case I have closed) >>>>> Regards >>>>> Asif >>>>> >>>>> >>>>> >>>>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas < >>>>> [email protected]> wrote: >>>>> >>>>>> I think repeatedly calling the contributors on this list a “cartel” >>>>>> is not conducive to a calm and amicable resolution. >>>>>> >>>>>> You may have some history built up that led you to use that word, but >>>>>> to the rest of us it comes out of nowhere; you in fact opened this thread >>>>>> with that attack. If you keep making your case in this manner, you will >>>>>> just turn everyone against you. >>>>>> >>>>>> If there is a history of what you feel is others stealing your work, >>>>>> please link to a few examples so we can see what you are seeing. If you >>>>>> can’t do that, then just focus on this current example. And try to >>>>>> refrain >>>>>> from calling people names unless your goal is just to have a fight, as >>>>>> opposed to resolving the problematic behavior so you can continue to >>>>>> contribute. >>>>>> >>>>>> I am not a committer and don’t have any special role in this >>>>>> community. I am speaking just as an observer and regular contributor to >>>>>> the >>>>>> project. >>>>>> >>>>>> > I have experienced this before, as recent as couple of months back >>>>>> ( https://issues.apache.org/jira/browse/SPARK-54386) >>>>>> >>>>>> For others following along, I took a look at this ticket and the >>>>>> associated PRs: #53261 <https://github.com/apache/spark/pull/53261> >>>>>> / #53100 <https://github.com/apache/spark/pull/53100> >>>>>> >>>>>> It looks like Asif is upset that he submitted a fix for the same >>>>>> issue a week or so prior to the fix that eventually got merged. But the >>>>>> fixes are different, and the one that got merged is a lot shorter, though >>>>>> they are both simple. The PR that got merged was submitted by someone who >>>>>> appears to be employed by Databricks; perhaps this is part of the >>>>>> “cartel” >>>>>> accusation. The two PRs were reviewed by different committers, however, >>>>>> and >>>>>> the one that got merged was merged in by someone who does _not_ work for >>>>>> Databricks. >>>>>> >>>>>> I don’t see anything here other than the normal dynamic of a large >>>>>> and busy open source project. Committer attention is limited; things fall >>>>>> through the cracks; different contributors may occasionally work on the >>>>>> same thing without knowing about each other. A minor help to this >>>>>> specific >>>>>> problem would be to have some way of automatically linking issues that >>>>>> appear to be about the same thing. >>>>>> >>>>>> Nick >>>>>> >>>>>> >>>>>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Hi Peter, >>>>>> Pls see inline for comments/ replies >>>>>> >>>>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hey Asif, >>>>>>> >>>>>>> Are you referring to >>>>>>> https://github.com/apache/spark/pull/49154/changes vs. >>>>>>> https://github.com/apache/spark/pull/55644/changes? Those are >>>>>>> definitely solving the same issue but I can assure you I wouldn't take >>>>>>> any >>>>>>> code from your PR without consulting with you first. >>>>>>> >>>>>> Yes Indeed Peter, I am referring to those. >>>>>> As for the fix, itself, is not indicative of any thing as its a one >>>>>> liner, test has uncanny resemblance. >>>>>> >>>>>> >>>>>>> As far as I remember, I opened SPARK-56694 / >>>>>>> https://github.com/apache/spark/pull/55644 because I ran into that >>>>>>> minor bug during the implementation of >>>>>>> https://github.com/apache/spark/pull/55298. >>>>>>> >>>>>> >>>>>> >>>>>>> Sorry, I didn't check whether a ticket or PR already existed. >>>>>>> >>>>>> >>>>>> The below I am addressing to the whole cartel.: >>>>>> I have experienced this before, as recent as couple of months back ( >>>>>> https://issues.apache.org/jira/browse/SPARK-54386) >>>>>> I have experienced, my personal effort ( going into weeks) to debug, >>>>>> reproduce issue reliably , being hijacked by members, without even >>>>>> discussing the fix proposed, ( by opening new PRs). ( If interested, I >>>>>> can >>>>>> provide details of the PRs / issues I am talking about) >>>>>> I have seen a perfectly valid PR being nixed , by following comment >>>>>> which essentially said >>>>>> " my code of making the cache lookup more effective , would result >>>>>> in greater chances of stale cache being picked, which already spark >>>>>> suffers from." >>>>>> Now the PR was related to collapsing the projects in analysis phase, >>>>>> and side effect was cache pick up being more sensitive. >>>>>> So this is such a frivolous reason to nix the PR , because >>>>>> "staleness" is an underlying existing issue which had nothing to do with >>>>>> my >>>>>> PR. And its more amusing , that if a DB is giving even one wrong result >>>>>> in >>>>>> millions, that makes all the results a suspect in any case. It does not >>>>>> matter at what frequency this occurs. To me the real reason was code >>>>>> complexity ( & more likely the loss of control of the code to the >>>>>> outsider). >>>>>> >>>>>> The reason I call this open source community as cartel, is because, I >>>>>> have seen the way it works pretty closely and have experienced it in the >>>>>> email exchanges which happen on this group. >>>>>> For the same PR , same issue, if advertently or inadvertently , >>>>>> other person ( especially a member) gets his changes pushed, by the >>>>>> virtue >>>>>> of his standing/position and the "for profit" company the person works, >>>>>> how >>>>>> would you give the credit to the original person who discovered the issue >>>>>> first / provided the fix? >>>>>> Why are issues filed by some immediately worked upon by members ( >>>>>> some of whom claim to be working full time on spark) ? Is it because >>>>>> certain companies / groups ( for profit companies, mind you ) exert >>>>>> undue >>>>>> control, or the petty newbee has to be in the good books of members ( >>>>>> with >>>>>> the hope that at some point they will also reach that position of power >>>>>> ?) >>>>>> >>>>>> Given the AI advent and such occurrences, how will you give due >>>>>> credit to the original creators and how do you plan to prevent some >>>>>> member >>>>>> for taking up idea of any old open PR ( which for reasons of complexity >>>>>> and >>>>>> non technical reasons) , polishing it up and pushing it as their own? >>>>>> >>>>>> I am also curious , am I the only one who is troubled by all this, or >>>>>> there are others who have experienced it? >>>>>> >>>>>> Regards >>>>>> Asif >>>>>> >>>>>> >>>>>>> If you have further improvements please feel free to open a PR. >>>>>>> >>>>>>> Best, >>>>>>> Peter >>>>>>> >>>>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> I had filed a bug >>>>>>>> https://issues.apache.org/jira/browse/SPARK-45866 >>>>>>>> >>>>>>>> I had also opened a PR for the same. >>>>>>>> >>>>>>>> Now I see that the ticket I filed is still open, but the issue has >>>>>>>> been fixed using a new ticket >>>>>>>> https://issues.apache.org/jira/browse/SPARK-56694 >>>>>>>> >>>>>>>> and on top of that the bug test and ofcourse the fix ( which in any >>>>>>>> case would be same) has been taken from my PR for >>>>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f >>>>>>>> >>>>>>>> To me this is clear unethical conduct of cartel member, unless I am >>>>>>>> missing some valid reason. >>>>>>>> >>>>>>>> And the irony is that the fix is still incomplete, as I just found >>>>>>>> and filed a new ticket >>>>>>>> https://issues.apache.org/jira/browse/SPARK-57126 >>>>>>>> >>>>>>>> I know that atleast some cartel members are insecure and think of >>>>>>>> OSS as their fiefdom, but this sort of behaviour , I never expected. >>>>>>>> Regards >>>>>>>> Asif >>>>>>>> >>>>>>> >>>>>>
