also take a look at this jira
https://issues.apache.org/jira/browse/SPARK-47320
for this also an alternate PR was opened.
This problem is do deep in code, that I even showed you that in the
existing test itself, if the join condition's operand are swapped, test
fails.. Its completely broken , the self joins.
I had proposed a consistent fix, which solves the issue completely and
logically, but again an alternate PR was filed..
What issue was there in my PR , which I created...?
Regards
Asif

On Thu, May 28, 2026 at 11:14 AM Asif Shahid <[email protected]> wrote:

>
>
> On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]> wrote:
>
>>
>> As for the fix, itself, is not indicative of any thing as its a one
>>> liner, test has uncanny resemblance
>>
>>
>> Asif, what exactly is the "uncanny resemblance" between those test cases
>> in https://github.com/apache/spark/pull/49154/changes vs
>> https://github.com/apache/spark/pull/55644/changes ? Besides the fact
>> that obviously they are comparing canonicalized forms.
>> Again, sorry for not noticing your PR, but I don't feel my fix has
>> anything to do with yours.
>>
> Ok. I respect your opinion.  Each one is entitled to its own view
>
>>
>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3
>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a
>>> different route.
>>
>>
>> Do you see anything in common between
>> https://github.com/apache/spark/pull/50029/changes and
>> https://github.com/apache/spark/pull/50757/changes ?
>> Because I do see. That someone else had a much better idea:
>> https://github.com/apache/spark/pull/50757#issuecomment-2844972082 /
>> https://github.com/apache/spark/pull/50230 and it was implemented for
>> the benefit of Spark.
>> IMO, that's the normal way of dealing with issues in an open-source
>> project. Ideas come and go and hopefully the one best wins.
>>
> The checksum approach has its expense. That can come later , because
> apriori its possible to detect whether the expression is returning value
> from an indeterministic expression.
> You opened an alternate PR, which I have described in the PR discussion
> that to fix the round robin issue which you were dealing with, you are
> trying to impose an order in in-deterministic expression evaluattion, which
> itself is against the basic premise that if data is in-determinate, there
> cannot be order in it.
> What issue did u see in the logic, that an alternate PR was opened...which
> impacted all the stages ( including the ancestors?) and I already discussed
> internally why the idea you had in mind would not work. I specifically
> asked, why dont we discuss via the PR filed...
>
>
>
>>
>> Peter
>>
>> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]>
>> wrote:
>>
>>> Hi Nicholas,
>>> You wanted some examples , right:
>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3
>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a
>>> different route.
>>> Did any one who created new PR and route, showed up any unaddressable
>>> logical issue?
>>> The same goes for all the PRs ( which in case I have closed)
>>> Regards
>>> Asif
>>>
>>>
>>>
>>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas <
>>> [email protected]> wrote:
>>>
>>>> I think repeatedly calling the contributors on this list a “cartel” is
>>>> not conducive to a calm and amicable resolution.
>>>>
>>>> You may have some history built up that led you to use that word, but
>>>> to the rest of us it comes out of nowhere; you in fact opened this thread
>>>> with that attack. If you keep making your case in this manner, you will
>>>> just turn everyone against you.
>>>>
>>>> If there is a history of what you feel is others stealing your work,
>>>> please link to a few examples so we can see what you are seeing. If you
>>>> can’t do that, then just focus on this current example. And try to refrain
>>>> from calling people names unless your goal is just to have a fight, as
>>>> opposed to resolving the problematic behavior so you can continue to
>>>> contribute.
>>>>
>>>> I am not a committer and don’t have any special role in this community.
>>>> I am speaking just as an observer and regular contributor to the project.
>>>>
>>>> > I have experienced this before, as recent as couple of months back (
>>>> https://issues.apache.org/jira/browse/SPARK-54386)
>>>>
>>>> For others following along, I took a look at this ticket and the
>>>> associated PRs: #53261 <https://github.com/apache/spark/pull/53261> /
>>>> #53100 <https://github.com/apache/spark/pull/53100>
>>>>
>>>> It looks like Asif is upset that he submitted a fix for the same issue
>>>> a week or so prior to the fix that eventually got merged. But the fixes are
>>>> different, and the one that got merged is a lot shorter, though they are
>>>> both simple. The PR that got merged was submitted by someone who appears to
>>>> be employed by Databricks; perhaps this is part of the “cartel” accusation.
>>>> The two PRs were reviewed by different committers, however, and the one
>>>> that got merged was merged in by someone who does _not_ work for 
>>>> Databricks.
>>>>
>>>> I don’t see anything here other than the normal dynamic of a large and
>>>> busy open source project. Committer attention is limited; things fall
>>>> through the cracks; different contributors may occasionally work on the
>>>> same thing without knowing about each other. A minor help to this specific
>>>> problem would be to have some way of automatically linking issues that
>>>> appear to be about the same thing.
>>>>
>>>> Nick
>>>>
>>>>
>>>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]>
>>>> wrote:
>>>>
>>>> Hi Peter,
>>>> Pls see inline for comments/ replies
>>>>
>>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]>
>>>> wrote:
>>>>
>>>>> Hey Asif,
>>>>>
>>>>> Are you referring to
>>>>> https://github.com/apache/spark/pull/49154/changes vs.
>>>>> https://github.com/apache/spark/pull/55644/changes? Those are
>>>>> definitely solving the same issue but I can assure you I wouldn't take any
>>>>> code from your PR without consulting with you first.
>>>>>
>>>>  Yes Indeed Peter, I am referring to those.
>>>> As for the fix, itself, is not indicative of any thing as its a one
>>>> liner, test has uncanny resemblance.
>>>>
>>>>
>>>>> As far as I remember, I opened SPARK-56694 /
>>>>> https://github.com/apache/spark/pull/55644 because I ran into that
>>>>> minor bug during the implementation of
>>>>> https://github.com/apache/spark/pull/55298.
>>>>>
>>>>
>>>>
>>>>> Sorry, I didn't check whether a ticket or PR already existed.
>>>>>
>>>>
>>>> The below I am addressing to the whole cartel.:
>>>> I have experienced this before, as recent as couple of months back (
>>>> https://issues.apache.org/jira/browse/SPARK-54386)
>>>> I have experienced,  my personal effort ( going into weeks) to debug,
>>>> reproduce issue reliably , being hijacked by members, without even
>>>> discussing the fix proposed, ( by opening new PRs). ( If interested, I can
>>>> provide details of the PRs / issues I am talking about)
>>>> I have seen a perfectly valid PR being nixed , by following comment
>>>> which essentially said
>>>> "  my code of making the cache lookup more effective , would result in
>>>> greater chances of stale cache being picked,  which already spark suffers
>>>> from."
>>>> Now the PR was related to collapsing the projects in analysis phase,
>>>> and side effect was cache pick up being more sensitive.
>>>> So this is such a frivolous reason to nix the PR , because "staleness"
>>>> is an underlying existing issue which had nothing to do with my PR. And its
>>>> more amusing , that if a DB is giving even one wrong result in millions,
>>>> that makes all the results a suspect in any case. It does not matter at
>>>> what frequency this occurs. To me the real reason was code complexity ( &
>>>> more likely  the loss of control of the code to the outsider).
>>>>
>>>> The reason I call this open source community as cartel, is because, I
>>>> have seen the way it works pretty closely and have experienced it in the
>>>> email exchanges which happen on this group.
>>>> For the same PR , same issue,  if advertently or inadvertently , other
>>>> person ( especially a member) gets his changes pushed, by the virtue of his
>>>> standing/position and the "for profit" company the person works, how would
>>>> you give the credit to the original person who discovered the issue first /
>>>> provided the fix?
>>>> Why are issues filed by some immediately worked upon by members ( some
>>>> of whom claim to be working full time on spark) ? Is it because certain
>>>> companies / groups ( for profit companies, mind you )  exert undue
>>>> control, or the petty newbee has to be in the good books of members ( with
>>>> the hope that at some point they will also reach that position of power ?)
>>>>
>>>> Given the AI advent and such occurrences,  how will you give due credit
>>>> to the original creators and how do you plan to prevent some member for
>>>> taking up idea of any old open PR ( which for reasons of complexity and non
>>>> technical reasons) ,  polishing it up and pushing it as their own?
>>>>
>>>> I am also curious , am I the only one who is troubled by all this, or
>>>> there are others who have experienced it?
>>>>
>>>> Regards
>>>> Asif
>>>>
>>>>
>>>>> If you have further improvements please feel free to open a PR.
>>>>>
>>>>> Best,
>>>>> Peter
>>>>>
>>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I had filed a bug
>>>>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>>>>>
>>>>>> I had also opened a PR for the same.
>>>>>>
>>>>>> Now I see that the ticket I  filed is still open, but the issue has
>>>>>> been fixed using a new ticket
>>>>>> https://issues.apache.org/jira/browse/SPARK-56694
>>>>>>
>>>>>> and on top of that the bug test and ofcourse the fix ( which in any
>>>>>> case would be same) has been taken from my PR for
>>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>>>>>
>>>>>> To me this is clear unethical conduct of cartel member, unless I am
>>>>>> missing some valid reason.
>>>>>>
>>>>>> And the irony is that the fix is still incomplete, as I just found
>>>>>> and filed a new ticket
>>>>>> https://issues.apache.org/jira/browse/SPARK-57126
>>>>>>
>>>>>> I know that atleast some cartel members are insecure and think of OSS
>>>>>> as their fiefdom, but this sort of behaviour , I never expected.
>>>>>> Regards
>>>>>> Asif
>>>>>>
>>>>>
>>>>

Reply via email to