In fact, I showed it not just to you but other colleague of yours too. But
there has been absolutely no comment or anything on that  from then , till
now.

On Thu, May 28, 2026 at 11:19 AM Asif Shahid <[email protected]> wrote:

> also take a look at this jira
> https://issues.apache.org/jira/browse/SPARK-47320
> for this also an alternate PR was opened.
> This problem is do deep in code, that I even showed you that in the
> existing test itself, if the join condition's operand are swapped, test
> fails.. Its completely broken , the self joins.
> I had proposed a consistent fix, which solves the issue completely and
> logically, but again an alternate PR was filed..
> What issue was there in my PR , which I created...?
> Regards
> Asif
>
> On Thu, May 28, 2026 at 11:14 AM Asif Shahid <[email protected]>
> wrote:
>
>>
>>
>> On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]> wrote:
>>
>>>
>>> As for the fix, itself, is not indicative of any thing as its a one
>>>> liner, test has uncanny resemblance
>>>
>>>
>>> Asif, what exactly is the "uncanny resemblance" between those test cases
>>> in https://github.com/apache/spark/pull/49154/changes vs
>>> https://github.com/apache/spark/pull/55644/changes ? Besides the fact
>>> that obviously they are comparing canonicalized forms.
>>> Again, sorry for not noticing your PR, but I don't feel my fix has
>>> anything to do with yours.
>>>
>> Ok. I respect your opinion.  Each one is entitled to its own view
>>
>>>
>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3
>>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a
>>>> different route.
>>>
>>>
>>> Do you see anything in common between
>>> https://github.com/apache/spark/pull/50029/changes and
>>> https://github.com/apache/spark/pull/50757/changes ?
>>> Because I do see. That someone else had a much better idea:
>>> https://github.com/apache/spark/pull/50757#issuecomment-2844972082 /
>>> https://github.com/apache/spark/pull/50230 and it was implemented for
>>> the benefit of Spark.
>>> IMO, that's the normal way of dealing with issues in an open-source
>>> project. Ideas come and go and hopefully the one best wins.
>>>
>> The checksum approach has its expense. That can come later , because
>> apriori its possible to detect whether the expression is returning value
>> from an indeterministic expression.
>> You opened an alternate PR, which I have described in the PR discussion
>> that to fix the round robin issue which you were dealing with, you are
>> trying to impose an order in in-deterministic expression evaluattion, which
>> itself is against the basic premise that if data is in-determinate, there
>> cannot be order in it.
>> What issue did u see in the logic, that an alternate PR was
>> opened...which impacted all the stages ( including the ancestors?) and I
>> already discussed internally why the idea you had in mind would not work. I
>> specifically asked, why dont we discuss via the PR filed...
>>
>>
>>
>>>
>>> Peter
>>>
>>> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]>
>>> wrote:
>>>
>>>> Hi Nicholas,
>>>> You wanted some examples , right:
>>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>>> To discover this bug and reproduce it reliably, I spent nearly 2- 3
>>>> weeks. I filed a PR. The bug was fixed via a different PR , taken a
>>>> different route.
>>>> Did any one who created new PR and route, showed up any unaddressable
>>>> logical issue?
>>>> The same goes for all the PRs ( which in case I have closed)
>>>> Regards
>>>> Asif
>>>>
>>>>
>>>>
>>>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas <
>>>> [email protected]> wrote:
>>>>
>>>>> I think repeatedly calling the contributors on this list a “cartel” is
>>>>> not conducive to a calm and amicable resolution.
>>>>>
>>>>> You may have some history built up that led you to use that word, but
>>>>> to the rest of us it comes out of nowhere; you in fact opened this thread
>>>>> with that attack. If you keep making your case in this manner, you will
>>>>> just turn everyone against you.
>>>>>
>>>>> If there is a history of what you feel is others stealing your work,
>>>>> please link to a few examples so we can see what you are seeing. If you
>>>>> can’t do that, then just focus on this current example. And try to refrain
>>>>> from calling people names unless your goal is just to have a fight, as
>>>>> opposed to resolving the problematic behavior so you can continue to
>>>>> contribute.
>>>>>
>>>>> I am not a committer and don’t have any special role in this
>>>>> community. I am speaking just as an observer and regular contributor to 
>>>>> the
>>>>> project.
>>>>>
>>>>> > I have experienced this before, as recent as couple of months back (
>>>>> https://issues.apache.org/jira/browse/SPARK-54386)
>>>>>
>>>>> For others following along, I took a look at this ticket and the
>>>>> associated PRs: #53261 <https://github.com/apache/spark/pull/53261> /
>>>>> #53100 <https://github.com/apache/spark/pull/53100>
>>>>>
>>>>> It looks like Asif is upset that he submitted a fix for the same issue
>>>>> a week or so prior to the fix that eventually got merged. But the fixes 
>>>>> are
>>>>> different, and the one that got merged is a lot shorter, though they are
>>>>> both simple. The PR that got merged was submitted by someone who appears 
>>>>> to
>>>>> be employed by Databricks; perhaps this is part of the “cartel” 
>>>>> accusation.
>>>>> The two PRs were reviewed by different committers, however, and the one
>>>>> that got merged was merged in by someone who does _not_ work for 
>>>>> Databricks.
>>>>>
>>>>> I don’t see anything here other than the normal dynamic of a large and
>>>>> busy open source project. Committer attention is limited; things fall
>>>>> through the cracks; different contributors may occasionally work on the
>>>>> same thing without knowing about each other. A minor help to this specific
>>>>> problem would be to have some way of automatically linking issues that
>>>>> appear to be about the same thing.
>>>>>
>>>>> Nick
>>>>>
>>>>>
>>>>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Hi Peter,
>>>>> Pls see inline for comments/ replies
>>>>>
>>>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hey Asif,
>>>>>>
>>>>>> Are you referring to
>>>>>> https://github.com/apache/spark/pull/49154/changes vs.
>>>>>> https://github.com/apache/spark/pull/55644/changes? Those are
>>>>>> definitely solving the same issue but I can assure you I wouldn't take 
>>>>>> any
>>>>>> code from your PR without consulting with you first.
>>>>>>
>>>>>  Yes Indeed Peter, I am referring to those.
>>>>> As for the fix, itself, is not indicative of any thing as its a one
>>>>> liner, test has uncanny resemblance.
>>>>>
>>>>>
>>>>>> As far as I remember, I opened SPARK-56694 /
>>>>>> https://github.com/apache/spark/pull/55644 because I ran into that
>>>>>> minor bug during the implementation of
>>>>>> https://github.com/apache/spark/pull/55298.
>>>>>>
>>>>>
>>>>>
>>>>>> Sorry, I didn't check whether a ticket or PR already existed.
>>>>>>
>>>>>
>>>>> The below I am addressing to the whole cartel.:
>>>>> I have experienced this before, as recent as couple of months back (
>>>>> https://issues.apache.org/jira/browse/SPARK-54386)
>>>>> I have experienced,  my personal effort ( going into weeks) to debug,
>>>>> reproduce issue reliably , being hijacked by members, without even
>>>>> discussing the fix proposed, ( by opening new PRs). ( If interested, I can
>>>>> provide details of the PRs / issues I am talking about)
>>>>> I have seen a perfectly valid PR being nixed , by following comment
>>>>> which essentially said
>>>>> "  my code of making the cache lookup more effective , would result in
>>>>> greater chances of stale cache being picked,  which already spark suffers
>>>>> from."
>>>>> Now the PR was related to collapsing the projects in analysis phase,
>>>>> and side effect was cache pick up being more sensitive.
>>>>> So this is such a frivolous reason to nix the PR , because "staleness"
>>>>> is an underlying existing issue which had nothing to do with my PR. And 
>>>>> its
>>>>> more amusing , that if a DB is giving even one wrong result in millions,
>>>>> that makes all the results a suspect in any case. It does not matter at
>>>>> what frequency this occurs. To me the real reason was code complexity ( &
>>>>> more likely  the loss of control of the code to the outsider).
>>>>>
>>>>> The reason I call this open source community as cartel, is because, I
>>>>> have seen the way it works pretty closely and have experienced it in the
>>>>> email exchanges which happen on this group.
>>>>> For the same PR , same issue,  if advertently or inadvertently , other
>>>>> person ( especially a member) gets his changes pushed, by the virtue of 
>>>>> his
>>>>> standing/position and the "for profit" company the person works, how would
>>>>> you give the credit to the original person who discovered the issue first 
>>>>> /
>>>>> provided the fix?
>>>>> Why are issues filed by some immediately worked upon by members ( some
>>>>> of whom claim to be working full time on spark) ? Is it because certain
>>>>> companies / groups ( for profit companies, mind you )  exert undue
>>>>> control, or the petty newbee has to be in the good books of members ( with
>>>>> the hope that at some point they will also reach that position of power ?)
>>>>>
>>>>> Given the AI advent and such occurrences,  how will you give due
>>>>> credit to the original creators and how do you plan to prevent some member
>>>>> for taking up idea of any old open PR ( which for reasons of complexity 
>>>>> and
>>>>> non technical reasons) ,  polishing it up and pushing it as their own?
>>>>>
>>>>> I am also curious , am I the only one who is troubled by all this, or
>>>>> there are others who have experienced it?
>>>>>
>>>>> Regards
>>>>> Asif
>>>>>
>>>>>
>>>>>> If you have further improvements please feel free to open a PR.
>>>>>>
>>>>>> Best,
>>>>>> Peter
>>>>>>
>>>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> I had filed a bug
>>>>>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>>>>>>
>>>>>>> I had also opened a PR for the same.
>>>>>>>
>>>>>>> Now I see that the ticket I  filed is still open, but the issue has
>>>>>>> been fixed using a new ticket
>>>>>>> https://issues.apache.org/jira/browse/SPARK-56694
>>>>>>>
>>>>>>> and on top of that the bug test and ofcourse the fix ( which in any
>>>>>>> case would be same) has been taken from my PR for
>>>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>>>>>>
>>>>>>> To me this is clear unethical conduct of cartel member, unless I am
>>>>>>> missing some valid reason.
>>>>>>>
>>>>>>> And the irony is that the fix is still incomplete, as I just found
>>>>>>> and filed a new ticket
>>>>>>> https://issues.apache.org/jira/browse/SPARK-57126
>>>>>>>
>>>>>>> I know that atleast some cartel members are insecure and think of
>>>>>>> OSS as their fiefdom, but this sort of behaviour , I never expected.
>>>>>>> Regards
>>>>>>> Asif
>>>>>>>
>>>>>>
>>>>>

Reply via email to