On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]> wrote:

>
> As for the fix, itself, is not indicative of any thing as its a one liner,
>> test has uncanny resemblance
>
>
> Asif, what exactly is the "uncanny resemblance" between those test cases
> in https://github.com/apache/spark/pull/49154/changes vs
> https://github.com/apache/spark/pull/55644/changes ? Besides the fact
> that obviously they are comparing canonicalized forms.
> Again, sorry for not noticing your PR, but I don't feel my fix has
> anything to do with yours.
>
Ok. I respect your opinion.  Each one is entitled to its own view

>
> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>> To discover this bug and reproduce it reliably, I spent nearly 2- 3
>> weeks. I filed a PR. The bug was fixed via a different PR , taken a
>> different route.
>
>
> Do you see anything in common between
> https://github.com/apache/spark/pull/50029/changes and
> https://github.com/apache/spark/pull/50757/changes ?
> Because I do see. That someone else had a much better idea:
> https://github.com/apache/spark/pull/50757#issuecomment-2844972082 /
> https://github.com/apache/spark/pull/50230 and it was implemented for the
> benefit of Spark.
> IMO, that's the normal way of dealing with issues in an open-source
> project. Ideas come and go and hopefully the one best wins.
>
The checksum approach has its expense. That can come later , because
apriori its possible to detect whether the expression is returning value
from an indeterministic expression.
You opened an alternate PR, which I have described in the PR discussion
that to fix the round robin issue which you were dealing with, you are
trying to impose an order in in-deterministic expression evaluattion, which
itself is against the basic premise that if data is in-determinate, there
cannot be order in it.
What issue did u see in the logic, that an alternate PR was opened...which
impacted all the stages ( including the ancestors?) and I already discussed
internally why the idea you had in mind would not work. I specifically
asked, why dont we discuss via the PR filed...



>
> Peter
>
> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]> wrote:
>
>> Hi Nicholas,
>> You wanted some examples , right:
>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>> To discover this bug and reproduce it reliably, I spent nearly 2- 3
>> weeks. I filed a PR. The bug was fixed via a different PR , taken a
>> different route.
>> Did any one who created new PR and route, showed up any unaddressable
>> logical issue?
>> The same goes for all the PRs ( which in case I have closed)
>> Regards
>> Asif
>>
>>
>>
>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas <
>> [email protected]> wrote:
>>
>>> I think repeatedly calling the contributors on this list a “cartel” is
>>> not conducive to a calm and amicable resolution.
>>>
>>> You may have some history built up that led you to use that word, but to
>>> the rest of us it comes out of nowhere; you in fact opened this thread with
>>> that attack. If you keep making your case in this manner, you will just
>>> turn everyone against you.
>>>
>>> If there is a history of what you feel is others stealing your work,
>>> please link to a few examples so we can see what you are seeing. If you
>>> can’t do that, then just focus on this current example. And try to refrain
>>> from calling people names unless your goal is just to have a fight, as
>>> opposed to resolving the problematic behavior so you can continue to
>>> contribute.
>>>
>>> I am not a committer and don’t have any special role in this community.
>>> I am speaking just as an observer and regular contributor to the project.
>>>
>>> > I have experienced this before, as recent as couple of months back (
>>> https://issues.apache.org/jira/browse/SPARK-54386)
>>>
>>> For others following along, I took a look at this ticket and the
>>> associated PRs: #53261 <https://github.com/apache/spark/pull/53261> /
>>> #53100 <https://github.com/apache/spark/pull/53100>
>>>
>>> It looks like Asif is upset that he submitted a fix for the same issue a
>>> week or so prior to the fix that eventually got merged. But the fixes are
>>> different, and the one that got merged is a lot shorter, though they are
>>> both simple. The PR that got merged was submitted by someone who appears to
>>> be employed by Databricks; perhaps this is part of the “cartel” accusation.
>>> The two PRs were reviewed by different committers, however, and the one
>>> that got merged was merged in by someone who does _not_ work for Databricks.
>>>
>>> I don’t see anything here other than the normal dynamic of a large and
>>> busy open source project. Committer attention is limited; things fall
>>> through the cracks; different contributors may occasionally work on the
>>> same thing without knowing about each other. A minor help to this specific
>>> problem would be to have some way of automatically linking issues that
>>> appear to be about the same thing.
>>>
>>> Nick
>>>
>>>
>>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]> wrote:
>>>
>>> Hi Peter,
>>> Pls see inline for comments/ replies
>>>
>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]> wrote:
>>>
>>>> Hey Asif,
>>>>
>>>> Are you referring to https://github.com/apache/spark/pull/49154/changes
>>>> vs. https://github.com/apache/spark/pull/55644/changes? Those are
>>>> definitely solving the same issue but I can assure you I wouldn't take any
>>>> code from your PR without consulting with you first.
>>>>
>>>  Yes Indeed Peter, I am referring to those.
>>> As for the fix, itself, is not indicative of any thing as its a one
>>> liner, test has uncanny resemblance.
>>>
>>>
>>>> As far as I remember, I opened SPARK-56694 /
>>>> https://github.com/apache/spark/pull/55644 because I ran into that
>>>> minor bug during the implementation of
>>>> https://github.com/apache/spark/pull/55298.
>>>>
>>>
>>>
>>>> Sorry, I didn't check whether a ticket or PR already existed.
>>>>
>>>
>>> The below I am addressing to the whole cartel.:
>>> I have experienced this before, as recent as couple of months back (
>>> https://issues.apache.org/jira/browse/SPARK-54386)
>>> I have experienced,  my personal effort ( going into weeks) to debug,
>>> reproduce issue reliably , being hijacked by members, without even
>>> discussing the fix proposed, ( by opening new PRs). ( If interested, I can
>>> provide details of the PRs / issues I am talking about)
>>> I have seen a perfectly valid PR being nixed , by following comment
>>> which essentially said
>>> "  my code of making the cache lookup more effective , would result in
>>> greater chances of stale cache being picked,  which already spark suffers
>>> from."
>>> Now the PR was related to collapsing the projects in analysis phase, and
>>> side effect was cache pick up being more sensitive.
>>> So this is such a frivolous reason to nix the PR , because "staleness"
>>> is an underlying existing issue which had nothing to do with my PR. And its
>>> more amusing , that if a DB is giving even one wrong result in millions,
>>> that makes all the results a suspect in any case. It does not matter at
>>> what frequency this occurs. To me the real reason was code complexity ( &
>>> more likely  the loss of control of the code to the outsider).
>>>
>>> The reason I call this open source community as cartel, is because, I
>>> have seen the way it works pretty closely and have experienced it in the
>>> email exchanges which happen on this group.
>>> For the same PR , same issue,  if advertently or inadvertently , other
>>> person ( especially a member) gets his changes pushed, by the virtue of his
>>> standing/position and the "for profit" company the person works, how would
>>> you give the credit to the original person who discovered the issue first /
>>> provided the fix?
>>> Why are issues filed by some immediately worked upon by members ( some
>>> of whom claim to be working full time on spark) ? Is it because certain
>>> companies / groups ( for profit companies, mind you )  exert undue
>>> control, or the petty newbee has to be in the good books of members ( with
>>> the hope that at some point they will also reach that position of power ?)
>>>
>>> Given the AI advent and such occurrences,  how will you give due credit
>>> to the original creators and how do you plan to prevent some member for
>>> taking up idea of any old open PR ( which for reasons of complexity and non
>>> technical reasons) ,  polishing it up and pushing it as their own?
>>>
>>> I am also curious , am I the only one who is troubled by all this, or
>>> there are others who have experienced it?
>>>
>>> Regards
>>> Asif
>>>
>>>
>>>> If you have further improvements please feel free to open a PR.
>>>>
>>>> Best,
>>>> Peter
>>>>
>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> I had filed a bug
>>>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>>>>
>>>>> I had also opened a PR for the same.
>>>>>
>>>>> Now I see that the ticket I  filed is still open, but the issue has
>>>>> been fixed using a new ticket
>>>>> https://issues.apache.org/jira/browse/SPARK-56694
>>>>>
>>>>> and on top of that the bug test and ofcourse the fix ( which in any
>>>>> case would be same) has been taken from my PR for
>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>>>>
>>>>> To me this is clear unethical conduct of cartel member, unless I am
>>>>> missing some valid reason.
>>>>>
>>>>> And the irony is that the fix is still incomplete, as I just found and
>>>>> filed a new ticket
>>>>> https://issues.apache.org/jira/browse/SPARK-57126
>>>>>
>>>>> I know that atleast some cartel members are insecure and think of OSS
>>>>> as their fiefdom, but this sort of behaviour , I never expected.
>>>>> Regards
>>>>> Asif
>>>>>
>>>>
>>>

Reply via email to