The last line is to be read as

That is how exclusivity and good control  is to be maintained.

On Fri, May 29, 2026, 8:29 AM Asif Shahid <[email protected]> wrote:

> To the open source C,
> As it's apparent to me and I believe tacitly admitted by the group in
> general and heard explicitly in person
> Any relatively complex PR which involves deeper thinking ( be it
> functional or performance issue) should be the business of member.
> If it's performance issue , no way .
> If it's functional issue which is becoming embarrassment to ignore,
> somehow ensure that the push happens under a member's PR.
>
> That is how exclusivity and good is to be maintained.
>
>
> On Fri, May 29, 2026, 8:05 AM Asif Shahid <[email protected]> wrote:
>
>> Based on the data I have and discussed, it's my view that the PRs opened
>> by you were reactive, happening only after I had opened the initial ticket
>> and PRs.
>> You are talking about simplifying the issue
>> https://github.com/apache/spark/pull/50757#discussion_r2069390537,
>> I am willing to discuss it here ,over meeting  with other members of your
>> open source group, as to how it simplifies?
>>
>> In fact , I had repeatedly said that  why are we discussing in internal
>> channel of company for the PR which I had filed in public Open source . In
>> that discussion ( the last one, before I was made redundant by company),  I
>> had given detailed explanation of why making each plan node emit
>> indeterministic  is bad idea. ( I would ask you to make that last slack
>> public, but I am sure that would be an issue as your company policy might
>> prohibit).
>>
>> I understood much earlier why you and your colleague never wanted
>> technical discussions on my  public PRs on PR itself..
>>
>>
>>
>> The same holds for other alternate PRs including   the issue of "self
>> joins".
>> I am willing to discuss it out with your group members, the problem it
>> solves and what your alternative PR does not.
>>
>>
>> I am not sure if this is generic approach of the "members", to ensure
>> that final checkin happens under their authorship.
>>
>>
>>
>>
>>
>>
>>
>>
>> On Fri, May 29, 2026, 1:58 AM Peter Toth <[email protected]> wrote:
>>
>>> Hi Viquar,
>>>
>>> To resolve the immediate discrepancy, I ask that we formally link
>>>> SPARK-45866 / PR #49154 within SPARK-56694 as "previously proposed by," and
>>>> add a JIRA comment explicitly crediting Asif as the original co-discoverer
>>>> of both the regression and the baseline fix. This standard attribution
>>>> costs us nothing but preserves the integrity of our commit history.
>>>
>>>
>>> SPARK-56694 is a duplicate of SPARK-45658 (not SPARK-45866), but I agree
>>> it's a fair point to link the tickets and mention Asif's previous work. Let
>>> me add a comment to both the ticket and the PR.
>>>
>>> Conversely, SPARK-56694 bypassed the queue and was merged within eight
>>>> hours
>>>
>>>
>>> I don't know, is there a queue? As for my work process, when I have some
>>> time for upstream reviews, I don't follow any queue. I just pick PRs that I
>>> find interesting or that relate to my experience with Spark. And despite
>>> its size, https://github.com/apache/spark/pull/55644/changes is
>>> technically just a one-liner, fairly trivial fix so review within 8 hours
>>> isn't extraordinary.
>>>
>>> Hi Asif,
>>>
>>> you opened an alternate PR, which...
>>>>
>>> What issue did u see in the logic, that an alternate PR was opened...
>>>
>>>
>>> I think the reason for my simplification approach was discussed both
>>> offline and online in this thread:
>>> https://github.com/apache/spark/pull/50757#discussion_r2069390537
>>>
>>
>>
>>
>>
>>> Best,
>>> Peter
>>>
>>> On Thu, May 28, 2026 at 10:29 PM vaquar khan <[email protected]>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have thoroughly reviewed the technical artifacts surrounding the
>>>> recent Catalyst optimizer canonicalization discussions to help guide this
>>>> toward a constructive resolution.
>>>>
>>>> We must address a tangible breakdown in our review pipeline.
>>>> SPARK-45866 and its corresponding PR #49154 correctly identified this
>>>> complex Catalyst regression in late 2023, yet the ticket remained
>>>> unaddressed. *Conversely, SPARK-56694 bypassed the queue and was
>>>> merged within eight hours without referencing the prior art*. Peter
>>>> has transparently acknowledged the oversight in searching for existing
>>>> tickets, but we still need to close the loop.
>>>>
>>>> To resolve the immediate discrepancy,* I ask that we formally link
>>>> SPARK-45866 / PR #49154 within SPARK-56694 as "previously proposed by," and
>>>> add a JIRA comment explicitly crediting Asif as the original co-discoverer
>>>> of both the regression and the baseline fix. This standard attribution
>>>> costs us nothing but preserves the integrity of our commit history.  *
>>>>
>>>> Stepping back, this incident highlights a critical systemic risk to our
>>>> contributor ecosystem. The stark asymmetry in review velocity where an
>>>> external contributor's highly complex PR sits stagnant for months/years,
>>>> while an identical internal PR is merged in hours creates visible friction.
>>>> Even if entirely unintentional due to organizational overload, this pattern
>>>> discourages the high-level engineering talent required to sustain the
>>>> project's momentum.
>>>>
>>>> To maintain Spark’s technical leadership, we must actively cultivate a
>>>> culture where contributions are prioritized strictly by their architectural
>>>> merit, regardless of authorship. Furthermore, we must normalize the habit
>>>> of proactively acknowledging independent work when parallel discoveries
>>>> surface. Small, intentional shifts in our governance and review cadence
>>>> will yield massive dividends in community trust and long-term innovation.
>>>>
>>>> Best regards,
>>>> Viquar Khan
>>>> https://www.linkedin.com/in/vaquar-khan-b695577/?skipRedirect=true
>>>>
>>>>
>>>> On Thu, 28 May 2026 at 13:42, Asif Shahid <[email protected]>
>>>> wrote:
>>>>
>>>>> Also I must admit that  I did not know oss works by opening alternate
>>>>> PRs.
>>>>> In the places where I have worked most of my life, we work on the
>>>>> opened PR with the original author and try to bridge the gap.
>>>>>
>>>>> On Thu, May 28, 2026, 11:25 AM Asif Shahid <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> In fact, I showed it not just to you but other colleague of yours
>>>>>> too. But there has been absolutely no comment or anything on that  from
>>>>>> then , till now.
>>>>>>
>>>>>> On Thu, May 28, 2026 at 11:19 AM Asif Shahid <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> also take a look at this jira
>>>>>>> https://issues.apache.org/jira/browse/SPARK-47320
>>>>>>> for this also an alternate PR was opened.
>>>>>>> This problem is do deep in code, that I even showed you that in the
>>>>>>> existing test itself, if the join condition's operand are swapped, test
>>>>>>> fails.. Its completely broken , the self joins.
>>>>>>> I had proposed a consistent fix, which solves the issue completely
>>>>>>> and logically, but again an alternate PR was filed..
>>>>>>> What issue was there in my PR , which I created...?
>>>>>>> Regards
>>>>>>> Asif
>>>>>>>
>>>>>>> On Thu, May 28, 2026 at 11:14 AM Asif Shahid <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> As for the fix, itself, is not indicative of any thing as its a
>>>>>>>>>> one liner, test has uncanny resemblance
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Asif, what exactly is the "uncanny resemblance" between those test
>>>>>>>>> cases in https://github.com/apache/spark/pull/49154/changes vs
>>>>>>>>> https://github.com/apache/spark/pull/55644/changes ? Besides the
>>>>>>>>> fact that obviously they are comparing canonicalized forms.
>>>>>>>>> Again, sorry for not noticing your PR, but I don't feel my fix has
>>>>>>>>> anything to do with yours.
>>>>>>>>>
>>>>>>>> Ok. I respect your opinion.  Each one is entitled to its own view
>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>>>>>>>>> To discover this bug and reproduce it reliably, I spent nearly 2-
>>>>>>>>>> 3 weeks. I filed a PR. The bug was fixed via a different PR , taken a
>>>>>>>>>> different route.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Do you see anything in common between
>>>>>>>>> https://github.com/apache/spark/pull/50029/changes and
>>>>>>>>> https://github.com/apache/spark/pull/50757/changes ?
>>>>>>>>> Because I do see. That someone else had a much better idea:
>>>>>>>>> https://github.com/apache/spark/pull/50757#issuecomment-2844972082
>>>>>>>>> / https://github.com/apache/spark/pull/50230 and it was
>>>>>>>>> implemented for the benefit of Spark.
>>>>>>>>> IMO, that's the normal way of dealing with issues in an
>>>>>>>>> open-source project. Ideas come and go and hopefully the one best 
>>>>>>>>> wins.
>>>>>>>>>
>>>>>>>> The checksum approach has its expense. That can come later ,
>>>>>>>> because apriori its possible to detect whether the expression is 
>>>>>>>> returning
>>>>>>>> value from an indeterministic expression.
>>>>>>>> You opened an alternate PR, which I have described in the PR
>>>>>>>> discussion that to fix the round robin issue which you were dealing 
>>>>>>>> with,
>>>>>>>> you are trying to impose an order in in-deterministic expression
>>>>>>>> evaluattion, which itself is against the basic premise that if data is
>>>>>>>> in-determinate, there cannot be order in it.
>>>>>>>> What issue did u see in the logic, that an alternate PR was
>>>>>>>> opened...which impacted all the stages ( including the ancestors?) and 
>>>>>>>> I
>>>>>>>> already discussed internally why the idea you had in mind would not 
>>>>>>>> work. I
>>>>>>>> specifically asked, why dont we discuss via the PR filed...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>>> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Nicholas,
>>>>>>>>>> You wanted some examples , right:
>>>>>>>>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>>>>>>>>> To discover this bug and reproduce it reliably, I spent nearly 2-
>>>>>>>>>> 3 weeks. I filed a PR. The bug was fixed via a different PR , taken a
>>>>>>>>>> different route.
>>>>>>>>>> Did any one who created new PR and route, showed up any
>>>>>>>>>> unaddressable logical issue?
>>>>>>>>>> The same goes for all the PRs ( which in case I have closed)
>>>>>>>>>> Regards
>>>>>>>>>> Asif
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think repeatedly calling the contributors on this list a
>>>>>>>>>>> “cartel” is not conducive to a calm and amicable resolution.
>>>>>>>>>>>
>>>>>>>>>>> You may have some history built up that led you to use that
>>>>>>>>>>> word, but to the rest of us it comes out of nowhere; you in fact 
>>>>>>>>>>> opened
>>>>>>>>>>> this thread with that attack. If you keep making your case in this 
>>>>>>>>>>> manner,
>>>>>>>>>>> you will just turn everyone against you.
>>>>>>>>>>>
>>>>>>>>>>> If there is a history of what you feel is others stealing your
>>>>>>>>>>> work, please link to a few examples so we can see what you are 
>>>>>>>>>>> seeing. If
>>>>>>>>>>> you can’t do that, then just focus on this current example. And try 
>>>>>>>>>>> to
>>>>>>>>>>> refrain from calling people names unless your goal is just to have 
>>>>>>>>>>> a fight,
>>>>>>>>>>> as opposed to resolving the problematic behavior so you can 
>>>>>>>>>>> continue to
>>>>>>>>>>> contribute.
>>>>>>>>>>>
>>>>>>>>>>> I am not a committer and don’t have any special role in this
>>>>>>>>>>> community. I am speaking just as an observer and regular 
>>>>>>>>>>> contributor to the
>>>>>>>>>>> project.
>>>>>>>>>>>
>>>>>>>>>>> > I have experienced this before, as recent as couple of months
>>>>>>>>>>> back ( https://issues.apache.org/jira/browse/SPARK-54386)
>>>>>>>>>>>
>>>>>>>>>>> For others following along, I took a look at this ticket and the
>>>>>>>>>>> associated PRs: #53261
>>>>>>>>>>> <https://github.com/apache/spark/pull/53261> / #53100
>>>>>>>>>>> <https://github.com/apache/spark/pull/53100>
>>>>>>>>>>>
>>>>>>>>>>> It looks like Asif is upset that he submitted a fix for the same
>>>>>>>>>>> issue a week or so prior to the fix that eventually got merged. But 
>>>>>>>>>>> the
>>>>>>>>>>> fixes are different, and the one that got merged is a lot shorter, 
>>>>>>>>>>> though
>>>>>>>>>>> they are both simple. The PR that got merged was submitted by 
>>>>>>>>>>> someone who
>>>>>>>>>>> appears to be employed by Databricks; perhaps this is part of the 
>>>>>>>>>>> “cartel”
>>>>>>>>>>> accusation. The two PRs were reviewed by different committers, 
>>>>>>>>>>> however, and
>>>>>>>>>>> the one that got merged was merged in by someone who does _not_ 
>>>>>>>>>>> work for
>>>>>>>>>>> Databricks.
>>>>>>>>>>>
>>>>>>>>>>> I don’t see anything here other than the normal dynamic of a
>>>>>>>>>>> large and busy open source project. Committer attention is limited; 
>>>>>>>>>>> things
>>>>>>>>>>> fall through the cracks; different contributors may occasionally 
>>>>>>>>>>> work on
>>>>>>>>>>> the same thing without knowing about each other. A minor help to 
>>>>>>>>>>> this
>>>>>>>>>>> specific problem would be to have some way of automatically linking 
>>>>>>>>>>> issues
>>>>>>>>>>> that appear to be about the same thing.
>>>>>>>>>>>
>>>>>>>>>>> Nick
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Peter,
>>>>>>>>>>> Pls see inline for comments/ replies
>>>>>>>>>>>
>>>>>>>>>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey Asif,
>>>>>>>>>>>>
>>>>>>>>>>>> Are you referring to
>>>>>>>>>>>> https://github.com/apache/spark/pull/49154/changes vs.
>>>>>>>>>>>> https://github.com/apache/spark/pull/55644/changes? Those are
>>>>>>>>>>>> definitely solving the same issue but I can assure you I wouldn't 
>>>>>>>>>>>> take any
>>>>>>>>>>>> code from your PR without consulting with you first.
>>>>>>>>>>>>
>>>>>>>>>>>  Yes Indeed Peter, I am referring to those.
>>>>>>>>>>> As for the fix, itself, is not indicative of any thing as its a
>>>>>>>>>>> one liner, test has uncanny resemblance.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> As far as I remember, I opened SPARK-56694 /
>>>>>>>>>>>> https://github.com/apache/spark/pull/55644 because I ran into
>>>>>>>>>>>> that minor bug during the implementation of
>>>>>>>>>>>> https://github.com/apache/spark/pull/55298.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Sorry, I didn't check whether a ticket or PR already existed.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The below I am addressing to the whole cartel.:
>>>>>>>>>>> I have experienced this before, as recent as couple of months
>>>>>>>>>>> back ( https://issues.apache.org/jira/browse/SPARK-54386)
>>>>>>>>>>> I have experienced,  my personal effort ( going into weeks) to
>>>>>>>>>>> debug, reproduce issue reliably , being hijacked by members, 
>>>>>>>>>>> without even
>>>>>>>>>>> discussing the fix proposed, ( by opening new PRs). ( If 
>>>>>>>>>>> interested, I can
>>>>>>>>>>> provide details of the PRs / issues I am talking about)
>>>>>>>>>>> I have seen a perfectly valid PR being nixed , by following
>>>>>>>>>>> comment which essentially said
>>>>>>>>>>> "  my code of making the cache lookup more effective , would
>>>>>>>>>>> result in greater chances of stale cache being picked,  which 
>>>>>>>>>>> already spark
>>>>>>>>>>> suffers from."
>>>>>>>>>>> Now the PR was related to collapsing the projects in analysis
>>>>>>>>>>> phase, and side effect was cache pick up being more sensitive.
>>>>>>>>>>> So this is such a frivolous reason to nix the PR , because
>>>>>>>>>>> "staleness" is an underlying existing issue which had nothing to do 
>>>>>>>>>>> with my
>>>>>>>>>>> PR. And its more amusing , that if a DB is giving even one wrong 
>>>>>>>>>>> result in
>>>>>>>>>>> millions, that makes all the results a suspect in any case. It does 
>>>>>>>>>>> not
>>>>>>>>>>> matter at what frequency this occurs. To me the real reason was code
>>>>>>>>>>> complexity ( & more likely  the loss of control of the code to the
>>>>>>>>>>> outsider).
>>>>>>>>>>>
>>>>>>>>>>> The reason I call this open source community as cartel, is
>>>>>>>>>>> because, I have seen the way it works pretty closely and have 
>>>>>>>>>>> experienced
>>>>>>>>>>> it in the email exchanges which happen on this group.
>>>>>>>>>>> For the same PR , same issue,  if advertently or inadvertently ,
>>>>>>>>>>> other person ( especially a member) gets his changes pushed, by the 
>>>>>>>>>>> virtue
>>>>>>>>>>> of his standing/position and the "for profit" company the person 
>>>>>>>>>>> works, how
>>>>>>>>>>> would you give the credit to the original person who discovered the 
>>>>>>>>>>> issue
>>>>>>>>>>> first / provided the fix?
>>>>>>>>>>> Why are issues filed by some immediately worked upon by members
>>>>>>>>>>> ( some of whom claim to be working full time on spark) ? Is it 
>>>>>>>>>>> because
>>>>>>>>>>> certain companies / groups ( for profit companies, mind you )  
>>>>>>>>>>> exert undue
>>>>>>>>>>> control, or the petty newbee has to be in the good books of members 
>>>>>>>>>>> ( with
>>>>>>>>>>> the hope that at some point they will also reach that position of 
>>>>>>>>>>> power ?)
>>>>>>>>>>>
>>>>>>>>>>> Given the AI advent and such occurrences,  how will you give due
>>>>>>>>>>> credit to the original creators and how do you plan to prevent some 
>>>>>>>>>>> member
>>>>>>>>>>> for taking up idea of any old open PR ( which for reasons of 
>>>>>>>>>>> complexity and
>>>>>>>>>>> non technical reasons) ,  polishing it up and pushing it as their 
>>>>>>>>>>> own?
>>>>>>>>>>>
>>>>>>>>>>> I am also curious , am I the only one who is troubled by all
>>>>>>>>>>> this, or there are others who have experienced it?
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> Asif
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> If you have further improvements please feel free to open a PR.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Peter
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> I had filed a bug
>>>>>>>>>>>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>>>>>>>>>>>>
>>>>>>>>>>>>> I had also opened a PR for the same.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now I see that the ticket I  filed is still open, but the
>>>>>>>>>>>>> issue has been fixed using a new ticket
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-56694
>>>>>>>>>>>>>
>>>>>>>>>>>>> and on top of that the bug test and ofcourse the fix ( which
>>>>>>>>>>>>> in any case would be same) has been taken from my PR for
>>>>>>>>>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>>>>>>>>>>>>
>>>>>>>>>>>>> To me this is clear unethical conduct of cartel member, unless
>>>>>>>>>>>>> I am missing some valid reason.
>>>>>>>>>>>>>
>>>>>>>>>>>>> And the irony is that the fix is still incomplete, as I just
>>>>>>>>>>>>> found and filed a new ticket
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-57126
>>>>>>>>>>>>>
>>>>>>>>>>>>> I know that atleast some cartel members are insecure and think
>>>>>>>>>>>>> of OSS as their fiefdom, but this sort of behaviour , I never 
>>>>>>>>>>>>> expected.
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Asif
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>

Reply via email to