This control , exclusivity, the requirement to build credibility by
starting small ( like fixing formatting  , stats ,log) , to leave complex
issues to other " bright " minds, the informal hierarchy,  may be this is
how the open source works.
Whatever it's , it does not sound "community " to me.
This is a club ( cartel is offensive to some or most), with usual struggle
for power , control and politics.

On Fri, May 29, 2026, 8:31 AM Asif Shahid <[email protected]> wrote:

> The last line is to be read as
>
>
> That is how exclusivity and good control  is to be maintained.
>
> On Fri, May 29, 2026, 8:29 AM Asif Shahid <[email protected]> wrote:
>
>> To the open source C,
>> As it's apparent to me and I believe tacitly admitted by the group in
>> general and heard explicitly in person
>> Any relatively complex PR which involves deeper thinking ( be it
>> functional or performance issue) should be the business of member.
>> If it's performance issue , no way .
>> If it's functional issue which is becoming embarrassment to ignore,
>> somehow ensure that the push happens under a member's PR.
>>
>> That is how exclusivity and good is to be maintained.
>>
>>
>> On Fri, May 29, 2026, 8:05 AM Asif Shahid <[email protected]> wrote:
>>
>>> Based on the data I have and discussed, it's my view that the PRs opened
>>> by you were reactive, happening only after I had opened the initial ticket
>>> and PRs.
>>> You are talking about simplifying the issue
>>> https://github.com/apache/spark/pull/50757#discussion_r2069390537,
>>> I am willing to discuss it here ,over meeting  with other members of
>>> your open source group, as to how it simplifies?
>>>
>>> In fact , I had repeatedly said that  why are we discussing in internal
>>> channel of company for the PR which I had filed in public Open source . In
>>> that discussion ( the last one, before I was made redundant by company),  I
>>> had given detailed explanation of why making each plan node emit
>>> indeterministic  is bad idea. ( I would ask you to make that last slack
>>> public, but I am sure that would be an issue as your company policy might
>>> prohibit).
>>>
>>> I understood much earlier why you and your colleague never wanted
>>> technical discussions on my  public PRs on PR itself..
>>>
>>>
>>>
>>> The same holds for other alternate PRs including   the issue of "self
>>> joins".
>>> I am willing to discuss it out with your group members, the problem it
>>> solves and what your alternative PR does not.
>>>
>>>
>>> I am not sure if this is generic approach of the "members", to ensure
>>> that final checkin happens under their authorship.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, May 29, 2026, 1:58 AM Peter Toth <[email protected]> wrote:
>>>
>>>> Hi Viquar,
>>>>
>>>> To resolve the immediate discrepancy, I ask that we formally link
>>>>> SPARK-45866 / PR #49154 within SPARK-56694 as "previously proposed by," 
>>>>> and
>>>>> add a JIRA comment explicitly crediting Asif as the original co-discoverer
>>>>> of both the regression and the baseline fix. This standard attribution
>>>>> costs us nothing but preserves the integrity of our commit history.
>>>>
>>>>
>>>> SPARK-56694 is a duplicate of SPARK-45658 (not SPARK-45866), but I
>>>> agree it's a fair point to link the tickets and mention Asif's previous
>>>> work. Let me add a comment to both the ticket and the PR.
>>>>
>>>> Conversely, SPARK-56694 bypassed the queue and was merged within eight
>>>>> hours
>>>>
>>>>
>>>> I don't know, is there a queue? As for my work process, when I have
>>>> some time for upstream reviews, I don't follow any queue. I just pick PRs
>>>> that I find interesting or that relate to my experience with Spark. And
>>>> despite its size, https://github.com/apache/spark/pull/55644/changes
>>>> is technically just a one-liner, fairly trivial fix so review within 8
>>>> hours isn't extraordinary.
>>>>
>>>> Hi Asif,
>>>>
>>>> you opened an alternate PR, which...
>>>>>
>>>> What issue did u see in the logic, that an alternate PR was opened...
>>>>
>>>>
>>>> I think the reason for my simplification approach was discussed both
>>>> offline and online in this thread:
>>>> https://github.com/apache/spark/pull/50757#discussion_r2069390537
>>>>
>>>
>>>
>>>
>>>
>>>> Best,
>>>> Peter
>>>>
>>>> On Thu, May 28, 2026 at 10:29 PM vaquar khan <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have thoroughly reviewed the technical artifacts surrounding the
>>>>> recent Catalyst optimizer canonicalization discussions to help guide this
>>>>> toward a constructive resolution.
>>>>>
>>>>> We must address a tangible breakdown in our review pipeline.
>>>>> SPARK-45866 and its corresponding PR #49154 correctly identified this
>>>>> complex Catalyst regression in late 2023, yet the ticket remained
>>>>> unaddressed. *Conversely, SPARK-56694 bypassed the queue and was
>>>>> merged within eight hours without referencing the prior art*. Peter
>>>>> has transparently acknowledged the oversight in searching for existing
>>>>> tickets, but we still need to close the loop.
>>>>>
>>>>> To resolve the immediate discrepancy,* I ask that we formally link
>>>>> SPARK-45866 / PR #49154 within SPARK-56694 as "previously proposed by," 
>>>>> and
>>>>> add a JIRA comment explicitly crediting Asif as the original co-discoverer
>>>>> of both the regression and the baseline fix. This standard attribution
>>>>> costs us nothing but preserves the integrity of our commit history.  *
>>>>>
>>>>> Stepping back, this incident highlights a critical systemic risk to
>>>>> our contributor ecosystem. The stark asymmetry in review velocity where an
>>>>> external contributor's highly complex PR sits stagnant for months/years,
>>>>> while an identical internal PR is merged in hours creates visible 
>>>>> friction.
>>>>> Even if entirely unintentional due to organizational overload, this 
>>>>> pattern
>>>>> discourages the high-level engineering talent required to sustain the
>>>>> project's momentum.
>>>>>
>>>>> To maintain Spark’s technical leadership, we must actively cultivate a
>>>>> culture where contributions are prioritized strictly by their 
>>>>> architectural
>>>>> merit, regardless of authorship. Furthermore, we must normalize the habit
>>>>> of proactively acknowledging independent work when parallel discoveries
>>>>> surface. Small, intentional shifts in our governance and review cadence
>>>>> will yield massive dividends in community trust and long-term innovation.
>>>>>
>>>>> Best regards,
>>>>> Viquar Khan
>>>>> https://www.linkedin.com/in/vaquar-khan-b695577/?skipRedirect=true
>>>>>
>>>>>
>>>>> On Thu, 28 May 2026 at 13:42, Asif Shahid <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Also I must admit that  I did not know oss works by opening alternate
>>>>>> PRs.
>>>>>> In the places where I have worked most of my life, we work on the
>>>>>> opened PR with the original author and try to bridge the gap.
>>>>>>
>>>>>> On Thu, May 28, 2026, 11:25 AM Asif Shahid <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> In fact, I showed it not just to you but other colleague of yours
>>>>>>> too. But there has been absolutely no comment or anything on that  from
>>>>>>> then , till now.
>>>>>>>
>>>>>>> On Thu, May 28, 2026 at 11:19 AM Asif Shahid <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> also take a look at this jira
>>>>>>>> https://issues.apache.org/jira/browse/SPARK-47320
>>>>>>>> for this also an alternate PR was opened.
>>>>>>>> This problem is do deep in code, that I even showed you that in the
>>>>>>>> existing test itself, if the join condition's operand are swapped, test
>>>>>>>> fails.. Its completely broken , the self joins.
>>>>>>>> I had proposed a consistent fix, which solves the issue completely
>>>>>>>> and logically, but again an alternate PR was filed..
>>>>>>>> What issue was there in my PR , which I created...?
>>>>>>>> Regards
>>>>>>>> Asif
>>>>>>>>
>>>>>>>> On Thu, May 28, 2026 at 11:14 AM Asif Shahid <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, May 28, 2026 at 10:56 AM Peter Toth <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As for the fix, itself, is not indicative of any thing as its a
>>>>>>>>>>> one liner, test has uncanny resemblance
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Asif, what exactly is the "uncanny resemblance" between those
>>>>>>>>>> test cases in https://github.com/apache/spark/pull/49154/changes
>>>>>>>>>> vs https://github.com/apache/spark/pull/55644/changes ? Besides
>>>>>>>>>> the fact that obviously they are comparing canonicalized forms.
>>>>>>>>>> Again, sorry for not noticing your PR, but I don't feel my fix
>>>>>>>>>> has anything to do with yours.
>>>>>>>>>>
>>>>>>>>> Ok. I respect your opinion.  Each one is entitled to its own view
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>>>>>>>>>> To discover this bug and reproduce it reliably, I spent nearly
>>>>>>>>>>> 2- 3 weeks. I filed a PR. The bug was fixed via a different PR , 
>>>>>>>>>>> taken a
>>>>>>>>>>> different route.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Do you see anything in common between
>>>>>>>>>> https://github.com/apache/spark/pull/50029/changes and
>>>>>>>>>> https://github.com/apache/spark/pull/50757/changes ?
>>>>>>>>>> Because I do see. That someone else had a much better idea:
>>>>>>>>>> https://github.com/apache/spark/pull/50757#issuecomment-2844972082
>>>>>>>>>> / https://github.com/apache/spark/pull/50230 and it was
>>>>>>>>>> implemented for the benefit of Spark.
>>>>>>>>>> IMO, that's the normal way of dealing with issues in an
>>>>>>>>>> open-source project. Ideas come and go and hopefully the one best 
>>>>>>>>>> wins.
>>>>>>>>>>
>>>>>>>>> The checksum approach has its expense. That can come later ,
>>>>>>>>> because apriori its possible to detect whether the expression is 
>>>>>>>>> returning
>>>>>>>>> value from an indeterministic expression.
>>>>>>>>> You opened an alternate PR, which I have described in the PR
>>>>>>>>> discussion that to fix the round robin issue which you were dealing 
>>>>>>>>> with,
>>>>>>>>> you are trying to impose an order in in-deterministic expression
>>>>>>>>> evaluattion, which itself is against the basic premise that if data is
>>>>>>>>> in-determinate, there cannot be order in it.
>>>>>>>>> What issue did u see in the logic, that an alternate PR was
>>>>>>>>> opened...which impacted all the stages ( including the ancestors?) 
>>>>>>>>> and I
>>>>>>>>> already discussed internally why the idea you had in mind would not 
>>>>>>>>> work. I
>>>>>>>>> specifically asked, why dont we discuss via the PR filed...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>> On Thu, May 28, 2026 at 6:38 PM Asif Shahid <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Nicholas,
>>>>>>>>>>> You wanted some examples , right:
>>>>>>>>>>> 1) Look at bug https://issues.apache.org/jira/browse/SPARK-51016
>>>>>>>>>>> To discover this bug and reproduce it reliably, I spent nearly
>>>>>>>>>>> 2- 3 weeks. I filed a PR. The bug was fixed via a different PR , 
>>>>>>>>>>> taken a
>>>>>>>>>>> different route.
>>>>>>>>>>> Did any one who created new PR and route, showed up any
>>>>>>>>>>> unaddressable logical issue?
>>>>>>>>>>> The same goes for all the PRs ( which in case I have closed)
>>>>>>>>>>> Regards
>>>>>>>>>>> Asif
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, May 28, 2026 at 9:06 AM Nicholas Chammas <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I think repeatedly calling the contributors on this list a
>>>>>>>>>>>> “cartel” is not conducive to a calm and amicable resolution.
>>>>>>>>>>>>
>>>>>>>>>>>> You may have some history built up that led you to use that
>>>>>>>>>>>> word, but to the rest of us it comes out of nowhere; you in fact 
>>>>>>>>>>>> opened
>>>>>>>>>>>> this thread with that attack. If you keep making your case in this 
>>>>>>>>>>>> manner,
>>>>>>>>>>>> you will just turn everyone against you.
>>>>>>>>>>>>
>>>>>>>>>>>> If there is a history of what you feel is others stealing your
>>>>>>>>>>>> work, please link to a few examples so we can see what you are 
>>>>>>>>>>>> seeing. If
>>>>>>>>>>>> you can’t do that, then just focus on this current example. And 
>>>>>>>>>>>> try to
>>>>>>>>>>>> refrain from calling people names unless your goal is just to have 
>>>>>>>>>>>> a fight,
>>>>>>>>>>>> as opposed to resolving the problematic behavior so you can 
>>>>>>>>>>>> continue to
>>>>>>>>>>>> contribute.
>>>>>>>>>>>>
>>>>>>>>>>>> I am not a committer and don’t have any special role in this
>>>>>>>>>>>> community. I am speaking just as an observer and regular 
>>>>>>>>>>>> contributor to the
>>>>>>>>>>>> project.
>>>>>>>>>>>>
>>>>>>>>>>>> > I have experienced this before, as recent as couple of months
>>>>>>>>>>>> back ( https://issues.apache.org/jira/browse/SPARK-54386)
>>>>>>>>>>>>
>>>>>>>>>>>> For others following along, I took a look at this ticket and
>>>>>>>>>>>> the associated PRs: #53261
>>>>>>>>>>>> <https://github.com/apache/spark/pull/53261> / #53100
>>>>>>>>>>>> <https://github.com/apache/spark/pull/53100>
>>>>>>>>>>>>
>>>>>>>>>>>> It looks like Asif is upset that he submitted a fix for the
>>>>>>>>>>>> same issue a week or so prior to the fix that eventually got 
>>>>>>>>>>>> merged. But
>>>>>>>>>>>> the fixes are different, and the one that got merged is a lot 
>>>>>>>>>>>> shorter,
>>>>>>>>>>>> though they are both simple. The PR that got merged was submitted 
>>>>>>>>>>>> by
>>>>>>>>>>>> someone who appears to be employed by Databricks; perhaps this is 
>>>>>>>>>>>> part of
>>>>>>>>>>>> the “cartel” accusation. The two PRs were reviewed by different 
>>>>>>>>>>>> committers,
>>>>>>>>>>>> however, and the one that got merged was merged in by someone who 
>>>>>>>>>>>> does
>>>>>>>>>>>> _not_ work for Databricks.
>>>>>>>>>>>>
>>>>>>>>>>>> I don’t see anything here other than the normal dynamic of a
>>>>>>>>>>>> large and busy open source project. Committer attention is 
>>>>>>>>>>>> limited; things
>>>>>>>>>>>> fall through the cracks; different contributors may occasionally 
>>>>>>>>>>>> work on
>>>>>>>>>>>> the same thing without knowing about each other. A minor help to 
>>>>>>>>>>>> this
>>>>>>>>>>>> specific problem would be to have some way of automatically 
>>>>>>>>>>>> linking issues
>>>>>>>>>>>> that appear to be about the same thing.
>>>>>>>>>>>>
>>>>>>>>>>>> Nick
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On May 28, 2026, at 11:33 AM, Asif Shahid <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Peter,
>>>>>>>>>>>> Pls see inline for comments/ replies
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hey Asif,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Are you referring to
>>>>>>>>>>>>> https://github.com/apache/spark/pull/49154/changes vs.
>>>>>>>>>>>>> https://github.com/apache/spark/pull/55644/changes? Those are
>>>>>>>>>>>>> definitely solving the same issue but I can assure you I wouldn't 
>>>>>>>>>>>>> take any
>>>>>>>>>>>>> code from your PR without consulting with you first.
>>>>>>>>>>>>>
>>>>>>>>>>>>  Yes Indeed Peter, I am referring to those.
>>>>>>>>>>>> As for the fix, itself, is not indicative of any thing as its a
>>>>>>>>>>>> one liner, test has uncanny resemblance.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> As far as I remember, I opened SPARK-56694 /
>>>>>>>>>>>>> https://github.com/apache/spark/pull/55644 because I ran into
>>>>>>>>>>>>> that minor bug during the implementation of
>>>>>>>>>>>>> https://github.com/apache/spark/pull/55298.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry, I didn't check whether a ticket or PR already existed.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The below I am addressing to the whole cartel.:
>>>>>>>>>>>> I have experienced this before, as recent as couple of months
>>>>>>>>>>>> back ( https://issues.apache.org/jira/browse/SPARK-54386)
>>>>>>>>>>>> I have experienced,  my personal effort ( going into weeks) to
>>>>>>>>>>>> debug, reproduce issue reliably , being hijacked by members, 
>>>>>>>>>>>> without even
>>>>>>>>>>>> discussing the fix proposed, ( by opening new PRs). ( If 
>>>>>>>>>>>> interested, I can
>>>>>>>>>>>> provide details of the PRs / issues I am talking about)
>>>>>>>>>>>> I have seen a perfectly valid PR being nixed , by following
>>>>>>>>>>>> comment which essentially said
>>>>>>>>>>>> "  my code of making the cache lookup more effective , would
>>>>>>>>>>>> result in greater chances of stale cache being picked,  which 
>>>>>>>>>>>> already spark
>>>>>>>>>>>> suffers from."
>>>>>>>>>>>> Now the PR was related to collapsing the projects in analysis
>>>>>>>>>>>> phase, and side effect was cache pick up being more sensitive.
>>>>>>>>>>>> So this is such a frivolous reason to nix the PR , because
>>>>>>>>>>>> "staleness" is an underlying existing issue which had nothing to 
>>>>>>>>>>>> do with my
>>>>>>>>>>>> PR. And its more amusing , that if a DB is giving even one wrong 
>>>>>>>>>>>> result in
>>>>>>>>>>>> millions, that makes all the results a suspect in any case. It 
>>>>>>>>>>>> does not
>>>>>>>>>>>> matter at what frequency this occurs. To me the real reason was 
>>>>>>>>>>>> code
>>>>>>>>>>>> complexity ( & more likely  the loss of control of the code to the
>>>>>>>>>>>> outsider).
>>>>>>>>>>>>
>>>>>>>>>>>> The reason I call this open source community as cartel, is
>>>>>>>>>>>> because, I have seen the way it works pretty closely and have 
>>>>>>>>>>>> experienced
>>>>>>>>>>>> it in the email exchanges which happen on this group.
>>>>>>>>>>>> For the same PR , same issue,  if advertently or inadvertently
>>>>>>>>>>>> , other person ( especially a member) gets his changes pushed, by 
>>>>>>>>>>>> the
>>>>>>>>>>>> virtue of his standing/position and the "for profit" company the 
>>>>>>>>>>>> person
>>>>>>>>>>>> works, how would you give the credit to the original person who 
>>>>>>>>>>>> discovered
>>>>>>>>>>>> the issue first / provided the fix?
>>>>>>>>>>>> Why are issues filed by some immediately worked upon by members
>>>>>>>>>>>> ( some of whom claim to be working full time on spark) ? Is it 
>>>>>>>>>>>> because
>>>>>>>>>>>> certain companies / groups ( for profit companies, mind you )  
>>>>>>>>>>>> exert undue
>>>>>>>>>>>> control, or the petty newbee has to be in the good books of 
>>>>>>>>>>>> members ( with
>>>>>>>>>>>> the hope that at some point they will also reach that position of 
>>>>>>>>>>>> power ?)
>>>>>>>>>>>>
>>>>>>>>>>>> Given the AI advent and such occurrences,  how will you give
>>>>>>>>>>>> due credit to the original creators and how do you plan to prevent 
>>>>>>>>>>>> some
>>>>>>>>>>>> member for taking up idea of any old open PR ( which for reasons of
>>>>>>>>>>>> complexity and non technical reasons) ,  polishing it up and 
>>>>>>>>>>>> pushing it as
>>>>>>>>>>>> their own?
>>>>>>>>>>>>
>>>>>>>>>>>> I am also curious , am I the only one who is troubled by all
>>>>>>>>>>>> this, or there are others who have experienced it?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Asif
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> If you have further improvements please feel free to open a PR.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Peter
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> I had filed a bug
>>>>>>>>>>>>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I had also opened a PR for the same.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now I see that the ticket I  filed is still open, but the
>>>>>>>>>>>>>> issue has been fixed using a new ticket
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-56694
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and on top of that the bug test and ofcourse the fix ( which
>>>>>>>>>>>>>> in any case would be same) has been taken from my PR for
>>>>>>>>>>>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To me this is clear unethical conduct of cartel member,
>>>>>>>>>>>>>> unless I am missing some valid reason.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And the irony is that the fix is still incomplete, as I just
>>>>>>>>>>>>>> found and filed a new ticket
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-57126
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I know that atleast some cartel members are insecure and
>>>>>>>>>>>>>> think of OSS as their fiefdom, but this sort of behaviour , I 
>>>>>>>>>>>>>> never
>>>>>>>>>>>>>> expected.
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> Asif
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>

Reply via email to