Thank you Tian. Your points are fair and valid.

On Thu, May 28, 2026 at 9:14 AM Tian Gao via dev <[email protected]>
wrote:

> Hi Asif,
>
> First of all, I understand your frustration. Even though cases like this
> happen to many open source contributors from time to time, it's still super
> discouraging and annoying.
>
> I don't believe Peter, or any other committer, intentionally "stole" your
> idea. It's a very common working process for committers to run into an
> issue, find the cause and fix it. Unfortunately, due to various reasons (in
> my case, JIRA search is very difficult to use), not all committers perform
> a full search for existing tickets.
>
> I took a look at your tickets and PRs, I think the core issue is that many
> of your PRs were not reviewed, so no one realized the problem was found and
> a fix was proposed. The bug remained in the code base and later someone
> else found the same issue and fixed it.
>


>

> It's an unfortunate but common problem in any open source community that
> authors with higher reputations get more attention. The fundamental reason
> is we don't have enough eyes. With the number of AI slops increasing these
> days, it's even harder to properly review every single PR and determine
> whether they are valid. Committers tend to treat PRs from other committers
> (or frequent contributors) more seriously, resulting in faster reviews.
>
> I'm not saying this situation will be magically getting much better
> overnight, but I do have a few suggestions that might help you.
>
> First, make sure your PR is ready for review. It's not a criticism of any
> of your PRs, but general guidance for anyone interested in contributing to
> Spark (or any other open source projects). Make sure they are well
> explained, properly tested and pass all tests.
>
> Then, tag some committers. Not everyone, but committers who understand
> your code and own those components. This might take some time initially,
> but you can start by examining commit histories. This will put your PR on
> their radar.
>
> Finally, address committers' comments quickly and tag them again if they
> don't respond. It's an open source project, everyone needs to take some
> initiative and build their reputation. No committer is obligated to review
> every single PR.
>
> I do hope things will get better for you eventually. An open source
> project is a community and it needs efforts from everyone who cares about
> it :)
>
> Tian
>
> On Thu, May 28, 2026 at 8:33 AM Asif Shahid <[email protected]> wrote:
>
>> Hi Peter,
>> Pls see inline for comments/ replies
>>
>> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected]> wrote:
>>
>>> Hey Asif,
>>>
>>> Are you referring to https://github.com/apache/spark/pull/49154/changes
>>> vs. https://github.com/apache/spark/pull/55644/changes? Those are
>>> definitely solving the same issue but I can assure you I wouldn't take any
>>> code from your PR without consulting with you first.
>>>
>>  Yes Indeed Peter, I am referring to those.
>> As for the fix, itself, is not indicative of any thing as its a one
>> liner, test has uncanny resemblance.
>>
>>
>>> As far as I remember, I opened SPARK-56694 /
>>> https://github.com/apache/spark/pull/55644 because I ran into that
>>> minor bug during the implementation of
>>> https://github.com/apache/spark/pull/55298.
>>>
>>
>>
>>> Sorry, I didn't check whether a ticket or PR already existed.
>>>
>>
>> The below I am addressing to the whole cartel.:
>> I have experienced this before, as recent as couple of months back (
>> https://issues.apache.org/jira/browse/SPARK-54386)
>> I have experienced,  my personal effort ( going into weeks) to debug,
>> reproduce issue reliably , being hijacked by members, without even
>> discussing the fix proposed, ( by opening new PRs). ( If interested, I can
>> provide details of the PRs / issues I am talking about)
>> I have seen a perfectly valid PR being nixed , by following comment which
>> essentially said
>> "  my code of making the cache lookup more effective , would result in
>> greater chances of stale cache being picked,  which already spark suffers
>> from."
>> Now the PR was related to collapsing the projects in analysis phase, and
>> side effect was cache pick up being more sensitive.
>> So this is such a frivolous reason to nix the PR , because "staleness" is
>> an underlying existing issue which had nothing to do with my PR. And its
>> more amusing , that if a DB is giving even one wrong result in millions,
>> that makes all the results a suspect in any case. It does not matter at
>> what frequency this occurs. To me the real reason was code complexity ( &
>> more likely  the loss of control of the code to the outsider).
>>
>> The reason I call this open source community as cartel, is because, I
>> have seen the way it works pretty closely and have experienced it in the
>> email exchanges which happen on this group.
>> For the same PR , same issue,  if advertently or inadvertently , other
>> person ( especially a member) gets his changes pushed, by the virtue of his
>> standing/position and the "for profit" company the person works, how would
>> you give the credit to the original person who discovered the issue first /
>> provided the fix?
>> Why are issues filed by some immediately worked upon by members ( some of
>> whom claim to be working full time on spark) ? Is it because certain
>> companies / groups ( for profit companies, mind you )  exert undue
>> control, or the petty newbee has to be in the good books of members ( with
>> the hope that at some point they will also reach that position of power ?)
>>
>> Given the AI advent and such occurrences,  how will you give due credit
>> to the original creators and how do you plan to prevent some member for
>> taking up idea of any old open PR ( which for reasons of complexity and non
>> technical reasons) ,  polishing it up and pushing it as their own?
>>
>> I am also curious , am I the only one who is troubled by all this, or
>> there are others who have experienced it?
>>
>> Regards
>> Asif
>>
>>
>>> If you have further improvements please feel free to open a PR.
>>>
>>> Best,
>>> Peter
>>>
>>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>> I had filed a bug
>>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>>>
>>>> I had also opened a PR for the same.
>>>>
>>>> Now I see that the ticket I  filed is still open, but the issue has
>>>> been fixed using a new ticket
>>>> https://issues.apache.org/jira/browse/SPARK-56694
>>>>
>>>> and on top of that the bug test and ofcourse the fix ( which in any
>>>> case would be same) has been taken from my PR for
>>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>>>
>>>> To me this is clear unethical conduct of cartel member, unless I am
>>>> missing some valid reason.
>>>>
>>>> And the irony is that the fix is still incomplete, as I just found and
>>>> filed a new ticket
>>>> https://issues.apache.org/jira/browse/SPARK-57126
>>>>
>>>> I know that atleast some cartel members are insecure and think of OSS
>>>> as their fiefdom, but this sort of behaviour , I never expected.
>>>> Regards
>>>> Asif
>>>>
>>>

Reply via email to