Sure!

Sourabh Majhee <[email protected]> 于2026年5月21日周四 12:25写道:

> Hi Yu Yu,
>
> My proposal was not selected this cycle, but I still want to build the
> skills directory — the work is needed regardless of GSoC. I will start with
> dubbo-overview and dubbo-rpc and contribute them directly to the
> apache/dubbo repository on the 3.3 branch.
>
> Would you be open to reviewing the first skill file once it is ready?
>
> Regards, Sourabh Majhee
>
> On Thu, Apr 9, 2026 at 9:49 PM Sourabh Majhee <[email protected]>
> wrote:
>
>> Hi Yu Yu, and all,
>> Resharing my response here for broader visibility on the dev list.
>>
>> Thank you for reading my proposal and raising this — it is a genuinely
>> important challenge and you are right to push back on it.
>>
>> You are correct that the primary use case for AI coding tools is writing
>> and debugging code, not asking architecture questions. So the real test for
>> a Dubbo skill file is not "does the AI answer 'what is dubbo-rpc'
>> correctly" — it is "does loading the skill make the AI generate better,
>> more correct Dubbo code."
>>
>> A concrete example of what this means in practice: when a developer asks
>> Claude Code or Copilot to "write a Dubbo 3 provider with application-level
>> service discovery and Nacos registry", the generated code today typically
>> uses the wrong registration model (interface-level, from Dubbo 2 patterns)
>> and misconfigures the MetadataCenter. If the dubbo-registry skill is
>> loaded, the AI has precise context about application-level discovery, the
>> URL structure, and the three-center architecture — and the generated code
>> should be correct.
>>
>> So my revised validation approach would be:
>>
>>    1. Define 20–30 coding tasks per skill — things like "write a custom
>>    LoadBalance SPI implementation", "configure FailoverCluster with 3
>>    retries", "set up Nacos as both registry and config center". These are 
>> real
>>    tasks, not questions.
>>    2. Run each task with Claude Code and GitHub Copilot with and without
>>    the skill loaded.
>>    3. Measure correctness of the generated code by running it against
>>    dubbo-samples test cases — pass/fail, not subjective scoring.
>>
>> This gives an objective, code-execution-based benchmark rather than a Q&A
>> accuracy metric.
>>
>> I would welcome your view on whether this approach addresses your
>> concern, or whether you see other failure modes I should account for.
>>
>> Regards, Sourabh Majhee [email protected]
>>
>> On Tue, Apr 7, 2026 at 12:51 PM Sourabh Majhee <
>> [email protected]> wrote:
>>
>>> Hi Yu Yu,
>>>
>>> Thank you for reading my proposal and raising this — it is a genuinely
>>> important challenge and you are right to push back on it.
>>>
>>> You are correct that the primary use case for AI coding tools is writing
>>> and debugging code, not asking architecture questions. So the real test for
>>> a Dubbo skill file is not "does the AI answer 'what is dubbo-rpc'
>>> correctly" — it is "does loading the skill make the AI generate better,
>>> more correct Dubbo code."
>>>
>>> A concrete example of what this means in practice: when a developer asks
>>> Claude Code or Copilot to "write a Dubbo 3 provider with application-level
>>> service discovery and Nacos registry", the generated code today typically
>>> uses the wrong registration model (interface-level, from Dubbo 2 patterns)
>>> and misconfigures the MetadataCenter. If the dubbo-registry skill is
>>> loaded, the AI has precise context about application-level discovery, the
>>> URL structure, and the three-center architecture — and the generated code
>>> should be correct.
>>>
>>> So my revised validation approach would be:
>>>
>>>    1. Define 20–30 coding tasks per skill — things like "write a custom
>>>    LoadBalance SPI implementation", "configure FailoverCluster with 3
>>>    retries", "set up Nacos as both registry and config center". These are 
>>> real
>>>    tasks, not questions.
>>>    2. Run each task with Claude Code and GitHub Copilot with and
>>>    without the skill loaded.
>>>    3. Measure correctness of the generated code by running it against
>>>    dubbo-samples test cases — pass/fail, not subjective scoring.
>>>
>>> This gives an objective, code-execution-based benchmark rather than a
>>> Q&A accuracy metric.
>>>
>>> I would welcome your view on whether this approach addresses your
>>> concern, or whether you see other failure modes I should account for.
>>>
>>> Regards, Sourabh Majhee [email protected]
>>>
>>> On Tue, Apr 7, 2026 at 9:12 AM Rain Yu <[email protected]> wrote:
>>>
>>>> I have read your proposal, and I would like to know how you ensure that
>>>> the Skills you generate are effective and truly helpful to users.In your
>>>> article, you mentioned using coding tools to ask questions, but in reality,
>>>> we often use coding tools for coding rather than directly asking questions
>>>>
>>>> Sourabh Majhee <[email protected]> 于2026年4月5日周日 13:10写道:
>>>>
>>>>> Hi Yu Yu,
>>>>>
>>>>> I have introduced myself on [email protected] and submitted a GSoC
>>>>> 2026
>>>>> proposal for the "Convert Dubbo Capabilities into AI Skills" project.
>>>>> I am
>>>>> writing to you directly to make sure you are aware of my proposal,
>>>>> since you
>>>>> are listed as a mentor.
>>>>>
>>>>> I have kept the broader discussion on the mailing list as I understand
>>>>> that
>>>>> is preferred. If there is anything specific you would like me to
>>>>> address
>>>>> before April 30 — a first contribution, a clarification in the
>>>>> proposal, or
>>>>> a design question about the skill taxonomy — please let me know and I
>>>>> will
>>>>> follow up there.
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Sourabh Majhee
>>>>> [email protected]
>>>>> github.com/Sourabh-Majhee
>>>>>
>>>>

Reply via email to