Sure! Sourabh Majhee <[email protected]> 于2026年5月21日周四 12:25写道:
> Hi Yu Yu, > > My proposal was not selected this cycle, but I still want to build the > skills directory — the work is needed regardless of GSoC. I will start with > dubbo-overview and dubbo-rpc and contribute them directly to the > apache/dubbo repository on the 3.3 branch. > > Would you be open to reviewing the first skill file once it is ready? > > Regards, Sourabh Majhee > > On Thu, Apr 9, 2026 at 9:49 PM Sourabh Majhee <[email protected]> > wrote: > >> Hi Yu Yu, and all, >> Resharing my response here for broader visibility on the dev list. >> >> Thank you for reading my proposal and raising this — it is a genuinely >> important challenge and you are right to push back on it. >> >> You are correct that the primary use case for AI coding tools is writing >> and debugging code, not asking architecture questions. So the real test for >> a Dubbo skill file is not "does the AI answer 'what is dubbo-rpc' >> correctly" — it is "does loading the skill make the AI generate better, >> more correct Dubbo code." >> >> A concrete example of what this means in practice: when a developer asks >> Claude Code or Copilot to "write a Dubbo 3 provider with application-level >> service discovery and Nacos registry", the generated code today typically >> uses the wrong registration model (interface-level, from Dubbo 2 patterns) >> and misconfigures the MetadataCenter. If the dubbo-registry skill is >> loaded, the AI has precise context about application-level discovery, the >> URL structure, and the three-center architecture — and the generated code >> should be correct. >> >> So my revised validation approach would be: >> >> 1. Define 20–30 coding tasks per skill — things like "write a custom >> LoadBalance SPI implementation", "configure FailoverCluster with 3 >> retries", "set up Nacos as both registry and config center". These are >> real >> tasks, not questions. >> 2. Run each task with Claude Code and GitHub Copilot with and without >> the skill loaded. >> 3. Measure correctness of the generated code by running it against >> dubbo-samples test cases — pass/fail, not subjective scoring. >> >> This gives an objective, code-execution-based benchmark rather than a Q&A >> accuracy metric. >> >> I would welcome your view on whether this approach addresses your >> concern, or whether you see other failure modes I should account for. >> >> Regards, Sourabh Majhee [email protected] >> >> On Tue, Apr 7, 2026 at 12:51 PM Sourabh Majhee < >> [email protected]> wrote: >> >>> Hi Yu Yu, >>> >>> Thank you for reading my proposal and raising this — it is a genuinely >>> important challenge and you are right to push back on it. >>> >>> You are correct that the primary use case for AI coding tools is writing >>> and debugging code, not asking architecture questions. So the real test for >>> a Dubbo skill file is not "does the AI answer 'what is dubbo-rpc' >>> correctly" — it is "does loading the skill make the AI generate better, >>> more correct Dubbo code." >>> >>> A concrete example of what this means in practice: when a developer asks >>> Claude Code or Copilot to "write a Dubbo 3 provider with application-level >>> service discovery and Nacos registry", the generated code today typically >>> uses the wrong registration model (interface-level, from Dubbo 2 patterns) >>> and misconfigures the MetadataCenter. If the dubbo-registry skill is >>> loaded, the AI has precise context about application-level discovery, the >>> URL structure, and the three-center architecture — and the generated code >>> should be correct. >>> >>> So my revised validation approach would be: >>> >>> 1. Define 20–30 coding tasks per skill — things like "write a custom >>> LoadBalance SPI implementation", "configure FailoverCluster with 3 >>> retries", "set up Nacos as both registry and config center". These are >>> real >>> tasks, not questions. >>> 2. Run each task with Claude Code and GitHub Copilot with and >>> without the skill loaded. >>> 3. Measure correctness of the generated code by running it against >>> dubbo-samples test cases — pass/fail, not subjective scoring. >>> >>> This gives an objective, code-execution-based benchmark rather than a >>> Q&A accuracy metric. >>> >>> I would welcome your view on whether this approach addresses your >>> concern, or whether you see other failure modes I should account for. >>> >>> Regards, Sourabh Majhee [email protected] >>> >>> On Tue, Apr 7, 2026 at 9:12 AM Rain Yu <[email protected]> wrote: >>> >>>> I have read your proposal, and I would like to know how you ensure that >>>> the Skills you generate are effective and truly helpful to users.In your >>>> article, you mentioned using coding tools to ask questions, but in reality, >>>> we often use coding tools for coding rather than directly asking questions >>>> >>>> Sourabh Majhee <[email protected]> 于2026年4月5日周日 13:10写道: >>>> >>>>> Hi Yu Yu, >>>>> >>>>> I have introduced myself on [email protected] and submitted a GSoC >>>>> 2026 >>>>> proposal for the "Convert Dubbo Capabilities into AI Skills" project. >>>>> I am >>>>> writing to you directly to make sure you are aware of my proposal, >>>>> since you >>>>> are listed as a mentor. >>>>> >>>>> I have kept the broader discussion on the mailing list as I understand >>>>> that >>>>> is preferred. If there is anything specific you would like me to >>>>> address >>>>> before April 30 — a first contribution, a clarification in the >>>>> proposal, or >>>>> a design question about the skill taxonomy — please let me know and I >>>>> will >>>>> follow up there. >>>>> >>>>> Thank you. >>>>> >>>>> Sourabh Majhee >>>>> [email protected] >>>>> github.com/Sourabh-Majhee >>>>> >>>>
