dennishuo opened a new pull request, #4519:
URL: https://github.com/apache/polaris/pull/4519

   More detailed proposal doc here: 
https://docs.google.com/document/d/1RE5mGcrMLbmi8sglkHuJKxORVNiuiZ69da1weqwpGjE/edit?tab=t.0
   
   Adds .agents/skills/polaris-extensibility-eval/ โ€” an A/B harness for 
measuring agentic-development impact of repo changes (AGENTS.md edits, 
extension-surface refactors, etc.).
   
   The skill spawns fresh, context-free coding-agent subprocesses (claude-cli / 
codex / cursor) in scrubbed-env worktrees pinned to BEFORE and AFTER refs, runs 
concrete tasks, captures verifier verdicts and per-cell cost/wall/token usage, 
and reports A/B deltas across (task ร— arm ร— model ร— cli) cells.
   
   <!--
   ๐Ÿ“ Describe what changes you're proposing, especially breaking or user-facing 
changes. 
   ๐Ÿ“– See https://github.com/apache/polaris/blob/main/CONTRIBUTING.md for more.
   -->
   
   ## Checklist
   - [ ] ๐Ÿ›ก๏ธ Don't disclose security issues! (contact [email protected])
   - [ ] ๐Ÿ”— Clearly explained why the changes are needed, or linked related 
issues: Fixes #
   - [ ] ๐Ÿงช Added/updated tests with good coverage, or manually tested (and 
explained how)
   - [ ] ๐Ÿ’ก Added comments for complex logic
   - [ ] ๐Ÿงพ Updated `CHANGELOG.md` (if needed)
   - [ ] ๐Ÿ“š Updated documentation in `site/content/in-dev/unreleased` (if needed)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to