dennishuo opened a new pull request, #4519: URL: https://github.com/apache/polaris/pull/4519
More detailed proposal doc here: https://docs.google.com/document/d/1RE5mGcrMLbmi8sglkHuJKxORVNiuiZ69da1weqwpGjE/edit?tab=t.0 Adds .agents/skills/polaris-extensibility-eval/ โ an A/B harness for measuring agentic-development impact of repo changes (AGENTS.md edits, extension-surface refactors, etc.). The skill spawns fresh, context-free coding-agent subprocesses (claude-cli / codex / cursor) in scrubbed-env worktrees pinned to BEFORE and AFTER refs, runs concrete tasks, captures verifier verdicts and per-cell cost/wall/token usage, and reports A/B deltas across (task ร arm ร model ร cli) cells. <!-- ๐ Describe what changes you're proposing, especially breaking or user-facing changes. ๐ See https://github.com/apache/polaris/blob/main/CONTRIBUTING.md for more. --> ## Checklist - [ ] ๐ก๏ธ Don't disclose security issues! (contact [email protected]) - [ ] ๐ Clearly explained why the changes are needed, or linked related issues: Fixes # - [ ] ๐งช Added/updated tests with good coverage, or manually tested (and explained how) - [ ] ๐ก Added comments for complex logic - [ ] ๐งพ Updated `CHANGELOG.md` (if needed) - [ ] ๐ Updated documentation in `site/content/in-dev/unreleased` (if needed) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
