RussellSpitzer opened a new pull request, #15529: URL: https://github.com/apache/iceberg/pull/15529
## Summary Adds an `AGENTS.md` file at the repository root following the [AGENTS.md open standard](https://agents.md/) — a convention adopted by 60,000+ repositories for providing AI coding agents with project-specific context. This file is automatically discovered by tools like Cursor, Copilot, Claude Code, Codex, and others. The conventions were **synthesized from 58,381 review comments across 4,309 merged PRs** spanning the full history of the Apache Iceberg project. Comments were collected via the GitHub GraphQL API from every merged PR with 3+ reviews, filtered to PMC members and committers. ### Creation Process 1. **Data collection**: Fetched all review comments from merged PRs using `gh api graphql` with pagination across the full PR history. 2. **Topic clustering**: Classified comments into ~17 topic buckets (API design, naming, testing, performance, serialization, REST/spec, error handling, configuration, code style, module boundaries, etc.) using keyword-based pattern matching. 3. **Sampling**: Within each bucket, selected the most substantive comments (>80 chars, diverse file paths, de-duplicated). 4. **Rule synthesis**: Extracted concrete, actionable conventions from each topic cluster, focusing on patterns that appeared repeatedly and consistently across different reviewers and time periods. 5. **Depersonalization**: All rules are generic — no attribution to specific reviewers or behavioral profiles. The raw review comment dataset is available as a public gist for reproducibility: https://gist.github.com/RussellSpitzer/8dddd1915d0c9fb9e027ab5fd5331c87 ### What's Covered - **Architecture**: Module boundaries, high-sensitivity areas - **Design patterns**: Refinement, CloseableIterable, null-over-Optional, builder, boolean-threw, Tasks, immutable metadata, etc. - **Coding conventions**: API design, naming, code style, placement, serialization, error handling, performance, configuration, testing, REST/OpenAPI ### What's NOT included - No reviewer behavioral profiles or personal attributions - No PR-specific data or review workflow instructions - No build/CI setup (already documented in CONTRIBUTING.md) ## Test plan - [ ] Verify file renders correctly on GitHub - [ ] Verify conventions are accurate against existing codebase patterns - [ ] Solicit feedback from PMC members on coverage and accuracy Made with [Cursor](https://cursor.com) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
