RussellSpitzer opened a new pull request, #15529:
URL: https://github.com/apache/iceberg/pull/15529

   ## Summary
   
   Adds an `AGENTS.md` file at the repository root following the [AGENTS.md 
open standard](https://agents.md/) — a convention adopted by 60,000+ 
repositories for providing AI coding agents with project-specific context. This 
file is automatically discovered by tools like Cursor, Copilot, Claude Code, 
Codex, and others.
   
   The conventions were **synthesized from 58,381 review comments across 4,309 
merged PRs** spanning the full history of the Apache Iceberg project. Comments 
were collected via the GitHub GraphQL API from every merged PR with 3+ reviews, 
filtered to PMC members and committers.
   
   ### Creation Process
   
   1. **Data collection**: Fetched all review comments from merged PRs using 
`gh api graphql` with pagination across the full PR history.
   2. **Topic clustering**: Classified comments into ~17 topic buckets (API 
design, naming, testing, performance, serialization, REST/spec, error handling, 
configuration, code style, module boundaries, etc.) using keyword-based pattern 
matching.
   3. **Sampling**: Within each bucket, selected the most substantive comments 
(>80 chars, diverse file paths, de-duplicated).
   4. **Rule synthesis**: Extracted concrete, actionable conventions from each 
topic cluster, focusing on patterns that appeared repeatedly and consistently 
across different reviewers and time periods.
   5. **Depersonalization**: All rules are generic — no attribution to specific 
reviewers or behavioral profiles.
   
   The raw review comment dataset is available as a public gist for 
reproducibility: 
https://gist.github.com/RussellSpitzer/8dddd1915d0c9fb9e027ab5fd5331c87
   
   ### What's Covered
   
   - **Architecture**: Module boundaries, high-sensitivity areas
   - **Design patterns**: Refinement, CloseableIterable, null-over-Optional, 
builder, boolean-threw, Tasks, immutable metadata, etc.
   - **Coding conventions**: API design, naming, code style, placement, 
serialization, error handling, performance, configuration, testing, REST/OpenAPI
   
   ### What's NOT included
   
   - No reviewer behavioral profiles or personal attributions
   - No PR-specific data or review workflow instructions
   - No build/CI setup (already documented in CONTRIBUTING.md)
   
   ## Test plan
   
   - [ ] Verify file renders correctly on GitHub
   - [ ] Verify conventions are accurate against existing codebase patterns
   - [ ] Solicit feedback from PMC members on coverage and accuracy
   
   
   Made with [Cursor](https://cursor.com)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to