Hi, The current prompt is a good start, but I think some of the assessment rules need to be made more explicit. For example, it asks the model to compare 3m/6m/12m windows and describe whether activity is growing, stable, or declining, but it does not define what those terms mean. Is a 10% change meaningful? 25%? Should a drop in absolute numbers matter if the number of contributors is still healthy? Similarly, it asks the model to surface health concerns, but does not define the judgment rules for when mentor sign-off rate, reviewer diversity, bus factor, mailing-list activity, or release cadence become concerns.
I wonder if we should separate deterministic assessment from narrative synthesis. Trends such as PR count, commit activity, reviewer diversity, mentor sign-off rate, release cadence, and mailing-list participation can probably be computed directly by the tools or a small rule layer and included in the draft as factual signals. The LLM could then be used where it adds more value: summarising less structured mailing-list discussions, identifying significant proposals or unresolved issues, spotting communication concerns, explaining mixed signals, and turning the computed signals into readable report language. That would make the draft easier to review: the factual signals and thresholds would be explicit and auditable, while the LLM would focus on synthesis rather than silently inventing the assessment policy. Vladimir
