Hi,

The current prompt is a good start, but I think some of the assessment
rules need to be made more explicit. For example, it asks the model to
compare 3m/6m/12m windows and describe whether activity is growing, stable,
or declining, but it does not define what those terms mean. Is a 10% change
meaningful? 25%? Should a drop in absolute numbers matter if the number of
contributors is still healthy? Similarly, it asks the model to surface
health concerns, but does not define the judgment rules for when mentor
sign-off rate, reviewer diversity, bus factor, mailing-list activity, or
release cadence become concerns.

I wonder if we should separate deterministic assessment from narrative
synthesis. Trends such as PR count, commit activity, reviewer diversity,
mentor sign-off rate, release cadence, and mailing-list participation can
probably be computed directly by the tools or a small rule layer and
included in the draft as factual signals. The LLM could then be used where
it adds more value: summarising less structured mailing-list discussions,
identifying significant proposals or unresolved issues, spotting
communication concerns, explaining mixed signals, and turning the computed
signals into readable report language.

That would make the draft easier to review: the factual signals and
thresholds would be explicit and auditable, while the LLM would focus on
synthesis rather than silently inventing the assessment policy.

Vladimir

Reply via email to