[
https://issues.apache.org/jira/browse/KAFKA-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730436#comment-17730436
]
Steven Booke commented on KAFKA-14995:
--
[~vvcephei] Hello John, this will be my first time contributing and I would
like to assign myself to this ticket but I am unable to do so. Could you assign
it to me please?
> Automate asf.yaml collaborators refresh
> ---
>
> Key: KAFKA-14995
> URL: https://issues.apache.org/jira/browse/KAFKA-14995
> Project: Kafka
> Issue Type: Improvement
>Reporter: John Roesler
>Priority: Minor
> Labels: newbie
>
> We have added a policy to use the asf.yaml Github Collaborators:
> [https://github.com/apache/kafka-site/pull/510]
> The policy states that we set this list to be the top 20 commit authors who
> are not Kafka committers. Unfortunately, it's not trivial to compute this
> list.
> Here is the process I followed to generate the list the first time (note that
> I generated this list on 2023-04-28, so the lookback is one year:
> 1. List authors by commit volume in the last year:
> {code:java}
> $ git shortlog --email --numbered --summary --since=2022-04-28 | vim {code}
> 2. manually filter out the authors who are committers, based on
> [https://kafka.apache.org/committers]
> 3. truncate the list to 20 authors
> 4. for each author
> 4a. Find a commit in the `git log` that they were the author on:
> {code:java}
> commit 440bed2391338dc10fe4d36ab17dc104b61b85e8
> Author: hudeqi <1217150...@qq.com>
> Date: Fri May 12 14:03:17 2023 +0800
> ...{code}
> 4b. Look up that commit in Github:
> [https://github.com/apache/kafka/commit/440bed2391338dc10fe4d36ab17dc104b61b85e8]
> 4c. Copy their Github username into .asf.yaml under both the PR whitelist and
> the Collaborators lists.
> 5. Send a PR to update .asf.yaml: [https://github.com/apache/kafka/pull/13713]
>
> This is pretty time consuming and is very scriptable. Two complications:
> * To do the filtering, we need to map from Git log "Author" to documented
> Kafka "Committer" that we can use to perform the filter. Suggestion: just
> update the structure of the "Committers" page to include their Git "Author"
> name and email
> ([https://github.com/apache/kafka-site/blob/asf-site/committers.html)]
> * To generate the YAML lists, we need to map from Git log "Author" to Github
> username. There's presumably some way to do this in the Github REST API (the
> mapping is based on the email, IIUC), or we could also just update the
> Committers page to also document each committer's Github username.
>
> Ideally, we would write this script (to be stored in the Apache Kafka repo)
> and create a Github Action to run it every three months.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)