Thanks. I think we can do this, but let's make sure the error messages
from the scripts give sufficient information on what to do when they
fail. Right now, I don't think they really describe things well from
the point of view of someone who doesn't know about them yet. We might
also want to put something longer on the wiki and link to it.
Otherwise, you will spend more time describing to people how to fix
this than you currently do updating the AUTHORS yourself.

Also, for anyone wondering why we do this in the first place, there
are two reasons. Firstly is that AUTHORS credits everyone who has
contributed, and is shipped with every release (so it's present even
when the git history isn't). Secondly, by keeping .mailmap up-to-date,
we can always have an accurate count of how many contributors we have.
Without it, it would be very difficult to count, because the same
person would appear duplicated in the git history. This information is
useful when doing various statistics. For example, consider the
question, "how many people contributed to SymPy in the last year?"
This question is easy to answer with 'git shortlog -ns --since="1 year
ago" | wc -l' (the answer as of right now is 152), but only because
duplicates are properly filtered.

Aaron Meurer

Aaron Meurer

On Sat, Aug 21, 2021 at 9:58 AM Oscar Benjamin
<[email protected]> wrote:
>
> Hi all,
>
> I just merged a PR to update the AUTHORS file:
> https://github.com/sympy/sympy/pull/21881
>
> This is something that I've needed to do before each release and is
> quite time consuming so in that PR I also added the scripts that check
> the AUTHORS file to the CI tests. Now any PR that does not have up to
> date author information will fail in CI. In particular this means that
> any new contributor will need to add their name to the AUTHORS file
> before their first PR can be merged.
>
> The name and email address in the AUTHORS file needs to match up with
> the name/email in the git commits (as set by git config). Otherwise
> there will need to be a line in the .mailmap file associating the
> name/email in the authors file with the name/email in the commits. The
> most common reason this is needed is if someone makes commits through
> the github web UI which will always record a no reply email address.
>
> This will be problematic for new contributors but it means that the
> author information will always be up to date and will be more accurate
> since anyone contributing will be required to specify the name and
> email address up front.
>
> The simple instruction for any new contributor is that you should
> first set your name and email address in your git config:
>
>     $ git config --global user.name "John Doe"
>     $ git config --global user.email [email protected]
>
> See e.g.:
> https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup
>
> If git is correctly configured in all of the commits then the sympy
> git repo has two scripts that can be used to check and update the
> author information. The first is
>
>     $ bin/mailmap_update.py
>
> This extracts all name/email combinations from all commits in the repo
> and checks that every email address is mapped to a unique name. If
> there are two commits with the same email addresses but different
> names then a mailmap line is needed to specify which name is the one
> that should be used in the authors file. Likewise if the same name is
> used with multiple email addresses then a mailmap line is needed to
> specify which email address should go with the name. (I'm not sure how
> to handle genuinely distinct people who actually have the same
> name...).
>
> If there are no errors reported by mailmap_update.py then the second
> script can be used to update the AUTHORS file:
>
>     $ bin/authors_update.py
>
> This also extracts all name/email combinations from commits and then
> runs them through the mapping in mailmap and then if any are not
> listed in the authors file the script will add them. It will print
> something to say what it has done. You can use git diff to see the
> changes. These changes in .mailmap and AUTHORS need to be committed
> and pushed.
>
> I expect some new contributors will find this difficult especially
> since they might not discover that their git config has the wrong
> name/email until after they have already made the commits and pushed
> them. It will reduce the chances that someone accidentally uses the
> wrong git config though (once your commits are merged to master this
> is no way to remove them even if you do not want that name/email to be
> used publicly). The simple fix is just to add some lines to .mailmap
> but many contributors might prefer to actually fix their config and
> redo the commits. In general I would rather have correct name/email
> information recorded in the commits than have a .mailmap entry that
> disambiguates them.
>
> Lastly whenever we have changes to CI like this there is a risk that a
> PR that has already passed CI will be merged resulting in CI failure
> on the master branch that then prevents any PRs from being merged
> until it gets fixed. CI can be rerun for a PR by closing and
> reopening.
>
> --
> Oscar
>
> --
> You received this message because you are subscribed to the Google Groups 
> "sympy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sympy/CAHVvXxThP7%2Bb2WQbExN670tgM8AE7hiS3PisJcNSHnUf-H9y-g%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAKgW%3D6J%2BgEdytO3PAUUww0C6d9Tky7oCJFHDjHOxqu0G9tJw%3Dg%40mail.gmail.com.

Reply via email to