klesh commented on issue #1680: URL: https://github.com/apache/incubator-devlake/issues/1680#issuecomment-1153588720
This is summary of our discussion:  ## glossary 1. table `users` to be renamed to `accounts` to avoid confusion, it had caused so many problems during discussion. 2. keep `people` as it is to represent a natural person, thus, a `people` could have multiple `accounts`, we agreed that `users` is ambiguous and should not be used anywhere in our database to avoid confusion. 3. table `person_accounts` exists, because `accounts` might be deleted by `subtask` from plugins, and `person_accounts` can only be entered by our user, in order to keep those data safe, we all agreed that they should be stored to a separated table even though they are actually One-To-Many relationship ## user cases ### aggregate `commits` by `person` **Proposal 1.1** ``` SELECT pa.person_id, count(*) as total_commits FROM commits c LEFT JOIN person_accounts pa ON (pa.account_id = c.author_email) GROUP BY pa.person_id ``` Sometimes, `commits.author_email` might be `[email protected]` due to privacy reasons, we all agreed that `gitextractor` should detect this kind of email (maybe configurable), and generate some sort FAKE author_email to make `commits` would always be distinguishable. **Proposal 1.2** However, I propose another approach: to use `commits.author_id` instead. ``` SELECT pa.person_id, count(*) as total_commits FROM commits c LEFT JOIN person_accounts pa ON (pa.account_id = c.author_id) GROUP BY pa.person_id ``` Rationale: 1. We have `author_id` `committer_id` in our `commits` table. 2. We all agreed that email is unreliable, and we should use `id` (either incremental or origin-key etc, we will discuss it later) ### how should a User setup this user mapping User download a `*.csv` , edit this file, upload, done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
