Hi list, after the recent threads of accountability I had a long talk with Thiago yesterday brainstorming the issue. We came up with a strategy that will allow us to have accountability while still embracing the advantages of distributed SCMs. I'd like to write up our findings and present it here for archival and discussion purposes.
The issue; In team efforts having a known-to-be-correct manner of linking contributions to persons is essential for various reasons. It keeps people honest and it allows easier communication if you spot a bug in some commit. Usecases; First, the simple usecase. Developer Andy updates kdelibs to the latest version, writes a patch which he commits locally and then pushes it to the kde-server. A bit more complex; Björn writes a fix for a bug on his 4.1 branch checkout. He then merges this same fix to the current 'trunk' branch so it will also appear in 4.2 and later. He pushes his commits (one for 4.1 and one for trunk) to the kde-server. Using distributed development; Carlos runs a server for kde-Chile. Carlos is the only one that has a kde account and he has various volunteers that write translations sending him patches which he then proofreads and eventually sends to the kde-server. Solution/proposal; We assume there is one kde-server that all kde people commit to in order to make that commit end up in a kde release. We assume that the people committing will connect to this server using some form of authentication. An ssh-key for example. In Git (or any DSCM) you 'push' the changes you have made locally upstream to the server. This may involve more than one commit. When Andy (usecase1) makes one or more commits locally and then pushes those the server knows for sure its Andy sending these commits due to the authentication. The official kde-server will not accept pushes from anonymous people, after all ;) The (git) repository Andy pushes to is the official kdelibs repository and naturally it will have a named branch that is the official 'master' branch. Björns changes will be visible after pushing on the master branch. All the above is mostly introduction, the exiting part happens below ;) The server that accepts the patches from Andy can easily find out which commits he just pushed. What the server then does is that it takes the latest commit made by Andy, on the master branch, and it merges it into a so called loggingOfMaster branch. This will mean the content of both the master and the loggingOfMaster branches will be identical. The merge action creates a commit. A commit on the logging branch and one that the server automatically creates. The commit message for that merge will contain; * All the sha1 id's of each of the commits that were just pushed. * The identifying name of the authenticated user that pushed these commits. * random data like the time and maybe the IP-address. The effect of this action is that we now register the actual pusher of commits. So even if I made a commit in Ossi's name, if I push it it will still be traceable to me. The usage of the log branch *inside* the official repository means that everyone that does a checkout of KDE can pass an argument to get all this identifying information. So we don't have to ask sysadmin to read some log or something ;) For our second usecase where Björn commits to two branches we apply the same concept. The difference is that Björn made commits to a second branch and for that second branch (the 4.1 branch) a separate log branch exists. The effect of pushing changes to two branches will then just differ from the usecase from Andy that a logging commit is made to both branches that B pushes to. For my 3th usecase Carlos collects patches from various people he knows. The only one actually pushing to the kde-server is Carlos, however. So in our system when Carlos pushes 10 commits from 10 different people all of them will be marked as belonging to Carlos. In principle this is fine. It is up to Carlos to take the responsibility of patches he pushes. But he might want to make clear that a commit is actually not his, especially if he didn't manage to actually read the whole thing. For this usecase I want to introduce a completely separate accountability concept that is similar, but different, from the git concept of signing off. When Diana makes several translations she does this on a git checkout. She then commits her changes and emails or pushes her changes to Carlos. Making a commit in Git is something that is done in several steps and the last step is to create a commit message. Diana will use a way of committing[1] that takes the commit message, the tree (including her changes) and her email address and it will gpg-sign that information. This signature will be added to the logmessage, probably as a one-liner of ascii data. The effect of signing this is that we can now send around that patch and merge it into any tree without loosing the information about who really initially created that patch. Ok, back to our Carlos usecase; Diana created the patch, she signed it and then pushes it to Carlos. Carlos checks the signature using some custom script to make sure the commit really is from Diana and he merges it into his tree. At a later point he pushes all the changes he has, including the one from Diana, to the kde-server. The commit is registered by the kde-server as being made by Carlos, but Carlos can point to the signature to indicate that, really, Diana was the one that made this commit. Looking forward to Akademy where we can explain/discuss this further ;) Cheers! ps. any mistakes are mine, but I'm sure that Thaigo will reply if he spots anything I missed. 1) we can create a script for all our users or we can try to push this into git-commit itself. Both solutions should be persuit. Basically the script will sign what we now see using git cat-file -p HEAD | grep -v committer -- Thomas Zander
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Kde-scm-interest mailing list [email protected] https://mail.kde.org/mailman/listinfo/kde-scm-interest
