Hi all, I wanted to bring up the topic about how contributions are made to the project, regarding the committer and author fields in the commit metadata. The process for contributing it's described here: https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez.
There seems to be to be two options to contribute: submitting a patch and submitting a pull request. If a patch is submitted it's harder to preserve authorship in the commit metadata since the user doing so may not have a github account. In this situation, for a general case, I can't think of anything better than the current approach which I understand consists of specifying the author in the commit message. But most of the commits are going to be made by people that we know of. It shouldn't be too hard for any of the committers to find out the github id and email. The second option is to open a pull request. I think for this case ideally we'd preserve authorship since we have all the necessary information to do so. A possible way of doing this that would be consistent with the commit history and with the previous options is to cherry-pick-squash from the PR and then commit to master. Jaume.
