v01d commented on issue #1954: URL: https://github.com/apache/incubator-nuttx/issues/1954#issuecomment-707708672
hi @yy-gu , great that you and Peter can take on this task. Regarding the workflow, I have a few comments: * the checks you mention are already in place in the script I wrote and are mostly: check git author, check header author and any attributions on git commit message. the checks are done by name and alternatively by email, using a set of remappings (to consider alternative emails of same author). * the "without ambiguity" will be difficult since the script may fail to identify some author (mostly in the attribution, since this is really a heuristic). header authors should all be detected but this is regex based and if I didn't consider a particular case it may also miss an author * I wonder if assuming Apache licensed files is really safe: if we didn't went through this process before it is possible someone may have missed an attribution in a git message, for example. So, maybe (in a later step) we should distinguish between "apache licensed" and "cleared". We could first clear non-apache files to get most of the work done and then clear (validate) the remaining Apache files Finally, note that besides authors there may be companies involved from which you will need SGAs besides author's ICLAs. The script also tries to identify this but there may be border cases. In conclusion, I think that the approach could be: 1. Stage 1 1. look for non-apache licensed files where the script give the "can be apachized" result 2. manually verify these by going throught the script output (to verify no authors, companies were missed) 2. Stage 2 1. look for non-apache licensed files where the script does not give the "can be apachized" result 2. try to contact authors/companies to request ICLAs 3. Stage 3 (optional?) 1. look for apache licensed files (not resulting from stage 1 and 2) which have not been cleared 2. run script and validate the "can be apachized" To perform the conversion itself, I added a script to change the header which you could use. But the two following issue is not yet addressed: the script does not work for files which were part of submodules in the past: this means boards, arch and maybe others which I don't remember. the problem is that the script tries to access the file content at a commit which was during submodule era and this is not possible in the current repo. Greg made the submodules available which could be used to retrieve contents from this part of their history. Finally, I would always do this header changes in separate commits and use the output of the script as part of the commit message. This way, there's traceability to these changes. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org