I found some gaps in scanCode --- the tool we use for checking repository conformance for things like headers, white space related formatting, etc. --- and how exclusions are implemented. I posit there's a desire to allow scanCode to process existing .gitignore files, and more over to treat the exclusion section of the scanCode config in the same way as git ignore files (wild cards don't current work and file-based matching is too loose).
This page provides a detailed description how gitignore rules [1] and I found a python library that appears to implement matching a directory tree against .gitignore [2]. I've incorporated the enhancements into scanCode on my fork here https://github.com/apache/incubator-openwhisk-utilities/compare/master...rabbah:gitignore?expand=1, which closes an issue I opened some time ago [3]. The implementation would require an added dependence for the matching library (pip install pathspec). I can look into compiling scanCode into a self container binary which would mean we should also create a release for scanCode itself. Thoughts? -r [1] https://git-scm.com/docs/gitignore [2] https://github.com/cpburnz/python-path-specification [3] https://github.com/apache/incubator-openwhisk-utilities/issues/39
