andygrove opened a new pull request, #4685: URL: https://github.com/apache/datafusion-comet/pull/4685
## Which issue does this PR close? Closes #4671. ## Rationale for this change The `apache-rat-plugin` is bound to the Maven `verify` phase, so it runs on every `install`, including the six `./mvnw ... -DskipTests install` runs in `dev/release/build-release-comet.sh` (`-DskipTests` skips tests, not RAT). RAT scans the root module's directory tree, and the exclude list did not cover several untracked generated/scratch directories that accumulate during a release (Python virtualenv, docker build workdir, extracted release tarballs, the rat report files, and the downloaded rat jar). Once populated, each RAT pass walks a very large number of files, so the build appears to hang rather than being busy. Additionally, the bash rat check (`dev/release/run-rat.sh` + `rat_exclude_files.txt`) flagged seven files that the Maven RAT config already skips, so the two checks were inconsistent. ## What changes are included in this PR? - Add excludes to the `apache-rat-plugin` configuration in `pom.xml` for the release scratch paths: `dev/release/venv/**`, `dev/release/comet-rm/workdir/**`, `dev/dist/**`, `dev/release/rat.txt`, `dev/release/filtered_rat.txt`, and `dev/release/*.jar`. - Reconcile the bash rat exclude list (`dev/release/rat_exclude_files.txt`) with the Maven excludes by adding the seven files the bash check flagged but Maven already skips: the four `.claude/skills/*/SKILL.md` docs, `.github/workflows/README.md`, and the `native/jni-bridge/testdata/` backtrace/stacktrace fixtures (these moved from `native/testdata`, whose stale paths remain listed). ## How are these changes tested? Verified that the updated `rat_exclude_files.txt` globs match all seven previously flagged files using the same `fnmatch` logic as `dev/release/check-rat-report.py`. The `pom.xml` excludes are RAT configuration only and are exercised by the existing release build process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
