mcgilman opened a new pull request, #11315: URL: https://github.com/apache/nifi/pull/11315
Remove the space character from the invalid-character set so interior spaces are preserved, and normalize the result by collapsing whitespace runs to a single space and stripping leading/trailing whitespace and trailing dots. This lets the asset-upload callers accept common valid filenames such as "driver (1).jar" while still rejecting non-canonical names. Add TestFileUtils covering the sanitization contract. # Summary [NIFI-16000](https://issues.apache.org/jira/browse/NIFI-16000) `FileUtils.getSanitizedFilename(String)` treats the space character (code point `32`) as invalid and replaces it with an underscore. The space character is legal on every major file system (NTFS, ext4, APFS, etc.), so this is stricter than necessary. This matters because of how the method is consumed. Both `ConnectorResource` and `ParameterContextResource` use it as a strict validation gate for the asset name supplied in the `Filename` request header — they sanitize the supplied name and reject the request if the sanitized value differs from the original: ```java final String sanitizedAssetName = FileUtils.getSanitizedFilename(assetName); if (!assetName.equals(sanitizedAssetName)) { throw new IllegalArgumentException(FILENAME_HEADER + " header contains an invalid file name"); } ``` Because any name containing a space is rewritten during sanitization, the equality check fails and the upload is rejected. As a result, common valid filenames cannot be uploaded as assets — e.g. a file produced by browser/OS download de-duplication such as `driver (1).jar` is sanitized to `driver_(1).jar` and rejected with *"... header contains an invalid file name."* ## Changes - Removed the space character (`32`) from the invalid-character set so spaces are preserved rather than replaced. - Spaces are kept exactly as supplied (leading, trailing, repeated, and interior); no other normalization is performed. All other characters continue to be sanitized as before. - Added `TestFileUtils` covering the sanitization contract (null/empty, invalid-character replacement, spaces preserved, dots preserved). The change is backward compatible: any filename that contained no spaces is sanitized exactly as before. The only behavioral change is that the space character is now preserved instead of replaced, so filenames whose sole issue was a space are now accepted by the asset-upload callers instead of being rejected. # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-00000` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as described in the issue tracking ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 21 ### Licensing - [x] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [x] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [x] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
