Copilot commented on code in PR #4479:
URL: https://github.com/apache/polaris/pull/4479#discussion_r3281862244
##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/validation/EntityNameValidator.java:
##########
@@ -27,25 +27,45 @@
*
* <ul>
* <li>is not null or empty;
- * <li>does not contain a forward slash ({@code /});
+ * <li>is not {@code .} or {@code ..};
+ * <li>does not contain ISO control characters (U+0000–U+001F or
U+007F–U+009F);
+ * <li>does not contain any of: {@code / \ : * ? " < > | # + `};
* <li>does not start or end with whitespace.
* </ul>
*/
public final class EntityNameValidator {
private EntityNameValidator() {}
+ /**
+ * Characters forbidden in entity names beyond control characters and
leading/trailing whitespace.
+ * Covers characters rejected or strongly discouraged by S3, GCS, Azure,
Windows filesystem
+ * semantics, URL encoding, and shell/template/SQL quoting.
+ */
+ private static final String FORBIDDEN_CHARS = "/\\:*?\"<>|#+`";
+
/** Validates a single entity name (table, view, namespace level, ...). */
public static void validateName(String name) {
if (name == null || name.isEmpty()) {
throw new IllegalArgumentException("Entity name must not be empty");
}
- if (name.indexOf('/') >= 0) {
- throw new IllegalArgumentException("Entity name must not contain '/': "
+ name);
+ if (name.equals(".") || name.equals("..")) {
+ throw new IllegalArgumentException("Entity name must not be '.' or
'..'");
}
- if (!name.equals(name.strip())) {
- throw new IllegalArgumentException(
- "Entity name must not have leading or trailing whitespace: " + name);
+ for (int i = 0; i < name.length(); i++) {
+ char c = name.charAt(i);
+ if (Character.isISOControl(c)) {
+ throw new IllegalArgumentException(
+ String.format(
+ "Entity name must not contain control characters (U+%04X):
%s", (int) c, name));
Review Comment:
The control-character error message includes the raw `name` value, which may
still contain unprintable characters (including newlines). Since
`IllegalArgumentException` messages are logged and returned in JSON error
responses (via `IcebergExceptionMapper`), this can lead to log injection /
hard-to-read logs. Consider omitting the raw name for this case, or
escaping/sanitizing it (e.g., replace ISO control chars with visible `\\uXXXX`
sequences) before including it in the exception message.
##########
CHANGELOG.md:
##########
@@ -34,7 +34,12 @@ request adding CHANGELOG notes for breaking (!) changes and
possibly other secti
### Breaking changes
- The ExternalCatalogFactory interface has been renamed to
FederatedCatalogFactory. Its createCatalog() and createGenericCatalog() method
signatures have been extended to include a `catalogProperties` parameter of
type `Map<String, String>` for passing through proxy and timeout settings to
federated catalog HTTP clients.
- The `ConnectionCredentials.of()` method now throws an exception when more
than one expiration timestamp property is present in the credentials map. Only
a single expiration timestamp is allowed per credentials bundle.
-- Entity names (namespaces, tables, views, generic tables) submitted to the
REST layer are now rejected with HTTP 400 if they are empty, contain a `/`, or
have leading/trailing whitespace. Clients that were previously able to create
such entities must rename them before upgrading.
+- The REST layer now enforces stricter validation for entity names (including
namespaces, tables, views, and generic tables). Requests containing invalid
names will be rejected with an HTTP 400 error. Existing clients should verify
and rename entities before upgrading if their names fall into the following
forbidden categories:
+ - Empty strings
+ - Names consisting solely of `.` or `..`
+ - Names containing control (invisible) characters
+ - Names with leading or trailing whitespace
+ - Names containing any of these characters: `/\:*?"<>|#`
Review Comment:
The changelog’s forbidden-character list doesn’t match the validator’s
actual behavior/documentation: `EntityNameValidator` currently forbids `+` and
`` ` `` in addition to the listed characters. To avoid surprising breaking
changes, update this entry (and/or adjust `FORBIDDEN_CHARS`) so the release
notes accurately reflect what will be rejected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]