Copilot commented on code in PR #880:
URL: https://github.com/apache/ranger/pull/880#discussion_r2963339582
##########
agents-common/src/main/java/org/apache/ranger/plugin/client/BaseClient.java:
##########
@@ -184,6 +184,114 @@ private void init() {
}
}
+ protected void validateSqlIdentifier(String identifier, String
identifierType) throws HadoopException {
+ if (identifier == null) {
+ return;
+ }
+ if (identifier.contains("..") || identifier.contains("//") ||
identifier.contains("\\")) {
+ String msgDesc = "Invalid " + identifierType + ": [" + identifier
+ "]. Path traversal patterns are not allowed.";
+ HadoopException hdpException = new HadoopException(msgDesc);
+ hdpException.generateResponseDataMap(false, msgDesc, msgDesc +
DEFAULT_ERROR_MESSAGE, null, null);
+ LOG.error(msgDesc);
+ throw hdpException;
+ }
+ if (!identifier.matches("^[a-zA-Z0-9_*%\\$\\{\\}\\=\\/\\.\\-]+$")) {
Review Comment:
The allowed-character regex in `validateSqlIdentifier()` does not include
`_`, but the error message says underscores are allowed and many legitimate
identifiers (and tests in this PR) use underscores. Add `_` to the character
class (or otherwise allow it) so common names like `test_db` don’t get rejected.
```suggestion
if (!identifier.matches("^[a-zA-Z0-9_*%\\$\\{\\}\\=\\/\\.\\-_]+$")) {
```
##########
agents-common/src/main/java/org/apache/ranger/plugin/client/BaseClient.java:
##########
@@ -184,6 +184,114 @@ private void init() {
}
}
+ protected void validateSqlIdentifier(String identifier, String
identifierType) throws HadoopException {
+ if (identifier == null) {
Review Comment:
`validateSqlIdentifier()` currently treats an empty string as invalid
because it only returns early for `null`. Several lookup call sites previously
treated empty input as “no filter” (e.g., they only applied LIKE when
`!isEmpty()`), so this change can cause regressions. Consider returning early
for `identifier.isEmpty()` (and possibly `identifier.isBlank()`) to preserve
prior behavior while still validating non-empty input.
```suggestion
if (identifier == null || identifier.isEmpty()) {
```
##########
agents-common/src/main/java/org/apache/ranger/plugin/client/BaseClient.java:
##########
@@ -184,6 +184,114 @@ private void init() {
}
}
+ protected void validateSqlIdentifier(String identifier, String
identifierType) throws HadoopException {
+ if (identifier == null) {
+ return;
+ }
+ if (identifier.contains("..") || identifier.contains("//") ||
identifier.contains("\\")) {
+ String msgDesc = "Invalid " + identifierType + ": [" + identifier
+ "]. Path traversal patterns are not allowed.";
+ HadoopException hdpException = new HadoopException(msgDesc);
+ hdpException.generateResponseDataMap(false, msgDesc, msgDesc +
DEFAULT_ERROR_MESSAGE, null, null);
+ LOG.error(msgDesc);
+ throw hdpException;
+ }
+ if (!identifier.matches("^[a-zA-Z0-9_*%\\$\\{\\}\\=\\/\\.\\-]+$")) {
+ String msgDesc = "Invalid " + identifierType + ": [" + identifier
+ "]. Only alphanumeric characters, underscores, asterisks, curly braces,
percent signs are allowed.";
+ HadoopException hdpException = new HadoopException(msgDesc);
+ hdpException.generateResponseDataMap(false, msgDesc, msgDesc +
DEFAULT_ERROR_MESSAGE, null, null);
+ LOG.error(msgDesc);
+ throw hdpException;
Review Comment:
The `validateSqlIdentifier()` error text doesn’t match what the validator
actually allows: the regex permits characters like `.`, `-`, `/`, `$`, and `=`
but the message only mentions underscores/asterisks/curly braces/percent. Align
the message with the real allowed set (or tighten the regex) so callers get
accurate guidance.
##########
plugin-schema-registry/src/main/java/org/apache/ranger/services/schema/registry/client/AutocompletionAgent.java:
##########
@@ -118,15 +120,59 @@ public String toString() {
}
List<String> expandSchemaMetadataNameRegex(List<String> schemaGroupList,
String lookupSchemaMetadataName) {
+ validatePattern(lookupSchemaMetadataName, "schema metadata pattern");
+ String safePattern = convertWildcardToRegex(lookupSchemaMetadataName);
List<String> res = new ArrayList<>();
Collection<String> schemas = client.getSchemaNames(schemaGroupList);
schemas.forEach(sName -> {
- if (sName.matches(lookupSchemaMetadataName)) {
+ if (sName.matches(safePattern)) {
res.add(sName);
}
});
return res;
}
+
+ private void validatePattern(String pattern, String patternType) {
+ if (pattern == null || pattern.isEmpty()) {
+ return;
+ }
+ if (!pattern.matches("^[a-zA-Z0-9_*?\\[\\]\\-\\$%\\{\\}\\=\\/]+$")) {
+ String msgDesc = "Invalid " + patternType + ": [" + pattern + "].
Only alphanumeric characters, underscores, hyphens, and simple wildcards (*, ?,
[], {}, -) are allowed.";
+ HadoopException hdpException = new HadoopException(msgDesc);
+ hdpException.generateResponseDataMap(false, msgDesc, msgDesc +
ERROR_MSG, null, null);
+ LOG.error(msgDesc);
Review Comment:
`validatePattern()` allows regex-significant characters like `{`/`}` and
`[`/`]`, but `convertWildcardToRegex()` does not escape them. This means inputs
such as `test{100}` or an unbalanced `[` can turn into real regex
quantifiers/character-classes (or even throw `PatternSyntaxException`) and
undermines the goal of preventing complex/unsafe regex. Either disallow
`{}`/`[]` here, or escape them when building `safePattern` (and consider
catching `PatternSyntaxException` from `String.matches`).
##########
agents-common/src/main/java/org/apache/ranger/plugin/client/BaseClient.java:
##########
@@ -184,6 +184,114 @@ private void init() {
}
}
+ protected void validateSqlIdentifier(String identifier, String
identifierType) throws HadoopException {
+ if (identifier == null) {
+ return;
+ }
+ if (identifier.contains("..") || identifier.contains("//") ||
identifier.contains("\\")) {
+ String msgDesc = "Invalid " + identifierType + ": [" + identifier
+ "]. Path traversal patterns are not allowed.";
+ HadoopException hdpException = new HadoopException(msgDesc);
+ hdpException.generateResponseDataMap(false, msgDesc, msgDesc +
DEFAULT_ERROR_MESSAGE, null, null);
+ LOG.error(msgDesc);
+ throw hdpException;
+ }
+ if (!identifier.matches("^[a-zA-Z0-9_*%\\$\\{\\}\\=\\/\\.\\-]+$")) {
+ String msgDesc = "Invalid " + identifierType + ": [" + identifier
+ "]. Only alphanumeric characters, underscores, asterisks, curly braces,
percent signs are allowed.";
+ HadoopException hdpException = new HadoopException(msgDesc);
+ hdpException.generateResponseDataMap(false, msgDesc, msgDesc +
DEFAULT_ERROR_MESSAGE, null, null);
+ LOG.error(msgDesc);
+ throw hdpException;
+ }
+ }
+
+ protected String convertToSqlPattern(String pattern) {
+ if (pattern == null || pattern.isEmpty() || pattern.equals("*")) {
+ return "%";
+ }
+ return pattern.replace("*", "%");
+ }
+
+ protected boolean matchesSqlPattern(String value, String pattern) {
+ if (pattern == null || pattern.equals("%")) {
+ return true;
+ }
+ String regex = pattern.replace("%", ".*").replace("_", ".");
Review Comment:
`matchesSqlPattern()` builds a Java regex directly from the SQL pattern by
replacing `%`/`_` but it doesn’t escape other regex metacharacters (e.g., `.`,
`{}`, `-`, `$`). This can cause incorrect matches (e.g., `.` matching any char)
and can also throw `PatternSyntaxException` for inputs like `a{`/`a}` that pass
`validateSqlIdentifier()`. Escape regex metacharacters first and then translate
`%`/`_`, or use `Pattern.quote()` with a controlled unquoting of wildcards.
```suggestion
StringBuilder regexBuilder = new StringBuilder(pattern.length() * 2);
for (int i = 0; i < pattern.length(); i++) {
char c = pattern.charAt(i);
switch (c) {
case '%':
// SQL LIKE wildcard: zero or more characters
regexBuilder.append(".*");
break;
case '_':
// SQL LIKE wildcard: exactly one character
regexBuilder.append('.');
break;
case '.':
case '^':
case '$':
case '+':
case '?':
case '{':
case '}':
case '[':
case ']':
case '(':
case ')':
case '|':
case '\\':
// Escape regex metacharacters so they are treated
literally
regexBuilder.append('\\').append(c);
break;
default:
regexBuilder.append(c);
break;
}
}
String regex = regexBuilder.toString();
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]