Jackie-Jiang commented on code in PR #16276:
URL: https://github.com/apache/pinot/pull/16276#discussion_r2211855111
##########
pinot-common/src/main/java/org/apache/pinot/common/request/context/predicate/RegexpLikePredicate.java:
##########
@@ -44,13 +57,42 @@ public String getValue() {
return _value;
}
+ public String getMatchParameter() {
+ return _matchParameter;
+ }
+
public Pattern getPattern() {
if (_pattern == null) {
- _pattern = PatternFactory.compile(_value);
+ _pattern = buildPattern(_value, _matchParameter);
}
return _pattern;
}
+ private Pattern buildPattern(String pattern, String matchParameter) {
+ // Validate that all characters in matchParameter are supported
+ for (char c : matchParameter.toCharArray()) {
Review Comment:
Let's make this consistent with scalar function, where we allow single
character, and both upper and lower case are accepted
##########
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java:
##########
@@ -106,6 +107,19 @@ public BaseFilterOperator
getLeafFilterOperator(QueryContext queryContext, Predi
}
return new ScanBasedFilterOperator(queryContext, predicateEvaluator,
dataSource, numDocs);
} else if (predicateType == Predicate.Type.REGEXP_LIKE) {
+ // Check if case-insensitive flag is present
+ RegexpLikePredicate regexpLikePredicate = (RegexpLikePredicate)
predicateEvaluator.getPredicate();
+ boolean isCaseInsensitive = regexpLikePredicate.getMatchParameter() !=
null
Review Comment:
Parse this in `RegexpLikePredicate`, and make it consistent with scalar
function parsing logic
##########
pinot-common/src/main/java/org/apache/pinot/common/request/context/predicate/RegexpLikePredicate.java:
##########
@@ -24,15 +24,28 @@
import org.apache.pinot.common.utils.regex.PatternFactory;
/**
- * Predicate for REGEXP_LIKE.
+ * Predicate for REGEXP_LIKE with optional match parameters
*/
public class RegexpLikePredicate extends BasePredicate {
private final String _value;
+ private final String _matchParameter;
private Pattern _pattern = null;
public RegexpLikePredicate(ExpressionContext lhs, String value) {
super(lhs);
_value = value;
+ _matchParameter = "c";
+ _pattern = PatternFactory.compile(value);
Review Comment:
We don't build pattern when creating the predicate because pattern is not
needed when FST is available.
##########
pinot-common/src/main/java/org/apache/pinot/common/request/context/predicate/RegexpLikePredicate.java:
##########
@@ -44,13 +57,42 @@ public String getValue() {
return _value;
}
+ public String getMatchParameter() {
Review Comment:
We can change this to `boolean isCaseInsensitive()`
##########
pinot-common/src/main/java/org/apache/pinot/common/request/context/predicate/RegexpLikePredicate.java:
##########
@@ -24,15 +24,28 @@
import org.apache.pinot.common.utils.regex.PatternFactory;
/**
- * Predicate for REGEXP_LIKE.
+ * Predicate for REGEXP_LIKE with optional match parameters
*/
public class RegexpLikePredicate extends BasePredicate {
private final String _value;
+ private final String _matchParameter;
private Pattern _pattern = null;
public RegexpLikePredicate(ExpressionContext lhs, String value) {
super(lhs);
_value = value;
+ _matchParameter = "c";
Review Comment:
Let's use `null` as the default when match parameter is not provided. This
way we can differentiate explicit `'c'` vs missing parameter
##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/regexp/RegexpLikeConstFunctions.java:
##########
@@ -34,17 +35,53 @@ public class RegexpLikeConstFunctions {
@ScalarFunction
public boolean regexpLike(String inputStr, String regexPatternStr) {
if (_matcher == null) {
- _matcher = PatternFactory.compile(regexPatternStr).matcher("");
+ Pattern p = PatternFactory.compile(regexPatternStr);
+ _matcher = p.matcher("");
Review Comment:
^^
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]