gortiz commented on code in PR #8893:
URL: https://github.com/apache/pinot/pull/8893#discussion_r913448898
##########
pinot-common/src/main/java/org/apache/pinot/common/utils/RegexpPatternConverterUtils.java:
##########
@@ -33,7 +33,49 @@ private RegexpPatternConverterUtils() {
* Converts a LIKE pattern into REGEXP_LIKE pattern.
*/
public static String likeToRegexpLike(String likePattern) {
- return "^" + escapeMetaCharacters(likePattern).replace('_',
'.').replace("%", ".*") + "$";
+ int start = 0;
+ int end = likePattern.length();
+ String prefix = "^";
+ String suffix = "$";
+ switch (likePattern.length()) {
+ case 0:
+ return "^$";
+ case 1:
+ if (likePattern.charAt(0) == '%') {
+ return "^.*$";
+ }
+ break;
+ default:
+ if (likePattern.charAt(0) == '%') {
Review Comment:
> do we plan to optimize something similar to
I don't think this is the place to do that because we don't want to just
optimize `LIKE '%%%%%%%%%%%%%zz'`, we also want to optimize `REGEXP_LIKE(col,
'((((((.*)*)*)*)*)*)*zz')`.
I mean:
1. we transform LIKE expressions into REGEXP_LIKE
2. we let users to write their own REGEXP_LIKE expressions
3. we know some regex in REGEXP_LIKE are dangerous
We should not focus on making 1. safe, we should focus in making 3. safe.
Otherwise an attacker may not be able to use LIKE to create an attack but they
could use REGEXP_LIKE.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]