[
https://issues.apache.org/jira/browse/CALCITE-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825002#comment-17825002
]
EveyWu commented on CALCITE-6278:
----------------------------------
1. Sure, in my opinion, in this case, 'unescaping' means handling backslashes
in character literals.
2. Indeed, if a query is based on table fields instead of literal strings,
there will be query inconsistencies, as escaping may have been done when
inserting data.
A possible solution is to failover unescape, which may lead to an expansion of
the range that `rlike` can support.
{code:java}
boolean find;
try {
find = cache.getUnchecked(new Key(0, pattern)).matcher(s).find();
} catch (Exception patternException) {
find = false;
}
if (!find) {
s = StringEscapeUtils.unescapeJava(s);
pattern = StringEscapeUtils.unescapeJava(pattern);
find = cache.getUnchecked(new Key(0, pattern)).matcher(s).find();
}
return find; {code}
3. For the `rlike` function, I currently can't think of a better way to ensure
that the behavior of table field queries and literal string queries is
consistent with spark and hive at the same time. I will roll back the handling
of unescape.
> Add REGEXP, REGEXP_LIKE function (enabled in Spark library)
> ------------------------------------------------------------
>
> Key: CALCITE-6278
> URL: https://issues.apache.org/jira/browse/CALCITE-6278
> Project: Calcite
> Issue Type: Improvement
> Reporter: EveyWu
> Priority: Minor
> Labels: pull-request-available
> Attachments: image-2024-03-07-09-32-27-002.png,
> image-2024-03-09-11-13-49-064.png, image-2024-03-09-11-37-27-816.png,
> image-2024-03-09-11-38-08-797.png
>
>
> Add Spark functions that have been implemented but have different
> OperandTypes/Returns.
> Add Function
> [REGEXP|https://spark.apache.org/docs/latest/api/sql/index.html#regexp],
> [REGEXP_LIKE|https://spark.apache.org/docs/latest/api/sql/index.html#regexp_like]
> # Since this function has the same implementation as the Spark
> [RLIKE|https://spark.apache.org/docs/latest/api/sql/index.html#rlike]
> function, the implementation can be directly reused.
> # -Since Spark 2.0, string literals (including regex patterns) are unescaped
> in SQL parser, also fix this bug in calcite.-
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)