Johnnathanalmeida commented on code in PR #13395:
URL: https://github.com/apache/arrow/pull/13395#discussion_r920061528
##########
cpp/src/gandiva/regex_functions_holder.cc:
##########
@@ -76,10 +57,114 @@ const FunctionNode LikeHolder::TryOptimize(const
FunctionNode& node) {
return node;
}
-Status LikeHolder::Make(const FunctionNode& node, std::shared_ptr<LikeHolder>*
holder) {
- ARROW_RETURN_IF(node.children().size() != 2 && node.children().size() != 3,
- Status::Invalid("'like' function requires two or three
parameters"));
+const FunctionNode SQLLikeHolder::TryOptimize(const FunctionNode& node) {
+ if (node.descriptor()->name() == "ilike") {
+ // Optimizations don't work for case-insensitive matching
+ return node;
+ }
+
+ std::string pcre_pattern;
+ auto pattern_result = GetPattern(node);
+ if (!pattern_result.ok()) {
+ return node;
+ } else {
+ pcre_pattern = pattern_result.ValueOrDie();
+ }
+
+ auto literal_type = node.children().at(1)->return_type();
+ auto pcre_node =
+ std::make_shared<LiteralNode>(literal_type, LiteralHolder(pcre_pattern),
false);
+ auto new_node = FunctionNode("regexp_matches", {node.children().at(0),
pcre_node},
+ node.return_type());
+
+ auto optimized_node = RegexpMatchesHolder::TryOptimize(new_node);
+
+ if (optimized_node.descriptor()->name() != "regexp_matches") {
Review Comment:
@projjal, as I understand from the code, there is only one optimizer and it
is inside the "RegexpMatchesHolder::TryOptimize" method.
Then, inside the "SQLLikeHolder::TryOptimize" method, a new node called
"regexp_matches" is created to use the "RegexpMatchesHolder::TryOptimize"
method. If it couldn't optimize (starts_with_regex_, ends_with_regex_,
is_substr_regex_), it doesn't want to use the new node created, but the
original one that arrived via parameter.
##########
cpp/src/gandiva/regex_functions_holder.cc:
##########
@@ -76,10 +57,114 @@ const FunctionNode LikeHolder::TryOptimize(const
FunctionNode& node) {
return node;
}
-Status LikeHolder::Make(const FunctionNode& node, std::shared_ptr<LikeHolder>*
holder) {
- ARROW_RETURN_IF(node.children().size() != 2 && node.children().size() != 3,
- Status::Invalid("'like' function requires two or three
parameters"));
+const FunctionNode SQLLikeHolder::TryOptimize(const FunctionNode& node) {
+ if (node.descriptor()->name() == "ilike") {
+ // Optimizations don't work for case-insensitive matching
+ return node;
+ }
+
+ std::string pcre_pattern;
+ auto pattern_result = GetPattern(node);
+ if (!pattern_result.ok()) {
+ return node;
+ } else {
+ pcre_pattern = pattern_result.ValueOrDie();
+ }
+
+ auto literal_type = node.children().at(1)->return_type();
+ auto pcre_node =
+ std::make_shared<LiteralNode>(literal_type, LiteralHolder(pcre_pattern),
false);
+ auto new_node = FunctionNode("regexp_matches", {node.children().at(0),
pcre_node},
+ node.return_type());
+
+ auto optimized_node = RegexpMatchesHolder::TryOptimize(new_node);
+
+ if (optimized_node.descriptor()->name() != "regexp_matches") {
Review Comment:
@projjal, as I understand from the code, there is only one optimizer and it
is inside the "RegexpMatchesHolder::TryOptimize" method.
Then, inside the "SQLLikeHolder::TryOptimize" method, a new node called
"regexp_matches" is created to use the "RegexpMatchesHolder::TryOptimize"
method. If it couldn't optimize (starts_with_regex_, ends_with_regex_,
is_substr_regex_), it doesn't want to use the new node created, but the
original one that arrived via parameter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]