Johnnathanalmeida commented on code in PR #13395:
URL: https://github.com/apache/arrow/pull/13395#discussion_r920061528


##########
cpp/src/gandiva/regex_functions_holder.cc:
##########
@@ -76,10 +57,114 @@ const FunctionNode LikeHolder::TryOptimize(const 
FunctionNode& node) {
   return node;
 }
 
-Status LikeHolder::Make(const FunctionNode& node, std::shared_ptr<LikeHolder>* 
holder) {
-  ARROW_RETURN_IF(node.children().size() != 2 && node.children().size() != 3,
-                  Status::Invalid("'like' function requires two or three 
parameters"));
+const FunctionNode SQLLikeHolder::TryOptimize(const FunctionNode& node) {
+  if (node.descriptor()->name() == "ilike") {
+    // Optimizations don't work for case-insensitive matching
+    return node;
+  }
+
+  std::string pcre_pattern;
+  auto pattern_result = GetPattern(node);
+  if (!pattern_result.ok()) {
+    return node;
+  } else {
+    pcre_pattern = pattern_result.ValueOrDie();
+  }
+
+  auto literal_type = node.children().at(1)->return_type();
+  auto pcre_node =
+      std::make_shared<LiteralNode>(literal_type, LiteralHolder(pcre_pattern), 
false);
+  auto new_node = FunctionNode("regexp_matches", {node.children().at(0), 
pcre_node},
+                               node.return_type());
+
+  auto optimized_node = RegexpMatchesHolder::TryOptimize(new_node);
+
+  if (optimized_node.descriptor()->name() != "regexp_matches") {

Review Comment:
   @projjal, as I understand from the code, there is only one optimizer and it 
is inside the "RegexpMatchesHolder::TryOptimize" method.
   
   Then, inside the "SQLLikeHolder::TryOptimize" method, a new node called 
"regexp_matches" is created to use the "RegexpMatchesHolder::TryOptimize" 
method. If it couldn't optimize (starts_with_regex_, ends_with_regex_, 
is_substr_regex_), it doesn't want to use the new node created, but the 
original one that arrived via parameter.



##########
cpp/src/gandiva/regex_functions_holder.cc:
##########
@@ -76,10 +57,114 @@ const FunctionNode LikeHolder::TryOptimize(const 
FunctionNode& node) {
   return node;
 }
 
-Status LikeHolder::Make(const FunctionNode& node, std::shared_ptr<LikeHolder>* 
holder) {
-  ARROW_RETURN_IF(node.children().size() != 2 && node.children().size() != 3,
-                  Status::Invalid("'like' function requires two or three 
parameters"));
+const FunctionNode SQLLikeHolder::TryOptimize(const FunctionNode& node) {
+  if (node.descriptor()->name() == "ilike") {
+    // Optimizations don't work for case-insensitive matching
+    return node;
+  }
+
+  std::string pcre_pattern;
+  auto pattern_result = GetPattern(node);
+  if (!pattern_result.ok()) {
+    return node;
+  } else {
+    pcre_pattern = pattern_result.ValueOrDie();
+  }
+
+  auto literal_type = node.children().at(1)->return_type();
+  auto pcre_node =
+      std::make_shared<LiteralNode>(literal_type, LiteralHolder(pcre_pattern), 
false);
+  auto new_node = FunctionNode("regexp_matches", {node.children().at(0), 
pcre_node},
+                               node.return_type());
+
+  auto optimized_node = RegexpMatchesHolder::TryOptimize(new_node);
+
+  if (optimized_node.descriptor()->name() != "regexp_matches") {

Review Comment:
   @projjal, as I understand from the code, there is only one optimizer and it 
is inside the "RegexpMatchesHolder::TryOptimize" method.
   
   Then, inside the "SQLLikeHolder::TryOptimize" method, a new node called 
"regexp_matches" is created to use the "RegexpMatchesHolder::TryOptimize" 
method. If it couldn't optimize (starts_with_regex_, ends_with_regex_, 
is_substr_regex_), it doesn't want to use the new node created, but the 
original one that arrived via parameter.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to