js8544 commented on code in PR #33806:
URL: https://github.com/apache/arrow/pull/33806#discussion_r1083265094


##########
cpp/src/arrow/compute/kernels/scalar_string_ascii.cc:
##########
@@ -1505,6 +1508,13 @@ struct MatchLike {
     static const RE2 kLikePatternIsStartsWith(R"(([^%_]*[^\\%_])?%+)", 
kRE2Options);
     // A LIKE pattern matching this regex can be translated into a suffix 
search.
     static const RE2 kLikePatternIsEndsWith(R"(%+([^%_]*))", kRE2Options);
+    static bool global_checked = false;
+    if (ARROW_PREDICT_FALSE(!global_checked)) {
+      RETURN_NOT_OK(RegexStatus(kLikePatternIsSubstringMatch));
+      RETURN_NOT_OK(RegexStatus(kLikePatternIsStartsWith));
+      RETURN_NOT_OK(RegexStatus(kLikePatternIsEndsWith));

Review Comment:
   I think it does though. `global_checked` is set to true only if all three 
patterns are ok.



##########
cpp/src/arrow/compute/kernels/scalar_string_ascii.cc:
##########
@@ -1505,6 +1508,13 @@ struct MatchLike {
     static const RE2 kLikePatternIsStartsWith(R"(([^%_]*[^\\%_])?%+)", 
kRE2Options);
     // A LIKE pattern matching this regex can be translated into a suffix 
search.
     static const RE2 kLikePatternIsEndsWith(R"(%+([^%_]*))", kRE2Options);
+    static bool global_checked = false;

Review Comment:
   I also feel like a static boolean is enough for our case. As it avoids false 
positives, i.e., erroneous regexes are always checked.



##########
cpp/src/arrow/compute/kernels/scalar_string_ascii.cc:
##########
@@ -1505,6 +1508,13 @@ struct MatchLike {
     static const RE2 kLikePatternIsStartsWith(R"(([^%_]*[^\\%_])?%+)", 
kRE2Options);
     // A LIKE pattern matching this regex can be translated into a suffix 
search.
     static const RE2 kLikePatternIsEndsWith(R"(%+([^%_]*))", kRE2Options);
+    static bool global_checked = false;

Review Comment:
   I thought about using `std::call_once` but the only solution I can come up 
with uses exceptions:
   ```cpp
       static std::once_flag global_checked;
       try {
         std::call_once(global_checked, [] {
           auto check_global = [] {
             RETURN_NOT_OK(RegexStatus(kLikePatternIsSubstringMatch));
             RETURN_NOT_OK(RegexStatus(kLikePatternIsStartsWith));
             RETURN_NOT_OK(RegexStatus(kLikePatternIsEndsWith));
           };
           auto status = check_global();
           if (!status.ok()) { // throw an exception so that once_flag is not 
set
             throw std::invalid_argument(status.ToString());
           }
         });
       } catch (const std::invalid_argument& e) {
         return Status::Invalid(e.what());
       }
   ```
   I couldn't find a more elegant solution. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to