[ 
https://issues.apache.org/jira/browse/ARROW-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433951#comment-16433951
 ] 

ASF GitHub Bot commented on ARROW-2224:
---------------------------------------

pitrou commented on a change in pull request #1880: ARROW-2224: [C++] Remove 
boost-regex dependency
URL: https://github.com/apache/arrow/pull/1880#discussion_r180765500
 
 

 ##########
 File path: cpp/src/arrow/util/decimal.cc
 ##########
 @@ -253,117 +251,131 @@ static void StringToInteger(const std::string& str, 
Decimal128* out) {
   }
 }
 
-static const boost::regex DECIMAL_REGEX(
-    // sign of the number
-    "(?<SIGN>[-+]?)"
-
-    // digits around the decimal point
-    
"(((?<LEFT_DIGITS>\\d+)\\.(?<FIRST_RIGHT_DIGITS>\\d*)|\\.(?<SECOND_RIGHT_DIGITS>\\d+)"
-    ")"
+namespace {
 
-    // optional exponent
-    "([eE](?<FIRST_EXP_VALUE>[-+]?\\d+))?"
+struct DecimalComponents {
+  std::string sign;
+  std::string whole_digits;
+  std::string fractional_digits;
+  std::string exponent_sign;
+  std::string exponent_digits;
+};
 
-    // otherwise
-    "|"
+inline bool IsSign(char c) { return (c == '-' || c == '+'); }
 
-    // we're just an integer
-    "(?<INTEGER>\\d+)"
+inline bool IsDot(char c) { return c == '.'; }
 
-    // or an integer with an exponent
-    "(?:[eE](?<SECOND_EXP_VALUE>[-+]?\\d+))?)");
+inline bool IsDigit(char c) { return (c >= '0' && c <= '9'); }
 
-static inline bool is_zero_character(char c) { return c == '0'; }
+inline bool StartsExponent(char c) { return (c == 'e' || c == 'E'); }
 
-Status Decimal128::FromString(const std::string& s, Decimal128* out, int32_t* 
precision,
-                              int32_t* scale) {
-  if (s.empty()) {
-    return Status::Invalid("Empty string cannot be converted to decimal");
+inline size_t ParseDigitsRun(const char* s, size_t start, size_t size, 
std::string* out) {
+  size_t pos;
+  for (pos = start; pos < size; ++pos) {
+    if (!IsDigit(s[pos])) {
+      break;
+    }
   }
+  *out = std::string(s + start, pos - start);
+  return pos;
+}
 
-  // case of all zeros
-  if (std::all_of(s.cbegin(), s.cend(), is_zero_character)) {
-    if (precision != nullptr) {
-      *precision = 0;
-    }
+bool ParseDecimalComponents(const char* s, size_t size, DecimalComponents* 
out) {
 
 Review comment:
   I mean if the parse function is taking a `string::const_iterator`, it maybe 
won't accept a different kind of iterator. Or we need to make it a template 
function, piling more layers of abstraction without any concrete advantage.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Replace boost regex usage with libre2
> -------------------------------------------
>
>                 Key: ARROW-2224
>                 URL: https://issues.apache.org/jira/browse/ARROW-2224
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Phillip Cloud
>            Assignee: Phillip Cloud
>            Priority: Major
>              Labels: pull-request-available
>
> We're using {{boost::regex}} to parse decimal strings for {{decimal128}} 
> types. We should use {{libre2}} instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to