[ 
https://issues.apache.org/jira/browse/ARROW-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433659#comment-16433659
 ] 

ASF GitHub Bot commented on ARROW-2224:
---------------------------------------

pitrou commented on a change in pull request #1880: [WIP] ARROW-2224: [C++] 
Remove boost-regex dependency
URL: https://github.com/apache/arrow/pull/1880#discussion_r180698163
 
 

 ##########
 File path: cpp/src/arrow/util/decimal.cc
 ##########
 @@ -253,117 +251,131 @@ static void StringToInteger(const std::string& str, 
Decimal128* out) {
   }
 }
 
-static const boost::regex DECIMAL_REGEX(
-    // sign of the number
-    "(?<SIGN>[-+]?)"
-
-    // digits around the decimal point
-    
"(((?<LEFT_DIGITS>\\d+)\\.(?<FIRST_RIGHT_DIGITS>\\d*)|\\.(?<SECOND_RIGHT_DIGITS>\\d+)"
-    ")"
+namespace {
 
-    // optional exponent
-    "([eE](?<FIRST_EXP_VALUE>[-+]?\\d+))?"
+struct DecimalComponents {
+  std::string sign;
+  std::string whole_digits;
+  std::string fractional_digits;
+  std::string exponent_sign;
+  std::string exponent_digits;
+};
 
-    // otherwise
-    "|"
+inline bool IsSign(char c) { return (c == '-' || c == '+'); }
 
-    // we're just an integer
-    "(?<INTEGER>\\d+)"
+inline bool IsDot(char c) { return c == '.'; }
 
-    // or an integer with an exponent
-    "(?:[eE](?<SECOND_EXP_VALUE>[-+]?\\d+))?)");
+inline bool IsDigit(char c) { return (c >= '0' && c <= '9'); }
 
-static inline bool is_zero_character(char c) { return c == '0'; }
+inline bool StartsExponent(char c) { return (c == 'e' || c == 'E'); }
 
-Status Decimal128::FromString(const std::string& s, Decimal128* out, int32_t* 
precision,
-                              int32_t* scale) {
-  if (s.empty()) {
-    return Status::Invalid("Empty string cannot be converted to decimal");
+inline size_t ParseDigitsRun(const char* s, size_t start, size_t size, 
std::string* out) {
+  size_t pos;
+  for (pos = start; pos < size; ++pos) {
+    if (!IsDigit(s[pos])) {
+      break;
+    }
   }
+  *out = std::string(s + start, pos - start);
+  return pos;
+}
 
-  // case of all zeros
-  if (std::all_of(s.cbegin(), s.cend(), is_zero_character)) {
-    if (precision != nullptr) {
-      *precision = 0;
-    }
+bool ParseDecimalComponents(const char* s, size_t size, DecimalComponents* 
out) {
 
 Review comment:
   I've decided against this. Not only would it be slightly less readable and 
more annoying to type, but the day we have a string view or span, the current 
solution adapts seamlessly. The "abstracted" variant OTOH is less abstracted 
from that point of view.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [C++] Replace boost regex usage with libre2
> -------------------------------------------
>
>                 Key: ARROW-2224
>                 URL: https://issues.apache.org/jira/browse/ARROW-2224
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Phillip Cloud
>            Assignee: Phillip Cloud
>            Priority: Major
>              Labels: pull-request-available
>
> We're using {{boost::regex}} to parse decimal strings for {{decimal128}} 
> types. We should use {{libre2}} instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to