Re: [PR] feat: support scientific notation in `parse_float_as_decimal ` [arrow-datafusion]

via GitHub Fri, 15 Dec 2023 03:30:58 -0800


alamb commented on code in PR #8494:
URL: https://github.com/apache/arrow-datafusion/pull/8494#discussion_r1427871571



##########
datafusion/sql/src/expr/value.rs:
##########
@@ -405,39 +407,48 @@ const fn try_decode_hex_char(c: u8) -> Option<u8> {
 }
 
 /// Parse Decimal128 from a string
-///
-/// TODO: support parsing from scientific notation
 fn parse_decimal_128(unsigned_number: &str, negative: bool) -> Result<Expr> {
-    // remove leading zeroes
-    let trimmed = unsigned_number.trim_start_matches('0');
-    // parse precision and scale, remove decimal point if exists
-    let (precision, scale, replaced_str) = if trimmed == "." {
-        // special cases for numbers such as “0.”, “000.”, and so on.
-        (1, 0, Cow::Borrowed("0"))
-    } else if let Some(i) = trimmed.find('.') {
+    let big_decimal = BigDecimal::from_str(unsigned_number).map_err(|e| {
+        DataFusionError::from(ParserError(format!(
+            "Cannot parse {unsigned_number} into BigDecimal when building 
decimal: {e}"
+        )))
+    })?;
+
+    let replaced_str = big_decimal.to_string().replace('.', "");

Review Comment:
   Once you have parsed the data into a BigDecimal, I think you could just 
convert to a `i128` directly which would be more performant and I think 
potentially more correct
   
   Perhaps you could use 
   
https://docs.rs/bigdecimal/latest/bigdecimal/struct.BigDecimal.html#method.fractional_digit_count
 to get the scale 
   
   And then compute the precision and the `i128` from 
   
https://docs.rs/bigdecimal/latest/bigdecimal/struct.BigDecimal.html#method.into_bigint_and_exponent
   



##########
datafusion/sql/tests/sql_integration.rs:
##########
@@ -43,8 +43,8 @@ fn parse_decimals() {
     let test_data = [
         ("1", "Int64(1)"),
         ("001", "Int64(1)"),
-        ("0.1", "Decimal128(Some(1),1,1)"),
-        ("0.01", "Decimal128(Some(1),2,2)"),
+        ("0.1", "Decimal128(Some(1),2,1)"),
+        ("0.01", "Decimal128(Some(1),3,2)"),

Review Comment:
   🤔 I don't understand why they were wrong
   
   I think it is possible to fit `0.1` into a Decimal(1,1):
   
   ```
   ❯ select arrow_cast(0.1, 'Decimal128(1,1)') as d;
   +-----+
   | d   |
   +-----+
   | 0.1 |
   +-----+
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: support scientific notation in `parse_float_as_decimal ` [arrow-datafusion]

Reply via email to