[GitHub] [arrow] jorisvandenbossche commented on pull request #9948: ARROW-12150: [Python] Correctly infer type of mixed-precision Decimals

GitBox Mon, 12 Apr 2021 07:25:19 -0700


jorisvandenbossche commented on pull request #9948:
URL: https://github.com/apache/arrow/pull/9948#issuecomment-817857305



   @pitrou thanks a lot for that input.
   
   To be explicit: do you think this logic should be used for *all* inference 
code paths? 
   
   Because currently, our inference of scale/precision from a string 
(`Decimal128::FromString`) vs from a python decimal 
(`InferDecimalPrecisionAndScale`)  have some different behaviour regarding 
trailing zero's. For a value like `Decimal('123E+2')`, the first would give a 
(precision, scale) of (5, 0), while the other (3, -2). 
   And it is this difference that gives a problem because now we first infer 
the type from the python decimal in a first pass, but then for the actual 
conversion in a second pass go through the string representation of the 
decimals, which gives conflicting types, even in the case of non-mixed 
decimals. For example with a single decimal (on master):
   
   ```
   In [1]: from decimal import Decimal
   
   In [2]: pa.array([Decimal('123E+2')])
   ...
   ArrowInvalid: Decimal type with precision 5 does not fit into precision 
inferred from first array element: 3
   ../src/arrow/python/python_to_arrow.cc:169  
internal::DecimalFromPyObject(obj, *type, &value)
   ../src/arrow/python/python_to_arrow.cc:486  
PyValue::Convert(this->primitive_type_, this->options_, value)
   ../src/arrow/python/iterators.h:69  func(value, static_cast<int64_t>(i), 
&keep_going)
   ../src/arrow/python/python_to_arrow.cc:1055  converter->Extend(seq, size)
   
   In [3]: pa.infer_type([Decimal('123E+2')])
   Out[3]: Decimal128Type(decimal128(3, -2))
   ```
   
   I think the logic you proposes matches the current logic of inferring from a 
python decimal, but thus not when inferring from a string.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on pull request #9948: ARROW-12150: [Python] Correctly infer type of mixed-precision Decimals

Reply via email to