jorisvandenbossche commented on pull request #9948:
URL: https://github.com/apache/arrow/pull/9948#issuecomment-831791116
For now, I pushed a minimal fix to ensure we don't raise for a different
precision if we rescale the decimal. That at least fixes the existing error for
eg `pa.array([Decimal('123E+2')])`.
But still running into some some more corner cases regarding the
precision/scale inference from a Python Decimal.
Using released version of pyarrow, we have an inconsistency in how to deal
with positive exponents:
```python
>>> pa.infer_type([decimal.Decimal('123e4')])
Decimal128Type(decimal128(7, 0))
>>> pa.infer_type([decimal.Decimal('123e2')])
Decimal128Type(decimal128(3, -2))
```
Those two cases should be interpreted consistently, and following
https://github.com/apache/arrow/pull/9948#issuecomment-817835991 they should
give `(3, -4)` and `(3, -2)`. The reason for this difference is this
`num_digits <= abs_exponent` check (so "123e4" with the exponent of 4 being
larger than number of digits of 3 is handled differently):
https://github.com/apache/arrow/blob/b3e43987c47b2f01b204a2d954f882f7161616ef/cpp/src/arrow/python/decimal.cc#L79-L84
This check is to ensure to add precision in case of leading zeros as in
`decimal.Decimal('0.0123')`:
```python
>>> decimal.Decimal('0.0123').as_tuple()
DecimalTuple(sign=0, digits=(1, 2, 3), exponent=-4)
>>> pa.infer_type([decimal.Decimal('0.0123')])
Decimal128Type(decimal128(4, 4))
```
So we could do this `num_additional_zeros` edit only when the exponent is
negative, to ensure that `Decimal('123e4')` and `Decimal('123e2')` (both with a
positive exponent) are treated consistently.
However, changing that would change the tested behaviour for:
```python
>>> pa.infer_type([decimal.Decimal('0.01E5')])
Decimal128Type(decimal128(4, 0))
```
to change to `decimal128(1, -3)`. Are we fine with changing this tested
behaviour for `decimal.Decimal('0.01E5')]` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]