If we want to provide useful arithmetic and conversions, then full-blown decimal64 (and perhaps decimal32) is warranted.

If we want to easily expose and roundtrip PostgreSQL's fixed-scale money type with full binary precision, then I agree a canonical extension type is the way.

And we can of course do both.

Sidenote: I haven't seen many proposals for canonical extension types until now, which is a bit surprising. The barrier for standardizing a canonical extension type is much lower than for a new Arrow data type.

Regards

Antoine.


Le 09/11/2023 à 18:35, David Li a écrit :
cuDF has decimal32/decimal64 [1].

Would a canonical extension type [2] be appropriate here? I think that's come 
up as a solution before.

[1]: https://docs.rapids.ai/api/cudf/stable/user_guide/data-types/
[2]: https://arrow.apache.org/docs/format/CanonicalExtensions.html

On Thu, Nov 9, 2023, at 11:56, Antoine Pitrou wrote:
Or they could trivially use a int64 column for that, since the scale is
fixed anyway, and you're probably not going to multiply money values
together.


Le 09/11/2023 à 17:54, Curt Hagenlocher a écrit :
If Arrow had a decimal64 type, someone could choose to use that for a
PostgreSQL money column knowing that there are edge cases where they may
get an undesired result.

On Thu, Nov 9, 2023 at 8:42 AM Antoine Pitrou <anto...@python.org> wrote:


Le 09/11/2023 à 17:23, Curt Hagenlocher a écrit :
Or more succinctly,
"111,111,111,111,111.1111" will fit into a decimal64; would you prevent
it
from being stored in one so that you can describe the column as
"decimal(18, 4)"?

That's what we do for other decimal types, see PyArrow below:
```
   >>> pa.array([111_111_111_111_111_1111]).cast(pa.decimal128(18, 0))
Traceback (most recent call last):
     [...]
ArrowInvalid: Precision is not great enough for the result. It should be
at least 19
```



Reply via email to