If we want to provide useful arithmetic and conversions, then full-blown
decimal64 (and perhaps decimal32) is warranted.
If we want to easily expose and roundtrip PostgreSQL's fixed-scale money
type with full binary precision, then I agree a canonical extension type
is the way.
And we can of course do both.
Sidenote: I haven't seen many proposals for canonical extension types
until now, which is a bit surprising. The barrier for standardizing a
canonical extension type is much lower than for a new Arrow data type.
Regards
Antoine.
Le 09/11/2023 à 18:35, David Li a écrit :
cuDF has decimal32/decimal64 [1].
Would a canonical extension type [2] be appropriate here? I think that's come
up as a solution before.
[1]: https://docs.rapids.ai/api/cudf/stable/user_guide/data-types/
[2]: https://arrow.apache.org/docs/format/CanonicalExtensions.html
On Thu, Nov 9, 2023, at 11:56, Antoine Pitrou wrote:
Or they could trivially use a int64 column for that, since the scale is
fixed anyway, and you're probably not going to multiply money values
together.
Le 09/11/2023 à 17:54, Curt Hagenlocher a écrit :
If Arrow had a decimal64 type, someone could choose to use that for a
PostgreSQL money column knowing that there are edge cases where they may
get an undesired result.
On Thu, Nov 9, 2023 at 8:42 AM Antoine Pitrou <anto...@python.org> wrote:
Le 09/11/2023 à 17:23, Curt Hagenlocher a écrit :
Or more succinctly,
"111,111,111,111,111.1111" will fit into a decimal64; would you prevent
it
from being stored in one so that you can describe the column as
"decimal(18, 4)"?
That's what we do for other decimal types, see PyArrow below:
```
>>> pa.array([111_111_111_111_111_1111]).cast(pa.decimal128(18, 0))
Traceback (most recent call last):
[...]
ArrowInvalid: Precision is not great enough for the result. It should be
at least 19
```