Joris Van den Bossche created PARQUET-1869:
----------------------------------------------

             Summary: [C++] Large decimal values don't roundtrip correctly
                 Key: PARQUET-1869
                 URL: https://issues.apache.org/jira/browse/PARQUET-1869
             Project: Parquet
          Issue Type: Test
            Reporter: Joris Van den Bossche


Reproducer with python:

{code}
import decimal
import pyarrow as pa
import pyarrow.parquet as pq

arr = pa.array([decimal.Decimal('9223372036854775808'), 
decimal.Decimal('1.111')])
print(arr)

pq.write_table(pa.table({'a': arr}), "test_decimal.parquet") 
result = pq.read_table("test_decimal.parquet")
print(result.column('a'))
{code}

gives

{code}
# before writing
<pyarrow.lib.Decimal128Array object at 0x7fd07d79a468>
[
  9223372036854775808.000,
  1.111
]
# after reading
<pyarrow.lib.ChunkedArray object at 0x7fd0711e9f98>
[
  [
    -221360928884514619.392,
    1.111
  ]
]
{code}

I tried reading the file with a different parquet implementation (fastparquet 
python package), and that gives the same values on read, so the issue might 
possibly rather be on the write side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to