[
https://issues.apache.org/jira/browse/ARROW-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Phillip Cloud updated ARROW-1716:
---------------------------------
Description:
Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides a
bug, because we're writing decimal values as hex encoded bytes.
C++ and Java compare that the bytes are the same, but because C++ is
interpreting everything as little endian after ARROW-1588 and Java is big
endian the numbers these bytes represent will be different in their respective
systems.
I propose that instead of encoding DecimaArray/DecimalVector values as hex
encoded bytes, we store the integer as a string when writing Arrow
DecimalArray/DecimalVector data to JSON. This will allow us to compare that the
bytes have the same meaning in both systems.
This requires a change to how Arrow writes JSON.
cc [~icexelloss] [~wesmckinn] [~jnadeau]
was:
Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides a
bug, because we're writing decimal values as hex encoded bytes.
C++ and Java compare that the bytes are the same, but because C++ is
interpreting everything as little endian after ARROW-1588 and Java is big
endian the numbers these bytes represent will be different in their respective
systems.
I propose that instead of encoding DecimaArray/DecimalVector values as hex
encoded bytes, we store the integer as a string when writing Arrow data to
JSON. This will allow us to compare that the bytes have the same meaning in
both systems.
This requires a change to how Arrow writes JSON.
cc [~icexelloss] [~wesmckinn] [~jnadeau]
> [Format/JSON] Use string integer value for Decimals in JSON
> -----------------------------------------------------------
>
> Key: ARROW-1716
> URL: https://issues.apache.org/jira/browse/ARROW-1716
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Java - Vectors
> Affects Versions: 0.7.1
> Reporter: Phillip Cloud
> Assignee: Phillip Cloud
> Fix For: 0.8.0
>
>
> Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides
> a bug, because we're writing decimal values as hex encoded bytes.
> C++ and Java compare that the bytes are the same, but because C++ is
> interpreting everything as little endian after ARROW-1588 and Java is big
> endian the numbers these bytes represent will be different in their
> respective systems.
> I propose that instead of encoding DecimaArray/DecimalVector values as hex
> encoded bytes, we store the integer as a string when writing Arrow
> DecimalArray/DecimalVector data to JSON. This will allow us to compare that
> the bytes have the same meaning in both systems.
> This requires a change to how Arrow writes JSON.
> cc [~icexelloss] [~wesmckinn] [~jnadeau]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)