Re: [QUESTION][Parquet][Decimal] Why not implement the INT32/INT64 to store Decimal logical type in parquet file

Micah Kornfield Fri, 06 Jan 2023 09:38:59 -0800

>
> Hi Kun,
> The document of arrow c++ about  Reading and writing Parquet files
> <https://arrow.apache.org/docs/cpp/parquet.html#logical-types> requires
> `(2) On the write side, a FIXED_LENGTH_BYTE_ARRAY is always emitted.`


I don't think this is a requirement, it is simply documenting current
behavior.

   Why we not follow the definition of parquet
> <
> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal
> >
> for
> writing the parquet file?


I think this was probably an issue of effort.  Given FLBA is more generic,
there wasn't a need to write to other types.  Contributing an option to
write out lower precision types to integers would be useful.

Thanks,
Micah


On Wed, Jan 4, 2023 at 12:58 AM Kun Liu <[email protected]> wrote:

> Hi all,
>    In the PR https://github.com/apache/arrow-rs/pull/3431, I want to write
> decimal data with lower precision to INT32/INT64 in the parquet file.
>
>    The document of arrow c++ about  Reading and writing Parquet files
> <https://arrow.apache.org/docs/cpp/parquet.html#logical-types> requires
> `(2) On the write side, a FIXED_LENGTH_BYTE_ARRAY is always emitted.`
>
>    But in the definition of parquet format
> <
> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal
> >
> for
> logical type of decimal, the decimal type can be represented by INT32/INT64
> for the lower precision.
>
>    Why we not follow the definition of parquet
> <
> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal
> >
> for
> writing the parquet file?
>
> Thanks
> Kun
>

Re: [QUESTION][Parquet][Decimal] Why not implement the INT32/INT64 to store Decimal logical type in parquet file

Reply via email to