[
https://issues.apache.org/jira/browse/PARQUET-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243106#comment-17243106
]
ASF GitHub Bot commented on PARQUET-1928:
-----------------------------------------
gszadovszky commented on pull request #831:
URL: https://github.com/apache/parquet-mr/pull/831#issuecomment-737864597
Parquet community was against adding INT96 support to not to encourage our
clients to use it. While I understand the requirement of supporting the already
written types. (Meanwhile as parquet-avro did not support INT96 ever this
change is required for developments of new functionalities depending on the
deprecated INT96.)
Anyway, I am fine with this change but I do not really like that it works by
default. What do you think about keeping the original behavior by default and
introduce a configuration flag to switch it on? (See `writeParquetUUID` as an
example.) This way we still not encourage the clients to use INT96 but have the
option to do so if it is necessary.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Interpret Parquet INT96 type as FIXED[12] AVRO Schema
> -----------------------------------------------------
>
> Key: PARQUET-1928
> URL: https://issues.apache.org/jira/browse/PARQUET-1928
> Project: Parquet
> Issue Type: Bug
> Components: parquet-avro
> Affects Versions: 1.11.0
> Reporter: Anant Damle
> Priority: Minor
> Labels: patch
> Fix For: 1.12.0
>
>
> Reading Parquet files in Apache Beam using ParquetIO uses `AvroParquetReader`
> causing it to throw `IllegalArgumentException("INT96 not implemented and is
> deprecated")`
> Customers have large datasets which can't be reprocessed again to convert
> into a supported type. An easier approach would be to convert into a byte
> array of 12 bytes, that can then be interpreted by the developer in any way
> they want to interpret it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)