[
https://issues.apache.org/jira/browse/PARQUET-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414249#comment-17414249
]
Joshua Howard commented on PARQUET-2088:
----------------------------------------
Sorry about the type in the description. I've updated it. The feature being
toggled is for sequential reads (PARQUET-246).
If it helps, I'm coming from the Trino community, so I could follow the example
with Trino version 361 (build xyz), but that would require us to replicate the
same logic as found in the link for disabling vectorized reads based on a Trino
version that relied on Parquet-mr older than 1.8.0. Is this what Impala does?
If so, it seems brittle.
I think it makes sense to have a parquet-mr version and a separate application
version, but I might be misunderstanding how other applications use the
parquet-mr library.
> Different created_by field values for application and library
> -------------------------------------------------------------
>
> Key: PARQUET-2088
> URL: https://issues.apache.org/jira/browse/PARQUET-2088
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Affects Versions: format-2.9.0
> Reporter: Joshua Howard
> Priority: Minor
>
> There seems to be a discrepancy in the Parquet format created_by field
> regarding how it should be filled out. The parquet-mr library uses this value
> to enable/disable features based on the parquet-mr version
> [here|https://github.com/apache/parquet-mr/blob/5f403501e9de05b6aa48f028191b4e78bb97fb12/parquet-column/src/main/java/org/apache/parquet/CorruptDeltaByteArrays.java#L64-L68].
> Meanwhile, users are encouraged to make use of the application version
> [here|https://www.javadoc.io/doc/org.apache.parquet/parquet-format/latest/org/apache/parquet/format/FileMetaData.html].
> It seems like there are multiple fields needed for an application and
> library version.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)