[
https://issues.apache.org/jira/browse/SPARK-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136236#comment-15136236
]
Sean Owen commented on SPARK-13127:
-----------------------------------
[~JustinPihony] I suspect this is a good idea, but whenever someone suggests a
dependency upgrade, the question is of course: are there incompatible changes?
is it compatible with other dependencies? does it work with all transitive
dependencies?
Would you mind opening a PR with the change, which will entail running the
dependency update scripts to check and declare the changed transitive
dependencies? and then also review release notes to identify any breaking
changes we should know about? for 2.0.0 we can tolerate most incompatibilities
but good to know.
> Upgrade Parquet to 1.9 (Fixes parquet sorting)
> ----------------------------------------------
>
> Key: SPARK-13127
> URL: https://issues.apache.org/jira/browse/SPARK-13127
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: Justin Pihony
> Priority: Minor
>
> Currently, when you write a sorted DataFrame to Parquet, then reading the
> data back out is not sorted by default. [This is due to a bug in
> Parquet|https://issues.apache.org/jira/browse/PARQUET-241] that was fixed in
> 1.9.
> There is a workaround to read the file back in using a file glob (filepath/*).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]