On Mon, Jul 18, 2022 at 2:35 AM Antoine Pitrou <anto...@python.org> wrote: > > > Le 18/07/2022 à 03:54, Wes McKinney a écrit : > > This patch caused Parquet files written with 2.0.0 to be unreadable in > > 3.0.0 onward > > > > https://github.com/apache/arrow/commit/ef0feb2c9c959681d8a105cbadc1ae6580789e69 > > > > This was reported on June 14 on dev@ and I git-bisected to the root cause: > > > > https://lists.apache.org/thread/wtbqozdhj2hwn6f0sps2j70lr07grk06 > > > > From the look of https://issues.apache.org/jira/browse/ARROW-10353, > > we're going to have to use the Parquet footer metadata to insert a > > backward compatibility affordance so that files written prior to 3.0.0 > > are still readable. > > At that time, you wrote """Note that DataPageV2 is not recommended for > production use""" :-) >
That's true! =) But the main reason for that is that the implementation completeness and testing for DataPageV2 across Parquet implementations is generally weaker — so I would expect to see more cross-implementation compatibility issues as a result of less use and testing. But if we can be internally backwards compatible without great hardship I think we should try.