On Mon, Jul 18, 2022 at 2:35 AM Antoine Pitrou <anto...@python.org> wrote:
>
>
> Le 18/07/2022 à 03:54, Wes McKinney a écrit :
> > This patch caused Parquet files written with 2.0.0 to be unreadable in
> > 3.0.0 onward
> >
> > https://github.com/apache/arrow/commit/ef0feb2c9c959681d8a105cbadc1ae6580789e69
> >
> > This was reported on June 14 on dev@ and I git-bisected to the root cause:
> >
> > https://lists.apache.org/thread/wtbqozdhj2hwn6f0sps2j70lr07grk06
> >
> >  From the look of https://issues.apache.org/jira/browse/ARROW-10353,
> > we're going to have to use the Parquet footer metadata to insert a
> > backward compatibility affordance so that files written prior to 3.0.0
> > are still readable.
>
> At that time, you wrote """Note that DataPageV2 is not recommended for
> production use""" :-)
>

That's true! =) But the main reason for that is that the
implementation completeness and testing for DataPageV2 across Parquet
implementations is generally weaker — so I would expect to see more
cross-implementation compatibility issues as a result of less use and
testing. But if we can be internally backwards compatible without great hardship
I think we should try.

Reply via email to