As discussed on the mailing list [1], in order to demarcate the
pre-1.0.0 and post-1.0.0 worlds, and to allow the
forward-compatibility-protection changes we are making to actually
work (i.e. so that libraries can recognize that they have received
data with a feature that they do not support), I have proposed to
increment the MetadataVersion from V4 to V5. Additionally, if the
union validity bitmap changes are accepted, the MetadataVersion could
be used to control whether unions are permitted to be serialized or
not (with V4 -- used by v0.8.0 to v0.17.1, unions would not be

Since there have been no backward incompatible changes to the Arrow
format since 0.8.0, this would be no different, and (aside from the
union issue) libraries supporting V5 are expected to accept BOTH V4
and V5 so that backward compatibility is not broken, and any
serialized data from prior versions of the Arrow libraries (0.8.0
onward) will continue to be readable.

Implementations are recommended, but not required, to provide an
optional "V4 compatibility mode" for forward compatibility
(serializing data from >= 1.0.0 that needs to be readable by older
libraries, e.g. Spark deployments stuck on an older Java-Arrow
version). In this compatibility mode, non-forward-compatible features
added in 1.0.0 and beyond would not be permitted.

A PR with the changes to Schema.fbs (possibly subject to some
clarifying changes to the comments) is at [2].

Once the PR is merged, it will be necessary for implementations to be
updated and tested as appropriate at minimum to validate that backward
compatibility is preserved (i.e. V4 IPC payloads are still readable --
we have some in apache/arrow-testing and can add more as needed).

The vote will be open for at least 72 hours.

[ ] +1 Accept addition of MetadataVersion::V5 along with its general
implications above
[ ] +0
[ ] -1 Do not accept because...

[2]: https://github.com/apache/arrow/pull/7566

Reply via email to