Ok, I created ORC-229 https://issues.apache.org/jira/browse/ORC-229 so that we'll have a new OrcFile.Version of UNSTABLE-PRE-2.0. If you look at the associated pull request, you can see the comments in the code are pretty clear that users should stay away. I also added a logged warning when the writer uses that version.
My intention is that we can iterate on the UNSTABLE-PRE-2.0 format without cross-version compatibility. It will only be used for developer testing. As part of the ORC 2.0 release, we can delete that version and move to a new 2.0 version. Thoughts? .. Owen On Tue, Aug 8, 2017 at 12:13 AM, Gopal Vijayaraghavan <[email protected]> wrote: > Hi, > > > > Let me make sure I have the backwards compatibility straight. If a > user > > > switches to ORC 2.0, he could choose to continue writing in older > formats > > > so that his old tools could read it > > > > Yes, exactly. > > To chime in on Owen's point, the development process has a slight wrinkle > in it, which we avoided in the 0.11 -> 0.12 migration due to ORC being > embedded in Hive. > > The feature addition is two-fold - the new features are available only > when a user flips the writer versions. > > There is no feature flag for reader versions, so the readers have to keep > up to date with the writer changes (or just fail for the "blackholed" ones, > with good errors). > > Due to the split between projects, I expect to see a two-step development > cycle, to clean up the integration pathways before the ABI is frozen in 2.0. > > The entire process can be gated on the writer version - during the > development process, there will be an experimental version (1.5?) and a > stable version. > > I have no interest in ever supporting an actual 1.5 version data setup in > ORC, but for the sake of integration testing the 1.5->2.0 writer versions > are extremely useful stepping stones towards a multi-project dependency > like ORC. > > Once the integrations are all complete and the format can be frozen, ORC > 2.0 releases can still disable the default writer version from being > upgraded for another stable release. > > After the ecosystem has had all its upgrades, the default version gets > flipped to 2.0, while the ability to write 0.12 files will still remain as > an option, while all intermediate reader versions will get dropped. > > That's a bit more complicated than being part of Hive and sync'ing > releases, but I think this gives ORC the flexibility to accept > contributions from a wide community, supporting multi-project release > timelines, without leaving the implementation full of reader > implementations for many writer versions. > > Cheers, > Gopal > > >
