Ok, just to close out this thread.  It looks like there's general
consensus the the approach outlined above, but also including the
iceberg-data module.

After looking more closely, iceberg-data and iceberg-common don't really
have much content, but they are used heavily by other areas of the
codebase, so I think the long term plan is to fold them into the
iceberg-core module so they should be treated similarly.

I'll put together a PR to update the contributing section to include
information about semantic versioning.

-Dan

On Fri, Sep 30, 2022 at 10:16 AM Daniel Weeks <daniel.c.we...@gmail.com>
wrote:

> Yufei, great questions.  I'll take a look at iceberg-data and see how
> that's being used/changing, but it might be a good idea to include it as
> well.
>
> I do think there are areas like you mentioned in other modules that we do
> want to try to maintain backwards compatibility and the goal is always make
> reasonable attempts to go through a deprecation cycle.
>
> Annotation based approach raised some concerns because it's hard to
> maintain and the annotations fall into disrepair over time.  I'm open to
> considering this as a possible extension of the proposed policy if we see
> issues, but there would be a fair amount of work to go through the codebase
> and start marking classes.  I feel we're trying to strike a balance of low
> up-front effort, but increase the clarity around the guarantees.
>
> Interested if others feel strongly about it.
> -Dan
>
> On Thu, Sep 29, 2022 at 3:07 PM Yufei Gu <flyrain...@gmail.com> wrote:
>
>> +1 for the proposed approach. Here is a question and a minor suggestion.
>>
>> Question: Do we consider the module iceberg-data as the minor version
>> compatible guaranteed?
>>
>> Suggestion: I get that we don’t want modules like spark/flink to be minor
>> and major versions compatible. It’s simple, and works well for both users
>> and developers over time. What if we want the compatible guarantee on
>> certain classes from these modules, e.g., class SparkActions. Is there a
>> way to do that? My suggestion is that we can still leverage annotations.
>> It’s not necessary to be the way of Spark/Hadoop, e.g., @experimental or
>> @stable. Instead, the annotation can just tell whether this is an API or
>> not. Considering we have two levels of compatibility guarantees, we may
>> have two annotations like @MinorVersionCompatibleAPI and
>> @MajorVersionCompatibleAPI.
>>
>> Best,
>>
>> Yufei
>>
>> `This is not a contribution`
>>
>>
>> On Thu, Sep 29, 2022 at 1:06 PM Ryan Blue <b...@tabular.io> wrote:
>>
>>> +1 to the approach outlined here. Thanks, Dan!
>>>
>>> On Wed, Sep 28, 2022 at 4:18 PM Daniel Weeks <dwe...@apache.org> wrote:
>>>
>>>> Hey Iceberg Community,
>>>>
>>>> I wanted to raise a discussion thread with respect to how to handle
>>>> semantic versioning and deprecations so that we can document the
>>>> expectations for changes to the baseline going forward.
>>>>
>>>> The goal is to clarify/formalize for users, contributors, and reviewers
>>>> the expectations around changes to Iceberg public facing interfaces and
>>>> what that means in context of the 1.0 release and future releases.
>>>>
>>>> Prior to 1.0, there has been an informal policy that all public facing
>>>> APIs require deprecation and backwards compatibility for at least one minor
>>>> release.  Earlier discussion around the 1.0 release included stronger major
>>>> version guarantees for the iceberg-api module and Revapi was introduced
>>>> with the intent to enforce additional major version guarantees.  Those
>>>> interfaces are the primary set that Iceberg users interface with and minor
>>>> version changes could be disruptive.
>>>>
>>>> However, this still leaves a large portion of the codebase without a
>>>> clear designation of what is "publicly facing" and what is considered
>>>> "internal".  During the community sync today, a few different approaches
>>>> were discussed but it sounded like there was some support for the proposed
>>>> policy below.  The expectation being that some of the central modules would
>>>> require a deprecation cycle to ensure stronger guarantees while other
>>>> modules that are less likely to have direct dependencies would still err
>>>> toward deprecations, but if more significant structural changes are
>>>> necessary or solutions are difficult to incorporate without breaking
>>>> changes, it would be up to the discretion of reviewers/committers to allow
>>>> breaking changes.
>>>>
>>>> I've also included some other ideas that are more strict or more
>>>> flexible as alternatives.
>>>>
>>>> Please share support, objections, or alternatives that you think
>>>> clarify the approach going forward.
>>>>
>>>> *Proposed Policy:*
>>>>
>>>> *Major Version Deprecations Required*
>>>> iceberg-api module
>>>>
>>>> *Minor Version Deprecations Required*
>>>> iceberg-common
>>>> iceberg-core
>>>> iceberg-parquet
>>>> iceberg-orc
>>>>
>>>> *Minor Version Deprecations Discretionary*
>>>> (all modules not referenced above)
>>>>
>>>> *Alternatives:*
>>>>
>>>>    1. Formalize current policy
>>>>       - Minor version guarantees only
>>>>       - No major version guarantees
>>>>    2. Strict policy
>>>>       - Major version guarantees for iceberg-api module
>>>>       - Minor version guarantees for all other modules
>>>>       - Strongest guarantees, least flexible
>>>>    3. Discretionary Policy
>>>>       - Major version for iceberg-api
>>>>       - Discretionary for other modules
>>>>       - Most flexible, weaker guarantees
>>>>    4. Annotation driven policy
>>>>       - Experimental/Stable/etc. annotations denote policy at a
>>>>       class/interface-level
>>>>       - Most granular, difficult to implement/maintain
>>>>
>>>> -Dan
>>>>
>>>>
>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>

Reply via email to