I agree with the general sentiment, that we should remove FlinkSource, but
only deprecate FlinkSink.

I understand Steven's approach to remove the deprecated classes only from
the latest Flink version, but keep them for the old versions, but I have
some concerns with this.

My main concern is that if we remove all the code related to the old
FlinkSource from the latent Flink version (production code and test cleanup
too), then the different Flink versions will have very  different code
bases. This will make the backports to older Flink versions painful until
the full deprecation.

I see the following possibilities:
1. Bite the bullet, and accept the backport complexity increase
2. Remove only the main class, and remove other classes only at full
deprecation
3. After surveying the userbase deprecate the code for every versions

Since the FlinkSource is deprecated for a long time, I would aim for
conserving development resources and chose option 3, unless there are
objections from the userbase.

On Wed, Mar 12, 2025, 18:45 Rodrigo Meneses <rmene...@gmail.com> wrote:

> Once we deprecate FlinkSink, we should also upgrade IcebergSink from
> `Experimental` to `PublicEvolving`
>
>
> On Wed, Mar 12, 2025 at 10:01 AM Steven Wu <stevenz...@gmail.com> wrote:
>
>> Max, thanks for confirming the Flink 2.0 behavior. With the latest
>> information, here are my suggestions
>>
>> 1) I am fine with deleting the old `FlinkSource` in the upcoming flink
>> 2.0 module (but keeping them in flink 1.x modules). As discussed in a
>> previous thread, the new FLIP-27 source has been baked by users for 2
>> years. We already switched SQL to the new FLIP-27 source in the Iceberg 1.7
>> release. So far so good.
>>
>> 2) for FlinkSink, I would prefer to keep it around with flink 2.0 module.
>> Here are my reasons.
>> - new sink is still relatively young. Give it more time to bake and
>> mature.
>> - Flink sink (for streaming ingestion) is probably a lot more widely used
>> than Flink source (for batch or streaming read). Hence, it is valuable to
>> be more conservative in the deprecation plan here.
>> - we can stop adding new features to the old FlinkSink and only
>> contribute to the new IcebergSink. that should reduce the maintenance
>> burden.
>> - we can mark the FinkSink as deprecated for all flink versions once the
>> parity is achieved.
>>
>>
>> On Wed, Mar 12, 2025 at 7:09 AM Maximilian Michels <m...@apache.org>
>> wrote:
>>
>>> Hi JB,
>>>
>>> V3 is another topic, but it would be good to start looking into that.
>>>
>>> Hey Steven,
>>>
>>> Flink 2.0 actually did not remove the old interfaces, which was not
>>> what I expected when I started this conversation. Turns out, old
>>> interfaces have been moved to ".legacy" packages. I'm in the process
>>> of verifying they will continue to work as expected.
>>>
>>> It looks like we will still have the option to keep around the old
>>> source / sink implementations. However, I think it still makes sense
>>> to remove FlinkSource and FlinkSink in the Flink 2.0 code path. This
>>> will simplify the code base and avoid confusion for users in the long
>>> run. I don't see real value in bringing over legacy sources / sinks to
>>> a new Flink major release.
>>>
>>> -Max
>>>
>>> On Tue, Mar 11, 2025 at 10:46 PM Steven Wu <stevenz...@gmail.com> wrote:
>>> >
>>> > I assume Flink 2.0 will remove the old source and sink interfaces.
>>> >
>>> > With the above assumption (please correct me if I am wrong), here is
>>> what I would suggest for the next Iceberg release (1.9 or whichever version
>>> that lands the Flink 2.0 support).
>>> > - Flink 2.0 module - remove the old source and sink impls. we have no
>>> choice here.
>>> > - Flink 1.x modules - update the deprecation note for old
>>> `FlinkSource` that will be removed in Flink 2.0 support. add a similar
>>> deprecation node for old `FlinkSink`.
>>> >
>>> > We shouldn't remove the old `FlinkSource` and `FlinkSink` for the
>>> Flink 1.x modules. Allow users to continue to use them with Flink 1.x. Note
>>> that Iceberg supports 3 active Flink versions (currently 1.18, 1.19, 1.20).
>>> So old versions will be naturally retired over time. As Flink progresses
>>> with 2.0, 2.1, 2.2 (etc.), we may want to keep the support for the last
>>> Flink 1.x version a little longer based on the community feedback.
>>> >
>>> > Thanks,
>>> > Steven
>>> >
>>> > On Tue, Mar 11, 2025 at 8:48 AM Jean-Baptiste Onofré <j...@nanthrax.net>
>>> wrote:
>>> >>
>>> >> Hi Max
>>> >>
>>> >> Yes, it makes sense. Maybe I can help on that as I'm working on Spec
>>> >> V3 support in Flink (I did the default value, and I'm checking the new
>>> >> types now).
>>> >>
>>> >> Please let me know :)
>>> >>
>>> >> Thanks,
>>> >> Regards
>>> >> JB
>>> >>
>>> >> On Tue, Mar 11, 2025 at 11:23 AM Maximilian Michels <m...@apache.org>
>>> wrote:
>>> >> >
>>> >> > Hi JB,
>>> >> >
>>> >> > Yes, Flink 2.0 support ideally should land in Iceberg 1.9.
>>> >> >
>>> >> > Hi Rodrigo,
>>> >> >
>>> >> > +1 on closing the gap between the two sink implementations:
>>> FlinkSink
>>> >> > and IcebergSink. Thanks for helping to close that gap!
>>> >> >
>>> >> > Hi Steven,
>>> >> >
>>> >> > Good point on the "Remove by Iceberg 2.0" claim stated in
>>> FlinkSource.
>>> >> > We can keep it until Iceberg 2.0 in the Flink 1.X paths, but I would
>>> >> > propose to remove FlinkSource earlier than Iceberg 2.0 for the Flink
>>> >> > 2.X path. The reason is that it is not used by default for many
>>> >> > Iceberg versions. I think it is unlikely that users jumping on Flink
>>> >> > 2.0 will want to use it.
>>> >> >
>>> >> > As for removing FlinkSink, you're right that IcebergSink is
>>> relatively
>>> >> > new as a replacement. If we can keep it in the process of porting
>>> the
>>> >> > code, we may keep it around longer, but I suggest deprecating it and
>>> >> > scheduling it for removal in the Iceberg release following 1.9.
>>> >> >
>>> >> > To summarize, I'm proposing the following:
>>> >> >
>>> >> > Iceberg 1.9
>>> >> > - Remove FlinkSource in Flink 2.0 code path
>>> >> > - Deprecate FlinkSink for all supported Flink versions
>>> >> >
>>> >> > Iceberg > 1.9
>>> >> > - Remove FlinkSource for all supported Flink versions
>>> >> > - Remove FlinkSink for all supported Flink versions
>>> >> >
>>> >> > Does that sound right?
>>> >> >
>>> >> > Cheers,
>>> >> > Max
>>> >> >
>>> >> > On Thu, Mar 6, 2025 at 9:10 PM Steven Wu <stevenz...@gmail.com>
>>> wrote:
>>> >> > >
>>> >> > > > if Flink 2.0 release is going to release the old source and
>>> sink interfaces, t
>>> >> > >
>>> >> > > typo above: "release the old" -> "delete the old"
>>> >> > >
>>> >> > > On Thu, Mar 6, 2025 at 12:08 PM Steven Wu <stevenz...@gmail.com>
>>> wrote:
>>> >> > >>
>>> >> > >> There was a previous thread for the Flink source. The Javadoc
>>> deprecation note for the old `FlinkSource` currently says it will be
>>> removed in Iceberg 2.0 release
>>> >> > >> https://lists.apache.org/thread/27kcvo3p86pysk9wrggq4vphzo03sv3l
>>> >> > >>
>>> >> > >> Now regarding deprecating and removing the old `FlinkSink` in
>>> favor of the new `IcebergSink`, the new `IcebergSink` was added 6 months
>>> ago and released in Iceberg 1.7 (current release is 1.8). Not sure how many
>>> users have got a chance to use it. Ideally, I would like to have the new
>>> `IcebergSink` bake longer with more users running it in the production
>>> environments. There is also the parity problems that Rod mentioned.
>>> >> > >>
>>> >> > >> However, if Flink 2.0 release is going to release the old source
>>> and sink interfaces, then we will have no choice and remove the old source
>>> and sink implementations earlier than we originally planned/preferred.
>>> >> > >>
>>> >> > >>
>>> >> > >> On Thu, Mar 6, 2025 at 8:40 AM Rodrigo Meneses <
>>> rmene...@gmail.com> wrote:
>>> >> > >>>
>>> >> > >>> Hi Max, +1 on that too. IcebergSink has been hanging around for
>>> a while now. We want to make sure we have the same features, though:
>>> >> > >>>
>>> >> > >>> I’ve got https://github.com/apache/iceberg/pull/12071 that
>>> adds the Range Distribution Mode to IcebergSink. It still needs a few
>>> recent bug fixes and features backported, but it should be ready soon.
>>> >> > >>>
>>> >> > >>>
>>> >> > >>>
>>> >> > >>> On Thu, Mar 6, 2025 at 8:18 AM Jean-Baptiste Onofré <
>>> j...@nanthrax.net> wrote:
>>> >> > >>>>
>>> >> > >>>> Hi Max,
>>> >> > >>>>
>>> >> > >>>> I guess you are proposing to remove FlinkSink and the
>>> corresponding
>>> >> > >>>> FlinkSource as well.
>>> >> > >>>>
>>> >> > >>>> It makes to me as, I saw in the code, that both FlinkSink and
>>> >> > >>>> FlinkSource are deprecated for a while.
>>> >> > >>>>
>>> >> > >>>> So, +1 to remove it.
>>> >> > >>>>
>>> >> > >>>> Are you planning this for 1.9.0 ?
>>> >> > >>>>
>>> >> > >>>> Regards
>>> >> > >>>> JB
>>> >> > >>>>
>>> >> > >>>> On Thu, Mar 6, 2025 at 4:21 PM Maximilian Michels <
>>> m...@apache.org> wrote:
>>> >> > >>>> >
>>> >> > >>>> > Hi,
>>> >> > >>>> >
>>> >> > >>>> > Today there are two Flink write connectors in Iceberg:
>>> >> > >>>> >
>>> >> > >>>> > 1. FlinkSink (original sink, based on Flink legacy
>>> interfaces)
>>> >> > >>>> > 2. IcebergSink (newer version, based on modern Flink API)
>>> >> > >>>> >
>>> >> > >>>> > In terms of features, (1) is a subset of (2).
>>> >> > >>>> >
>>> >> > >>>> > I'm in the process of adding support for Flink 2.0. The
>>> interfaces
>>> >> > >>>> > used for (1) have been deprecated for several Flink versions
>>> and are
>>> >> > >>>> > removed / discouraged in Flink 2.0.
>>> >> > >>>> >
>>> >> > >>>> > Therefore, I would like to propose to remove FlinkSink for
>>> the Flink
>>> >> > >>>> > 2.0 Iceberg module. We have already deprecated FlinkSink for
>>> a while.
>>> >> > >>>> >
>>> >> > >>>> > Any objections?
>>> >> > >>>> >
>>> >> > >>>> > -Max
>>>
>>

Reply via email to