Hi Dong Lin,

I see where you are coming from. I agree it would be good to have a FLIP to 
design the public API for flink-avro, so we will limit this vote thread to the 
externalization.

Regards,
Hong




From: Dong Lin <lindon...@gmail.com>
Date: Tuesday, 31 January 2023 at 14:26
To: "dev@flink.apache.org" <dev@flink.apache.org>
Cc: "Teoh, Hong" <lian...@amazon.co.uk>, Chesnay Schepler <ches...@apache.org>
Subject: RE: [EXTERNAL][VOTE] Externalize AWS formats to flink-connector-aws


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Hi all,

It seems that we are on the same page that having flink-connector-aws use 
non-public API from apache/flink is not a blocking issue here. And I agree that 
it would be nice to make necessary APIs public so that 1) projects outside 
Flink can also develop similar connectors and 2) flink-connector-aws can likely 
have less issues dealing with changes in apache/flink.

I just want to note that, if we want to additionally mark a few classes in 
flink-avro as @PublicEvolving, since this is a major public API change, it is 
probably necessary to create a FLIP to specify these API changes. Also, we 
might need to open a new voting thread since the votes made earlier in this 
thread do not cover those public API change.

Thanks,
Dong


On Tue, Jan 31, 2023 at 4:02 PM Danny Cranmer 
<dannycran...@apache.org<mailto:dannycran...@apache.org>> wrote:
Hello all,

@Dong Lin while I agree with your reasoning here, our ultimate aim is to
have stable public APIs. If we need to use @Internal apis for the Apache
connectors/formats/etc, then it is likely that users building custom
integrations do too. Therefore I am aligned with @Chesnay that we should
not depend on @Internal classes. But the same issue applies in the main
Flink codebase and externalizing code does not necessarily need to be a
blocker, since it is an existing problem. We can consider this a best
practice.

> It seems that there are roughly 32 public classes in flink-avro that are
not marked with @PublicEvolving. Some of them are explicitly marked
as @Internal. Are you suggesting to mark all these classes as
@PublicEvolving?

I interpreted @Teoh, Hong <lian...@amazon.co.uk<mailto:lian...@amazon.co.uk>>'s 
message as we have
identified *three* classes that are currently used by the formats that are
insufficiently scoped that we should promote to @PublicEvolving.

Thanks for the deep dive @Teoh, Hong 
<lian...@amazon.co.uk<mailto:lian...@amazon.co.uk>>.

> As such, I propose that we:
- Mark the inherently public classes in flink-avro as @PublicEvolving
- Proceed with externalizing flink-avro-glue-schema-registry

+1 to this proposal

Thanks,
Danny

On Tue, Jan 31, 2023 at 12:32 AM Dong Lin 
<lindon...@gmail.com<mailto:lindon...@gmail.com>> wrote:

> Hi Hong,
>
> Thanks for the detailed reply.
>
> It seems that there are roughly 32 public classes in flink-avro that are
> not marked with @PublicEvolving. Some of them are explicitly marked
> as @Internal. Are you suggesting to mark all these classes as
> @PublicEvolving?
>
> I agree it would be nice to mark more classes as @PublicEvolving so that
> projects outside apache/flink* can take advantage of these APIs. On the
> other hand, it is probably useful to make sure that we only expose the
> minimal set of APIs for our target use cases and that all public APIs are
> well-documented.
>
> Typically public APIs are exposed via interface rather than concrete class.
> So it is not clear to me whether we can just mark all those classes
> as @PublicEvolving and still achieve the high-level goal described above.
> Maybe we should follow the typical FLIP process by listing all APIs that
> need to be exposed as public in a FLIP so that other developers will help
> double check those APIs and vote for it, if we decide to add more public
> APIs in Flink.
>
> On the other hand, I still think it is OK for flink-connector-aws (or any
> projects managed by Flink community) to use Internal APIs, for the reasons
> described in my reply to Chesnay's email. We can still try to make more
> APIs public. I just don't think it has to block the effort to
> externalize AWS formats.
>
> What do you think?
>
> Regards,
> Dong
>
>
>
> On Tue, Jan 31, 2023 at 2:18 AM Teoh, Hong 
> <lian...@amazon.co.uk<mailto:lian...@amazon.co.uk>> wrote:
>
> > Hi @Chesnay @Dong Lin,
> >
> > Thanks for flagging up the concerns.
> >
> > I agree that we should make clear which APIs are public in flink-avro so
> > that we can make the connector code more independent. IMO that moves in
> the
> > right direction for Flink in general!
> >
> >
> > 1. Public classes/interfaces in flink-avro
> > - I took a look at the classes in flink-avro that are used in
> > flink-avro-glue-schema-registry, and it turns out those classes are
> already
> > exposed as part of the public interface in
> flink-avro-glue-schema-registry
> > and flink-avro-confluent-registry. (They are already inherently
> > @PublicEvolving (RegistryAvroDeserializationSchema, SchemaCoder,
> > RegistryAvroSerializationSchema), so I recommend that we should just mark
> > them as so for clarity.
> > - The only class that isn't inherently exposed as @PublicEvolving is
> > MutableByteArrayInputStream, but I took a look, and I think we can remove
> > this dependency from flink-avro-glue-schema-registry
> > - Also, while looking, I found that there are a couple of classes that
> are
> > recommended for users in our doc [1] but are not marked as
> @PublicEvolving
> > . We can probably also fix this by adding the @PublicEvolving annotation
> >
> > 2. Our public API depends on Apache Avro classes
> > - I agree, ideally we don't have this dependency, but since we already
> > inherently expose it to users, we should maintain this for backwards
> > compatibility.
> > - Also, some of the @PublicEvolving classes in flink-avro already have
> > dependencies on Avro classes [2]
> >
> > As such, I propose that we:
> > - Mark the inherently public classes in flink-avro as @PublicEvolving
> > - Proceed with externalizing flink-avro-glue-schema-registry
> >
> >
> > Let me know what you think!
> >
> > Regards,
> > Hong
> >
> > [1] AvroInputFormat
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/avro/
> > [2] AvroFormatOptions
> >
> https://github.com/apache/flink/blob/f75cf38e28143bb36acbc2ee6a5ea9c8852916dd/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroFormatOptions.java
> >
> >
> > On 30/01/2023, 12:53, "Chesnay Schepler" 
> > <ches...@apache.org<mailto:ches...@apache.org>> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> >     On 30/01/2023 13:24, Dong Lin wrote:
> >     > it should be OK for
> >     > apache/flink-connector-aws to use non-public API, similar to how
> > classes
> >     > inside Flink can use APIs marked with @Internal.
> >     >
> >     > We can mark related classes with @Internal as appropriate, without
> >     > requiring any new FLIP.
> >
> >     It's not though, because these APIs don't offer any compatibility
> >     guarantees.
> >
> >     We want to arrive at a state where connectors are fully independent
> of
> >     Flink releases. Using non-public APIs runs counter to that.
> >
> >
> >
>

Reply via email to