Hello all, @Dong Lin while I agree with your reasoning here, our ultimate aim is to have stable public APIs. If we need to use @Internal apis for the Apache connectors/formats/etc, then it is likely that users building custom integrations do too. Therefore I am aligned with @Chesnay that we should not depend on @Internal classes. But the same issue applies in the main Flink codebase and externalizing code does not necessarily need to be a blocker, since it is an existing problem. We can consider this a best practice.
> It seems that there are roughly 32 public classes in flink-avro that are not marked with @PublicEvolving. Some of them are explicitly marked as @Internal. Are you suggesting to mark all these classes as @PublicEvolving? I interpreted @Teoh, Hong <lian...@amazon.co.uk>'s message as we have identified *three* classes that are currently used by the formats that are insufficiently scoped that we should promote to @PublicEvolving. Thanks for the deep dive @Teoh, Hong <lian...@amazon.co.uk>. > As such, I propose that we: - Mark the inherently public classes in flink-avro as @PublicEvolving - Proceed with externalizing flink-avro-glue-schema-registry +1 to this proposal Thanks, Danny On Tue, Jan 31, 2023 at 12:32 AM Dong Lin <lindon...@gmail.com> wrote: > Hi Hong, > > Thanks for the detailed reply. > > It seems that there are roughly 32 public classes in flink-avro that are > not marked with @PublicEvolving. Some of them are explicitly marked > as @Internal. Are you suggesting to mark all these classes as > @PublicEvolving? > > I agree it would be nice to mark more classes as @PublicEvolving so that > projects outside apache/flink* can take advantage of these APIs. On the > other hand, it is probably useful to make sure that we only expose the > minimal set of APIs for our target use cases and that all public APIs are > well-documented. > > Typically public APIs are exposed via interface rather than concrete class. > So it is not clear to me whether we can just mark all those classes > as @PublicEvolving and still achieve the high-level goal described above. > Maybe we should follow the typical FLIP process by listing all APIs that > need to be exposed as public in a FLIP so that other developers will help > double check those APIs and vote for it, if we decide to add more public > APIs in Flink. > > On the other hand, I still think it is OK for flink-connector-aws (or any > projects managed by Flink community) to use Internal APIs, for the reasons > described in my reply to Chesnay's email. We can still try to make more > APIs public. I just don't think it has to block the effort to > externalize AWS formats. > > What do you think? > > Regards, > Dong > > > > On Tue, Jan 31, 2023 at 2:18 AM Teoh, Hong <lian...@amazon.co.uk> wrote: > > > Hi @Chesnay @Dong Lin, > > > > Thanks for flagging up the concerns. > > > > I agree that we should make clear which APIs are public in flink-avro so > > that we can make the connector code more independent. IMO that moves in > the > > right direction for Flink in general! > > > > > > 1. Public classes/interfaces in flink-avro > > - I took a look at the classes in flink-avro that are used in > > flink-avro-glue-schema-registry, and it turns out those classes are > already > > exposed as part of the public interface in > flink-avro-glue-schema-registry > > and flink-avro-confluent-registry. (They are already inherently > > @PublicEvolving (RegistryAvroDeserializationSchema, SchemaCoder, > > RegistryAvroSerializationSchema), so I recommend that we should just mark > > them as so for clarity. > > - The only class that isn't inherently exposed as @PublicEvolving is > > MutableByteArrayInputStream, but I took a look, and I think we can remove > > this dependency from flink-avro-glue-schema-registry > > - Also, while looking, I found that there are a couple of classes that > are > > recommended for users in our doc [1] but are not marked as > @PublicEvolving > > . We can probably also fix this by adding the @PublicEvolving annotation > > > > 2. Our public API depends on Apache Avro classes > > - I agree, ideally we don't have this dependency, but since we already > > inherently expose it to users, we should maintain this for backwards > > compatibility. > > - Also, some of the @PublicEvolving classes in flink-avro already have > > dependencies on Avro classes [2] > > > > As such, I propose that we: > > - Mark the inherently public classes in flink-avro as @PublicEvolving > > - Proceed with externalizing flink-avro-glue-schema-registry > > > > > > Let me know what you think! > > > > Regards, > > Hong > > > > [1] AvroInputFormat > > > https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/avro/ > > [2] AvroFormatOptions > > > https://github.com/apache/flink/blob/f75cf38e28143bb36acbc2ee6a5ea9c8852916dd/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroFormatOptions.java > > > > > > On 30/01/2023, 12:53, "Chesnay Schepler" <ches...@apache.org> wrote: > > > > CAUTION: This email originated from outside of the organization. Do > > not click links or open attachments unless you can confirm the sender and > > know the content is safe. > > > > > > > > On 30/01/2023 13:24, Dong Lin wrote: > > > it should be OK for > > > apache/flink-connector-aws to use non-public API, similar to how > > classes > > > inside Flink can use APIs marked with @Internal. > > > > > > We can mark related classes with @Internal as appropriate, without > > > requiring any new FLIP. > > > > It's not though, because these APIs don't offer any compatibility > > guarantees. > > > > We want to arrive at a state where connectors are fully independent > of > > Flink releases. Using non-public APIs runs counter to that. > > > > > > >