Hello all,

@Dong Lin while I agree with your reasoning here, our ultimate aim is to
have stable public APIs. If we need to use @Internal apis for the Apache
connectors/formats/etc, then it is likely that users building custom
integrations do too. Therefore I am aligned with @Chesnay that we should
not depend on @Internal classes. But the same issue applies in the main
Flink codebase and externalizing code does not necessarily need to be a
blocker, since it is an existing problem. We can consider this a best
practice.

> It seems that there are roughly 32 public classes in flink-avro that are
not marked with @PublicEvolving. Some of them are explicitly marked
as @Internal. Are you suggesting to mark all these classes as
@PublicEvolving?

I interpreted @Teoh, Hong <lian...@amazon.co.uk>'s message as we have
identified *three* classes that are currently used by the formats that are
insufficiently scoped that we should promote to @PublicEvolving.

Thanks for the deep dive @Teoh, Hong <lian...@amazon.co.uk>.

> As such, I propose that we:
- Mark the inherently public classes in flink-avro as @PublicEvolving
- Proceed with externalizing flink-avro-glue-schema-registry

+1 to this proposal

Thanks,
Danny

On Tue, Jan 31, 2023 at 12:32 AM Dong Lin <lindon...@gmail.com> wrote:

> Hi Hong,
>
> Thanks for the detailed reply.
>
> It seems that there are roughly 32 public classes in flink-avro that are
> not marked with @PublicEvolving. Some of them are explicitly marked
> as @Internal. Are you suggesting to mark all these classes as
> @PublicEvolving?
>
> I agree it would be nice to mark more classes as @PublicEvolving so that
> projects outside apache/flink* can take advantage of these APIs. On the
> other hand, it is probably useful to make sure that we only expose the
> minimal set of APIs for our target use cases and that all public APIs are
> well-documented.
>
> Typically public APIs are exposed via interface rather than concrete class.
> So it is not clear to me whether we can just mark all those classes
> as @PublicEvolving and still achieve the high-level goal described above.
> Maybe we should follow the typical FLIP process by listing all APIs that
> need to be exposed as public in a FLIP so that other developers will help
> double check those APIs and vote for it, if we decide to add more public
> APIs in Flink.
>
> On the other hand, I still think it is OK for flink-connector-aws (or any
> projects managed by Flink community) to use Internal APIs, for the reasons
> described in my reply to Chesnay's email. We can still try to make more
> APIs public. I just don't think it has to block the effort to
> externalize AWS formats.
>
> What do you think?
>
> Regards,
> Dong
>
>
>
> On Tue, Jan 31, 2023 at 2:18 AM Teoh, Hong <lian...@amazon.co.uk> wrote:
>
> > Hi @Chesnay @Dong Lin,
> >
> > Thanks for flagging up the concerns.
> >
> > I agree that we should make clear which APIs are public in flink-avro so
> > that we can make the connector code more independent. IMO that moves in
> the
> > right direction for Flink in general!
> >
> >
> > 1. Public classes/interfaces in flink-avro
> > - I took a look at the classes in flink-avro that are used in
> > flink-avro-glue-schema-registry, and it turns out those classes are
> already
> > exposed as part of the public interface in
> flink-avro-glue-schema-registry
> > and flink-avro-confluent-registry. (They are already inherently
> > @PublicEvolving (RegistryAvroDeserializationSchema, SchemaCoder,
> > RegistryAvroSerializationSchema), so I recommend that we should just mark
> > them as so for clarity.
> > - The only class that isn't inherently exposed as @PublicEvolving is
> > MutableByteArrayInputStream, but I took a look, and I think we can remove
> > this dependency from flink-avro-glue-schema-registry
> > - Also, while looking, I found that there are a couple of classes that
> are
> > recommended for users in our doc [1] but are not marked as
> @PublicEvolving
> > . We can probably also fix this by adding the @PublicEvolving annotation
> >
> > 2. Our public API depends on Apache Avro classes
> > - I agree, ideally we don't have this dependency, but since we already
> > inherently expose it to users, we should maintain this for backwards
> > compatibility.
> > - Also, some of the @PublicEvolving classes in flink-avro already have
> > dependencies on Avro classes [2]
> >
> > As such, I propose that we:
> > - Mark the inherently public classes in flink-avro as @PublicEvolving
> > - Proceed with externalizing flink-avro-glue-schema-registry
> >
> >
> > Let me know what you think!
> >
> > Regards,
> > Hong
> >
> > [1] AvroInputFormat
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/avro/
> > [2] AvroFormatOptions
> >
> https://github.com/apache/flink/blob/f75cf38e28143bb36acbc2ee6a5ea9c8852916dd/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroFormatOptions.java
> >
> >
> > On 30/01/2023, 12:53, "Chesnay Schepler" <ches...@apache.org> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do
> > not click links or open attachments unless you can confirm the sender and
> > know the content is safe.
> >
> >
> >
> >     On 30/01/2023 13:24, Dong Lin wrote:
> >     > it should be OK for
> >     > apache/flink-connector-aws to use non-public API, similar to how
> > classes
> >     > inside Flink can use APIs marked with @Internal.
> >     >
> >     > We can mark related classes with @Internal as appropriate, without
> >     > requiring any new FLIP.
> >
> >     It's not though, because these APIs don't offer any compatibility
> >     guarantees.
> >
> >     We want to arrive at a state where connectors are fully independent
> of
> >     Flink releases. Using non-public APIs runs counter to that.
> >
> >
> >
>

Reply via email to