Hi Danny, I'm also leaning slightly towards the single AWS connector repo direction.
Bumps in the underlying AWS SDK would bump all of the connectors in any case. And if a change occurs that is isolated to a single connector, then those that do not use that connector can just skip the release. Cheers, Thomas On Mon, Oct 24, 2022 at 3:01 PM Teoh, Hong <lian...@amazon.co.uk.invalid> wrote: > I like the single repo with single version idea. > > Pros: > - Better discoverability for connectors for AWS services means a better > experience for Flink users > - Natural placement of AWS-related utils (Credentials, SDK Retry strategy) > > Caveats: > - As you mentioned, it is not desirable if we have to evolve the major > version of the connector just for a change in a single connector (e.g. > DynamoDB). However, I think it is reasonable to only evolve the major > version of the AWS connector repo when there are Flink Source/Sink API > upgrades or AWS SDK major upgrades (probably quire rare). Any new features > for individual connectors can be collapsed into minor releases. > - An additional callout here is that we should be careful adopting any AWS > connectors that don't use the AWS SDK directly (e.g. how the Kinesis > connector used KPL for a long time). In my opinion, any new connectors like > that would be better placed in their own repositories, otherwise we will > have a complex mesh of dependencies to manage. > > Regards, > Hong > > > > > On 21/10/2022, 16:59, "Danny Cranmer" <dannycran...@apache.org> wrote: > > CAUTION: This email originated from outside of the organization. Do > not click links or open attachments unless you can confirm the sender and > know the content is safe. > > > > Thanks Chesnay for the suggestion, I will investigate this option. > > Related to the single repo idea, I have considered it in the past. Are > you > proposing we also use a single version between all connectors? If we > have a > single version then it makes sense to combine them in a single repo, if > they are separate versions, then splitting them makes sense. This was > discussed last year more generally [1] and the consensus was "we > ultimately > propose to have a single repository per connector". > > Combining all AWS connectors into a single repo with a single version > is > inline with how the AWS SDK works, therefore AWS users are familiar > with > this approach. However it is frustrating that we would have to release > all > connectors to fix a bug or add a feature in one of them. Example: a > user is > using Kinesis Data Streams only (the most popular and mature > connector), > and we evolve the version from 1.x to 2.y (or 1.x to 1.y) for a > DynamoDB > change. > > I am torn and will think some more, but it would be great to hear other > people's opinions. > > [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm > > Thanks, > Danny > > On Fri, Oct 21, 2022 at 3:11 PM Jing Ge <j...@ververica.com> wrote: > > > I agree with Jark. It would be easier for the further development and > > maintenance, if all aws related connectors and the base module are > in the > > same repo. It might make sense to upgrade the > flink-connector-dynamodb to > > flink-connector-aws and move the other modules including the > > flink-connector-aws-base into it. The aws sdk could be managed in > > flink-connector-aws-base. Any future common connector features could > also > > be developed in the base module. > > > > Best regards, > > Jing > > > > On Fri, Oct 21, 2022 at 1:26 PM Jark Wu <imj...@gmail.com> wrote: > > > >> How about creating a new repository flink-connector-aws and merging > >> dynamodb, kinesis firehouse into it? > >> This can reduce the maintenance for complex dependencies and make > the > >> release easy. > >> I think the maintainers of aws-releated connectors are the same > people. > >> > >> Best, > >> Jark > >> > >> > 2022年10月21日 17:41,Chesnay Schepler <ches...@apache.org> 写道: > >> > > >> > I would not go with 2); I think it'd just be messy . > >> > > >> > Here's another option: > >> > > >> > Create another repository (aws-connector-base) (following the > >> externalization model), add it as a sub-module to the downstream > >> repositories, and make it part of the release process of said > connector. > >> > > >> > I.e., we never create a release for aws-connector-bose, but > release it > >> as part of the connector. > >> > This main benefit here is that we'd always be able to make > changes to > >> the aws-base code without delaying connector releases. > >> > I would assume that any added overhead due to _technically_ > releasing > >> the aws code multiple times to be negligible. > >> > > >> > > >> > On 20/10/2022 22:38, Danny Cranmer wrote: > >> >> Hello all, > >> >> > >> >> Currently we have 2 AWS Flink connectors in the main Flink > codebase > >> >> (Kinesis Data Streams and Kinesis Data Firehose) and one new > >> externalized > >> >> connector in progress (DynamoDB). Currently all three of these > use > >> common > >> >> AWS utilities from the flink-connector-aws-base module. Common > code > >> >> includes client builders, property keys, validation, utils etc. > >> >> > >> >> Once we externalize the connectors, leaving > flink-connector-aws-base > >> in the > >> >> main Flink repository will restrict our ability to evolve the > >> connectors > >> >> quickly. For example, as part of the DynamoDB connector build we > are > >> >> considering adding a general retry strategy config that can be > >> leveraged by > >> >> all connectors. We would need to block on Flink 1.17 for this. > >> >> > >> >> In the past we have tried to keep the AWS SDK version consistent > across > >> >> connectors, with the externalization this is more likely to > diverge. > >> >> > >> >> Option 1: I propose we create a new repository, > flink-connector-aws, > >> which > >> >> we can move the flink-connector-aws-base module to and create a > new > >> >> flink-connector-aws-parent to manage SDK versions. Each of the > >> externalized > >> >> AWS connectors will depend on this new module and parent. > Downside is > >> an > >> >> additional module to release per Flink version, however I will > >> volunteer to > >> >> manage this. > >> >> > >> >> Option 2: We can move the flink-connector-aws-base module and > create > >> >> flink-connector-parent within the flink-connector-shared-utils > repo [2] > >> >> > >> >> Option 3: We do nothing. > >> >> > >> >> For option 1+2 we will follow the general externalized connector > >> versioning > >> >> strategy and rules. > >> >> > >> >> I am inclined towards option 1, and appreciate feedback from the > >> community. > >> >> > >> >> [1] > >> >> > >> > https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-aws-base > >> >> [2] https://github.com/apache/flink-connector-shared-utils > >> >> > >> >> Thanks, > >> >> Danny > >> >> > >> > > >> > >> > >