Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-18 Thread Alexey Romanenko
Hi David,

Thank you for your feedback, sounds totally reasonable to me. I agree that 
before deprecation any of AWS V1 connectors we have to make sure that V2 
version may completely substitute the previous one.

> On 17 Sep 2020, at 19:19, David Hollands  wrote:
> 
> Hi Alexey –
>  
> As relatively new users of Beam, we recently selected v1 over v2 because we 
> didn’t think v2 currently (as of 2.24.0-snapshot) had feature parity 
> especially the lack of a v2 based S3FileSystem and KinesisIO.Write.
>  
> Ideally we would have selected v2. 
>  
> On a related note, and not really Beam’s problem, but if I remember rightly, 
> we also had a bit of trouble creating some LocalStack testcontainers based 
> integration tests with v2…
>  
> Cheers, David
>  
> David Hollands
> Audience Platform – Audience Data Engineering
> david.holla...@bbc.co.uk <mailto:david.holla...@bbc.co.uk>
> BC5 C5, BBC Broadcast Centre, London, W12 7TQ
>  
> From: Alexey Romanenko  <mailto:aromanenko@gmail.com>>
> Reply to: "user@beam.apache.org <mailto:user@beam.apache.org>" 
> mailto:user@beam.apache.org>>
> Date: Tuesday, 15 September 2020 at 17:06
> To: "user@beam.apache.org" 
> Subject: Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors
>  
> I just want to cross-post it on users@ to find out which version of AWS SDK 
> connectors is mostly used in user applications and if there are any strong 
> objections to switch mostly to AWS SDK v2?
>  
> Thank you for any feedback in advance.
> 
> 
> On 11 Sep 2020, at 19:13, Alexey Romanenko  <mailto:aromanenko@gmail.com>> wrote:
>  
> Hello,
>  
> In Beam, there are two versions of AWS IO connectors for Java SDK - based on 
> AWS SDK v1 [1] and v2 [2]. For now, they are pretty equal in terms of 
> functionality, but since AWS SDK v2 is more modern (it's a major rewrite of 
> the version 1.x code base, it’s built on top of Java 8+ and adds more 
> features [3]), then it would be more logical to use only V2. Also, it’s not 
> reasonable to support two versions of similar connectors, since it’s a big 
> pain for us, and it will be more clear for users which package of AWS 
> connectors to use . 
>  
> According to this, I’d propose to deprecate all Java AWS IO connectors V1 (+ 
> KinesisIO which is in a different package for now) starting from Beam 2.25 
> and then add new features only to V2 connectors. Bug fixes should be applied 
> to V2 connectors in the first order, and to V1 connectors if it’s only 
> necessary. 
>  
> What are the community thoughts on this? Any pros and cons that I'm missing?
>  
>  
> [1] 
> https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services 
> <https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services>
> [2] 
> https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services2 
> <https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services2>
> [3] https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/welcome.html 
> <https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/welcome.html>


Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-17 Thread David Hollands
Hi Alexey –

As relatively new users of Beam, we recently selected v1 over v2 because we 
didn’t think v2 currently (as of 2.24.0-snapshot) had feature parity especially 
the lack of a v2 based S3FileSystem and KinesisIO.Write.

Ideally we would have selected v2.

On a related note, and not really Beam’s problem, but if I remember rightly, we 
also had a bit of trouble creating some LocalStack testcontainers based 
integration tests with v2…

Cheers, David

David Hollands
Audience Platform – Audience Data Engineering
david.holla...@bbc.co.uk<mailto:david.holla...@bbc.co.uk>
BC5 C5, BBC Broadcast Centre, London, W12 7TQ

From: Alexey Romanenko 
Reply to: "user@beam.apache.org" 
Date: Tuesday, 15 September 2020 at 17:06
To: "user@beam.apache.org" 
Subject: Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

I just want to cross-post it on users@ to find out which version of AWS SDK 
connectors is mostly used in user applications and if there are any strong 
objections to switch mostly to AWS SDK v2?

Thank you for any feedback in advance.


On 11 Sep 2020, at 19:13, Alexey Romanenko 
mailto:aromanenko@gmail.com>> wrote:

Hello,

In Beam, there are two versions of AWS IO connectors for Java SDK - based on 
AWS SDK v1 [1] and v2 [2]. For now, they are pretty equal in terms of 
functionality, but since AWS SDK v2 is more modern (it's a major rewrite of the 
version 1.x code base, it’s built on top of Java 8+ and adds more features 
[3]), then it would be more logical to use only V2. Also, it’s not reasonable 
to support two versions of similar connectors, since it’s a big pain for us, 
and it will be more clear for users which package of AWS connectors to use .

According to this, I’d propose to deprecate all Java AWS IO connectors V1 (+ 
KinesisIO which is in a different package for now) starting from Beam 2.25 and 
then add new features only to V2 connectors. Bug fixes should be applied to V2 
connectors in the first order, and to V1 connectors if it’s only necessary.

What are the community thoughts on this? Any pros and cons that I'm missing?


[1] https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services
[2] https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services2
[3] https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/welcome.html





Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-16 Thread Alexey Romanenko
Thanks Ismael and Robert for your thoughts on this.

I’m going to check the parity and see if we need to do any changes on this.

PS: Sorry, I screwed up the subject of this thread (“v2" instead of "v1"), it 
definitively has to be “Deprecation of AWS SDK v1 IO connectors” (thanks to 
Ismael for pointing this out).

> On 16 Sep 2020, at 02:19, Robert Bradshaw  wrote:
> 
> Thanks for clarifying the state of things. +1 to deprecating once we have 
> parity. If the v2 ones are better, perhaps a blog post would be a good way to 
> advertise (and document) their existence and advantages too. 
> 
> On Tue, Sep 15, 2020 at 2:15 PM Ismaël Mejía  > wrote:
> The reason why most people are using AWSv1 IOs is probably because they are
> in Beam since 2017 instead of just added in the last year which is the case of
> the AWSv2 ones.
> 
> Alexey mentions that maintaining both versions is becoming painful and I would
> like to expand on that because we have now duplicated work for new features, 
> for
> example someone contributing some small improvement does it in one of the two
> versions and we try to encourage them to do it in both and general confusion 
> and
> lots of extra work is going into keeping them aligned. And for more complex 
> IOs
> like Kinesis this might prove harder in the future.
> 
> Notice that the migration path is incremental because users can have both 
> Amazon
> SDKs in the same classpath without conflicts. And Alexey's proposal is about
> deprecating AWSv1 IOs to reduce the maintenance burden, not removing them from
> the codebase. This could help to raise awareness about the AWSv2 IOs so users
> migrate and diminish the extra overhead for contributors and maintainers.
> 
> One minor comment to the proposal is that if we proceed with this plan we 
> should
> deprecate a v1 IO ONLY when we have full feature parity in the v2 version.
> I think we don't have a replacement for AWSv1 S3 IO so that one should not be
> deprecated.
> 
> On Tue, Sep 15, 2020 at 6:07 PM Robert Bradshaw  > wrote:
> >
> > The 10x-100x ratio looks like an answer right there about (non-)suitability 
> > for deprecation. The new question would be *why* people are using the v1 
> > APIs. Is it because it was the original, or that it's been around longer, 
> > or it has more features?
> >



Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-15 Thread Robert Bradshaw
Thanks for clarifying the state of things. +1 to deprecating once we have
parity. If the v2 ones are better, perhaps a blog post would be a good way
to advertise (and document) their existence and advantages too.

On Tue, Sep 15, 2020 at 2:15 PM Ismaël Mejía  wrote:

> The reason why most people are using AWSv1 IOs is probably because they are
> in Beam since 2017 instead of just added in the last year which is the
> case of
> the AWSv2 ones.
>
> Alexey mentions that maintaining both versions is becoming painful and I
> would
> like to expand on that because we have now duplicated work for new
> features, for
> example someone contributing some small improvement does it in one of the
> two
> versions and we try to encourage them to do it in both and general
> confusion and
> lots of extra work is going into keeping them aligned. And for more
> complex IOs
> like Kinesis this might prove harder in the future.
>
> Notice that the migration path is incremental because users can have both
> Amazon
> SDKs in the same classpath without conflicts. And Alexey's proposal is
> about
> deprecating AWSv1 IOs to reduce the maintenance burden, not removing them
> from
> the codebase. This could help to raise awareness about the AWSv2 IOs so
> users
> migrate and diminish the extra overhead for contributors and maintainers.
>
> One minor comment to the proposal is that if we proceed with this plan we
> should
> deprecate a v1 IO ONLY when we have full feature parity in the v2 version.
> I think we don't have a replacement for AWSv1 S3 IO so that one should not
> be
> deprecated.
>
> On Tue, Sep 15, 2020 at 6:07 PM Robert Bradshaw 
> wrote:
> >
> > The 10x-100x ratio looks like an answer right there about
> (non-)suitability for deprecation. The new question would be *why* people
> are using the v1 APIs. Is it because it was the original, or that it's been
> around longer, or it has more features?
> >
>


Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-15 Thread Ismaël Mejía
The reason why most people are using AWSv1 IOs is probably because they are
in Beam since 2017 instead of just added in the last year which is the case of
the AWSv2 ones.

Alexey mentions that maintaining both versions is becoming painful and I would
like to expand on that because we have now duplicated work for new features, for
example someone contributing some small improvement does it in one of the two
versions and we try to encourage them to do it in both and general confusion and
lots of extra work is going into keeping them aligned. And for more complex IOs
like Kinesis this might prove harder in the future.

Notice that the migration path is incremental because users can have both Amazon
SDKs in the same classpath without conflicts. And Alexey's proposal is about
deprecating AWSv1 IOs to reduce the maintenance burden, not removing them from
the codebase. This could help to raise awareness about the AWSv2 IOs so users
migrate and diminish the extra overhead for contributors and maintainers.

One minor comment to the proposal is that if we proceed with this plan we should
deprecate a v1 IO ONLY when we have full feature parity in the v2 version.
I think we don't have a replacement for AWSv1 S3 IO so that one should not be
deprecated.

On Tue, Sep 15, 2020 at 6:07 PM Robert Bradshaw  wrote:
>
> The 10x-100x ratio looks like an answer right there about (non-)suitability 
> for deprecation. The new question would be *why* people are using the v1 
> APIs. Is it because it was the original, or that it's been around longer, or 
> it has more features?
>


Re: [DISCUSS] Deprecation of AWS SDK v2 IO connectors

2020-09-15 Thread Alexey Romanenko
I just want to cross-post it on users@ to find out which version of AWS SDK 
connectors is mostly used in user applications and if there are any strong 
objections to switch mostly to AWS SDK v2?

Thank you for any feedback in advance.

> On 11 Sep 2020, at 19:13, Alexey Romanenko  wrote:
> 
> Hello,
> 
> In Beam, there are two versions of AWS IO connectors for Java SDK - based on 
> AWS SDK v1 [1] and v2 [2]. For now, they are pretty equal in terms of 
> functionality, but since AWS SDK v2 is more modern (it's a major rewrite of 
> the version 1.x code base, it’s built on top of Java 8+ and adds more 
> features [3]), then it would be more logical to use only V2. Also, it’s not 
> reasonable to support two versions of similar connectors, since it’s a big 
> pain for us, and it will be more clear for users which package of AWS 
> connectors to use . 
> 
> According to this, I’d propose to deprecate all Java AWS IO connectors V1 (+ 
> KinesisIO which is in a different package for now) starting from Beam 2.25 
> and then add new features only to V2 connectors. Bug fixes should be applied 
> to V2 connectors in the first order, and to V1 connectors if it’s only 
> necessary. 
> 
> What are the community thoughts on this? Any pros and cons that I'm missing?
> 
> 
> [1] 
> https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services 
> 
> [2] 
> https://github.com/apache/beam/tree/master/sdks/java/io/amazon-web-services2 
> 
> [3] https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/welcome.html 
> 
> 
>