jackye1995 commented on pull request #1573:
URL: https://github.com/apache/iceberg/pull/1573#issuecomment-708513468


   > > Has there been any consideration to using the AWS Java SDK v2? I know 
that @jacques-n mentioned they found the java v2 Async s3 SDK buggy, but the 
linked code from their project is using the AWS Java SDK v2 (all of the imports 
start with `software.amazon`).
   > > To me it seems like it would be smarter to start on the newer client 
version than have to do an upgrade later. My understanding is that the Java SDK 
V2 is much more performant for most things as its the one seeing most of the 
work. And though I don't doubt @jacques-n's performance / bug issues with the 
java sdk v2 async s3 client, but I would ask when that was? I've personally 
noticed that when new clients and new services are brought out by amazon, 
they're not always production ready from the start. But many times I've found 
that things we performance tested 6 months prior were much more performant / 
resilient later on.
   > 
   > I would say that I've also had issues when exploring the v2 sdk, but more 
in terms of completeness of the implementation. For example they don't have 
transfer manager (not that we're using here), but if we decide to go that 
route, we would need to go back to v1. Also, at this point most other systems 
(Spark, S3A, Presto, etc.) are still on v1 as well. If there are documented 
performance or other features in v2, I'd be happy to upgrade, but it seems like 
the community hasn't really moved that direction yet.
   
   The SDK v2 is intended to live together with v1 because some old packages 
such as S3AFileSystem might never upgrade. That is why they have completely 
different class path and you do not need to resolve any dependency conflicts.
   
   All the new features related to the client itself will only be developed in 
v2, so it is always recommended to use the v2 client when possible for new 
projects. There is a 
[blog](https://aws.amazon.com/blogs/developer/tag/aws-sdk-java-v2/) that is 
dedicated to new features added to v2.
   
   For performance, there are optimizations made for users in AWS Lambda 
environment based on [this 
doc](https://docs.aws.amazon.com/sdk-for-java/v2/developer-guide/client-configuration-starttime.html).
 There is no performance benchmark done for HTTP calls, but since v2 supports 
HTTP2, it is supposed to be faster when the service enables HTTP2 traffic.
   
   From feature perspective, yes the transfer manager is not there, but for 
Iceberg the most important feature should be the multipart upload which is 
there, so I see much more benefits to use v2 instead of v1.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to