At the moment my use-case is local indexing task ("index_hadoop" but without
remote Hadoop cluster) with ioConfig type "hadoop" and input path pointing to
"s3a://bucket/..." location, all running on AWS vms. The feature I wanted to
use is role-based access to S3, which allows reading data from storage using
credentials taken from environment (EC2 vm). This make s3a implementation call
out to aws sdk and version incompatibility arises caused by very similar
exception as mentioned here, i.e.:
```
Caused by: java.lang.NoSuchMethodError:
com.amazonaws.AmazonWebServiceRequest.copyPrivateRequestParameters()Ljava/util/Map;
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3506)
~[?:?]
at
com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
~[?:?]
at
com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
~[?:?]
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
~[?:?]
```
I thought that using newer hadoop-aws.jar would help, but using the oldest
newer than 2.7, that is 2.9 didn't help in itself. Instead it had other errors
due to incompatibilities between different jars. In afterthought, maybe the
culprit is not hadoop-aws.jar, but simply the fact that hadoop-dependencies/2.7
uses too old aws-java-sdk and converging it on the same version as placed in
extensions/druid-hdfs-storage/ would be enough (though it will likely require
several other jars to be brought in sync between those two dirs).
I could try using ``mapreduce.job.classloader = true`` and/or just upgrading
aws sdk.
[ Full content available at:
https://github.com/apache/incubator-druid/issues/4841 ]
This message was relayed via gitbox.apache.org for [email protected]