jean-baptiste-exorde opened a new issue, #14284:
URL: https://github.com/apache/druid/issues/14284
## Affected Version
Druid 25.
## Description
When S3 with SSE-C encryption is used as deep storage, the middlemanager
nodes write segments successfully, but the historical nodes fail to download
the segments from S3.
Segment files are visible on S3 and can be downloaded manually. But, when
historical node retrieves the file, it first checks if the segment file exists
by sending a HEAD request to S3 and receives an error 400.
I had the issue with a S3-compatible provider as well as with AWS S3.
Below are an example error logs:
```
2023-05-11T21:31:38,373 INFO [SimpleDataSegmentChangeHandler-0]
org.apache.druid.storage.s3.S3DataSegmentPuller - Pulling index at
path[CloudObjectLocation{bucket='xxx', path='xxx'}] to
outDir[/var/druid/segment-cache/xxx]
2023-05-11T21:31:38,374 DEBUG [SimpleDataSegmentChangeHandler-0]
com.amazonaws.request - Sending Request: HEAD https://xxx xxx Headers:
(amz-sdk-invocation-id: xxx, Content-Type: application/octet-stream,
User-Agent: aws-sdk-java/1.12.317 Linux/5.4.0-124-generic
OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1 java/11.0.16
vendor/Debian cfg/retry-mode/legacy, )
2023-05-11T21:31:38,375 DEBUG [SimpleDataSegmentChangeHandler-0]
com.amazonaws.auth.AWS4Signer - AWS4 Canonical Request: '"HEAD
xxx
amz-sdk-invocation-id:xxx
amz-sdk-request:ttl=20230511T213228Z;attempt=1;max=4
amz-sdk-retry:0/0/500
content-type:application/octet-stream
host:xxx
user-agent:aws-sdk-java/1.12.317 Linux/5.4.0-124-generic
OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1 java/11.0.16
vendor/Debian cfg/retry-mode/legacy
x-amz-content-sha256:UNSIGNED-PAYLOAD
x-amz-date:20230511T213138Z
amz-sdk-invocation-id;amz-sdk-request;amz-sdk-retry;content-type;host;user-agent;x-amz-content-sha256;x-amz-date
UNSIGNED-PAYLOAD"
2023-05-11T21:31:38,375 DEBUG [SimpleDataSegmentChangeHandler-0]
com.amazonaws.auth.AWS4Signer - AWS4 String to Sign: '"AWS4-HMAC-SHA256
20230511T213138Z
20230511/gra/s3/aws4_request
xxx"
...
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> HEAD xxx HTTP/1.1
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> Host: xxx
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> amz-sdk-invocation-id: xxx
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> amz-sdk-request:
ttl=20230511T213228Z;attempt=1;max=4
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> amz-sdk-retry: 0/0/500
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> Authorization: AWS4-HMAC-SHA256
Credential=xxx
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> Content-Type:
application/octet-stream
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> User-Agent: aws-sdk-java/1.12.317
Linux/5.4.0-124-generic OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1
java/11.0.16 vendor/Debian cfg/retry-mode/legacy
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> x-amz-content-sha256:
UNSIGNED-PAYLOAD
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> X-Amz-Date: 20230511T213138Z
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.headers - http-outgoing-0 >> Connection: Keep-Alive
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "HEAD xxx HTTP/1.1[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "Host: xxx[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "amz-sdk-invocation-id: xxx[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "amz-sdk-request:
ttl=20230511T213228Z;attempt=1;max=4[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "amz-sdk-retry: 0/0/500[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "Authorization: AWS4-HMAC-SHA256
Credential=xxx[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "Content-Type:
application/octet-stream[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "User-Agent: aws-sdk-java/1.12.317
Linux/5.4.0-124-generic OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1
java/11.0.16 vendor/Debian cfg/retry-mode/legacy[\r][\n]"
2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "x-amz-content-sha256:
UNSIGNED-PAYLOAD[\r][\n]"
2023-05-11T21:31:38,379 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "X-Amz-Date: 20230511T213138Z[\r][\n]"
2023-05-11T21:31:38,379 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"
2023-05-11T21:31:38,379 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 >> "[\r][\n]"
2023-05-11T21:31:38,401 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "HTTP/1.1 400 Bad Request[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "Content-Type:
application/xml[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "x-amz-id-2: xxx[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "x-amz-request-id: xxx[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "X-Trans-Id: xxx[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "X-Openstack-Request-Id: xxx[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "Date: Thu, 11 May 2023 21:31:38
GMT[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "Connection: keep-alive[\r][\n]"
2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0]
org.apache.http.wire - http-outgoing-0 << "[\r][\n]"
...
2023-05-11T21:31:38,404 DEBUG [SimpleDataSegmentChangeHandler-0]
com.amazonaws.request - Received error response:
com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon
S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: xxx; S3 Extended
Request ID: xxx; Proxy: null), S3 Extended Request ID: xxx
2023-05-11T21:31:38,405 ERROR [SimpleDataSegmentChangeHandler-0]
org.apache.druid.segment.loading.SegmentLocalCacheManager - Failed to load
segment in current location [/var/druid/segment-cache], try next location if
any: {class=org.apache.druid.segment.loading.SegmentLocalCacheManager,
exceptionType=class org.apache.druid.segment.loading.SegmentLoadingException,
exceptionMessage=S3 fail! Key[CloudObjectLocation{bucket='xxx', path='xxx'}],
location=/var/druid/segment-cache}
org.apache.druid.segment.loading.SegmentLoadingException: S3 fail!
Key[CloudObjectLocation{bucket='xxx', path='xxx'}]
at
org.apache.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:292)
~[?:?]
at
org.apache.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:80)
~[?:?]
at
org.apache.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:61) ~[?:?]
at
org.apache.druid.segment.loading.SegmentLocalCacheManager.loadInLocation(SegmentLocalCacheManager.java:343)
~[druid-server-25.0.0.jar:25.0.0]
...
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request
(Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID:
xxx; S3 Extended Request ID: xxx; Proxy: null)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
~[aws-java-sdk-core-1.12.317.jar:?]
...
```
### Steps to reproduce the issue
To reproduce the issue, we can deploy a simple cluster following the
[quickstart](https://druid.apache.org/docs/latest/tutorials/index.html).
1. Create a AWS S3 bucket and get credentials.
2. Download Druid:
```bash
wget https://dlcdn.apache.org/druid/25.0.0/apache-druid-25.0.0-bin.tar.gz
tar -xzf apache-druid-25.0.0-bin.tar.gz
```
3. Edit file
`apache-druid-25.0.0/conf/druid/auto/_common/common.runtime.properties` to
setup S3 deep storage:
```diff
#...
-druid.extensions.loadList=["druid-hdfs-storage",
"druid-kafka-indexing-service", "druid-datasketches", "druid-multi-stage-query"]
+druid.extensions.loadList=["druid-s3-extensions"]
#...
-druid.storage.type=local
-druid.storage.storageDirectory=var/druid/segments
+druid.storage.type=s3
+druid.storage.bucket=<bucket-name>
+druid.storage.baseKey=druid/segments
+druid.s3.accessKey=<access-key>
+druid.s3.secretKey=<secret-key>
+druid.storage.sse.type=custom
+druid.storage.sse.custom.base64EncodedKey=<encryption-key>
+druid.s3.endpoint.url=<endpoint>
+druid.storage.disableAcl=true
# ...
```
4. Edit file `apache-druid-25.0.0/conf/druid/auto/_common/log4j2.xml` to
enable debug logs:
```diff
# ...
<Loggers>
- <Root level="info">
+ <Root level="debug">
<AppenderRef ref="FileAppender"/>
</Root>
# ...
```
5. Start Apache Druid:
```bash
cd apache-druid-25.0.0
AWS_REGION=<REGION> ./bin/start-druid
```
6. Load data following the
[quickstart](https://druid.apache.org/docs/latest/tutorials/tutorial-batch.html).
Result:
- Ingestion task run successfully and the segment is visible on S3.
- From the CLI or the [web
console](http://localhost:8888/unified-console.html#datasources), we can check
the state of the published segment: it is unavailable.
- The logs of historical node contains the error mentioned above.
## Possible cause
In logs, we can see a HEAD request to get the object's metadata. This
request doesn't contains the encryption key. As per the [AWS
documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html#specifying-s3-c-encryption),
it's mandatory to provide it when requesting object's metadata. I might be the
cause of the error 400.
We have this stack trace telling us that the error occurs when the node
checks if the file exists:
```
at
org.apache.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:292)
```
Here are the calls made from this stack trace to the error:
-
[S3DataSegmentPuller.isObjectInBucket()](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentPuller.java#L284)
-
[S3Utils.isObjectInBucketIgnoringPermission()](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3Utils.java#L110)
-
[ServerSideEncryptingAmazonS3.doesObjectExist()](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/ServerSideEncryptingAmazonS3.java#L77)
-
[AmazonS3Client.doesObjectExist()](https://github.com/aws/aws-sdk-java/blob/96a0c2f8f11842c7e58498dbf294173fe2356221/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1423)
of the AWS S3 SDK.
-
[AmazonS3Client.getObjectMetadata()](https://github.com/aws/aws-sdk-java/blob/96a0c2f8f11842c7e58498dbf294173fe2356221/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1345)
of the AWS S3 SDK.
The last method sends the HEAD request without SSE parameters and raise the
error.
In the class
[ServerSideEncryptingAmazonS3](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/ServerSideEncryptingAmazonS3.java),
we can see that most AWS S3 SDK calls are wrapped to add the SSE parameters,
but the method `doesObjectExist` is not wrapped. So I think that the AWS S3 SDK
is called without any information about the SSE parameters and it calls the S3
API like if objects were not encrypted.
Can you tell me if you have the same issue and if it's a bug or a
configuration issue?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]