jean-baptiste-exorde opened a new issue, #14284:
URL: https://github.com/apache/druid/issues/14284

   ## Affected Version
   
   Druid 25.
   
   ## Description
   
   When S3 with SSE-C encryption is used as deep storage, the middlemanager 
nodes write segments successfully, but the historical nodes fail to download 
the segments from S3.
   
   Segment files are visible on S3 and can be downloaded manually. But, when 
historical node retrieves the file, it first checks if the segment file exists 
by sending a HEAD request to S3 and receives an error 400.
   
   I had the issue with a S3-compatible provider as well as with AWS S3.
   
   Below are an example error logs:
   
   ```
   2023-05-11T21:31:38,373 INFO [SimpleDataSegmentChangeHandler-0] 
org.apache.druid.storage.s3.S3DataSegmentPuller - Pulling index at 
path[CloudObjectLocation{bucket='xxx', path='xxx'}] to 
outDir[/var/druid/segment-cache/xxx]
   2023-05-11T21:31:38,374 DEBUG [SimpleDataSegmentChangeHandler-0] 
com.amazonaws.request - Sending Request: HEAD https://xxx xxx Headers: 
(amz-sdk-invocation-id: xxx, Content-Type: application/octet-stream, 
User-Agent: aws-sdk-java/1.12.317 Linux/5.4.0-124-generic 
OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1 java/11.0.16 
vendor/Debian cfg/retry-mode/legacy, )
   2023-05-11T21:31:38,375 DEBUG [SimpleDataSegmentChangeHandler-0] 
com.amazonaws.auth.AWS4Signer - AWS4 Canonical Request: '"HEAD
   xxx
   
   amz-sdk-invocation-id:xxx
   amz-sdk-request:ttl=20230511T213228Z;attempt=1;max=4
   amz-sdk-retry:0/0/500
   content-type:application/octet-stream
   host:xxx
   user-agent:aws-sdk-java/1.12.317 Linux/5.4.0-124-generic 
OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1 java/11.0.16 
vendor/Debian cfg/retry-mode/legacy
   x-amz-content-sha256:UNSIGNED-PAYLOAD
   x-amz-date:20230511T213138Z
   
   
amz-sdk-invocation-id;amz-sdk-request;amz-sdk-retry;content-type;host;user-agent;x-amz-content-sha256;x-amz-date
   UNSIGNED-PAYLOAD"
   2023-05-11T21:31:38,375 DEBUG [SimpleDataSegmentChangeHandler-0] 
com.amazonaws.auth.AWS4Signer - AWS4 String to Sign: '"AWS4-HMAC-SHA256
   20230511T213138Z
   20230511/gra/s3/aws4_request
   xxx"
   ...
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> HEAD xxx HTTP/1.1
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> Host: xxx
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> amz-sdk-invocation-id: xxx
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> amz-sdk-request: 
ttl=20230511T213228Z;attempt=1;max=4
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> amz-sdk-retry: 0/0/500
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> Authorization: AWS4-HMAC-SHA256 
Credential=xxx
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> Content-Type: 
application/octet-stream
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> User-Agent: aws-sdk-java/1.12.317 
Linux/5.4.0-124-generic OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1 
java/11.0.16 vendor/Debian cfg/retry-mode/legacy
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> x-amz-content-sha256: 
UNSIGNED-PAYLOAD
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> X-Amz-Date: 20230511T213138Z
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.headers - http-outgoing-0 >> Connection: Keep-Alive
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "HEAD xxx HTTP/1.1[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "Host: xxx[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "amz-sdk-invocation-id: xxx[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "amz-sdk-request: 
ttl=20230511T213228Z;attempt=1;max=4[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "amz-sdk-retry: 0/0/500[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "Authorization: AWS4-HMAC-SHA256 
Credential=xxx[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "Content-Type: 
application/octet-stream[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "User-Agent: aws-sdk-java/1.12.317 
Linux/5.4.0-124-generic OpenJDK_64-Bit_Server_VM/11.0.16+8-post-Debian-1deb11u1 
java/11.0.16 vendor/Debian cfg/retry-mode/legacy[\r][\n]"
   2023-05-11T21:31:38,378 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "x-amz-content-sha256: 
UNSIGNED-PAYLOAD[\r][\n]"
   2023-05-11T21:31:38,379 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "X-Amz-Date: 20230511T213138Z[\r][\n]"
   2023-05-11T21:31:38,379 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"
   2023-05-11T21:31:38,379 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 >> "[\r][\n]"
   2023-05-11T21:31:38,401 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "HTTP/1.1 400 Bad Request[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "Content-Type: 
application/xml[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "x-amz-id-2: xxx[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "x-amz-request-id: xxx[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "X-Trans-Id: xxx[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "X-Openstack-Request-Id: xxx[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "Date: Thu, 11 May 2023 21:31:38 
GMT[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "Connection: keep-alive[\r][\n]"
   2023-05-11T21:31:38,402 DEBUG [SimpleDataSegmentChangeHandler-0] 
org.apache.http.wire - http-outgoing-0 << "[\r][\n]"
   ...
   2023-05-11T21:31:38,404 DEBUG [SimpleDataSegmentChangeHandler-0] 
com.amazonaws.request - Received error response: 
com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon 
S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: xxx; S3 Extended 
Request ID: xxx; Proxy: null), S3 Extended Request ID: xxx
   2023-05-11T21:31:38,405 ERROR [SimpleDataSegmentChangeHandler-0] 
org.apache.druid.segment.loading.SegmentLocalCacheManager - Failed to load 
segment in current location [/var/druid/segment-cache], try next location if 
any: {class=org.apache.druid.segment.loading.SegmentLocalCacheManager, 
exceptionType=class org.apache.druid.segment.loading.SegmentLoadingException, 
exceptionMessage=S3 fail! Key[CloudObjectLocation{bucket='xxx', path='xxx'}], 
location=/var/druid/segment-cache}
   org.apache.druid.segment.loading.SegmentLoadingException: S3 fail! 
Key[CloudObjectLocation{bucket='xxx', path='xxx'}]
           at 
org.apache.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:292)
 ~[?:?]
           at 
org.apache.druid.storage.s3.S3DataSegmentPuller.getSegmentFiles(S3DataSegmentPuller.java:80)
 ~[?:?]
           at 
org.apache.druid.storage.s3.S3LoadSpec.loadSegment(S3LoadSpec.java:61) ~[?:?]
           at 
org.apache.druid.segment.loading.SegmentLocalCacheManager.loadInLocation(SegmentLocalCacheManager.java:343)
 ~[druid-server-25.0.0.jar:25.0.0]
           ...
   Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request 
(Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 
xxx; S3 Extended Request ID: xxx; Proxy: null)
           at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
 ~[aws-java-sdk-core-1.12.317.jar:?]
           ...
   ```
   
   ### Steps to reproduce the issue
   
   To reproduce the issue, we can deploy a simple cluster following the 
[quickstart](https://druid.apache.org/docs/latest/tutorials/index.html).
   
   1. Create a AWS S3 bucket and get credentials.
   2. Download Druid:
   
      ```bash
      wget https://dlcdn.apache.org/druid/25.0.0/apache-druid-25.0.0-bin.tar.gz
      tar -xzf apache-druid-25.0.0-bin.tar.gz
      ```
   
   3. Edit file 
`apache-druid-25.0.0/conf/druid/auto/_common/common.runtime.properties` to 
setup S3 deep storage:
   
      ```diff
      #...
   
      -druid.extensions.loadList=["druid-hdfs-storage", 
"druid-kafka-indexing-service", "druid-datasketches", "druid-multi-stage-query"]
      +druid.extensions.loadList=["druid-s3-extensions"]
   
      #...
   
      -druid.storage.type=local
      -druid.storage.storageDirectory=var/druid/segments
      +druid.storage.type=s3
      +druid.storage.bucket=<bucket-name>
      +druid.storage.baseKey=druid/segments
      +druid.s3.accessKey=<access-key>
      +druid.s3.secretKey=<secret-key>
      +druid.storage.sse.type=custom
      +druid.storage.sse.custom.base64EncodedKey=<encryption-key>
      +druid.s3.endpoint.url=<endpoint>
      +druid.storage.disableAcl=true
   
      # ...
      ```
   
   4. Edit file `apache-druid-25.0.0/conf/druid/auto/_common/log4j2.xml` to 
enable debug logs:
   
      ```diff
      # ...
      <Loggers>
      -   <Root level="info">
      +   <Root level="debug">
             <AppenderRef ref="FileAppender"/>
          </Root>
      # ...
      ```
   
   5. Start Apache Druid:
   
      ```bash
      cd apache-druid-25.0.0
      AWS_REGION=<REGION> ./bin/start-druid
      ```
   
   6. Load data following the 
[quickstart](https://druid.apache.org/docs/latest/tutorials/tutorial-batch.html).
   
   Result:
   
   - Ingestion task run successfully and the segment is visible on S3.
   - From the CLI or the [web 
console](http://localhost:8888/unified-console.html#datasources), we can check 
the state of the published segment: it is unavailable.
   - The logs of historical node contains the error mentioned above.
   
   ## Possible cause
   
   In logs, we can see a HEAD request to get the object's metadata. This 
request doesn't contains the encryption key. As per the [AWS 
documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html#specifying-s3-c-encryption),
 it's mandatory to provide it when requesting object's metadata. I might be the 
cause of the error 400.
   
   We have this stack trace telling us that the error occurs when the node 
checks if the file exists:
   
   ```
   at 
org.apache.druid.storage.s3.S3DataSegmentPuller.isObjectInBucket(S3DataSegmentPuller.java:292)
   ```
   
   Here are the calls made from this stack trace to the error:
   
   - 
[S3DataSegmentPuller.isObjectInBucket()](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentPuller.java#L284)
     - 
[S3Utils.isObjectInBucketIgnoringPermission()](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3Utils.java#L110)
       - 
[ServerSideEncryptingAmazonS3.doesObjectExist()](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/ServerSideEncryptingAmazonS3.java#L77)
         - 
[AmazonS3Client.doesObjectExist()](https://github.com/aws/aws-sdk-java/blob/96a0c2f8f11842c7e58498dbf294173fe2356221/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1423)
 of the AWS S3 SDK.
           - 
[AmazonS3Client.getObjectMetadata()](https://github.com/aws/aws-sdk-java/blob/96a0c2f8f11842c7e58498dbf294173fe2356221/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1345)
 of the AWS S3 SDK.
   
   The last method sends the HEAD request without SSE parameters and raise the 
error.
   
   In the class 
[ServerSideEncryptingAmazonS3](https://github.com/apache/druid/blob/25.0.0/extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/ServerSideEncryptingAmazonS3.java),
 we can see that most AWS S3 SDK calls are wrapped to add the SSE parameters, 
but the method `doesObjectExist` is not wrapped. So I think that the AWS S3 SDK 
is called without any information about the SSE parameters and it calls the S3 
API like if objects were not encrypted.
   
   Can you tell me if you have the same issue and if it's a bug or a 
configuration issue?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to