endzyme opened a new issue, #475:
URL: https://github.com/apache/solr-operator/issues/475
### Summary
I tried backing up to S3 using IAM Role Assumption via Web Identity Tokens
on EKS, and am getting errors. I tried with static AWS IAM credentials with the
same policy and it works. There is an ominous `WARN` which may indicate why web
identity token role assumption is not functioning.
* * *
### Details
I am experiencing some issues using the S3 Backup and Restore configuration
using the Solr Operator. I am running the 8.11 Solr image and have configured
the pods on our EKS cluster with the appropriate Kubernetes Service Account and
the Service Account is annotated in the proper way with the IAM Role ARN.
There is an interesting warning message when attempting to perform a backup.
The shard leader will emit the message below before every attempt:
```
WARN
(OverseerThreadFactory-34-thread-2-processing-n:dev-8-blue-solrcloud-2.solr:80_solr)
[c:test ] s.a.a.a.c.i.WebIdentityCredentialsUtils To use web identity
tokens, the 'sts' service module must be on the class path.
```
When I configure the SolrCloud resource with static IAM credentials I can
perform the backup, but with the assumed role via web identity token I am
receiving a 403 from S3 (see error message below).
```
2022-09-15 15:17:21.941 WARN
(OverseerThreadFactory-34-thread-1-processing-n:dev-8-blue-solrcloud-1.solr:80_solr)
[c:test ] s.a.a.a.c.i.WebIdentityCredentialsUtils To use web identity
tokens, the 'sts' service module must be on the class path.
2022-09-15 15:17:24.447 ERROR
(OverseerThreadFactory-34-thread-1-processing-n:dev-8-blue-solrcloud-1.solr:80_solr)
[c:test ] o.a.s.s.S3StorageClient An AmazonServiceException was thrown!
[serviceName=S3] [awsRequestId=SNIP] [httpStatus=403] [s3ErrorCode=null]
[message=null]
2022-09-15 15:17:24.449 ERROR
(OverseerThreadFactory-34-thread-1-processing-n:dev-8-blue-solrcloud-1.solr:80_solr)
[c:test ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test
operation: backup failed => org.apache.solr.s3.S3Exception: An
AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=SNIP]
[httpStatus=403] [s3ErrorCode=null] [message=null]
at
org.apache.solr.s3.S3StorageClient.handleAmazonException(S3StorageClient.java:598)
org.apache.solr.s3.S3Exception: An AmazonServiceException was thrown!
[serviceName=S3] [awsRequestId=SNIP] [httpStatus=403] [s3ErrorCode=null]
[message=null]
at
org.apache.solr.s3.S3StorageClient.handleAmazonException(S3StorageClient.java:598)
~[?:?]
at
org.apache.solr.s3.S3StorageClient.pathExists(S3StorageClient.java:314) ~[?:?]
at
org.apache.solr.s3.S3BackupRepository.exists(S3BackupRepository.java:200) ~[?:?]
at
org.apache.solr.cloud.api.collections.BackupCmd.createAndValidateBackupPath(BackupCmd.java:154)
~[?:?]
at
org.apache.solr.cloud.api.collections.BackupCmd.call(BackupCmd.java:94) ~[?:?]
at
org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:271)
~[?:?]
at
org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:524)
~[?:?]
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source) ~[?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null
(Service: S3, Status Code: 403, Request ID: SNIP, Extended Request ID: SNIP)
at
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
~[?:?]
at
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:106)
~[?:?]
at
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:84)
~[?:?]
at
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:42)
~[?:?]
at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:95)
~[?:?]
at
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$6(BaseClientHandler.java:232)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:80)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
~[?:?]
at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
~[?:?]
at
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
~[?:?]
at
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
~[?:?]
at
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193)
~[?:?]
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
~[?:?]
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:167)
~[?:?]
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:82)
~[?:?]
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:175)
~[?:?]
at
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:76)
~[?:?]
at
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
~[?:?]
at
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:56)
~[?:?]
at
software.amazon.awssdk.services.s3.DefaultS3Client.headObject(DefaultS3Client.java:5080)
~[?:?]
at
software.amazon.awssdk.services.s3.S3Client.headObject(S3Client.java:9886)
~[?:?]
at
org.apache.solr.s3.S3StorageClient.pathExists(S3StorageClient.java:309) ~[?:?]
... 9 more
```
Below are the things I've tried and observed:
* Tested the IAM Role itself for access to the target bucket
* Confirmed that the mutating webhook is in fact modifying the SolrCloud
pods with the appropriate env vars and projected service account token volume
mounts
* Confirmed that I can use those tokens to assume the role and get to the S3
bucket
* Tested with "static" IAM credentials via the `kubectl explain
solrcloud.spec.backupRepositories.s3.credentials.credentialsFileSecret`
configuration, the IAM user has the same policy as the IAM role, and this works
for backups
This warning about `the 'sts' service module must be on the class path`
message makes me think that something else needs to be loaded in the Solr
modules before this will work. I have looked through the documentation and
everything seems to indicate that, when using EKS, it's a supported use case to
use IAM Roles through Service Accounts. The documentation also appears to
indicate that I do not need to specify anything extra in modules or plugins for
SolrCloud K8s Resource because they are autoloaded when providing backup
configurations of S3.
Any help would be appreciated!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]