[ 
https://issues.apache.org/jira/browse/CASSSIDECAR-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18082334#comment-18082334
 ] 

Jon Haddad commented on CASSSIDECAR-415:
----------------------------------------

[Green CircleC is 
GreenI.|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra-sidecar/279]
[GitHub Actions CI has a 6.0 related 
failure|https://github.com/apache/cassandra-sidecar/actions/runs/26130702765], 
same issue I identified earlier.

I've rebased and am running CircleCI again just in case, will merge when it's 
done.

Thanks Yifan!

> Support IAM instance profile credentials for S3 restore job downloads
> ---------------------------------------------------------------------
>
>                 Key: CASSSIDECAR-415
>                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-415
>             Project: Sidecar for Apache Cassandra
>          Issue Type: Improvement
>          Components: Bulk Analytics
>            Reporter: Jon Haddad
>            Assignee: Jon Haddad
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The restore job feature downloads SSTables from S3 using static AWS 
> credentials that the caller must supply via POST 
> /api/v1/\{keyspace}/\{table}/restore-jobs. The request body must include a 
> secrets object (RestoreJobSecrets) containing separate read and write 
> StorageCredentials, each requiring accessKeyId, secretAccessKey, 
> sessionToken, and region — all enforced as non-null in 
> StorageCredentials.java (lines 52–55) and RestoreJobSecrets.java (lines 
> 41–42).
> On job creation, RestoreJobDatabaseAccessor.create() (line 90) serializes the 
> secrets to JSON via Jackson and writes them as a raw blob to the blob_secrets 
> column of the restore_jobs table, defined in RestoreJobsSchema.java (line 
> 91). There is no encryption applied at the column, table, or application 
> level — the credentials are stored as plaintext JSON bytes. This leaks the 
> credentials to anyone with access to this table.
> Because multiple sidecar nodes process different slices of the same restore 
> job in parallel, each node reads the job back from Cassandra — including the 
> secrets — via
> RestoreJobDatabaseAccessor.find() (line 191), which deserializes them from 
> row.getBytes("blob_secrets") in {{RestoreJob.java}} (line 94). Each node then 
> passes the job to StorageClientPool.storageClient() (line 86), which extracts 
> the region from {{restoreJob.secrets.readCredentials().region()}} (line 88) 
> and calls StorageClient.authenticate(). Inside 
> {{StorageClient.Credentials.init()}} (lines 341–344), the credentials are 
> unconditionally converted to AwsSessionCredentials and wrapped in a 
> StaticCredentialsProvider, which is then injected into each S3 request via 
> overrideConfiguration(b -> b.credentialsProvider(...)) in both objectExists() 
> (line 145) and rangeGetObject() (line 237).
> This design contradicts AWS best practices. AWS explicitly recommends using 
> IAM roles over static credentials wherever possible. IAM roles — via EC2 
> instance profiles, ECS task roles, or EKS IRSA — eliminate the need to 
> create, distribute, store, rotate, or revoke long-lived credentials. The 
> current design forces users running in AWS to work against this guidance: 
> even if their nodes already have IAM-granted S3 access, they must still 
> obtain and manage static credentials to satisfy the mandatory 
> {{Objects.requireNonNull(secrets, ...)}} check in 
> {{CreateRestoreJobRequestPayload.java}} (line 101).
> Passing static credentials over the request and storing them in Cassandra 
> creates risk that IAM roles entirely avoid. 
> RestoreJobDatabaseAccessor.create() (line 90) writes the secrets as a plain 
> JSON blob into blob_secrets. The restore_jobs table schema 
> (RestoreJobsSchema.java line 91) has no encryption configuration — no 
> column-level encryption, no transparent data encryption, no application-level 
> crypto. The credentials sit as plaintext, replicated across every Cassandra 
> node holding that partition, and included in any Cassandra backups taken 
> during the job's lifetime.
> Credentials visible in logs on failure. StorageClient logs 
> credentials.readCredentials on S3 request failures in both 
> logCredentialOnRequestFailure() (line 298) and the failure mapper in
> rangeGetObject() (line 256). Although StorageCredentials.toString() redacts 
> the secret key and session token (line 94 of StorageCredentials.java), the 
> access key ID is logged in plaintext. This provides an attack vector by 
> giving an adversary a string to search for to potentially match a secret to.
> *Proposed Solution*
> Make secrets optional throughout the restore job pipeline. When secrets are 
> absent, StorageClient should fall back to DefaultCredentialsProvider, which 
> implements the standard AWS credential chain: environment variables → system 
> properties → IAM instance profile → ECS task role → etc. This aligns the 
> sidecar with AWS best practices and allows operators running in
> AWS to use the credential model AWS recommends.
> StorageCredentials, RestoreJobSecrets, and CreateRestoreJobRequestPayload 
> need to permit null/absent credentials. The region must still be provided — 
> either inside the secrets object or as
> a new top-level field on the request — since it is required by 
> StorageClientPool.storageClient() (line 88) to construct the regional S3 
> endpoint.
> StorageClient.Credentials.init() (lines 339–345) should branch: use 
> StaticCredentialsProvider with AwsSessionCredentials when credentials are 
> present, use
> DefaultCredentialsProvider.create() when they are not. The 
> RestoreJobFatalException thrown when secrets are null (lines 331–334) should 
> be removed.
> RestoreJobDatabaseAccessor.create() (line 90) should skip writing 
> blob_secrets when secrets are null. RestoreJob.from() (line 94 of 
> RestoreJob.java) already handles a null blob_secrets
> column gracefully.
> API backward compatibility: Fully backward-compatible. Callers that currently 
> pass credentials continue to work unchanged.
> Acceptance Criteria
>  * secrets is optional in POST /api/v1/\{keyspace}/\{table}/restore-jobs; 
> existing clients with credentials continue to work unchanged
>  * When secrets is absent, StorageClient uses DefaultCredentialsProvider
>  * region is still required whether or not secrets are provided
>  * When using IAM mode, nothing is written to the blob_secrets column
>  * Integration test covering a restore job completing successfully without 
> explicit credentials
>  * Unit tests for StorageClient.Credentials covering both the static and IAM 
> credential paths
> Key Files to Modify
>  * client-common/.../common/data/StorageCredentials.java — make credential 
> fields optional
>  * client-common/.../common/data/RestoreJobSecrets.java — allow null 
> read/write credentials
>  * client-common/.../common/request/data/CreateRestoreJobRequestPayload.java 
> — remove null check on secrets; handle region when secrets are absent
>  * server/.../restore/StorageClient.java — branch on null credentials in 
> Credentials.init(); use DefaultCredentialsProvider as fallback; remove fatal 
> exception on null secrets
>  * server/.../restore/StorageClientPool.java — handle null secrets when 
> extracting region in storageClient()
>  * server/.../db/RestoreJob.java — handle null secrets throughout
>  * server/.../db/RestoreJobDatabaseAccessor.java — skip blob_secrets write 
> when secrets are null
>  * server/.../handlers/restore/CreateRestoreJobHandler.java — relax secrets 
> validation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to