kevinjqliu commented on code in PR #742:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/742#discussion_r3390715668


##########
src/azure/client.rs:
##########
@@ -58,6 +59,19 @@ static MS_CONTENT_ENCODING: HeaderName = 
HeaderName::from_static("x-ms-blob-cont
 static MS_CONTENT_LANGUAGE: HeaderName = 
HeaderName::from_static("x-ms-blob-content-language");
 
 static TAGS_HEADER: HeaderName = HeaderName::from_static("x-ms-tags");
+static ENCRYPTION_KEY_HEADER: HeaderName = 
HeaderName::from_static("x-ms-encryption-key");
+static ENCRYPTION_KEY_SHA256_HEADER: HeaderName =
+    HeaderName::from_static("x-ms-encryption-key-sha256");
+static ENCRYPTION_ALGORITHM_HEADER: HeaderName =
+    HeaderName::from_static("x-ms-encryption-algorithm");
+static SOURCE_ENCRYPTION_KEY_HEADER: HeaderName =
+    HeaderName::from_static("x-ms-source-encryption-key");
+static SOURCE_ENCRYPTION_KEY_SHA256_HEADER: HeaderName =
+    HeaderName::from_static("x-ms-source-encryption-key-sha256");
+static SOURCE_ENCRYPTION_ALGORITHM_HEADER: HeaderName =
+    HeaderName::from_static("x-ms-source-encryption-algorithm");
+// Put Blob From URL added source CPK headers in 2026-02-06.
+const PUT_BLOB_FROM_URL_SOURCE_CPK_VERSION: &str = "2026-02-06";

Review Comment:
   nit: took me a while to figure out the "source" headers and "2026-02-06"
   
   could we add a comment on why this is necessary? Its used for copy_request 
and 2026-02-06 is the first supported version. Its newer and than the base 
version and thus the requires the override
    
   https://learn.microsoft.com/en-us/rest/api/storageservices/version-2026-02-06



##########
src/azure/client.rs:
##########
@@ -181,6 +196,21 @@ impl AzureConfig {
         }
         url
     }
+
+    /// Whether a request built with this config must be treated as sensitive.
+    ///
+    /// This is true when the credential is carried in the URL (SAS tokens) or
+    /// when customer-provided encryption keys are configured, since CPK 
requests
+    /// carry secret key material. Combining both here means callers only need 
to
+    /// remember a single source of truth for request sensitivity.
+    fn is_sensitive(&self, credential: &Option<Arc<AzureCredential>>) -> bool {

Review Comment:
   nit: i noticed a few places still calculate `is_sensitive` using previous 
method. can we update all for uniformity? 
   In
   - get_user_delegation_key
   - list_request



##########
src/azure/client.rs:
##########
@@ -190,6 +220,97 @@ impl AzureConfig {
     }
 }
 
+/// Encryption headers for Azure requests.
+/// Azure only supports AES256 encryption with customer-provided keys.
+#[derive(Default, Clone)]
+pub(crate) struct AzureEncryptionHeaders {
+    pub encryption_key: Option<String>,
+    pub encryption_key_sha256: Option<String>,
+}
+
+impl std::fmt::Debug for AzureEncryptionHeaders {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("AzureEncryptionHeaders")
+            .field("key_configured", &self.encryption_key.is_some())
+            .finish()
+    }
+}
+
+impl AzureEncryptionHeaders {
+    pub(crate) fn try_new(encryption_key: Option<String>) -> Result<Self> {
+        let Some(encryption_key) = encryption_key else {
+            return Ok(Self::default());
+        };
+
+        let decoded_key = BASE64_STANDARD
+            .decode(encryption_key.as_bytes())
+            .map_err(|source| crate::Error::Generic {
+                store: STORE,
+                source: Box::new(source),
+            })?;
+
+        // As above encryption keys must be 256-bit AES keys,
+        // which means the base64-encoded value must decode to 32 bytes.
+        if decoded_key.len() != 32 {
+            return Err(crate::Error::Generic {
+                store: STORE,
+                source: format!(
+                    "Azure customer-provided encryption key must decode to 32 
bytes, got {}",
+                    decoded_key.len()
+                )
+                .into(),
+            });
+        }
+
+        let encryption_key_sha256 =
+            BASE64_STANDARD.encode(digest::digest(&digest::SHA256, 
&decoded_key));
+
+        Ok(Self {
+            encryption_key: Some(encryption_key),
+            encryption_key_sha256: Some(encryption_key_sha256),
+        })
+    }
+
+    fn is_enabled(&self) -> bool {
+        self.encryption_key.is_some()
+    }
+}
+
+/// Request-builder extension for customer-provided encryption keys (CPK).
+///
+/// Mirrors [`CredentialExt`] so that CPK headers are applied as part of the
+/// fluent request-construction chain, keeping the secret key material close to
+/// the rest of the request building rather than as a detached side-call.
+pub(crate) trait EncryptionHeadersExt {
+    /// Apply the customer-provided encryption headers for the request target.
+    fn with_azure_encryption_headers(self, headers: &AzureEncryptionHeaders) 
-> Self;
+
+    /// Apply the customer-provided encryption headers for a copy *source*.

Review Comment:
   ```suggestion
       /// Apply the customer-provided encryption headers for a copy *source*.
       /// 
<https://learn.microsoft.com/en-us/rest/api/storageservices/put-block-from-url?tabs=microsoft-entra-id#request-headers-source-customer-provided-encryption-keys>
   ```



##########
src/azure/client.rs:
##########


Review Comment:
   ```suggestion
           let sensitive = self.config.is_sensitive(&credential);
           let batch_response = self
   ```



##########
src/azure/client.rs:
##########


Review Comment:
   ```suggestion
               .retryable(&self.config.retry_config)
               .sensitive(sensitive)
               .send()
   ```
   propagate `sensitive` on retry



##########
src/azure/client.rs:
##########
@@ -190,6 +197,57 @@ impl AzureConfig {
     }
 }
 
+#[derive(Default, Clone, Debug)]
+pub(crate) struct AzureEncryptionHeaders {
+    pub encryption_key: Option<String>,
+    pub encryption_key_sha256: Option<String>,
+}
+
+impl AzureEncryptionHeaders {
+    pub(crate) fn try_new(encryption_key: Option<String>) -> Result<Self> {
+        let Some(encryption_key) = encryption_key else {
+            return Ok(Self::default());
+        };
+
+        let decoded_key = BASE64_STANDARD
+            .decode(encryption_key.as_bytes())
+            .map_err(|source| crate::Error::Generic {
+                store: STORE,
+                source: Box::new(source),
+            })?;
+
+        if decoded_key.len() != 32 {
+            return Err(crate::Error::Generic {
+                store: STORE,
+                source: format!(
+                    "Azure customer-provided encryption key must decode to 32 
bytes, got {}",
+                    decoded_key.len()
+                )
+                .into(),
+            });
+        }
+
+        let encryption_key_sha256 =
+            BASE64_STANDARD.encode(digest::digest(&digest::SHA256, 
&decoded_key));
+
+        Ok(Self {
+            encryption_key: Some(encryption_key),
+            encryption_key_sha256: Some(encryption_key_sha256),
+        })
+    }
+
+    fn apply(&self, builder: HttpRequestBuilder) -> HttpRequestBuilder {
+        match (&self.encryption_key, &self.encryption_key_sha256) {
+            (Some(encryption_key), Some(encryption_key_sha256)) => builder
+                .header(&ENCRYPTION_KEY_HEADER, encryption_key)
+                .header(&ENCRYPTION_KEY_SHA256_HEADER, encryption_key_sha256)
+                // Azure only supports this algorithm header
+                .header(&ENCRYPTION_ALGORITHM_HEADER, "AES256"),

Review Comment:
   nit: i still see the string here :P 



##########
src/azure/client.rs:
##########
@@ -190,6 +220,97 @@ impl AzureConfig {
     }
 }
 
+/// Encryption headers for Azure requests.
+/// Azure only supports AES256 encryption with customer-provided keys.
+#[derive(Default, Clone)]
+pub(crate) struct AzureEncryptionHeaders {
+    pub encryption_key: Option<String>,
+    pub encryption_key_sha256: Option<String>,
+}
+
+impl std::fmt::Debug for AzureEncryptionHeaders {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("AzureEncryptionHeaders")
+            .field("key_configured", &self.encryption_key.is_some())
+            .finish()
+    }
+}
+
+impl AzureEncryptionHeaders {
+    pub(crate) fn try_new(encryption_key: Option<String>) -> Result<Self> {
+        let Some(encryption_key) = encryption_key else {
+            return Ok(Self::default());
+        };
+
+        let decoded_key = BASE64_STANDARD
+            .decode(encryption_key.as_bytes())
+            .map_err(|source| crate::Error::Generic {
+                store: STORE,
+                source: Box::new(source),
+            })?;
+
+        // As above encryption keys must be 256-bit AES keys,
+        // which means the base64-encoded value must decode to 32 bytes.
+        if decoded_key.len() != 32 {
+            return Err(crate::Error::Generic {
+                store: STORE,
+                source: format!(
+                    "Azure customer-provided encryption key must decode to 32 
bytes, got {}",
+                    decoded_key.len()
+                )
+                .into(),
+            });
+        }
+
+        let encryption_key_sha256 =
+            BASE64_STANDARD.encode(digest::digest(&digest::SHA256, 
&decoded_key));
+
+        Ok(Self {
+            encryption_key: Some(encryption_key),
+            encryption_key_sha256: Some(encryption_key_sha256),
+        })
+    }
+
+    fn is_enabled(&self) -> bool {
+        self.encryption_key.is_some()
+    }
+}
+
+/// Request-builder extension for customer-provided encryption keys (CPK).
+///
+/// Mirrors [`CredentialExt`] so that CPK headers are applied as part of the
+/// fluent request-construction chain, keeping the secret key material close to
+/// the rest of the request building rather than as a detached side-call.

Review Comment:
   ```suggestion
   ```
   nit: i feel like we dont need this explanation 



##########
src/azure/client.rs:
##########
@@ -829,13 +958,11 @@ impl AzureClient {
     pub(crate) async fn get_blob_tagging(&self, path: &Path) -> 
Result<HttpResponse> {
         let credential = self.get_credential().await?;
         let url = self.config.path_url(path);
-        let sensitive = credential
-            .as_deref()
-            .map(|c| c.sensitive_request())
-            .unwrap_or_default();
+        let sensitive = self.config.is_sensitive(&credential);
         let response = self
             .client
             .get(url.as_str())
+            .with_azure_encryption_headers(&self.config.encryption_headers)

Review Comment:
   ```suggestion
   ```
   
   get blob tags also does not require CPK, so lets remove setting header
   
   
https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-customer-provided-keys#blob-storage-operations-supporting-customer-provided-keys



##########
src/azure/client.rs:
##########
@@ -190,6 +220,97 @@ impl AzureConfig {
     }
 }
 
+/// Encryption headers for Azure requests.
+/// Azure only supports AES256 encryption with customer-provided keys.
+#[derive(Default, Clone)]
+pub(crate) struct AzureEncryptionHeaders {
+    pub encryption_key: Option<String>,
+    pub encryption_key_sha256: Option<String>,
+}
+
+impl std::fmt::Debug for AzureEncryptionHeaders {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("AzureEncryptionHeaders")
+            .field("key_configured", &self.encryption_key.is_some())
+            .finish()
+    }
+}
+
+impl AzureEncryptionHeaders {
+    pub(crate) fn try_new(encryption_key: Option<String>) -> Result<Self> {
+        let Some(encryption_key) = encryption_key else {
+            return Ok(Self::default());
+        };
+
+        let decoded_key = BASE64_STANDARD
+            .decode(encryption_key.as_bytes())
+            .map_err(|source| crate::Error::Generic {
+                store: STORE,
+                source: Box::new(source),
+            })?;
+
+        // As above encryption keys must be 256-bit AES keys,
+        // which means the base64-encoded value must decode to 32 bytes.
+        if decoded_key.len() != 32 {
+            return Err(crate::Error::Generic {
+                store: STORE,
+                source: format!(
+                    "Azure customer-provided encryption key must decode to 32 
bytes, got {}",
+                    decoded_key.len()
+                )
+                .into(),
+            });
+        }
+
+        let encryption_key_sha256 =
+            BASE64_STANDARD.encode(digest::digest(&digest::SHA256, 
&decoded_key));
+
+        Ok(Self {
+            encryption_key: Some(encryption_key),
+            encryption_key_sha256: Some(encryption_key_sha256),
+        })
+    }
+
+    fn is_enabled(&self) -> bool {
+        self.encryption_key.is_some()
+    }
+}
+
+/// Request-builder extension for customer-provided encryption keys (CPK).
+///
+/// Mirrors [`CredentialExt`] so that CPK headers are applied as part of the
+/// fluent request-construction chain, keeping the secret key material close to
+/// the rest of the request building rather than as a detached side-call.
+pub(crate) trait EncryptionHeadersExt {
+    /// Apply the customer-provided encryption headers for the request target.

Review Comment:
   ```suggestion
       /// Apply the customer-provided encryption headers for the request 
target.
       /// 
<https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-customer-provided-keys>
   ```



##########
src/azure/credential.rs:
##########
@@ -278,6 +282,12 @@ pub(crate) trait CredentialExt {
         credential: &Option<impl Deref<Target = AzureCredential>>,
         account: &str,
     ) -> Self;
+
+    /// Override the Azure Blob service version used for a single request.
+    ///
+    /// [`with_azure_authorization`](Self::with_azure_authorization) preserves 
an
+    /// explicit version header instead of replacing it with the backend 
default.
+    fn with_azure_version(self, version: &'static str) -> Self;

Review Comment:
   nit: agent suggested pulling this out into its own trait since its not tied 
to authorization
   
   ```
   /// Override the Azure Blob service version used for a single request.
   ///
   /// [`with_azure_authorization`](CredentialExt::with_azure_authorization) 
preserves
   /// an explicit version header instead of replacing it with the backend 
default.
   pub(crate) trait RequestVersionExt {
       fn with_azure_version(self, version: &'static str) -> Self;
   }
   ```
   
   ```
   impl RequestVersionExt for HttpRequestBuilder {
       fn with_azure_version(self, version: &'static str) -> Self {
           self.header(&VERSION, version)
       }
   }
   ```



##########
src/azure/builder.rs:
##########
@@ -898,6 +916,12 @@ impl MicrosoftAzureBuilder {
         self
     }
 
+    /// Set the base64-encoded 256-bit customer-provided encryption key.

Review Comment:
   ```suggestion
       /// Set the customer-provided encryption key (CPK) used to encrypt blob 
content.
       ///
       /// `key` must be a base64-encoded 256-bit AES key (the decoded value 
must be
       /// exactly 32 bytes). The same key must be supplied on every subsequent 
read,
       /// write, or copy of any blob created with it; if the key is lost or 
omitted
       /// the data is unrecoverable. CPK material is sent to Azure on every 
request,
       /// so the configured endpoint must use HTTPS.
       ///
       /// Only a subset of Blob storage operations support CPK
       /// (see the [Azure documentation][cpk-ops]). When CPK is enabled, `copy`
       /// switches from the asynchronous `Copy Blob` API to `Put Blob From 
URL`,
       /// which is synchronous and limits the source blob to 5,000 MiB.
       ///
       /// [cpk-ops]: 
https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-customer-provided-keys#blob-storage-operations-supporting-customer-provided-keys
   ```



##########
src/azure/client.rs:
##########
@@ -181,6 +196,21 @@ impl AzureConfig {
         }
         url
     }
+
+    /// Whether a request built with this config must be treated as sensitive.
+    ///
+    /// This is true when the credential is carried in the URL (SAS tokens) or
+    /// when customer-provided encryption keys are configured, since CPK 
requests
+    /// carry secret key material. Combining both here means callers only need 
to
+    /// remember a single source of truth for request sensitivity.

Review Comment:
   ```suggestion
       /// The retry layer's `sensitive` flag suppresses the request URL from
       /// error messages (see [`RetryableRequestBuilder::sensitive`]). For SAS
       /// credentials this is load-bearing because the token is carried as URL
       /// query parameters.
       ///
       /// CPK material lives in request *headers* (`x-ms-encryption-key` etc.),
       /// not in the URL, so today's URL-only redaction does not actively hide
       /// it. The flag is still set for CPK requests so that any future
       /// expansion of the redaction surface (headers, response bodies) covers
       /// CPK without further changes here, and so that operators have a single
       /// "this request touches secret material" signal for both auth modes.
       ///
       /// [`RetryableRequestBuilder::sensitive`]: 
crate::client::retry::RetryableRequestBuilder
   ```



##########
src/azure/client.rs:
##########
@@ -190,6 +220,97 @@ impl AzureConfig {
     }
 }
 
+/// Encryption headers for Azure requests.
+/// Azure only supports AES256 encryption with customer-provided keys.
+#[derive(Default, Clone)]
+pub(crate) struct AzureEncryptionHeaders {
+    pub encryption_key: Option<String>,
+    pub encryption_key_sha256: Option<String>,
+}
+
+impl std::fmt::Debug for AzureEncryptionHeaders {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("AzureEncryptionHeaders")
+            .field("key_configured", &self.encryption_key.is_some())
+            .finish()
+    }
+}
+
+impl AzureEncryptionHeaders {
+    pub(crate) fn try_new(encryption_key: Option<String>) -> Result<Self> {
+        let Some(encryption_key) = encryption_key else {
+            return Ok(Self::default());
+        };
+
+        let decoded_key = BASE64_STANDARD
+            .decode(encryption_key.as_bytes())
+            .map_err(|source| crate::Error::Generic {
+                store: STORE,
+                source: Box::new(source),
+            })?;
+
+        // As above encryption keys must be 256-bit AES keys,
+        // which means the base64-encoded value must decode to 32 bytes.
+        if decoded_key.len() != 32 {
+            return Err(crate::Error::Generic {
+                store: STORE,
+                source: format!(
+                    "Azure customer-provided encryption key must decode to 32 
bytes, got {}",
+                    decoded_key.len()
+                )
+                .into(),
+            });
+        }
+
+        let encryption_key_sha256 =
+            BASE64_STANDARD.encode(digest::digest(&digest::SHA256, 
&decoded_key));
+
+        Ok(Self {
+            encryption_key: Some(encryption_key),
+            encryption_key_sha256: Some(encryption_key_sha256),
+        })
+    }
+
+    fn is_enabled(&self) -> bool {
+        self.encryption_key.is_some()
+    }
+}
+
+/// Request-builder extension for customer-provided encryption keys (CPK).
+///
+/// Mirrors [`CredentialExt`] so that CPK headers are applied as part of the
+/// fluent request-construction chain, keeping the secret key material close to
+/// the rest of the request building rather than as a detached side-call.
+pub(crate) trait EncryptionHeadersExt {
+    /// Apply the customer-provided encryption headers for the request target.
+    fn with_azure_encryption_headers(self, headers: &AzureEncryptionHeaders) 
-> Self;
+
+    /// Apply the customer-provided encryption headers for a copy *source*.

Review Comment:
   Could we add a short comment explaining the difference between 
`with_azure_encryption_headers` and `with_azure_source_encryption_headers`?
   
   For copy with CPK, the `x-ms-encryption-*` headers apply to the destination 
blob, while `x-ms-source-encryption-*` applies to decrypting the source blob. 
Since both use the same configured key here, it is easy to miss why both sets 
are required.



##########
src/azure/client.rs:
##########
@@ -705,7 +825,11 @@ impl AzureClient {
         Ok(results)
     }
 
-    /// Make an Azure Copy request 
<https://docs.microsoft.com/en-us/rest/api/storageservices/copy-blob>
+    /// Make an Azure copy request 
<https://docs.microsoft.com/en-us/rest/api/storageservices/copy-blob>.
+    ///
+    /// When customer-provided keys are enabled, this request uses the Put Blob
+    /// From URL header shape and must opt into the service version that added
+    /// source CPK headers.

Review Comment:
   ```suggestion
       /// Make an Azure copy request 
<https://docs.microsoft.com/en-us/rest/api/storageservices/copy-blob>.
       ///
       /// The classic `Copy Blob` API does not accept CPK headers, so when
       /// customer-provided keys are enabled this falls back to
       /// [Put Blob From URL][put-blob-from-url] and opts into the service 
version
       /// that added source CPK headers. That changes the semantics: the 
operation
       /// is synchronous, the source must be a block blob no larger than 5,000 
MiB,
       /// and uncommitted blocks / the source block list are not preserved.
       ///
       /// [put-blob-from-url]: 
https://learn.microsoft.com/en-us/rest/api/storageservices/put-blob-from-url
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to