github-actions[bot] commented on code in PR #61102:
URL: https://github.com/apache/doris/pull/61102#discussion_r2894065484


##########
be/src/util/s3_util.cpp:
##########
@@ -327,14 +356,25 @@ std::shared_ptr<io::ObjStorageClient> 
S3ClientFactory::_create_azure_client(
     
options.Retry.StatusCodes.insert(Azure::Core::Http::HttpStatusCode::TooManyRequests);
     options.Retry.MaxRetries = config::max_s3_client_retry;
     
options.PerRetryPolicies.emplace_back(std::make_unique<AzureRetryRecordPolicy>());
+    if (_ca_cert_file_path.empty()) {
+        _ca_cert_file_path = 
get_valid_ca_cert_path(doris::split(config::ca_cert_file_paths, ";"));

Review Comment:
   Note (pre-existing issue, not introduced by this PR): This read-then-write 
of `_ca_cert_file_path` is executed **outside `_lock`** (as 
`_create_azure_client` is called after the cache-miss lock is released in 
`create()`). If two threads concurrently create Azure clients and both miss the 
cache, they will race on `_ca_cert_file_path`. This is technically UB under the 
C++ memory model, though practically benign since both would write the same 
value. This pre-existing pattern also exists in `_create_s3_client` and 
`_get_aws_credentials_provider_v1/v2`. Consider moving the lazy-init of 
`_ca_cert_file_path` into the constructor (which already does 
`_ca_cert_file_path = get_valid_ca_cert_path(...)` at line 240) and removing 
the lazy fallback, or protecting it with `std::call_once`.



##########
be/src/io/fs/azure_obj_storage_client.cpp:
##########
@@ -93,31 +102,55 @@ namespace doris::io {
 // > Each batch request supports a maximum of 256 subrequests.
 constexpr size_t BlobBatchMaxOperations = 256;
 
+bool is_azure_tls_ca_error_message(std::string_view message) {
+    std::string lower = to_lower_ascii(message);
+    return lower.find("ssl ca cert") != std::string::npos ||
+           lower.find("peer failed verification") != std::string::npos ||

Review Comment:
   Nit: The patterns `"ssl ca cert"` and `"problem with the ssl ca cert"` 
overlap — any message matching the latter will also match the former. Consider 
removing the redundant `"ssl ca cert"` check, or keeping it as a broader 
catch-all and documenting the intent.
   
   Also, these patterns are curl-specific error strings. A brief comment 
listing the curl error codes that produce these messages (e.g., 
`CURLE_SSL_CACERT`, `CURLE_PEER_FAILED_VERIFICATION`) would help future 
maintainers understand the provenance and know when to update this list.



##########
be/src/io/fs/azure_obj_storage_client.cpp:
##########
@@ -54,6 +56,13 @@ std::string wrap_object_storage_path_msg(const 
doris::io::ObjectStoragePathOptio
                        opts.path.native());
 }
 
+std::string to_lower_ascii(std::string_view input) {
+    std::string lowered(input);

Review Comment:
   Minor: `to_lower_ascii` is defined in an anonymous namespace but called from 
`is_azure_tls_ca_error_message` in the `doris::io` namespace. This works fine 
(anonymous namespace has internal linkage accessible within the TU), but since 
you already added `<cctype>` — consider using a lambda inside 
`is_azure_tls_ca_error_message` directly instead of a separate helper function, 
to keep the scope tighter. Alternatively, the function is fine as-is if you 
anticipate reuse.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to