github-actions[bot] commented on code in PR #60921: URL: https://github.com/apache/doris/pull/60921#discussion_r2876970755
########## fe/fe-core/src/main/java/org/apache/doris/common/util/InternalHttpsUtils.java: ########## @@ -0,0 +1,105 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.common.util; + +import org.apache.doris.common.Config; + +import org.apache.http.conn.ssl.NoopHostnameVerifier; +import org.apache.http.conn.ssl.SSLConnectionSocketFactory; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.impl.client.HttpClients; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.io.InputStream; +import java.nio.file.Files; +import java.nio.file.Paths; +import java.security.KeyStore; +import javax.net.ssl.SSLContext; +import javax.net.ssl.TrustManagerFactory; + +/** + * SSL-aware HTTP clients for internal FE communication using MySQL SSL truststore. + * + * Security Model: + * - Validates certificates against configured CA truststore (mysql_ssl_default_ca_certificate) + * - Hostname verification is DISABLED to support IP-based FE communication + * - This is safe for internal cluster communication because: + * 1. All endpoints enforce checkFromValidFe() - only registered FE nodes can connect + * 2. FE cluster is assumed to be on trusted network + * 3. Traffic is encrypted and authenticated via certificate validation + * + * This approach is similar to other distributed systems (Kafka, Elasticsearch, Cassandra) + * where inter-node SSL communication disables hostname verification for operational flexibility. + */ +public class InternalHttpsUtils { + private static final Logger LOG = LogManager.getLogger(InternalHttpsUtils.class); + + public static CloseableHttpClient createValidatedHttpClient() { + try { + KeyStore trustStore = KeyStore.getInstance(Config.ssl_trust_store_type); + try (InputStream stream = Files.newInputStream( + Paths.get(Config.mysql_ssl_default_ca_certificate))) { + trustStore.load(stream, Config.mysql_ssl_default_ca_certificate_password.toCharArray()); + } + + TrustManagerFactory tmf = TrustManagerFactory.getInstance( + TrustManagerFactory.getDefaultAlgorithm()); + tmf.init(trustStore); + + SSLContext sslContext = SSLContext.getInstance("TLS"); + sslContext.init(null, tmf.getTrustManagers(), null); + + SSLConnectionSocketFactory sslFactory = new SSLConnectionSocketFactory( + sslContext, + NoopHostnameVerifier.INSTANCE); + + return HttpClients.custom() + .setSSLSocketFactory(sslFactory) + .build(); + } catch (Exception e) { + LOG.error("Failed to create SSL-aware HTTP client using truststore: {}", + Config.mysql_ssl_default_ca_certificate, e); + throw new RuntimeException("Failed to create SSL-aware HTTP client", e); + } + } + + public static void installTrustManagerForUrlConnection() { + try { + KeyStore trustStore = KeyStore.getInstance(Config.ssl_trust_store_type); + try (InputStream stream = Files.newInputStream( + Paths.get(Config.mysql_ssl_default_ca_certificate))) { + trustStore.load(stream, Config.mysql_ssl_default_ca_certificate_password.toCharArray()); + } + + TrustManagerFactory tmf = TrustManagerFactory.getInstance( + TrustManagerFactory.getDefaultAlgorithm()); + tmf.init(trustStore); + + SSLContext sslContext = SSLContext.getInstance("TLS"); + sslContext.init(null, tmf.getTrustManagers(), null); + + javax.net.ssl.HttpsURLConnection.setDefaultSSLSocketFactory(sslContext.getSocketFactory()); + javax.net.ssl.HttpsURLConnection.setDefaultHostnameVerifier((hostname, session) -> true); Review Comment: **Critical: JVM-wide global state mutation** `setDefaultSSLSocketFactory()` and `setDefaultHostnameVerifier()` modify **JVM-wide defaults** that affect ALL `HttpsURLConnection` instances in the entire process — not just the FE-to-FE connection being created. This is problematic because: 1. **Thread safety**: Multiple threads calling this concurrently could race. There's no synchronization. 2. **Global hostname verification bypass**: `(hostname, session) -> true` disables hostname verification for ALL future `HttpsURLConnection` instances, including any external catalog connections or other HTTPS clients using `HttpsURLConnection`. 3. **Redundant re-initialization**: This is called on every `getConnectionWithNodeIdent()` invocation, re-reading the truststore from disk and re-initializing the `SSLContext` each time. Recommendation: Initialize the `SSLContext` and install the defaults **once** at startup (e.g., via a static initializer or an `init()` method called during FE bootstrap). Alternatively, set the SSL factory on the individual `HttpsURLConnection` instance rather than the global default: ```java HttpsURLConnection httpsConn = (HttpsURLConnection) url.openConnection(); httpsConn.setSSLSocketFactory(sslContext.getSocketFactory()); httpsConn.setHostnameVerifier(...); ``` This avoids polluting the JVM-wide defaults. ########## fe/fe-core/src/main/java/org/apache/doris/master/MetaHelper.java: ########## @@ -39,7 +39,6 @@ import java.io.InputStreamReader; import java.io.OutputStream; import java.net.HttpURLConnection; -import java.util.Map; public class MetaHelper { public static final Logger LOG = LogManager.getLogger(MetaHelper.class); Review Comment: **Removed useful debug logging** The `LOG.info("meta helper, url: {}, timeout{}, headers: {}", url, timeout, headers)` was removed. While the response wrapping in try-catch is a good improvement, the debug logging for URL, timeout, and headers is valuable for troubleshooting connectivity issues (especially with this HTTPS change). Consider keeping it, perhaps at `LOG.debug` level. ########## fe/fe-core/src/main/java/org/apache/doris/httpv2/rest/manager/HttpUtils.java: ########## @@ -130,7 +131,14 @@ public static CloseableHttpClient getHttpClient() { } private static String executeRequest(HttpRequestBase request) throws IOException { - CloseableHttpClient client = getHttpClient(); + CloseableHttpClient client; Review Comment: **Bug: Resource leak — `CloseableHttpClient` is never closed** The `CloseableHttpClient` created here (both the HTTPS and plain HTTP paths) is never closed. The `ResponseHandler` variant of `execute()` closes the HTTP *response*, but NOT the *client* itself. This leaks the client's internal `PoolingHttpClientConnectionManager` and associated socket file descriptors. Other callers in the codebase do this correctly (e.g., `MysqlLoadManager.java` uses `try-with-resources`). Fix: ```java private static String executeRequest(HttpRequestBase request) throws IOException { try (CloseableHttpClient client = request.getURI().getScheme().equalsIgnoreCase("https") && Config.enable_https ? InternalHttpsUtils.createValidatedHttpClient() : HttpClientBuilder.create().build()) { return client.execute(request, httpResponse -> EntityUtils.toString(httpResponse.getEntity())); } } ``` Note: this resource leak also existed before this PR (the old code called `getHttpClient()` without closing it), but since this PR rewrites the method, it's a good opportunity to fix it. ########## fe/fe-core/src/main/java/org/apache/doris/common/util/InternalHttpsUtils.java: ########## @@ -0,0 +1,105 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.common.util; + +import org.apache.doris.common.Config; + +import org.apache.http.conn.ssl.NoopHostnameVerifier; +import org.apache.http.conn.ssl.SSLConnectionSocketFactory; +import org.apache.http.impl.client.CloseableHttpClient; +import org.apache.http.impl.client.HttpClients; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.io.InputStream; +import java.nio.file.Files; +import java.nio.file.Paths; +import java.security.KeyStore; +import javax.net.ssl.SSLContext; +import javax.net.ssl.TrustManagerFactory; + +/** + * SSL-aware HTTP clients for internal FE communication using MySQL SSL truststore. + * + * Security Model: + * - Validates certificates against configured CA truststore (mysql_ssl_default_ca_certificate) + * - Hostname verification is DISABLED to support IP-based FE communication + * - This is safe for internal cluster communication because: + * 1. All endpoints enforce checkFromValidFe() - only registered FE nodes can connect + * 2. FE cluster is assumed to be on trusted network + * 3. Traffic is encrypted and authenticated via certificate validation + * + * This approach is similar to other distributed systems (Kafka, Elasticsearch, Cassandra) + * where inter-node SSL communication disables hostname verification for operational flexibility. + */ +public class InternalHttpsUtils { + private static final Logger LOG = LogManager.getLogger(InternalHttpsUtils.class); + + public static CloseableHttpClient createValidatedHttpClient() { + try { Review Comment: **Performance: Truststore re-read on every call** Both `createValidatedHttpClient()` and `installTrustManagerForUrlConnection()` re-read the truststore file from disk and re-initialize the `SSLContext` on every invocation. For internal FE-to-FE calls (checkpoint sync, journal ID queries, etc.), this could be called frequently. Consider caching the `SSLContext` in a static field, initialized lazily or at startup. The truststore file doesn't change at runtime. ########## fe/fe-core/src/main/java/org/apache/doris/common/util/HttpURLUtil.java: ########## @@ -34,9 +35,12 @@ public static HttpURLConnection getConnectionWithNodeIdent(String request) throw try { SecurityChecker.getInstance().startSSRFChecking(request); URL url = new URL(request); + + if (url.getProtocol().equalsIgnoreCase("https") && Config.enable_https) { + InternalHttpsUtils.installTrustManagerForUrlConnection(); + } + HttpURLConnection conn = (HttpURLConnection) url.openConnection(); - // Must use Env.getServingEnv() instead of getCurrentEnv(), - // because here we need to obtain selfNode through the official service catalog. HostInfo selfNode = Env.getServingEnv().getSelfNode(); conn.setRequestProperty(Env.CLIENT_NODE_HOST_KEY, selfNode.getHost()); conn.setRequestProperty(Env.CLIENT_NODE_PORT_KEY, selfNode.getPort() + ""); Review Comment: **Deleted useful comments** The comments `// Must use Env.getServingEnv() instead of getCurrentEnv(), // because here we need to obtain selfNode through the official service catalog.` were removed here and in `getNodeIdentHeaders()`. These documented a non-obvious architectural requirement. Please preserve them — they explain *why* `getServingEnv()` is used instead of the more common `getCurrentEnv()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
