neils-dev commented on a change in pull request #2945:
URL: https://github.com/apache/ozone/pull/2945#discussion_r827488331



##########
File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/GrpcOmTransport.java
##########
@@ -93,6 +104,22 @@ public void start() {
             .usePlaintext()
             .maxInboundMessageSize(maxSize);
 
+    if (secConfig.isGrpcTlsEnabled()) {
+      try {
+        SslContextBuilder sslContextBuilder = GrpcSslContexts.forClient();
+        if (caCerts != null) {
+          sslContextBuilder.trustManager(caCerts);
+        } else {
+          LOG.error("x509Certicates empty");

Review comment:
       Errors on s3g client creation appear the same to the user I'm afraid.  
The s3 gateway uses jax context and dependency injections that fail if the s3 
gateway fails on initialization.  That is what happens when the s3g TLS channel 
is, for any given reason, unable to be created when _ozone.security.enabled_ 
and _hdds.grpc.tls.enabled_ are set.  The error in s3g client creation is 
propagated up through the jax injection failure which results in retries and 
the 500 http erro response.
   
    > would it not be better to fail the client request here itself instead of 
when it reaches the max retries?
   
   In this case all errors that occur during the TLS / Grpc channel creation 
appear the same to the client with a 500 http response and max retires.  So if 
we fail it another way, it will still propagate back the same.  We do provide 
additional valuable information on the error through the log.  Here we have the 
additional information that we log indicating that the `"x509Certificates 
empty"`.

##########
File path: 
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/OzoneClientCache.java
##########
@@ -75,8 +99,66 @@ public static OzoneClient 
getOzoneClientInstance(OzoneConfiguration
     return instance.client;
   }
 
+  public static void closeClient() throws IOException {
+    if (instance != null) {
+      instance.client.close();
+      instance = null;
+    }
+  }
+
+  private void setCertificate(String omServiceID,
+                              OzoneConfiguration conf)
+      throws IOException {
+
+    // create local copy of config incase exception occurs
+    // with certificate OmRequest
+    OzoneConfiguration config = new OzoneConfiguration(conf);
+    OzoneClient certClient;
+
+    if (secConfig.isGrpcTlsEnabled()) {
+      // set OmTransport to hadoop rpc to securely,
+      // get certificates with service list request
+      config.set(OZONE_OM_TRANSPORT_CLASS,
+          OZONE_OM_TRANSPORT_CLASS_DEFAULT);
+
+      if (omServiceID == null) {
+        certClient = OzoneClientFactory.getRpcClient(config);
+      } else {
+        // As in HA case, we need to pass om service ID.
+        certClient = OzoneClientFactory.getRpcClient(omServiceID,
+            config);
+      }
+      try {
+        ServiceInfoEx serviceInfoEx = certClient
+            .getObjectStore()
+            .getClientProxy()
+            .getOzoneManagerClient()
+            .getServiceInfo();
+
+        if (OzoneSecurityUtil.isSecurityEnabled(conf)) {
+          String caCertPem = null;
+          List<String> caCertPems = null;
+          caCertPem = serviceInfoEx.getCaCertificate();
+          caCertPems = serviceInfoEx.getCaCertPemList();
+          if (caCertPems == null || caCertPems.isEmpty()) {
+            caCertPems = Collections.singletonList(caCertPem);

Review comment:
       Thanks @hanishakoneru for following up.  Yes, I agree - looking at the 
code for the certificates, it may not be set in neither the `CaCertPemList` nor 
the `CaCertificate`.  I've added an additional check for null CaCertificate and 
throws if the not null precondition is not met in the `OzoneClientCache` prior 
to setCaCerts for the transport.
   
   As a result we handle this with the client creation jax injection error 500 
http response (max retries) with an exception trace indicating the precondition 
failure of not null check on the certificate:  
   _java.lang.NullPointerException
   at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:880)
   
,org.apache.hadoop.ozone.s3.OzoneClientCache.setCertificate(OzoneClientCache.java:146)_
   
   Do we need to do anything in addition?
   
   The same conditional check of `caCertPems` for null or empty then setting 
the `caCertPerms` with the `caCertPer` exists in the `RpcClient` constructor 
code.  I can followup on this with new jira to patch the `RpcClient` if you 
think it is needed(?). 

##########
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/GrpcOzoneManagerServer.java
##########
@@ -46,24 +58,42 @@ public GrpcOzoneManagerServer(OzoneConfiguration config,
                                 OzoneManagerProtocolServerSideTranslatorPB
                                     omTranslator,
                                 OzoneDelegationTokenSecretManager
-                                    delegationTokenMgr) {
+                                    delegationTokenMgr,
+                                CertificateClient caClient) {
     this.port = config.getObject(
         GrpcOzoneManagerServerConfig.class).
         getPort();
     init(omTranslator,
         delegationTokenMgr,
-        config);
+        config,
+        caClient);
   }
 
   public void init(OzoneManagerProtocolServerSideTranslatorPB omTranslator,
                    OzoneDelegationTokenSecretManager delegationTokenMgr,
-                   OzoneConfiguration omServerConfig) {
+                   OzoneConfiguration omServerConfig,
+                   CertificateClient caClient) {
     NettyServerBuilder nettyServerBuilder = NettyServerBuilder.forPort(port)
         .maxInboundMessageSize(OzoneConsts.OZONE_SCM_CHUNK_MAX_SIZE)
         .addService(new OzoneManagerServiceGrpc(omTranslator,
             delegationTokenMgr,
             omServerConfig));
 
+    SecurityConfig secConf = new SecurityConfig(omServerConfig);
+    if (secConf.isGrpcTlsEnabled()) {

Review comment:
       Yes, logging a separate error for misconfigured ozone.security while 
_grpc.tls.enabled_ is `true` for the s3g Grpc TLS channel is much better. 
   
   I have modified the code for this case on both the s3g Grpc client and the 
s3g Grpc server.  For the client, `GrpcOmTransport`, should `grpc.tls.enabled` 
be set when _ozone.security_ = `false`, we log the error "...security not 
enabled when TLS specified.".  Also, we don't set the sslContextBuilder for tls 
and default to plaintext channel.  Similarly, for the server, 
`GrpcOzoneManagerServer`, should `grpc.tls.enabled` be set when 
_ozone.security_ = `false`, we log the error ("... not enabled when TLS..., 
using plaintext.") and default to a plaintext channel.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to