Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-22 Thread via GitHub


yuqi1129 merged PR #3852:
URL: https://github.com/apache/gravitino/pull/3852


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-22 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1686059381


##
core/src/main/java/org/apache/gravitino/utils/PrincipalUtils.java:
##
@@ -44,13 +44,19 @@ public static  T doAs(Principal principal, 
PrivilegedExceptionAction actio
 }
   }
 
+  // This method can't be used in nested `Subject#doAs` block.
   public static Principal getCurrentPrincipal() {
 java.security.AccessControlContext context = 
java.security.AccessController.getContext();
 Subject subject = Subject.getSubject(context);
-if (subject == null || 
subject.getPrincipals(UserPrincipal.class).isEmpty()) {

Review Comment:
   I've modified it according to my coding habits, if you feel is not okay, I 
will revert it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-21 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1685926668


##
core/src/main/java/org/apache/gravitino/utils/PrincipalUtils.java:
##
@@ -44,13 +44,19 @@ public static  T doAs(Principal principal, 
PrivilegedExceptionAction actio
 }
   }
 
+  // This method can't be used in nested `Subject#doAs` block.
   public static Principal getCurrentPrincipal() {
 java.security.AccessControlContext context = 
java.security.AccessController.getContext();
 Subject subject = Subject.getSubject(context);
-if (subject == null || 
subject.getPrincipals(UserPrincipal.class).isEmpty()) {

Review Comment:
   Why do you change this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-19 Thread via GitHub


yuqi1129 commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-2238562685

   gently ping @jerqi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-18 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1682369406


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/BaseHadoopCatalogOperations.java:
##
@@ -0,0 +1,625 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop;
+
+import static org.apache.gravitino.connector.BaseCatalog.CATALOG_BYPASS_PREFIX;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Maps;
+import java.io.IOException;
+import java.time.Instant;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.stream.Collectors;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.gravitino.Catalog;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.GravitinoEnv;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Namespace;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.SchemaChange;
+import org.apache.gravitino.StringIdentifier;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.exceptions.AlreadyExistsException;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.AuditInfo;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+class BaseHadoopCatalogOperations {

Review Comment:
   Change to `HadoopCatalogOperationsImpl`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-18 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1682338799


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/BaseHadoopCatalogOperations.java:
##
@@ -0,0 +1,625 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop;
+
+import static org.apache.gravitino.connector.BaseCatalog.CATALOG_BYPASS_PREFIX;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Maps;
+import java.io.IOException;
+import java.time.Instant;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.stream.Collectors;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.gravitino.Catalog;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.GravitinoEnv;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Namespace;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.SchemaChange;
+import org.apache.gravitino.StringIdentifier;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.exceptions.AlreadyExistsException;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.AuditInfo;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+class BaseHadoopCatalogOperations {

Review Comment:
   >   Base usually means inheritance relationship. Could you give a better 
name?
   
   Let me check
   
   > Alterxxx should pass api user, too.
   
   Alterxxx won't need UGI to handle FileSystem, so we do not need to pass 
users to it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-18 Thread via GitHub


qqqttt123 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1682324396


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/BaseHadoopCatalogOperations.java:
##
@@ -0,0 +1,625 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.hadoop;
+
+import static org.apache.gravitino.connector.BaseCatalog.CATALOG_BYPASS_PREFIX;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Maps;
+import java.io.IOException;
+import java.time.Instant;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.stream.Collectors;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.gravitino.Catalog;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.GravitinoEnv;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Namespace;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.SchemaChange;
+import org.apache.gravitino.StringIdentifier;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.exceptions.AlreadyExistsException;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.AuditInfo;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+class BaseHadoopCatalogOperations {

Review Comment:
   Base usually means inheritance relationship. Could you give a better name?
   Alterxxx should pass api user, too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-17 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1682286678


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/SecureHadoopCatalogOperations.java:
##
@@ -0,0 +1,377 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
+import java.util.Collections;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.catalog.hadoop.authentication.AuthenticationConfig;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosClient;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosConfig;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.apache.hadoop.security.UserGroupInformation.AuthenticationMethod;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * SecureHadoopCatalogOperations is a secure version of 
HadoopCatalogOperations that can manage
+ * Schema and fileset level of user authentication.
+ */
+public class SecureHadoopCatalogOperations extends HadoopCatalogOperations {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(SecureHadoopCatalogOperations.class);
+
+  private final List closeables = Lists.newArrayList();
+
+  private final Map userInfoMap = 
Maps.newConcurrentMap();
+
+  public static final String GRAVITINO_KEYTAB_FORMAT = "keytabs/gravitino-%s";
+
+  private String kerberosRealm;
+
+  public SecureHadoopCatalogOperations() {
+super();
+  }
+
+  public SecureHadoopCatalogOperations(EntityStore store) {
+super(store);
+  }
+
+  public String getKerberosRealm() {
+return kerberosRealm;
+  }
+
+  static class UserInfo {
+UserGroupInformation loginUser;
+boolean enableUserImpersonation;
+String keytabPath;
+String realm;
+
+static UserInfo of(
+UserGroupInformation loginUser,
+boolean enableUserImpersonation,
+String keytabPath,
+String kerberosRealm) {
+  UserInfo userInfo = new UserInfo();
+  userInfo.loginUser = loginUser;
+  userInfo.enableUserImpersonation = enableUserImpersonation;
+  userInfo.keytabPath = keytabPath;
+  userInfo.realm = kerberosRealm;
+  return userInfo;
+}
+  }
+
+  // We have overridden the createFileset, dropFileset, createSchema, 
dropSchema method to reset
+  // the current user based on the name identifier.
+
+  @Override
+  public Fileset createFileset(
+  NameIdentifier ident,
+  String comment,
+  Fileset.Type type,
+  String storageLocation,
+  Map properties)
+  throws NoSuchSchemaException, FilesetAlreadyExistsException {
+ 

Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-17 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1680950653


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/SecureHadoopCatalogOperations.java:
##
@@ -0,0 +1,377 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
+import java.util.Collections;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.catalog.hadoop.authentication.AuthenticationConfig;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosClient;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosConfig;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.apache.hadoop.security.UserGroupInformation.AuthenticationMethod;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * SecureHadoopCatalogOperations is a secure version of 
HadoopCatalogOperations that can manage
+ * Schema and fileset level of user authentication.
+ */
+public class SecureHadoopCatalogOperations extends HadoopCatalogOperations {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(SecureHadoopCatalogOperations.class);
+
+  private final List closeables = Lists.newArrayList();
+
+  private final Map userInfoMap = 
Maps.newConcurrentMap();
+
+  public static final String GRAVITINO_KEYTAB_FORMAT = "keytabs/gravitino-%s";
+
+  private String kerberosRealm;
+
+  public SecureHadoopCatalogOperations() {
+super();
+  }
+
+  public SecureHadoopCatalogOperations(EntityStore store) {
+super(store);
+  }
+
+  public String getKerberosRealm() {
+return kerberosRealm;
+  }
+
+  static class UserInfo {
+UserGroupInformation loginUser;
+boolean enableUserImpersonation;
+String keytabPath;
+String realm;
+
+static UserInfo of(
+UserGroupInformation loginUser,
+boolean enableUserImpersonation,
+String keytabPath,
+String kerberosRealm) {
+  UserInfo userInfo = new UserInfo();
+  userInfo.loginUser = loginUser;
+  userInfo.enableUserImpersonation = enableUserImpersonation;
+  userInfo.keytabPath = keytabPath;
+  userInfo.realm = kerberosRealm;
+  return userInfo;
+}
+  }
+
+  // We have overridden the createFileset, dropFileset, createSchema, 
dropSchema method to reset
+  // the current user based on the name identifier.
+
+  @Override
+  public Fileset createFileset(
+  NameIdentifier ident,
+  String comment,
+  Fileset.Type type,
+  String storageLocation,
+  Map properties)
+  throws NoSuchSchemaException, FilesetAlreadyExistsException {
+

Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-17 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1680942619


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/SecureHadoopCatalogOperations.java:
##
@@ -0,0 +1,377 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop;
+
+import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
+import java.util.Collections;
+import java.util.List;
+import java.util.Locale;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.catalog.hadoop.authentication.AuthenticationConfig;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosClient;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosConfig;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.apache.hadoop.security.UserGroupInformation.AuthenticationMethod;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * SecureHadoopCatalogOperations is a secure version of 
HadoopCatalogOperations that can manage
+ * Schema and fileset level of user authentication.
+ */
+public class SecureHadoopCatalogOperations extends HadoopCatalogOperations {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(SecureHadoopCatalogOperations.class);
+
+  private final List closeables = Lists.newArrayList();
+
+  private final Map userInfoMap = 
Maps.newConcurrentMap();
+
+  public static final String GRAVITINO_KEYTAB_FORMAT = "keytabs/gravitino-%s";
+
+  private String kerberosRealm;
+
+  public SecureHadoopCatalogOperations() {
+super();
+  }
+
+  public SecureHadoopCatalogOperations(EntityStore store) {
+super(store);
+  }
+
+  public String getKerberosRealm() {
+return kerberosRealm;
+  }
+
+  static class UserInfo {
+UserGroupInformation loginUser;
+boolean enableUserImpersonation;
+String keytabPath;
+String realm;
+
+static UserInfo of(
+UserGroupInformation loginUser,
+boolean enableUserImpersonation,
+String keytabPath,
+String kerberosRealm) {
+  UserInfo userInfo = new UserInfo();
+  userInfo.loginUser = loginUser;
+  userInfo.enableUserImpersonation = enableUserImpersonation;
+  userInfo.keytabPath = keytabPath;
+  userInfo.realm = kerberosRealm;
+  return userInfo;
+}
+  }
+
+  // We have overridden the createFileset, dropFileset, createSchema, 
dropSchema method to reset
+  // the current user based on the name identifier.
+
+  @Override
+  public Fileset createFileset(
+  NameIdentifier ident,
+  String comment,
+  Fileset.Type type,
+  String storageLocation,
+  Map properties)
+  throws NoSuchSchemaException, FilesetAlreadyExistsException {
+ 

Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679254196


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -140,20 +177,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf,
+  hadoopConf,
+  NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  true);
+} else if (config.isSimpleAuth()) {
+  UserGroupInformation u =
+  
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());

Review Comment:
   UserGroupInformation.getCurrentUser.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679251780


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -140,20 +177,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =

Review Comment:
   My fault.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679251345


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopProxyPlugin.java:
##
@@ -82,7 +82,7 @@ public Object doAs(
 
   @Override
   public void bindCatalogOperation(CatalogOperations ops) {
-this.ops = ((HadoopCatalogOperations) ops);
+this.ops = ((SecureHadoopCatalogOperations) ops);

Review Comment:
   `HadoopProxyPlugin` was not used anymore, so it does not matter here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679247529


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -109,6 +118,34 @@ public String getKerberosRealm() {
 return kerberosRealm;
   }
 
+  public EntityStore getStore() {
+return store;
+  }
+
+  public Map getUserInfoMap() {

Review Comment:
   let me see,  If we can move `initKerberos` to `SecureHadoopCatalogOperations`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679246137


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -140,20 +177,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf,
+  hadoopConf,
+  NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  true);
+} else if (config.isSimpleAuth()) {
+  UserGroupInformation u =
+  
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());

Review Comment:
   So in simple mode, how can I create an UGI?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679245962


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopProxyPlugin.java:
##
@@ -82,7 +82,7 @@ public Object doAs(
 
   @Override
   public void bindCatalogOperation(CatalogOperations ops) {
-this.ops = ((HadoopCatalogOperations) ops);
+this.ops = ((SecureHadoopCatalogOperations) ops);

Review Comment:
   Should the plugin be null?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679244937


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -140,20 +177,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =

Review Comment:
   `this.kerberosRealm` will only be initialized once, so what's your main 
point?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679244910


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -109,6 +118,34 @@ public String getKerberosRealm() {
 return kerberosRealm;
   }
 
+  public EntityStore getStore() {
+return store;
+  }
+
+  public Map getUserInfoMap() {

Review Comment:
   Should we maintain these in the SecureHadoopCatalogOperations?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679240903


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -140,20 +177,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf,
+  hadoopConf,
+  NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  true);
+} else if (config.isSimpleAuth()) {
+  UserGroupInformation u =
+  
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());

Review Comment:
   Why do we use `RemoteUser`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679240407


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -140,20 +177,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =

Review Comment:
   Why do we need this? Is it a stateful variable? Will it involve thread safe 
issue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679224887


##
core/src/main/java/org/apache/gravitino/utils/PrincipalUtils.java:
##
@@ -47,10 +47,19 @@ public static  T doAs(Principal principal, 
PrivilegedExceptionAction actio
   public static Principal getCurrentPrincipal() {
 java.security.AccessControlContext context = 
java.security.AccessController.getContext();
 Subject subject = Subject.getSubject(context);
-if (subject == null || 
subject.getPrincipals(UserPrincipal.class).isEmpty()) {
+if (subject == null) {
   return new UserPrincipal(AuthConstants.ANONYMOUS_USER);
 }
-return subject.getPrincipals(UserPrincipal.class).iterator().next();
+
+if (!subject.getPrincipals(UserPrincipal.class).isEmpty()) {
+  return subject.getPrincipals(UserPrincipal.class).iterator().next();
+}
+
+if (!subject.getPrincipals().isEmpty()) {
+  return new 
UserPrincipal(subject.getPrincipals().iterator().next().getName());

Review Comment:
   I think we'd better add some description that `PrincipalUtils` can't only be 
used in nested `doAs` code block if without this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679221853


##
core/src/main/java/org/apache/gravitino/utils/PrincipalUtils.java:
##
@@ -47,10 +47,19 @@ public static  T doAs(Principal principal, 
PrivilegedExceptionAction actio
   public static Principal getCurrentPrincipal() {
 java.security.AccessControlContext context = 
java.security.AccessController.getContext();
 Subject subject = Subject.getSubject(context);
-if (subject == null || 
subject.getPrincipals(UserPrincipal.class).isEmpty()) {
+if (subject == null) {
   return new UserPrincipal(AuthConstants.ANONYMOUS_USER);
 }
-return subject.getPrincipals(UserPrincipal.class).iterator().next();
+
+if (!subject.getPrincipals(UserPrincipal.class).isEmpty()) {
+  return subject.getPrincipals(UserPrincipal.class).iterator().next();
+}
+
+if (!subject.getPrincipals().isEmpty()) {
+  return new 
UserPrincipal(subject.getPrincipals().iterator().next().getName());

Review Comment:
   This logic isn't right. Why do we need this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679157472


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/SecureHadoopCatalogOperations.java:
##
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop;
+
+import java.io.File;
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Namespace;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.SchemaChange;
+import org.apache.gravitino.catalog.hadoop.HadoopCatalogOperations.UserInfo;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosConfig;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.CatalogOperations;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.connector.SupportsSchemas;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetCatalog;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * SecureHadoopCatalogOperations is a secure version of 
HadoopCatalogOperations that can manage
+ * Schema and fileset level of user authentication.
+ */
+public class SecureHadoopCatalogOperations
+implements CatalogOperations, SupportsSchemas, FilesetCatalog {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(SecureHadoopCatalogOperations.class);
+
+  private final HadoopCatalogOperations hadoopCatalogOperations;
+
+  public SecureHadoopCatalogOperations() {
+this.hadoopCatalogOperations = new HadoopCatalogOperations();
+  }
+
+  public SecureHadoopCatalogOperations(EntityStore store) {
+this.hadoopCatalogOperations = new HadoopCatalogOperations(store);
+  }
+
+  public HadoopCatalogOperations getHadoopCatalogOperations() {
+return hadoopCatalogOperations;
+  }
+
+  public String getKerberosRealm() {
+return hadoopCatalogOperations.getKerberosRealm();
+  }
+
+  public void setProxyPlugin(HadoopProxyPlugin plugin) {
+hadoopCatalogOperations.setProxyPlugin(plugin);
+  }
+
+  // We have overridden the createFileset, dropFileset, createSchema, 
dropSchema method to reset
+  // the current user based on the name identifier.
+
+  @Override
+  public Fileset createFileset(
+  NameIdentifier ident,
+  String comment,
+  Fileset.Type type,
+  String storageLocation,
+  Map properties)
+  throws NoSuchSchemaException, FilesetAlreadyExistsException {
+UserGroupInformation currentUser = getUGIByIdent(properties, ident);
+return doAs(
+currentUser,
+() ->
+hadoopCatalogOperations.createFileset(
+ident, comment, type, storageLocation, properties),
+ident);
+  }
+
+  @Override
+  public boolean dropFileset(NameIdentifier ident) {
+FilesetEntity filesetEntity;
+try {
+  filesetEntity =
+  hadoopCatalogOperations
+  .getStore()
+  .get(ident, Entity.EntityType.FILESET, FilesetEntity.c

Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679101137


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/SecureHadoopCatalogOperations.java:
##
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop;
+
+import java.io.File;
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Namespace;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.SchemaChange;
+import org.apache.gravitino.catalog.hadoop.HadoopCatalogOperations.UserInfo;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosConfig;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.CatalogOperations;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.connector.SupportsSchemas;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetCatalog;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * SecureHadoopCatalogOperations is a secure version of 
HadoopCatalogOperations that can manage
+ * Schema and fileset level of user authentication.
+ */
+public class SecureHadoopCatalogOperations
+implements CatalogOperations, SupportsSchemas, FilesetCatalog {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(SecureHadoopCatalogOperations.class);
+
+  private final HadoopCatalogOperations hadoopCatalogOperations;
+
+  public SecureHadoopCatalogOperations() {
+this.hadoopCatalogOperations = new HadoopCatalogOperations();
+  }
+
+  public SecureHadoopCatalogOperations(EntityStore store) {
+this.hadoopCatalogOperations = new HadoopCatalogOperations(store);
+  }
+
+  public HadoopCatalogOperations getHadoopCatalogOperations() {
+return hadoopCatalogOperations;
+  }
+
+  public String getKerberosRealm() {
+return hadoopCatalogOperations.getKerberosRealm();
+  }
+
+  public void setProxyPlugin(HadoopProxyPlugin plugin) {
+hadoopCatalogOperations.setProxyPlugin(plugin);
+  }
+
+  // We have overridden the createFileset, dropFileset, createSchema, 
dropSchema method to reset
+  // the current user based on the name identifier.
+
+  @Override
+  public Fileset createFileset(
+  NameIdentifier ident,
+  String comment,
+  Fileset.Type type,
+  String storageLocation,
+  Map properties)
+  throws NoSuchSchemaException, FilesetAlreadyExistsException {
+UserGroupInformation currentUser = getUGIByIdent(properties, ident);
+return doAs(
+currentUser,
+() ->
+hadoopCatalogOperations.createFileset(
+ident, comment, type, storageLocation, properties),
+ident);
+  }
+
+  @Override
+  public boolean dropFileset(NameIdentifier ident) {
+FilesetEntity filesetEntity;
+try {
+  filesetEntity =
+  hadoopCatalogOperations
+  .getStore()
+  .get(ident, Entity.EntityType.FILESET, FilesetEntit

Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1679089279


##
catalogs/catalog-hadoop/src/main/java/org/apache/gravitino/catalog/hadoop/SecureHadoopCatalogOperations.java:
##
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.gravitino.catalog.hadoop;
+
+import java.io.File;
+import java.io.IOException;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.UndeclaredThrowableException;
+import java.security.PrivilegedActionException;
+import java.security.PrivilegedExceptionAction;
+import java.util.Collections;
+import java.util.Map;
+import java.util.Optional;
+import org.apache.gravitino.Entity;
+import org.apache.gravitino.EntityStore;
+import org.apache.gravitino.NameIdentifier;
+import org.apache.gravitino.Namespace;
+import org.apache.gravitino.Schema;
+import org.apache.gravitino.SchemaChange;
+import org.apache.gravitino.catalog.hadoop.HadoopCatalogOperations.UserInfo;
+import 
org.apache.gravitino.catalog.hadoop.authentication.kerberos.KerberosConfig;
+import org.apache.gravitino.connector.CatalogInfo;
+import org.apache.gravitino.connector.CatalogOperations;
+import org.apache.gravitino.connector.HasPropertyMetadata;
+import org.apache.gravitino.connector.SupportsSchemas;
+import org.apache.gravitino.exceptions.FilesetAlreadyExistsException;
+import org.apache.gravitino.exceptions.NoSuchCatalogException;
+import org.apache.gravitino.exceptions.NoSuchEntityException;
+import org.apache.gravitino.exceptions.NoSuchFilesetException;
+import org.apache.gravitino.exceptions.NoSuchSchemaException;
+import org.apache.gravitino.exceptions.NonEmptySchemaException;
+import org.apache.gravitino.exceptions.SchemaAlreadyExistsException;
+import org.apache.gravitino.file.Fileset;
+import org.apache.gravitino.file.FilesetCatalog;
+import org.apache.gravitino.file.FilesetChange;
+import org.apache.gravitino.meta.FilesetEntity;
+import org.apache.gravitino.meta.SchemaEntity;
+import org.apache.gravitino.utils.PrincipalUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * SecureHadoopCatalogOperations is a secure version of 
HadoopCatalogOperations that can manage
+ * Schema and fileset level of user authentication.
+ */
+public class SecureHadoopCatalogOperations
+implements CatalogOperations, SupportsSchemas, FilesetCatalog {
+
+  public static final Logger LOG = 
LoggerFactory.getLogger(SecureHadoopCatalogOperations.class);
+
+  private final HadoopCatalogOperations hadoopCatalogOperations;
+
+  public SecureHadoopCatalogOperations() {
+this.hadoopCatalogOperations = new HadoopCatalogOperations();
+  }
+
+  public SecureHadoopCatalogOperations(EntityStore store) {
+this.hadoopCatalogOperations = new HadoopCatalogOperations(store);
+  }
+
+  public HadoopCatalogOperations getHadoopCatalogOperations() {
+return hadoopCatalogOperations;
+  }
+
+  public String getKerberosRealm() {
+return hadoopCatalogOperations.getKerberosRealm();
+  }
+
+  public void setProxyPlugin(HadoopProxyPlugin plugin) {
+hadoopCatalogOperations.setProxyPlugin(plugin);
+  }
+
+  // We have overridden the createFileset, dropFileset, createSchema, 
dropSchema method to reset
+  // the current user based on the name identifier.
+
+  @Override
+  public Fileset createFileset(
+  NameIdentifier ident,
+  String comment,
+  Fileset.Type type,
+  String storageLocation,
+  Map properties)
+  throws NoSuchSchemaException, FilesetAlreadyExistsException {
+UserGroupInformation currentUser = getUGIByIdent(properties, ident);
+return doAs(
+currentUser,
+() ->
+hadoopCatalogOperations.createFileset(
+ident, comment, type, storageLocation, properties),
+ident);
+  }
+
+  @Override
+  public boolean dropFileset(NameIdentifier ident) {
+FilesetEntity filesetEntity;
+try {
+  filesetEntity =
+  hadoopCatalogOperations
+  .getStore()
+  .get(ident, Entity.EntityType.FILESET, FilesetEntity.c

Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-16 Thread via GitHub


yuqi1129 commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-2230469031

   @qqqttt123 
   I have made the following modifications.
   -  Disable `HadoopProxyPlugin` for the Hadoop catalog.
   -  Disable credentials refresh for schema/fieldset as we will log in 
Kerberos for every operation. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-11 Thread via GitHub


yuqi1129 commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-704271

   @jerqi 
   
   I hesitate to remove the catalog proxy and use a new proxy to unify catalog, 
schema and fileset authentication as there seem to be some customized logic for 
some operations such as 
   
   
   ```
 try {
 SchemaEntity schemaEntity =
 hadoopCatalogOperations
 .getStore()
 .get(ident, Entity.EntityType.SCHEMA, SchemaEntity.class);
 Map properties =
 
Optional.ofNullable(schemaEntity.properties()).orElse(Collections.emptyMap());

// We need the logic mentioned above to obtain the properties, it's not 
appropriate to move it to a proxy class. 
   
   
 // Reset the current user based on the name identifier.
 UserGroupInformation user = getUGIByIdent(properties, ident);
   
 boolean r = doAs(user, () -> hadoopCatalogOperations.dropSchema(ident, 
cascade), ident);
 cleanUserInfo(ident);
 return r;
   } catch (IOException ioe) {
 throw new RuntimeException("Failed to delete schema " + ident, ioe);
   }
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-10 Thread via GitHub


yuqi1129 commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-2221975854

   > > There are still some issues which I ever mentioned to solve.
   > > 
   > > 1. Multiple threads will trigger the issue if you don't use lock. 
Authentication will use the some static variables.
   > > 2. How do you handle the credential refresh issues?
   > > 3. Could you extract a wrapper for every operation?
   > 
   > > How do you handle the credential refresh issues?
   > 
   > Different Kerberos users use different credential files, so I'm not very 
clear why you mentioned potential refresh issues.
   
   I have checked the login and refresh login, the core data structure 
`HadoopLoginContext` and `Subject` have not use any static value, the last 
problem: I can't confirm whether the method `LoginContext#login` is tread safe 
or not as it depends the authentication mechanism implementer. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-10 Thread via GitHub


yuqi1129 commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-2221944693

   > There are still some issues which I ever mentioned to solve.
   > 
   > 1. Multiple threads will trigger the issue if you don't use lock. 
Authentication will use the some static variables.
   > 2. How do you handle the credential refresh issues?
   > 3. Could you extract a wrapper for every operation?
   
   > How do you handle the credential refresh issues?
   
   Different Kerberos users use different credential files, so I'm not very 
clear why you mentioned potential refresh issues.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-10 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1672287397


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -96,6 +99,14 @@ public class HadoopCatalogOperations implements 
CatalogOperations, SupportsSchem
 
   private CatalogInfo catalogInfo;
 
+  private final List closeables = Lists.newArrayList();
+
+  private final Map userInfoMap = 
Maps.newConcurrentMap();
+
+  public static final String GRAVITINO_KEYTAB_FORMAT = "keytabs/gravitino-%s";
+
+  private final ThreadLocal currentUser = ThreadLocal.withInitial(() 
-> null);

Review Comment:
   I used a local thread to keep the API user or we won't be able to get the 
API user in the `doAs` block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-10 Thread via GitHub


jerqi commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1672117933


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -96,6 +99,14 @@ public class HadoopCatalogOperations implements 
CatalogOperations, SupportsSchem
 
   private CatalogInfo catalogInfo;
 
+  private final List closeables = Lists.newArrayList();
+
+  private final Map userInfoMap = 
Maps.newConcurrentMap();
+
+  public static final String GRAVITINO_KEYTAB_FORMAT = "keytabs/gravitino-%s";
+
+  private final ThreadLocal currentUser = ThreadLocal.withInitial(() 
-> null);

Review Comment:
   Could you avoid using thread local variable here? We can doAs in the wrapper 
like other code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-10 Thread via GitHub


jerqi commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-2220177674

   There are still some issues which I ever mentioned to solve.
   1. Multiple threads will trigger the issue if you don't use lock. 
Authentication will use the some static variables. 
   2. How do you handle the credential refresh issues?
   3. Could you extract a wrapper for every operation?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-10 Thread via GitHub


yuqi1129 commented on PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#issuecomment-2219799074

   @qqqttt123 
   Please help to reivew it again, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-09 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1669997190


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -370,66 +420,121 @@ public NameIdentifier[] listSchemas(Namespace namespace) 
throws NoSuchCatalogExc
 }
   }
 
-  @Override
-  public Schema createSchema(NameIdentifier ident, String comment, Map properties)
-  throws NoSuchCatalogException, SchemaAlreadyExistsException {
-try {
-  if (store.exists(ident, Entity.EntityType.SCHEMA)) {
-throw new SchemaAlreadyExistsException("Schema %s already exists", 
ident);
-  }
-} catch (IOException ioe) {
-  throw new RuntimeException("Failed to check if schema " + ident + " 
exists", ioe);
-}
+  /**
+   * Get the UserGroupInformation based on the NameIdentifier and properties.
+   *
+   * Note: As UserGroupInformation is a static class, to avoid the thread 
safety issue, we need
+   * to use synchronized to ensure the thread safety: Make login and 
getLoginUser atomic.
+   */
+  private synchronized String initKerberos(
+  String keytabPath,
+  Map properties,
+  Configuration configuration,
+  NameIdentifier ident) {
+// Init schema level kerberos authentication.
+KerberosConfig kerberosConfig = new KerberosConfig(properties);
+if (kerberosConfig.isKerberosAuth()) {
+  configuration.set(
+  HADOOP_SECURITY_AUTHENTICATION,
+  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
+  UserGroupInformation.setConfiguration(configuration);

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-08 Thread via GitHub


qqqttt123 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1668287818


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -125,20 +168,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf, hadoopConf, NameIdentifier.of(catalogInfo.namespace(), 
catalogInfo.name()));
+} else if (config.isSimpleAuth()) {
+  // TODO: change the user 'datastrato' to 'anonymous' and uncomment the 
following code;
+  //  uncomment the following code after the user 'datastrato' is removed 
from the codebase.
+  //  for more, please see 
https://github.com/datastrato/gravitino/issues/4013
+  // UserGroupInformation u =
+  //
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());
+  // userInfoMap.put(
+  //NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  //UserInfo.of(u, false, null, null));

Review Comment:
   It's not reasonable solution for me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-07 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1668067258


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -125,20 +168,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf, hadoopConf, NameIdentifier.of(catalogInfo.namespace(), 
catalogInfo.name()));
+} else if (config.isSimpleAuth()) {
+  // TODO: change the user 'datastrato' to 'anonymous' and uncomment the 
following code;
+  //  uncomment the following code after the user 'datastrato' is removed 
from the codebase.
+  //  for more, please see 
https://github.com/datastrato/gravitino/issues/4013
+  // UserGroupInformation u =
+  //
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());
+  // userInfoMap.put(
+  //NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  //UserInfo.of(u, false, null, null));

Review Comment:
   We have reached an agreement that sub-entities use the parent's 
authentication if no authentication is set. When the catalog authentication is 
simple, schema or fileset will use the default user `anonymous` to access HDFS. 
so
   - Whether we need to keep these changes.
   - Change the default user name in the Docker image for Hive. (NOT advised by 
@xunliu )
   
   
   Before #4013 or something similar is resolved, I think these changes should 
be commented. 
   
   
   > Is the prerequisite pull request merged?
   
   Not really, @xunliu is not suggested to do so.  What about changing the 
default user name `anonymous` to `datastrato`?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-07 Thread via GitHub


qqqttt123 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1668056937


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -125,20 +168,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf, hadoopConf, NameIdentifier.of(catalogInfo.namespace(), 
catalogInfo.name()));
+} else if (config.isSimpleAuth()) {
+  // TODO: change the user 'datastrato' to 'anonymous' and uncomment the 
following code;
+  //  uncomment the following code after the user 'datastrato' is removed 
from the codebase.
+  //  for more, please see 
https://github.com/datastrato/gravitino/issues/4013
+  // UserGroupInformation u =
+  //
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());
+  // userInfoMap.put(
+  //NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  //UserInfo.of(u, false, null, null));

Review Comment:
   Is the prerequisite pull request merged?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-07 Thread via GitHub


qqqttt123 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1668056937


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -125,20 +168,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf, hadoopConf, NameIdentifier.of(catalogInfo.namespace(), 
catalogInfo.name()));
+} else if (config.isSimpleAuth()) {
+  // TODO: change the user 'datastrato' to 'anonymous' and uncomment the 
following code;
+  //  uncomment the following code after the user 'datastrato' is removed 
from the codebase.
+  //  for more, please see 
https://github.com/datastrato/gravitino/issues/4013
+  // UserGroupInformation u =
+  //
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());
+  // userInfoMap.put(
+  //NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  //UserInfo.of(u, false, null, null));

Review Comment:
   Is the prequisition merged?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [#3462] improvement(hadoop-catalog): Support different level(catalog, schema,fileset) of authentication for hadoop catalog. [gravitino]

2024-07-04 Thread via GitHub


yuqi1129 commented on code in PR #3852:
URL: https://github.com/apache/gravitino/pull/3852#discussion_r1665251107


##
catalogs/catalog-hadoop/src/main/java/com/datastrato/gravitino/catalog/hadoop/HadoopCatalogOperations.java:
##
@@ -125,20 +168,20 @@ public void initialize(
 
   private void initAuthentication(Map conf, Configuration 
hadoopConf) {
 AuthenticationConfig config = new AuthenticationConfig(conf);
-String authType = config.getAuthType();
 
-if (StringUtils.equalsIgnoreCase(authType, 
AuthenticationMethod.KERBEROS.name())) {
-  hadoopConf.set(
-  HADOOP_SECURITY_AUTHENTICATION,
-  AuthenticationMethod.KERBEROS.name().toLowerCase(Locale.ROOT));
-  UserGroupInformation.setConfiguration(hadoopConf);
-  try {
-KerberosClient kerberosClient = new KerberosClient(conf, hadoopConf);
-File keytabFile = 
kerberosClient.saveKeyTabFileFromUri(catalogInfo.id());
-this.kerberosRealm = 
kerberosClient.login(keytabFile.getAbsolutePath());
-  } catch (IOException e) {
-throw new RuntimeException("Failed to login with Kerberos", e);
-  }
+if (config.isKerberosAuth()) {
+  this.kerberosRealm =
+  initKerberos(
+  conf, hadoopConf, NameIdentifier.of(catalogInfo.namespace(), 
catalogInfo.name()));
+} else if (config.isSimpleAuth()) {
+  // TODO: change the user 'datastrato' to 'anonymous' and uncomment the 
following code;
+  //  uncomment the following code after the user 'datastrato' is removed 
from the codebase.
+  //  for more, please see 
https://github.com/datastrato/gravitino/issues/4013
+  // UserGroupInformation u =
+  //
UserGroupInformation.createRemoteUser(PrincipalUtils.getCurrentUserName());
+  // userInfoMap.put(
+  //NameIdentifier.of(catalogInfo.namespace(), catalogInfo.name()),
+  //UserInfo.of(u, false, null, null));

Review Comment:
   @qqqttt123 
   Please take a look if you have some free time. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]