xiaozhch5 commented on a change in pull request #3771:
URL: https://github.com/apache/hudi/pull/3771#discussion_r817901694



##########
File path: 
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java
##########
@@ -77,6 +77,10 @@ public HiveSyncTool(HiveSyncConfig cfg, HiveConf 
configuration, FileSystem fs) {
     super(configuration.getAllProperties(), fs);
 
     try {
+      if (cfg.useKerberos) {
+        configuration.set("hive.metastore.sasl.enabled", "true");
+        configuration.set("hive.metastore.kerberos.principal", 
cfg.kerberosPrincipal);

Review comment:
       Hello, I tested the PR and could not sync HUDi to Hive3, but I managed 
to do so using the following parameters and modifying the configuration.
   ```java
   if (cfg.enableKerberos) {
           System.setProperty("java.security.krb5.conf", cfg.krb5Conf);
           Configuration conf = new Configuration();
           conf.set("hadoop.security.authentication", "kerberos");
           conf.set("kerberos.principal", cfg.principal);
           UserGroupInformation.setConfiguration(conf);
           UserGroupInformation.loginUserFromKeytab(cfg.keytabName, 
cfg.keytabFile);
           
configuration.set(HiveConf.ConfVars.METASTORE_USE_THRIFT_SASL.varname, "true");
           
configuration.set(HiveConf.ConfVars.METASTORE_KERBEROS_PRINCIPAL.varname, 
cfg.principal);
           
configuration.set(HiveConf.ConfVars.METASTORE_KERBEROS_KEYTAB_FILE.varname, 
cfg.keytabFile);
         }
   ```
   The following is an explanation of each parameter:
   
   * cfg.krb5Conf: The location of krb5.conf, /etc/krb5.conf, by default.
   *  cfg.principal: Hive metastore principal, such as hive/[email protected]
   * cfg.keytabFile: Hive MetaStore keytab used to submit tasks to host assigned
   * cfg.keytabName: Corresponds to the principal of the keytab above, such as 
hive/host144
   
   Before starting the flink cluster, I distribute the hive metastore keytab to 
the cluster of the same location, such as, 
/home/keydir/hive/hive.service.keytab.
   
   And afterwards, I start a flink cluster with yarn session mode using the 
hive metastore keytab, and submit the SQLs to the Cluster.
   
   ```sql
   CREATE TABLE sourceT (
     uuid varchar(20),
     name varchar(10),
     age int,
     ts timestamp(3),
     `partition` varchar(20)
   ) WITH (
     'connector' = 'datagen',
     'rows-per-second' = '1'
   );
   
   create table t2(
     uuid varchar(20),
     name varchar(10),
     age int,
     ts timestamp(3),
     `partition` varchar(20)
   )
   with (
     'connector' = 'hudi',
     'path' = 'hdfs://host146:8020/tmp/t2',
     'table.type' = 'MERGE_ON_READ',
     'write.bucket_assign.tasks' = '2',
     'write.tasks' = '2',
     'hive_sync.enable' = 'true',
     'hive_sync.mode' = 'hms',
     'hive_sync.metastore.uris' = 'thrift://host145:9083',
     'hive_sync.db' = 'default',
     'hive_sync.table' = 't2',
     'hive_sync.kerberos.enable' = 'true',
     'hive_sync.kerberos.krb5.conf' = '/etc/krb5.conf', 
     'hive_sync.kerberos.principal' = 'hive/[email protected]',
     'hive_sync.kerberos.keytab.file' = 
'/home/keydir/hive/hive.service.keytab', 
     'hive_sync.kerberos.keytab.name' = 'hive/host144'
   );
   
   insert into t2 select * from sourceT;
   ```
   
   The result is like below:
   
   
![image](https://user-images.githubusercontent.com/46479816/156407399-3c7ede7c-b5d5-4252-bd7b-98025d2d865f.png)
   
   
![image](https://user-images.githubusercontent.com/46479816/156408301-eecf95fa-8e1e-4e0b-994e-866cc09d3ec3.png)
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to