si1verwind17 opened a new issue, #6526:
URL: https://github.com/apache/hudi/issues/6526

   **Describe the problem you faced**
   
   I'm trying to write a spark dataframe to store in GCS and sync with 
standalone hive metastore but it cannot sync successfully.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   Running the code with configs below
   ```
   spark = SparkSession(SparkContext(conf=spark_config))
   df = spark.read.format("jdbc").option("driver", "com.mysql.cj.jdbc.Driver")
             .option("url", "jdbc:mysql://<db_host>/<db_name>")
             .option("dbtable", table)
             .option("user", "username")
             .option("password", "db_password")
             .load()
   
df.write.format("hudi").options(**hudi_options).mode("append").save(gs://<bucket>/<table_name>)
   ```
   ```
   spark_config =
       spark_config = SparkConf()
           .setAppName("test-sync-metastore")
           .setMaster("spark://<ip>:7077")
           .set("spark.driver.host", <ip>)
           .set("spark.driver.cores", "1")
           .set("spark.driver.memory", "1000M")
           .set("spark.executor.memory", "1000M")
           .set("spark.executor.cores", "1")
           .set("spark.executor.instances", "1")
           .set("spark.serializer", 
"org.apache.spark.serializer.KryoSerializer")
           .set("spark.sql.catalog.spark_catalog", 
"org.apache.spark.sql.hudi.catalog.HoodieCatalog")
           .set("spark.sql.extensions", 
"org.apache.spark.sql.hudi.HoodieSparkSessionExtension")
           .set("spark.jars.packages", 
"org.apache.hudi:hudi-spark3.3-bundle_2.12:0.12.0")
           .set("spark.hadoop.fs.gs.impl", 
"com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem")
           .set("spark.hadoop.fs.AbstractFileSystem.gs.impl", 
"com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS")
           .set("spark.hive.metastore.warehouse.dir", "gs://<gcs 
bucket>/dataset")
           .set("spark.sql.warehouse.dir", "gs://<gcs bucket>/dataset")
   
   Hudi options = {
           "hoodie.table.name": "table",
           "hoodie.datasource.write.recordkey.field": "id",
           "hoodie.datasource.write.table.name": "table",
           "hoodie.datasource.write.operation": "upsert",
           "hoodie.datasource.write.precombine.field": "ts",
           "hoodie.datasource.write.table.type": "MERGE_ON_READ",
           "hoodie.datasource.hive_sync.enable": "true",
           "hoodie.datasource.hive_sync.database": "default",
           "hoodie.datasource.hive_sync.table": "table",
           "hoodie.datasource.hive_sync.username": "username",
           "hoodie.datasource.hive_sync.password": "password",
           "hoodie.datasource.hive_sync.mode": "hms",
           "hoodie.datasource.hive_sync.metastore.uris": 
"thrift://hive-metastore:9083",
    }
   ```
   
   
   **Expected behavior**
   
   Be able to sync hudi table with remote hive metastore.
   
   **Environment Description**
   
   * Hudi version : 0.12.0
   * Spark version : 3.3.0
   * Hive version : 2.3.9
   * Hadoop version : -
   * Storage (HDFS/S3/GCS..) : GCS
   * Running on Docker? (yes/no) : No
   
   **Stacktrace**
   
   ```An error occurred while calling o83.save.
   : org.apache.hudi.exception.HoodieException: Could not sync using the meta 
sync class org.apache.hudi.hive.HiveSyncTool
        at org.apache.hudi.sync.common.util.oxy.$Proxy55.verifySchema(Unknown 
Source)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:595)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:588)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:655)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
        ... 81 more```
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to