[GitHub] [hudi] matthiasdg commented on issue #3868: [SUPPORT] hive syncing with `--spark-datasource` (first title was: Querying hudi datasets from standalone metastore)

GitBox Wed, 16 Mar 2022 11:26:05 -0700


matthiasdg commented on issue #3868:
URL: https://github.com/apache/hudi/issues/3868#issuecomment-1069380960



   @nsivabalan @codope finally had the time to have a look.
   I have the same problem in the spark-shell with both gist examples. We don't 
have a Hive server, only a metastore, so I used
   ```
   option("hoodie.datasource.hive_sync.mode", "hms").
   option("hoodie.datasource.hive_sync.jdbcurl", "thrift://localhost:9083").
   ```
   (port forwarded our metastore running on k8s, used an azure path to write 
to).
   
   I don't get any errors upon writing/syncing regardless the partition 
extractor.
   (For the slash encoded day partition, I still had to replace the 
`hoodie.datasource.hive_sync.partition_fields` from the gist with a single 
value).
   
   If I do something like `spark.sql("select * from hudi_mor_ts_ro").show`, I 
get in both cases the
   ```
   22/03/16 18:09:30 ERROR Executor: Exception in task 0.0 in stage 40.0 (TID 
61)1]
   java.io.IOException: Required column is missing in data file. Col: [year]
   ```
   error I described earlier (`Col: [year]` in case of the partition fields 
`year,month,day` for MultiPartKeysValue, or `Col: [date]` in case of a 
partition field `date` for SlashEncoded).
   
   Let me know if I can try something else (maybe run metastore locally + write 
to local storage to see if that makes a difference)?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] matthiasdg commented on issue #3868: [SUPPORT] hive syncing with `--spark-datasource` (first title was: Querying hudi datasets from standalone metastore)

Reply via email to