matthiasdg commented on issue #3868:
URL: https://github.com/apache/hudi/issues/3868#issuecomment-1069380960
@nsivabalan @codope finally had the time to have a look.
I have the same problem in the spark-shell with both gist examples. We don't
have a Hive server, only a metastore, so I used
```
option("hoodie.datasource.hive_sync.mode", "hms").
option("hoodie.datasource.hive_sync.jdbcurl", "thrift://localhost:9083").
```
(port forwarded our metastore running on k8s, used an azure path to write
to).
I don't get any errors upon writing/syncing regardless the partition
extractor.
(For the slash encoded day partition, I still had to replace the
`hoodie.datasource.hive_sync.partition_fields` from the gist with a single
value).
If I do something like `spark.sql("select * from hudi_mor_ts_ro").show`, I
get in both cases the
```
22/03/16 18:09:30 ERROR Executor: Exception in task 0.0 in stage 40.0 (TID
61)1]
java.io.IOException: Required column is missing in data file. Col: [year]
```
error I described earlier (`Col: [year]` in case of the partition fields
`year,month,day` for MultiPartKeysValue, or `Col: [date]` in case of a
partition field `date` for SlashEncoded).
Let me know if I can try something else (maybe run metastore locally + write
to local storage to see if that makes a difference)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]