[
https://issues.apache.org/jira/browse/HUDI-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2839:
-----------------------------
Sprint: 2022/05/16, 2022/05/17 (was: 2022/05/16)
> Align configs across Spark datasource, write client, etc
> --------------------------------------------------------
>
> Key: HUDI-2839
> URL: https://issues.apache.org/jira/browse/HUDI-2839
> Project: Apache Hudi
> Issue Type: Improvement
> Components: configs, spark
> Reporter: Ethan Guo
> Assignee: Sagar Sumit
> Priority: Critical
> Fix For: 0.12.0
>
>
> This is aroused when discussing HUDI-2818. For the same logic such as
> keygenerator, compaction, clustering, etc., there are different configs in
> Spark datasource and write client and they may cause conflicts. This can
> cause unexpected behavior on the write path.
>
> Raymond: I encountered this NPE when trying to run 0.10 over a 0.8 table:
> https://issues.apache.org/jira/browse/HUDI-2818.
> to align configs, do you think we should auto set
> {{hoodie.table.keygenerator.class}} when user sets
> {{hoodie.datasource.write.keygenerator.class}} and also the other way around?
> Siva: guess in the regular write path(HoodiesparkSqlWriter), this is what
> happens. i.e. users sets only
> {{{}hoodie.datasource.write.keygenerator.class{}}}, but internally we set
> {{hoodie.table.keygenerator.class}} from datasource write config.
> Vinoth: {{HoodieConfig}} has some alternaitves/fallback mechanism. Something
> to consider
> but overall we should fix these
> Ethan: when working on compaction/clustering, I also see different configs
> around the same logic between spark datasource and write client. maybe we
> can take a pass of all configs later and make them consistent
--
This message was sent by Atlassian Jira
(v8.20.7#820007)