matthiasdg commented on code in PR #7068:
URL: https://github.com/apache/hudi/pull/7068#discussion_r1016797549
##########
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java:
##########
@@ -108,45 +108,45 @@ public static class HiveSyncConfigParams {
+ "instead of org.apache.hudi package. Use this when you are in
the process of migrating from "
+ "com.uber.hoodie to org.apache.hudi. Stop using this after you
migrated the table definition to "
+ "org.apache.hudi input format.")
- public Boolean usePreApacheInputFormat;
+ public boolean usePreApacheInputFormat;
@Deprecated
- @Parameter(names = {"--use-jdbc"}, description = "Hive jdbc connect url")
- public Boolean useJdbc;
+ @Parameter(names = {"--use-jdbc"}, description = "Hive jdbc connect url",
arity = 1)
+ public Boolean useJdbc = true;
@Parameter(names = {"--metastore-uris"}, description = "Hive metastore
uris")
public String metastoreUris;
@Parameter(names = {"--sync-mode"}, description = "Mode to choose for Hive
ops. Valid values are hms,glue,jdbc and hiveql")
public String syncMode;
- @Parameter(names = {"--auto-create-database"}, description = "Auto create
hive database")
- public Boolean autoCreateDatabase;
+ @Parameter(names = {"--auto-create-database"}, description = "Auto create
hive database", arity = 1)
+ public boolean autoCreateDatabase = true;
@Parameter(names = {"--ignore-exceptions"}, description = "Ignore hive
exceptions")
- public Boolean ignoreExceptions;
+ public boolean ignoreExceptions;
Review Comment:
@xicm @xushiyan I am a bit late to the party, but just wanted to point out
https://github.com/cbeust/jcommander/issues/378
This means you can still override parameters that default to true unlike
what you seem to believe. Worse, if they're set to true by default, adding them
in the cli tool will toggle them to false. Or adding them multiple times will
also change the effect (easy to verify now that a test was added).
Therefore, the behavior in hudi has already changed since
https://github.com/apache/hudi/pull/4175; before that the Boolean JCommander
parameters had default values, such as `public Boolean
syncAsSparkDataSourceTable = true;`, which meant adding `--spark-datasource`
actually set this to false. Now on the other hand, since the default value
comes from `HIVE_SYNC_AS_DATA_SOURCE_TABLE` and not through JCommander, adding
`--spark-datasource` would also set this to true. So it is already quite
confusing. With no defaults set in JCommander like currently is the case, it
may make more sense (at least it's consistent), but then to disable
spark-datasource, I think I'll actually have to add `--spark-datasource
--spark-datasource`; if I don't add it, the default
`HIVE_SYNC_AS_DATA_SOURCE_TABLE` value (=true) is used, if I add it once ->
JCommander also sets it to true, twice -> JCommander flips it to false.
Based on this I'm personally in favor of enforcing `arity=1` for all
booleans, makes it clear what will happen and will throw an error if someone
does not specify an arity, so then it's at least clear that the behavior
changed (which now isn't).
But this probably deserves a separate issue?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]