matthiasdg commented on code in PR #7068:
URL: https://github.com/apache/hudi/pull/7068#discussion_r1016797549


##########
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java:
##########
@@ -108,45 +108,45 @@ public static class HiveSyncConfigParams {
             + "instead of org.apache.hudi package. Use this when you are in 
the process of migrating from "
             + "com.uber.hoodie to org.apache.hudi. Stop using this after you 
migrated the table definition to "
             + "org.apache.hudi input format.")
-    public Boolean usePreApacheInputFormat;
+    public boolean usePreApacheInputFormat;
     @Deprecated
-    @Parameter(names = {"--use-jdbc"}, description = "Hive jdbc connect url")
-    public Boolean useJdbc;
+    @Parameter(names = {"--use-jdbc"}, description = "Hive jdbc connect url", 
arity = 1)
+    public Boolean useJdbc = true;
     @Parameter(names = {"--metastore-uris"}, description = "Hive metastore 
uris")
     public String metastoreUris;
     @Parameter(names = {"--sync-mode"}, description = "Mode to choose for Hive 
ops. Valid values are hms,glue,jdbc and hiveql")
     public String syncMode;
-    @Parameter(names = {"--auto-create-database"}, description = "Auto create 
hive database")
-    public Boolean autoCreateDatabase;
+    @Parameter(names = {"--auto-create-database"}, description = "Auto create 
hive database", arity = 1)
+    public boolean autoCreateDatabase = true;
     @Parameter(names = {"--ignore-exceptions"}, description = "Ignore hive 
exceptions")
-    public Boolean ignoreExceptions;
+    public boolean ignoreExceptions;

Review Comment:
   @xicm  @xushiyan I am a bit late to the party, but just wanted to point out 
https://github.com/cbeust/jcommander/issues/378
   This means you can still override parameters that default to true unlike 
what you seem to believe. Worse, if they're set to true by default, adding them 
in the cli tool will toggle them to false. Or adding them multiple times will 
also change the effect (easy to verify now that a test was added).
   
   Therefore, the behavior in hudi has already changed since 
https://github.com/apache/hudi/pull/4175; before that the Boolean JCommander 
parameters had default values, such as `public Boolean 
syncAsSparkDataSourceTable = true;`, which meant adding `--spark-datasource` 
actually set this to false. Now on the other hand, since the default value 
comes from `HIVE_SYNC_AS_DATA_SOURCE_TABLE` and not through JCommander, adding 
`--spark-datasource` would also set this to true. So it is already quite 
confusing. With no defaults set in JCommander like currently is the case, it 
may make more sense (at least it's consistent), but then to disable 
spark-datasource, I think I'll actually have to add `--spark-datasource 
--spark-datasource`; if I don't add it, the default 
`HIVE_SYNC_AS_DATA_SOURCE_TABLE` value (=true) is used, if I add it once -> 
JCommander also sets it to true, twice -> JCommander flips it to false.
   
   Based on this I'm personally in favor of enforcing `arity=1` for all 
booleans, makes it clear what will happen and will throw an error if someone 
does not specify an arity, so then it's at least clear that the behavior 
changed (which now isn't).
   But this probably deserves a separate issue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to