[GitHub] [iceberg] kbendick commented on a change in pull request #1515: Flink: Support sink when disable flink checkpoint disable

GitBox Sat, 26 Sep 2020 21:40:57 -0700


kbendick commented on a change in pull request #1515:
URL: https://github.com/apache/iceberg/pull/1515#discussion_r495527775




##########
File path: flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java
##########
@@ -153,6 +158,11 @@ public Builder hadoopConf(Configuration newHadoopConf) {
       return this;
     }
 
+    public Builder flinkConf(org.apache.flink.configuration.Configuration 
config) {
+      this.flinkConf = config != null ? config : new 
org.apache.flink.configuration.Configuration();
+      return this;
+    }

Review comment:
       cc @JingsongLi @openinx 

##########
File path: flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java
##########
@@ -153,6 +158,11 @@ public Builder hadoopConf(Configuration newHadoopConf) {
       return this;
     }
 
+    public Builder flinkConf(org.apache.flink.configuration.Configuration 
config) {
+      this.flinkConf = config != null ? config : new 
org.apache.flink.configuration.Configuration();
+      return this;
+    }

Review comment:
       By allowing user's to pass in `null` flink configuration objects by 
silently instantiating a new one, are we making it harder for developers to 
tell if their configuration they've set up elsewhere is being respected? For 
example, if at the beginning of my program I used `ParamaterTool` somehow to 
get my configuration - e.g. reading from a file via 
`ParameterTool.fromPropertiesFile(file)`, or possibly from CLI args via 
`ParameterTool.fromArgs(args)`, or one of the several ways to parse 
configuration properties and parameters view `ParameterToool`, would this 
instantiation of a new config pick up all of these changes?
   
   If not, I think we should disallow null here - it really doesn't make sense 
to me to call this method with null. In my opinion, calling a builder with null 
for the configuration option is typically a bug. And it seems like a potential 
cause of difficult to track bugs for people who are less aware of the 
complexities of Flink's configuration object and the various ways it can be 
derived (via `set` in the SQL cli, via `ParameterTool`, via `flink-conf.yaml`, 
via table catalog properties, etc).
   
   However, it's possible I'm mistaken and that by this point in generation of 
the job graph that new Configuration will somehow have hooked into everything 
that's needed to get the data however the user intended. And I will admit it's 
possible that I am mixing up `Configuration` with several related, but 
separate, entities. If so, please let me know and I do apologize.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] kbendick commented on a change in pull request #1515: Flink: Support sink when disable flink checkpoint disable

Reply via email to