aromanenko-dev commented on a change in pull request #13543:
URL: https://github.com/apache/beam/pull/13543#discussion_r548480861



##########
File path: 
sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java
##########
@@ -475,6 +479,18 @@
       return 
withValueTranslation(function).toBuilder().setValueCoder(coder).build();
     }
 
+    /**
+     * Determines if key-value clone should be skipped or not (default is 
'false'). Hadoop formats
+     * typically work with Writable data structures which are mutable. 
Therefore, this IO will clone
+     * read key-values if they are not in the list of well known immutable 
types. However, in case
+     * user does use key/value translation functions, resulting key-values 
might already be
+     * immutable. In such case, additional copy is unnecessary overhead and 
can be avoided by
+     * setting skip to 'true'.
+     */
+    public Read<K, V> withSkipKeyValueClone(boolean value) {

Review comment:
       Please, make it configurable separately for keys and values like we do 
with other configuration options. 

##########
File path: 
sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java
##########
@@ -475,6 +479,18 @@
       return 
withValueTranslation(function).toBuilder().setValueCoder(coder).build();
     }
 
+    /**
+     * Determines if key-value clone should be skipped or not (default is 
'false'). Hadoop formats

Review comment:
       Please, update `HadoopFormatIO`'s Javadoc as well.

##########
File path: 
sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java
##########
@@ -488,13 +504,16 @@
       if (valueCoder == null) {
         valueCoder = getDefaultCoder(getValueTypeDescriptor(), coderRegistry);
       }
+      boolean skipKeyValueClone = getSkipKeyValueClone() == null ? false : 
getSkipKeyValueClone();

Review comment:
       Please, set default values in `read()` with a builder.

##########
File path: 
sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java
##########
@@ -475,6 +479,18 @@
       return 
withValueTranslation(function).toBuilder().setValueCoder(coder).build();
     }
 
+    /**
+     * Determines if key-value clone should be skipped or not (default is 
'false'). Hadoop formats
+     * typically work with Writable data structures which are mutable. 
Therefore, this IO will clone
+     * read key-values if they are not in the list of well known immutable 
types. However, in case
+     * user does use key/value translation functions, resulting key-values 
might already be

Review comment:
       Should it be used only with `withKeyTranslation()` or 
`withValueTranslation()`, or it can be used independently?  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to