[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] [WIP] Implement simple clustering strategies to create and run ClusteringPlan

GitBox Mon, 30 Nov 2020 10:44:24 -0800


satishkotha commented on a change in pull request #2263:
URL: https://github.com/apache/hudi/pull/2263#discussion_r532818631




##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/SparkLazyInsertIterable.java
##########
@@ -34,14 +35,18 @@
 
 public class SparkLazyInsertIterable<T extends HoodieRecordPayload> extends 
HoodieLazyInsertIterable<T> {
 
+  private boolean addMetadataFields;

Review comment:
       With regular bulk insert, RDD only has user specified columns. With bulk 
insert based clustering, RDD also has hoodie internal columns. So, I am adding 
this flag to make it work for both cases. Let me know if you want me to 
reorganize this  differently.

##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/BulkInsertMapFunction.java
##########
@@ -41,20 +41,22 @@
   private HoodieWriteConfig config;
   private HoodieTable hoodieTable;
   private List<String> fileIDPrefixes;
+  private boolean addMetadataFields;
 
   public BulkInsertMapFunction(String instantTime, boolean areRecordsSorted,
                                HoodieWriteConfig config, HoodieTable 
hoodieTable,
-                               List<String> fileIDPrefixes) {
+                               List<String> fileIDPrefixes, boolean 
addMetadataFields) {

Review comment:
       With regular bulk insert, RDD only has user specified columns. With bulk 
insert based clustering, RDD also has hoodie internal columns. So, I am adding 
this flag to make it work for both cases. Let me know if you want me to 
reorganize this  differently.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] satishkotha commented on a change in pull request #2263: [HUDI-1075] [WIP] Implement simple clustering strategies to create and run ClusteringPlan

Reply via email to