satishkotha commented on a change in pull request #2263:
URL: https://github.com/apache/hudi/pull/2263#discussion_r532818631
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/SparkLazyInsertIterable.java
##########
@@ -34,14 +35,18 @@
public class SparkLazyInsertIterable<T extends HoodieRecordPayload> extends
HoodieLazyInsertIterable<T> {
+ private boolean addMetadataFields;
Review comment:
With regular bulk insert, RDD only has user specified columns. With bulk
insert based clustering, RDD also has hoodie internal columns. So, I am adding
this flag to make it work for both cases. Let me know if you want me to
reorganize this differently.
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/BulkInsertMapFunction.java
##########
@@ -41,20 +41,22 @@
private HoodieWriteConfig config;
private HoodieTable hoodieTable;
private List<String> fileIDPrefixes;
+ private boolean addMetadataFields;
public BulkInsertMapFunction(String instantTime, boolean areRecordsSorted,
HoodieWriteConfig config, HoodieTable
hoodieTable,
- List<String> fileIDPrefixes) {
+ List<String> fileIDPrefixes, boolean
addMetadataFields) {
Review comment:
With regular bulk insert, RDD only has user specified columns. With bulk
insert based clustering, RDD also has hoodie internal columns. So, I am adding
this flag to make it work for both cases. Let me know if you want me to
reorganize this differently.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]