nsivabalan commented on a change in pull request #3315:
URL: https://github.com/apache/hudi/pull/3315#discussion_r675342299



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java
##########
@@ -80,6 +80,10 @@
   private final HoodieTableMetaClient hoodieTableMetaClient;
   // Merge strategy to use when combining records from log
   private final String payloadClassFQN;
+  // simple recordKey field
+  private Option<String> simpleRecordKeyFieldOpt = Option.empty();

Review comment:
       On the query side, I am naming all variables with the assumption that if 
populateMetaFields is set to false, then key gen will be simpleKeyGen as we 
would failed the operation during write itself. 
   Where as on the write path, we will serialize all key gen properties. and 
hence recordKey property and partition path property is not prefixed with 
"simple". 
   

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieFileSliceReader.java
##########
@@ -36,11 +38,14 @@
   private Iterator<HoodieRecord<? extends HoodieRecordPayload>> 
recordsIterator;
 
   public static <R extends IndexedRecord, T extends HoodieRecordPayload> 
HoodieFileSliceReader getFileSliceReader(
-      HoodieFileReader<R> baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass) throws IOException {
+      HoodieFileReader<R> baseFileReader, HoodieMergedLogRecordScanner 
scanner, Schema schema, String payloadClass,
+      Option<String> simpleRecordKeyFieldOpt, Option<String> 
simplePartitionPathFieldOpt) throws IOException {

Review comment:
       Initially I had 3 args, populateMetaFields, simpleRecordKey and 
simplePartitionPath. Based on feedback for COW patch, have changed it this way. 
   But just for this class, I feel we could go with 3 args and then in line 46 
call either method depending on populateMetaFields. Mainly bcoz, caller of this 
method actually adds Option and sends to this method, but then in line 48, we 
again remove the Option and pass the argument. So, why not keep it simple. I am 
fine either ways, let me know your thoughts

##########
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -128,8 +129,11 @@ object HoodieSparkSqlWriter {
           .setPayloadClassName(hoodieConfig.getString(PAYLOAD_CLASS_OPT_KEY))
           
.setPreCombineField(hoodieConfig.getStringOrDefault(PRECOMBINE_FIELD_OPT_KEY, 
null))
           .setPartitionColumns(partitionColumns)
-          
.setPopulateMetaFields(parameters.getOrElse(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(),
 HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue()).toBoolean)
+          .setPopulateMetaFields(populateMetaFields)
+          .setRecordKeyFields(hoodieConfig.getString(RECORDKEY_FIELD_OPT_KEY))
+          
.setPartitionColumns(hoodieConfig.getString(PARTITIONPATH_FIELD_OPT_KEY))

Review comment:
       So, we are adding record keys, partition path and key gen class prop for 
all tables at spark datasource layer. 

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -324,6 +324,13 @@ public void validateTableProperties(Properties properties, 
WriteOperationType op
         && Boolean.parseBoolean((String) 
properties.getOrDefault(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key(), 
HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.defaultValue()))) {
       throw new 
HoodieException(HoodieTableConfig.HOODIE_POPULATE_META_FIELDS.key() + " already 
disabled for the table. Can't be re-enabled back");
     }
+
+    // meta fields can be disabled only with SimpleKeyGenerator
+    if (!getTableConfig().populateMetaFields()
+        && 
!properties.getProperty(HoodieTableConfig.HOODIE_TABLE_KEY_GENERATOR_CLASS.key()).equals("org.apache.hudi.keygen.SimpleKeyGenerator"))
 {

Review comment:
       SimpleKeyGenerator class is not reachable from here and so have hard 
coded. Let me know if there is some other way to not hard code the class 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to