prashantwason commented on a change in pull request #4449:
URL: https://github.com/apache/hudi/pull/4449#discussion_r790041194



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java
##########
@@ -162,6 +158,20 @@ protected void createRecordsFromContentBytes() throws 
IOException {
     return records;
   }
 
+  /**
+   * Serialize the record to byte buffer.
+   *
+   * @param record         - Record to serialize
+   * @param schemaKeyField - Key field in the schema
+   * @return Serialized byte buffer for the record
+   */
+  private byte[] serializeRecord(final IndexedRecord record, final 
Option<Field> schemaKeyField) {
+    if (schemaKeyField.isPresent()) {
+      record.put(schemaKeyField.get().pos(), "");

Review comment:
       I still see "" being used.

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java
##########
@@ -77,6 +81,8 @@ public HoodieHFileWriter(String instantTime, Path file, 
HoodieHFileConfig hfileC
     this.file = HoodieWrapperFileSystem.convertToHoodiePath(file, conf);
     this.fs = (HoodieWrapperFileSystem) this.file.getFileSystem(conf);
     this.hfileConfig = hfileConfig;
+    this.schema = schema;
+    this.schemaRecordKeyField = 
Option.ofNullable(schema.getField(hfileConfig.getKeyFieldName()));

Review comment:
       Can we simply use HoodieHFileReader.KEY_FIELD_NAME here instead of 
plugging in through hfileConfig? 
   
   hFileConfig value is anyways being hardcoded to 
HoodieHFileReader.KEY_FIELD_NAME so this does not provide nay benefit. 
Furthermore, in the PR to actually fix the interfaces to provide the 
key-name-config to readers and writers, we will have greater flexibility in 
deciding how to pass the configs and refactor.

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java
##########
@@ -122,7 +128,13 @@ public boolean canWrite() {
 
   @Override
   public void writeAvro(String recordKey, IndexedRecord object) throws 
IOException {
-    byte[] value = HoodieAvroUtils.avroToBytes((GenericRecord)object);
+    byte[] value = HoodieAvroUtils.avroToBytes((GenericRecord) object);

Review comment:
       This is another way to implement this without perf issue:
   byte[] value;
   if (schemaRecordKeyField.isPresent()) {
       int keyFieldPos = this.schemaRecordKeyField.get().pos();
       Object origKey = object.get(keyFieldPos);
       object.put(keyFieldPos, StringUtils.EMPTY_STRING);
       value = HoodieAvroUtils.avroToBytes((GenericRecord)object);
       object.put(keyFieldPos, origKey);
   } else {
      value = HoodieAvroUtils.avroToBytes((GenericRecord) object);
   }




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to