prashantwason commented on a change in pull request #4449:
URL: https://github.com/apache/hudi/pull/4449#discussion_r790041194
##########
File path:
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java
##########
@@ -162,6 +158,20 @@ protected void createRecordsFromContentBytes() throws
IOException {
return records;
}
+ /**
+ * Serialize the record to byte buffer.
+ *
+ * @param record - Record to serialize
+ * @param schemaKeyField - Key field in the schema
+ * @return Serialized byte buffer for the record
+ */
+ private byte[] serializeRecord(final IndexedRecord record, final
Option<Field> schemaKeyField) {
+ if (schemaKeyField.isPresent()) {
+ record.put(schemaKeyField.get().pos(), "");
Review comment:
I still see "" being used.
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java
##########
@@ -77,6 +81,8 @@ public HoodieHFileWriter(String instantTime, Path file,
HoodieHFileConfig hfileC
this.file = HoodieWrapperFileSystem.convertToHoodiePath(file, conf);
this.fs = (HoodieWrapperFileSystem) this.file.getFileSystem(conf);
this.hfileConfig = hfileConfig;
+ this.schema = schema;
+ this.schemaRecordKeyField =
Option.ofNullable(schema.getField(hfileConfig.getKeyFieldName()));
Review comment:
Can we simply use HoodieHFileReader.KEY_FIELD_NAME here instead of
plugging in through hfileConfig?
hFileConfig value is anyways being hardcoded to
HoodieHFileReader.KEY_FIELD_NAME so this does not provide nay benefit.
Furthermore, in the PR to actually fix the interfaces to provide the
key-name-config to readers and writers, we will have greater flexibility in
deciding how to pass the configs and refactor.
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java
##########
@@ -122,7 +128,13 @@ public boolean canWrite() {
@Override
public void writeAvro(String recordKey, IndexedRecord object) throws
IOException {
- byte[] value = HoodieAvroUtils.avroToBytes((GenericRecord)object);
+ byte[] value = HoodieAvroUtils.avroToBytes((GenericRecord) object);
Review comment:
This is another way to implement this without perf issue:
byte[] value;
if (schemaRecordKeyField.isPresent()) {
int keyFieldPos = this.schemaRecordKeyField.get().pos();
Object origKey = object.get(keyFieldPos);
object.put(keyFieldPos, StringUtils.EMPTY_STRING);
value = HoodieAvroUtils.avroToBytes((GenericRecord)object);
object.put(keyFieldPos, origKey);
} else {
value = HoodieAvroUtils.avroToBytes((GenericRecord) object);
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]