yihua commented on code in PR #12866:
URL: https://github.com/apache/hudi/pull/12866#discussion_r1978110682


##########
hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/HFileUtils.java:
##########
@@ -207,25 +194,21 @@ public byte[] serializeRecordsToLogBlock(HoodieStorage 
storage,
       sortedRecordsMap.put(recordKey, recordBytesList);
     }
 
-    HFile.Writer writer = HFile.getWriterFactory(conf, cacheConfig)
-        .withOutputStream(ostream).withFileContext(context).create();
-
-    // Write the records
+    HFileContext context = HFileContext.builder().build();
+    HFileWriter writer = new HFileWriterImpl(context, ostream);
     sortedRecordsMap.forEach((recordKey, recordBytesList) -> {
       for (byte[] recordBytes : recordBytesList) {
         try {
-          KeyValue kv = new KeyValue(recordKey.getBytes(), null, null, 
recordBytes);
-          writer.append(kv);
+          writer.append(recordKey.getBytes(StandardCharsets.UTF_8), 
recordBytes);

Review Comment:
   Let's remove all Hadoop class usage in this class and move this `HFileUtils` 
class to `hudi-common`.



##########
hudi-hadoop-common/src/main/java/org/apache/hudi/io/hadoop/HoodieAvroFileWriterFactory.java:
##########
@@ -98,8 +98,9 @@ protected HoodieFileWriter newHFileFileWriter(
       String instantTime, StoragePath path, HoodieConfig config, Schema schema,
       TaskContextSupplier taskContextSupplier) throws IOException {
     BloomFilter filter = createBloomFilter(config);
-    HoodieHFileConfig hfileConfig = new 
HoodieHFileConfig(storage.getConf().unwrapAs(Configuration.class),
-        Compression.Algorithm.valueOf(
+    HoodieHFileConfig hfileConfig = new HoodieHFileConfig(
+        storage.getConf().unwrapAs(Configuration.class),

Review Comment:
   Let's directly pass in `StorageConfiguration` instance here instead of 
`Configuration` to avoid Hadoop class usage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to