linliu-code commented on code in PR #12866:
URL: https://github.com/apache/hudi/pull/12866#discussion_r2105582631
##########
hudi-hadoop-common/src/main/java/org/apache/hudi/io/hadoop/HoodieAvroHFileWriter.java:
##########
@@ -101,25 +104,25 @@ public HoodieAvroHFileWriter(String instantTime,
StoragePath file, HoodieHFileCo
this.taskContextSupplier = taskContextSupplier;
this.populateMetaFields = populateMetaFields;
- HFileContext context = new
HFileContextBuilder().withBlockSize(hfileConfig.getBlockSize())
- .withCompression(hfileConfig.getCompressionAlgorithm())
- .withCellComparator(hfileConfig.getHFileComparator())
+ HFileContext context = new HFileContext.Builder()
+ .blockSize(hfileConfig.getBlockSize())
+ .compressionCodec(hfileConfig.getCompressionCodec())
.build();
conf.set(CacheConfig.PREFETCH_BLOCKS_ON_OPEN_KEY,
String.valueOf(hfileConfig.shouldPrefetchBlocksOnOpen()));
conf.set(HColumnDescriptor.CACHE_DATA_IN_L1,
String.valueOf(hfileConfig.shouldCacheDataInL1()));
conf.set(DROP_BEHIND_CACHE_COMPACTION_KEY,
String.valueOf(hfileConfig.shouldDropBehindCacheCompaction()));
- CacheConfig cacheConfig = new CacheConfig(conf);
- this.writer = HFile.getWriterFactory(conf, cacheConfig)
- .withPath(fs, this.file)
- .withFileContext(context)
- .create();
-
-
writer.appendFileInfo(getUTF8Bytes(HoodieAvroHFileReaderImplBase.SCHEMA_KEY),
- getUTF8Bytes(schema.toString()));
+
+ StorageConfiguration<Configuration> storageConf = new
HadoopStorageConfiguration(conf);
+ StoragePath filePath = new StoragePath(this.file.toUri());
Review Comment:
In order to control file size, `HoodieWrapperFileSystem` is used since it
caches write stats. To reduce all hadoop related dependency from this class, we
have to remove dependency for `HoodieWrapperFileSystem`, which requires to
cache these write stats. Don't have a clear solution for now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]