yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593562711
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/functional/TestHoodieLogFormat.java:
##########
@@ -459,7 +457,7 @@ public void testHugeLogFileWrite() throws IOException,
URISyntaxException, Inter
header.put(HoodieLogBlock.HeaderMetadataType.SCHEMA,
getSimpleSchema().toString());
byte[] dataBlockContentBytes = getDataBlock(DEFAULT_DATA_BLOCK_TYPE,
records, header).getContentBytes();
HoodieLogBlock.HoodieLogBlockContentLocation logBlockContentLoc = new
HoodieLogBlock.HoodieLogBlockContentLocation(
- HadoopFSUtils.getStorageConf(new Configuration()), null, 0,
dataBlockContentBytes.length, 0);
+ HoodieStorageUtils.getNewStorageConf(), null, 0,
dataBlockContentBytes.length, 0);
Review Comment:
use `getDefaultStorageConf`.
##########
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HoodieHadoopStorage.java:
##########
@@ -43,16 +47,66 @@
import java.util.List;
import java.util.stream.Collectors;
+import static org.apache.hudi.common.util.ValidationUtils.checkArgument;
import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToHadoopPath;
import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToStoragePath;
import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToStoragePathInfo;
+import static org.apache.hudi.hadoop.fs.HadoopFSUtils.getFs;
/**
* Implementation of {@link HoodieStorage} using Hadoop's {@link FileSystem}
*/
public class HoodieHadoopStorage extends HoodieStorage {
private final FileSystem fs;
+ public HoodieHadoopStorage(HoodieStorage storage) {
Review Comment:
I feel like there are too many constructors here. Ideally, only a couple
should suffice. To load the implementation through config and reflection, we
cannot support many constructors, ideally only one or two
.
##########
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HoodieHadoopStorage.java:
##########
@@ -213,6 +267,16 @@ public Configuration unwrapConf() {
return fs.getConf();
}
+ @Override
+ public StorageConfiguration<?> buildInlineConf(StorageConfiguration<?>
storageConf) {
+ return HadoopInLineFSUtils.buildInlineConf(storageConf);
+ }
+
+ @Override
+ public StoragePath getInlineFilePath(StoragePath outerPath, String
origScheme, long inLineStartOffset, long inLineLength) {
+ return HadoopInLineFSUtils.getInlineFilePath(outerPath, origScheme,
inLineStartOffset, inLineLength);
+ }
Review Comment:
I think this is not the logic that `HoodieStorage` implementation should
care about. Inline Path is invariant compared to different `HoodieStorage`
implementations.
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestHoodieTableConfig.java:
##########
@@ -63,7 +61,7 @@ public class TestHoodieTableConfig extends
HoodieCommonTestHarness {
@BeforeEach
public void setUp() throws Exception {
initPath();
- storage = HoodieStorageUtils.getStorage(basePath,
HadoopFSUtils.getStorageConf(new Configuration()));
+ storage = HoodieStorageUtils.getStorage(basePath);
Review Comment:
Let's create a util method in test utils only. Production code should not
use empty `Configuration`.
##########
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##########
@@ -34,22 +32,35 @@ public static HoodieStorage
getStorage(StorageConfiguration<?> conf) {
}
public static HoodieStorage getStorage(FileSystem fs) {
- return new HoodieHadoopStorage(fs);
+ return (HoodieStorage)
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
new Class<?>[] {FileSystem.class}, fs);
+ }
+
+ public static HoodieStorage getStorage(String basePath) {
+ return (HoodieStorage)
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
new Class<?>[] {String.class}, basePath);
}
public static HoodieStorage getStorage(String basePath,
StorageConfiguration<?> conf) {
- return getStorage(HadoopFSUtils.getFs(basePath, conf));
+ return (HoodieStorage)
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
new Class<?>[] {String.class, StorageConfiguration.class}, basePath, conf);
+ }
+
+ public static HoodieStorage getStorage(String basePath, Configuration conf) {
+ return (HoodieStorage)
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
new Class<?>[] {String.class, Configuration.class}, basePath, conf);
}
public static HoodieStorage getStorage(StoragePath path,
StorageConfiguration<?> conf) {
- return getStorage(HadoopFSUtils.getFs(path,
conf.unwrapAs(Configuration.class)));
+ return (HoodieStorage)
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
new Class<?>[] {StoragePath.class, StorageConfiguration.class}, path, conf);
}
public static HoodieStorage getRawStorage(HoodieStorage storage) {
- FileSystem fs = (FileSystem) storage.getFileSystem();
- if (fs instanceof HoodieWrapperFileSystem) {
- return getStorage(((HoodieWrapperFileSystem) fs).getFileSystem());
- }
- return storage;
+ return (HoodieStorage)
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
new Class<?>[] {HoodieStorage.class}, storage);
+ }
+
+ public static StorageConfiguration<?> getNewStorageConf() {
Review Comment:
This should not be exposed.
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestTableSchemaResolver.java:
##########
@@ -100,7 +98,7 @@ public void testReadSchemaFromLogFile() throws IOException,
URISyntaxException,
assertEquals(
new AvroSchemaConverter().convert(expectedSchema),
TableSchemaResolver.readSchemaFromLogFile(HoodieStorageUtils.getStorage(
- logFilePath, HadoopFSUtils.getStorageConf(new Configuration())),
logFilePath));
+ logFilePath, HoodieStorageUtils.getNewStorageConf()),
logFilePath));
Review Comment:
same here, use `getDefaultStorageConf()`
##########
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HadoopStorageConfiguration.java:
##########
@@ -38,7 +38,11 @@ public class HadoopStorageConfiguration extends
StorageConfiguration<Configurati
private transient Configuration configuration;
public HadoopStorageConfiguration() {
- this(new Configuration());
+ this(true);
+ }
+
+ public HadoopStorageConfiguration(Boolean loadDefaults) {
Review Comment:
Is this need for production code? `HadoopStorageConfiguration(Configuration
configuration)` should suffice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]