vinothchandar commented on a change in pull request #4114:
URL: https://github.com/apache/hudi/pull/4114#discussion_r757607131
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -727,7 +729,7 @@ protected void bootstrapCommit(List<DirectoryInfo>
partitionInfoList, String cre
HoodieData<HoodieRecord> fileListRecords =
engineContext.parallelize(partitionInfoList,
partitionInfoList.size()).map(partitionInfo -> {
// Record which saves files within a partition
return HoodieMetadataPayload.createPartitionFilesRecord(
- partitionInfo.getRelativePath(),
Option.of(partitionInfo.getFileNameToSizeMap()), Option.empty());
+ partitionInfo.getRelativePath().isEmpty() ? NON_PARTITIONED_NAME :
partitionInfo.getRelativePath(),
Option.of(partitionInfo.getFileNameToSizeMap()), Option.empty());
Review comment:
is the identation all ok here? needs folding?
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/HoodieSparkTable.java
##########
@@ -49,8 +48,7 @@
public abstract class HoodieSparkTable<T extends HoodieRecordPayload>
extends HoodieTable<T, JavaRDD<HoodieRecord<T>>, JavaRDD<HoodieKey>,
JavaRDD<WriteStatus>> {
- private boolean isMetadataAvailabilityUpdated = false;
- private boolean isMetadataTableAvailable;
+ private volatile boolean isMetadataTableBasePathExists = false;
Review comment:
+1
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/HoodieSparkTable.java
##########
@@ -112,25 +110,22 @@ protected HoodieIndex getIndex(HoodieWriteConfig config,
HoodieEngineContext con
* @return instance of {@link HoodieTableMetadataWriter}
*/
@Override
- public <T extends SpecificRecordBase> Option<HoodieTableMetadataWriter>
getMetadataWriter(Option<T> actionMetadata) {
- synchronized (this) {
- if (!isMetadataAvailabilityUpdated) {
- // This code assumes that if metadata availability is updated once it
will not change.
- // Please revisit this logic if that's not the case. This is done to
avoid repeated calls to fs.exists().
- try {
- isMetadataTableAvailable = config.isMetadataTableEnabled()
- && metaClient.getFs().exists(new
Path(HoodieTableMetadata.getMetadataTableBasePath(metaClient.getBasePath())));
- } catch (IOException e) {
- throw new HoodieMetadataException("Checking existence of metadata
table failed", e);
+ public <T extends SpecificRecordBase> Option<HoodieTableMetadataWriter>
getMetadataWriter(String inFlightInstantTimestamp,
+
Option<T> actionMetadata) {
+ if (config.isMetadataTableEnabled()) {
+ final HoodieTableMetadataWriter metadataWriter =
SparkHoodieBackedTableMetadataWriter.create(
Review comment:
if we always call this, the check below is only valid for cases where it
actually bootstrapped?
lets add a line of comment there. its a bit obtuse to infer
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java
##########
@@ -733,10 +733,11 @@ public HoodieEngineContext getContext() {
/**
* Get Table metadata writer.
*
+ * @param inFlightInstantTimestamp - InFlight instant timestamp for which
metadata writer is needed
Review comment:
`triggeringInstantTime - the instant that is triggering this metadata
write`
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java
##########
@@ -733,10 +733,11 @@ public HoodieEngineContext getContext() {
/**
* Get Table metadata writer.
*
+ * @param inFlightInstantTimestamp - InFlight instant timestamp for which
metadata writer is needed
Review comment:
rename `inFlightInstantTimestamp` to `triggeringInstantTime` here and
metadata writer classes (not timestamp, its mean as logical time, our tests use
001, 002 and everything still works. )
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/HoodieSparkTable.java
##########
@@ -112,25 +110,22 @@ protected HoodieIndex getIndex(HoodieWriteConfig config,
HoodieEngineContext con
* @return instance of {@link HoodieTableMetadataWriter}
*/
@Override
- public <T extends SpecificRecordBase> Option<HoodieTableMetadataWriter>
getMetadataWriter(Option<T> actionMetadata) {
- synchronized (this) {
- if (!isMetadataAvailabilityUpdated) {
- // This code assumes that if metadata availability is updated once it
will not change.
- // Please revisit this logic if that's not the case. This is done to
avoid repeated calls to fs.exists().
- try {
- isMetadataTableAvailable = config.isMetadataTableEnabled()
- && metaClient.getFs().exists(new
Path(HoodieTableMetadata.getMetadataTableBasePath(metaClient.getBasePath())));
- } catch (IOException e) {
- throw new HoodieMetadataException("Checking existence of metadata
table failed", e);
+ public <T extends SpecificRecordBase> Option<HoodieTableMetadataWriter>
getMetadataWriter(String inFlightInstantTimestamp,
Review comment:
rename to `triggeringInstantTime`
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/HoodieTable.java
##########
@@ -733,10 +733,11 @@ public HoodieEngineContext getContext() {
/**
* Get Table metadata writer.
*
+ * @param inFlightInstantTimestamp - InFlight instant timestamp for which
metadata writer is needed
Review comment:
Also when we add ability to index async, this needs to be a
`Option<String>` since there will be no inflight/triggering instant then. We
can do it here or defer to me. Just calling out
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -727,7 +729,7 @@ protected void bootstrapCommit(List<DirectoryInfo>
partitionInfoList, String cre
HoodieData<HoodieRecord> fileListRecords =
engineContext.parallelize(partitionInfoList,
partitionInfoList.size()).map(partitionInfo -> {
// Record which saves files within a partition
return HoodieMetadataPayload.createPartitionFilesRecord(
- partitionInfo.getRelativePath(),
Option.of(partitionInfo.getFileNameToSizeMap()), Option.empty());
+ partitionInfo.getRelativePath().isEmpty() ? NON_PARTITIONED_NAME :
partitionInfo.getRelativePath(),
Option.of(partitionInfo.getFileNameToSizeMap()), Option.empty());
Review comment:
push the ternary check into `partitionInfo.getRelativePath()` , avoid
code dupe?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]