nsivabalan commented on code in PR #12888:
URL: https://github.com/apache/hudi/pull/12888#discussion_r1976269159
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java:
##########
@@ -632,6 +632,7 @@ static boolean validateConfigVersion(ConfigProperty<?>
configProperty, HoodieTab
// validate that the table version is greater than or equal to the config
version
HoodieTableVersion firstVersion =
HoodieTableVersion.fromReleaseVersion(configProperty.getSinceVersion().get());
boolean valid = tableVersion.greaterThan(firstVersion) ||
tableVersion.equals(firstVersion);
+ valid = valid ||
configProperty.key().equals(KEY_GENERATOR_CLASS_NAME.key()) ||
configProperty.key().equals(KEY_GENERATOR_TYPE.key());
Review Comment:
shouldn't this be &&?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java:
##########
@@ -109,9 +109,6 @@ public Option<HoodieCompactionPlan> execute() {
.getWriteTimeline().filterCompletedAndCompactionInstants().getInstantsAsStream()
.filter(instant -> compareTimestamps(instant.requestedTime(),
GREATER_THAN_OR_EQUALS, instantTime))
.collect(Collectors.toList());
- ValidationUtils.checkArgument(conflictingInstants.isEmpty(),
Review Comment:
why are we removing it?
and even if required, is it possible to enable it only for tbl version 8 or
above?
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/HoodieIncrSource.java:
##########
@@ -380,8 +380,7 @@ private Pair<Option<Dataset<Row>>, Checkpoint>
fetchNextBatchBasedOnRequestedTim
.option(START_COMMIT().key(), queryInfo.getStartInstant())
.option(END_COMMIT().key(), queryInfo.getEndInstant())
.option(INCREMENTAL_FALLBACK_TO_FULL_TABLE_SCAN().key(),
- props.getString(INCREMENTAL_FALLBACK_TO_FULL_TABLE_SCAN().key(),
- INCREMENTAL_FALLBACK_TO_FULL_TABLE_SCAN().defaultValue()))
+ props.getString(INCREMENTAL_FALLBACK_TO_FULL_TABLE_SCAN().key(),
"false"))
Review Comment:
why flipping the def value even for tbl version 8?
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java:
##########
@@ -479,13 +472,6 @@ &&
compareTimestamps(logBlock.getLogBlockHeader().get(INSTANT_TIME), GREATER_THA
continue;
}
if (logBlock.getBlockType() != COMMAND_BLOCK) {
- if (this.tableVersion.lesserThan(HoodieTableVersion.EIGHT)) {
- if
(!getOrCreateCompletedInstantsTimeline().containsOrBeforeTimelineStarts(instantTime)
Review Comment:
same here.
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java:
##########
@@ -281,13 +281,6 @@ private void scanInternalV1(Option<KeySpec> keySpecOpt) {
final String instantTime =
logBlock.getLogBlockHeader().get(INSTANT_TIME);
totalLogBlocks.incrementAndGet();
if (logBlock.isDataOrDeleteBlock()) {
- if (this.tableVersion.lesserThan(HoodieTableVersion.EIGHT)) {
- if
(!getOrCreateCompletedInstantsTimeline().containsOrBeforeTimelineStarts(instantTime)
- ||
getOrCreateInflightInstantsTimeline().containsInstant(instantTime)) {
Review Comment:
we can't just remove this constraints.
We needed this specifically for tbl6 and while preparing RLI records.
But for a regular MOR table reader, we can't expose uncommitted data.
So, we might have to add additional argument to log record reader and then
set it to true only when we are preparing RLI records. If not, it should be
false on which case the condition check and continue logic should still be
there for tbl v 6.
can we add docs here as to why we are allowing here.
##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java:
##########
@@ -537,9 +538,13 @@ public static String getTableBasePath(InputSplit split,
JobConf jobConf) throws
/**
* `schema.on.read` and skip merge not implemented
*/
- public static boolean shouldUseFilegroupReader(final JobConf jobConf, final
InputSplit split) {
- return
jobConf.getBoolean(HoodieReaderConfig.FILE_GROUP_READER_ENABLED.key(),
HoodieReaderConfig.FILE_GROUP_READER_ENABLED.defaultValue())
- &&
!jobConf.getBoolean(HoodieCommonConfig.SCHEMA_EVOLUTION_ENABLE.key(),
HoodieCommonConfig.SCHEMA_EVOLUTION_ENABLE.defaultValue())
- && !(split instanceof BootstrapBaseFileSplit);
+ public static boolean shouldUseFilegroupReader(final JobConf jobConf, final
InputSplit split) throws IOException {
+ if (!(split instanceof FileSplit)) {
Review Comment:
why is this fix required? can you help me understand
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieReaderConfig.java:
##########
@@ -89,4 +95,18 @@ public class HoodieReaderConfig extends HoodieConfig {
"hoodie.write.record.merge.custom.implementation.classes";
public static final String
RECORD_MERGE_IMPL_CLASSES_DEPRECATED_WRITE_CONFIG_KEY =
"hoodie.datasource.write.record.merger.impls";
+
+ public static boolean isFileGroupReaderEnabled(HoodieTableVersion
tableVersion, HoodieConfig config) {
Review Comment:
do we have UTs for this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]