nsivabalan commented on code in PR #12310:
URL: https://github.com/apache/hudi/pull/12310#discussion_r1855701538
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -1155,28 +1146,15 @@ public static HoodieData<HoodieRecord>
convertMetadataToColumnStatsRecords(Hoodi
}
try {
- Option<Schema> writerSchema =
-
Option.ofNullable(commitMetadata.getMetadata(HoodieCommitMetadata.SCHEMA_KEY))
- .flatMap(writerSchemaStr ->
- isNullOrEmpty(writerSchemaStr)
- ? Option.empty()
- : Option.of(new Schema.Parser().parse(writerSchemaStr)));
-
- HoodieTableConfig tableConfig = dataMetaClient.getTableConfig();
-
- // NOTE: Writer schema added to commit metadata will not contain Hudi's
metadata fields
- Option<Schema> tableSchema = writerSchema.map(schema ->
- tableConfig.populateMetaFields() ? addMetadataFields(schema) :
schema);
-
- List<String> columnsToIndex =
getColumnsToIndex(isColumnStatsIndexEnabled, targetColumnsForColumnStatsIndex,
- Lazy.eagerly(tableSchema));
+ List<String> columnsToIndex =
getColumnsToIndex(dataMetaClient.getTableConfig(), metadataConfig,
Review Comment:
not all operations might contain schema in the commit metadata.
for eg, delete partition. can you fallback to table schema from table schema
resolver if current commit metadata does not have any schema in the extra
metadata.
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -1185,25 +1163,51 @@ public static HoodieData<HoodieRecord>
convertMetadataToColumnStatsRecords(Hoodi
}
}
- /**
- * Get the list of columns for the table for column stats indexing
- */
- private static List<String> getColumnsToIndex(boolean
isColumnStatsIndexEnabled,
- List<String>
targetColumnsForColumnStatsIndex,
- Lazy<Option<Schema>>
lazyWriterSchemaOpt) {
- checkState(isColumnStatsIndexEnabled);
+ public static final String[] META_COLS_TO_ALWAYS_INDEX =
{COMMIT_TIME_METADATA_FIELD, RECORD_KEY_METADATA_FIELD,
PARTITION_PATH_METADATA_FIELD};
+ public static final Set<String> META_COL_SET_TO_INDEX = new
HashSet<>(Arrays.asList(META_COLS_TO_ALWAYS_INDEX));
- if (!targetColumnsForColumnStatsIndex.isEmpty()) {
- return targetColumnsForColumnStatsIndex;
+ public static List<String> getColumnsToIndex(HoodieTableConfig tableConfig,
Review Comment:
do we have UT for this method?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]