danny0405 commented on code in PR #5201:
URL: https://github.com/apache/hudi/pull/5201#discussion_r849042314
##########
hudi-common/src/main/java/org/apache/hudi/common/table/TableSchemaResolver.java:
##########
@@ -159,23 +167,67 @@ public Schema getTableAvroSchema() throws Exception {
* @throws Exception
*/
public Schema getTableAvroSchema(boolean includeMetadataFields) throws
Exception {
+ Schema schema;
Option<Schema> schemaFromCommitMetadata =
getTableSchemaFromCommitMetadata(includeMetadataFields);
if (schemaFromCommitMetadata.isPresent()) {
- return schemaFromCommitMetadata.get();
- }
- Option<Schema> schemaFromTableConfig =
metaClient.getTableConfig().getTableCreateSchema();
- if (schemaFromTableConfig.isPresent()) {
- if (includeMetadataFields) {
- return HoodieAvroUtils.addMetadataFields(schemaFromTableConfig.get(),
hasOperationField);
+ schema = schemaFromCommitMetadata.get();
+ } else {
+ Option<Schema> schemaFromTableConfig =
metaClient.getTableConfig().getTableCreateSchema();
+ if (schemaFromTableConfig.isPresent()) {
+ if (includeMetadataFields) {
+ schema =
HoodieAvroUtils.addMetadataFields(schemaFromTableConfig.get(),
hasOperationField);
+ } else {
+ schema = schemaFromTableConfig.get();
+ }
} else {
- return schemaFromTableConfig.get();
+ if (includeMetadataFields) {
+ schema = getTableAvroSchemaFromDataFile();
+ } else {
+ schema =
HoodieAvroUtils.removeMetadataFields(getTableAvroSchemaFromDataFile());
+ }
}
}
- if (includeMetadataFields) {
- return getTableAvroSchemaFromDataFile();
- } else {
- return
HoodieAvroUtils.removeMetadataFields(getTableAvroSchemaFromDataFile());
+
+ Option<String[]> partitionFieldsOpt =
metaClient.getTableConfig().getPartitionFields();
+ if (metaClient.getTableConfig().isDropPartitionColumns()) {
+ schema = recreateSchemaWhenDropPartitionColumns(partitionFieldsOpt,
schema);
+ }
Review Comment:
I'm confused with this operation `recreateSchemaWhenDropPartitionColumns`,
do you mean when `drop partitions` set as true, we still need to set up the
metadata field `__hoodie_partition_path` but ignore it in the record payload ?
How much gains we bring in this complexity ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]