bvaradar opened a new issue, #14267: URL: https://github.com/apache/hudi/issues/14267
Description This issue migrates column statistics schema handling from Avro Schema to HoodieSchema for in-memory processing while maintaining Avro serialization for on-disk storage. Packages Migrated org.apache.hudi.stats What to Change Migrate ValueType.inferType() to use HoodieSchema for type inference Update ValueMetadata schema-based operations to use HoodieSchema internally Add column statistics utilities to HoodieSchemaUtils Convert schema parameters from Avro Schema to HoodieSchema in public APIs What to Avoid Do not change on-disk serialization format for column statistics (must remain Avro) Do not modify existing column statistics storage in Parquet metadata Do not alter column statistics serialization in timeline metadata files Files to Modify ValueType.java (~100 lines changed) ValueMetadata.java (~80 lines changed) HoodieSchemaUtils.java (+40 lines for stats utilities) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
