bvaradar opened a new issue, #14267:
URL: https://github.com/apache/hudi/issues/14267

   Description
   This issue migrates column statistics schema handling from Avro Schema to 
HoodieSchema for in-memory processing while maintaining Avro serialization for 
on-disk storage.
   
   Packages Migrated
   org.apache.hudi.stats
   
   What to Change
   Migrate ValueType.inferType() to use HoodieSchema for type inference
   Update ValueMetadata schema-based operations to use HoodieSchema internally
   Add column statistics utilities to HoodieSchemaUtils
   Convert schema parameters from Avro Schema to HoodieSchema in public APIs
   
   What to Avoid
   Do not change on-disk serialization format for column statistics (must 
remain Avro)
   Do not modify existing column statistics storage in Parquet metadata
   Do not alter column statistics serialization in timeline metadata files
   
   Files to Modify
   ValueType.java (~100 lines changed)
   ValueMetadata.java (~80 lines changed)
   HoodieSchemaUtils.java (+40 lines for stats utilities)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to