manojpec commented on a change in pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#discussion_r786379145
##########
File path: hudi-common/src/main/avro/HoodieMetadata.avsc
##########
@@ -30,27 +30,118 @@
"doc": "Type of the metadata record",
"type": "int"
},
- { "name": "filesystemMetadata",
+ {
"doc": "Contains information about partitions and files within the
dataset",
- "type": ["null", {
- "type": "map",
- "values": {
+ "name": "filesystemMetadata",
+ "type": [
+ "null",
+ {
+ "type": "map",
+ "values": {
+ "type": "record",
+ "name": "HoodieMetadataFileInfo",
+ "fields": [
+ {
+ "name": "size",
+ "type": "long",
+ "doc": "Size of the file"
+ },
+ {
+ "name": "isDeleted",
+ "type": "boolean",
+ "doc": "True if this file has been deleted"
+ }
+ ]
+ }
+ }
+ ]
+ },
+ {
+ "doc": "Metadata Index of bloom filters for all data files in the
user table",
+ "name": "BloomFilterMetadata",
+ "type": [
+ "null",
+ {
+ "doc": "Data file bloom filter details",
+ "name": "HoodieMetadataBloomFilter",
"type": "record",
- "name": "HoodieMetadataFileInfo",
"fields": [
{
- "name": "size",
- "type": "long",
- "doc": "Size of the file"
+ "doc": "Bloom filter type code",
+ "name": "type",
+ "type": "string"
+ },
+ {
+ "doc": "Instant timestamp when this metadata was
created/updated",
+ "name": "timestamp",
+ "type": "string"
+ },
+ {
+ "doc": "Bloom filter binary byte array",
+ "name": "bloomFilter",
+ "type": "bytes"
},
{
+ "doc": "Bloom filter entry valid/deleted flag",
"name": "isDeleted",
- "type": "boolean",
- "doc": "True if this file has been deleted"
+ "type": "boolean"
+ },
+ {
+ "doc": "Reserved bytes for future use",
+ "name": "reserved",
+ "type": "bytes"
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "doc": "Metadata Index of column ranges for all data files in the
user table",
+ "name": "ColumnStatsMetadata",
+ "type": [
+ "null",
+ {
+ "doc": "Data file column ranges details",
+ "name": "HoodieColumnStats",
+ "type": "record",
+ "fields": [
+ {
+ "doc": "Minimum value in the range. Based on user
data table schema, we can convert this to appropriate type",
+ "name": "minValue",
+ "type": [
+ "null",
+ "string"
+ ]
+ },
+ {
+ "doc": "Maximum value in the range. Based on user
data table schema, we can convert it to appropriate type",
+ "name": "maxValue",
+ "type": [
+ "null",
+ "string"
+ ]
+ },
+ {
+ "doc": "Maximum value in the range. Based on user
data table schema, we can convert it to appropriate type",
Review comment:
We don't have range info details available from the file
footer/metadata. Added all the ones available.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]