[
https://issues.apache.org/jira/browse/HUDI-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinoth Chandar updated HUDI-6242:
---------------------------------
Description:
This EPIC tracks changes to the Hudi storage format.
Format change is anything that changes any bits related to
- *Timeline* : active or archived timeline contents, file names.
- *Base Files*: file format versions, any changes to any data types, file
footers, file names.
- *Log Files*: Block structure, content, names.
- *Metadata Table*: (should we call this index table instead?) partition
names, number of file groups, key/value schema and metadata to MDT row
mappings.
- *Table properties*: What's written to hoodie.properties.
The following functionality should be met.
- Ability to mix different types of base files within a single table or even a
single file group (e.g images, json, vectors ...)
- Ability to support general purpose multi-table transactions, esp between
data and metadata tables.
- Support lockless/non-blocking transactions, where writers don't block each
other even in face of conflicts.
- Support encoding of watermarks/event time fields as first class citizen, for
handling late arriving data.
- Pluggable implementations for write handles, timeline, metadata table.
> Format changes for Hudi 1.X release line
> ----------------------------------------
>
> Key: HUDI-6242
> URL: https://issues.apache.org/jira/browse/HUDI-6242
> Project: Apache Hudi
> Issue Type: Epic
> Components: core
> Reporter: Vinoth Chandar
> Assignee: Vinoth Chandar
> Priority: Major
> Labels: hudi-umbrellas
> Fix For: 1.0.0
>
>
> This EPIC tracks changes to the Hudi storage format.
> Format change is anything that changes any bits related to
> - *Timeline* : active or archived timeline contents, file names.
> - *Base Files*: file format versions, any changes to any data types, file
> footers, file names.
> - *Log Files*: Block structure, content, names.
> - *Metadata Table*: (should we call this index table instead?) partition
> names, number of file groups, key/value schema and metadata to MDT row
> mappings.
> - *Table properties*: What's written to hoodie.properties.
> The following functionality should be met.
> - Ability to mix different types of base files within a single table or even
> a single file group (e.g images, json, vectors ...)
> - Ability to support general purpose multi-table transactions, esp between
> data and metadata tables.
> - Support lockless/non-blocking transactions, where writers don't block each
> other even in face of conflicts.
> - Support encoding of watermarks/event time fields as first class citizen,
> for handling late arriving data.
> - Pluggable implementations for write handles, timeline, metadata table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)