(hudi) branch asf-site updated: docs: update roadmap items (#14358)

xushiyan Tue, 25 Nov 2025 18:01:47 -0800

This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 07ff8e48c0ae docs: update roadmap items (#14358)
07ff8e48c0ae is described below

commit 07ff8e48c0ae3cd16fbd3f5a7bb512910fb3ad63
Author: Shiyan Xu <[email protected]>
AuthorDate: Tue Nov 25 20:00:30 2025 -0600

    docs: update roadmap items (#14358)
---
 ...ve-into-hudis-indexing-subsystem-part-1-of-2.md |   4 +--
 ...ve-into-hudis-indexing-subsystem-part-2-of-2.md |   6 ++--
 website/docs/hudi_stack.md                         |   7 +++-
 website/src/pages/ecosystem.md                     |   4 +--
 website/src/pages/roadmap.md                       |  40 ++++++++-------------
 .../assets/images/hudi_stack/pluggable_tf.png      | Bin 0 -> 77190 bytes
 website/versioned_docs/version-1.1.0/hudi_stack.md |   7 +++-
 7 files changed, 34 insertions(+), 34 deletions(-)

diff --git 
a/website/blog/2025-10-29-deep-dive-into-hudis-indexing-subsystem-part-1-of-2.md
 
b/website/blog/2025-10-29-deep-dive-into-hudis-indexing-subsystem-part-1-of-2.md
index e1cd9a03d495..9377a32609c3 100644
--- 
a/website/blog/2025-10-29-deep-dive-into-hudis-indexing-subsystem-part-1-of-2.md
+++ 
b/website/blog/2025-10-29-deep-dive-into-hudis-indexing-subsystem-part-1-of-2.md
@@ -13,7 +13,7 @@ tags:
 
 For decades, databases have relied on indexes—specialized data structures—to 
dramatically improve read and write performance by quickly locating specific 
records. Apache Hudi extends this fundamental principle to the data lakehouse 
with a unique and powerful approach. Every Hudi table contains a self-managed 
metadata table that functions as an indexing subsystem, enabling efficient data 
skipping and fast record lookups across a wide range of read and write 
scenarios.
 
-This two-part series dives into Hudi’s indexing subsystem. Part 1 explains the 
internal layout and data-skipping capabilities. Part 2 covers advanced 
features—record, secondary, and expression indexes—and asynchronous index 
maintenance. By the end, you’ll know how to leverage Hudi’s multimodal index to 
build more efficient lakehouse tables.
+This two-part series dives into Hudi’s indexing subsystem. Part 1 explains the 
internal layout and data-skipping capabilities. [part 
2](https://hudi.apache.org/blog/2025/11/12/deep-dive-into-hudis-indexing-subsystem-part-2-of-2/)
 covers advanced features—record, secondary, and expression indexes—and 
asynchronous index maintenance. By the end, you’ll know how to leverage Hudi’s 
multimodal index to build more efficient lakehouse tables.
 
 ## The Metadata Table
 
@@ -210,4 +210,4 @@ Hudi’s metadata table is itself a Hudi Merge‑on‑Read (MOR) 
table that acts
 
 Index maintenance happens transactionally alongside data writes, keeping index 
entries consistent with the data table. Periodic compaction merges log files 
into read‑optimized HFile base files to keep point lookups fast and 
predictable. On the read path, Hudi composes multiple indexes to minimize I/O: 
the files index enumerates candidates, partition stats prune irrelevant 
partitions, and column stats prune non‑matching files. In effect, the engine 
scans only the minimum set of files requ [...]
 
-In practice, the defaults are a strong starting point. Keep the metadata table 
enabled and explicitly list only the columns you frequently filter on via 
`hoodie.metadata.index.column.stats.column.list` to control metadata overhead. 
In Part 2, we’ll go deeper into accelerating equality‑matching and 
expression‑based predicates using the record, secondary, and expression 
indexes, and discuss how asynchronous index maintenance keeps writers unblocked 
while indexes build in the background.
+In practice, the defaults are a strong starting point. Keep the metadata table 
enabled and explicitly list only the columns you frequently filter on via 
`hoodie.metadata.index.column.stats.column.list` to control metadata overhead. 
In [part 
2](https://hudi.apache.org/blog/2025/11/12/deep-dive-into-hudis-indexing-subsystem-part-2-of-2/),
 we’ll go deeper into accelerating equality‑matching and expression‑based 
predicates using the record, secondary, and expression indexes, and discuss how 
 [...]
diff --git 
a/website/blog/2025-11-12-deep-dive-into-hudis-indexing-subsystem-part-2-of-2.md
 
b/website/blog/2025-11-12-deep-dive-into-hudis-indexing-subsystem-part-2-of-2.md
index b5dc843a0f3b..83170d4cfa25 100644
--- 
a/website/blog/2025-11-12-deep-dive-into-hudis-indexing-subsystem-part-2-of-2.md
+++ 
b/website/blog/2025-11-12-deep-dive-into-hudis-indexing-subsystem-part-2-of-2.md
@@ -11,9 +11,9 @@ tags:
   - data skipping
 ---
 
-In [Part 
1](https://hudi.apache.org/blog/2025/10/29/deep-dive-into-hudis-indexing-subsystem-part-1-of-2/),
 we explored how Hudi's metadata table functions as a self-managed, multimodal 
indexing subsystem. We covered its internal architecture—a partitioned Hudi 
Merge-on-Read (MOR) table using HFile format for efficient key lookups—and how 
the files, column stats, and partition stats indexes work together to implement 
powerful data skipping. These indexes dramatically reduce I/O by pruning [...]
+In [part 
1](https://hudi.apache.org/blog/2025/10/29/deep-dive-into-hudis-indexing-subsystem-part-1-of-2/),
 we explored how Hudi's metadata table functions as a self-managed, multimodal 
indexing subsystem. We covered its internal architecture—a partitioned Hudi 
Merge-on-Read (MOR) table using HFile format for efficient key lookups—and how 
the files, column stats, and partition stats indexes work together to implement 
powerful data skipping. These indexes dramatically reduce I/O by pruning [...]
 
-Now in Part 2, we'll dive into more specialized indexes that handle different 
query patterns. We'll look at the record and secondary indexes, which provide 
exact file locations for equality-matching predicates rather than just skipping 
irrelevant files. We'll explore expression indexes that optimize queries with 
inline transformations like `from_unixtime()` or `substring()`. Finally, we'll 
cover async indexing, which lets you build resource-intensive indexes in the 
background without blo [...]
+Now in part 2, we'll dive into more specialized indexes that handle different 
query patterns. We'll look at the record and secondary indexes, which provide 
exact file locations for equality-matching predicates rather than just skipping 
irrelevant files. We'll explore expression indexes that optimize queries with 
inline transformations like `from_unixtime()` or `substring()`. Finally, we'll 
cover async indexing, which lets you build resource-intensive indexes in the 
background without blo [...]
 
 ## Equality Matching with Record and Secondary Indexes
 
@@ -128,7 +128,7 @@ To manage this concurrency, a lock provider must be 
configured for both the inde
 
 ## Summary
 
-Throughout this two-part series, we've explored how Hudi's indexing subsystem 
brings database-grade performance to the data lakehouse. In Part 1, we examined 
the metadata table's architecture and how files, column stats, and partition 
stats indexes work together to skip irrelevant data. In Part 2, we covered 
specialized indexes—record, secondary, and expression indexes—that provide 
exact file locations for equality matching and handle transformed predicates. 
We also looked at async index [...]
+Throughout this two-part series, we've explored how Hudi's indexing subsystem 
brings database-grade performance to the data lakehouse. In [part 
1](https://hudi.apache.org/blog/2025/10/29/deep-dive-into-hudis-indexing-subsystem-part-1-of-2/),
 we examined the metadata table's architecture and how files, column stats, and 
partition stats indexes work together to skip irrelevant data. In part 2, we 
covered specialized indexes—record, secondary, and expression indexes—that 
provide exact file  [...]
 
 Here's a quick guide for choosing the right indexes for your workload:
 
diff --git a/website/docs/hudi_stack.md b/website/docs/hudi_stack.md
index 189c9840b727..67e2bdbb1b8f 100644
--- a/website/docs/hudi_stack.md
+++ b/website/docs/hudi_stack.md
@@ -68,7 +68,12 @@ all Base Files is required. Read more about the various 
table types in Hudi [tab
 
 ## Pluggable Table format
 
-Starting with Hudi 1.1, Hudi introduces a pluggable table format framework 
that extends Hudi's powerful storage engine capabilities beyond its native 
format to other table formats like Apache Iceberg and Delta Lake. This 
framework decouples Hudi's core capabilities—transaction management, indexing, 
concurrency control, and table services—from the specific storage format used 
for data files. Hudi provides native format support (configured via 
`hoodie.table.format=native` by default), whil [...]
+Starting with Hudi 1.1, Hudi introduces a pluggable table format framework 
that extends Hudi's powerful storage engine capabilities beyond its native 
format to other table formats like Apache Iceberg and Delta Lake. This 
framework decouples Hudi's core capabilities—transaction management, indexing, 
concurrency control, and table services—from the specific storage format used 
for data files.
+
+![pluggable table format](/assets/images/hudi_stack/pluggable_tf.png)
+<p align = "center">Pluggable Table Format</p>
+
+Hudi provides native format support (configured via 
`hoodie.table.format=native` by default), while [Apache XTable 
(incubating)](https://xtable.apache.org/) supplies pluggable format adapters 
for formats like Iceberg and Delta Lake. The framework enables organizations to 
choose the right format for each use case while maintaining a unified 
operational experience and leveraging Hudi's sophisticated storage engine 
across all formats. For example, you can write high-frequency updates to a H 
[...]
 
 ## Storage Engine
 
diff --git a/website/src/pages/ecosystem.md b/website/src/pages/ecosystem.md
index d5391f3e4263..8c0cdb46ec57 100644
--- a/website/src/pages/ecosystem.md
+++ b/website/src/pages/ecosystem.md
@@ -1,8 +1,8 @@
 ---
-title: Ecosystem
+title: Integrations
 ---
 
-# Ecosystem Support
+# Integrations
 
 While Apache Hudi works seamlessly with various application frameworks, SQL 
query engines, and data warehouses, some systems might only offer read 
capabilities.
 In such cases, you can leverage another tool like Apache Spark or Apache Flink 
to write data to Hudi tables and then use the read-compatible system for 
querying.
diff --git a/website/src/pages/roadmap.md b/website/src/pages/roadmap.md
index e5bd113dac97..f678c3785137 100644
--- a/website/src/pages/roadmap.md
+++ b/website/src/pages/roadmap.md
@@ -24,18 +24,15 @@ down by areas on our [stack](/docs/hudi_stack).
 
 | Feature                                              | Target Release | 
Tracking                                                                        
                                                                                
               |
 
|------------------------------------------------------|----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Introduce `.abort` state in the timeline             | 1.1.0          | 
[HUDI-8189](https://issues.apache.org/jira/browse/HUDI-8189) |
-| Schema tracking in metadata table                    | 1.1.0          | 
[HUDI-6778](https://issues.apache.org/jira/browse/HUDI-6778) |
-| Variant type support on Spark 4                      | 1.1.0          | 
[HUDI-9046](https://issues.apache.org/jira/browse/HUDI-9046) |
-| Non-blocking updates during clustering               | 1.1.0          | 
[HUDI-1045](https://issues.apache.org/jira/browse/HUDI-1045)                    
                                                                                
               |
-| Track schema in metadata table                       | 1.1.0          | 
[HUDI-6778](https://issues.apache.org/jira/browse/HUDI-6778)                    
                                                                                
               |
-| Enable partial updates for CDC workload payload      | 1.1.0          | 
[HUDI-7229](https://issues.apache.org/jira/browse/HUDI-7229)                    
                                                                                
               |
-| NBCC for MDT writes                                  | 1.1.0          | 
[HUDI-8480](https://issues.apache.org/jira/browse/HUDI-8480) |
-| Index abstraction for writer and reader              | 1.1.0          | 
[HUDI-9176](https://issues.apache.org/jira/browse/HUDI-9176) |
-| Vector search index                                  | 1.1.0          | 
[HUDI-9047](https://issues.apache.org/jira/browse/HUDI-9047) |
-| Bitmap index                                         | 1.1.0          | 
[HUDI-9048](https://issues.apache.org/jira/browse/HUDI-9048) |
-| Native HFile Writer and removal of HBase dependency  | 1.1.0          | 
[HUDI-8222](https://issues.apache.org/jira/browse/HUDI-8222) |
-| Pluggable Table Formats in Hudi                      | 1.1.0          | 
[RFC-93, 
HUDI-9332](https://github.com/apache/hudi/blob/master/rfc/rfc-93/rfc-93.md) |
+| Introduce `.abort` state in the timeline             | 1.2.0          | 
[HUDI-8189](https://issues.apache.org/jira/browse/HUDI-8189) |
+| Variant type support on Spark 4                      | 1.2.0          | 
[HUDI-9046](https://issues.apache.org/jira/browse/HUDI-9046) |
+| Non-blocking updates during clustering               | 1.2.0          | 
[HUDI-1045](https://issues.apache.org/jira/browse/HUDI-1045)                    
                                                                                
               |
+| Enable partial updates for CDC workload payload      | 1.2.0          | 
[HUDI-7229](https://issues.apache.org/jira/browse/HUDI-7229)                    
                                                                                
               |
+| Schema tracking in metadata table                    | 1.2.0          | 
[HUDI-6778](https://issues.apache.org/jira/browse/HUDI-6778) |
+| NBCC for MDT writes                                  | 1.2.0          | 
[HUDI-8480](https://issues.apache.org/jira/browse/HUDI-8480) |
+| Index abstraction for writer and reader              | 1.2.0          | 
[HUDI-9176](https://issues.apache.org/jira/browse/HUDI-9176) |
+| Vector search index                                  | 1.2.0          | 
[HUDI-9047](https://issues.apache.org/jira/browse/HUDI-9047) |
+| Bitmap index                                         | 1.2.0          | 
[HUDI-9048](https://issues.apache.org/jira/browse/HUDI-9048) |
 | New abstraction for schema, expressions, and filters | 1.2.0          | 
[RFC-88](https://github.com/apache/hudi/pull/12795) |
 | Streaming CDC/Incremental read improvement           | 1.2.0          | 
[HUDI-2749](https://issues.apache.org/jira/browse/HUDI-2749) |
 | Supervised table service planning and execution      | 1.2.0          | 
[RFC-43](https://github.com/apache/hudi/pull/4309), 
[HUDI-4147](https://issues.apache.org/jira/browse/HUDI-4147)                    
                                           |
@@ -50,8 +47,7 @@ down by areas on our [stack](/docs/hudi_stack).
 
 | Feature                                                 | Target Release | 
Tracking                                                                        
                                           |
 
|---------------------------------------------------------|----------------|----------------------------------------------------------------------------------------------------------------------------|
-| Deprecate Payload and support CDC with built-in merge mode | 1.1.0       | 
[HUDI-8401](https://issues.apache.org/jira/browse/HUDI-8401) |
-| New Hudi Table Format APIs for Query Integrations       | 1.1.0          | 
[RFC-64](https://github.com/apache/hudi/pull/7080), 
[HUDI-4141](https://issues.apache.org/jira/browse/HUDI-4141)           |
+| New Hudi Table Format APIs for Query Integrations       | 1.2.0          | 
[RFC-64](https://github.com/apache/hudi/pull/7080), 
[HUDI-4141](https://issues.apache.org/jira/browse/HUDI-4141)           |
 | Snapshot view management                                | 1.2.0          | 
[RFC-61](https://github.com/apache/hudi/pull/6576), 
[HUDI-4677](https://issues.apache.org/jira/browse/HUDI-4677)           |
 | Support of verification with multiple event_time fields | 1.2.0          | 
[RFC-59](https://github.com/apache/hudi/pull/6382), 
[HUDI-4569](https://issues.apache.org/jira/browse/HUDI-4569)           |
 
@@ -60,14 +56,9 @@ down by areas on our [stack](/docs/hudi_stack).
 
 | Feature                                                 | Target Release | 
Tracking                                                                        
                                                                                
                         |
 
|---------------------------------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Improve metadata table write DAG on Spark               | 1.1.0          | 
[HUDI-8462](https://issues.apache.org/jira/browse/HUDI-8462) |
-| Optimize performance with engine-native records on Flink | 1.1.0          | 
[HUDI-8799](https://issues.apache.org/jira/browse/HUDI-8799) |
-| File group reader integration on Flink                  | 1.1.0          | 
[HUDI-6788](https://issues.apache.org/jira/browse/HUDI-6788) |
-| File group reader integration with MDT read path        | 1.1.0          | 
[HUDI-8720](https://issues.apache.org/jira/browse/HUDI-8720) |
-| Default Java 17 support                                    | 1.1.0           
 | [HUDI-6506](https://issues.apache.org/jira/browse/HUDI-6506)                 
                                                                                
                            |
-| Spark 4 Support                                            | 1.1.0           
 | [HUDI-7915](https://issues.apache.org/jira/browse/HUDI-7915)                 
                                                                                
                            |
-| Spark datasource V2 read                                | 1.1.0          | 
[HUDI-4449](https://issues.apache.org/jira/browse/HUDI-4449)                    
                                                                                
                         |
-| Simplification of engine integration and module organization | 1.1.0         
 | [HUDI-9502](https://issues.apache.org/jira/browse/HUDI-9502) |
+| Default Java 17 support                                    | 1.2.0           
 | [HUDI-6506](https://issues.apache.org/jira/browse/HUDI-6506)                 
                                                                                
                            |
+| Spark datasource V2 read                                | 1.2.0          | 
[HUDI-4449](https://issues.apache.org/jira/browse/HUDI-4449)                    
                                                                                
                         |
+| Simplification of engine integration and module organization | 1.2.0         
 | [HUDI-9502](https://issues.apache.org/jira/browse/HUDI-9502) |
 | End-to-end DataFrame write path on Spark                | 1.2.0          | 
[HUDI-9019](https://issues.apache.org/jira/browse/HUDI-9019), 
[HUDI-4857](https://issues.apache.org/jira/browse/HUDI-4857) |
 | Support Hudi 1.0 release in Presto Hudi Connector       | Presto Release / 
Q2 | [HUDI-3210](https://issues.apache.org/jira/browse/HUDI-3210) |
 | Support of new indexes in Presto Hudi Connector         | Presto Release / 
Q3 | [HUDI-4394](https://issues.apache.org/jira/browse/HUDI-4394), 
[HUDI-4552](https://issues.apache.org/jira/browse/HUDI-4552) |
@@ -78,7 +69,7 @@ down by areas on our [stack](/docs/hudi_stack).
 
 | Feature                                                                      
                     | Target Release | Tracking                                
                                                                                
               |
 
|---------------------------------------------------------------------------------------------------|----------------|----------------------------------------------------------------------------------------------------------------------------------------|
-| Syncing as non-partitoned tables in catalogs             | 1.1.0          | 
[HUDI-9503](https://issues.apache.org/jira/browse/HUDI-9503) |
+| Syncing as non-partitoned tables in catalogs             | 1.2.0          | 
[HUDI-9503](https://issues.apache.org/jira/browse/HUDI-9503) |
 | Hudi Reverse streamer                                                        
                     | 1.2.0          | 
[RFC-70](https://github.com/apache/hudi/pull/9040)                              
                                                        |
 | Diagnostic Reporter                                                          
                     | 1.2.0          | 
[RFC-62](https://github.com/apache/hudi/pull/6600)                              
                                          |
 | Mutable, Transactional caching for Hudi Tables (could be accelerated based 
on community feedback) | 2.0.0          | [Strawman 
design](https://docs.google.com/presentation/d/1QBgLw11TM2Qf1KUESofGrQDb63EuggNCpPaxc82Kldo/edit#slide=id.gf7e0551254_0_5),
 [HUDI-6489](https://issues.apache.org/jira/browse/HUDI-6489) |
@@ -88,5 +79,4 @@ down by areas on our [stack](/docs/hudi_stack).
 ## Developer Experience
 | Feature                                                 | Target Release | 
Tracking                                 |
 
|---------------------------------------------------------|----------------|------------------------------------------|
-| Support code coverage report and improve test coverage  | 1.1.0          | 
[HUDI-9015](https://issues.apache.org/jira/browse/HUDI-9015) |
-| Clean up tech debt and deprecate unused code            | 1.1.0          | 
[HUDI-9054](https://issues.apache.org/jira/browse/HUDI-9054) |
+| Clean up tech debt and deprecate unused code            | 1.2.0          | 
[HUDI-9054](https://issues.apache.org/jira/browse/HUDI-9054) |
diff --git a/website/static/assets/images/hudi_stack/pluggable_tf.png 
b/website/static/assets/images/hudi_stack/pluggable_tf.png
new file mode 100644
index 000000000000..58ddeb8cc1fc
Binary files /dev/null and 
b/website/static/assets/images/hudi_stack/pluggable_tf.png differ
diff --git a/website/versioned_docs/version-1.1.0/hudi_stack.md 
b/website/versioned_docs/version-1.1.0/hudi_stack.md
index 189c9840b727..67e2bdbb1b8f 100644
--- a/website/versioned_docs/version-1.1.0/hudi_stack.md
+++ b/website/versioned_docs/version-1.1.0/hudi_stack.md
@@ -68,7 +68,12 @@ all Base Files is required. Read more about the various 
table types in Hudi [tab
 
 ## Pluggable Table format
 
-Starting with Hudi 1.1, Hudi introduces a pluggable table format framework 
that extends Hudi's powerful storage engine capabilities beyond its native 
format to other table formats like Apache Iceberg and Delta Lake. This 
framework decouples Hudi's core capabilities—transaction management, indexing, 
concurrency control, and table services—from the specific storage format used 
for data files. Hudi provides native format support (configured via 
`hoodie.table.format=native` by default), whil [...]
+Starting with Hudi 1.1, Hudi introduces a pluggable table format framework 
that extends Hudi's powerful storage engine capabilities beyond its native 
format to other table formats like Apache Iceberg and Delta Lake. This 
framework decouples Hudi's core capabilities—transaction management, indexing, 
concurrency control, and table services—from the specific storage format used 
for data files.
+
+![pluggable table format](/assets/images/hudi_stack/pluggable_tf.png)
+<p align = "center">Pluggable Table Format</p>
+
+Hudi provides native format support (configured via 
`hoodie.table.format=native` by default), while [Apache XTable 
(incubating)](https://xtable.apache.org/) supplies pluggable format adapters 
for formats like Iceberg and Delta Lake. The framework enables organizations to 
choose the right format for each use case while maintaining a unified 
operational experience and leveraging Hudi's sophisticated storage engine 
across all formats. For example, you can write high-frequency updates to a H 
[...]
 
 ## Storage Engine

(hudi) branch asf-site updated: docs: update roadmap items (#14358)

Reply via email to