Re: [D] 1.2 Release Planning [hudi]

2026-01-12 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [x] Flink CDC Hudi sink connector: https://github.com/apache/hudi/issues/17649
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533
- [ ] Async clustering for Flink: https://github.com/apache/hudi/issues/17639
- [ ] RLI/SI support for Flink reading/writing: 
https://github.com/apache/hudi/issues/17647; 
https://github.com/apache/hudi/pull/17610/
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: 
https://github.com/apache/hudi/issues/14310; 
https://github.com/apache/hudi/pull/17827



GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2026-01-12 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [x] Flink CDC Hudi sink connector: https://github.com/apache/hudi/issues/17649
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533
- [ ] Async clustering for Flink: https://github.com/apache/hudi/issues/17639
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/issues/17647; 
https://github.com/apache/hudi/pull/17610/
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: 
https://github.com/apache/hudi/issues/14310; 
https://github.com/apache/hudi/pull/17827



GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2026-01-12 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [x] Flink CDC Hudi sink connector: https://github.com/apache/hudi/issues/17649
- [ ] Async clustering for Flink: https://github.com/apache/hudi/issues/17639
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/issues/17647; 
https://github.com/apache/hudi/pull/17610/
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-24 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [x] Flink CDC Hudi sink connector: https://github.com/apache/hudi/issues/17649
- [ ] Async clustering for Flink: https://github.com/apache/hudi/issues/17639
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/issues/17647
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-19 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Flink CDC Hudi sink connector: https://github.com/apache/hudi/issues/17649
- [ ] Async clustering for Flink: https://github.com/apache/hudi/issues/17639
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/issues/17647
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-18 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Flink CDC Hudi sink connector: 
https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink: https://github.com/apache/hudi/issues/17639
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-18 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Flink CDC Hudi sink connector: 
https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [x] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-18 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Flink CDC Hudi sink connector: 
https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [ ] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-10 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))
- [ ] End-to-end row writing including Hudi streamer
- [ ] Spark 4.1 support (once Spark release is available)
- [ ] Deprecation and clean-up
- [ ] Remove glob support, unused relations

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-10 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))
- [ ] End-to-end row writing including Hudi streamer (?)
- [ ] Spark 4.1 support (once Spark release is available)
- [ ] Deprecation and clean-up
- [ ] Remove glob support, unused relations

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-10 Thread via GitHub


GitHub user yihua added a comment to the discussion: 1.2 Release Planning

Spark 4.1.0-preview4 is out 
(https://spark.apache.org/news/spark-4-1-0-preview4-released.html).  We will 
likely need to add Spark 4.1 support for Hudi release 1.2.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15223444


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-09 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Flink CDC Hudi sink connector: 
https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [ ] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-09 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/hudi/issues/14252
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27 https://github.com/apache/hudi/issues/17038
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [ ] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-09 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27: RFC-design -> 
https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308
- [ ] Flink 2.1 support https://github.com/apache/hudi/issues/17533


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user danny0405 deleted a comment on the discussion: 1.2 Release Planning

- [ ] Dynamic bucket scaling for bucket index: (still planning)
postpone this into 1.3 because I have no clear idea around this in mind, but 
it's a real painpoint in production.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15205507


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user danny0405 added a comment to the discussion: 1.2 Release Planning

- [ ] Dynamic bucket scaling for bucket index: (still planning)
postpone this into 1.3 because I have no clear idea around this in mind, but 
it's a real painpoint in production.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15205507


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

@shangxinli Besides RLI/upsert support, what do you see as biggest needs 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15204506


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink reading/writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27: RFC-design -> 
https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27: RFC-design -> 
https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

> - [ ] Dynamic bucket scaling for bucket index: (still planning)

Pushed to after 1.2.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15204499


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27: RFC-design -> 
https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink (@danny0405 to write issue on gaps/...)
- [ ] RLI support for Flink writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27: RFC-design -> 
https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar reopened a discussion: 1.2 Release Planning

Starting this discussion to plan out the 1.2 release. I propose moving to a 
faster release cycle this time, where we pick 1-2 key features per track and 
release when they are ready to go out. 

Will start separate comments on each area and mention contributors I know, who 
are working on those. 




GitHub link: https://github.com/apache/hudi/discussions/14307


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

I think we can close this discuss thread. with the agreed upon scope across 
each area.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15201501


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar closed a discussion: 1.2 Release Planning

Starting this discussion to plan out the 1.2 release. I propose moving to a 
faster release cycle this time, where we pick 1-2 key features per track and 
release when they are ready to go out. 

Will start separate comments on each area and mention contributors I know, who 
are working on those. 




GitHub link: https://github.com/apache/hudi/discussions/14307


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

@HuangZhenQiu we would like to get the 1.2 Flink items finalized. Can you 
please chime in here?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15201498


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))
- [ ] End-to-end row writing including Hudi streamer (?)
- [ ] Deprecation and clean-up
- [ ] Remove glob support, unused relations

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

I agree. we can punt datasource v2 for 1.2.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15201481


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

I agree. we can punt datasource v2. 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15201481


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))
- [ ] DataSource V2 support (?)
- [ ] End-to-end row writing including Hudi streamer (?)
- [ ] Deprecation and clean-up
  - [ ] Remove glob support, unused relations

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user geserdugarov edited a comment on the discussion: 1.2 Release 
Planning

@vinothchandar , @yihua , about DSv2, I propose not to include this task for 
1.2 release scope. I would say that milestone at 1.2 release would be POC 
implementation (draft PR) with benchmark results, and full design of 
corresponding RFC with writer path considering V1 and V2. And then, proper 
review could possibly take a lot of time, for instance, review of [Spark 4 
support](https://github.com/apache/hudi/pull/12772) took about 7 months (from 
Feb to Sep). So, the optimistic scenario is to prepare production ready 
functionality for 1.3 release. In negative scenarios, it will take more time 
depending on RFC design feedback.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15195415


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-08 Thread via GitHub


GitHub user geserdugarov added a comment to the discussion: 1.2 Release Planning

@vinothchandar , @yihua , about DSv2, I propose not to include this task for 
1.2 release scope. I would say that milestone at 1.2 release would be POC 
implementation (draft PR) with benchmark results, and full design of 
corresponding RFC with writer path considering V1 and V2. And then, proper 
review could possibly take a lot of time, for instance, review of S[park 4 
support](https://github.com/apache/hudi/pull/12772) took about 7 months (from 
Feb to Sep). So, optimistic scenarios, it to prepare production ready 
functionality for 1.3 release. In negative scenarios it will take more 
depending on RFC design feedback.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15195415


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-06 Thread via GitHub


GitHub user danny0405 added a comment to the discussion: 1.2 Release Planning

@vinothchandar it is already supported, I reached out to Uber/Peter and he said 
something related to the notification between clustering schedule and 
execution, that is beyond the scope of Flink though, needs further 
clarification.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15178918


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-05 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

@danny0405 what about async clustering can we write out an issue that describes 
current state /gaps?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15177543


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-05 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: https://github.com/apache/flink-cdc/pull/4164
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: 
https://github.com/apache/hudi/discussions/17452
- [ ] New Flink source with FLIP 27: RFC-design -> 
https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-04 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: 
[PR](https://github.com/apache/flink-cdc/pull/4164)
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: 
[design](https://github.com/apache/hudi/discussions/17452)
- [ ] New Flink source with FLIP 27: 
[RFC-design](https://github.com/apache/hudi/pull/13381)
- [ ] LSM file layout for streaming: 
[issue](https://github.com/apache/hudi/issues/14310)
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-03 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

nvm: just saw this. 
https://github.com/apache/hudi/discussions/13955#discussioncomment-15059391 
Will go over that. 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15152343


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-03 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

do we know the scope. but yes aligning with how the type is defined in RFC-99 
would be ideal. 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15152339


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-03 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

We can give this 2 days, till EOD Friday PST. and freeze 1.2 scope as described 
here so far. 



GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15152332


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-03 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: 
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: 
- [ ] New Flink source with FLIP 27: 
[RFC-design](https://github.com/apache/hudi/pull/13381)
- [ ] LSM file layout for streaming: 
[issue](https://github.com/apache/hudi/issues/14310)
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-03 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

datasource v2 - if someone can write a concise explanation of what we buy and 
how we retain the writer path for upsert as-is, it would be great. Else let's 
revisit later. ?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15152309


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-02 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-1781237)
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-2551077)
- [ ] New Flink source with FLIP 27: 
[RFC-design](https://github.com/apache/hudi/pull/13381)
- [ ] LSM file layout for streaming: 
[issue](https://github.com/apache/hudi/issues/14310)
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [x] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-02 Thread via GitHub


GitHub user yihua added a comment to the discussion: 1.2 Release Planning

Given that there are still users on Spark 3.3, for facilitating 1.x adoption 
and better upgrade experience, let's keep Spark 3.3 support in release 1.2 and 
deprecate Spark 3.3 support in release 1.3.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15140228


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-01 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-1781237)
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-2551077)
- [ ] New Flink source with FLIP 27: 
[RFC-design](https://github.com/apache/hudi/pull/13381)
- [ ] LSM file layout for streaming: 
[issue](https://github.com/apache/hudi/issues/14310)
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support
- [ ] RLI index support for read

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-12-01 Thread via GitHub


GitHub user cshuo added a comment to the discussion: 1.2 Release Planning

Besides RLI support for flink writing, we also need scan optimization based on 
RLI?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15130585


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-27 Thread via GitHub


GitHub user suryaprasanna edited a comment on the discussion: 1.2 Release 
Planning

@yihua If maintaining compatibility with older Spark versions is blocking or 
hindering progress on datasource V2 APIs, we can proceed with deprecating Spark 
3.3.
CC @prashantwason

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15093303


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user suryaprasanna added a comment to the discussion: 1.2 Release 
Planning

@yihua If maintaining compatibility with older Spark versions blocks progress 
on datasource V2 APIs, we can proceed with deprecating Spark 3.3.
CC @prashantwason

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15093303


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Storage Engine

- [ ] Performance certification of SI/RLI reads/writes. 
- [ ] Schema abstraction – don’t use Avro directly. 
- [ ] `HoodieStorage` enhancement on reader and writer paths

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027734


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user yihua added a comment to the discussion: 1.2 Release Planning

There are `HoodieStorage` enhancements that @mansipp and @CTTY would like to 
contribute.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15092464


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))
- [ ] DataSource V2 support (?)
- [ ] End-to-end row writing including Hudi streamer (?)
- [ ] Deprecation and clean-up
  - [ ] Deprecate Spark 3.3 support
  - [ ] Remove glob support, unused relations

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user yihua added a comment to the discussion: 1.2 Release Planning

@suryaprasanna Maintaining the Hudi integration for more than 3 or 4 Spark 
versions brings overhead to development, as we need to test against all these 
versions for new features and account for version-specific APIs as Hudi is 
deeply integrated with Spark internals.  There are certain logic that's Spark 
3.3 specific which can be removed once Spark 3.3 is deprecated, simplifying the 
code base. The latest patch release of Spark 3.3 is Spark 3.3.4, which is 
released almost 2 years ago.  Spark 4.1.0 is coming soon 
(https://spark.apache.org/news/spark-4-1-0-preview4-released.html).  So IMO we 
should gradually remove old Spark version support based on the latest Spark 
releases, and focus more on new Spark versions (e.g., Spark 3.5.x and latest 
Spark 4.x which are actively maintained by Spark community) to save our 
development cycles.  Wdyt?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15091345


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user yihua added a comment to the discussion: 1.2 Release Planning

@geserdugarov Yes, DataSource V2 support takes more than 1 release cycle based 
on the current timeline.  If we can scope out the work to be done for 
DataSource V2 support, especially for the performance, we should get some 
ground work done in this story.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15091289


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user suryaprasanna edited a comment on the discussion: 1.2 Release 
Planning

My bad, I missed it. Then can we not do deprecate Spark 3.3 for Hudi 1.2 
release and directly remove 3.3 and 3.4 as part of 2.x release?
Is there a specific reason we want to remove this support, or is it simply part 
of a standard deprecation process due to the feature being outdated?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15089552


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user suryaprasanna added a comment to the discussion: 1.2 Release 
Planning

My bad, I missed it. Then can we not do deprecate Spark 3.3 for Hudi 1.2 
release and directly remove 3.3 and 3.4 as part of 2.x release?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15089552


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user geserdugarov edited a comment on the discussion: 1.2 Release 
Planning

> @yihua Can we keep Spark 3.3 version and deprecate the older spark versions 
> like 2.4, 3.1, 3.2? As part of Hudi 2.0 we can deprecate all versions older 
> than Spark 3.5.

But we already removed support of Spark 2.4 
https://github.com/apache/hudi/pull/11788 and 3.0-3.2 
https://github.com/apache/hudi/pull/11692 .

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15083710


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user geserdugarov added a comment to the discussion: 1.2 Release Planning

> @yihua Can we keep Spark 3.3 version and deprecate the older spark versions 
> like 2.4, 3.1, 3.2? As part of Hudi 2.0 we can deprecate all versions older 
> than Spark 3.5.

But we already removed support of Spark 2.4, 3.0-3.2:
https://github.com/apache/hudi/pull/11788
https://github.com/apache/hudi/pull/11692

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15083710


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-1781237)
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-2551077)
- [ ] New Flink source with FLIP 27: 
[RFC-design](https://github.com/apache/hudi/pull/13381)
- [ ] LSM file layout for streaming: 
[issue](https://github.com/apache/hudi/issues/14310)
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)
- [ ] Flink 2.1 support

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-26 Thread via GitHub


GitHub user cshuo added a comment to the discussion: 1.2 Release Planning

maybe we should also include flink 2.1 support, which contains some new useful 
features, e.g., variant type and delta join. 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15083573


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-25 Thread via GitHub


GitHub user suryaprasanna added a comment to the discussion: 1.2 Release 
Planning

@yihua Can we keep Spark 3.3 version and deprecate the older spark versions 
like 2.4, 3.1, 3.2? As part of Hudi 2.0 we can deprecate all versions older 
than Spark 3.5.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15083464


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-25 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-1781237)
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: [internal 
doc](https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-2551077)
- [ ] New Flink source with FLIP 27: 
[RFC-design](https://github.com/apache/hudi/pull/13381)
- [ ] LSM file layout for streaming: 
[issue](https://github.com/apache/hudi/issues/14310)
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
[issue](https://github.com/apache/hudi/issues/14308)

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-25 Thread via GitHub


GitHub user geserdugarov edited a comment on the discussion: 1.2 Release 
Planning

I suppose that DataSource V2 support would take more than 1 release cycle.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15082195


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-25 Thread via GitHub


GitHub user geserdugarov added a comment to the discussion: 1.2 Release Planning

I suppose that DataSource V2 support could take more than 1 release cycle.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15082195


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-25 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))
- [ ] DataSource V2 support (?)
- [ ] Deprecation and clean-up
  - [ ] Deprecate Spark 3.3 support
  - [ ] Remove glob support, unused relations

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-24 Thread via GitHub


GitHub user danny0405 added a comment to the discussion: 1.2 Release Planning

yes, it's useful, I can help for the code review of these two PRs.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15070383


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-24 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution: (internal) 
https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-1781237
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing: (internal) 
https://app.clickup.com/18029943/v/dc/h67bq-7684/h67bq-2551077
- [ ] New Flink source with FLIP 27: https://github.com/apache/hudi/pull/13381
- [ ] LSM file layout for streaming: https://github.com/apache/hudi/issues/14310
- [ ] Dynamic bucket scaling for bucket index: (still planning)
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`: 
https://github.com/apache/hudi/issues/14308

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-24 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

yes absolutely @parisni .. 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15069186


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-22 Thread via GitHub


GitHub user parisni added a comment to the discussion: 1.2 Release Planning

May contributor suggest stuff ? If so let me propose full support for table 
documentation (doc, flat and nested comments) for both reading hudi and writing 
into hudi from spark, trino or any hms/glue/s3 hudi compliant readers
- https://github.com/apache/hudi/pull/14234
- https://github.com/apache/hudi/pull/14235

Comment support has been commited back in 2022 
https://github.com/apache/hudi/pull/4960 but the feature has never been working 
due to lack of suppprt for comment extraction from the hudi avro schema.

The nested comment suppports allows spark io to propagate the comments 
transparently thought the transformation lineage, which is AFAIK unique among 
acid format (delta/iceberg) do not support nested comment, only first level.

If this is of interest i can also work to backport this work for 0.15 since it 
can be considered as a bug fix.


GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15046396


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution??
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing.
- [ ] New Flink source with FLIP 27 ?
- [ ] LSM file layout for streaming
- [ ] Dynamic bucket scaling for bucket index
- [ ] Virtual column read support, e.g, the `_hoodie_commit_time`

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user danny0405 added a comment to the discussion: 1.2 Release Planning

@voonhous yes.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15034017


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution??
- [ ] Async clustering for Flink 
- [ ] RLI support for Flink writing.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user the-other-tim-brown added a comment to the discussion: 1.2 Release 
Planning

Do we want to include Variant support or is that expanding the scope too far?

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15029967


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user voonhous added a comment to the discussion: 1.2 Release Planning

CMIIW, automatic schema evolution refers to Hudi support as a sink for 
Flink-CDC? @cshuo @danny0405 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15029327


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite
- [ ] Explore removing caching on upsert path 
([JIRA](https://issues.apache.org/jira/browse/HUDI-860))

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Storage Engine

- [ ] Performance certification of SI/RLI reads/writes. 
- [ ] Schema abstraction – don’t use Avro directly. 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027734


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Storage Engine

- [ ] Performance certification of SI/RLI reads/writes. 
- [ ] Schema abstraction – don’t use Avro directly. 
- [ ] 

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027734


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a discussion: 1.2 Release Planning

Starting this discussion to plan out the 1.2 release. I propose moving to a 
faster release cycle this time, where we pick 1-2 key features per track and 
release when they are ready to go out. 

Will start separate comments on each area and mention contributors I know, who 
are working on those. 




GitHub link: https://github.com/apache/hudi/discussions/14307


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

@cshuo @danny0405 to update items along the theme of making Flink more 
performant, reliable. and have total equivalent parity with the Spark engine 
support.

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027855


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

@yihua to update

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027827


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Flink

- [ ] Automatic Schema Evolution??

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Spark 

- [ ] MIT rewrite

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

@rahil-c @the-other-tim-brown @voonhous to confirm/add/remove

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027834


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Storage Format

- [ ] Type system (RFC-99) (to have support for) discussion 
[thread](https://github.com/apache/hudi/discussions/14253)
- [ ] Blob type & Vector types for RFC-100
- [ ] New file format support to ensure for storing unstructured content 
(RFC-100) (Lance file format integration 
([board](https://github.com/apache/hudi/discussions/14128))
- [ ] Vector Index (ANN) (Review Suryas 
[[PR](https://github.com/apache/hudi/blob/a71a6f67d5416e13453104d6d0e2bcdadf241945/rfc/rfc-103/rfc-103.md)]
- [ ] Spark TVF for Vector Search  [RFC 
102](https://github.com/apache/hudi/pull/14218); with End-End integration with 
RAG application to query tables using text

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027732


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar edited a comment on the discussion: 1.2 Release 
Planning

# Storage Engine

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027734


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

# Flink

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027741


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

Storage Engine

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027734


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-20 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

# Storage Format



GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15027732


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-19 Thread via GitHub


GitHub user vinothchandar added a comment to the discussion: 1.2 Release 
Planning

# Spark 



GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020269


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] 1.2 Release Planning [hudi]

2025-11-19 Thread via GitHub


GitHub user rahil-c added a comment to the discussion: 1.2 Release Planning

I can write items

GitHub link: 
https://github.com/apache/hudi/discussions/14307#discussioncomment-15020272


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]