[ 
https://issues.apache.org/jira/browse/FLINK-34973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan updated FLINK-34973:
--------------------------------
    Description: 
This is a sub-FLIP for the disaggregated state management and its related work, 
please read the FLIP-423 first to know the whole story.

FLIP-424 introduces asynchronous state APIs with callbacks allowing state 
access to be executed in threads separate from the task thread, making better 
usage of I/O bandwidth and enhancing throughput. This FLIP proposes an 
execution framework for asynchronous state APIs. The execution code path for 
the new API is completely independent from the original one, where many runtime 
components are redesigned. We intend to delve into the challenges associated 
with asynchronous execution and provide an in-depth design analysis for each 
module. Furthermore, we will conduct a performance analysis of the new 
framework relative to the current implementation and examine how it measures up 
against other potential alternatives.

  was:
The past decade has witnessed a dramatic shift in Flink's deployment mode, 
workload patterns, and hardware improvements. We've moved from the map-reduce 
era where workers are computation-storage tightly coupled nodes to a 
cloud-native world where containerized deployments on Kubernetes become 
standard. To enable Flink's Cloud-Native future, we introduce Disaggregated 
State Storage and Management that uses DFS as primary storage in Flink 2.0, as 
promised in the Flink 2.0 Roadmap.

Detailed design and story: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046855

Also sub-FLIPs:
- Asynchronous State APIs 
([FLIP-424|https://cwiki.apache.org/confluence/x/SYp3EQ]): Introduce new APIs 
for asynchronous state access. 
- Asynchronous Execution Model 
([FLIP-425|https://cwiki.apache.org/confluence/x/S4p3EQ]): Implement a 
non-blocking execution model leveraging the asynchronous APIs introduced in 
FLIP-424. 
- Grouping Remote State Access 
([FLIP-426|https://cwiki.apache.org/confluence/x/TYp3EQ]): Enable retrieval of 
remote state data in batches to avoid unnecessary round-trip costs for remote 
access. 
- Disaggregated State Store 
([FLIP-427|https://cwiki.apache.org/confluence/x/T4p3EQ]): Introduce the 
initial version of the ForSt disaggregated state store.
- Fault Tolerance/Rescale Integration 
([FLIP-428|https://cwiki.apache.org/confluence/x/UYp3EQ]): Integrate 
checkpointing mechanisms with the disaggregated state store for fault tolerance 
and fast rescaling.


> FLIP-425: Asynchronous Execution Model
> --------------------------------------
>
>                 Key: FLINK-34973
>                 URL: https://issues.apache.org/jira/browse/FLINK-34973
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Zakelly Lan
>            Priority: Major
>             Fix For: 2.0.0
>
>
> This is a sub-FLIP for the disaggregated state management and its related 
> work, please read the FLIP-423 first to know the whole story.
> FLIP-424 introduces asynchronous state APIs with callbacks allowing state 
> access to be executed in threads separate from the task thread, making better 
> usage of I/O bandwidth and enhancing throughput. This FLIP proposes an 
> execution framework for asynchronous state APIs. The execution code path for 
> the new API is completely independent from the original one, where many 
> runtime components are redesigned. We intend to delve into the challenges 
> associated with asynchronous execution and provide an in-depth design 
> analysis for each module. Furthermore, we will conduct a performance analysis 
> of the new framework relative to the current implementation and examine how 
> it measures up against other potential alternatives.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to