[ 
https://issues.apache.org/jira/browse/FLINK-39629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18079276#comment-18079276
 ] 

featzhang commented on FLINK-39629:
-----------------------------------

I would like to work on this sub-task under the FLINK-39625 umbrella. Could a 
committer please assign it to me (Jira username: featzhang)? Thanks!

> Add Async I/O operator that invokes GPU sidecar from DataStream API
> -------------------------------------------------------------------
>
>                 Key: FLINK-39629
>                 URL: https://issues.apache.org/jira/browse/FLINK-39629
>             Project: Flink
>          Issue Type: Sub-task
>          Components: API / DataStream
>            Reporter: featzhang
>            Priority: Major
>              Labels: gpu, model-inference
>
> h2. Background
> With a functional sidecar that batches inference requests, user programs
> written against the DataStream API need a convenient, correctly ordered,
> back-pressure-friendly way to invoke the sidecar. Flink's existing Async
> I/O subsystem ({{AsyncDataStream}}, {{AsyncFunction}},
> {{RichAsyncFunction}}) already provides the required machinery for
> ordered / unordered result emission, timeouts, and capacity limits.
> This sub-task adds a first-party async function that delegates record-by-
> record inference to the sidecar.
> h2. Scope of this sub-task
> * Add {{GpuSidecarAsyncFunction<IN, OUT>}} in a new package under
>  {{flink-streaming-java}} or a dedicated {{flink-gpu-client}} module.
>  Constructor parameters:
> ** Sidecar endpoint (UDS or TCP).
> ** Request timeout.
> ** Maximum inflight requests (capacity).
> ** A pluggable {{TensorCodec<IN, OUT>}} that converts records to / from
>  the sidecar's tensor wire format.
> * Manage one RPC channel per parallel subtask; reconnect with exponential
>  backoff.
> * Honour Flink checkpointing: inflight requests are drained on checkpoint
>  barriers when ordered emission is requested.
> * Surface the sidecar's structured error codes as
>  {{AsyncFunction}}-level failures so the user program can apply Flink's
>  standard restart strategies.
> h2. Out of scope
> * Table / SQL integration (tracked in a later sub-task).
> * Scheduling operators onto nodes with a live sidecar (tracked in a later
>  sub-task).
> * Model-hot-reload UX on the DataStream side.
> h2. Acceptance criteria
> * End-to-end test: a job with a source, a
>  {{GpuSidecarAsyncFunction}}, and a sink runs against the mock sidecar
>  on CI and produces deterministic output under ordered emission.
> * Timeout and capacity settings behave as documented.
> * Clean shutdown on job cancellation; no connection leak.
> h2. Affected modules
> * {{flink-streaming-java}} (or new {{flink-gpu-client}})
> * {{flink-tests}} (integration test against mock sidecar)
> h2. Links
> Parent: see umbrella issue linked to this sub-task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to