Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
fredia merged PR #24678: URL: https://github.com/apache/flink/pull/24678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
Zakelly commented on code in PR #24678: URL: https://github.com/apache/flink/pull/24678#discussion_r1574232324 ## flink-datastream/src/main/java/org/apache/flink/datastream/impl/operators/ProcessOperator.java: ## @@ -46,6 +46,12 @@ public ProcessOperator(OneInputStreamProcessFunction userFunction) { chainingStrategy = ChainingStrategy.ALWAYS; } +@Override +public boolean isAsyncStateProcessingEnabled() { +// For normal operator (without keyed context) the async state processing is unused. Review Comment: I'd prefer make Datastream V2 directly use the new state v2. We will provide fallback of sync state access in new state framework. So no `toAsync()` method and no `AsyncKeyedStream` should be introduced. This would be more clear to users. Of course, for Datastream V1, we won't touch the original state v1 code path, thus we should introduce new stream if we want to integrate state v2 into DS V1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
Zakelly commented on code in PR #24678: URL: https://github.com/apache/flink/pull/24678#discussion_r1574232324 ## flink-datastream/src/main/java/org/apache/flink/datastream/impl/operators/ProcessOperator.java: ## @@ -46,6 +46,12 @@ public ProcessOperator(OneInputStreamProcessFunction userFunction) { chainingStrategy = ChainingStrategy.ALWAYS; } +@Override +public boolean isAsyncStateProcessingEnabled() { +// For normal operator (without keyed context) the async state processing is unused. Review Comment: I'd prefer make Datastream V2 directly use the new state v2. We will provide fallback of sync state access in new state framework. So no `toAsync()` method and `AsyncKeyedStream` should be introduced. This would be more clear to users. Of course, for Datastream V1, we won't touch the original state v1 code path, thus we should introduce new stream if we want to integrate state v2 into DS V1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
fredia commented on code in PR #24678: URL: https://github.com/apache/flink/pull/24678#discussion_r1574075445 ## flink-datastream/src/main/java/org/apache/flink/datastream/impl/operators/ProcessOperator.java: ## @@ -46,6 +46,12 @@ public ProcessOperator(OneInputStreamProcessFunction userFunction) { chainingStrategy = ChainingStrategy.ALWAYS; } +@Override +public boolean isAsyncStateProcessingEnabled() { +// For normal operator (without keyed context) the async state processing is unused. Review Comment: I was wondering if it would be better to just inherit async on `KeyedProcessOperator`/`KeyedTwoInputNonBroadcastProcessOperator`/`KeyedTwoInputBroadcastProcessOperator`/`KeyedTwoOutputProcessOperator`? BTW, should we offer an `toAsync()` method in `KeyedPartitionStream`? which allows users to set the execution modes of different operators in a fine-grained manner. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
Zakelly commented on code in PR #24678: URL: https://github.com/apache/flink/pull/24678#discussion_r1574052485 ## flink-datastream/src/main/java/org/apache/flink/datastream/impl/operators/ProcessOperator.java: ## @@ -23,15 +23,15 @@ import org.apache.flink.datastream.impl.common.TimestampCollector; import org.apache.flink.datastream.impl.context.DefaultNonPartitionedContext; import org.apache.flink.datastream.impl.context.DefaultRuntimeContext; -import org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator; import org.apache.flink.streaming.api.operators.BoundedOneInput; import org.apache.flink.streaming.api.operators.ChainingStrategy; import org.apache.flink.streaming.api.operators.OneInputStreamOperator; +import org.apache.flink.streaming.runtime.operators.asyncprocessing.AbstractAsyncStateUdfStreamOperator; import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; /** Operator for {@link OneInputStreamProcessFunction}. */ public class ProcessOperator -extends AbstractUdfStreamOperator> +extends AbstractAsyncStateUdfStreamOperator> Review Comment: Thanks for the reminder! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
reswqa commented on code in PR #24678: URL: https://github.com/apache/flink/pull/24678#discussion_r1572117471 ## flink-datastream/src/main/java/org/apache/flink/datastream/impl/operators/ProcessOperator.java: ## @@ -23,15 +23,15 @@ import org.apache.flink.datastream.impl.common.TimestampCollector; import org.apache.flink.datastream.impl.context.DefaultNonPartitionedContext; import org.apache.flink.datastream.impl.context.DefaultRuntimeContext; -import org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator; import org.apache.flink.streaming.api.operators.BoundedOneInput; import org.apache.flink.streaming.api.operators.ChainingStrategy; import org.apache.flink.streaming.api.operators.OneInputStreamOperator; +import org.apache.flink.streaming.runtime.operators.asyncprocessing.AbstractAsyncStateUdfStreamOperator; import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; /** Operator for {@link OneInputStreamProcessFunction}. */ public class ProcessOperator -extends AbstractUdfStreamOperator> +extends AbstractAsyncStateUdfStreamOperator> Review Comment: I think we should also do this for `TwoInputBroadcastProcessOperator`/`TwoInputNonBroadcastProcessOperator` and `TwoOutputProcessOperator` 樂 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
flinkbot commented on PR #24678: URL: https://github.com/apache/flink/pull/24678#issuecomment-2063745225 ## CI report: * 503f01052e16a806da7589a8dc0ed645ff00d282 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [FLINK-35156][Runtime] Make operators of DataStream V2 integrate with async state processing framework [flink]
Zakelly opened a new pull request, #24678: URL: https://github.com/apache/flink/pull/24678 ## What is the purpose of the change This is a simple PR that wire the new introduced operators of DataStream V2 with the `AbstractAsyncStateStreamOperator`. ## Brief change log - Introduce `AbstractAsyncStateUdfStreamOperator` that is nearly identical with `AbstractUdfStreamOperator`, but extends from `AbstractAsyncStateStreamOperator` - Replace base class of `ProcessOperator` (v2) from `AbstractUdfStreamOperator` to `AbstractAsyncStateUdfStreamOperator`. ## Verifying this change This change is a trivial rework without any test coverage. More will be added when the whole state processing works. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org