Thank you for the quick reply, Igal.

Our use case is the following: A stream of data from Kafka is fed into Flink 
where data transformations take place. After that we send that transformed data 
to an inference engine to score the relevance of each record. (Rough 
simplification.)

Doing that using HTTP endpoints is possible, and it is the solution we have in 
place today, however, for each request to that endpoint, we need to incur the 
cost of establishing the connection, etc., thus increasing the latency of the 
system.

We do process data in batches to mitigate the latency, but it is not the same 
as having a bi-directional stream, as it would be possible using gRPC. 
Furthermore, we already use gRPC in other parts of our system.

We also want to be able to scale those endpoints up and down, as demand for the 
service fluctuates depending on the hour and day. Combining StateFun and 
Kubernetes would allow for that elasticity of the service, while keeping state 
of the execution, since inferences are not always just one endpoint, but a 
collection of them where the output of one becomes the input of the next, 
culminating with the predicted score(s).

We are evaluating StateFun because Flink is already part of the infrastructure. 
With that said, gRPC is also part of our requirements, thus motivation for the 
question.

I’d love to hear more about plans to implement support for gRPC and perhaps 
become an early adopter.

I hope this helps with understanding of the use case. Happy to talk further and 
answer more questions.

Best,

Dalmo



From: Igal Shilman <[email protected]>
Date: Saturday, September 19, 2020 at 01:41
To: Dalmo Cirne <[email protected]>
Cc: "[email protected]" <[email protected]>
Subject: Re: Support for gRPC in Flink StateFun 2.x

Hi,

Your observation is correct, currently the only way to invoke a remote function 
is trough an HTTP POST request to a service that exposes a StateFun endpoint.

The endpoint must implement the client side of a the “RequestReply” protocol as 
defined by StateFun (basically an invocation contains the state and message, 
and a response contains a description of the side effects).

While gRPC can be easily added a as a replacement for the transport layer, the 
client side (the remote function) would still have to implement the 
RequestReply protocol.

To truly utilize gRPC we would want to introduce a new type of protocol, that 
can exploit the low latency bi-directional streams to and from the function.

While for the later it is a bit difficult to commit for a specific date the 
former can be easily implemented in the next StateFun release.

Would you be able to share with us a little bit more about your original 
motivation to ask this question :-)
This would help us as we gather more and more use cases.

For example: target language, environment, how gRPC services are being 
discovered.

Thanks,
Igal

On Thursday, September 17, 2020, Dalmo Cirne 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

In the latest Flink Forward, from April 2020, there were mentions that adding 
support to gRPC, in addition to HTTP, was in the works and would be implemented 
in the future.

Looking into the 
flink-statefun<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_flink-2Dstatefun&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=FeoftdI25c24WDfCzZuLKlzDGX4Ny1UkpP-nYieLwI4&m=D5GejhN0RqzCk7zz8mRBClCYQJLUs5sMKh4HGT09reQ&s=vXkqL1_aNT6gv4HluEmg_vqtb8gnDUxCBWw_YsQhRJw&e=>
 repository on GitHub, one can see that there is already some work done with 
gRPC, but parity with its HTTP counterpart is not there, yet.

Is there a roadmap or an estimate of when gRPC will be implemented in StateFun?

Thank you,

Dalmo









Reply via email to