Re: Flink Statefun Python Batch

2021-04-22 Thread Igal Shilman
Hi Tim, I've created a tiny PoC, let me know if this helps, I can't guarantee tho, that this is how we'll eventually approach this, but it should be somewhere along these lines. https://github.com/igalshilman/flink-statefun/tree/tim Thanks, Igal. On Thu, Apr 22, 2021 at 6:53 AM Timothy Bess w

Re: Flink Statefun Python Batch

2021-04-21 Thread Timothy Bess
Hi Igal and Konstantin, Wow! I appreciate the offer of creating a branch to test with, but for now we were able to get it working by tuning a few configs and moving other blocking IO out of statefun, so no rush there. That said if you do add that, I'd definitely switch over. That's great! I'll tr

Re: Flink Statefun Python Batch

2021-04-21 Thread Konstantin Knauf
Hi Igal, Hi Timothy, this sounds very interesting. Both state introspection as well as OpenTracing support have been requested by multiple users before, so certainly something we are willing to invest into. Timothy, would you have time for a 30min call in the next days to understand your use case

Re: Flink Statefun Python Batch

2021-04-21 Thread Igal Shilman
Hi Tim, Yes, I think that this feature can be implemented relatively fast. If this blocks you at the moment, I can prepare a branch for you to experiment with, in the following days. Regarding to open tracing integration, I think the community can benefit a lot out of this, and definitely contrib

Re: Flink Statefun Python Batch

2021-04-20 Thread Timothy Bess
Hi Igal, Yes! that's exactly what I was thinking. The batching will naturally happen as the model applies backpressure. We're using pandas and it's pretty costly to create a dataframe and everything to process a single event. Internally the SDK has access to the batch and is calling my function, w

Re: Flink Statefun Python Batch

2021-04-20 Thread Igal Shilman
Hi Tim! Indeed the StateFun SDK / StateFun runtime, has an internal concept of batching, that kicks in the presence of a slow /congested remote function. Keep in mind that under normal circumstances batching does not happen (effectively a batch of size 1 will be sent). [1] This batch is not curren

Flink Statefun Python Batch

2021-04-16 Thread Timothy Bess
Hi everyone, Is there a good way to access the batch of leads that Statefun sends to the Python SDK rather than processing events one by one? We're trying to run our data scientist's machine learning model through the SDK, but the code is very slow when we do single events and we don't get many of