Hey everyone,

I'm one of the maintainers of the Ray project (https://ray.io 
<https://ray.io/>) and have been investigating an Apache Beam runner for Ray. 
Our team has seen a lot of interest from users for this and we are happy to 
invest significant time. 

I've implemented a proof of concept Ray-backed SDK worker[0]. Here I'm using 
the existing Python Fn API Runner and replace multi-threading/multi-processing 
SDK workers by distributed Ray workers. This works well (it executes workloads 
on a cluster) but I'm wondering if this can actually scale to large multi-node 
clusters. 

Further I have a couple of questions regarding the general worker scheduling 
model, specifically if we should continue to go with a polling-based worker 
model or if we can actually replace this with a push-based model where we use 
Ray-native data processing APIs do schedule the workload.

At this point it would be great to get more context from someone who is more 
familiar with the Beam abstractions and codebase, specifically the Python 
implementation. I'd love to get on a call to chat more. 

Thanks and best wishes, 
Kai 

[0] 
https://github.com/krfricke/beam/tree/ray-runner-scratch/sdks/python/apache_beam/runners/ray
 
<https://github.com/krfricke/beam/tree/ray-runner-scratch/sdks/python/apache_beam/runners/ray>

Reply via email to