thinkharderdev commented on code in PR #728:
URL: https://github.com/apache/arrow-ballista/pull/728#discussion_r1153374082


##########
ballista/scheduler/src/cluster/kv.rs:
##########
@@ -57,6 +56,8 @@ pub struct KeyValueState<
 > {
     /// Underlying `KeyValueStore`
     store: S,
+    /// ExecutorHeartbeat cache, executor_id -> ExecutorHeartbeat
+    executor_heartbeats: Arc<DashMap<String, ExecutorHeartbeat>>,

Review Comment:
   We have :). The problem is mostly around the cost of plan serialization. We 
frequently see execution plans with 50k+ files and the plan is very large and 
causes a lot of CPU and memory overhead to serialize. 
   
   Aside from that, the other issue is that doing zero-downtime deployments is 
much easier with multiple active schedulers. It is a solvable problem using 
leader election but for our use case it was preferable to just run multiple 
active schedulers and solve both the deployment and scalability issue.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to