> On 18 Jan 2017, at 21:50, kant kodali <kanth...@gmail.com> wrote:
> 
> Anyone has any experience using spark in the banking industry? I have couple 
> of questions.

> 2. How can I make spark cluster highly available across multi datacenter? Any 
> pointers?


That's not, AFAIK, been a design goal. The communications and scheduling for 
spark assume that (a) there's low latency between executors and driver, and (b) 
that data is close enough to any executor that you don't have to get placement 
right: if you can't schedule work near the data, then running it on other nodes 
is better than not running work. oh, and failure modes are that of a single 
cluster: node and rack failures, not a single long-haul connection which could 
cut the entire cluster in half. If that happens, then all work running on the 
cluster without the driver is lost: you don't have the failure resliience you 
wanted, as it means that if the cluster with the driver actually failed, then 
the other cluster would not be able to take over.

Similarly, cluster filesystems tend to assume they are single DC, with its 
failure modes. life is more complex across two sites. I do know HDFS doesn't 
handle it, though there are things that do.

I would try to come up with a strategy for having separate applications running 
on the different DCs, with a story for data replication and reconciliation.

Even there, though, there'll inevitably be an SPOF. How do you find it? You 
Wait: it will find you.

-steve

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to