Yup, all you need to do is provide a valid host address that will route to a Master. So you could, for instance, make the host addresses named such as spark://spark-master1.my.com:port1,spark-master2.my.com:port2, spark-master3.my.com:port3
and then just change the DNS entries to keep them up to date with the current masters. Or use static IPs, HAProxy, etc. On Sat, Feb 22, 2014 at 7:11 PM, Matan Shukry <matanshu...@gmail.com> wrote: > Is there any way to make this url more dynamic, so a case such as you > described where I would need to add new node wouldn't require > recompilation? For example, by using a dns record or haproxy or some other > software? > On Feb 23, 2014 3:51 AM, "Aaron Davidson" <ilike...@gmail.com> wrote: > >> The current way of solving this problem is to list all three masters as >> your master url; e.g.,: >> spark://host1:port1,host2:port2,host3:port3 >> >> This will try all three in parallel and use whichever one is currently >> the master. This should work as long as you don't have to introduce a new >> node as a backup master (due to one of the others failing permanently) -- >> in that case, you'd have to update the master URL to include the new node >> in case it is elected leader for all *newly created* clients/workers. >> Old clients are ambivalent to the coming and goings of masters, as any new >> master will reconnect to all old clients and workers. >> >> >> >> On Sat, Feb 22, 2014 at 4:12 PM, Matan Shukry <matanshu...@gmail.com>wrote: >> >>> Lately I started messing around with hadoop and spark. >>> >>> I noticed spark can leverage zookeeper in order to create >>> multiple "secondaries" masters. >>> >>> I was wondering however, how one may implement the client >>> in such situation? >>> >>> that is, what should the spark master URL be for a spark client >>> application? >>> >>> Let's say for example, I have 10 nodes, and 3 of them (1/3/5) are >>> masters. >>> I don't want to put either one of the masters url, since they may be >>> brought down. >>> >>> so, which master URL do I use? or rather, how do I use one url >>> which will change when a new master is chosen? >>> >>> Note: >>> I know I can simply have a list of masters, use try/catch to see which >>> one fails, and try other ones - I was hoping for something "better", in >>> performance context, and more dynamic as well. >>> >>> Yours, Jones. >>> >> >>