I got things to work, sort of, using the zk:// url type. I am now using the 0.12.X branch from the Github mirror. When I try to bring up the masters, often multiple machines decide to be the master. Similarly, when I try to bring up slaves, they rarely detect the masters (maybe 5-10% of the time).
I triaged the issue and determined that the correct zk url to use is this: zk:// myserver1.com:2181/mesos,myserver2.com:2181/mesos,myserver3.com:2181/mesos Note that you must specify the same hierarchy path for each server. If you don't do this, things will work, but unreliably. On Tue, Apr 16, 2013 at 4:50 PM, Benjamin Mahler <[email protected]>wrote: > I believe it needs to be prefixed with "zk://" rather than zoo. > > The relevant code is in detector.cpp: > > * } else if (master.find("zk://") == 0) {* > Try<zookeeper::URL> url = zookeeper::URL::parse(master); > if (url.isError()) { > return Error(url.error()); > } > if (url.get().path == "/") { > return Error( > "Expecting a (chroot) path for ZooKeeper ('/' is not > supported)"); > } > return new ZooKeeperMasterDetector(url.get(), pid, contend, quiet); > } > > > On Tue, Apr 16, 2013 at 1:01 PM, David Greenberg <[email protected] > >wrote: > > > Hi Vinod, > > That's correct. I tried starting the masters with --zk instead of --url. > I > > am running mesos from the git mirror at commit 3fa8389. Should I try > > updating to head, or is there a particular more stable version I should > > use? > > > > [email protected]:~/mesos/bin$ ./mesos-master.sh --zk=zoo:// > > myserver1.com:2181,myserver2.com:2181,myserver3.com:2181/mesos > > I0416 19:59:45.205003 48438 main.cpp:116] Build: 2013-04-08 19:16:35 by > > dgrnbrg > > I0416 19:59:45.205140 48438 main.cpp:117] Starting Mesos master > > I0416 19:59:45.205313 48466 master.cpp:309] Master started on > > 172.21.97.196:5050 > > I0416 19:59:45.205397 48466 master.cpp:324] Master ID: > > 201304161959-3294696876-5050-48438 > > W0416 19:59:45.205567 48484 master.cpp:81] No whitelist given. > Advertising > > offers for all slaves > > F0416 19:59:45.205613 48438 main.cpp:129] CHECK_SOME(detector) failed: > > Failed to create a master detector: Cannot parse '@0.0.0.0:0' > > *** Check failure stack trace: *** > > @ 0x7f230ef49f1d google::LogMessage::Fail() > > @ 0x7f230ef4e5cf google::LogMessage::SendToLog() > > @ 0x7f230ef4db07 google::LogMessage::Flush() > > @ 0x7f230ef4f25d google::LogMessageFatal::~LogMessageFatal() > > @ 0x41c079 main > > @ 0x7f230cf74abd (unknown) > > @ 0x418979 (unknown) > > Aborted > > > > > > > > On Tue, Apr 16, 2013 at 2:38 PM, Vinod Kone <[email protected]> wrote: > > > > > Hi David, > > > > > > I'm assuming the myserver[1-2-3].com above are your zk servers? > > > > > > Also, masters take "--zk" instead of "--url" for zookeeper address. > > "--url" > > > might have been our old flag, which is deprecated (which version of > mesos > > > are you running?). > > > > > > For slaves, "--master" should be the same set of zk servers that you > > > started your masters with. > > > > > > So, "--master="zoo://myserver1.com:2181,myserver2.com:2181, > > > myserver3.com:2181/mesos" > > > > > > Let me know if that works. If not, please paste the master and slave > > logs. > > > > > > > > > > > > On Tue, Apr 16, 2013 at 10:58 AM, David Greenberg < > > [email protected] > > > >wrote: > > > > > > > I am trying to use the automatic master failover feature of > zookeeper, > > > but > > > > I'm seeing several issues: > > > > > > > > When I launch multiple masters with ./mesos-master.sh --url=zoo:// > > > > myserver1.com:2181,myserver2.com:2181,myserver3.com:2181/mesos , > all 3 > > > > servers elect themselves as master and I don't see anything in the > logs > > > > about zookeeper. > > > > > > > > Similarly, when I launch slaves, they require a --master setting, > > which, > > > if > > > > I provide the zoo:// URL, causes them to fault (and I don't see why I > > > > should provide a hostname, given that a host could be down. > > > > > > > > I assume that I'm making some silly mistake in how I'm launching > these > > > > processes. > > > > > > > > Thanks, > > > > David > > > > > > > > > >
