Great!! There are several spots that need fixing:

src/master/main.cpp
src/mesos/main.cpp
src/slave/main.cpp
src/java/src/org/apache/mesos/MesosSchedulerDriver.java (just the javadoc
needs fixing)

Also, yes that would be a bug, can you provide more information / logs /
etc?

On Thu, Apr 18, 2013 at 2:33 PM, David Greenberg <[email protected]>wrote:

> I will be happy to! I'm just finishing up the process with my employer to
> be able to start submitting patches (I have them all ready and waiting).
>
> By the way, I have discovered a bug, I think (unless it's already been
> found): after master failover, new frameworks I launch don't get resource
> offers.
>
>
> On Wed, Apr 17, 2013 at 4:47 PM, Vinod Kone <[email protected]> wrote:
>
> > Great to hear you were able to debug this David. Sounds like we should
> > either fix our help message or make the code work with the format that
> the
> > 'help' claims. I would think the former is easiest. Would you mind
> sending
> > us a patch?
> >
> >
> > On Wed, Apr 17, 2013 at 12:53 PM, David Greenberg <
> [email protected]
> > >wrote:
> >
> > > I got things to work, sort of, using the zk:// url type. I am now using
> > the
> > > 0.12.X branch from the Github mirror. When I try to bring up the
> masters,
> > > often multiple machines decide to be the master. Similarly, when I try
> to
> > > bring up slaves, they rarely detect the masters (maybe 5-10% of the
> > time).
> > >
> > > I triaged the issue and determined that the correct zk url to use is
> > this:
> > >
> > > zk://
> > >
> >
> myserver1.com:2181/mesos,myserver2.com:2181/mesos,myserver3.com:2181/mesos
> > >
> > > Note that you must specify the same hierarchy path for each server. If
> > you
> > > don't do this, things will work, but unreliably.
> > >
> > >
> > > On Tue, Apr 16, 2013 at 4:50 PM, Benjamin Mahler
> > > <[email protected]>wrote:
> > >
> > > > I believe it needs to be prefixed with "zk://" rather than zoo.
> > > >
> > > > The relevant code is in detector.cpp:
> > > >
> > > > *  } else if (master.find("zk://") == 0) {*
> > > >     Try<zookeeper::URL> url = zookeeper::URL::parse(master);
> > > >     if (url.isError()) {
> > > >       return Error(url.error());
> > > >     }
> > > >     if (url.get().path == "/") {
> > > >       return Error(
> > > >           "Expecting a (chroot) path for ZooKeeper ('/' is not
> > > > supported)");
> > > >     }
> > > >     return new ZooKeeperMasterDetector(url.get(), pid, contend,
> quiet);
> > > >   }
> > > >
> > > >
> > > > On Tue, Apr 16, 2013 at 1:01 PM, David Greenberg <
> > [email protected]
> > > > >wrote:
> > > >
> > > > > Hi Vinod,
> > > > > That's correct. I tried starting the masters with --zk instead of
> > > --url.
> > > > I
> > > > > am running mesos from the git mirror at commit 3fa8389. Should I
> try
> > > > > updating to head, or is there a particular more stable version I
> > should
> > > > > use?
> > > > >
> > > > > [email protected]:~/mesos/bin$ ./mesos-master.sh --zk=zoo://
> > > > > myserver1.com:2181,myserver2.com:2181,myserver3.com:2181/mesos
> > > > > I0416 19:59:45.205003 48438 main.cpp:116] Build: 2013-04-08
> 19:16:35
> > by
> > > > > dgrnbrg
> > > > > I0416 19:59:45.205140 48438 main.cpp:117] Starting Mesos master
> > > > > I0416 19:59:45.205313 48466 master.cpp:309] Master started on
> > > > > 172.21.97.196:5050
> > > > > I0416 19:59:45.205397 48466 master.cpp:324] Master ID:
> > > > > 201304161959-3294696876-5050-48438
> > > > > W0416 19:59:45.205567 48484 master.cpp:81] No whitelist given.
> > > > Advertising
> > > > > offers for all slaves
> > > > > F0416 19:59:45.205613 48438 main.cpp:129] CHECK_SOME(detector)
> > failed:
> > > > > Failed to create a master detector: Cannot parse '@0.0.0.0:0'
> > > > > *** Check failure stack trace: ***
> > > > >     @     0x7f230ef49f1d  google::LogMessage::Fail()
> > > > >     @     0x7f230ef4e5cf  google::LogMessage::SendToLog()
> > > > >     @     0x7f230ef4db07  google::LogMessage::Flush()
> > > > >     @     0x7f230ef4f25d
>  google::LogMessageFatal::~LogMessageFatal()
> > > > >     @           0x41c079  main
> > > > >     @     0x7f230cf74abd  (unknown)
> > > > >     @           0x418979  (unknown)
> > > > > Aborted
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Apr 16, 2013 at 2:38 PM, Vinod Kone <[email protected]>
> > > wrote:
> > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > I'm assuming the myserver[1-2-3].com above are your zk servers?
> > > > > >
> > > > > > Also, masters take "--zk" instead of "--url" for zookeeper
> address.
> > > > > "--url"
> > > > > > might have been our old flag, which is deprecated (which version
> of
> > > > mesos
> > > > > > are you running?).
> > > > > >
> > > > > > For slaves, "--master" should be the same set of zk servers that
> > you
> > > > > > started your masters with.
> > > > > >
> > > > > > So, "--master="zoo://myserver1.com:2181,myserver2.com:2181,
> > > > > > myserver3.com:2181/mesos"
> > > > > >
> > > > > > Let me know if that works. If not, please paste the master and
> > slave
> > > > > logs.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Apr 16, 2013 at 10:58 AM, David Greenberg <
> > > > > [email protected]
> > > > > > >wrote:
> > > > > >
> > > > > > > I am trying to use the automatic master failover feature of
> > > > zookeeper,
> > > > > > but
> > > > > > > I'm seeing several issues:
> > > > > > >
> > > > > > > When I launch multiple masters with ./mesos-master.sh
> > --url=zoo://
> > > > > > > myserver1.com:2181,myserver2.com:2181,myserver3.com:2181/mesos,
> > > > all 3
> > > > > > > servers elect themselves as master and I don't see anything in
> > the
> > > > logs
> > > > > > > about zookeeper.
> > > > > > >
> > > > > > > Similarly, when I launch slaves, they require a --master
> setting,
> > > > > which,
> > > > > > if
> > > > > > > I provide the zoo:// URL, causes them to fault (and I don't see
> > > why I
> > > > > > > should provide a hostname, given that a host could be down.
> > > > > > >
> > > > > > > I assume that I'm making some silly mistake in how I'm
> launching
> > > > these
> > > > > > > processes.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > David
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to