On mini mesos you might have the same problem, like libprocess might bind to the loopback device. The other tricky thing is what the individual docker containers can reach and by what host name then can. I think my setup is quite similar I just opted to not use mini mesos and created something quite similar. In that setup I create a docker network for my containers so that they can see each other by name. the driver for my framework is also running in a container so that I don't need the native library on my system. This also has the effect that libprocess should have no problems with the communication as it is all using IPs from the same network.

You might also want to check the master log. In my case I saw a framework registration immediately followed by a disconnect. I believe this indicates that there is a communication issue.

On 27.09.2016 15:09, Gmail wrote:
Thanks Hendrik

I have solved that particular problem, when running a different framework in 
docker. It was a bit of a challenge to get the right incantation of environment 
variables and ports defined, but is working reliably now.

I mainly hit this issue when running my integration tests, where I also run the 
mesos master and agent in docker using mini mesos

Sent from my iPad

On 27 Sep 2016, at 22:51, Hendrik Haddorp <[email protected]> wrote:

Hi,

this sounds quite like a problem I had hit a few days ago. If you are using the 
mesos native library you need to make sure that the LIBPROCESS environment 
variables are set correctly. Otherwise the Mesos master can not communicate 
back to your process, especially if you are not running on the same node as the 
master. Things gets slightly more tricky if your scheduler is running in a 
docker container.

regards,
Hendrik

On 27.09.2016 14:34, Eli Jordan wrote:
Yes, it appears in the mesos ui, and stays there. I log all messages from the 
mesos master, including resource offers and disconnected. I don't receive 
offers or disconnected.

I know I need to accept or decline the offers, the problem is that I never 
receive the resource offer, but the master thinks I have.

This only happens sometimes, sometimes the framework starts just fine, and can 
launch tasks. Which is what led me to think it might be a timing issue.

Thanks
Eli

On 27 Sep. 2016, at 22:25, Olivier Sallou <[email protected] 
<mailto:[email protected]>> wrote:



On 09/27/2016 02:08 PM, Gmail wrote:
Hi

I am implementing a mesos framework, and have hit a strange issue that I can't 
make sense of. Intermittently, my framework will receive the registered 
message, and is shown as registered in the mesos ui.

I never see any resource offer messages being processed by the framework, 
however, the mesos master indicates that it has offered resources to the 
framework (on the frameworks page in the ui). In this case, I only have one 
slave, and all the resources are apparently being consumed by the framework, so 
no tasks can be launched.
Does your framework appear in mesos UI in the list fo frameworks ? (and
remains in the list)

Maybe your framework is registered then disconnected.
Anyone have an idea what the problem might be?

One thought I had, is that the MesosSchedulerDriver isn't expecting the 
scheduler implementation to process messages asynchronously, but I couldn't 
find any documentation indicating one way or the other. In my case, I'm using 
akka actors, and all the scheduler implementation does is dispatch a message.
Do you log when you received offers? When you receive an offer you must
accept or decline the offers.

Olivier
Is this a possibility?

Thanks
Eli
--
Olivier Sallou
IRISA / University of Rennes 1
Campus de Beaulieu, 35000 RENNES - FRANCE
Tel: 02.99.84.71.95

gpg key id: 4096R/326D8438  (keyring.debian.org <http://keyring.debian.org>)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438

Reply via email to