Re: Problems with scheduling tasks in mesos and spark

2016-04-13 Thread Shuai Lin
Have you tried setting the "spark.cores.max" in sparkconf? Check
http://spark.apache.org/docs/1.6.1/running-on-mesos.html :

 You can cap the maximum number of cores using conf.set("spark.cores.max",
> "10") (for example).


On Thu, Apr 14, 2016 at 12:53 AM, Andreas Tsarida <
andreas.tsar...@teralytics.ch> wrote:

>
> Hello,
>
> I’m trying to figure out a solution for dynamic resource allocation in
> mesos within the same framework ( spark ).
>
> Scenario :
> 1 - run spark a job in coarse mode
> 2 - run second job in coarse mode
>
> Second job will not start unless first job finishes which is not something
> that I would want. The problem is small when the job running doesn’t take
> too long but when it does nobody can work on the cluster.
>
> Best scenario would be to have mesos revoke resources from the first job
> and try to allocate resources to the second job.
>
> If there anybody else who solved this issue in another way ?
>
> Thanks
>


Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread Justin Ryan
I’m +1 on removing the default.

While /opt/mesos may seem reasonable on the surface given many common uses of 
/opt, putting data there doesn’t really comply with FHS.  Arguments could be 
made for /var/mesos (which I’m using) or /srv/mesos, but I think no default is 
fine.

I noticed early on that it was a little odd to default to /tmp, but felt as if 
I was following someone-or-other’s lead.  It’s now clear that’s not the case. :)

From: tommy xiao >
Reply-To: "user@mesos.apache.org" 
>
Date: Tuesday, April 12, 2016 at 11:14 PM
To: "user@mesos.apache.org" 
>
Cc: dev >
Subject: Re: [Proposal] Remove the default value for agent work_dir

how about /opt/mesos/
+1

2016-04-13 12:44 GMT+08:00 Avinash Sridharan 
>:
+1

On Tue, Apr 12, 2016 at 9:31 PM, Jie Yu 
> wrote:
+1

On Tue, Apr 12, 2016 at 9:29 PM, James Peach 
> wrote:

>
> > On Apr 12, 2016, at 3:58 PM, Greg Mann 
> > > wrote:
> >
> > Hey folks!
> > A number of situations have arisen in which the default value of the
> Mesos agent `--work_dir` flag (/tmp/mesos) has caused problems on systems
> in which the automatic cleanup of '/tmp' deletes agent metadata. To resolve
> this, we would like to eliminate the default value of the agent
> `--work_dir` flag. You can find the relevant JIRA here.
> >
> > We considered simply changing the default value to a more appropriate
> location, but decided against this because the expected filesystem
> structure varies from platform to platform, and because it isn't guaranteed
> that the Mesos agent would have access to the default path on a particular
> platform.
> >
> > Eliminating the default `--work_dir` value means that the agent would
> exit immediately if the flag is not provided, whereas currently it launches
> successfully in this case. This will break existing infrastructure which
> relies on launching the Mesos agent without specifying the work directory.
> I believe this is an acceptable change because '/tmp/mesos' is not a
> suitable location for the agent work directory except for short-term local
> testing, and any production scenario that is currently using this location
> should be altered immediately.
>
> +1 from me too. Defaulting to /tmp just helps people shoot themselves in
> the foot.
>
> J



--
Avinash Sridharan, Mesosphere
+1 (323) 702 5245



--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com


P Please consider the environment before printing this e-mail

The information in this electronic mail message is the sender's confidential 
business and may be legally privileged. It is intended solely for the 
addressee(s). Access to this internet electronic mail message by anyone else is 
unauthorized. If you are not the intended recipient, any disclosure, copying, 
distribution or any action taken or omitted to be taken in reliance on it is 
prohibited and may be unlawful. The sender believes that this E-mail and any 
attachments were free of any virus, worm, Trojan horse, and/or malicious code 
when sent. This message and its attachments could have been infected during 
transmission. By reading the message and opening any attachments, the recipient 
accepts full responsibility for taking protective and remedial action about 
viruses and other defects. The sender's employer is not liable for any loss or 
damage arising in any way.


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread Stefano Bianchi
I found it sorry :-)

2016-04-14 0:23 GMT+02:00 Stefano Bianchi :

> Thanks Jeff.
> Sorry i am not sufficiently familiar with mesos mailing list, where i can
> find the jira issue you are talking about?
>
> Thanks again.
>
> 2016-04-13 23:23 GMT+02:00 Jeff Schroeder :
>
>> Stefano, you might also follow the jira issue MESOS-3548, which is for
>> mesos to support federation amongst multiple clusters natively.
>>
>>
>> On Wednesday, April 13, 2016, Stefano Bianchi 
>> wrote:
>>
>>> Ah ok.
>>> No problem.
>>> See you and best regards!!!
>>> Il 13/apr/2016 21:09, "June Taylor"  ha scritto:
>>>
 Stefano,

 We are not currently using CoreOS - it just seemed that it had a
 feature you're looking for. We are new to Mesos as well. I apologize that I
 cannot be more helpful.


 Thanks,
 June Taylor
 System Administrator, Minnesota Population Center
 University of Minnesota

 On Wed, Apr 13, 2016 at 2:07 PM, Stefano Bianchi 
 wrote:

> June Taylor, is there another channel on which i can contact you?
> Since i am a simple student doing his master thesis on mesos/calico, i
> need all the support possible :P
> I already have of course from my supervisor and advisor, but sometimes
> is better have discussions with more skilled people.
>
> 2016-04-13 17:33 GMT+02:00 June Taylor :
>
>> Stefano,
>>
>> That's the feature, yes.
>>
>> Unfortunately I don't know the answer to your questions further.
>>
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>> On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi <
>> jazzist...@gmail.com> wrote:
>>
>>> Thanks to your reply June.
>>>
>>> You mean this feature?
>>> https://coreos.com/etcd/docs/latest/clustering.html
>>> Because for sure i know that calico exploit etcd as datastore, but i
>>> need to know if, and if yes how, i can manage etcd as DNS.
>>>
>>> 2016-04-13 15:04 GMT+02:00 June Taylor :
>>>
 Stefano,

 We are exploring CoreOS, and I believe it has the feature you're
 looking for.


 Thanks,
 June Taylor
 System Administrator, Minnesota Population Center
 University of Minnesota

 On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi <
 jazzist...@gmail.com> wrote:

> Hi all
>
> i have to set up two mesos clusters.
> On each cluster i should integrate Project calico in order to
> distribute tasks among the agents. But these tasks should be sent 
> also from
> a slave of one cluster to the slave of the other cluster.
> I know that when i start calico on each slaves, it registers the
> hosts to the ETCD_AUTHORITY so calico use etcd.
> In order to have interconnection among 2 mesos clusters, i should
> have the same etcd datastore for both my Mesos/Calico clusters.
> Someone knows how to reach this condition?
>
> Thanks in advance
>
>

>>>
>>
>

>>
>> --
>> Text by Jeff, typos by iPhone
>>
>
>


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread Stefano Bianchi
Thanks Jeff.
Sorry i am not sufficiently familiar with mesos mailing list, where i can
find the jira issue you are talking about?

Thanks again.

2016-04-13 23:23 GMT+02:00 Jeff Schroeder :

> Stefano, you might also follow the jira issue MESOS-3548, which is for
> mesos to support federation amongst multiple clusters natively.
>
>
> On Wednesday, April 13, 2016, Stefano Bianchi 
> wrote:
>
>> Ah ok.
>> No problem.
>> See you and best regards!!!
>> Il 13/apr/2016 21:09, "June Taylor"  ha scritto:
>>
>>> Stefano,
>>>
>>> We are not currently using CoreOS - it just seemed that it had a feature
>>> you're looking for. We are new to Mesos as well. I apologize that I cannot
>>> be more helpful.
>>>
>>>
>>> Thanks,
>>> June Taylor
>>> System Administrator, Minnesota Population Center
>>> University of Minnesota
>>>
>>> On Wed, Apr 13, 2016 at 2:07 PM, Stefano Bianchi 
>>> wrote:
>>>
 June Taylor, is there another channel on which i can contact you?
 Since i am a simple student doing his master thesis on mesos/calico, i
 need all the support possible :P
 I already have of course from my supervisor and advisor, but sometimes
 is better have discussions with more skilled people.

 2016-04-13 17:33 GMT+02:00 June Taylor :

> Stefano,
>
> That's the feature, yes.
>
> Unfortunately I don't know the answer to your questions further.
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi <
> jazzist...@gmail.com> wrote:
>
>> Thanks to your reply June.
>>
>> You mean this feature?
>> https://coreos.com/etcd/docs/latest/clustering.html
>> Because for sure i know that calico exploit etcd as datastore, but i
>> need to know if, and if yes how, i can manage etcd as DNS.
>>
>> 2016-04-13 15:04 GMT+02:00 June Taylor :
>>
>>> Stefano,
>>>
>>> We are exploring CoreOS, and I believe it has the feature you're
>>> looking for.
>>>
>>>
>>> Thanks,
>>> June Taylor
>>> System Administrator, Minnesota Population Center
>>> University of Minnesota
>>>
>>> On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi <
>>> jazzist...@gmail.com> wrote:
>>>
 Hi all

 i have to set up two mesos clusters.
 On each cluster i should integrate Project calico in order to
 distribute tasks among the agents. But these tasks should be sent also 
 from
 a slave of one cluster to the slave of the other cluster.
 I know that when i start calico on each slaves, it registers the
 hosts to the ETCD_AUTHORITY so calico use etcd.
 In order to have interconnection among 2 mesos clusters, i should
 have the same etcd datastore for both my Mesos/Calico clusters.
 Someone knows how to reach this condition?

 Thanks in advance


>>>
>>
>

>>>
>
> --
> Text by Jeff, typos by iPhone
>


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread Jeff Schroeder
Stefano, you might also follow the jira issue MESOS-3548, which is for
mesos to support federation amongst multiple clusters natively.

On Wednesday, April 13, 2016, Stefano Bianchi  wrote:

> Ah ok.
> No problem.
> See you and best regards!!!
> Il 13/apr/2016 21:09, "June Taylor"  > ha scritto:
>
>> Stefano,
>>
>> We are not currently using CoreOS - it just seemed that it had a feature
>> you're looking for. We are new to Mesos as well. I apologize that I cannot
>> be more helpful.
>>
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>> On Wed, Apr 13, 2016 at 2:07 PM, Stefano Bianchi > > wrote:
>>
>>> June Taylor, is there another channel on which i can contact you?
>>> Since i am a simple student doing his master thesis on mesos/calico, i
>>> need all the support possible :P
>>> I already have of course from my supervisor and advisor, but sometimes
>>> is better have discussions with more skilled people.
>>>
>>> 2016-04-13 17:33 GMT+02:00 June Taylor >> >:
>>>
 Stefano,

 That's the feature, yes.

 Unfortunately I don't know the answer to your questions further.


 Thanks,
 June Taylor
 System Administrator, Minnesota Population Center
 University of Minnesota

 On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi > wrote:

> Thanks to your reply June.
>
> You mean this feature?
> https://coreos.com/etcd/docs/latest/clustering.html
> Because for sure i know that calico exploit etcd as datastore, but i
> need to know if, and if yes how, i can manage etcd as DNS.
>
> 2016-04-13 15:04 GMT+02:00 June Taylor  >:
>
>> Stefano,
>>
>> We are exploring CoreOS, and I believe it has the feature you're
>> looking for.
>>
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>> On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi <
>> jazzist...@gmail.com
>> > wrote:
>>
>>> Hi all
>>>
>>> i have to set up two mesos clusters.
>>> On each cluster i should integrate Project calico in order to
>>> distribute tasks among the agents. But these tasks should be sent also 
>>> from
>>> a slave of one cluster to the slave of the other cluster.
>>> I know that when i start calico on each slaves, it registers the
>>> hosts to the ETCD_AUTHORITY so calico use etcd.
>>> In order to have interconnection among 2 mesos clusters, i should
>>> have the same etcd datastore for both my Mesos/Calico clusters.
>>> Someone knows how to reach this condition?
>>>
>>> Thanks in advance
>>>
>>>
>>
>

>>>
>>

-- 
Text by Jeff, typos by iPhone


Re: Mesos Master and Slave on same server?

2016-04-13 Thread Adam Bordelon
See also my answer to
http://stackoverflow.com/questions/26597521/can-mesos-master-and-slave-nodes-be-deployed-on-the-same-machines


On Wed, Apr 13, 2016 at 12:45 PM, Stefano Bianchi 
wrote:

> I have set an HA cluster.
> 3 mesos slaves with one leader and quorum 2.
> 3 mesos slaves.
> I can easily start tasks from marathon.
> The common way is set the slaves on other machines.
> However if you are running the mesos-slave service also on the master is
> it not a problem.
>
> 2016-04-13 19:24 GMT+02:00 Paul Bell :
>
>> Hi June,
>>
>> In addition to doing what Pradeep suggests, I also now & then run a
>> single node "cluster" that houses mesos-master, mesos-slave, and Marathon.
>>
>> Works fine.
>>
>> Cordially,
>>
>> Paul
>>
>> On Wed, Apr 13, 2016 at 12:36 PM, Pradeep Chhetri <
>> pradeep.chhetr...@gmail.com> wrote:
>>
>>> I would suggest you to run mesos-master and zookeeper and marathon on
>>> same set of hosts (maybe call them as coordinator nodes) and use completely
>>> different set of nodes for mesos slaves. This way you can do the
>>> maintenance of such hosts in a very planned fashion.
>>>
>>> On Wed, Apr 13, 2016 at 4:22 PM, Stefano Bianchi 
>>> wrote:
>>>
 For sure it is possible.
 Simply Mesos-master will the the resources offered by the machine on
 which is running mesos-slave also, transparently.

 2016-04-13 16:34 GMT+02:00 June Taylor :

> All of our node servers are identical hardware. Is it reasonable for
> me to install the Mesos-Master and Mesos-Slave on the same physical
> hardware?
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>


>>>
>>>
>>> --
>>> Regards,
>>> Pradeep Chhetri
>>>
>>
>>
>


Re: Custom IPTables rules

2016-04-13 Thread Rad Gruchalski
Alfredo,  

I have no examples of locking that one on hand but I can imagine that if should 
be feasible to lock that down.










Best regards,

Radek Gruchalski

ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
(mailto:ra...@gruchalski.com)
de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)

Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.



On Wednesday, 13 April 2016 at 22:14, Alfredo Carneiro wrote:

> Unfortunately, I am facing some problemseven with my INPUT rules allowing 
> just some subnetworks, Docker is accepting connections from everywhere.
>  
> On Wed, Apr 13, 2016 at 5:06 PM, Rad Gruchalski  (mailto:ra...@gruchalski.com)> wrote:
> > I actually found the complete thing you need. Here we go:  
> >  
> > *nat
> > …
> >  
> > :DOCKER - [0:0]
> > -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
> > -A OUTPUT ! -d 127.0.0.0/8 (http://127.0.0.0/8) -m addrtype --dst-type 
> > LOCAL -j DOCKER
> > -A POSTROUTING -s 172.17.0.0/16 (http://172.17.0.0/16) ! -o docker0 -j 
> > MASQUERADE
> > # This is where the docker NAT rules go
> >  
> >  
> > # NAT chains
> >  
> > COMMIT
> >  
> > *filter
> > …
> > :DOCKER - [0:0]
> >  
> > …
> >  
> > -A FORWARD -o docker0 -j DOCKER
> > -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> > -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> > -A FORWARD -i docker0 -o docker0 -j ACCEPT
> >  
> >  
> > This gives you everything you need. Thanks to Avinash for pointing this 
> > out.  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > Best regards,

> > Radek Gruchalski
> > 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
> > (mailto:ra...@gruchalski.com)
> > de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)
> >  
> > Confidentiality:
> > This communication is intended for the above-named person and may be 
> > confidential and/or legally privileged.
> > If it has come to you in error you must take no action based on it, nor 
> > must you copy or show it to anyone; please delete/destroy and inform the 
> > sender immediately.
> >  
> >  
> >  
> > On Wednesday, 13 April 2016 at 21:59, Alfredo Carneiro wrote:
> >  
> > > Oh man! Really thanks! It worked!
> > >  
> > > On Wed, Apr 13, 2016 at 4:57 PM, Rad Gruchalski  > > (mailto:ra...@gruchalski.com)> wrote:
> > > > Have you tried restarting docker daemon afterwards?
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > > Best regards,

> > > > Radek Gruchalski
> > > > 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
> > > > (mailto:ra...@gruchalski.com)
> > > > de.linkedin.com/in/radgruchalski/ 
> > > > (http://de.linkedin.com/in/radgruchalski/)
> > > >  
> > > > Confidentiality:
> > > > This communication is intended for the above-named person and may be 
> > > > confidential and/or legally privileged.
> > > > If it has come to you in error you must take no action based on it, nor 
> > > > must you copy or show it to anyone; please delete/destroy and inform 
> > > > the sender immediately.
> > > >  
> > > >  
> > > >  
> > > > On Wednesday, 13 April 2016 at 21:53, Alfredo Carneiro wrote:
> > > >  
> > > > > Hey Rad,
> > > > >  
> > > > > Thanks for your answer! I have added theses lines and now looks very 
> > > > > similar before.
> > > > >  
> > > > > iptables -N DOCKER
> > > > > iptables -A FORWARD -o docker0 -j DOCKER
> > > > > iptables -A FORWARD -o docker0 -m conntrack --ctstate 
> > > > > RELATED,ESTABLISHED -j ACCEPT
> > > > > iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> > > > > iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT
> > > > >  
> > > > >  
> > > > > However, I am still getting errors.
> > > > >  
> > > > > docker: Error response from daemon: failed to create endpoint 
> > > > > cranky_kilby on network bridge: iptables failed: iptables --wait -t 
> > > > > nat -A DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 
> > > > > 172.17.0.2:8080 (http://172.17.0.2:8080) ! -i docker0: iptables: No 
> > > > > chain/target/match by that name.
> > > > >  (exit status 1).
> > > > >  
> > > > >  
> > > > > This is my iptables -L output:
> > > > >  
> > > > > Chain FORWARD (policy DROP)
> > > > > target prot opt source   destination  
> > > > > DOCKER all  --  anywhere anywhere 
> > > > > ACCEPT all  --  anywhere anywhere ctstate 
> > > > > RELATED,ESTABLISHED
> > > > > ACCEPT all  --  anywhere anywhere 
> > > > > ACCEPT all  --  anywhere anywhere 
> > > > >  
> > > > > Chain OUTPUT (policy ACCEPT)
> > 

Re: Custom IPTables rules

2016-04-13 Thread Alfredo Carneiro
Unfortunately, I am facing some problemseven with my INPUT rules
allowing just some subnetworks, Docker is accepting connections from
everywhere.

On Wed, Apr 13, 2016 at 5:06 PM, Rad Gruchalski 
wrote:

> I actually found the complete thing you need. Here we go:
>
> *nat
> …
>
> :DOCKER - [0:0]
> -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
> -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
> -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
> # This is where the docker NAT rules go
>
> # NAT chains
>
> COMMIT
>
> *filter
> …
> :DOCKER - [0:0]
>
> …
>
> -A FORWARD -o docker0 -j DOCKER
> -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
>
> This gives you everything you need. Thanks to Avinash for pointing this
> out.
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com 
> de.linkedin.com/in/radgruchalski/
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Wednesday, 13 April 2016 at 21:59, Alfredo Carneiro wrote:
>
> Oh man! Really thanks! It worked!
>
> On Wed, Apr 13, 2016 at 4:57 PM, Rad Gruchalski 
> wrote:
>
> Have you tried restarting docker daemon afterwards?
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com 
> de.linkedin.com/in/radgruchalski/
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Wednesday, 13 April 2016 at 21:53, Alfredo Carneiro wrote:
>
> Hey Rad,
>
> Thanks for your answer! I have added theses lines and now looks very
> similar before.
>
> *iptables -N DOCKER*
> *iptables -A FORWARD -o docker0 -j DOCKER*
> *iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED
> -j ACCEPT*
> *iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT*
> *iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT*
>
> However, I am still getting errors.
>
> *docker: Error response from daemon: failed to create endpoint
> cranky_kilby on network bridge: iptables failed: iptables --wait -t nat -A
> DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.17.0.2:8080
>  ! -i docker0: iptables: No chain/target/match by
> that name.*
> * (exit status 1).*
>
> This is my iptables -L output:
>
> *Chain FORWARD (policy DROP)*
> *target prot opt source   destination *
> *DOCKER all  --  anywhere anywhere*
> *ACCEPT all  --  anywhere anywhere ctstate
> RELATED,ESTABLISHED*
> *ACCEPT all  --  anywhere anywhere*
> *ACCEPT all  --  anywhere anywhere*
>
> *Chain OUTPUT (policy ACCEPT)*
> *target prot opt source   destination *
> *ACCEPT all  --  anywhere anywhere*
>
> *Chain DOCKER (1 references)*
> *target prot opt source   destination*
>
> I hid the INPUT chain because is very big!
>
> Best Regards,
>
> On Wed, Apr 13, 2016 at 4:29 PM, Rad Gruchalski 
> wrote:
>
> Hi Alfredo,
>
> The only thing you need is:
>
> -A FORWARD -o docker0 -j DOCKER
> -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com 
> de.linkedin.com/in/radgruchalski/
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Wednesday, 13 April 2016 at 21:27, Alfredo Carneiro wrote:
>
> Hello guys,
>
> I don't know if that is the right place to ask. So, since we use public
> cloud, we are trying to hardening our servers allowing traffic just from
> our subnetworks. However, when I tried to implement some iptables rules I
> got problems with Docker, which couldn't find its chain anymore.
>
> Then, I am wondering if anyone has ever implemented any iptables rule in
> this scenario.
>
> I've seen this[1] "tip", however, I think that it is not apply to this
> case, because it is very "static".
>
> [1] - https://fralef.me/docker-and-iptables.html
>
> Best Regards,
>
> --
> 

Re: Custom IPTables rules

2016-04-13 Thread Rad Gruchalski
I actually found the complete thing you need. Here we go:  

*nat
…

:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
# This is where the docker NAT rules go


# NAT chains

COMMIT

*filter
…
:DOCKER - [0:0]

…

-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT


This gives you everything you need. Thanks to Avinash for pointing this out.  











Best regards,

Radek Gruchalski

ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
(mailto:ra...@gruchalski.com)
de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)

Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.



On Wednesday, 13 April 2016 at 21:59, Alfredo Carneiro wrote:

> Oh man! Really thanks! It worked!
>  
> On Wed, Apr 13, 2016 at 4:57 PM, Rad Gruchalski  (mailto:ra...@gruchalski.com)> wrote:
> > Have you tried restarting docker daemon afterwards?
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > Best regards,

> > Radek Gruchalski
> > 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
> > (mailto:ra...@gruchalski.com)
> > de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)
> >  
> > Confidentiality:
> > This communication is intended for the above-named person and may be 
> > confidential and/or legally privileged.
> > If it has come to you in error you must take no action based on it, nor 
> > must you copy or show it to anyone; please delete/destroy and inform the 
> > sender immediately.
> >  
> >  
> >  
> > On Wednesday, 13 April 2016 at 21:53, Alfredo Carneiro wrote:
> >  
> > > Hey Rad,
> > >  
> > > Thanks for your answer! I have added theses lines and now looks very 
> > > similar before.
> > >  
> > > iptables -N DOCKER
> > > iptables -A FORWARD -o docker0 -j DOCKER
> > > iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED 
> > > -j ACCEPT
> > > iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> > > iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT
> > >  
> > >  
> > > However, I am still getting errors.
> > >  
> > > docker: Error response from daemon: failed to create endpoint 
> > > cranky_kilby on network bridge: iptables failed: iptables --wait -t nat 
> > > -A DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 
> > > 172.17.0.2:8080 (http://172.17.0.2:8080) ! -i docker0: iptables: No 
> > > chain/target/match by that name.
> > >  (exit status 1).
> > >  
> > >  
> > > This is my iptables -L output:
> > >  
> > > Chain FORWARD (policy DROP)
> > > target prot opt source   destination  
> > > DOCKER all  --  anywhere anywhere 
> > > ACCEPT all  --  anywhere anywhere ctstate 
> > > RELATED,ESTABLISHED
> > > ACCEPT all  --  anywhere anywhere 
> > > ACCEPT all  --  anywhere anywhere 
> > >  
> > > Chain OUTPUT (policy ACCEPT)
> > > target prot opt source   destination  
> > > ACCEPT all  --  anywhere anywhere 
> > >  
> > > Chain DOCKER (1 references)
> > > target prot opt source   destination
> > >  
> > >  
> > > I hid the INPUT chain because is very big!
> > >  
> > > Best Regards,
> > >  
> > > On Wed, Apr 13, 2016 at 4:29 PM, Rad Gruchalski  > > (mailto:ra...@gruchalski.com)> wrote:
> > > > Hi Alfredo,  
> > > >  
> > > > The only thing you need is:
> > > >  
> > > > -A FORWARD -o docker0 -j DOCKER
> > > > -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j 
> > > > ACCEPT
> > > > -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> > > > -A FORWARD -i docker0 -o docker0 -j ACCEPT
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > > Best regards,

> > > > Radek Gruchalski
> > > > 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
> > > > (mailto:ra...@gruchalski.com)
> > > > de.linkedin.com/in/radgruchalski/ 
> > > > (http://de.linkedin.com/in/radgruchalski/)
> > > >  
> > > > Confidentiality:
> > > > This communication is intended for the above-named person and may be 
> > > > confidential and/or legally privileged.
> > > > If it has come to you in error you must take no action based on it, nor 
> > > > must you copy or show it to anyone; please delete/destroy and inform 
> > > > the sender immediately.
> > > >  
> > > >  
> > > >  
> > > > On Wednesday, 13 April 2016 at 

Re: Custom IPTables rules

2016-04-13 Thread Alfredo Carneiro
Oh man! Really thanks! It worked!

On Wed, Apr 13, 2016 at 4:57 PM, Rad Gruchalski 
wrote:

> Have you tried restarting docker daemon afterwards?
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com 
> de.linkedin.com/in/radgruchalski/
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Wednesday, 13 April 2016 at 21:53, Alfredo Carneiro wrote:
>
> Hey Rad,
>
> Thanks for your answer! I have added theses lines and now looks very
> similar before.
>
> *iptables -N DOCKER*
> *iptables -A FORWARD -o docker0 -j DOCKER*
> *iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED
> -j ACCEPT*
> *iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT*
> *iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT*
>
> However, I am still getting errors.
>
> *docker: Error response from daemon: failed to create endpoint
> cranky_kilby on network bridge: iptables failed: iptables --wait -t nat -A
> DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.17.0.2:8080
>  ! -i docker0: iptables: No chain/target/match by
> that name.*
> * (exit status 1).*
>
> This is my iptables -L output:
>
> *Chain FORWARD (policy DROP)*
> *target prot opt source   destination *
> *DOCKER all  --  anywhere anywhere*
> *ACCEPT all  --  anywhere anywhere ctstate
> RELATED,ESTABLISHED*
> *ACCEPT all  --  anywhere anywhere*
> *ACCEPT all  --  anywhere anywhere*
>
> *Chain OUTPUT (policy ACCEPT)*
> *target prot opt source   destination *
> *ACCEPT all  --  anywhere anywhere*
>
> *Chain DOCKER (1 references)*
> *target prot opt source   destination*
>
> I hid the INPUT chain because is very big!
>
> Best Regards,
>
> On Wed, Apr 13, 2016 at 4:29 PM, Rad Gruchalski 
> wrote:
>
> Hi Alfredo,
>
> The only thing you need is:
>
> -A FORWARD -o docker0 -j DOCKER
> -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com 
> de.linkedin.com/in/radgruchalski/
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Wednesday, 13 April 2016 at 21:27, Alfredo Carneiro wrote:
>
> Hello guys,
>
> I don't know if that is the right place to ask. So, since we use public
> cloud, we are trying to hardening our servers allowing traffic just from
> our subnetworks. However, when I tried to implement some iptables rules I
> got problems with Docker, which couldn't find its chain anymore.
>
> Then, I am wondering if anyone has ever implemented any iptables rule in
> this scenario.
>
> I've seen this[1] "tip", however, I think that it is not apply to this
> case, because it is very "static".
>
> [1] - https://fralef.me/docker-and-iptables.html
>
> Best Regards,
>
> --
> Alfredo Miranda
>
>
>
>
>
> --
> Alfredo Miranda
>
>
>


-- 
Alfredo Miranda


Re: Custom IPTables rules

2016-04-13 Thread Avinash Sridharan
You need a docker chain in the NAT table as well. The output you are
showing is in the default table.

Try "iptable -t nat -L" to list all rules and chain in the NAT table. You
can add the docker chain in the NAT table
"iptable -t nat -N Docker" to create a docker Chain in the NAT table.

As Rad suggested restarting the docker daemon would allow Docker to
recreate all the iptable chains and rules it needs. That might be a cleaner
approach, than trying to insert rules on your own.

On Wed, Apr 13, 2016 at 12:53 PM, Alfredo Carneiro <
alfr...@simbioseventures.com> wrote:

> Hey Rad,
>
> Thanks for your answer! I have added theses lines and now looks very
> similar before.
>
> *iptables -N DOCKER*
> *iptables -A FORWARD -o docker0 -j DOCKER*
> *iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED
> -j ACCEPT*
> *iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT*
> *iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT*
>
> However, I am still getting errors.
>
> *docker: Error response from daemon: failed to create endpoint
> cranky_kilby on network bridge: iptables failed: iptables --wait -t nat -A
> DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.17.0.2:8080
>  ! -i docker0: iptables: No chain/target/match by
> that name.*
> * (exit status 1).*
>
> This is my iptables -L output:
>
> *Chain FORWARD (policy DROP)*
> *target prot opt source   destination *
> *DOCKER all  --  anywhere anywhere*
> *ACCEPT all  --  anywhere anywhere ctstate
> RELATED,ESTABLISHED*
> *ACCEPT all  --  anywhere anywhere*
> *ACCEPT all  --  anywhere anywhere*
>
> *Chain OUTPUT (policy ACCEPT)*
> *target prot opt source   destination *
> *ACCEPT all  --  anywhere anywhere*
>
> *Chain DOCKER (1 references)*
> *target prot opt source   destination*
>
> I hid the INPUT chain because is very big!
>
> Best Regards,
>
> On Wed, Apr 13, 2016 at 4:29 PM, Rad Gruchalski 
> wrote:
>
>> Hi Alfredo,
>>
>> The only thing you need is:
>>
>> -A FORWARD -o docker0 -j DOCKER
>> -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
>> -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
>> -A FORWARD -i docker0 -o docker0 -j ACCEPT
>>
>> Best regards,
>> Radek Gruchalski
>> ra...@gruchalski.com 
>> de.linkedin.com/in/radgruchalski/
>>
>>
>> *Confidentiality:*This communication is intended for the above-named
>> person and may be confidential and/or legally privileged.
>> If it has come to you in error you must take no action based on it, nor
>> must you copy or show it to anyone; please delete/destroy and inform the
>> sender immediately.
>>
>> On Wednesday, 13 April 2016 at 21:27, Alfredo Carneiro wrote:
>>
>> Hello guys,
>>
>> I don't know if that is the right place to ask. So, since we use public
>> cloud, we are trying to hardening our servers allowing traffic just from
>> our subnetworks. However, when I tried to implement some iptables rules I
>> got problems with Docker, which couldn't find its chain anymore.
>>
>> Then, I am wondering if anyone has ever implemented any iptables rule in
>> this scenario.
>>
>> I've seen this[1] "tip", however, I think that it is not apply to this
>> case, because it is very "static".
>>
>> [1] - https://fralef.me/docker-and-iptables.html
>>
>> Best Regards,
>>
>> --
>> Alfredo Miranda
>>
>>
>>
>
>
> --
> Alfredo Miranda
>



-- 
Avinash Sridharan, Mesosphere
+1 (323) 702 5245


Re: Custom IPTables rules

2016-04-13 Thread Rad Gruchalski
Have you tried restarting docker daemon afterwards?










Best regards,

Radek Gruchalski

ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
(mailto:ra...@gruchalski.com)
de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)

Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.



On Wednesday, 13 April 2016 at 21:53, Alfredo Carneiro wrote:

> Hey Rad,
>  
> Thanks for your answer! I have added theses lines and now looks very similar 
> before.
>  
> iptables -N DOCKER
> iptables -A FORWARD -o docker0 -j DOCKER
> iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j 
> ACCEPT
> iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT
>  
>  
> However, I am still getting errors.
>  
> docker: Error response from daemon: failed to create endpoint cranky_kilby on 
> network bridge: iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 
> 0/0 --dport 8080 -j DNAT --to-destination 172.17.0.2:8080 
> (http://172.17.0.2:8080) ! -i docker0: iptables: No chain/target/match by 
> that name.
>  (exit status 1).
>  
>  
> This is my iptables -L output:
>  
> Chain FORWARD (policy DROP)
> target prot opt source   destination  
> DOCKER all  --  anywhere anywhere 
> ACCEPT all  --  anywhere anywhere ctstate 
> RELATED,ESTABLISHED
> ACCEPT all  --  anywhere anywhere 
> ACCEPT all  --  anywhere anywhere 
>  
> Chain OUTPUT (policy ACCEPT)
> target prot opt source   destination  
> ACCEPT all  --  anywhere anywhere 
>  
> Chain DOCKER (1 references)
> target prot opt source   destination
>  
>  
> I hid the INPUT chain because is very big!
>  
> Best Regards,
>  
> On Wed, Apr 13, 2016 at 4:29 PM, Rad Gruchalski  (mailto:ra...@gruchalski.com)> wrote:
> > Hi Alfredo,  
> >  
> > The only thing you need is:
> >  
> > -A FORWARD -o docker0 -j DOCKER
> > -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> > -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> > -A FORWARD -i docker0 -o docker0 -j ACCEPT
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > Best regards,

> > Radek Gruchalski
> > 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
> > (mailto:ra...@gruchalski.com)
> > de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)
> >  
> > Confidentiality:
> > This communication is intended for the above-named person and may be 
> > confidential and/or legally privileged.
> > If it has come to you in error you must take no action based on it, nor 
> > must you copy or show it to anyone; please delete/destroy and inform the 
> > sender immediately.
> >  
> >  
> >  
> > On Wednesday, 13 April 2016 at 21:27, Alfredo Carneiro wrote:
> >  
> > > Hello guys,
> > >  
> > > I don't know if that is the right place to ask. So, since we use public 
> > > cloud, we are trying to hardening our servers allowing traffic just from 
> > > our subnetworks. However, when I tried to implement some iptables rules I 
> > > got problems with Docker, which couldn't find its chain anymore.
> > >  
> > > Then, I am wondering if anyone has ever implemented any iptables rule in 
> > > this scenario.
> > >  
> > > I've seen this[1] "tip", however, I think that it is not apply to this 
> > > case, because it is very "static".
> > >  
> > > [1] - https://fralef.me/docker-and-iptables.html
> > >  
> > > Best Regards,
> > >  
> > > --  
> > > Alfredo Miranda  
> >  
>  
>  
>  
> --  
> Alfredo Miranda  



Re: Custom IPTables rules

2016-04-13 Thread Alfredo Carneiro
Hey Rad,

Thanks for your answer! I have added theses lines and now looks very
similar before.

*iptables -N DOCKER*
*iptables -A FORWARD -o docker0 -j DOCKER*
*iptables -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED
-j ACCEPT*
*iptables -A FORWARD -i docker0 ! -o docker0 -j ACCEPT*
*iptables -A FORWARD -i docker0 -o docker0 -j ACCEPT*

However, I am still getting errors.

*docker: Error response from daemon: failed to create endpoint cranky_kilby
on network bridge: iptables failed: iptables --wait -t nat -A DOCKER -p tcp
-d 0/0 --dport 8080 -j DNAT --to-destination 172.17.0.2:8080
 ! -i docker0: iptables: No chain/target/match by
that name.*
* (exit status 1).*

This is my iptables -L output:

*Chain FORWARD (policy DROP)*
*target prot opt source   destination *
*DOCKER all  --  anywhere anywhere*
*ACCEPT all  --  anywhere anywhere ctstate
RELATED,ESTABLISHED*
*ACCEPT all  --  anywhere anywhere*
*ACCEPT all  --  anywhere anywhere*

*Chain OUTPUT (policy ACCEPT)*
*target prot opt source   destination *
*ACCEPT all  --  anywhere anywhere*

*Chain DOCKER (1 references)*
*target prot opt source   destination*

I hid the INPUT chain because is very big!

Best Regards,

On Wed, Apr 13, 2016 at 4:29 PM, Rad Gruchalski 
wrote:

> Hi Alfredo,
>
> The only thing you need is:
>
> -A FORWARD -o docker0 -j DOCKER
> -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
> -A FORWARD -i docker0 ! -o docker0 -j ACCEPT
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com 
> de.linkedin.com/in/radgruchalski/
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On Wednesday, 13 April 2016 at 21:27, Alfredo Carneiro wrote:
>
> Hello guys,
>
> I don't know if that is the right place to ask. So, since we use public
> cloud, we are trying to hardening our servers allowing traffic just from
> our subnetworks. However, when I tried to implement some iptables rules I
> got problems with Docker, which couldn't find its chain anymore.
>
> Then, I am wondering if anyone has ever implemented any iptables rule in
> this scenario.
>
> I've seen this[1] "tip", however, I think that it is not apply to this
> case, because it is very "static".
>
> [1] - https://fralef.me/docker-and-iptables.html
>
> Best Regards,
>
> --
> Alfredo Miranda
>
>
>


-- 
Alfredo Miranda


Re: Mesos Master and Slave on same server?

2016-04-13 Thread Stefano Bianchi
I have set an HA cluster.
3 mesos slaves with one leader and quorum 2.
3 mesos slaves.
I can easily start tasks from marathon.
The common way is set the slaves on other machines.
However if you are running the mesos-slave service also on the master is it
not a problem.

2016-04-13 19:24 GMT+02:00 Paul Bell :

> Hi June,
>
> In addition to doing what Pradeep suggests, I also now & then run a single
> node "cluster" that houses mesos-master, mesos-slave, and Marathon.
>
> Works fine.
>
> Cordially,
>
> Paul
>
> On Wed, Apr 13, 2016 at 12:36 PM, Pradeep Chhetri <
> pradeep.chhetr...@gmail.com> wrote:
>
>> I would suggest you to run mesos-master and zookeeper and marathon on
>> same set of hosts (maybe call them as coordinator nodes) and use completely
>> different set of nodes for mesos slaves. This way you can do the
>> maintenance of such hosts in a very planned fashion.
>>
>> On Wed, Apr 13, 2016 at 4:22 PM, Stefano Bianchi 
>> wrote:
>>
>>> For sure it is possible.
>>> Simply Mesos-master will the the resources offered by the machine on
>>> which is running mesos-slave also, transparently.
>>>
>>> 2016-04-13 16:34 GMT+02:00 June Taylor :
>>>
 All of our node servers are identical hardware. Is it reasonable for me
 to install the Mesos-Master and Mesos-Slave on the same physical hardware?

 Thanks,
 June Taylor
 System Administrator, Minnesota Population Center
 University of Minnesota

>>>
>>>
>>
>>
>> --
>> Regards,
>> Pradeep Chhetri
>>
>
>


Re: Disappearing tasks

2016-04-13 Thread Justin Ryan
Hiya, coming back to this thread after having to focus on some other things 
(and facing some issues I brought up in another thread).

I reconfigured this cluster with work_dir as /var/mesos and am logging output 
from ‘mesos ps’ from the python mesos.cli package in a loop to try and catch 
the next occurrence.

Still, what seems most interesting to me is that the count of “Running” 
remembers the lost processes.  Even now, as I’ve launched 3 new instances of 
flume from marathon, the running count is 6.  Killed count shows recently 
killed tasks, but was at 0 earlier when I had 3 processes running which mesos 
had lost.


From: Greg Mann >
Reply-To: "user@mesos.apache.org" 
>
Date: Wednesday, April 6, 2016 at 4:24 PM
To: user >
Subject: Re: Disappearing tasks

Hi Justin,
I'm sorry that you've been having difficulty with your cluster. Do you have 
access to master/agent logs around the time that these tasks went missing from 
the Mesos UI? It would be great to have a look at those if possible.

I would still recommend against setting the agent work_dir to '/tmp/mesos' for 
a long-running cluster scenario - this location is really only suitable for 
local, short-term testing purposes. We currently have a patch in flight to 
update our docs to clarify this point. Even though the work_dir appeared to be 
intact when you checked it, it's possible that some of the agent's checkpoint 
data had been deleted. Could you try changing the work_dir for your agents to 
see if that helps?

Cheers,
Greg


On Wed, Apr 6, 2016 at 11:27 AM, Justin Ryan 
> wrote:
Thanks Rik – Interesting theory, I considered that it might have some 
connection to the removal of sandbox files.

Sooo this morning I had all of my kafka brokers disappear again, and checked 
this on a node that is definitely still running kafka.  All of /tmp/mesos, 
including what appear to be the sandbox and logs of the running process, are 
still there, and the “running” count this time is actually higher than I’d 
expect.  I had 9 kafka brokers and 3 flume processes running, and the running 
count currently says 15.

From: > on behalf of 
Rik >
Reply-To: "user@mesos.apache.org" 
>
Date: Tuesday, April 5, 2016 at 3:19 PM
To: "user@mesos.apache.org" 
>
Subject: Re: Disappearing tasks

FWIW, the only time I've seen this happen here is when someone accidentally 
clears the work dir (default=/tmp/mesos), which I personally would advise to 
put somewhere else where rogue people or processes are less likely to throw 
things away accidentally. Could it be that? Although... tasks were 'lost' at 
that point, so it differs slightly (same general outcome, not entirely the same 
symptoms).

On Tue, Apr 5, 2016 at 11:35 PM, Justin Ryan 
> wrote:
An interesting fact I left out, the count of “Running” tasks remains intact, 
while absolutely no history remains in the dashboard.



From: Justin Ryan >
Reply-To: "user@mesos.apache.org" 
>
Date: Tuesday, April 5, 2016 at 12:29 PM
To: "user@mesos.apache.org" 
>
Subject: Disappearing tasks

Hiya folks!

I’ve spent the past few weeks prototyping a new data cluster with Mesos, Kafka, 
and Flume delivering data to HDFS which we plan to interact with via Spark.  In 
the prototype environment, I had a fairly high volume of test data flowing for 
some weeks with little to no major issues except for learning about tuning 
Kafka and Flume.

I’m launching kafka with the 
github.com/mesos/kafka project, and flume is run 
via marathon.

Yesterday morning, I came in and my flume jobs had disappeared from the task 
list in Mesos, though I found the actual processes still running when I 
searched the cluster ’ps’ output.  Later in the day, I had the same happen to 
my kafka brokers.  In some cases, the only way I’ve found to recover from this 
is to shut everything down and clear the zookeeper data, which would be fairly 
drastic if it happened in production, and particularly if we had many tasks / 
frameworks that were fine, but one or two disappeared.

I’d appreciate any help sorting through this, I’m using latest Mesos and CDH5 
installed via community Chef cookbooks.




P Please consider the environment before printing this e-mail

The 

Re: Custom IPTables rules

2016-04-13 Thread Rad Gruchalski
Hi Alfredo,  

The only thing you need is:

-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT











Best regards,

Radek Gruchalski

ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 
(mailto:ra...@gruchalski.com)
de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)

Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.



On Wednesday, 13 April 2016 at 21:27, Alfredo Carneiro wrote:

> Hello guys,
>  
> I don't know if that is the right place to ask. So, since we use public 
> cloud, we are trying to hardening our servers allowing traffic just from our 
> subnetworks. However, when I tried to implement some iptables rules I got 
> problems with Docker, which couldn't find its chain anymore.
>  
> Then, I am wondering if anyone has ever implemented any iptables rule in this 
> scenario.
>  
> I've seen this[1] "tip", however, I think that it is not apply to this case, 
> because it is very "static".
>  
> [1] - https://fralef.me/docker-and-iptables.html
>  
> Best Regards,
>  
> --  
> Alfredo Miranda  



Re: Mesos interconnection among clusters project calico

2016-04-13 Thread Stefano Bianchi
Ah ok.
No problem.
See you and best regards!!!
Il 13/apr/2016 21:09, "June Taylor"  ha scritto:

> Stefano,
>
> We are not currently using CoreOS - it just seemed that it had a feature
> you're looking for. We are new to Mesos as well. I apologize that I cannot
> be more helpful.
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Wed, Apr 13, 2016 at 2:07 PM, Stefano Bianchi 
> wrote:
>
>> June Taylor, is there another channel on which i can contact you?
>> Since i am a simple student doing his master thesis on mesos/calico, i
>> need all the support possible :P
>> I already have of course from my supervisor and advisor, but sometimes is
>> better have discussions with more skilled people.
>>
>> 2016-04-13 17:33 GMT+02:00 June Taylor :
>>
>>> Stefano,
>>>
>>> That's the feature, yes.
>>>
>>> Unfortunately I don't know the answer to your questions further.
>>>
>>>
>>> Thanks,
>>> June Taylor
>>> System Administrator, Minnesota Population Center
>>> University of Minnesota
>>>
>>> On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi 
>>> wrote:
>>>
 Thanks to your reply June.

 You mean this feature?
 https://coreos.com/etcd/docs/latest/clustering.html
 Because for sure i know that calico exploit etcd as datastore, but i
 need to know if, and if yes how, i can manage etcd as DNS.

 2016-04-13 15:04 GMT+02:00 June Taylor :

> Stefano,
>
> We are exploring CoreOS, and I believe it has the feature you're
> looking for.
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi  > wrote:
>
>> Hi all
>>
>> i have to set up two mesos clusters.
>> On each cluster i should integrate Project calico in order to
>> distribute tasks among the agents. But these tasks should be sent also 
>> from
>> a slave of one cluster to the slave of the other cluster.
>> I know that when i start calico on each slaves, it registers the
>> hosts to the ETCD_AUTHORITY so calico use etcd.
>> In order to have interconnection among 2 mesos clusters, i should
>> have the same etcd datastore for both my Mesos/Calico clusters.
>> Someone knows how to reach this condition?
>>
>> Thanks in advance
>>
>>
>

>>>
>>
>


Custom IPTables rules

2016-04-13 Thread Alfredo Carneiro
Hello guys,

I don't know if that is the right place to ask. So, since we use public
cloud, we are trying to hardening our servers allowing traffic just from
our subnetworks. However, when I tried to implement some iptables rules I
got problems with Docker, which couldn't find its chain anymore.

Then, I am wondering if anyone has ever implemented any iptables rule in
this scenario.

I've seen this[1] "tip", however, I think that it is not apply to this
case, because it is very "static".

[1] - https://fralef.me/docker-and-iptables.html

Best Regards,

-- 
Alfredo Miranda


[RESULT][VOTE] Release Apache Mesos 0.28.1 (rc2)

2016-04-13 Thread Jie Yu
Hi all,

The vote for Mesos 0.28.1 (rc2) has passed with the
following votes.

+1 (Binding)
--
Michael Park
Vinod Kone
Kapil Arya

+1 (Non-binding)
--
Greg Mann

There were no 0 or -1 votes.

Please find the release at:
https://dist.apache.org/repos/dist/release/mesos/0.28.1

It is recommended to use a mirror to download the release:
http://www.apache.org/dyn/closer.cgi

The CHANGELOG for the release is available at:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.1

The mesos-0.28.1.jar has been released to:
https://repository.apache.org

The website (http://mesos.apache.org) will be updated shortly to reflect
this release.

Thanks,
- Jie


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread June Taylor
Stefano,

We are not currently using CoreOS - it just seemed that it had a feature
you're looking for. We are new to Mesos as well. I apologize that I cannot
be more helpful.


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Wed, Apr 13, 2016 at 2:07 PM, Stefano Bianchi 
wrote:

> June Taylor, is there another channel on which i can contact you?
> Since i am a simple student doing his master thesis on mesos/calico, i
> need all the support possible :P
> I already have of course from my supervisor and advisor, but sometimes is
> better have discussions with more skilled people.
>
> 2016-04-13 17:33 GMT+02:00 June Taylor :
>
>> Stefano,
>>
>> That's the feature, yes.
>>
>> Unfortunately I don't know the answer to your questions further.
>>
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>> On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi 
>> wrote:
>>
>>> Thanks to your reply June.
>>>
>>> You mean this feature?
>>> https://coreos.com/etcd/docs/latest/clustering.html
>>> Because for sure i know that calico exploit etcd as datastore, but i
>>> need to know if, and if yes how, i can manage etcd as DNS.
>>>
>>> 2016-04-13 15:04 GMT+02:00 June Taylor :
>>>
 Stefano,

 We are exploring CoreOS, and I believe it has the feature you're
 looking for.


 Thanks,
 June Taylor
 System Administrator, Minnesota Population Center
 University of Minnesota

 On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi 
 wrote:

> Hi all
>
> i have to set up two mesos clusters.
> On each cluster i should integrate Project calico in order to
> distribute tasks among the agents. But these tasks should be sent also 
> from
> a slave of one cluster to the slave of the other cluster.
> I know that when i start calico on each slaves, it registers the hosts
> to the ETCD_AUTHORITY so calico use etcd.
> In order to have interconnection among 2 mesos clusters, i should have the
> same etcd datastore for both my Mesos/Calico clusters.
> Someone knows how to reach this condition?
>
> Thanks in advance
>
>

>>>
>>
>


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread Stefano Bianchi
June Taylor, is there another channel on which i can contact you?
Since i am a simple student doing his master thesis on mesos/calico, i need
all the support possible :P
I already have of course from my supervisor and advisor, but sometimes is
better have discussions with more skilled people.

2016-04-13 17:33 GMT+02:00 June Taylor :

> Stefano,
>
> That's the feature, yes.
>
> Unfortunately I don't know the answer to your questions further.
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi 
> wrote:
>
>> Thanks to your reply June.
>>
>> You mean this feature?
>> https://coreos.com/etcd/docs/latest/clustering.html
>> Because for sure i know that calico exploit etcd as datastore, but i need
>> to know if, and if yes how, i can manage etcd as DNS.
>>
>> 2016-04-13 15:04 GMT+02:00 June Taylor :
>>
>>> Stefano,
>>>
>>> We are exploring CoreOS, and I believe it has the feature you're looking
>>> for.
>>>
>>>
>>> Thanks,
>>> June Taylor
>>> System Administrator, Minnesota Population Center
>>> University of Minnesota
>>>
>>> On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi 
>>> wrote:
>>>
 Hi all

 i have to set up two mesos clusters.
 On each cluster i should integrate Project calico in order to
 distribute tasks among the agents. But these tasks should be sent also from
 a slave of one cluster to the slave of the other cluster.
 I know that when i start calico on each slaves, it registers the hosts
 to the ETCD_AUTHORITY so calico use etcd.
 In order to have interconnection among 2 mesos clusters, i should have the
 same etcd datastore for both my Mesos/Calico clusters.
 Someone knows how to reach this condition?

 Thanks in advance


>>>
>>
>


Re: Problems with scheduling tasks in mesos and spark

2016-04-13 Thread Hans van den Bogert
Hi, 

This is a hard problem to solve atm if your requirement is that you really need 
Spark to operate in Coarse-grained mode.
I assume this is a problem because you are trying to run two spark-applications 
(as apposed to two jobs in one applications).

Obvious “solutions” would be that you could run both applications in 
fine-grained mode. 
You could also try if both jobs can be submitted through the same spark context 
where its job scheduler would be set to FAIR (the default is FIFO.) However, I 
don’t have enough context information to know if this latter option would be 
applicable for you.

If you need more help, please provide some context of what you’re trying to 
achieve.

Regards,

Hans

> On Apr 13, 2016, at 6:53 PM, Andreas Tsarida  
> wrote:
> 
> 
> Hello,
> 
> I’m trying to figure out a solution for dynamic resource allocation in mesos 
> within the same framework ( spark ).
> 
> Scenario :
> 1 - run spark a job in coarse mode
> 2 - run second job in coarse mode
> 
> Second job will not start unless first job finishes which is not something 
> that I would want. The problem is small when the job running doesn’t take too 
> long but when it does nobody can work on the cluster.
> 
> Best scenario would be to have mesos revoke resources from the first job and 
> try to allocate resources to the second job.
> 
> If there anybody else who solved this issue in another way ?
> 
> Thanks



Re: Mesos Master and Slave on same server?

2016-04-13 Thread Paul Bell
Hi June,

In addition to doing what Pradeep suggests, I also now & then run a single
node "cluster" that houses mesos-master, mesos-slave, and Marathon.

Works fine.

Cordially,

Paul

On Wed, Apr 13, 2016 at 12:36 PM, Pradeep Chhetri <
pradeep.chhetr...@gmail.com> wrote:

> I would suggest you to run mesos-master and zookeeper and marathon on same
> set of hosts (maybe call them as coordinator nodes) and use completely
> different set of nodes for mesos slaves. This way you can do the
> maintenance of such hosts in a very planned fashion.
>
> On Wed, Apr 13, 2016 at 4:22 PM, Stefano Bianchi 
> wrote:
>
>> For sure it is possible.
>> Simply Mesos-master will the the resources offered by the machine on
>> which is running mesos-slave also, transparently.
>>
>> 2016-04-13 16:34 GMT+02:00 June Taylor :
>>
>>> All of our node servers are identical hardware. Is it reasonable for me
>>> to install the Mesos-Master and Mesos-Slave on the same physical hardware?
>>>
>>> Thanks,
>>> June Taylor
>>> System Administrator, Minnesota Population Center
>>> University of Minnesota
>>>
>>
>>
>
>
> --
> Regards,
> Pradeep Chhetri
>


Re: Mesos-master url in HA

2016-04-13 Thread Steven Schlansker
I personally believe that this is not a sufficient workaround -- what
if the master is failing over, and your autoscaler happens to redirect
to a master which just lost leadership?

This solution is inherently racy and leads to the end user writing extra code
to work around it, and even then can still result in extremely difficult
to diagnose bugs.

I'd filed an issue on this a while ago (0.20 days):
https://issues.apache.org/jira/browse/MESOS-1865

but unfortunately it is still not resolved.

> On Apr 13, 2016, at 12:44 AM, Alexander Rojas  wrote:
> 
> Hi guillermo,
> 
> The master has the `/redirect` endpoint which should point you to the current 
> leader.
> 
>> On 13 Apr 2016, at 08:20, Guillermo Rodriguez  wrote:
>> 
>> Hi,
>> 
>> I have 3 mesos master setup for HA. One has the lead.
>> 
>> http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
>> http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
>> http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
>> 
>> I have an URL mesos-master.mydomain.com pointing to the leader and that 
>> works fine because it returns the slave list which I need for my autoscaler. 
>> But I'm afraid if the master fails the URL will no longer be valid. So I 
>> added the three IPs to the router (AWS Route53)  so it would round robin, 
>> but of course this will return an empty list sometimes because it hits a 
>> follower which returns empty.
>> 
>> So my question is, is it possible to redirect the call from the followers to 
>> the leader master?
>> 
>> Thanks.
>> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Problems with scheduling tasks in mesos and spark

2016-04-13 Thread Andreas Tsarida

Hello,

I’m trying to figure out a solution for dynamic resource allocation in mesos 
within the same framework ( spark ).

Scenario :
1 - run spark a job in coarse mode
2 - run second job in coarse mode

Second job will not start unless first job finishes which is not something that 
I would want. The problem is small when the job running doesn’t take too long 
but when it does nobody can work on the cluster.

Best scenario would be to have mesos revoke resources from the first job and 
try to allocate resources to the second job.

If there anybody else who solved this issue in another way ?

Thanks


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Mesos Task History

2016-04-13 Thread June Taylor
We have a single master at the moment. Does the task history get cleared
when the mesos-master restarts?


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Wed, Apr 13, 2016 at 11:33 AM, Pradeep Chhetri <
pradeep.chhetr...@gmail.com> wrote:

> Yes, they get cleaned up whenever the mesos master leader failover happens.
>
> On Wed, Apr 13, 2016 at 3:32 PM, June Taylor  wrote:
>
>> I am noticing that recently our Completed Tasks and Terminated Frameworks
>> lists are empty. Where are these stored, and do they get automatically
>> cleared out at some interval?
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>
>
>
> --
> Regards,
> Pradeep Chhetri
>


Re: Mesos Master and Slave on same server?

2016-04-13 Thread Pradeep Chhetri
I would suggest you to run mesos-master and zookeeper and marathon on same
set of hosts (maybe call them as coordinator nodes) and use completely
different set of nodes for mesos slaves. This way you can do the
maintenance of such hosts in a very planned fashion.

On Wed, Apr 13, 2016 at 4:22 PM, Stefano Bianchi 
wrote:

> For sure it is possible.
> Simply Mesos-master will the the resources offered by the machine on which
> is running mesos-slave also, transparently.
>
> 2016-04-13 16:34 GMT+02:00 June Taylor :
>
>> All of our node servers are identical hardware. Is it reasonable for me
>> to install the Mesos-Master and Mesos-Slave on the same physical hardware?
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>
>


-- 
Regards,
Pradeep Chhetri


Re: Mesos Task History

2016-04-13 Thread Pradeep Chhetri
Yes, they get cleaned up whenever the mesos master leader failover happens.

On Wed, Apr 13, 2016 at 3:32 PM, June Taylor  wrote:

> I am noticing that recently our Completed Tasks and Terminated Frameworks
> lists are empty. Where are these stored, and do they get automatically
> cleared out at some interval?
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>



-- 
Regards,
Pradeep Chhetri


Re: Pyspark Cluster Mode

2016-04-13 Thread Pradeep Chhetri
In cluster mode, you need to first run *MesosClusterDispatcher* application
on marathon (Read more about that here:
http://spark.apache.org/docs/latest/running-on-mesos.html#cluster-mode)

In both client and cluster mode, you need to specify --master flag while
submitting job, the only difference is that you will specifying the value
as the URL of dispatcher in cluster mode
(mesos://:) while in client mode, you
will be specifying URL of mesos-master
(mesos://:)

On Wed, Apr 13, 2016 at 3:24 PM, June Taylor  wrote:

> I'm interested in what the "best practice" is for running pyspark jobs
> against a mesos cluster.
>
> Right now, we're simply passing the --master mesos://host:5050 flag, which
> appears to register a framework properly.
>
> However, I was told this isn't "cluster mode" - and I'm a bit confused.
> What is the recommended method of doing this?
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>



-- 
Regards,
Pradeep Chhetri


Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread Klaus Ma
+1


Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
Platform OpenSource Technology, STG, IBM GCG
+86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me

On Wed, Apr 13, 2016 at 11:14 PM, Paul  wrote:

> +1
>
> On Apr 13, 2016, at 11:01 AM, Ken Sipe  wrote:
>
> +1
>
> On Apr 12, 2016, at 5:58 PM, Greg Mann  wrote:
>
> Hey folks!
> A number of situations have arisen in which the default value of the Mesos
> agent `--work_dir` flag (/tmp/mesos) has caused problems on systems in
> which the automatic cleanup of '/tmp' deletes agent metadata. To resolve
> this, we would like to eliminate the default value of the agent
> `--work_dir` flag. You can find the relevant JIRA here
> .
>
> We considered simply changing the default value to a more appropriate
> location, but decided against this because the expected filesystem
> structure varies from platform to platform, and because it isn't guaranteed
> that the Mesos agent would have access to the default path on a particular
> platform.
>
> Eliminating the default `--work_dir` value means that the agent would exit
> immediately if the flag is not provided, whereas currently it launches
> successfully in this case. This will break existing infrastructure which
> relies on launching the Mesos agent without specifying the work directory.
> I believe this is an acceptable change because '/tmp/mesos' is not a
> suitable location for the agent work directory except for short-term local
> testing, and any production scenario that is currently using this location
> should be altered immediately.
>
> If you have any thoughts/opinions/concerns regarding this change, please
> let us know!
>
> Cheers,
> Greg
>
>
>


pyspark exiting cleanly, but executor remains on slave as an orphaned task

2016-04-13 Thread June Taylor
We are running pyspark against our cluster in coarse-grained mode by
specifying the --master mesos://host:5050 flag, which properly creates one
task on each node.

However, if the driver is shut down, it appears that these executors become
orphaned_tasks, still consuming resources on the slave, but no longer being
represented in the master's understanding of available resources.

Examining the stdout/stderr shows it exited:

Registered executor on node4
Starting task 0
sh -c 'cd spark-1*;  ./bin/spark-class
org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://
CoarseGrainedScheduler@128.101.163.200:41563 --executor-id
aa1337b6-43b0-4236-b445-c8ccbfb60506-S2/0 --hostname node4 --cores 31
--app-id aa1337b6-43b0-4236-b445-c8ccbfb60506-0097'
Forked command at 117620
Command exited with status 1 (pid: 117620)

But, these executors are remaining on all the slaves.

What can we do to clear them out? Stopping mesos-slave and removing the
full work-dir is successful, but also destroys our other tasks.

Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota


Re: Mesos Masters Leader Keeps Fluctuating

2016-04-13 Thread haosdent
It sounds like an issue in 0.28. I create a ticket
https://issues.apache.org/jira/browse/MESOS-5207 from this to continue to
 investigate. If @suruchi you could attach logs of mesos masters and your
zookeeper configuration, I think it would more helpful for investigating.

On Wed, Apr 13, 2016 at 8:24 PM, Stefano Bianchi 
wrote:

> Thanks for your reply @haosdent.
> I destroyed my VM and re build mesos 0.28 with just one master, and now is
> working.
> i will try to add another master but for the moment, since on openstack i
> don't have much resources i need to use that VM as a slave.
> However in the previous configuration the switch between two masters was
> ok, just when the master was leading after, more or less 30 seconds, there
> was that Failed to connect message.
>
> 2016-04-13 13:08 GMT+02:00 haosdent :
>
>> Hi, @Stefano Could you show conf/zoo.cfg? And how many zookeper nodes you
>> haved? And "but after a while again Failed to connec"​, how long the
>> interval here? Is it always "few seconds"?
>>
>
>


-- 
Best Regards,
Haosdent Huang


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread June Taylor
Stefano,

That's the feature, yes.

Unfortunately I don't know the answer to your questions further.


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Wed, Apr 13, 2016 at 10:26 AM, Stefano Bianchi 
wrote:

> Thanks to your reply June.
>
> You mean this feature?
> https://coreos.com/etcd/docs/latest/clustering.html
> Because for sure i know that calico exploit etcd as datastore, but i need
> to know if, and if yes how, i can manage etcd as DNS.
>
> 2016-04-13 15:04 GMT+02:00 June Taylor :
>
>> Stefano,
>>
>> We are exploring CoreOS, and I believe it has the feature you're looking
>> for.
>>
>>
>> Thanks,
>> June Taylor
>> System Administrator, Minnesota Population Center
>> University of Minnesota
>>
>> On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi 
>> wrote:
>>
>>> Hi all
>>>
>>> i have to set up two mesos clusters.
>>> On each cluster i should integrate Project calico in order to distribute
>>> tasks among the agents. But these tasks should be sent also from a slave of
>>> one cluster to the slave of the other cluster.
>>> I know that when i start calico on each slaves, it registers the hosts
>>> to the ETCD_AUTHORITY so calico use etcd.
>>> In order to have interconnection among 2 mesos clusters, i should have the
>>> same etcd datastore for both my Mesos/Calico clusters.
>>> Someone knows how to reach this condition?
>>>
>>> Thanks in advance
>>>
>>>
>>
>


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread Stefano Bianchi
Thanks to your reply June.

You mean this feature?
https://coreos.com/etcd/docs/latest/clustering.html
Because for sure i know that calico exploit etcd as datastore, but i need
to know if, and if yes how, i can manage etcd as DNS.

2016-04-13 15:04 GMT+02:00 June Taylor :

> Stefano,
>
> We are exploring CoreOS, and I believe it has the feature you're looking
> for.
>
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>
> On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi 
> wrote:
>
>> Hi all
>>
>> i have to set up two mesos clusters.
>> On each cluster i should integrate Project calico in order to distribute
>> tasks among the agents. But these tasks should be sent also from a slave of
>> one cluster to the slave of the other cluster.
>> I know that when i start calico on each slaves, it registers the hosts to
>> the ETCD_AUTHORITY so calico use etcd.
>> In order to have interconnection among 2 mesos clusters, i should have the
>> same etcd datastore for both my Mesos/Calico clusters.
>> Someone knows how to reach this condition?
>>
>> Thanks in advance
>>
>>
>


Re: Mesos Master and Slave on same server?

2016-04-13 Thread Stefano Bianchi
For sure it is possible.
Simply Mesos-master will the the resources offered by the machine on which
is running mesos-slave also, transparently.

2016-04-13 16:34 GMT+02:00 June Taylor :

> All of our node servers are identical hardware. Is it reasonable for me to
> install the Mesos-Master and Mesos-Slave on the same physical hardware?
>
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
>


Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread Ken Sipe
+1
> On Apr 12, 2016, at 5:58 PM, Greg Mann  wrote:
> 
> Hey folks!
> A number of situations have arisen in which the default value of the Mesos 
> agent `--work_dir` flag (/tmp/mesos) has caused problems on systems in which 
> the automatic cleanup of '/tmp' deletes agent metadata. To resolve this, we 
> would like to eliminate the default value of the agent `--work_dir` flag. You 
> can find the relevant JIRA here 
> .
> 
> We considered simply changing the default value to a more appropriate 
> location, but decided against this because the expected filesystem structure 
> varies from platform to platform, and because it isn't guaranteed that the 
> Mesos agent would have access to the default path on a particular platform.
> 
> Eliminating the default `--work_dir` value means that the agent would exit 
> immediately if the flag is not provided, whereas currently it launches 
> successfully in this case. This will break existing infrastructure which 
> relies on launching the Mesos agent without specifying the work directory. I 
> believe this is an acceptable change because '/tmp/mesos' is not a suitable 
> location for the agent work directory except for short-term local testing, 
> and any production scenario that is currently using this location should be 
> altered immediately.
> 
> If you have any thoughts/opinions/concerns regarding this change, please let 
> us know!
> 
> Cheers,
> Greg



Mesos Master and Slave on same server?

2016-04-13 Thread June Taylor
All of our node servers are identical hardware. Is it reasonable for me to
install the Mesos-Master and Mesos-Slave on the same physical hardware?

Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota


Mesos Task History

2016-04-13 Thread June Taylor
I am noticing that recently our Completed Tasks and Terminated Frameworks
lists are empty. Where are these stored, and do they get automatically
cleared out at some interval?

Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota


Multiple Marathon Frameworks on a single cluster

2016-04-13 Thread June Taylor
We have a cluster with 6 nodes. I have registered 3 of them with the
default * role, and the other three with a "production" role.

It appears that Marathon can only consume a single role, and therefore it
appears I need to stand up two Marathon instances.

First question: is that correct?

Second: If correct, what is the recommended method of running two
frameworks? I have installed Mesos and Marathon using the Mesosphere
packages. Should I simply copy the /etc/init/marathon.conf file and make
two - one for the default role and one for production?

How are you segregating your resources with Marathon?

Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota


Re: marathon issue in running a docker container.

2016-04-13 Thread nirmalendu swain
Thanks Haosdent for identifying the issue. If I remove the port from 
configuration, then instances are correctly spinning up.Thanks a ton for timely 
help.
Best Regards,Nirmal

On Tuesday, 12 April 2016 11:01 PM, haosdent  wrote:
 

 >Server running at: http://0.0.0.0:7683
And according to your log, your service is running on 7683 while your 
configuration use 8080 in portMapping.
On Wed, Apr 13, 2016 at 1:25 AM, haosdent  wrote:

>If I do telnet or curl, it does not show me any reponse.Looks wired here, 
>could you find the status of task is running or other status in mesos webui or 
>marathon webui?
And is it possible for you to use `docker ps` to find out the container and use 
`docker exec` to enter container and check whether service is running?
On Tue, Apr 12, 2016 at 7:13 PM, nirmalendu swain  
wrote:

I might be wrong here. But I am using marathon-lb package of DCOS which already 
has haproxy.
 

On Tuesday, 12 April 2016 2:14 PM, Rad Gruchalski  
wrote:
 

  Do you have anything like haproxy for port mappings installed on your Mesos 
cluster? When using BRIDGE network, your process inside of the container, say 
SSH, is running on a standard port 22. Marathon allocates a random port in the 
default range of 31000 to 32000. However, it is your task to map the 
:31xxx to :22.
The simplest is to use 
this:https://github.com/mesosphere/marathon/blob/master/examples/haproxy-marathon-bridge
The haproxy-marathon-bridge needs to run as a cron job on every agent. Because 
it runs as a cron job every minute, your ports become accessible after up to 
one minute from going into RUNNING state.
There are obviously moe advanced ways of getting this done - 
haproxy-marathon-bridge is the simplest one.Best regards,

RadekGruchalski

ra...@gruchalski.com

de.linkedin.com/in/radgruchalski/

Confidentiality:
Thiscommunication is intended for the above-named person and may beconfidential 
and/or legally privileged.
If it has come to you inerror you must take no action based on it, nor must you 
copy or showit to anyone; please delete/destroy and inform the 
senderimmediately. On Tuesday, 12 April 2016 at 10:19, nirmalendu swain wrote: 
 From the stderr log, nothing can be figure out. From stout log, it says server 
running at host:port. But If I do telnet or curl, it does not show me any 
reponse.output of stderr log :
I0412 08:16:12.842341  9909 exec.cpp:134] Version: 0.27.1I0412 08:16:12.844701  
9934 exec.cpp:208] Executor registered on slave 
87849fd2-fda9-4d6a-870f-de101a5bdc59-S3js-bson: Failed to load c++ bson 
extension, using pure JS version 

On Tuesday, 12 April 2016 1:39 PM, Abhishek Amralkar 
 wrote:
 

 anything is sandbox logs, why the tasks are getting killed? `stderr` and 
`stdout`?

On 12-Apr-2016, at 1:35 PM, haosdent  wrote:
>Its frequently changing the deployment status to Staged
Do you find any related log in mesos when marathon lauch the task?
On Tue, Apr 12, 2016 at 3:55 PM, nirmalendu swain  
wrote:

Changing the network type to HOST does not work. Its frequently changing the 
deployment status to Staged and then to no task.

On Tuesday, 12 April 2016 11:49 AM, haosdent  wrote:


How about change the network type from BRIDGE to HOST?
On Tue, Apr 12, 2016 at 2:13 PM, nirmalendu swain  
wrote:

Hi Mesos user,I am running mesos marathon using dcos for spinning up AWS 
instances. I have successfully built the mongodb as a docker container, but 
when I try to deploy my dockerized app, it does n't deploy. My App is dependent 
upon mongo which is passed as environment variable in the json file to be run 
by dcos command. If a i do a telnet/curl, it does n't receive at the desired 
host:port. from the mesos logs, it does not seem to throw any error/exception. 
Doing a copy-past of my backend-app.json file which falis to deploy.

{    "id": "/todo-with-backend",    "instances": 2,    "container": {        
"type": "DOCKER",        "docker": {            "image": "tldr/todo-backend",   
         "network": "BRIDGE",            "portMappings": [                {     
               "containerPort": 8080,                    "hostPort": 0,         
           "protocol": "tcp"                }            ]        }    
},"env":{       "MONGO_URL":"10.0.2.252:5530" },    "healthChecks": [{        
"protocol": "HTTP",        "portIndex": 0    }],    "labels":{        
"HAPROXY_GROUP":"external",        ""    },    "cpus": 0.25,    "mem": 256.0}
I have gone inside the host and checked that env value is reflecting 
correctly.Please help me out in analyzing the issue.
Regards,Nirmal





-- 
Best Regards,
Haosdent Huang





-- 
Best Regards,
Haosdent Huang




  
 

   



-- 
Best Regards,
Haosdent Huang



-- 
Best Regards,
Haosdent Huang

  

Re: Mesos-master url in HA

2016-04-13 Thread Guillermo Rodriguez
Ok go my solution, autoscaler now uses the mesos-dns record for 
leader.mesos instead of the Route53 record.
  
 Thanks!
  
  


 From: "Alexander Rojas" 
Sent: Wednesday, April 13, 2016 5:45 PM
To: user@mesos.apache.org, gu...@spritekin.com
Subject: Re: Mesos-master url in HA   
 Hi guillermo,
  
 The master has the `/redirect` endpoint which should point you to the 
current leader.
 On 13 Apr 2016, at 08:20, Guillermo Rodriguez  
wrote:
Hi,
  
 I have 3 mesos master setup for HA. One has the lead.
  
  http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
 http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
  
 I have an URL mesos-master.mydomain.com pointing to the leader and that 
works fine because it returns the slave list which I need for my 
autoscaler. But I'm afraid if the master fails the URL will no longer be 
valid. So I added the three IPs to the router (AWS Route53)  so it would 
round robin, but of course this will return an empty list sometimes because 
it hits a follower which returns empty.
  
 So my question is, is it possible to redirect the call from the followers 
to the leader master?
  
 Thanks.
  

 



Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread June Taylor
+1

As a new user this was one of the first things we changed, to direct Mesos
to use a volume with more disk space. Please print a useful error message
to the logs, though.


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Tue, Apr 12, 2016 at 5:58 PM, Greg Mann  wrote:

> Hey folks!
> A number of situations have arisen in which the default value of the Mesos
> agent `--work_dir` flag (/tmp/mesos) has caused problems on systems in
> which the automatic cleanup of '/tmp' deletes agent metadata. To resolve
> this, we would like to eliminate the default value of the agent
> `--work_dir` flag. You can find the relevant JIRA here
> .
>
> We considered simply changing the default value to a more appropriate
> location, but decided against this because the expected filesystem
> structure varies from platform to platform, and because it isn't guaranteed
> that the Mesos agent would have access to the default path on a particular
> platform.
>
> Eliminating the default `--work_dir` value means that the agent would exit
> immediately if the flag is not provided, whereas currently it launches
> successfully in this case. This will break existing infrastructure which
> relies on launching the Mesos agent without specifying the work directory.
> I believe this is an acceptable change because '/tmp/mesos' is not a
> suitable location for the agent work directory except for short-term local
> testing, and any production scenario that is currently using this location
> should be altered immediately.
>
> If you have any thoughts/opinions/concerns regarding this change, please
> let us know!
>
> Cheers,
> Greg
>


Re: Mesos interconnection among clusters project calico

2016-04-13 Thread June Taylor
Stefano,

We are exploring CoreOS, and I believe it has the feature you're looking
for.


Thanks,
June Taylor
System Administrator, Minnesota Population Center
University of Minnesota

On Tue, Apr 12, 2016 at 5:35 PM, Stefano Bianchi 
wrote:

> Hi all
>
> i have to set up two mesos clusters.
> On each cluster i should integrate Project calico in order to distribute
> tasks among the agents. But these tasks should be sent also from a slave of
> one cluster to the slave of the other cluster.
> I know that when i start calico on each slaves, it registers the hosts to
> the ETCD_AUTHORITY so calico use etcd.
> In order to have interconnection among 2 mesos clusters, i should have the
> same etcd datastore for both my Mesos/Calico clusters.
> Someone knows how to reach this condition?
>
> Thanks in advance
>
>


Re: Mesos Masters Leader Keeps Fluctuating

2016-04-13 Thread Stefano Bianchi
Thanks for your reply @haosdent.
I destroyed my VM and re build mesos 0.28 with just one master, and now is
working.
i will try to add another master but for the moment, since on openstack i
don't have much resources i need to use that VM as a slave.
However in the previous configuration the switch between two masters was
ok, just when the master was leading after, more or less 30 seconds, there
was that Failed to connect message.

2016-04-13 13:08 GMT+02:00 haosdent :

> Hi, @Stefano Could you show conf/zoo.cfg? And how many zookeper nodes you
> haved? And "but after a while again Failed to connec"​, how long the
> interval here? Is it always "few seconds"?
>


Re: Mesos Masters Leader Keeps Fluctuating

2016-04-13 Thread haosdent
Hi, @Stefano Could you show conf/zoo.cfg? And how many zookeper nodes you
haved? And "but after a while again Failed to connec"​, how long the
interval here? Is it always "few seconds"?


Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread Dick Davies
Oh please yes!

On 13 April 2016 at 08:00, Sam  wrote:
> +1
>
> Sent from my iPhone
>
> On Apr 13, 2016, at 12:44 PM, Avinash Sridharan 
> wrote:
>
> +1
>
> On Tue, Apr 12, 2016 at 9:31 PM, Jie Yu  wrote:
>>
>> +1
>>
>> On Tue, Apr 12, 2016 at 9:29 PM, James Peach  wrote:
>>
>> >
>> > > On Apr 12, 2016, at 3:58 PM, Greg Mann  wrote:
>> > >
>> > > Hey folks!
>> > > A number of situations have arisen in which the default value of the
>> > Mesos agent `--work_dir` flag (/tmp/mesos) has caused problems on
>> > systems
>> > in which the automatic cleanup of '/tmp' deletes agent metadata. To
>> > resolve
>> > this, we would like to eliminate the default value of the agent
>> > `--work_dir` flag. You can find the relevant JIRA here.
>> > >
>> > > We considered simply changing the default value to a more appropriate
>> > location, but decided against this because the expected filesystem
>> > structure varies from platform to platform, and because it isn't
>> > guaranteed
>> > that the Mesos agent would have access to the default path on a
>> > particular
>> > platform.
>> > >
>> > > Eliminating the default `--work_dir` value means that the agent would
>> > exit immediately if the flag is not provided, whereas currently it
>> > launches
>> > successfully in this case. This will break existing infrastructure which
>> > relies on launching the Mesos agent without specifying the work
>> > directory.
>> > I believe this is an acceptable change because '/tmp/mesos' is not a
>> > suitable location for the agent work directory except for short-term
>> > local
>> > testing, and any production scenario that is currently using this
>> > location
>> > should be altered immediately.
>> >
>> > +1 from me too. Defaulting to /tmp just helps people shoot themselves in
>> > the foot.
>> >
>> > J
>
>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245


Re: Mesos-master url in HA

2016-04-13 Thread Alexander Rojas
Hi guillermo,

The master has the `/redirect` endpoint which should point you to the current 
leader.

> On 13 Apr 2016, at 08:20, Guillermo Rodriguez  wrote:
> 
> Hi,
>  
> I have 3 mesos master setup for HA. One has the lead.
>  
> http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
> http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
> http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
>  
> I have an URL mesos-master.mydomain.com pointing to the leader and that works 
> fine because it returns the slave list which I need for my autoscaler. But 
> I'm afraid if the master fails the URL will no longer be valid. So I added 
> the three IPs to the router (AWS Route53)  so it would round robin, but of 
> course this will return an empty list sometimes because it hits a follower 
> which returns empty.
>  
> So my question is, is it possible to redirect the call from the followers to 
> the leader master?
>  
> Thanks.
>  



Re: Slaves not getting registered

2016-04-13 Thread Dick Davies
The masters are losing their zookeeper connection too, which is
forcing an election:

I0412 11:01:48.887229  3677 group.cpp:460] Lost connection to
ZooKeeper, attempting to reconnect ...

I0412 11:01:48.919545  3675 group.cpp:519] ZooKeeper session expired

I0412 11:01:48.919848  3680 detector.cpp:154] Detected a new leader: None

I0412 11:01:48.919922  3680 master.cpp:1710] The newly elected leader is None

You need to tune your zookeeper cluster I'd guess, there's something
not right there.

On 13 April 2016 at 06:09,   wrote:
> Hi,
>
>
>
> I configured the zookeeper file in slave machine by adding the master
> details and now the salve is getting registered.
>
>
>
> But I don’t why, the three masters keep fluctuating among themselves to be
> the leader when I try accessing the master IP in the GUI.
>
>
>
> Thank you.
>
>
>
>
>
> From: haosdent [mailto:haosd...@gmail.com]
> Sent: 13 April 2016 09:25
> To: user 
> Cc: Kumari, Suruchi 
>
>
> Subject: Re: Slaves not getting registered
>
>
>
>>I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
>> group
>
>
>
> According to this, master 1 should connect to zk successfully.
>
>
>
>>root@slave1:/var/log/mesos# tail -f
>> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
>
>>I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
>
>>I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
>> allowed age: 5.848917453828577days
>
>>W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect to
>> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> How about check whether you could connect to zk on slave1 or not?
>
>
>
> On Wed, Apr 13, 2016 at 11:49 AM, 
> wrote:
>
> I checked the zookeeper status by running the command:
>
>
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
>
> Mode: follower
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
>
> Mode: leader
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
>
> Mode: follower
>
>
>
> And it seems like it’s working fine. Is there another way to check the
> health status?
>
>
>
>
>
> From: Abhishek Amralkar [mailto:abhishek.amral...@talentica.com]
> Sent: 13 April 2016 09:10
>
>
> To: user@mesos.apache.org
> Subject: Re: Slaves not getting registered
>
>
>
> Have you checked if your ZooKeeper cluster is healthy? accessible from Mesos
> Masters?
>
>
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
>
>
> It seems Mesos masters are not able to communicate to Zookeeper.
>
>
>
> -Abhishek
>
> On 13-Apr-2016, at 9:06 AM, aishwarya.adyanth...@accenture.com wrote:
>
>
>
> Hi,
>
>
>
> I have been following the document from the digitalocean (mesos-doc-link)
> where I have set 3 masters and one slave. Below are the log details:
>
>
>
> root@master1:/var/log/mesos# tail -f mesos-master.INFO
>
> I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response
> from a replica in VOTING status
>
> I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to
> RECOVERING
>
> I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 3.154399ms
>
> I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to
> RECOVERING
>
> I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from position
> 1 to 2
>
> I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to
> VOTING
>
> I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 2.540703ms
>
> I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to
> VOTING
>
> I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
> group
>
> I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated
>
>
>
> root@master1:/var/log/mesos# tail -f mesos-master.WARNING
>
> Log file created at: 2016/04/12 11:01:49
>
> Running on machine: master1
>
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
>
> W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials provided,
> authentication requests will be refused
>
>
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
>
> tail: cannot open
> ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ 

Re: Mesos Masters Leader Keeps Fluctuating

2016-04-13 Thread haosdent
Do all the network between 3 Mesos masters and zookeeper are stable? Is it
lost packets when you ping zookeeper servers in every Mesos master?

And if possible, could you post the related Mesos masters logs as well?

On Wed, Apr 13, 2016 at 2:26 PM,  wrote:

> Hi,
>
>
>
> I have set the quorum value as 2 as I have configured 3 master machines in
> my environment.
>
>
>
> But I don’t know why my leader master keeps fluctuating.
>
>
>
> Thank you,
>
>
>
> --
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Where allowed
> by local law, electronic communications with Accenture and its affiliates,
> including e-mail and instant messaging (including content), may be scanned
> by our systems for the purposes of information security and assessment of
> internal compliance with Accenture policy.
>
> __
>
> www.accenture.com
>



-- 
Best Regards,
Haosdent Huang


Re: Hybrid application deployments (container/VM/bare metal) in Mesos

2016-04-13 Thread Guangya Liu
If you do not want to provision VM or PM on demand, then mesos plus
marathon can help.

There is also a JIRA talking about support Qemu/KVM in mesos
https://issues.apache.org/jira/browse/MESOS-2717

On Wed, Apr 13, 2016 at 2:17 PM, tommy xiao  wrote:

> mesos + marathon natively support your heterogeneous app. are you some
> concerns?
>
> 2016-04-13 13:57 GMT+08:00 Xiaoning Ding :
>
>> Hello,
>>
>> I'm wondering if someone here can help point me some document links about
>> hybrid application deployment in Mesos. The basic idea is that we have some
>> applications in mixed flavors (container, VM, bare metal) and we want to
>> run them on a single cluster.
>>
>> Let me explain by an example. Say I have an application which consists of
>> three different component services:
>>
>>1. A web front-end which has been containerized. It can be deployed
>>as a Docker image.
>>2. An application server which hasn’t been containerized. It can only
>>run on a VM or a physical machine.
>>3. An Oracle database. Since I don’t want to lose any performance, I
>>want to run it on a physical machine directly.
>>
>> Now I want to run all these three component services on a single cluster
>> and have a unified management on those heterogeneous resources. What's the
>> suggested solution from Mesos side? Any known limitations or good practices?
>>
>>
>> Thanks,
>>
>> Xiaoning
>>
>>
>>
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>


Mesos-master url in HA

2016-04-13 Thread Guillermo Rodriguez
Hi,
  
 I have 3 mesos master setup for HA. One has the lead.
  
  http://172.31.35.90:5050/master/slaves   <<< leader, returns slave list
 http://172.31.35.91:5050/master/slaves   <<< follower, returns empty list
http://172.31.35.92:5050/master/slaves   <<< follower, returns empty list
  
 I have an URL mesos-master.mydomain.com pointing to the leader and that 
works fine because it returns the slave list which I need for my 
autoscaler. But I'm afraid if the master fails the URL will no longer be 
valid. So I added the three IPs to the router (AWS Route53)  so it would 
round robin, but of course this will return an empty list sometimes because 
it hits a follower which returns empty.
  
 So my question is, is it possible to redirect the call from the followers 
to the leader master?
  
 Thanks.
  



Re: Hybrid application deployments (container/VM/bare metal) in Mesos

2016-04-13 Thread tommy xiao
mesos + marathon natively support your heterogeneous app. are you some
concerns?

2016-04-13 13:57 GMT+08:00 Xiaoning Ding :

> Hello,
>
> I'm wondering if someone here can help point me some document links about
> hybrid application deployment in Mesos. The basic idea is that we have some
> applications in mixed flavors (container, VM, bare metal) and we want to
> run them on a single cluster.
>
> Let me explain by an example. Say I have an application which consists of
> three different component services:
>
>1. A web front-end which has been containerized. It can be deployed as
>a Docker image.
>2. An application server which hasn’t been containerized. It can only
>run on a VM or a physical machine.
>3. An Oracle database. Since I don’t want to lose any performance, I
>want to run it on a physical machine directly.
>
> Now I want to run all these three component services on a single cluster
> and have a unified management on those heterogeneous resources. What's the
> suggested solution from Mesos side? Any known limitations or good practices?
>
>
> Thanks,
>
> Xiaoning
>
>
>



-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com


Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread tommy xiao
how about /opt/mesos/
+1

2016-04-13 12:44 GMT+08:00 Avinash Sridharan :

> +1
>
> On Tue, Apr 12, 2016 at 9:31 PM, Jie Yu  wrote:
>
>> +1
>>
>> On Tue, Apr 12, 2016 at 9:29 PM, James Peach  wrote:
>>
>> >
>> > > On Apr 12, 2016, at 3:58 PM, Greg Mann  wrote:
>> > >
>> > > Hey folks!
>> > > A number of situations have arisen in which the default value of the
>> > Mesos agent `--work_dir` flag (/tmp/mesos) has caused problems on
>> systems
>> > in which the automatic cleanup of '/tmp' deletes agent metadata. To
>> resolve
>> > this, we would like to eliminate the default value of the agent
>> > `--work_dir` flag. You can find the relevant JIRA here.
>> > >
>> > > We considered simply changing the default value to a more appropriate
>> > location, but decided against this because the expected filesystem
>> > structure varies from platform to platform, and because it isn't
>> guaranteed
>> > that the Mesos agent would have access to the default path on a
>> particular
>> > platform.
>> > >
>> > > Eliminating the default `--work_dir` value means that the agent would
>> > exit immediately if the flag is not provided, whereas currently it
>> launches
>> > successfully in this case. This will break existing infrastructure which
>> > relies on launching the Mesos agent without specifying the work
>> directory.
>> > I believe this is an acceptable change because '/tmp/mesos' is not a
>> > suitable location for the agent work directory except for short-term
>> local
>> > testing, and any production scenario that is currently using this
>> location
>> > should be altered immediately.
>> >
>> > +1 from me too. Defaulting to /tmp just helps people shoot themselves in
>> > the foot.
>> >
>> > J
>>
>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245
>



-- 
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com


Re: Slaves not getting registered

2016-04-13 Thread haosdent
>the three masters keep fluctuating among themselves to be the leader.

Do all the network between 3 Mesos masters and zookeeper are stable? Is it
lost packets when you ping zookeeper servers in every Mesos master?

On Wed, Apr 13, 2016 at 1:15 PM, Abhishek Amralkar <
abhishek.amral...@talentica.com> wrote:

> Not sure, but try to change the quorum and check.
>
>
>
> On 13-Apr-2016, at 10:39 AM, aishwarya.adyanth...@accenture.com wrote:
>
> Hi,
>
> I configured the zookeeper file in slave machine by adding the master
> details and now the salve is getting registered.
>
> But I don’t why, the three masters keep fluctuating among themselves to be
> the leader when I try accessing the master IP in the GUI.
>
> Thank you.
>
>
> *From:* haosdent [mailto:haosd...@gmail.com ]
> *Sent:* 13 April 2016 09:25
> *To:* user 
> *Cc:* Kumari, Suruchi 
> *Subject:* Re: Slaves not getting registered
>
> >I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the
> Paxos group
>
> According to this, master 1 should connect to zk successfully.
>
> >root@slave1:/var/log/mesos# tail -f
> mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696
> >I0413 03:12:54.532676  1711 group.cpp:519] ZooKeeper session expired
> >I0413 03:12:58.757953  1715 slave.cpp:4304] Current disk usage 6.44%. Max
> allowed age: 5.848917453828577days
> >W0413 03:13:04.539577  1715 group.cpp:503] Timed out waiting to connect
> to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> How about check whether you could connect to zk on slave1 or not?
>
> On Wed, Apr 13, 2016 at 11:49 AM, 
> wrote:
>
> I checked the zookeeper status by running the command:
>
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode
> Mode: follower
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode
> Mode: leader
> root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode
> Mode: follower
>
> And it seems like it’s working fine. Is there another way to check the
> health status?
>
>
> *From:* Abhishek Amralkar [mailto:abhishek.amral...@talentica.com]
> *Sent:* 13 April 2016 09:10
>
> *To:* user@mesos.apache.org
> *Subject:* Re: Slaves not getting registered
>
> Have you checked if your ZooKeeper cluster is healthy? accessible from
> Mesos Masters?
>
> W0413 03:12:24.512336  1715 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:34.519641  1710 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:44.521181  1713 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
> W0413 03:12:54.532501  1711 group.cpp:503] Timed out waiting to connect to
> ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration
>
> It seems Mesos masters are not able to communicate to Zookeeper.
>
> -Abhishek
>
> On 13-Apr-2016, at 9:06 AM, aishwarya.adyanth...@accenture.com wrote:
>
> Hi,
>
> I have been following the document from the digitalocean (mesos-doc-link
> )
> where I have set 3 masters and one slave. Below are the log details:
>
> root@master1:/var/log/mesos# tail -f mesos-master.INFO
> I0412 11:01:50.579818  3736 recover.cpp:193] Received a recover response
> from a replica in VOTING status
> I0412 11:01:50.579903  3736 recover.cpp:564] Updating replica status to
> RECOVERING
> I0412 11:01:50.583102  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 3.154399ms
> I0412 11:01:50.583137  3736 replica.cpp:320] Persisted replica status to
> RECOVERING
> I0412 11:01:50.583176  3736 recover.cpp:543] Starting catch-up from
> position 1 to 2
> I0412 11:01:50.583732  3736 recover.cpp:564] Updating replica status to
> VOTING
> I0412 11:01:50.586318  3736 leveldb.cpp:304] Persisting metadata (8 bytes)
> to leveldb took 2.540703ms
> I0412 11:01:50.586484  3736 replica.cpp:320] Persisted replica status to
> VOTING
> I0412 11:01:50.586612  3732 recover.cpp:578] Successfully joined the Paxos
> group
> I0412 11:01:50.586745  3731 recover.cpp:462] Recover process terminated
>
> root@master1:/var/log/mesos# tail -f mesos-master.WARNING
> Log file created at: 2016/04/12 11:01:49
> Running on machine: master1
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> W0412 11:01:49.024226  3712 authenticator.cpp:511] No credentials
> provided, authentication requests will be refused
>
> root@master1:/var/log/mesos# tail -f
> mesos-master.master1.invalid-user.log.INFO.20160412-11014
> tail: cannot open
> ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No
> such file or directory
> root@master1:/var/log/mesos# tail -f
>