Re: Mesos Roles | Min or Max ?

2018-05-21 Thread Ken Sipe
Hey Trevor!

Quota in mesos is used for both… it is a guarantee of offers up to a Max and it 
defines the Max.

Ken 

> On May 21, 2018, at 4:34 PM, Trevor Powell  wrote:
> 
> Hello everyone,
>  
> Reading up on http://mesos.apache.org/documentation/latest/roles/ 
>  and 
> http://mesos.apache.org/documentation/latest/quota/ 
> 
> Its not clear to me if there is a way to control how much a role is allowed 
> to use a cluster. The MAX.  I think roles and quotas are more for minimum 
> guarantees of resources??
>  
> —
> 
> Trevor Alexander Powell
> Product Owner, Release+Platform Engineering
> 7575 Gateway Blvd. Newark, CA 94560
> M: +1.650.325.7467 
>  
> https://github.com/tpowell-rms  
> https://www.linkedin.com/in/trevorapowell 
>  
> http://www.rms.com 


Re: High availability feature

2018-04-20 Thread Ken Sipe
Marathon provides it’s own HA through multiple instances of Marathon:  
https://mesosphere.github.io/marathon/docs/high-availability.html 


Ken 

> On Apr 20, 2018, at 9:05 AM, Mahmood Naderan  wrote:
> 
> Hi,
> I want to know if the high availability feature of mesos is related to
> high availability of Marathon? In other word, mesos supports multiple
> master nodes with an election policy and so on. Does that provide high
> availability feature for marathon? Or marathon has its own mechanism?
> 
> 
> Regards,
> Mahmood



Re: Mesos on OS X

2018-03-21 Thread Ken Sipe
I don’t have long running experience but I would expect it to work fine… the 
thing to be aware of is that under OSX there are no cgroup constraints…  you 
also may want to review the APPLE difference:  
https://github.com/apache/mesos/search?utf8=%E2%9C%93=__APPLE__= 


Ken 

> On Mar 21, 2018, at 1:25 PM, Sunil Shah  wrote:
> 
> Hey all,
> 
> We're contemplating setting up a small OS X Mesos cluster for running iOS 
> tests. I know Mesos technically builds on Macs, but has anyone ever had 
> experience with a long running cluster on OS X? Is it possible? Recommended? 
> Not recommended?
> 
> Thanks,
> 
> Sunil



Re: what's the pronunciation of "MESOS"?

2016-08-09 Thread Ken Sipe
Apparently it depends on if you are British or not :)   
http://dictionary.cambridge.org/us/pronunciation/english/the-mesosphere 


apparently the absence of “phere" changes everything:  
https://www.howtopronounce.com/mesos/ 

and for this looking for which percentile they are in:  
http://www.basilmarket.com/How-do-You-pronounce-mesos-Thread-b5eAL-1 



> On Aug 9, 2016, at 12:30 PM, Charles Allen  
> wrote:
> 
> My wife thought I was crazy sitting here mumbling "mAY-sohs" "MEH-sohs" 
> "Mee-sohs"
> 
> On Mon, Aug 8, 2016 at 9:22 PM Yu Wei  > wrote:
> Thanks Joe.
> 
> It's really interesting.
> 
> Jared, (韦煜)
> Software developer
> Interested in open source software, big data, Linux
> 
> 
> From: Joseph Jacks >
> Sent: Tuesday, August 9, 2016 10:53 AM
> To: user@mesos.apache.org 
> Subject: Re: what's the pronunciation of "MESOS"?
>  
> "MAY-zoss" is most common and correct. 
> 
> "MEH-zoss" is second most common and also correct, I think. 
> 
> "MEE-zoss" is third most common, but incorrect.
> 
> JJ.
> 
> On Aug 8, 2016, at 10:48 PM, Yu Wei  > wrote:
> 
>> 
>> Thx,
>> 
>> Jared, (韦煜)
>> Software developer
>> Interested in open source software, big data, Linux



Re: Mesos on hybrid AWS - Best practices?

2016-06-30 Thread Ken Sipe
I would suggest a cluster on AWS and a cluster on-prem.Then tooling on top 
to manage between the 2.
It is unlikely that a failure of a task on-prem should have a scheduled 
replacement on AWS or vise versa.It is likely that you will end up creating 
constraints to statically partition the clusters anyway IMO. 
2 Clusters eliminates most of your proposed questions.

ken

> On Jun 30, 2016, at 10:57 AM, Florian Pfeiffer  wrote:
> 
> Hi,
> 
> the last 2 years I managed a mesos cluster with bare-metal on-premise. Now at 
> my new company, the situation is a little bit different, and I'm wondering if 
> there are some kind of best practices:
> The company is in the middle of a transition from on-premise to AWS. The old 
> stuff is still running in the DC, the newer micro services are running within 
> autoscales groups on AWS and other AWS services like DynamoDB, Kinesis and 
> Lambda are also on the rise. 
> 
> So in my naive view of the world (where no problems occur. never!) I'm 
> thinking that it would be great to span a hybrid mesos cluster over AWS to 
> leverage the still available resources in the DC which gets more and more 
> underutilized over the time. 
> 
> Now my naive world view slowly crumbles, and I realize that I'm missing the 
> experience with AWS. Questions that are already popping up (beside all those 
> Questions, where I currently don't know that I will have them...) are:
> * Is Virtual Private Gateway to my VPC enough, or do I need to aim for a 
> Direct Connect?
> * Put everything into one Account, or use a Multi-Account strategy? (Mainly 
> to prevent things running amok and drag stuff down while running into an 
> account wide shared limit?)
> * Will e.g. DynamoDb be "fast" enough if it's accessed from the Datacenter.
> 
> I'll appreciate any feedback or lessons learned about that topic :)
> 
> Thanks,
> Florian
> 



Re: Marathon scaling application

2016-05-12 Thread Ken Sipe
ocess.cpp:1958] Failed to shutdown
>> socket with fd 10: Transport endpoint is not connected
>> 
>> E0511 05:39:43.651479  1351 slave.cpp:3252] Failed to update resources
>> for container 53bb3453-31b2-4cf7-a9e1-5f700510eeb4 of executor
>> 'nginx.38f28ab0-169b-11e6-9f8a-fa163ecc33f1' running task
>> nginx.38f28ab0-169b-11e6-9f8a-fa163ecc33f1 on status update for
>> terminal task, destroying container: Failed to 'docker -H
>> unix:///var/run/docker.sock  inspect
>> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.53bb3453-31b2-4cf7-a9e1-5f700510eeb4':
>> exit status = exited with status 1 stderr = Cannot connect to the
>> Docker daemon. Is the docker daemon running on this host?
>> 
>> E0511 05:39:43.651845  1351 slave.cpp:3252] Failed to update resources
>> for container ec4e97ad-2365-4c29-9ed7-64cd9261c666 of executor
>> 'nginx.38f48682-169b-11e6-9f8a-fa163ecc33f1' running task
>> nginx.38f48682-169b-11e6-9f8a-fa163ecc33f1 on status update for
>> terminal task, destroying container: Failed to 'docker -H
>> unix:///var/run/docker.sock  inspect
>> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.ec4e97ad-2365-4c29-9ed7-64cd9261c666':
>> exit status = exited with status 1 stderr = Cannot connect to the
>> Docker daemon. Is the docker daemon running on this host?
>> 
>> E0511 05:39:43.651983  1351 slave.cpp:3252] Failed to update resources
>> for container 116be528-b81f-4e4c-b2a4-11bb10707031 of executor
>> 'nginx.413b3558-169b-11e6-9f8a-fa163ecc33f1' running task
>> nginx.413b3558-169b-11e6-9f8a-fa163ecc33f1 on status update for
>> terminal task, destroying container: Failed to 'docker -H
>> unix:///var/run/docker.sock  inspect
>> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.116be528-b81f-4e4c-b2a4-11bb10707031':
>> exit status = exited with status 1 stderr = Cannot connect to the
>> Docker daemon. Is the docker daemon running on this host?
>> 
>> E0511 05:39:43.652032  1351 slave.cpp:3252] Failed to update resources
>> for container f77a5a14-4eb0-4801-a520-6fd2b298a3e3 of executor
>> 'nginx.47bcdb9a-169b-11e6-9f8a-fa163ecc33f1' running task
>> nginx.47bcdb9a-169b-11e6-9f8a-fa163ecc33f1 on status update for
>> terminal task, destroying container: Failed to 'docker -H
>> unix:///var/run/docker.sock  inspect
>> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.f77a5a14-4eb0-4801-a520-6fd2b298a3e3':
>> exit status = exited with status 1 stderr = Cannot connect to the
>> Docker daemon. Is the docker daemon running on this host?
>> 
>> E0511 05:39:43.652118  1351 slave.cpp:3252] Failed to update resources
>> for container 537100e5-99de-4b59-903a-127dae29839e of executor
>> 'nginx.47bf4c9b-169b-11e6-9f8a-fa163ecc33f1' running task
>> nginx.47bf4c9b-169b-11e6-9f8a-fa163ecc33f1 on status update for
>> terminal task, destroying container: Failed to 'docker -H
>> unix:///var/run/docker.sock  inspect
>> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.537100e5-99de-4b59-903a-127dae29839e':
>> exit status = exited with status 1 stderr = Cannot connect to the
>> Docker daemon. Is the docker daemon running on this host?
>> 
>> E0511 05:39:43.679261  1352 process.cpp:1958] Failed to shutdown
>> socket with fd 18: Transport endpoint is not connected
>> 
>> E0511 05:39:43.780983  1352 process.cpp:1958] Failed to shutdown
>> socket with fd 14: Transport endpoint is not connected
>> 
>> *From:*Ken Sipe [mailto:kens...@gmail.com <mailto:kens...@gmail.com>]
>> *Sent:* 11 May 2016 13:50
>> *To:* user@mesos.apache.org <mailto:user@mesos.apache.org>
>> *Subject:* Re: Marathon scaling application
>> 
>> It is hard to say with the information provided.   I would check the
>> slave log the failure node.  I suspect the failure is recorded there.
>> 
>> otherwise more information is necessary:
>> 
>> 1. the marathon job (did you launch with a json file? that would be
>> helpful)
>> 
>> 2. the slave logs
>> 
>> it could also be useful to understand:
>> 
>> 1. the version of mesos and marathon
>> 
>> 2. what OS is on the nodes
>> 
>> ken
>> 
>>On May 11, 2016, at 3:10 AM, suruchi.kum...@accenture.com
>><mailto:suruchi.kum...@accenture.com 
>> <mailto:suruchi.kum...@accenture.com>> wrote:
>> 
>>I have problem scaling the applications through Marathon.
>> 
>>I have a setup of two slave nodes.The first slave node having CPU=1
>>and RAM=2GB and the Second node having CPU=4 and RAM=8GB.
>> 
>>It is able to scale maximum 5 instances on the first node but  when
>> 

Re: Marathon scaling application

2016-05-11 Thread Ken Sipe
te for terminal 
> task, destroying container: Failed to 'docker -H unix:///var/run/docker.sock 
>  inspect 
> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.116be528-b81f-4e4c-b2a4-11bb10707031':
>  exit status = exited with status 1 stderr = Cannot connect to the Docker 
> daemon. Is the docker daemon running on this host?
> E0511 05:39:43.652032  1351 slave.cpp:3252] Failed to update resources for 
> container f77a5a14-4eb0-4801-a520-6fd2b298a3e3 of executor 
> 'nginx.47bcdb9a-169b-11e6-9f8a-fa163ecc33f1' running task 
> nginx.47bcdb9a-169b-11e6-9f8a-fa163ecc33f1 on status update for terminal 
> task, destroying container: Failed to 'docker -H unix:///var/run/docker.sock 
>  inspect 
> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.f77a5a14-4eb0-4801-a520-6fd2b298a3e3':
>  exit status = exited with status 1 stderr = Cannot connect to the Docker 
> daemon. Is the docker daemon running on this host?
> E0511 05:39:43.652118  1351 slave.cpp:3252] Failed to update resources for 
> container 537100e5-99de-4b59-903a-127dae29839e of executor 
> 'nginx.47bf4c9b-169b-11e6-9f8a-fa163ecc33f1' running task 
> nginx.47bf4c9b-169b-11e6-9f8a-fa163ecc33f1 on status update for terminal 
> task, destroying container: Failed to 'docker -H unix:///var/run/docker.sock 
>  inspect 
> mesos-f986e4ba-91ba-4624-b685-4c004407c6db-S1.537100e5-99de-4b59-903a-127dae29839e':
>  exit status = exited with status 1 stderr = Cannot connect to the Docker 
> daemon. Is the docker daemon running on this host?
> E0511 05:39:43.679261  1352 process.cpp:1958] Failed to shutdown socket with 
> fd 18: Transport endpoint is not connected
> E0511 05:39:43.780983  1352 process.cpp:1958] Failed to shutdown socket with 
> fd 14: Transport endpoint is not connected
>  
>  
>  
>  
> From: Ken Sipe [mailto:kens...@gmail.com <mailto:kens...@gmail.com>] 
> Sent: 11 May 2016 13:50
> To: user@mesos.apache.org <mailto:user@mesos.apache.org>
> Subject: Re: Marathon scaling application
>  
> It is hard to say with the information provided.   I would check the slave 
> log the failure node.  I suspect the failure is recorded there.
>  
> otherwise more information is necessary:
> 1. the marathon job (did you launch with a json file? that would be helpful)
> 2. the slave logs
>  
> it could also be useful to understand:
> 1. the version of mesos and marathon
> 2. what OS is on the nodes
>  
> ken
>  
> On May 11, 2016, at 3:10 AM, suruchi.kum...@accenture.com 
> <mailto:suruchi.kum...@accenture.com> wrote:
>  
> I have problem scaling the applications through Marathon.
>  
> I have a setup of two slave nodes.The first slave node having CPU=1 and 
> RAM=2GB and the Second node having CPU=4 and RAM=8GB.
>  
> It is able to scale maximum 5 instances on the first node but  when I tried 
> scaling it further the host gets changed to the second slave node.And the 
> task fails to start and error in the debug section of the Marathon UI shows 
> "Abnormal executor termination".
>  
> I would like to know why is it not getting scheduled on the other slave 
> node???
>  
> Can you please help me with this issue.
>  
> Thanks
>  
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Where allowed by local law, 
> electronic communications with Accenture and its affiliates, including e-mail 
> and instant messaging (including content), may be scanned by our systems for 
> the purposes of information security and assessment of internal compliance 
> with Accenture policy. 
> __
> 
> www.accenture.com <http://www.accenture.com/>


Re: Marathon scaling application

2016-05-11 Thread Ken Sipe
It is hard to say with the information provided.   I would check the slave log 
the failure node.  I suspect the failure is recorded there.

otherwise more information is necessary:
1. the marathon job (did you launch with a json file? that would be helpful)
2. the slave logs

it could also be useful to understand:
1. the version of mesos and marathon
2. what OS is on the nodes

ken

> On May 11, 2016, at 3:10 AM, suruchi.kum...@accenture.com wrote:
> 
> I have problem scaling the applications through Marathon.
>  
> I have a setup of two slave nodes.The first slave node having CPU=1 and 
> RAM=2GB and the Second node having CPU=4 and RAM=8GB.
>  
> It is able to scale maximum 5 instances on the first node but  when I tried 
> scaling it further the host gets changed to the second slave node.And the 
> task fails to start and error in the debug section of the Marathon UI shows 
> "Abnormal executor termination".
>  
> I would like to know why is it not getting scheduled on the other slave 
> node???
>  
> Can you please help me with this issue.
>  
> Thanks
> 
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Where allowed by local law, 
> electronic communications with Accenture and its affiliates, including e-mail 
> and instant messaging (including content), may be scanned by our systems for 
> the purposes of information security and assessment of internal compliance 
> with Accenture policy. 
> __
> 
> www.accenture.com 


Re: Enable s3a for fetcher

2016-05-11 Thread Ken Sipe
Jamie,

I’m in Europe this week… so the timing of my responses are out of sync / 
delayed.   There are 2 issues to work with here.  The first is having a 
pluggable mesos fetcher… sounds like that is scheduled for 0.30.   The other is 
what is available on dcos.  Could you move that discussion to that mailing 
list?  I will definitely work with you on getting this resolved.

ken
> On May 10, 2016, at 3:45 PM, Briant, James  
> wrote:
> 
> Ok. Thanks Joseph. I will figure out how to get a more recent hadoop onto my 
> dcos agents then.
> 
> Jamie
> 
> From: Joseph Wu >
> Reply-To: "user@mesos.apache.org " 
> >
> Date: Tuesday, May 10, 2016 at 1:40 PM
> To: user >
> Subject: Re: Enable s3a for fetcher
> 
> I can't speak to what DCOS does or will do (you can ask on the associated 
> mailing list: us...@dcos.io ).
> 
> We will be maintaining existing functionality for the fetcher, which means 
> supporting the schemes:
> * file
> * http, https, ftp, ftps
> * hdfs, hftp, s3, s3n  <--  These rely on hadoop.
> 
> And we will retain the --hadoop_home agent flag, which you can use to specify 
> the hadoop binary.
> 
> Other schemes might work right now, if you hack around with your node setup.  
> But there's no guarantee that your hack will work between Mesos versions.  In 
> future, we will associate a fetcher plugin for each scheme.  And you will be 
> able to load custom fetcher plugins for additional schemes.
> TLDR: no "nerfing" and less hackiness :)
> 
> On Tue, May 10, 2016 at 12:58 PM, Briant, James 
> > wrote:
>> This is the mesos latest documentation:
>> 
>> If the requested URI is based on some other protocol, then the fetcher tries 
>> to utilise a local Hadoop client and hence supports any protocol supported 
>> by the Hadoop client, e.g., HDFS, S3. See the slave configuration 
>> documentation  
>> for how to configure the slave with a path to the Hadoop client. [emphasis 
>> added]
>> 
>> What you are saying is that dcos simply wont install hadoop on agents?
>> 
>> Next question then: will you be nerfing fetcher.cpp, or will I be able to 
>> install hadoop on the agents myself, such that mesos will recognize s3a?
>> 
>> 
>> From: Joseph Wu >
>> Reply-To: "user@mesos.apache.org " 
>> >
>> Date: Tuesday, May 10, 2016 at 12:20 PM
>> To: user >
>> 
>> Subject: Re: Enable s3a for fetcher
>> 
>> Mesos does not explicitly support HDFS and S3.  Rather, Mesos will assume 
>> you have a hadoop binary and use it (blindly) for certain types of URIs.  If 
>> the hadoop binary is not present, the mesos-fetcher will fail to fetch your 
>> HDFS or S3 URIs.
>> 
>> Mesos does not ship/package hadoop, so these URIs are not expected to work 
>> out of the box (for plain Mesos distributions).  In all cases, the operator 
>> must preconfigure hadoop on each node (similar to how Docker in Mesos works).
>> 
>> Here's the epic tracking the modularization of the mesos-fetcher (I estimate 
>> it'll be done by 0.30):
>> https://issues.apache.org/jira/browse/MESOS-3918 
>> 
>> 
>> ^ Once done, it should be easier to plug in more fetchers, such as one for 
>> your use-case.
>> 
>> On Tue, May 10, 2016 at 11:21 AM, Briant, James 
>> > wrote:
>>> I’m happy to have default IAM role on the box that can read-only fetch from 
>>> my s3 bucket. s3a gets the credentials from AWS instance metadata. It works.
>>> 
>>> If hadoop is gone, does that mean that hfds: URIs don’t work either?
>>> 
>>> Are you saying dcos and mesos are diverging? Mesos explicitly supports hdfs 
>>> and s3.
>>> 
>>> In the absence of S3, how do you propose I make large binaries available to 
>>> my cluster, and only to my cluster, on AWS?
>>> 
>>> Jamie
>>> 
>>> From: Cody Maloney >
>>> Reply-To: "user@mesos.apache.org " 
>>> >
>>> Date: Tuesday, May 10, 2016 at 10:58 AM
>>> To: "user@mesos.apache.org " 
>>> >
>>> Subject: Re: Enable s3a for fetcher
>>> 
>>> The s3 fetcher stuff inside of DC/OS is not supported. The `hadoop` binary 
>>> has been entirely removed from DC/OS 1.8 already. There have been various 
>>> proposals to make it so the mesos fetcher is much more pluggable / 
>>> 

Re: Enable s3a for fetcher

2016-05-11 Thread Ken Sipe
to Josephs point… hdfs and s3 challenges are dcos issues not a mesos issue. 
We do however need Mesos to support custom protocols for the fetcher.   At our 
current pace of releases it sounds not too far away.

ken
> On May 10, 2016, at 2:20 PM, Joseph Wu  wrote:
> 
> Mesos does not explicitly support HDFS and S3.  Rather, Mesos will assume you 
> have a hadoop binary and use it (blindly) for certain types of URIs.  If the 
> hadoop binary is not present, the mesos-fetcher will fail to fetch your HDFS 
> or S3 URIs.
> 
> Mesos does not ship/package hadoop, so these URIs are not expected to work 
> out of the box (for plain Mesos distributions).  In all cases, the operator 
> must preconfigure hadoop on each node (similar to how Docker in Mesos works).
> 
> Here's the epic tracking the modularization of the mesos-fetcher (I estimate 
> it'll be done by 0.30):
> https://issues.apache.org/jira/browse/MESOS-3918 
> 
> 
> ^ Once done, it should be easier to plug in more fetchers, such as one for 
> your use-case.
> 
> On Tue, May 10, 2016 at 11:21 AM, Briant, James 
> > wrote:
> I’m happy to have default IAM role on the box that can read-only fetch from 
> my s3 bucket. s3a gets the credentials from AWS instance metadata. It works.
> 
> If hadoop is gone, does that mean that hfds: URIs don’t work either?
> 
> Are you saying dcos and mesos are diverging? Mesos explicitly supports hdfs 
> and s3.
> 
> In the absence of S3, how do you propose I make large binaries available to 
> my cluster, and only to my cluster, on AWS?
> 
> Jamie
> 
> From: Cody Maloney >
> Reply-To: "user@mesos.apache.org " 
> >
> Date: Tuesday, May 10, 2016 at 10:58 AM
> To: "user@mesos.apache.org " 
> >
> Subject: Re: Enable s3a for fetcher
> 
> The s3 fetcher stuff inside of DC/OS is not supported. The `hadoop` binary 
> has been entirely removed from DC/OS 1.8 already. There have been various 
> proposals to make it so the mesos fetcher is much more pluggable / extensible 
> (https://issues.apache.org/jira/browse/MESOS-2731 
>  for instance). 
> 
> Generally speaking people want a lot of different sorts of fetching, and 
> there are all sorts of questions of how to properly get auth to the various 
> chunks (if you're using s3a:// presumably you need to get credentials there 
> somehow. Otherwise you could just use http://). Need to design / build that 
> into Mesos and DC/OS to be able to use this stuff.
> 
> Cody
> 
> On Tue, May 10, 2016 at 9:55 AM Briant, James  > wrote:
> I want to use s3a: urls in fetcher. I’m using dcos 1.7 which has hadoop 2.5 
> on its agents. This version has the necessary hadoop-aws and aws-sdk:
> 
> hadoop--afadb46fe64d0ee7ce23dbe769e44bfb0767a8b9]$ ls 
> usr/share/hadoop/tools/lib/ | grep aws
> aws-java-sdk-1.7.4.jar
> hadoop-aws-2.5.0-cdh5.3.3.jar
> 
> What config/scripts do I need to hack to get these guys on the classpath so 
> that "hadoop fs -copyToLocal” works?
> 
> Thanks,
> Jamie
> 



Re: Enable s3a for fetcher

2016-05-11 Thread Ken Sipe
Jamie,

The general philosophy is that services should depend very little on the base 
image (some would say no dependency).   There has been an HDFS on the base 
image which we have leveraged while we work on higher priorities.  It was 
always our intent to remove it.  Another example and another enabler to this 
working is there is Java JRE on the base.  It would be a bad idea to get 
addicted to it :)   

That said,  it has always been our intention to support different protocols 
(such as retrieving artifacts from HDFS which  other services (such as Chronos) 
could leverage).It makes sense that we support s3 retrieval as well.   It 
does mean that we need a pluggable way to hook in solutions to protocols other 
that http.   We have had some discussion around it and have a design idea in 
place.   At this point it is an issue of priority and timing.

ken
> On May 10, 2016, at 1:21 PM, Briant, James  
> wrote:
> 
> I’m happy to have default IAM role on the box that can read-only fetch from 
> my s3 bucket. s3a gets the credentials from AWS instance metadata. It works.
> 
> If hadoop is gone, does that mean that hfds: URIs don’t work either?
> 
> Are you saying dcos and mesos are diverging? Mesos explicitly supports hdfs 
> and s3.
> 
> In the absence of S3, how do you propose I make large binaries available to 
> my cluster, and only to my cluster, on AWS?
> 
> Jamie
> 
> From: Cody Maloney >
> Reply-To: "user@mesos.apache.org " 
> >
> Date: Tuesday, May 10, 2016 at 10:58 AM
> To: "user@mesos.apache.org " 
> >
> Subject: Re: Enable s3a for fetcher
> 
> The s3 fetcher stuff inside of DC/OS is not supported. The `hadoop` binary 
> has been entirely removed from DC/OS 1.8 already. There have been various 
> proposals to make it so the mesos fetcher is much more pluggable / extensible 
> (https://issues.apache.org/jira/browse/MESOS-2731 
>  for instance). 
> 
> Generally speaking people want a lot of different sorts of fetching, and 
> there are all sorts of questions of how to properly get auth to the various 
> chunks (if you're using s3a:// presumably you need to get credentials there 
> somehow. Otherwise you could just use http://). Need to design / build that 
> into Mesos and DC/OS to be able to use this stuff.
> 
> Cody
> 
> On Tue, May 10, 2016 at 9:55 AM Briant, James  > wrote:
>> I want to use s3a: urls in fetcher. I’m using dcos 1.7 which has hadoop 2.5 
>> on its agents. This version has the necessary hadoop-aws and aws-sdk:
>> 
>> hadoop--afadb46fe64d0ee7ce23dbe769e44bfb0767a8b9]$ ls 
>> usr/share/hadoop/tools/lib/ | grep aws
>> aws-java-sdk-1.7.4.jar
>> hadoop-aws-2.5.0-cdh5.3.3.jar
>> 
>> What config/scripts do I need to hack to get these guys on the classpath so 
>> that "hadoop fs -copyToLocal” works?
>> 
>> Thanks,
>> Jamie



Re: Framework taking default resources even though a role is specified

2016-04-15 Thread Ken Sipe
The framework with role “production” will receive production resources and * 
resources
All other frameworks (assuming no role) will only receive * resources

ken

> On Apr 15, 2016, at 11:38 AM, June Taylor  wrote:
> 
> We have a small cluster with 3 nodes in the * resource role default, and 3 
> nodes in a "production" resource role.
> 
> Starting up a framework which requests "production" properly executes on the 
> expected nodes, however, today we noticed that this job also started up 
> executors under the * resource role as well.
> 
> We expect these tasks to only go on nodes with the "production" resource 
> role. Can you advise further?
> 
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota



Re: [Proposal] Remove the default value for agent work_dir

2016-04-13 Thread Ken Sipe
+1
> On Apr 12, 2016, at 5:58 PM, Greg Mann  wrote:
> 
> Hey folks!
> A number of situations have arisen in which the default value of the Mesos 
> agent `--work_dir` flag (/tmp/mesos) has caused problems on systems in which 
> the automatic cleanup of '/tmp' deletes agent metadata. To resolve this, we 
> would like to eliminate the default value of the agent `--work_dir` flag. You 
> can find the relevant JIRA here 
> .
> 
> We considered simply changing the default value to a more appropriate 
> location, but decided against this because the expected filesystem structure 
> varies from platform to platform, and because it isn't guaranteed that the 
> Mesos agent would have access to the default path on a particular platform.
> 
> Eliminating the default `--work_dir` value means that the agent would exit 
> immediately if the flag is not provided, whereas currently it launches 
> successfully in this case. This will break existing infrastructure which 
> relies on launching the Mesos agent without specifying the work directory. I 
> believe this is an acceptable change because '/tmp/mesos' is not a suitable 
> location for the agent work directory except for short-term local testing, 
> and any production scenario that is currently using this location should be 
> altered immediately.
> 
> If you have any thoughts/opinions/concerns regarding this change, please let 
> us know!
> 
> Cheers,
> Greg



Re: Why are FrameworkToExecutorMessage and ExecutorToFrameworkMessage transmitted along different paths?

2016-01-05 Thread Ken Sipe
regarding reliability:  obviously TCP provides a point to point guarantee...  
however there is no application level (ISO model) guarantee that the message 
was received or processed.   A loss of slave or executor at the wrong time 
would result in no processing of the message without the senders awareness.   
There is a guarantee of status messages but not framework messages.   It is a 
an application equivalent of UDP at the protocol level.

ken
> On Jan 5, 2016, at 9:34 AM, sujz  wrote:
> 
> Hi, all: 
>  
> I am using mesos-0.22.0, I noticed that FrameworkToExecutorMessage is sent 
> along path: 
> Scheduler->Master->Slave->Executor, 
> while ExecutorToFrameworkMessage is sent along path: 
> Executor->Slave->Scheduler, 
>  
> So is there some reason or benefit for bypassing master while transmitting 
> ExecutorToFrameworkMessage? 
>  
> One more question, FrameworkToExecutorMessage and ExecutorToFrameworkMessage 
> are instantiated in function SendFrameworkMessage, declaration of 
> SendFrameworkMessage in include/mesos/scheduler.hpp and 
> include/mesos/executor.hpp: 
>   // Sends a message from the framework to one of its executors. These 
>   // messages are best effort; do not expect a framework message to be 
>   // retransmitted in any reliable fashion. 
>   virtual Status sendFrameworkMessage( 
>   const ExecutorID& executorId, 
>   const SlaveID& slaveId, 
>   const std::string& data) = 0; 
>  
> I guess that protobuf message are transmitted with TCP, so does this comment 
> mean I have to guarantee reliability by myself even with TCP? What's special 
> for these  
> two messages compared with other protobuf messages, If no, do we have to 
> guarantee reliability all by ourselves?
>  
> Thank you very much and best regards ! 
> 
> 
>  



Re: Java detector for mess masters and leader

2015-07-08 Thread Ken Sipe
awesome sharing of code!

I’ll add that if you are using Mesos-DNS, that the dns name master.mesos will 
resolve to the masters and leader.mesos will resolve to the leader.

if you are looking to resolve to marathon leader you would have to use the code 
below against zk at the moment.

- ken

 On Jul 8, 2015, at 9:42 AM, Nikolaos Ballas neXus 
 nikolaos.bal...@nexusgroup.com wrote:
 
 Don…the bellow code will return leader node for mesos  marathon framework.
 
 import …
 public class SomeClass {
 CuratorFramework client;
 public void init(){
  client = CuratorFrameworkFactory.newClient(connectionString, new 
 ExponentialBackoffRetry(1000, 3))
 }
 public String getMasterNodeIP(){
 if(client!=null){
 client.start();
 LeaderSelectorListener listener = new LeaderSelectorListenerAdapter() 
 {
 public void takeLeadership(CuratorFramework client) throws 
 Exception {
 }
 };
 
 LeaderSelector selector = new LeaderSelector(client, path, listener);
 selector.autoRequeue();
 selector.start();
 
 Participant participant = selector.getLeader();
 String id = 
 participant.getId().substring(participant.getId().indexOf(@) + 1, 
 participant.getId().indexOf(*));
 masterNode.add(id);
 }
 } catch (Exception e) {
 logger.error(Failed find out master node, e.getCause());
 }
 
 }
 }
 Nikolaos Ballas  |  Software Development Manager 
 
 Technology Nexus S.a.r.l.
 2-4 Rue Eugene Rupert
 2453 Luxembourg
 Delivery address: 2-3 Rue Eugene Rupert,Vertigo Polaris Building
 Tel: + 3522619113580
 cont...@nexusgroup.com mailto:contact...@nexusgroup.com | nexusgroup.com 
 http://www.nexusgroup.com/ 
 LinkedIn.com http://www.linkedin.com/company/nexus-technology | Twitter 
 http://www.twitter.com/technologynexus | Facebook.com 
 https://www.facebook.com/pages/Technology-Nexus/133756470003189
 
 
 C5B06FBE-74F4-416B-9BCE-F914341A2E0B_4_.png
 
 On 08 Jul 2015, at 16:27, Donald Laidlaw donlaid...@me.com 
 mailto:donlaid...@me.com wrote:
 
 @Nikolaos Ballas neXus
 I can see no way to instantiate the Curator LeaderSelector without actually 
 becoming a participant in leader election. If I do instantiate that class, 
 it does not accept a null value for the LeaderSelectorListener and so 
 anything instantiating LeaderSelector must also become a participant.
 
 Even then, that class provides no way to listen for leadership change. The 
 only listening it does is to discover when it itself becomes the leader. I 
 suppose it would be possible to participate in the leadership election, but 
 immediately relinquish leadership causing a real mesos master to become the 
 leader, but that seems a little too invasive to do.
 
 The only solution I can see is to monitor the children of the mesos leader 
 node, and parse through the contents of the ones whose name begins with 
 “info” as per @Marco Massenzio.
 
 Best regards,
 -Don
 
 On Jul 7, 2015, at 12:16 PM, Donald Laidlaw donlaid...@me.com 
 mailto:donlaid...@me.com wrote:
 
 Thank you all.
 
 I will use the Curator recipe, since I already use Curator for a bunch of 
 other things. 
 
 If curator can find the leader and the participants that is good enough. 
 Otherwise I will parse the protocol buffer contents, and provide a way to 
 parse the future son contents when that happens.
 
 I’ll reply again with the results of using the Curator recipe to get the 
 leader and participants.
 
 Best regards,
 -Don
 
 On Jul 7, 2015, at 11:04 AM, Dick Davies d...@hellooperator.net 
 mailto:d...@hellooperator.net wrote:
 
 The active master has a flag set in  /metrics/snapshot  :
 master/elected which is 1 for the active
 master and 0 otherwise, so it's easy enough to only load the metrics
 from the active master.
 
 (I use the collectd plugin and push data rather than poll, but the
 same principle should apply).
 
 On 7 July 2015 at 14:02, Donald Laidlaw donlaid...@me.com 
 mailto:donlaid...@me.com wrote:
 Has anyone ever developed Java code to detect the mesos masters and 
 leader, given a zookeeper connection?
 
 The reason I ask is because I would like to monitor mesos to report 
 various metrics reported by the master. This requires detecting and 
 tracking the leading master to query its /metrics/snapshot REST endpoint.
 
 Thanks,
 -Don
 
 
 



Re: Broken link report

2015-06-17 Thread Ken Sipe
thanks!   https://github.com/apache/mesos/pull/46 
https://github.com/apache/mesos/pull/46

 On Jun 17, 2015, at 5:22 AM, Brian Candler b.cand...@pobox.com wrote:
 
 At http://mesos.apache.org/documentation/latest/mesos-frameworks/
 the Torque link points to
 http://mesos.apache.org/documentation/latest/running-torque-or-mpi-on-mesos/
 but that gives a Not Found error.
 
 The nearest I could find through Google was
 https://svn.apache.org/repos/asf/mesos/trunk/docs/Running-torque-or-mpi-on-mesos.md
 
 Regards,
 
 Brian.
 



Re: mesos-dns

2015-06-04 Thread Ken Sipe
just put a blog post up on running mesos-dns with docker

https://mesosphere.com/blog/2015/06/02/get-mesos-dns-up-and-running-in-under-5-minutes-using-docker/
 
https://mesosphere.com/blog/2015/06/02/get-mesos-dns-up-and-running-in-under-5-minutes-using-docker/

and upstart is fine.

ken
 On Jun 4, 2015, at 4:41 PM, Svante Karlsson svante.karls...@csi.se wrote:
 
 I've just started playing with mesos so excuse me if this is a stupid 
 question.
 
 I'm configuring mesos-dns for my very small cluster (initial size) 3 + 3 
 nodes.
 
 I'm planning  to use 3 zookeeper nodes combined with mesos masters, chronos, 
 and marathon and mesos-dns.
 
 this will be using 3+ nodes as mesos slaves combined with native kafka (not 
 using mesos) using the same zookeepers.
 
 I want to use the slave nodes both for kafka and for the service on to on 
 kafka that performs the actual work. (a massive MQTT broker) 
 
 I looked at the mesos-dns ansible script and realized that is is intended to 
 run as maraton jobs.
 
 However I'm inclined to run them on each master node using upstart since I 
 want to use haproxy to find a first level proxy that should live in the 
 mesos cluster (I think I want fixed addresses to dns for that to work).
 
  Is there any drawback of that approach (instead of marathon)?
 
 best regards
 svante
 
 
 
 



Re: Mesos metrics in Datadog

2015-05-12 Thread Ken Sipe
Mesosphere makes significant use of datadog for monitor our mesos clusters. 
I'll see what details I can throw together as a quick response. I'll follow 
that up with a blog with details (probably by next week). 

Ken

 On May 12, 2015, at 3:21 AM, Antonio Cardenes 
 antoniocarde...@notonthehighstreet.com wrote:
 
 Hello, I was wondering if anyone has used Datadog to successfully monitor 
 Mesos clusters and if so, what kind of metrics/graphs have they find useful 
 for this task.
 
 Thanks,
 Antonio
 -- 
 Antonio Cardenes | Senior DevOps | notonthehighstreet.com
 
 
 
 
 phone  02086147100
 email   antoniocarde...@notonthehighstreet.com
 website www.notonthehighstreet.com
 address NOTHS House, 63 Kew Road , Richmond, Surrey, TW9 2NQ
 
 This email and any files transmitted with it, including replies and forwarded 
 copies subsequently transmitted from Notonthehighstreet Enterprises Limited 
 is confidential and solely for the use of the intended recipient. Any 
 opinions expressed in this email are those of the individual and not 
 necessarily of Notonthehighstreet Enterprises Limited. If you are not the  
 intended recipient, be advised that you have received this email in error and 
 that any use is strictly prohibited.
 
 The Site is operated by Notonthehighstreet Enterprises Limited (we). We are 
 registered in England and Wales under company number 5591382 and with our 
 registered office address at NOTHS House, 63 Kew Road , Richmond, Surrey, TW9 
 2NQ. Our VAT number is 872468392.


Re: Zookeeper integration for Mesos-DNS

2015-03-23 Thread Ken Sipe
Aaron,

It depends on what you mean however, Mesos-DNS works outside the cluster IMO. 
It is a bridge for things in the cluster (services launched by mesos)... But at 
that point it is DNS.  Any client in or out of the cluster that can query DNS 
that leverage the service. 

Sent from my iPhone

 On Mar 23, 2015, at 4:25 AM, Aaron Carey aca...@ilm.com wrote:
 
 Hey,
 
 I don't suppose there is anything like Mesos-DNS but for services/users 
 outside the mesos cluster? So having a service which updates a DNS provider 
 with task port/ips running inside the cluster so that external users are able 
 to find those services? Am I correct in thinking Mesos-DNS only works inside 
 the cluster?
 
 Currently we're using consul for this, but I'd be interested if there was 
 some sort of magical plug and play solution?
 
 Thanks,
 Aaron
 
 From: Christos Kozyrakis [kozyr...@gmail.com]
 Sent: 21 March 2015 00:18
 To: user@mesos.apache.org
 Subject: Zookeeper integration for Mesos-DNS
 
 Hi everybody, 
 
 we have updated Mesos-DNS to integrate directly with Zookeeper. Instead of 
 providing Mesos-DNS with a list of masters, you point it to the Zookeeper 
 instances. Meson-DNS will watch Zookeeper to detect the current leading 
 master. So, while the list of Zookeeper instances is configured in a static 
 manner, Mesos masters can be added or removed freely without restarting 
 Mesos-DNS. 
 
 The integration with Zookeeper forced to switch from -v and -vv as the flags 
 to control verbosity to -v=0 (default), -v=1 (verbose), and -v=2 (very 
 verbose). 
 
 To reduce complications because of dependencies to other packages, we have 
 also started using godep. 
 
 Please take a look at the branch 
 https://github.com/mesosphere/mesos-dns/tree/zk
 and provide us with any feedback on the code or the documentation. 
 
 Thanks
 
 -- 
 Christos


Re: Zookeeper integration for Mesos-DNS

2015-03-23 Thread Ken Sipe
roger that
 On Mar 23, 2015, at 9:22 AM, Aaron Carey aca...@ilm.com wrote:
 
 Thanks Ken,
 
 So basically we just need to add mesos-dns to our /etc/resolv.conf on every 
 machine and hey presto auto-service discovery (using DNS)? (Here I mean 
 service discovery to be: hey where is rabbitmq? DNS says: 172.20.121.292:8393 
 or whatever)
 
 Aaron
 
 From: Ken Sipe [kens...@gmail.com]
 Sent: 23 March 2015 14:29
 To: user@mesos.apache.org
 Subject: Re: Zookeeper integration for Mesos-DNS
 
 Aaron,
 
 Mesos-DNS is a DNS name server + a monitor of mesos-masters.  It listens to 
 the mesos-master.  If a service is launched by mesos then mesos-dns conjures 
 a service name (app_id + framework_id +.mesos) and associates it to the IP 
 and PORT of the service.  Since Mesos-DNS is a name service, it needs to be 
 in your list of name services for service discovery.  From a service 
 discovery stand point there is no need to be in the cluster and there is no 
 need to have a dependency on Mesos.   
 
 Mesos-DNS is not a proxy.  It doesn’t provide any special services to clients 
 or services inside the cluster.   more detail below.
  
 On Mar 23, 2015, at 7:52 AM, Aaron Carey aca...@ilm.com 
 mailto:aca...@ilm.com wrote:
 
 As I understood it, it provides a service for containers within the cluster 
 to automatically find each other as it handles their dns calls?
 
 The way this is stated this doesn’t seem true.Mesos-DNS is a DNS name 
 server.From a service discovery stand point, It doesn’t do anything 
 different than a standard DNS naming server.
 
 
 However clients outside the cluster will not use the mesos-dns service by 
 default, so won't have knowledge of anything running inside the cluster?
 
 This is all dependent on how /etc/resolv.conf is setup.  If mesos-dns is in 
 the list… then this is not true.
 
 
 Is there an easy way to set this up to (for example) add records to AWS 
 Route 53 when services get started in the cluster, so other clients can see 
 them?
 
 This is outside of Mesos-DNS
 
 Good Luck!!
 
 Thanks!
 Aaron
 
 From: Ken Sipe [kens...@gmail.com mailto:kens...@gmail.com]
 Sent: 23 March 2015 13:31
 To: user@mesos.apache.org mailto:user@mesos.apache.org
 Subject: Re: Zookeeper integration for Mesos-DNS
 
 Aaron,
 
 It depends on what you mean however, Mesos-DNS works outside the cluster 
 IMO. It is a bridge for things in the cluster (services launched by 
 mesos)... But at that point it is DNS.  Any client in or out of the cluster 
 that can query DNS that leverage the service. 
 
 Sent from my iPhone
 
 On Mar 23, 2015, at 4:25 AM, Aaron Carey aca...@ilm.com 
 mailto:aca...@ilm.com wrote:
 
 Hey,
 
 I don't suppose there is anything like Mesos-DNS but for services/users 
 outside the mesos cluster? So having a service which updates a DNS provider 
 with task port/ips running inside the cluster so that external users are 
 able to find those services? Am I correct in thinking Mesos-DNS only works 
 inside the cluster?
 
 Currently we're using consul for this, but I'd be interested if there was 
 some sort of magical plug and play solution?
 
 Thanks,
 Aaron
 
 From: Christos Kozyrakis [kozyr...@gmail.com mailto:kozyr...@gmail.com]
 Sent: 21 March 2015 00:18
 To: user@mesos.apache.org mailto:user@mesos.apache.org
 Subject: Zookeeper integration for Mesos-DNS
 
 Hi everybody, 
 
 we have updated Mesos-DNS to integrate directly with Zookeeper. Instead of 
 providing Mesos-DNS with a list of masters, you point it to the Zookeeper 
 instances. Meson-DNS will watch Zookeeper to detect the current leading 
 master. So, while the list of Zookeeper instances is configured in a static 
 manner, Mesos masters can be added or removed freely without restarting 
 Mesos-DNS. 
 
 The integration with Zookeeper forced to switch from -v and -vv as the 
 flags to control verbosity to -v=0 (default), -v=1 (verbose), and -v=2 
 (very verbose). 
 
 To reduce complications because of dependencies to other packages, we have 
 also started using godep. 
 
 Please take a look at the branch 
 https://github.com/mesosphere/mesos-dns/tree/zk 
 https://github.com/mesosphere/mesos-dns/tree/zk
 and provide us with any feedback on the code or the documentation. 
 
 Thanks
 
 -- 
 Christos



Re: Zookeeper integration for Mesos-DNS

2015-03-23 Thread Ken Sipe
Aaron,

Mesos-DNS is a DNS name server + a monitor of mesos-masters.  It listens to the 
mesos-master.  If a service is launched by mesos then mesos-dns conjures a 
service name (app_id + framework_id +.mesos) and associates it to the IP and 
PORT of the service.  Since Mesos-DNS is a name service, it needs to be in your 
list of name services for service discovery.  From a service discovery stand 
point there is no need to be in the cluster and there is no need to have a 
dependency on Mesos.   

Mesos-DNS is not a proxy.  It doesn’t provide any special services to clients 
or services inside the cluster.   more detail below.
 
 On Mar 23, 2015, at 7:52 AM, Aaron Carey aca...@ilm.com wrote:
 
 As I understood it, it provides a service for containers within the cluster 
 to automatically find each other as it handles their dns calls?

The way this is stated this doesn’t seem true.Mesos-DNS is a DNS name 
server.From a service discovery stand point, It doesn’t do anything 
different than a standard DNS naming server.

 
 However clients outside the cluster will not use the mesos-dns service by 
 default, so won't have knowledge of anything running inside the cluster?

This is all dependent on how /etc/resolv.conf is setup.  If mesos-dns is in the 
list… then this is not true.

 
 Is there an easy way to set this up to (for example) add records to AWS Route 
 53 when services get started in the cluster, so other clients can see them?

This is outside of Mesos-DNS

Good Luck!!
 
 Thanks!
 Aaron
 
 From: Ken Sipe [kens...@gmail.com mailto:kens...@gmail.com]
 Sent: 23 March 2015 13:31
 To: user@mesos.apache.org mailto:user@mesos.apache.org
 Subject: Re: Zookeeper integration for Mesos-DNS
 
 Aaron,
 
 It depends on what you mean however, Mesos-DNS works outside the cluster IMO. 
 It is a bridge for things in the cluster (services launched by mesos)... But 
 at that point it is DNS.  Any client in or out of the cluster that can query 
 DNS that leverage the service. 
 
 Sent from my iPhone
 
 On Mar 23, 2015, at 4:25 AM, Aaron Carey aca...@ilm.com 
 mailto:aca...@ilm.com wrote:
 
 Hey,
 
 I don't suppose there is anything like Mesos-DNS but for services/users 
 outside the mesos cluster? So having a service which updates a DNS provider 
 with task port/ips running inside the cluster so that external users are 
 able to find those services? Am I correct in thinking Mesos-DNS only works 
 inside the cluster?
 
 Currently we're using consul for this, but I'd be interested if there was 
 some sort of magical plug and play solution?
 
 Thanks,
 Aaron
 
 From: Christos Kozyrakis [kozyr...@gmail.com mailto:kozyr...@gmail.com]
 Sent: 21 March 2015 00:18
 To: user@mesos.apache.org mailto:user@mesos.apache.org
 Subject: Zookeeper integration for Mesos-DNS
 
 Hi everybody, 
 
 we have updated Mesos-DNS to integrate directly with Zookeeper. Instead of 
 providing Mesos-DNS with a list of masters, you point it to the Zookeeper 
 instances. Meson-DNS will watch Zookeeper to detect the current leading 
 master. So, while the list of Zookeeper instances is configured in a static 
 manner, Mesos masters can be added or removed freely without restarting 
 Mesos-DNS. 
 
 The integration with Zookeeper forced to switch from -v and -vv as the flags 
 to control verbosity to -v=0 (default), -v=1 (verbose), and -v=2 (very 
 verbose). 
 
 To reduce complications because of dependencies to other packages, we have 
 also started using godep. 
 
 Please take a look at the branch 
 https://github.com/mesosphere/mesos-dns/tree/zk 
 https://github.com/mesosphere/mesos-dns/tree/zk
 and provide us with any feedback on the code or the documentation. 
 
 Thanks
 
 -- 
 Christos



Re: Mesos-DNS

2015-02-24 Thread Ken Sipe
Anirudha,

Did you follow: http://mesosphere.github.io/mesos-dns/docs/ 
http://mesosphere.github.io/mesos-dns/docs/ ?

the build should work according to the build instructions.

ken
 On Feb 24, 2015, at 11:31 AM, Anirudha Jadhav aniru...@nyu.edu wrote:
 
 Whats the plan for mesos DNS? The dns lib is not even released. 
 
 even the build fails with syntax errors.
 
 is there a particular way to get this working?
 
 -- 
 Anirudha 
 --
 sudo go build -o mesos-dns
 
 # github.com/miekg/dns http://github.com/miekg/dns
 /usr/lib/go/src/pkg/github.com/miekg/dns/msg.go:1936 
 http://github.com/miekg/dns/msg.go:1936: syntax error: unexpected :, 
 expecting ]
 
 /usr/lib/go/src/pkg/github.com/miekg/dns/msg.go:1945 
 http://github.com/miekg/dns/msg.go:1945: syntax error: unexpected :, 
 expecting ]
 
 /usr/lib/go/src/pkg/github.com/miekg/dns/msg.go:1954 
 http://github.com/miekg/dns/msg.go:1954: syntax error: unexpected :, 
 expecting ]
 



Re: Mesos Master / Slave communications issues

2015-02-24 Thread Ken Sipe
It appears your configuration is off… as you suspected.. the master 
registration should NOT be 127.0.0.1 or 127.0.1.1.For each master if you 
configure the IP in a file named ip under `/etc/mesos-master` you should be 
good (after restarting the master)

my configurations under /etc/mesos-master looks like this:
/etc/mesos-master/
├── cluster
├── hostname
├── ip
├── quorum
├── registry
└── work_dir

these are just plan text files.  ip has the internal IP of the master, hostname 
has the fqdn of the master, cluster is the name of the cluster, etc.

good luck!
ken

 On Feb 24, 2015, at 4:06 PM, Kenneth Su su.ke...@gmail.com wrote:
 
 Hi Devin,
 
 I am new to Mesos as well, and I just configured it had the same problem like 
 yours.
 
 For your reference, what my fix was use the actually master IP instead, then 
 slave will pick it up and connected. I really wonder if 127.0.0.1, then Slave 
 will use it to connect itself and that is why never get to master one.
 
 Hope it helps!
 
 Kenneth
 
 On Tue, Feb 24, 2015 at 2:50 PM, Devin Carlen devin.car...@gmail.com 
 mailto:devin.car...@gmail.com wrote:
 Hello all,
 
 I’m new to Mesos but have recently started trying to stand up a cluster using 
 BOSH.  There is a BOSH release for it at 
 https://github.com/cf-platform-eng/mesos-boshrelease 
 https://github.com/cf-platform-eng/mesos-boshrelease that is under active 
 development.
 
 I was able to successfully deploy the cluster, however the slaves are not 
 communicating with the master.  Upon investigation I found that the leader 
 election is happening properly with ZooKeeper.  For this test I only have 1 
 Mesos master, 3 Mesos slaves, and 1 ZooKeeper instance for this test.  All 
 are running on their own VMs.  The single master gets elected upon startup:
 
 I0224 21:20:40.716702 12024 contender.cpp:243] New candidate (id='0') has 
 entered the contest for leadership
 I0224 21:20:40.717182 12024 detector.cpp:134] Detected a new leader: (id='0')
 I0224 21:20:40.717718 12030 group.cpp:629] Trying to get 
 '/mesos/info_00' in ZooKeeper
 I0224 21:20:40.79 12030 detector.cpp:351] A new leading master 
 (UPID=master@127.0.0.1 mailto:UPID=master@127.0.0.1:80) is detected
 I0224 21:20:40.722367 12030 master.cpp:734] The newly elected leader is 
 master@127.0.0.1 mailto:master@127.0.0.1:80
 I0224 21:20:40.722394 12030 master.cpp:742] Elected as the leading master!
 
 I thought it odd that the IP listed here is 127.0.0.1.  I have not specified 
 localhost anywhere and I explicitly specify —ip=0.0.0.0 in my mesos-master 
 command.
 
 The slave sees the election happen, but then appears to connect to 
 127.0.0.1:80 http://127.0.0.1/:
 
 I0224 21:24:18.892083 17316 detector.cpp:134] Detected a new leader: (id='0')
 I0224 21:24:18.892290 17316 group.cpp:629] Trying to get 
 '/mesos/info_00' in ZooKeeper
 I0224 21:24:18.894039 17316 detector.cpp:351] A new leading master 
 (UPID=master@127.0.0.1 mailto:UPID=master@127.0.0.1:80) is detected
 I0224 21:24:18.894130 17316 slave.cpp:500] New master detected at 
 master@127.0.0.1 mailto:master@127.0.0.1:80
 I0224 21:24:18.894383 17316 slave.cpp:525] Detecting new master
 I0224 21:24:18.894443 17316 status_update_manager.cpp:162] New master 
 detected at master@127.0.0.1 mailto:master@127.0.0.1:80
 I0224 21:24:18.894630 17320 slave.cpp:1957] master@127.0.0.1 
 mailto:master@127.0.0.1:80 exited
 W0224 21:24:18.894665 17320 slave.cpp:1960] Master disconnected! Waiting for 
 a new master to be elected
 
 At this point the slave never successfully connects.  Just to verify, I also 
 checked what ZooKeeper was reporting:
 
 $ /zkCli.sh get /mesos/info_00
 
 201502242120-16777343-80-12000��Pmaster@127.0.0.1:80 http://127.0.0.1/
 cZxid = 0x20
 ctime = Tue Feb 24 21:20:40 UTC 
 http://airmail.calendar/2015-02-24%2013:20:40%20PST 2015
 mZxid = 0x20
 mtime = Tue Feb 24 21:20:40 UTC 
 http://airmail.calendar/2015-02-24%2013:20:40%20PST 2015
 pZxid = 0x20
 cversion = 0
 dataVersion = 0
 aclVersion = 0
 ephemeralOwner = 0x14bbd711b6e0012
 dataLength = 60
 numChildren = 0
 
 So somehow the IP 127.0.0.1 is written instead of the correct IP.  Any 
 thoughts on how I can fix this?
 
 Best,
 
 Devin
 



Re: Transferring Chronos project to the community

2015-01-13 Thread Ken Sipe
YAY!

 On Jan 13, 2015, at 4:56 PM, Brenden Matthews brenden.matth...@airbnb.com 
 wrote:
 
 Hello Mesos users,
 
 I'm pleased to announce that Airbnb has decided to transfer the Chronos 
 project to community ownership. It will now live under the umbrella of the 
 top-level Apache Mesos project, as a GitHub hosted project at 
 https://github.com/mesos/chronos https://github.com/mesos/chronos.
 
 Airbnb continues to use and maintain the Chronos project, which drives our 
 offline processing and ETL systems. Going forward, we hope to foster a 
 healthy relationship with the community and keep the project active.