Re: Who is the user in Mesos Authorization ACL definition?

2015-03-09 Thread Sivaram Kannan
Hi,

I apologize for bombarding with so many emails on the same issue. So, I
modified the acl.json as below.

1. I was able to launch the framework with authentication as users devel1
and devel2.
2. I was able to launch a task as user devel1
3. I get TASK_LOST when I try to launch task with the framework registered
as devel2.
4. In the same config, if I change the run_tasks = users to devel, the
task fails with the error described in the previous email. As far as I
understand, an error in run_tasks users, does not give TASK_LOST, but a
TASK_FAILED. But a mismatch in principals between register_frameworks and
run_tasks gives a TASK_LOST.

Does the above makes sense? Please correct me if I am wrong.


permissive: false,
register_frameworks: [
{
principals: { values: [ devel1, devel2 ] },
roles: { values: [ apps, dev-ops ] }
},
{
principals: { type: NONE },
roles: { values: [ apps, dev-ops ] }
}
],
run_tasks: [
{
principals: { values: [ devel1 ] },
users: { values: [ root ] }
},
{
principals: { values: [ marathon ] },
users: { type: NONE } } ]
}

Thanks,
./Siva.

On Mon, Mar 9, 2015 at 3:57 PM, Sivaram Kannan sivara...@gmail.com wrote:


 Hi Vinod,

 The users in below run_tasks definition - does it refer to unix users in
 the machine where the framework is run or the unix users in the mesos-slave
 machine. I think the fact that I run all softwares (mesos-master,
 mesos-slave, marathon) as docker containers is of significance and reason
 for the below failure.

 run_tasks: [
 {
 principals: {
 values: [
 marathon
 ]
 },
 users: {
 values: [
 devel
 ]
 }
 },
 {
 principals: {
 values: [
 marathon
 ]
 },
 users: {
 type: NONE
 }
 }
 ]

 When I start the marathon, I start with the flag --mesos_user=devel and
 while bringing up mesos-slave I bring up with the flag
 --switch_user=true(which I think anyway is default). When I try to launch a
 task this is what I am getting

 Marathon Log:

 0.10; rv:38.0) Gecko/20100101 Firefox/38.0
 (mesosphere.chaos.http.ChaosRequestLog:15)
 [2015-03-06 06:04:04,057] INFO Received status update for task
 busybox.9777a963-c3c6-11e4-a31a-56847afe9799: TASK_FAILED (Abnormal
 executor termination) (mesosphere.marathon.MarathonScheduler:165)
 [2015-03-06 06:04:04,063] INFO Task launch delay for [/busybox] is now
 [43] seconds (mesosphere.util.RateLimiter:34)
 [2015-03-06 06:04:04,068] INFO Task
 busybox.9777a963-c3c6-11e4-a31a-56847afe9799 expunged and removed from
 TaskTracker (mesosphere.marathon.tasks.TaskTracker:101)
 [2015-03-06 06:04:04,068] INFO Sending event notification.
 (mesosphere.marathon.MarathonScheduler:274)

 Mesos-Slave Log:

 Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.89847313
 slave.cpp:2787] Container '9835da8c-a844-4d53-a7f7-4a5e6e808a9b' for
 executor 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
 '20150306-054714-24707342-5050-1-' failed to start: Failed to create
 container: Failed to chown: Failed to get user information for 'devel':
 Success
 Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.90006813
 slave.cpp:2882] Termination of executor
 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
 '20150306-054714-24707342-5050-1-' failed: Unknown container:
 9835da8c-a844-4d53-a7f7-4a5e6e808a9b
 Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.90590013
 slave.cpp:3134] Failed to unmonitor container for executor
 busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799 of framework
 20150306-054714-24707342-5050-1-: Not monitored

 Could the failure be related to me running the mesos-slave as container
 here?

 Thanks,
 ./Siva.

 On Mon, Mar 9, 2015 at 10:51 AM, Sivaram Kannan sivara...@gmail.com
 wrote:


 Hi Vinod,

 Thanks, I got it. I guess I did not understand the relationship between
 principals defined in authentication and in authorization.  I re-read the
 authentication and credentials flag, it is not clear from them that the
 principals defined in authorization should match them to work correctly. If
 I could, will change the documentation to be more clear and submit a PR.

 Thanks,
 ./Siva.

 On Mon, Mar 9, 2015 at 2:18 AM, Vinod Kone vinodk...@apache.org wrote:

 The principal used for authenticating the framework is the same
 principal used to authorize the framework too. So you need to use
 'marathon' in your credentials too. In other words, when you start the
 framework the Credential.principal should be the same as
 FrameworkInfo.principal (Mesos master will validate this).

 On Sun, Mar 8, 2015 at 10:48 AM, 

mesos on coreos

2015-03-09 Thread Gurvinder Singh
Hi,

I am wondering if anybody in the community has looked into or are
running mesos on top of coreos. I would be interested to hear out your
experiences around following areas

- Users management on coreos cluster and containers running with Mesos
- Are you using fleet to run mesos or run it as service in cloud-config
and don't use fleet at all
- Networking among hosts flannel or ?
- Any other interesting insights you found considering such setup

Thanks,
Gurvinder


Re: Who is the user in Mesos Authorization ACL definition?

2015-03-09 Thread Michael Park
On 9 March 2015 at 07:36, Sivaram Kannan sivara...@gmail.com wrote:


 Hi,

 I apologize for bombarding with so many emails on the same issue. So, I
 modified the acl.json as below.

 1. I was able to launch the framework with authentication as users devel1
 and devel2.


Just so that our terminologies match here, you were able to *register*
 frameworks *authenticated* as *principals* *devel1* and *devel2.*

Your ACL specifies *devel1* and *devel2* can register under *apps* and
*dev-ops* roles, so as long as the frameworks registered under those roles,
the success here makes sense.


 2. I was able to launch a task as user devel1


I agree with Vinod that maybe the point of confusion is regarding *principal
*vs *user* for *run_tasks*. To reiterate, *principal* is essentially a
username for Mesos to authenticate the framework, and *user* is the unix
user under which the task will run. Your ACL specifies that *principal*
*devel1* can launch tasks as *user* *root*. So you shouldn't be able to
launch a task as user *devel1*, but rather launch a task as user* root* with
the framework registered as principal=*devel1*. If this is not the case,
something's wrong.


 3. I get TASK_LOST when I try to launch task with the framework registered
 as devel2.


This is correct. Vinod already covered this point. In short, framework
registered as *devel2* is not permitted to run anything based on your ACL
since none of the specified cases match and permissive is set to false.


 4. In the same config, if I change the run_tasks = users to devel, the
 task fails with the error described in the previous email. As far as I
 understand, an error in run_tasks users, does not give TASK_LOST, but a
 TASK_FAILED.


I'm not sure what you mean by an error in run_tasks users. The error you
get in this case is because you don't have a *devel* user available in the
environment you're launching the task. The relevant line in the error
message that illustrate this is:

Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.89847313
 slave.cpp:2787] Container '9835da8c-a844-4d53-a7f7-4a5e6e808a9b' for
 executor 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
 '20150306-054714-24707342-5050-1-' failed to start: Failed to create
 container: Failed to chown: *Failed to get user information for 'devel'*:
 Success


This is indeed a *TASK_FAILED*, since authorization succeeded, but the task
failed to launch.


 But a mismatch in principals between register_frameworks and run_tasks
 gives a TASK_LOST.


It's not clear exactly what you mean by a *mismatch* in principals between
register_frameworks and run_tasks. If you mean that all principals under
register_frameworks should have a matching portion in run_tasks, that's not
quite correct. For example, you modified the ACL to be:

{
 permissive: false,
 register_frameworks: [
 {
 principals: { values: [ devel1, devel2 ] },
 roles: { values: [ apps, dev-ops ] }
 },
 {
 principals: { type: NONE },
 roles: { values: [ apps, dev-ops ] }
 }
 ],
 run_tasks: [
 {
 principals: { values: [ devel1, *devel2* ] },
 users: { values: [ root ] }
 },

{
 principals: { values: [ marathon ] },
 users: { type: NONE }
 }
 ]
 }


If we attempt to launch a task as user mpark with the framework
registered as devel2 (or devel1), we'll get continue to get the
*TASK_LOST* message because it fails at the *authorization* phase.


 Does the above makes sense? Please correct me if I am wrong.


I hope my explanation above made sense!



 permissive: false,
 register_frameworks: [
 {
 principals: { values: [ devel1, devel2 ] },
 roles: { values: [ apps, dev-ops ] }
 },
 {
 principals: { type: NONE },
 roles: { values: [ apps, dev-ops ] }
 }
 ],
 run_tasks: [
 {
 principals: { values: [ devel1 ] },
 users: { values: [ root ] }
 },
 {
 principals: { values: [ marathon ] },
 users: { type: NONE } } ]
 }

 Thanks,
 ./Siva.


Thanks,

MPark.


 On Mon, Mar 9, 2015 at 3:57 PM, Sivaram Kannan sivara...@gmail.com
 wrote:


 Hi Vinod,

 The users in below run_tasks definition - does it refer to unix users in
 the machine where the framework is run or the unix users in the mesos-slave
 machine. I think the fact that I run all softwares (mesos-master,
 mesos-slave, marathon) as docker containers is of significance and reason
 for the below failure.

 run_tasks: [
 {
 principals: {
 values: [
 marathon
 ]
 },
 users: {
 values: [
 devel
 ]
 }
 },
 {
 principals: {
 values: 

Re: Who is the user in Mesos Authorization ACL definition?

2015-03-09 Thread Sivaram Kannan
Hi Vinod,

The users in below run_tasks definition - does it refer to unix users in
the machine where the framework is run or the unix users in the mesos-slave
machine. I think the fact that I run all softwares (mesos-master,
mesos-slave, marathon) as docker containers is of significance and reason
for the below failure.

run_tasks: [
{
principals: {
values: [
marathon
]
},
users: {
values: [
devel
]
}
},
{
principals: {
values: [
marathon
]
},
users: {
type: NONE
}
}
]

When I start the marathon, I start with the flag --mesos_user=devel and
while bringing up mesos-slave I bring up with the flag
--switch_user=true(which I think anyway is default). When I try to launch a
task this is what I am getting

Marathon Log:

0.10; rv:38.0) Gecko/20100101 Firefox/38.0
(mesosphere.chaos.http.ChaosRequestLog:15)
[2015-03-06 06:04:04,057] INFO Received status update for task
busybox.9777a963-c3c6-11e4-a31a-56847afe9799: TASK_FAILED (Abnormal
executor termination) (mesosphere.marathon.MarathonScheduler:165)
[2015-03-06 06:04:04,063] INFO Task launch delay for [/busybox] is now [43]
seconds (mesosphere.util.RateLimiter:34)
[2015-03-06 06:04:04,068] INFO Task
busybox.9777a963-c3c6-11e4-a31a-56847afe9799 expunged and removed from
TaskTracker (mesosphere.marathon.tasks.TaskTracker:101)
[2015-03-06 06:04:04,068] INFO Sending event notification.
(mesosphere.marathon.MarathonScheduler:274)

Mesos-Slave Log:

Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.89847313
slave.cpp:2787] Container '9835da8c-a844-4d53-a7f7-4a5e6e808a9b' for
executor 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
'20150306-054714-24707342-5050-1-' failed to start: Failed to create
container: Failed to chown: Failed to get user information for 'devel':
Success
Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.90006813
slave.cpp:2882] Termination of executor
'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
'20150306-054714-24707342-5050-1-' failed: Unknown container:
9835da8c-a844-4d53-a7f7-4a5e6e808a9b
Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.90590013
slave.cpp:3134] Failed to unmonitor container for executor
busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799 of framework
20150306-054714-24707342-5050-1-: Not monitored

Could the failure be related to me running the mesos-slave as container
here?

Thanks,
./Siva.

On Mon, Mar 9, 2015 at 10:51 AM, Sivaram Kannan sivara...@gmail.com wrote:


 Hi Vinod,

 Thanks, I got it. I guess I did not understand the relationship between
 principals defined in authentication and in authorization.  I re-read the
 authentication and credentials flag, it is not clear from them that the
 principals defined in authorization should match them to work correctly. If
 I could, will change the documentation to be more clear and submit a PR.

 Thanks,
 ./Siva.

 On Mon, Mar 9, 2015 at 2:18 AM, Vinod Kone vinodk...@apache.org wrote:

 The principal used for authenticating the framework is the same principal
 used to authorize the framework too. So you need to use 'marathon' in your
 credentials too. In other words, when you start the framework the
 Credential.principal should be the same as FrameworkInfo.principal (Mesos
 master will validate this).

 On Sun, Mar 8, 2015 at 10:48 AM, Sivaram Kannan sivara...@gmail.com
 wrote:

 I0308 17:41:14.876610 6 master.cpp:1342] Authorizing framework
 principal 'user1' to receive offers for role 'apps'


 As you can see from this line, the master is trying to authorize
 principal 'user1' and not 'marathon'.




 --
 ever tried. ever failed. no matter.
 try again. fail again. fail better.
 -- Samuel Beckett




-- 
ever tried. ever failed. no matter.
try again. fail again. fail better.
-- Samuel Beckett