Thanks a lot for the patient explanation both of you - Vinod and Mark. You
are right, I got confused with the terminology. I got what I wanted out of
ACL in my environment.

Thanks again and I really appreciate it.

./Siva.



On Tue, Mar 10, 2015 at 4:30 AM, Michael Park <[email protected]> wrote:

> On 9 March 2015 at 07:36, Sivaram Kannan <[email protected]> wrote:
>
>>
>> Hi,
>>
>> I apologize for bombarding with so many emails on the same issue. So, I
>> modified the acl.json as below.
>>
>> 1. I was able to launch the framework with authentication as users devel1
>> and devel2.
>>
>
> Just so that our terminologies match here, you were able to *register*
>  frameworks *authenticated* as *principals* *devel1* and *devel2.*
>
> Your ACL specifies *devel1* and *devel2* can register under *apps* and
> *dev-ops* roles, so as long as the frameworks registered under those
> roles, the success here makes sense.
>
>
>> 2. I was able to launch a task as user devel1
>>
>
> I agree with Vinod that maybe the point of confusion is regarding *principal
> *vs *user* for *run_tasks*. To reiterate, *principal* is essentially a
> username for Mesos to authenticate the framework, and *user* is the unix
> user under which the task will run. Your ACL specifies that *principal*
> *devel1* can launch tasks as *user* *root*. So you shouldn't be able to
> launch a task as user *devel1*, but rather launch a task as user* root* with
> the framework registered as principal=*devel1*. If this is not the case,
> something's wrong.
>
>
>> 3. I get TASK_LOST when I try to launch task with the framework
>> registered as devel2.
>>
>
> This is correct. Vinod already covered this point. In short, framework
> registered as *devel2* is not permitted to run anything based on your ACL
> since none of the specified cases match and "permissive" is set to false.
>
>
>> 4. In the same config, if I change the run_tasks => users to devel, the
>> task fails with the error described in the previous email. As far as I
>> understand, an error in run_tasks users, does not give TASK_LOST, but a
>> TASK_FAILED.
>>
>
> I'm not sure what you mean by "an error in run_tasks users". The error you
> get in this case is because you don't have a *devel* user available in
> the environment you're launching the task. The relevant line in the error
> message that illustrate this is:
>
> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.898473    13
>> slave.cpp:2787] Container '9835da8c-a844-4d53-a7f7-4a5e6e808a9b' for
>> executor 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
>> '20150306-054714-24707342-5050-1-0000' failed to start: Failed to create
>> container: Failed to chown: *Failed to get user information for 'devel'*:
>> Success
>
>
> This is indeed a *TASK_FAILED*, since authorization succeeded, but the
> task failed to launch.
>
>
>> But a mismatch in principals between register_frameworks and run_tasks
>> gives a TASK_LOST.
>>
>
> It's not clear exactly what you mean by "a *mismatch* in principals
> between register_frameworks and run_tasks". If you mean that all principals
> under register_frameworks should have a matching portion in run_tasks,
> that's not quite correct. For example, you modified the ACL to be:
>
> {
>>     "permissive": false,
>>     "register_frameworks": [
>>         {
>>             "principals": { "values": [ "devel1", "devel2" ] },
>>             "roles": { "values": [ "apps", "dev-ops" ] }
>>         },
>>         {
>>             "principals": { "type": "NONE" },
>>             "roles": { "values": [ "apps", "dev-ops" ] }
>>         }
>>     ],
>>     "run_tasks": [
>>         {
>>             "principals": { "values": [ "devel1", *"devel2"* ] },
>>             "users": { "values": [ "root" ] }
>>         },
>
>         {
>>             "principals": { "values": [ "marathon" ] },
>>             "users": { "type": "NONE" }
>>         }
>>     ]
>> }
>
>
> If we attempt to launch a task as user "mpark" with the framework
> registered as "devel2" (or "devel1"), we'll get continue to get the
> *TASK_LOST* message because it fails at the *authorization* phase.
>
>
>> Does the above makes sense? Please correct me if I am wrong.
>>
>
> I hope my explanation above made sense!
>
>
>>
>>     "permissive": false,
>>     "register_frameworks": [
>>         {
>>             "principals": { "values": [ "devel1", "devel2" ] },
>>             "roles": { "values": [ "apps", "dev-ops" ] }
>>         },
>>         {
>>             "principals": { "type": "NONE" },
>>             "roles": { "values": [ "apps", "dev-ops" ] }
>>         }
>>     ],
>>     "run_tasks": [
>>         {
>>             "principals": { "values": [ "devel1" ] },
>>             "users": { "values": [ "root" ] }
>>         },
>>         {
>>             "principals": { "values": [ "marathon" ] },
>>             "users": { "type": "NONE" } } ]
>> }
>>
>> Thanks,
>> ./Siva.
>>
>
> Thanks,
>
> MPark.
>
>
>> On Mon, Mar 9, 2015 at 3:57 PM, Sivaram Kannan <[email protected]>
>> wrote:
>>
>>>
>>> Hi Vinod,
>>>
>>> The users in below run_tasks definition - does it refer to unix users in
>>> the machine where the framework is run or the unix users in the mesos-slave
>>> machine. I think the fact that I run all softwares (mesos-master,
>>> mesos-slave, marathon) as docker containers is of significance and reason
>>> for the below failure.
>>>
>>> "run_tasks": [
>>>         {
>>>             "principals": {
>>>                 "values": [
>>>                     "marathon"
>>>                 ]
>>>             },
>>>             "users": {
>>>                 "values": [
>>>                     "devel"
>>>                 ]
>>>             }
>>>         },
>>>         {
>>>             "principals": {
>>>                 "values": [
>>>                     "marathon"
>>>                 ]
>>>             },
>>>             "users": {
>>>                 "type": "NONE"
>>>             }
>>>         }
>>>     ]
>>>
>>> When I start the marathon, I start with the flag --mesos_user=devel and
>>> while bringing up mesos-slave I bring up with the flag
>>> --switch_user=true(which I think anyway is default). When I try to launch a
>>> task this is what I am getting
>>>
>>> Marathon Log:
>>>
>>> 0.10; rv:38.0) Gecko/20100101 Firefox/38.0"
>>> (mesosphere.chaos.http.ChaosRequestLog:15)
>>> [2015-03-06 06:04:04,057] INFO Received status update for task
>>> busybox.9777a963-c3c6-11e4-a31a-56847afe9799: TASK_FAILED (Abnormal
>>> executor termination) (mesosphere.marathon.MarathonScheduler:165)
>>> [2015-03-06 06:04:04,063] INFO Task launch delay for [/busybox] is now
>>> [43] seconds (mesosphere.util.RateLimiter:34)
>>> [2015-03-06 06:04:04,068] INFO Task
>>> busybox.9777a963-c3c6-11e4-a31a-56847afe9799 expunged and removed from
>>> TaskTracker (mesosphere.marathon.tasks.TaskTracker:101)
>>> [2015-03-06 06:04:04,068] INFO Sending event notification.
>>> (mesosphere.marathon.MarathonScheduler:274)
>>>
>>> Mesos-Slave Log:
>>>
>>> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.898473    13
>>> slave.cpp:2787] Container '9835da8c-a844-4d53-a7f7-4a5e6e808a9b' for
>>> executor 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
>>> '20150306-054714-24707342-5050-1-0000' failed to start: Failed to create
>>> container: Failed to chown: Failed to get user information for 'devel':
>>> Success
>>> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.900068    13
>>> slave.cpp:2882] Termination of executor
>>> 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
>>> '20150306-054714-24707342-5050-1-0000' failed: Unknown container:
>>> 9835da8c-a844-4d53-a7f7-4a5e6e808a9b
>>> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.905900    13
>>> slave.cpp:3134] Failed to unmonitor container for executor
>>> busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799 of framework
>>> 20150306-054714-24707342-5050-1-0000: Not monitored
>>>
>>> Could the failure be related to me running the mesos-slave as container
>>> here?
>>>
>>> Thanks,
>>> ./Siva.
>>>
>>> On Mon, Mar 9, 2015 at 10:51 AM, Sivaram Kannan <[email protected]>
>>> wrote:
>>>
>>>>
>>>> Hi Vinod,
>>>>
>>>> Thanks, I got it. I guess I did not understand the relationship between
>>>> principals defined in authentication and in authorization.  I re-read the
>>>> authentication and credentials flag, it is not clear from them that the
>>>> principals defined in authorization should match them to work correctly. If
>>>> I could, will change the documentation to be more clear and submit a PR.
>>>>
>>>> Thanks,
>>>> ./Siva.
>>>>
>>>> On Mon, Mar 9, 2015 at 2:18 AM, Vinod Kone <[email protected]>
>>>> wrote:
>>>>
>>>>> The principal used for authenticating the framework is the same
>>>>> principal used to authorize the framework too. So you need to use
>>>>> 'marathon' in your credentials too. In other words, when you start the
>>>>> framework the Credential.principal should be the same as
>>>>> FrameworkInfo.principal (Mesos master will validate this).
>>>>>
>>>>> On Sun, Mar 8, 2015 at 10:48 AM, Sivaram Kannan <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I0308 17:41:14.876610     6 master.cpp:1342] Authorizing framework
>>>>>> principal 'user1' to receive offers for role 'apps'
>>>>>>
>>>>>
>>>>> As you can see from this line, the master is trying to authorize
>>>>> principal 'user1' and not 'marathon'.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ever tried. ever failed. no matter.
>>>> try again. fail again. fail better.
>>>>         -- Samuel Beckett
>>>>
>>>
>>>
>>>
>>> --
>>> ever tried. ever failed. no matter.
>>> try again. fail again. fail better.
>>>         -- Samuel Beckett
>>>
>>
>>
>>
>> --
>> ever tried. ever failed. no matter.
>> try again. fail again. fail better.
>>         -- Samuel Beckett
>>
>
>


-- 
ever tried. ever failed. no matter.
try again. fail again. fail better.
        -- Samuel Beckett

Reply via email to