Hi,
I apologize for bombarding with so many emails on the same issue. So, I
modified the acl.json as below.
1. I was able to launch the framework with authentication as users devel1
and devel2.
2. I was able to launch a task as user devel1
3. I get TASK_LOST when I try to launch task with the framework registered
as devel2.
4. In the same config, if I change the run_tasks => users to devel, the
task fails with the error described in the previous email. As far as I
understand, an error in run_tasks users, does not give TASK_LOST, but a
TASK_FAILED. But a mismatch in principals between register_frameworks and
run_tasks gives a TASK_LOST.
Does the above makes sense? Please correct me if I am wrong.
"permissive": false,
"register_frameworks": [
{
"principals": { "values": [ "devel1", "devel2" ] },
"roles": { "values": [ "apps", "dev-ops" ] }
},
{
"principals": { "type": "NONE" },
"roles": { "values": [ "apps", "dev-ops" ] }
}
],
"run_tasks": [
{
"principals": { "values": [ "devel1" ] },
"users": { "values": [ "root" ] }
},
{
"principals": { "values": [ "marathon" ] },
"users": { "type": "NONE" } } ]
}
Thanks,
./Siva.
On Mon, Mar 9, 2015 at 3:57 PM, Sivaram Kannan <[email protected]> wrote:
>
> Hi Vinod,
>
> The users in below run_tasks definition - does it refer to unix users in
> the machine where the framework is run or the unix users in the mesos-slave
> machine. I think the fact that I run all softwares (mesos-master,
> mesos-slave, marathon) as docker containers is of significance and reason
> for the below failure.
>
> "run_tasks": [
> {
> "principals": {
> "values": [
> "marathon"
> ]
> },
> "users": {
> "values": [
> "devel"
> ]
> }
> },
> {
> "principals": {
> "values": [
> "marathon"
> ]
> },
> "users": {
> "type": "NONE"
> }
> }
> ]
>
> When I start the marathon, I start with the flag --mesos_user=devel and
> while bringing up mesos-slave I bring up with the flag
> --switch_user=true(which I think anyway is default). When I try to launch a
> task this is what I am getting
>
> Marathon Log:
>
> 0.10; rv:38.0) Gecko/20100101 Firefox/38.0"
> (mesosphere.chaos.http.ChaosRequestLog:15)
> [2015-03-06 06:04:04,057] INFO Received status update for task
> busybox.9777a963-c3c6-11e4-a31a-56847afe9799: TASK_FAILED (Abnormal
> executor termination) (mesosphere.marathon.MarathonScheduler:165)
> [2015-03-06 06:04:04,063] INFO Task launch delay for [/busybox] is now
> [43] seconds (mesosphere.util.RateLimiter:34)
> [2015-03-06 06:04:04,068] INFO Task
> busybox.9777a963-c3c6-11e4-a31a-56847afe9799 expunged and removed from
> TaskTracker (mesosphere.marathon.tasks.TaskTracker:101)
> [2015-03-06 06:04:04,068] INFO Sending event notification.
> (mesosphere.marathon.MarathonScheduler:274)
>
> Mesos-Slave Log:
>
> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.898473 13
> slave.cpp:2787] Container '9835da8c-a844-4d53-a7f7-4a5e6e808a9b' for
> executor 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
> '20150306-054714-24707342-5050-1-0000' failed to start: Failed to create
> container: Failed to chown: Failed to get user information for 'devel':
> Success
> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.900068 13
> slave.cpp:2882] Termination of executor
> 'busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799' of framework
> '20150306-054714-24707342-5050-1-0000' failed: Unknown container:
> 9835da8c-a844-4d53-a7f7-4a5e6e808a9b
> Mar 06 06:06:03 node-0800279564ad sh[27684]: E0306 06:06:03.905900 13
> slave.cpp:3134] Failed to unmonitor container for executor
> busybox.d4ef22c6-c3c6-11e4-a31a-56847afe9799 of framework
> 20150306-054714-24707342-5050-1-0000: Not monitored
>
> Could the failure be related to me running the mesos-slave as container
> here?
>
> Thanks,
> ./Siva.
>
> On Mon, Mar 9, 2015 at 10:51 AM, Sivaram Kannan <[email protected]>
> wrote:
>
>>
>> Hi Vinod,
>>
>> Thanks, I got it. I guess I did not understand the relationship between
>> principals defined in authentication and in authorization. I re-read the
>> authentication and credentials flag, it is not clear from them that the
>> principals defined in authorization should match them to work correctly. If
>> I could, will change the documentation to be more clear and submit a PR.
>>
>> Thanks,
>> ./Siva.
>>
>> On Mon, Mar 9, 2015 at 2:18 AM, Vinod Kone <[email protected]> wrote:
>>
>>> The principal used for authenticating the framework is the same
>>> principal used to authorize the framework too. So you need to use
>>> 'marathon' in your credentials too. In other words, when you start the
>>> framework the Credential.principal should be the same as
>>> FrameworkInfo.principal (Mesos master will validate this).
>>>
>>> On Sun, Mar 8, 2015 at 10:48 AM, Sivaram Kannan <[email protected]>
>>> wrote:
>>>
>>>> I0308 17:41:14.876610 6 master.cpp:1342] Authorizing framework
>>>> principal 'user1' to receive offers for role 'apps'
>>>>
>>>
>>> As you can see from this line, the master is trying to authorize
>>> principal 'user1' and not 'marathon'.
>>>
>>
>>
>>
>> --
>> ever tried. ever failed. no matter.
>> try again. fail again. fail better.
>> -- Samuel Beckett
>>
>
>
>
> --
> ever tried. ever failed. no matter.
> try again. fail again. fail better.
> -- Samuel Beckett
>
--
ever tried. ever failed. no matter.
try again. fail again. fail better.
-- Samuel Beckett