Re: Authentication module

2016-12-04 Thread Vinod Kone
Authentication is enabled for Mesos APIs used by schedulers (to talk to
master), operators (to talk to master/agent) and agents (to talk to
master). Executor to agent communication is not currently authenticated.

This might throw some light:
https://github.com/apache/mesos/blob/master/docs/authentication.md

On Fri, Dec 2, 2016 at 11:48 AM, Alexander Gallego 
wrote:

>
> For the authentication module: http://mesos.apache.org/
> documentation/latest/modules/ does it mean kerberos,ldap, etc for tasks
> or for framework registration or for machine registration
>
> are there any more docs on this?
>
>
>


Re: Failure reason documentation

2016-12-04 Thread Erik Weathers
Thanks haosdent!

On Sun, Dec 4, 2016 at 10:45 AM haosdent  wrote:

> Ohoh, sorry for misunderstanding the question. As far as I know, there is
> no documentation for that. We should add some comments to the reason enums.
> Create a ticket here https://issues.apache.org/jira/browse/MESOS-6686 to
> track it.
>
> On Mon, Dec 5, 2016 at 2:27 AM, Erik Weathers 
> wrote:
>
> I think he's looking for documentation about what precisely each reason
> *means*. A la how there are comments beside the TaskState list in
> mesos.proto.
>
> - Erik
>
> On Sun, Dec 4, 2016 at 10:07 AM haosdent  wrote:
>
> Hi @Wil You could find them here
> https://github.com/apache/mesos/blob/1.1.0/include/mesos/mesos.proto#L1577-L1609
>
> On Sat, Dec 3, 2016 at 6:09 AM, Wil Yegelwel  wrote:
>
> No I'm referring to the values of the enum Reason.
>
> On Fri, Dec 2, 2016, 4:52 PM Tomek Janiszewski  wrote:
>
> Hi
>
> Are you referring to task state? If yes then take a look at comments in
> proto
> https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L1552
>
> http://mesos.apache.org/api/latest/java/org/apache/mesos/Protos.TaskState.html
>
> Best
>
> Tomek
>
> pt., 2.12.2016, 21:31 użytkownik Wil Yegelwel 
> napisał:
>
> Hey mesos users!
>
> I can't seem to find any documentation about the various reasons mesos
> includes when a job fails. Is there a place that describes what the reasons
> mean?
>
> Thanks,
> Wil
>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>


Re: Failure reason documentation

2016-12-04 Thread haosdent
Ohoh, sorry for misunderstanding the question. As far as I know, there is
no documentation for that. We should add some comments to the reason enums.
Create a ticket here https://issues.apache.org/jira/browse/MESOS-6686 to
track it.

On Mon, Dec 5, 2016 at 2:27 AM, Erik Weathers  wrote:

> I think he's looking for documentation about what precisely each reason
> *means*. A la how there are comments beside the TaskState list in
> mesos.proto.
>
> - Erik
>
> On Sun, Dec 4, 2016 at 10:07 AM haosdent  wrote:
>
> Hi @Wil You could find them here https://github.com/
> apache/mesos/blob/1.1.0/include/mesos/mesos.proto#L1577-L1609
>
> On Sat, Dec 3, 2016 at 6:09 AM, Wil Yegelwel  wrote:
>
> No I'm referring to the values of the enum Reason.
>
> On Fri, Dec 2, 2016, 4:52 PM Tomek Janiszewski  wrote:
>
> Hi
>
> Are you referring to task state? If yes then take a look at comments in
> proto https://github.com/apache/mesos/blob/master/include/
> mesos/mesos.proto#L1552  http://mesos.apache.org/api/
> latest/java/org/apache/mesos/Protos.TaskState.html
>
> Best
>
> Tomek
>
> pt., 2.12.2016, 21:31 użytkownik Wil Yegelwel 
> napisał:
>
> Hey mesos users!
>
> I can't seem to find any documentation about the various reasons mesos
> includes when a job fails. Is there a place that describes what the reasons
> mean?
>
> Thanks,
> Wil
>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
>


-- 
Best Regards,
Haosdent Huang


Re: Failure reason documentation

2016-12-04 Thread Erik Weathers
I think he's looking for documentation about what precisely each reason
*means*. A la how there are comments beside the TaskState list in
mesos.proto.

- Erik

On Sun, Dec 4, 2016 at 10:07 AM haosdent  wrote:

Hi @Wil You could find them here
https://github.com/apache/mesos/blob/1.1.0/include/mesos/mesos.proto#L1577-L1609

On Sat, Dec 3, 2016 at 6:09 AM, Wil Yegelwel  wrote:

No I'm referring to the values of the enum Reason.

On Fri, Dec 2, 2016, 4:52 PM Tomek Janiszewski  wrote:

Hi

Are you referring to task state? If yes then take a look at comments in
proto
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L1552
http://mesos.apache.org/api/latest/java/org/apache/mesos/Protos.TaskState.html

Best

Tomek

pt., 2.12.2016, 21:31 użytkownik Wil Yegelwel  napisał:

Hey mesos users!

I can't seem to find any documentation about the various reasons mesos
includes when a job fails. Is there a place that describes what the reasons
mean?

Thanks,
Wil




-- 
Best Regards,
Haosdent Huang


Re: Failure reason documentation

2016-12-04 Thread haosdent
Hi @Wil You could find them here
https://github.com/apache/mesos/blob/1.1.0/include/mesos/mesos.proto#L1577-L1609

On Sat, Dec 3, 2016 at 6:09 AM, Wil Yegelwel  wrote:

> No I'm referring to the values of the enum Reason.
>
> On Fri, Dec 2, 2016, 4:52 PM Tomek Janiszewski  wrote:
>
>> Hi
>>
>> Are you referring to task state? If yes then take a look at comments in
>> proto https://github.com/apache/mesos/blob/master/include/
>> mesos/mesos.proto#L1552  http://mesos.apache.org/api/
>> latest/java/org/apache/mesos/Protos.TaskState.html
>>
>> Best
>>
>> Tomek
>>
>> pt., 2.12.2016, 21:31 użytkownik Wil Yegelwel 
>> napisał:
>>
>> Hey mesos users!
>>
>> I can't seem to find any documentation about the various reasons mesos
>> includes when a job fails. Is there a place that describes what the reasons
>> mean?
>>
>> Thanks,
>> Wil
>>
>>


-- 
Best Regards,
Haosdent Huang


Re: MESOS-6233 Allow agents to re-register post a host reboot

2016-12-04 Thread haosdent
> we can have the agent remove `rm -f /meta/slaves/latest`
automatically upon recovery failure but only after the host has rebooted.
This sounds dangerous. When the different of AgentInfo is caused by
operator's typo, I think the operator would prefer to correct them and try
to start agent again. Rather than remove them automatically.

But if we decide to do that, please make sure email this behavior change to
the mailing lists in a separate email. Thank you!

On Wed, Nov 30, 2016 at 6:24 AM, tommy xiao  wrote:

> agree with james's options.
>
> 2016-11-30 0:48 GMT+08:00 James Peach :
>
> >
> > > On Nov 28, 2016, at 6:09 PM, Yan Xu  wrote:
> > >
> > > So one thing that was brought up during offline conversations was that
> > if the host reboot is associated with hardware change (e.g., a new memory
> > stick):
> > >
> > >   • Currently: the agent would skip the recovery (and the chance of
> > running into incompatible agent info) and register as a new agent.
> > >   • With the change: the agent could run into incompatible agent
> > info due to resource change and flap indefinitely until the operator
> > intervenes.
> > >
> > > To mitigate this and maintain the current behavior, we can have the
> > agent remove `rm -f /meta/slaves/latest` automatically upon
> > recovery failure but only after the host has rebooted. This way the agent
> > can restart as a new agent without operator intervention.
> > >
> > > Any thoughts?
> >
> > I still think you need a mechanism for the master/agent to tell you
> > whether it will honor the restart policy. Without this, you have to lock
> > the framework to a Mesos version.
> >
> > An empty RestartPolicy is also problematic since it precludes using
> > RestartPolicy in pods. If you later want to restart a task inside a pod
> but
> > not across agent restarts you would have no way to express that.
> >
> > J
>
>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>



-- 
Best Regards,
Haosdent Huang