Re: 答复: 答复: Status update: task 1 is in state TASK_ERROR

2018-03-16 Thread Benjamin Mahler
What kind of tasks are you trying to run?

If you want to run commands or containers, you can just use the built-in
DEFAULT executor:
https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/mesos.proto#L713-L725

If you need a custom executor because your tasks are not commands or
containers, then you can implement your own custom executor:
https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/mesos.proto#L727-L730

In the latter case, you will have to implement your own executor or use an
existing third party executor. If implementing your own, you need to speak
the v1 protocol to the agent. We maintain a listing of known executor API
libraries here:
http://mesos.apache.org/documentation/latest/api-client-libraries/#executor-api

On Thu, Mar 15, 2018 at 2:32 AM, 罗 辉  wrote:

> Hi guys:
>
> For more info, my framework app’s log and master/agent logs are attached.
>
> My app fails as the end of log described:
>
> The message of current task is :Executor did not register within 1mins
>
> Status update: task 1 is in state TASK_FAILED
>
> Aborting because task 1 is in unexpected state TASK_FAILED with reason
> 'REASON_EXECUTOR_REGISTRATION_TIMEOUT' from source 'SOURCE_AGENT' with
> message 'Executor did not register within 1mins'
>
>
>
> My opinion about this failure:
>
> 1.I guess there should be an V1 version executor class , with a register
> method to register the executor onto the agent?
>
> 2.I studied V0’s executor implementation and tried to implement a V1
> version executor ,which supposed to extend from executor interface, and
> implement the abstract methods including register, reregister and etc.
> However I didn’t find the V1 executor interface java API. Does that mean I
> am in the wrong direction?
>
>
>
> In one word, any ideas about the REASON_EXECUTOR_REGISTRATION_TIMEOUT
> failure?
>
>
>
> San
>
>
>
> *发件人:* 罗 辉 
> *发送时间:* 2018年3月14日 15:29
> *收件人:* user 
> *主题:* 答复: 答复: Status update: task 1 is in state TASK_ERROR
>
>
>
> Thanks Benjamin,
>
> I tried to understand the missing reservation metadata and look up
> relative docs about resource reservation, however i didn't find to much
> document about it.
>
> I solved this problem by adding a method like below in my scheduler:
>
>   def luanchtask(offer: Offer, task: TaskInfo): Call = {
> Call.newBuilder()
>   .setFrameworkId(frameworkId)
>   .setType(Call.Type.ACCEPT)
>   .setAccept(
> Call.Accept.newBuilder()
>   .addOfferIds(offer.getId)
>   .addOperations(
> Offer.Operation.newBuilder()
>   .setType(Offer.Operation.Type.LAUNCH)
>   .setLaunch(
> Offer.Operation.Launch.newBuilder()
>   .addTaskInfos(task.build()
>   }
>
>
>
> And after that I met another problem: my task is always in staging, and
> terminates after 1min due to timeout. I think there are many mini process
> in a scheduler app including callbacks, such as connect, register, get
> offers list,accpet offer and etc. Is there a detail programming guide in V1
> framework developing?
>
>
>
> Thank you.
>
>
>
>
>
> San
>
>
> --
>
> *发件人**:* Benjamin Mahler 
> *发送时间**:* 2018年3月10日 9:00:55
> *收件人**:* user
> *主题**:* Re: 答复: Status update: task 1 is in state TASK_ERROR
>
>
>
> The message clarifies it, the task+executor have some unreserved
> resources:
>
> cpus(allocated: controller):6; mem(allocated: controller):8000
>
>
>
> But the resources offered were reserved:
>
> cpus(allocated: controller)(reservations: [(STATIC,controller)]):6;
> mem(allocated: controller)(reservations: [(STATIC,controller)]):8000; +
> disk + ports
>
>
>
> The scheduler needs to provide resources that are contained in the offer,
> in this case it needs to include the missing reservation metadata.
>
>
>
> On Thu, Mar 8, 2018 at 6:57 PM, 罗 辉  wrote:
>
> yes,I modified my code like below:
>
>   def acknowledgeTaskMessage(taskStatus: TaskStatus): String = {
> taskStatus.getMessage
>   }
>
> def update(mesos: Mesos, status: TaskStatus) = {
> val message = acknowledgeTaskMessage(status)
> println("The message of current task is :" + message)
> println("Status update: task " + status.getTaskId().getValue() + " is
> in state " + status.getState().getValueDescriptor().getName())
>
>
> ..
>
>
>
> And I got below log as attched file line 231:
>
> 231 Received an UPDATE event
> 232 The message of current task is :Total resources cpus(allocated:
> controller):6; mem(allocated: c

答复: 答复: Status update: task 1 is in state TASK_ERROR

2018-03-14 Thread 罗 辉
Thanks Benjamin,

I tried to understand the missing reservation metadata and look up relative 
docs about resource reservation, however i didn't find to much document about 
it.

I solved this problem by adding a method like below in my scheduler:

  def luanchtask(offer: Offer, task: TaskInfo): Call = {
Call.newBuilder()
  .setFrameworkId(frameworkId)
  .setType(Call.Type.ACCEPT)
  .setAccept(
Call.Accept.newBuilder()
  .addOfferIds(offer.getId)
  .addOperations(
Offer.Operation.newBuilder()
  .setType(Offer.Operation.Type.LAUNCH)
  .setLaunch(
Offer.Operation.Launch.newBuilder()
  .addTaskInfos(task.build()
  }

And after that I met another problem: my task is always in staging, and 
terminates after 1min due to timeout. I think there are many mini process in a 
scheduler app including callbacks, such as connect, register, get offers 
list,accpet offer and etc. Is there a detail programming guide in V1 framework 
developing?

Thank you.



San



发件人: Benjamin Mahler 
发送时间: 2018年3月10日 9:00:55
收件人: user
主题: Re: 答复: Status update: task 1 is in state TASK_ERROR

The message clarifies it, the task+executor have some unreserved resources:
cpus(allocated: controller):6; mem(allocated: controller):8000

But the resources offered were reserved:
cpus(allocated: controller)(reservations: [(STATIC,controller)]):6; 
mem(allocated: controller)(reservations: [(STATIC,controller)]):8000; + disk + 
ports

The scheduler needs to provide resources that are contained in the offer, in 
this case it needs to include the missing reservation metadata.

On Thu, Mar 8, 2018 at 6:57 PM, 罗 辉 
mailto:luo...@zetyun.com>> wrote:

yes,I modified my code like below:

  def acknowledgeTaskMessage(taskStatus: TaskStatus): String = {
taskStatus.getMessage
  }
def update(mesos: Mesos, status: TaskStatus) = {
val message = acknowledgeTaskMessage(status)
println("The message of current task is :" + message)
println("Status update: task " + status.getTaskId().getValue() + " is in 
state " + status.getState().getValueDescriptor().getName())

..

And I got below log as attched file line 231:
231 Received an UPDATE event
232 The message of current task is :Total resources cpus(allocated: 
controller):6; mem(allocated: controller):8000 required by task and its 
executor is more than available cpus(allocated: controller)(reservations: 
[(STATIC,controller)]):6; mem(allocated: controller)(reservations: 
[(STATIC,controller)]):8000; disk(allocated: controller)(reservations: 
[(STATIC,controller)]):550264; ports(allocate    d: controller):[31000-32000]
233 Status update: task 1 is in state TASK_ERROR



罗辉

基础架构


发件人: Benjamin Mahler mailto:bmah...@apache.org>>
发送时间: 2018年3月9日 9:24:37
收件人: user
主题: Re: Status update: task 1 is in state TASK_ERROR

Can you log the message provided in the TaskStatus?

https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/mesos.proto#L2424
[https://avatars3.githubusercontent.com/u/47359?s=400&v=4]<https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/mesos.proto#L2424>

apache/mesos<https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/mesos.proto#L2424>
github.com
mesos - Mirror of Apache Mesos




On Wed, Mar 7, 2018 at 11:23 PM, 罗 辉 
mailto:luo...@zetyun.com>> wrote:

Hi guys:

I got a mesos test app, mostly likely

https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/v1/scheduler/V1Mesos.java

[https://avatars3.githubusercontent.com/u/47359?s=400&v=4]<https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/v1/scheduler/V1Mesos.java>

apache/mesos<https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/v1/scheduler/V1Mesos.java>
github.com
mesos - Mirror of Apache Mesos



just to run a simple task "free -m". The app can not run the task successfully, 
always got a log info :

Received an UPDATE event
Status update: task 1 is in state TASK_ERROR


I checked the logs , but no Errors  in the mesos-master.ERROR or 
mesos-agent.ERROR, only in mesos-master.INFO shows :

W0307 17:55:28.180716 29438 validation.cpp:1298] Executor 'default' for task 
'1' uses less CPUs (None) than the minimum required (0.01). Please update your 
executor, as this will be mandatory in future releases.
W0307 17:55:28.180766 29438 validation.cpp:1310] Executor 'default' for task 
'1' uses less memory (None) than the minimum required (32MB). Please update 
your executor, as this will be mandatory in future releases.
  Following this log, I didn't find a way to set the executor's resource or 
similar code example

  Why my little app always fails? Thanks for any ideas.



San




Re: 答复: Status update: task 1 is in state TASK_ERROR

2018-03-09 Thread Benjamin Mahler
The message clarifies it, the task+executor have some unreserved resources:
cpus(allocated: controller):6; mem(allocated: controller):8000

But the resources offered were reserved:
cpus(allocated: controller)(reservations: [(STATIC,controller)]):6;
mem(allocated: controller)(reservations: [(STATIC,controller)]):8000; +
disk + ports

The scheduler needs to provide resources that are contained in the offer,
in this case it needs to include the missing reservation metadata.

On Thu, Mar 8, 2018 at 6:57 PM, 罗 辉  wrote:

> yes,I modified my code like below:
>
>   def acknowledgeTaskMessage(taskStatus: TaskStatus): String = {
> taskStatus.getMessage
>   }
> def update(mesos: Mesos, status: TaskStatus) = {
> val message = acknowledgeTaskMessage(status)
> println("The message of current task is :" + message)
> println("Status update: task " + status.getTaskId().getValue() + " is
> in state " + status.getState().getValueDescriptor().getName())
>
> ..
>
> And I got below log as attched file line 231:
> 231 Received an UPDATE event
> 232 The message of current task is :Total resources cpus(allocated:
> controller):6; mem(allocated: controller):8000 required by task and its
> executor is more than available cpus(allocated: controller)(reservations:
> [(STATIC,controller)]):6; mem(allocated: controller)(reservations:
> [(STATIC,controller)]):8000; disk(allocated: controller)(reservations:
> [(STATIC,controller)]):550264; ports(allocated:
> controller):[31000-32000]
> 233 Status update: task 1 is in state TASK_ERROR
>
>
>
> 罗辉
>
> 基础架构
> --------------
> *发件人:* Benjamin Mahler 
> *发送时间:* 2018年3月9日 9:24:37
> *收件人:* user
> *主题:* Re: Status update: task 1 is in state TASK_ERROR
>
> Can you log the message provided in the TaskStatus?
>
> https://github.com/apache/mesos/blob/1.5.0/include/mesos/v1/
> mesos.proto#L2424
>
> On Wed, Mar 7, 2018 at 11:23 PM, 罗 辉  wrote:
>
> Hi guys:
>
> I got a mesos test app, mostly likely
>
> https://github.com/apache/mesos/blob/master/src/java/src/org
> /apache/mesos/v1/scheduler/V1Mesos.java
>
> just to run a simple task "free -m". The app can not run the task
> successfully, always got a log info :
>
> Received an UPDATE event
> Status update: task 1 is in state TASK_ERROR
>
>
> I checked the logs , but no Errors  in the mesos-master.ERROR or
> mesos-agent.ERROR, only in mesos-master.INFO shows :
>
> W0307 17:55:28.180716 29438 validation.cpp:1298] Executor 'default' for
> task '1' uses less CPUs (None) than the minimum required (0.01). Please
> update your executor, as this will be mandatory in future releases.
> W0307 17:55:28.180766 29438 validation.cpp:1310] Executor 'default' for
> task '1' uses less memory (None) than the minimum required (32MB). Please
> update your executor, as this will be mandatory in future releases.
>   Following this log, I didn't find a way to set the executor's
> resource or similar code example
>
>   Why my little app always fails? Thanks for any ideas.
>
>
> San
>
>
>


Re: Status update: task 1 is in state TASK_ERROR

2018-03-08 Thread Benjamin Mahler
Can you log the message provided in the TaskStatus?

https://github.com/apache/mesos/blob/1.5.0/include/
mesos/v1/mesos.proto#L2424

On Wed, Mar 7, 2018 at 11:23 PM, 罗 辉  wrote:

> Hi guys:
>
> I got a mesos test app, mostly likely
>
> https://github.com/apache/mesos/blob/master/src/java/src/
> org/apache/mesos/v1/scheduler/V1Mesos.java
>
> just to run a simple task "free -m". The app can not run the task
> successfully, always got a log info :
>
> Received an UPDATE event
> Status update: task 1 is in state TASK_ERROR
>
>
> I checked the logs , but no Errors  in the mesos-master.ERROR or
> mesos-agent.ERROR, only in mesos-master.INFO shows :
>
> W0307 17:55:28.180716 29438 validation.cpp:1298] Executor 'default' for
> task '1' uses less CPUs (None) than the minimum required (0.01). Please
> update your executor, as this will be mandatory in future releases.
> W0307 17:55:28.180766 29438 validation.cpp:1310] Executor 'default' for
> task '1' uses less memory (None) than the minimum required (32MB). Please
> update your executor, as this will be mandatory in future releases.
>   Following this log, I didn't find a way to set the executor's
> resource or similar code example
>
>   Why my little app always fails? Thanks for any ideas.
>
>
> San
>