[jira] [Commented] (MESOS-5439) registerExecutor problem

2016-05-31 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308814#comment-15308814
 ] 

Gilbert Song commented on MESOS-5439:
-

hi [~wnghksrla001], are you saying it is only slow between 'Forked child with 
pid' and 'Got registration for executor', or you are saying all the agent 
logging is slow. If it is the former case, it may be related to the executor.

As an usual case, it should be pretty quick. You can test it out to launch some 
similar tasks using mesos-execute with command executor.

> registerExecutor problem
> 
>
> Key: MESOS-5439
> URL: https://issues.apache.org/jira/browse/MESOS-5439
> Project: Mesos
>  Issue Type: Bug
>  Components: c++ api, slave
>Affects Versions: 0.27.0
>Reporter: kimjoohwan
>
> Currently, we are using Mesos 0.27.0. The master is build up with a Intel(R) 
> Core(TM) i5-3470 CPU @ 3.20GHz CPU and a 4GB RAM. The slave (Banana PI) is 
> build up with a Cortex -A7 Dual-Core CPU and a 1GB RAM.
> By using the Mesos API, we have developed and completed the execution of the 
> framework which is based on python.
> but, we found that it takes too much time between the messages, 'Forked child 
> with pid' and 'Got registration for executor' from the slave log. (5sec)
> If you know how to deal with this problem, please let us know.
> I0523 17:38:16.264289  1787 slave.cpp:5208] Launching executor default of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 with resources  in work 
> directory 
> '/tmp/mesos/slaves/3fb86eea-96c4-4b07-aaa2-caf071275bdf-S2/frameworks/3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010/executors/default/runs/1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:16.290601  1789 containerizer.cpp:616] Starting container 
> '1c830c9a-4120-4ef0-af80-49a52d307539' for executor 'default' of framework 
> '3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010'
> I0523 17:38:16.293285  1787 slave.cpp:1626] Queuing task '0' for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010
> I0523 17:38:16.297369  1787 slave.cpp:4233] Current disk usage 2.14%. Max 
> allowed age: 6.150293798159722days
> I0523 17:38:16.504043  1789 launcher.cpp:132] Forked child with pid '1837' 
> for container '1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:21.510535  1785 slave.cpp:2573] Got registration for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.554608  1785 slave.cpp:1791] Sending queued task '0' to 
> executor 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 at 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.594511  1789 slave.cpp:2932] Handling status update 
> TASK_RUNNING (UUID: cd04ec2a-0e68-460a-ad2e-e4f504f3b032) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.600050  1789 slave.cpp:2932] Handling status update 
> TASK_FINISHED (UUID: 46e110c8-4078-4f98-ae30-30b3a1376034) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5439) registerExecutor problem

2016-05-29 Thread kimjoohwan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305929#comment-15305929
 ] 

kimjoohwan commented on MESOS-5439:
---

Hello Joseph, Thank you for your comments

1. How many tasks are you launching at once? (i.e. from a single offer) And how 
many over a given time?
I using this framework

#!/usr/bin/env python

import os
import sys
import time
import datetime

import mesos.interface
from mesos.interface import mesos_pb2
import mesos.native

TOTAL_TASKS = 32

TASK_CPUS = 1
TASK_MEM = 350

class TestScheduler(mesos.interface.Scheduler):
def __init__(self, implicitAcknowledgements, executor):
self.implicitAcknowledgements = implicitAcknowledgements
self.executor = executor
self.taskData = {}
self.tasksLaunched = 0
self.tasksFinished = 0
self.messagesSent = 0
self.messagesReceived = 0
self.result = " "
self.data = " "
self.tasks = []
self.start = " "
self.end = " "
self.finish = " "
self.time1 = {}
self.time2 = {}
self.time3 = {}
self.time4 = {}
self.time5 = {}
self.time6 = {}
self.time7 = {}
self.time8 = {}
self.time0 = {}
self.count = 0
self.count2 = 0
def work1(self, offer):
tid = self.tasksLaunched
self.tasksLaunched += 1
tasks = []
print "Launching egrep_task %d using offer %s " \
  % (tid, offer.hostname)
task = mesos_pb2.TaskInfo()
task.task_id.value = str(tid)
task.slave_id.value = offer.slave_id.value
task.name = "task %d" % tid
executor.executor_id.value = str(tid)
executor.command.value = os.path.abspath("./work1-executor")
task.executor.MergeFrom(self.executor)

cpus = task.resources.add()
cpus.name = "cpus"
cpus.type = mesos_pb2.Value.SCALAR
cpus.scalar.value = TASK_CPUS

mem = task.resources.add()
mem.name = "mem"
mem.type = mesos_pb2.Value.SCALAR
mem.scalar.value = TASK_MEM

return task

def work2(self, offer):
tasks = []
tid = self.tasksLaunched
self.tasksLaunched += 1

print "Launching wc_task %d using offer %s" \
  % (tid, offer.hostname)
task = mesos_pb2.TaskInfo()
task.task_id.value = str(tid)
task.slave_id.value = offer.slave_id.value
task.name = "task %d" % tid
executor.executor_id.value = str(tid)
executor.command.value = os.path.abspath("./work2-executor")
task.executor.MergeFrom(self.executor)

cpus = task.resources.add()
cpus.name = "cpus"
cpus.type = mesos_pb2.Value.SCALAR
cpus.scalar.value = TASK_CPUS

mem = task.resources.add()
mem.name = "mem"
mem.type = mesos_pb2.Value.SCALAR
mem.scalar.value = TASK_MEM

print "work2"

return task

def work3(self, offer):
tid = self.tasksLaunched
self.tasksLaunched += 1
tasks = []
print "Launching egrep_task %d using offer %s" \
  % (tid, offer.hostname)
task = mesos_pb2.TaskInfo()
task.task_id.value = str(tid)
task.slave_id.value = offer.slave_id.value
task.name = "task %d" % tid
executor.executor_id.value = str(tid)
executor.command.value = os.path.abspath("./work3-executor")
task.executor.MergeFrom(self.executor)

cpus = task.resources.add()
cpus.name = "cpus"
cpus.type = mesos_pb2.Value.SCALAR
cpus.scalar.value = TASK_CPUS

mem = task.resources.add()
mem.name = "mem"
mem.type = mesos_pb2.Value.SCALAR
mem.scalar.value = TASK_MEM

return task

def work4(self, offer):
tasks = []
tid = self.tasksLaunched
self.tasksLaunched += 1

print "Launching wc_task %d using offer %s" \
  % (tid, offer.hostname)
task = mesos_pb2.TaskInfo()
task.task_id.value = str(tid)
task.slave_id.value = offer.slave_id.value
task.name = "task %d" % tid
executor.executor_id.value = str(tid)
executor.command.value = os.path.abspath("./work4-executor")
task.executor.MergeFrom(self.executor)

cpus = task.resources.add()
cpus.name = "cpus"
cpus.type = mesos_pb2.Value.SCALAR
cpus.scalar.value = TASK_CPUS

mem = task.resources.add()
mem.name = "mem"
mem.type = mesos_pb2.Value.SCALAR
mem.scalar.value = TASK_MEM

print "work2"

return task

def registered(self, driver, frameworkId, masterInfo):
print "Registered with framework ID %s" % frameworkId.value
self.start = datetime.datetime.now()

def 

[jira] [Commented] (MESOS-5439) registerExecutor problem

2016-05-23 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296759#comment-15296759
 ] 

Joseph Wu commented on MESOS-5439:
--

A couple questions:
* How many tasks are you launching at once?  (i.e. from a single offer)  And 
how many over a given time?
* Are you using the default command executor?  Or are you launching a custom 
executor?
* What flags are you using to launch the agent?
* What do the executor's stdout/stderr files (in the sandbox) say?  There 
should be glog logs in there too.

> registerExecutor problem
> 
>
> Key: MESOS-5439
> URL: https://issues.apache.org/jira/browse/MESOS-5439
> Project: Mesos
>  Issue Type: Bug
>  Components: c++ api, slave
>Affects Versions: 0.27.0
>Reporter: kimjoohwan
>
> Currently, we are using Mesos 0.27.0. The master is build up with a Intel(R) 
> Core(TM) i5-3470 CPU @ 3.20GHz CPU and a 4GB RAM. The slave (Banana PI) is 
> build up with a Cortex -A7 Dual-Core CPU and a 1GB RAM.
> By using the Mesos API, we have developed and completed the execution of the 
> framework which is based on python.
> but, we found that it takes too much time between the messages, 'Forked child 
> with pid' and 'Got registration for executor' from the slave log. (5sec)
> If you know how to deal with this problem, please let us know.
> I0523 17:38:16.264289  1787 slave.cpp:5208] Launching executor default of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 with resources  in work 
> directory 
> '/tmp/mesos/slaves/3fb86eea-96c4-4b07-aaa2-caf071275bdf-S2/frameworks/3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010/executors/default/runs/1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:16.290601  1789 containerizer.cpp:616] Starting container 
> '1c830c9a-4120-4ef0-af80-49a52d307539' for executor 'default' of framework 
> '3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010'
> I0523 17:38:16.293285  1787 slave.cpp:1626] Queuing task '0' for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010
> I0523 17:38:16.297369  1787 slave.cpp:4233] Current disk usage 2.14%. Max 
> allowed age: 6.150293798159722days
> I0523 17:38:16.504043  1789 launcher.cpp:132] Forked child with pid '1837' 
> for container '1c830c9a-4120-4ef0-af80-49a52d307539'
> I0523 17:38:21.510535  1785 slave.cpp:2573] Got registration for executor 
> 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.554608  1785 slave.cpp:1791] Sending queued task '0' to 
> executor 'default' of framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 at 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.594511  1789 slave.cpp:2932] Handling status update 
> TASK_RUNNING (UUID: cd04ec2a-0e68-460a-ad2e-e4f504f3b032) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508
> I0523 17:38:21.600050  1789 slave.cpp:2932] Handling status update 
> TASK_FINISHED (UUID: 46e110c8-4078-4f98-ae30-30b3a1376034) for task 0 of 
> framework 3fb86eea-96c4-4b07-aaa2-caf071275bdf-0010 from 
> executor(1)@192.168.0.8:56508



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)