How to update mesos slave configuration

2015-07-17 Thread Dvorkin-Contractor, Eugene (CORP)
Hi,
I have a small cluster consisting of 3 masters( masters+zookeeper) and 3 slave 
nodes.
This cluster on AWS infrastructure. Originally I had 8GB of storage which was 
correctly advertised on mesos UI.
Now I added more storage, as a additional drive, and want to tell mesos to use 
this storage (60GB) mapped to another mount point.
How do I reconfigure my slaves now?

Also, I have created 2 new slaves, but they do not show up in mesos UI.
I created following files
  /etc/mesos-master/hostname
vi /etc/marathon/conf/hostname
And restarted slave process, but changes was not reflected in mesos UI.
I can confirm that slave running ok.
How do I force mesos to to update list of slave nodes?

Thanks

mesos-execute --master=$MASTER --name=cluster-test --command=sleep 5
I0717 15:13:56.438535  2410 sched.cpp:157] Version: 0.22.1
I0717 15:13:56.440099  2418 sched.cpp:254] New master detected at 
master@172.31.50.60:5050
I0717 15:13:56.440201  2418 sched.cpp:264] No credentials provided. Attempting 
to register without authentication
I0717 15:13:56.443297  2414 sched.cpp:448] Framework registered with 
20150706-201327-1009917868-5050-16293-0019
Framework registered with 20150706-201327-1009917868-5050-16293-0019
task cluster-test submitted to slave 20150706-201327-1009917868-5050-16293-S48
Received status update TASK_RUNNING for task cluster-test
Received status update TASK_FINISHED for task cluster-test
I0717 15:14:01.514215  2415 sched.cpp:1589] Asked to stop the driver
I0717 15:14:01.514241  2415 sched.cpp:831] Stopping framework 
'20150706-201327-1009917868-5050-16293-0019'

--
This message and any attachments are intended only for the use of the addressee 
and may contain information that is privileged and confidential. If the reader 
of the message is not the intended recipient or an authorized representative of 
the intended recipient, you are hereby notified that any dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, notify the sender immediately by return email and delete the message 
and any attachments from your system.


Re: Marathon can no longer deploy any apps after a failover

2015-07-17 Thread Maciej Strzelecki
Thanks for guidelines! Ill try these paths out, and join the marathon 
mailing-list (was oblivious there was one ;))


Maciej Strzelecki
Operations Engineer
Tel: +49 30 6098381-50
Fax: +49 851-213728-88
E-mail: mstrzele...@crealytics.de
www.crealytics.comhttp://www.crealytics.com
blog.crealytics.com

crealytics GmbH - Semantic PPC Advertising Technology

Brunngasse 1 - 94032 Passau - Germany
Oranienstraße 185 - 10999 Berlin - Germany

Managing directors: Andreas Reiffen, Christof König, Dr. Markus Kurch
Register court: Amtsgericht Passau, HRB 7466
Geschäftsführer: Andreas Reiffen, Christof König, Daniel Trost
Reg.-Gericht: Amtsgericht Passau, HRB 7466


From: Vinod Kone vinodk...@gmail.com
Sent: Thursday, July 16, 2015 7:09 PM
To: user@mesos.apache.org
Subject: Re: Marathon can no longer deploy any apps after a failover

Sounds like a marathon issue. Mind asking in marathon's mailing list?

On Thu, Jul 16, 2015 at 8:02 AM, Nikolay Borodachev 
nbo...@adobe.commailto:nbo...@adobe.com wrote:
Maciej,

I had a similar problem but it got solved by setting LIBPROCESS_IP environment 
variable to the host IP address for the Marathon process.

Nikolay


From: Maciej Strzelecki 
[mailto:maciej.strzele...@crealytics.commailto:maciej.strzele...@crealytics.com]
Sent: Thursday, July 16, 2015 7:30 AM
To: user@mesos.apache.orgmailto:user@mesos.apache.org
Subject: Marathon can no longer deploy any apps after a failover


Problem:



If i restart a current framework leader for marathon ( the host from active 
frameworks tab in mesos ui) , a new one is elected after a moment and any new 
deployments are stuck infinitely at  'deploying' state (empty black bar, 0/1 
and hanging - with debug level i dont see any errors in marathon/mesos logs)

Also the old tasks are untouchable at that time - yes, they keep running, but 
cant kill, restart nor scale them.


When that happens i can:

stop marathon on all masters

remove the framework via a curl to mesos api /shutdown

purge /marathon from zookeper cli

restart docker services on all slaves (that kills the zombie containers)
restart mesos-slave services on all slaves (pampering my paranoia here)
then i can deploy apps again.



How can i avoid this problem? Any basic settings im missing? This is scary, as 
the reboot of a single master (out of 3 or 5 servers) freezes everything that 
is deployed using marathon, and the steps to reclaim control introduce downtime 
to every single app sunning there.









Configuration:



Running ubuntu 14.04.2. LTS

mesos   0.22.1-1.0.ubuntu1404

marathon0.9.0-1.0.381.ubuntu1404

chronos 2.3.4-1.0.81.ubuntu1404



The cluster  uses 3 masters and a 15 slaves. Also the master machines are 
running mesos-slave process (albeit those machines give only a  portion of 
resources as offerrings)



The configuration for mesos/marathon is very default dependant, options 
specified You can see below. The quorum is 2.



Marathon service is run on 3 master machines



root@mesos-master1 ~ # tree /etc/marathon/
/etc/marathon/
`-- conf
|-- event_subscriber
|-- framework_name
|-- hostname
|-- logging_level
`-- zk

1 directory, 5 files
root@mesos-master1 ~ # tree /etc/mesos
/etc/mesos
`-- zk

0 directories, 1 file
root@mesos-master1 ~ # tree /etc/mesos-slave/
/etc/mesos-slave/
|-- containerizers
|-- docker_stop_timeout
|-- executor_registration_timeout
|-- executor_shutdown_grace_period
|-- hostname
|-- ip
|-- logging_level
`-- resources

0 directories, 8 files
root@mesos-master1 ~ # tree /etc/mesos-master
/etc/mesos-master
|-- cluster
|-- hostname
|-- ip
|-- logging_level
|-- quorum
`-- work_dir



Re: Mesos-DNS configuration problem with dockerized web application

2015-07-17 Thread Grzegorz Graczyk
It’s not possible now, but hopefully it will be in the future. Related mesos 
issue: https://issues.apache.org/jira/browse/MESOS-2044 
https://issues.apache.org/jira/browse/MESOS-2044

 On 16 Jul 2015, at 22:57, Ondrej Smola ondrej.sm...@gmail.com wrote:
 
 I dont think there is way how can MesosDNS help you in this case ... when you 
 rely on classic DNS lookups - they are based on looking up DNS - IP (a, 
  records) - there is no port lookup (they use port you provide). 
 
 https://mesosphere.github.io/marathon/docs/service-discovery-load-balancing.html
  
 https://mesosphere.github.io/marathon/docs/service-discovery-load-balancing.html
 
 is a good starting point.
 
 
 
 
 2015-07-16 22:24 GMT+02:00 Dvorkin-Contractor, Eugene (CORP) 
 eugene.dvorkin-contrac...@adp.com 
 mailto:eugene.dvorkin-contrac...@adp.com:
 Thanks.  I was hoping that mesos-dns will do it for me and I can run services 
 on different ports even on the same node. I was hesitant to use HAProxy.
 I think I have to use HAProxy/Bamboo to achieve this functionality. 
 
 From: Ondrej Smola ondrej.sm...@gmail.com mailto:ondrej.sm...@gmail.com
 Reply-To: user@mesos.apache.org mailto:user@mesos.apache.org 
 user@mesos.apache.org mailto:user@mesos.apache.org
 Date: Thursday, July 16, 2015 at 2:55 PM
 To: user@mesos.apache.org mailto:user@mesos.apache.org 
 user@mesos.apache.org mailto:user@mesos.apache.org
 Subject: Re: Mesos-DNS configuration problem with dockerized web application
 
 Hi,
 
 portMappings: [
 { containerPort: 8080, hostPort: 80, servicePort: 9000, 
 protocol: tcp } 
  ]
 
 will work - you need to specify required port as hostPort
 only limitation of this setup is that you wont be able to run multiple  
 services on single host with same hostPort (port collision)
 but for most setups you should be ok with just choosing random/different 
 ports for different services or ensuring there are more nodes than requested 
 instances with same port
 if you want to use random port - you will need some have logic to query DNS 
 and parse SRV records and for example setup HA proxy with correctly assigned 
 ports
 
 this problem can also be solved using SDN (for example flannel/weave -) 
 assigning each service unique IP address and dont care about port collisions  
 - but this is not related to MesosDNS - just info :)
 
 
 
 
 2015-07-16 17:58 GMT+02:00 Dvorkin-Contractor, Eugene (CORP) 
 eugene.dvorkin-contrac...@adp.com 
 mailto:eugene.dvorkin-contrac...@adp.com:
 Hi,
 I can’t access my application using mesos-dns.  Neither port 8123 nor 8080 
 responding. I think I miss something in configuration but can’t find problem  
 myself. 
 
 I have a very basic java application that listen on port 8080. I have created 
 docker image and deployed this application to marathon.
 My deployment configuration is following:
 $ cat app-slick.json
 {
   container: {
 type: DOCKER,
 docker: {
   image: edvorkin/slick-swagger:1,
   network: BRIDGE,
   portMappings: [
 { containerPort: 8080, hostPort: 0, servicePort: 9000, 
 protocol: tcp }
   ]
 }
   },
   cmd: java -jar /tmp/spray-slick-swagger-assembly-0.0.2.jar Boot,
   id: slick-swagger-demo,
   instances: 1,
   cpus: 0.1,
   mem: 256,
   constraints: [
 [hostname, UNIQUE]
   ]
 }
 Application successfully deployed to 2 nodes and assigned random port of 
 31990 and 31000 on each node.
 Now I installed and configured Mesos-DNS with config.json 
 {
 zk: zk://172.31.50.58:2181 http://172.31.50.58:2181/,172.31.50.59:2181 
 http://172.31.50.59:2181/,172.31.50.60:2181/mesos 
 http://172.31.50.60:2181/mesos,
   refreshSeconds: 60,
   ttl: 60,
   domain: mesos,
   port: 53,
   resolvers: [172.31.0.2],
   timeout: 5,
   email: root.mesos-dns.mesos
 }
 
 
 and I got following:
 
 $ dig slick-swagger-demo.marathon.mesos
 
 ;  DiG 9.9.4-RedHat-9.9.4-18.el7_1.1  
 slick-swagger-demo.marathon.mesos
 ;; global options: +cmd
 ;; Got answer:
 ;; -HEADER- opcode: QUERY, status: NOERROR, id: 20376
 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
 
 ;; QUESTION SECTION:
 ;slick-swagger-demo.marathon.mesos. IN  A
 
 ;; ANSWER SECTION:
 slick-swagger-demo.marathon.mesos. 60 IN A  172.31.11.202
 slick-swagger-demo.marathon.mesos. 60 IN A  172.31.11.203
 
 ;; Query time: 1 msec
 ;; SERVER: 54.86.164.193#53(54.86.164.193)
 ;; WHEN: Thu Jul 16 15:23:04 UTC 2015
 ;; MSG SIZE  rcvd: 83
 
 
  curl 
 http://localhost:8123/v1/services/_slick-swagger-demo._tcp.marathon.mesos 
 http://localhost:8123/v1/services/_slick-swagger-demo._tcp.marathon.mesos  
 |python -m json.tool
   % Total% Received % Xferd  Average Speed   TimeTime Time  
 Current
  Dload  Upload   Total   SpentLeft  Speed
 100   289  100   2890 0   1916  0 --:--:-- --:--:-- --:--:--  1926
 [
 {
 host: slick-swagger-demo-15491-s42.marathon.mesos.,
 ip: 172.31.11.203,
 port: 31990,
 service: 

High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Philip Weaver
I'm trying to understand the behavior of mesos, and if what I am observing
is typical or if I'm doing something wrong, and what options I have for
improving the performance of how offers are made and how tasks are executed
for my particular use case.

I have written a Scheduler that has a queue of very small tasks (for
testing, they are echo hello world, but in production many of them won't
be much more expensive than that). Each task is configured to use 1 cpu
resource. When resourceOffers is called, I launch as many tasks as I can in
the given offers; that is, one call to driver.launchTasks for each offer,
with a list of tasks that has one task for each cpu in that offer.

On a cluster of 3 nodes and 4 cores each (12 total cores), it takes 120s to
execute 1000 tasks out of the queue. We are evaluting mesos because we want
to use it to replace our current homegrown cluster controller, which can
execute 1000 tasks in way less than 120s.

I am seeing two things that concern me:

   - The time between driver.launchTasks and receiving a callback to
   statusUpdate when the task completes is typically 200-500ms, and sometimes
   even as high as 1000-2000ms.
   - The time between when a task completes and when I get an offer for the
   newly freed resource is another 500ms or so.

These latencies explain why I can only execute tasks at a rate of about 8/s.

It looks like my offers always include all 4 cores on each machine, which
would indicate that mesos doesn't like to send an offer as soon as a single
resource is avaiable, and prefers to delay and send an offer with more
resources in it. Is this true?

Thanks in advance for any advice you can offer!

- Phllip


Re: [VOTE] Release Apache Mesos 0.23.0 (rc4)

2015-07-17 Thread Marco Massenzio
Ubuntu 14.04

Not sure if I'm doing something wrong, `sudo make distcheck` fails -
re-running after a `make clean`

If it continues failing, I'll provide more detailed log output.
In the meantime, if anyone has any suggestions as to what I may be doing
wrong, please let me know.

$ ../configure  make -j8 V=0  make -j12 V=0 check

[==] 649 tests from 94 test cases ran. (254152 ms total)
[  PASSED  ] 649 tests.

$ sudo make -j12 V=0 distcheck

[==] 712 tests from 116 test cases ran. (325751 ms total)
[  PASSED  ] 702 tests.
[  FAILED  ] 10 tests, listed below:
[  FAILED  ] LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
[  FAILED  ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample
[  FAILED  ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where
TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess
[  FAILED  ]
MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
[  FAILED  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
[  FAILED  ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
[  FAILED  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
[  FAILED  ] NsTest.ROOT_setns
[  FAILED  ] PerfTest.ROOT_Events
[  FAILED  ] PerfTest.ROOT_SamplePid

10 FAILED TESTS
  YOU HAVE 12 DISABLED TESTS


*Marco Massenzio*
*Distributed Systems Engineer*

On Fri, Jul 17, 2015 at 6:49 PM, Vinod Kone vinodk...@gmail.com wrote:

 +1 (binding)

 Successfully built RPMs for CentOS5 and CentOS6 with network isolator.


 On Fri, Jul 17, 2015 at 4:56 PM, Khanduja, Vaibhav 
 vaibhav.khand...@emc.com
  wrote:

  +1
 
  Sent from my iPhone. Please excuse the typos and brevity of this message.
 
   On Jul 17, 2015, at 4:43 PM, Adam Bordelon a...@mesosphere.io wrote:
  
   Hello Mesos community,
  
   Please vote on releasing the following candidate as Apache Mesos
 0.23.0.
  
   0.23.0 includes the following:
  
 
 
   - Per-container network isolation
   - Dockerized slaves will properly recover Docker containers upon
  failover.
   - Upgraded minimum required compilers to GCC 4.8+ or clang 3.5+.
  
   as well as experimental support for:
   - Fetcher Caching
   - Revocable Resources
   - SSL encryption
   - Persistent Volumes
   - Dynamic Reservations
  
   The CHANGELOG for the release is available at:
  
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.0-rc4
  
 
 
  
   The candidate for Mesos 0.23.0 release is available at:
  
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz
  
   The tag to be voted on is 0.23.0-rc4:
  
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.23.0-rc4
  
   The MD5 checksum of the tarball can be found at:
  
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.md5
  
   The signature of the tarball can be found at:
  
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.asc
  
   The PGP key used to sign the release is here:
   https://dist.apache.org/repos/dist/release/mesos/KEYS
  
   The JAR is up in Maven in a staging repository here:
   https://repository.apache.org/content/repositories/orgapachemesos-1062
  
   Please vote on releasing this package as Apache Mesos 0.23.0!
  
   The vote is open until Wed July 22nd, 17:00 PDT 2015 and passes if a
   majority of at least 3 +1 PMC votes are cast.
  
   [ ] +1 Release this package as Apache Mesos 0.23.0 (I've tested it!)
   [ ] -1 Do not release this package because ...
  
   Thanks,
   -Adam-
 



Re: [VOTE] Release Apache Mesos 0.23.0 (rc4)

2015-07-17 Thread Marco Massenzio
.RecoverUnregisteredExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.RecoverTerminatedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.CleanupExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.RemoveNonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.NonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.KillTask, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.GCExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.ShutdownSlave, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.RegisterDisconnectedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.ReconcileKillTask, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.ReconcileShutdownFramework, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.ReconcileTasksMissingFromSlave, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.SchedulerFailover, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.PartitionedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.MasterFailover, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.MultipleFrameworks, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.MultipleSlaves, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] SlaveRecoveryTest/0.RestartBeforeContainerizerLaunch, where TypeParam = mesos::internal::slave::MesosContainerizer
[  FAILED  ] MesosContainerizerSlaveRecoveryTest.ResourceStatistics
[  FAILED  ] MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
[  FAILED  ] MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceForward
[  FAILED  ] MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceBackward
[  FAILED  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
[  FAILED  ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
[  FAILED  ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
[  FAILED  ] NsTest.ROOT_setns
[  FAILED  ] PerfTest.ROOT_Events
[  FAILED  ] PerfTest.ROOT_SamplePid

36 FAILED TESTS
  YOU HAVE 12 DISABLED TESTS




[ RUN  ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
../../src/tests/isolator_tests.cpp:1210: Failure
isolator: Failed to create PerfEvent isolator, invalid events: { cpu-cycles }
[  FAILED  ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess (3 ms)
userdel: mesos.test.unprivileged.user mail spool (/var/mail/mesos.test.unprivileged.user) not found
userdel: mesos.test.unprivileged.user home directory (/home/mesos.test.unprivileged.user) not found
[--] 1 test from UserCgroupIsolatorTest/2 (3 ms total)

[--] 24 tests from SlaveRecoveryTest/0, where TypeParam = mesos::internal::slave::MesosContainerizer
[ RUN  ] SlaveRecoveryTest/0.RecoverSlaveState
2015-07-17 20:08:14,697:8318(0x2ab579b2f700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:58356] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
I0717 20:08:14.833111 13827 exec.cpp:132] Version: 0.23.0
I0717 20:08:14.839572 13842 exec.cpp:206] Executor registered on slave 20150717-200814-855746752-41031-8318-S0
Registered executor on gondor
Starting task bfb318c0-fa7e-430b-bc27-2c1e2d593ba9
Forked command at 13854
sh -c 'sleep 1000'
I0717 20:08:14.880101 13845 exec.cpp:379] Executor asked to shutdown
Shutting down
Sending SIGTERM to process tree at pid 13854
../../src/tests/mesos.cpp:757: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
[  FAILED  ] SlaveRecoveryTest/0.RecoverSlaveState, where TypeParam = mesos::internal::slave::MesosContainerizer (399 ms)
[ RUN  ] SlaveRecoveryTest/0.RecoverStatusUpdateManager
I0717 20:08:15.233203 13876 exec.cpp:132] Version: 0.23.0
I0717 20:08:15.241922 13901 exec.cpp:206] Executor registered on slave 20150717-200814-855746752-41031-8318-S0
Registered executor on gondor
Starting task 17e4b799-642b

Re: [VOTE] Release Apache Mesos 0.23.0 (rc4)

2015-07-17 Thread Vinod Kone
+1 (binding)

Successfully built RPMs for CentOS5 and CentOS6 with network isolator.


On Fri, Jul 17, 2015 at 4:56 PM, Khanduja, Vaibhav vaibhav.khand...@emc.com
 wrote:

 +1

 Sent from my iPhone. Please excuse the typos and brevity of this message.

  On Jul 17, 2015, at 4:43 PM, Adam Bordelon a...@mesosphere.io wrote:
 
  Hello Mesos community,
 
  Please vote on releasing the following candidate as Apache Mesos 0.23.0.
 
  0.23.0 includes the following:
 
 
  - Per-container network isolation
  - Dockerized slaves will properly recover Docker containers upon
 failover.
  - Upgraded minimum required compilers to GCC 4.8+ or clang 3.5+.
 
  as well as experimental support for:
  - Fetcher Caching
  - Revocable Resources
  - SSL encryption
  - Persistent Volumes
  - Dynamic Reservations
 
  The CHANGELOG for the release is available at:
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.0-rc4
 
 
 
  The candidate for Mesos 0.23.0 release is available at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz
 
  The tag to be voted on is 0.23.0-rc4:
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.23.0-rc4
 
  The MD5 checksum of the tarball can be found at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.md5
 
  The signature of the tarball can be found at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.asc
 
  The PGP key used to sign the release is here:
  https://dist.apache.org/repos/dist/release/mesos/KEYS
 
  The JAR is up in Maven in a staging repository here:
  https://repository.apache.org/content/repositories/orgapachemesos-1062
 
  Please vote on releasing this package as Apache Mesos 0.23.0!
 
  The vote is open until Wed July 22nd, 17:00 PDT 2015 and passes if a
  majority of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Mesos 0.23.0 (I've tested it!)
  [ ] -1 Do not release this package because ...
 
  Thanks,
  -Adam-



Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Benjamin Mahler
Currently, recovered resources are not immediately re-offered as you
noticed, and the default allocation interval is 1 second. I'd recommend
lowering that (e.g. --allocation_interval=50ms), that should improve the
second bullet you listed. Although, in your case it would be better to
immediately re-offer recovered resources (feel free to file a ticket for
supporting that).

For the first bullet, mind providing some more information? E.g. master
flags, slave flags, scheduler logs, master logs, slave logs, executor logs?
We would need to trace through a task launch to see where the latency is
being introduced.

On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver philip.wea...@gmail.com
wrote:

 I'm trying to understand the behavior of mesos, and if what I am observing
 is typical or if I'm doing something wrong, and what options I have for
 improving the performance of how offers are made and how tasks are executed
 for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes 120s
 to execute 1000 tasks out of the queue. We are evaluting mesos because we
 want to use it to replace our current homegrown cluster controller, which
 can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer for
the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of about
 8/s.

 It looks like my offers always include all 4 cores on each machine, which
 would indicate that mesos doesn't like to send an offer as soon as a single
 resource is avaiable, and prefers to delay and send an offer with more
 resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip




Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Benjamin Mahler
I've filed a ticket to immediately re-offer recovered resources from
terminal tasks / executors:

https://issues.apache.org/jira/browse/MESOS-3078

On Fri, Jul 17, 2015 at 2:24 PM, Philip Weaver philip.wea...@gmail.com
wrote:

 Your advice worked and made a huge difference. With
 allocation_interval=50ms, the 1000 tasks now execute in 21s instead of
 120s. Thanks.

 On Fri, Jul 17, 2015 at 2:20 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 Ok, thanks!

 On Fri, Jul 17, 2015 at 2:18 PM, Alexander Gallego agall...@concord.io
 wrote:

 I use a similar pattern.

 I have my own scheduler as you have. I deploy my own executor which
 downloads a tar from some storage and effectively ` execvp ( ... ) ` a
 proc. It monitors the child proc and reports status of child pid exit
 status.

 Check out the Marathon code if you are writing in scala. It is an
 excellent example for both scheduler and executor templates.

 -ag

 On Fri, Jul 17, 2015 at 5:06 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 Awesome, I suspected that was the case, but hadn't discovered the
 --allocation_interval flag, so I will use that.

 I installed from the mesosphere RPMs and didn't change any flags from
 there. I will try to find some logs that provide some insight into the
 execution times.

 I am using a command task. I haven't looked into executors yet; I had a
 hard time finding some examples in my language (Scala).

 On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 One other thing, do you use an executor to run many tasks? Or are you
 using a command task?

 On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g.
 master flags, slave flags, scheduler logs, master logs, slave logs,
 executor logs? We would need to trace through a task launch to see where
 the latency is being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver 
 philip.wea...@gmail.com wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks 
 are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them 
 won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I 
 can in
 the given offers; that is, one call to driver.launchTasks for each 
 offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes
 120s to execute 1000 tasks out of the queue. We are evaluting mesos 
 because
 we want to use it to replace our current homegrown cluster controller,
 which can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback
to statusUpdate when the task completes is typically 200-500ms, and
sometimes even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer
for the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of
 about 8/s.

 It looks like my offers always include all 4 cores on each machine,
 which would indicate that mesos doesn't like to send an offer as soon 
 as a
 single resource is avaiable, and prefers to delay and send an offer with
 more resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip













Re: Mesos Slave Failover time

2015-07-17 Thread Vinod Kone
It's not configurable yet, but will be in the upcoming 0.23.0 release.

On Fri, Jul 17, 2015 at 3:46 PM, Nastooh Avessta (navesta) 
nave...@cisco.com wrote:

  Hi

 Trying to adjust the current failover time to below 10 seconds and don’t
 seem to be able to find the right set of parameters. Currently, it takes
 around minute and half for master to detect that a slave has gone offline,
 which seems to correspond to
 slave_ping_timeout=15*max_slave_ping_timeouts=5. However, I can’t find
 these parameters in mesos-master:



 # mesos-master --version

 mesos 0.22.1

 #mesos-master --help

 Usage: mesos-master [...]



 Supported options:

   --acls=VALUE The value could be a JSON
 formatted string of ACLs

or a file path containing the
 JSON formatted ACLs used

for authorization. Path could
 be of the form 'file:///path/to/file'

or '/path/to/file'.



See the ACLs protobuf in
 mesos.proto for the expected format.



Example:

{

  register_frameworks: [

   {


 principals: { type: ANY },


 roles: { values: [a] }

   }

 ],

  run_tasks: [

  {


principals: {
 values: [a, b] },

 users: {
 values: [c] }

  }

],

  shutdown_frameworks: [

{


 principals: { values: [a, b] },


 framework_principals: { values: [c] }

}

  ]

}

   --allocation_interval=VALUE  Amount of time to wait between
 performing

 (batch) allocations (e.g.,
 500ms, 1sec, etc). (default: 1secs)

   --[no-]authenticate  If authenticate is 'true' only
 authenticated frameworks are allowed

to register. If 'false'
 unauthenticated frameworks are also

allowed to register. (default:
 false)

   --[no-]authenticate_slaves   If 'true' only authenticated
 slaves are allowed to register.

If 'false' unauthenticated
 slaves are also allowed to register. (default: false)

   --authenticators=VALUE   Authenticator implementation to
 use when authenticating frameworks

and/or slaves. Use the default
 'crammd5', or

load an alternate authenticator
 module using --modules. (default: crammd5)

   --cluster=VALUE  Human readable name for the
 cluster,

displayed in the webui.

   --credentials=VALUE  Either a path to a text file
 with a list of credentials,

each line containing
 'principal' and 'secret' separated by whitespace,

or, a path to a JSON-formatted
 file containing credentials.

Path could be of the form
 'file:///path/to/file' or '/path/to/file'.

JSON file Example:

{

  credentials: [

{


 principal: sherman,


  secret: kitesurf,

}

   ]

}

Text file Example:

username secret



   --external_log_file=VALUESpecified the externally
 managed log file. This file will be

exposed in the webui and HTTP
 api. This is useful when using

stderr logging as the log file
 is otherwise unknown to Mesos.

   --framework_sorter=VALUE

Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Philip Weaver
Awesome, I suspected that was the case, but hadn't discovered the
--allocation_interval flag, so I will use that.

I installed from the mesosphere RPMs and didn't change any flags from
there. I will try to find some logs that provide some insight into the
execution times.

I am using a command task. I haven't looked into executors yet; I had a
hard time finding some examples in my language (Scala).

On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler benjamin.mah...@gmail.com
wrote:

 One other thing, do you use an executor to run many tasks? Or are you
 using a command task?

 On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g. master
 flags, slave flags, scheduler logs, master logs, slave logs, executor logs?
 We would need to trace through a task launch to see where the latency is
 being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes 120s
 to execute 1000 tasks out of the queue. We are evaluting mesos because we
 want to use it to replace our current homegrown cluster controller, which
 can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and 
 sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer for
the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of about
 8/s.

 It looks like my offers always include all 4 cores on each machine,
 which would indicate that mesos doesn't like to send an offer as soon as a
 single resource is avaiable, and prefers to delay and send an offer with
 more resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip






Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Alexander Gallego
I use a similar pattern.

I have my own scheduler as you have. I deploy my own executor which
downloads a tar from some storage and effectively ` execvp ( ... ) ` a
proc. It monitors the child proc and reports status of child pid exit
status.

Check out the Marathon code if you are writing in scala. It is an excellent
example for both scheduler and executor templates.

-ag

On Fri, Jul 17, 2015 at 5:06 PM, Philip Weaver philip.wea...@gmail.com
wrote:

 Awesome, I suspected that was the case, but hadn't discovered the
 --allocation_interval flag, so I will use that.

 I installed from the mesosphere RPMs and didn't change any flags from
 there. I will try to find some logs that provide some insight into the
 execution times.

 I am using a command task. I haven't looked into executors yet; I had a
 hard time finding some examples in my language (Scala).

 On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 One other thing, do you use an executor to run many tasks? Or are you
 using a command task?

 On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g. master
 flags, slave flags, scheduler logs, master logs, slave logs, executor logs?
 We would need to trace through a task launch to see where the latency is
 being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver philip.wea...@gmail.com
  wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes
 120s to execute 1000 tasks out of the queue. We are evaluting mesos because
 we want to use it to replace our current homegrown cluster controller,
 which can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and 
 sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer
for the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of about
 8/s.

 It looks like my offers always include all 4 cores on each machine,
 which would indicate that mesos doesn't like to send an offer as soon as a
 single resource is avaiable, and prefers to delay and send an offer with
 more resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip







Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Philip Weaver
Ok, thanks!

On Fri, Jul 17, 2015 at 2:18 PM, Alexander Gallego agall...@concord.io
wrote:

 I use a similar pattern.

 I have my own scheduler as you have. I deploy my own executor which
 downloads a tar from some storage and effectively ` execvp ( ... ) ` a
 proc. It monitors the child proc and reports status of child pid exit
 status.

 Check out the Marathon code if you are writing in scala. It is an
 excellent example for both scheduler and executor templates.

 -ag

 On Fri, Jul 17, 2015 at 5:06 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 Awesome, I suspected that was the case, but hadn't discovered the
 --allocation_interval flag, so I will use that.

 I installed from the mesosphere RPMs and didn't change any flags from
 there. I will try to find some logs that provide some insight into the
 execution times.

 I am using a command task. I haven't looked into executors yet; I had a
 hard time finding some examples in my language (Scala).

 On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 One other thing, do you use an executor to run many tasks? Or are you
 using a command task?

 On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g. master
 flags, slave flags, scheduler logs, master logs, slave logs, executor logs?
 We would need to trace through a task launch to see where the latency is
 being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver 
 philip.wea...@gmail.com wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks 
 are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can 
 in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes
 120s to execute 1000 tasks out of the queue. We are evaluting mesos 
 because
 we want to use it to replace our current homegrown cluster controller,
 which can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and 
 sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer
for the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of
 about 8/s.

 It looks like my offers always include all 4 cores on each machine,
 which would indicate that mesos doesn't like to send an offer as soon as a
 single resource is avaiable, and prefers to delay and send an offer with
 more resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip











Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Philip Weaver
Your advice worked and made a huge difference. With
allocation_interval=50ms, the 1000 tasks now execute in 21s instead of
120s. Thanks.

On Fri, Jul 17, 2015 at 2:20 PM, Philip Weaver philip.wea...@gmail.com
wrote:

 Ok, thanks!

 On Fri, Jul 17, 2015 at 2:18 PM, Alexander Gallego agall...@concord.io
 wrote:

 I use a similar pattern.

 I have my own scheduler as you have. I deploy my own executor which
 downloads a tar from some storage and effectively ` execvp ( ... ) ` a
 proc. It monitors the child proc and reports status of child pid exit
 status.

 Check out the Marathon code if you are writing in scala. It is an
 excellent example for both scheduler and executor templates.

 -ag

 On Fri, Jul 17, 2015 at 5:06 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 Awesome, I suspected that was the case, but hadn't discovered the
 --allocation_interval flag, so I will use that.

 I installed from the mesosphere RPMs and didn't change any flags from
 there. I will try to find some logs that provide some insight into the
 execution times.

 I am using a command task. I haven't looked into executors yet; I had a
 hard time finding some examples in my language (Scala).

 On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 One other thing, do you use an executor to run many tasks? Or are you
 using a command task?

 On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g.
 master flags, slave flags, scheduler logs, master logs, slave logs,
 executor logs? We would need to trace through a task launch to see where
 the latency is being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver 
 philip.wea...@gmail.com wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks 
 are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them 
 won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can 
 in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes
 120s to execute 1000 tasks out of the queue. We are evaluting mesos 
 because
 we want to use it to replace our current homegrown cluster controller,
 which can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and 
 sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer
for the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of
 about 8/s.

 It looks like my offers always include all 4 cores on each machine,
 which would indicate that mesos doesn't like to send an offer as soon as 
 a
 single resource is avaiable, and prefers to delay and send an offer with
 more resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip












Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Alexander Gallego
I take back the executor in scala. Just looked at the source and both
PathExecutor and CommandExecutor proxy to mesos TaskBuilder.setCommand


executor match {
  case CommandExecutor() =
builder.setCommand(TaskBuilder.commandInfo(app, Some(taskId), host,
ports, envPrefix))
containerProto.foreach(builder.setContainer)

  case PathExecutor(path) =
val executorId = fmarathon-${taskId.getValue} // Fresh executor
val executorPath = s'$path' // TODO: Really escape this.
val cmd = app.cmd orElse app.args.map(_ mkString  ) getOrElse 
val shell = schmod ug+rx $executorPath  exec $executorPath $cmd
val command = TaskBuilder.commandInfo(app, Some(taskId), host,
ports, envPrefix).toBuilder.setValue(shell)

val info = ExecutorInfo.newBuilder()
  .setExecutorId(ExecutorID.newBuilder().setValue(executorId))
  .setCommand(command)
containerProto.foreach(info.setContainer)
builder.setExecutor(info)
val binary = new ByteArrayOutputStream()
mapper.writeValue(binary, app)
builder.setData(ByteString.copyFrom(binary.toByteArray))
}


The pattern of execvp'ing is still what I use and in fact what mesos uses:

if (task.command().shell()) {
  execl(
  /bin/sh,
  sh,
  -c,
  task.command().value().c_str(),
  (char*) NULL);
} else {
  execvp(task.command().value().c_str(), argv);
}


Sorry for the missinformation about the executor in Marathon.



On Fri, Jul 17, 2015 at 5:20 PM, Philip Weaver philip.wea...@gmail.com
wrote:

 Ok, thanks!

 On Fri, Jul 17, 2015 at 2:18 PM, Alexander Gallego agall...@concord.io
 wrote:

 I use a similar pattern.

 I have my own scheduler as you have. I deploy my own executor which
 downloads a tar from some storage and effectively ` execvp ( ... ) ` a
 proc. It monitors the child proc and reports status of child pid exit
 status.

 Check out the Marathon code if you are writing in scala. It is an
 excellent example for both scheduler and executor templates.

 -ag

 On Fri, Jul 17, 2015 at 5:06 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 Awesome, I suspected that was the case, but hadn't discovered the
 --allocation_interval flag, so I will use that.

 I installed from the mesosphere RPMs and didn't change any flags from
 there. I will try to find some logs that provide some insight into the
 execution times.

 I am using a command task. I haven't looked into executors yet; I had a
 hard time finding some examples in my language (Scala).

 On Fri, Jul 17, 2015 at 2:00 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 One other thing, do you use an executor to run many tasks? Or are you
 using a command task?

 On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler 
 benjamin.mah...@gmail.com wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g.
 master flags, slave flags, scheduler logs, master logs, slave logs,
 executor logs? We would need to trace through a task launch to see where
 the latency is being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver 
 philip.wea...@gmail.com wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks 
 are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them 
 won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can 
 in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes
 120s to execute 1000 tasks out of the queue. We are evaluting mesos 
 because
 we want to use it to replace our current homegrown cluster controller,
 which can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and 
 sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer
for the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate 

[VOTE] Release Apache Mesos 0.23.0 (rc4)

2015-07-17 Thread Adam Bordelon
Hello Mesos community,

Please vote on releasing the following candidate as Apache Mesos 0.23.0.

0.23.0 includes the following:

- Per-container network isolation
- Dockerized slaves will properly recover Docker containers upon failover.
- Upgraded minimum required compilers to GCC 4.8+ or clang 3.5+.

as well as experimental support for:
- Fetcher Caching
- Revocable Resources
- SSL encryption
- Persistent Volumes
- Dynamic Reservations

The CHANGELOG for the release is available at:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.0-rc4


The candidate for Mesos 0.23.0 release is available at:
https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz

The tag to be voted on is 0.23.0-rc4:
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.23.0-rc4

The MD5 checksum of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.md5

The signature of the tarball can be found at:
https://dist.apache.org/repos/dist/dev/mesos/0.23.0-rc4/mesos-0.23.0.tar.gz.asc

The PGP key used to sign the release is here:
https://dist.apache.org/repos/dist/release/mesos/KEYS

The JAR is up in Maven in a staging repository here:
https://repository.apache.org/content/repositories/orgapachemesos-1062

Please vote on releasing this package as Apache Mesos 0.23.0!

The vote is open until Wed July 22nd, 17:00 PDT 2015 and passes if a
majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Mesos 0.23.0 (I've tested it!)
[ ] -1 Do not release this package because ...

Thanks,
-Adam-


Mesos Slave Failover time

2015-07-17 Thread Nastooh Avessta (navesta)
Hi
Trying to adjust the current failover time to below 10 seconds and don't seem 
to be able to find the right set of parameters. Currently, it takes around 
minute and half for master to detect that a slave has gone offline, which seems 
to correspond to slave_ping_timeout=15*max_slave_ping_timeouts=5. However, I 
can't find these parameters in mesos-master:

# mesos-master --version
mesos 0.22.1
#mesos-master --help
Usage: mesos-master [...]

Supported options:
  --acls=VALUE The value could be a JSON formatted 
string of ACLs
   or a file path containing the JSON 
formatted ACLs used
   for authorization. Path could be of 
the form 'file:///path/to/file'
   or '/path/to/file'.

   See the ACLs protobuf in mesos.proto 
for the expected format.

   Example:
   {
 register_frameworks: [
  {
 
principals: { type: ANY },
 roles: { 
values: [a] }
  }
],
 run_tasks: [
 {
principals: { 
values: [a, b] },
users: { 
values: [c] }
 }
   ],
 shutdown_frameworks: [
   {
  principals: { 
values: [a, b] },
  
framework_principals: { values: [c] }
   }
 ]
   }
  --allocation_interval=VALUE  Amount of time to wait between 
performing
(batch) allocations (e.g., 500ms, 
1sec, etc). (default: 1secs)
  --[no-]authenticate  If authenticate is 'true' only 
authenticated frameworks are allowed
   to register. If 'false' 
unauthenticated frameworks are also
   allowed to register. (default: false)
  --[no-]authenticate_slaves   If 'true' only authenticated slaves 
are allowed to register.
   If 'false' unauthenticated slaves 
are also allowed to register. (default: false)
  --authenticators=VALUE   Authenticator implementation to use 
when authenticating frameworks
   and/or slaves. Use the default 
'crammd5', or
   load an alternate authenticator 
module using --modules. (default: crammd5)
  --cluster=VALUE  Human readable name for the cluster,
   displayed in the webui.
  --credentials=VALUE  Either a path to a text file with a 
list of credentials,
   each line containing 'principal' and 
'secret' separated by whitespace,
   or, a path to a JSON-formatted file 
containing credentials.
   Path could be of the form 
'file:///path/to/file' or '/path/to/file'.
   JSON file Example:
   {
 credentials: [
   {
  principal: 
sherman,
  secret: 
kitesurf,
   }
  ]
   }
   Text file Example:
   username secret

  --external_log_file=VALUESpecified the externally managed log 
file. This file will be
   exposed in the webui and HTTP api. 
This is useful when using
   

Re: High latency when scheduling and executing many tiny tasks.

2015-07-17 Thread Benjamin Mahler
One other thing, do you use an executor to run many tasks? Or are you using
a command task?

On Fri, Jul 17, 2015 at 1:54 PM, Benjamin Mahler benjamin.mah...@gmail.com
wrote:

 Currently, recovered resources are not immediately re-offered as you
 noticed, and the default allocation interval is 1 second. I'd recommend
 lowering that (e.g. --allocation_interval=50ms), that should improve the
 second bullet you listed. Although, in your case it would be better to
 immediately re-offer recovered resources (feel free to file a ticket for
 supporting that).

 For the first bullet, mind providing some more information? E.g. master
 flags, slave flags, scheduler logs, master logs, slave logs, executor logs?
 We would need to trace through a task launch to see where the latency is
 being introduced.

 On Fri, Jul 17, 2015 at 12:26 PM, Philip Weaver philip.wea...@gmail.com
 wrote:

 I'm trying to understand the behavior of mesos, and if what I am
 observing is typical or if I'm doing something wrong, and what options I
 have for improving the performance of how offers are made and how tasks are
 executed for my particular use case.

 I have written a Scheduler that has a queue of very small tasks (for
 testing, they are echo hello world, but in production many of them won't
 be much more expensive than that). Each task is configured to use 1 cpu
 resource. When resourceOffers is called, I launch as many tasks as I can in
 the given offers; that is, one call to driver.launchTasks for each offer,
 with a list of tasks that has one task for each cpu in that offer.

 On a cluster of 3 nodes and 4 cores each (12 total cores), it takes 120s
 to execute 1000 tasks out of the queue. We are evaluting mesos because we
 want to use it to replace our current homegrown cluster controller, which
 can execute 1000 tasks in way less than 120s.

 I am seeing two things that concern me:

- The time between driver.launchTasks and receiving a callback to
statusUpdate when the task completes is typically 200-500ms, and sometimes
even as high as 1000-2000ms.
- The time between when a task completes and when I get an offer for
the newly freed resource is another 500ms or so.

 These latencies explain why I can only execute tasks at a rate of about
 8/s.

 It looks like my offers always include all 4 cores on each machine, which
 would indicate that mesos doesn't like to send an offer as soon as a single
 resource is avaiable, and prefers to delay and send an offer with more
 resources in it. Is this true?

 Thanks in advance for any advice you can offer!

 - Phllip