Re: Health Checks for Updates design review
Hi Maxim, I am not keen on the potential risk of tasks getting stuck in STARTING. We perform auto-scaling of jobs, so there might be nobody around to notice and correct the problem in time. How about keeping the initial_interval_secs and just change its meaning to be grace period, so that health checks are triggered but errors ignored during this interval. The initial_interval_secs is then a user-configurable upper bound of when a job is meant to be working. It can even be set rather high, because it won't affect the update performance. What do you think? Best Regards, Stephan From: Maxim Khutornenko ma...@apache.org Sent: Tuesday, May 5, 2015 10:24 PM To: dev@aurora.apache.org Subject: Health Checks for Updates design review Hi, I have put together a design proposal for improving health-enabled job update performance. Please, review and leave your comments: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit Thanks, Maxim
Aurora + Docker containerizer
Hello, We would like to use Aurora, but some of important for us features are missing, mostly those related to Docker containerizer. - We need to pass any docker run param e.g. environment variables, volumes. ATM we are limited by Aurora, because Mesos (correct me if I’m wrong) provides many more options to frameworks - We need to have docker network solutions other than --net=host, AFAIU it does not relate only to my first bullet point - We need to be able to schedule “pure” docker containers, e.g. docker run -d nginx, without overriding CMD and without python in base image, do you think it can be possible with Aurora? Please let me know what you think about it and if it makes any sense to you. Regards, Łukasz Adamczyk -- Łukasz Adamczyk AVSystem | ul. Radzikowskiego 47d | 31-315 Kraków | Poland phone: +48126194700 | avsys...@avsystem.com | www.avsystem.com This email is intended solely for the person or entity to which it is addressed and contains confidential and/or privileged information. If you believe that you received this email in error, please contact the sender immediately, delete the message from any computer and do not read, distribute, copy or print this email or any attachment. You are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful.
Build failed in Jenkins: Aurora #1023
See https://builds.apache.org/job/Aurora/1023/changes Changes: [wfarner] Updgrade to gradle 2.4. -- Started by an SCM change Building remotely on ubuntu-1 (docker Ubuntu ubuntu ubuntu1) in workspace https://builds.apache.org/job/Aurora/ws/ Wiping out workspace first. Cloning the remote Git repository Cloning repository https://git-wip-us.apache.org/repos/asf/aurora.git git init https://builds.apache.org/job/Aurora/ws/ # timeout=10 Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/aurora.git git --version # timeout=10 git fetch --tags --progress https://git-wip-us.apache.org/repos/asf/aurora.git +refs/heads/*:refs/remotes/origin/* git config remote.origin.url https://git-wip-us.apache.org/repos/asf/aurora.git # timeout=10 git config remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 git config remote.origin.url https://git-wip-us.apache.org/repos/asf/aurora.git # timeout=10 Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/aurora.git git fetch --tags --progress https://git-wip-us.apache.org/repos/asf/aurora.git +refs/heads/*:refs/remotes/origin/* git rev-parse origin/master^{commit} # timeout=10 Checking out Revision fc9cb02ef8ecd30ea163a64d88738c461a18e0fb (origin/master) git config core.sparsecheckout # timeout=10 git checkout -f fc9cb02ef8ecd30ea163a64d88738c461a18e0fb git rev-list a31acbb6c59db6cf592be03a5e51a77c8bc50549 # timeout=10 Cleaning workspace git rev-parse --verify HEAD # timeout=10 Resetting working tree git reset --hard # timeout=10 git clean -fdx # timeout=10 [Aurora] $ /bin/bash -xe /tmp/hudson2952627025816345596.sh + ./build-support/jenkins/build.sh + date Wed May 6 19:37:27 UTC 2015 + ./gradlew -Pq clean build Downloading https://services.gradle.org/distributions/gradle-2.4-bin.zip . Unzipping /home/jenkins/.gradle/wrapper/dists/gradle-2.4-bin/1lebsnfoptv8qpa10w6kyy5mp/gradle-2.4-bin.zip to /home/jenkins/.gradle/wrapper/dists/gradle-2.4-bin/1lebsnfoptv8qpa10w6kyy5mp Set executable permissions for: /home/jenkins/.gradle/wrapper/dists/gradle-2.4-bin/1lebsnfoptv8qpa10w6kyy5mp/gradle-2.4/bin/gradle :buildSrc:clean UP-TO-DATE :buildSrc:compileJava UP-TO-DATE :buildSrc:compileGroovy :buildSrc:processResources UP-TO-DATE :buildSrc:classes :buildSrc:jar :buildSrc:assemble :buildSrc:compileTestJava UP-TO-DATE :buildSrc:compileTestGroovy UP-TO-DATE :buildSrc:processTestResources UP-TO-DATE :buildSrc:testClasses UP-TO-DATE :buildSrc:test UP-TO-DATE :buildSrc:check UP-TO-DATE :buildSrc:build FAILURE: Build failed with an exception. * What went wrong: A problem occurred configuring root project 'aurora'. Could not
Re: the status of pesos
In order to unblock the current situation, I've come around to just setting up aurora-compactor and aurora-pesos for now (later it may make sense to do aurora-pystachio and aurora-thermos as well.) If there ever becomes a consistent story for the mesos github organization, we can reevaluate. Jake, what do we do about setting up these repositories? On Thu, Apr 23, 2015 at 7:14 PM, Jake Farrell jfarr...@apache.org wrote: Creating a project on Github as suggested does not give the IP rights to the ASF for any of the code, it would be an external project and would no different than keeping it as your personal github project. I do not think this is a good route to start down for any Apache Aurora/Mesos additions like this -Jake On Wed, Apr 22, 2015 at 4:33 PM, Brian Wickman wick...@apache.org wrote: My only reservation with aurora-* repos is that it discourages discovery and will lead to confusion about the scope of the projects. pesos and compactor are broadly useful to the mesos ecosystem, so names like 'aurora-pesos' can genuinely draw people away. It sounds like the main concerns people have with the status quo revolves around ownership (who can merge patches) and quality (that all code merged to master is reviewed with the same scrutiny as the rest of Aurora.) I think these are reasonable concerns, but I think they're more valid once we rely upon the code for production. Right now pesos is purely an optional feature, so I don't think that the above review should be blocked on the incubating nature of pesos, otherwise we'll be stuck with a chicken-and-egg situation where we have little way to vet the code in a meaningful way. Here's a counterproposal: we create an Aurora top level project on github a la mesos (call it aurora-incubating, aurora-project, apache-aurora, whatever, since aurora is taken), giving all committers write access to all projects therein. We may not be able to rely upon reviewboard, but we can at least solve the problem of ownership. Thoughts? On Mon, Apr 20, 2015 at 7:29 PM, Jake Farrell jfarr...@apache.org wrote: We only sync reviewboard repos from our git-wip or svn servers. I would recommend that we move them into aurora-project name git repos so they can have their own release cycles -Jake On Mon, Apr 20, 2015 at 5:50 PM, Brian Wickman wick...@apache.org wrote: I started work in r/32373 https://reviews.apache.org/r/32373/ to add pesos https://github.com/wickman/pesos support for the Aurora executor. Pesos is a pure python implementation of the Mesos API. Adding Pesos support to Aurora will pave the way towards pip install and the standard python packaging toolchain as a means to package/install the Aurora executor, without relying upon a cumbersome Mesos build process that is predicated on the nuances of libmesos and its myriad dependencies e.g. glibc, C++11 and libsvn/apr. Pesos and its dependent library, compactor https://github.com/wickman/compactor, are both projects on my personal github. I'd like to keep them independent repositories. My experience shows that vendoring these sorts of things reduces discoverability and peoples' willingness to contribute, and increases likelihood of forks. That being said, I'm not convinced they should be under my personal github either because I'm a poor BDFL http://en.wikipedia.org/wiki/Benevolent_dictator_for_life candidate. Instead they should either be under the moniker of the mesos github organization (there is precedent https://github.com/mesos/mesos-go for this) or we should create an Aurora organization for third party projects that tend to be developed under the Aurora umbrella, e.g. pystachio. Regardless of where they live, I think we should immediately start using reviewboard to do code reviews for patches. Does anyone know if this is feasible using reviews.apache.org if the code does not live under the apache umbrella? (The code itself is Apache licensed.) ~brian
Re: Health Checks for Updates design review
Thanks for your comment, Stephan. I have moved it into the doc to keep discussion history in one place. On Wed, May 6, 2015 at 1:33 AM, Erb, Stephan stephan@blue-yonder.com wrote: Hi Maxim, I am not keen on the potential risk of tasks getting stuck in STARTING. We perform auto-scaling of jobs, so there might be nobody around to notice and correct the problem in time. How about keeping the initial_interval_secs and just change its meaning to be grace period, so that health checks are triggered but errors ignored during this interval. The initial_interval_secs is then a user-configurable upper bound of when a job is meant to be working. It can even be set rather high, because it won't affect the update performance. What do you think? Best Regards, Stephan From: Maxim Khutornenko ma...@apache.org Sent: Tuesday, May 5, 2015 10:24 PM To: dev@aurora.apache.org Subject: Health Checks for Updates design review Hi, I have put together a design proposal for improving health-enabled job update performance. Please, review and leave your comments: https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit Thanks, Maxim