Re: [VOTE] Release Apache Aurora 0.18.x packages

2017-07-21 Thread Erb, Stephan
+1 The verification scripts for all distributions in the test repository have passed for me. On 19.07.17, 01:58, "Santhosh Kumar Shanmugham" wrote: I missed to update one of the bintray links. It should be https://dl.bintray.com/shanmugh/aurora/ On Tue, Jul 18, 2017 at

Re: Redesign of the Aurora UI

2017-07-23 Thread Erb, Stephan
A big +1 from me as well. We have not touched or updated the existing UI for quite some time, which is a bad sign for code health. I would even be OK with a couple of bigger initial code dumps. I am not really a web-developer, so a working piece of code to play around with would probably be the

Re: [Design Doc] Hot Standby in Replicas to Reduce Failover Time

2017-09-04 Thread Erb, Stephan
Thanks for the detailed design document and the in-depth walkthrough [1]! Your proposal seems to be sound. (But be warned, I don’t have much experience in this part of Aurora or Mesos :-)) [1] https://docs.google.com/presentation/d/1fQMfNLaRex9rJyq3h08HIujtpULoYnpFFV7-P6p6Zt0/edit#slide=id.p4

Re: Make drain MAX_STATUS_WAIT configurable

2017-09-05 Thread Erb, Stephan
+1 On 05.09.17, 19:17, "David McLaughlin" wrote: +1 On Tue, Sep 5, 2017 at 10:09 AM, Mauricio Garavaglia < mauriciogaravag...@gmail.com> wrote: > Hi folks, > > The aurora-admin drain command currently has a hardcoded limit of 5 minutes > waiting for a node

Re: Future of storage in Aurora

2017-10-02 Thread Erb, Stephan
What do you have in mind for the in-memory replacement? Revert back to the usage of thrift objects within plain Java containers like we do for the task store? On 02.10.17, 00:59, "Bill Farner" wrote: I would like to revive this discussion in light of some work i have been doing around

Re: Build failed in Jenkins: Aurora #1858

2017-10-23 Thread Erb, Stephan
Ah, again a node with insufficient memory. I once added a mechanism to abort the build early rather than running and eventually failing in these cases. This was very helpful for the regular reviewbot but is not that helpful for the normal SCM-triggerd build. Can anyone think of a better way to

Re: 0.19.0 release preparation

2017-10-30 Thread Erb, Stephan
Sounds good to me. Getting the release out quickly will allow us to remove the old mybatis/h2 code sooner. I planned on upgrading to Mesos 1.4. Unfortunately this is currently blocked by a missing mesos.interface package on PyPI. I send a mail out to the Mesos Dev list but I am still waiting fo

Re: [VOTE] Release Apache Aurora 0.19.0 RC0

2017-11-10 Thread Erb, Stephan
+1 from me. Verification script has passed. I also intended to deploy this to a test cluster, but won’t be able to do so before the vote is closing. On 09.11.17, 17:16, "Bill Farner" wrote: Friendly reminder to vote, folks! We are currently one binding vote shy of a release, and the

Pass slave-ip to user process

2015-03-30 Thread Erb, Stephan
Hi everyone, we are running our Mesos slaves on hosts with multiple network interfaces and would like to specifically bind started services to the ip used to start the Mesos slave (as specified via --ip). Mesos seems to export this ip via the LIBPROCESS_IP environment variable. However thermos

Re: [jira] [Created] (AURORA-1258) Improve procedure for adding instances to a job

2015-04-07 Thread Erb, Stephan
+1 as well. We are implementing an external auto-scaler for certain types of jobs which would greatly benefit from such an command. The scaling can be simulated using the updater, but the latter always needs to know about the job configuration which makes the entire implementation more complica

Re: Pass slave-ip to user process

2015-04-07 Thread Erb, Stephan
would be to file a MESOS ticket and see what the developers there think. On Mon, Mar 30, 2015 at 7:28 AM, Erb, Stephan wrote: > Hi everyone, > > we are running our Mesos slaves on hosts with multiple network interfaces > and would like to specifically bind started services to the i

Re: Graceful task shutdown

2015-04-07 Thread Erb, Stephan
gt; I do think the lifecycle modules idea would solve Stephan's issue. > > On Tue, Mar 24, 2015 at 5:06 PM, Brian Brazil > wrote: > > > On 24 March 2015 at 20:57, Erb, Stephan > > wrote: > > > > > Hi everyone, > > > > > > we are impleme

Re: Pass slave-ip to user process

2015-04-07 Thread Erb, Stephan
reasonable variables. Please file a ticket requesting these. On Tue, Apr 7, 2015 at 1:11 PM, Erb, Stephan wrote: > Hi, > > you are right, network interface handling should probably be done in > Mesos. However, I thought Aurora could simply expose the information it > already has about

Re: Health Checks for Updates design review

2015-05-06 Thread Erb, Stephan
Hi Maxim, I am not keen on the potential risk of tasks getting stuck in STARTING. We perform auto-scaling of jobs, so there might be nobody around to notice and correct the problem in time. How about keeping the initial_interval_secs and just change its meaning to be grace period, so that heal

non-prod SLA stats

2015-05-29 Thread Erb, Stephan
Hi everyone, we are are interested in the job uptime percentiles and the aggregate cluster uptime percentage not only for production jobs, but also for our non-production jobs. Are there any reasons why those are not available in a non-prod version, similar to the current handling of mtta and

Re: non-prod SLA stats

2015-06-01 Thread Erb, Stephan
collection set. What do you think? Thanks, Maxim On Fri, May 29, 2015 at 7:09 AM, Erb, Stephan wrote: > Hi everyone, > > we are are interested in the job uptime percentiles and the aggregate cluster > uptime percentage not only for production jobs, but also for our > non-

Re: aurora replica log snapshot interval

2015-06-04 Thread Erb, Stephan
We also used to have timeout issues on a smallish but VM-based test cluster. We ended up doing the following: * run Zookeeper and Aurora with concurrent mark and sweep collector (without any further tuning) * doubled the Aurora native log read and write timeouts. This is in line with the defaul

Re: non-prod SLA stats

2015-06-16 Thread Erb, Stephan
- feel free to propose patches. I'd also encourage to invest some time into an SLA benchmark using our JMH-based harness to back your changes with real perf data. Thanks, Maxim On Mon, Jun 1, 2015 at 3:26 AM, Erb, Stephan wrote: > Hi Maxim, > > introducing some toggles for met

Re: Using Prometheus with Aurora

2015-07-01 Thread Erb, Stephan
Hi Brian, that will come in handy. Thanks for that! I have thought about using prometheus together with the Mesos and Aurora exporters. Your feature would even allow us to start both exports on Aurora and let Prometheus discover them automatically. However, it might be a bit risky to use Auror

Blue Yonder is using Aurora

2015-07-18 Thread Erb, Stephan
Hi everyone, we are happy to announce that Blue Yonder (http://www.blue-yonder.com/en/) is using Apache Aurora. We have published a short explanation of our current usecase [1] and are looking forward to further collaboration with the community. [1] http://www.blue-yonder.com/blog-e/2015/07/

SSL Support

2015-08-13 Thread Erb, Stephan
Hi, let's say I want to run Aurora using SSL. How would I do that? Hide it behind a reverse proxy? At least the client seems to have some kind of https related bits and pieces [1]. Best Regards, Stephan [1] https://github.com/apache/aurora/blob/8bdfb8500e792da199bd8cc9fed38d36e2448e81/src/m

Re: MesosCon Europe CFP Extended to Friday

2015-08-28 Thread Erb, Stephan
We have submitted two Aurora related talks: One about service discovery and one about why we adopted Aurora and how we are using it. @Bill: I would love to hear more about upcoming and planned Aurora features (for example, oversubscription, user/filesystem isolation). Cheers, Stephan __

Meaning and usage of job environments

2015-09-02 Thread Erb, Stephan
Hi everyone, I have been wondering about the environment part of Aurora jobkeys (devel, test, staging, staging1, ...prod). I guess you made the environment a first class citizen in order to enforce some kind of standardization. How is this working out for you in practice? IIRC, the current s

Aurora master seems broken

2015-10-07 Thread Erb, Stephan
Hi, does anyone (who is not on the move today) has time to look into that bug here? If Aurora is broken during MesosCon it will make it quite more difficult for new people to play round with it. https://issues.apache.org/jira/browse/AURORA-1513 Thanks and Best Regards, Stephan

Re: Unified container support in Aurora

2015-10-21 Thread Erb, Stephan
Hi Maxim, we would be interested in the unified container support as well. It would allow us to independently update the major version of the slave OS and the OS used within containers. Nevertheless, while very interesting for the future, it is not a pressing issue for us right now. In additio

Re: Questions about Aurora scheduling policy

2015-11-25 Thread Erb, Stephan
Hi Riccardo, have you looked into the quota feature and the updater settings? I have the impression that a combination of both is what you are looking for. Regarding your requirement to have no resource constraints, have you considered starting a certain set of Mesos slaves defaulting to the po

Re: Colocate services support in Aurora

2015-12-07 Thread Erb, Stephan
I am curious, why exactly do you want to co-locate? What are the requirements imposed by your legacy app? There might be different solutions for different problems (e.g., bind-mount a shared filesystem into the the sandbox, setup reverse proxies to allow communication via localhost etc...) Reg

Re: Website fixes

2015-12-13 Thread Erb, Stephan
Looks great, thanks for that! Only issue I have came across: Images on the presentation list are broken https://aurora.apache.org/documentation/latest/presentations/ Regards, Stephan From: Bill Farner Sent: Saturday, December 12, 2015 11:20 PM To: dev@a

Re: [VOTE] Release Apache Aurora 0.11.0 RC1

2015-12-20 Thread Erb, Stephan
What's up with this ticket here: https://issues.apache.org/jira/browse/AURORA-1520 Was this forgotten? Should we do it now? Regards, Stephan From: John Sirois Sent: Friday, December 18, 2015 3:37 AM To: dev@aurora.apache.org Subject: Re: [VOTE] Release

Re: Ticket cleanup

2015-12-28 Thread Erb, Stephan
+1. Having a well-groomed bug tracker is very helpful for everyone involved. In particular, it would be great if we could get the bug count to 0 over the course of the next months. Either bugs are important and we get them fixed, or we have to guts to close them as won't fix and update the docum

Re: AURORA-1440 Evaluate Fenzo scheduling library

2015-12-29 Thread Erb, Stephan
Someone also expressed interest in Fenzo by adding it to the community-driven roadmap [1]. AFAIK nobody has looked at in in detail, yet. Or at least nobody has posted about it on the mailinglist. Feel free to be that someone and take a closer look at what would be necessary to leverage the powe

Re: [PROPOSAL] Amend 0.12.0 release goals

2016-01-14 Thread Erb, Stephan
+1 for catching up From: John Sirois Sent: Thursday, January 14, 2016 4:18 PM To: dev@aurora.apache.org Subject: Re: [PROPOSAL] Amend 0.12.0 release goals On Thu, Jan 14, 2016 at 8:02 AM, Bill Farner wrote: > Given that we are still playing catch-up to m

Re: [PROPOSAL] Job instance scaling APIs

2016-01-15 Thread Erb, Stephan
I really like the proposal. The gain in simplicity on the client-side by not having to provide an aurora config is quite significant. The implementation on the scheduler side is probably rather straight forward as the update can be reused. That would also provide us with the update UI, which ha

Re: [PROPOSAL] Job instance scaling APIs

2016-01-17 Thread Erb, Stephan
about scaleOut() looking more like startJobUpdate() if we keep adding features. If health watching, throttling (batch_size) or rollback on failure is required then I believe the startJobUpdate() should be used instead of scaleOut(). On Fri, Jan 15, 2016 at 1:09 AM, Erb, Stephan wrote: > I really like th

Job-Aggregation

2016-01-26 Thread Erb, Stephan
Hi, I've noticed that a couple of people [1, 2] have independently talked about aggregating multiple Aurora jobs to 'logical services'. Internally, we also do something similar. I am wondering if there is a broader concept waiting to be discovered as an Aurora feature. As a kind of related con

Re: [PROPOSAL] Revisit task ID format

2016-01-26 Thread Erb, Stephan
+1 for dropping the timestamp However, I am not sure regarding the mangled jobkey. It tends to make it easier to correlate Mesos tasks to Aurora jobs when skimming log files, viewing the Mesos-UI or even when using the Thermos [1]. I guess the traceability of all of those usecases could be impr

Re: [PROPOSAL] Revisit task ID format

2016-01-26 Thread Erb, Stephan
#L135 From: Erb, Stephan Sent: Wednesday, January 27, 2016 12:17 AM To: dev@aurora.apache.org Subject: Re: [PROPOSAL] Revisit task ID format +1 for dropping the timestamp However, I am not sure regarding the mangled jobkey. It tends to make it easier to

Re: Job-Aggregation

2016-01-26 Thread Erb, Stephan
esday, January 27, 2016 12:53 AM To: dev@aurora.apache.org Subject: Re: Job-Aggregation Are there any specific things you be looking to do with these groups, or just view them as a logical collection? On Tue, Jan 26, 2016 at 3:01 PM, Erb, Stephan wrote: > Hi, > > I've noticed that a cou

Allow dots and hyphens in metric names

2016-01-27 Thread Erb, Stephan
Bill suggested that I send a note here regarding a planned change of valid metric names: https://reviews.apache.org/r/42879. Goal is to allow any character that may be part of a user-defined job name, in particular dots and hyphens. Fixing this will significantly reduce the logging noise of Auro

NEWS Layout

2016-02-02 Thread Erb, Stephan
Hi everyone, I'd like to propose that we give our NEWS file a little bit more structure. Currently, it is quite cluttered [1]. To keep it simple, I'd suggest that we adopt the style from the 0.11 Aurora blog post: * New/updated * Deprecations and removals [1] https://github.com/apache/aurora/

Re: NEWS Layout

2016-02-02 Thread Erb, Stephan
: > +1 > > On Tue, Feb 2, 2016 at 12:29 PM, Joshua Cohen wrote: > > > +1 > > > > On Tue, Feb 2, 2016 at 1:09 PM, Bill Farner wrote: > > > > > +1 > > > > > > On Tuesday, February 2, 2016, Erb, Stephan < > stephan@blue-yonder.com

Re: New committer and PMC member: Stephan Erb

2016-02-03 Thread Erb, Stephan
Awesome, thanks! Great to be on board! :-) From: Bill Farner Sent: Wednesday, February 3, 2016 7:01 PM To: dev@aurora.apache.org; Erb, Stephan Subject: New committer and PMC member: Stephan Erb Folks, Please join me in welcoming Stephan Erb, who is now an

Re: Further thoughts on config deprecations

2016-02-03 Thread Erb, Stephan
Are you suggesting a tool that operates against a running Aurora cluster and performs a serverside inspection? Or are you implying a tool that works on .aurora files? I'd find the first one way more useful, as the latter one would suggest that you had to have a monorepo with access to the all

Re: Subject: [VOTE] Release Apache Aurora 0.12.0 RC4

2016-02-06 Thread Erb, Stephan
+1 "./build-support/release/verify-release-candidate 0.12.0-rc4 " has passed successfully for me. From: Bill Farner Sent: Saturday, February 6, 2016 00:22 To: dev@aurora.apache.org; jsir...@apache.org Subject: Re: Subject: [VOTE] Release Apache Aurora 0.

Re: [PROPOSAL] Disallow instance removal in job update

2016-02-07 Thread Erb, Stephan
A related idea that recently crossed my mind was some kind of pystachio variable / binding helper: {{aurora.instances}}. When updating a job, the scheduler would fill in the current instance count. However, when I want to change the number of instances, I could simply bind another value locall

Re: [RESULT][VOTE] Release Apache Aurora 0.12.0 RC4

2016-02-28 Thread Erb, Stephan
Even though we have done the voting, the release is still pending. We still have to build the packages and update the website. Is there a way we can help out here? Best, Stephan From: John Sirois Sent: Monday, February 8, 2016 23:47 To: dev@aurora.apache

Weekly community meeting

2016-02-28 Thread Erb, Stephan
Hi everyone, seems like we have been sloppy with the community meeting in the last weeks. It doesn't feel right to have a regular meeting that is skipped silently. Any thoughts or ideas what we could do about that? Best Regards, Stephan

Re: [DRAFT] [REPORT] Apache Aurora

2016-03-01 Thread Erb, Stephan
+1 From: Dave Lester Sent: Tuesday, March 1, 2016 07:15 To: dev@aurora.apache.org Subject: Re: [DRAFT] [REPORT] Apache Aurora +1 On Mon, Feb 29, 2016, at 05:27 PM, Jake Farrell wrote: > +1 looks good > > -Jake > > On Mon, Feb 29, 2016 at 8:15 PM, Bill Far

Re: aurora-packaging builds failling due to people.apache.org dependency

2016-03-02 Thread Erb, Stephan
That's the normal thrift package we use here. Could be hosted anywhere. Fortunately, another user was already so kind to submit a review request in order to fix the issue: https://reviews.apache.org/r/44277/diff/1#index_header Regards, Stephan From: Maur

Re: [PROPOSAL] DB snapshotting

2016-03-02 Thread Erb, Stephan
+1 for the plan and the ticket. In addition, for reference a couple of messages from IRC from yesterday: 23:42 mkhutornenko: interesting storage proposal on the mailinglist! I only wondered one thing... 23:42 it feeld kind of weird that we use H2 as a non-replicated database and build some s

MesosCon 2016

2016-03-04 Thread Erb, Stephan
Short reminder for everyone: Submission deadline for MesosCon 2016 is next week (March 9). You can find all details on the MesosCon website http://events.linuxfoundation.org/events/mesoscon/program/cfp So hurry up to submit your Aurora-related talk :-) Kind Regards, Stephan

Re: Populate DiscoveryInfo in Mesos

2016-03-07 Thread Erb, Stephan
That sounds like a good idea! Great. If you go ahead with this, please be so kind and start by posting a short design document here on mailinglist (similar to those here https://github.com/apache/aurora/blob/master/docs/design-documents.md, but probably shorter). This will allow us to split t

Re: [VOTE] Release Apache Aurora 0.12.0 rpms

2016-03-12 Thread Erb, Stephan
I'll try to spin up a centos box myself and see how that goes. > Dependency naming aside, i think we should omit docker from our > dependencies, as it really should be a mesos dep if anything. *I can send > a patch for that if others agree.* We also have to be more diligent in tracking the Mesos

Re: [PROPOSAL] Supporting Mesos Universal Containers

2016-03-13 Thread Erb, Stephan
As mentioned in IRC, I like the proposal. Still, we need a discussion regarding the future of current Docker support. Especially since Bill and John have now started improving it. What are our plans here? What are the plans of the Mesos community (i.e., deprecation of the docker containerizer)

Re: [PROPOSAL] Supporting Mesos Universal Containers

2016-03-15 Thread Erb, Stephan
an, or is it reasonable to assess the situation as we go? On Sun, Mar 13, 2016 at 7:23 AM, Erb, Stephan wrote: > As mentioned in IRC, I like the proposal. > > Still, we need a discussion regarding the future of current Docker > support. Especially since Bill and John have now started

Re: [VOTE] Release Apache Aurora 0.12.0 rpms

2016-03-21 Thread Erb, Stephan
+1 Verified using the install instructions here https://github.com/apache/aurora-packaging/blob/master/test/rpm/centos-7/README.md#released but with together with John's pkg_root. I have noticed two things: * our thermos_root variable does not seem to play well with the default work_dir of th

Re: Populate DiscoveryInfo in Mesos

2016-03-25 Thread Erb, Stephan
ut which fields would be useful for community, or what value they should take? On Mon, Mar 7, 2016 at 1:00 AM, Erb, Stephan wrote: > That sounds like a good idea! Great. > > If you go ahead with this, please be so kind and start by posting a short > design document here on mailing

Re: Looking for feedback - Setting CommandInfo.user by default when launching tasks.

2016-03-29 Thread Erb, Stephan
I am in favor of your proposal. We offer less attack surface if the executor is not running as root. Interesting though, this introduces another security problem: The credentials file in the incoming Zookeeper ACL patch (https://reviews.apache.org/r/45042/) will have to be readable by everyone

Re: Populate DiscoveryInfo in Mesos

2016-03-31 Thread Erb, Stephan
; of > > > rules applicable to all orgs using Aurora + Mesos, because cluster > > > management and service discovery stack could differ from org to org. > > > > > > In a recent Mesos work group, some experience folks (Jie Yu and Ben > > > Mahler) m

Re: Are we ready to remove the observer?

2016-04-01 Thread Erb, Stephan
>From an operator and Aurora developer perspective, it would be really great to >get rid of the thermos observer quickly. However, from a user perspective the usability gap between observer and plain Mesos sandbox browsing is quite large right now. I agree with Benjamin here that it would proba

Re: Are we ready to remove the observer?

2016-04-04 Thread Erb, Stephan
never been very useful to us (since they > don't > > > work for docker), however, even being able to see the processes that > are > > > running, how many times they've restarted, when they launched, etc is > > > invaluable. > > > > > >

Re: Populate DiscoveryInfo in Mesos

2016-04-05 Thread Erb, Stephan
t;https://github.com/mesosphere/mesos-dns/issues/414> to > describe > > what I want. > > > > I've updated my patch to include unit test and command flag switch, and > > it's ready for review now. > > > > On Thu, Mar 31, 2016 at 2:32 AM, Erb, S

Re: [DISCUSS]: 0.13.0 release candidate

2016-04-06 Thread Erb, Stephan
Short heads up: I believe I might be blocking the release candidate right now :-/. * Goal was to get https://reviews.apache.org/r/45177/ merged * Before we can merge this, we need to rebuild of the vagrant base image due to this change https://reviews.apache.org/r/45782/ * Unfortunately, I don'

Re: Looking for feedback - Setting CommandInfo.user by default when launching tasks.

2016-04-12 Thread Erb, Stephan
st in the feature I proposed or >> should >> > I >> > > > just >> > > > > > drop it? It's not a lot of code, but also it's not a >> requirement >> > for >> > > > > > anything

Re: [VOTE] Release Apache Aurora 0.13.0 RC0

2016-04-12 Thread Erb, Stephan
+1 for releasing 0.13.0-rc0 as Aurora 0.13.0 * tested with the verification script * deployed the RC to an inhouse test cluster I am also OK with fixing the changelog afterwards. From: Bill Farner Sent: Tuesday, April 12, 2016 16:18 To: jfarr...@apache.o

Re: [PROPOSAL] Switch aurora client from service discovery to HTTP redirects.

2016-04-20 Thread Erb, Stephan
So, scheduler_uri would point to a domain name with multiple A-records? We could probably also extend the interface to support a list of scheduler uris. That would make an initial HA-setup simpler, as it would not require the DNS entries or a load balancer. People could just use a list of IPs.

Default value of Filter.refuse_seconds

2016-04-26 Thread Erb, Stephan
Hi everyone, within this Aurora review request https://reviews.apache.org/r/46603/ we are wondering about the current Filter.refuse_seconds default value of 5 seconds [1]. Is there any reason why this particular value was chosen? Would we have to expect increased load for the leading Mesos ma

Re: Aurora and Mesos

2016-05-02 Thread Erb, Stephan
Hi Supun, Aurora is a Mesos framework and therefore does not work without Mesos. Details how Mesos & Aurora work together can be found here https://github.com/apache/aurora/blob/master/docs/getting-started/overview.md It is possible to install Mesos masters and Aurora schedulers on different n

Re: Log aggregation to Kafka

2016-05-02 Thread Erb, Stephan
As of today, this is not possible with the thermos logger. However, I can think of a two of potential solutions: * you start an additional (ephemeral) process together with you job that forwards the content of the stdout & stderr files to Kafka (https://github.com/apache/aurora/blob/17ade117b8

Agent / Slave renaming

2016-05-18 Thread Erb, Stephan
I got some spare time yesterday and used it to rebase the first of a series of stale review requests by Kevin that implement the Mesos agent renaming in Aurora (https://reviews.apache.org/r/47495/). The renaming is ranked high on the Mesos roadmap. The specific epic is work-in-progress but proc

Re: aurora-packaging for 0.13

2016-06-06 Thread Erb, Stephan
Hi Mauricio, the master of https://github.com/apache/aurora-packaging should be ready to use if you want to build 0.13 binaries. There are some minor cleanups needed before we can do an official release though. I will try to look into this until next week. Best Regards, Stephan ___

Re: Aurora 0.14.0 release

2016-06-06 Thread Erb, Stephan
+1 for a RC this week. I'd volunteer to serve as a release manager, but will probably need someone to walk me through the necessary steps. From: Maxim Khutornenko Sent: Tuesday, June 7, 2016 00:15 To: dev@aurora.apache.org Subject: Aurora 0.14.0 release

Re: Aurora 0.14.0 release

2016-06-09 Thread Erb, Stephan
ee to ping me with any questions or bring them up in #aurora > as a number of us have now been RM's before > > -Jake > > On Mon, Jun 6, 2016 at 6:54 PM, Erb, Stephan > wrote: > > > +1 for a RC this week. > > > > I'd volunteer to serve as a release manag

Re: Aurora performance impact with hourly query runs

2016-06-09 Thread Erb, Stephan
I am no expert here, but I would assume that slow task store operations could result from a slow replicated log. Have you tried keeping it on an SSD? (https://github.com/apache/aurora/blob/e89521f1eebd9a5301eb02e2ed6ffebdecd54c9a/docs/operations/configuration.md#-native_log_file_path) FWIW, ther

Re: aurora-packaging for 0.13

2016-06-14 Thread Erb, Stephan
6, 2016 at 4:37 AM, Erb, Stephan wrote: > Hi Mauricio, > > the master of https://github.com/apache/aurora-packaging should be ready > to use if you want to build 0.13 binaries. > > There are some minor cleanups needed before we can do an official release > though. I will try to

Re: Few things we would like to support in aurora scheduler

2016-06-19 Thread Erb, Stephan
>> The next problem is related to the way we collect service cluster >> status. I couldn't find a way to quickly get latest statuses for all >> instances/shards for a job in one query. Instead we query all task statuses >> for a job, them manually iterate through all the statuses and filter the >

Re: [PROPOSAL] Job as a first-class citizen

2016-06-29 Thread Erb, Stephan
I recently thought about the same idea. Use case for us would be to scale a job 0 instances. While this sounds useless at first, it can be quite powerful when trying to implement a feature like socket activation. From: Maxim Khutornenko Sent: Wednesday,

Re: [VOTE] Release Apache Aurora 0.14.0 packages

2016-06-29 Thread Erb, Stephan
All, I propose that we accept the following artifacts as the official deb and rpm packaging for Apache Aurora 0.14.0. https://dl.bintray.com/stephanerb/aurora/ The Aurora deb and rpm packaging includes the following: --- The CHANGELOGs are available at: https://git1-us-west.apache.org/repos/asf

[RESULT][VOTE] Release Apache Aurora 0.13.0 packages

2016-06-29 Thread Erb, Stephan
This vote has passed. Vote summary: +1 votes: 4 binding +0 votes: 0 -1 votes: 0 Artifacts are now available in the official Apache Aurora bintray repos: https://bintray.com/apache/aurora/ From: Joshua Cohen Sent: Wednesday, June 29, 2016 21:22 To: dev@

Re: [RESULT][VOTE] Release Apache Aurora 0.14.0 packages

2016-07-03 Thread Erb, Stephan
@aurora.apache.org Subject: Re: [VOTE] Release Apache Aurora 0.14.0 packages +1 as verified for debian-jessie, ubuntu-trusty and centos-7 On Wed, Jun 29, 2016 at 5:11 PM, Maxim Khutornenko wrote: > +1 > > Verified for centos-7 and ubuntu-trusty. > > On Wed, Jun 29, 2016 at 2:20 P

Re: Build failed in Jenkins: Aurora #1548

2016-07-04 Thread Erb, Stephan
Looks like src.test.python.apache.thermos.core.test_process.test_log_tee is flaky. I have seen it fail at least twice. On 04/07/16 16:46, "Apache Jenkins Server" wrote: See Changes: [john.sirois] Close `PathChildrenCache` before its framewo

Multi-Role Frameworks

2016-07-10 Thread Erb, Stephan
Some people started an initiative to support frameworks with multiple Mesos roles: · Epic: https://issues.apache.org/jira/browse/MESOS-1763 · Design doc: https://docs.google.com/document/d/1gnDdADUMhPgvPa_lN7y97riUEhTCxmrLinOZM471HLc/edit?pref=2&pli=1#heading=h.czfhotx8xvs5 I

Re: [VOTE] Release Apache Aurora 0.15.0 packages

2016-07-12 Thread Erb, Stephan
+1 Ran the test instructions for all three architectures. There is a minor hiccup with the provision.sh for Debian, but nothing too serious https://gist.github.com/StephanErb/f96f2e4038499efba4fafebada58a9a7. On 08/07/16 23:59, "John Sirois" wrote: Ah yes - and that's covered in the relase no

Re: Supporting the mesos HTTP executor API

2016-07-13 Thread Erb, Stephan
The HTTP executor API sounds interesting and I would like to see it land in Aurora one day. As of today, I see two major hurdles with the libmesos dependency of the executor: a) Users have to ship Mesos dependencies within their Docker images b) We need to build and upload the libmesos eggs whe

Re: Dynamic Reservation Meeting

2016-08-02 Thread Erb, Stephan
Making your meeting public is a great service to the community. Thanks for that! I wish all of you a productive meeting :-) On 02/08/16 01:08, "Mehrdad Nurolahzade" wrote: Hi Devs, Folks here at Twitter (Maxim and myself) will be meeting with Uber folks (Dmitriy and Zameer) over lunch to disc

Re: [VOTE] Release Apache Aurora 0.15.0 packages

2016-08-02 Thread Erb, Stephan
This vote is still open and the packages have not been released yet. Formally, everything looks correct to me. We should be able to proceed here, right? On 12/07/16 22:54, "Erb, Stephan" wrote: +1 Ran the test instructions for all three architectures. There is a minor hiccup

Re: [FEEDBACK] Transitioning Aurora leader election to Apache Curator (`-zk_use_curator`)

2016-08-24 Thread Erb, Stephan
The curator backend has been working well for us so far. I believe it is safe to make it the default for the next release, and to drop the old code in the release after that. From: John Sirois Reply-To: "u...@aurora.apache.org" , "jsir...@apache.org" Date: Thursday 7 July 2016 at 01:13 To: Ma

Re: Build failed in Jenkins: aurora-packaging-nightly #408

2016-08-29 Thread Erb, Stephan
Filed an issue with pants: https://github.com/pantsbuild/pants/issues/3817 On 29/08/16 02:34, "Apache Jenkins Server" wrote: See -- [...truncated 1404 lines...] 77% |#

Re: Aurora 0.16.0 release

2016-09-08 Thread Erb, Stephan
I would like to get https://reviews.apache.org/r/51664/ into the release. I am open for feedback and will have time to update the review request on early Friday. Thanks, Stephan On 06/09/16 17:36, "Joshua Cohen" wrote: Hi Aurorans, I plan to kick off the 0.16.0 release some time

Re: [VOTE] Release Apache Aurora 0.16.0 RC1

2016-09-21 Thread Erb, Stephan
Unfortunate -1 from me. I bumped into this: https://issues.apache.org/jira/browse/AURORA-1779 On 20/09/16 22:37, "Joshua Cohen" wrote: I'll start with my own +1 vote. Verified with the verify-release-candidate script. On Tue, Sep 20, 2016 at 3:01 PM, Joshua Cohen wrote:

Re: A mini postmortem on snapshot failures

2016-10-04 Thread Erb, Stephan
An immediate failover seems rather drastic too me. However, I have no anecdotal evidence to back up this feeling or any other default config changes. Maybe Joshua can share what they are using so that we can adjust the default values accordingly? Other thoughts: • Have you tried this magic tric

Re: A mini postmortem on snapshot failures

2016-10-04 Thread Erb, Stephan
Thanks for the pointers regarding the broken documentation. I will fix that. The configuration options have moved and are now described here http://aurora.apache.org/documentation/latest/operations/configuration/#replicated-log-configuration On 03/10/16 09:05, "meghdoot bhattacharya" wrote:

Aurora CNI support

2016-10-11 Thread Erb, Stephan
Hi everyone, I have filed an issue to explore the necessary changes for CNI support in Aurora: https://issues.apache.org/jira/browse/AURORA-1790 I would love to hear your feedback. Best regards, Stephan

Re: Build failed in Jenkins: aurora-packaging-nightly #452

2016-10-12 Thread Erb, Stephan
All those recent packaging failures are due to the gradle major version upgrade. Fix is here: https://reviews.apache.org/r/52777/ On 12/10/16 02:31, "Apache Jenkins Server" wrote: See Changes: [serb] Upda

Re: [VOTE] Release Apache Aurora 0.16.0 packages

2016-10-24 Thread Erb, Stephan
+1 (binding) Also verified via the test instructions. On 24/10/16 21:41, "John Sirois" wrote: +1 (binding) Tested using the 3 vagrant test setups under aurora-packaging/test. There were the `libcurl4-nss-dev` missing dep issues in each of the deb vms that I worked around wi

AttributeAggregate performance

2016-10-25 Thread Erb, Stephan
Hi everyone, I believe I might have found a small performance regression in our scheduling code: https://issues.apache.org/jira/browse/AURORA-1802 I don’t have the cycles (or the necessity) to look into this any further at that point in time. However, for some of you with huge clusters this ma

Preparations for 0.17.0

2016-11-02 Thread Erb, Stephan
Hi everyone, I’d like to volunteer as our next release manager and set the release train for 0.17 into motion. Since 0.16 we have fixed several important bugs and should therefore aim for a release in the next 2-4 weeks, if possible. If everything goes according to plan for the current Mesos r

Re: Preparations for 0.17.0

2016-11-21 Thread Erb, Stephan
release until it is out. > > > > On Wed, Nov 2, 2016 at 9:24 AM, Erb, Stephan < > stephan@blue-yonder.com> > > wrote: > > > > > Hi everyone, > > > > > > I’d like to volunteer as our next release manager a

Re: [DRAFT][REPORT]: Apache Aurora - December 2016

2016-12-07 Thread Erb, Stephan
Looks good to me. Thanks! On 07/12/16 17:34, "Joshua Cohen" wrote: +1, thanks for doing this Jake! On Wed, Dec 7, 2016 at 9:55 AM, Jake Farrell wrote: > Below is our draft board report which is due next week. Please let me know > if you see any additions or changes t

Proposal: Move snapshots into a separate log

2016-12-27 Thread Erb, Stephan
David has posted a great patch & design document on Review Board: RB: https://reviews.apache.org/r/54883/ Design Doc: https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing I am merely reposting his work here so that it won’t be missed by casual readers

  1   2   >