Re: Blameless post mortem

2015-09-28 Thread Sebastien Goasguen
ok so :) While fixing your broken stuff , he broke some other stuff which you attempted to fix but broke other stuff doing it, so he decided to fix your broken stuff that was supposed to fix the broken stuff he did to improve old stuff. Great 0-0, ball in the middle. Let’s fix some blockers,

Re: Blameless post mortem

2015-09-28 Thread Wilder Rodrigues
Koushik, Please, say my name! Don’t mention stuff like “the person blah blah blah”, please! If you want to start pointing fingers, I can also play this game and get references to the PRs which were not tested, pushed straight to master (when we agreed on not doing so) or got 2 LGTM without any

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Wilder, I think you are taking this in a wrong way. I am not pissed because people are asking for tests. I am pissed with the blame game that is being played in the community, I do not understand why do we need to sight PRs and try to blame others, other who trying to fix the bugs which

Re: Blameless post mortem

2015-09-28 Thread Koushik Das
inline On 28-Sep-2015, at 9:15 PM, Sebastien Goasguen wrote: > Let me try to reply, > >> On Sep 28, 2015, at 5:17 PM, Koushik Das wrote: >> >> I had asked for the documentation on persistent VR (PR # 118) changes in the >> context of another

Re: Blameless post mortem

2015-09-28 Thread Sebastien Goasguen
Let me try to reply, > On Sep 28, 2015, at 5:17 PM, Koushik Das wrote: > > I had asked for the documentation on persistent VR (PR # 118) changes in the > context of another discussion and this is what I got at that time. >

Re: Blameless post mortem

2015-09-28 Thread Koushik Das
I had asked for the documentation on persistent VR (PR # 118) changes in the context of another discussion and this is what I got at that time. http://dev.cloudstack.apache.narkive.com/MH47etbS/discuss-out-of-band-vr-migration-should-we-reboot-vr-or-not#post39 Right now as I see from the

Re: Blameless post mortem

2015-09-28 Thread Sebastien Goasguen
> On Sep 28, 2015, at 7:22 AM, Sanjeev N wrote: > > I have a concern here. Some of us are actively involved in reviewing the > PRs related to marvin tests(Enhancing existing tests/Adding new tests). If > we have to test a PR it requires an environment to be created with

Re: Blameless post mortem

2015-09-28 Thread Remi Bergsma
Hi Bharat, There is no bigger problem. We should always run the tests and if we find a case that isn’t currently covered by the tests we should simply add tests for it. There’s no way we’ll get a stable master without them. The fact that they may not cover everything, is no reason to not rely

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Remi, Thank you for the Blame less postmortem. I think there is a bigger problem here than just the review process and running tests. Even if we run the tests we cannot be sure that every thing will work as intended. The tests will only give some level of confidence. The tests may not

Re: Blameless post mortem

2015-09-28 Thread Remi Bergsma
Hi Bharat, There is only one way to prove a feature works: with tests. That’s why I say actually _running_ the tests we have today on any new PR, is the most important thing. Having no documentation is a problem, I agree, but it is not more important IMHO. If we had the documentation, we

Re: Blameless post mortem

2015-09-28 Thread Sebastien Goasguen
> On Sep 28, 2015, at 1:29 PM, Sebastien Goasguen wrote: > > >> On Sep 28, 2015, at 1:14 PM, Remi Bergsma >> wrote: >> >> Hi Bharat, >> >> >> There is only one way to prove a feature works: with tests. That’s why I say >> actually _running_

Re: Blameless post mortem

2015-09-28 Thread Remi Bergsma
+1 There are two VR related issues left: - CLOUDSTACK-8697: Assign VPC Internal LB rule to a VM fails - CLOUDSTACK-8915: Cannot SSH into VMs deployed Redundant VPC routers The first one has been tested today and seems still present. The second we discovered this weekend while testing. It was

Re: Blameless post mortem

2015-09-28 Thread Wilder Rodrigues
Hi Bharat, Perhaps you haven’t been away of not reading all the email that were sent to the list in the past. Why am I saying that? just based on your sentence where you said “i wonder why was this ignored when merging the VR refactor code" Is there any particular point you want to make that

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi guys, Anyway of all the things said and done I think we all agree that we need some documentation related to python changes. Regards, Bharat. On 28-Sep-2015, at 5:46 pm, Wilder Rodrigues wrote: > Hi Bharat, > > Perhaps you haven’t been away of not reading

RE: Blameless post mortem

2015-09-28 Thread Raja Pullela
@cloudstack.apache.org Subject: Re: Blameless post mortem Hi Remi, Whatever ever we think we have discovered are all well known best practices while developing code in community. I agree that tests need to be run on a new PR, but i wonder why was this ignored when merging the VR refactor code

Re: Blameless post mortem

2015-09-28 Thread Sebastien Goasguen
Folks let’s chill for a second here, Let’s be pragmatic: First, - Master got unstable with lots of issues related to the VPC - Issues were fixed - Let’s go back to blockers, fix and release 4.6 Second, - We have a postmortem from Remi. - Let’s talk it out, first with the folks that will be

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Remi, Whatever ever we think we have discovered are all well known best practices while developing code in community. I agree that tests need to be run on a new PR, but i wonder why was this ignored when merging the VR refactor code. Perhaps we will uncover some more issues if we

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Sebastien, You are confused, we are talking about persistent VR config changes. below is the pr related to it. https://github.com/apache/cloudstack/pull/118 If you look at it you will notice that there are more than 250 commits and only a few tests that were run. Regards, Bharat. On

Re: Blameless post mortem

2015-09-28 Thread Daan Hoogland
On Mon, Sep 28, 2015 at 2:32 PM, Wilder Rodrigues < wrodrig...@schubergphilis.com> wrote: > Only few tests…. 51 tests against a real environment. > ​... and then a lot of people wrote a lot more. @Bharat, @Raja, I hope you don't see design as part of quality assurance. It is not. It is only

Re: Blameless post mortem

2015-09-28 Thread Sebastien Goasguen
> On Sep 28, 2015, at 1:14 PM, Remi Bergsma wrote: > > Hi Bharat, > > > There is only one way to prove a feature works: with tests. That’s why I say > actually _running_ the tests we have today on any new PR, is the most > important thing. Having no

Re: Blameless post mortem

2015-09-28 Thread Remi Bergsma
Dude, this is the final friendly email about his. All points have been made in previous mails. This has nothing to do with ‘blameless’ and ‘learning’ anymore. Read Seb’s mail. We will move on now. Regards, Remi On 28/09/15 13:54, "Bharat Kumar" wrote: >Hi Remi, >

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Dude, There was nothing friendly about the postmortem you did, It was only partial, we should do a complete postmortem and then draw conclusions. I think the post-mortems like this are of no use if we do not do them completely. Regards, Bharat. On 28-Sep-2015, at 5:39 pm, Remi Bergsma

Re: Blameless post mortem

2015-09-28 Thread Wilder Rodrigues
Only few tests…. 51 tests against a real environment. At that time Nux also tested it and we tried to get Paul Angus, Geoff and Rohit from Shape Blue to test it as well. Nux found a couple of issues that were reported and fixed (see email below). When I came back from holidays, 4 weeks ago, a

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Remi, I never intended to say that we should not run tests, but even before tests we should have proper documentation. My concern was if a major change is being introduced it should be properly documented. All the issues which we are trying to fix are majorly due to VR refactor. If there

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Remi, i do not agree with “There is no bigger problem” part of your reply. so I had to repeat myself to make it more clear, Not because i am not aware of what this thread is supposed to do. Regards, Bharat. On 28-Sep-2015, at 2:51 pm, Remi Bergsma wrote: >

Re: Blameless post mortem

2015-09-28 Thread Remi Bergsma
Hi Bharat, I understand your frustrations but we already agreed on this so no need to repeat. This thread is supposed to list some improvements and learn from it. Your point has been taken so let’s move on. We need documentation first, then do a change after which all tests should pass. Even

Re: Blameless post mortem

2015-09-28 Thread Wilder Rodrigues
I agree with the docs stuff, that I said 5 emails ago. Once things are fixed, I will take the time to understand the code as a whole and write the documentation: we will need ir for release purposes anyway. Cheers, Wilder > On 28 Sep 2015, at 14:47, Bharat Kumar

Re: Blameless post mortem

2015-09-28 Thread Bharat Kumar
Hi Wilder, I am not talking about just the vpc networks. There are many other ares getting effected because of this, some of them are vpn(not implemented) , rvr in isolated networks etc. All i am saying is the design doc will help us understand the complete impact of the changes and deal with

Re: Blameless post mortem

2015-09-27 Thread Sanjeev N
I have a concern here. Some of us are actively involved in reviewing the PRs related to marvin tests(Enhancing existing tests/Adding new tests). If we have to test a PR it requires an environment to be created with actual resources and this is going to take lot of time. Some of the tests can run

Re: Blameless post mortem

2015-09-26 Thread sebgoa
Remi, thanks for the detailed post-mortem, it's a good read and great learning. I hope everyone reads it. The one thing to emphasize is that we now have a very visible way to get code into master, we have folks investing time to provide review (great), we need the submitters to make due

RE: Blameless post mortem

2015-09-25 Thread Raja Pullela
Thanks for the update Remi! -Original Message- From: Remi Bergsma [mailto:rberg...@schubergphilis.com] Sent: Saturday, September 26, 2015 1:21 AM To: dev@cloudstack.apache.org Subject: Blameless post mortem Hi all, This mail is intended to be blameless. We need to learn something from

Blameless post mortem

2015-09-25 Thread Remi Bergsma
Hi all, This mail is intended to be blameless. We need to learn something from it. That's why I left out who exactly did what because it’s not relevant. There are multiple examples but it's about the why. Let's learn from this without blaming anyone. We know we need automated testing. We have