Re: [discovery] First two weeks as part of the team

Guillaume Lederrey Fri, 19 Feb 2016 02:32:11 -0800

And here is the post I wanted to write about not being smart:
https://slashdevslashrandom.wordpress.com/2016/02/19/on-the-importance-of-not-being-smart/.
Or at least a post that is vaguely related to what I had in mind
(funny how words always sound better and much more clear when they are
still inside my own head...)


On Thu, Feb 18, 2016 at 10:37 AM, Giuseppe Lavagetto
<[email protected]> wrote:
> [X-posting to ops as this discussion is relevant there too]
>
> On Wed, Feb 17, 2016 at 5:53 PM, Erik Bernhardson
> <[email protected]> wrote:
>> On Feb 17, 2016 1:50 AM, "Guillaume Lederrey" <[email protected]>
>> wrote:
>>>
>>> Hello team!
>>> == Versionning ==
>>>
>>> **my belief **
>>> anything deployed must have a version number
>>>
>>> ** what happens at WMF **
>>> * deployments on labs are pretty much free-form, cherry pick whatever
>>> you want on puppetmaster
>>> * deployments on prod seems to have version numbers at least for
>>> mediawiki code, puppet code is deployed directly from production
>>> branch
>>>
>>> ** comments **
>>> Having clear version numbers implies having a conscious decision of
>>> creating a version, potentially with the appropriate checks of the
>>> content of that version, additional testing. It allows to have a clear
>>> separation between creating a version and promoting it to production.
>>> Not having versions everywhere allows for more flexibility and puts
>>> responsibility of making the right choices more on the people than on
>>> the process. Probably a good thing if you have smart enough people
>>> (and WMF seems to have a pretty smart crowd).
>>>
>>> Having a shared git repository on deployment-puppetmaster scares the
>>> hell out of me! I'm so used to preparing anything I want to push
>>> locally and then just applying a specific tag / version...
>>>
>>
>> Puppet being unversioned certainly makes it different from the rest of
>> deployments. I think ops gets away with this by having relatively few people
>> commiting code. It also has to do with the careful nature of puppet
>> deployments,  puppet is  typically deployed one patch at a time. I think
>> this helps with understanding what just broke everything, rather than having
>> a big release with many disparate changes.
>>
>
> Puppet is _always_ deployed one patch at a time unless for very very
> special cases; I do think it's a very good thing for operations: there
> are a few reasons why it's a good thing:
>
> 1) Minimize change risk/surface: given we're a very high traffic
> website with a mildly complex architecture, you can't realistically
> think you can validate a large set of changes without throwing live
> traffic at them. I've see ops teams working with stricter change
> management strategies and the risk for *big troubles* has always been
> higher.
> 2) Speed of deployment: we're a very small team for the amount of
> things we're doing in parallel. We can't seriously think to keep up
> the pace with a stricter change management (as in, deploy a new
> version of our puppet code N times a week after rigorous testing and
> picking the changes that make the cut).
> 3) Keeping changes independent: since the puppet repo is large and
> includes all of production, having changes to independent systems
> being tied together is a recipe for disaster: rolling back one change
> would mean rolling back all of them, frustrating a lot of people and
> probably requiring coordination with other teams. You could just
> revert the affected change and make a new point release, but then I
> miss completely how having releases does us any good.
>
> About cherry-picks in beta: the problem is not cherry-picking (I think
> it's a reasonable way to test things) but persistent cherry-picking to
> monkey patch problems is. I think if we follow the flow of:
>
> - writing a patch
> - testing it on beta with a cherry-pick
> - get it merged on ops/puppet and production
>
> and all of this happens within a week, it would be a decent compromise.
>
>>> * I still have not found a global architecture schema (something like
>>> a high level component or deplyoment diagram). But I have never seen
>>> any company having those...
>>
>> Pretty sure one doesn't exist :(
>
> Luca (the new analytics opsen) has started to work on
> https://wikitech.wikimedia.org/wiki/File:Infrastructure_overview.png
>
> I asked him to share the sources for it so that everyone can improve it.
>
> Also, if you need some oral history, just ask opsens and we'll be
> happy to give you an overview of how things work :)
>
> Cheers,
>
> Giuseppe
>
> _______________________________________________
> discovery mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/discovery

_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

Re: [discovery] First two weeks as part of the team

Reply via email to