Hey I talked more with Josh Rosen about this who has helped with automation
since I became less involved in release management.

I can think of a few different things that would improve our RM based on
these suggestions:

(1) We could remove signing step from the rest of the automation and as the
RM to sign the artifacts locally as a last step. This does mean we'd trust
the RM's environment not to be owned, but it could be better if there is
concern about centralization of risk. I'm curious how other projects do
this.

(2) We could rotate the RM position. BTW Holden Karau is doing this and
that's how this whole discussion started.

(3) We should make sure all build tooling automation is in the repo itself
so that the build is 100% reproducible by anyone. I think most of it is
already in dev/ [1] but there might be jenkins configs, etc that could be
put into the spark repo.

[1] https://github.com/apache/spark/tree/master/dev/create-release

- Patrick

On Mon, Sep 18, 2017 at 6:23 PM, Patrick Wendell <patr...@databricks.com>
wrote:

> One thing we could do is modify the release tooling to allow the key to be
> injected each time, thus allowing any RM to insert their own key at build
> time.
>
> Patrick
>
> On Mon, Sep 18, 2017 at 4:56 PM Ryan Blue <rb...@netflix.com> wrote:
>
>> I don't understand why it is necessary to share a release key. If this is
>> something that can be automated in a Jenkins job, then can it be a script
>> with a reasonable set of build requirements for Mac and Ubuntu? That's the
>> approach I've seen the most in other projects.
>>
>> I'm also not just concerned about release managers. Having a key stored
>> persistently on outside infrastructure adds the most risk, as Luciano noted
>> as well. We should also start publishing checksums in the Spark VOTE
>> thread, which are currently missing. The risk I'm concerned about is that
>> if the key were compromised, it would be possible to replace binaries with
>> perfectly valid ones, at least on some mirrors. If the Apache copy were
>> replaced, then we wouldn't even be able to catch that it had happened.
>> Given the high profile of Spark and the number of companies that run it, I
>> think we need to take extra care to make sure that can't happen, even if it
>> is an annoyance for the release managers.
>>
>> On Sun, Sep 17, 2017 at 10:12 PM, Patrick Wendell <patr...@databricks.com
>> > wrote:
>>
>>> Sparks release pipeline is automated and part of that automation
>>> includes securely injecting this key for the purpose of signing. I asked
>>> the ASF to provide a service account key several years ago but they
>>> suggested that we use a key attributed to an individual even if the process
>>> is automated.
>>>
>>> I believe other projects that release with high frequency also have
>>> automated the signing process.
>>>
>>> This key is injected during the build process. A really ambitious
>>> release manager could reverse engineer this in a way that reveals the
>>> private key, however if someone is a release manager then they themselves
>>> can do quite a bit of nefarious things anyways.
>>>
>>> It is true that we trust all previous release managers instead of only
>>> one. We could probably rotate the jenkins credentials periodically in order
>>> to compensate for this, if we think this is a nontrivial risk.
>>>
>>> - Patrick
>>>
>>> On Sun, Sep 17, 2017 at 7:04 PM, Holden Karau <hol...@pigscanfly.ca>
>>> wrote:
>>>
>>>> Would any of Patrick/Josh/Shane (or other PMC folks with
>>>> understanding/opinions on this setup) care to comment? If this is a
>>>> blocking issue I can cancel the current release vote thread while we
>>>> discuss this some more.
>>>>
>>>> On Fri, Sep 15, 2017 at 5:18 PM Holden Karau <hol...@pigscanfly.ca>
>>>> wrote:
>>>>
>>>>> Oh yes and to keep people more informed I've been updating a PR for
>>>>> the release documentation as I go to write down some of this unwritten
>>>>> knowledge -- https://github.com/apache/spark-website/pull/66
>>>>>
>>>>>
>>>>> On Fri, Sep 15, 2017 at 5:12 PM Holden Karau <hol...@pigscanfly.ca>
>>>>> wrote:
>>>>>
>>>>>> Also continuing the discussion from the vote threads, Shane probably
>>>>>> has the best idea on the ACLs for Jenkins so I've CC'd him as well.
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 15, 2017 at 5:09 PM Holden Karau <hol...@pigscanfly.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> Changing the release jobs, beyond the available parameters, right
>>>>>>> now depends on Josh arisen as there are some scripts which generate the
>>>>>>> jobs which aren't public. I've done temporary fixes in the past with the
>>>>>>> Python packaging but my understanding is that in the medium term it
>>>>>>> requires access to the scripts.
>>>>>>>
>>>>>>> So +CC Josh.
>>>>>>>
>>>>>>> On Fri, Sep 15, 2017 at 4:38 PM Ryan Blue <rb...@netflix.com> wrote:
>>>>>>>
>>>>>>>> I think this needs to be fixed. It's true that there are barriers
>>>>>>>> to publication, but the signature is what we use to authenticate Apache
>>>>>>>> releases.
>>>>>>>>
>>>>>>>> If Patrick's key is available on Jenkins for any Spark committer to
>>>>>>>> use, then the chance of a compromise are much higher than for a normal 
>>>>>>>> RM
>>>>>>>> key.
>>>>>>>>
>>>>>>>> rb
>>>>>>>>
>>>>>>>> On Fri, Sep 15, 2017 at 12:34 PM, Sean Owen <so...@cloudera.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yeah I had meant to ask about that in the past. While I presume
>>>>>>>>> Patrick consents to this and all that, it does mean that anyone with 
>>>>>>>>> access
>>>>>>>>> to said Jenkins scripts can create a signed Spark release, regardless 
>>>>>>>>> of
>>>>>>>>> who they are.
>>>>>>>>>
>>>>>>>>> I haven't thought through whether that's a theoretical issue we
>>>>>>>>> can ignore or something we need to fix up. For example you can't get a
>>>>>>>>> release on the ASF mirrors without more authentication.
>>>>>>>>>
>>>>>>>>> How hard would it be to make the script take in a key? it sort of
>>>>>>>>> looks like the script already takes GPG_KEY, but don't know how to 
>>>>>>>>> modify
>>>>>>>>> the jobs. I suppose it would be ideal, in any event, for the actual 
>>>>>>>>> release
>>>>>>>>> manager to sign.
>>>>>>>>>
>>>>>>>>> On Fri, Sep 15, 2017 at 8:28 PM Holden Karau <hol...@pigscanfly.ca>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> That's a good question, I built the release candidate however the
>>>>>>>>>> Jenkins scripts don't take a parameter for configuring who signs them
>>>>>>>>>> rather it always signs them with Patrick's key. You can see this from
>>>>>>>>>> previous releases which were managed by other folks but still signed 
>>>>>>>>>> by
>>>>>>>>>> Patrick.
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 15, 2017 at 12:16 PM, Ryan Blue <rb...@netflix.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> The signature is valid, but why was the release signed with
>>>>>>>>>>> Patrick Wendell's private key? Did Patrick build the release 
>>>>>>>>>>> candidate?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ryan Blue
>>>>>>>> Software Engineer
>>>>>>>> Netflix
>>>>>>>>
>>>>>>> --
>>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>>
>>>>>> --
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>
>>>>> --
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

Reply via email to