> On Feb 20, 2018, at 11:00 , cryptography-dev-requ...@python.org wrote:
> Date: Mon, 19 Feb 2018 17:14:25 -0800
> From: Paul Kehrer <paul.l.keh...@gmail.com>
> To: cryptography-dev@python.org
> Subject: Re: [Cryptography-dev] Cryptography-dev Digest, Vol 54, Issue
>       2
> Message-ID:
>       <cabj5tktv8cp6xgdcrabxf6qxjnzoipufe6_xy2y_n+r95ks...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8”
> 
> Access to Python's memory (via side channel or dumping as root) is not part
> of pyca/cryptography's threat model at this time so we don't attempt to
> protect against it. Making it part of our threat model would be difficult
> due in part to the reasons you stated above as well as the difficulty in
> writing tests to prevent regression, but let's talk about what CPython does
> in this case.



Paul,



        Thank you for your enlightening comments above and below.



What Am I Really Trying To Do?:

This function is initially an AWS lambda function that I also intend to support 
on Google Cloud and MS Azure. This august group of engineers obviously knows of 
my concern but I nonetheless wish to emphasize the problem. Perhaps I can also 
contribute to its solution? While I have control of the function’s environment, 
I don’t have total control. For example, while I can hope the host OS has 
enough entropy in its pool to support os.urandom(), the AWS recommendation, 
with which I concur, is that I get my random_bytes from the AWS KMS service and 
stir it up with an HKDF. As in:



random_bytes = os.urandom(512//8)  # <= Should come from AWS KMS random bytes 
service.

# Derive a SECP384R1 private key using SHA-512, _salt512, and guid.
hkdf = HKDF(algorithm=sha512, length=384//8, salt=_salt512, info=guid, 
backend=backend)
secret_bytes = hkdf.derive(random_bytes)
secret_int = int.from_bytes(secret_bytes, byteorder='big')
private_key = ec.derive_private_key(secret_int, ec.SECP384R1(), backend)



As I understand it, each of the cloud vendors offers a high entropy source of 
random bytes. Upon completion of this function, I want to scrub RAM of key 
material. I will never get another chance to address the existence of this key 
material. There is no goodbye kiss from the cloud function execution 
environments. These are one-shot function calls. If I am to ensure that this 
crypto toxic waste doesn’t come back to bite my service, then I must dispose of 
it in the function context that created it. 



In other words, pyca/cryptography is likely to be used in a much more dynamic 
environment than heretofore. Considering that pyca/cryptography has largely 
succeeded in building a civilized interface to crypto routines, helping folks 
implement good secret/key hygiene seems in scope for the project’s goals. 
Considering Amazon’s embrace of pyca/cryptography for their Python lambda 
functions and AWS Encryption SDK, I am unlikely to be the sole user.



> int.from_bytes will unfortunately make a copy (to a Python integer). That
> int will then be copied into a BN via _int_to_bn (
> https://github.com/pyca/cryptography/blob/master/src/cryptography/hazmat/backends/openssl/backend.py#L317-L346)
> when you call derive_private_key. It will actually be converted twice (a
> thing we should fix) (
> https://github.com/pyca/cryptography/blob/master/src/cryptography/hazmat/backends/openssl/backend.py#L1383-L1419).
> Although the resulting BNs will themselves be zeroed as freed, this means a
> secret scalar bytestring created in Python will be resident in memory no
> less than 5 times (3 byte strings, 2 numbers).
> 
> Obviously the next logical question is why you'd provide a Python integer
> when we're just going to convert it back to big endian bytes anyway.
> Disregarding the memory clearing issue it's also inefficient. When
> originally designing some of the APIs we made a mistake and chose integers
> instead of big endian bytes (see: numbers classes). We have not yet added
> alternate APIs to potentially enable us to deprecate numbers because the
> improvement in efficiency probably isn't worth the pain of trying to
> convert the huge number of users of those classes.
> 
> ec.derive_private_key_from_bytes(secret_bytes, ec.SECP384R1(), backend)
> could potentially be a way to do this specific operation while reducing the
> number of copies (to zero in Python and 2-3 in OpenSSL, although the latter
> are zeroed), but without tests that can detect non-required copies of
> secret material it would be extremely hard to prevent regression in the
> long term as the code is updated.



        Thank you for the above excellent exposition of the state of data 
copying in pyca/cryptography. Considering the appropriate "stürm und drang" 
that accompanied Meltdown, Specter, and other exploits around key material, I 
think many developers have an interest in making the changes to their code that 
allows them to ensure precise key material lifetimes. Of course, with modern 
internet survey tools, the pyca/cryptography team can easily ask their users if 
they would make these changes.

        To your point about regression testing the copying of key material: I 
place this in the category of expertise of the library developer. There are 
plenty of similar cryptographic issues that the pyca/cryptography team has had 
to tackle through discipline and intra-project communication. Obviously, 
because your team already clears and then frees items passed to and from 
OpenSSL, you know how to do this. How can I help?




        I also note that ec.derive_private_key() has a paired routine where 
pyca/cryptography takes responsibility for the secret_bytes: 
ec.generate_private_key(). If it manages its secret copies carefully, I might 
consider using it instead of providing my own secret_bytes. It would, of 
course, depend upon pyca/cryptography’s high entropy data source 
(os.urandom()?). As this is in the hazmat section of the library, perhaps a 
discussion in the documentation about digital toxic/hazardous waste is 
appropriate? How can I help?



> Given your chosen constraints have you considered deriving a key in
> subprocess, serializing it, and reading it from stdout in the parent
> process? By doing this you'd have a precisely defined intermediate object
> lifetime and the only secret in the parent process's memory would be a
> single DER or PEM bytestring containing the EC key.



        That is one excellent way to achieve my goal in a normal execution 
environment. All of the industry trends lead me to believe that AWS lambda 
function-like environments will become a high percentage of where crypto 
operations are performed. Due to their small scope and statelessness, they lend 
themselves to focussed and highly tuned solutions. They are a natural place for 
the cryptographic core functions of many systems to reside. Put all of your 
keys in one basket and then watch that basket very carefully.



        Thank you for taking the time to examine my concerns. 



Anon,
Andrew
____________________________________
Andrew W. Donoho
Donoho Design Group, L.L.C.
andrew.don...@gmail.com, +1 (512) 666-7596, twitter.com/adonoho

No risk, no art.
        No art, no reward.
                -- Seth Godin


_______________________________________________
Cryptography-dev mailing list
Cryptography-dev@python.org
https://mail.python.org/mailman/listinfo/cryptography-dev

Reply via email to