Re: [bitcoin-dev] Mitigating Differential Power Analysis in BIP-340
Hi Pieter, Thanks for the detailed response. > /secret key/secret keyI'll try to summarize the discussion we had that led > to this choice, but most of it is on > https://github.com/sipa/bips/issues/195 if you want the details. Ahh I can't believe I missed that github issue while searching. I guess I started reading a paper on DPA and got carried away. I can see you got to where I was and then went much further including some empirical analysis. Nice. I agree with the conclusion that xor is more robust than just hashing randomness in the same block as the secret key. > Let me first try to address what I think you're overlooking: in a > BIP32/Taproot like scenario, the private key that goes into the signing > algorithm functions as *both* secret and known to the attacker. That is to > say, there is a master secret s, and signing key x that goes into the hash > is x=s+a (mod n) for some value a that the attacker knows, and can modify > (he cannot control it directly, but he may be able to grind it to have a > structure he likes). I believe that means that feeding x to a hash directly > itself is already a problem, regardless of what else goes into the hash - > interactions between bits inside the hash operation that all come from x > itself can leak bit-level information of x. XORing (or any other simple > mix operation that does not expose bit-level information) into the private > key before giving it to a hash function seems like it would address this. > This is an subtle point that I didn't cross my mind. My gut feeling is there isn't even a computational argument to made that what I was suggesting is secure against DPA in that setting. DPA seems to be a PITA. A footnote in the BIP with a citation for DPA (the ed25519 one from the issue is good) and a hint about why you should avoid hashing Bitcoin secret keys altogether would be good. This brings us to the next point. It also assumes that somehow the computation of x itself is immune from > leaks (something you pointed out in a previous e-mail, I noticed). > >From my reading of the HMAC papers it seems you might be able to vary the BIP32 child index derivation to do this attack. Just thinking about it now, these attacks seem far fetched just because in order for it to be useful you need to have physical access to the device and to be able to accurately measure power consumption in high resolution (which I guess you can't do from a typical USB bus from corrupted software). Then you also need to get the user to do lots of signing or derivation with their device. I guess a malicious cable with some way of exfiltrating power consumption could do it. > I'm happy for any input you may have here. In particular, the recent > discussions around the interactions between anti-covert channel protection, > randomness, and the ability to spot check hardware devices may mean we > should revise the advice to consider not adding randomness unless such a > anti-covert channel scheme is used. > My only comment here is that there will end up being more than one way to do it and I think what you and your collaborators have put forward is at a local optimum of design (now that I understand it). Thanks and well done! It won't be the right optimum for everyone. To me, it seems like a good place to start. If you develop a decent nonce exfiltration protected signing protocol later then I don't see why HW wallets wouldn't compete for favour amongst the community by implementing and updating their devices to conform to it. LL ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
Re: [bitcoin-dev] Mitigating Differential Power Analysis in BIP-340
On Tuesday, March 24, 2020 6:00 AM, Lloyd Fournier via bitcoin-dev wrote: > Hi List, > > I felt this topic deserved it's own thread but it follows on from the mailing > list post [2] announcing a new PR [1] to change BIP-340 in several ways, > including adding random auxiliary data into the nonce derivation function. > Rather than hashing the randomness with the secret key and message etc, the > randomness is hashed then XOR'd (^) with the secret key and the result is > hashed like so to determine the secret nonce k: > > (1) k = H_derive( sec_key ^ H_aux(rand) || pub_key_x || message) > > The claim made in the mailing list post is that this is more secure against > "differential power analysis" (DPA) attacks than just doing the simpler and > more efficient: > > (2) k = H_derive(sec_key || rand || pub_key_x || message) > > The TL;DR here is that I don't think this is the case. Hi Lloyd, Thank you for looking into this. I very much agree we haven't given nearly enough justification for the use of a non-standard scheme here. I'll try to summarize the discussion we had that led to this choice, but most of it is on https://github.com/sipa/bips/issues/195 if you want the details. Let me first try to address what I think you're overlooking: in a BIP32/Taproot like scenario, the private key that goes into the signing algorithm functions as *both* secret and known to the attacker. That is to say, there is a master secret s, and signing key x that goes into the hash is x=s+a (mod n) for some value a that the attacker knows, and can modify (he cannot control it directly, but he may be able to grind it to have a structure he likes). I believe that means that feeding x to a hash directly itself is already a problem, regardless of what else goes into the hash - interactions between bits inside the hash operation that all come from x itself can leak bit-level information of x. XORing (or any other simple mix operation that does not expose bit-level information) into the private key before giving it to a hash function seems like it would address this. That said, all these DPA issues are very hard to reason about, as it's hard to find a realistic attack model that both (a) leaks some information but (b) doesn't obviously leak the entire key immediately. In the reasoning above I assumed an attacker who can observe word-level Hamming weight aggregates of all variables in the algorithm (which seems to match what one of the papers observed), but not bit level information. It also assumes that somehow the computation of x itself is immune from leaks (something you pointed out in a previous e-mail, I noticed). So really, all of this is trying to choose one alternative among a set of (when ignoring DPA) nearly equally good constructions. Note that randomness is useful for protection against fault attacks, but for that purpose it doesn't matter where in the hash it goes, or even that it's particularly strong randomness (a counter would also work). There are a number of other concerns we discussed in the linked thread: * Efficiency (how many SHA256 transformations, including the ability for some to be precomputed) * The risk that the randomness added is correlated with the private key in a way that cancels things out when they're naively XORed together. * The risk of having a midstate in the hash function leak (without leaking the actual private key, but enough to predict nonces). * The issue with public keys that are input to the signing algorithm which come directly from an attacker (which is the reason why pubkey goes into the nonce function too). The solution we came up with (H(priv XOR H(aux) || pub || msg)) is the only that ticks most of the boxes - but a different prioritization may certainly lead to a different conclusion. I'm happy for any input you may have here. In particular, the recent discussions around the interactions between anti-covert channel protection, randomness, and the ability to spot check hardware devices may mean we should revise the advice to consider not adding randomness unless such a anti-covert channel scheme is used. Cheers, -- Pieter ___ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[bitcoin-dev] Mitigating Differential Power Analysis in BIP-340
Hi List, I felt this topic deserved it's own thread but it follows on from the mailing list post [2] announcing a new PR [1] to change BIP-340 in several ways, including adding random auxiliary data into the nonce derivation function. Rather than hashing the randomness with the secret key and message etc, the randomness is hashed then XOR'd (^) with the secret key and the result is hashed like so to determine the secret nonce k: (1) k = H_derive( sec_key ^ H_aux(rand) || pub_key_x || message) The claim made in the mailing list post is that this is more secure against "differential power analysis" (DPA) attacks than just doing the simpler and more efficient: (2) k = H_derive(sec_key || rand || pub_key_x || message) The TL;DR here is that I don't think this is the case. There was no citation for this claim, so I did some digging and found two papers that seemed like they might be the origin of the idea [3,4] (I had no idea about these attacks before). A relatively easy to understand explanation of DPA attacks against is in [3]: The fundamental principle behind all DPA attacks is that at some point in > an algorithm’s execution, a function f exists that combines a fixed secret > value with a variable which an attacker knows. An attacker can form > hypotheses about the fixed secret value, and compute the corresponding > output values of f by using an appropriate leakage model, such as the > Hamming Distance model. The attacker can then use the acquired power > consumption traces to verify her hypotheses, by partitioning the > acquisitions or using Pearson’s correlation coefficient. These side-channel > analysis attacks are aided by knowledge of details of the implementation > under attack. Moreover, these attacks can be used to validate hypotheses > about implementation details. In subsequent sections, these side-channel > analysis attacks are referred to as DPA attacks. For example, in the original BIP-340 proposal the nonce derivation was vulnerable to DPA attacks as it was derived simply by doing H_derive(sec_key || message). Since, the message is known to the attacker and variable (even if it is not controller by her), the SHA256 compression function run on (sec_key || message) may leak information about sec_key. It is crucial to understand that just hashing sec_key before passing it into the H_derive does *not* fix the problem. Although the attacker would be unable to find sec_key directly, they could learn H(sec_key) and with that know all the inputs into H_derive and therefore get the value of the secret nonce k and from there extract the secret key from any signature made with this nonce derivation algorithm. The key thing I want to argue with this post is that there is no advantage of (1) over (2) against DPA attacks, at least not given my understanding of these papers. The way the attack in [3] works is by assuming that operations in the compression function leak the "hamming distance" [5] (HD) between the static secret thing that is being combined with the variable public thing. In practice the attack involves many particulars about SHA256 but that is, at a high level, the right way to simplify it I think. The way the paper suggests to fix the problem is to mask the secret data with secret randomness before each sensitive operation and then strip off the secret randomness afterwards. This seems to be the inspiration for the structure of updated BIP-340 (1), however I don't believe that it provides any extra protection over (2). My argument is as follows: Claim A: If the randomness used during signing is kept secret from the attacker then (2) is secure against DPA. Since SHA256 has 64-byte blocks the hash H_derive(sec_key || rand || pub_key_x || message) will be split up into two 64 byte blocks, one containing secret data (sec_key || rand) and the other containing data known to the attacker (pub_key_x || message). The compression function will run on (sec_key || rand) but DPA will be useless here because the HD(sec_key, rand) will contain no information about sec_key since rand is also secret. The output of the compression function on the first block will be secret but *variable* so the intermediate hash state will not reveal useful information when compressed with the second block. Then I thought perhaps (1) is more robust in the case where the randomness is known by the attacker (maybe the attacker can physically modify the chipset to control the rng). We'd have to assume that the sec_key ^ H_aux(rand) isn't vulnerable to DPA (the LHS is under the control of the attacker) to be true. Even under this assumption it turned out not to be the case: Claim B: If the randomness used during signing is known to the attacker, then (1) is not secure against DPA. In (1) there are 96 bytes to be hashed and therefore two SHA256 blocks: (H_aux(sec_key) ^ rand || pub_key_x) and (message). During the first compression function call the attacker gets the HD of: HD( sec_key ^ H_aux(rand), pub_key_x)