[GitHub] [accumulo] ctubbsii commented on issue #2700: Seeing lots of lock contention and CPU around sha512 computation during authentication

GitBox Tue, 24 May 2022 13:11:56 -0700


ctubbsii commented on issue #2700:
URL: https://github.com/apache/accumulo/issues/2700#issuecomment-1136386342


   > > I don't want to do anything custom.
   > 
   > To be clear I wasn't doing anything custom, `PBKDF2WithHmacSHA512` is a 
JCA supported algorithm.
   
   Right, so, there's the algorithm, but then there's the serialization format. 
Crypt emits them in a well known serialization format. We'd have to do 
something custom for the serialization, even if we're using a built-in 
algorithm. I'd prefer to stick to the crypt standard for serialization, and 
limit ourselves to the algorithms it supports, rather than use a different 
algorithm and customize the serialization.
   
   > Looking at [CODEC-133](https://issues.apache.org/jira/browse/CODEC-133), 
the commons-codec classes implement the Linux crypt algorithm, but I'm not 
quite sure of it's lineage. There are several things mentioned in that ticket 
that seem to be merged together.
   
   Linux's crypt evolved out of the need to use something stronger than the DES 
algorithm, but needed to be serialized in a backwards-compatible way, and 
human-readable since it was going in a text file. They came up with the 
well-known prefix `$<algorithm_id>$`. The format of everything after that 
prefix depends on the algorithm, but ultimately, the hash is encoded using 
printable ASCII characters. The SHA-2 algorithms support a rounds from 1000 to 
999,999,999 (also known as a "CPU time cost parameter" in the man page) 
parameter (defaults to 5000) and a 6-96 bit salt that is base-64 encoded.
   
   If there are several things mentioned in the ticket, it's probably because 
there are multiple algorithms that are supported.
   
   > > Something else to keep in mind: faster hashing makes brute force attacks 
more efficient. The fact that sha512crypt is slower isn't necessarily a bad 
thing. It would be nice if it didn't use as much CPU, though.
   > 
   > Agreed, my test performed 1000 iterations to match what the commons codec 
code is doing by default. I don't think we are providing commons-codec with any 
number of iterations. NIST SP800-63B suggest doing 10,000 iterations, it says:
   
   Even though the javadoc for Crypt incorrectly says the salt is specified 
without the rounds= parameter, you *can* specify that parameter to override the 
default of 5000. However, the Crypt API doesn't make this easy for us... we 
would have to randomly generate our own salt, base-64 encode it and pass it in, 
rather than rely on the library generating a random salt for us. And, their 
utility code to generate the salt and efficiently base64 is not public, so we'd 
have to rewrite it or copy it.
   
   It would be nice to have an upstream change that makes it easier to specify 
the rounds (and also to support newer algorithms, like yescrypt). commons-codec 
is a bit out of date in this regard.
   
   > 
   > > For PBKDF2, the cost factor is an iteration count: the more times the 
PBKDF2 function is iterated, the longer it takes to compute the password hash. 
Therefore, the iteration count SHOULD be as large as verification server 
performance will allow, typically at least 10,000 iterations.
   > 
   > So, my test is faster doing an apples-to-apples comparison, but would be 
slower when performing 10,000 iterations. We would still need the caching, IMO, 
since we are currently checking that the passwords match on each API call.
   
   Yours is not faster. The default is 5000, not 1000. Your test code uses 
1000. When I ran your test with 5000, Crypt was faster. When I pre-generate the 
salts for Crypt, like you did with the PBKDF2, then it's slightly faster still.
   
   But the question remains: do we want fast and efficient checks, or do we 
want slow checks/large iterations? The original issue was about performance... 
but if the performance is intentionally slower here, if lock contention is 
intentional to throttle as a mechanism to protect against brute force attacks, 
perhaps there's no bug here to fix? Can we have both? Throttled, but less CPU 
usage?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [accumulo] ctubbsii commented on issue #2700: Seeing lots of lock contention and CPU around sha512 computation during authentication

Reply via email to