ctubbsii commented on issue #2700: URL: https://github.com/apache/accumulo/issues/2700#issuecomment-1136386342
> > I don't want to do anything custom. > > To be clear I wasn't doing anything custom, `PBKDF2WithHmacSHA512` is a JCA supported algorithm. Right, so, there's the algorithm, but then there's the serialization format. Crypt emits them in a well known serialization format. We'd have to do something custom for the serialization, even if we're using a built-in algorithm. I'd prefer to stick to the crypt standard for serialization, and limit ourselves to the algorithms it supports, rather than use a different algorithm and customize the serialization. > Looking at [CODEC-133](https://issues.apache.org/jira/browse/CODEC-133), the commons-codec classes implement the Linux crypt algorithm, but I'm not quite sure of it's lineage. There are several things mentioned in that ticket that seem to be merged together. Linux's crypt evolved out of the need to use something stronger than the DES algorithm, but needed to be serialized in a backwards-compatible way, and human-readable since it was going in a text file. They came up with the well-known prefix `$<algorithm_id>$`. The format of everything after that prefix depends on the algorithm, but ultimately, the hash is encoded using printable ASCII characters. The SHA-2 algorithms support a rounds from 1000 to 999,999,999 (also known as a "CPU time cost parameter" in the man page) parameter (defaults to 5000) and a 6-96 bit salt that is base-64 encoded. If there are several things mentioned in the ticket, it's probably because there are multiple algorithms that are supported. > > Something else to keep in mind: faster hashing makes brute force attacks more efficient. The fact that sha512crypt is slower isn't necessarily a bad thing. It would be nice if it didn't use as much CPU, though. > > Agreed, my test performed 1000 iterations to match what the commons codec code is doing by default. I don't think we are providing commons-codec with any number of iterations. NIST SP800-63B suggest doing 10,000 iterations, it says: Even though the javadoc for Crypt incorrectly says the salt is specified without the rounds= parameter, you *can* specify that parameter to override the default of 5000. However, the Crypt API doesn't make this easy for us... we would have to randomly generate our own salt, base-64 encode it and pass it in, rather than rely on the library generating a random salt for us. And, their utility code to generate the salt and efficiently base64 is not public, so we'd have to rewrite it or copy it. It would be nice to have an upstream change that makes it easier to specify the rounds (and also to support newer algorithms, like yescrypt). commons-codec is a bit out of date in this regard. > > > For PBKDF2, the cost factor is an iteration count: the more times the PBKDF2 function is iterated, the longer it takes to compute the password hash. Therefore, the iteration count SHOULD be as large as verification server performance will allow, typically at least 10,000 iterations. > > So, my test is faster doing an apples-to-apples comparison, but would be slower when performing 10,000 iterations. We would still need the caching, IMO, since we are currently checking that the passwords match on each API call. Yours is not faster. The default is 5000, not 1000. Your test code uses 1000. When I ran your test with 5000, Crypt was faster. When I pre-generate the salts for Crypt, like you did with the PBKDF2, then it's slightly faster still. But the question remains: do we want fast and efficient checks, or do we want slow checks/large iterations? The original issue was about performance... but if the performance is intentionally slower here, if lock contention is intentional to throttle as a mechanism to protect against brute force attacks, perhaps there's no bug here to fix? Can we have both? Throttled, but less CPU usage? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
