In the course of some troubleshooting I happened to look at the sha.c code, which creates an SHA1 message digest. This code did not originate with Evergreen; we presumably got it somewhere off the net. It bears the names of several contributors unknown to me. I have found it in the Google, or at least versions of it, but I don't know where it came from originally.
This code includes a call to malloc() that disturbs me, for two reasons: 1. It doesn't check the return value for NULL. 2. It is completely unnecessary. We can allocate the same space on the stack without incurring the overhead of malloc() and free(). I can remove the malloc(), and I will do so whenever I get around to it. However there is another issue that will be harder to address. The code is not portable, because it assumes that ints are always 32 bits wide. This assumption appears to be deeply embedded in the algorithm. I considered changing every int to an int32_t so as to specify its width in a portable way. However I'm not convinced that this change would reliably fix the problem. The algorithm uses arrays of ints, and assumes that they are contiguous in memory. Specifically, it uses a cast to redefine a 64-byte buffer as an array of 16 ints. So far as I understand, the compiler is allowed to align 32-bit ints on 64-bit boundaries, leaving padding bytes between them. In that case the code will break even if you use int32_t. If the code does break, we may not notice. We compare hashes created by sha.c to other hashes created by the same code. If they're both wrong in the same way, they'll still match -- but they're still wrong, and the chance of a spurious match will be greater than it should be. I expect to go shopping for an implementation of SHA1 that works for any architecture, regardless of int size or endianness. Naturally its licensing must be compatible with the GPL. This issue will likely simmer on the back burner for a while, if not forever, but I'm open to suggestions. Scott McKellar
