One of my current projects is revising the internal keystore API used within the Cryptech HSM software. One aspect of this is figuring out how to make better use of the keystore flash than we have to date. I've talked over some of the details of this with Fredrik and Paul, but wanted to share the discussion with a wider audience in case anybody has advice or sees serious problems with the approach we're considering.
We've talked about layering some kind of filesystem on top of the flash, whether that filesystem be something that understands flash wear leveling (eg, UFFS, YAFFS) or something that doesn't (eg, FATFS). While all of these are potentially doable, all also have potential issues (licensing, lack of wear leveling, code complexity). More to the point, a filesystem per se may not really be appropriate here in any case: what we really want is some kind of record storage, so we'd be layering that on top of a filesystem, adding still more code. Maybe we don't need all of that, and maybe we can do better. Peter Gutmann advised us to look at PKCS #15, which is sort of an ASN.1 structured filesystem. We could do that, but it doesn't really address the underlying flash issues either, and itself appears to be layered on top of a sort of rudimentary proto-filesystem that's part of the underlying smartcard specification. Again, probably doable, but still seems more complex than we need. So it occurred to me that if we really just need a record store, maybe we should just write that. Hence the following. Data sheet for the flash that (I think) is in the Alpha, for reference: https://www.micron.com/~/media/documents/products/data-sheet/nor-flash/serial-nor/n25q/n25q_128mb_3v_65nm.pdf For the current discussion, it's probably most convenient to think of the flash as a sequence of 4096 4KB sub-sectors, since that's the minimum erasable unit. As I think I understand it, flash wear is caused by the erase cycle, so the basic goal of wear leveling is to spread erasure load somewhat evenly among the available sub-sectors. Fredrik tells me that, at least with the current Alpha hardware design, mapping the flash directly into the ARM's address space isn't practical, so reading or writing the flash looks like an I/O operation to the software. So, the basic proposal is just to store a fixed set of N records per sub-sector, where N may well end up being one, and not bother with a filesystem per se at all. Exactly how many records should be stored per subsector depends on the target key size: * An 8192-bit RSA key in the current representation won't fit in 4KB, so although we could probably squeeze that a little smaller, we might need a scheme which allows us to chain multiple sub-sectors if we're serious about 8192-bit RSA (not sure we are -- 8192 bit RSA is painfully slow with the current hardware and software); * We probably want to leave some room for things like storing a set of opaque attributes, so that we can move stuff like the PKCS #11 attributes currently stored on the host in an SQLite3 database into the HSM and get rid of the separate database. For the moment, take it as read that the number of keys per sub-sector would be a small integer, most likely one, value to be chosen later. Whether all key blocks are the same size or we vary the size based on the key type is another question TBD. Managing this would be relatively simple: we'd have an in-memory index mapping key names (16 bytes) to the subsectors currently holding those keys, and an in-memory bitvector and cursor tracking free subsectors. New subsectors would be allocated round-robin, which seems like about the right level of work for simple wear leveling. All of this in-memory stuff would of course go away when the HSM reboots, but if we handle the flash correctly it should be straightforward to build the in-memory stuff by examining the flash at boot time. We probably want each flash subsector to start with a few header fields: * A simple type code (PIN block, key block, perhaps overflow block if we need to chain multiple blocks to hold things larger than 4KB); * A field indicating whether this sub-sector is pristine (all-ones, ie, state just after flash erasure), in use, or zeroed (no longer in use, but not yet erased); * Field(s) to allow us to distinguish between data under construction and data that was fully written to flash (precise details vary depending on exactly how we lay things out, but general form follows the pristine -> intermediate -> final model and the state transitions have to be marked by zeroing bits, since setting bits would require erasing the sub-sector). We can probably keep the index structures fairly simple: for a 4096-entry index doesn't take up a lot of space, and given that we expect lookup to be much more common than insertion or deletion, something as simple as binary search of an array may be perfectly adequate (I confess to a fondness for search algorithms that can be expressed in half a dozen lines of C code). If necessary, we can use hash buckets / balanced trees / .... No dynamic memory required: sizes and counts determined at compile time would tell the boot code how much SDRAM to grab. The use of both "pristine" (all-ones) and "zeroed" (all-zeros) states above is not accidental. One of the minor issues in this design is how to figure out where to resume the round-robin rotation when rebuilding the in-memory stuff on reboot. Leaving the most recently discarded subsectors zeroed instead of erasing them immediately should make this fairly straightforward. Some corner cases if we ever manage to completely fill the flash, so maybe we also need a timestamp field in the subsector headers, handwave, details to be worked out. Anyway, that's the general idea under consideration. Comments? _______________________________________________ Tech mailing list Tech@cryptech.is https://lists.cryptech.is/listinfo/tech