Hi all, I've been working on a proposal for how we might allow RedBoot to use a NAND array to provide the FIS.
I've include the current version of my proposed design, although following the conversation with Simon this week it has occurred to me that it might well be reasonable to fillet out the core idea and make it available as a simple proto-filesystem. (Essentially it would be a block device with very large blocks. The high-level operations provided would be reading and writing up to a blockful, and erasing a whole block; this doesn't seem to fit well with the eCos block device model, so I suspect it would be better off as its own interface.) Please do speak up if that would be a useful development, and if so I'll happily rework this proposal into two parts as time allows. Ross ============================================================================ NOR-in-NAND design (3/7/09) 1. Scenario and assumptions: We want RedBoot to be able to use NAND flash as if it was NOR flash, for FIS and fconfig. We don't care about supporting non-RB apps, so can rely on RB's behaviour. Analysis: * fconfig-in-FIS always calls FIS code. * fconfig not in FIS always erases before programming. * FIS always erases before it programs, only deals with block-aligned regions, and never rewrites within such a region. [+] Design goals: * Clean and simple, in keeping with the eCos philosophy. * Robust: copes with blocks going bad and indulges in some sort of wear-levelling. * No use of malloc. (As we're targetting only RB, we can use the workspace.) [+] This isn't 100% true; if CYGOPT_REDBOOT_REDUNDANT_FIS option is enabled, it rewrites two bytes (its status flag) in the redundant copy of the FIS block after writing out the data. We can provide suitable redundant-write logic in the NOR-in-NAND driver; we can get this project going initially by not allowing its combination with REDUNDANT_FIS. Later, if desired, we could nobble the REDUNDANT_FIS code to not do its page-rewrite and to drive the NAND-to-NOR layer so as to take advantage of its redundancy if the NOR-in-NAND option was set. (Or, more flexibly, it could do so if the flash driver were to set an as-yet-unspecified bit of its as-yet-nonexistent flags word to say that it did redundant writes; but this is perhaps wishful thinking.) 2. High level design The NAND will present as a NOR device with block_size the same as the NAND eraseblock size[#]. We will use a log-like storage mechanism with block-level granularity. [#] Nick pointed out that this could be souped up with an option to combine multiple NAND blocks into a single logical block. The logical blocks that RedBoot thinks it is using will be be internally numbered from zero and mapped to a cyg_flash_addr: flash_addr := logical_block_number * block_size + flash_base where flash_base is a suitable hex-round number defined by the platform HAL to keep it away from any existing NOR devices. The number of blocks available - i.e. (1 + maximum valid logical block number) - will be computed as: * Number of physical blocks in the chip or partition, * minus the number of factory bad blocks, * minus one (to allow for robust block rewrites), * minus an allowance for bad blocks to develop during the life of the device. The allowance for blocks going bad will be computed as four blocks, plus a configurable % of the number of blocks in the NAND partition (default 1%; rounded up). The NAND library will require to be enhanced to cope with part-page writes and reads, in particular getting the ECC right on them. (While we're at it, we could also enhance it to get the ECC right on multiple non-overlapping writes to a page, but this is harder, and seems unnecessary in this case. Whatever we do, we need to document the precise cases which are expected to work and to not work.) The NOR-in-NAND layer will be making multiple-page reads and writes; it may be worthwhile to put code to do this into the NAND layer, as opposed to forcing all users to reinvent the wheel. 2.1. Physical block storage NAND blocks are used as a dumb datastore, with their physical addresses bearing no relation to their logical addresses. In-use blocks are tagged in the OOB area with the logical block number they refer to (see below). Physical storage blocks are used sequentially from the beginning of the device, then cycling back to the start to reuse erased blocks. This is managed by maintaining a next-write "pointer". At runtime, this is the next vacant block after the last written block; at boot time, it is initialised by scanning the filesystem to find the block with the highest serial number, which is taken to be the latest-written block. This scheme is much better than a simple block mapping due to its robustness: it intrinsically provides reasonable wear-levelling, and is very robust against physical blocks going bad. Provided no more than 1% of the array goes bad in the field, no operation should ever fail from the point of view of the caller, even with a full FIS. In essence this is a nod towards log-structured filesystems, but with very simple addressing. 2.1.1 OOB tag format The tag is written into the OOB area of the first page of each block (as "application" OOB data, from the NAND library's point of view). The tag is a packed byte array, in processor-local endian, with the following contents: * Magic number - 2 bytes, 0xEF15. (This is a compile-time constant and demonstrates that the block is one of ours. It's a corruption of "eCos FIS".) * Logical block number - 2 bytes. * Master serial number - 4 bytes (see below). We deliberately keep the tag size to eight packed bytes so as to conveniently support devices with 512-byte pages, which have 16 OOB bytes per page, of which 8 are available for application use. (If desired, 256-byte page devices could be supported by spreading the tag over the first two pages of the block. This adds complexity, IMHO needlessly as those devices are firmly obsolete.) 2.1.2 Master serial number This is a 4 byte counter which increments on every write. At boot time, the array is scanned and the serial number is initialised to be one greater than the largest serial number found in the array. We assume that 2^32 writes are sufficient and that we will never have to worry about overflow. (Justification: The typical 1Gbit chip on my desk has 1024 blocks and is rated for 100,000 write/erase cycles. 1.024e8 max serials versus 4.295e9 capacity represents a safety margin of approx 41x. This will have to be rethought should chips with significantly longer lifespans and more blocks appear. To be safe, we should impose an upper limit on the number of physical NAND blocks that this system will use, and hence cap the number of logical NOR blocks the system will support. I suggest 1024, which ought to be enough for anybody; it's more than many (most?) NOR chips.) 2.2 Flash operations All operations which read NAND blocks must check the data tags. A magic number of all-0xFF means the block is empty; if it is anything else other than the expected magic number, something Odd is going on and we will report and ignore the block. 2.2.1 Query flash This simply sets up the flash_info struct in the standard way. 2.2.2 Read from flash Convert the flash address to a logical block number, call this B. Find blocks containing data for block B. Switch on how many there were: * if 0: Synthesise an all-FFs return. * if 1: Return data from the pages of that block as appropriate * if 2: The block with higher serial number was probably incompletely written, so erase it. Proceed as for case 1. * if >2: Something strange has happened. Erase the block(s) with lowest serial numbers until two remain, then proceed as for case 2. Iterate the above as necessary given the size of the write and the NAND page/block size. Read errors (i.e. NAND layer reported ECC uncorrectable) are not expected; if one does occur, we do our best and report back. 2.2.3 Program flash Convert the flash address to a logical block number, call this B. Find blocks containing data for block B. Switch on how many there were: * if 0: write and tag the programmed data into the physical block pointed to by the next-write pointer, then increment that pointer until it points to an empty block, and increment the serial number counter. * if 1: we are doing a redundant write. Write and tag the programmed data - as for case 0 - then erase the older block. * if >1: Something strange has happened. Erase all existing blocks for block B except for the one with the second-highest serial number, then treat as case 1. For robustness, by default we will perform a read-back of all pages as we write then, but this can be switched off with a CDL option. If a write fails, or fails to read-back correctly: mark the physical block as bad, increment the next-write pointer until it points to an empty block, then retry. Iterate the above as necessary given the size of the write and the NAND page/block size. 2.2.4 Erase flash region (given base address and erase size). Convert the flash address to a logical block number, call this B. Seek and erase all physical blocks bearing data for block B. If the eraselength was larger than the block size, repeat as necessary for subsequent logical blocks. If the NAND layer reports an erase failure, the block will already have been marked as bad (by the NAND layer), so we need do nothing (except perhaps whingeing about it). To improve robustness, we will check that the tags area of a physical block we've just erased is all-0xFF; if not, we shall retry the erase once, and if that fails, mark the block as bad. 2.2.5 Block locking These operations are not supported; they are provided as no-ops, and CYG_FLASH_FUNS does the right thing. 2.3 Handling of bad blocks When choosing where to write, marked-bad blocks are skipped over. When scanning the array for a block, marked-bad blocks are similarly ignored. The number of logical blocks in the array must never be changed after its first use. Therefore it is essential that nothing, other than the one-shot initial setup of the bad block table, ever marks a block used for this scheme as factory-bad. This should be a safe assumption, as to do so would require "forbidden" use of nand internals. 2.4 Runtime considerations If a runtime performance boost was required, the system could on startup scan the NAND partition and build up a cached mapping in-RAM of the physical addresses of each (non-zeroed) logical block. However, we don't expect this code will be used on gigantic NAND arrays [partitions], and it should only see light duty via RedBoot. Therefore it won't take long to linear-scan for blocks during operations, so this optimisation may not be worth its complexity. (Bart suggested that a scan at startup could also take care of the cases above where more than one physical block is tagged with the same logical block number, hence reducing the time and complexity of the block access code. There's a startup time vs access time vs memory trade-off here, though we haven't fully analysed it. Nick agreed that we ought to think this through very carefully.) ============================================================================ -- Embedded Software Engineer, eCosCentric Limited. Barnwell House, Barnwell Drive, Cambridge CB5 8UU, UK. Registered in England no. 4422071. www.ecoscentric.com