NAND "FIS" in RedBoot proposal

Ross Younger Fri, 03 Jul 2009 03:17:53 -0700

Hi all,

I've been working on a proposal for how we might allow RedBoot to use a NAND
array to provide the FIS.


I've include the current version of my proposed design, although following
the conversation with Simon this week it has occurred to me that it might
well be reasonable to fillet out the core idea and make it available as a
simple proto-filesystem. (Essentially it would be a block device with very
large blocks. The high-level operations provided would be reading and
writing up to a blockful, and erasing a whole block; this doesn't seem to
fit well with the eCos block device model, so I suspect it would be better
off as its own interface.) Please do speak up if that would be a useful
development, and if so I'll happily rework this proposal into two parts as
time allows.


Ross


============================================================================

NOR-in-NAND design (3/7/09)

1. Scenario and assumptions:

We want RedBoot to be able to use NAND flash as if it was NOR flash,
for FIS and fconfig.

We don't care about supporting non-RB apps, so can rely on RB's behaviour.

Analysis:
* fconfig-in-FIS always calls FIS code.
* fconfig not in FIS always erases before programming.
* FIS always erases before it programs, only deals with block-aligned
regions, and never rewrites within such a region. [+]

Design goals:
* Clean and simple, in keeping with the eCos philosophy.
* Robust: copes with blocks going bad and indulges in some sort of
wear-levelling.
* No use of malloc. (As we're targetting only RB, we can use the workspace.)

[+] This isn't 100% true; if CYGOPT_REDBOOT_REDUNDANT_FIS option is
enabled, it rewrites two bytes (its status flag) in the redundant copy
of the FIS block after writing out the data. We can provide suitable
redundant-write logic in the NOR-in-NAND driver; we can get this project
going initially by not allowing its combination with REDUNDANT_FIS. Later,
if desired, we could nobble the REDUNDANT_FIS code to not do its
page-rewrite and to drive the NAND-to-NOR layer so as to take advantage
of its redundancy if the NOR-in-NAND option was set. (Or, more flexibly,
it could do so if the flash driver were to set an as-yet-unspecified bit
of its as-yet-nonexistent flags word to say that it did redundant writes;
but this is perhaps wishful thinking.)

2. High level design

The NAND will present as a NOR device with block_size the same as the
NAND eraseblock size[#]. We will use a log-like storage mechanism with
block-level granularity.

[#] Nick pointed out that this could be souped up with an option to
combine multiple NAND blocks into a single logical block.

The logical blocks that RedBoot thinks it is using will be be internally
numbered from zero and mapped to a cyg_flash_addr:
        flash_addr := logical_block_number * block_size + flash_base
where flash_base is a suitable hex-round number defined by the platform
HAL to keep it away from any existing NOR devices.

The number of blocks available - i.e. (1 + maximum valid logical block
number) - will be computed as:
        * Number of physical blocks in the chip or partition,
        * minus the number of factory bad blocks,
        * minus one (to allow for robust block rewrites),
        * minus an allowance for bad blocks to develop during the life
        of the device.

The allowance for blocks going bad will be computed as four blocks,
plus a configurable % of the number of blocks in the NAND partition
(default 1%; rounded up).

The NAND library will require to be enhanced to cope with part-page
writes and reads, in particular getting the ECC right on them. (While
we're at it, we could also enhance it to get the ECC right on multiple
non-overlapping writes to a page, but this is harder, and seems
unnecessary in this case. Whatever we do, we need to document the precise
cases which are expected to work and to not work.)

The NOR-in-NAND layer will be making multiple-page reads and writes;
it may be worthwhile to put code to do this into the NAND layer, as
opposed to forcing all users to reinvent the wheel.

2.1. Physical block storage

NAND blocks are used as a dumb datastore, with their physical addresses
bearing no relation to their logical addresses. In-use blocks are tagged
in the OOB area with the logical block number they refer to (see below).

Physical storage blocks are used sequentially from the beginning of
the device, then cycling back to the start to reuse erased blocks. This
is managed by maintaining a next-write "pointer". At runtime, this is
the next vacant block after the last written block; at boot time, it
is initialised by scanning the filesystem to find the block with the
highest serial number, which is taken to be the latest-written block.

This scheme is much better than a simple block mapping due to its
robustness: it intrinsically provides reasonable wear-levelling, and is
very robust against physical blocks going bad. Provided no more than 1%
of the array goes bad in the field, no operation should ever fail from
the point of view of the caller, even with a full FIS.

In essence this is a nod towards log-structured filesystems, but with
very simple addressing.

2.1.1 OOB tag format

The tag is written into the OOB area of the first page of each block
(as "application" OOB data, from the NAND library's point of view).

The tag is a packed byte array, in processor-local endian, with the
following contents:

        * Magic number - 2 bytes, 0xEF15. (This is a compile-time constant
        and demonstrates that the block is one of ours. It's a corruption
        of "eCos FIS".)
        * Logical block number - 2 bytes.
        * Master serial number - 4 bytes (see below).

We deliberately keep the tag size to eight packed bytes so as to
conveniently support devices with 512-byte pages, which have 16 OOB bytes
per page, of which 8 are available for application use. (If desired,
256-byte page devices could be supported by spreading the tag over the
first two pages of the block. This adds complexity, IMHO needlessly as
those devices are firmly obsolete.)

2.1.2 Master serial number

This is a 4 byte counter which increments on every write. At boot time,
the array is scanned and the serial number is initialised to be one
greater than the largest serial number found in the array.

We assume that 2^32 writes are sufficient and that we will never have
to worry about overflow. (Justification: The typical 1Gbit chip on my
desk has 1024 blocks and is rated for 100,000 write/erase cycles. 1.024e8
max serials versus 4.295e9 capacity represents a safety margin of approx
41x. This will have to be rethought should chips with significantly longer
lifespans and more blocks appear.  To be safe, we should impose an upper
limit on the number of physical NAND blocks that this system will use,
and hence cap the number of logical NOR blocks the system will support. I
suggest 1024, which ought to be enough for anybody; it's more than many
(most?) NOR chips.)

2.2 Flash operations

All operations which read NAND blocks must check the data tags.

A magic number of all-0xFF means the block is empty; if it is anything
else other than the expected magic number, something Odd is going on
and we will report and ignore the block.

2.2.1 Query flash
This simply sets up the flash_info struct in the standard way.

2.2.2 Read from flash
Convert the flash address to a logical block number, call this B.
Find blocks containing data for block B. Switch on how many there were:
*       if 0: Synthesise an all-FFs return.
*       if 1: Return data from the pages of that block as appropriate
*       if 2: The block with higher serial number was probably
        incompletely written, so erase it. Proceed as for case 1.
*       if >2: Something strange has happened. Erase the block(s) with
        lowest serial numbers until two remain, then proceed as for
        case 2.

Iterate the above as necessary given the size of the write and the NAND
page/block size.

Read errors (i.e. NAND layer reported ECC uncorrectable) are not expected;
if one does occur, we do our best and report back.

2.2.3 Program flash
Convert the flash address to a logical block number, call this B.
Find blocks containing data for block B. Switch on how many there were:
*       if 0: write and tag the programmed data into the physical
        block pointed to by the next-write pointer, then increment that
        pointer until it points to an empty block, and increment the
        serial number counter.
*       if 1: we are doing a redundant write. Write and tag the programmed
        data - as for case 0 - then erase the older block.
*       if >1: Something strange has happened. Erase all existing blocks
        for block B except for the one with the second-highest serial
        number, then treat as case 1.

For robustness, by default we will perform a read-back of all pages as
we write then, but this can be switched off with a CDL option.

If a write fails, or fails to read-back correctly: mark the physical
block as bad, increment the next-write pointer until it points to an
empty block, then retry.

Iterate the above as necessary given the size of the write and the NAND
page/block size.

2.2.4 Erase flash region (given base address and erase size).
Convert the flash address to a logical block number, call this B.
Seek and erase all physical blocks bearing data for block B.  If the
eraselength was larger than the block size, repeat as necessary for
subsequent logical blocks.

If the NAND layer reports an erase failure, the block will already have
been marked as bad (by the NAND layer), so we need do nothing (except
perhaps whingeing about it).

To improve robustness, we will check that the tags area of a physical
block we've just erased is all-0xFF; if not, we shall retry the erase
once, and if that fails, mark the block as bad.

2.2.5 Block locking
These operations are not supported; they are provided as no-ops, and
CYG_FLASH_FUNS does the right thing.

2.3 Handling of bad blocks

When choosing where to write, marked-bad blocks are skipped over.
When scanning the array for a block, marked-bad blocks are similarly ignored.

The number of logical blocks in the array must never be changed after
its first use. Therefore it is essential that nothing, other than the
one-shot initial setup of the bad block table, ever marks a block used
for this scheme as factory-bad. This should be a safe assumption, as to
do so would require "forbidden" use of nand internals.

2.4 Runtime considerations

If a runtime performance boost was required, the system could on startup
scan the NAND partition and build up a cached mapping in-RAM of the
physical addresses of each (non-zeroed) logical block. However, we don't
expect this code will be used on gigantic NAND arrays [partitions],
and it should only see light duty via RedBoot. Therefore it won't take
long to linear-scan for blocks during operations, so this optimisation
may not be worth its complexity.

(Bart suggested that a scan at startup could also take care of the cases
above where more than one physical block is tagged with the same logical
block number, hence reducing the time and complexity of the block access
code. There's a startup time vs access time vs memory trade-off here,
though we haven't fully analysed it. Nick agreed that we ought to think
this through very carefully.)

============================================================================


-- 
Embedded Software Engineer, eCosCentric Limited.
Barnwell House, Barnwell Drive, Cambridge CB5 8UU, UK.
Registered in England no. 4422071.                  www.ecoscentric.com

NAND "FIS" in RedBoot proposal

Reply via email to