Ross Younger wrote:
I've include the current version of my proposed design, although following
the conversation with Simon this week it has occurred to me that it might
well be reasonable to fillet out the core idea and make it available as a
simple proto-filesystem. (Essentially it would be a block device with very
large blocks. The high-level operations provided would be reading and
writing up to a blockful, and erasing a whole block; this doesn't seem to
fit well with the eCos block device model, so I suspect it would be better
off as its own interface.) Please do speak up if that would be a useful
development, and if so I'll happily rework this proposal into two parts as
time allows.

I think there would be merit in doing something like that. I agree a block device interface wouldn't be appropriate.

============================================================================

NOR-in-NAND design (3/7/09)

1. Scenario and assumptions:

We want RedBoot to be able to use NAND flash as if it was NOR flash,
for FIS and fconfig.

We don't care about supporting non-RB apps, so can rely on RB's behaviour.

Not sure that is wise. And I'm not sure what specific behaviour you would wish to rely on. I wouldn't make the code itself dependent on RedBoot in any case. That's what got us into a mess with the v1 flash drivers. Ultimately the RedBoot FIS approach is not something we'd like to continue with - it was designed for the common case of the late '90s/early '00s of single NOR flash parts (and in some aspects, isn't so great even for then, such as not being able to support bootblocks properly). That code has limited lifespan, and many flaws. It's better to do something that can last beyond that.

I can certainly see this layer being used for a non-RedBoot boot loader. Lots of people may use RedBoot during development, but this sort of layer is useful for any boot loader. Far fewer people want to use RedBoot for their final product (outside of development) - they just want to load their apps and go. I'm not suggesting writing a boot loader to do this now, but I do believe it would be a mistake to think that people won't want to.

Analysis:
* fconfig-in-FIS always calls FIS code.
* fconfig not in FIS always erases before programming.
* FIS always erases before it programs, only deals with block-aligned
regions, and never rewrites within such a region. [+]

Design goals:
* Clean and simple, in keeping with the eCos philosophy.
* Robust: copes with blocks going bad and indulges in some sort of
wear-levelling.
* No use of malloc. (As we're targetting only RB, we can use the workspace.)

The workspace is a very crude way to allocate memory, but if you can constrain yourself to it, so much the better.

The number of blocks available - i.e. (1 + maximum valid logical block
number) - will be computed as:
        * Number of physical blocks in the chip or partition,
        * minus the number of factory bad blocks,
        * minus one (to allow for robust block rewrites),
        * minus an allowance for bad blocks to develop during the life
        of the device.

Of course a (NOR) flash driver has to specify a constant device size, whereas the bad blocks are board specific, so really you have to have an allowance for bad blocks full stop.

The NOR-in-NAND layer will be making multiple-page reads and writes;
it may be worthwhile to put code to do this into the NAND layer, as
opposed to forcing all users to reinvent the wheel.

Sounds like a better approach.

NAND blocks are used as a dumb datastore, with their physical addresses
bearing no relation to their logical addresses. In-use blocks are tagged
in the OOB area with the logical block number they refer to (see below).

Physical storage blocks are used sequentially from the beginning of
the device,  then cycling back to the start to reuse erased blocks. This
is managed by maintaining a next-write "pointer". At runtime, this is
the next vacant block after the last written block; at boot time, it
is initialised by scanning the filesystem to find the block with the
highest serial number, which is taken to be the latest-written block.

That sounds potentially painful for non-trivial numbers of blocks, unless you are going to use a large logical block size. More below.

This scheme is much better than a simple block mapping due to its
robustness: it intrinsically provides reasonable wear-levelling,

Not much if most/all blocks are used. I suspect with the normal usage pattern of flash under RedBoot, you wouldn't get much wear-levelling. You'd need to start occasionally reallocating used blocks too to wear-level properly, which admittedly probably wouldn't be /that/ difficult. That said, I think we should indeed probably just put up with theoretically inadequate wear-levelling for now.

Conversely, note that MLC NAND may start to become more common, but it is (I think) rated for ~10K cycles which may increase the need for more thorough wear-levelling.

2.1.1 OOB tag format

The tag is written into the OOB area of the first page of each block
(as "application" OOB data, from the NAND library's point of view).

The tag is a packed byte array, in processor-local endian, with the
following contents:

        * Magic number - 2 bytes, 0xEF15. (This is a compile-time constant
        and demonstrates that the block is one of ours. It's a corruption
        of "eCos FIS".)
        * Logical block number - 2 bytes.
        * Master serial number - 4 bytes (see below).

Given you say processor-local endian, I assume you mean there are two 16-bit words and a 32-bit word here.

Given you are assuming partial-page writes, I think you can do something more intelligent here to handle the seeking through NAND space that your proposal entails for every read/write:

- For a start, the serial number seems potentially overkill unless I'm missing something. All you need to know is whether a discovered logical block number is the most recent version of it. The serial only needs to reflect that block. When you write a new revision of a block, you mark the previous one dead by overwriting it with a partial write (without erasure). Thus you only have one valid version of a block at one time. Duplicates are dealt with solely at initial device scan time (stomping on the old one at that point). This way you only need 2 bits to represent the serial, theoretically (as the difference between serials can only be 1, so you can always tell which is older).

- That frees up space which we can use for potential optimisations. In particular, the common use-case we are envisaging is wholly sequential reads of fairly large images. So we could use 2 bytes to point to the next block in the logical block chain. This is very useful if most use is sequential. If that block turns out not to be the correct block number, we lose very little and just revert to scanning the medium. Most times it should be correct. (We could make this behaviour a CDL option anyway). This does mean knowing what the next block to be used next will be at the time you are writing the current block, but that doesn't seem to be much of an issue - it's primarily just bringing forward a determination you'd have to make anyway. This all doesn't seem a particularly hard-to-implement optimisation.

I should note though that multiple writes are not supported on newer MLC NAND flash. This could be an issue as this class of NAND may become more common. Perhaps in that case an obsoleted block can just be erased immediately.

Also, if scanning the medium for each block isn't as slow as I fear it may be, then the above may be unnecessary (although there's still a good argument for freeing up the OOB bits for later use, if they can be freed).

 To be safe, we should impose an upper
limit on the number of physical NAND blocks that this system will use,
and hence cap the number of logical NOR blocks the system will support. I
suggest 1024, which ought to be enough for anybody; it's more than many
(most?) NOR chips.)

The number of blocks should be configurable anyway, so I don't think we need go beyond that surely? Setting a default of 1024 for such an option should be adequate.

2.4 Runtime considerations

If a runtime performance boost was required, the system could on startup
scan the NAND partition and build up a cached mapping in-RAM of the
physical addresses of each (non-zeroed) logical block. However, we don't
expect this code will be used on gigantic NAND arrays [partitions],

Hmm. I'm not as certain about that. People like having lots of space to play in, with e.g. multiple app versions or linux kernel images or root fs's or initrd's to load etc. A linear scan will work ok on a brand new board for a while - starting the scan for the next block from the current block of course - but in due course, performance would deteriorate, mostly irreversibly.

and it should only see light duty via RedBoot.

For a production system sure, but less so on a developer's board, with apps frequently getting written/rewritten. In particular the FIS directory updates will get interspersed frequently as a result which will cause increasing fragmentation. Put it like this - if you're considering something where wear-levelling is a concern (and for the mooted proto-fs, that's certainly valid), then you'd definitely need to consider the length of time scanning every block read as the block mappings will drift further away from 1:1 logical to virtual, and stop being linear.

Therefore it won't take
long to linear-scan for blocks during operations, so this optimisation
may not be worth its complexity.

And 4Kbytes RAM (for, say, 1024 blocks).

(Bart suggested that a scan at startup could also take care of the cases
above where more than one physical block is tagged with the same logical
block number, hence reducing the time and complexity of the block access
code. There's a startup time vs access time vs memory trade-off here,
though we haven't fully analysed it. Nick agreed that we ought to think
this through very carefully.)

I think that being forced as a matter of course to scan the whole medium _every_ block read is a bad thing and should be avoided.


Something you may want to think about is that there are problems that RedBoot's FIS and config code have with multiple flash devices in a system. This may happen, whether with multiple NANDs or a mixture of flash types - possibly increasingly so these days. Work by myself and IIRC to some extent Bart at eCosCentric has ameliorated problems somewhat, but not fixed them. RedBoot's FIS code orients itself around a single flash device, with a single fixed block size for the entirety of that device. You may stumble across problems here as NAND may not be the only flash device on many boards, so it's something to bear in mind.

Jifl
--
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

Reply via email to