Re: GSoC Final Report on mnemofs

Saurav Pal Fri, 13 Sep 2024 13:10:59 -0700

Hi all,

Thank you for the feedback and support.

Sebastien:
Thanks for the valuable points you have mentioned. Out of them, there are
some points that have been considered while designing this, and there are
some that can be used to improve it.

Theoretically, power loss should not be a problem. However, I agree that it
isn't a very concrete claim as it's yet to be tested out. This is mostly
due to a lack of actual drivers for the hardware at hand. The NAND sim I
had written helps simulate the structure of a NAND flash, and bad blocks,
but physical irregularities are a different matter. It does not yet
consider any form of bit flipping, nor does it have the capability to
simulate power losses yet.

Though, again theoretically, the worst that will happen in usual power loss
situations will be that all the changes stored on the LRU are gone.

The FS follows a copy on write system, and the old version of the file
system is always accessible till the new version is *completely written*.
However, there's one edge case that I'm worried about...where the pages in
the flash aren't enough in number to have both a new and old copy available
(which is what happens at one point during the update procedure). As for
the journal, it does ensure that the checksum is written *after* the
journal log. Assuming the bytes are written to the flash from the start to
end (which I have not seen anything to suggest otherwise), the checksum
will get written at the very end. However, thanks for the suggestion, I
will look more into the best way to apply checksums in this situation. For
now, the checksum is written along with the data, to lower wasted space (as
a separate write would need to be written onto a new page), but I will look
into this in more detail.

For application of these logs, the LRU and the journal are used together to
update the FS from the leaves (files) and go upwards. However, the old FS
still remains, and until the root of the FS is updated, the old one is the
only valid FS. When the new root is written, it will essentially be the
latest revision of the root, and the latest revision (along with a
checksum) is used as the root of the FS on mount in case of a power loss in
between the update. The partial completion of the update process can be
resumed again later on (of course, this is theoretical, but it should be
testable once the driver gets done, or the power loss feature gets added to
the NAND flash sim).

As for bit flipping, which is a more common occurrence in NAND flashes
because of the compact nature, the NAND driver takes care of bit flips.
There is a dedicated spare area that contains ECC bits for this purpose,
and the driver utilizes them before providing it to the FS. However, it
still has a limit for the amount of bits that can be rectified, and testing
in these situations is yet to be done.

I'll be a bit too busy till the end of October due to personal reasons, but
following that, I'll be spending time improving the FS. For now, it works
in theory, but the implementation of mnemofs still needs quite a bit of
work, as it was developed under a fairly short deadline required by GSoC,
which I plan to do.

Thanks a lot to all of you for this feedback, it highlighted some of the
various focal points a bit more, and all that I need to focus on to improve
this.

Best regards,
Saurav.

On Fri, Sep 13, 2024 at 6:57 PM Sebastien Lorquet <sebast...@lorquet.fr>
wrote:

> Hi,
>
> Yes before we claim having a power cut resilient FS , it has to be
> tested on real hardware.
>
> It will also be the occasion to actually fire-proof littlefs, because I
> have no idea if that was ever done before.
>
> We suggested the idea in a littlefs github issue but it got nowhere IIRC.
>
> Sebastien
>
> On 13/09/2024 15:20, Alan C. Assis wrote:
> > Hi Tomek,
> >
> > I think it is possible to modify the NAND Simulator to help testing
> > partial writes and induced errors.
> >
> > But I agree with Sebastien that it is better to test in real hardware.
> >
> > Since the MnemoFS already works in the simulator we can get the
> > initial version working in any flash, even SPI NOR Flash.
> >
> > BR,
> >
> > Alan
> >
> > On Fri, Sep 13, 2024 at 10:05 AM Tomek CEDRO <to...@cedro.info> wrote:
> >
> >     very nice discussion!
> >
> >     can problems with NAND create a semi-dereministic group of well
> >     defined
> >     issues with random characteristics?
> >
> >     if so then we could create a model nand for sim with controllable
> >     errors in
> >     order to verify various nand drivers, filesystems, etc?
> >
> >     :-)
> >
> >     --
> >     CeDeROM, SQ7MHZ, http://www.tomek.cedro.info
> >
> >     On Fri, Sep 13, 2024, 14:52 Alan C. Assis <acas...@gmail.com> wrote:
> >
> >     > Hi Sebastien,
> >     >
> >     > Thank you for your helpful considerations.
> >     >
> >     > As I explained before he used the SIM NAND Simulator that he
> >     created and
> >     > integrated on NuttX.
> >     >
> >     > Also as I explained in my previous email, we need help to test
> >     in real
> >     > hardware.
> >     >
> >     > Since you have previous experience with NAND Flash, maybe you
> >     could help
> >     > here (of course, if you are interested to help)!
> >     >
> >     > First we need to create a driver for a SPI NAND Flash (I bought
> >     this model:
> >     > https://aliexpress.com/item/1005005307786079.html) and use it with
> >     > MnemoFS.
> >     >
> >     > This model that I selected has internal error detection, etc, it
> >     means we
> >     > don't need to worry about taking care of bad blocks ourselves.
> >     >
> >     > If you look inside nuttx/drivers/mtd/ many of the pieces we need
> are
> >     > already there, we just need to understand how to use SPI NAND,
> >     FTL, MTD,
> >     > etc.
> >     >
> >     > Xiang, since you and your team ported YAFFS to NuttX, maybe you
> >     guys could
> >     > help us to get MnemoFS working on real flash on NuttX.
> >     >
> >     > BR,
> >     >
> >     > Alan
> >     >
> >     >
> >     >
> >     > On Fri, Sep 13, 2024 at 6:17 AM Sebastien Lorquet
> >     <sebast...@lorquet.fr>
> >     > wrote:
> >     >
> >     > > Hello
> >     > >
> >     > >
> >     > > This is quite a complete report with a lot of details, this
> >     shows that
> >     > > you have put some large amount of mental energy in this
> >     project, so
> >     > > congratulations and thank you.
> >     > >
> >     > > What I'm about to write is not a critic but a complement that may
> >     > > interest you.
> >     > >
> >     > >
> >     > > Since I've worked with critical flash systems for more than 10
> >     years
> >     > > now, I have read the part of your document that deals with
> >     power loss
> >     > > with great interest.
> >     > >
> >     > > Resilience to power loss is *absolutely critical* to any embedded
> >     > > filesystem.
> >     > >
> >     > >
> >     > > Did you do power interruption tests on your code? Can you
> >     guarantee that
> >     > > the device format stays consistent/recoverable when the power
> >     is cut at
> >     > > any code location? Did you identify power critical code
> >     sections (with
> >     > > relation to power cut, not cpu access) ?
> >     > >
> >     > > Remember, if it's not tested, it doesnt work...
> >     > >
> >     > >
> >     > > The most critical part of your work is the journal. Do you
> >     make sure
> >     > > that the checksum is written 1-last, and 2-completely? How do
> >     you make
> >     > > sure that the journal entries are correctly applied to their
> final
> >     > > storage locations?
> >     > >
> >     > > The largest problem in that area is flash metastability. The
> >     checksum
> >     > > MIGHT appear correct on one read, but not correct at the next
> >     access.
> >     > > The reason for this is the analog nature of flash writes (and
> >     erases),
> >     > > which injects a number of electrons in a floating gate. 0 and
> >     1 bits are
> >     > > separated by thresholds, but these thresholds vary with
> >     temperature and
> >     > > time (wear), so it might appear that a bit is correct by being
> >     just at
> >     > > the threshold, but the next access will result in a flipped bit.
> >     > >
> >     > > These issues are NOT theoretical, they happen all the time in
> >     all flash
> >     > > devices, you just have to tickle the devices often enough at
> >     the right
> >     > > moment so you begin to see these.
> >     > >
> >     > > These tests require the ability to fully cut the power to a
> >     test board
> >     > > with microsecond precision. No need for pulses, just an
> adjustable
> >     > > delay. Test is triggered by a command that also start a
> >     countdown, and
> >     > > timeout is increased microsecond by microsecond until you
> >     reach the
> >     > > point that the flash is actually written. Usually, there is a
> >     point
> >     > > where timeouts result in partial writes. Then the board will
> start
> >     > > acting funny and will start entering the error branches that
> >     are usually
> >     > > never taken. Board capacitors are not a problem, they just
> >     increase the
> >     > > delays. They always discharge the same way during all repeated
> >     tests, so
> >     > > they have no influence on the process.
> >     > >
> >     > > It is quite hard to make sure that everything is correct, but a
> >     > > sufficient amount of dedication is required to be aware of the
> >     potential
> >     > > problems.
> >     > >
> >     > > How do you know in your filesystem that the checksum has been
> >     written
> >     > > only after all the previous data are written? How do you know the
> >     > > checksum write is complete. There are software techniques for
> >     this. This
> >     > > also requires the flash to support overwrites, so making this
> >     work with
> >     > > ECC is harder (but possible).
> >     > >
> >     > > Fine details absolutely matters here.
> >     > >
> >     > > Thanks,
> >     > >
> >     > > Sebastien
> >     > >
> >     > >
> >     > > On 12/09/2024 17:48, Saurav Pal wrote:
> >     > > > Hi all,
> >     > > >
> >     > > > Here's my final report <
> >     > https://resyfer.github.io/blogs/mnemofs/endeval/
> >     > > >
> >     > > > on mnemofs, a NAND flash file system for NuttX, on which I
> >     worked
> >     > during
> >     > > my
> >     > > > tenure as a GSoC 2024 Contributor for ASF. I would be
> >     grateful for any
> >     > > > suggestions and criticism.
> >     > > >
> >     > > > Best regards,
> >     > > > Saurav Pal.
> >     > > >
> >     > >
> >     >
> >

Re: GSoC Final Report on mnemofs

Reply via email to