Re: [MeeGo-dev] N900 internal flash image

David Woodhouse Thu, 23 Dec 2010 07:15:30 -0800

On Thu, 2010-12-23 at 07:35 -0600, Nishanth Menon wrote:
> Just curious: why cant we use eMMC?


eMMC is bog-roll technology; I want to be able to *trust* my storage.

I firmly believe that we should be working with file systems natively on
flash, not these "translation layers" which use flash to pretend to be a
disk.

What you have inside an eMMC device is essentially a file system. It's
not a normal one offering POSIX file system semantics, but it's a file
system nonetheless. It's a file system which does nothing except pretend
to be spinning rust, with 512-byte sectors that you can overwrite
atomically, etc. And then you use a "normal" file system on top of that.

This was a reasonably sane approach in the 1990s where you had to hook
into the INT 13h DISK BIOS services vector and then DOS would "just
work". But it has numerous issues.

One of the biggest issues, in practice, is reliability. This "flash
translation layer" is almost always implemented badly. The common
estimate is that it takes a minimum of about 5 years for a file system
to truly come to maturity. And that's an open source file system where
you can debug it, and where you can recover from errors with a viable
'fsck' tool, etc.

Inside these SSD-type devices you have the flash translation layer
implemented in a "black box". Not only is it closed source, but you
can't even access the underlying medium directly, for diagnosis and
recovery if^W WHEN it goes wrong.

If you do any kind of serious testing on these devices, including stress
testing and powerfail testing, you'll see that almost all of them are a
complete pile of crap. And if you *do* manage to get a batch which
actually pass a full powerfail and stress test suite, you may well find
that a repeat order of the *same* devices, in the *same* package with
the *same* model number, actually starts failing. And then when you take
them apart you find they're *completely* different hardware.

And the failure mode is often catastrophic. If the internal file system
gets itself screwed, then because you can't access the underlying medium
there's no fsck or even reformat; you just buy a new device and you've
lost *all* your data.

But it's not just the reliability; there's no fundamental reason why all
the devices out there have to be *quite* so incompetently implemented.
Even with an ideal implementation, the design itself is flawed; you just
don't *want* that extra layering of one file system on top of another.

One of the biggest problems with the gratuitous layering has always been
the fact that the underlying FTL doesn't actually know which 'sectors'
of its fake disk contain real data, and which sectors the real file
system just doesn't care about any more. The TRIM command helps with
that to a certain extent, but is *so* badly implemented on *so* many
devices that it's actually disabled by default in btrfs.

But there's still the issue that the underlying FTL will be doing
garbage collection on eraseblocks which contain a mixture of short-term
and long-term data. When garbage-collecting, a true flash file system
could take the opportunity to separate those, so that data with a high
expected longevity gets shifted into one eraseblock, while data which we
expect to obsolete in the short term gets put elsewhere. That helps to
increase the efficiency of future garbage collection.

We're also duplicating the data verification; btrfs has checksums, and
the FTL *also* needs to do ECC. There are a bunch of ways in which this
gratuitous extra layering makes things less efficient.

Really, we just need to ditch this stupid microcontroller with its
substandard internal "file system", and let the CPU have direct access
to the flash with a decent NAND flash controller (DMA, queueing
operations, etc.).

This eMMC crap, and the whole "pretend to be a disk" nonsense, needs to
die. It's OK for digital cameras and USB sticks where it's *short* term
storage and you expect it to die once a year

It was OK to use this kind of thing for digital camera when phones were
'disposable' and didn't have much on them in the first place. So it was
OK for them to just die every year or two and have to be replaced. But
with the type of device that'll be running MeeGo, we don't have that
excuse. These devices may well hold the *only* copy of certain bits of
information, and will be a complete pain to rebuild from scratch if the
storage device goes south. We need to do better, and we can only do that
if we have *control* from the operating system of how our data gets
stored. Trusting it to a closed-source black box, especially when that
black box has a history of being complete and utter crap, is just
insane.

Anyway, those are *my* reasons for not wanting to use eMMC... :)

-- 
dwmw2

_______________________________________________
MeeGo-dev mailing list
[email protected]
http://lists.meego.com/listinfo/meego-dev

Re: [MeeGo-dev] N900 internal flash image

Reply via email to