Re: SSD Optimizations

Hubert Kario Sat, 13 Mar 2010 11:01:46 -0800

On Saturday 13 March 2010 18:02:10 Stephan von Krawczynski wrote:
> On Fri, 12 Mar 2010 17:00:08 +0100
> Hubert Kario <[email protected]> wrote:
> > > Even on true
> > > spinning disks your assumption is wrong for relocated sectors.
> > 
> > Which we don't have to worry about because if the drive has less than 5
> > of 'em, the impact of hitting them is marginal and if there are more,
> > the user has much more pressing problem than the performance of the
> > drive or FS.
> 
> Are you really sure that a drive firmware tells you about the true number
> of relocated sectors? I mean if it makes the product look better in
> comparison to another product, are you really sure that the firmware will
> not tell you what you expect to see only to make you content and happy
> with your drive?


because Joe Sixpack reads SMART values, and even if he does, he will be much 
more angry when a drive that has no or few relocations fails, that when a 
drive that reports that's failing fails.

If the drive arrives with badsectors, it goes where it came from the same day 
if it meets an IT guy worth its salt, any IT guy knows that some HDDs develop 
badsectors no matter the make and model, but if they do, you replace them.

And as the Google disk survey showed, the SMART has very high percentage of 
Type I errors, but very few Type II errors.

But we're off-topic here

> > > Which
> > > basically means that every disk controller firmware fiddles around with
> > > the physical layout since decades. Please accept that you cannot do a
> > > disks' job in FS. The more advanced technology gets the more disks
> > > become black boxes with a defined software interface. Use this
> > > interface and drop the idea of having inside knowledge of such a
> > > device. That's other peoples' work. If you want to design smart SSD
> > > controllers hire at a company that builds those.
> > 
> > And I don't think that doing disks' job in the FS is good idea, but I
> > think that we should be able to minimise the impact of the translation
> > layer.
> > 
> > The way to do this, is to threat the device as a block device with
> > sectors the size of erase-blocks. That's nothing too fancy, don't you
> > think?
> 
> I don't believe anyone is able to tell the size of erase-blocks of some
> device - current and future - for sure.

Well, if the engeneer that designed it doesn't know this, I don't know how he 
got his degree.

Just because it isn't publicised now, doesn't mean it won't be in near future.

Besides that, to detect how big the erase-blocks are in size is easy, if they 
have any impact on the performance, if they don't have any impact (whatever 
the reason) tunning for their size is pointless anyway. 

> I do believe that making this
> guess only reduces the future design options for new devices - if its
> creators care at all about your guess.

Did I, or any one else, say that we want to hardwire a specific erase-block 
size to the design of the FS?! That would be utter stupidity!

> Why not let the fs designer take his creative options in fs layer and let
> the device designer use his brain on the device level and all meet at the
> predefined software interface in between - and nowhere _else_.

We (well, at least Gordon and I) just want a "stripe_width" option added to 
the mkfs.btrfs, just like it is there for ext2/3/4, reiserfs, xfs and jfs to 
name a few. It would need very few additional tweaks to make it SSD friendly, 
hardly any considering how -o ssd or -o ssd_spread already work.

You're forgetting there's an elephant in the room that won't to talk to 
devices that don't have sectors 512B in size. If not for it, there wouldn't 
even _be_ SSDs with 512B sectors.

It's not the way Flash memory works.

The 512B abstraction is there to be compatible, to work with one current OS, 
it's not there because it describes better the way Flash memory works or is 
the best way to address the data on the device itself.

There are already consumer HDDs with 4kiB sector size, so the situation is  
getting better. We can only hope that in few years time the SSDs will have 
sectors the size of erase-blocks. But in the mean time, stripe_width would be 
enough.


Besides, the stripe_width option will be not only useful for the SSDs but also 
in environments where btrfs is on a device that is a RAID5/6 array 
(reconfiguring a server with many virtual machines is far from easy and 
sometimes just can't be done because of heterogeneous virtualised OSs that 
need the data protection provided by lower layers).

-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: SSD Optimizations

Reply via email to