Re: File systems usage patterns and NAND lifetime

2008-10-16 Thread Artem Bityutskiy
Hi,

Philippe Clérié wrote:
 Valerie Henson blogged about SSD's a while back 
 (http://valhenson.livejournal.com/25228.html). Since then I've made 
 sure I back up anything I have on flash.

Yeah, managed flashes are tricky. If you are interested, here is some
short writing about raw flash vs. managed flash:

http://www.linux-mtd.infradead.org/doc/ubifs.html#L_raw_vs_ftl

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


File systems usage patterns and NAND lifetime

2008-10-10 Thread Deepak Saxena

I attended and Embedded Linux Conference [1] last week  at which I
saw a great talk on Managing NAND Over A Product Lifecycle [2].

The speaker presented the case of determining whether a choosen
NAND HW and SW combination will survive the estimated lifecycle 
of a product. As an example, he used a GPS device his firm worked
on in which they had some very specific usage data such as:

- The average runtime for the device is 4 hours a day, during
  which we will see 100bytes/second of application logs
  written, 2300 bytes written for the addressbook, 
  1KiB/second used for temporary storage as mapes are
  decompressed.

- The user will on average update the map data from his/her
  PC every such that it requires 3GiB writes/quarter.

- OS and application updates require 32MiB/quarter.

There were many other data points, please refer to the slides
for full details.

With this data, they were able to  generate an I/O model of the 
application that was used to drive nandsim, an in-kernel NAND device
simulator. By doing this, they were to replicate the product's expected 
lifetime before user replacement (3 years) in a matter of a few days.
nandsim + the UBI reporting mechanisms were used to generate detailed 
reports of the wear leveling behaviour of the system, how the filesystem 
reacted to bitflips, bad pages, etc. Using this they were able to 
determine how to layout their filesystem and to meet the lifecylce 
requirement. After this was done, they used the same I/O model was used 
to rapidly drive a real device toward failure modes to see how it
would react. If it didn't survive for the expected lifecycle,
they could analyze the data and figure out what settings to tweak.

In this talk I also learned about the MLC NAND property of read 
disturbance, where a read to one page can cause a bit-flip on an 
adjacent page.

I found the talk fascinating and it has made me wonder if we 
have any idea what our typical deployed usage patterns might 
look like?  How often does the journal write to disk and how 
big is each write write?  How often do systems reboot and 
require a full filesystem read vs simply suspending/resuming?

Related to this topicm I am also wondering  what is the expected 
usable life of the XO? We're used to product replacement every few 
years, sometimes faster depending on the product segment, but I 
doubt countries that are investing $millions expect to only get 
2-3 years of use out of the XO. 

~Deepak

[1] http://mvista.com/vision/
[2] http://www.mvista.com/download/fetchdoc.php?docid=329

-- 
Deepak Saxena - Kernel Developer - [EMAIL PROTECTED]
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: File systems usage patterns and NAND lifetime

2008-10-10 Thread Martin Langhoff
On Fri, Oct 10, 2008 at 7:17 PM, Deepak Saxena [EMAIL PROTECTED] wrote:
 Related to this topicm I am also wondering  what is the expected
 usable life of the XO?

5 years is what I heard many times. Can't now find a formal source of
it, but it's tatooed in my forehead by raw repetition.

cheers,




m
-- 
 [EMAIL PROTECTED]
 [EMAIL PROTECTED] -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: File systems usage patterns and NAND lifetime

2008-10-10 Thread Ed McNierney
Deepak -

Thanks very much for the report and the notes; this is great stuff.   
Of course, my first question is to wonder how long that GPS has  
actually been in use by customers g.  Any other real-world NAND data  
would certainly be worth sharing with the team.

- Ed

On Oct 10, 2008, at 2:17 AM, Deepak Saxena wrote:


 I attended and Embedded Linux Conference [1] last week  at which I
 saw a great talk on Managing NAND Over A Product Lifecycle [2].

 The speaker presented the case of determining whether a choosen
 NAND HW and SW combination will survive the estimated lifecycle
 of a product. As an example, he used a GPS device his firm worked
 on in which they had some very specific usage data such as:

 - The average runtime for the device is 4 hours a day, during
  which we will see 100bytes/second of application logs
  written, 2300 bytes written for the addressbook,
  1KiB/second used for temporary storage as mapes are
  decompressed.

 - The user will on average update the map data from his/her
  PC every such that it requires 3GiB writes/quarter.

 - OS and application updates require 32MiB/quarter.

 There were many other data points, please refer to the slides
 for full details.

 With this data, they were able to  generate an I/O model of the
 application that was used to drive nandsim, an in-kernel NAND device
 simulator. By doing this, they were to replicate the product's  
 expected
 lifetime before user replacement (3 years) in a matter of a few days.
 nandsim + the UBI reporting mechanisms were used to generate detailed
 reports of the wear leveling behaviour of the system, how the  
 filesystem
 reacted to bitflips, bad pages, etc. Using this they were able to
 determine how to layout their filesystem and to meet the lifecylce
 requirement. After this was done, they used the same I/O model was  
 used
 to rapidly drive a real device toward failure modes to see how it
 would react. If it didn't survive for the expected lifecycle,
 they could analyze the data and figure out what settings to tweak.

 In this talk I also learned about the MLC NAND property of read
 disturbance, where a read to one page can cause a bit-flip on an
 adjacent page.

 I found the talk fascinating and it has made me wonder if we
 have any idea what our typical deployed usage patterns might
 look like?  How often does the journal write to disk and how
 big is each write write?  How often do systems reboot and
 require a full filesystem read vs simply suspending/resuming?

 Related to this topicm I am also wondering  what is the expected
 usable life of the XO? We're used to product replacement every few
 years, sometimes faster depending on the product segment, but I
 doubt countries that are investing $millions expect to only get
 2-3 years of use out of the XO.

 ~Deepak

 [1] http://mvista.com/vision/
 [2] http://www.mvista.com/download/fetchdoc.php?docid=329

 -- 
 Deepak Saxena - Kernel Developer - [EMAIL PROTECTED]
 ___
 Devel mailing list
 Devel@lists.laptop.org
 http://lists.laptop.org/listinfo/devel

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: File systems usage patterns and NAND lifetime

2008-10-10 Thread John Watlington

On Oct 10, 2008, at 2:17 AM, Deepak Saxena wrote:

 I attended and Embedded Linux Conference [1] last week  at which I
 saw a great talk on Managing NAND Over A Product Lifecycle [2].

 The speaker presented the case of determining whether a choosen
 NAND HW and SW combination will survive the estimated lifecycle
 of a product. As an example, he used a GPS device his firm worked
 on in which they had some very specific usage data such as:

 - The average runtime for the device is 4 hours a day, during
   which we will see 100bytes/second of application logs
   written, 2300 bytes written for the addressbook,
   1KiB/second used for temporary storage as mapes are
   decompressed.

 - The user will on average update the map data from his/her
   PC every such that it requires 3GiB writes/quarter.

 - OS and application updates require 32MiB/quarter.

 There were many other data points, please refer to the slides
 for full details.

 With this data, they were able to  generate an I/O model of the
 application that was used to drive nandsim, an in-kernel NAND device
 simulator. By doing this, they were to replicate the product's  
 expected
 lifetime before user replacement (3 years) in a matter of a few days.
 nandsim + the UBI reporting mechanisms were used to generate detailed
 reports of the wear leveling behaviour of the system, how the  
 filesystem
 reacted to bitflips, bad pages, etc. Using this they were able to
 determine how to layout their filesystem and to meet the lifecylce
 requirement. After this was done, they used the same I/O model was  
 used
 to rapidly drive a real device toward failure modes to see how it
 would react. If it didn't survive for the expected lifecycle,
 they could analyze the data and figure out what settings to tweak.

Did he discuss trying to use the same for managed NAND, where
there is no visibility into the wear levelling ?   Given the difference
in manuf. volumes, I fear that raw NAND will go the way of SCSI
disks (same basic storage medium, twice the price).

 In this talk I also learned about the MLC NAND property of read
 disturbance, where a read to one page can cause a bit-flip on an
 adjacent page.

And write disturbs are a much bigger problem than in SLC NAND.

 I found the talk fascinating and it has made me wonder if we
 have any idea what our typical deployed usage patterns might
 look like?  How often does the journal write to disk and how
 big is each write write?  How often do systems reboot and
 require a full filesystem read vs simply suspending/resuming?

Unfortunately, the XO is a general purpose device.   There is a
huge difference in storage patterns between a kid that is just using
the laptop for reading and writing, and a kid that is trying to be
the next Stanley Kublick.

There was talk of getting typical profiles for disk usage for the XO,
and using them to drive some device testing (and I'm still willing to
do such).Until such are available, I'm simulating the case of a
kid who fills up their laptop with data they don't want to delete, and
then keeps acquiring data and deleting it.  If naive wear levelling
(just using unused blocks) is used, this is a worst case for device  
wear.

At the same time, I'm trying to get a measurement for error rates.
So far, unless you count the repeated kernel crashes as errors,
we've only seen what looks like SD bus errors (the data in the
device didn't change, but a read returned erronous data.)

 Related to this topicm I am also wondering  what is the expected
 usable life of the XO? We're used to product replacement every few
 years, sometimes faster depending on the product segment, but I
 doubt countries that are investing $millions expect to only get
 2-3 years of use out of the XO.

The desired lifetime of the XO is five years.   But all the components
that would wear out after that time (main battery, backlight, keyboard)
are easily replaceable.If the NAND dies, it kills the most expensive
component.

Unfortunately, devices are getting less reliable faster than they are
growing in size.   Upcoming devices expect write/erase cycle lifetimes
of 5 - 10K instead of the 100K expected of our current SLC NAND.

Thanks for the links,
wad

___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel


Re: File systems usage patterns and NAND lifetime

2008-10-10 Thread Philippe Clérié
Valerie Henson blogged about SSD's a while back 
(http://valhenson.livejournal.com/25228.html). Since then I've made 
sure I back up anything I have on flash.

Philippe

--
The trouble with common sense is that it is so uncommon.
Anonymous
On Friday 10 October 2008 01:17:43 Deepak Saxena wrote:
 I attended and Embedded Linux Conference [1] last week  at which I
 saw a great talk on Managing NAND Over A Product Lifecycle [2].

 The speaker presented the case of determining whether a choosen
 NAND HW and SW combination will survive the estimated lifecycle
 of a product. As an example, he used a GPS device his firm worked
 on in which they had some very specific usage data such as:

 - The average runtime for the device is 4 hours a day, during
   which we will see 100bytes/second of application logs
   written, 2300 bytes written for the addressbook,
   1KiB/second used for temporary storage as mapes are
   decompressed.

 - The user will on average update the map data from his/her
   PC every such that it requires 3GiB writes/quarter.

 - OS and application updates require 32MiB/quarter.

 There were many other data points, please refer to the slides
 for full details.

 With this data, they were able to  generate an I/O model of the
 application that was used to drive nandsim, an in-kernel NAND device
 simulator. By doing this, they were to replicate the product's
 expected lifetime before user replacement (3 years) in a matter of a
 few days. nandsim + the UBI reporting mechanisms were used to
 generate detailed reports of the wear leveling behaviour of the
 system, how the filesystem reacted to bitflips, bad pages, etc. Using
 this they were able to determine how to layout their filesystem and
 to meet the lifecylce requirement. After this was done, they used the
 same I/O model was used to rapidly drive a real device toward failure
 modes to see how it would react. If it didn't survive for the
 expected lifecycle, they could analyze the data and figure out what
 settings to tweak.

 In this talk I also learned about the MLC NAND property of read
 disturbance, where a read to one page can cause a bit-flip on an
 adjacent page.

 I found the talk fascinating and it has made me wonder if we
 have any idea what our typical deployed usage patterns might
 look like?  How often does the journal write to disk and how
 big is each write write?  How often do systems reboot and
 require a full filesystem read vs simply suspending/resuming?

 Related to this topicm I am also wondering  what is the expected
 usable life of the XO? We're used to product replacement every few
 years, sometimes faster depending on the product segment, but I
 doubt countries that are investing $millions expect to only get
 2-3 years of use out of the XO.

 ~Deepak

 [1] http://mvista.com/vision/
 [2] http://www.mvista.com/download/fetchdoc.php?docid=329


___
Devel mailing list
Devel@lists.laptop.org
http://lists.laptop.org/listinfo/devel