Re: File systems usage patterns and NAND lifetime
Hi, Philippe Clérié wrote: Valerie Henson blogged about SSD's a while back (http://valhenson.livejournal.com/25228.html). Since then I've made sure I back up anything I have on flash. Yeah, managed flashes are tricky. If you are interested, here is some short writing about raw flash vs. managed flash: http://www.linux-mtd.infradead.org/doc/ubifs.html#L_raw_vs_ftl -- Best Regards, Artem Bityutskiy (Артём Битюцкий) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
File systems usage patterns and NAND lifetime
I attended and Embedded Linux Conference [1] last week at which I saw a great talk on Managing NAND Over A Product Lifecycle [2]. The speaker presented the case of determining whether a choosen NAND HW and SW combination will survive the estimated lifecycle of a product. As an example, he used a GPS device his firm worked on in which they had some very specific usage data such as: - The average runtime for the device is 4 hours a day, during which we will see 100bytes/second of application logs written, 2300 bytes written for the addressbook, 1KiB/second used for temporary storage as mapes are decompressed. - The user will on average update the map data from his/her PC every such that it requires 3GiB writes/quarter. - OS and application updates require 32MiB/quarter. There were many other data points, please refer to the slides for full details. With this data, they were able to generate an I/O model of the application that was used to drive nandsim, an in-kernel NAND device simulator. By doing this, they were to replicate the product's expected lifetime before user replacement (3 years) in a matter of a few days. nandsim + the UBI reporting mechanisms were used to generate detailed reports of the wear leveling behaviour of the system, how the filesystem reacted to bitflips, bad pages, etc. Using this they were able to determine how to layout their filesystem and to meet the lifecylce requirement. After this was done, they used the same I/O model was used to rapidly drive a real device toward failure modes to see how it would react. If it didn't survive for the expected lifecycle, they could analyze the data and figure out what settings to tweak. In this talk I also learned about the MLC NAND property of read disturbance, where a read to one page can cause a bit-flip on an adjacent page. I found the talk fascinating and it has made me wonder if we have any idea what our typical deployed usage patterns might look like? How often does the journal write to disk and how big is each write write? How often do systems reboot and require a full filesystem read vs simply suspending/resuming? Related to this topicm I am also wondering what is the expected usable life of the XO? We're used to product replacement every few years, sometimes faster depending on the product segment, but I doubt countries that are investing $millions expect to only get 2-3 years of use out of the XO. ~Deepak [1] http://mvista.com/vision/ [2] http://www.mvista.com/download/fetchdoc.php?docid=329 -- Deepak Saxena - Kernel Developer - [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: File systems usage patterns and NAND lifetime
On Fri, Oct 10, 2008 at 7:17 PM, Deepak Saxena [EMAIL PROTECTED] wrote: Related to this topicm I am also wondering what is the expected usable life of the XO? 5 years is what I heard many times. Can't now find a formal source of it, but it's tatooed in my forehead by raw repetition. cheers, m -- [EMAIL PROTECTED] [EMAIL PROTECTED] -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: File systems usage patterns and NAND lifetime
Deepak - Thanks very much for the report and the notes; this is great stuff. Of course, my first question is to wonder how long that GPS has actually been in use by customers g. Any other real-world NAND data would certainly be worth sharing with the team. - Ed On Oct 10, 2008, at 2:17 AM, Deepak Saxena wrote: I attended and Embedded Linux Conference [1] last week at which I saw a great talk on Managing NAND Over A Product Lifecycle [2]. The speaker presented the case of determining whether a choosen NAND HW and SW combination will survive the estimated lifecycle of a product. As an example, he used a GPS device his firm worked on in which they had some very specific usage data such as: - The average runtime for the device is 4 hours a day, during which we will see 100bytes/second of application logs written, 2300 bytes written for the addressbook, 1KiB/second used for temporary storage as mapes are decompressed. - The user will on average update the map data from his/her PC every such that it requires 3GiB writes/quarter. - OS and application updates require 32MiB/quarter. There were many other data points, please refer to the slides for full details. With this data, they were able to generate an I/O model of the application that was used to drive nandsim, an in-kernel NAND device simulator. By doing this, they were to replicate the product's expected lifetime before user replacement (3 years) in a matter of a few days. nandsim + the UBI reporting mechanisms were used to generate detailed reports of the wear leveling behaviour of the system, how the filesystem reacted to bitflips, bad pages, etc. Using this they were able to determine how to layout their filesystem and to meet the lifecylce requirement. After this was done, they used the same I/O model was used to rapidly drive a real device toward failure modes to see how it would react. If it didn't survive for the expected lifecycle, they could analyze the data and figure out what settings to tweak. In this talk I also learned about the MLC NAND property of read disturbance, where a read to one page can cause a bit-flip on an adjacent page. I found the talk fascinating and it has made me wonder if we have any idea what our typical deployed usage patterns might look like? How often does the journal write to disk and how big is each write write? How often do systems reboot and require a full filesystem read vs simply suspending/resuming? Related to this topicm I am also wondering what is the expected usable life of the XO? We're used to product replacement every few years, sometimes faster depending on the product segment, but I doubt countries that are investing $millions expect to only get 2-3 years of use out of the XO. ~Deepak [1] http://mvista.com/vision/ [2] http://www.mvista.com/download/fetchdoc.php?docid=329 -- Deepak Saxena - Kernel Developer - [EMAIL PROTECTED] ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: File systems usage patterns and NAND lifetime
On Oct 10, 2008, at 2:17 AM, Deepak Saxena wrote: I attended and Embedded Linux Conference [1] last week at which I saw a great talk on Managing NAND Over A Product Lifecycle [2]. The speaker presented the case of determining whether a choosen NAND HW and SW combination will survive the estimated lifecycle of a product. As an example, he used a GPS device his firm worked on in which they had some very specific usage data such as: - The average runtime for the device is 4 hours a day, during which we will see 100bytes/second of application logs written, 2300 bytes written for the addressbook, 1KiB/second used for temporary storage as mapes are decompressed. - The user will on average update the map data from his/her PC every such that it requires 3GiB writes/quarter. - OS and application updates require 32MiB/quarter. There were many other data points, please refer to the slides for full details. With this data, they were able to generate an I/O model of the application that was used to drive nandsim, an in-kernel NAND device simulator. By doing this, they were to replicate the product's expected lifetime before user replacement (3 years) in a matter of a few days. nandsim + the UBI reporting mechanisms were used to generate detailed reports of the wear leveling behaviour of the system, how the filesystem reacted to bitflips, bad pages, etc. Using this they were able to determine how to layout their filesystem and to meet the lifecylce requirement. After this was done, they used the same I/O model was used to rapidly drive a real device toward failure modes to see how it would react. If it didn't survive for the expected lifecycle, they could analyze the data and figure out what settings to tweak. Did he discuss trying to use the same for managed NAND, where there is no visibility into the wear levelling ? Given the difference in manuf. volumes, I fear that raw NAND will go the way of SCSI disks (same basic storage medium, twice the price). In this talk I also learned about the MLC NAND property of read disturbance, where a read to one page can cause a bit-flip on an adjacent page. And write disturbs are a much bigger problem than in SLC NAND. I found the talk fascinating and it has made me wonder if we have any idea what our typical deployed usage patterns might look like? How often does the journal write to disk and how big is each write write? How often do systems reboot and require a full filesystem read vs simply suspending/resuming? Unfortunately, the XO is a general purpose device. There is a huge difference in storage patterns between a kid that is just using the laptop for reading and writing, and a kid that is trying to be the next Stanley Kublick. There was talk of getting typical profiles for disk usage for the XO, and using them to drive some device testing (and I'm still willing to do such).Until such are available, I'm simulating the case of a kid who fills up their laptop with data they don't want to delete, and then keeps acquiring data and deleting it. If naive wear levelling (just using unused blocks) is used, this is a worst case for device wear. At the same time, I'm trying to get a measurement for error rates. So far, unless you count the repeated kernel crashes as errors, we've only seen what looks like SD bus errors (the data in the device didn't change, but a read returned erronous data.) Related to this topicm I am also wondering what is the expected usable life of the XO? We're used to product replacement every few years, sometimes faster depending on the product segment, but I doubt countries that are investing $millions expect to only get 2-3 years of use out of the XO. The desired lifetime of the XO is five years. But all the components that would wear out after that time (main battery, backlight, keyboard) are easily replaceable.If the NAND dies, it kills the most expensive component. Unfortunately, devices are getting less reliable faster than they are growing in size. Upcoming devices expect write/erase cycle lifetimes of 5 - 10K instead of the 100K expected of our current SLC NAND. Thanks for the links, wad ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: File systems usage patterns and NAND lifetime
Valerie Henson blogged about SSD's a while back (http://valhenson.livejournal.com/25228.html). Since then I've made sure I back up anything I have on flash. Philippe -- The trouble with common sense is that it is so uncommon. Anonymous On Friday 10 October 2008 01:17:43 Deepak Saxena wrote: I attended and Embedded Linux Conference [1] last week at which I saw a great talk on Managing NAND Over A Product Lifecycle [2]. The speaker presented the case of determining whether a choosen NAND HW and SW combination will survive the estimated lifecycle of a product. As an example, he used a GPS device his firm worked on in which they had some very specific usage data such as: - The average runtime for the device is 4 hours a day, during which we will see 100bytes/second of application logs written, 2300 bytes written for the addressbook, 1KiB/second used for temporary storage as mapes are decompressed. - The user will on average update the map data from his/her PC every such that it requires 3GiB writes/quarter. - OS and application updates require 32MiB/quarter. There were many other data points, please refer to the slides for full details. With this data, they were able to generate an I/O model of the application that was used to drive nandsim, an in-kernel NAND device simulator. By doing this, they were to replicate the product's expected lifetime before user replacement (3 years) in a matter of a few days. nandsim + the UBI reporting mechanisms were used to generate detailed reports of the wear leveling behaviour of the system, how the filesystem reacted to bitflips, bad pages, etc. Using this they were able to determine how to layout their filesystem and to meet the lifecylce requirement. After this was done, they used the same I/O model was used to rapidly drive a real device toward failure modes to see how it would react. If it didn't survive for the expected lifecycle, they could analyze the data and figure out what settings to tweak. In this talk I also learned about the MLC NAND property of read disturbance, where a read to one page can cause a bit-flip on an adjacent page. I found the talk fascinating and it has made me wonder if we have any idea what our typical deployed usage patterns might look like? How often does the journal write to disk and how big is each write write? How often do systems reboot and require a full filesystem read vs simply suspending/resuming? Related to this topicm I am also wondering what is the expected usable life of the XO? We're used to product replacement every few years, sometimes faster depending on the product segment, but I doubt countries that are investing $millions expect to only get 2-3 years of use out of the XO. ~Deepak [1] http://mvista.com/vision/ [2] http://www.mvista.com/download/fetchdoc.php?docid=329 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel