On 04/08/2014 10:12 AM, Lux, Jim (337C) wrote:
On 4/7/14 6:48 PM, "Ellis H. Wilson III" <[email protected]> wrote:

On 04/07/2014 09:34 PM, Prentice Bisbal wrote:
Was it wear out, or some other failure mode?

And if wear out, was it because consumer SSDs have lame leveling or
something like that?

Here's how I remember it. You took the capacity of the disk, figured out
how much data would have to be written to it wear it out, and then
divided that by the bandwidth of the drive to figure out how long it
would take to write that much data to the disk if data was constantly
being written to it. I think the answer was on the order of 5-10 years,
which is a bit more than the expected lifespan of a cluster, making it a
non-issue.

This would be the ideal case, but requires perfect wear-leveling and
write amplification factor of 1.  Unfortunately, those properties rarely
hold.

However, again, in the case of using it as a Hadoop intermediate disk,
write amp would be a non-issue because you'd be blowing away data after
runs (make sure to use a scripted trim or something, unless the FS
auto-trims, which you may not want), and wear-leveling would be less
important because the data written/read would be large highly
sequential.  Wear-leveling would be trivial under those conditions.


Wear leveling would be trivial, if one were designing the wear leveling
algorithms.

Or if you were using a workload that would operate well under any given wear-leveling algorithm, as the example I gave. Hence, "under those conditions."

I could easily see a consumer device having a different algorithm from an
enterprise device, either because they just spend more time and money
getting a good algorithm, or because of different underlying assumptions
about write/read patterns.

This is my understanding. Further, sometimes different algorithms are in use due to acquisitions -- the old algorithm just gets used in the commodity drives, and the new one is just used in enterprise. Sometimes there are resource reasons for this (the enterprise one is more CPU-intensive or DRAM-requiring within the SSD).

Even in an enterprise environment, there's some very different write
patterns possible.  A "scratch" device might get written randomly, while a
"logging" device will tend to be written sequentially.  Consider something
like a credit card processing system.  This is going to have a lot of "add
at the end" transaction data.  As opposed to, say, a library catalog where
books are checked out essentially at random, and you update the "check
out/check in" status, and writes are sprinkled randomly through out the
data.

I agree, which is what makes wear-leveling such an interesting (and well-researched) area in the SSD field. However, my suggestion for Prentice on how to use it in his system (keeping the discussion on point) avoided dealing with the wide variety of issues SSD manufacturers have to cope with.

Sadly, much of this will not be particularly well documented, if at all.

Supposedly more APIs are being exposed to control wear-leveling, when GC kicks, in, etc (I believe Samsung is on the forefront here). But this is just what I have heard. I don't have examples to share just yet. Very little has been said in this space in the past because these were the most highly guarded of the proprietary algorithms in the SSD arena. As more and more algorithms gets researched and are made effectively open-source (i.e., yet another sad case of computer science catching up with industry) pressure is off to protect these so much, and on to give the reigns to the user.

Best,

ellis

--
Ph.D. Candidate
Department of Computer Science and Engineering
The Pennsylvania State University
www.ellisv3.com
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to