Well, if you look at the very very fine print on their warranty statement
and some spec sheets they say they are only supposed to be used in "Client
PCs" and if the application exceeds certain write amounts per day, even if
it's below the total volume of writes the drive is supposed to handle, it
voids the warranty. I expect it's the write rate that is killing them.
Purely by measure of the amount of writes, mine should have been at about
50% life or better.

So, according to the strict letter of their specs, a ceph server would be
an unintended use. Of course, all that detail gets omitted in lots of
places where one would do research. In the end though, they are taking care
of me, and frankly that means a lot more in my book. And for what it's
worth, I have many drives from them in PCs and laptops that have been
rolling happily along for years.

QH

On Thu, Sep 17, 2015 at 5:07 PM, Andrija Panic <[email protected]>
wrote:

> "      came to the conclusion they we put to an "unintended use".   "
> wtf ? :)))) Best to install them inside shutdown workstation... :)
>
> On 18 September 2015 at 01:04, Quentin Hartman <
> [email protected]> wrote:
>
>> I ended up having 7 total die. 5 while in service, 2 more when I hooked
>> them up to a test machine to collect information from them. To Samsung's
>> credit, they've been great to deal with and are replacing the failed
>> drives, on the condition that I don't use them for ceph again. Apparently
>> they sent some of my failed drives to an engineer in Korea and they did a
>> failure analysis on them and came to the conclusion they we put to an
>> "unintended use". I have seven left I'm not sure what to do with.
>>
>> I've honestly always really liked Samsung, and I'm disappointed that I
>> wasn't able to find anyone with their DC-class drives actually in stock so
>> I ended up switching the to Intel S3700s. My users will be happy to have
>> some SSDs to put in their workstations though!
>>
>> QH
>>
>> On Thu, Sep 17, 2015 at 4:49 PM, Andrija Panic <[email protected]>
>> wrote:
>>
>>> Another one bites the dust...
>>>
>>> This is Samsung 850 PRO 256GB... (6 journals on this SSDs just died...)
>>>
>>> [root@cs23 ~]# smartctl -a /dev/sda
>>> smartctl 5.43 2012-06-30 r3573
>>> [x86_64-linux-3.10.66-1.el6.elrepo.x86_64] (local build)
>>> Copyright (C) 2002-12 by Bruce Allen,
>>> http://smartmontools.sourceforge.net
>>>
>>> Vendor:               /1:0:0:0
>>> Product:
>>> User Capacity:        600,332,565,813,390,450 bytes [600 PB]
>>> Logical block size:   774843950 bytes
>>> >> Terminate command early due to bad response to IEC mode page
>>> A mandatory SMART command failed: exiting. To continue, add one or more
>>> '-T permissive' options
>>>
>>> On 8 September 2015 at 18:01, Quentin Hartman <
>>> [email protected]> wrote:
>>>
>>>> On Tue, Sep 8, 2015 at 9:05 AM, Mark Nelson <[email protected]> wrote:
>>>>
>>>>> A list of hardware that is known to work well would be incredibly
>>>>>> valuable to people getting started. It doesn't have to be exhaustive,
>>>>>> nor does it have to provide all the guidance someone could want. A
>>>>>> simple "these things have worked for others" would be sufficient. If
>>>>>> nothing else, it will help people justify more expensive gear when
>>>>>> their
>>>>>> approval people say "X seems just as good and is cheaper, why can't we
>>>>>> get that?".
>>>>>>
>>>>>
>>>>> So I have my opinions on different drives, but I think we do need to
>>>>> be really careful not to appear to endorse or pick on specific vendors. 
>>>>> The
>>>>> more we can stick to high-level statements like:
>>>>>
>>>>> - Drives should have high write endurance
>>>>> - Drives should perform well with O_DSYNC writes
>>>>> - Drives should support power loss protection for data in motion
>>>>>
>>>>> The better I think.  Once those are established, I think it's
>>>>> reasonable to point out that certain drives meet (or do not meet) those
>>>>> criteria and get feedback from the community as to whether or not vendor's
>>>>> marketing actually reflects reality.  It'd also be really nice to see more
>>>>> information available like the actual hardware (capacitors, flash cells,
>>>>> etc) used in the drives.  I've had to show photos of the innards of
>>>>> specific drives to vendors to get them to give me accurate information
>>>>> regarding certain drive capabilities.  Having a database of such things
>>>>> available to the community would be really helpful.
>>>>>
>>>>>
>>>> That's probably a very good approach. I think it would be pretty simple
>>>> to avoid the appearance of endorsement if the data is presented correctly.
>>>>
>>>>
>>>>>
>>>>>> To that point, I think perhaps though something more important than a
>>>>>> list of known "good" hardware would be a list of known "bad" hardware,
>>>>>>
>>>>>
>>>>> I'm rather hesitant to do this unless it's been specifically confirmed
>>>>> by the vendor.  It's too easy to point fingers (see the recent kernel trim
>>>>> bug situation).
>>>>
>>>>
>>>> I disagree. I think that only comes into play if you claim to know why
>>>> the hardware has problems. In this case, if you simply state "people who
>>>> have used this drive have experienced a large number of seemingly premature
>>>> failures when using them as journals" that provides sufficient warning to
>>>> users, and if the vendor wants to engage the community and potentially pin
>>>> down why and help us find a way to make the device work or confirm that
>>>> it's just not suited, then that's on them. Samsung seems to be doing
>>>> exactly that. It would be great to have them help provide that level of
>>>> detail, but again, I don't think it's necessary. We're not saying
>>>> "ceph/redhat/$whatever says this hardware sucks" we're saying "The
>>>> community has found that using this hardware with ceph has exhibited these
>>>> negative behaviors...". At that point you're just relaying experiences and
>>>> collecting them in a central location. It's up to the reader to draw
>>>> conclusions from it.
>>>>
>>>> But again, I think more important than either of these would be a
>>>> collection of use cases with actual journal write volumes that have
>>>> occurred in those use cases so that people can make more informed
>>>> purchasing decisions. The fact that my small openstack cluster created 3.6T
>>>> of writes per month on my journal drives (3 OSD each) is somewhat
>>>> mind-blowing. That's almost four times the amount of writes my best guess
>>>> estimates indicated we'd be doing. Clearly there's more going on than we
>>>> are used to paying attention to. Someone coming to ceph and seeing the cost
>>>> of DC-class SSDs versus consumer-class SSDs will almost certainly suffer
>>>> from some amount of sticker shock, and even if they don't their purchasing
>>>> approval people almost certainly will. This is especially true for people
>>>> in smaller organizations where SSDs are still somewhat exotic. And when
>>>> they come back with the "Why won't cheaper thing X be OK?" they need to
>>>> have sufficient information to answer that. Without a test environment to
>>>> generate data with, they will need to rely on the experiences of others,
>>>> and right now those experiences don't seem to be documented anywhere, and
>>>> if they are, they are not very discoverable.
>>>>
>>>> QH
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> [email protected]
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Andrija Panić
>>>
>>
>>
>
>
> --
>
> Andrija Panić
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to