" came to the conclusion they we put to an "unintended use". " wtf ? :)))) Best to install them inside shutdown workstation... :)
On 18 September 2015 at 01:04, Quentin Hartman <[email protected] > wrote: > I ended up having 7 total die. 5 while in service, 2 more when I hooked > them up to a test machine to collect information from them. To Samsung's > credit, they've been great to deal with and are replacing the failed > drives, on the condition that I don't use them for ceph again. Apparently > they sent some of my failed drives to an engineer in Korea and they did a > failure analysis on them and came to the conclusion they we put to an > "unintended use". I have seven left I'm not sure what to do with. > > I've honestly always really liked Samsung, and I'm disappointed that I > wasn't able to find anyone with their DC-class drives actually in stock so > I ended up switching the to Intel S3700s. My users will be happy to have > some SSDs to put in their workstations though! > > QH > > On Thu, Sep 17, 2015 at 4:49 PM, Andrija Panic <[email protected]> > wrote: > >> Another one bites the dust... >> >> This is Samsung 850 PRO 256GB... (6 journals on this SSDs just died...) >> >> [root@cs23 ~]# smartctl -a /dev/sda >> smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.10.66-1.el6.elrepo.x86_64] >> (local build) >> Copyright (C) 2002-12 by Bruce Allen, >> http://smartmontools.sourceforge.net >> >> Vendor: /1:0:0:0 >> Product: >> User Capacity: 600,332,565,813,390,450 bytes [600 PB] >> Logical block size: 774843950 bytes >> >> Terminate command early due to bad response to IEC mode page >> A mandatory SMART command failed: exiting. To continue, add one or more >> '-T permissive' options >> >> On 8 September 2015 at 18:01, Quentin Hartman < >> [email protected]> wrote: >> >>> On Tue, Sep 8, 2015 at 9:05 AM, Mark Nelson <[email protected]> wrote: >>> >>>> A list of hardware that is known to work well would be incredibly >>>>> valuable to people getting started. It doesn't have to be exhaustive, >>>>> nor does it have to provide all the guidance someone could want. A >>>>> simple "these things have worked for others" would be sufficient. If >>>>> nothing else, it will help people justify more expensive gear when >>>>> their >>>>> approval people say "X seems just as good and is cheaper, why can't we >>>>> get that?". >>>>> >>>> >>>> So I have my opinions on different drives, but I think we do need to be >>>> really careful not to appear to endorse or pick on specific vendors. The >>>> more we can stick to high-level statements like: >>>> >>>> - Drives should have high write endurance >>>> - Drives should perform well with O_DSYNC writes >>>> - Drives should support power loss protection for data in motion >>>> >>>> The better I think. Once those are established, I think it's >>>> reasonable to point out that certain drives meet (or do not meet) those >>>> criteria and get feedback from the community as to whether or not vendor's >>>> marketing actually reflects reality. It'd also be really nice to see more >>>> information available like the actual hardware (capacitors, flash cells, >>>> etc) used in the drives. I've had to show photos of the innards of >>>> specific drives to vendors to get them to give me accurate information >>>> regarding certain drive capabilities. Having a database of such things >>>> available to the community would be really helpful. >>>> >>>> >>> That's probably a very good approach. I think it would be pretty simple >>> to avoid the appearance of endorsement if the data is presented correctly. >>> >>> >>>> >>>>> To that point, I think perhaps though something more important than a >>>>> list of known "good" hardware would be a list of known "bad" hardware, >>>>> >>>> >>>> I'm rather hesitant to do this unless it's been specifically confirmed >>>> by the vendor. It's too easy to point fingers (see the recent kernel trim >>>> bug situation). >>> >>> >>> I disagree. I think that only comes into play if you claim to know why >>> the hardware has problems. In this case, if you simply state "people who >>> have used this drive have experienced a large number of seemingly premature >>> failures when using them as journals" that provides sufficient warning to >>> users, and if the vendor wants to engage the community and potentially pin >>> down why and help us find a way to make the device work or confirm that >>> it's just not suited, then that's on them. Samsung seems to be doing >>> exactly that. It would be great to have them help provide that level of >>> detail, but again, I don't think it's necessary. We're not saying >>> "ceph/redhat/$whatever says this hardware sucks" we're saying "The >>> community has found that using this hardware with ceph has exhibited these >>> negative behaviors...". At that point you're just relaying experiences and >>> collecting them in a central location. It's up to the reader to draw >>> conclusions from it. >>> >>> But again, I think more important than either of these would be a >>> collection of use cases with actual journal write volumes that have >>> occurred in those use cases so that people can make more informed >>> purchasing decisions. The fact that my small openstack cluster created 3.6T >>> of writes per month on my journal drives (3 OSD each) is somewhat >>> mind-blowing. That's almost four times the amount of writes my best guess >>> estimates indicated we'd be doing. Clearly there's more going on than we >>> are used to paying attention to. Someone coming to ceph and seeing the cost >>> of DC-class SSDs versus consumer-class SSDs will almost certainly suffer >>> from some amount of sticker shock, and even if they don't their purchasing >>> approval people almost certainly will. This is especially true for people >>> in smaller organizations where SSDs are still somewhat exotic. And when >>> they come back with the "Why won't cheaper thing X be OK?" they need to >>> have sufficient information to answer that. Without a test environment to >>> generate data with, they will need to rely on the experiences of others, >>> and right now those experiences don't seem to be documented anywhere, and >>> if they are, they are not very discoverable. >>> >>> QH >>> >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> >> >> -- >> >> Andrija Panić >> > > -- Andrija Panić
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
