On Tue, Sep 8, 2015 at 9:05 AM, Mark Nelson <mnel...@redhat.com> wrote:

> A list of hardware that is known to work well would be incredibly
>> valuable to people getting started. It doesn't have to be exhaustive,
>> nor does it have to provide all the guidance someone could want. A
>> simple "these things have worked for others" would be sufficient. If
>> nothing else, it will help people justify more expensive gear when their
>> approval people say "X seems just as good and is cheaper, why can't we
>> get that?".
>>
>
> So I have my opinions on different drives, but I think we do need to be
> really careful not to appear to endorse or pick on specific vendors. The
> more we can stick to high-level statements like:
>
> - Drives should have high write endurance
> - Drives should perform well with O_DSYNC writes
> - Drives should support power loss protection for data in motion
>
> The better I think.  Once those are established, I think it's reasonable
> to point out that certain drives meet (or do not meet) those criteria and
> get feedback from the community as to whether or not vendor's marketing
> actually reflects reality.  It'd also be really nice to see more
> information available like the actual hardware (capacitors, flash cells,
> etc) used in the drives.  I've had to show photos of the innards of
> specific drives to vendors to get them to give me accurate information
> regarding certain drive capabilities.  Having a database of such things
> available to the community would be really helpful.
>
>
That's probably a very good approach. I think it would be pretty simple to
avoid the appearance of endorsement if the data is presented correctly.


>
>> To that point, I think perhaps though something more important than a
>> list of known "good" hardware would be a list of known "bad" hardware,
>>
>
> I'm rather hesitant to do this unless it's been specifically confirmed by
> the vendor.  It's too easy to point fingers (see the recent kernel trim bug
> situation).


I disagree. I think that only comes into play if you claim to know why the
hardware has problems. In this case, if you simply state "people who have
used this drive have experienced a large number of seemingly premature
failures when using them as journals" that provides sufficient warning to
users, and if the vendor wants to engage the community and potentially pin
down why and help us find a way to make the device work or confirm that
it's just not suited, then that's on them. Samsung seems to be doing
exactly that. It would be great to have them help provide that level of
detail, but again, I don't think it's necessary. We're not saying
"ceph/redhat/$whatever says this hardware sucks" we're saying "The
community has found that using this hardware with ceph has exhibited these
negative behaviors...". At that point you're just relaying experiences and
collecting them in a central location. It's up to the reader to draw
conclusions from it.

But again, I think more important than either of these would be a
collection of use cases with actual journal write volumes that have
occurred in those use cases so that people can make more informed
purchasing decisions. The fact that my small openstack cluster created 3.6T
of writes per month on my journal drives (3 OSD each) is somewhat
mind-blowing. That's almost four times the amount of writes my best guess
estimates indicated we'd be doing. Clearly there's more going on than we
are used to paying attention to. Someone coming to ceph and seeing the cost
of DC-class SSDs versus consumer-class SSDs will almost certainly suffer
from some amount of sticker shock, and even if they don't their purchasing
approval people almost certainly will. This is especially true for people
in smaller organizations where SSDs are still somewhat exotic. And when
they come back with the "Why won't cheaper thing X be OK?" they need to
have sufficient information to answer that. Without a test environment to
generate data with, they will need to rely on the experiences of others,
and right now those experiences don't seem to be documented anywhere, and
if they are, they are not very discoverable.

QH
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to