Are you using the 4TB disks for the journal?

*Nate Curry*
IT Manager
ISSM
*Mosaic ATM*
mobile: 240.285.7341
office: 571.223.7036 x226
[email protected]

On Thu, Jul 2, 2015 at 12:16 PM, Shane Gibson <[email protected]>
wrote:

>
> I'd def be happy to share what numbers I can get out of it.  I'm still a
> neophyte w/ Ceph, and learning how to operate it, set it up ... etc...
>
> My limited performance testing to date has been with "stock" XFS ceph-disk
> built filesystem for the OSDs, basic PG/CRUSH map stuff - and using "dd"
> across RBD mounted volumes ...  I'm learning how to scale it up, and start
> tweaking and tuning.
>
> If anyone on the list is interested in specific tests and can provide
> specific detailed instructions on configuration, test patterns, etc ... I"m
> happy to run them if I can ...  We're baking in automation around the Ceph
> deoployment from fresh build using the Open Crowbar deployment tooling,
> with a Ceph work load on it.  RIght now, modifying the Ceph work load to
> work across multple L3 rack boundaries in the cluster.
>
> Physical servers are Dell R720xd platforms, with 12 spinning (4TB 7200
> rpm) data disks, and 2x 10k 600 GB mirrired OS disks.  Memory is 128 GB,
> and dual 6-core HT CPUs.
>
> ~~shane
>
>
>
> On 7/1/15, 5:24 PM, "German Anders" <[email protected]> wrote:
>
> I'm interested in such a configuration, can you share some perfomance
> test/numbers?
>
> Thanks in advance,
>
> Best regards,
>
> *German*
>
> 2015-07-01 21:16 GMT-03:00 Shane Gibson <[email protected]>:
>
>>
>> It also depends a lot on the size of your cluster ... I have a test
>> cluster I'm standing up right now with 60 nodes - a total of 600 OSDs each
>> at 4 TB ... If I lose 4 TB - that's a very small fraction of the data.  My
>> replicas are going to be spread out across a lot of spindles, and
>> replicating that missing 4 TB isn't much of an issue, across 3 racks each
>> with 80 gbit/sec ToR uplinks to Spine.  Each node has 20 gbit/sec to ToR in
>> a bond.
>>
>> On the other hand ... if you only have 4 .. or 8 ... or 10 servers ...
>> and a smaller number of OSDs - you have fewer spindles replicating that
>> loss, and it might be more of an issue.
>>
>> It just depends on the size/scale of  your environment.
>>
>> We're going to 8 TB drives - and that will ultimately be spread over a
>> 100 or more physical servers w/ 10 OSD disks per server.   This will be
>> across 7 to 10 racks (same network topology) ... so an 8 TB drive loss
>> isn't too big of an issue.   Now that assumes that replication actually
>> works well in that size cluster.  We're still cessing out this part of the
>> PoC engagement.
>>
>> ~~shane
>>
>>
>>
>>
>> On 7/1/15, 5:05 PM, "ceph-users on behalf of German Anders" <
>> [email protected] on behalf of [email protected]>
>> wrote:
>>
>> ask the other guys on the list, but for me to lose 4TB of data is to
>> much, the cluster will still running fine, but in some point you need to
>> recover that disk, and also if you lose one server with all the 4TB disk in
>> that case yeah it will hurt the cluster, also take into account that with
>> that kind of disk you will get no more than 100-110 iops per disk
>>
>> *German Anders*
>> Storage System Engineer Leader
>> *Despegar* | IT Team
>> *office* +54 11 4894 3500 x3408
>> *mobile* +54 911 3493 7262
>> *mail* [email protected]
>>
>> 2015-07-01 20:54 GMT-03:00 Nate Curry <[email protected]>:
>>
>>> 4TB is too much to lose?  Why would it matter if you lost one 4TB with
>>> the redundancy?  Won't it auto recover from the disk failure?
>>>
>>> Nate Curry
>>> On Jul 1, 2015 6:12 PM, "German Anders" <[email protected]> wrote:
>>>
>>>> I would probably go with less size osd disks, 4TB is to much to loss in
>>>> case of a broken disk, so maybe more osd daemons with less size, maybe 1TB
>>>> or 2TB size. 4:1 relationship is good enough, also i think that 200G disk
>>>> for the journals would be ok, so you can save some money there, the osd's
>>>> of course configured them as a JBOD, don't use any RAID under it, and use
>>>> two different networks for public and cluster net.
>>>>
>>>> *German*
>>>>
>>>> 2015-07-01 18:49 GMT-03:00 Nate Curry <[email protected]>:
>>>>
>>>>> I would like to get some clarification on the size of the journal
>>>>> disks that I should get for my new Ceph cluster I am planning.  I read
>>>>> about the journal settings on
>>>>> http://ceph.com/docs/master/rados/configuration/osd-config-ref/#journal-settings
>>>>> but that didn't really clarify it for me that or I just didn't get it.  I
>>>>> found in the Learning Ceph Packt book it states that you should have one
>>>>> disk for journalling for every 4 OSDs.  Using that as a reference I was
>>>>> planning on getting multiple systems with 8 x 6TB inline SAS drives for
>>>>> OSDs with two SSDs for journalling per host as well as 2 hot spares for 
>>>>> the
>>>>> 6TB drives and 2 drives for the OS.  I was thinking of 400GB SSD drives 
>>>>> but
>>>>> am wondering if that is too much.  Any informed opinions would be
>>>>> appreciated.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> *Nate Curry*
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> [email protected]
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>
>>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to