On Tuesday, April 28, 2015, Dominik Hannen <[email protected]> wrote: > Hi ceph-users, > > I am currently planning a cluster and would like some input specifically > about the storage-nodes. > > The non-osd systems will be running on more powerful system. > > Interconnect as currently planned: > 4 x 1Gbit LACP Bonds over a pair of MLAG-capable switches (planned: EX3300) > > One problem with LACP is that it will only allow you to have 1Gbps between any two IPs or MACs (depending on your switch config). This will most likely limit the throughput of any client to 1Gbps, which is equivalent to 125MBps storage throughput. It is not really equivalent to a 4Gbps interface or 2x 2Gbps interfaces (if you plan to have a client network and cluster network).
So far I would go with Supermicros 5018A-MHN4 offering, rack-space is not > really a concern, so only 4 OSDs per U is fine. > (The cluster is planned to start with 8 osd-nodes.) > > osd-node: > Avoton C2758 - 8 x 2.40GHz > 16 GB RAM ECC > 16 GB SSD - OS - SATA-DOM > 250GB SSD - Journal (MX200 250GB with extreme over-provisioning, staggered > deployment, monitored for TBW-Value) > 4 x 3 TB OSD - Seagate Surveillance HDD (ST3000VX000) 7200rpm 24/7 > 4 x 1 Gbit > > per-osd breakdown: > 3 TB HDD > 2 x 2.40GHz (Avoton-Cores) > 4 GB RAM > 8 GB SSD-Journal (~125 MB/s r/w) > 1 Gbit > > The main question is, will the Avoton CPU suffice? (I recon the common > 1GHz/OSD suggestion are in regards to much more powerful CPUs.) > > I don't have any experience with this CPU, but 8x 2.4GHz cores for 4 OSDs seems like plenty of CPU. I have 32GB of RAM for 7 osds, which has been enough for me. Are there any cost-effective suggestions to improve this configuration? I have implemented a small cluster with no SSD journals, and the performance is pretty good. 42 osds, 3x replication, 40Gb NICs rados bench shows me 2000 iops at 4k writes and 500MBps at 4M writes. I would trade your SSD journals for 10Gb NICs and switches. I started out with the same 4x 1Gb LACP config and things like rebalancing/recovery were terribly slow, as well as the throughput limit I mentioned above. When you get more funding next quarter/year, you can choose to add the SSD journals or more OSD nodes. Moving to 10Gb networking after you get the cluster up and running will be much harder. > Will erasure coding be a feasible possibility? > > Does it hurt to run OSD-nodes CPU-capped, if you have enough of them? > > ___ > Dominik Hannen > _______________________________________________ > ceph-users mailing list > [email protected] <javascript:;> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
