[ceph-users] Best Practices for Managing Multiple Pools

2016-09-20 Thread Heath Albritton
I'm wondering if anyone has some tips for managing different types of pools, each of which fall on a different type of OSD. Right now, I have a small cluster running with two kinds of OSD nodes, ones with spinning disks (and SSD journals) and another with all SATA SSD. I'm currently running

Re: [ceph-users] OSPF to the host

2016-06-06 Thread Heath Albritton
Separate OSPF areas would make this unnecessarily complex. In a world where (some) routers are built to accommodate the number of Internet prefixes of over a half million, your few hundred or few thousand /32s represent very little load to a modern network element. The number of links will

Re: [ceph-users] Blocked ops, OSD consuming memory, hammer

2016-05-25 Thread Heath Albritton
I fear I've hit a bug as well. Considering an upgrade to the latest release of hammer. Somewhat concerned that I may lose those PGs. -H > On May 25, 2016, at 07:42, Gregory Farnum <gfar...@redhat.com> wrote: > >> On Tue, May 24, 2016 at 11:19 PM, Heath Albritton <hal

Re: [ceph-users] Blocked ops, OSD consuming memory, hammer

2016-05-25 Thread Heath Albritton
; >> On Tue, May 24, 2016 at 2:16 PM, Heath Albritton <halbr...@harm.org> >> wrote: >> > Having some problems with my cluster. Wondering if I could get some >> > troubleshooting tips: >> > >> > Running hammer 0.94.5. Small cluster with cache t

[ceph-users] Blocked ops, OSD consuming memory, hammer

2016-05-24 Thread Heath Albritton
Having some problems with my cluster. Wondering if I could get some troubleshooting tips: Running hammer 0.94.5. Small cluster with cache tiering. 3 spinning nodes and 3 SSD nodes. Lots of blocked ops. OSDs are consuming the entirety of the system memory (128GB) and then falling over. Lots

[ceph-users] blocked ops

2016-05-24 Thread Heath Albritton
Having some issues with blocked ops on a small cluster. Running 0.94.5 with cache tiering. 3 cache nodes with 8 SSDs each and 3 spinning nodes with 12 spinning disk and journals. All the pools are 3x replicas. Started experiencing problems with OSDs in the cold tier consuming the entirety of

Re: [ceph-users] ZFS or BTRFS for performance?

2016-03-19 Thread Heath Albritton
If you google "ceph bluestore" you'll be able to find a couple slide decks on the topic. One of them by Sage is easy to follow without the benefit of the presentation. There's also the " Redhat Ceph Storage Roadmap 2016" deck. In any case, bluestore is not intended to address bitrot. Given

Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-03-19 Thread Heath Albritton
The rule of thumb is to match the journal throughput to the OSD throughout. I'm seeing ~180MB/s sequential write on my OSDs and I'm using one of the P3700 400GB units per six OSDs. The 400GB P3700 yields around 1200MB/s* and has around 1/10th the latency of any SATA SSD I've tested. I put a

Re: [ceph-users] ZFS or BTRFS for performance?

2016-03-19 Thread Heath Albritton
Neither of these file systems is recommended for production use underlying an OSD. The general direction for ceph is to move away from having a file system at all. That effort is called "bluestore" and is supposed to show up in the jewel release. -H > On Mar 18, 2016, at 11:15, Schlacta,

Re: [ceph-users] Fwd: List of SSDs

2016-03-05 Thread Heath Albritton
> On Thu, Mar 3, 2016 at 10:17 PM, Christian Balzer wrote: > Fair enough. > Sync tests would be nice, if nothing else to confirm that the Samsung DC > level SSDs are suitable and how they compare in that respect to the Intels. I'll do some sync testing next week and maybe gather

Re: [ceph-users] Fwd: List of SSDs

2016-02-29 Thread Heath Albritton
> Did you just do these tests or did you also do the "suitable for Ceph" > song and dance, as in sync write speed? These were done with libaio, so async. I can do a sync test if that helps. My goal for testing wasn't specifically suitability with ceph, but overall suitability in my environment,

[ceph-users] Fwd: List of SSDs

2016-02-27 Thread Heath Albritton
I've done a bit of testing with the Intel units: S3600, S3700, S3710, and P3700. I've also tested the Samsung 850 Pro, 845DC Pro, and SM863. All of my testing was "worst case IOPS" as described here: http://www.anandtech.com/show/8319/samsung-ssd-845dc-evopro-preview-exploring-worstcase-iops/6

Re: [ceph-users] List of SSDs

2016-02-26 Thread Heath Albritton
I've done a bit of testing with the Intel units: S3600, S3700, S3710, and P3700. I've also tested the Samsung 850 Pro, 845DC Pro, and SM863. All of my testing was "worst case IOPS" as described here: http://www.anandtech.com/show/8319/samsung-ssd-845dc-evopro-preview-exploring-worstcase-iops/6

Re: [ceph-users] Tips for faster openstack instance boot

2016-02-08 Thread Heath Albritton
I'm not sure what's normal, but I'm on Openstack Juno with ceph .94.5 using separate pools for nova, glance, and cinder. Takes 16 seconds to start an instance (el7 minimal). Everything is on 10GE and I'm using cache tiering, which I'm sure speeds things up. Can personally verify that COW is