[ceph-users] IRQ balancing, distribution

2014-09-22 Thread Christian Balzer
Hello, not really specific to Ceph, but since one of the default questions by the Ceph team when people are facing performance problems seems to be Have you tried turning it off and on again? ^o^ err, Are all your interrupts on one CPU? I'm going to wax on about this for a bit and hope for

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
hi christian, we once were debugging some performance isssues, and IRQ balancing was one of the issues we looked in, but no real benefit there for us. all interrupts on one cpu is only an issue if the hardware itself is not the bottleneck. we were running some default SAS HBA (Dell H200), and

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Christian Balzer
Hello, On Mon, 22 Sep 2014 09:35:10 +0200 Stijn De Weirdt wrote: hi christian, we once were debugging some performance isssues, and IRQ balancing was one of the issues we looked in, but no real benefit there for us. all interrupts on one cpu is only an issue if the hardware itself is not

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Florian Haas
On Mon, Sep 22, 2014 at 10:21 AM, Christian Balzer ch...@gol.com wrote: The linux scheduler usually is quite decent in keeping processes where the action is, thus you see for example a clear preference of DRBD or KVM vnet processes to be near or on the CPU(s) where the IRQs are. Since you're

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
but another issue is the OSD processes: do you pin those as well? and how much data do they actually handle. to checksum, the OSD process needs all data, so that can also cause a lot of NUMA traffic, esp if they are not pinned. That's why all my (production) storage nodes have only a single 6

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Anand Bhat
Page reclamation in Linux is NUMA aware. So page reclamation is not an issue. You can see performance improvements only if all the components of a given IO completes on a single core. This is hard to achieve in Ceph as a single IO goes through multiple thread switches and the threads are not

Re: [ceph-users] Newbie Ceph Design Questions

2014-09-22 Thread Udo Lembke
Hi Christian, On 22.09.2014 05:36, Christian Balzer wrote: Hello, On Sun, 21 Sep 2014 21:00:48 +0200 Udo Lembke wrote: Hi Christian, On 21.09.2014 07:18, Christian Balzer wrote: ... Personally I found ext4 to be faster than XFS in nearly all use cases and the lack of full, real kernel

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
hi, Page reclamation in Linux is NUMA aware. So page reclamation is not an issue. except for the first min_free_kbytes? those can come from anywhere, no? or is the reclamation such that it tries to free equal portion for each NUMA domain. if the OSD allocates memory in chunks smaller then

[ceph-users] Pgs are in stale+down+peering state

2014-09-22 Thread Sahana Lokeshappa
Hi all, I used command 'ceph osd thrash ' command and after all osds are up and in, 3 pgs are in stale+down+peering state sudo ceph -s cluster 99ffc4a5-2811-4547-bd65-34c7d4c58758 health HEALTH_WARN 3 pgs down; 3 pgs peering; 3 pgs stale; 3 pgs stuck inactive; 3 pgs stuck stale; 3

Re: [ceph-users] Timeout on ceph-disk activate

2014-09-22 Thread Alfredo Deza
I would run that one command (sudo ceph-disk -v activate --mark-init sysvinit --mount /data/osd ) on the hp10 box and see what is going on when you do so. On Thu, Sep 18, 2014 at 12:09 PM, BG bglac...@nyx.com wrote: I've hit a timeout issue on calls to ceph-disk activate. Initially, I

Re: [ceph-users] ceph health related message

2014-09-22 Thread Sean Sullivan
I had this happen to me as well. Turned out to be a connlimit thing for me. I would check dmesg/kernel log and see if you see any conntrack limit reached connection dropped messages then increase connlimit. Odd as I connected over ssh for this but I can't deny syslog.

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Mark Nelson
On 09/22/2014 01:55 AM, Christian Balzer wrote: Hello, not really specific to Ceph, but since one of the default questions by the Ceph team when people are facing performance problems seems to be Have you tried turning it off and on again? ^o^ err, Are all your interrupts on one CPU? I'm going

Re: [ceph-users] Newbie Ceph Design Questions

2014-09-22 Thread Christian Balzer
On Mon, 22 Sep 2014 13:35:26 +0200 Udo Lembke wrote: Hi Christian, On 22.09.2014 05:36, Christian Balzer wrote: Hello, On Sun, 21 Sep 2014 21:00:48 +0200 Udo Lembke wrote: Hi Christian, On 21.09.2014 07:18, Christian Balzer wrote: ... Personally I found ext4 to be faster

[ceph-users] Adding another radosgw node

2014-09-22 Thread Jon Kåre Hellan
Hi We've got a three node ceph cluster, and radosgw on a fourth machine. We would like to add another radosgw machine for high availability. Here are a few questions I have: - We aren't expecting to deploy to multiple regions and zones anywhere soon. So presumably, we do not have to worry

Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-22 Thread Sage Weil
Stale means that the primary OSD for the PG went down and the status is stale. They all seem to be from OSD.12... Seems like something is preventing that OSD from reporting to the mon? sage On September 22, 2014 7:51:48 AM EDT, Sahana Lokeshappa sahana.lokesha...@sandisk.com wrote: Hi all,

Re: [ceph-users] [Ceph-community] Pgs are in stale+down+peering state

2014-09-22 Thread Varada Kari
Hi Sage, To give more context on this problem, This cluster has two pools rbd and user-created. Osd.12 is a primary for some other PG’s , but the problem happens for these three PG’s. $ sudo ceph osd lspools 0 rbd,2 pool1, $ sudo ceph -s cluster 99ffc4a5-2811-4547-bd65-34c7d4c58758

[ceph-users] XenServer and Ceph - any updates?

2014-09-22 Thread Andrei Mikhailovsky
Hello guys, I was wondering if there has been any updates on getting XenServer ready for ceph? I've seen a howto that was written well over a year ago (I think) for a PoC integration of XenServer and Ceph. However, I've not seen any developments lately.It would be cool to see other

[ceph-users] Reassigning admin server

2014-09-22 Thread LaBarre, James (CTR) A6IT
If I have a machine/VM I am using as an Admin node for a ceph cluster, can I relocate that admin to another machine/VM after I've built a cluster? I would expect as the Admin isn't an actual operating part of the cluster itself (other than Calamari, if it happens to be running) the rest of the

Re: [ceph-users] Bcache / Enhanceio with osds

2014-09-22 Thread Robert LeBlanc
We are still in the middle of testing things, but so far we have had more improvement with SSD journals than the OSD cached with bcache (five OSDs fronted by one SSD). We still have yet to test if adding a bcache layer in addition to the SSD journals provides any additional improvements. Robert

Re: [ceph-users] Bcache / Enhanceio with osds

2014-09-22 Thread Mark Nelson
Likely it won't since the OSD is already coalescing journal writes. FWIW, I ran through a bunch of tests using seekwatcher and blktrace at 4k, 128k, and 4m IO sizes on a 4 OSD cluster (3x replication) to get a feel for what the IO patterns are like for the dm-cache developers. I included both

[ceph-users] Ceph Day Speaking Slots

2014-09-22 Thread Patrick McGarry
Hey cephers, As we finalize the next couple schedules for Ceph Days in NYC and London it looks like there are still a couple of speaking slots open. If you are available in NYC on 08 OCT or in London on 22 OCT and would be interested in speaking about your Ceph experiences (of any kind) please

Re: [ceph-users] Bcache / Enhanceio with osds

2014-09-22 Thread Andrei Mikhailovsky
I've done a bit of testing with Enhanceio on my cluster and I can see a definate improvement in read performance for cached data. The performance increase is around 3-4 times the cluster speed prior to using enhanceio based on large block size IO (1M and 4M). I've done a concurrent test of

Re: [ceph-users] OSDs are crashing with Cannot fork or cannot create thread but plenty of memory is left

2014-09-22 Thread Nathan O'Sullivan
Hi Christian, Your problem is probably that your kernel.pid_max (the maximum threads+processes across the entire system) needs to be increased - the default is 32768, which is too low for even a medium density deployment. You can test this easily enough with $ ps axms | wc -l If you get a

[ceph-users] get amount of space used by snapshots

2014-09-22 Thread Steve Anthony
Hello, If I have an rbd image and a series of snapshots of that image, is there a fast way to determine how much space the objects composing the original image and all the snapshots are using in the cluster, or even just the space used by the snaps? The only way I've been able to find so far is

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Christian Balzer
Hello, On Mon, 22 Sep 2014 08:55:48 -0500 Mark Nelson wrote: On 09/22/2014 01:55 AM, Christian Balzer wrote: Hello, not really specific to Ceph, but since one of the default questions by the Ceph team when people are facing performance problems seems to be Have you tried turning it