[ceph-users] mon memory usage (again)

2014-04-12 Thread Christian Balzer
Hello, 3 node cluster (2 storage with 2 OSDs one dedicated mon), 3 mons total. Debian Jessie, thus 3.13 kernel and Ceph 0.72.2. 2 of the mons (including the leader) are using around 100MB RSS and one was using about 1.1GB. I did my homework and scoured the ML archives and found at least 2

Re: [ceph-users] mon memory usage (again)

2014-04-12 Thread Gregory Farnum
On Fri, Apr 11, 2014 at 11:12 PM, Christian Balzer ch...@gol.com wrote: Hello, 3 node cluster (2 storage with 2 OSDs one dedicated mon), 3 mons total. Debian Jessie, thus 3.13 kernel and Ceph 0.72.2. 2 of the mons (including the leader) are using around 100MB RSS and one was using about

Re: [ceph-users] mon memory usage (again)

2014-04-12 Thread Christian Balzer
On Fri, 11 Apr 2014 23:33:42 -0700 Gregory Farnum wrote: On Fri, Apr 11, 2014 at 11:12 PM, Christian Balzer ch...@gol.com wrote: [snip] Questions remaining: a) Is that non-deterministic ceph heap behavior expected and if yes can it be fixed? You can specify the monitor you want

[ceph-users] Useful visualizations / metrics

2014-04-12 Thread Greg Poirier
I'm in the process of building a dashboard for our Ceph nodes. I was wondering if anyone out there had instrumented their OSD / MON clusters and found particularly useful visualizations. At first, I was trying to do ridiculous things (like graphing % used for every disk in every OSD host), but I

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Jason Villalta
Hi, i have not don't anything with metrics yet but the only ones I personally would be interested in is total capacity utilization and cluster latency. Just my 2 cents. On Sat, Apr 12, 2014 at 10:02 AM, Greg Poirier greg.poir...@opower.comwrote: I'm in the process of building a dashboard for

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Greg Poirier
Curious as to how you define cluster latency. On Sat, Apr 12, 2014 at 7:21 AM, Jason Villalta ja...@rubixnet.com wrote: Hi, i have not don't anything with metrics yet but the only ones I personally would be interested in is total capacity utilization and cluster latency. Just my 2 cents.

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Jason Villalta
I know ceph throws some warnings if there is high write latency. But i would be most intrested in the delay for io requests, linking directly to iops. If iops start to drop because the disk are overwhelmed then latency for requests would be increasing. This would tell me that I need to add more

[ceph-users] qemu + rbd block driver with cache=writeback, is live migration safe ?

2014-04-12 Thread Alexandre DERUMIER
Hello, I known that qemu live migration with disk with cache=writeback are not safe with storage like nfs,iscsi... Is it also true with rbd ? If yes, it is possible to disable manually writeback online with qmp ? Best Regards, Alexandre ___

Re: [ceph-users] qemu + rbd block driver with cache=writeback, is live migration safe ?

2014-04-12 Thread Alex Crow
Hi. I've read in many places that you should never use writeback on any kind of shared storage. Caching is better dealt with on the storage side anyway as you have hopefully provided resilience there. In fact if your SAN/NAS is good enough it's supposed to be best to use none as the caching

Re: [ceph-users] qemu + rbd block driver with cache=writeback, is live migration safe ?

2014-04-12 Thread Christian Balzer
Hello, On Sat, 12 Apr 2014 16:26:40 +0100 Alex Crow wrote: Hi. I've read in many places that you should never use writeback on any kind of shared storage. Caching is better dealt with on the storage side anyway as you have hopefully provided resilience there. In fact if your SAN/NAS

Re: [ceph-users] OSD: GPT Partition for journal on different partition ?

2014-04-12 Thread Sage Weil
Hi Florent, GPT partitions ate required if the udev-based magic is going to work. If you opt out of that strategy, you need to mount your file systems using fstab or similar and start the daemons manually. sage On April 12, 2014 6:38:13 AM PDT, Florent B flor...@coppint.com wrote: Hi all, I

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Mark Nelson
One thing I do right now for ceph performance testing is run a copy of collectl during every test. This gives you a TON of information about CPU usage, network stats, disk stats, etc. It's pretty easy to import the output data into gnuplot. Mark Seger (the creator of collectl) also has some

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Greg Poirier
We are collecting system metrics through sysstat every minute and getting those to OpenTSDB via Sensu. We have a plethora of metrics, but I am finding it difficult to create meaningful visualizations. We have alerting for things like individual OSDs reaching capacity thresholds, memory spikes on

[ceph-users] DCBX Ceph...

2014-04-12 Thread N. Richard Solis
Anyone using DCB-X features on their cluster network in conjunction with Ceph? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] pg incomplete, won't create

2014-04-12 Thread Craig Lewis
I reformatted 2 OSDs, in a cluster with 2 replicas. I tried to get as much data off them as possible before hand, using ceph osd out, but I couldn't get it all. I know I've lost data. I have 1 incomplete PG, which is better than I expected. Following previous advice, I ran ceph pg

Re: [ceph-users] Useful visualizations / metrics

2014-04-12 Thread Craig Lewis
I've been graphing disk latency, osd latency, and RGW latency. It's a bit tricky to pull out of ceph --admin-daemon ceph-osd.0.asok perf dump though. perf dump gives you the total ops and total op time. You have to track the delta of those two values, then

Re: [ceph-users] pg incomplete, won't create

2014-04-12 Thread Craig Lewis
From another discussion, I learned about ceph osd lost. I'm draining osd 1 and 3 (ceph osd out). Once they're empty, I'll mark them lost and see if that helps. *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email cle...@centraldesktop.com mailto:cle...@centraldesktop.com

Re: [ceph-users] pg incomplete, won't create

2014-04-12 Thread Craig Lewis
While I'm waiting for these OSDs to drain, is there any way to prioritize certain PGs to recover/backfill first? In this case, I'd prefer to prioritize the PGs that are on the two OSDs that I'm draining. There have been other times I've wanted to manually boost a recovery though. Most

Re: [ceph-users] qemu + rbd block driver with cache=writeback, is live migration safe ?

2014-04-12 Thread Alexandre DERUMIER
And I read: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg06890.html Niw don't get me wrong, erring on the side of safety is quite sensible, but the impact of having no caching by qemu is in the order of a 2 magnitudes easily. Thanks for link reference ! - Mail original