Re: Crash and strange things on MDS

2013-02-13 Thread Kevin Decherf
On Mon, Feb 11, 2013 at 12:25:59PM -0800, Gregory Farnum wrote: On Mon, Feb 11, 2013 at 10:54 AM, Kevin Decherf ke...@kdecherf.com wrote: Furthermore, I observe another strange thing more or less related to the storms. During a rsync command to write ~20G of data on Ceph and during (and

Re: OSD Weights

2013-02-13 Thread sheng qiu
Hi Gregory, once running ceph online, will ceph change the weight dynamically (if not set properly) or it can only be changed by the user through command line or it cannot be changed online? Thanks, Sheng On Mon, Feb 11, 2013 at 3:31 PM, Gregory Farnum g...@inktank.com wrote: On Mon, Feb 11,

Fwd: optimizing ceph-fuse performance

2013-02-13 Thread femi anjorin
I setup ceph cluster on 28 nodes. 24 nodes for osds...Each storage node has 16 drives. raid0 on 4 drives. therefore i have 4 osds daemon on each node. each osd daemon is allocated a raid volume. so total of 96 osds daemon in the entire cluster. 3 nodes for mon ... 1 node for mds ... I mounted a

Re: optimizing ceph-fuse performance

2013-02-13 Thread Sam Lang
On Wed, Feb 13, 2013 at 6:31 AM, femi anjorin femi.anjo...@gmail.com wrote: Hi, Pls can somebody help interprete what this configuration signify. [global] debug ms = 0 [mon] debug mon = 20 debug paxos = 20 debug auth = 20 [osd] debug osd = 20

Re: rbd export speed limit

2013-02-13 Thread Sage Weil
On Wed, 13 Feb 2013, Stefan Priebe - Profihost AG wrote: Hi, Am 12.02.2013 21:45, schrieb Andrey Korolyov: you may be interested in throttle(1) as a side solution with stdout export option. What's throttle? Never seen this. Wouldn't it be possible to use tc? By the way, on which

OSD dies after seconds

2013-02-13 Thread Jesus Cuenca
Hi, I'm setting up a small ceph 0.56.2 cluster on 3 64-bit Debian 6 servers with kernel 3.7.2. My problem is that OSD die. First I try to start them with the init script: /etc/init.d/ceph start osd.0 ... starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal

Re: OSD dies after seconds

2013-02-13 Thread Sage Weil
On Wed, 13 Feb 2013, Jesus Cuenca wrote: Hi, I'm setting up a small ceph 0.56.2 cluster on 3 64-bit Debian 6 servers with kernel 3.7.2. This might be http://tracker.ceph.com/issues/3595 which is problems with google perftools (which we use by default) and the version in squeeze,

Re: Write Replication on Degraded PGs

2013-02-13 Thread Gregory Farnum
On Wed, Feb 13, 2013 at 3:40 AM, Ben Rowland ben.rowl...@gmail.com wrote: Hi, Apologies that this is a fairly long post, but hopefully all my questions are similar (or even invalid!) Does Ceph allow writes to proceed if it's not possible to satisfy the rules for replica placement across

Re: Crash and strange things on MDS

2013-02-13 Thread Gregory Farnum
On Wed, Feb 13, 2013 at 3:47 AM, Kevin Decherf ke...@kdecherf.com wrote: On Mon, Feb 11, 2013 at 12:25:59PM -0800, Gregory Farnum wrote: On Mon, Feb 11, 2013 at 10:54 AM, Kevin Decherf ke...@kdecherf.com wrote: Furthermore, I observe another strange thing more or less related to the storms.

Re: Links to various language bindings

2013-02-13 Thread Wido den Hollander
On 02/12/2013 11:53 PM, John Wilkins wrote: Also, be sure to open bugs and assign them to me. Yes! Although I'll probably write some docs as well (if I get them to build...). I'll sent patches or pull requests. So we have the following bindings/extensions/whatever: librados: * Python *

Re: rbd export speed limit

2013-02-13 Thread Stefan Priebe
Hi, first sorry i got this totally wrong. The speed is not correct and i mixed this with another operation going on at the same time. The problem isn't the rbd export in my case it the fstrim issued on all VMs. This results in writes up to 400Mb/s per OSD and then results in aborted /

Questions on some minor issues when upgrading from 0.48 to 0.56

2013-02-13 Thread Daniel Hoang
Hi All, Just in case these issues have not been reported yet, I am onĀ ubuntu 12.04, upgrade librados2/librados-dev from 0.48 to 0.56, and I notice the following issues: 1. librados2 / librados-dev still reports minor version as 48 Should minor version changed to 56? 2. In 0.48,

OSD failure on start

2013-02-13 Thread Mandell Degerness
I'm getting this error on one of my OSD's when I try to start it. I can gather more complete log data if no-one recognizes the error from this: Feb 13 19:30:04 node-192-168-8-14 ceph-osd: 2013-02-13 19:30:04.612847 7f4f607e7780 0 filestore(/mnt/osd96) mount found snaps Feb 13 19:30:04

Re: rbd export speed limit

2013-02-13 Thread Sage Weil
On Wed, 13 Feb 2013, Stefan Priebe wrote: Hi, first sorry i got this totally wrong. The speed is not correct and i mixed this with another operation going on at the same time. The problem isn't the rbd export in my case it the fstrim issued on all VMs. This results in writes up to

Re: rbd export speed limit

2013-02-13 Thread Stefan Priebe
Hi, Am 13.02.2013 21:21, schrieb Sage Weil: This results in writes up to 400Mb/s per OSD and then results in aborted / hanging task in VMs. Is it possible to give trim commands lower priority? Is that 400Mb or MB? Measured over the network, or on the disk itself? Sorry it's MB - so the

Re: rbd export speed limit

2013-02-13 Thread Gregory Farnum
On Wed, Feb 13, 2013 at 12:27 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi, Am 13.02.2013 21:21, schrieb Sage Weil: This results in writes up to 400Mb/s per OSD and then results in aborted / hanging task in VMs. Is it possible to give trim commands lower priority? Is that 400Mb or

Re: rbd export speed limit

2013-02-13 Thread Stefan Priebe
HI Greg, Am 13.02.2013 21:38, schrieb Gregory Farnum: Sorry it's MB - so the SSDs get fully utilized meased via /proc/diskstats . I'm wondering if this is a lack of punch support on the kernel.. I'm using 3.7.7 running XFS. Sounds like maybe the client trim is ending up issuing a truly

Re: OSD failure on start

2013-02-13 Thread Mike Dawson
Mandell, A few of us saw a similar failure on 0.56.1. http://tracker.ceph.com/issues/3770 Sam Just patched the issue for 0.56.2. My understanding is Sam's patch prevents the issue in the future, but doesn't repair a previously damaged OSD. If you have good replication (or a good backup), I

v0.56.3 released

2013-02-13 Thread Sage Weil
We've fixed an important bug that a few users were hitting with unresponsive OSDs and internal heartbeat timeouts. This, along with a range of less critical fixes, was sufficient to justify another point release. Any production users should upgrade. Notable changes include: * osd: flush

Re: OSD failure on start

2013-02-13 Thread Samuel Just
Actually, that bug did not exist in 48.1, must have been something different. Was the the node you had the trouble with the pg logs on? -Sam On Wed, Feb 13, 2013 at 2:47 PM, Mandell Degerness mand...@pistoncloud.com wrote: Thanks. I'm glad to hear it is fixed in new version. Wiping the OSD

The Ceph Census

2013-02-13 Thread Ross David Turk
Hi! It's been a while since my last poll about Ceph deployments and use cases. Since there are so many more of us now, I think it's a good time to do it again. This time, I've set up a survey. I am particularly interested in how many deployments of Ceph there are, how much underlying

Re: Questions on some minor issues when upgrading from 0.48 to 0.56

2013-02-13 Thread Wido den Hollander
Hi, On 02/13/2013 08:26 PM, Daniel Hoang wrote: Hi All, Just in case these issues have not been reported yet, I am on ubuntu 12.04, upgrade librados2/librados-dev from 0.48 to 0.56, and I notice the following issues: 1. librados2 / librados-dev still reports minor version as 48 Should