Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-14 Thread Jason Dillaman
I would need to see the log from the point where you've frozen the disks until the point when you attempt to create a snapshot. The logs below just show normal IO. I've opened a new ticket [1] where you can attach the logs. [1] http://tracker.ceph.com/issues/14373 -- Jason Dillaman

Re: [ceph-users] ceph-fuse on Jessie not mounted at boot

2016-01-14 Thread Gregory Farnum
Try using "id=client.my_user". It's not taking daemonize arguments because auto-mount in fstab requires the use of CLI arguments (of which daemonize isn't a member), IIRC. -Greg On Wednesday, January 6, 2016, Florent B wrote: > Hi everyone, > > I have a problem with

Re: [ceph-users] where is the fsid field coming from in ceph -s ?

2016-01-14 Thread Gregory Farnum
It sounds like you *didn't* change the fsid for the existing osd/mon daemons since you say there gettin refused. So I think you created a new "cluster" of just the one monitor, and your client is choosing to connect to it first. If that's the case, killing that monitor and creating it properly

Re: [ceph-users] where is the client

2016-01-14 Thread Gregory Farnum
There's not a great unified tracking solution, but newer MDS code has admin socket commands to dump client sessions. Look for those. This question is good for the user list, but if you can't send mail to dev lost you're probably using HTML email or something. vger.kernel.org has some pretty

Re: [ceph-users] CEPH Replication

2016-01-14 Thread Gregory Farnum
We went to 3 copies because 2 isn't safe enough for the default. With 3 copies and a properly configured system your data is approximately as safe as the data center it's in. With 2 copies the durability is a lot lower than that (two 9s versus four 9s or something). The actual safety numbers did

[ceph-users] Observations after upgrading to latest Firefly (0.80.11)

2016-01-14 Thread Kostis Fardelas
Hello cephers, after being on 0.80.10 for a while, we upgraded to 0.80.11 and we noticed the following things: a. ~13% paxos refresh latency increase (from about 0.015 to 0.017 on average) b. ~15% paxos commit latency ( from 0.019 to 0.022 on average) c. osd commitcycle latencies were decreased

Re: [ceph-users] lost OSD due to failing disk

2016-01-14 Thread Mihai Gheorghe
2016-01-14 11:25 GMT+02:00 Magnus Hagdorn : > On 13/01/16 13:32, Andy Allan wrote: > >> On 13 January 2016 at 12:26, Magnus Hagdorn >> wrote: >> >>> Hi there, >>> we recently had a problem with two OSDs failing because of I/O errors of >>> the

Re: [ceph-users] lost OSD due to failing disk

2016-01-14 Thread Magnus Hagdorn
On 13/01/16 13:32, Andy Allan wrote: On 13 January 2016 at 12:26, Magnus Hagdorn wrote: Hi there, we recently had a problem with two OSDs failing because of I/O errors of the underlying disks. We run a small ceph cluster with 3 nodes and 18 OSDs in total. All 3 nodes

[ceph-users] v10.0.2 released

2016-01-14 Thread Sage Weil
This development release includes a raft of changes and improvements for Jewel. Key additions include CephFS scrub/repair improvements, an AIX and Solaris port of librados, many librbd journaling additions and fixes, extended per-pool options, and NBD driver for RBD (rbd-nbd) that allows

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-14 Thread Василий Ангапов
Thank you very much, Jason! I've updated the ticket with new data, but I'm not sure if I attached logs correctly. Please let me know if anything more is needed. 2016-01-14 23:29 GMT+08:00 Jason Dillaman : > I would need to see the log from the point where you've frozen the

[ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-14 Thread seapasu...@uchicago.edu
I am not sure why this is happening someone used s3cmd to upload around 130,000 7mb objects to a single bucket. Now we are tearing down the cluster to rebuild it better, stronger, and hopefully faster. Before we destroy it we need to download all of the data. I am running through all of the

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson
Hey Zheng, I've been in the #ceph irc channel all day about this. We did that, we set max_mds back to 1, but, instead of stopping mds 1, we did a "ceph mds rmfailed 1". Running ceph mds stop 1 produces: # ceph mds stop 1 Error EEXIST: mds.1 not active (???) Our mds in a state of resolve, and

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng
> On Jan 15, 2016, at 08:01, Gregory Farnum wrote: > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote: >> Hey Zheng, >> >> I've been in the #ceph irc channel all day about this. >> >> We did that, we set max_mds back to 1, but, instead of stopping

Re: [ceph-users] Infernalis upgrade breaks when journal on separate partition

2016-01-14 Thread Stuart Longland
On 12/01/16 01:22, Stillwell, Bryan wrote: >> Well, it seems I spoke to soon. Not sure what logic the udev rules use >> >to identify ceph journals, but it doesn't seem to pick up on the >> >journals in our case as after a reboot, those partitions are owned by >> >root:disk with permissions 0660.

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng
On Fri, Jan 15, 2016 at 3:28 AM, Mike Carlson wrote: > Thank you for the reply Zheng > > We tried set mds bal frag to true, but the end result was less than > desirable. All nfs and smb clients could no longer browse the share, they > would hang on a directory with anything

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Gregory Farnum
On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote: > Hey Zheng, > > I've been in the #ceph irc channel all day about this. > > We did that, we set max_mds back to 1, but, instead of stopping mds 1, we > did a "ceph mds rmfailed 1". Running ceph mds stop 1 produces: > > # ceph

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson
okay, that sounds really good. Would it help if you had access to our cluster? On Thu, Jan 14, 2016 at 4:19 PM, Yan, Zheng wrote: > > > On Jan 15, 2016, at 08:16, Mike Carlson wrote: > > > > Did I just loose all of my data? > > > > If we were able to export

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson
Do I apply this against the v9.2.0 git tag? On Thu, Jan 14, 2016 at 4:48 PM, Dyweni - Ceph-Users < 6exbab4fy...@dyweni.com> wrote: > Your patch lists the command as "addfailed" but the email lists the > command as "add failed". (Note the space). > > > > > > On 2016-01-14 18:46, Yan, Zheng

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng
> On Jan 15, 2016, at 08:16, Mike Carlson wrote: > > Did I just loose all of my data? > > If we were able to export the journal, could we create a brand new mds out of > that and retrieve our data? No. it’s early to fix. but you need to re-compile ceph-mon from source

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread John Spray
On Fri, Jan 15, 2016 at 12:23 AM, Sage Weil wrote: > On Fri, 15 Jan 2016, Yan, Zheng wrote: >> > On Jan 15, 2016, at 08:16, Mike Carlson wrote: >> > >> > Did I just loose all of my data? >> > >> > If we were able to export the journal, could we create a

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Dyweni - Ceph-Users
Your patch lists the command as "addfailed" but the email lists the command as "add failed". (Note the space). On 2016-01-14 18:46, Yan, Zheng wrote: Here is patch for v9.2.0. After install the modified version of ceph-mon, run “ceph mds add failed 1” On Jan 15, 2016, at 08:20,

Re: [ceph-users] osd process threads stack up on osds failure

2016-01-14 Thread Samuel Just
Probably worth filing a bug. Make sure to include the usual stuff: 1) version 2) logs from a crashing osd For this one, it would also be handy if you used gdb to dump the thread backtraces for an osd which is experiencing "an increase of approximately 230-260 threads for every other OSD node"

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Bill Sanders
Is there some information about rbd-nbd somewhere? If it has feature parity with librbd and is easier to maintain, will this eventually deprecate krbd? We're using the RBD kernel client right now, and so this looks like something we might want to explore at my employer. Bill On Thu, Jan 14,

Re: [ceph-users] RGW -- 404 on keys in bucket.list() thousands of multipart ids listed as well.

2016-01-14 Thread seapasu...@uchicago.edu
It looks like the gateway is experiencing a similar race condition to what we reported before. The rados object has a size of 0 bytes but the bucket index shows the object listed and the object metadata shows a size of 7147520 bytes. I have a lot of logs but I don't think any of them have

[ceph-users] Odd single VM ceph error

2016-01-14 Thread Robert LeBlanc
We have a single VM that is acting odd. We had 7 SSD OSDs (out of 40) go down over a period of about 12 hours. These are a cache tier and have size 4, min_size 2. I'm not able to make heads or tails of the error and hoped someone here could help. 2016-01-14 23:09:54.559121 osd.136 [ERR] 13.503

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson
Hey ceph-users, I wanted to follow up, Zheng's patch did the trick. We re-added the removed mds, and it all came back. We're sync-ing our data off to a backup server. Thanks for all of the help, Ceph has a great community to work with! Mike C On Thu, Jan 14, 2016 at 4:46 PM, Yan, Zheng

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng
> On Jan 15, 2016, at 08:01, Gregory Farnum wrote: > > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote: >> Hey Zheng, >> >> I've been in the #ceph irc channel all day about this. >> >> We did that, we set max_mds back to 1, but, instead of stopping

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Sage Weil
On Fri, 15 Jan 2016, Yan, Zheng wrote: > > On Jan 15, 2016, at 08:16, Mike Carlson wrote: > > > > Did I just loose all of my data? > > > > If we were able to export the journal, could we create a brand new mds out > > of that and retrieve our data? > > No. it’s early to

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Yan, Zheng
Here is patch for v9.2.0. After install the modified version of ceph-mon, run “ceph mds add failed 1” mds_addfailed.patch Description: Binary data > On Jan 15, 2016, at 08:20, Mike Carlson wrote: > > okay, that sounds really good. > > Would it help if you had access

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Dyweni - Ceph-Users
Does this support rbd images with stripe count > 1? If yes, then this is also a solution for this problem: http://tracker.ceph.com/issues/3837 Thanks, Dyweni On 2016-01-14 13:27, Bill Sanders wrote: Is there some information about rbd-nbd somewhere? If it has feature parity with

[ceph-users] Ceph Advisory Board: meeting minutes 2016-01-12

2016-01-14 Thread Patrick McGarry
This month’s Ceph Advisory Board meeting notes have been added to the Ceph wiki: wiki.ceph.com/Ceph_Advisory_Board Please let me know if you have any questions or concerns. Thanks. -- Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com ||

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Jason Dillaman
rbd-nbd uses librbd directly -- it runs as a user-space daemon process and interacts with the kernel NBD commands via a UNIX socket. As a result, it supports all image features supported by librbd. You can use the rbd CLI to map/unmap RBD-based NBDs [1] similar to how you map/unmap images via

[ceph-users] Community Update

2016-01-14 Thread Patrick McGarry
Hey cephers, It has been quite a while since I distilled the highlights of what is going on in the community into a single post, so I figured it was long overdue. Please check out the latest Ceph.com blog and some of the many great things that are on our short-term radar at the moment:

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-14 Thread Mike Carlson
Thank you for the reply Zheng We tried set mds bal frag to true, but the end result was less than desirable. All nfs and smb clients could no longer browse the share, they would hang on a directory with anything more than a few hundred files. We then tried to back out the active/active mds

Re: [ceph-users] Observations after upgrading to latest Firefly (0.80.11)

2016-01-14 Thread Gregory Farnum
On Thu, Jan 14, 2016 at 12:50 AM, Kostis Fardelas wrote: > Hello cephers, > after being on 0.80.10 for a while, we upgraded to 0.80.11 and we > noticed the following things: > a. ~13% paxos refresh latency increase (from about 0.015 to 0.017 on average) > b. ~15% paxos commit

Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Yehuda Sadeh-Weinraub
On Thu, Jan 14, 2016 at 7:37 AM, Sage Weil wrote: > This development release includes a raft of changes and improvements for > Jewel. Key additions include CephFS scrub/repair improvements, an AIX and > Solaris port of librados, many librbd journaling additions and fixes, >