Re: [ceph-users] NFS interaction with RBD

2015-06-11 Thread Christian Schnidrig
Hi George Well that’s strange. I wonder why our systems behave so differently. We’ve got: Hypervisors running on Ubuntu 14.04. VMs with 9 ceph volumes: 2TB each. XFS instead of your ext4 Maybe the number of placement groups plays a major role as well. Jens-Christian may be able to give you

Re: [ceph-users] NFS interaction with RBD

2015-06-11 Thread Christian Schnidrig
Hi George In order to experience the error it was enough to simply run mkfs.xfs on all the volumes. In the meantime it became clear what the problem was: ~ ; cat /proc/183016/limits ... Max open files1024 4096 files .. This can be changed by

Re: [ceph-users] NFS interaction with RBD

2015-05-29 Thread Georgios Dimitrakakis
All, I 've tried to recreate the issue without success! My configuration is the following: OS (Hypervisor + VM): CentOS 6.6 (2.6.32-504.1.3.el6.x86_64) QEMU: qemu-kvm-0.12.1.2-2.415.el6.3ceph.x86_64 Ceph: ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047), 20x4TB OSDs equally

Re: [ceph-users] NFS interaction with RBD

2015-05-29 Thread John-Paul Robinson
In the end this came down to one slow OSD. There were no hardware issues so have to just assume something gummed up during rebalancing and peering. I restarted the osd process after setting the cluster to noout. After the osd was restarted the rebalance completed and the cluster returned to

Re: [ceph-users] NFS interaction with RBD

2015-05-28 Thread John-Paul Robinson
To follow up on the original post, Further digging indicates this is a problem with RBD image access and is not related to NFS-RBD interaction as initially suspected. The nfsd is simply hanging as a result of a hung request to the XFS file system mounted on our RBD-NFS gateway.This hung XFS

Re: [ceph-users] NFS interaction with RBD

2015-05-28 Thread Georgios Dimitrakakis
Thanks a million for the feedback Christian! I 've tried to recreate the issue with 10RBD Volumes mounted on a single server without success! I 've issued the mkfs.xfs command simultaneously (or at least as fast I could do it in different terminals) without noticing any problems. Can you

Re: [ceph-users] NFS interaction with RBD

2015-05-28 Thread Trent Lloyd
Jens-Christian Fischer jens-christian.fischer@... writes: I think we (i.e. Christian) found the problem: We created a test VM with 9 mounted RBD volumes (no NFS server). As soon as he hit all disks, we started to experience these 120 second timeouts. We realized that the QEMU process on the

Re: [ceph-users] NFS interaction with RBD

2015-05-27 Thread Jens-Christian Fischer
George, I will let Christian provide you the details. As far as I know, it was enough to just do a ‘ls’ on all of the attached drives. we are using Qemu 2.0: $ dpkg -l | grep qemu ii ipxe-qemu 1.0.0+git-2013.c3d1e78-2ubuntu1 all PXE boot firmware -

Re: [ceph-users] NFS interaction with RBD

2015-05-26 Thread Jens-Christian Fischer
I think we (i.e. Christian) found the problem: We created a test VM with 9 mounted RBD volumes (no NFS server). As soon as he hit all disks, we started to experience these 120 second timeouts. We realized that the QEMU process on the hypervisor is opening a TCP connection to every OSD for

Re: [ceph-users] NFS interaction with RBD

2015-05-26 Thread Georgios Dimitrakakis
Jens-Christian, how did you test that? Did you just tried to write to them simultaneously? Any other tests that one can perform to verify that? In our installation we have a VM with 30 RBD volumes mounted which are all exported via NFS to other VMs. No one has complaint for the moment but

Re: [ceph-users] NFS interaction with RBD

2015-05-24 Thread Christian Balzer
Hello, lets compare your case with John-Paul's. Different OS and Ceph versions (thus we can assume different NFS versions as well). The only common thing is that both of you added OSDs and are likely suffering from delays stemming from Ceph re-balancing or deep-scrubbing. Ceph logs will only

Re: [ceph-users] NFS interaction with RBD

2015-05-23 Thread Jens-Christian Fischer
We see something very similar on our Ceph cluster, starting as of today. We use a 16 node, 102 OSD Ceph installation as the basis for an Icehouse OpenStack cluster (we applied the RBD patches for live migration etc) On this cluster we have a big ownCloud installation (Sync Share) that stores

[ceph-users] NFS interaction with RBD

2015-05-23 Thread John-Paul Robinson (Campus)
We've had a an NFS gateway serving up RBD images successfully for over a year. Ubuntu 12.04 and ceph .73 iirc. In the past couple of weeks we have developed a problem where the nfs clients hang while accessing exported rbd containers. We see errors on the server about nfsd hanging for 120sec