Re: [ceph-users] CephFS: No space left on device

2016-10-04 Thread Yan, Zheng
On Mon, Oct 3, 2016 at 5:48 AM, Mykola Dvornik wrote: > Hi Johan, > > Many thanks for your reply. I will try to play with the mds tunables and > report back to your ASAP. > > So far I see that mds log contains a lot of errors of the following kind: > > 2016-10-02

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread Yan, Zheng
On Tue, Oct 4, 2016 at 11:30 PM, John Spray wrote: > On Tue, Oct 4, 2016 at 5:09 PM, Stephen Horton wrote: >> Thank you John. Both my Openstack hosts and the VMs are all running >> 4.4.0-38-generic #57-Ubuntu SMP x86_64. I can see no evidence that any of

Re: [ceph-users] Crash in ceph_readdir.

2016-10-04 Thread Yan, Zheng
> On 3 Oct 2016, at 20:27, Ilya Dryomov wrote: > > On Mon, Oct 3, 2016 at 1:19 PM, Nikolay Borisov wrote: >> Hello, >> >> I've been investigating the following crash with cephfs: >> >> [8734559.785146] general protection fault: [#1] SMP >>

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Christian Balzer
Hello, replying to the original post for quoting reasons. Totally agree with what the others (Nick and Burkhard) wrote. On Tue, 04 Oct 2016 15:43:18 +0200 Denny Fuchs wrote: > Hello, > > we are brand new to Ceph and planing it as our future storage for > KVM/LXC VMs as replacement for Xen /

Re: [ceph-users] Investigating active+remapped+wait_backfill pg status

2016-10-04 Thread Ivan Grcic
I've just seen that send_pg_creates command is obsolete and has already been removed in 6cbdd6750cf330047d52817b9ee9af31a7d318ae So I guess it doesn't do too much :) Tnx, Ivan On Wed, Oct 5, 2016 at 2:37 AM, Ivan Grcic wrote: > Hi everybody, > > I am trying to understand

[ceph-users] Investigating active+remapped+wait_backfill pg status

2016-10-04 Thread Ivan Grcic
Hi everybody, I am trying to understand why am I keep on getting remapped+wait_backfill pg statuses, when doing some cluster pg shuffling. Sometimes it happens just by doing small reweight-by-utilization operation, and sometimes when I modify the crushmap (bigger movement of data). Taking look

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread Stephen Horton
Thanks again John. I am installing 4.8.0-040800 kernel on my VM clients and will report back. Just to confirm: there is no reason for this issue to try the newer kernel on the mds node correct? > On Oct 4, 2016, at 10:30 AM, John Spray wrote: > >> On Tue, Oct 4, 2016 at

[ceph-users] Bug 14396 Calamari Dashboard :: can't connect to the cluster??

2016-10-04 Thread McFarland, Bruce
I am attempting to bring up Calamari using SuSE rpms. I am able to connect to all salt minions, execute ceph.get_heartbeats from all minions on the master, see the whisper DB getting populated from diamond data passed to the master, and initialize calamari. When I open the Calamari Dashboard

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread John Spray
On Tue, Oct 4, 2016 at 5:09 PM, Stephen Horton wrote: > Thank you John. Both my Openstack hosts and the VMs are all running > 4.4.0-38-generic #57-Ubuntu SMP x86_64. I can see no evidence that any of the > VMs are holding large numbers of files open. If this is likely a

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-04 Thread Burkhard Linke
Hi, some thoughts about network and disks inline On 10/04/2016 03:43 PM, Denny Fuchs wrote: Hello, *snipsnap* * Storage NIC: 1 x Infiniband MCX314A-BCCT ** I red, that ConnectX-3 Pro is better supported, than the X-4 and a bit cheaper ** Switch: 2 x Mellanox SX6012 (56Gb/s) ** Active

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread Stephen Horton
Thank you John. Both my Openstack hosts and the VMs are all running 4.4.0-38-generic #57-Ubuntu SMP x86_64. I can see no evidence that any of the VMs are holding large numbers of files open. If this is likely a client bug, is there some process I can follow to file a bug report? > On Oct 4,

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Denny Fuchs > Sent: 04 October 2016 15:51 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware > planning / agreement > > Hi, > >

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Denny Fuchs
Hi, thanks for take a look :-) Am 04.10.2016 16:11, schrieb Nick Fisk: We have two goals: * High availability * Short latency for our transaction services How Low? See below re CPU's so low, what is possible without doing crazy stuff. We thinking to put the database on CEPH too, instead

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread John Spray
On Tue, Oct 4, 2016 at 4:27 PM, Stephen Horton wrote: > Adding that all of my ceph components are version: > 10.2.2-0ubuntu0.16.04.2 > > Openstack is Mitaka on Ubuntu 16.04x. Manila file share is 1:2.0.0-0ubuntu1 > > My scenario is that I have a 3-node ceph cluster running

[ceph-users] Recovery/Backfill Speedup

2016-10-04 Thread Reed Dier
Attempting to expand our small ceph cluster currently. Have 8 nodes, 3 mons, and went from a single 8TB disk per node to 2x 8TB disks per node, and the rebalancing process is excruciatingly slow. Originally at 576 PGs before expansion, and wanted to allow rebalance to finish before expanding

Re: [ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread Stephen Horton
Adding that all of my ceph components are version: 10.2.2-0ubuntu0.16.04.2 Openstack is Mitaka on Ubuntu 16.04x. Manila file share is 1:2.0.0-0ubuntu1 My scenario is that I have a 3-node ceph cluster running openstack mitaka. Each node has 256gb ram, 14tb raid 5 array. I have 30 VMs running in

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Nick Fisk
Hi, Comments inline > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Denny Fuchs > Sent: 04 October 2016 14:43 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning > / agreement >

[ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Denny Fuchs
Hello, we are brand new to Ceph and planing it as our future storage for KVM/LXC VMs as replacement for Xen / DRBD / Pacemaker / Synology (NFS) stuff. We have two goals: * High availability * Short latency for our transaction services * For later: replication to different datacenter

[ceph-users] cephfs kernel driver - failing to respond to cache pressure

2016-10-04 Thread Stephen Horton
I am using Ceph to back Openstack Nova ephemeral, Cinder volumes, Glance images, and Openstack Manila File Share storage. Originally, I was using ceph-fuse with Manila, but performance and resource usage was poor, so I changed to using the CephFs kernel driver. Now however, I am getting

[ceph-users] status of ceph performance weekly video archives

2016-10-04 Thread Nikolay Borisov
Hello, I'd like to ask whether Ceph's perfrmance weekly meetings recordings are going to be updated at http://pad.ceph.com/p/performance_weekly. I can see that minutes are being updated from the meeting, however the links with videos from those discussions are lagging by more than a year

Re: [ceph-users] CephFS: No space left on device

2016-10-04 Thread Mykola Dvornik
To my best knowledge nobody used hardlinks within fs. So I have unmounted everything to see what would happen: [root@005-s-ragnarok ragnarok]# ceph daemon mds.fast-test session ls [] -mds-- --mds_server-- ---objecter--- -mds_cache- ---mds_log rlat inos caps|hsr hcs hcr

Re: [ceph-users] Blog post about Ceph cache tiers - feedback welcome

2016-10-04 Thread Sascha Vogt
Hi Lindsay, Am 03.10.2016 um 23:57 schrieb Lindsay Mathieson: > Thanks, that clarified things a lot - much easier to follow than the > offical docs :) Thank you for the kind words, very much appreciated! > Do cache tiers help with writes as well? Basically there are two cache modes (you specify

[ceph-users] upgrade from v0.94.6 or lower and 'failed to encode map X with expected crc'

2016-10-04 Thread kefu chai
hi ceph users, If user upgrades the cluster from a prior release to v0.94.7 or up by following the steps: 1. upgrade the monitors first, 2. and then the OSDs. It is expected that the cluster log will be flooded with messages like: 2016-07-12 08:42:42.1234567 osd.1234 [WRN] failed to encode map

Re: [ceph-users] CephFS: No space left on device

2016-10-04 Thread John Spray
(Re-adding list) The 7.5k stray dentries while idle is probably indicating that clients are holding onto references to them (unless you unmount the clients and they don't purge, in which case you may well have found a bug). The other way you can end up with lots of dentries sitting in stray dirs