Re: [ceph-users] Ceph-deploy new OSD addition issue

2016-06-28 Thread Pisal, Ranjit Dnyaneshwar
This is another error I get while trying to activate disk - [ceph@MYOPTPDN16 ~]$ sudo ceph-disk activate /dev/sdl1 2016-06-29 11:25:17.436256 7f8ed85ef700 0 -- :/1032777 >> 10.115.1.156:6789/0 pipe(0x7f8ed4021610 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8ed40218a0).fault 2016-06-29 11:25:20.436362

Re: [ceph-users] CephFS mds cache pressure

2016-06-28 Thread Mykola Dvornik
I have the same issues with the variety of kernel clients running 4.6.3 and 4.4.12 and fuse clients from 10.2.2. -Mykola -Original Message- From: xiaoxi chen To: João Castro , ceph-users@lists.ceph.com Subject: Re: [ceph-users] CephFS

[ceph-users] Ceph-deploy new OSD addition issue

2016-06-28 Thread Pisal, Ranjit Dnyaneshwar
Hi, I am stuck at one point to new OSD Host to existing ceph cluster. I tried a multiple combinations for creating OSDs on new host but every time its failing while disk activation and no partition for OSD (/var/lib/ceph/osd/ceoh-xxx) is getting created instead (/var/lib/ceph/tmp/bhbjnk.mnt)

Re: [ceph-users] Is anyone seeing iissues with task_numa_find_cpu?

2016-06-28 Thread Alex Gorbachev
Hi Stefan, On Tue, Jun 28, 2016 at 1:46 PM, Stefan Priebe - Profihost AG wrote: > Please be aware that you may need even more patches. Overall this needs 3 > patches. Where the first two try to fix a bug and the 3rd one fixes the > fixes + even more bugs related to the

[ceph-users] FIO Performance test

2016-06-28 Thread Mohd Zainal Abidin Rabani
Hi, We have manage deploy ceph with cloudstack. Now, we running 3 monitor and 5 osd. We share some output and we very proud get done ceph. We will move ceph to production on short period. We have manage to build VSM (GUI) to monitor ceph. Result: This test using one vm only. The result

Re: [ceph-users] CephFS mds cache pressure

2016-06-28 Thread xiaoxi chen
Hmm, I asked in the ML some days before,:) likely you hit the kernel bug which fixed by commit 5e804ac482 "ceph: don't invalidate page cache when inode is no longer used”. This fix is in 4.4 but not in 4.2. I haven't got a chance to play with 4.4 , it would be great if you can have a try. For

Re: [ceph-users] CPU use for OSD daemon

2016-06-28 Thread Christian Balzer
Hello, re-adding list. On Tue, 28 Jun 2016 20:52:51 +0300 George Shuklin wrote: > On 06/28/2016 06:46 PM, Christian Balzer wrote: > > Hello, > > > > On Tue, 28 Jun 2016 18:23:02 +0300 George Shuklin wrote: > > > >> Hello. > >> > >> I'm testing different configuration for Ceph. > > What

Re: [ceph-users] Is anyone seeing iissues with task_numa_find_cpu?

2016-06-28 Thread Brendan Moloney
The Ubuntu bug report is here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1568729 > Please be aware that you may need even more patches. Overall this needs 3 > patches. Where the first two try to fix a bug and the 3rd one fixes the fixes > + even more bugs related to the scheduler.

Re: [ceph-users] CephFS mds cache pressure

2016-06-28 Thread João Castro
Sorry, forgot. Kernel! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS mds cache pressure

2016-06-28 Thread João Castro
Hey John, ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) 4.2.0-36-generic Thanks! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS mds cache pressure

2016-06-28 Thread John Spray
On Tue, Jun 28, 2016 at 10:25 PM, João Castro wrote: > Hello guys, > From time to time I have MDS cache pressure error (Client failing to respond > to cache pressure). > > If I try to increase the mds_cache_size to increase the number of inodes two > things happen: > > 1)

[ceph-users] CephFS mds cache pressure

2016-06-28 Thread João Castro
Hello guys, >From time to time I have MDS cache pressure error (Client failing to respond to cache pressure). If I try to increase the mds_cache_size to increase the number of inodes two things happen: 1) inodes keep growing until I get to the limit again 2) the more inodes, mds runs out of

Re: [ceph-users] Rebalancing cluster and client access

2016-06-28 Thread Oliver Dzombic
Hi Sergey, IF you have size = 2 and min_size = 1 then with 2 replica's all should be fine and accessable, even when 1 node goes down. -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschraenkt )

[ceph-users] Rebalancing cluster and client access

2016-06-28 Thread Sergey Osherov
Hi everybody! We have cluster with 12 storage nodes and replication 2. When one node was destroyed our clients can not access to ceph cluster. I read that with two copies in replication pool, ceph interrupt write operation during degraded state. But it is not clear why clients can not read 

Re: [ceph-users] Is anyone seeing iissues with task_numa_find_cpu?

2016-06-28 Thread Stefan Priebe - Profihost AG
Please be aware that you may need even more patches. Overall this needs 3 patches. Where the first two try to fix a bug and the 3rd one fixes the fixes + even more bugs related to the scheduler. I've no idea on which patch level Ubuntu is. Stefan Excuse my typo sent from my mobile phone. >

Re: [ceph-users] CPU use for OSD daemon

2016-06-28 Thread Alexandre DERUMIER
>>And when I benchmark it I see some horribly-low performance and clear >>bottleneck at ceph-osd process: it consumes about 110% of CPU and giving >>me following results: 127 iops in fio benchmark (4k randwrite) for rbd >>device, rados benchmark gives me ~21 IOPS and 76Mb/s (write). on a 2x xeon

Re: [ceph-users] Another cluster completely hang

2016-06-28 Thread Stefan Priebe - Profihost AG
And ceph health detail Stefan Excuse my typo sent from my mobile phone. > Am 28.06.2016 um 19:28 schrieb Oliver Dzombic : > > Hi Mario, > > please give some more details: > > Please the output of: > > ceph osd pool ls detail > ceph osd df > ceph --version > > ceph

Re: [ceph-users] Another cluster completely hang

2016-06-28 Thread Oliver Dzombic
Hi Mario, please give some more details: Please the output of: ceph osd pool ls detail ceph osd df ceph --version ceph -w for 10 seconds ( use http://pastebin.com/ please ) ceph osd crush dump ( also pastebin pls ) -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive

[ceph-users] Mounting Ceph RBD under xenserver

2016-06-28 Thread Mike Jacobacci
Hi all, Is there anyone using rbd for xenserver vm storage? I have XenServer 7 and the latest Ceph, I am looking for the the best way to mount the rbd volume under XenServer. There is not much recent info out there I have found except for this:

[ceph-users] Another cluster completely hang

2016-06-28 Thread Mario Giammarco
Hello, this is the second time that happens to me, I hope that someone can explain what I can do. Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2. One hdd goes down due to bad sectors. Ceph recovers but it ends with: cluster f2a8dd7d-949a-4a29-acab-11d4900249f4 health

Re: [ceph-users] Is anyone seeing iissues with task_numa_find_cpu?

2016-06-28 Thread Tim Bishop
Yes - I noticed this today on Ubuntu 16.04 with the default kernel. No useful information to add other than it's not just you. Tim. On Tue, Jun 28, 2016 at 11:05:40AM -0400, Alex Gorbachev wrote: > After upgrading to kernel 4.4.13 on Ubuntu, we are seeing a few of > these issues where an OSD

Re: [ceph-users] CPU use for OSD daemon

2016-06-28 Thread Christian Balzer
Hello, On Tue, 28 Jun 2016 18:23:02 +0300 George Shuklin wrote: > Hello. > > I'm testing different configuration for Ceph. What version... > I found that osd are > REALLY hungry for cpu. > They can be, but unlikely in your case. > I've created a tiny pool with size 1 with single OSD made

[ceph-users] CPU use for OSD daemon

2016-06-28 Thread George Shuklin
Hello. I'm testing different configuration for Ceph. I found that osd are REALLY hungry for cpu. I've created a tiny pool with size 1 with single OSD made of fast intel SSD (2500-series), on old dell server (R210), Xeon E3-1230 V2 @ 3.30GHz. And when I benchmark it I see some horribly-low

Re: [ceph-users] Is anyone seeing iissues with task_numa_find_cpu?

2016-06-28 Thread Stefan Priebe - Profihost AG
Yes you need those lkml patches. I added them to our custom 4.4 Kernel too to prevent this. Stefan Excuse my typo sent from my mobile phone. > Am 28.06.2016 um 17:05 schrieb Alex Gorbachev : > > After upgrading to kernel 4.4.13 on Ubuntu, we are seeing a few of >

[ceph-users] Is anyone seeing iissues with task_numa_find_cpu?

2016-06-28 Thread Alex Gorbachev
After upgrading to kernel 4.4.13 on Ubuntu, we are seeing a few of these issues where an OSD would fail with the stack below. I logged a bug at https://bugzilla.kernel.org/show_bug.cgi?id=121101 and there is a similar description at https://lkml.org/lkml/2016/6/22/102, but the odd part is we have

Re: [ceph-users] ceph not replicating to all osds

2016-06-28 Thread Ishmael Tsoaela
Thanks Brad, I have looked through OCFS2 and does exactly what I wanted. On Tue, Jun 28, 2016 at 1:04 PM, Brad Hubbard wrote: > On Tue, Jun 28, 2016 at 4:17 PM, Ishmael Tsoaela > wrote: > > Hi, > > > > I am new to Ceph and most of the concepts are

Re: [ceph-users] How many nodes/OSD can fail

2016-06-28 Thread David
Hi, This is probably the min_size on your cephfs data and/or metadata pool. I believe the default is 2, if you have less than 2 replicas available I/O will stop. See: http://docs.ceph.com/docs/master/rados/operations/pools/#set-the-number-of-object-replicas On Tue, Jun 28, 2016 at 10:23 AM,

Re: [ceph-users] ceph not replicating to all osds

2016-06-28 Thread Brad Hubbard
On Tue, Jun 28, 2016 at 4:17 PM, Ishmael Tsoaela wrote: > Hi, > > I am new to Ceph and most of the concepts are new. > > image mounted on nodeA, FS is XFS > > sudo mkfs.xfs /dev/rbd/data/data_01 > > sudo mount /dev/rbd/data/data_01 /mnt > > cluster_master@nodeB:~$ mount|grep

Re: [ceph-users] VM shutdown because of PG increase

2016-06-28 Thread Brad Hubbard
On Tue, Jun 28, 2016 at 7:39 PM, Torsten Urbas wrote: > Hello, > > are you sure about your Ceph version? Below’s output states "0.94.1“. I suspect it's quite likely that the cluster was upgraded but not the clients or, if the clients were upgraded, that the VMs were not

Re: [ceph-users] OSD Cache

2016-06-28 Thread David
Hi, Please clarify what you mean by "osd cache". Raid controller cache or Ceph's cache tiering feature? On Tue, Jun 28, 2016 at 10:21 AM, Mohd Zainal Abidin Rabani < zai...@nocser.net> wrote: > Hi, > > > > We have using osd on production. SSD as journal. We have test io and show > good result.

Re: [ceph-users] VM shutdown because of PG increase

2016-06-28 Thread Torsten Urbas
Hello, are you sure about your Ceph version? Below’s output states "0.94.1“. We have ran into a similar issue with Ceph 0.94.3 and can confirm that we no longer see that with Ceph 0.94.5. If you upgraded during operation, did you at least migrate all of your VMs at least once to make sure they

[ceph-users] How many nodes/OSD can fail

2016-06-28 Thread willi.feh...@t-online.de
Hello, I'm still very new to Ceph. I've created a small test Cluster. ceph-node1 osd0 osd1 osd2 ceph-node2 osd3 osd4 osd5 ceph-node3 osd6 osd7 osd8 My pool for CephFS has a replication count of 3. I've powered of 2 nodes(6 OSDs went down) and my cluster status became critical and my ceph

[ceph-users] OSD Cache

2016-06-28 Thread Mohd Zainal Abidin Rabani
Hi, We have using osd on production. SSD as journal. We have test io and show good result. We plan to use osd cache to get better iops. Have anyone here success deploy osd cache? Please share or advice here. Thanks. ___ ceph-users mailing list

Re: [ceph-users] pg scrub and auto repair in hammer

2016-06-28 Thread Stefan Priebe - Profihost AG
Am 28.06.2016 um 09:42 schrieb Christian Balzer: > On Tue, 28 Jun 2016 09:15:50 +0200 Stefan Priebe - Profihost AG wrote: > >> >> Am 28.06.2016 um 09:06 schrieb Christian Balzer: >>> >>> Hello, >>> >>> On Tue, 28 Jun 2016 08:34:26 +0200 Stefan Priebe - Profihost AG wrote: >>> Am 27.06.2016

[ceph-users] VM shutdown because of PG increase

2016-06-28 Thread 한승진
Hi, Cephers. Our ceph version is Hammer(0.94.7). I implemented ceph with OpenStack, all instances use block storage as a local volume. After increasing the PG number from 256 to 768, many vms are shutdown. That was very strange case for me. Below vm's is libvirt error log. osd/osd_types.cc:

Re: [ceph-users] Should I use different pool?

2016-06-28 Thread EM - SC
Thanks for the answers. SSD could be an option, but the idea is to grow (if business goes well) from those 18TB. I am thinking, however, after reading some bad comments about CephFS with very large directories with many subdirectories (which is our case) doesn't perform very well. The big

Re: [ceph-users] Should I use different pool?

2016-06-28 Thread Brian ::
+1 for 18TB and all SSD - If you need any decent IOPS with a cluster this size then I all SSDs are the way to go. On Mon, Jun 27, 2016 at 11:47 AM, David wrote: > Yes you should definitely create different pools for different HDD types. > Another decision you need to

Re: [ceph-users] pg scrub and auto repair in hammer

2016-06-28 Thread Lionel Bouton
Hi, Le 28/06/2016 08:34, Stefan Priebe - Profihost AG a écrit : > [...] > Yes but at least BTRFS is still not working for ceph due to > fragmentation. I've even tested a 4.6 kernel a few weeks ago. But it > doubles it's I/O after a few days. BTRFS autodefrag is not working over the long term.

Re: [ceph-users] pg scrub and auto repair in hammer

2016-06-28 Thread Christian Balzer
On Tue, 28 Jun 2016 09:15:50 +0200 Stefan Priebe - Profihost AG wrote: > > Am 28.06.2016 um 09:06 schrieb Christian Balzer: > > > > Hello, > > > > On Tue, 28 Jun 2016 08:34:26 +0200 Stefan Priebe - Profihost AG wrote: > > > >> Am 27.06.2016 um 02:14 schrieb Christian Balzer: > >>> On Sun, 26

Re: [ceph-users] pg scrub and auto repair in hammer

2016-06-28 Thread Stefan Priebe - Profihost AG
Am 28.06.2016 um 09:06 schrieb Christian Balzer: > > Hello, > > On Tue, 28 Jun 2016 08:34:26 +0200 Stefan Priebe - Profihost AG wrote: > >> Am 27.06.2016 um 02:14 schrieb Christian Balzer: >>> On Sun, 26 Jun 2016 19:48:18 +0200 Stefan Priebe wrote: >>> Hi, is there any option or

Re: [ceph-users] pg scrub and auto repair in hammer

2016-06-28 Thread Christian Balzer
Hello, On Tue, 28 Jun 2016 08:34:26 +0200 Stefan Priebe - Profihost AG wrote: > Am 27.06.2016 um 02:14 schrieb Christian Balzer: > > On Sun, 26 Jun 2016 19:48:18 +0200 Stefan Priebe wrote: > > > >> Hi, > >> > >> is there any option or chance to have auto repair of pgs in hammer? > >> > > Short

Re: [ceph-users] pg scrub and auto repair in hammer

2016-06-28 Thread Stefan Priebe - Profihost AG
Am 27.06.2016 um 02:14 schrieb Christian Balzer: > On Sun, 26 Jun 2016 19:48:18 +0200 Stefan Priebe wrote: > >> Hi, >> >> is there any option or chance to have auto repair of pgs in hammer? >> > Short answer: > No, in any version of Ceph. > > Long answer: > There are currently no checksums

Re: [ceph-users] ceph not replicating to all osds

2016-06-28 Thread Ishmael Tsoaela
Hi, I am new to Ceph and most of the concepts are new. image mounted on nodeA, FS is XFS sudo mkfs.xfs /dev/rbd/data/data_01 sudo mount /dev/rbd/data/data_01 /mnt cluster_master@nodeB:~$ mount|grep rbd /dev/rbd0 on /mnt type xfs (rw) Basically I need a way to write on nodeA, mount the same