Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-13 Thread Dai Xiang
On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote: > Your kernel is way too old for CephFS Luminous. I'd use one of the newer > kernels from elrepo.org. :) We're on 4.12 here on RHEL 7.4. I had updated kernel version to newest: [root@d32f3a7b6eb8 ~]$ uname -a Linux d32f3a7b6eb8

Re: [ceph-users] features required for live migration

2017-11-13 Thread Konstantin Shalygin
I'd like to use the live migration feature of KVM. In this scenario, what features may be enabled in the rbd base image? and in my EV (snapshot clone)? You can use live migration without features. For KVM I can recommend minimal "rbd default features = 3" (layering, striping).

Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-13 Thread Dai Xiang
On Tue, Nov 14, 2017 at 02:24:06AM +, Linh Vu wrote: > Your kernel is way too old for CephFS Luminous. I'd use one of the newer > kernels from elrepo.org. :) We're on 4.12 here on RHEL 7.4. There is still a question: why on my host(3.10.0-327.el7.x86_64) cephfs can mount and load kernel

Re: [ceph-users] rocksdb: Corruption: missing start of fragmented record

2017-11-13 Thread Konstantin Shalygin
Which isn't released yet, yes. I could try building the development repository if you think that has a chance of resolving the issue? For tests - yes... This ML tells that 12.2.2 should be based on commit

Re: [ceph-users] mount failed since failed to load ceph kernel module

2017-11-13 Thread Linh Vu
Your kernel is way too old for CephFS Luminous. I'd use one of the newer kernels from elrepo.org. :) We're on 4.12 here on RHEL 7.4. From: ceph-users on behalf of xiang@sky-data.cn Sent: Tuesday, 14

[ceph-users] mount failed since failed to load ceph kernel module

2017-11-13 Thread xiang....@sky-data.cn
Hi! I got a confused issue in docker as below: After install ceph successfully, i want to mount cephfs but failed: [root@dbffa72704e4 ~]$ /bin/mount 172.17.0.4:/ /cephfs -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret -v failed to load ceph kernel module (1) parsing options:

[ceph-users] Incorrect pool usage statistics

2017-11-13 Thread Karun Josy
Hello, Recently, I deleted all the disks from an erasure pool 'ecpool'. The pool is empty. However the space usage shows around 400GB. What might be wrong? $ rbd ls -l ecpool $ $ ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 19019G 16796G2223G 11.69

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Anthony D'Atri
Oscar, a few thoughts: o I think you might have some misunderstandings about how Ceph works. Ceph is best deployed as a single cluster spanning multiple servers, generally at least 3. Is that your plan? It sort of sounds as though you're thinking of Ceph managing only the drives local to

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Christian Wuerdig
I haven't used the rados command line utility but it has an "-o object_size" option as well as "--striper" to make it use the libradosstriper library so I'd suggest to give these options a go. On Mon, Nov 13, 2017 at 9:40 PM, Marc Roos wrote: > > 1. I don’t think an osd

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Hi Brady, For me is very difficult to make a PoC because servers are very expensive. Then, may I understand that your advice is a RAID0 for each 4TB? For a balanced configuration... 1 osd x 1 disk of 4TB 1 osd x 2 disks of 2TB 1 odd x 4 disks of 1 TB Isn't it? Thanks a lot El 13 nov. 2017

[ceph-users] Adding a monitor freezes the cluster

2017-11-13 Thread Bishoy Mikhael
Hi All, I've tried adding 2 monitors to a 3 nodes cluster with 1 monitor, 1 MGR and 1 MDS. The cluster was at CLEAN state when it just had 1 monitor. # ceph status cluster: id: 46a122a0-8670-4935-b644-399e744c1c03 health: HEALTH_OK services: mon: 1 daemons, quorum

[ceph-users] ceph-disk should wait for device file /dev/sdXX file to be created before trying to run mkfs

2017-11-13 Thread Subhachandra Chandra
Hi, I am using ceph-ansible to deploy ceph to run as a container on VMs running on my laptop. The VMs run CoreOS and the docker image being installed has the tag "tag-build-master-luminous-ubuntu-16.04". The backend is "bluestore". While running the "ceph-osd-prepare" stage, the

Re: [ceph-users] Object gateway and LDAP Auth

2017-11-13 Thread Josh Haft
Finally got back around to working on this and wanted to provide a solution in case anyone else runs into the same problem. I was able to reproduce the problem using s3cmd, and noticed different calls utilized different signature versions. Doing a GET operation on '/' seemed to use v2 while a

[ceph-users] Reuse pool id

2017-11-13 Thread Karun Josy
Hi, Is there anyway we can change or reuse pool id ? I had created and deleted lot of test pools. So the IDs kind of look like this now: --- $ ceph osd lspools 34 imagepool,37 cvmpool,40 testecpool,41 ecpool1, -- Can I change it to 0,1,2,3 etc ? Karun

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-13 Thread Robert Stanford
ceph osd pool create scbench 100 100 rados bench -p scbench 10 write --no-cleanup rados bench -p scbench 10 seq On Mon, Nov 13, 2017 at 1:28 AM, Rudi Ahlers wrote: > Would you mind telling me what rados command set you use, and share the > output? I would like to compare

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Lionel Bouton
Le 13/11/2017 à 15:47, Oscar Segarra a écrit : > Thanks Mark, Peter,  > > For clarification, the configuration with RAID5 is having many servers > (2 or more) with RAID5 and CEPH on top of it. Ceph will replicate data > between servers. Of course, each server will have just one OSD daemon >

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Thanks Mark, Peter, For clarification, the configuration with RAID5 is having many servers (2 or more) with RAID5 and CEPH on top of it. Ceph will replicate data between servers. Of course, each server will have just one OSD daemon managing a big disk. It looks functionally is the same using

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Marc Roos
Keep in mind also if you want to have fail over in the future. We were running a 2nd server and were replicating via DRBD the raid arrays. Expanding this storage is quite hastle, compared to just adding a few osd's. -Original Message- From: Oscar Segarra

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Hi Peter, Thanks a lot for your consideration in terms of storage consumption. The other question is considering having one OSDs vs 8 OSDs... 8 OSDs will consume more CPU than 1 OSD (RAID5) ? As I want to share compute and osd in the same box, resources consumed by OSD can be a handicap.

[ceph-users] Active+clean PGs reported many times in log

2017-11-13 Thread Matteo Dacrema
Hi, I noticed that sometimes the monitors start to log active+clean pgs many times in the same line. For example I have 18432 and the logs shows " 2136 active+clean, 28 active+clean, 2 active+clean+scrubbing+deep, 16266 active+clean;” After a minute monitor start to log correctly again. Is it

Re: [ceph-users] No ops on some OSD

2017-11-13 Thread Marc Roos
Very very nice, Thanks! Is there a heavy penalty to pay for enabling this? -Original Message- From: John Spray [mailto:jsp...@redhat.com] Sent: maandag 13 november 2017 11:48 To: Marc Roos Cc: iswaradrmwn; ceph-users Subject: Re: [ceph-users] No ops on some OSD On Sun, Nov 12,

[ceph-users] CRUSH - adding device class to existing rule without causing complete rebalance

2017-11-13 Thread Patrick Fruh
Hi everyone, I only have a single rule in my crushmap and only OSDs classed as hdd (after the luminous update): rule replicated_ruleset { id 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Michael
Oscar Segarra wrote: I'd like to hear your opinion about theese two configurations: 1.- RAID5 with 8 disks (I will have 7TB but for me it is enough) + 1 OSD daemon 2.- 8 OSD daemons You mean 1 OSD daemon on top of RAID5? I don't think I'd do that. You'll probably want redundancy at Ceph's

Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Peter Maloney
Once you've replaced an OSD, you'll see it is quite simple... doing it for a few is not much more work (you've scripted it, right?). I don't see RAID as giving any benefit here at all. It's not tricky...it's perfectly normal operation. Just get used to ceph, and it'll be as normal as replacing a

[ceph-users] Ubuntu upgrade Zesty => Aardvark, Implications for Ceph?

2017-11-13 Thread Ranjan Ghosh
Hi everyone, In January, support for Ubuntu Zesty will run out and we're planning to upgrade our servers to Aardvark. We have a two-node-cluster (and one additional monitoring-only server) and we're using the packages that come with the distro. We have mounted CephFS on the same server with

[ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Oscar Segarra
Hi, I'm designing my infraestructure. I want to provide 8TB (8 disks x 1TB each) of data per host just for Microsoft Windows 10 VDI. In each host I will have storage (ceph osd) and compute (on kvm). I'd like to hear your opinion about theese two configurations: 1.- RAID5 with 8 disks (I will

Re: [ceph-users] rocksdb: Corruption: missing start of fragmented record

2017-11-13 Thread Michael
Konstantin Shalygin wrote: > I think Christian talks about version 12.2.2, not 12.2.* Which isn't released yet, yes. I could try building the development repository if you think that has a chance of resolving the issue? Although I'd still like to know how I could theoretically get my hands

Re: [ceph-users] 答复: 答复: Where can I find the fix commit of #3370 ?

2017-11-13 Thread Ilya Dryomov
On Mon, Nov 13, 2017 at 10:53 AM, 周 威 wrote: > Hi, Ilya > > The kernel version is 3.10.106. > Part of dmesg related to ceph: > [7349718.004905] libceph: osd297 down > [7349718.005190] libceph: osd299 down > [7349785.671015] libceph: osd295 down > [7350006.357509] libceph: osd291

Re: [ceph-users] No ops on some OSD

2017-11-13 Thread John Spray
On Sun, Nov 12, 2017 at 2:56 PM, Marc Roos wrote: > > [@c03 ~]# ceph osd status > 2017-11-12 15:54:13.164823 7f478a6ad700 -1 WARNING: the following > dangerous and experimental features are enabled: bluestore > 2017-11-12 15:54:13.211219 7f478a6ad700 -1 WARNING: the

Re: [ceph-users] No ops on some OSD

2017-11-13 Thread Marc Roos
Indeed this what I have [@c01 ceph]# ceph --version ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) [@c01 ceph]# ceph tell osd.* version|head osd.0: { "version": "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)" } osd.1:

[ceph-users] 答复: 答复: Where can I find the fix commit of #3370 ?

2017-11-13 Thread 周 威
Hi, Ilya The kernel version is 3.10.106. Part of dmesg related to ceph: [7349718.004905] libceph: osd297 down [7349718.005190] libceph: osd299 down [7349785.671015] libceph: osd295 down [7350006.357509] libceph: osd291 weight 0x0 (out) [7350006.357795] libceph: osd292 weight 0x0 (out)

Re: [ceph-users] No ops on some OSD

2017-11-13 Thread Caspar Smit
Weird # ceph --version ceph version 12.2.1 (fc129ad90a65dc0b419412e77cb85ac230da42a6) luminous (stable) # ceph osd status +++---+---++-++-+ | id | host | used | avail | wr ops | wr data | rd ops | rd data |

Re: [ceph-users] 答复: Where can I find the fix commit of #3370 ?

2017-11-13 Thread Ilya Dryomov
On Mon, Nov 13, 2017 at 10:18 AM, 周 威 wrote: > Hi, Ilya > > I'm using the kernel of centos 7, should be 3.10 > I checked the patch, and it is appears in my kernel source. > We got the same stack of #3370, the process is hung in sleep_on_page_killable. > The debugs/ceph/osdc shows

[ceph-users] 答复: Where can I find the fix commit of #3370 ?

2017-11-13 Thread 周 威
Hi, Ilya I'm using the kernel of centos 7, should be 3.10 I checked the patch, and it is appears in my kernel source. We got the same stack of #3370, the process is hung in sleep_on_page_killable. The debugs/ceph/osdc shows there is a read request are waiting response, while the command `ceph

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Marc Roos
I have been asking myself (and here) the same question. I think it is because of having this in ceph.conf enable experimental unrecoverable data corrupting features = bluestore But I am not sure if I can remove this, or have to replace this with something else. ceph-12.2.1-0.el7.x86_64

[ceph-users] force scrubbing

2017-11-13 Thread Kenneth Waegeman
Hi all, Is there a way to force scrub a pg of an erasure coded pool? I tried  ceph pg deep-scrub 5.4c7, but after a week it still hasn't scrubbed the pg (last scrub timestamp not changed) Thanks! Kenneth ___ ceph-users mailing list

Re: [ceph-users] Where can I find the fix commit of #3370 ?

2017-11-13 Thread Ilya Dryomov
On Mon, Nov 13, 2017 at 7:45 AM, ? ? wrote: > I met the same issue as http://tracker.ceph.com/issues/3370 , > > But I can’t find the commit id of 2978257c56935878f8a756c6cb169b569e99bb91 , > Can someone help me? I updated the ticket. It's very old though, which kernel are you

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Caspar Smit
Hi, Why would Ceph 12.2.1 give you this message: 2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following dangerous and experimental features are enabled: bluestore Or is that a leftover warning message from an old client? Kind regards, Caspar 2017-11-10 21:27 GMT+01:00 Marc Roos

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Marc Roos
1. I don’t think an osd should 'crash' in such situation. 2. How else should I 'rados put' an 8GB file? -Original Message- From: Christian Wuerdig [mailto:christian.wuer...@gmail.com] Sent: maandag 13 november 2017 0:12 To: Marc Roos Cc: ceph-users Subject: Re: [ceph-users]

Re: [ceph-users] Cluster hang (deep scrub bug? "waiting for scrub")

2017-11-13 Thread Matteo Dacrema
I’ve seen that only one time and noticed that there’s a bug fixed in 10.2.10 ( http://tracker.ceph.com/issues/20041 ) Yes I use snapshots. As I can see in my case the PG was scrubbing since 20 days but I’ve only 7 days logs so I’m not able to identify