Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Aravind Ramesh
Hi I did a make install in my ceph build and also did make install on the fio and ensured the latest binaries were installed. Now, fio is failing with below errors for the rbd device with EC pool as data pool. I have shared the "rbd ls" and my rbd.fio conf file details below. Let me know if

Re: [ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Joshua Schmid
On 12/08/2016 03:09 PM, Khramchikhin Nikolay wrote: > No luck, i dont even have logs associate with ceph-disk at startup: In that case your udev rules probably didn't trigger. Have a look in the logs and see if udev invokes (/usr/sbin/ceph-disk --log-stdout -v trigger /dev/$name) you could

Re: [ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Joshua Schmid
Hey Khramchikhin, On 12/08/2016 12:22 PM, Khramchikhin Nikolay wrote: > Hello, folks, > > I have problem with osd up after server reboot, all disk deployed with > ceph-deploy osd prepare --dmcrypt, i `ve tried ceph-deploy osd create > with no luck, same issue. > > After server reboot osd

Re: [ceph-users] filestore_split_multiple hardcoded maximum?

2016-12-08 Thread Frédéric Nass
David, you might also be interested in the new Jewel 10.2.4 tool called 'ceph-objectstore-tool' from Josh. It allows to split filestore directories offline (http://tracker.ceph.com/issues/17220). Unfortunatly not merge apprently. Regards, Frédéric. - Le 27 Sep 16, à 0:42, David

Re: [ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Khramchikhin Nikolay
No luck, i dont even have logs associate with ceph-disk at startup: $journalctl -u ceph- it shows me only standart ceph services, such ceph-osd@ but after ceph-disk activate-lockbox, i can get logs: journalctl -u

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Jason Dillaman
On Thu, Dec 8, 2016 at 6:53 AM, Aravind Ramesh wrote: > I did a make install in my ceph build and also did make install on the fio > and ensured the latest binaries were installed. Now, fio is failing with > below errors for the rbd device with EC pool as data pool.

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Venky Shankar
On 16-12-08 08:45:44, Jason Dillaman wrote: > On Thu, Dec 8, 2016 at 6:53 AM, Aravind Ramesh > wrote: > > I did a make install in my ceph build and also did make install on the fio > > and ensured the latest binaries were installed. Now, fio is failing with > > below

[ceph-users] CEPH failuers after 5 journals down

2016-12-08 Thread Wojciech Kobryń
Hi Ceph Users ! I've got here a CEPH cluster: 6 nodes, 12 OSDs on HDD and SSD disks. All journal OSDs on SSDs. 25 various HDDs in total. We had several HDD failures in past, but every time - it was HDD failure and it was never journal related. After replacing HDD, and recovery procedures all was

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Varada Kari
Fio with librbd takes pool as an argument. fio arguments looks like below. ioengine=rbd clientname=admin pool=rbd rbdname=test If we can create the rbd image in a specific pool, we can run fio without any issues. BTW i am using latest fio(built from github repo). Varada On Thursday 08

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Venky Shankar
On 16-12-08 09:32:02, Aravind Ramesh wrote: > You can specify the -data-pool option while creating the rbd image. > Example: > rbd create rbdimg_EC1 --size 1024 --pool replicated_pool1 --data-pool ecpool > Once the image is created, you can add the image name(rdbimg_EC1) and the > replicated pool

Re: [ceph-users] 10.2.4 Jewel released

2016-12-08 Thread Ruben Kerkhof
On Thu, Dec 8, 2016 at 1:24 AM, Gregory Farnum wrote: > Okay, we think we know what's happened; explanation at > http://tracker.ceph.com/issues/18184 and first PR at > https://github.com/ceph/ceph/pull/12372 > > If you haven't already installed the previous branch, please try

[ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Khramchikhin Nikolay
Hello, folks, I have problem with osd up after server reboot, all disk deployed with ceph-deploy osd prepare --dmcrypt, i `ve tried ceph-deploy osd create with no luck, same issue. After server reboot osd dont come up, i need to run *ceph-disk activate-lockbox /dev/sdb3 ** * for

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Aravind Ramesh
You can specify the -data-pool option while creating the rbd image. Example: rbd create rbdimg_EC1 --size 1024 --pool replicated_pool1 --data-pool ecpool Once the image is created, you can add the image name(rdbimg_EC1) and the replicated pool name(replicated_pool1) in the fio config file and set

Re: [ceph-users] 10.2.4 Jewel released

2016-12-08 Thread Francois Lafont
On 12/08/2016 11:24 AM, Ruben Kerkhof wrote: > I've been running this on one of my servers now for half an hour, and > it fixes the issue. It's the same for me. ;) ~$ ceph -v ceph version 10.2.4-1-g5d3c76c (5d3c76c1c6e991649f0beedb80e6823606176d9e) Thanks for the help. Bye.

Re: [ceph-users] 10.2.4 Jewel released

2016-12-08 Thread Micha Krause
Hi, If you haven't already installed the previous branch, please try wip-msgr-jewel-fix2 instead. That's a cleaner and more precise solution to the real problem. :) Any predictions when this fix will hit the Debian repositories? Micha Krause ___

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Aravind Ramesh
The issue was due to a version mismatch in the rbd binary that I was using. It was picking the old rbd instead of the ones I had compiled. -Original Message- From: Jason Dillaman [mailto:jdill...@redhat.com] Sent: Thursday, December 08, 2016 2:05 AM To: Aravind Ramesh

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Nick Fisk
Fio has a direct RBD engine which uses librbd. I've just had a quick look at the code and I can't see an option in the latest fio to specify datapool, but I'm not sure if librbd handles this all behind the scenes. Might be worth a try. From: ceph-users

Re: [ceph-users] filestore_split_multiple hardcoded maximum?

2016-12-08 Thread Frédéric Nass
Hi David, I'm surprised your message didn't get any echo yet. I guess it depends on how many files your OSDs get to store on filesystem which depends essentialy on use cases. We're having similar issues with a 144 osd cluster running 2 pools. Each one holds 100 M objects.One is replication

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread Rob Pickerill
Hi I am working alongside Sean with this assertion issue. We see problems with the stray calls when the MDS starts, the last entry in the log before the assertion failure is a reference: try_remove_dentries_for_stray. I have provided a link for the ceph-collect logs, which we're collected after

Re: [ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Pierre BLONDEAU
Hello, I have a similar problem. I am not sure it's the same, but if i can help. I'm in jewel ( upgrade from firefly ) on jessie. The temporary solution that i found to start the OSD is to force udev to launch its triggers : udevadm trigger --action=add Regards --

[ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread Sean Redmond
Hi, I have a CephFS cluster that is currently unable to start the mds server as it is hitting an assert, the extract from the mds log is below, any pointers are welcome: ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) 2016-12-08 14:50:18.577038 7f7d9faa3700 1 mds.0.47077

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread John Spray
On Thu, Dec 8, 2016 at 3:11 PM, Sean Redmond wrote: > Hi, > > I have a CephFS cluster that is currently unable to start the mds server as > it is hitting an assert, the extract from the mds log is below, any pointers > are welcome: > > ceph version 10.2.3

Re: [ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Khramchikhin Nikolay
Udev doesnt trigger at start. But i have correct rules at udev-rules file. OS - debian jessie.ceph - jewel. Hello, I have a similar problem. I am not sure it's the same, but if i can help. I'm in jewel ( upgrade from firefly ) on jessie. The temporary solution that i

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread Sean Redmond
Hi, We had no changes going on with the ceph pools or ceph servers at the time. We have however been hitting this in the last week and it maybe related: http://tracker.ceph.com/issues/17177 Thanks On Thu, Dec 8, 2016 at 3:34 PM, John Spray wrote: > On Thu, Dec 8, 2016 at

Re: [ceph-users] problem after reinstalling system

2016-12-08 Thread Jake Young
Hey Dan, I had the same issue that Jacek had after changing my OS and Ceph version from Ubuntu 14 - Hammer to Centos 7 - Jewel. I was also able to recover from the failure by renaming the .ldb files to .sst files. Do you know why this works? Is it just because leveldb changed the file naming

Re: [ceph-users] RBD: Failed to map rbd device with data pool enabled.

2016-12-08 Thread Aravind Ramesh
Thanks Jason, Yes, after enabling the flag, it is able to run IOs on it. Aravind -Original Message- From: Jason Dillaman [mailto:jdill...@redhat.com] Sent: Thursday, December 08, 2016 7:16 PM To: Aravind Ramesh Cc: Venky Shankar ;

Re: [ceph-users] dmcrypt osd startup problem

2016-12-08 Thread Joshua Schmid
[..] > > > Yep, it`s same problem. It looks like bug. Manual udevadm trigger > worked fine. Good, seems like at least the ceph part is functioning correctly. > [..] ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] filestore_split_multiple hardcoded maximum?

2016-12-08 Thread Mark Nelson
I don't want to retype it all, but you guys might be interested in the discussion under section 3 of this post here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/012987.html basically the gist of it is: 1) Make sure SELinux isn't doing security xattr lookups for

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread John Spray
On Thu, Dec 8, 2016 at 3:45 PM, Sean Redmond wrote: > Hi, > > We had no changes going on with the ceph pools or ceph servers at the time. > > We have however been hitting this in the last week and it maybe related: > > http://tracker.ceph.com/issues/17177 Oh, okay -- so

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread Sean Redmond
Hi John, Thanks for your pointers, I have extracted the onmap_keys and onmap_values for an object I found in the metadata pool called '600.' and dropped them at the below location https://www.dropbox.com/sh/wg6irrjg7kie95p/AABk38IB4PXsn2yINpNa9Js5a?dl=0 Could you explain how is it

[ceph-users] documentation: osd crash tunables optimal and "some data movement"

2016-12-08 Thread Peter Gervai
"Hello List, This could be transformed a bugreport if anyone feels like, I just kind of share my harsh experience of today. We had a few OSDs getting full while others being below 40%; while these were properly weighted (full ones of 800GB being 0.800 and fairly empty ones being 2.7TB and 2.700

Re: [ceph-users] documentation: osd crash tunables optimal and "some data movement"

2016-12-08 Thread David Welch
I've seen this before and would recommend upgrading from Hammer. On 12/08/2016 04:26 PM, Peter Gervai wrote: "Hello List, This could be transformed a bugreport if anyone feels like, I just kind of share my harsh experience of today. We had a few OSDs getting full while others being below

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread Rob Pickerill
Hi John / All Thank you for the help so far. To add a further point to Sean's previous email, I see this log entry before the assertion failure: -6> 2016-12-08 15:47:08.483700 7fb133dca700 12 mds.0.cache.dir(1000a453344) remove_dentry [dentry #100/stray9/1000a453344/config [2,head] auth

Re: [ceph-users] CephFS FAILED assert(dn->get_linkage()->is_null())

2016-12-08 Thread Goncalo Borges
Hi John. I have been hitting that issue also although have not seen any asserts in my mds yet. Could you please clarify a bit further your proposal about manually removing omap info from strays? Should it be applied: - to the problematic replicas of the stray object which triggered the

Re: [ceph-users] node and its OSDs down...

2016-12-08 Thread M Ranga Swami Reddy
no noout and other flags set... How do we confirm if the down OSD is down of the cluster? Thanks Swami On Fri, Dec 9, 2016 at 11:18 AM, Brad Hubbard wrote: > > > On Fri, Dec 9, 2016 at 3:28 PM, M Ranga Swami Reddy > wrote: > >> Confused ... >> a

Re: [ceph-users] node and its OSDs down...

2016-12-08 Thread M Ranga Swami Reddy
Confused ... a few OSDs down and cluster done the recovery and reblanced to HEALTH OK state. Now I can could that down OSDs are down state from crushmap and are not part of OSD up or in state. After 5 days or says, still the same state. How or when Ceph will make the down state OSDs to out state?

Re: [ceph-users] node and its OSDs down...

2016-12-08 Thread Brad Hubbard
On Fri, Dec 9, 2016 at 3:28 PM, M Ranga Swami Reddy wrote: > Confused ... > a few OSDs down and cluster done the recovery and reblanced to HEALTH OK > state. > Now I can could that down OSDs are down state from crushmap and are not > part of OSD up or in state. > After 5

Re: [ceph-users] CEPH failuers after 5 journals down

2016-12-08 Thread Krzysztof Nowicki
Hi, We have managed to work around this problem. It seems that one of the objects lost due to the SSD failure was part of the hitset for the cache tier. The OSD was dying after an attempt to open and read the object's data file. In an effort to progress I have found out which file is missing by

[ceph-users] jewel/ceph-osd/filestore: Moving omap to separate filesystem/device

2016-12-08 Thread Kjetil Jørgensen
Hi, so - we're considering moving omap out to faster media than our rather slow spinning rust. There's been some discussion around this here: https://github.com/ceph/ceph/pull/6421 Since this hasn't landed in jewel, or the ceph-disk convenience bits - we're thinking of "other ways" of doing

Re: [ceph-users] Parallel reads with CephFS

2016-12-08 Thread Andreas Gerstmayr
Thanks for your response! 2016-12-09 1:27 GMT+01:00 Gregory Farnum : > On Wed, Dec 7, 2016 at 5:45 PM, Andreas Gerstmayr > wrote: >> Hi, >> >> does the CephFS kernel module (as of kernel version 4.8.8) support >> parallel reads of file stripes? >>

Re: [ceph-users] Parallel reads with CephFS

2016-12-08 Thread Gregory Farnum
On Wed, Dec 7, 2016 at 5:45 PM, Andreas Gerstmayr wrote: > Hi, > > does the CephFS kernel module (as of kernel version 4.8.8) support > parallel reads of file stripes? > When an application requests a 500MB block from a file (which is > splitted into multiple objects