Re: [ceph-users] O_DIRECT on deep-scrub read

2015-10-07 Thread Sage Weil
On Wed, 7 Oct 2015, David Zafman wrote: > > There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after > deep-scrub reads for objects not recently accessed by clients. Yeah, it's the 'except for stuff already in cache' part that we don't do (and the kernel doesn't give us a good

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Christian Balzer
Hello, On Wed, 07 Oct 2015 07:34:16 +0200 Loic Dachary wrote: > Hi Christian, > > Interesting use case :-) How many OSDs / hosts do you have ? And how are > they connected together ? > If you look far back in the archives you'd find that design. And of course there will be a lot of "I told

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread Christian Balzer
Hello, On Thu, 8 Oct 2015 11:27:46 +0800 (CST) wikison wrote: > Hi, > I've removed the rbd pool and created it again. It picked up my > default settings but there are still some problems. After running "sudo > ceph -s", the output is as follow: > cluster

Re: [ceph-users] CephFS "corruption" -- Nulled bytes

2015-10-07 Thread Adam Tygart
Does this patch fix files that have been corrupted in this manner? If not, or I guess even if it does, is there a way to walk the metadata and data pools and find objects that are affected? Is that '_' xattr in hammer? If so, how can I access it? Doing a listxattr on the inode just lists

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread wikison
Here, like this : esta@monitorOne:~$ sudo ceph osd tree ID WEIGHT TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY -3 4.39996 root defualt -2 1.0 host storageTwo 0 0.0 osd.0 up 1.0 1.0 1 1.0 osd.1 up 1.0

Re: [ceph-users] CephFS "corruption" -- Nulled bytes

2015-10-07 Thread Sage Weil
On Wed, 7 Oct 2015, Adam Tygart wrote: > Does this patch fix files that have been corrupted in this manner? Nope, it'll only prevent it from happening to new files (that haven't yet been migrated between the cache and base tier). > If not, or I guess even if it does, is there a way to walk the

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread Christian Balzer
Hello, On Thu, 8 Oct 2015 12:21:40 +0800 (CST) wikison wrote: > Here, like this : > esta@monitorOne:~$ sudo ceph osd tree > ID WEIGHT TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY > -3 4.39996 root defualt That's your problem. It should be "default" Your

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread wikison
Hi, I've removed the rbd pool and created it again. It picked up my default settings but there are still some problems. After running "sudo ceph -s", the output is as follow: cluster 0b9b05db-98fe-49e6-b12b-1cce0645c015 health HEALTH_WARN 512 pgs stuck

[ceph-users] ceph osd start failed

2015-10-07 Thread Fulin Sun
The failing message looks like following: What would be the root cause ? === osd.0 === failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.0 --keyring=/var/lib/ceph/osd/ceph-0/keyring osd crush create-or-move -- 0 1.74 host=certstor-18 root=default' === osd.1 === failed:

Re: [ceph-users] leveldb compaction error

2015-10-07 Thread Narendra Trivedi (natrived)
Hi Selcuk, Which version of ceph did you upgrade from to Hammer (0.94)? --Narendra From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Selcuk TUNC Sent: Thursday, September 17, 2015 12:41 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] leveldb compaction error

Re: [ceph-users] pgs stuck inactive and unclean, too feww PGs per OSD

2015-10-07 Thread Chris Jones
One possibility, it may be that the crush map is not creating. Look at your /etc/ceph/ceph.conf file and see if you have something under the OSD section (actually could be in global too) that looks like the following: osd crush update on start = false If that line is there and if you're not

Re: [ceph-users] Potential OSD deadlock?

2015-10-07 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 We forgot to upload the ceph.log yesterday. It is there now. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Oct 6, 2015 at 5:40 PM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED

Re: [ceph-users] O_DIRECT on deep-scrub read

2015-10-07 Thread David Zafman
There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after deep-scrub reads for objects not recently accessed by clients. I see the NewStore objectstore sometimes using the O_DIRECT flag for writes. This concerns me because the open(2) man pages says: "Applications should avoid

Re: [ceph-users] Q on the hererogenity

2015-10-07 Thread John Spray
Pretty unusual but not necessarily a problem -- if anything didn't work I think we'd consider it a bug. Discussions about getting the uids/gids to match on OSD filesystems when running as non-root were the last time I heard people chatting about this (and even then, only matters if you move

Re: [ceph-users] Q on the hererogenity

2015-10-07 Thread Jan Schermer
We recently mixed CentOS/Ubuntu OSDs and ran into some issues, but I don't think those have anything to do with the distros but more likely with the fact that we ran -grsec kernel there. YMMV. I don't think there's a reason it shouldn't work. It will be much harder to debug and tune, though if

[ceph-users] Q on the hererogenity

2015-10-07 Thread Andrey Shevel
Hello everybody, we are discussing experimental installation ceph cluster under Scientific Linux 7.x (or CentOS 7.x). I wonder if it is possible that in one ceph cluster are used different operating platforms (e.g. CentOS, Ubuntu) ? Does anybody have information/experience on the topic ? Many

[ceph-users] O_DIRECT on deep-scrub read

2015-10-07 Thread Paweł Sadowski
Hi, Can anyone tell if deep scrub is done using O_DIRECT flag or not? I'm not able to verify that in source code. If not would it be possible to add such feature (maybe config option) to help keeping Linux page cache in better shape? Thanks, -- PS

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Udo Lembke
Hi Christian, On 07.10.2015 09:04, Christian Balzer wrote: > > ... > > My main suspect for the excessive slowness are actually the Toshiba DT > type drives used. > We only found out after deployment that these can go into a zombie mode > (20% of their usual performance for ~8 hours if not

Re: [ceph-users] Q on the hererogenity

2015-10-07 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 We just transitioned a pre-production cluster from CentOS7 to Debian Jessie without any issues. I rebooted half of the nodes (6 nodes) on day, let it sit for 24 hours and then rebooted the other half the next day. This was running 0.94.3. I'll be

Re: [ceph-users] O_DIRECT on deep-scrub read

2015-10-07 Thread Sage Weil
It's not, but it would not be ahrd to do this. There are fadvise-style hints being passed down that could trigger O_DIRECT reads in this case. That may not be the best choice, though--it won't use data that happens to be in cache and it'll also throw it out.. On Wed, 7 Oct 2015, Pawe?

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Christian Balzer
Hello Udo, On Wed, 07 Oct 2015 11:40:11 +0200 Udo Lembke wrote: > Hi Christian, > > On 07.10.2015 09:04, Christian Balzer wrote: > > > > ... > > > > My main suspect for the excessive slowness are actually the Toshiba DT > > type drives used. > > We only found out after deployment that these

Re: [ceph-users] O_DIRECT on deep-scrub read

2015-10-07 Thread Milosz Tanski
On Wed, Oct 7, 2015 at 10:50 AM, Sage Weil wrote: > It's not, but it would not be ahrd to do this. There are fadvise-style > hints being passed down that could trigger O_DIRECT reads in this case. > That may not be the best choice, though--it won't use data that happens > to

Re: [ceph-users] avoid 3-mds fs laggy on 1 rejoin?

2015-10-07 Thread Dzianis Kahanovich
John Spray пишет: [...] There are part of log for restarted mds debug 7 (without standby-replplay, but IMHO no matter): (PS How [un]safe multiple mds in current hammer? Now I try & temporary work with "set_max_mds 3", but 1 mds shutdown still too laggy for related client anymore. And