[ceph-users] PG:: recovery optimazation: recovery what is really modified by mslovy ・ Pull Request #3837 ・ ceph/ceph ・ GitHub

2017-07-27 Thread donglifec...@gmail.com
yaoning, haomai, Json what about the "recovery what is really modified" feature? I didn't see any update on github recently, will it be further developed? https://github.com/ceph/ceph/pull/3837 (PG:: recovery optimazation: recovery what is really modified) Thanks a lot.

Re: [ceph-users] XFS attempt to access beyond end of device

2017-07-27 Thread Brad Hubbard
An update on this. The "attempt to access beyond end of device" messages are created due to a kernel bug which is rectified by the following patches. - 59d43914ed7b9625(vfs: make guard_bh_eod() more generic) - 4db96b71e3caea(vfs: guard end of device for mpage interface) An upgraded Red Hat

Re: [ceph-users] Networking/naming doubt

2017-07-27 Thread David Turner
The only thing that is supposed to use the cluster network are the OSDs. Not even the MONs access the cluster network. I am sure that if you have a need to make this work that you can find a way, but I don't know that one exists in the standard tool set. You might try temporarily setting the

Re: [ceph-users] High iowait on OSD node

2017-07-27 Thread Anthony D'Atri
My first suspicion would be the HBA. Are you using a RAID HBA? If so I suggest checking the status of your BBU/FBWC and cache policy. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Networking/naming doubt

2017-07-27 Thread Oscar Segarra
Sorry! I'd like to add that I want to use the cluster network for both purposes: ceph-deploy --username vdicceph new vdicnode01 --cluster-network 192.168.100.0/24 --public-network 192.168.100.0/24 Thanks a lot 2017-07-28 0:29 GMT+02:00 Oscar Segarra : > Hi, > > ¿Do you

Re: [ceph-users] how to troubleshoot "heartbeat_check: no reply" in OSD log

2017-07-27 Thread Brad Hubbard
On Fri, Jul 28, 2017 at 6:06 AM, Jared Watts wrote: > I’ve got a cluster where a bunch of OSDs are down/out (only 6/21 are up/in). > ceph status and ceph osd tree output can be found at: > > https://gist.github.com/jbw976/24895f5c35ef0557421124f4b26f6a12 > > > > In osd.4

Re: [ceph-users] Networking/naming doubt

2017-07-27 Thread Roger Brown
I could be wrong, but I think you cannot achieve this objective. If you declare a cluster network, OSDs will route heartbeat, object replication and recovery traffic over the cluster network. We prefer that the cluster network is NOT reachable from the public network or the Internet for added

Re: [ceph-users] Error in boot.log - Failed to start Ceph disk activation - Luminous

2017-07-27 Thread Oscar Segarra
Hi Roger, Thanks a lot, I will try your workarround. I have opened a bug in order devs to review it as soon as they have availability. http://tracker.ceph.com/issues/20807 2017-07-27 23:39 GMT+02:00 Roger Brown : > I had same issue on Lumninous and worked around it by

[ceph-users] Networking/naming doubt

2017-07-27 Thread Oscar Segarra
Hi, In my environment I have 3 hosts, every host has 2 network interfaces: public: 192.168.2.0/24 cluster: 192.168.100.0/24 The hostname "vdicnode01", "vdicnode02" and "vdicnode03" are resolved by public DNS through the public interface, that means the "ping vdicnode01" will resolve

Re: [ceph-users] Error in boot.log - Failed to start Ceph disk activation - Luminous

2017-07-27 Thread Roger Brown
I had same issue on Lumninous and worked around it by disabling ceph-disk. The osds can start without it. On Thu, Jul 27, 2017 at 3:36 PM Oscar Segarra wrote: > Hi, > > First of all, my version: > > [root@vdicnode01 ~]# ceph -v > ceph version 12.1.1

[ceph-users] Error in boot.log - Failed to start Ceph disk activation - Luminous

2017-07-27 Thread Oscar Segarra
Hi, First of all, my version: [root@vdicnode01 ~]# ceph -v ceph version 12.1.1 (f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc) When I boot my ceph node (I have an all in one) I get the following message in boot.log: *[FAILED] Failed to start Ceph disk activation: /dev/sdb2.* *See

[ceph-users] how to troubleshoot "heartbeat_check: no reply" in OSD log

2017-07-27 Thread Jared Watts
I’ve got a cluster where a bunch of OSDs are down/out (only 6/21 are up/in). ceph status and ceph osd tree output can be found at: https://gist.github.com/jbw976/24895f5c35ef0557421124f4b26f6a12 In osd.4 log, I see many of these: 2017-07-27 19:38:53.468852 7f3855c1c700 -1 osd.4 120

Re: [ceph-users] Client behavior when OSD is unreachable

2017-07-27 Thread David Turner
The clients receive up to date versions of the osd map which includes which osds are down. So yes, when an osd is marked down in the cluster the clients know about it. If an osd is unreachable but isn't marked down in the cluster, the result is blocked requests. On Thu, Jul 27, 2017, 1:21 PM

[ceph-users] Client behavior when OSD is unreachable

2017-07-27 Thread Daniel K
Does the client track which OSDs are reachable? How does it behave if some are not reachable? For example: Cluster network with all OSD hosts on a switch. Public network with OSD hosts split between two switches, failure domain is switch. copies=3 so with a failure of the public switch, 1 copy

Re: [ceph-users] High iowait on OSD node

2017-07-27 Thread Peter Maloney
I'm using bcache (starting around the middle of December...before that see way higher await) for all the 12 hdds on the 2 SSDs, and NVMe for journals. (and some months ago I changed all the 2TB disks to 6TB and added ceph4,5) Here's my iostat in ganglia: just raw per disk await

Re: [ceph-users] Fwd: [lca-announce] Call for Proposals for linux.conf.au 2018 in Sydney are open!

2017-07-27 Thread Tim Serong
On 07/03/2017 02:36 PM, Tim Serong wrote: > It's that time of year again, folks! Please everyone go submit talks, > or at least plan to attend this most excellent of F/OSS conferences. CFP closes in a bit over a week (August 6). Get into it if you didn't already :-) > (I thought I might put in

[ceph-users] Ceph Developers Monthly - August

2017-07-27 Thread Leonardo Vaz
Hey Cephers, This is just a friendly reminder that the next Ceph Developer Montly meeting is coming up: https://wiki.ceph.com/Planning If you have work that you're doing that it a feature work, significant backports, or anything you would like to discuss with the core team, please add it to

[ceph-users] High iowait on OSD node

2017-07-27 Thread John Petrini
Hello list, Just curious if anyone has ever seen this behavior and might have some ideas on how to troubleshoot it. We're seeing very high iowait in iostat across all OSD's in on a single OSD host. It's very spiky - dropping to zero and then shooting up to as high as 400 in some cases. Despite

Re: [ceph-users] Ceph object recovery

2017-07-27 Thread Daniel K
So I'm not sure if this was the best or right way to do this but -- using rados I confirmed the unfound object was in the cephfs_data pool # rados -p cephfs_data ls|grep 001c0ed4 using the osdmap tool I found the pg/osd the unfound object was in -- # osdmaptool --test-map-object