Re: [ceph-users] Potential OSD deadlock?

2015-10-09 Thread Jan Schermer
Are there any errors on the NICs? (ethtool -s ethX) Also take a look at the switch and look for flow control statistics - do you have flow control enabled or disabled? We had to disable flow control as it would pause all IO on the port whenever any path got congested which you don't want to

Re: wip-addr

2015-10-09 Thread Sage Weil
Hey Marcus, On Fri, 2 Oct 2015, Marcus Watts wrote: > wip-addr > > 1. where is it? > 2. current state > 3. more info > 4. cheap fixes > 5. in case you were wondering why? > > 1. where is it? > > I've just pushed another update to wip-addr: > > g...@github.com:linuxbox2/linuxbox-ceph.git

after a reboot,osd can not up because of leveldb Corruption

2015-10-09 Thread lin zhou 周林
hi,guys the mon and osds in one of a node in our ceph cluster can not up now because of leveldb. ceph 0.80.7 ubuntu12.04 osd log is: -- 2015-10-10 11:12:58.896724 7f4cfcf9d7c0 -1 ESC[0;31m ** ERROR: error converting store

Re: wip-addr

2015-10-09 Thread Haomai Wang
resend to ML On Sat, Oct 10, 2015 at 11:20 AM, Haomai Wang wrote: > > > On Sat, Oct 10, 2015 at 5:49 AM, Sage Weil wrote: >> >> Hey Marcus, >> >> On Fri, 2 Oct 2015, Marcus Watts wrote: >> > wip-addr >> > >> > 1. where is it? >> > 2. current state >> >

How to reduce the influenct on the IO when an osd is marked out?

2015-10-09 Thread wangsongbo
Hi all, when an osd is marked out, relative IO will be blocked, in which case, application built on ceph will fail.According to test result, the larger a data is,the longer it will take to elapse. How to reduce the impact of this process on the IO? Thanks and Regards, WangSongbo -- To

Re: [ceph-users] Potential OSD deadlock?

2015-10-09 Thread Max A. Krasilnikov
Hello! On Fri, Oct 09, 2015 at 11:05:59AM +0200, jan wrote: > Are there any errors on the NICs? (ethtool -s ethX) No errors. Neither on nodes, nor on switches. > Also take a look at the switch and look for flow control statistics - do you > have flow control enabled or disabled? flow control

Re: [ceph-users] Potential OSD deadlock?

2015-10-09 Thread Jan Schermer
Have you tried running iperf between the nodes? Capturing a pcap of the (failing) Ceph comms from both sides could help narrow it down. Is there any SDN layer involved that could add overhead/padding to the frames? What about some intermediate MTU like 8000 - does that work? Oh and if there's

Re: [ceph-users] Potential OSD deadlock?

2015-10-09 Thread Max A. Krasilnikov
Здравствуйте! On Fri, Oct 09, 2015 at 01:45:42PM +0200, jan wrote: > Have you tried running iperf between the nodes? Capturing a pcap of the > (failing) Ceph comms from both sides could help narrow it down. > Is there any SDN layer involved that could add overhead/padding to the frames? No

Re: [ceph-users] O_DIRECT on deep-scrub read

2015-10-09 Thread Milosz Tanski
On Thu, Oct 8, 2015 at 4:11 AM, Paweł Sadowski wrote: > > On 10/07/2015 10:52 PM, Sage Weil wrote: > > On Wed, 7 Oct 2015, David Zafman wrote: > >> There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after > >> deep-scrub reads for objects not recently accessed by

[PATCH] ceph/osd_client: add support for CEPH_OSD_OP_GETXATTR

2015-10-09 Thread David Disseldorp
Allows for xattr retrieval. Response data buffer allocation is the responsibility of the osd_req_op_xattr_init() caller. Signed-off-by: David Disseldorp --- include/linux/ceph/osd_client.h | 8 +- net/ceph/osd_client.c | 55 -

[PATCH 0/1] ceph/osd_client: GETXATTR support

2015-10-09 Thread David Disseldorp
The following patch adds support for xattr retrieval via CEPH_OSD_OP_GETXATTR. This allows for future RBD Persistent Reservation support to implemented with state retained in an xattr on the RBD header object. RBD get/set/cmpset xattr functionality can be tested using the debug DEVICE_ATTR