Re: 答复: Reboot blocked when undoing unmap op.

2016-01-04 Thread Ilya Dryomov
On Mon, Jan 4, 2016 at 10:51 AM, Wukongming wrote: > Hi, Ilya, > > It is an old problem. > When you say "when you issue a reboot, daemons get killed and the kernel > client ends up waiting for the them to come back, because of outstanding > writes issued by umount called by

Re: puzzling disapearance of /dev/sdc1

2015-12-18 Thread Ilya Dryomov
On Fri, Dec 18, 2015 at 1:38 PM, Loic Dachary wrote: > Hi Ilya, > > It turns out that sgdisk 0.8.6 -i 2 /dev/vdb removes partitions and re-adds > them on CentOS 7 with a 3.10.0-229.11.1.el7 kernel, in the same way partprobe > does. It is used intensively by ceph-disk and

Re: puzzling disapearance of /dev/sdc1

2015-12-17 Thread Ilya Dryomov
On Thu, Dec 17, 2015 at 3:10 PM, Loic Dachary wrote: > Hi Sage, > > On 17/12/2015 14:31, Sage Weil wrote: >> On Thu, 17 Dec 2015, Loic Dachary wrote: >>> Hi Ilya, >>> >>> This is another puzzling behavior (the log of all commands is at >>>

Re: understanding partprobe failure

2015-12-17 Thread Ilya Dryomov
On Thu, Dec 17, 2015 at 1:19 PM, Loic Dachary wrote: > Hi Ilya, > > I'm seeing a partprobe failure right after a disk was zapped with sgdisk > --clear --mbrtogpt -- /dev/vdb: > > partprobe /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been > written, but we have

Re: Rbd map failure in 3.16.0-55

2015-12-12 Thread Ilya Dryomov
On Sat, Dec 12, 2015 at 7:56 AM, Varada Kari wrote: > Hi all, > > We are working on jewel branch on a test cluster to validate some of the > fixes. But landed up in the following error when mapping an image using krbd > on Ubuntu 14.04.2 with 3.16.0-55 kernel version.

Re: Rbd map failure in 3.16.0-55

2015-12-12 Thread Ilya Dryomov
On Sat, Dec 12, 2015 at 6:42 PM, Somnath Roy wrote: > Ilya, > If we map with 'nocrc' would that help ? No, it will disable data crcs, header and middle crcs will still be checked. The header/data separation in userspace is fairly new, if that's something you care about,

Re: CEPH_MAX_OID_NAME_LEN in rbd kernel driver

2015-12-11 Thread Ilya Dryomov
On Fri, Dec 11, 2015 at 3:48 PM, Jean-Tiare Le Bigot wrote: > Hi, > > I hit a use case where rbd was failing to map an image because of the > name length. > > dmesg output: > > WARNING: CPU: 0 PID: 20851 at include/linux/ceph/osdmap.h:97 >

Re: Kernel RBD hang on OSD Failure

2015-12-11 Thread Ilya Dryomov
On Fri, Dec 11, 2015 at 1:37 AM, Matt Conner wrote: > Hi Ilya, > > I had already recovered but I managed to recreate the problem again. I ran How did you recover? > the commands against rbd_data.f54f9422698a8. which was one > of those listed in osdc

Re: Kernel RBD hang on OSD Failure

2015-12-10 Thread Ilya Dryomov
On Fri, Dec 11, 2015 at 12:16 AM, Matt Conner wrote: > Hi Ilya, > > Took me a little time to reproduce, but I'm once again got into the > state. In the past I had reproduced the issue with a single OSD > failure but in this case I failed an entire server. Ignore the

Re: [ceph-users] Kernel RBD hang on OSD Failure

2015-12-08 Thread Ilya Dryomov
On Tue, Dec 8, 2015 at 10:57 AM, Tom Christensen wrote: > We aren't running NFS, but regularly use the kernel driver to map RBDs and > mount filesystems in same. We see very similar behavior across nearly all > kernel versions we've tried. In my experience only very few

Re: Kernel RBD hang on OSD Failure

2015-12-08 Thread Ilya Dryomov
On Mon, Dec 7, 2015 at 9:56 PM, Matt Conner wrote: > Hi, > > We have a Ceph cluster in which we have been having issues with RBD > clients hanging when an OSD failure occurs. We are using a NAS gateway > server which maps RBD images to filesystems and serves the

Re: testing the /dev/cciss/c0d0 device names

2015-12-06 Thread Ilya Dryomov
On Sat, Dec 5, 2015 at 7:36 PM, Loic Dachary wrote: > Hi Ilya, > > ceph-disk has special handling for device names like /dev/cciss/c0d1 [1] and > it was partially broken when support for device mapper was introduced. > Ideally there would be a way to test that support when

[PATCH] rbd: don't put snap_context twice in rbd_queue_workfn()

2015-12-01 Thread Ilya Dryomov
umes a ref on snapc, so calling ceph_put_snap_context() after a successful rbd_img_request_create() leads to an extra put. Fix it. Cc: sta...@vger.kernel.org # 3.18+ Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drive

Re: block-rbd: One function call less in rbd_dev_probe_parent() after error detection

2015-11-26 Thread Ilya Dryomov
On Thu, Nov 26, 2015 at 8:54 AM, SF Markus Elfring wrote: >>> I interpreted the eventual passing of a null pointer to the >>> rbd_dev_destroy() >>> function as an indication for further source code adjustments. >> >> If all error paths could be adjusted so that

Re: [PATCH 2/2] block-rbd: One function call less in rbd_dev_probe_parent() after error detection

2015-11-25 Thread Ilya Dryomov
On Wed, Nov 25, 2015 at 12:55 PM, Dan Carpenter <dan.carpen...@oracle.com> wrote: > On Tue, Nov 24, 2015 at 09:21:06PM +0100, Ilya Dryomov wrote: >> >> Cleanup here is (and should be) done in reverse order. >> > > > Yes. This is true. > >> > I ha

Re: block-rbd: One function call less in rbd_dev_probe_parent() after error detection

2015-11-24 Thread Ilya Dryomov
On Tue, Nov 24, 2015 at 9:34 PM, SF Markus Elfring wrote: >> Well, there isn't any _literal_ linking (e.g. adding to a link list, >> etc) in this case. We just bump some refs and do probe to fill in the >> newly allocated parent. > > Thanks for your clarification.

Re: [CEPH-DEVEL] [ceph-users] occasional failure to unmap rbd

2015-11-24 Thread Ilya Dryomov
esg attached. > But does not show much except for one mon being down. > The mon is down for hardware reasons. > > > > On Mon, Nov 23, 2015 at 11:26 PM, Ilya Dryomov <idryo...@gmail.com> wrote: >> >> On Mon, Nov 23, 2015 at 11:03 PM, Markus Kienast <

Re: Reboot blocked when undoing unmap op.

2015-11-20 Thread Ilya Dryomov
On Fri, Nov 20, 2015 at 3:19 AM, Wukongming wrote: > Hi Sage, > > I created a rbd image, and mapped to a local which means I can find > /dev/rbd0, at this time I reboot the system, in last step of shutting down, > it blocked with an error > > [235618.0202207] libceph:

Re: Kernel client OSD message version

2015-11-19 Thread Ilya Dryomov
On Thu, Nov 19, 2015 at 11:27 AM, Lakis, Jacek wrote: > Ilya, thank you for the quick reply. > > For example it's about split decoding. I'm asking not because of specific > changes, I'm rather curious about when we should sync the kernel client > encoding to the master

Re: [PATCH 2/3] net/ceph: do not define list_entry_next

2015-11-18 Thread Ilya Dryomov
On Wed, Nov 18, 2015 at 1:13 PM, Sergey Senozhatsky wrote: > Cosmetic. > > Do not define list_entry_next() and use list_next_entry() > from list.h. > > Signed-off-by: Sergey Senozhatsky > --- > net/ceph/messenger.c | 8 +++- > 1

Re: [PATCH 2/3] libceph: use list_next_entry instead of list_entry_next

2015-11-17 Thread Ilya Dryomov
On Mon, Nov 16, 2015 at 2:46 PM, Geliang Tang wrote: > list_next_entry has been defined in list.h, so I replace list_entry_next > with it. > > Signed-off-by: Geliang Tang > --- > net/ceph/messenger.c | 7 ++- > 1 file changed, 2 insertions(+), 5

request_queue use-after-free - inode_detach_wb()

2015-11-16 Thread Ilya Dryomov
Hello, Last week, while running an rbd test which does a lot of maps and unmaps (read losetup / losetup -d) with slab debugging enabled, I hit the attached splat. That 6a byte corresponds to the atomic_long_t count of the percpu_ref refcnt in request_queue::backing_dev_info::wb, pointing to a

Re: [PATCH] ceph:Fix error handling in the function down_reply

2015-11-09 Thread Ilya Dryomov
On Mon, Nov 9, 2015 at 11:15 AM, Yan, Zheng wrote: > >> On Nov 9, 2015, at 11:11, Nicholas Krause wrote: >> >> This fixes error handling in the function down_reply in order to >> properly check and jump to the goto label, out_err for this >> particular

[PATCH 2/4] libceph: drop authorizer check from cephx msg signing routines

2015-11-02 Thread Ilya Dryomov
I don't see a way for auth->authorizer to be NULL in ceph_x_sign_message() or ceph_x_check_message_signature(). Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- net/ceph/auth_x.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/net/ceph/auth_x.c b/net/ceph

[PATCH 1/4] libceph: msg signing callouts don't need con argument

2015-11-02 Thread Ilya Dryomov
provided authorizer is of no use. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- fs/ceph/mds_client.c | 14 -- include/linux/ceph/messenger.h | 5 ++--- net/ceph/messenger.c | 4 ++-- net/ceph/osd_client.c | 14 -- 4 files changed

[PATCH 0/4] libceph: nocephx_sign_messages option + misc

2015-11-02 Thread Ilya Dryomov
Hello, This adds nocephx_sign_messages libceph option (a lack of which is something people are running into, see [1]), plus a couple of related cleanups. [1] https://forum.proxmox.com/threads/24116-new-krbd-option-on-pve4-don-t-work Thanks, Ilya Ilya Dryomov (4): libceph

Re: Re: [PATCH] mark rbd requiring stable pages

2015-10-30 Thread Ilya Dryomov
On Fri, Oct 23, 2015 at 9:06 PM, Ilya Dryomov <idryo...@gmail.com> wrote: > On Fri, Oct 23, 2015 at 9:00 PM, ronny.hegew...@online.de > <ronny.hegew...@online.de> wrote: >>> Could you share the entire log snippet for those 10 minutes? >> >> Thats all in t

Re: [PATCH] net: ceph: osd_client: change osd_req_op_data() macro

2015-10-29 Thread Ilya Dryomov
On Thu, Oct 22, 2015 at 5:06 PM, Ioana Ciornei wrote: > This patch changes the osd_req_op_data() macro to not evaluate > parameters more than once in order to follow the kernel coding style. > > Signed-off-by: Ioana Ciornei > Reviewed-by: Alex

[PATCH] libceph: introduce ceph_x_authorizer_cleanup()

2015-10-26 Thread Ilya Dryomov
y(), which currently always leaks key and ceph_x_build_authorizer() error paths. Cc: Yan, Zheng <z...@redhat.com> Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- net/ceph/auth_x.c | 28 +--- net/ceph/crypto.h | 4 +++- 2 files changed, 20 insertions(+), 12 de

Re: Re: [PATCH] mark rbd requiring stable pages

2015-10-23 Thread Ilya Dryomov
On Fri, Oct 23, 2015 at 9:00 PM, ronny.hegew...@online.de wrote: >> Could you share the entire log snippet for those 10 minutes? > > Thats all in the logs. But if more information would be useful tell me which > logs > to activate and i will give it another run. At

[PATCH] rbd: remove duplicate calls to rbd_dev_mapping_clear()

2015-10-23 Thread Ilya Dryomov
f moving the old one. Around the same time, another duplicate was introduced in rbd_dev_device_release() - kill both. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index

Re: [PATCH] mark rbd requiring stable pages

2015-10-23 Thread Ilya Dryomov
On Fri, Oct 23, 2015 at 1:56 AM, Ronny Hegewald <ronny.hegew...@online.de> wrote: > On Thursday 22 October 2015, Ilya Dryomov wrote: >> Well, checksum mismatches are to be expected given what we are doing >> now, but I wouldn't expect any data corruptions. Ronny writes th

Re: [PATCH] mark rbd requiring stable pages

2015-10-22 Thread Ilya Dryomov
On Thu, Oct 22, 2015 at 7:22 PM, Mike Christie <micha...@cs.wisc.edu> wrote: > On 10/22/15, 11:52 AM, Ilya Dryomov wrote: >> >> On Thu, Oct 22, 2015 at 5:37 PM, Mike Christie <micha...@cs.wisc.edu> >> wrote: >>> >>> On 10/22/2015 06:20 AM, Ilya Dr

Re: [PATCH] mark rbd requiring stable pages

2015-10-22 Thread Ilya Dryomov
On Thu, Oct 22, 2015 at 6:07 AM, Mike Christie <micha...@cs.wisc.edu> wrote: > On 10/21/2015 03:57 PM, Ilya Dryomov wrote: >> On Wed, Oct 21, 2015 at 10:51 PM, Ilya Dryomov <idryo...@gmail.com> wrote: >>> On Fri, Oct 16, 2015 at 1:09 PM, Ilya Dryomov <id

Re: [PATCH] mark rbd requiring stable pages

2015-10-22 Thread Ilya Dryomov
On Thu, Oct 22, 2015 at 5:37 PM, Mike Christie <micha...@cs.wisc.edu> wrote: > On 10/22/2015 06:20 AM, Ilya Dryomov wrote: >> >>> > >>> > If we are just talking about if stable pages are not used, and someone >>> > is re-writing data

Re: [PATCH] mark rbd requiring stable pages

2015-10-21 Thread Ilya Dryomov
On Wed, Oct 21, 2015 at 10:51 PM, Ilya Dryomov <idryo...@gmail.com> wrote: > On Fri, Oct 16, 2015 at 1:09 PM, Ilya Dryomov <idryo...@gmail.com> wrote: >> Hmm... On the one hand, yes, we do compute CRCs, but that's optional, >> so enabling this unconditionally is p

Re: [PATCH] mark rbd requiring stable pages

2015-10-21 Thread Ilya Dryomov
On Fri, Oct 16, 2015 at 1:09 PM, Ilya Dryomov <idryo...@gmail.com> wrote: > Hmm... On the one hand, yes, we do compute CRCs, but that's optional, > so enabling this unconditionally is probably too harsh. OTOH we are > talking to the network, which means all sorts of delays,

Re: [PATCH v2] Net: ceph: messenger: Use local variable cursor in read_partial_msg_data()

2015-10-19 Thread Ilya Dryomov
On Mon, Oct 19, 2015 at 4:49 AM, Shraddha Barke wrote: > Use local variable cursor in place of >cursor in > read_partial_msg_data() > > Signed-off-by: Shraddha Barke > --- > Changes in v2- > Drop incorrect use of cursor > >

Re: [PATCH v3] net: ceph: messenger: Use local variable cursor instead of >cursor

2015-10-19 Thread Ilya Dryomov
On Mon, Oct 19, 2015 at 6:29 PM, Shraddha Barke wrote: > Use local variable cursor in place of >cursor in > read_partial_msg_data() and write_partial_msg_data() > > Signed-off-by: Shraddha Barke > --- > Changes in v3- > Replace >cursor with

[PATCH] rbd: return -ENOMEM instead of pool id if rbd_dev_create() fails

2015-10-19 Thread Ilya Dryomov
Returning pool id (i.e. >= 0) from a sysfs ->store() callback makes userspace think it needs to retry the write. Fix it - it's a leftover from the times when the equivalent of rbd_dev_create() was the first action in rbd_add(). Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- d

[PATCH] rbd: set device_type::release instead of device::release

2015-10-19 Thread Ilya Dryomov
No point in providing an empty device_type::release callback and then setting device::release for each rbd_dev dynamically. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/block/r

[PATCH] rbd: don't free rbd_dev outside of the release callback

2015-10-19 Thread Ilya Dryomov
/issues/12697 Cc: Alex Elder <el...@linaro.org> Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 89 - 1 file changed, 47 insertions(+), 42 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd

Re: [PATCH] Net: ceph: osd_client: Remove con argument in handle_reply

2015-10-18 Thread Ilya Dryomov
On Sun, Oct 18, 2015 at 10:25 AM, Shraddha Barke wrote: > Since the function handle_reply does not use it's con argument, > remove it. > > Signed-off-by: Shraddha Barke > --- > net/ceph/osd_client.c | 5 ++--- > 1 file changed, 2 insertions(+),

Re: [PATCH] Net: ceph: messenger: Use local variable cursor in read_partial_msg_data()

2015-10-18 Thread Ilya Dryomov
On Sun, Oct 18, 2015 at 12:00 PM, Shraddha Barke wrote: > Use local variable cursor in place of >cursor in > read_partial_msg_data() > > Signed-off-by: Shraddha Barke > --- > net/ceph/messenger.c | 6 +++--- > 1 file changed, 3 insertions(+), 3

Re: [PATCH] rbd: don't leak parent_spec in rbd_dev_probe_parent()

2015-10-16 Thread Ilya Dryomov
On Thu, Oct 15, 2015 at 11:10 PM, Alex Elder <el...@ieee.org> wrote: > On 10/11/2015 01:03 PM, Ilya Dryomov wrote: >> Currently we leak parent_spec and trigger a "parent reference >> underflow" warning if rbd_dev_create() in rbd_dev_probe_parent() fails. >&g

Re: [PATCH] mark rbd requiring stable pages

2015-10-16 Thread Ilya Dryomov
On Thu, Oct 15, 2015 at 8:50 PM, Ronny Hegewald wrote: > rbd requires stable pages, as it performs a crc of the page data before they > are send to the OSDs. > > But since kernel 3.9 (patch 1d1d1a767206fbe5d4c69493b7e6d2a8d08cc0a0 "mm: only > enforce stable page writes

Re: [PATCH] rbd: don't leak parent_spec in rbd_dev_probe_parent()

2015-10-16 Thread Ilya Dryomov
On Fri, Oct 16, 2015 at 1:50 PM, Alex Elder <el...@ieee.org> wrote: > On 10/16/2015 04:50 AM, Ilya Dryomov wrote: >> On Thu, Oct 15, 2015 at 11:10 PM, Alex Elder <el...@ieee.org> wrote: >>> On 10/11/2015 01:03 PM, Ilya Dryomov wrote: >>>> Currently

Re: [PATCH] rbd: set max_sectors explicitly

2015-10-16 Thread Ilya Dryomov
On Fri, Oct 16, 2015 at 2:22 PM, Alex Elder <el...@ieee.org> wrote: > On 10/07/2015 12:00 PM, Ilya Dryomov wrote: >> Commit 30e2bc08b2bb ("Revert "block: remove artifical max_hw_sectors >> cap"") restored a clamp on max_sectors. It's now 2560 sectors ins

Re: [PATCH] ceph/osd_client: add support for CEPH_OSD_OP_GETXATTR

2015-10-15 Thread Ilya Dryomov
On Thu, Oct 15, 2015 at 12:51 AM, David Disseldorp <dd...@suse.de> wrote: > On Wed, 14 Oct 2015 19:57:46 +0200, Ilya Dryomov wrote: > >> On Wed, Oct 14, 2015 at 7:37 PM, David Disseldorp <dd...@suse.de> wrote: > ... >> > Ping, any feedback on the patch? >

Re: [PATCH] rbd: don't leak parent_spec in rbd_dev_probe_parent()

2015-10-15 Thread Ilya Dryomov
On Thu, Oct 15, 2015 at 7:10 PM, Alex Elder <el...@ieee.org> wrote: > On 10/11/2015 01:03 PM, Ilya Dryomov wrote: >> Currently we leak parent_spec and trigger a "parent reference >> underflow" warning if rbd_dev_create() in rbd_dev_probe_parent() fails. >&g

Re: [PATCH] ceph/osd_client: add support for CEPH_OSD_OP_GETXATTR

2015-10-14 Thread Ilya Dryomov
On Wed, Oct 14, 2015 at 7:37 PM, David Disseldorp wrote: > On Fri, 9 Oct 2015 16:43:09 +0200, David Disseldorp wrote: > >> Allows for xattr retrieval. Response data buffer allocation is the >> responsibility of the osd_req_op_xattr_init() caller. > > Ping, any feedback on the

Re: Kernel RBD Readahead

2015-10-13 Thread Ilya Dryomov
On Tue, Oct 13, 2015 at 11:33 AM, Olivier Bonvalet <ceph.l...@daevel.fr> wrote: > Le mardi 25 août 2015 à 17:50 +0300, Ilya Dryomov a écrit : >> > Ok. I might try and create a 4.1 kernel with the blk-mq queue >> depth/IO size + readahead +max_segments fixes in as I'm thin

Re: Reply: [PATCH] rbd: prevent kernel stack blow up on rbd map

2015-10-12 Thread Ilya Dryomov
On Mon, Oct 12, 2015 at 4:22 AM, Caoxudong wrote: > By the way, do you think it's necessary that we add the clone-chain-length > limit in user-space code too? librbd is different in a lot of ways and there isn't a clean separation between the client part (i.e. what is

[PATCH] rbd: don't leak parent_spec in rbd_dev_probe_parent()

2015-10-11 Thread Ilya Dryomov
() remains and triggers rbd_warn() in rbd_dev_parent_put() - at that point we have parent_spec != NULL and parent_ref == 0, so counter ends up being -1 after the decrement. Redo rbd_dev_probe_parent() to fix this. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/

[PATCH] rbd: prevent kernel stack blow up on rbd map

2015-10-11 Thread Ilya Dryomov
() rbd_img_parent_read_callback() rbd_obj_request_complete() ... Limit the parent chain to 8 images, which is ~3K worth of stack. It's probably a good thing to do regardless - performance with more than a few images long parent chain is likely to be pretty bad. Signed-off-by: Ilya

[PATCH] rbd: set max_sectors explicitly

2015-10-07 Thread Ilya Dryomov
default object size is 4M. So, set max_sectors to max_hw_sectors in rbd at queue init time. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 05072464d25e..04e69b4df664

[PATCH] rbd: use writefull op for object size writes

2015-10-07 Thread Ilya Dryomov
an assert, I didn't do it because its only user is cephfs. All other sites were updated. Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 9 +++-- net/ceph/osd_client.c | 13 + 2

Re: [PATCH] rbd: use writefull op for object size writes

2015-10-07 Thread Ilya Dryomov
On Wed, Oct 7, 2015 at 5:36 PM, Alex Elder <el...@ieee.org> wrote: > On 10/07/2015 12:02 PM, Ilya Dryomov wrote: >> This covers only the simplest case - an object size sized write, but >> it's still useful in tiering setups when EC is used for the base tier >> as

Re: [PATCH] ceph:Remove unused goto labels in decode crush map functions

2015-10-02 Thread Ilya Dryomov
On Fri, Oct 2, 2015 at 9:48 PM, Nicholas Krause wrote: > This removes unused goto labels in decode crush map functions related > to error paths due to them never being used on any error path for these > particular functions in the file, osdmap.c. > > Signed-off-by: Nicholas

Re: [CEPH-DEVEL] [ceph-users] occasional failure to unmap rbd

2015-09-26 Thread Ilya Dryomov
On Sat, Sep 26, 2015 at 5:54 AM, Shinobu Kinjo wrote: > I think it's more helpful to put returned value in: > > # ./src/krbd.cc > 530 cerr << "rbd: sysfs write failed" << std::endl; > > like: > > 530 cerr << "rbd: sysfs write failed (" << r << ")" << std::endl; > >

Re: how to sepcify the point a rbd mapped?

2015-09-25 Thread Ilya Dryomov
On Fri, Sep 25, 2015 at 6:25 AM, Jaze Lee wrote: > Hello, > I know we can map a rbd image to a block device into kernel by > ‘rbd map’ command。 > But we can not specify which block device. For example, if i have > a rbd named rbd_0, > i want it mapped into

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Ilya Dryomov
On Wed, Sep 23, 2015 at 4:08 PM, Jason Dillaman wrote: >> > In this case the commands look a little confusing to me, as from their >> > names I would rather think they enable/disable mirror for existent >> > images too. Also, I don't see a command to check what current >> >

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Ilya Dryomov
On Wed, Sep 23, 2015 at 9:28 PM, Jason Dillaman wrote: >> So a pool policy is just a set of feature bits? > > It would have to store additional details as well. > >> I think Cinder at least creates images with rbd_default_features from >> ceph.conf and adds in layering if

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Ilya Dryomov
On Wed, Sep 23, 2015 at 9:33 AM, Mykola Golub wrote: > On Tue, Sep 22, 2015 at 01:32:49PM -0400, Jason Dillaman wrote: > >> > > * rbd mirror pool enable >> > > This will, by default, ensure that all images created in this >> > > pool have exclusive lock,

Re: partprobe or partx or ... ?

2015-09-21 Thread Ilya Dryomov
On Sat, Sep 19, 2015 at 11:08 PM, Loic Dachary wrote: > > > On 19/09/2015 17:23, Loic Dachary wrote: >> Hi Ilya, >> >> At present ceph-disk uses partprobe to ensure the kernel is aware of the >> latest partition changes after a new one is created, or after zapping the >>

Re: [Ceph-community] Getting WARN in __kick_osd_requests doing stress testing

2015-09-18 Thread Ilya Dryomov
On Fri, Sep 18, 2015 at 9:48 AM, Abhishek L wrote: > Redirecting to ceph-devel, where such a question might have a better > chance of a reply. > > On Fri, Sep 18, 2015 at 4:03 AM, wrote: >> I'm running in a 3-node cluster and doing osd/rbd

Re: [PATCH] libceph: advertise support for keepalive2

2015-09-16 Thread Ilya Dryomov
On Wed, Sep 16, 2015 at 9:28 AM, Yan, Zheng <uker...@gmail.com> wrote: > On Mon, Sep 14, 2015 at 9:51 PM, Ilya Dryomov <idryo...@gmail.com> wrote: >> We are the client, but advertise keepalive2 anyway - for consistency, >> if nothing else. In the future the server

[PATCH] libceph: advertise support for keepalive2

2015-09-14 Thread Ilya Dryomov
We are the client, but advertise keepalive2 anyway - for consistency, if nothing else. In the future the server might want to know whether its clients support keepalive2. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- include/linux/ceph/ceph_features.h | 1 + 1 file changed, 1 ins

[PATCH] libceph: don't access invalid memory in keepalive2 path

2015-09-14 Thread Ilya Dryomov
Fix this by encoding into a ceph_timespec member, similar to how acks are read and written. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- include/linux/ceph/messenger.h | 4 +++- net/ceph/messenger.c | 9 + 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/include/l

Re: [PATCH 33/39] rbd: drop null test before destroy functions

2015-09-14 Thread Ilya Dryomov
On Sun, Sep 13, 2015 at 3:15 PM, Julia Lawall wrote: > Remove unneeded NULL test. > > The semantic patch that makes this change is as follows: > (http://coccinelle.lip6.fr/) > > // > @@ expression x; @@ > -if (x != NULL) { >

size_t and related types on mn10300

2015-09-13 Thread Ilya Dryomov
On Thu, Sep 10, 2015 at 10:57 AM, kbuild test robot wrote: > tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git > master > head: 22dc312d56ba077db27a9798b340e7d161f1df05 > commit: 5f1c79a71766ba656762636936edf708089bdb14 [12335/12685] libceph:

Re: [PATCH] libceph: use keepalive2 to verify the mon session is alive

2015-09-02 Thread Ilya Dryomov
On Wed, Sep 2, 2015 at 5:22 AM, Yan, Zheng wrote: > timespec_to_jiffies() does not work this way. it convert time delta in form > of timespec to time delta in form of jiffies. Ah sorry, con->last_keepalive_ack is a realtime timespec from userspace. > > I will updated the patch

Re: [PATCH] libceph: use keepalive2 to verify the mon session is alive

2015-09-02 Thread Ilya Dryomov
On Wed, Sep 2, 2015 at 12:25 PM, Yan, Zheng <z...@redhat.com> wrote: > >> On Sep 2, 2015, at 17:12, Ilya Dryomov <idryo...@gmail.com> wrote: >> >> On Wed, Sep 2, 2015 at 5:22 AM, Yan, Zheng <z...@redhat.com> wrote: >>> timespec_to_jiffies() does

[PATCH] libceph: check data_len in ->alloc_msg()

2015-09-02 Thread Ilya Dryomov
rrupt random memory should a buggy ->alloc_msg() return an unfit ceph_msg. While at it, I changed the "unknown tid" dout() to a pr_warn() to make sure all skips are seen and unified format strings. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- net/ceph/messenger.c | 7

Re: [PATCH] libceph: use keepalive2 to verify the mon session is alive

2015-09-02 Thread Ilya Dryomov
On Wed, Sep 2, 2015 at 12:25 PM, Yan, Zheng <z...@redhat.com> wrote: > >> On Sep 2, 2015, at 17:12, Ilya Dryomov <idryo...@gmail.com> wrote: >> >> On Wed, Sep 2, 2015 at 5:22 AM, Yan, Zheng <z...@redhat.com> wrote: >>> timespec_to_jiffies() does

Re: devicemapper and udev events

2015-09-01 Thread Ilya Dryomov
On Tue, Sep 1, 2015 at 1:09 AM, Loic Dachary wrote: > Hi Ilya, > > While working on multipath, I noticed that udev add event are not triggered > when /dev/dm-0 etc. are created. Should I expect a udev add event for every > device ? Or does it not apply to devices created by /

Re: [PATCH] libceph: use keepalive2 to verify the mon session is alive

2015-09-01 Thread Ilya Dryomov
On Tue, Sep 1, 2015 at 5:21 PM, Yan, Zheng wrote: > Signed-off-by: Yan, Zheng > --- > include/linux/ceph/libceph.h | 2 ++ > include/linux/ceph/messenger.h | 4 +++ > include/linux/ceph/msgr.h | 4 ++- > net/ceph/ceph_common.c | 18

Re: format 2TB rbd device is too slow

2015-08-31 Thread Ilya Dryomov
On Mon, Aug 31, 2015 at 4:21 AM, Ma, Jianpeng wrote: > Ilya > I modify kernel code. The patch like this: > [root@dev linux]# git diff > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c > index bc67a93..e4c4ea9 100644 > --- a/drivers/block/rbd.c > +++

[PATCH] rbd: fix double free on rbd_dev->header_name

2015-08-31 Thread Ilya Dryomov
it shoudn't mock with clone's fields. Signed-off-by: Ilya Dryomov <idryo...@gmail.com> --- drivers/block/rbd.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index bc67a93aa4f4..324bf35ec4dd 100644 --- a/drivers/block/rbd.c +++ b/drivers/block

Re: format 2TB rbd device is too slow

2015-08-28 Thread Ilya Dryomov
On Thu, Aug 27, 2015 at 3:43 AM, huang jun hjwsm1...@gmail.com wrote: hi,llya 2015-08-26 23:56 GMT+08:00 Ilya Dryomov idryo...@gmail.com: On Wed, Aug 26, 2015 at 6:22 PM, Haomai Wang haomaiw...@gmail.com wrote: On Wed, Aug 26, 2015 at 11:16 PM, huang jun hjwsm1...@gmail.com wrote: hi,all we

Re: format 2TB rbd device is too slow

2015-08-28 Thread Ilya Dryomov
On Fri, Aug 28, 2015 at 10:36 AM, Ma, Jianpeng jianpeng...@intel.com wrote: Hi Ilya, We can change sector size from 512 to 4096. This can reduce the count of write. I did a simple test: for 900G, mkfs.xfs -f For default: 1m10s Physical sector size = 4096: 0m10s. But if change sector

Re: format 2TB rbd device is too slow

2015-08-26 Thread Ilya Dryomov
On Wed, Aug 26, 2015 at 6:22 PM, Haomai Wang haomaiw...@gmail.com wrote: On Wed, Aug 26, 2015 at 11:16 PM, huang jun hjwsm1...@gmail.com wrote: hi,all we create a 2TB rbd image, after map it to local, then we format it to xfs with 'mkfs.xfs /dev/rbd0', it spent 318 seconds to finish, but

Re: Kernel RBD Readahead

2015-08-25 Thread Ilya Dryomov
On Tue, Aug 25, 2015 at 5:05 PM, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 25 August 2015 09:45 To: Nick Fisk n...@fisk.me.uk Cc: Ceph Development ceph-devel@vger.kernel.org Subject: Re: Kernel RBD Readahead On Tue, Aug

Re: Kernel RBD Readahead

2015-08-25 Thread Ilya Dryomov
On Tue, Aug 25, 2015 at 10:40 AM, Nick Fisk n...@fisk.me.uk wrote: I have done two tests one with 1MB objects and another with 4MB objects, my cluster is a little busier than when I did the quick test yesterday, so all speeds are slightly down across the board but you can see the scaling

Re: Kernel RBD Readahead

2015-08-24 Thread Ilya Dryomov
On Sun, Aug 23, 2015 at 10:23 PM, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 23 August 2015 18:33 To: Nick Fisk n...@fisk.me.uk Cc: Ceph Development ceph-devel@vger.kernel.org Subject: Re: Kernel RBD Readahead On Sat

Re: Kernel RBD Readahead

2015-08-24 Thread Ilya Dryomov
On Mon, Aug 24, 2015 at 5:43 PM, Ilya Dryomov idryo...@gmail.com wrote: On Sun, Aug 23, 2015 at 10:23 PM, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 23 August 2015 18:33 To: Nick Fisk n...@fisk.me.uk Cc: Ceph Development

Re: Kernel RBD Readahead

2015-08-24 Thread Ilya Dryomov
On Mon, Aug 24, 2015 at 7:00 PM, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 24 August 2015 16:07 To: Nick Fisk n...@fisk.me.uk Cc: Ceph Development ceph-devel@vger.kernel.org Subject: Re: Kernel RBD Readahead On Mon, Aug

Re: Kernel RBD Readahead

2015-08-24 Thread Ilya Dryomov
On Mon, Aug 24, 2015 at 11:11 PM, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: 24 August 2015 18:19 To: Nick Fisk n...@fisk.me.uk Cc: Ceph Development ceph-devel@vger.kernel.org Subject: Re: Kernel RBD Readahead On Mon

Re: Kernel RBD Readahead

2015-08-23 Thread Ilya Dryomov
On Sat, Aug 22, 2015 at 11:45 PM, Nick Fisk n...@fisk.me.uk wrote: Hi Ilya, I was wondering if I could just get your thoughts on a matter I have run into? Its surrounding read performance of the RBD kernel client and blk-mq, mainly when doing large single threaded reads. During testing

Re: /sys/block and /dev and partitions

2015-08-18 Thread Ilya Dryomov
On Tue, Aug 18, 2015 at 12:46 AM, Loic Dachary l...@dachary.org wrote: Hi Ilya, For regular devices such as /dev/vdb2 or /dev/sda3, do you think it is safe to use /sys/dev/block/M:m/partition to figure out the partition number ? Or could it vary depending on the disk driver or the partition

Re: rbd: default map options, new options, misc (Re: creating the issue for the v0.94.4 release)

2015-08-18 Thread Ilya Dryomov
On Mon, Aug 17, 2015 at 8:41 PM, Loic Dachary l...@dachary.org wrote: On 17/08/2015 18:31, Josh Durgin wrote: Yes, this would make sense in firefly too, since it's useful for newer kernels regardless of ceph userspace versions. Ok, I'll schedule it for backport. Is there an issue

Re: /sys/block and /dev and partitions

2015-08-15 Thread Ilya Dryomov
On Sat, Aug 15, 2015 at 11:56 PM, Loic Dachary l...@dachary.org wrote: Hi Ilya, On 15/08/2015 19:42, Ilya Dryomov wrote: On Sat, Aug 15, 2015 at 6:35 PM, Loic Dachary l...@dachary.org wrote: Hi Sage, On 15/08/2015 16:28, Sage Weil wrote: On Sat, 15 Aug 2015, Loic Dachary wrote: Hi

Re: /sys/block and /dev and partitions

2015-08-15 Thread Ilya Dryomov
On Sat, Aug 15, 2015 at 6:35 PM, Loic Dachary l...@dachary.org wrote: Hi Sage, On 15/08/2015 16:28, Sage Weil wrote: On Sat, 15 Aug 2015, Loic Dachary wrote: Hi, Is there a portable and consistent way to figure out if a given /dev/XXX path (for instance /dev/dm-1) is a partition of a whole

Re: rbd object map

2015-08-07 Thread Ilya Dryomov
On Fri, Aug 7, 2015 at 9:13 AM, Max Yehorov myeho...@skytap.com wrote: Hi, Object map feature was added in hammer. It is possible to create and delete an image with object map enabled using rbd, though it is not possible to map the image which has object map or exclusive lock features

Re: Newbie question about metadata_list.

2015-08-07 Thread Ilya Dryomov
On Fri, Aug 7, 2015 at 10:12 AM, Łukasz Szymczyk lukasz.szymc...@corp.ovh.com wrote: On Thu, 6 Aug 2015 15:01:54 +0300 Ilya Dryomov idryo...@gmail.com wrote: Hi, On Thu, Aug 6, 2015 at 12:26 PM, Łukasz Szymczyk lukasz.szymc...@corp.ovh.com wrote: Hi, I'm writing some program

Re: Newbie question about metadata_list.

2015-08-06 Thread Ilya Dryomov
On Thu, Aug 6, 2015 at 12:26 PM, Łukasz Szymczyk lukasz.szymc...@corp.ovh.com wrote: Hi, I'm writing some program to replace image in cluster with it's copy. But I have problem with metadata_list. I created pool: #rados mkpool dupa then I created image: #rbd create --size 1000 -p mypool

Re: [PATCH] rbd: fix copyup completion race

2015-07-29 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 8:48 PM, Alex Elder el...@ieee.org wrote: On 07/17/2015 05:36 AM, Ilya Dryomov wrote: For write/discard obj_requests that involved a copyup method call, the opcode of the first op is CEPH_OSD_OP_CALL and the -callback is rbd_img_obj_copyup_callback(). The latter frees

Re: max rbd devices

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 5:19 PM, Wyllys Ingersoll wyllys.ingers...@keepertech.com wrote: Actually, its a 4.1rc8 kernel. And modinfo shows that the single_major parameter defaults to false. $ sudo modinfo rbd filename: /lib/modules/4.1.0-040100rc8-generic/kernel/drivers/block/rbd.ko

Re: max rbd devices

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 5:59 PM, Wyllys Ingersoll wyllys.ingers...@keepertech.com wrote: Interesting to note - /etc/modules contains a line for rbd, which means it gets loaded at boot time and NOT via the rbd cli, so perhaps this is why it is loaded with the flag set to N. Yeah, that's

Re: max rbd devices

2015-07-28 Thread Ilya Dryomov
On Tue, Jul 28, 2015 at 5:35 PM, Wyllys Ingersoll wyllys.ingers...@keepertech.com wrote: OK, I misunderstood. It is not loaded by hand. We have permanent mappings listed in /etc/ceph/rbdmap which then get mapped automatically when the system boots. So, in that case, shouldn't single_major=Y

  1   2   3   4   5   6   >