Re: [PATCH] rbd: fix I/O error propagation for reads

2013-08-27 Thread Alex Elder
or not they were completed successfully. This looks good to me. Reviewed-by: Alex Elder el...@linaro.org they always set xferred = length, but reads were skipping that step if an error other than -ENOENT occurred. Instead, rbd would end up passing 0 xferred to blk_end_request(), which would always

Re: [PATCH 02/15] rbd: convert bus code to use bus_groups

2013-08-27 Thread Alex Elder
On 08/23/2013 04:24 PM, Greg Kroah-Hartman wrote: The bus_attrs field of struct bus_type is going away soon, dev_groups should be used instead. This converts the RBD bus code to use the correct field. Cc: Yehuda Sadeh yeh...@inktank.com Cc: Sage Weil s...@inktank.com Cc: Alex Elder el

Re: [PATCH] rbd: fix I/O error propagation for reads

2013-08-27 Thread Alex Elder
On 08/27/2013 10:36 AM, Sage Weil wrote: On Tue, 27 Aug 2013, Alex Elder wrote: On 08/26/2013 08:34 PM, Josh Durgin wrote: When a request returns an error, the driver needs to report the entire extent of the request as completed. Writes already did this, since You're right. The block layer

Re: [PATCH] libceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc

2013-08-29 Thread Alex Elder
with pgp_num (not pg_num) in a ceph_stable_mod() call. Reviewed-by: Alex Elder el...@linaro.org CC: sta...@vger.kernel.org Signed-off-by: Sage Weil s...@inktank.com --- net/ceph/osdmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ceph/osdmap.c b/net/ceph

Re: [PATCH 1/3] rbd: fix null dereference in dout

2013-08-29 Thread Alex Elder
On 08/29/2013 01:24 AM, Josh Durgin wrote: The order parameter is sometimes NULL in _rbd_dev_v2_snap_size(), but the dout() always derefences it. Move this to another dout() protected by a check that order is non-NULL. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Josh

Re: [PATCH] rbd: fix buffer size for writes to images with snapshots

2013-08-29 Thread Alex Elder
-by: Alex Elder alex.el...@linaro.org I must have suggested it in e-mail... I guess I should have updated the bug. This looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Josh Durgin josh.dur...@inktank.com --- drivers/block/rbd.c | 10 +- 1 files changed, 5 insertions

Re: [PATCH 3/3] rbd: close remove vs. notify race leading to use-after-free

2013-08-29 Thread Alex Elder
. Fixes: http://tracker.ceph.com/issues/5636 Signed-off-by: Josh Durgin josh.dur...@inktank.com Please verify there's no lockdep reports if you haven't already. And consider my other comments, but barring lockdep issues: Reviewed-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c | 25

Re: [PATCH 2/3] libceph: add function to ensure notifies are complete

2013-08-29 Thread Alex Elder
in the queue that will call the watch callback for the event. If the queue is flushed after the event is unregistered, the caller can be sure no more watch callbacks will occur for the canceled watch. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Josh Durgin josh.dur

Re: [PATCH v2 1/3] rbd: fix use-after free of rbd_dev-disk

2013-09-03 Thread Alex Elder
://tracker.ceph.com/issues/5636 Signed-off-by: Josh Durgin josh.dur...@inktank.com A few comments below. I don't think you need the shutting_down flag after all. If you disagree, say so. Either way though, this looks good to me. Reviewed-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c | 39

Re: [PATCH v2 2/3] rbd: complete notifies before cleaning up osd_client and rbd_dev

2013-09-03 Thread Alex Elder
. Signed-off-by: Josh Durgin josh.dur...@inktank.com Looks good. Reviewed-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 8ab3362b..2223617 100644

Re: [PATCH v2 3/3] rbd: make rbd_obj_notify_ack() synchronous

2013-09-03 Thread Alex Elder
On 08/29/2013 07:57 PM, Josh Durgin wrote: The only user is rbd_watch_cb(). If this is asynchronous, there is no The only user *of rbd_obj_notify_ack()* is rbd_watch_cb(). ... tracking of when it completes, so it may still be in progress when the I think what you're saying is that, because

Re: [PATCH v2 3/3] rbd: make rbd_obj_notify_ack() synchronous

2013-09-03 Thread Alex Elder
On 09/03/2013 08:05 AM, Alex Elder wrote: There's another problem though. This function used a clever trick of assigning rbd_obj_request_put() as the object request's callback function. This allowed the request to just be sent and cleaned up whenever it completed. Oh, sorry, I neglected

Re: [PATCH] rbd: ignore unmapped snapshots that no longer exist

2013-09-03 Thread Alex Elder
. I mention something about a one of your comments, but even so this looks good to me, so for this patch: Reviewed-by: Alex Elder el...@linaro.org However, in looking at this, I found something else that probably should be resolved. Note that rbd_dev_spec_update() calls rbd_snap_name

Re: [PATCH 1/2] ceph: remove ceph_lookup_inode()

2013-09-03 Thread Alex Elder
(), use ceph_find_inode() instead. Signed-off-by: Yan, Zheng zheng.z@intel.com Looks good. Reviewed-by: Alex Elder el...@linaro.org --- fs/ceph/inode.c | 8 fs/ceph/mds_client.c | 2 +- fs/ceph/super.h | 2 -- 3 files changed, 1 insertion(+), 11 deletions(-) diff

Re: [PATCH 2/2] ceph: use d_invalidate() to invalidate aliases

2013-09-03 Thread Alex Elder
On 09/02/2013 02:19 AM, Yan, Zheng wrote: From: Yan, Zheng zheng.z@intel.com d_invalidate() is the standard VFS method to invalidate dentry. compare to d_delete(), it also try shrinking children dentries. I'm less familiar with this code than rbd and the osd client, so you probably want

Re: [PATCH v3 2/5] rbd: make rbd_obj_notify_ack() synchronous

2013-09-09 Thread Alex Elder
() to rbd_obj_notify_ack_sync() to reflect its new synchronous nature. Signed-off-by: Josh Durgin josh.dur...@inktank.com Looks great. Nice description. Reviewed-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c | 11 ++- 1 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers

Re: [PATCH v3 4/5] rbd: ignore unmapped snapshots that no longer exist

2013-09-09 Thread Alex Elder
a distraction. (But don't bother changing it, what you have is good.) Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Josh Durgin josh.dur...@inktank.com --- drivers/block/rbd.c |9 +++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/block/rbd.c b

Re: [PATCH v3 5/5] rbd: fix error handling from rbd_snap_name()

2013-09-09 Thread Alex Elder
between the two calls or memory was exhausted. Use an ERR_PTR in rbd_dev_v1_snap_name() so that the specific error can be propagated, and it is consistent with rbd_dev_v2_snap_name(). Handle the ERR_PTR in the only rbd_snap_name() caller. Looks good. Reviewed-by: Alex Elder el...@linaro.org

Re: [PATCH v3 0/5] fix shutdown races and snapshot error handling

2013-09-09 Thread Alex Elder
On 09/09/2013 02:16 AM, Josh Durgin wrote: Patches 1-3 fix races between device removal and notify processing. Patch 2 has an improved summary, fixes reference counting, and renames the function as suggested by Alex. Patch 3 is a reworked and simplified version that uses the existing

Re: [PATCH v3 3/5] rbd: fix use-after free of rbd_dev-disk

2013-09-09 Thread Alex Elder
On 09/09/2013 11:15 AM, Josh Durgin wrote: On 09/09/2013 06:37 AM, Alex Elder wrote: On 09/09/2013 02:17 AM, Josh Durgin wrote: Removing a device deallocates the disk, unschedules the watch, and finally cleans up the rbd_dev structure. rbd_dev_refresh(), called from the watch callback

Re: [PATCH v2] rbd: add ioctl for rbd

2013-09-17 Thread Alex Elder
On 09/17/2013 02:04 AM, Guangliang Zhao wrote: When running the following commands: [root@ceph0 mnt]# blockdev --setro /dev/rbd2 [root@ceph0 mnt]# blockdev --getro /dev/rbd2 0 I think this is a good change to make, and I think what you've done is overall fine. I do see one bug

Re: [PATCH v2] rbd: add ioctl for rbd

2013-09-17 Thread Alex Elder
On 09/17/2013 10:24 AM, Josh Durgin wrote: . . . OK, now I have a very broad (and too detailed) suggestion. (I got a little carried away, sorry about that.) At this point there is only one IOCTL request that is handled by this function. I know it doesn't match the general-purpose

Re: [PATCH v2] rbd: add ioctl for rbd

2013-09-17 Thread Alex Elder
On 09/17/2013 10:24 AM, Josh Durgin wrote: As long as you're testing... It would be good to demonstrate that it's writable after a setrw call (i.e., by actually writing to it). And it would be good to show an attempt to change to read-write a mapped base image as well as a mapped snapshot

Re: [PATCH v3] rbd: add ioctl for rbd

2013-09-21 Thread Alex Elder
change, and Josh has no problem with the read-only state thing: Reviewed-by: Alex Elder el...@linaro.org The required change is something Josh mentioned. In rbd_request_fn(), the variable read_only is defined and assigned at the top of the function. That line needs to be inside the while loop

Re: [PATCH 1/2] rbd: move calls that may sleep out of spin lock range

2013-10-01 Thread Alex Elder
get_user() and set_disk_ro(). This fix looks good. I have a couple small comments to consider. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Josh Durgin josh.dur...@inktank.com --- drivers/block/rbd.c | 36 +++- 1 files changed, 23 insertions(+), 13

Re: [PATCH 2/2] rbd: only set disk to read-only once

2013-10-01 Thread Alex Elder
block_device isn't available to us there. Looks good; set_disk_ro() sort of makes more sense anyway without any partitions. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Josh Durgin josh.dur...@inktank.com --- drivers/block/rbd.c |2 +- 1 files changed, 1 insertions(+), 1

[PATCH] MAINTAINERS: update an e-mail address

2013-11-08 Thread Alex Elder
I no longer have direct access to my Inktank e-mail. I still pay attention to rbd, so update its entry in MAINTAINERS accordingly. Signed-off-by: Alex Elder el...@linaro.org --- MAINTAINERS |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index

[PATCH] MAINTAINERS: update an e-mail address

2013-12-26 Thread Alex Elder
[Sage, perhaps you could sign off on or ack this, and take it in through your tree. Thanks. -Alex] I no longer have direct access to my Inktank e-mail. I still pay attention to rbd, so update its entry in MAINTAINERS accordingly. Signed-off-by: Alex Elder el...@linaro.org --- MAINTAINERS

[PATCH 0/3] kernel.h: centrally define U32_MAX, etc.

2013-12-26 Thread Alex Elder
. The first makes existing definitions be done conditionally; the second defines them in kernel.h; and the third gets rid of the conditional definitions. (These three can obviously be squashed into a single commit.) -Alex Alex Elder (3): conditionally

Re: [PATCH 1/5] ceph: properly handle XATTR_CREATE and XATTR_REPLACE

2014-02-11 Thread Alex Elder
should fix the man page to resolve the discrepancy.) I have one other comment at the end--basically that you're fixing a bug in addition to what you've mentioned above. Please at least mention that in your explanation. Otherwise this looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off

Re: [PATCH 2/5] ceph: remove zero-size xattr

2014-02-11 Thread Alex Elder
On 02/10/2014 11:30 PM, Yan, Zheng wrote: Signed-off-by: Yan, Zheng zheng.z@intel.com You really need to explain better under what circumstances a zero-size xattr is getting removed. But apparently it's only when you're updating an xattr (not building it up from a blob from storage). Why

Re: [PATCH 2/5] ceph: remove zero-size xattr

2014-02-11 Thread Alex Elder
On 02/11/2014 09:10 AM, Yan, Zheng wrote: On Tue, Feb 11, 2014 at 10:47 PM, Alex Elder el...@ieee.org wrote: On 02/10/2014 11:30 PM, Yan, Zheng wrote: Signed-off-by: Yan, Zheng zheng.z@intel.com You really need to explain better under what circumstances a zero-size xattr is getting

Re: [PATCH 4/5] ceph: add missing init_acl() for mkdir() and atomic_open()

2014-02-11 Thread Alex Elder
On 02/10/2014 11:30 PM, Yan, Zheng wrote: This looks OK to me but you should get another opinion, I haven't really given it as thorough a review as I normally do. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Yan, Zheng zheng.z@intel.com --- fs/ceph/dir.c | 9 + fs

Re: [PATCH 5/5] ceph: fix ceph_set_acl()

2014-02-11 Thread Alex Elder
is important, to be able to verify the code change matches what the developer intended it to do.) The problem here is that ceph_set_acl() could return -ENODATA if it attempted to remove a non-existent ACL attribute, and that shouldn't happen. (Right?) Reviewed-by: Alex Elder el...@linaro.org PS

Re: rbd: use watch/notify for changes in rbd header

2014-02-13 Thread Alex Elder
On 02/13/2014 06:04 AM, Ilya Dryomov wrote: On Thu, Feb 13, 2014 at 11:50 AM, Dan Carpenter dan.carpen...@oracle.com wrote: Hello Yehuda Sadeh, The patch 59c2be1e4d42: rbd: use watch/notify for changes in rbd header from Mar 21, 2011, leads to the following static checker warning:

Re: [PATCH 1/3] ceph: make ceph_forget_all_cached_acls() static inline

2014-02-14 Thread Alex Elder
asking.) Reviewed-by: Alex Elder el...@linaro.org --- fs/ceph/acl.c |5 - fs/ceph/super.h |6 +- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c index 64fddbc..48af0b3 100644 --- a/fs/ceph/acl.c +++ b/fs/ceph/acl.c @@ -54,11

Re: [PATCH 3/3] ceph: make ceph ACL for symlink inheritable

2014-02-14 Thread Alex Elder
On 02/13/2014 11:29 PM, Guangliang Zhao wrote: Default ACL couldn't be inherited by the symlink in the parent directory. This resolve it. This looks good to me, and it seems to be consistent with what other file systems do. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Guangliang

Re: [PATCH 2/3] ceph: add acl, noacl options for cephfs mount

2014-02-14 Thread Alex Elder
On 02/13/2014 11:29 PM, Guangliang Zhao wrote: Signed-off-by: Guangliang Zhao lucienc...@gmail.com If CONFIG_CEPH_FS_POSIX_ACL is defined, the default is to have them enabled; if it is not, the default is to have it disabled. But now, whether enabled or not is possible to enable/disable them

Re: [PATCH 4/5] ceph: add missing init_acl() for mkdir() and atomic_open()

2014-02-14 Thread Alex Elder
On 02/14/2014 12:12 AM, Yan, Zheng wrote: updated patch This looks OK, but I just finished reviewing a separate patch that seems to do this... One or the other. Provided you resolve that: Reviewed-by: Alex Elder el...@linaro.org --- From 9c517a5af5af472a0bee9c900b7df400b10f3098 Mon Sep 17

Re: [PATCH 2/3] ceph: add acl, noacl options for cephfs mount

2014-02-17 Thread Alex Elder
On 02/16/2014 12:26 PM, Sage Weil wrote: Hi Alex, Guangliang, Looks good to me. Reviewed-by: Alex Elder el...@linaro.org On Fri, 14 Feb 2014, Alex Elder wrote: On 02/13/2014 11:29 PM, Guangliang Zhao wrote: Signed-off-by: Guangliang Zhao lucienc...@gmail.com If CONFIG_CEPH_FS_POSIX_ACL

Re: [PATCH 0/6] libceph: CEPH_OSD_OP_SETALLOCHINT osd op

2014-02-23 Thread Alex Elder
On 02/23/2014 10:14 AM, Sage Weil wrote: Reviewed-by: Sage Weil s...@inktank.com It would be great if Josh or Alex could also look (at 6 especially) as they are more familiar with the RBD code than I am, but I think it's safe to put this in testing for now. I will. I have been planning

Re: [PATCH 2/6] libceph: add support for CEPH_OSD_OP_SETALLOCHINT osd op

2014-02-24 Thread Alex Elder
for all SETALLOCHINT ops. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com I have a few small comments for you to consider but this looks good to me. Reviewed-by: Alex Elder el...@linaro.org --- include/linux/ceph/osd_client.h |9 + include/linux/ceph/rados.h |8

Re: [PATCH 0/6] libceph: CEPH_OSD_OP_SETALLOCHINT osd op

2014-02-24 Thread Alex Elder
On 02/21/2014 12:55 PM, Ilya Dryomov wrote: Hello, This series adds support for CEPH_OSD_OP_SETALLOCHINT osd op to libceph along with adjusting rbd to make use of it. The rationale and the basic desing was outlined in the rados io hints thread on ceph-devel about a month ago. It looks

Re: [PATCH 3/6] libceph: bump CEPH_OSD_MAX_OP to 3

2014-02-24 Thread Alex Elder
the assertion). That being said, I appreciate your breaking up the functionality into logically separate changes, it's important. So I guess, do what you like... Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/osd_client.h |2

Re: [PATCH 5/6] rbd: num_ops parameter for rbd_osd_req_create()

2014-02-24 Thread Alex Elder
to be in the callers. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com Looks good. Reviewed-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c | 28 ++-- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/block/rbd.c b/drivers

Re: [PATCH 4/6] rbd: do not hard-code CEPH_OSD_MAX_OP in rbd_osd_req_callback()

2014-02-24 Thread Alex Elder
On 02/21/2014 12:55 PM, Ilya Dryomov wrote: CEPH_OSD_MAX_OP value in rbd_osd_req_callback() is hard-coded to 2. Fix it. Please squash this in with the previous patch (at least). Change the BUG_ON() to rbd_assert() while you're at it (and invert the logic). Reviewed-by: Alex Elder el

Re: [PATCH 6/6] rbd: prefix rbd writes with CEPH_OSD_OP_SETALLOCHINT osd op

2014-02-24 Thread Alex Elder
On 02/21/2014 12:55 PM, Ilya Dryomov wrote: In an effort to reduce fragmentation, prefix every rbd write with a CEPH_OSD_OP_SETALLOCHINT osd op with an expected_write_size value set to the object size (1 order). Backwards compatibility is taken care of on the libceph/osd side. If *every*

Re: [PATCH 2/6] libceph: add support for CEPH_OSD_OP_SETALLOCHINT osd op

2014-02-25 Thread Alex Elder
On 02/25/2014 06:52 AM, Ilya Dryomov wrote: On Mon, Feb 24, 2014 at 4:59 PM, Alex Elder el...@ieee.org wrote: On 02/21/2014 12:55 PM, Ilya Dryomov wrote: This is primarily for rbd's benefit and is supposed to combat fragmentation: ... knowing that rbd images have a 4m size, librbd can pass

Re: [PATCH 6/6] rbd: prefix rbd writes with CEPH_OSD_OP_SETALLOCHINT osd op

2014-02-25 Thread Alex Elder
On 02/25/2014 06:58 AM, Ilya Dryomov wrote: On Mon, Feb 24, 2014 at 4:59 PM, Alex Elder el...@ieee.org wrote: On 02/21/2014 12:55 PM, Ilya Dryomov wrote: In an effort to reduce fragmentation, prefix every rbd write with a CEPH_OSD_OP_SETALLOCHINT osd op with an expected_write_size value set

Re: [PATCH] ceph: fix ceph_dir_llseek()

2014-02-28 Thread Alex Elder
an llseek on a directory that ended up in any fragment but the first. I have one question/suggestion below, but otherwise: Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Yan, Zheng zheng.z@intel.com --- fs/ceph/dir.c | 12 ++-- fs/ceph/super.h | 2 +- 2 files changed, 7

Re: [PATCH] ceph: fix reset_readdir()

2014-02-28 Thread Alex Elder
-by: Alex Elder el...@linaro.org Signed-off-by: Yan, Zheng zheng.z@intel.com --- fs/ceph/dir.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index a7eaf96..8ce8833 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -454,7 +454,7

Re: [PATCH 1/3] rbd: skip the copyup when an entire object writing

2014-03-11 Thread Alex Elder
looks good but I have something for you to consider, below. Either way: Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Guangliang Zhao lucienc...@gmail.com --- drivers/block/rbd.c |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/block/rbd.c b

Re: [PATCH 2/3] rbd: extend the operation type

2014-03-11 Thread Alex Elder
On 03/11/2014 11:24 PM, Guangliang Zhao wrote: It could only handle the read and write operations now, extend it for the coming discard support. OK, here too I think this looks good, but please consider the comments I've provided below. Reviewed-by: Alex Elder el...@linaro.org Signed-off

Re: [PATCH 3/3] rbd: add discard support for rbd

2014-03-11 Thread Alex Elder
case, this looks nice to me. It's another case where adding the feature seems not that intrusive, and fits within the existing framework pretty well. There is a bug below, which you should fix. I have a few other comments too, for you to consider. Reviewed-by: Alex Elder el...@linaro.org Signed

Re: [PATCH 0/3] patches for rbd

2014-03-11 Thread Alex Elder
On 03/11/2014 11:24 PM, Guangliang Zhao wrote: Hi, I am sorry that didn't notice Jean-Tiare's mail at all, just sage's reminds me. There are 3 patches, the first is optimization for rbd writing, the remaining two are support for rbd discard. If you have completed the discard things,

Re: [PATCH 1/3 v2] rbd: skip the copyup when an entire object writing

2014-03-12 Thread Alex Elder
, this looks good. Reviewed-by: Alex Elder el...@linaro.org { struct rbd_img_request *img_request; struct rbd_device *rbd_dev; - bool known; + u64 obj_size; rbd_assert(obj_request_img_data_test(obj_request)); img_request = obj_request-img_request

Re: [PATCH 2/3 v2] rbd: extend the operation type

2014-03-12 Thread Alex Elder
On 03/12/2014 10:21 PM, Guangliang Zhao wrote: It could only handle the read and write operations now, extend it for the coming discard support. Wow, it looks like you took all of my suggestions. This looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Guangliang Zhao

Re: [PATCH 3/3 v2] rbd: add discard support for rbd

2014-03-12 Thread Alex Elder
: http://tracker.ceph.com/issues/190 Signed-off-by: Guangliang Zhao lucienc...@gmail.com Looks good. Reviewed-by: Alex Elder el...@linaro.org -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info

Re: [PATCH] libceph: fix oops in ceph_msg_data_pagelist_advance()

2014-03-22 Thread Alex Elder
On 03/22/2014 05:54 PM, Yan, Zheng wrote: When there is no more data, ceph_msg_data_pagelist_advance() should not move on to the next page. Without looking very hard at this, this looks right. Does ceph_msg_data_pages_advance() need the same fix? And ceph_msg_data_bio_advance()? I'm going to

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 04:04 AM, Ilya Dryomov wrote: On Tue, Mar 25, 2014 at 10:39 AM, Olivier Bonvalet ceph.l...@daevel.fr wrote: Hi, what can/should I do to help fix that problem ? for now, RBD kernel client hang on : Assertion failure in rbd_img_obj_callback() at line 2131:

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 07:34 AM, Ilya Dryomov wrote: On 03/25/2014 04:04 AM, Ilya Dryomov wrote: On Tue, Mar 25, 2014 at 10:39 AM, Olivier Bonvalet ceph.l...@daevel.fr wrote: Hi, what can/should I do to help fix that problem ? for now, RBD kernel client hang on : Assertion failure in

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 08:18 AM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 14:57 +0200, Ilya Dryomov a écrit : On Tue, Mar 25, 2014 at 2:51 PM, Alex Elder el...@ieee.org wrote: On 03/25/2014 07:34 AM, Ilya Dryomov wrote: On 03/25/2014 04:04 AM, Ilya Dryomov wrote: On Tue, Mar 25, 2014 at 10

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
... So, a (partial) fix can be this patch ? --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -2123,6 +2123,7 @@ static void rbd_img_obj_callback(struct rbd_obj_request *obj_request) rbd_assert(obj_request_img_data_test(obj_request)); img_request =

Re: [PATCH] rbd: assert next_completion under completion_lock

2014-03-25 Thread Alex Elder
On 03/25/2014 08:47 AM, Ilya Dryomov wrote: completion_lock is there specifically to protect next_completion. Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com Reviewed-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 12:15 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 08:31 -0500, Alex Elder a écrit : ... So, a (partial) fix can be this patch ? Yes, roughly. I'd do the following instead. It would be great to learn whether it eliminates the one form of assertion failure you were

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
Please try applying this, on top of the previous patch. If you can then reproduce the problem we'll have a bunch of new information about the particular request that's leading to the failure. That might tell us what more we can do to find the root cause. Thank you.

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 01:53 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit : Please try applying this, on top of the previous patch. If you can then reproduce the problem we'll have a bunch of new information about the particular request that's leading

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 03:21 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit : On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder el...@ieee.org wrote: On 03/25/2014 01:53 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit : Please try

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 03:21 PM, Olivier Bonvalet wrote: Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad image object request information: Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request 88025a2b3c48 Mar 25 21:17:58 murmillia kernel: [ 2205.255940]

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
Olivier, it appears this is a layered image, i.e., the image is a clone of another. Can you tell us any more about the way these images are organized? Do you have one master image and others are based on that? Is there more than one layer to the organization? (Do these questions make sense?)

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 05:17 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 22:54 +0100, Olivier Bonvalet a écrit : Le mardi 25 mars 2014 à 23:49 +0200, Ilya Dryomov a écrit : On Tue, Mar 25, 2014 at 11:41 PM, Olivier Bonvalet ceph.l...@daevel.fr wrote: mmm the cluster seems to be in a really

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 06:04 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 17:46 -0500, Alex Elder a écrit : On 03/25/2014 05:17 PM, Olivier Bonvalet wrote: I now have this one very often (here 5 minutes after the host boot) : I am fairly sure this indicates a use-after-free scenario, likely

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 08:50 PM, Olivier Bonvalet wrote: Le mercredi 26 mars 2014 à 02:33 +0100, Olivier Bonvalet a écrit : Thanks for your patch. This is an output of a crash case : Mar 26 02:31:18 alg kernel: [ 965.366895] rbd_img_obj_callback: bad image object request information: Mar 26

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 09:40 PM, Olivier Bonvalet wrote: Thanks again to took time to analyze that problem. All my RBD images have daily snapshots, can this bug be related to snapshots ? Maybe it's a stupid question, but is there a workaround that I could use to reduce that problem in production,

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
On 03/25/2014 09:45 PM, Olivier Bonvalet wrote: Le mardi 25 mars 2014 à 21:42 -0500, Alex Elder a écrit : PS I thought you said you were going to stop for the night! Yes, I would love ! But my phone doesn't stop ring about ceph crash :D OK, one more thing to try and I'm going to bed. I'm

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-25 Thread Alex Elder
I think I've got it. I'll explain, and we'll have to look at the explanation more closely in the morning. Bottom line is that I think the assertion is bogus. Like the other assertion that was not done under protection of a lock, this one is being done in a way that's not safe. First, here's a

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-26 Thread Alex Elder
On 03/26/2014 12:00 AM, Alex Elder wrote: I think I've got it. I'll explain, and we'll have to look at the explanation more closely in the morning. Bottom line is that I think the assertion is bogus. Like the other assertion that was not done under protection of a lock, this one is being

Re: Issue #5876 : assertion failure in rbd_img_obj_callback()

2014-03-26 Thread Alex Elder
On 03/26/2014 06:43 AM, Ilya Dryomov wrote: It looks like img_request kref currenlty exists for posterity only. Unless I'm missing something, its counter is set to 1 in rbd_img_request_create() and is not incremented anywhere else, which means that the instant rbd_img_request_put() is called,

[PATCH] rbd: drop an unsafe assertion

2014-03-26 Thread Alex Elder
-by: Olivier Bonvalet o...@daevel.fr Signed-off-by: Alex Elder el...@linaro.org --- drivers/block/rbd.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index f044fab..4c95b50 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c

Re: [PATCH 01/33] libceph: refer to osdmap directly in osdmap_show()

2014-03-27 Thread Alex Elder
On 03/27/2014 01:17 PM, Ilya Dryomov wrote: To make it more readable and save screen space. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph/debugfs.c | 26 ++ 1 file changed, 14 insertions

Re: [PATCH 02/33] libceph: do not prefix osd lines with \t in debugfs output

2014-03-27 Thread Alex Elder
into a smaller set of patches. Reviewed-by: Alex Elder el...@linaro.org --- net/ceph/debugfs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ceph/debugfs.c b/net/ceph/debugfs.c index d225842c7b41..112d98edb156 100644 --- a/net/ceph/debugfs.c +++ b/net/ceph/debugfs.c

Re: [PATCH 04/33] libceph: dump osdmap and enhance output on decode errors

2014-03-27 Thread Alex Elder
On 03/27/2014 01:17 PM, Ilya Dryomov wrote: Dump osdmap in hex on both full and incremental decode errors, to make it easier to match the contents with error offset. dout() map epoch and max_osd value on success. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya

Re: [PATCH 05/33] libceph: split osdmap allocation and decode steps

2014-03-27 Thread Alex Elder
On 03/27/2014 01:17 PM, Ilya Dryomov wrote: Split osdmap allocation and initialization into a separate function, ceph_osdmap_decode(). Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/osdmap.h |2 +- net

Re: [PATCH 03/33] libceph: dump pg_temp mappings to debugfs

2014-03-27 Thread Alex Elder
looks good. Reviewed-by: Alex Elder el...@linaro.org --- net/ceph/debugfs.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/net/ceph/debugfs.c b/net/ceph/debugfs.c index 112d98edb156..c45d235e774e 100644 --- a/net/ceph/debugfs.c +++ b/net/ceph/debugfs.c @@ -82,6 +82,17

Re: [PATCH 06/33] libceph: fixup error handling in osdmap_decode()

2014-03-27 Thread Alex Elder
a special e_inval label to be used by all ceph_decode_* macros. I don't see where it's returning 0 on error, but I think this is a good change anyway. I'd use einval or err_inval instead of e_inval. But no matter. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya

Re: [PATCH 07/33] libceph: safely decode max_osd value in osdmap_decode()

2014-03-27 Thread Alex Elder
about to go decode... Anyway, this is the right thing to do. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph/osdmap.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ceph/osdmap.c b/net

Re: [PATCH 08/33] libceph: assert length of osdmap osd arrays

2014-03-27 Thread Alex Elder
: Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph/osdmap.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c index ec06010657b3..19aca4d3c5dd 100644 --- a/net/ceph

Re: [PATCH 09/33] libceph: fix crush_decode() call site in osdmap_decode()

2014-03-27 Thread Alex Elder
argument and advance the pointer by as much as it uses (like most of the other routines do). That's a suggestion, but I don't really care, this is fine. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph/osdmap.c |7 ++- 1 file

Re: [PATCH 10/33] libceph: fixup error handling in osdmap_apply_incremental()

2014-03-27 Thread Alex Elder
osdmap_decode() and fix this by adding a special e_inval label to be used by all ceph_decode_* macros. Same comments as last time. Otherwise, looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph/osdmap.c | 66

Re: [PATCH 11/33] libceph: nuke bogus encoding version check in osdmap_apply_incremental()

2014-03-27 Thread Alex Elder
On 03/27/2014 01:17 PM, Ilya Dryomov wrote: Only version 6 of osdmap encoding is supported, anything other than version 6 results in an error and halts the decoding process. Checking if version is = 5 is therefore bogus. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off

Re: [PATCH 12/33] libceph: fix and clarify ceph_decode_need() sizes

2014-03-27 Thread Alex Elder
On 03/27/2014 01:17 PM, Ilya Dryomov wrote: Sum up sizeof(...) results instead of (incorrectly) hard-coding the number of bytes, expressed in ints and longs. Yay!!! Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph

Re: [PATCH 13/33] libceph: rename __decode_pool{,_names}() to decode_pool{,_names}()

2014-03-27 Thread Alex Elder
On 03/27/2014 01:17 PM, Ilya Dryomov wrote: To be in line with all the other osdmap decode helpers. I wouldn't object to folding this into another patch, it doesn't change anything functionally. Looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo

Re: [PATCH 14/33] libceph: introduce decode{,_new}_pools() and switch to them

2014-03-27 Thread Alex Elder
On 03/27/2014 01:18 PM, Ilya Dryomov wrote: Consolidate pools (full map, mapu64, pg_pool_t) and new_pools (inc map, same) decoding logic into a common helper and switch to it. Nice refactoring. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com

Re: [PATCH 15/33] libceph: switch osdmap_set_max_osd() to krealloc()

2014-03-27 Thread Alex Elder
On 03/27/2014 01:18 PM, Ilya Dryomov wrote: Use krealloc() instead of rolling our own. (krealloc() with a NULL first argument acts as a kmalloc()). Properly initalize the new array elements. This is needed to make future additions to osdmap easier. Looks good. Reviewed-by: Alex Elder el

Re: [PATCH 16/33] libceph: introduce decode{,_new}_pg_temp() and switch to them

2014-03-27 Thread Alex Elder
On 03/27/2014 01:18 PM, Ilya Dryomov wrote: Consolidate pg_temp (full map, mappg_t, vectoru32) and new_pg_temp (inc map, same) decoding logic into a common helper and switch to it. Again, it's nice to see this kind of refactoring being done. Looks good. Reviewed-by: Alex Elder el

Re: [PATCH 17/33] libceph: introduce get_osdmap_client_data_v()

2014-03-27 Thread Alex Elder
clear to me at first that this was adding a new bit of functionality--support for v7 OSD map encodings. A couple comments below but this looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- net/ceph/osdmap.c | 81

Re: [PATCH 19/33] libceph: primary_temp infrastructure

2014-03-27 Thread Alex Elder
/client/osdmap, one 'primary_temp pgid osd' per line, e.g: primary_temp 2.6 4 So this just sets up the infrastructure, but doesn't use it yet. OK. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com --- include/linux/ceph/osdmap.h |5

Re: [PATCH 20/33] libceph: primary_temp decode bits

2014-03-27 Thread Alex Elder
On 03/27/2014 01:18 PM, Ilya Dryomov wrote: Add a common helper to decode both primary_temp (full map, mappg_t, u32) and new_primary_temp (inc map, same) and switch to it. The code looks reasonable. I'll have to assume it's doing the decoding properly. Reviewed-by: Alex Elder el...@linaro.org

Re: [PATCH 22/33] libceph: primary_affinity decode bits

2014-03-27 Thread Alex Elder
On 03/27/2014 01:18 PM, Ilya Dryomov wrote: Add two helpers to decode primary_affinity (full map, vectoru32) and new_primary_affinity (inc map, mapu32, u32) and switch to them. One comment below, but otherwise looks good. Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov

Re: [PATCH 23/33] libceph: enable OSDMAP_ENC feature bit

2014-03-27 Thread Alex Elder
On 03/27/2014 01:18 PM, Ilya Dryomov wrote: Announce our support for new osdmap enconding. Looks OK to me. Isn't there a version of this OSD map encoding? Maybe there'll be a newer one someday? Reviewed-by: Alex Elder el...@linaro.org Signed-off-by: Ilya Dryomov ilya.dryo...@inktank.com

  1   2   3   4   5   6   7   8   9   10   >