Re: [PATCH 2/2] block-rbd: One function call less in rbd_dev_probe_parent() after error detection
On Tue, Nov 24, 2015 at 09:21:06PM +0100, Ilya Dryomov wrote: > >> Cleanup here is (and should be) done in reverse order. > > Yes. This is true. > > I have got an other impression about the appropriate order for the > > corresponding > > clean-up function calls. > > > > > >> We allocate parent rbd_device and then link it with what we already have, > > > > I guess that we have got a different understanding about the relevant > > "linking". > > Well, there isn't any _literal_ linking (e.g. adding to a link list, > etc) in this case. We just bump some refs and do probe to fill in the > newly allocated parent. If probe fails, we put refs and free parent, > reversing the "alloc parent, bump refs" order. > > The actual linking (rbd_dev->parent = parent) is done right before > returning so we never have to undo it in rbd_dev_probe_parent() and > that's the only reason your patch probably doesn't break anything. > Think about what happens if, after your patch is applied, someone moves > that assignment up or adds an extra step that can fail after it... > The problem is that the unwind code should be a mirror of the allocate code but rbd_dev_unparent() doesn't mirror anything. Generally, writing future proof stubs like this is a wrong thing because predicting the future is hard and in the mean time we are left stubs which confuse everyone. > If all error paths could be adjusted so that NULL pointers are never > passed in, destroy functions wouldn't need to have a NULL check, would > they? Yep. We agree on the right way to do it. I am probably the number one kernel developer for removing the most sanity checks. :P (As opposed to patch 1/1 where we now rely on the sanity check inside rbd_dev_destroy().) drivers/block/rbd.c 5149 static int rbd_dev_probe_parent(struct rbd_device *rbd_dev, int depth) 5150 { 5151 struct rbd_device *parent = NULL; 5152 int ret; 5153 5154 if (!rbd_dev->parent_spec) 5155 return 0; 5156 5157 if (++depth > RBD_MAX_PARENT_CHAIN_LEN) { 5158 pr_info("parent chain is too long (%d)\n", depth); 5159 ret = -EINVAL; 5160 goto out_err; We haven't allocated anything so this should just be return -EINVAL; In the original code, we decrement the kref count on ->parent_spec on this error path so that is a classic One Err Bug. 5161 } 5162 5163 parent = rbd_dev_create(rbd_dev->rbd_client, rbd_dev->parent_spec, 5164 NULL); 5165 if (!parent) { 5166 ret = -ENOMEM; 5167 goto out_err; Still haven't allocated anything so return -ENOMEM, but if we fail after this point we will need to call rbd_dev_destroy(). 5168 } 5169 5170 /* 5171 * Images related by parent/child relationships always share 5172 * rbd_client and spec/parent_spec, so bump their refcounts. 5173 */ 5174 __rbd_get_client(rbd_dev->rbd_client); 5175 rbd_spec_get(rbd_dev->parent_spec); We will need to put these on any later error paths. 5176 5177 ret = rbd_dev_image_probe(parent, depth); 5178 if (ret < 0) 5179 goto out_err; Ok. We need to put the ->parent_spec, ->rbd_client and free the parent. 5180 5181 rbd_dev->parent = parent; 5182 atomic_set(_dev->parent_ref, 1); 5183 return 0; 5184 5185 out_err: 5186 rbd_dev_unparent(rbd_dev); This is a complicated way to say rbd_spec_put(rbd_dev->parent_spec); Also, is it really necessary to set ->parent_spec to NULL? If we didn't put the last reference then doesn't setting it to NULL mean we are leaking? Setting it to NULL is confusing and feels like a layering violation. 5187 if (parent) 5188 rbd_dev_destroy(parent); 5189 return ret; 5190 } I feel like we should be calling rbd_put_client() on this error path or else the code is buggy or has layer violations. So I *think* it should look like this: dec_ref_counts: rbd_spec_put(rbd_dev->parent_spec); rbd_put_client(rbd_dev->rbd_client); rbd_dev_destroy(parent); return ret; regards, dan carpenter -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] block-rbd: One function call less in rbd_dev_probe_parent() after error detection
On Wed, Nov 25, 2015 at 12:55 PM, Dan Carpenterwrote: > On Tue, Nov 24, 2015 at 09:21:06PM +0100, Ilya Dryomov wrote: >> >> Cleanup here is (and should be) done in reverse order. >> > > > Yes. This is true. > >> > I have got an other impression about the appropriate order for the >> > corresponding >> > clean-up function calls. >> > >> > >> >> We allocate parent rbd_device and then link it with what we already have, >> > >> > I guess that we have got a different understanding about the relevant >> > "linking". >> >> Well, there isn't any _literal_ linking (e.g. adding to a link list, >> etc) in this case. We just bump some refs and do probe to fill in the >> newly allocated parent. If probe fails, we put refs and free parent, >> reversing the "alloc parent, bump refs" order. >> >> The actual linking (rbd_dev->parent = parent) is done right before >> returning so we never have to undo it in rbd_dev_probe_parent() and >> that's the only reason your patch probably doesn't break anything. >> Think about what happens if, after your patch is applied, someone moves >> that assignment up or adds an extra step that can fail after it... >> > > The problem is that the unwind code should be a mirror of the allocate > code but rbd_dev_unparent() doesn't mirror anything. Generally, writing > future proof stubs like this is a wrong thing because predicting the > future is hard and in the mean time we are left stubs which confuse > everyone. It's not a future proof stub. It's just some crufty code that was fixed over time to not leak things. I won't defend it - it is confusing and could definitely be improved - but that can't be done without refactoring a fair bunch of calling code. A patch changing rbd_dev_probe_parent() alone just won't do it. > >> If all error paths could be adjusted so that NULL pointers are never >> passed in, destroy functions wouldn't need to have a NULL check, would >> they? > > Yep. We agree on the right way to do it. I am probably the number one > kernel developer for removing the most sanity checks. :P (As opposed > to patch 1/1 where we now rely on the sanity check inside > rbd_dev_destroy().) > > drivers/block/rbd.c > 5149 static int rbd_dev_probe_parent(struct rbd_device *rbd_dev, int depth) > 5150 { > 5151 struct rbd_device *parent = NULL; > 5152 int ret; > 5153 > 5154 if (!rbd_dev->parent_spec) > 5155 return 0; > 5156 > 5157 if (++depth > RBD_MAX_PARENT_CHAIN_LEN) { > 5158 pr_info("parent chain is too long (%d)\n", depth); > 5159 ret = -EINVAL; > 5160 goto out_err; > > We haven't allocated anything so this should just be return -EINVAL; > In the original code, we decrement the kref count on ->parent_spec on > this error path so that is a classic One Err Bug. The caller expects rbd_dev->parent_spec to be put on any error. Notice that we return right away if !rbd_dev->parent_spec. > > 5161 } > 5162 > 5163 parent = rbd_dev_create(rbd_dev->rbd_client, > rbd_dev->parent_spec, > 5164 NULL); > 5165 if (!parent) { > 5166 ret = -ENOMEM; > 5167 goto out_err; > > Still haven't allocated anything so return -ENOMEM, but if we fail after > this point we will need to call rbd_dev_destroy(). > > 5168 } > 5169 > 5170 /* > 5171 * Images related by parent/child relationships always share > 5172 * rbd_client and spec/parent_spec, so bump their refcounts. > 5173 */ > 5174 __rbd_get_client(rbd_dev->rbd_client); > 5175 rbd_spec_get(rbd_dev->parent_spec); > > We will need to put these on any later error paths. And we do, in rbd_dev_destroy(parent), since these are references for the parent. > > 5176 > 5177 ret = rbd_dev_image_probe(parent, depth); > 5178 if (ret < 0) > 5179 goto out_err; > > Ok. We need to put the ->parent_spec, ->rbd_client and free the parent. > > 5180 > 5181 rbd_dev->parent = parent; > 5182 atomic_set(_dev->parent_ref, 1); > 5183 return 0; > 5184 > 5185 out_err: > 5186 rbd_dev_unparent(rbd_dev); > > This is a complicated way to say rbd_spec_put(rbd_dev->parent_spec); > > Also, is it really necessary to set ->parent_spec to NULL? If we didn't > put the last reference then doesn't setting it to NULL mean we are > leaking? Setting it to NULL is confusing and feels like a layering > violation. Yes, because as it is, ->parent_spec is a determinant of whether or not the image has a parent. If we fail in rbd_dev_probe_parent(), it needs to be set to NULL to signify that the image doesn't have a parent. Even if the entire thing was refactored, we'd still have to do the same because not every image has a parent and the same error path has to
Re: [PATCH 2/2] block-rbd: One function call less in rbd_dev_probe_parent() after error detection
>> @@ -5157,14 +5157,14 @@ static int rbd_dev_probe_parent(struct rbd_device >> *rbd_dev, int depth) >> if (++depth > RBD_MAX_PARENT_CHAIN_LEN) { >> pr_info("parent chain is too long (%d)\n", depth); >> ret = -EINVAL; >> - goto out_err; >> + goto unparent_device; >> } >> >> parent = rbd_dev_create(rbd_dev->rbd_client, rbd_dev->parent_spec, >> NULL); >> if (!parent) { >> ret = -ENOMEM; >> - goto out_err; >> + goto unparent_device; >> } >> >> /* >> @@ -5176,15 +5176,15 @@ static int rbd_dev_probe_parent(struct rbd_device >> *rbd_dev, int depth) >> >> ret = rbd_dev_image_probe(parent, depth); >> if (ret < 0) >> - goto out_err; >> + goto destroy_device; >> >> rbd_dev->parent = parent; >> atomic_set(_dev->parent_ref, 1); >> return 0; >> - >> -out_err: >> - rbd_dev_unparent(rbd_dev); >> +destroy_device: >> rbd_dev_destroy(parent); >> +unparent_device: >> + rbd_dev_unparent(rbd_dev); >> return ret; >> } > > Cleanup here is (and should be) done in reverse order. I have got an other impression about the appropriate order for the corresponding clean-up function calls. > We allocate parent rbd_device and then link it with what we already have, I guess that we have got a different understanding about the relevant "linking". > so the order in which we cleanup is unlink ("unparent"), destroy. I interpreted the eventual passing of a null pointer to the rbd_dev_destroy() function as an indication for further source code adjustments. > Changing it is just asking for use-after-free bugs. Do the affected implementation details need a bit more clarification? Regards, Markus -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html