Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Sat, Dec 12 2015, Al Viro wrote: > On Fri, Dec 11, 2015 at 11:16:24PM +, Al Viro wrote: > >> #define set_delayed_type(call, f, arg) \ >> sizeof(f(arg),0), \ >> __set_delayed_type(call, __closure_##f, (void *)arg) >> >> That could be reused for timers with typechecking - we have a lot of timer >> callbacks that start with casting the argument (unsigned long, not void *, >> but that's not a big deal) to whatever it is that callback really wants, >> with setup_timer() callers explicitly casting that whatever the callback >> really wants to unsigned long. Which, of course, defeats the typechecking >> by both cc(1) and sparse(1)... >> >> I still hope for better solution, though... Comments? > > Hmm... > > #define set_delayed_type(call, f, arg) > \ > (sizeof(f(arg),0), \ > __set_delayed_type(call, ({ \ > void _(void *__) {f((typeof(arg))(unsigned long)__);} \ > _;}), (void *)arg)) > > woult do nicely, but it's a gccism, and the one clang doesn't support > ;-/ Careful. While this may appear to work, I'm pretty sure it would break horribly when f is not the actual name of a function but a function pointer. Not that it won't compile, but it would presumably make gcc emit what they call a 'trampoline', a small piece of executable code located on the stack, allowing the stub to access the containing function's local variables (namely, f). I don't know if kernel stacks are even executable, but even then it's a bad idea, once the stack frame containing the trampoline is gone. All of this could easily happen down the line when someone sees a bunch of similar code and decides to factor that into a helper(), and then it works whenever gcc decides to inline the helper to both call sites and otherwise, as the gcc docs put it, "all hell breaks loose". IOWs, an accident waiting to happen. One could probably ensure that the macro is always used with a compile/link-time constant by putting in some trick like static void *link_time_constant_please = f; but even then I wouldn't be completely sure that gcc would always emit a stub as simple as the one you showed. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Sat, Dec 12 2015, Al Virowrote: > On Fri, Dec 11, 2015 at 11:16:24PM +, Al Viro wrote: > >> #define set_delayed_type(call, f, arg) \ >> sizeof(f(arg),0), \ >> __set_delayed_type(call, __closure_##f, (void *)arg) >> >> That could be reused for timers with typechecking - we have a lot of timer >> callbacks that start with casting the argument (unsigned long, not void *, >> but that's not a big deal) to whatever it is that callback really wants, >> with setup_timer() callers explicitly casting that whatever the callback >> really wants to unsigned long. Which, of course, defeats the typechecking >> by both cc(1) and sparse(1)... >> >> I still hope for better solution, though... Comments? > > Hmm... > > #define set_delayed_type(call, f, arg) > \ > (sizeof(f(arg),0), \ > __set_delayed_type(call, ({ \ > void _(void *__) {f((typeof(arg))(unsigned long)__);} \ > _;}), (void *)arg)) > > woult do nicely, but it's a gccism, and the one clang doesn't support > ;-/ Careful. While this may appear to work, I'm pretty sure it would break horribly when f is not the actual name of a function but a function pointer. Not that it won't compile, but it would presumably make gcc emit what they call a 'trampoline', a small piece of executable code located on the stack, allowing the stub to access the containing function's local variables (namely, f). I don't know if kernel stacks are even executable, but even then it's a bad idea, once the stack frame containing the trampoline is gone. All of this could easily happen down the line when someone sees a bunch of similar code and decides to factor that into a helper(), and then it works whenever gcc decides to inline the helper to both call sites and otherwise, as the gcc docs put it, "all hell breaks loose". IOWs, an accident waiting to happen. One could probably ensure that the macro is always used with a compile/link-time constant by putting in some trick like static void *link_time_constant_please = f; but even then I wouldn't be completely sure that gcc would always emit a stub as simple as the one you showed. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11, 2015 at 01:54:25AM +, Al Viro wrote: > BTW, why are we passing unsigned long to free_page()? We have > a bit under 700 callers; excluding the ones that have an explicit cast > to unsigned long right in the argument of call leaves ~150, and the rest > tend to contain a lot of pointer casts of unsigned long thing they are feeding > to free_page() (IOW, they would be just as happy if they kept it as a pointer > all along). Sure, that would mean __get_free_page() et.al. returning void *, > but I don't see any problems with that either... Is that just for historical > reasons, or is there anything more subtle I'm missing here? The situation with free_pages() is even funnier - we have 313 call sites, and after converting it to void(void *, unsigned) 31 of them need casts to void *. Right now the mainline has 249 of those call sites with cast to unsigned long (or ulong, in several places). Then there's a bunch of places where we do __get_free_pages(), then use it a lot (all with casts to pointers), then pass to free_pages() - those would be just fine with storing it as void *, but even leaving those aside... The current signature is contrary to actual use - nearly 80% of call sites are forced to cast a pointer to unsigned long, only to have it cast back to void * in free_pages() itself, we would obviously be better off if we'd switched to just passing the damn thing as a pointer. Especially since that 80% turn into 90% once you add the callers that could easily switch to storing the value eventually passed to free_pages() as a pointer. And free_page() is basically the same story, only with twice as many call sites... While we are at it: __get_free_pages() has 238 call sites. 193 of them immediately cast to pointer. And there's a bunch of places like this: static inline void __tlb_alloc_page(struct mmu_gather *tlb) { unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0); if (addr) { tlb->pages = (void *)addr; tlb->max = PAGE_SIZE / sizeof(struct page *); } } where the cast is not immediate, but might as well had been. And conversion of free_pages() whittles it down even more. For __get_free_page() the situation is about the same, ditto for get_zeroed_page(). I realize that get_free_page() has been returning unsigned long since 0.01, but looking at 0.01... it might as well had been returning void * - wouldn't be more inconvenient. The kludge you had in get_pipe_inode() would've been a bit more obviously wrong, but that's about it ;-) Seriously, though - what do you think about a flagday commit right after 4.5-rc1 switching all those guys to void *? For __get_user_pages(), __get_user_page(), get_zeroed_page() - return values, for free_pages(), free_page() - argument. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11, 2015 at 01:54:25AM +, Al Viro wrote: > BTW, why are we passing unsigned long to free_page()? We have > a bit under 700 callers; excluding the ones that have an explicit cast > to unsigned long right in the argument of call leaves ~150, and the rest > tend to contain a lot of pointer casts of unsigned long thing they are feeding > to free_page() (IOW, they would be just as happy if they kept it as a pointer > all along). Sure, that would mean __get_free_page() et.al. returning void *, > but I don't see any problems with that either... Is that just for historical > reasons, or is there anything more subtle I'm missing here? The situation with free_pages() is even funnier - we have 313 call sites, and after converting it to void(void *, unsigned) 31 of them need casts to void *. Right now the mainline has 249 of those call sites with cast to unsigned long (or ulong, in several places). Then there's a bunch of places where we do __get_free_pages(), then use it a lot (all with casts to pointers), then pass to free_pages() - those would be just fine with storing it as void *, but even leaving those aside... The current signature is contrary to actual use - nearly 80% of call sites are forced to cast a pointer to unsigned long, only to have it cast back to void * in free_pages() itself, we would obviously be better off if we'd switched to just passing the damn thing as a pointer. Especially since that 80% turn into 90% once you add the callers that could easily switch to storing the value eventually passed to free_pages() as a pointer. And free_page() is basically the same story, only with twice as many call sites... While we are at it: __get_free_pages() has 238 call sites. 193 of them immediately cast to pointer. And there's a bunch of places like this: static inline void __tlb_alloc_page(struct mmu_gather *tlb) { unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0); if (addr) { tlb->pages = (void *)addr; tlb->max = PAGE_SIZE / sizeof(struct page *); } } where the cast is not immediate, but might as well had been. And conversion of free_pages() whittles it down even more. For __get_free_page() the situation is about the same, ditto for get_zeroed_page(). I realize that get_free_page() has been returning unsigned long since 0.01, but looking at 0.01... it might as well had been returning void * - wouldn't be more inconvenient. The kludge you had in get_pipe_inode() would've been a bit more obviously wrong, but that's about it ;-) Seriously, though - what do you think about a flagday commit right after 4.5-rc1 switching all those guys to void *? For __get_user_pages(), __get_user_page(), get_zeroed_page() - return values, for free_pages(), free_page() - argument. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11, 2015 at 11:16:24PM +, Al Viro wrote: > #define set_delayed_type(call, f, arg) \ > sizeof(f(arg),0), \ > __set_delayed_type(call, __closure_##f, (void *)arg) > > That could be reused for timers with typechecking - we have a lot of timer > callbacks that start with casting the argument (unsigned long, not void *, > but that's not a big deal) to whatever it is that callback really wants, > with setup_timer() callers explicitly casting that whatever the callback > really wants to unsigned long. Which, of course, defeats the typechecking > by both cc(1) and sparse(1)... > > I still hope for better solution, though... Comments? Hmm... #define set_delayed_type(call, f, arg) \ (sizeof(f(arg),0), \ __set_delayed_type(call, ({ \ void _(void *__) {f((typeof(arg))(unsigned long)__);} \ _;}), (void *)arg)) woult do nicely, but it's a gccism, and the one clang doesn't support ;-/ gcc support goes back at least to 4.1, not sure about the earlier branches (and not at all sure how flaky it was/is, obviously). But this thing works for e.g. f being void(char) and arg being char, and AFAICS generates correct code - we end up with a stub like _.1765: movsbl %dil, %edi jmp f and its address gets stored in call->fn, which should DTRT on call... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11, 2015 at 08:49:24AM +0100, Rasmus Villemoes wrote: > union delayed_call_fn { > void (*fn)(void *); > void (*kfree_like)(const void *); > } __attribute__((__transparent_union__)); > > void > set_delayed_call(struct delayed_call *call, union delayed_call_fn u, void > *arg) > { > call->fn = u.fn; > call->arg = arg; > } Yecc... If we are into that kind of gccisms, I'd rather use __builtin_choose_expr/__builtin_types_compatible_p. At least that is used elsewhere in the kernel; __transparent_union__ kludge isn't. Sure, it means having set_delayed_call() a macro, but IMO it's less nasty that way... FWIW, another possibility is to have #define CLOSURE_CALLBACK(f,type) \ static inline void __closure_##f(void *p) {return f((type)p);} #define set_delayed_type(call, f, arg) \ sizeof(f(arg),0), \ __set_delayed_type(call, __closure_##f, (void *)arg) That could be reused for timers with typechecking - we have a lot of timer callbacks that start with casting the argument (unsigned long, not void *, but that's not a big deal) to whatever it is that callback really wants, with setup_timer() callers explicitly casting that whatever the callback really wants to unsigned long. Which, of course, defeats the typechecking by both cc(1) and sparse(1)... I still hope for better solution, though... Comments? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11, 2015 at 08:49:24AM +0100, Rasmus Villemoes wrote: > union delayed_call_fn { > void (*fn)(void *); > void (*kfree_like)(const void *); > } __attribute__((__transparent_union__)); > > void > set_delayed_call(struct delayed_call *call, union delayed_call_fn u, void > *arg) > { > call->fn = u.fn; > call->arg = arg; > } Yecc... If we are into that kind of gccisms, I'd rather use __builtin_choose_expr/__builtin_types_compatible_p. At least that is used elsewhere in the kernel; __transparent_union__ kludge isn't. Sure, it means having set_delayed_call() a macro, but IMO it's less nasty that way... FWIW, another possibility is to have #define CLOSURE_CALLBACK(f,type) \ static inline void __closure_##f(void *p) {return f((type)p);} #define set_delayed_type(call, f, arg) \ sizeof(f(arg),0), \ __set_delayed_type(call, __closure_##f, (void *)arg) That could be reused for timers with typechecking - we have a lot of timer callbacks that start with casting the argument (unsigned long, not void *, but that's not a big deal) to whatever it is that callback really wants, with setup_timer() callers explicitly casting that whatever the callback really wants to unsigned long. Which, of course, defeats the typechecking by both cc(1) and sparse(1)... I still hope for better solution, though... Comments? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11, 2015 at 11:16:24PM +, Al Viro wrote: > #define set_delayed_type(call, f, arg) \ > sizeof(f(arg),0), \ > __set_delayed_type(call, __closure_##f, (void *)arg) > > That could be reused for timers with typechecking - we have a lot of timer > callbacks that start with casting the argument (unsigned long, not void *, > but that's not a big deal) to whatever it is that callback really wants, > with setup_timer() callers explicitly casting that whatever the callback > really wants to unsigned long. Which, of course, defeats the typechecking > by both cc(1) and sparse(1)... > > I still hope for better solution, though... Comments? Hmm... #define set_delayed_type(call, f, arg) \ (sizeof(f(arg),0), \ __set_delayed_type(call, ({ \ void _(void *__) {f((typeof(arg))(unsigned long)__);} \ _;}), (void *)arg)) woult do nicely, but it's a gccism, and the one clang doesn't support ;-/ gcc support goes back at least to 4.1, not sure about the earlier branches (and not at all sure how flaky it was/is, obviously). But this thing works for e.g. f being void(char) and arg being char, and AFAICS generates correct code - we end up with a stub like _.1765: movsbl %dil, %edi jmp f and its address gets stored in call->fn, which should DTRT on call... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11 2015, Al Viro wrote: > I would really love to be able to say > set_delayed_call(done, kfree, p); > but as it is I had to keep a wrapper - void kfree_link(void *). The problem > is, you can't assign void f(const void *) to void (*p)(void *) - mismatch of > qualifiers in the arguments makes the latter not assignment-compatible with > the former. If there's a clever trick allowing to sidestep that, I'd be > very happy; I don't know one. Any ideas not starting with "use C11" (or, > worse yet, "use such and such C++ misfeature with arseloads of RTL required > in order to implement it") would be welcome... I _think_ this satisfies these very reasonable criteria. What you're looking for is presumably __attribute__((__transparent_union__)). At least this compiles without warnings at -Wall -Wextra and gives the expected disassembly, and the gcc docs mention transparent_union at least back to 4.0.4. #include struct delayed_call { void (*fn)(void *); void *arg; }; union delayed_call_fn { void (*fn)(void *); void (*kfree_like)(const void *); } __attribute__((__transparent_union__)); void set_delayed_call(struct delayed_call *call, union delayed_call_fn u, void *arg) { call->fn = u.fn; call->arg = arg; } void some_cb(void *); void kfree(const void *); extern struct delayed_call done; void test(void) { set_delayed_call(, some_cb, NULL); set_delayed_call(, kfree, NULL); } Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Thu, Dec 10, 2015 at 02:40:50AM +, Al Viro wrote: > ... except that it would have to be something like work_struct, since > callback_head gets embedded into the argument and here we don't want to > do that. _And_ we've no use for ->entry in work_struct, which makes it > a bad fit as well. IOW, please ignore that idiocy... OK, the following (incremental to that series) seems to be better; in particular, overlayfs ->get_link() doesn't allocate anything whatsoever, ->put_link() as a method is gone and instead of *cookie = we do set_delayed_call(done, callback, argument) which is a whole lot more straightforward. Memory footprint cost is trivial - one extra pointer in struct nameidata. Basically, we add a new structure (struct delayed_call; poor man's closure) and pass a reference to that to ->get_link(); if anything needs to be done once we are done with the string returned by ->get_link(), just fill that structure (as above) and that's it. It will be evaluated by caller once it's through with the symlink body. I'd put it in linux/fs.h for now, but it certainly belongs in something like linux/types.h; the current location is temporary. And I'm quite sure that we have other places where something of that sort is open-coded - timers, for one thing and probably quite a few other things. I would really love to be able to say set_delayed_call(done, kfree, p); but as it is I had to keep a wrapper - void kfree_link(void *). The problem is, you can't assign void f(const void *) to void (*p)(void *) - mismatch of qualifiers in the arguments makes the latter not assignment-compatible with the former. If there's a clever trick allowing to sidestep that, I'd be very happy; I don't know one. Any ideas not starting with "use C11" (or, worse yet, "use such and such C++ misfeature with arseloads of RTL required in order to implement it") would be welcome... BTW, why are we passing unsigned long to free_page()? We have a bit under 700 callers; excluding the ones that have an explicit cast to unsigned long right in the argument of call leaves ~150, and the rest tend to contain a lot of pointer casts of unsigned long thing they are feeding to free_page() (IOW, they would be just as happy if they kept it as a pointer all along). Sure, that would mean __get_free_page() et.al. returning void *, but I don't see any problems with that either... Is that just for historical reasons, or is there anything more subtle I'm missing here? Anyway, for now the patch below is strictly for review and (if one is brave enough) testing. Comments would be very welcome. diff --git a/drivers/staging/lustre/lustre/llite/symlink.c b/drivers/staging/lustre/lustre/llite/symlink.c index e5c4137..2610348 100644 --- a/drivers/staging/lustre/lustre/llite/symlink.c +++ b/drivers/staging/lustre/lustre/llite/symlink.c @@ -118,8 +118,14 @@ failed: return rc; } +static void ll_put_link(void *p) +{ + ptlrpc_req_finished(p); +} + static const char *ll_get_link(struct dentry *dentry, - struct inode *inode, void **cookie) + struct inode *inode, + struct delayed_call *done) { struct ptlrpc_request *request = NULL; int rc; @@ -137,22 +143,16 @@ static const char *ll_get_link(struct dentry *dentry, } /* symname may contain a pointer to the request message buffer, -* we delay request releasing until ll_put_link then. +* we delay request releasing then. */ - *cookie = request; + set_delayed_call(done, ll_put_link, request); return symname; } -static void ll_put_link(struct inode *unused, void *cookie) -{ - ptlrpc_req_finished(cookie); -} - const struct inode_operations ll_fast_symlink_inode_operations = { .readlink = generic_readlink, .setattr= ll_setattr, .get_link = ll_get_link, - .put_link = ll_put_link, .getattr= ll_getattr, .permission = ll_inode_permission, .setxattr = ll_setxattr, diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index a237f47..c7cc7c3 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -1226,11 +1226,12 @@ ino_t v9fs_qid2ino(struct p9_qid *qid) * v9fs_vfs_get_link - follow a symlink path * @dentry: dentry for symlink * @inode: inode for symlink - * @cookie: place to pass the data to put_link() + * @done: delayed call for when we are done with the return value */ static const char *v9fs_vfs_get_link(struct dentry *dentry, -struct inode *inode, void **cookie) +struct inode *inode, +struct delayed_call *done) { struct v9fs_session_info *v9ses; struct p9_fid *fid; @@ -1266,7 +1267,8 @@ static const char *v9fs_vfs_get_link(struct dentry *dentry,
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Thu, Dec 10, 2015 at 02:40:50AM +, Al Viro wrote: > ... except that it would have to be something like work_struct, since > callback_head gets embedded into the argument and here we don't want to > do that. _And_ we've no use for ->entry in work_struct, which makes it > a bad fit as well. IOW, please ignore that idiocy... OK, the following (incremental to that series) seems to be better; in particular, overlayfs ->get_link() doesn't allocate anything whatsoever, ->put_link() as a method is gone and instead of *cookie = we do set_delayed_call(done, callback, argument) which is a whole lot more straightforward. Memory footprint cost is trivial - one extra pointer in struct nameidata. Basically, we add a new structure (struct delayed_call; poor man's closure) and pass a reference to that to ->get_link(); if anything needs to be done once we are done with the string returned by ->get_link(), just fill that structure (as above) and that's it. It will be evaluated by caller once it's through with the symlink body. I'd put it in linux/fs.h for now, but it certainly belongs in something like linux/types.h; the current location is temporary. And I'm quite sure that we have other places where something of that sort is open-coded - timers, for one thing and probably quite a few other things. I would really love to be able to say set_delayed_call(done, kfree, p); but as it is I had to keep a wrapper - void kfree_link(void *). The problem is, you can't assign void f(const void *) to void (*p)(void *) - mismatch of qualifiers in the arguments makes the latter not assignment-compatible with the former. If there's a clever trick allowing to sidestep that, I'd be very happy; I don't know one. Any ideas not starting with "use C11" (or, worse yet, "use such and such C++ misfeature with arseloads of RTL required in order to implement it") would be welcome... BTW, why are we passing unsigned long to free_page()? We have a bit under 700 callers; excluding the ones that have an explicit cast to unsigned long right in the argument of call leaves ~150, and the rest tend to contain a lot of pointer casts of unsigned long thing they are feeding to free_page() (IOW, they would be just as happy if they kept it as a pointer all along). Sure, that would mean __get_free_page() et.al. returning void *, but I don't see any problems with that either... Is that just for historical reasons, or is there anything more subtle I'm missing here? Anyway, for now the patch below is strictly for review and (if one is brave enough) testing. Comments would be very welcome. diff --git a/drivers/staging/lustre/lustre/llite/symlink.c b/drivers/staging/lustre/lustre/llite/symlink.c index e5c4137..2610348 100644 --- a/drivers/staging/lustre/lustre/llite/symlink.c +++ b/drivers/staging/lustre/lustre/llite/symlink.c @@ -118,8 +118,14 @@ failed: return rc; } +static void ll_put_link(void *p) +{ + ptlrpc_req_finished(p); +} + static const char *ll_get_link(struct dentry *dentry, - struct inode *inode, void **cookie) + struct inode *inode, + struct delayed_call *done) { struct ptlrpc_request *request = NULL; int rc; @@ -137,22 +143,16 @@ static const char *ll_get_link(struct dentry *dentry, } /* symname may contain a pointer to the request message buffer, -* we delay request releasing until ll_put_link then. +* we delay request releasing then. */ - *cookie = request; + set_delayed_call(done, ll_put_link, request); return symname; } -static void ll_put_link(struct inode *unused, void *cookie) -{ - ptlrpc_req_finished(cookie); -} - const struct inode_operations ll_fast_symlink_inode_operations = { .readlink = generic_readlink, .setattr= ll_setattr, .get_link = ll_get_link, - .put_link = ll_put_link, .getattr= ll_getattr, .permission = ll_inode_permission, .setxattr = ll_setxattr, diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index a237f47..c7cc7c3 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -1226,11 +1226,12 @@ ino_t v9fs_qid2ino(struct p9_qid *qid) * v9fs_vfs_get_link - follow a symlink path * @dentry: dentry for symlink * @inode: inode for symlink - * @cookie: place to pass the data to put_link() + * @done: delayed call for when we are done with the return value */ static const char *v9fs_vfs_get_link(struct dentry *dentry, -struct inode *inode, void **cookie) +struct inode *inode, +struct delayed_call *done) { struct v9fs_session_info *v9ses; struct p9_fid *fid; @@ -1266,7 +1267,8 @@ static const char *v9fs_vfs_get_link(struct dentry *dentry,
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Fri, Dec 11 2015, Al Virowrote: > I would really love to be able to say > set_delayed_call(done, kfree, p); > but as it is I had to keep a wrapper - void kfree_link(void *). The problem > is, you can't assign void f(const void *) to void (*p)(void *) - mismatch of > qualifiers in the arguments makes the latter not assignment-compatible with > the former. If there's a clever trick allowing to sidestep that, I'd be > very happy; I don't know one. Any ideas not starting with "use C11" (or, > worse yet, "use such and such C++ misfeature with arseloads of RTL required > in order to implement it") would be welcome... I _think_ this satisfies these very reasonable criteria. What you're looking for is presumably __attribute__((__transparent_union__)). At least this compiles without warnings at -Wall -Wextra and gives the expected disassembly, and the gcc docs mention transparent_union at least back to 4.0.4. #include struct delayed_call { void (*fn)(void *); void *arg; }; union delayed_call_fn { void (*fn)(void *); void (*kfree_like)(const void *); } __attribute__((__transparent_union__)); void set_delayed_call(struct delayed_call *call, union delayed_call_fn u, void *arg) { call->fn = u.fn; call->arg = arg; } void some_cb(void *); void kfree(const void *); extern struct delayed_call done; void test(void) { set_delayed_call(, some_cb, NULL); set_delayed_call(, kfree, NULL); } Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Thu, Dec 10, 2015 at 12:10:12AM +, Al Viro wrote: > PS: I toyed with the idea of replacing cookie a struct callback_head, to make ... except that it would have to be something like work_struct, since callback_head gets embedded into the argument and here we don't want to do that. _And_ we've no use for ->entry in work_struct, which makes it a bad fit as well. IOW, please ignore that idiocy... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Wed, Dec 09, 2015 at 06:23:09PM +, Al Viro wrote: > What's more, that dentry might very well have gone negative by that > point. Think what happens if, during the symlink traversal, we run > into the hard "restart from scratch in non-RCU mode". We'll need to > do ->put_link() on everything we have in stack. Regardless of what > might've happened to dentries/inodes of symlinks involved - all we > are really promised is that RCU-delayed parts of their destruction hadn't > been entered yet. > > Note, BTW, that *all* ->put_link() instances ignore the inode argument > and I seriously considered dropping it completely. PS: I toyed with the idea of replacing cookie a struct callback_head, to make it really obvious what's going on - we are leaving a closure to be evaluated once the caller is done with our return value. That way ->put_link() as a method would've been gone and we would've had something like init_callback(destructor, page_put_link, page); instead of *cookie = page; plus .put_link = page_put_link, we have right now, etc. The reason not to go that way (and not all that strong, at that) is that we get an extra pointer in struct saved, i.e. slightly heavier nameidata (there are two struct saved embedded into it). That, or play silly buggers with union { struct callback_head callback; struct inode *inode; }; in struct saved - we need to stash a pointer to inode between pick_link() and get_link(). Hell knows, still might make sense to go there... What we want is to set a closure to be evaluated once the caller of ->get_link() is done with the string ->get_link() returns. We end up picking the function to be called at saved->inode->i_op->put_link and store its argument in saved->cookie. Might as well just use the representation we use for such stuff (rcu_head, etc.) and be done with that... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
Hi Al, thanks a lot for this. I particularly like that you renamed "follow_link" which didn't really describe what the function did any more. Apart from the changelog comment in 4/11 which starts "It was to needed for ..." I can't fault anything. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Wed, Dec 09, 2015 at 09:24:32AM -0800, Linus Torvalds wrote: >That's because of the HIGHMEM crap that you removed. Without > highmem, the page data address specifies the page fine without any > cookie. > > - overlayfs, which uses it to create a temporary data storage. > >Why? Mainly because it wants to hide the cookie in there. D'oh! Not really - it also wants to stash a way to underlying inode in there. See below... > - proc, which uses it to save the proc_dir_entry > >Why? The reason seems to be that we don't call put_link() with the > dentry that get_link got, so it can't look it up the sane way. What's more, that dentry might very well have gone negative by that point. Think what happens if, during the symlink traversal, we run into the hard "restart from scratch in non-RCU mode". We'll need to do ->put_link() on everything we have in stack. Regardless of what might've happened to dentries/inodes of symlinks involved - all we are really promised is that RCU-delayed parts of their destruction hadn't been entered yet. Note, BTW, that *all* ->put_link() instances ignore the inode argument and I seriously considered dropping it completely. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Tue, Dec 8, 2015 at 9:32 PM, Al Viro wrote: > > Any help with review and testing would be welcome. FWIW, it seems to survive > the beating here. It looks ok to me, but I *hate* that "cookie" thing we have. And I don't actually see any reason for it. Every single user except for a couple seem to just put the resulting symlink address into the cookie (and then put_link() will kfree it or put_page() it). And the calling convention would be *so* much clearer if we just made things be char *get_link(dentry, inode); put_link(dentry, inode, char *); and get rid of that cookie entirely. The only users of the cookie seem to be - we currently use it for the page pointer vs the page data address. That's because of the HIGHMEM crap that you removed. Without highmem, the page data address specifies the page fine without any cookie. - overlayfs, which uses it to create a temporary data storage. Why? Mainly because it wants to hide the cookie in there. D'oh! - proc, which uses it to save the proc_dir_entry Why? The reason seems to be that we don't call put_link() with the dentry that get_link got, so it can't look it up the sane way. In other words, as far as I can tell, the cookie is just garbage now. But maybe I missed something. Other than that, I like the series. And I realize that technically the "cookie removal" would be a separate thing, but since you're changing the calling convention from "follow_link()" to "get_link()" in this series, I _really_ think that as part of that calling convention change we should just get rid of the cookie, rather than have *another* calling convention change later. Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Tue, Dec 8, 2015 at 9:32 PM, Al Virowrote: > > Any help with review and testing would be welcome. FWIW, it seems to survive > the beating here. It looks ok to me, but I *hate* that "cookie" thing we have. And I don't actually see any reason for it. Every single user except for a couple seem to just put the resulting symlink address into the cookie (and then put_link() will kfree it or put_page() it). And the calling convention would be *so* much clearer if we just made things be char *get_link(dentry, inode); put_link(dentry, inode, char *); and get rid of that cookie entirely. The only users of the cookie seem to be - we currently use it for the page pointer vs the page data address. That's because of the HIGHMEM crap that you removed. Without highmem, the page data address specifies the page fine without any cookie. - overlayfs, which uses it to create a temporary data storage. Why? Mainly because it wants to hide the cookie in there. D'oh! - proc, which uses it to save the proc_dir_entry Why? The reason seems to be that we don't call put_link() with the dentry that get_link got, so it can't look it up the sane way. In other words, as far as I can tell, the cookie is just garbage now. But maybe I missed something. Other than that, I like the series. And I realize that technically the "cookie removal" would be a separate thing, but since you're changing the calling convention from "follow_link()" to "get_link()" in this series, I _really_ think that as part of that calling convention change we should just get rid of the cookie, rather than have *another* calling convention change later. Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Wed, Dec 09, 2015 at 09:24:32AM -0800, Linus Torvalds wrote: >That's because of the HIGHMEM crap that you removed. Without > highmem, the page data address specifies the page fine without any > cookie. > > - overlayfs, which uses it to create a temporary data storage. > >Why? Mainly because it wants to hide the cookie in there. D'oh! Not really - it also wants to stash a way to underlying inode in there. See below... > - proc, which uses it to save the proc_dir_entry > >Why? The reason seems to be that we don't call put_link() with the > dentry that get_link got, so it can't look it up the sane way. What's more, that dentry might very well have gone negative by that point. Think what happens if, during the symlink traversal, we run into the hard "restart from scratch in non-RCU mode". We'll need to do ->put_link() on everything we have in stack. Regardless of what might've happened to dentries/inodes of symlinks involved - all we are really promised is that RCU-delayed parts of their destruction hadn't been entered yet. Note, BTW, that *all* ->put_link() instances ignore the inode argument and I seriously considered dropping it completely. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
Hi Al, thanks a lot for this. I particularly like that you renamed "follow_link" which didn't really describe what the function did any more. Apart from the changelog comment in 4/11 which starts "It was to needed for ..." I can't fault anything. Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Wed, Dec 09, 2015 at 06:23:09PM +, Al Viro wrote: > What's more, that dentry might very well have gone negative by that > point. Think what happens if, during the symlink traversal, we run > into the hard "restart from scratch in non-RCU mode". We'll need to > do ->put_link() on everything we have in stack. Regardless of what > might've happened to dentries/inodes of symlinks involved - all we > are really promised is that RCU-delayed parts of their destruction hadn't > been entered yet. > > Note, BTW, that *all* ->put_link() instances ignore the inode argument > and I seriously considered dropping it completely. PS: I toyed with the idea of replacing cookie a struct callback_head, to make it really obvious what's going on - we are leaving a closure to be evaluated once the caller is done with our return value. That way ->put_link() as a method would've been gone and we would've had something like init_callback(destructor, page_put_link, page); instead of *cookie = page; plus .put_link = page_put_link, we have right now, etc. The reason not to go that way (and not all that strong, at that) is that we get an extra pointer in struct saved, i.e. slightly heavier nameidata (there are two struct saved embedded into it). That, or play silly buggers with union { struct callback_head callback; struct inode *inode; }; in struct saved - we need to stash a pointer to inode between pick_link() and get_link(). Hell knows, still might make sense to go there... What we want is to set a closure to be evaluated once the caller of ->get_link() is done with the string ->get_link() returns. We end up picking the function to be called at saved->inode->i_op->put_link and store its argument in saved->cookie. Might as well just use the representation we use for such stuff (rcu_head, etc.) and be done with that... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHSET v2] ->follow_link() without dropping from RCU mode
On Thu, Dec 10, 2015 at 12:10:12AM +, Al Viro wrote: > PS: I toyed with the idea of replacing cookie a struct callback_head, to make ... except that it would have to be something like work_struct, since callback_head gets embedded into the argument and here we don't want to do that. _And_ we've no use for ->entry in work_struct, which makes it a bad fit as well. IOW, please ignore that idiocy... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHSET v2] ->follow_link() without dropping from RCU mode
On Tue, Nov 17, 2015 at 10:57:52PM +, Al Viro wrote: > Right now we stay in RCU mode for fast symlink traversal. > However, anything trickier drops out of RCU mode - back in 4.2 > the symlink-related pile had grown too large to add this on top > of everything else. Below is an attempt to do that now. > > Those who prefer to use git for review can find that series in > git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.symlinks Compared to the previous iteration: * rebased to 4.4-rc4. * NFS (in case of pagecache hit, of course) also stays in RCU mode. * renamed inode_nohigh() as suggested by dchinner, moved to fs/inode.c - less header PITA that way. What's there: 01/11: switch befs long symlinks to page_symlink_operations 02/11: logfs: don't duplicate page_symlink_inode_operations 03/11: udf: don't duplicate page_symlink_inode_operations 04/11: ufs: get rid of ->setattr() for symlinks Simplifying things a bit by switching them to page_symlink_operations where possible. 05/11: namei: page_getlink() and page_follow_link_light() are the same thing Get rid of some code duplication 06/11: don't put symlink bodies in pagecache into highmem Bugfix for a long-standing mess. For pagecache-based symlinks we end up with kmap() of the body for the duration of traversal. Which could take a long time if we walk into e.g. slow automount along the way. It's DoSable, actually. Not hard to fix - there's no reason to use GFP_HIGHUSER_MOVABLE for those guys; GFP_USER serves just as well. 07/11: replace ->follow_link() with new method that could stay Meat of the series - we switch to a new method (->get_link()) that differs from ->follow_link() in getting dentry and inode separately; it can be called in RCU mode (with NULL dentry) and it should bail out with ERR_PTR(-ECHILD) if it needs non-RCU. All instances converted, most of them by making them bail out immediately in RCU mode. Some are trivialy RCU-safe, though; those do not bail out at all. 08/11: teach page_get_link() to work in RCU mode 09/11: teach shmem_get_link() to work in RCU mode 10/11: teach proc_self_get_link()/proc_thread_self_get_link() 11/11: teach nfs_get_link() to work in RCU mode Teaching more instances to stay in RCU mode if they can. (8) and (11) are similar to what Neil had done back in March, except for the lack of kmap mess to deal with - that's one of the benefits of having (6) done... Missing, probably worth adding: XFS, overlayfs. Maybe gfs2 - if it's at all possible (not sure if it is). Maybe autofs4 as well. Procfs magical symlinks definitely are *not* worth trying to handle in RCU mode - ->d_revalidate() on them is such that it'll kick us out of RCU mode before we even see that it's a symlink, let alone try to follow it. Any help with review and testing would be welcome. FWIW, it seems to survive the beating here. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCHSET v2] ->follow_link() without dropping from RCU mode
On Tue, Nov 17, 2015 at 10:57:52PM +, Al Viro wrote: > Right now we stay in RCU mode for fast symlink traversal. > However, anything trickier drops out of RCU mode - back in 4.2 > the symlink-related pile had grown too large to add this on top > of everything else. Below is an attempt to do that now. > > Those who prefer to use git for review can find that series in > git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.symlinks Compared to the previous iteration: * rebased to 4.4-rc4. * NFS (in case of pagecache hit, of course) also stays in RCU mode. * renamed inode_nohigh() as suggested by dchinner, moved to fs/inode.c - less header PITA that way. What's there: 01/11: switch befs long symlinks to page_symlink_operations 02/11: logfs: don't duplicate page_symlink_inode_operations 03/11: udf: don't duplicate page_symlink_inode_operations 04/11: ufs: get rid of ->setattr() for symlinks Simplifying things a bit by switching them to page_symlink_operations where possible. 05/11: namei: page_getlink() and page_follow_link_light() are the same thing Get rid of some code duplication 06/11: don't put symlink bodies in pagecache into highmem Bugfix for a long-standing mess. For pagecache-based symlinks we end up with kmap() of the body for the duration of traversal. Which could take a long time if we walk into e.g. slow automount along the way. It's DoSable, actually. Not hard to fix - there's no reason to use GFP_HIGHUSER_MOVABLE for those guys; GFP_USER serves just as well. 07/11: replace ->follow_link() with new method that could stay Meat of the series - we switch to a new method (->get_link()) that differs from ->follow_link() in getting dentry and inode separately; it can be called in RCU mode (with NULL dentry) and it should bail out with ERR_PTR(-ECHILD) if it needs non-RCU. All instances converted, most of them by making them bail out immediately in RCU mode. Some are trivialy RCU-safe, though; those do not bail out at all. 08/11: teach page_get_link() to work in RCU mode 09/11: teach shmem_get_link() to work in RCU mode 10/11: teach proc_self_get_link()/proc_thread_self_get_link() 11/11: teach nfs_get_link() to work in RCU mode Teaching more instances to stay in RCU mode if they can. (8) and (11) are similar to what Neil had done back in March, except for the lack of kmap mess to deal with - that's one of the benefits of having (6) done... Missing, probably worth adding: XFS, overlayfs. Maybe gfs2 - if it's at all possible (not sure if it is). Maybe autofs4 as well. Procfs magical symlinks definitely are *not* worth trying to handle in RCU mode - ->d_revalidate() on them is such that it'll kick us out of RCU mode before we even see that it's a symlink, let alone try to follow it. Any help with review and testing would be welcome. FWIW, it seems to survive the beating here. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/