Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/02/2013 03:30 PM, Jason Low wrote: On Wed, Oct 2, 2013 at 12:19 PM, Waiman Long wrote: On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node->lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node->lock = 0 so that it gets executed after the "if (likely(prev == NULL)) {}" code block and then delete "node->lock = 1" inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node->locked = 0; You can remove the locked flag setting statement inside if (prev == NULL), but you can't clear the locked flag after xchg(). In the interval between xchg() and locked=0, the previous lock owner may come in and set the flag. Now if your clear it, the thread will loop forever. You have to clear it before xchg(). Yes, in my most recent version, I left locked = 0 in its original place so that the xchg() can act as a barrier for it. The other option would have been to put another barrier after locked = 0. I went with leaving locked = 0 in its original place so that we don't need that extra barrier. I don't think putting another barrier after locked=0 will work. Chronologically, the flag must be cleared before the node address is saved in the lock field. There is no way to guarantee that except by putting the locked=0 before xchg(). -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/02/2013 02:43 PM, Tim Chen wrote: On Tue, 2013-10-01 at 21:25 -0400, Waiman Long wrote: If the lock and unlock functions are done right, there should be no overlap of critical section. So it is job of the lock/unlock functions to make sure that critical section code won't leak out. There should be some kind of memory barrier at the beginning of the lock function and the end of the unlock function. The critical section also likely to have branches. The CPU may speculatively execute code on the 2 branches, but one of them will be discarded once the branch condition is known. Also arch_mutex_cpu_relax() is a compiler barrier by itself. So we may not need a barrier() after all. The while statement is a branch instruction, any code after that can only be speculatively executed and cannot be committed until the branch is done. But the condition code may be checked after speculative execution? The condition may not be true during speculative execution and only turns true when we check the condition, and take that branch? The thing that bothers me is without memory barrier after the while statement, we could speculatively execute before affirming the lock is in acquired state. Then when we check the lock, the lock is set to acquired state in the mean time. We could be loading some memory entry *before* the node->locked has been set true. I think a smp_rmb (if not a smp_mb) should be set after the while statement. Yes, I think a smp_rmb() make sense here to correspond to the smp_wmb() in the unlock path. BTW, you need to move the node->locked = 0; statement before xchg() if you haven't done so. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Wed, Oct 2, 2013 at 12:19 PM, Waiman Long wrote: > On 09/26/2013 06:42 PM, Jason Low wrote: >> >> On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: >>> >>> Okay, that would makes sense for consistency because we always >>> first set node->lock = 0 at the top of the function. >>> >>> If we prefer to optimize this a bit though, perhaps we can >>> first move the node->lock = 0 so that it gets executed after the >>> "if (likely(prev == NULL)) {}" code block and then delete >>> "node->lock = 1" inside the code block. >>> >>> static noinline >>> void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node >>> *node) >>> { >>> struct mcs_spin_node *prev; >>> >>> /* Init node */ >>> node->next = NULL; >>> >>> prev = xchg(lock, node); >>> if (likely(prev == NULL)) { >>> /* Lock acquired */ >>> return; >>> } >>> node->locked = 0; > > > You can remove the locked flag setting statement inside if (prev == NULL), > but you can't clear the locked flag after xchg(). In the interval between > xchg() and locked=0, the previous lock owner may come in and set the flag. > Now if your clear it, the thread will loop forever. You have to clear it > before xchg(). Yes, in my most recent version, I left locked = 0 in its original place so that the xchg() can act as a barrier for it. The other option would have been to put another barrier after locked = 0. I went with leaving locked = 0 in its original place so that we don't need that extra barrier. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node->lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node->lock = 0 so that it gets executed after the "if (likely(prev == NULL)) {}" code block and then delete "node->lock = 1" inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node->locked = 0; You can remove the locked flag setting statement inside if (prev == NULL), but you can't clear the locked flag after xchg(). In the interval between xchg() and locked=0, the previous lock owner may come in and set the flag. Now if your clear it, the thread will loop forever. You have to clear it before xchg(). -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Tue, 2013-10-01 at 21:25 -0400, Waiman Long wrote: > On 10/01/2013 05:16 PM, Tim Chen wrote: > > On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: > >>> > >>> The cpu could still be executing out of order load instruction from the > >>> critical section before checking node->locked? Probably smp_mb() is > >>> still needed. > >>> > >>> Tim > >> But this is the lock function, a barrier() call should be enough to > >> prevent the critical section from creeping up there. We certainly need > >> some kind of memory barrier at the end of the unlock function. > > I may be missing something. My understanding is that barrier only > > prevents the compiler from rearranging instructions, but not for cpu out > > of order execution (as in smp_mb). So cpu could read memory in the next > > critical section, before node->locked is true, (i.e. unlock has been > > completed). If we only have a simple barrier at end of mcs_lock, then > > say the code on CPU1 is > > > > mcs_lock > > x = 1; > > ... > > x = 2; > > mcs_unlock > > > > and CPU 2 is > > > > mcs_lock > > y = x; > > ... > > mcs_unlock > > > > We expect y to be 2 after the "y = x" assignment. But we > > we may execute the code as > > > > CPU1CPU2 > > > > x = 1; > > ... y = x; ( y=1, out of order load) > > x = 2 > > mcs_unlock > > Check node->locked==true > > continue executing critical section (y=1 when we expect > > y=2) > > > > So we get y to be 1 when we expect that it should be 2. Adding smp_mb > > after the node->locked check in lock code > > > > ACCESS_ONCE(prev->next) = node; > > /* Wait until the lock holder passes the lock down */ > > while (!ACCESS_ONCE(node->locked)) > > arch_mutex_cpu_relax(); > > smp_mb(); > > > > should prevent this scenario. > > > > Thanks. > > Tim > > If the lock and unlock functions are done right, there should be no > overlap of critical section. So it is job of the lock/unlock functions > to make sure that critical section code won't leak out. There should be > some kind of memory barrier at the beginning of the lock function and > the end of the unlock function. > > The critical section also likely to have branches. The CPU may > speculatively execute code on the 2 branches, but one of them will be > discarded once the branch condition is known. Also > arch_mutex_cpu_relax() is a compiler barrier by itself. So we may not > need a barrier() after all. The while statement is a branch instruction, > any code after that can only be speculatively executed and cannot be > committed until the branch is done. But the condition code may be checked after speculative execution? The condition may not be true during speculative execution and only turns true when we check the condition, and take that branch? The thing that bothers me is without memory barrier after the while statement, we could speculatively execute before affirming the lock is in acquired state. Then when we check the lock, the lock is set to acquired state in the mean time. We could be loading some memory entry *before* the node->locked has been set true. I think a smp_rmb (if not a smp_mb) should be set after the while statement. At first I was also thinking that the memory barrier is not necessary but Paul convinced me otherwise in a previous email. https://lkml.org/lkml/2013/9/27/523 > > In x86, the smp_mb() function translated to a mfence instruction which > cost time. That is why I try to get rid of it if it is not necessary. > I also hope that the memory barrier is not necessary and I am missing something obvious. But I haven't been able to persuade myself. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Tue, 2013-10-01 at 21:25 -0400, Waiman Long wrote: On 10/01/2013 05:16 PM, Tim Chen wrote: On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: The cpu could still be executing out of order load instruction from the critical section before checking node-locked? Probably smp_mb() is still needed. Tim But this is the lock function, a barrier() call should be enough to prevent the critical section from creeping up there. We certainly need some kind of memory barrier at the end of the unlock function. I may be missing something. My understanding is that barrier only prevents the compiler from rearranging instructions, but not for cpu out of order execution (as in smp_mb). So cpu could read memory in the next critical section, before node-locked is true, (i.e. unlock has been completed). If we only have a simple barrier at end of mcs_lock, then say the code on CPU1 is mcs_lock x = 1; ... x = 2; mcs_unlock and CPU 2 is mcs_lock y = x; ... mcs_unlock We expect y to be 2 after the y = x assignment. But we we may execute the code as CPU1CPU2 x = 1; ... y = x; ( y=1, out of order load) x = 2 mcs_unlock Check node-locked==true continue executing critical section (y=1 when we expect y=2) So we get y to be 1 when we expect that it should be 2. Adding smp_mb after the node-locked check in lock code ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); should prevent this scenario. Thanks. Tim If the lock and unlock functions are done right, there should be no overlap of critical section. So it is job of the lock/unlock functions to make sure that critical section code won't leak out. There should be some kind of memory barrier at the beginning of the lock function and the end of the unlock function. The critical section also likely to have branches. The CPU may speculatively execute code on the 2 branches, but one of them will be discarded once the branch condition is known. Also arch_mutex_cpu_relax() is a compiler barrier by itself. So we may not need a barrier() after all. The while statement is a branch instruction, any code after that can only be speculatively executed and cannot be committed until the branch is done. But the condition code may be checked after speculative execution? The condition may not be true during speculative execution and only turns true when we check the condition, and take that branch? The thing that bothers me is without memory barrier after the while statement, we could speculatively execute before affirming the lock is in acquired state. Then when we check the lock, the lock is set to acquired state in the mean time. We could be loading some memory entry *before* the node-locked has been set true. I think a smp_rmb (if not a smp_mb) should be set after the while statement. At first I was also thinking that the memory barrier is not necessary but Paul convinced me otherwise in a previous email. https://lkml.org/lkml/2013/9/27/523 In x86, the smp_mb() function translated to a mfence instruction which cost time. That is why I try to get rid of it if it is not necessary. I also hope that the memory barrier is not necessary and I am missing something obvious. But I haven't been able to persuade myself. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node-lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node-locked = 0; You can remove the locked flag setting statement inside if (prev == NULL), but you can't clear the locked flag after xchg(). In the interval between xchg() and locked=0, the previous lock owner may come in and set the flag. Now if your clear it, the thread will loop forever. You have to clear it before xchg(). -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Wed, Oct 2, 2013 at 12:19 PM, Waiman Long waiman.l...@hp.com wrote: On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node-lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node-locked = 0; You can remove the locked flag setting statement inside if (prev == NULL), but you can't clear the locked flag after xchg(). In the interval between xchg() and locked=0, the previous lock owner may come in and set the flag. Now if your clear it, the thread will loop forever. You have to clear it before xchg(). Yes, in my most recent version, I left locked = 0 in its original place so that the xchg() can act as a barrier for it. The other option would have been to put another barrier after locked = 0. I went with leaving locked = 0 in its original place so that we don't need that extra barrier. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/02/2013 02:43 PM, Tim Chen wrote: On Tue, 2013-10-01 at 21:25 -0400, Waiman Long wrote: If the lock and unlock functions are done right, there should be no overlap of critical section. So it is job of the lock/unlock functions to make sure that critical section code won't leak out. There should be some kind of memory barrier at the beginning of the lock function and the end of the unlock function. The critical section also likely to have branches. The CPU may speculatively execute code on the 2 branches, but one of them will be discarded once the branch condition is known. Also arch_mutex_cpu_relax() is a compiler barrier by itself. So we may not need a barrier() after all. The while statement is a branch instruction, any code after that can only be speculatively executed and cannot be committed until the branch is done. But the condition code may be checked after speculative execution? The condition may not be true during speculative execution and only turns true when we check the condition, and take that branch? The thing that bothers me is without memory barrier after the while statement, we could speculatively execute before affirming the lock is in acquired state. Then when we check the lock, the lock is set to acquired state in the mean time. We could be loading some memory entry *before* the node-locked has been set true. I think a smp_rmb (if not a smp_mb) should be set after the while statement. Yes, I think a smp_rmb() make sense here to correspond to the smp_wmb() in the unlock path. BTW, you need to move the node-locked = 0; statement before xchg() if you haven't done so. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/02/2013 03:30 PM, Jason Low wrote: On Wed, Oct 2, 2013 at 12:19 PM, Waiman Longwaiman.l...@hp.com wrote: On 09/26/2013 06:42 PM, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: Okay, that would makes sense for consistency because we always first set node-lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node-locked = 0; You can remove the locked flag setting statement inside if (prev == NULL), but you can't clear the locked flag after xchg(). In the interval between xchg() and locked=0, the previous lock owner may come in and set the flag. Now if your clear it, the thread will loop forever. You have to clear it before xchg(). Yes, in my most recent version, I left locked = 0 in its original place so that the xchg() can act as a barrier for it. The other option would have been to put another barrier after locked = 0. I went with leaving locked = 0 in its original place so that we don't need that extra barrier. I don't think putting another barrier after locked=0 will work. Chronologically, the flag must be cleared before the node address is saved in the lock field. There is no way to guarantee that except by putting the locked=0 before xchg(). -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/01/2013 05:16 PM, Tim Chen wrote: On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: The cpu could still be executing out of order load instruction from the critical section before checking node->locked? Probably smp_mb() is still needed. Tim But this is the lock function, a barrier() call should be enough to prevent the critical section from creeping up there. We certainly need some kind of memory barrier at the end of the unlock function. I may be missing something. My understanding is that barrier only prevents the compiler from rearranging instructions, but not for cpu out of order execution (as in smp_mb). So cpu could read memory in the next critical section, before node->locked is true, (i.e. unlock has been completed). If we only have a simple barrier at end of mcs_lock, then say the code on CPU1 is mcs_lock x = 1; ... x = 2; mcs_unlock and CPU 2 is mcs_lock y = x; ... mcs_unlock We expect y to be 2 after the "y = x" assignment. But we we may execute the code as CPU1CPU2 x = 1; ... y = x; ( y=1, out of order load) x = 2 mcs_unlock Check node->locked==true continue executing critical section (y=1 when we expect y=2) So we get y to be 1 when we expect that it should be 2. Adding smp_mb after the node->locked check in lock code ACCESS_ONCE(prev->next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); smp_mb(); should prevent this scenario. Thanks. Tim If the lock and unlock functions are done right, there should be no overlap of critical section. So it is job of the lock/unlock functions to make sure that critical section code won't leak out. There should be some kind of memory barrier at the beginning of the lock function and the end of the unlock function. The critical section also likely to have branches. The CPU may speculatively execute code on the 2 branches, but one of them will be discarded once the branch condition is known. Also arch_mutex_cpu_relax() is a compiler barrier by itself. So we may not need a barrier() after all. The while statement is a branch instruction, any code after that can only be speculatively executed and cannot be committed until the branch is done. In x86, the smp_mb() function translated to a mfence instruction which cost time. That is why I try to get rid of it if it is not necessary. Regards, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: > On 10/01/2013 12:48 PM, Tim Chen wrote: > > On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: > >> On 09/30/2013 12:10 PM, Jason Low wrote: > >>> On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: > On 09/28/2013 12:34 AM, Jason Low wrote: > >> Also, below is what the mcs_spin_lock() and mcs_spin_unlock() > >> functions would look like after applying the proposed changes. > >> > >> static noinline > >> void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > >> *node) > >> { > >>struct mcs_spin_node *prev; > >> > >>/* Init node */ > >>node->locked = 0; > >>node->next = NULL; > >> > >>prev = xchg(lock, node); > >>if (likely(prev == NULL)) { > >>/* Lock acquired. No need to set node->locked since > >> it > >> won't be used */ > >>return; > >>} > >>ACCESS_ONCE(prev->next) = node; > >>/* Wait until the lock holder passes the lock down */ > >>while (!ACCESS_ONCE(node->locked)) > >>arch_mutex_cpu_relax(); > >>smp_mb(); > I wonder if a memory barrier is really needed here. > >>> If the compiler can reorder the while (!ACCESS_ONCE(node->locked)) check > >>> so that the check occurs after an instruction in the critical section, > >>> then the barrier may be necessary. > >>> > >> In that case, just a barrier() call should be enough. > > The cpu could still be executing out of order load instruction from the > > critical section before checking node->locked? Probably smp_mb() is > > still needed. > > > > Tim > > But this is the lock function, a barrier() call should be enough to > prevent the critical section from creeping up there. We certainly need > some kind of memory barrier at the end of the unlock function. I may be missing something. My understanding is that barrier only prevents the compiler from rearranging instructions, but not for cpu out of order execution (as in smp_mb). So cpu could read memory in the next critical section, before node->locked is true, (i.e. unlock has been completed). If we only have a simple barrier at end of mcs_lock, then say the code on CPU1 is mcs_lock x = 1; ... x = 2; mcs_unlock and CPU 2 is mcs_lock y = x; ... mcs_unlock We expect y to be 2 after the "y = x" assignment. But we we may execute the code as CPU1CPU2 x = 1; ... y = x; ( y=1, out of order load) x = 2 mcs_unlock Check node->locked==true continue executing critical section (y=1 when we expect y=2) So we get y to be 1 when we expect that it should be 2. Adding smp_mb after the node->locked check in lock code ACCESS_ONCE(prev->next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); smp_mb(); should prevent this scenario. Thanks. Tim > > -Longman > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/01/2013 12:48 PM, Tim Chen wrote: On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->locked = 0; node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node->locked since it won't be used */ return; } ACCESS_ONCE(prev->next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node->locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. In that case, just a barrier() call should be enough. The cpu could still be executing out of order load instruction from the critical section before checking node->locked? Probably smp_mb() is still needed. Tim But this is the lock function, a barrier() call should be enough to prevent the critical section from creeping up there. We certainly need some kind of memory barrier at the end of the unlock function. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: > On 09/30/2013 12:10 PM, Jason Low wrote: > > On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: > >> On 09/28/2013 12:34 AM, Jason Low wrote: > Also, below is what the mcs_spin_lock() and mcs_spin_unlock() > functions would look like after applying the proposed changes. > > static noinline > void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > *node) > { > struct mcs_spin_node *prev; > > /* Init node */ > node->locked = 0; > node->next = NULL; > > prev = xchg(lock, node); > if (likely(prev == NULL)) { > /* Lock acquired. No need to set node->locked since it > won't be used */ > return; > } > ACCESS_ONCE(prev->next) = node; > /* Wait until the lock holder passes the lock down */ > while (!ACCESS_ONCE(node->locked)) > arch_mutex_cpu_relax(); > smp_mb(); > >> I wonder if a memory barrier is really needed here. > > If the compiler can reorder the while (!ACCESS_ONCE(node->locked)) check > > so that the check occurs after an instruction in the critical section, > > then the barrier may be necessary. > > > > In that case, just a barrier() call should be enough. The cpu could still be executing out of order load instruction from the critical section before checking node->locked? Probably smp_mb() is still needed. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node-locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. In that case, just a barrier() call should be enough. The cpu could still be executing out of order load instruction from the critical section before checking node-locked? Probably smp_mb() is still needed. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/01/2013 12:48 PM, Tim Chen wrote: On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node-locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. In that case, just a barrier() call should be enough. The cpu could still be executing out of order load instruction from the critical section before checking node-locked? Probably smp_mb() is still needed. Tim But this is the lock function, a barrier() call should be enough to prevent the critical section from creeping up there. We certainly need some kind of memory barrier at the end of the unlock function. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: On 10/01/2013 12:48 PM, Tim Chen wrote: On Mon, 2013-09-30 at 12:36 -0400, Waiman Long wrote: On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node-locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. In that case, just a barrier() call should be enough. The cpu could still be executing out of order load instruction from the critical section before checking node-locked? Probably smp_mb() is still needed. Tim But this is the lock function, a barrier() call should be enough to prevent the critical section from creeping up there. We certainly need some kind of memory barrier at the end of the unlock function. I may be missing something. My understanding is that barrier only prevents the compiler from rearranging instructions, but not for cpu out of order execution (as in smp_mb). So cpu could read memory in the next critical section, before node-locked is true, (i.e. unlock has been completed). If we only have a simple barrier at end of mcs_lock, then say the code on CPU1 is mcs_lock x = 1; ... x = 2; mcs_unlock and CPU 2 is mcs_lock y = x; ... mcs_unlock We expect y to be 2 after the y = x assignment. But we we may execute the code as CPU1CPU2 x = 1; ... y = x; ( y=1, out of order load) x = 2 mcs_unlock Check node-locked==true continue executing critical section (y=1 when we expect y=2) So we get y to be 1 when we expect that it should be 2. Adding smp_mb after the node-locked check in lock code ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); should prevent this scenario. Thanks. Tim -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 10/01/2013 05:16 PM, Tim Chen wrote: On Tue, 2013-10-01 at 16:01 -0400, Waiman Long wrote: The cpu could still be executing out of order load instruction from the critical section before checking node-locked? Probably smp_mb() is still needed. Tim But this is the lock function, a barrier() call should be enough to prevent the critical section from creeping up there. We certainly need some kind of memory barrier at the end of the unlock function. I may be missing something. My understanding is that barrier only prevents the compiler from rearranging instructions, but not for cpu out of order execution (as in smp_mb). So cpu could read memory in the next critical section, before node-locked is true, (i.e. unlock has been completed). If we only have a simple barrier at end of mcs_lock, then say the code on CPU1 is mcs_lock x = 1; ... x = 2; mcs_unlock and CPU 2 is mcs_lock y = x; ... mcs_unlock We expect y to be 2 after the y = x assignment. But we we may execute the code as CPU1CPU2 x = 1; ... y = x; ( y=1, out of order load) x = 2 mcs_unlock Check node-locked==true continue executing critical section (y=1 when we expect y=2) So we get y to be 1 when we expect that it should be 2. Adding smp_mb after the node-locked check in lock code ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); should prevent this scenario. Thanks. Tim If the lock and unlock functions are done right, there should be no overlap of critical section. So it is job of the lock/unlock functions to make sure that critical section code won't leak out. There should be some kind of memory barrier at the beginning of the lock function and the end of the unlock function. The critical section also likely to have branches. The CPU may speculatively execute code on the 2 branches, but one of them will be discarded once the branch condition is known. Also arch_mutex_cpu_relax() is a compiler barrier by itself. So we may not need a barrier() after all. The while statement is a branch instruction, any code after that can only be speculatively executed and cannot be committed until the branch is done. In x86, the smp_mb() function translated to a mfence instruction which cost time. That is why I try to get rid of it if it is not necessary. Regards, Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->locked = 0; node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node->locked since it won't be used */ return; } ACCESS_ONCE(prev->next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node->locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. In that case, just a barrier() call should be enough. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 19:19 -0700, Paul E. McKenney wrote: > On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote: > > On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney > > wrote: > > > Yep. The previous lock holder's smp_wmb() won't keep either the compiler > > > or the CPU from reordering things for the new lock holder. They could for > > > example reorder the critical section to precede the node->locked check, > > > which would be very bad. > > > > Paul, Tim, Longman, > > > > How would you like the proposed changes below? > > Could you point me at what this applies to? I can find flaws looking > at random pieces, given a little luck, but at some point I need to look > at the whole thing. ;-) > > Thanx, Paul Jason's patch is on top of the following patchset: https://lkml.org/lkml/2013/9/26/674 Thanks. Tim > > > --- > > Subject: [PATCH] MCS: optimizations and barrier corrections > > > > Delete the node->locked = 1 assignment if the lock is free as it won't be > > used. > > > > Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the > > end of the mcs_spin_lock() function. As Paul McKenney suggested, "you do > > need a > > full memory barrier here in order to ensure that you see the effects of the > > previous lock holder's critical section." And in the mcs_spin_unlock(), > > move the > > memory barrier so that it is before the "ACCESS_ONCE(next->locked) = 1;". > > > > Signed-off-by: Jason Low > > Signed-off-by: Paul E. McKenney > > Signed-off-by: Tim Chen > > --- > > include/linux/mcslock.h |7 +++ > > 1 files changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > index 20fd3f0..edd57d2 100644 > > --- a/include/linux/mcslock.h > > +++ b/include/linux/mcslock.h > > @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, > > struct mcs_spin_node *node) > > > > prev = xchg(lock, node); > > if (likely(prev == NULL)) { > > - /* Lock acquired */ > > - node->locked = 1; > > + /* Lock acquired. No need to set node->locked since it > > won't be used */ > > return; > > } > > ACCESS_ONCE(prev->next) = node; > > - smp_wmb(); > > /* Wait until the lock holder passes the lock down */ > > while (!ACCESS_ONCE(node->locked)) > > arch_mutex_cpu_relax(); > > + smp_mb(); > > } > > > > static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > > mcs_spin_node *node) > > @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node > > **lock, struct mcs_spin_node *n > > while (!(next = ACCESS_ONCE(node->next))) > > arch_mutex_cpu_relax(); > > } > > - ACCESS_ONCE(next->locked) = 1; > > smp_wmb(); > > + ACCESS_ONCE(next->locked) = 1; > > } > > > > #endif > > -- > > 1.7.1 > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: > On 09/28/2013 12:34 AM, Jason Low wrote: > >> Also, below is what the mcs_spin_lock() and mcs_spin_unlock() > >> functions would look like after applying the proposed changes. > >> > >> static noinline > >> void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > >> { > >> struct mcs_spin_node *prev; > >> > >> /* Init node */ > >> node->locked = 0; > >> node->next = NULL; > >> > >> prev = xchg(lock, node); > >> if (likely(prev == NULL)) { > >> /* Lock acquired. No need to set node->locked since it > >> won't be used */ > >> return; > >> } > >> ACCESS_ONCE(prev->next) = node; > >> /* Wait until the lock holder passes the lock down */ > >> while (!ACCESS_ONCE(node->locked)) > >> arch_mutex_cpu_relax(); > >> smp_mb(); > > I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node->locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->locked = 0; node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node->locked since it won't be used */ return; } ACCESS_ONCE(prev->next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *next = ACCESS_ONCE(node->next); if (likely(!next)) { /* * Release the lock by setting it to NULL */ if (cmpxchg(lock, node, NULL) == node) return; /* Wait until the next pointer is set */ while (!(next = ACCESS_ONCE(node->next))) arch_mutex_cpu_relax(); } smp_wmb(); ACCESS_ONCE(next->locked) = 1; } Instead, I think what we need may be: if (likely(!next)) { } else smp_mb(); ACCESS_ONCE(next->locked) = 1; That will ensure a memory barrier in the unlock path. Regards, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *next = ACCESS_ONCE(node-next); if (likely(!next)) { /* * Release the lock by setting it to NULL */ if (cmpxchg(lock, node, NULL) == node) return; /* Wait until the next pointer is set */ while (!(next = ACCESS_ONCE(node-next))) arch_mutex_cpu_relax(); } smp_wmb(); ACCESS_ONCE(next-locked) = 1; } Instead, I think what we need may be: if (likely(!next)) { } else smp_mb(); ACCESS_ONCE(next-locked) = 1; That will ensure a memory barrier in the unlock path. Regards, Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node-locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 19:19 -0700, Paul E. McKenney wrote: On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote: On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new lock holder. They could for example reorder the critical section to precede the node-locked check, which would be very bad. Paul, Tim, Longman, How would you like the proposed changes below? Could you point me at what this applies to? I can find flaws looking at random pieces, given a little luck, but at some point I need to look at the whole thing. ;-) Thanx, Paul Jason's patch is on top of the following patchset: https://lkml.org/lkml/2013/9/26/674 Thanks. Tim --- Subject: [PATCH] MCS: optimizations and barrier corrections Delete the node-locked = 1 assignment if the lock is free as it won't be used. Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the end of the mcs_spin_lock() function. As Paul McKenney suggested, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. And in the mcs_spin_unlock(), move the memory barrier so that it is before the ACCESS_ONCE(next-locked) = 1;. Signed-off-by: Jason Low jason.l...@hp.com Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Tim Chen tim.c.c...@linux.intel.com --- include/linux/mcslock.h |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..edd57d2 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) prev = xchg(lock, node); if (likely(prev == NULL)) { - /* Lock acquired */ - node-locked = 1; + /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; - smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); + smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *n while (!(next = ACCESS_ONCE(node-next))) arch_mutex_cpu_relax(); } - ACCESS_ONCE(next-locked) = 1; smp_wmb(); + ACCESS_ONCE(next-locked) = 1; } #endif -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/30/2013 12:10 PM, Jason Low wrote: On Mon, 2013-09-30 at 11:51 -0400, Waiman Long wrote: On 09/28/2013 12:34 AM, Jason Low wrote: Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); I wonder if a memory barrier is really needed here. If the compiler can reorder the while (!ACCESS_ONCE(node-locked)) check so that the check occurs after an instruction in the critical section, then the barrier may be necessary. In that case, just a barrier() call should be enough. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 7:19 PM, Paul E. McKenney wrote: > On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote: >> On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney >> wrote: >> > Yep. The previous lock holder's smp_wmb() won't keep either the compiler >> > or the CPU from reordering things for the new lock holder. They could for >> > example reorder the critical section to precede the node->locked check, >> > which would be very bad. >> >> Paul, Tim, Longman, >> >> How would you like the proposed changes below? > > Could you point me at what this applies to? I can find flaws looking > at random pieces, given a little luck, but at some point I need to look > at the whole thing. ;-) Sure. Here is a link to the patch we are trying to modify: https://lkml.org/lkml/2013/9/25/532 Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->locked = 0; node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node->locked since it won't be used */ return; } ACCESS_ONCE(prev->next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *next = ACCESS_ONCE(node->next); if (likely(!next)) { /* * Release the lock by setting it to NULL */ if (cmpxchg(lock, node, NULL) == node) return; /* Wait until the next pointer is set */ while (!(next = ACCESS_ONCE(node->next))) arch_mutex_cpu_relax(); } smp_wmb(); ACCESS_ONCE(next->locked) = 1; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/27/2013 02:09 PM, Tim Chen wrote: On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen Signed-off-by: Davidlohr Bueso --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node->locked = 0; + node->next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node->locked = 1; + return; + } + ACCESS_ONCE(prev->next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node->locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node->next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node->next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next->locked) = 1; + smp_wmb(); Shouldn't the memory barrier precede the "ACCESS_ONCE(next->locked) = 1;"? Maybe in an "else" clause of the prior "if" statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the "if" conditionn is false, the critical section could bleed out past the unlock. Yes, I agree with you that the smp_wmb should be moved before ACCESS_ONCE to prevent critical section from bleeding. Copying Waiman who is the original author of the mcs code to see if he has any comments on things we may have missed. Tim As a more general lock/unlock mechanism, I also agreed that we should move smp_wmb() before ACCESS_ONCE(). For the mutex case, it is used as a queuing mechanism rather than guarding critical section, so it doesn't really matter. Regards, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote: > On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney > wrote: > > Yep. The previous lock holder's smp_wmb() won't keep either the compiler > > or the CPU from reordering things for the new lock holder. They could for > > example reorder the critical section to precede the node->locked check, > > which would be very bad. > > Paul, Tim, Longman, > > How would you like the proposed changes below? Could you point me at what this applies to? I can find flaws looking at random pieces, given a little luck, but at some point I need to look at the whole thing. ;-) Thanx, Paul > --- > Subject: [PATCH] MCS: optimizations and barrier corrections > > Delete the node->locked = 1 assignment if the lock is free as it won't be > used. > > Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the > end of the mcs_spin_lock() function. As Paul McKenney suggested, "you do need > a > full memory barrier here in order to ensure that you see the effects of the > previous lock holder's critical section." And in the mcs_spin_unlock(), move > the > memory barrier so that it is before the "ACCESS_ONCE(next->locked) = 1;". > > Signed-off-by: Jason Low > Signed-off-by: Paul E. McKenney > Signed-off-by: Tim Chen > --- > include/linux/mcslock.h |7 +++ > 1 files changed, 3 insertions(+), 4 deletions(-) > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > index 20fd3f0..edd57d2 100644 > --- a/include/linux/mcslock.h > +++ b/include/linux/mcslock.h > @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, > struct mcs_spin_node *node) > > prev = xchg(lock, node); > if (likely(prev == NULL)) { > - /* Lock acquired */ > - node->locked = 1; > + /* Lock acquired. No need to set node->locked since it > won't be used */ > return; > } > ACCESS_ONCE(prev->next) = node; > - smp_wmb(); > /* Wait until the lock holder passes the lock down */ > while (!ACCESS_ONCE(node->locked)) > arch_mutex_cpu_relax(); > + smp_mb(); > } > > static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > mcs_spin_node *node) > @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node > **lock, struct mcs_spin_node *n > while (!(next = ACCESS_ONCE(node->next))) > arch_mutex_cpu_relax(); > } > - ACCESS_ONCE(next->locked) = 1; > smp_wmb(); > + ACCESS_ONCE(next->locked) = 1; > } > > #endif > -- > 1.7.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 16:54 -0700, Jason Low wrote: > On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney > wrote: > > Yep. The previous lock holder's smp_wmb() won't keep either the compiler > > or the CPU from reordering things for the new lock holder. They could for > > example reorder the critical section to precede the node->locked check, > > which would be very bad. > > Paul, Tim, Longman, > > How would you like the proposed changes below? > > --- > Subject: [PATCH] MCS: optimizations and barrier corrections We *really* need to comment those barriers - explicitly that is :) > > Delete the node->locked = 1 assignment if the lock is free as it won't be > used. > > Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the > end of the mcs_spin_lock() function. As Paul McKenney suggested, "you do need > a > full memory barrier here in order to ensure that you see the effects of the > previous lock holder's critical section." And in the mcs_spin_unlock(), move > the > memory barrier so that it is before the "ACCESS_ONCE(next->locked) = 1;". > > Signed-off-by: Jason Low > Signed-off-by: Paul E. McKenney > Signed-off-by: Tim Chen > --- > include/linux/mcslock.h |7 +++ > 1 files changed, 3 insertions(+), 4 deletions(-) > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > index 20fd3f0..edd57d2 100644 > --- a/include/linux/mcslock.h > +++ b/include/linux/mcslock.h > @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, > struct mcs_spin_node *node) > > prev = xchg(lock, node); > if (likely(prev == NULL)) { > - /* Lock acquired */ > - node->locked = 1; > + /* Lock acquired. No need to set node->locked since it > won't be used */ Then, we need to explain/comment then the relationship between this situation and the locked being set in mspin_unlock(), passing the lock holder down the list. > return; > } > ACCESS_ONCE(prev->next) = node; > - smp_wmb(); > /* Wait until the lock holder passes the lock down */ > while (!ACCESS_ONCE(node->locked)) > arch_mutex_cpu_relax(); > + smp_mb(); > } > > static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > mcs_spin_node *node) > @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node > **lock, struct mcs_spin_node *n > while (!(next = ACCESS_ONCE(node->next))) > arch_mutex_cpu_relax(); > } > - ACCESS_ONCE(next->locked) = 1; > smp_wmb(); > + ACCESS_ONCE(next->locked) = 1; > } > > #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney wrote: > Yep. The previous lock holder's smp_wmb() won't keep either the compiler > or the CPU from reordering things for the new lock holder. They could for > example reorder the critical section to precede the node->locked check, > which would be very bad. Paul, Tim, Longman, How would you like the proposed changes below? --- Subject: [PATCH] MCS: optimizations and barrier corrections Delete the node->locked = 1 assignment if the lock is free as it won't be used. Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the end of the mcs_spin_lock() function. As Paul McKenney suggested, "you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section." And in the mcs_spin_unlock(), move the memory barrier so that it is before the "ACCESS_ONCE(next->locked) = 1;". Signed-off-by: Jason Low Signed-off-by: Paul E. McKenney Signed-off-by: Tim Chen --- include/linux/mcslock.h |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..edd57d2 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) prev = xchg(lock, node); if (likely(prev == NULL)) { - /* Lock acquired */ - node->locked = 1; + /* Lock acquired. No need to set node->locked since it won't be used */ return; } ACCESS_ONCE(prev->next) = node; - smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); + smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *n while (!(next = ACCESS_ONCE(node->next))) arch_mutex_cpu_relax(); } - ACCESS_ONCE(next->locked) = 1; smp_wmb(); + ACCESS_ONCE(next->locked) = 1; } #endif -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 03:46:45PM -0700, Tim Chen wrote: > On Fri, 2013-09-27 at 13:38 -0700, Paul E. McKenney wrote: > > On Fri, Sep 27, 2013 at 12:38:53PM -0700, Tim Chen wrote: > > > On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: > > > > On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: > > > > > We will need the MCS lock code for doing optimistic spinning for > > > > > rwsem. > > > > > Extracting the MCS code from mutex.c and put into its own file allow > > > > > us > > > > > to reuse this code easily for rwsem. > > > > > > > > > > Signed-off-by: Tim Chen > > > > > Signed-off-by: Davidlohr Bueso > > > > > --- > > > > > include/linux/mcslock.h | 58 > > > > > +++ > > > > > kernel/mutex.c | 58 > > > > > +- > > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > > new file mode 100644 > > > > > index 000..20fd3f0 > > > > > --- /dev/null > > > > > +++ b/include/linux/mcslock.h > > > > > @@ -0,0 +1,58 @@ > > > > > +/* > > > > > + * MCS lock defines > > > > > + * > > > > > + * This file contains the main data structure and API definitions of > > > > > MCS lock. > > > > > + */ > > > > > +#ifndef __LINUX_MCSLOCK_H > > > > > +#define __LINUX_MCSLOCK_H > > > > > + > > > > > +struct mcs_spin_node { > > > > > + struct mcs_spin_node *next; > > > > > + int locked; /* 1 if lock acquired */ > > > > > +}; > > > > > + > > > > > +/* > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly > > > > > account for the > > > > > + * time spent in this lock function. > > > > > + */ > > > > > +static noinline > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > > > > > *node) > > > > > +{ > > > > > + struct mcs_spin_node *prev; > > > > > + > > > > > + /* Init node */ > > > > > + node->locked = 0; > > > > > + node->next = NULL; > > > > > + > > > > > + prev = xchg(lock, node); > > > > > + if (likely(prev == NULL)) { > > > > > + /* Lock acquired */ > > > > > + node->locked = 1; > > > > > + return; > > > > > + } > > > > > + ACCESS_ONCE(prev->next) = node; > > > > > + smp_wmb(); > > > > > > BTW, is the above memory barrier necessary? It seems like the xchg > > > instruction already provided a memory barrier. > > > > > > Now if we made the changes that Jason suggested: > > > > > > > > > /* Init node */ > > > - node->locked = 0; > > > node->next = NULL; > > > > > > prev = xchg(lock, node); > > > if (likely(prev == NULL)) { > > > /* Lock acquired */ > > > - node->locked = 1; > > > return; > > > } > > > + node->locked = 0; > > > ACCESS_ONCE(prev->next) = node; > > > smp_wmb(); > > > > > > We are probably still okay as other cpus do not read the value of > > > node->locked, which is a local variable. > > > > I don't immediately see the need for the smp_wmb() in either case. > > > Thinking a bit more, the following could happen in Jason's > initial patch proposal. In this case variable "prev" referenced > by CPU1 points to "node" referenced by CPU2 > > CPU 1 (calling lock)CPU 2 (calling unlock) > ACCESS_ONCE(prev->next) = node > *next = ACCESS_ONCE(node->next); > ACCESS_ONCE(next->locked) = 1; > node->locked = 0; > > Then we will be spinning forever on CPU1 as we overwrite the lock passed > from CPU2 before we check it. The original code assign > "node->locked = 0" before xchg does not have this issue. > Doing the following change of moving smp_wmb immediately > after node->locked assignment (suggested by Jason) > > node->locked = 0; > smp_wmb(); > ACCESS_ONCE(prev->next) = node; > > could avoid the problem, but will need closer scrutiny to see if > there are other pitfalls if wmb happen before > > ACCESS_ONCE(prev->next) = node; I could believe that an smp_wmb() might be needed before the "ACCESS_ONCE(prev->next) = node;", just not after. > > > > > + /* Wait until the lock holder passes the lock down */ > > > > > + while (!ACCESS_ONCE(node->locked)) > > > > > + arch_mutex_cpu_relax(); > > > > However, you do need a full memory barrier here in order to ensure that > > you see the effects of the previous lock holder's critical section. > > Is it necessary to add a memory barrier after acquiring > the lock if the previous lock holder execute smp_wmb before passing > the lock? Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 13:38 -0700, Paul E. McKenney wrote: > On Fri, Sep 27, 2013 at 12:38:53PM -0700, Tim Chen wrote: > > On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: > > > On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: > > > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > > > Extracting the MCS code from mutex.c and put into its own file allow us > > > > to reuse this code easily for rwsem. > > > > > > > > Signed-off-by: Tim Chen > > > > Signed-off-by: Davidlohr Bueso > > > > --- > > > > include/linux/mcslock.h | 58 > > > > +++ > > > > kernel/mutex.c | 58 > > > > +- > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > new file mode 100644 > > > > index 000..20fd3f0 > > > > --- /dev/null > > > > +++ b/include/linux/mcslock.h > > > > @@ -0,0 +1,58 @@ > > > > +/* > > > > + * MCS lock defines > > > > + * > > > > + * This file contains the main data structure and API definitions of > > > > MCS lock. > > > > + */ > > > > +#ifndef __LINUX_MCSLOCK_H > > > > +#define __LINUX_MCSLOCK_H > > > > + > > > > +struct mcs_spin_node { > > > > + struct mcs_spin_node *next; > > > > + int locked; /* 1 if lock acquired */ > > > > +}; > > > > + > > > > +/* > > > > + * We don't inline mcs_spin_lock() so that perf can correctly account > > > > for the > > > > + * time spent in this lock function. > > > > + */ > > > > +static noinline > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > > > > *node) > > > > +{ > > > > + struct mcs_spin_node *prev; > > > > + > > > > + /* Init node */ > > > > + node->locked = 0; > > > > + node->next = NULL; > > > > + > > > > + prev = xchg(lock, node); > > > > + if (likely(prev == NULL)) { > > > > + /* Lock acquired */ > > > > + node->locked = 1; > > > > + return; > > > > + } > > > > + ACCESS_ONCE(prev->next) = node; > > > > + smp_wmb(); > > > > BTW, is the above memory barrier necessary? It seems like the xchg > > instruction already provided a memory barrier. > > > > Now if we made the changes that Jason suggested: > > > > > > /* Init node */ > > - node->locked = 0; > > node->next = NULL; > > > > prev = xchg(lock, node); > > if (likely(prev == NULL)) { > > /* Lock acquired */ > > - node->locked = 1; > > return; > > } > > + node->locked = 0; > > ACCESS_ONCE(prev->next) = node; > > smp_wmb(); > > > > We are probably still okay as other cpus do not read the value of > > node->locked, which is a local variable. > > I don't immediately see the need for the smp_wmb() in either case. Thinking a bit more, the following could happen in Jason's initial patch proposal. In this case variable "prev" referenced by CPU1 points to "node" referenced by CPU2 CPU 1 (calling lock)CPU 2 (calling unlock) ACCESS_ONCE(prev->next) = node *next = ACCESS_ONCE(node->next); ACCESS_ONCE(next->locked) = 1; node->locked = 0; Then we will be spinning forever on CPU1 as we overwrite the lock passed from CPU2 before we check it. The original code assign "node->locked = 0" before xchg does not have this issue. Doing the following change of moving smp_wmb immediately after node->locked assignment (suggested by Jason) node->locked = 0; smp_wmb(); ACCESS_ONCE(prev->next) = node; could avoid the problem, but will need closer scrutiny to see if there are other pitfalls if wmb happen before ACCESS_ONCE(prev->next) = node; > > > > > > + /* Wait until the lock holder passes the lock down */ > > > > + while (!ACCESS_ONCE(node->locked)) > > > > + arch_mutex_cpu_relax(); > > However, you do need a full memory barrier here in order to ensure that > you see the effects of the previous lock holder's critical section. Is it necessary to add a memory barrier after acquiring the lock if the previous lock holder execute smp_wmb before passing the lock? Thanks. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 12:38:53PM -0700, Tim Chen wrote: > On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: > > On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: > > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > > Extracting the MCS code from mutex.c and put into its own file allow us > > > to reuse this code easily for rwsem. > > > > > > Signed-off-by: Tim Chen > > > Signed-off-by: Davidlohr Bueso > > > --- > > > include/linux/mcslock.h | 58 > > > +++ > > > kernel/mutex.c | 58 > > > +- > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > create mode 100644 include/linux/mcslock.h > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > new file mode 100644 > > > index 000..20fd3f0 > > > --- /dev/null > > > +++ b/include/linux/mcslock.h > > > @@ -0,0 +1,58 @@ > > > +/* > > > + * MCS lock defines > > > + * > > > + * This file contains the main data structure and API definitions of MCS > > > lock. > > > + */ > > > +#ifndef __LINUX_MCSLOCK_H > > > +#define __LINUX_MCSLOCK_H > > > + > > > +struct mcs_spin_node { > > > + struct mcs_spin_node *next; > > > + int locked; /* 1 if lock acquired */ > > > +}; > > > + > > > +/* > > > + * We don't inline mcs_spin_lock() so that perf can correctly account > > > for the > > > + * time spent in this lock function. > > > + */ > > > +static noinline > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > > > *node) > > > +{ > > > + struct mcs_spin_node *prev; > > > + > > > + /* Init node */ > > > + node->locked = 0; > > > + node->next = NULL; > > > + > > > + prev = xchg(lock, node); > > > + if (likely(prev == NULL)) { > > > + /* Lock acquired */ > > > + node->locked = 1; > > > + return; > > > + } > > > + ACCESS_ONCE(prev->next) = node; > > > + smp_wmb(); > > BTW, is the above memory barrier necessary? It seems like the xchg > instruction already provided a memory barrier. > > Now if we made the changes that Jason suggested: > > > /* Init node */ > - node->locked = 0; > node->next = NULL; > > prev = xchg(lock, node); > if (likely(prev == NULL)) { > /* Lock acquired */ > - node->locked = 1; > return; > } > + node->locked = 0; > ACCESS_ONCE(prev->next) = node; > smp_wmb(); > > We are probably still okay as other cpus do not read the value of > node->locked, which is a local variable. I don't immediately see the need for the smp_wmb() in either case. > Tim > > > > + /* Wait until the lock holder passes the lock down */ > > > + while (!ACCESS_ONCE(node->locked)) > > > + arch_mutex_cpu_relax(); However, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. Thanx, Paul > > > +} > > > + > > > +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > > > mcs_spin_node *node) > > > +{ > > > + struct mcs_spin_node *next = ACCESS_ONCE(node->next); > > > + > > > + if (likely(!next)) { > > > + /* > > > + * Release the lock by setting it to NULL > > > + */ > > > + if (cmpxchg(lock, node, NULL) == node) > > > + return; > > > + /* Wait until the next pointer is set */ > > > + while (!(next = ACCESS_ONCE(node->next))) > > > + arch_mutex_cpu_relax(); > > > + } > > > + ACCESS_ONCE(next->locked) = 1; > > > + smp_wmb(); > > > > Shouldn't the memory barrier precede the "ACCESS_ONCE(next->locked) = 1;"? > > Maybe in an "else" clause of the prior "if" statement, given that the > > cmpxchg() does it otherwise. > > > > Otherwise, in the case where the "if" conditionn is false, the critical > > section could bleed out past the unlock. > > > > Thanx, Paul > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 12:38 PM, Tim Chen wrote: > BTW, is the above memory barrier necessary? It seems like the xchg > instruction already provided a memory barrier. > > Now if we made the changes that Jason suggested: > > > /* Init node */ > - node->locked = 0; > node->next = NULL; > > prev = xchg(lock, node); > if (likely(prev == NULL)) { > /* Lock acquired */ > - node->locked = 1; > return; > } > + node->locked = 0; > ACCESS_ONCE(prev->next) = node; > smp_wmb(); > > We are probably still okay as other cpus do not read the value of > node->locked, which is a local variable. Similarly, I was wondering if we should also move smp_wmb() so that it is before the ACCESS_ONCE(prev->next) = node and after the node->locked = 0. Would we want to guarantee that the node->locked gets set before it is added to the linked list where a previous thread calling mcs_spin_unlock() would potentially modify node->locked? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: > On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > Extracting the MCS code from mutex.c and put into its own file allow us > > to reuse this code easily for rwsem. > > > > Signed-off-by: Tim Chen > > Signed-off-by: Davidlohr Bueso > > --- > > include/linux/mcslock.h | 58 > > +++ > > kernel/mutex.c | 58 > > +- > > 2 files changed, 65 insertions(+), 51 deletions(-) > > create mode 100644 include/linux/mcslock.h > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > new file mode 100644 > > index 000..20fd3f0 > > --- /dev/null > > +++ b/include/linux/mcslock.h > > @@ -0,0 +1,58 @@ > > +/* > > + * MCS lock defines > > + * > > + * This file contains the main data structure and API definitions of MCS > > lock. > > + */ > > +#ifndef __LINUX_MCSLOCK_H > > +#define __LINUX_MCSLOCK_H > > + > > +struct mcs_spin_node { > > + struct mcs_spin_node *next; > > + int locked; /* 1 if lock acquired */ > > +}; > > + > > +/* > > + * We don't inline mcs_spin_lock() so that perf can correctly account for > > the > > + * time spent in this lock function. > > + */ > > +static noinline > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > > +{ > > + struct mcs_spin_node *prev; > > + > > + /* Init node */ > > + node->locked = 0; > > + node->next = NULL; > > + > > + prev = xchg(lock, node); > > + if (likely(prev == NULL)) { > > + /* Lock acquired */ > > + node->locked = 1; > > + return; > > + } > > + ACCESS_ONCE(prev->next) = node; > > + smp_wmb(); BTW, is the above memory barrier necessary? It seems like the xchg instruction already provided a memory barrier. Now if we made the changes that Jason suggested: /* Init node */ - node->locked = 0; node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node->locked = 1; return; } + node->locked = 0; ACCESS_ONCE(prev->next) = node; smp_wmb(); We are probably still okay as other cpus do not read the value of node->locked, which is a local variable. Tim > > + /* Wait until the lock holder passes the lock down */ > > + while (!ACCESS_ONCE(node->locked)) > > + arch_mutex_cpu_relax(); > > +} > > + > > +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > > mcs_spin_node *node) > > +{ > > + struct mcs_spin_node *next = ACCESS_ONCE(node->next); > > + > > + if (likely(!next)) { > > + /* > > +* Release the lock by setting it to NULL > > +*/ > > + if (cmpxchg(lock, node, NULL) == node) > > + return; > > + /* Wait until the next pointer is set */ > > + while (!(next = ACCESS_ONCE(node->next))) > > + arch_mutex_cpu_relax(); > > + } > > + ACCESS_ONCE(next->locked) = 1; > > + smp_wmb(); > > Shouldn't the memory barrier precede the "ACCESS_ONCE(next->locked) = 1;"? > Maybe in an "else" clause of the prior "if" statement, given that the > cmpxchg() does it otherwise. > > Otherwise, in the case where the "if" conditionn is false, the critical > section could bleed out past the unlock. > > Thanx, Paul > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: > On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > Extracting the MCS code from mutex.c and put into its own file allow us > > to reuse this code easily for rwsem. > > > > Signed-off-by: Tim Chen > > Signed-off-by: Davidlohr Bueso > > --- > > include/linux/mcslock.h | 58 > > +++ > > kernel/mutex.c | 58 > > +- > > 2 files changed, 65 insertions(+), 51 deletions(-) > > create mode 100644 include/linux/mcslock.h > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > new file mode 100644 > > index 000..20fd3f0 > > --- /dev/null > > +++ b/include/linux/mcslock.h > > @@ -0,0 +1,58 @@ > > +/* > > + * MCS lock defines > > + * > > + * This file contains the main data structure and API definitions of MCS > > lock. > > + */ > > +#ifndef __LINUX_MCSLOCK_H > > +#define __LINUX_MCSLOCK_H > > + > > +struct mcs_spin_node { > > + struct mcs_spin_node *next; > > + int locked; /* 1 if lock acquired */ > > +}; > > + > > +/* > > + * We don't inline mcs_spin_lock() so that perf can correctly account for > > the > > + * time spent in this lock function. > > + */ > > +static noinline > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > > +{ > > + struct mcs_spin_node *prev; > > + > > + /* Init node */ > > + node->locked = 0; > > + node->next = NULL; > > + > > + prev = xchg(lock, node); > > + if (likely(prev == NULL)) { > > + /* Lock acquired */ > > + node->locked = 1; > > + return; > > + } > > + ACCESS_ONCE(prev->next) = node; > > + smp_wmb(); > > + /* Wait until the lock holder passes the lock down */ > > + while (!ACCESS_ONCE(node->locked)) > > + arch_mutex_cpu_relax(); > > +} > > + > > +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > > mcs_spin_node *node) > > +{ > > + struct mcs_spin_node *next = ACCESS_ONCE(node->next); > > + > > + if (likely(!next)) { > > + /* > > +* Release the lock by setting it to NULL > > +*/ > > + if (cmpxchg(lock, node, NULL) == node) > > + return; > > + /* Wait until the next pointer is set */ > > + while (!(next = ACCESS_ONCE(node->next))) > > + arch_mutex_cpu_relax(); > > + } > > + ACCESS_ONCE(next->locked) = 1; > > + smp_wmb(); > > Shouldn't the memory barrier precede the "ACCESS_ONCE(next->locked) = 1;"? > Maybe in an "else" clause of the prior "if" statement, given that the > cmpxchg() does it otherwise. > > Otherwise, in the case where the "if" conditionn is false, the critical > section could bleed out past the unlock. Yes, I agree with you that the smp_wmb should be moved before ACCESS_ONCE to prevent critical section from bleeding. Copying Waiman who is the original author of the mcs code to see if he has any comments on things we may have missed. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 09:12 -0700, Jason Low wrote: > On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote: > > Would be nice to have this as a separate, add-on patch. Every single > > instruction removal that has no downside is an upside! > > Okay, so here is a patch. Tim, would you like to add this to v7? Okay. Will do. Tim > > ... > Subject: MCS lock: Remove and reorder unnecessary assignments in > mcs_spin_lock() > > In mcs_spin_lock(), if (likely(prev == NULL)) is true, then the lock is free > and we won't spin on the local node. In that case, we don't have to assign > node->locked because it won't be used. We can also move the node->locked = 0 > assignment so that it occurs after the if (likely(prev == NULL)) check. > > This might also help make it clearer as to how the node->locked variable > is used in MCS locks. > > Signed-off-by: Jason Low > --- > include/linux/mcslock.h |3 +-- > 1 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > index 20fd3f0..1167d57 100644 > --- a/include/linux/mcslock.h > +++ b/include/linux/mcslock.h > @@ -21,15 +21,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct > mcs_spin_node *node) > struct mcs_spin_node *prev; > > /* Init node */ > - node->locked = 0; > node->next = NULL; > > prev = xchg(lock, node); > if (likely(prev == NULL)) { > /* Lock acquired */ > - node->locked = 1; > return; > } > + node->locked = 0; > ACCESS_ONCE(prev->next) = node; > smp_wmb(); > /* Wait until the lock holder passes the lock down */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote: > Would be nice to have this as a separate, add-on patch. Every single > instruction removal that has no downside is an upside! Okay, so here is a patch. Tim, would you like to add this to v7? ... Subject: MCS lock: Remove and reorder unnecessary assignments in mcs_spin_lock() In mcs_spin_lock(), if (likely(prev == NULL)) is true, then the lock is free and we won't spin on the local node. In that case, we don't have to assign node->locked because it won't be used. We can also move the node->locked = 0 assignment so that it occurs after the if (likely(prev == NULL)) check. This might also help make it clearer as to how the node->locked variable is used in MCS locks. Signed-off-by: Jason Low --- include/linux/mcslock.h |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..1167d57 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -21,15 +21,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) struct mcs_spin_node *prev; /* Init node */ - node->locked = 0; node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node->locked = 1; return; } + node->locked = 0; ACCESS_ONCE(prev->next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: > We will need the MCS lock code for doing optimistic spinning for rwsem. > Extracting the MCS code from mutex.c and put into its own file allow us > to reuse this code easily for rwsem. > > Signed-off-by: Tim Chen > Signed-off-by: Davidlohr Bueso > --- > include/linux/mcslock.h | 58 > +++ > kernel/mutex.c | 58 +- > 2 files changed, 65 insertions(+), 51 deletions(-) > create mode 100644 include/linux/mcslock.h > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > new file mode 100644 > index 000..20fd3f0 > --- /dev/null > +++ b/include/linux/mcslock.h > @@ -0,0 +1,58 @@ > +/* > + * MCS lock defines > + * > + * This file contains the main data structure and API definitions of MCS > lock. > + */ > +#ifndef __LINUX_MCSLOCK_H > +#define __LINUX_MCSLOCK_H > + > +struct mcs_spin_node { > + struct mcs_spin_node *next; > + int locked; /* 1 if lock acquired */ > +}; > + > +/* > + * We don't inline mcs_spin_lock() so that perf can correctly account for the > + * time spent in this lock function. > + */ > +static noinline > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > +{ > + struct mcs_spin_node *prev; > + > + /* Init node */ > + node->locked = 0; > + node->next = NULL; > + > + prev = xchg(lock, node); > + if (likely(prev == NULL)) { > + /* Lock acquired */ > + node->locked = 1; > + return; > + } > + ACCESS_ONCE(prev->next) = node; > + smp_wmb(); > + /* Wait until the lock holder passes the lock down */ > + while (!ACCESS_ONCE(node->locked)) > + arch_mutex_cpu_relax(); > +} > + > +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > mcs_spin_node *node) > +{ > + struct mcs_spin_node *next = ACCESS_ONCE(node->next); > + > + if (likely(!next)) { > + /* > + * Release the lock by setting it to NULL > + */ > + if (cmpxchg(lock, node, NULL) == node) > + return; > + /* Wait until the next pointer is set */ > + while (!(next = ACCESS_ONCE(node->next))) > + arch_mutex_cpu_relax(); > + } > + ACCESS_ONCE(next->locked) = 1; > + smp_wmb(); Shouldn't the memory barrier precede the "ACCESS_ONCE(next->locked) = 1;"? Maybe in an "else" clause of the prior "if" statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the "if" conditionn is false, the critical section could bleed out past the unlock. Thanx, Paul > +} > + > +#endif > diff --git a/kernel/mutex.c b/kernel/mutex.c > index 6d647ae..1b6ba3f 100644 > --- a/kernel/mutex.c > +++ b/kernel/mutex.c > @@ -25,6 +25,7 @@ > #include > #include > #include > +#include > > /* > * In the DEBUG case we are using the "NULL fastpath" for mutexes, > @@ -111,54 +112,9 @@ EXPORT_SYMBOL(mutex_lock); > * more or less simultaneously, the spinners need to acquire a MCS lock > * first before spinning on the owner field. > * > - * We don't inline mspin_lock() so that perf can correctly account for the > - * time spent in this lock function. > */ > -struct mspin_node { > - struct mspin_node *next ; > - int locked; /* 1 if lock acquired */ > -}; > -#define MLOCK(mutex)((struct mspin_node **)&((mutex)->spin_mlock)) > > -static noinline > -void mspin_lock(struct mspin_node **lock, struct mspin_node *node) > -{ > - struct mspin_node *prev; > - > - /* Init node */ > - node->locked = 0; > - node->next = NULL; > - > - prev = xchg(lock, node); > - if (likely(prev == NULL)) { > - /* Lock acquired */ > - node->locked = 1; > - return; > - } > - ACCESS_ONCE(prev->next) = node; > - smp_wmb(); > - /* Wait until the lock holder passes the lock down */ > - while (!ACCESS_ONCE(node->locked)) > - arch_mutex_cpu_relax(); > -} > - > -static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node) > -{ > - struct mspin_node *next = ACCESS_ONCE(node->next); > - > - if (likely(!next)) { > - /* > - * Release the lock by setting it to NULL > - */ > - if (cmpxchg(lock, node, NULL) == node) > - return; > - /* Wait until the next pointer is set */ > - while (!(next = ACCESS_ONCE(node->next))) > - arch_mutex_cpu_relax(); > - } > - ACCESS_ONCE(next->locked) = 1; > - smp_wmb(); > -} > +#define MLOCK(mutex)((struct mcs_spin_node > **)&((mutex)->spin_mlock)) > > /* > * Mutex spinning code migrated from kernel/sched/core.c > @@ -448,7 +404,7 @@
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 07:05:00AM -0700, Joe Perches wrote: > On Fri, 2013-09-27 at 15:48 +0200, Peter Zijlstra wrote: > > On Fri, Sep 27, 2013 at 06:44:55AM -0700, Joe Perches wrote: > > > It's a CHK test, so it's only tested with --strict > > > > > > $ scripts/checkpatch.pl -f --strict kernel/mutex.c 2>&1 | grep memory > > > CHECK: memory barrier without comment > > > CHECK: memory barrier without comment > > > > > > It could be changed to WARN so it's always on. > > > > Yes please, we can't be too careful with memory barriers. > > I'll send the patch separately. > > It seems a pretty noisy test. > There are 13 hits just in arch/x86/kernel/ Urgh. that wants fixing. We really need to stop getting more and that's where checkpatch is good at. At very very bare minimum the comment should mention where the pairing barrier is; but ideally it should describe the actual ordering. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 15:48 +0200, Peter Zijlstra wrote: > On Fri, Sep 27, 2013 at 06:44:55AM -0700, Joe Perches wrote: > > It's a CHK test, so it's only tested with --strict > > > > $ scripts/checkpatch.pl -f --strict kernel/mutex.c 2>&1 | grep memory > > CHECK: memory barrier without comment > > CHECK: memory barrier without comment > > > > It could be changed to WARN so it's always on. > > Yes please, we can't be too careful with memory barriers. I'll send the patch separately. It seems a pretty noisy test. There are 13 hits just in arch/x86/kernel/ $ ./scripts/checkpatch.pl -f arch/x86/kernel/*.c | grep -A3 "memory barrier" WARNING: memory barrier without comment #685: FILE: x86/kernel/alternative.c:685: + smp_wmb(); -- WARNING: memory barrier without comment #401: FILE: x86/kernel/kvm.c:401: + rmb(); WARNING: memory barrier without comment #403: FILE: x86/kernel/kvm.c:403: + rmb(); -- WARNING: memory barrier without comment #702: FILE: x86/kernel/kvm.c:702: + smp_wmb(); WARNING: memory barrier without comment #704: FILE: x86/kernel/kvm.c:704: + smp_wmb(); -- WARNING: memory barrier without comment #62: FILE: x86/kernel/ldt.c:62: + wmb(); WARNING: memory barrier without comment #64: FILE: x86/kernel/ldt.c:64: + wmb(); -- WARNING: memory barrier without comment #204: FILE: x86/kernel/smpboot.c:204: + wmb(); WARNING: memory barrier without comment #265: FILE: x86/kernel/smpboot.c:265: + wmb(); -- WARNING: memory barrier without comment #557: FILE: x86/kernel/smpboot.c:557: + mb(); -- WARNING: memory barrier without comment #1065: FILE: x86/kernel/smpboot.c:1065: + mb(); -- WARNING: memory barrier without comment #1321: FILE: x86/kernel/smpboot.c:1321: + mb(); WARNING: memory barrier without comment #1399: FILE: x86/kernel/smpboot.c:1399: + mb(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 06:44:55AM -0700, Joe Perches wrote: > It's a CHK test, so it's only tested with --strict > > $ scripts/checkpatch.pl -f --strict kernel/mutex.c 2>&1 | grep memory > CHECK: memory barrier without comment > CHECK: memory barrier without comment > > It could be changed to WARN so it's always on. Yes please, we can't be too careful with memory barriers. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 13:23 +0200, Peter Zijlstra wrote: > checkpatch.pl should really warn about that; and it appears there > code in there for that; however: > > # grep -C3 smp_mb scripts/checkpatch.pl [] > # check for memory barriers without a comment. > if ($line =~ > /\b(mb|rmb|wmb|read_barrier_depends|smp_mb|smp_rmb|smp_wmb|smp_read_barrier_depends)\(/) > { > if (!ctx_has_comment($first_line, $linenr)) { > CHK("MEMORY_BARRIER", > "memory barrier without comment\n" . > $herecurr); [] > # scripts/checkpatch.pl -f kernel/mutex.c 2>&1 | grep memory > # > > so that appears to be completely broken :/ > > Joe, any clue what's up with that? It's a CHK test, so it's only tested with --strict $ scripts/checkpatch.pl -f --strict kernel/mutex.c 2>&1 | grep memory CHECK: memory barrier without comment CHECK: memory barrier without comment It could be changed to WARN so it's always on. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 08:02:13AM +0200, Ingo Molnar wrote: > Would be nice to have this as a separate, add-on patch. Every single > instruction removal that has no downside is an upside! > > You can add a comment that explains it. If someone is going to do add-on patches to the mcslock.h file, please also consider doing a patch that adds comments to the memory barriers in there. Also, checkpatch.pl should really warn about that; and it appears there code in there for that; however: # grep -C3 smp_mb scripts/checkpatch.pl } } # check for memory barriers without a comment. if ($line =~ /\b(mb|rmb|wmb|read_barrier_depends|smp_mb|smp_rmb|smp_wmb|smp_read_barrier_depends)\(/) { if (!ctx_has_comment($first_line, $linenr)) { CHK("MEMORY_BARRIER", "memory barrier without comment\n" . $herecurr); # grep -C3 smp_wmb kernel/mutex.c return; } ACCESS_ONCE(prev->next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); -- arch_mutex_cpu_relax(); } ACCESS_ONCE(next->locked) = 1; smp_wmb(); } /* # scripts/checkpatch.pl -f kernel/mutex.c 2>&1 | grep memory # so that appears to be completely broken :/ Joe, any clue what's up with that? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote: > * Tim Chen wrote: > > > > If we prefer to optimize this a bit though, perhaps we can first move > > > the node->lock = 0 so that it gets executed after the "if (likely(prev > > > == NULL)) {}" code block and then delete "node->lock = 1" inside the > > > code block. > > > > I suppose we can save one single assignment. The gain is probably not > > noticeable as once we set node->next to NULL, node->locked is likely in > > local cache line and the assignment operation is cheap. > > Would be nice to have this as a separate, add-on patch. Every single > instruction removal that has no downside is an upside! > > You can add a comment that explains it. Yup, especially a spin lock (and one that I have found to be be used very frequently when running workloads on big machines). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
* Tim Chen wrote: > > If we prefer to optimize this a bit though, perhaps we can first move > > the node->lock = 0 so that it gets executed after the "if (likely(prev > > == NULL)) {}" code block and then delete "node->lock = 1" inside the > > code block. > > I suppose we can save one single assignment. The gain is probably not > noticeable as once we set node->next to NULL, node->locked is likely in > local cache line and the assignment operation is cheap. Would be nice to have this as a separate, add-on patch. Every single instruction removal that has no downside is an upside! You can add a comment that explains it. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
* Tim Chen tim.c.c...@linux.intel.com wrote: If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. I suppose we can save one single assignment. The gain is probably not noticeable as once we set node-next to NULL, node-locked is likely in local cache line and the assignment operation is cheap. Would be nice to have this as a separate, add-on patch. Every single instruction removal that has no downside is an upside! You can add a comment that explains it. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote: * Tim Chen tim.c.c...@linux.intel.com wrote: If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. I suppose we can save one single assignment. The gain is probably not noticeable as once we set node-next to NULL, node-locked is likely in local cache line and the assignment operation is cheap. Would be nice to have this as a separate, add-on patch. Every single instruction removal that has no downside is an upside! You can add a comment that explains it. Yup, especially a spin lock (and one that I have found to be be used very frequently when running workloads on big machines). -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 08:02:13AM +0200, Ingo Molnar wrote: Would be nice to have this as a separate, add-on patch. Every single instruction removal that has no downside is an upside! You can add a comment that explains it. If someone is going to do add-on patches to the mcslock.h file, please also consider doing a patch that adds comments to the memory barriers in there. Also, checkpatch.pl should really warn about that; and it appears there code in there for that; however: # grep -C3 smp_mb scripts/checkpatch.pl } } # check for memory barriers without a comment. if ($line =~ /\b(mb|rmb|wmb|read_barrier_depends|smp_mb|smp_rmb|smp_wmb|smp_read_barrier_depends)\(/) { if (!ctx_has_comment($first_line, $linenr)) { CHK(MEMORY_BARRIER, memory barrier without comment\n . $herecurr); # grep -C3 smp_wmb kernel/mutex.c return; } ACCESS_ONCE(prev-next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); -- arch_mutex_cpu_relax(); } ACCESS_ONCE(next-locked) = 1; smp_wmb(); } /* # scripts/checkpatch.pl -f kernel/mutex.c 21 | grep memory # so that appears to be completely broken :/ Joe, any clue what's up with that? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 13:23 +0200, Peter Zijlstra wrote: checkpatch.pl should really warn about that; and it appears there code in there for that; however: # grep -C3 smp_mb scripts/checkpatch.pl [] # check for memory barriers without a comment. if ($line =~ /\b(mb|rmb|wmb|read_barrier_depends|smp_mb|smp_rmb|smp_wmb|smp_read_barrier_depends)\(/) { if (!ctx_has_comment($first_line, $linenr)) { CHK(MEMORY_BARRIER, memory barrier without comment\n . $herecurr); [] # scripts/checkpatch.pl -f kernel/mutex.c 21 | grep memory # so that appears to be completely broken :/ Joe, any clue what's up with that? It's a CHK test, so it's only tested with --strict $ scripts/checkpatch.pl -f --strict kernel/mutex.c 21 | grep memory CHECK: memory barrier without comment CHECK: memory barrier without comment It could be changed to WARN so it's always on. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 06:44:55AM -0700, Joe Perches wrote: It's a CHK test, so it's only tested with --strict $ scripts/checkpatch.pl -f --strict kernel/mutex.c 21 | grep memory CHECK: memory barrier without comment CHECK: memory barrier without comment It could be changed to WARN so it's always on. Yes please, we can't be too careful with memory barriers. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 15:48 +0200, Peter Zijlstra wrote: On Fri, Sep 27, 2013 at 06:44:55AM -0700, Joe Perches wrote: It's a CHK test, so it's only tested with --strict $ scripts/checkpatch.pl -f --strict kernel/mutex.c 21 | grep memory CHECK: memory barrier without comment CHECK: memory barrier without comment It could be changed to WARN so it's always on. Yes please, we can't be too careful with memory barriers. I'll send the patch separately. It seems a pretty noisy test. There are 13 hits just in arch/x86/kernel/ $ ./scripts/checkpatch.pl -f arch/x86/kernel/*.c | grep -A3 memory barrier WARNING: memory barrier without comment #685: FILE: x86/kernel/alternative.c:685: + smp_wmb(); -- WARNING: memory barrier without comment #401: FILE: x86/kernel/kvm.c:401: + rmb(); WARNING: memory barrier without comment #403: FILE: x86/kernel/kvm.c:403: + rmb(); -- WARNING: memory barrier without comment #702: FILE: x86/kernel/kvm.c:702: + smp_wmb(); WARNING: memory barrier without comment #704: FILE: x86/kernel/kvm.c:704: + smp_wmb(); -- WARNING: memory barrier without comment #62: FILE: x86/kernel/ldt.c:62: + wmb(); WARNING: memory barrier without comment #64: FILE: x86/kernel/ldt.c:64: + wmb(); -- WARNING: memory barrier without comment #204: FILE: x86/kernel/smpboot.c:204: + wmb(); WARNING: memory barrier without comment #265: FILE: x86/kernel/smpboot.c:265: + wmb(); -- WARNING: memory barrier without comment #557: FILE: x86/kernel/smpboot.c:557: + mb(); -- WARNING: memory barrier without comment #1065: FILE: x86/kernel/smpboot.c:1065: + mb(); -- WARNING: memory barrier without comment #1321: FILE: x86/kernel/smpboot.c:1321: + mb(); WARNING: memory barrier without comment #1399: FILE: x86/kernel/smpboot.c:1399: + mb(); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 07:05:00AM -0700, Joe Perches wrote: On Fri, 2013-09-27 at 15:48 +0200, Peter Zijlstra wrote: On Fri, Sep 27, 2013 at 06:44:55AM -0700, Joe Perches wrote: It's a CHK test, so it's only tested with --strict $ scripts/checkpatch.pl -f --strict kernel/mutex.c 21 | grep memory CHECK: memory barrier without comment CHECK: memory barrier without comment It could be changed to WARN so it's always on. Yes please, we can't be too careful with memory barriers. I'll send the patch separately. It seems a pretty noisy test. There are 13 hits just in arch/x86/kernel/ Urgh. that wants fixing. We really need to stop getting more and that's where checkpatch is good at. At very very bare minimum the comment should mention where the pairing barrier is; but ideally it should describe the actual ordering. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* + * Release the lock by setting it to NULL + */ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); Shouldn't the memory barrier precede the ACCESS_ONCE(next-locked) = 1;? Maybe in an else clause of the prior if statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the if conditionn is false, the critical section could bleed out past the unlock. Thanx, Paul +} + +#endif diff --git a/kernel/mutex.c b/kernel/mutex.c index 6d647ae..1b6ba3f 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -25,6 +25,7 @@ #include linux/spinlock.h #include linux/interrupt.h #include linux/debug_locks.h +#include linux/mcslock.h /* * In the DEBUG case we are using the NULL fastpath for mutexes, @@ -111,54 +112,9 @@ EXPORT_SYMBOL(mutex_lock); * more or less simultaneously, the spinners need to acquire a MCS lock * first before spinning on the owner field. * - * We don't inline mspin_lock() so that perf can correctly account for the - * time spent in this lock function. */ -struct mspin_node { - struct mspin_node *next ; - int locked; /* 1 if lock acquired */ -}; -#define MLOCK(mutex)((struct mspin_node **)((mutex)-spin_mlock)) -static noinline -void mspin_lock(struct mspin_node **lock, struct mspin_node *node) -{ - struct mspin_node *prev; - - /* Init node */ - node-locked = 0; - node-next = NULL; - - prev = xchg(lock, node); - if (likely(prev == NULL)) { - /* Lock acquired */ - node-locked = 1; - return; - } - ACCESS_ONCE(prev-next) = node; - smp_wmb(); - /* Wait until the lock holder passes the lock down */ - while (!ACCESS_ONCE(node-locked)) - arch_mutex_cpu_relax(); -} - -static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node) -{ - struct mspin_node *next = ACCESS_ONCE(node-next); - - if (likely(!next)) { - /* - * Release the lock by setting it to NULL - */ - if (cmpxchg(lock, node, NULL) == node) - return; - /* Wait until the next pointer is set */ - while (!(next = ACCESS_ONCE(node-next))) - arch_mutex_cpu_relax(); - } - ACCESS_ONCE(next-locked) = 1; - smp_wmb(); -} +#define MLOCK(mutex)((struct mcs_spin_node **)((mutex)-spin_mlock)) /* * Mutex spinning code migrated from kernel/sched/core.c @@ -448,7 +404,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote: Would be nice to have this as a separate, add-on patch. Every single instruction removal that has no downside is an upside! Okay, so here is a patch. Tim, would you like to add this to v7? ... Subject: MCS lock: Remove and reorder unnecessary assignments in mcs_spin_lock() In mcs_spin_lock(), if (likely(prev == NULL)) is true, then the lock is free and we won't spin on the local node. In that case, we don't have to assign node-locked because it won't be used. We can also move the node-locked = 0 assignment so that it occurs after the if (likely(prev == NULL)) check. This might also help make it clearer as to how the node-locked variable is used in MCS locks. Signed-off-by: Jason Low jason.l...@hp.com --- include/linux/mcslock.h |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..1167d57 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -21,15 +21,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) struct mcs_spin_node *prev; /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 09:12 -0700, Jason Low wrote: On Fri, 2013-09-27 at 08:02 +0200, Ingo Molnar wrote: Would be nice to have this as a separate, add-on patch. Every single instruction removal that has no downside is an upside! Okay, so here is a patch. Tim, would you like to add this to v7? Okay. Will do. Tim ... Subject: MCS lock: Remove and reorder unnecessary assignments in mcs_spin_lock() In mcs_spin_lock(), if (likely(prev == NULL)) is true, then the lock is free and we won't spin on the local node. In that case, we don't have to assign node-locked because it won't be used. We can also move the node-locked = 0 assignment so that it occurs after the if (likely(prev == NULL)) check. This might also help make it clearer as to how the node-locked variable is used in MCS locks. Signed-off-by: Jason Low jason.l...@hp.com --- include/linux/mcslock.h |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..1167d57 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -21,15 +21,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) struct mcs_spin_node *prev; /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); Shouldn't the memory barrier precede the ACCESS_ONCE(next-locked) = 1;? Maybe in an else clause of the prior if statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the if conditionn is false, the critical section could bleed out past the unlock. Yes, I agree with you that the smp_wmb should be moved before ACCESS_ONCE to prevent critical section from bleeding. Copying Waiman who is the original author of the mcs code to see if he has any comments on things we may have missed. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); BTW, is the above memory barrier necessary? It seems like the xchg instruction already provided a memory barrier. Now if we made the changes that Jason suggested: /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); We are probably still okay as other cpus do not read the value of node-locked, which is a local variable. Tim + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); Shouldn't the memory barrier precede the ACCESS_ONCE(next-locked) = 1;? Maybe in an else clause of the prior if statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the if conditionn is false, the critical section could bleed out past the unlock. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 12:38 PM, Tim Chen tim.c.c...@linux.intel.com wrote: BTW, is the above memory barrier necessary? It seems like the xchg instruction already provided a memory barrier. Now if we made the changes that Jason suggested: /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); We are probably still okay as other cpus do not read the value of node-locked, which is a local variable. Similarly, I was wondering if we should also move smp_wmb() so that it is before the ACCESS_ONCE(prev-next) = node and after the node-locked = 0. Would we want to guarantee that the node-locked gets set before it is added to the linked list where a previous thread calling mcs_spin_unlock() would potentially modify node-locked? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 12:38:53PM -0700, Tim Chen wrote: On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); BTW, is the above memory barrier necessary? It seems like the xchg instruction already provided a memory barrier. Now if we made the changes that Jason suggested: /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); We are probably still okay as other cpus do not read the value of node-locked, which is a local variable. I don't immediately see the need for the smp_wmb() in either case. Tim + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); However, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. Thanx, Paul +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* + * Release the lock by setting it to NULL + */ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); Shouldn't the memory barrier precede the ACCESS_ONCE(next-locked) = 1;? Maybe in an else clause of the prior if statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the if conditionn is false, the critical section could bleed out past the unlock. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 13:38 -0700, Paul E. McKenney wrote: On Fri, Sep 27, 2013 at 12:38:53PM -0700, Tim Chen wrote: On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); BTW, is the above memory barrier necessary? It seems like the xchg instruction already provided a memory barrier. Now if we made the changes that Jason suggested: /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); We are probably still okay as other cpus do not read the value of node-locked, which is a local variable. I don't immediately see the need for the smp_wmb() in either case. Thinking a bit more, the following could happen in Jason's initial patch proposal. In this case variable prev referenced by CPU1 points to node referenced by CPU2 CPU 1 (calling lock)CPU 2 (calling unlock) ACCESS_ONCE(prev-next) = node *next = ACCESS_ONCE(node-next); ACCESS_ONCE(next-locked) = 1; node-locked = 0; Then we will be spinning forever on CPU1 as we overwrite the lock passed from CPU2 before we check it. The original code assign node-locked = 0 before xchg does not have this issue. Doing the following change of moving smp_wmb immediately after node-locked assignment (suggested by Jason) node-locked = 0; smp_wmb(); ACCESS_ONCE(prev-next) = node; could avoid the problem, but will need closer scrutiny to see if there are other pitfalls if wmb happen before ACCESS_ONCE(prev-next) = node; + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); However, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. Is it necessary to add a memory barrier after acquiring the lock if the previous lock holder execute smp_wmb before passing the lock? Thanks. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 03:46:45PM -0700, Tim Chen wrote: On Fri, 2013-09-27 at 13:38 -0700, Paul E. McKenney wrote: On Fri, Sep 27, 2013 at 12:38:53PM -0700, Tim Chen wrote: On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); BTW, is the above memory barrier necessary? It seems like the xchg instruction already provided a memory barrier. Now if we made the changes that Jason suggested: /* Init node */ - node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ - node-locked = 1; return; } + node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); We are probably still okay as other cpus do not read the value of node-locked, which is a local variable. I don't immediately see the need for the smp_wmb() in either case. Thinking a bit more, the following could happen in Jason's initial patch proposal. In this case variable prev referenced by CPU1 points to node referenced by CPU2 CPU 1 (calling lock)CPU 2 (calling unlock) ACCESS_ONCE(prev-next) = node *next = ACCESS_ONCE(node-next); ACCESS_ONCE(next-locked) = 1; node-locked = 0; Then we will be spinning forever on CPU1 as we overwrite the lock passed from CPU2 before we check it. The original code assign node-locked = 0 before xchg does not have this issue. Doing the following change of moving smp_wmb immediately after node-locked assignment (suggested by Jason) node-locked = 0; smp_wmb(); ACCESS_ONCE(prev-next) = node; could avoid the problem, but will need closer scrutiny to see if there are other pitfalls if wmb happen before ACCESS_ONCE(prev-next) = node; I could believe that an smp_wmb() might be needed before the ACCESS_ONCE(prev-next) = node;, just not after. + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); However, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. Is it necessary to add a memory barrier after acquiring the lock if the previous lock holder execute smp_wmb before passing the lock? Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new lock holder. They could for example reorder the critical section to precede the node-locked check, which would be very bad. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new lock holder. They could for example reorder the critical section to precede the node-locked check, which would be very bad. Paul, Tim, Longman, How would you like the proposed changes below? --- Subject: [PATCH] MCS: optimizations and barrier corrections Delete the node-locked = 1 assignment if the lock is free as it won't be used. Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the end of the mcs_spin_lock() function. As Paul McKenney suggested, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. And in the mcs_spin_unlock(), move the memory barrier so that it is before the ACCESS_ONCE(next-locked) = 1;. Signed-off-by: Jason Low jason.l...@hp.com Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Tim Chen tim.c.c...@linux.intel.com --- include/linux/mcslock.h |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..edd57d2 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) prev = xchg(lock, node); if (likely(prev == NULL)) { - /* Lock acquired */ - node-locked = 1; + /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; - smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); + smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *n while (!(next = ACCESS_ONCE(node-next))) arch_mutex_cpu_relax(); } - ACCESS_ONCE(next-locked) = 1; smp_wmb(); + ACCESS_ONCE(next-locked) = 1; } #endif -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, 2013-09-27 at 16:54 -0700, Jason Low wrote: On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new lock holder. They could for example reorder the critical section to precede the node-locked check, which would be very bad. Paul, Tim, Longman, How would you like the proposed changes below? --- Subject: [PATCH] MCS: optimizations and barrier corrections We *really* need to comment those barriers - explicitly that is :) Delete the node-locked = 1 assignment if the lock is free as it won't be used. Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the end of the mcs_spin_lock() function. As Paul McKenney suggested, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. And in the mcs_spin_unlock(), move the memory barrier so that it is before the ACCESS_ONCE(next-locked) = 1;. Signed-off-by: Jason Low jason.l...@hp.com Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Tim Chen tim.c.c...@linux.intel.com --- include/linux/mcslock.h |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..edd57d2 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) prev = xchg(lock, node); if (likely(prev == NULL)) { - /* Lock acquired */ - node-locked = 1; + /* Lock acquired. No need to set node-locked since it won't be used */ Then, we need to explain/comment then the relationship between this situation and the locked being set in mspin_unlock(), passing the lock holder down the list. return; } ACCESS_ONCE(prev-next) = node; - smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); + smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *n while (!(next = ACCESS_ONCE(node-next))) arch_mutex_cpu_relax(); } - ACCESS_ONCE(next-locked) = 1; smp_wmb(); + ACCESS_ONCE(next-locked) = 1; } #endif -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote: On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new lock holder. They could for example reorder the critical section to precede the node-locked check, which would be very bad. Paul, Tim, Longman, How would you like the proposed changes below? Could you point me at what this applies to? I can find flaws looking at random pieces, given a little luck, but at some point I need to look at the whole thing. ;-) Thanx, Paul --- Subject: [PATCH] MCS: optimizations and barrier corrections Delete the node-locked = 1 assignment if the lock is free as it won't be used. Delete the smp_wmb() in mcs_spin_lock() and add a full memory barrier at the end of the mcs_spin_lock() function. As Paul McKenney suggested, you do need a full memory barrier here in order to ensure that you see the effects of the previous lock holder's critical section. And in the mcs_spin_unlock(), move the memory barrier so that it is before the ACCESS_ONCE(next-locked) = 1;. Signed-off-by: Jason Low jason.l...@hp.com Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Signed-off-by: Tim Chen tim.c.c...@linux.intel.com --- include/linux/mcslock.h |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h index 20fd3f0..edd57d2 100644 --- a/include/linux/mcslock.h +++ b/include/linux/mcslock.h @@ -26,15 +26,14 @@ void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) prev = xchg(lock, node); if (likely(prev == NULL)) { - /* Lock acquired */ - node-locked = 1; + /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; - smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); + smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) @@ -51,8 +50,8 @@ static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *n while (!(next = ACCESS_ONCE(node-next))) arch_mutex_cpu_relax(); } - ACCESS_ONCE(next-locked) = 1; smp_wmb(); + ACCESS_ONCE(next-locked) = 1; } #endif -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On 09/27/2013 02:09 PM, Tim Chen wrote: On Fri, 2013-09-27 at 08:29 -0700, Paul E. McKenney wrote: On Wed, Sep 25, 2013 at 03:10:49PM -0700, Tim Chen wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chentim.c.c...@linux.intel.com Signed-off-by: Davidlohr Buesodavidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); Shouldn't the memory barrier precede the ACCESS_ONCE(next-locked) = 1;? Maybe in an else clause of the prior if statement, given that the cmpxchg() does it otherwise. Otherwise, in the case where the if conditionn is false, the critical section could bleed out past the unlock. Yes, I agree with you that the smp_wmb should be moved before ACCESS_ONCE to prevent critical section from bleeding. Copying Waiman who is the original author of the mcs code to see if he has any comments on things we may have missed. Tim As a more general lock/unlock mechanism, I also agreed that we should move smp_wmb() before ACCESS_ONCE(). For the mutex case, it is used as a queuing mechanism rather than guarding critical section, so it doesn't really matter. Regards, Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Fri, Sep 27, 2013 at 7:19 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Fri, Sep 27, 2013 at 04:54:06PM -0700, Jason Low wrote: On Fri, Sep 27, 2013 at 4:01 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: Yep. The previous lock holder's smp_wmb() won't keep either the compiler or the CPU from reordering things for the new lock holder. They could for example reorder the critical section to precede the node-locked check, which would be very bad. Paul, Tim, Longman, How would you like the proposed changes below? Could you point me at what this applies to? I can find flaws looking at random pieces, given a little luck, but at some point I need to look at the whole thing. ;-) Sure. Here is a link to the patch we are trying to modify: https://lkml.org/lkml/2013/9/25/532 Also, below is what the mcs_spin_lock() and mcs_spin_unlock() functions would look like after applying the proposed changes. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-locked = 0; node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired. No need to set node-locked since it won't be used */ return; } ACCESS_ONCE(prev-next) = node; /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); smp_mb(); } static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *next = ACCESS_ONCE(node-next); if (likely(!next)) { /* * Release the lock by setting it to NULL */ if (cmpxchg(lock, node, NULL) == node) return; /* Wait until the next pointer is set */ while (!(next = ACCESS_ONCE(node-next))) arch_mutex_cpu_relax(); } smp_wmb(); ACCESS_ONCE(next-locked) = 1; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 15:42 -0700, Jason Low wrote: > On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: > > On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: > > > On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: > > > > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: > > > > > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > > > > > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > > > > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > > > > > > wrote: > > > > > > > > We will need the MCS lock code for doing optimistic spinning > > > > > > > > for rwsem. > > > > > > > > Extracting the MCS code from mutex.c and put into its own file > > > > > > > > allow us > > > > > > > > to reuse this code easily for rwsem. > > > > > > > > > > > > > > > > Signed-off-by: Tim Chen > > > > > > > > Signed-off-by: Davidlohr Bueso > > > > > > > > --- > > > > > > > > include/linux/mcslock.h | 58 > > > > > > > > +++ > > > > > > > > kernel/mutex.c | 58 > > > > > > > > +- > > > > > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > > > > > new file mode 100644 > > > > > > > > index 000..20fd3f0 > > > > > > > > --- /dev/null > > > > > > > > +++ b/include/linux/mcslock.h > > > > > > > > @@ -0,0 +1,58 @@ > > > > > > > > +/* > > > > > > > > + * MCS lock defines > > > > > > > > + * > > > > > > > > + * This file contains the main data structure and API > > > > > > > > definitions of MCS lock. > > > > > > > > + */ > > > > > > > > +#ifndef __LINUX_MCSLOCK_H > > > > > > > > +#define __LINUX_MCSLOCK_H > > > > > > > > + > > > > > > > > +struct mcs_spin_node { > > > > > > > > + struct mcs_spin_node *next; > > > > > > > > + int locked; /* 1 if lock acquired */ > > > > > > > > +}; > > > > > > > > + > > > > > > > > +/* > > > > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly > > > > > > > > account for the > > > > > > > > + * time spent in this lock function. > > > > > > > > + */ > > > > > > > > +static noinline > > > > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct > > > > > > > > mcs_spin_node *node) > > > > > > > > +{ > > > > > > > > + struct mcs_spin_node *prev; > > > > > > > > + > > > > > > > > + /* Init node */ > > > > > > > > + node->locked = 0; > > > > > > > > + node->next = NULL; > > > > > > > > + > > > > > > > > + prev = xchg(lock, node); > > > > > > > > + if (likely(prev == NULL)) { > > > > > > > > + /* Lock acquired */ > > > > > > > > + node->locked = 1; > > > > > > > > > > > > > > If we don't spin on the local node, is it necessary to set this > > > > > > > variable? > > > > > > > > > > > > I don't follow, the whole idea is to spin on the local variable. > > > > > > > > > > If prev == NULL, doesn't that mean it won't proceed to spin on the > > > > > variable because the lock is already free and we call return? In that > > > > > case where we directly acquire the lock, I was wondering if it is > > > > > necessary to set node->locked = 1. > > > > > > > > Yes, that's true, but we need to flag the lock as acquired (the node's > > > > lock is initially set to unlocked), otherwise others trying to acquire > > > > the lock can spin forever: > > > > > > > > /* Wait until the lock holder passes the lock down */ > > > > while (!ACCESS_ONCE(node->locked)) > > > > arch_mutex_cpu_relax(); > > > > > > > > The ->locked variable in this implementation refers to if the lock is > > > > acquired, and *not* to if busy-waiting is necessary. > > > > > > hmm, others threads acquiring the lock will be spinning on their own > > > local nodes, not this node's node->locked. And if prev == NULL, the > > > current thread won't be reading it's node->lock either since we return. > > > So what other thread is going to be reading this node's node->lock? > > > > > > Thanks, > > > Jason > > > > I think setting node->locked = 1 for the prev==NULL case is not > > necessary functionally, but was done for semantics consistency. > > Okay, that would makes sense for consistency because we always > first set node->lock = 0 at the top of the function. > > If we prefer to optimize this a bit though, perhaps we can > first move the node->lock = 0 so that it gets executed after the > "if (likely(prev == NULL)) {}" code block and then delete > "node->lock = 1" inside the code block. I suppose we can save one single assignment. The gain is probably not noticeable as once we set node->next to NULL, node->locked is likely in local cache line and the assignment operation is cheap. Regards, Tim > > static noinline > void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: > On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: > > On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: > > > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: > > > > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > > > > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > > > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > > > > > wrote: > > > > > > > We will need the MCS lock code for doing optimistic spinning for > > > > > > > rwsem. > > > > > > > Extracting the MCS code from mutex.c and put into its own file > > > > > > > allow us > > > > > > > to reuse this code easily for rwsem. > > > > > > > > > > > > > > Signed-off-by: Tim Chen > > > > > > > Signed-off-by: Davidlohr Bueso > > > > > > > --- > > > > > > > include/linux/mcslock.h | 58 > > > > > > > +++ > > > > > > > kernel/mutex.c | 58 > > > > > > > +- > > > > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > > > > new file mode 100644 > > > > > > > index 000..20fd3f0 > > > > > > > --- /dev/null > > > > > > > +++ b/include/linux/mcslock.h > > > > > > > @@ -0,0 +1,58 @@ > > > > > > > +/* > > > > > > > + * MCS lock defines > > > > > > > + * > > > > > > > + * This file contains the main data structure and API > > > > > > > definitions of MCS lock. > > > > > > > + */ > > > > > > > +#ifndef __LINUX_MCSLOCK_H > > > > > > > +#define __LINUX_MCSLOCK_H > > > > > > > + > > > > > > > +struct mcs_spin_node { > > > > > > > + struct mcs_spin_node *next; > > > > > > > + int locked; /* 1 if lock acquired */ > > > > > > > +}; > > > > > > > + > > > > > > > +/* > > > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly > > > > > > > account for the > > > > > > > + * time spent in this lock function. > > > > > > > + */ > > > > > > > +static noinline > > > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct > > > > > > > mcs_spin_node *node) > > > > > > > +{ > > > > > > > + struct mcs_spin_node *prev; > > > > > > > + > > > > > > > + /* Init node */ > > > > > > > + node->locked = 0; > > > > > > > + node->next = NULL; > > > > > > > + > > > > > > > + prev = xchg(lock, node); > > > > > > > + if (likely(prev == NULL)) { > > > > > > > + /* Lock acquired */ > > > > > > > + node->locked = 1; > > > > > > > > > > > > If we don't spin on the local node, is it necessary to set this > > > > > > variable? > > > > > > > > > > I don't follow, the whole idea is to spin on the local variable. > > > > > > > > If prev == NULL, doesn't that mean it won't proceed to spin on the > > > > variable because the lock is already free and we call return? In that > > > > case where we directly acquire the lock, I was wondering if it is > > > > necessary to set node->locked = 1. > > > > > > Yes, that's true, but we need to flag the lock as acquired (the node's > > > lock is initially set to unlocked), otherwise others trying to acquire > > > the lock can spin forever: > > > > > > /* Wait until the lock holder passes the lock down */ > > > while (!ACCESS_ONCE(node->locked)) > > > arch_mutex_cpu_relax(); > > > > > > The ->locked variable in this implementation refers to if the lock is > > > acquired, and *not* to if busy-waiting is necessary. > > > > hmm, others threads acquiring the lock will be spinning on their own > > local nodes, not this node's node->locked. And if prev == NULL, the > > current thread won't be reading it's node->lock either since we return. > > So what other thread is going to be reading this node's node->lock? > > > > Thanks, > > Jason > > I think setting node->locked = 1 for the prev==NULL case is not > necessary functionally, but was done for semantics consistency. Okay, that would makes sense for consistency because we always first set node->lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node->lock = 0 so that it gets executed after the "if (likely(prev == NULL)) {}" code block and then delete "node->lock = 1" inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node->next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node->locked = 0; ACCESS_ONCE(prev->next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); } -- To unsubscribe from this list:
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: > On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: > > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: > > > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > > > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > > > > wrote: > > > > > > We will need the MCS lock code for doing optimistic spinning for > > > > > > rwsem. > > > > > > Extracting the MCS code from mutex.c and put into its own file > > > > > > allow us > > > > > > to reuse this code easily for rwsem. > > > > > > > > > > > > Signed-off-by: Tim Chen > > > > > > Signed-off-by: Davidlohr Bueso > > > > > > --- > > > > > > include/linux/mcslock.h | 58 > > > > > > +++ > > > > > > kernel/mutex.c | 58 > > > > > > +- > > > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > > > new file mode 100644 > > > > > > index 000..20fd3f0 > > > > > > --- /dev/null > > > > > > +++ b/include/linux/mcslock.h > > > > > > @@ -0,0 +1,58 @@ > > > > > > +/* > > > > > > + * MCS lock defines > > > > > > + * > > > > > > + * This file contains the main data structure and API definitions > > > > > > of MCS lock. > > > > > > + */ > > > > > > +#ifndef __LINUX_MCSLOCK_H > > > > > > +#define __LINUX_MCSLOCK_H > > > > > > + > > > > > > +struct mcs_spin_node { > > > > > > + struct mcs_spin_node *next; > > > > > > + int locked; /* 1 if lock acquired */ > > > > > > +}; > > > > > > + > > > > > > +/* > > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly > > > > > > account for the > > > > > > + * time spent in this lock function. > > > > > > + */ > > > > > > +static noinline > > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct > > > > > > mcs_spin_node *node) > > > > > > +{ > > > > > > + struct mcs_spin_node *prev; > > > > > > + > > > > > > + /* Init node */ > > > > > > + node->locked = 0; > > > > > > + node->next = NULL; > > > > > > + > > > > > > + prev = xchg(lock, node); > > > > > > + if (likely(prev == NULL)) { > > > > > > + /* Lock acquired */ > > > > > > + node->locked = 1; > > > > > > > > > > If we don't spin on the local node, is it necessary to set this > > > > > variable? > > > > > > > > I don't follow, the whole idea is to spin on the local variable. > > > > > > If prev == NULL, doesn't that mean it won't proceed to spin on the > > > variable because the lock is already free and we call return? In that > > > case where we directly acquire the lock, I was wondering if it is > > > necessary to set node->locked = 1. > > > > Yes, that's true, but we need to flag the lock as acquired (the node's > > lock is initially set to unlocked), otherwise others trying to acquire > > the lock can spin forever: > > > > /* Wait until the lock holder passes the lock down */ > > while (!ACCESS_ONCE(node->locked)) > > arch_mutex_cpu_relax(); > > > > The ->locked variable in this implementation refers to if the lock is > > acquired, and *not* to if busy-waiting is necessary. > > hmm, others threads acquiring the lock will be spinning on their own > local nodes, not this node's node->locked. But we're xchg'ing memory locations, not values - so after the xchg() call, when we modify node->locked, we're also changing lock, which isn't local. > And if prev == NULL, the > current thread won't be reading it's node->lock either since we return. > So what other thread is going to be reading this node's node->lock? > > Thanks, > Jason > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: > On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: > > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: > > > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > > > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > > > > wrote: > > > > > > We will need the MCS lock code for doing optimistic spinning for > > > > > > rwsem. > > > > > > Extracting the MCS code from mutex.c and put into its own file > > > > > > allow us > > > > > > to reuse this code easily for rwsem. > > > > > > > > > > > > Signed-off-by: Tim Chen > > > > > > Signed-off-by: Davidlohr Bueso > > > > > > --- > > > > > > include/linux/mcslock.h | 58 > > > > > > +++ > > > > > > kernel/mutex.c | 58 > > > > > > +- > > > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > > > new file mode 100644 > > > > > > index 000..20fd3f0 > > > > > > --- /dev/null > > > > > > +++ b/include/linux/mcslock.h > > > > > > @@ -0,0 +1,58 @@ > > > > > > +/* > > > > > > + * MCS lock defines > > > > > > + * > > > > > > + * This file contains the main data structure and API definitions > > > > > > of MCS lock. > > > > > > + */ > > > > > > +#ifndef __LINUX_MCSLOCK_H > > > > > > +#define __LINUX_MCSLOCK_H > > > > > > + > > > > > > +struct mcs_spin_node { > > > > > > + struct mcs_spin_node *next; > > > > > > + int locked; /* 1 if lock acquired */ > > > > > > +}; > > > > > > + > > > > > > +/* > > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly > > > > > > account for the > > > > > > + * time spent in this lock function. > > > > > > + */ > > > > > > +static noinline > > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct > > > > > > mcs_spin_node *node) > > > > > > +{ > > > > > > + struct mcs_spin_node *prev; > > > > > > + > > > > > > + /* Init node */ > > > > > > + node->locked = 0; > > > > > > + node->next = NULL; > > > > > > + > > > > > > + prev = xchg(lock, node); > > > > > > + if (likely(prev == NULL)) { > > > > > > + /* Lock acquired */ > > > > > > + node->locked = 1; > > > > > > > > > > If we don't spin on the local node, is it necessary to set this > > > > > variable? > > > > > > > > I don't follow, the whole idea is to spin on the local variable. > > > > > > If prev == NULL, doesn't that mean it won't proceed to spin on the > > > variable because the lock is already free and we call return? In that > > > case where we directly acquire the lock, I was wondering if it is > > > necessary to set node->locked = 1. > > > > Yes, that's true, but we need to flag the lock as acquired (the node's > > lock is initially set to unlocked), otherwise others trying to acquire > > the lock can spin forever: > > > > /* Wait until the lock holder passes the lock down */ > > while (!ACCESS_ONCE(node->locked)) > > arch_mutex_cpu_relax(); > > > > The ->locked variable in this implementation refers to if the lock is > > acquired, and *not* to if busy-waiting is necessary. > > hmm, others threads acquiring the lock will be spinning on their own > local nodes, not this node's node->locked. And if prev == NULL, the > current thread won't be reading it's node->lock either since we return. > So what other thread is going to be reading this node's node->lock? > > Thanks, > Jason I think setting node->locked = 1 for the prev==NULL case is not necessary functionally, but was done for semantics consistency. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: > On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: > > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > > > wrote: > > > > > We will need the MCS lock code for doing optimistic spinning for > > > > > rwsem. > > > > > Extracting the MCS code from mutex.c and put into its own file allow > > > > > us > > > > > to reuse this code easily for rwsem. > > > > > > > > > > Signed-off-by: Tim Chen > > > > > Signed-off-by: Davidlohr Bueso > > > > > --- > > > > > include/linux/mcslock.h | 58 > > > > > +++ > > > > > kernel/mutex.c | 58 > > > > > +- > > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > > new file mode 100644 > > > > > index 000..20fd3f0 > > > > > --- /dev/null > > > > > +++ b/include/linux/mcslock.h > > > > > @@ -0,0 +1,58 @@ > > > > > +/* > > > > > + * MCS lock defines > > > > > + * > > > > > + * This file contains the main data structure and API definitions of > > > > > MCS lock. > > > > > + */ > > > > > +#ifndef __LINUX_MCSLOCK_H > > > > > +#define __LINUX_MCSLOCK_H > > > > > + > > > > > +struct mcs_spin_node { > > > > > + struct mcs_spin_node *next; > > > > > + int locked; /* 1 if lock acquired */ > > > > > +}; > > > > > + > > > > > +/* > > > > > + * We don't inline mcs_spin_lock() so that perf can correctly > > > > > account for the > > > > > + * time spent in this lock function. > > > > > + */ > > > > > +static noinline > > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > > > > > *node) > > > > > +{ > > > > > + struct mcs_spin_node *prev; > > > > > + > > > > > + /* Init node */ > > > > > + node->locked = 0; > > > > > + node->next = NULL; > > > > > + > > > > > + prev = xchg(lock, node); > > > > > + if (likely(prev == NULL)) { > > > > > + /* Lock acquired */ > > > > > + node->locked = 1; > > > > > > > > If we don't spin on the local node, is it necessary to set this > > > > variable? > > > > > > I don't follow, the whole idea is to spin on the local variable. > > > > If prev == NULL, doesn't that mean it won't proceed to spin on the > > variable because the lock is already free and we call return? In that > > case where we directly acquire the lock, I was wondering if it is > > necessary to set node->locked = 1. > > Yes, that's true, but we need to flag the lock as acquired (the node's > lock is initially set to unlocked), otherwise others trying to acquire > the lock can spin forever: > > /* Wait until the lock holder passes the lock down */ > while (!ACCESS_ONCE(node->locked)) > arch_mutex_cpu_relax(); > > The ->locked variable in this implementation refers to if the lock is > acquired, and *not* to if busy-waiting is necessary. hmm, others threads acquiring the lock will be spinning on their own local nodes, not this node's node->locked. And if prev == NULL, the current thread won't be reading it's node->lock either since we return. So what other thread is going to be reading this node's node->lock? Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: > On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > > wrote: > > > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > > > Extracting the MCS code from mutex.c and put into its own file allow us > > > > to reuse this code easily for rwsem. > > > > > > > > Signed-off-by: Tim Chen > > > > Signed-off-by: Davidlohr Bueso > > > > --- > > > > include/linux/mcslock.h | 58 > > > > +++ > > > > kernel/mutex.c | 58 > > > > +- > > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > > create mode 100644 include/linux/mcslock.h > > > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > > new file mode 100644 > > > > index 000..20fd3f0 > > > > --- /dev/null > > > > +++ b/include/linux/mcslock.h > > > > @@ -0,0 +1,58 @@ > > > > +/* > > > > + * MCS lock defines > > > > + * > > > > + * This file contains the main data structure and API definitions of > > > > MCS lock. > > > > + */ > > > > +#ifndef __LINUX_MCSLOCK_H > > > > +#define __LINUX_MCSLOCK_H > > > > + > > > > +struct mcs_spin_node { > > > > + struct mcs_spin_node *next; > > > > + int locked; /* 1 if lock acquired */ > > > > +}; > > > > + > > > > +/* > > > > + * We don't inline mcs_spin_lock() so that perf can correctly account > > > > for the > > > > + * time spent in this lock function. > > > > + */ > > > > +static noinline > > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > > > > *node) > > > > +{ > > > > + struct mcs_spin_node *prev; > > > > + > > > > + /* Init node */ > > > > + node->locked = 0; > > > > + node->next = NULL; > > > > + > > > > + prev = xchg(lock, node); > > > > + if (likely(prev == NULL)) { > > > > + /* Lock acquired */ > > > > + node->locked = 1; > > > > > > If we don't spin on the local node, is it necessary to set this variable? > > > > I don't follow, the whole idea is to spin on the local variable. > > If prev == NULL, doesn't that mean it won't proceed to spin on the > variable because the lock is already free and we call return? In that > case where we directly acquire the lock, I was wondering if it is > necessary to set node->locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node->locked)) arch_mutex_cpu_relax(); The ->locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: > On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen > > wrote: > > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > > Extracting the MCS code from mutex.c and put into its own file allow us > > > to reuse this code easily for rwsem. > > > > > > Signed-off-by: Tim Chen > > > Signed-off-by: Davidlohr Bueso > > > --- > > > include/linux/mcslock.h | 58 > > > +++ > > > kernel/mutex.c | 58 > > > +- > > > 2 files changed, 65 insertions(+), 51 deletions(-) > > > create mode 100644 include/linux/mcslock.h > > > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > > new file mode 100644 > > > index 000..20fd3f0 > > > --- /dev/null > > > +++ b/include/linux/mcslock.h > > > @@ -0,0 +1,58 @@ > > > +/* > > > + * MCS lock defines > > > + * > > > + * This file contains the main data structure and API definitions of MCS > > > lock. > > > + */ > > > +#ifndef __LINUX_MCSLOCK_H > > > +#define __LINUX_MCSLOCK_H > > > + > > > +struct mcs_spin_node { > > > + struct mcs_spin_node *next; > > > + int locked; /* 1 if lock acquired */ > > > +}; > > > + > > > +/* > > > + * We don't inline mcs_spin_lock() so that perf can correctly account > > > for the > > > + * time spent in this lock function. > > > + */ > > > +static noinline > > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node > > > *node) > > > +{ > > > + struct mcs_spin_node *prev; > > > + > > > + /* Init node */ > > > + node->locked = 0; > > > + node->next = NULL; > > > + > > > + prev = xchg(lock, node); > > > + if (likely(prev == NULL)) { > > > + /* Lock acquired */ > > > + node->locked = 1; > > > > If we don't spin on the local node, is it necessary to set this variable? > > I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node->locked = 1. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: > On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen wrote: > > We will need the MCS lock code for doing optimistic spinning for rwsem. > > Extracting the MCS code from mutex.c and put into its own file allow us > > to reuse this code easily for rwsem. > > > > Signed-off-by: Tim Chen > > Signed-off-by: Davidlohr Bueso > > --- > > include/linux/mcslock.h | 58 > > +++ > > kernel/mutex.c | 58 > > +- > > 2 files changed, 65 insertions(+), 51 deletions(-) > > create mode 100644 include/linux/mcslock.h > > > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > > new file mode 100644 > > index 000..20fd3f0 > > --- /dev/null > > +++ b/include/linux/mcslock.h > > @@ -0,0 +1,58 @@ > > +/* > > + * MCS lock defines > > + * > > + * This file contains the main data structure and API definitions of MCS > > lock. > > + */ > > +#ifndef __LINUX_MCSLOCK_H > > +#define __LINUX_MCSLOCK_H > > + > > +struct mcs_spin_node { > > + struct mcs_spin_node *next; > > + int locked; /* 1 if lock acquired */ > > +}; > > + > > +/* > > + * We don't inline mcs_spin_lock() so that perf can correctly account for > > the > > + * time spent in this lock function. > > + */ > > +static noinline > > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > > +{ > > + struct mcs_spin_node *prev; > > + > > + /* Init node */ > > + node->locked = 0; > > + node->next = NULL; > > + > > + prev = xchg(lock, node); > > + if (likely(prev == NULL)) { > > + /* Lock acquired */ > > + node->locked = 1; > > If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen wrote: > We will need the MCS lock code for doing optimistic spinning for rwsem. > Extracting the MCS code from mutex.c and put into its own file allow us > to reuse this code easily for rwsem. > > Signed-off-by: Tim Chen > Signed-off-by: Davidlohr Bueso > --- > include/linux/mcslock.h | 58 > +++ > kernel/mutex.c | 58 +- > 2 files changed, 65 insertions(+), 51 deletions(-) > create mode 100644 include/linux/mcslock.h > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > new file mode 100644 > index 000..20fd3f0 > --- /dev/null > +++ b/include/linux/mcslock.h > @@ -0,0 +1,58 @@ > +/* > + * MCS lock defines > + * > + * This file contains the main data structure and API definitions of MCS > lock. > + */ > +#ifndef __LINUX_MCSLOCK_H > +#define __LINUX_MCSLOCK_H > + > +struct mcs_spin_node { > + struct mcs_spin_node *next; > + int locked; /* 1 if lock acquired */ > +}; > + > +/* > + * We don't inline mcs_spin_lock() so that perf can correctly account for the > + * time spent in this lock function. > + */ > +static noinline > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > +{ > + struct mcs_spin_node *prev; > + > + /* Init node */ > + node->locked = 0; > + node->next = NULL; > + > + prev = xchg(lock, node); > + if (likely(prev == NULL)) { > + /* Lock acquired */ > + node->locked = 1; If we don't spin on the local node, is it necessary to set this variable? > + return; > + } > + ACCESS_ONCE(prev->next) = node; > + smp_wmb(); > + /* Wait until the lock holder passes the lock down */ > + while (!ACCESS_ONCE(node->locked)) > + arch_mutex_cpu_relax(); > +} > + > +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > mcs_spin_node *node) > +{ > + struct mcs_spin_node *next = ACCESS_ONCE(node->next); > + > + if (likely(!next)) { > + /* > +* Release the lock by setting it to NULL > +*/ > + if (cmpxchg(lock, node, NULL) == node) And can we make this check likely()? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 10:40 +0200, Peter Zijlstra wrote: > On Thu, Sep 26, 2013 at 08:46:29AM +0200, Ingo Molnar wrote: > > > +/* > > > + * MCS lock defines > > > + * > > > + * This file contains the main data structure and API definitions of MCS > > > lock. > > > > A (very) short blurb about what an MCS lock is would be nice here. > > A while back I suggested including a link to something like: > > http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf > > Its a fairly concise write-up of the idea; only 6 pages. The sad part > about linking to the web is that links tend to go dead after a while. Link rot is a problem. If I provide a few details about MCS lock I think people should be able to google for it. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
* Peter Zijlstra wrote: > On Thu, Sep 26, 2013 at 08:46:29AM +0200, Ingo Molnar wrote: > > > +/* > > > + * MCS lock defines > > > + * > > > + * This file contains the main data structure and API definitions of MCS > > > lock. > > > > A (very) short blurb about what an MCS lock is would be nice here. > > A while back I suggested including a link to something like: > > http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf > > Its a fairly concise write-up of the idea; only 6 pages. The sad part > about linking to the web is that links tend to go dead after a while. So what I wanted to see was to add just a few sentences summing up the concept - so that people blundering into this file in include/linux/ have an idea what it's all about! Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, Sep 26, 2013 at 08:46:29AM +0200, Ingo Molnar wrote: > > +/* > > + * MCS lock defines > > + * > > + * This file contains the main data structure and API definitions of MCS > > lock. > > A (very) short blurb about what an MCS lock is would be nice here. A while back I suggested including a link to something like: http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf Its a fairly concise write-up of the idea; only 6 pages. The sad part about linking to the web is that links tend to go dead after a while. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
* Tim Chen wrote: > We will need the MCS lock code for doing optimistic spinning for rwsem. > Extracting the MCS code from mutex.c and put into its own file allow us > to reuse this code easily for rwsem. > > Signed-off-by: Tim Chen > Signed-off-by: Davidlohr Bueso > --- > include/linux/mcslock.h | 58 > +++ > kernel/mutex.c | 58 +- > 2 files changed, 65 insertions(+), 51 deletions(-) > create mode 100644 include/linux/mcslock.h > > diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h > new file mode 100644 > index 000..20fd3f0 > --- /dev/null > +++ b/include/linux/mcslock.h > @@ -0,0 +1,58 @@ > +/* > + * MCS lock defines > + * > + * This file contains the main data structure and API definitions of MCS > lock. A (very) short blurb about what an MCS lock is would be nice here. > + */ > +#ifndef __LINUX_MCSLOCK_H > +#define __LINUX_MCSLOCK_H > + > +struct mcs_spin_node { > + struct mcs_spin_node *next; > + int locked; /* 1 if lock acquired */ > +}; The vertical alignment looks broken here. > + > +/* > + * We don't inline mcs_spin_lock() so that perf can correctly account for the > + * time spent in this lock function. > + */ > +static noinline > +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) > +{ > + struct mcs_spin_node *prev; > + > + /* Init node */ > + node->locked = 0; > + node->next = NULL; > + > + prev = xchg(lock, node); > + if (likely(prev == NULL)) { > + /* Lock acquired */ > + node->locked = 1; > + return; > + } > + ACCESS_ONCE(prev->next) = node; > + smp_wmb(); > + /* Wait until the lock holder passes the lock down */ > + while (!ACCESS_ONCE(node->locked)) > + arch_mutex_cpu_relax(); > +} > + > +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct > mcs_spin_node *node) > +{ > + struct mcs_spin_node *next = ACCESS_ONCE(node->next); > + > + if (likely(!next)) { > + /* > + * Release the lock by setting it to NULL > + */ > + if (cmpxchg(lock, node, NULL) == node) > + return; > + /* Wait until the next pointer is set */ > + while (!(next = ACCESS_ONCE(node->next))) > + arch_mutex_cpu_relax(); > + } > + ACCESS_ONCE(next->locked) = 1; > + smp_wmb(); > +} > + > +#endif We typically close header guards not via a plain #endif but like this: #endif /* __LINUX_SPINLOCK_H */ #endif /* __LINUX_SPINLOCK_TYPES_H */ etc. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
* Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. A (very) short blurb about what an MCS lock is would be nice here. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; The vertical alignment looks broken here. + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* + * Release the lock by setting it to NULL + */ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); +} + +#endif We typically close header guards not via a plain #endif but like this: #endif /* __LINUX_SPINLOCK_H */ #endif /* __LINUX_SPINLOCK_TYPES_H */ etc. Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, Sep 26, 2013 at 08:46:29AM +0200, Ingo Molnar wrote: +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. A (very) short blurb about what an MCS lock is would be nice here. A while back I suggested including a link to something like: http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf Its a fairly concise write-up of the idea; only 6 pages. The sad part about linking to the web is that links tend to go dead after a while. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
* Peter Zijlstra pet...@infradead.org wrote: On Thu, Sep 26, 2013 at 08:46:29AM +0200, Ingo Molnar wrote: +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. A (very) short blurb about what an MCS lock is would be nice here. A while back I suggested including a link to something like: http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf Its a fairly concise write-up of the idea; only 6 pages. The sad part about linking to the web is that links tend to go dead after a while. So what I wanted to see was to add just a few sentences summing up the concept - so that people blundering into this file in include/linux/ have an idea what it's all about! Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 10:40 +0200, Peter Zijlstra wrote: On Thu, Sep 26, 2013 at 08:46:29AM +0200, Ingo Molnar wrote: +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. A (very) short blurb about what an MCS lock is would be nice here. A while back I suggested including a link to something like: http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf Its a fairly concise write-up of the idea; only 6 pages. The sad part about linking to the web is that links tend to go dead after a while. Link rot is a problem. If I provide a few details about MCS lock I think people should be able to google for it. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) And can we make this check likely()? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); The -locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. Thanks, Davidlohr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); The -locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. hmm, others threads acquiring the lock will be spinning on their own local nodes, not this node's node-locked. And if prev == NULL, the current thread won't be reading it's node-lock either since we return. So what other thread is going to be reading this node's node-lock? Thanks, Jason -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); The -locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. hmm, others threads acquiring the lock will be spinning on their own local nodes, not this node's node-locked. And if prev == NULL, the current thread won't be reading it's node-lock either since we return. So what other thread is going to be reading this node's node-lock? Thanks, Jason I think setting node-locked = 1 for the prev==NULL case is not necessary functionally, but was done for semantics consistency. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); The -locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. hmm, others threads acquiring the lock will be spinning on their own local nodes, not this node's node-locked. But we're xchg'ing memory locations, not values - so after the xchg() call, when we modify node-locked, we're also changing lock, which isn't local. And if prev == NULL, the current thread won't be reading it's node-lock either since we return. So what other thread is going to be reading this node's node-lock? Thanks, Jason -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); The -locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. hmm, others threads acquiring the lock will be spinning on their own local nodes, not this node's node-locked. And if prev == NULL, the current thread won't be reading it's node-lock either since we return. So what other thread is going to be reading this node's node-lock? Thanks, Jason I think setting node-locked = 1 for the prev==NULL case is not necessary functionally, but was done for semantics consistency. Okay, that would makes sense for consistency because we always first set node-lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
On Thu, 2013-09-26 at 15:42 -0700, Jason Low wrote: On Thu, 2013-09-26 at 14:41 -0700, Tim Chen wrote: On Thu, 2013-09-26 at 14:09 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:40 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 13:23 -0700, Jason Low wrote: On Thu, 2013-09-26 at 13:06 -0700, Davidlohr Bueso wrote: On Thu, 2013-09-26 at 12:27 -0700, Jason Low wrote: On Wed, Sep 25, 2013 at 3:10 PM, Tim Chen tim.c.c...@linux.intel.com wrote: We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; If we don't spin on the local node, is it necessary to set this variable? I don't follow, the whole idea is to spin on the local variable. If prev == NULL, doesn't that mean it won't proceed to spin on the variable because the lock is already free and we call return? In that case where we directly acquire the lock, I was wondering if it is necessary to set node-locked = 1. Yes, that's true, but we need to flag the lock as acquired (the node's lock is initially set to unlocked), otherwise others trying to acquire the lock can spin forever: /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); The -locked variable in this implementation refers to if the lock is acquired, and *not* to if busy-waiting is necessary. hmm, others threads acquiring the lock will be spinning on their own local nodes, not this node's node-locked. And if prev == NULL, the current thread won't be reading it's node-lock either since we return. So what other thread is going to be reading this node's node-lock? Thanks, Jason I think setting node-locked = 1 for the prev==NULL case is not necessary functionally, but was done for semantics consistency. Okay, that would makes sense for consistency because we always first set node-lock = 0 at the top of the function. If we prefer to optimize this a bit though, perhaps we can first move the node-lock = 0 so that it gets executed after the if (likely(prev == NULL)) {} code block and then delete node-lock = 1 inside the code block. I suppose we can save one single assignment. The gain is probably not noticeable as once we set node-next to NULL, node-locked is likely in local cache line and the assignment operation is cheap. Regards, Tim static noinline void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) { struct mcs_spin_node *prev; /* Init node */ node-next = NULL; prev = xchg(lock, node); if (likely(prev == NULL)) { /* Lock acquired */ return; } node-locked = 0; ACCESS_ONCE(prev-next) = node; smp_wmb(); /* Wait until the lock holder passes the lock down */ while (!ACCESS_ONCE(node-locked)) arch_mutex_cpu_relax(); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a
[PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen Signed-off-by: Davidlohr Bueso --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node->locked = 0; + node->next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node->locked = 1; + return; + } + ACCESS_ONCE(prev->next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node->locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node->next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node->next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next->locked) = 1; + smp_wmb(); +} + +#endif diff --git a/kernel/mutex.c b/kernel/mutex.c index 6d647ae..1b6ba3f 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -25,6 +25,7 @@ #include #include #include +#include /* * In the DEBUG case we are using the "NULL fastpath" for mutexes, @@ -111,54 +112,9 @@ EXPORT_SYMBOL(mutex_lock); * more or less simultaneously, the spinners need to acquire a MCS lock * first before spinning on the owner field. * - * We don't inline mspin_lock() so that perf can correctly account for the - * time spent in this lock function. */ -struct mspin_node { - struct mspin_node *next ; - int locked; /* 1 if lock acquired */ -}; -#defineMLOCK(mutex)((struct mspin_node **)&((mutex)->spin_mlock)) -static noinline -void mspin_lock(struct mspin_node **lock, struct mspin_node *node) -{ - struct mspin_node *prev; - - /* Init node */ - node->locked = 0; - node->next = NULL; - - prev = xchg(lock, node); - if (likely(prev == NULL)) { - /* Lock acquired */ - node->locked = 1; - return; - } - ACCESS_ONCE(prev->next) = node; - smp_wmb(); - /* Wait until the lock holder passes the lock down */ - while (!ACCESS_ONCE(node->locked)) - arch_mutex_cpu_relax(); -} - -static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node) -{ - struct mspin_node *next = ACCESS_ONCE(node->next); - - if (likely(!next)) { - /* -* Release the lock by setting it to NULL -*/ - if (cmpxchg(lock, node, NULL) == node) - return; - /* Wait until the next pointer is set */ - while (!(next = ACCESS_ONCE(node->next))) - arch_mutex_cpu_relax(); - } - ACCESS_ONCE(next->locked) = 1; - smp_wmb(); -} +#defineMLOCK(mutex)((struct mcs_spin_node **)&((mutex)->spin_mlock)) /* * Mutex spinning code migrated from kernel/sched/core.c @@ -448,7 +404,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, for (;;) { struct task_struct *owner; - struct mspin_node node; + struct mcs_spin_node node; if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 0) { struct ww_mutex *ww; @@ -470,10 +426,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, * If there's an owner, wait for it to either * release the lock or go to sleep. */ -
[PATCH v6 5/6] MCS Lock: Restructure the MCS lock defines and locking code into its own file
We will need the MCS lock code for doing optimistic spinning for rwsem. Extracting the MCS code from mutex.c and put into its own file allow us to reuse this code easily for rwsem. Signed-off-by: Tim Chen tim.c.c...@linux.intel.com Signed-off-by: Davidlohr Bueso davidl...@hp.com --- include/linux/mcslock.h | 58 +++ kernel/mutex.c | 58 +- 2 files changed, 65 insertions(+), 51 deletions(-) create mode 100644 include/linux/mcslock.h diff --git a/include/linux/mcslock.h b/include/linux/mcslock.h new file mode 100644 index 000..20fd3f0 --- /dev/null +++ b/include/linux/mcslock.h @@ -0,0 +1,58 @@ +/* + * MCS lock defines + * + * This file contains the main data structure and API definitions of MCS lock. + */ +#ifndef __LINUX_MCSLOCK_H +#define __LINUX_MCSLOCK_H + +struct mcs_spin_node { + struct mcs_spin_node *next; + int locked; /* 1 if lock acquired */ +}; + +/* + * We don't inline mcs_spin_lock() so that perf can correctly account for the + * time spent in this lock function. + */ +static noinline +void mcs_spin_lock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *prev; + + /* Init node */ + node-locked = 0; + node-next = NULL; + + prev = xchg(lock, node); + if (likely(prev == NULL)) { + /* Lock acquired */ + node-locked = 1; + return; + } + ACCESS_ONCE(prev-next) = node; + smp_wmb(); + /* Wait until the lock holder passes the lock down */ + while (!ACCESS_ONCE(node-locked)) + arch_mutex_cpu_relax(); +} + +static void mcs_spin_unlock(struct mcs_spin_node **lock, struct mcs_spin_node *node) +{ + struct mcs_spin_node *next = ACCESS_ONCE(node-next); + + if (likely(!next)) { + /* +* Release the lock by setting it to NULL +*/ + if (cmpxchg(lock, node, NULL) == node) + return; + /* Wait until the next pointer is set */ + while (!(next = ACCESS_ONCE(node-next))) + arch_mutex_cpu_relax(); + } + ACCESS_ONCE(next-locked) = 1; + smp_wmb(); +} + +#endif diff --git a/kernel/mutex.c b/kernel/mutex.c index 6d647ae..1b6ba3f 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -25,6 +25,7 @@ #include linux/spinlock.h #include linux/interrupt.h #include linux/debug_locks.h +#include linux/mcslock.h /* * In the DEBUG case we are using the NULL fastpath for mutexes, @@ -111,54 +112,9 @@ EXPORT_SYMBOL(mutex_lock); * more or less simultaneously, the spinners need to acquire a MCS lock * first before spinning on the owner field. * - * We don't inline mspin_lock() so that perf can correctly account for the - * time spent in this lock function. */ -struct mspin_node { - struct mspin_node *next ; - int locked; /* 1 if lock acquired */ -}; -#defineMLOCK(mutex)((struct mspin_node **)((mutex)-spin_mlock)) -static noinline -void mspin_lock(struct mspin_node **lock, struct mspin_node *node) -{ - struct mspin_node *prev; - - /* Init node */ - node-locked = 0; - node-next = NULL; - - prev = xchg(lock, node); - if (likely(prev == NULL)) { - /* Lock acquired */ - node-locked = 1; - return; - } - ACCESS_ONCE(prev-next) = node; - smp_wmb(); - /* Wait until the lock holder passes the lock down */ - while (!ACCESS_ONCE(node-locked)) - arch_mutex_cpu_relax(); -} - -static void mspin_unlock(struct mspin_node **lock, struct mspin_node *node) -{ - struct mspin_node *next = ACCESS_ONCE(node-next); - - if (likely(!next)) { - /* -* Release the lock by setting it to NULL -*/ - if (cmpxchg(lock, node, NULL) == node) - return; - /* Wait until the next pointer is set */ - while (!(next = ACCESS_ONCE(node-next))) - arch_mutex_cpu_relax(); - } - ACCESS_ONCE(next-locked) = 1; - smp_wmb(); -} +#defineMLOCK(mutex)((struct mcs_spin_node **)((mutex)-spin_mlock)) /* * Mutex spinning code migrated from kernel/sched/core.c @@ -448,7 +404,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, for (;;) { struct task_struct *owner; - struct mspin_node node; + struct mcs_spin_node node; if (!__builtin_constant_p(ww_ctx == NULL) ww_ctx-acquired 0) { struct ww_mutex *ww; @@ -470,10 +426,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, * If there's an owner, wait for it to either