Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/17/2014 07:41 PM, Dave Chinner wrote: On Fri, Aug 15, 2014 at 01:58:09PM -0400, Waiman Long wrote: On 08/14/2014 11:34 PM, Dave Chinner wrote: xfs_io -f -c "truncate 500t" -c "extsize 1m" /path/to/vm/image/file Thank for the testing recipe. I am afraid that I can't find a 500TB SSD for testing purpose. Which bit of "sparse vm image file" didn't you understand? I'm using a 400GB of SSD for this testing $ df -h /mnt/fast-ssd Filesystem Size Used Avail Use% Mounted on /dev/sdf400G 275G 125G 69% /mnt/fast-ssd $ ls -lh /mnt/fast-ssd/vm-500t.img -rw--- 1 root root 500T Aug 15 13:21 /mnt/fast-ssd/vm-500t.img $ du -sh /mnt/fast-ssd/vm-500t.img 275G/mnt/fast-ssd/vm-500t.img That is on a Samsung 840 EVO SSD, which just about everyone should be able to obtain. Do you *really* think I have 500TB of SSDs lying around? Cheers, Dave. I am sorry that I misunderstood your instruction. I am not a filesystem guy and haven't run this kind of test before. Thank for the clarification. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/17/2014 07:41 PM, Dave Chinner wrote: On Fri, Aug 15, 2014 at 01:58:09PM -0400, Waiman Long wrote: On 08/14/2014 11:34 PM, Dave Chinner wrote: create sparse vm image file of 500TB on ssd with XFS on it xfs_io -f -c truncate 500t -c extsize 1m /path/to/vm/image/file Thank for the testing recipe. I am afraid that I can't find a 500TB SSD for testing purpose. Which bit of sparse vm image file didn't you understand? I'm using a 400GB of SSD for this testing $ df -h /mnt/fast-ssd Filesystem Size Used Avail Use% Mounted on /dev/sdf400G 275G 125G 69% /mnt/fast-ssd $ ls -lh /mnt/fast-ssd/vm-500t.img -rw--- 1 root root 500T Aug 15 13:21 /mnt/fast-ssd/vm-500t.img $ du -sh /mnt/fast-ssd/vm-500t.img 275G/mnt/fast-ssd/vm-500t.img That is on a Samsung 840 EVO SSD, which just about everyone should be able to obtain. Do you *really* think I have 500TB of SSDs lying around? Cheers, Dave. I am sorry that I misunderstood your instruction. I am not a filesystem guy and haven't run this kind of test before. Thank for the clarification. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Fri, Aug 15, 2014 at 01:58:09PM -0400, Waiman Long wrote: > On 08/14/2014 11:34 PM, Dave Chinner wrote: > > > > > >xfs_io -f -c "truncate 500t" -c "extsize 1m" /path/to/vm/image/file > > Thank for the testing recipe. I am afraid that I can't find a 500TB > SSD for testing purpose. Which bit of "sparse vm image file" didn't you understand? I'm using a 400GB of SSD for this testing $ df -h /mnt/fast-ssd Filesystem Size Used Avail Use% Mounted on /dev/sdf400G 275G 125G 69% /mnt/fast-ssd $ ls -lh /mnt/fast-ssd/vm-500t.img -rw--- 1 root root 500T Aug 15 13:21 /mnt/fast-ssd/vm-500t.img $ du -sh /mnt/fast-ssd/vm-500t.img 275G/mnt/fast-ssd/vm-500t.img That is on a Samsung 840 EVO SSD, which just about everyone should be able to obtain. Do you *really* think I have 500TB of SSDs lying around? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Fri, Aug 15, 2014 at 01:58:09PM -0400, Waiman Long wrote: On 08/14/2014 11:34 PM, Dave Chinner wrote: create sparse vm image file of 500TB on ssd with XFS on it xfs_io -f -c truncate 500t -c extsize 1m /path/to/vm/image/file Thank for the testing recipe. I am afraid that I can't find a 500TB SSD for testing purpose. Which bit of sparse vm image file didn't you understand? I'm using a 400GB of SSD for this testing $ df -h /mnt/fast-ssd Filesystem Size Used Avail Use% Mounted on /dev/sdf400G 275G 125G 69% /mnt/fast-ssd $ ls -lh /mnt/fast-ssd/vm-500t.img -rw--- 1 root root 500T Aug 15 13:21 /mnt/fast-ssd/vm-500t.img $ du -sh /mnt/fast-ssd/vm-500t.img 275G/mnt/fast-ssd/vm-500t.img That is on a Samsung 840 EVO SSD, which just about everyone should be able to obtain. Do you *really* think I have 500TB of SSDs lying around? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Fri, 2014-08-15 at 13:58 -0400, Waiman Long wrote: > Thank for the testing recipe. I am afraid that I can't find a 500TB SSD > for testing purpose. Do you think the test will still be valid for > exercising rwsem if I use a smaller SSD or maybe mechanical hard disk? I suspect fs_mark will fit in less than a cubic meter of silicon. You definitely don't want to use a 400GB USB2 drive to find out though. FSUse%Count SizeFiles/sec App Overhead 0 800 22474.9 5551907 0 1600 1550.0314304154 0 2400832.3928216719 z... nope, nobody is _that_ bored ^C (starts xfs_repair, perf top.. snort) Nope, you definitely don't want USB2 crapware for that either :) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Fri, 2014-08-15 at 13:58 -0400, Waiman Long wrote: Thank for the testing recipe. I am afraid that I can't find a 500TB SSD for testing purpose. Do you think the test will still be valid for exercising rwsem if I use a smaller SSD or maybe mechanical hard disk? I suspect fs_mark will fit in less than a cubic meter of silicon. You definitely don't want to use a 400GB USB2 drive to find out though. FSUse%Count SizeFiles/sec App Overhead 0 800 22474.9 5551907 0 1600 1550.0314304154 0 2400832.3928216719 z... nope, nobody is _that_ bored ^C (starts xfs_repair, perf top.. snort) Nope, you definitely don't want USB2 crapware for that either :) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/14/2014 11:34 PM, Dave Chinner wrote: xfs_io -f -c "truncate 500t" -c "extsize 1m" /path/to/vm/image/file In vm: download and build fsmark from here: git://oss.sgi.com/dgc/fs_mark download and install xfsprogs v3.2.1 from here: git://oss.sgi.com/xfs/cmds/xfsprogs.git tags/v3.2.1 Setup up the target filesystem: # mkfs.xfs -f -m "crc=1,finobt=1" /dev/vda # mount -o logbsize=262144,nobarrier /dev/vda /mnt/scratch Run: # fs_mark -D 1 -S0 -n 5 -s 0 -L 32 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 \ -d /mnt/scratch/8 -d /mnt/scratch/9 \ -d /mnt/scratch/10 -d /mnt/scratch/11 \ -d /mnt/scratch/12 -d /mnt/scratch/13 \ -d /mnt/scratch/14 -d /mnt/scratch/15 \ If you've got everything set up right, that should run at around 200-250,000 file creates/s. When finished, unmount and run: # xfs_repair -o bhash=50 /dev/vda And that should spend quite a long while pounding on the mmap_sem until the the userspace buffer cache stops growing. I just ran the above on 3.16, saw this from perf: 37.30% [kernel] [k] _raw_spin_unlock_irqrestore - _raw_spin_unlock_irqrestore - 62.00% rwsem_wake - call_rwsem_wake + 83.52% sys_mprotect + 16.23% __do_page_fault + 35.15% try_to_wake_up + 0.96% update_blocked_averages + 0.61% pagevec_lru_move_fn - 23.35% [kernel] [k] _raw_spin_unlock_irq - _raw_spin_unlock_irq + 51.37% finish_task_switch + 39.37% rwsem_down_write_failed + 8.49% rwsem_down_read_failed 0.62% run_timer_softirq + 5.22% [kernel] [k] native_read_tsc + 3.89% [kernel] [k] rwsem_down_write_failed . Cheers, Dave. Thank for the testing recipe. I am afraid that I can't find a 500TB SSD for testing purpose. Do you think the test will still be valid for exercising rwsem if I use a smaller SSD or maybe mechanical hard disk? -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/14/2014 11:34 PM, Dave Chinner wrote: create sparse vm image file of 500TB on ssd with XFS on it xfs_io -f -c truncate 500t -c extsize 1m /path/to/vm/image/file start 16p/16GB RAM vm with image file configured as: -drive file=/path/to/vm/image/file,if=virtio,cache=none In vm: download and build fsmark from here: git://oss.sgi.com/dgc/fs_mark download and install xfsprogs v3.2.1 from here: git://oss.sgi.com/xfs/cmds/xfsprogs.git tags/v3.2.1 Setup up the target filesystem: # mkfs.xfs -f -m crc=1,finobt=1 /dev/vda # mount -o logbsize=262144,nobarrier /dev/vda /mnt/scratch Run: # fs_mark -D 1 -S0 -n 5 -s 0 -L 32 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 \ -d /mnt/scratch/8 -d /mnt/scratch/9 \ -d /mnt/scratch/10 -d /mnt/scratch/11 \ -d /mnt/scratch/12 -d /mnt/scratch/13 \ -d /mnt/scratch/14 -d /mnt/scratch/15 \ If you've got everything set up right, that should run at around 200-250,000 file creates/s. When finished, unmount and run: # xfs_repair -o bhash=50 /dev/vda And that should spend quite a long while pounding on the mmap_sem until the the userspace buffer cache stops growing. I just ran the above on 3.16, saw this from perf: 37.30% [kernel] [k] _raw_spin_unlock_irqrestore - _raw_spin_unlock_irqrestore - 62.00% rwsem_wake - call_rwsem_wake + 83.52% sys_mprotect + 16.23% __do_page_fault + 35.15% try_to_wake_up + 0.96% update_blocked_averages + 0.61% pagevec_lru_move_fn - 23.35% [kernel] [k] _raw_spin_unlock_irq - _raw_spin_unlock_irq + 51.37% finish_task_switch + 39.37% rwsem_down_write_failed + 8.49% rwsem_down_read_failed 0.62% run_timer_softirq + 5.22% [kernel] [k] native_read_tsc + 3.89% [kernel] [k] rwsem_down_write_failed . Cheers, Dave. Thank for the testing recipe. I am afraid that I can't find a 500TB SSD for testing purpose. Do you think the test will still be valid for exercising rwsem if I use a smaller SSD or maybe mechanical hard disk? -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Wed, Aug 13, 2014 at 12:41:06PM -0400, Waiman Long wrote: > On 08/13/2014 01:51 AM, Dave Chinner wrote: > >On Mon, Aug 04, 2014 at 11:44:19AM -0400, Waiman Long wrote: > >>On 08/04/2014 12:10 AM, Jason Low wrote: > >>>On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: > The rwsem_can_spin_on_owner() function currently allows optimistic > spinning only if the owner field is defined and is running. That is > too conservative as it will cause some tasks to miss the opportunity > of doing spinning in case the owner hasn't been able to set the owner > field in time or the lock has just become available. > > This patch enables more aggressive use of optimistic spinning by > assuming that the lock is spinnable unless proved otherwise. > > Signed-off-by: Waiman Long > --- > kernel/locking/rwsem-xadd.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index d058946..dce22b8 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -285,7 +285,7 @@ static inline bool > rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) > static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) > { > struct task_struct *owner; > - bool on_cpu = false; > + bool on_cpu = true; /* Assume spinnable unless proved not to be */ > >>>Hi, > >>> > >>>So "on_cpu = true" was recently converted to "on_cpu = false" in order > >>>to address issues such as a 5x performance regression in the xfs_repair > >>>workload that was caused by the original rwsem optimistic spinning code. > >>> > >>>However, patch 4 in this patchset does address some of the problems with > >>>spinning when there are readers. CC'ing Dave Chinner, who did the > >>>testing with the xfs_repair workload. > >>> > >>This patch set enables proper reader spinning and so the problem > >>that we see with xfs_repair workload should go away. I should have > >>this patch after patch 4 to make it less confusing. BTW, patch 3 can > >>significantly reduce spinlock contention in rwsem. So I believe the > >>xfs_repair workload should run faster with this patch than both 3.15 > >>and 3.16. > >I see lots of handwaving. I documented the test I ran when I > >reported the problem so anyone with a 16p system and an SSD can > >reproduce it. I don't have the bandwidth to keep track of the lunacy > >of making locks scale these days - that's what you guys are doing. > > > >I gave you a simple, reliable workload that is extremely sensitive > >to rwsem perturbations, so you should be adding it to your > >regression tests rather than leaving it for others to notice you > >screwed up > > > >Cheers, > > > >Dave. > > If you can send me a rwsem workload that I can use for testing > purpose, it will be highly appreciated. xfs_io -f -c "truncate 500t" -c "extsize 1m" /path/to/vm/image/file In vm: download and build fsmark from here: git://oss.sgi.com/dgc/fs_mark download and install xfsprogs v3.2.1 from here: git://oss.sgi.com/xfs/cmds/xfsprogs.git tags/v3.2.1 Setup up the target filesystem: # mkfs.xfs -f -m "crc=1,finobt=1" /dev/vda # mount -o logbsize=262144,nobarrier /dev/vda /mnt/scratch Run: # fs_mark -D 1 -S0 -n 5 -s 0 -L 32 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 \ -d /mnt/scratch/8 -d /mnt/scratch/9 \ -d /mnt/scratch/10 -d /mnt/scratch/11 \ -d /mnt/scratch/12 -d /mnt/scratch/13 \ -d /mnt/scratch/14 -d /mnt/scratch/15 \ If you've got everything set up right, that should run at around 200-250,000 file creates/s. When finished, unmount and run: # xfs_repair -o bhash=50 /dev/vda And that should spend quite a long while pounding on the mmap_sem until the the userspace buffer cache stops growing. I just ran the above on 3.16, saw this from perf: 37.30% [kernel] [k] _raw_spin_unlock_irqrestore - _raw_spin_unlock_irqrestore - 62.00% rwsem_wake - call_rwsem_wake + 83.52% sys_mprotect + 16.23% __do_page_fault + 35.15% try_to_wake_up + 0.96% update_blocked_averages + 0.61% pagevec_lru_move_fn - 23.35% [kernel] [k] _raw_spin_unlock_irq - _raw_spin_unlock_irq + 51.37% finish_task_switch + 39.37% rwsem_down_write_failed + 8.49% rwsem_down_read_failed 0.62% run_timer_softirq + 5.22% [kernel] [k] native_read_tsc + 3.89% [kernel] [k] rwsem_down_write_failed . Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Wed, Aug 13, 2014 at 12:41:06PM -0400, Waiman Long wrote: On 08/13/2014 01:51 AM, Dave Chinner wrote: On Mon, Aug 04, 2014 at 11:44:19AM -0400, Waiman Long wrote: On 08/04/2014 12:10 AM, Jason Low wrote: On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Longwaiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So on_cpu = true was recently converted to on_cpu = false in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. This patch set enables proper reader spinning and so the problem that we see with xfs_repair workload should go away. I should have this patch after patch 4 to make it less confusing. BTW, patch 3 can significantly reduce spinlock contention in rwsem. So I believe the xfs_repair workload should run faster with this patch than both 3.15 and 3.16. I see lots of handwaving. I documented the test I ran when I reported the problem so anyone with a 16p system and an SSD can reproduce it. I don't have the bandwidth to keep track of the lunacy of making locks scale these days - that's what you guys are doing. I gave you a simple, reliable workload that is extremely sensitive to rwsem perturbations, so you should be adding it to your regression tests rather than leaving it for others to notice you screwed up Cheers, Dave. If you can send me a rwsem workload that I can use for testing purpose, it will be highly appreciated. create sparse vm image file of 500TB on ssd with XFS on it xfs_io -f -c truncate 500t -c extsize 1m /path/to/vm/image/file start 16p/16GB RAM vm with image file configured as: -drive file=/path/to/vm/image/file,if=virtio,cache=none In vm: download and build fsmark from here: git://oss.sgi.com/dgc/fs_mark download and install xfsprogs v3.2.1 from here: git://oss.sgi.com/xfs/cmds/xfsprogs.git tags/v3.2.1 Setup up the target filesystem: # mkfs.xfs -f -m crc=1,finobt=1 /dev/vda # mount -o logbsize=262144,nobarrier /dev/vda /mnt/scratch Run: # fs_mark -D 1 -S0 -n 5 -s 0 -L 32 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 \ -d /mnt/scratch/8 -d /mnt/scratch/9 \ -d /mnt/scratch/10 -d /mnt/scratch/11 \ -d /mnt/scratch/12 -d /mnt/scratch/13 \ -d /mnt/scratch/14 -d /mnt/scratch/15 \ If you've got everything set up right, that should run at around 200-250,000 file creates/s. When finished, unmount and run: # xfs_repair -o bhash=50 /dev/vda And that should spend quite a long while pounding on the mmap_sem until the the userspace buffer cache stops growing. I just ran the above on 3.16, saw this from perf: 37.30% [kernel] [k] _raw_spin_unlock_irqrestore - _raw_spin_unlock_irqrestore - 62.00% rwsem_wake - call_rwsem_wake + 83.52% sys_mprotect + 16.23% __do_page_fault + 35.15% try_to_wake_up + 0.96% update_blocked_averages + 0.61% pagevec_lru_move_fn - 23.35% [kernel] [k] _raw_spin_unlock_irq - _raw_spin_unlock_irq + 51.37% finish_task_switch + 39.37% rwsem_down_write_failed + 8.49% rwsem_down_read_failed 0.62% run_timer_softirq + 5.22% [kernel] [k] native_read_tsc + 3.89% [kernel] [k] rwsem_down_write_failed . Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/13/2014 01:51 AM, Dave Chinner wrote: On Mon, Aug 04, 2014 at 11:44:19AM -0400, Waiman Long wrote: On 08/04/2014 12:10 AM, Jason Low wrote: On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Long --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So "on_cpu = true" was recently converted to "on_cpu = false" in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. This patch set enables proper reader spinning and so the problem that we see with xfs_repair workload should go away. I should have this patch after patch 4 to make it less confusing. BTW, patch 3 can significantly reduce spinlock contention in rwsem. So I believe the xfs_repair workload should run faster with this patch than both 3.15 and 3.16. I see lots of handwaving. I documented the test I ran when I reported the problem so anyone with a 16p system and an SSD can reproduce it. I don't have the bandwidth to keep track of the lunacy of making locks scale these days - that's what you guys are doing. I gave you a simple, reliable workload that is extremely sensitive to rwsem perturbations, so you should be adding it to your regression tests rather than leaving it for others to notice you screwed up Cheers, Dave. If you can send me a rwsem workload that I can use for testing purpose, it will be highly appreciated. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/13/2014 01:51 AM, Dave Chinner wrote: On Mon, Aug 04, 2014 at 11:44:19AM -0400, Waiman Long wrote: On 08/04/2014 12:10 AM, Jason Low wrote: On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Longwaiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So on_cpu = true was recently converted to on_cpu = false in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. This patch set enables proper reader spinning and so the problem that we see with xfs_repair workload should go away. I should have this patch after patch 4 to make it less confusing. BTW, patch 3 can significantly reduce spinlock contention in rwsem. So I believe the xfs_repair workload should run faster with this patch than both 3.15 and 3.16. I see lots of handwaving. I documented the test I ran when I reported the problem so anyone with a 16p system and an SSD can reproduce it. I don't have the bandwidth to keep track of the lunacy of making locks scale these days - that's what you guys are doing. I gave you a simple, reliable workload that is extremely sensitive to rwsem perturbations, so you should be adding it to your regression tests rather than leaving it for others to notice you screwed up Cheers, Dave. If you can send me a rwsem workload that I can use for testing purpose, it will be highly appreciated. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Mon, Aug 04, 2014 at 11:44:19AM -0400, Waiman Long wrote: > On 08/04/2014 12:10 AM, Jason Low wrote: > >On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: > >>The rwsem_can_spin_on_owner() function currently allows optimistic > >>spinning only if the owner field is defined and is running. That is > >>too conservative as it will cause some tasks to miss the opportunity > >>of doing spinning in case the owner hasn't been able to set the owner > >>field in time or the lock has just become available. > >> > >>This patch enables more aggressive use of optimistic spinning by > >>assuming that the lock is spinnable unless proved otherwise. > >> > >>Signed-off-by: Waiman Long > >>--- > >> kernel/locking/rwsem-xadd.c |2 +- > >> 1 files changed, 1 insertions(+), 1 deletions(-) > >> > >>diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > >>index d058946..dce22b8 100644 > >>--- a/kernel/locking/rwsem-xadd.c > >>+++ b/kernel/locking/rwsem-xadd.c > >>@@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct > >>rw_semaphore *sem) > >> static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) > >> { > >>struct task_struct *owner; > >>- bool on_cpu = false; > >>+ bool on_cpu = true; /* Assume spinnable unless proved not to be */ > >Hi, > > > >So "on_cpu = true" was recently converted to "on_cpu = false" in order > >to address issues such as a 5x performance regression in the xfs_repair > >workload that was caused by the original rwsem optimistic spinning code. > > > >However, patch 4 in this patchset does address some of the problems with > >spinning when there are readers. CC'ing Dave Chinner, who did the > >testing with the xfs_repair workload. > > > > This patch set enables proper reader spinning and so the problem > that we see with xfs_repair workload should go away. I should have > this patch after patch 4 to make it less confusing. BTW, patch 3 can > significantly reduce spinlock contention in rwsem. So I believe the > xfs_repair workload should run faster with this patch than both 3.15 > and 3.16. I see lots of handwaving. I documented the test I ran when I reported the problem so anyone with a 16p system and an SSD can reproduce it. I don't have the bandwidth to keep track of the lunacy of making locks scale these days - that's what you guys are doing. I gave you a simple, reliable workload that is extremely sensitive to rwsem perturbations, so you should be adding it to your regression tests rather than leaving it for others to notice you screwed up Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Mon, Aug 04, 2014 at 11:44:19AM -0400, Waiman Long wrote: On 08/04/2014 12:10 AM, Jason Low wrote: On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Longwaiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So on_cpu = true was recently converted to on_cpu = false in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. This patch set enables proper reader spinning and so the problem that we see with xfs_repair workload should go away. I should have this patch after patch 4 to make it less confusing. BTW, patch 3 can significantly reduce spinlock contention in rwsem. So I believe the xfs_repair workload should run faster with this patch than both 3.15 and 3.16. I see lots of handwaving. I documented the test I ran when I reported the problem so anyone with a 16p system and an SSD can reproduce it. I don't have the bandwidth to keep track of the lunacy of making locks scale these days - that's what you guys are doing. I gave you a simple, reliable workload that is extremely sensitive to rwsem perturbations, so you should be adding it to your regression tests rather than leaving it for others to notice you screwed up Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/04/2014 12:10 AM, Jason Low wrote: On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Long --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So "on_cpu = true" was recently converted to "on_cpu = false" in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. This patch set enables proper reader spinning and so the problem that we see with xfs_repair workload should go away. I should have this patch after patch 4 to make it less confusing. BTW, patch 3 can significantly reduce spinlock contention in rwsem. So I believe the xfs_repair workload should run faster with this patch than both 3.15 and 3.16. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On 08/04/2014 12:10 AM, Jason Low wrote: On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Longwaiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So on_cpu = true was recently converted to on_cpu = false in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. This patch set enables proper reader spinning and so the problem that we see with xfs_repair workload should go away. I should have this patch after patch 4 to make it less confusing. BTW, patch 3 can significantly reduce spinlock contention in rwsem. So I believe the xfs_repair workload should run faster with this patch than both 3.15 and 3.16. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: > The rwsem_can_spin_on_owner() function currently allows optimistic > spinning only if the owner field is defined and is running. That is > too conservative as it will cause some tasks to miss the opportunity > of doing spinning in case the owner hasn't been able to set the owner > field in time or the lock has just become available. > > This patch enables more aggressive use of optimistic spinning by > assuming that the lock is spinnable unless proved otherwise. > > Signed-off-by: Waiman Long > --- > kernel/locking/rwsem-xadd.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index d058946..dce22b8 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct > rw_semaphore *sem) > static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) > { > struct task_struct *owner; > - bool on_cpu = false; > + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So "on_cpu = true" was recently converted to "on_cpu = false" in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: > The rwsem_can_spin_on_owner() function currently allows optimistic > spinning only if the owner field is defined and is running. That is > too conservative as it will cause some tasks to miss the opportunity > of doing spinning in case the owner hasn't been able to set the owner > field in time or the lock has just become available. > > This patch enables more aggressive use of optimistic spinning by > assuming that the lock is spinnable unless proved otherwise. > > Signed-off-by: Waiman Long > --- > kernel/locking/rwsem-xadd.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index d058946..dce22b8 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct > rw_semaphore *sem) > static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) > { > struct task_struct *owner; > - bool on_cpu = false; > + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Nope, unfortunately we need as it fixes some pretty bad regressions when dealing with multiple readers -- as readers do not deal with lock ownership, so another thread can spin for too long in !owner. See commit 37e95624. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Long --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ if (need_resched()) return false; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ if (need_resched()) return false; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Nope, unfortunately we need as it fixes some pretty bad regressions when dealing with multiple readers -- as readers do not deal with lock ownership, so another thread can spin for too long in !owner. See commit 37e95624. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/7] locking/rwsem: more aggressive use of optimistic spinning
On Sun, 2014-08-03 at 22:36 -0400, Waiman Long wrote: The rwsem_can_spin_on_owner() function currently allows optimistic spinning only if the owner field is defined and is running. That is too conservative as it will cause some tasks to miss the opportunity of doing spinning in case the owner hasn't been able to set the owner field in time or the lock has just become available. This patch enables more aggressive use of optimistic spinning by assuming that the lock is spinnable unless proved otherwise. Signed-off-by: Waiman Long waiman.l...@hp.com --- kernel/locking/rwsem-xadd.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index d058946..dce22b8 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -285,7 +285,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem) static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) { struct task_struct *owner; - bool on_cpu = false; + bool on_cpu = true; /* Assume spinnable unless proved not to be */ Hi, So on_cpu = true was recently converted to on_cpu = false in order to address issues such as a 5x performance regression in the xfs_repair workload that was caused by the original rwsem optimistic spinning code. However, patch 4 in this patchset does address some of the problems with spinning when there are readers. CC'ing Dave Chinner, who did the testing with the xfs_repair workload. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/