Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 10/23/2017 07:25 PM, kemi wrote: > > > On 2017年10月24日 09:21, Andi Kleen wrote: >> kemiwrites: >>> >>> I'll see if I can find some time to implement the above in a nice way. >>> >>> Agree. Maybe something like test_and_set_bit() would be more suitable. >> >> test_and_set_bit is a very different operation for the CPU because >> it is atomic for both. But we want the initial read to not >> be atomic. >> > > I meant to express the meaning of test before setting bit. > Apologize to make you confused. That's why I suggested something like set_bit_if_not_set(), test_and_set_bit() is both already used and has entirely different semantics. -- Jens Axboe
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 10/23/2017 07:25 PM, kemi wrote: > > > On 2017年10月24日 09:21, Andi Kleen wrote: >> kemi writes: >>> >>> I'll see if I can find some time to implement the above in a nice way. >>> >>> Agree. Maybe something like test_and_set_bit() would be more suitable. >> >> test_and_set_bit is a very different operation for the CPU because >> it is atomic for both. But we want the initial read to not >> be atomic. >> > > I meant to express the meaning of test before setting bit. > Apologize to make you confused. That's why I suggested something like set_bit_if_not_set(), test_and_set_bit() is both already used and has entirely different semantics. -- Jens Axboe
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 2017年10月24日 09:21, Andi Kleen wrote: > kemiwrites: >> >> I'll see if I can find some >>> time to implement the above in a nice way. >> >> Agree. Maybe something like test_and_set_bit() would be more suitable. > > test_and_set_bit is a very different operation for the CPU because > it is atomic for both. But we want the initial read to not > be atomic. > I meant to express the meaning of test before setting bit. Apologize to make you confused. > If you add special functions use a different variant that is only > atomic for the set. > > -Andi >
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 2017年10月24日 09:21, Andi Kleen wrote: > kemi writes: >> >> I'll see if I can find some >>> time to implement the above in a nice way. >> >> Agree. Maybe something like test_and_set_bit() would be more suitable. > > test_and_set_bit is a very different operation for the CPU because > it is atomic for both. But we want the initial read to not > be atomic. > I meant to express the meaning of test before setting bit. Apologize to make you confused. > If you add special functions use a different variant that is only > atomic for the set. > > -Andi >
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
kemiwrites: > > I'll see if I can find some >> time to implement the above in a nice way. > > Agree. Maybe something like test_and_set_bit() would be more suitable. test_and_set_bit is a very different operation for the CPU because it is atomic for both. But we want the initial read to not be atomic. If you add special functions use a different variant that is only atomic for the set. -Andi
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
kemi writes: > > I'll see if I can find some >> time to implement the above in a nice way. > > Agree. Maybe something like test_and_set_bit() would be more suitable. test_and_set_bit is a very different operation for the CPU because it is atomic for both. But we want the initial read to not be atomic. If you add special functions use a different variant that is only atomic for the set. -Andi
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 2017年10月24日 00:19, Jens Axboe wrote: > On 10/23/2017 10:27 AM, Kemi Wang wrote: >> It's expensive to set buffer flags that are already set, because that >> causes a costly cache line transition. >> >> A common case is setting the "verified" flag during ext4 writes. >> This patch checks for the flag being set first. >> >> With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4 >> file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on >> a 2-sockets broadwell platform. >> >> What the benchmark does is: it forks 3000 processes, and each process do >> the following: >> a) open a new file >> b) close the file >> c) delete the file >> until loop=100*1000 times. >> >> The original patch is contributed by Andi Kleen. > > We discussed this recently, in reference to this commit: > > commit 7fcbbaf18392f0b17c95e2f033c8ccf87eecde1d > Author: Jens Axboe> Date: Thu May 22 11:54:16 2014 -0700 > > mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT > > which made a massive difference, as the changelog details. > > blk-mq uses this extensively as well, where possible. The problem is > that it always has to be explained, hence the recent discussion was > around perhaps adding > > set_bit_if_not_set() > clear_bit_if_set() > > or similar functions, to document in a single location why this matters. > Additionally, some archs may be able to implement that in an efficient > manner. > > You can add my reviewed-by to the below, Thanks. I'll see if I can find some > time to implement the above in a nice way. Agree. Maybe something like test_and_set_bit() would be more suitable. In the mean time, you may > want to consider adding a comment to the function explaining why you > have done it that way. > Sure.
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 2017年10月24日 00:19, Jens Axboe wrote: > On 10/23/2017 10:27 AM, Kemi Wang wrote: >> It's expensive to set buffer flags that are already set, because that >> causes a costly cache line transition. >> >> A common case is setting the "verified" flag during ext4 writes. >> This patch checks for the flag being set first. >> >> With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4 >> file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on >> a 2-sockets broadwell platform. >> >> What the benchmark does is: it forks 3000 processes, and each process do >> the following: >> a) open a new file >> b) close the file >> c) delete the file >> until loop=100*1000 times. >> >> The original patch is contributed by Andi Kleen. > > We discussed this recently, in reference to this commit: > > commit 7fcbbaf18392f0b17c95e2f033c8ccf87eecde1d > Author: Jens Axboe > Date: Thu May 22 11:54:16 2014 -0700 > > mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT > > which made a massive difference, as the changelog details. > > blk-mq uses this extensively as well, where possible. The problem is > that it always has to be explained, hence the recent discussion was > around perhaps adding > > set_bit_if_not_set() > clear_bit_if_set() > > or similar functions, to document in a single location why this matters. > Additionally, some archs may be able to implement that in an efficient > manner. > > You can add my reviewed-by to the below, Thanks. I'll see if I can find some > time to implement the above in a nice way. Agree. Maybe something like test_and_set_bit() would be more suitable. In the mean time, you may > want to consider adding a comment to the function explaining why you > have done it that way. > Sure.
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 10/23/2017 10:27 AM, Kemi Wang wrote: > It's expensive to set buffer flags that are already set, because that > causes a costly cache line transition. > > A common case is setting the "verified" flag during ext4 writes. > This patch checks for the flag being set first. > > With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4 > file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on > a 2-sockets broadwell platform. > > What the benchmark does is: it forks 3000 processes, and each process do > the following: > a) open a new file > b) close the file > c) delete the file > until loop=100*1000 times. > > The original patch is contributed by Andi Kleen. We discussed this recently, in reference to this commit: commit 7fcbbaf18392f0b17c95e2f033c8ccf87eecde1d Author: Jens AxboeDate: Thu May 22 11:54:16 2014 -0700 mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT which made a massive difference, as the changelog details. blk-mq uses this extensively as well, where possible. The problem is that it always has to be explained, hence the recent discussion was around perhaps adding set_bit_if_not_set() clear_bit_if_set() or similar functions, to document in a single location why this matters. Additionally, some archs may be able to implement that in an efficient manner. You can add my reviewed-by to the below, I'll see if I can find some time to implement the above in a nice way. In the mean time, you may want to consider adding a comment to the function explaining why you have done it that way. -- Jens Axboe
Re: [PATCH] buffer: Avoid setting buffer bits that are already set
On 10/23/2017 10:27 AM, Kemi Wang wrote: > It's expensive to set buffer flags that are already set, because that > causes a costly cache line transition. > > A common case is setting the "verified" flag during ext4 writes. > This patch checks for the flag being set first. > > With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4 > file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on > a 2-sockets broadwell platform. > > What the benchmark does is: it forks 3000 processes, and each process do > the following: > a) open a new file > b) close the file > c) delete the file > until loop=100*1000 times. > > The original patch is contributed by Andi Kleen. We discussed this recently, in reference to this commit: commit 7fcbbaf18392f0b17c95e2f033c8ccf87eecde1d Author: Jens Axboe Date: Thu May 22 11:54:16 2014 -0700 mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT which made a massive difference, as the changelog details. blk-mq uses this extensively as well, where possible. The problem is that it always has to be explained, hence the recent discussion was around perhaps adding set_bit_if_not_set() clear_bit_if_set() or similar functions, to document in a single location why this matters. Additionally, some archs may be able to implement that in an efficient manner. You can add my reviewed-by to the below, I'll see if I can find some time to implement the above in a nice way. In the mean time, you may want to consider adding a comment to the function explaining why you have done it that way. -- Jens Axboe