Re: [PATCHv3 1/1] ext4: Optimize file overwrites

2020-10-02 Thread Theodore Y. Ts'o
On Fri, Sep 18, 2020 at 10:36:35AM +0530, Ritesh Harjani wrote:
> In case if the file already has underlying blocks/extents allocated
> then we don't need to start a journal txn and can directly return
> the underlying mapping. Currently ext4_iomap_begin() is used by
> both DAX & DIO path. We can check if the write request is an
> overwrite & then directly return the mapping information.
> 
> This could give a significant perf boost for multi-threaded writes
> specially random overwrites.
> On PPC64 VM with simulated pmem(DAX) device, ~10x perf improvement
> could be seen in random writes (overwrite). Also bcoz this optimizes
> away the spinlock contention during jbd2 slab cache allocation
> (jbd2_journal_handle). On x86 VM, ~2x perf improvement was observed.
> 
> Reported-by: Dan Williams 
> Suggested-by: Jan Kara 
> Signed-off-by: Ritesh Harjani 

Thanks, applied.

- Ted


Re: [PATCHv3 1/1] ext4: Optimize file overwrites

2020-09-18 Thread Jan Kara
On Fri 18-09-20 10:36:35, Ritesh Harjani wrote:
> In case if the file already has underlying blocks/extents allocated
> then we don't need to start a journal txn and can directly return
> the underlying mapping. Currently ext4_iomap_begin() is used by
> both DAX & DIO path. We can check if the write request is an
> overwrite & then directly return the mapping information.
> 
> This could give a significant perf boost for multi-threaded writes
> specially random overwrites.
> On PPC64 VM with simulated pmem(DAX) device, ~10x perf improvement
> could be seen in random writes (overwrite). Also bcoz this optimizes
> away the spinlock contention during jbd2 slab cache allocation
> (jbd2_journal_handle). On x86 VM, ~2x perf improvement was observed.
> 
> Reported-by: Dan Williams 
> Suggested-by: Jan Kara 
> Signed-off-by: Ritesh Harjani 

The patch looks good to me. You can add:

Reviewed-by: Jan Kara 

Honza

> ---
>  fs/ext4/inode.c | 18 +++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 10dd470876b3..6eae17758ece 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3437,14 +3437,26 @@ static int ext4_iomap_begin(struct inode *inode, 
> loff_t offset, loff_t length,
>   map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
> EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
>  
> - if (flags & IOMAP_WRITE)
> + if (flags & IOMAP_WRITE) {
> + /*
> +  * We check here if the blocks are already allocated, then we
> +  * don't need to start a journal txn and we can directly return
> +  * the mapping information. This could boost performance
> +  * especially in multi-threaded overwrite requests.
> +  */
> + if (offset + length <= i_size_read(inode)) {
> + ret = ext4_map_blocks(NULL, inode, , 0);
> + if (ret > 0 && (map.m_flags & EXT4_MAP_MAPPED))
> + goto out;
> + }
>   ret = ext4_iomap_alloc(inode, , flags);
> - else
> + } else {
>   ret = ext4_map_blocks(NULL, inode, , 0);
> + }
>  
>   if (ret < 0)
>   return ret;
> -
> +out:
>   ext4_set_iomap(inode, iomap, , offset, length);
>  
>   return 0;
> -- 
> 2.26.2
> 
-- 
Jan Kara 
SUSE Labs, CR


Re: [PATCHv3 1/1] ext4: Optimize file overwrites

2020-09-18 Thread Sedat Dilek
On Fri, Sep 18, 2020 at 7:09 AM Ritesh Harjani  wrote:
>
> In case if the file already has underlying blocks/extents allocated
> then we don't need to start a journal txn and can directly return
> the underlying mapping. Currently ext4_iomap_begin() is used by
> both DAX & DIO path. We can check if the write request is an
> overwrite & then directly return the mapping information.
>
> This could give a significant perf boost for multi-threaded writes
> specially random overwrites.
> On PPC64 VM with simulated pmem(DAX) device, ~10x perf improvement
> could be seen in random writes (overwrite). Also bcoz this optimizes
> away the spinlock contention during jbd2 slab cache allocation
> (jbd2_journal_handle). On x86 VM, ~2x perf improvement was observed.
>
> Reported-by: Dan Williams 
> Suggested-by: Jan Kara 
> Signed-off-by: Ritesh Harjani 

I have applied your patch on top of recent Linus Git and boot-tested on x86-64.

Here I have LTP installed.
If you have a LTP filesystem test-/use-case you know for testing,
please let me know.

Yes, I have seen the FIO config in the cover-letter.
Maybe you have a different FIO config - 16G AFAIK is too big here.

Feel free to add...

Tested-by: Sedat Dilek  # Compile and boot on
x86-64 Debian/unstable

Thanks.

- Sedat -

> ---
>  fs/ext4/inode.c | 18 +++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 10dd470876b3..6eae17758ece 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3437,14 +3437,26 @@ static int ext4_iomap_begin(struct inode *inode, 
> loff_t offset, loff_t length,
> map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
>   EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
>
> -   if (flags & IOMAP_WRITE)
> +   if (flags & IOMAP_WRITE) {
> +   /*
> +* We check here if the blocks are already allocated, then we
> +* don't need to start a journal txn and we can directly 
> return
> +* the mapping information. This could boost performance
> +* especially in multi-threaded overwrite requests.
> +*/
> +   if (offset + length <= i_size_read(inode)) {
> +   ret = ext4_map_blocks(NULL, inode, , 0);
> +   if (ret > 0 && (map.m_flags & EXT4_MAP_MAPPED))
> +   goto out;
> +   }
> ret = ext4_iomap_alloc(inode, , flags);
> -   else
> +   } else {
> ret = ext4_map_blocks(NULL, inode, , 0);
> +   }
>
> if (ret < 0)
> return ret;
> -
> +out:
> ext4_set_iomap(inode, iomap, , offset, length);
>
> return 0;
> --
> 2.26.2
>


[PATCHv3 1/1] ext4: Optimize file overwrites

2020-09-17 Thread Ritesh Harjani
In case if the file already has underlying blocks/extents allocated
then we don't need to start a journal txn and can directly return
the underlying mapping. Currently ext4_iomap_begin() is used by
both DAX & DIO path. We can check if the write request is an
overwrite & then directly return the mapping information.

This could give a significant perf boost for multi-threaded writes
specially random overwrites.
On PPC64 VM with simulated pmem(DAX) device, ~10x perf improvement
could be seen in random writes (overwrite). Also bcoz this optimizes
away the spinlock contention during jbd2 slab cache allocation
(jbd2_journal_handle). On x86 VM, ~2x perf improvement was observed.

Reported-by: Dan Williams 
Suggested-by: Jan Kara 
Signed-off-by: Ritesh Harjani 
---
 fs/ext4/inode.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 10dd470876b3..6eae17758ece 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3437,14 +3437,26 @@ static int ext4_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
  EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
 
-   if (flags & IOMAP_WRITE)
+   if (flags & IOMAP_WRITE) {
+   /*
+* We check here if the blocks are already allocated, then we
+* don't need to start a journal txn and we can directly return
+* the mapping information. This could boost performance
+* especially in multi-threaded overwrite requests.
+*/
+   if (offset + length <= i_size_read(inode)) {
+   ret = ext4_map_blocks(NULL, inode, , 0);
+   if (ret > 0 && (map.m_flags & EXT4_MAP_MAPPED))
+   goto out;
+   }
ret = ext4_iomap_alloc(inode, , flags);
-   else
+   } else {
ret = ext4_map_blocks(NULL, inode, , 0);
+   }
 
if (ret < 0)
return ret;
-
+out:
ext4_set_iomap(inode, iomap, , offset, length);
 
return 0;
-- 
2.26.2