On Thu 21-01-16 10:46:02, Ross Zwisler wrote:
> Several of the subtleties and assumptions of the DAX fsync/msync
> implementation are not immediately obvious, so document them with comments.
> 
> Signed-off-by: Ross Zwisler <[email protected]>
> Reported-by: Jan Kara <[email protected]>

Thanks, the comments really help! Just two nits below, otherwise feel free
to add:

Reviewed-by: Jan Kara <[email protected]>

> ---
>  fs/dax.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index d589113..55ae394 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -350,6 +350,13 @@ static int dax_radix_entry(struct address_space 
> *mapping, pgoff_t index,
>  
>               if (!pmd_entry || type == RADIX_DAX_PMD)
>                       goto dirty;
> +
> +             /*
> +              * We only insert dirty PMD entries into the radix tree.  This
> +              * means we don't need to worry about removing a dirty PTE
> +              * entry and inserting a clean PMD entry, thus reducing the
> +              * range we would flush with a follow-up fsync/msync call.
> +              */

May be acompany this with:

                WARN_ON(pmd_entry && !dirty);

somewhere in dax_radix_entry()?

>               radix_tree_delete(&mapping->page_tree, index);
>               mapping->nrexceptional--;
>       }
> @@ -912,6 +919,21 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned 
> long address,
>               }
>               dax_unmap_atomic(bdev, &dax);
>  
> +             /*
> +              * For PTE faults we insert a radix tree entry for reads, and
> +              * leave it clean.  Then on the first write we dirty the radix
> +              * tree entry via the dax_pnf_mkwrite() path.  This sequence
                                          ^^^ pfn

                                                                Honza
-- 
Jan Kara <[email protected]>
SUSE Labs, CR

Reply via email to