On Thu 13-10-16 14:34:34, Ross Zwisler wrote:
> On Mon, Oct 03, 2016 at 01:13:58PM +0200, Jan Kara wrote:
> > On Mon 03-10-16 02:32:48, Christoph Hellwig wrote:
> > > On Mon, Oct 03, 2016 at 10:15:49AM +0200, Jan Kara wrote:
> > > > Yeah, so DAX path is special because it installs its own PTE directly
> > > > from
> > > > the fault handler which we don't do in any other case (only driver fault
> > > > handlers commonly do this but those generally don't care about
> > > > ->page_mkwrite or file mappings for that matter).
> > > >
> > > > I don't say there are no simplifications or unifications possible, but
> > > > I'd
> > > > prefer to leave them for a bit later once the current churn with ongoing
> > > > work somewhat settles...
> > >
> > > Allright, let's keep it simple for now. Being said this series clearly
> > > is 4.9 material, but any chance to get a respin of the invalidate_pages
> >
> > Agreed (actually 4.10).
> >
> > > series as that might still be 4.8 material?
> >
> > The problem with invalidate_pages series is that it depends on the ability
> > to clear the dirty bits in the radix tree of DAX mappings (i.e. the first
> > series). Otherwise radix tree entries that get once dirty can never be
> > safely
> > evicted, invalidate_inode_pages2_range() will keep returning EBUSY and
> > callers get confused (I've tried that few weeks ago).
> >
> > If I dropped patch 5/6 for 4.9 merge (i.e., we would still happily discard
> > dirty radix tree entries from invalidate_inode_pages2_range()), things
> > would run fine, just fsync() may miss to flush caches for some pages. I'm
> > not sure that's much better than current status quo though. Thoughts?
>
> I'm not sure if I'm understanding this correctly, but if you're saying
> that we might end up in a case where fsync()/msync() would fail to
> properly flush pages that are/should be dirty, I think this is a no-go.
> That could result in data corruption if a user calls fsync(), thinks
> they've achieved a synchronization point (updating other metadata or
> whatever), then via power loss they lose data they had flushed via that
> previous fsync() because it was still in the CPU cache and never really
> made it out to media.
I know and actually current code is buggy in that way as well and this
patch set is fixing it. But I was arguing that only applying part of the
fixes so that the main problem remains unfixed would not be very beneficial
anyway.
This week I plan to rebase both series on top of rc1 + your THP patches so
that we can move on with merging the stuff.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm