Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

Ross Zwisler Fri, 04 Aug 2017 11:21:27 -0700

On Fri, Aug 04, 2017 at 11:01:08AM -0700, Dan Williams wrote:
> [ adding Dave who is working on a blk-mq + dma offload version of the
> pmem driver ]
> 
> On Fri, Aug 4, 2017 at 1:17 AM, Minchan Kim <[email protected]> wrote:
> > On Fri, Aug 04, 2017 at 12:54:41PM +0900, Minchan Kim wrote:
> [..]
> >> Thanks for the testing. Your testing number is within noise level?
> >>
> >> I cannot understand why PMEM doesn't have enough gain while BTT is 
> >> significant
> >> win(8%). I guess no rw_page with BTT testing had more chances to wait bio 
> >> dynamic
> >> allocation and mine and rw_page testing reduced it significantly. However,
> >> in no rw_page with pmem, there wasn't many cases to wait bio allocations 
> >> due
> >> to the device is so fast so the number comes from purely the number of
> >> instructions has done. At a quick glance of bio init/submit, it's not 
> >> trivial
> >> so indeed, i understand where the 12% enhancement comes from but I'm not 
> >> sure
> >> it's really big difference in real practice at the cost of maintaince 
> >> burden.
> >
> > I tested pmbench 10 times in my local machine(4 core) with zram-swap.
> > In my machine, even, on-stack bio is faster than rw_page. Unbelievable.
> >
> > I guess it's really hard to get stable result in severe memory pressure.
> > It would be a result within noise level(see below stddev).
> > So, I think it's hard to conclude rw_page is far faster than onstack-bio.
> >
> > rw_page
> > avg     5.54us
> > stddev  8.89%
> > max     6.02us
> > min     4.20us
> >
> > onstack bio
> > avg     5.27us
> > stddev  13.03%
> > max     5.96us
> > min     3.55us
> 
> The maintenance burden of having alternative submission paths is
> significant especially as we consider the pmem driver ising more
> services of the core block layer. Ideally, I'd want to complete the
> rw_page removal work before we look at the blk-mq + dma offload
> reworks.
> 
> The change to introduce BDI_CAP_SYNC is interesting because we might
> have use for switching between dma offload and cpu copy based on
> whether the I/O is synchronous or otherwise hinted to be a low latency
> request. Right now the dma offload patches are using "bio_segments() >
> 1" as the gate for selecting offload vs cpu copy which seem
> inadequate.


Okay, so based on the feedback above and from Jens[1], it sounds like we want
to go forward with removing the rw_page() interface, and instead optimize the
regular I/O path via on-stack BIOS and dma offload, correct?

If so, I'll prepare patches that fully remove the rw_page() code, and let
Minchan and Dave work on their optimizations.

[1]: https://lkml.org/lkml/2017/8/3/803
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [PATCH 0/3] remove rw_page() from brd, pmem and btt

Reply via email to