On Mon, Sep 30, 2013 at 03:14:52PM +0200, Jack Wang wrote:
> On 09/30/2013 12:11 PM, Luis Henriques wrote:
> > 3.5.7.22 -stable review patch.  If anyone has any objections, please let me 
> > know.
> > 
> > ------------------
> > 
> > From: Khalid Aziz <[email protected]>
> > 
> > commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream.
> > 
> > I am working with a tool that simulates oracle database I/O workload.
> > This tool (orion to be specific -
> > <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>)
> > allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag.  It then
> > does aio into these pages from flash disks using various common block
> > sizes used by database.  I am looking at performance with two of the most
> > common block sizes - 1M and 64K.  aio performance with these two block
> > sizes plunged after Transparent HugePages was introduced in the kernel.
> > Here are performance numbers:
> > 
> >             pre-THP         2.6.39          3.11-rc5
> > 1M read             8384 MB/s       5629 MB/s       6501 MB/s
> > 64K read    7867 MB/s       4576 MB/s       4251 MB/s
> > 
> > I have narrowed the performance impact down to the overheads introduced by
> > THP in __get_page_tail() and put_compound_page() routines.  perf top shows
> >> 40% of cycles being spent in these two routines.  Every time direct I/O
> > to hugetlbfs pages starts, kernel calls get_page() to grab a reference to
> > the pages and calls put_page() when I/O completes to put the reference
> > away.  THP introduced significant amount of locking overhead to get_page()
> > and put_page() when dealing with compound pages because hugepages can be
> > split underneath get_page() and put_page().  It added this overhead
> > irrespective of whether it is dealing with hugetlbfs pages or transparent
> > hugepages.  This resulted in 20%-45% drop in aio performance when using
> > hugetlbfs pages.
> > 
> > Since hugetlbfs pages can not be split, there is no reason to go through
> > all the locking overhead for these pages from what I can see.  I added
> > code to __get_page_tail() and put_compound_page() to bypass all the
> > locking code when working with hugetlbfs pages.  This improved performance
> > significantly.  Performance numbers with this patch:
> > 
> >             pre-THP         3.11-rc5        3.11-rc5 + Patch
> > 1M read             8384 MB/s       6501 MB/s       8371 MB/s
> > 64K read    7867 MB/s       4251 MB/s       6510 MB/s
> > 
> > Performance with 64K read is still lower than what it was before THP, but
> > still a 53% improvement.  It does mean there is more work to be done but I
> > will take a 53% improvement for now.
> > 
> > Please take a look at the following patch and let me know if it looks
> > reasonable.
> > 
> > [[email protected]: tweak comments]
> > Signed-off-by: Khalid Aziz <[email protected]>
> > Cc: Pravin B Shelar <[email protected]>
> > Cc: Christoph Lameter <[email protected]>
> > Cc: Andrea Arcangeli <[email protected]>
> > Cc: Johannes Weiner <[email protected]>
> > Cc: Mel Gorman <[email protected]>
> > Cc: Rik van Riel <[email protected]>
> > Cc: Minchan Kim <[email protected]>
> > Cc: Andi Kleen <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
> > Signed-off-by: Linus Torvalds <[email protected]>
> > [ luis: backported to 3.5: adjusted context ]
> > Signed-off-by: Luis Henriques <[email protected]>
> Hi Greg,
> 
> I suppose this patch also needed for 3.4, right?

As it didn't originally apply there, I didn't apply it.

If people think it should be applicable for 3.4, I'll take it.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to