[PATCH v3 2/2] mm/page_ref: add tracepoint to track down page reference manipulation

2016-02-22 Thread js1304
From: Joonsoo Kim 

CMA allocation should be guaranteed to succeed by definition, but,
unfortunately, it would be failed sometimes. It is hard to track down
the problem, because it is related to page reference manipulation and
we don't have any facility to analyze it.

This patch adds tracepoints to track down page reference manipulation.
With it, we can find exact reason of failure and can fix the problem.
Following is an example of tracepoint output. (note: this example is
stale version that printing flags as the number. Recent version will
print it as human readable string.)

<...>-9018  [004]92.678375: page_ref_set: pfn=0x17ac9 flags=0x0 
count=1 mapcount=0 mapping=(nil) mt=4 val=1
<...>-9018  [004]92.678378: kernel_stack:
 => get_page_from_freelist (81176659)
 => __alloc_pages_nodemask (81176d22)
 => alloc_pages_vma (811bf675)
 => handle_mm_fault (8119e693)
 => __do_page_fault (810631ea)
 => trace_do_page_fault (81063543)
 => do_async_page_fault (8105c40a)
 => async_page_fault (817581d8)
[snip]
<...>-9018  [004]92.678379: page_ref_mod: pfn=0x17ac9 flags=0x40048 
count=2 mapcount=1 mapping=0x880015a78dc1 mt=4 val=1
[snip]
...
...
<...>-9131  [001]93.174468: test_pages_isolated:  start_pfn=0x17800 
end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail
[snip]
<...>-9018  [004]93.174843: page_ref_mod_and_test: pfn=0x17ac9 
flags=0x40068 count=0 mapcount=0 mapping=0x880015a78dc1 mt=4 val=-1 ret=1
 => release_pages (8117c9e4)
 => free_pages_and_swap_cache (811b0697)
 => tlb_flush_mmu_free (81199616)
 => tlb_finish_mmu (8119a62c)
 => exit_mmap (811a53f7)
 => mmput (81073f47)
 => do_exit (810794e9)
 => do_group_exit (81079def)
 => SyS_exit_group (81079e74)
 => entry_SYSCALL_64_fastpath (817560b6)

This output shows that problem comes from exit path. In exit path,
to improve performance, pages are not freed immediately. They are gathered
and processed by batch. During this process, migration cannot be possible
and CMA allocation is failed. This problem is hard to find without this
page reference tracepoint facility.

Enabling this feature bloat kernel text 30 KB in my configuration.

   textdata bss dec hex filename
121273272243616 1507328 15878271 f2487f vmlinux_disabled
121572082258880 1507328 15923416 f2f8d8 vmlinux_enabled

Note that, due to header file dependency problem between mm.h and
tracepoint.h, this feature has to open code the static key functions
for tracepoints. Proposed by Steven Rostedt in following link.

https://lkml.org/lkml/2015/12/9/699

v3:
o Add commit description and code comment why this patch open code
the static key functions for tracepoints.
o Notify that example is stale version.
o Add "depends on TRACEPOINTS".

v2:
o Use static key of each tracepoints to avoid function call overhead
when tracepoints are disabled.
o Print human-readable page flag thanks to newly introduced %pgp option.
o Add more description to Kconfig.debug.

Acked-by: Michal Nazarewicz 
Signed-off-by: Joonsoo Kim 
---
 include/linux/page_ref.h|  98 +++--
 include/trace/events/page_ref.h | 133 
 mm/Kconfig.debug|  14 +
 mm/Makefile |   1 +
 mm/debug_page_ref.c |  53 
 5 files changed, 294 insertions(+), 5 deletions(-)
 create mode 100644 include/trace/events/page_ref.h
 create mode 100644 mm/debug_page_ref.c

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 534249c..e2631ac 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -1,6 +1,62 @@
 #include 
 #include 
 #include 
+#include 
+
+extern struct tracepoint __tracepoint_page_ref_set;
+extern struct tracepoint __tracepoint_page_ref_mod;
+extern struct tracepoint __tracepoint_page_ref_mod_and_test;
+extern struct tracepoint __tracepoint_page_ref_mod_and_return;
+extern struct tracepoint __tracepoint_page_ref_mod_unless;
+extern struct tracepoint __tracepoint_page_ref_freeze;
+extern struct tracepoint __tracepoint_page_ref_unfreeze;
+
+#ifdef CONFIG_DEBUG_PAGE_REF
+
+/*
+ * Ideally we would want to use the trace__enabled() helper
+ * functions. But due to include header file issues, that is not
+ * feasible. Instead we have to open code the static key functions.
+ *
+ * See trace_##name##_enabled(void) in include/linux/tracepoint.h
+ */
+#define page_ref_tracepoint_active(t) static_key_false(&(t).key)
+
+extern void __page_ref_set(struct page *page, int v);
+extern void __page_ref_mod(struct page *page, int v);
+extern void __page_ref_mod_and_test(struct page *page, int v, int ret);
+extern void __page_ref_mod_and_return(struct page *page, int v, int ret);
+extern void __page_ref_mod_unless(struct 

[PATCH v3 2/2] mm/page_ref: add tracepoint to track down page reference manipulation

2016-02-22 Thread js1304
From: Joonsoo Kim 

CMA allocation should be guaranteed to succeed by definition, but,
unfortunately, it would be failed sometimes. It is hard to track down
the problem, because it is related to page reference manipulation and
we don't have any facility to analyze it.

This patch adds tracepoints to track down page reference manipulation.
With it, we can find exact reason of failure and can fix the problem.
Following is an example of tracepoint output. (note: this example is
stale version that printing flags as the number. Recent version will
print it as human readable string.)

<...>-9018  [004]92.678375: page_ref_set: pfn=0x17ac9 flags=0x0 
count=1 mapcount=0 mapping=(nil) mt=4 val=1
<...>-9018  [004]92.678378: kernel_stack:
 => get_page_from_freelist (81176659)
 => __alloc_pages_nodemask (81176d22)
 => alloc_pages_vma (811bf675)
 => handle_mm_fault (8119e693)
 => __do_page_fault (810631ea)
 => trace_do_page_fault (81063543)
 => do_async_page_fault (8105c40a)
 => async_page_fault (817581d8)
[snip]
<...>-9018  [004]92.678379: page_ref_mod: pfn=0x17ac9 flags=0x40048 
count=2 mapcount=1 mapping=0x880015a78dc1 mt=4 val=1
[snip]
...
...
<...>-9131  [001]93.174468: test_pages_isolated:  start_pfn=0x17800 
end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail
[snip]
<...>-9018  [004]93.174843: page_ref_mod_and_test: pfn=0x17ac9 
flags=0x40068 count=0 mapcount=0 mapping=0x880015a78dc1 mt=4 val=-1 ret=1
 => release_pages (8117c9e4)
 => free_pages_and_swap_cache (811b0697)
 => tlb_flush_mmu_free (81199616)
 => tlb_finish_mmu (8119a62c)
 => exit_mmap (811a53f7)
 => mmput (81073f47)
 => do_exit (810794e9)
 => do_group_exit (81079def)
 => SyS_exit_group (81079e74)
 => entry_SYSCALL_64_fastpath (817560b6)

This output shows that problem comes from exit path. In exit path,
to improve performance, pages are not freed immediately. They are gathered
and processed by batch. During this process, migration cannot be possible
and CMA allocation is failed. This problem is hard to find without this
page reference tracepoint facility.

Enabling this feature bloat kernel text 30 KB in my configuration.

   textdata bss dec hex filename
121273272243616 1507328 15878271 f2487f vmlinux_disabled
121572082258880 1507328 15923416 f2f8d8 vmlinux_enabled

Note that, due to header file dependency problem between mm.h and
tracepoint.h, this feature has to open code the static key functions
for tracepoints. Proposed by Steven Rostedt in following link.

https://lkml.org/lkml/2015/12/9/699

v3:
o Add commit description and code comment why this patch open code
the static key functions for tracepoints.
o Notify that example is stale version.
o Add "depends on TRACEPOINTS".

v2:
o Use static key of each tracepoints to avoid function call overhead
when tracepoints are disabled.
o Print human-readable page flag thanks to newly introduced %pgp option.
o Add more description to Kconfig.debug.

Acked-by: Michal Nazarewicz 
Signed-off-by: Joonsoo Kim 
---
 include/linux/page_ref.h|  98 +++--
 include/trace/events/page_ref.h | 133 
 mm/Kconfig.debug|  14 +
 mm/Makefile |   1 +
 mm/debug_page_ref.c |  53 
 5 files changed, 294 insertions(+), 5 deletions(-)
 create mode 100644 include/trace/events/page_ref.h
 create mode 100644 mm/debug_page_ref.c

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 534249c..e2631ac 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -1,6 +1,62 @@
 #include 
 #include 
 #include 
+#include 
+
+extern struct tracepoint __tracepoint_page_ref_set;
+extern struct tracepoint __tracepoint_page_ref_mod;
+extern struct tracepoint __tracepoint_page_ref_mod_and_test;
+extern struct tracepoint __tracepoint_page_ref_mod_and_return;
+extern struct tracepoint __tracepoint_page_ref_mod_unless;
+extern struct tracepoint __tracepoint_page_ref_freeze;
+extern struct tracepoint __tracepoint_page_ref_unfreeze;
+
+#ifdef CONFIG_DEBUG_PAGE_REF
+
+/*
+ * Ideally we would want to use the trace__enabled() helper
+ * functions. But due to include header file issues, that is not
+ * feasible. Instead we have to open code the static key functions.
+ *
+ * See trace_##name##_enabled(void) in include/linux/tracepoint.h
+ */
+#define page_ref_tracepoint_active(t) static_key_false(&(t).key)
+
+extern void __page_ref_set(struct page *page, int v);
+extern void __page_ref_mod(struct page *page, int v);
+extern void __page_ref_mod_and_test(struct page *page, int v, int ret);
+extern void __page_ref_mod_and_return(struct page *page, int v, int ret);
+extern void __page_ref_mod_unless(struct page *page, int v, int u);
+extern void __page_ref_freeze(struct page