Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-31 Thread Huang, Ying
Johannes Weiner  writes:

> On Thu, Mar 30, 2017 at 12:28:17PM +0800, Huang, Ying wrote:
>> Johannes Weiner  writes:
>> > On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
>> >> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>> >>  
>> >>  #endif /* CONFIG_SWAP */
>> >>  
>> >> +#ifdef CONFIG_THP_SWAP_CLUSTER
>> >> +static inline swp_entry_t get_huge_swap_page(void)
>> >> +{
>> >> + swp_entry_t entry;
>> >> +
>> >> + if (get_swap_pages(1, , true))
>> >> + return entry;
>> >> + else
>> >> + return (swp_entry_t) {0};
>> >> +}
>> >> +#else
>> >> +static inline swp_entry_t get_huge_swap_page(void)
>> >> +{
>> >> + return (swp_entry_t) {0};
>> >> +}
>> >> +#endif
>> >
>> > Your introducing a function without a user, making it very hard to
>> > judge whether the API is well-designed for the callers or not.
>> >
>> > I pointed this out as a systemic problem with this patch series in v3,
>> > along with other stuff, but with the way this series is structured I'm
>> > having a hard time seeing whether you implemented my other feedback or
>> > whether your counter arguments to them are justified.
>> >
>> > I cannot review and ack these patches this way.
>> 
>> Sorry for inconvenience, I will send a new version to combine the
>> function definition and usage into one patch at least for you to
>> review.
>
> We tried this before. I reviewed the self-contained patch and you
> incorporated the feedback into the split-out structure that made it
> impossible for me to verify the updates.
>
> I'm not sure why you insist on preserving this series format. It's not
> good for review, and it's not good for merging and git history.

I had thought some reviewers would prefer the original series format.
But I will use your suggested format in the future, unless more
reviewers prefer the original format.

Best Regards,
Huang, Ying

>> But I think we can continue our discussion in the comments your
>> raised so far firstly, what do you think about that?
>
> Yeah, let's finish the discussions before -v8.


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-31 Thread Huang, Ying
Johannes Weiner  writes:

> On Thu, Mar 30, 2017 at 12:28:17PM +0800, Huang, Ying wrote:
>> Johannes Weiner  writes:
>> > On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
>> >> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>> >>  
>> >>  #endif /* CONFIG_SWAP */
>> >>  
>> >> +#ifdef CONFIG_THP_SWAP_CLUSTER
>> >> +static inline swp_entry_t get_huge_swap_page(void)
>> >> +{
>> >> + swp_entry_t entry;
>> >> +
>> >> + if (get_swap_pages(1, , true))
>> >> + return entry;
>> >> + else
>> >> + return (swp_entry_t) {0};
>> >> +}
>> >> +#else
>> >> +static inline swp_entry_t get_huge_swap_page(void)
>> >> +{
>> >> + return (swp_entry_t) {0};
>> >> +}
>> >> +#endif
>> >
>> > Your introducing a function without a user, making it very hard to
>> > judge whether the API is well-designed for the callers or not.
>> >
>> > I pointed this out as a systemic problem with this patch series in v3,
>> > along with other stuff, but with the way this series is structured I'm
>> > having a hard time seeing whether you implemented my other feedback or
>> > whether your counter arguments to them are justified.
>> >
>> > I cannot review and ack these patches this way.
>> 
>> Sorry for inconvenience, I will send a new version to combine the
>> function definition and usage into one patch at least for you to
>> review.
>
> We tried this before. I reviewed the self-contained patch and you
> incorporated the feedback into the split-out structure that made it
> impossible for me to verify the updates.
>
> I'm not sure why you insist on preserving this series format. It's not
> good for review, and it's not good for merging and git history.

I had thought some reviewers would prefer the original series format.
But I will use your suggested format in the future, unless more
reviewers prefer the original format.

Best Regards,
Huang, Ying

>> But I think we can continue our discussion in the comments your
>> raised so far firstly, what do you think about that?
>
> Yeah, let's finish the discussions before -v8.


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-31 Thread Johannes Weiner
On Thu, Mar 30, 2017 at 12:28:17PM +0800, Huang, Ying wrote:
> Johannes Weiner  writes:
> > On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
> >> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
> >>  
> >>  #endif /* CONFIG_SWAP */
> >>  
> >> +#ifdef CONFIG_THP_SWAP_CLUSTER
> >> +static inline swp_entry_t get_huge_swap_page(void)
> >> +{
> >> +  swp_entry_t entry;
> >> +
> >> +  if (get_swap_pages(1, , true))
> >> +  return entry;
> >> +  else
> >> +  return (swp_entry_t) {0};
> >> +}
> >> +#else
> >> +static inline swp_entry_t get_huge_swap_page(void)
> >> +{
> >> +  return (swp_entry_t) {0};
> >> +}
> >> +#endif
> >
> > Your introducing a function without a user, making it very hard to
> > judge whether the API is well-designed for the callers or not.
> >
> > I pointed this out as a systemic problem with this patch series in v3,
> > along with other stuff, but with the way this series is structured I'm
> > having a hard time seeing whether you implemented my other feedback or
> > whether your counter arguments to them are justified.
> >
> > I cannot review and ack these patches this way.
> 
> Sorry for inconvenience, I will send a new version to combine the
> function definition and usage into one patch at least for you to
> review.

We tried this before. I reviewed the self-contained patch and you
incorporated the feedback into the split-out structure that made it
impossible for me to verify the updates.

I'm not sure why you insist on preserving this series format. It's not
good for review, and it's not good for merging and git history.

> But I think we can continue our discussion in the comments your
> raised so far firstly, what do you think about that?

Yeah, let's finish the discussions before -v8.


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-31 Thread Johannes Weiner
On Thu, Mar 30, 2017 at 12:28:17PM +0800, Huang, Ying wrote:
> Johannes Weiner  writes:
> > On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
> >> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
> >>  
> >>  #endif /* CONFIG_SWAP */
> >>  
> >> +#ifdef CONFIG_THP_SWAP_CLUSTER
> >> +static inline swp_entry_t get_huge_swap_page(void)
> >> +{
> >> +  swp_entry_t entry;
> >> +
> >> +  if (get_swap_pages(1, , true))
> >> +  return entry;
> >> +  else
> >> +  return (swp_entry_t) {0};
> >> +}
> >> +#else
> >> +static inline swp_entry_t get_huge_swap_page(void)
> >> +{
> >> +  return (swp_entry_t) {0};
> >> +}
> >> +#endif
> >
> > Your introducing a function without a user, making it very hard to
> > judge whether the API is well-designed for the callers or not.
> >
> > I pointed this out as a systemic problem with this patch series in v3,
> > along with other stuff, but with the way this series is structured I'm
> > having a hard time seeing whether you implemented my other feedback or
> > whether your counter arguments to them are justified.
> >
> > I cannot review and ack these patches this way.
> 
> Sorry for inconvenience, I will send a new version to combine the
> function definition and usage into one patch at least for you to
> review.

We tried this before. I reviewed the self-contained patch and you
incorporated the feedback into the split-out structure that made it
impossible for me to verify the updates.

I'm not sure why you insist on preserving this series format. It's not
good for review, and it's not good for merging and git history.

> But I think we can continue our discussion in the comments your
> raised so far firstly, what do you think about that?

Yeah, let's finish the discussions before -v8.


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-29 Thread Huang, Ying
Johannes Weiner  writes:

> On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
>> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>>  
>>  #endif /* CONFIG_SWAP */
>>  
>> +#ifdef CONFIG_THP_SWAP_CLUSTER
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +swp_entry_t entry;
>> +
>> +if (get_swap_pages(1, , true))
>> +return entry;
>> +else
>> +return (swp_entry_t) {0};
>> +}
>> +#else
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +return (swp_entry_t) {0};
>> +}
>> +#endif
>
> Your introducing a function without a user, making it very hard to
> judge whether the API is well-designed for the callers or not.
>
> I pointed this out as a systemic problem with this patch series in v3,
> along with other stuff, but with the way this series is structured I'm
> having a hard time seeing whether you implemented my other feedback or
> whether your counter arguments to them are justified.
>
> I cannot review and ack these patches this way.

Sorry for inconvenience, I will send a new version to combine the
function definition and usage into one patch at least for you to
review.  But I think we can continue our discussion in the comments your
raised so far firstly, what do you think about that?

Best Regards,
Huang, Ying


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-29 Thread Huang, Ying
Johannes Weiner  writes:

> On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
>> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>>  
>>  #endif /* CONFIG_SWAP */
>>  
>> +#ifdef CONFIG_THP_SWAP_CLUSTER
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +swp_entry_t entry;
>> +
>> +if (get_swap_pages(1, , true))
>> +return entry;
>> +else
>> +return (swp_entry_t) {0};
>> +}
>> +#else
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +return (swp_entry_t) {0};
>> +}
>> +#endif
>
> Your introducing a function without a user, making it very hard to
> judge whether the API is well-designed for the callers or not.
>
> I pointed this out as a systemic problem with this patch series in v3,
> along with other stuff, but with the way this series is structured I'm
> having a hard time seeing whether you implemented my other feedback or
> whether your counter arguments to them are justified.
>
> I cannot review and ack these patches this way.

Sorry for inconvenience, I will send a new version to combine the
function definition and usage into one patch at least for you to
review.  But I think we can continue our discussion in the comments your
raised so far firstly, what do you think about that?

Best Regards,
Huang, Ying


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-29 Thread Johannes Weiner
On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>  
>  #endif /* CONFIG_SWAP */
>  
> +#ifdef CONFIG_THP_SWAP_CLUSTER
> +static inline swp_entry_t get_huge_swap_page(void)
> +{
> + swp_entry_t entry;
> +
> + if (get_swap_pages(1, , true))
> + return entry;
> + else
> + return (swp_entry_t) {0};
> +}
> +#else
> +static inline swp_entry_t get_huge_swap_page(void)
> +{
> + return (swp_entry_t) {0};
> +}
> +#endif

Your introducing a function without a user, making it very hard to
judge whether the API is well-designed for the callers or not.

I pointed this out as a systemic problem with this patch series in v3,
along with other stuff, but with the way this series is structured I'm
having a hard time seeing whether you implemented my other feedback or
whether your counter arguments to them are justified.

I cannot review and ack these patches this way.


Re: [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-29 Thread Johannes Weiner
On Tue, Mar 28, 2017 at 01:32:04PM +0800, Huang, Ying wrote:
> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>  
>  #endif /* CONFIG_SWAP */
>  
> +#ifdef CONFIG_THP_SWAP_CLUSTER
> +static inline swp_entry_t get_huge_swap_page(void)
> +{
> + swp_entry_t entry;
> +
> + if (get_swap_pages(1, , true))
> + return entry;
> + else
> + return (swp_entry_t) {0};
> +}
> +#else
> +static inline swp_entry_t get_huge_swap_page(void)
> +{
> + return (swp_entry_t) {0};
> +}
> +#endif

Your introducing a function without a user, making it very hard to
judge whether the API is well-designed for the callers or not.

I pointed this out as a systemic problem with this patch series in v3,
along with other stuff, but with the way this series is structured I'm
having a hard time seeing whether you implemented my other feedback or
whether your counter arguments to them are justified.

I cannot review and ack these patches this way.


[PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-27 Thread Huang, Ying
From: Huang Ying 

A variation of get_swap_page(), get_huge_swap_page(), is added to
allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
cluster allocation function.  A fair simple algorithm is used, that is,
only the first swap device in priority list will be tried to allocate
the swap cluster.  The function will fail if the trying is not
successful, and the caller will fallback to allocate a single swap slot
instead.  This works good enough for normal cases.

This will be used for the THP (Transparent Huge Page) swap support.
Where get_huge_swap_page() will be used to allocate one swap cluster for
each THP swapped out.

Because of the algorithm adopted, if the difference of the number of the
free swap clusters among multiple swap devices is significant, it is
possible that some THPs are split earlier than necessary.  For example,
this could be caused by big size difference among multiple swap devices.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h | 19 ++-
 mm/swap_slots.c  |  5 +++--
 mm/swapfile.c| 18 +++---
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 278e1349a424..e3a7609a8989 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -388,7 +388,7 @@ static inline long get_nr_swap_pages(void)
 extern void si_swapinfo(struct sysinfo *);
 extern swp_entry_t get_swap_page(void);
 extern swp_entry_t get_swap_page_of_type(int);
-extern int get_swap_pages(int n, swp_entry_t swp_entries[]);
+extern int get_swap_pages(int n, swp_entry_t swp_entries[], bool huge);
 extern int add_swap_count_continuation(swp_entry_t, gfp_t);
 extern void swap_shmem_alloc(swp_entry_t);
 extern int swap_duplicate(swp_entry_t);
@@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
 
 #endif /* CONFIG_SWAP */
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   swp_entry_t entry;
+
+   if (get_swap_pages(1, , true))
+   return entry;
+   else
+   return (swp_entry_t) {0};
+}
+#else
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   return (swp_entry_t) {0};
+}
+#endif
+
 #ifdef CONFIG_MEMCG
 static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 9b5bc86f96ad..075bb39e03c5 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -258,7 +258,8 @@ static int refill_swap_slots_cache(struct swap_slots_cache 
*cache)
 
cache->cur = 0;
if (swap_slot_cache_active)
-   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots);
+   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots,
+  false);
 
return cache->nr;
 }
@@ -334,7 +335,7 @@ swp_entry_t get_swap_page(void)
return entry;
}
 
-   get_swap_pages(1, );
+   get_swap_pages(1, , false);
 
return entry;
 }
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 54480acbbeef..382e84541e16 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -904,13 +904,14 @@ static unsigned long scan_swap_map(struct 
swap_info_struct *si,
 
 }
 
-int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
+int get_swap_pages(int n_goal, swp_entry_t swp_entries[], bool huge)
 {
struct swap_info_struct *si, *next;
long avail_pgs;
int n_ret = 0;
+   int nr_pages = huge_cluster_nr_entries(huge);
 
-   avail_pgs = atomic_long_read(_swap_pages);
+   avail_pgs = atomic_long_read(_swap_pages) / nr_pages;
if (avail_pgs <= 0)
goto noswap;
 
@@ -920,7 +921,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
if (n_goal > avail_pgs)
n_goal = avail_pgs;
 
-   atomic_long_sub(n_goal, _swap_pages);
+   atomic_long_sub(n_goal * nr_pages, _swap_pages);
 
spin_lock(_avail_lock);
 
@@ -946,10 +947,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
spin_unlock(>lock);
goto nextsi;
}
-   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
-   n_goal, swp_entries);
+   if (likely(!huge))
+   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
+   n_goal, swp_entries);
+   else
+   n_ret = swap_alloc_huge_cluster(si, swp_entries);
spin_unlock(>lock);
-   if (n_ret)
+   if (n_ret || unlikely(huge))
goto 

[PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page()

2017-03-27 Thread Huang, Ying
From: Huang Ying 

A variation of get_swap_page(), get_huge_swap_page(), is added to
allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
cluster allocation function.  A fair simple algorithm is used, that is,
only the first swap device in priority list will be tried to allocate
the swap cluster.  The function will fail if the trying is not
successful, and the caller will fallback to allocate a single swap slot
instead.  This works good enough for normal cases.

This will be used for the THP (Transparent Huge Page) swap support.
Where get_huge_swap_page() will be used to allocate one swap cluster for
each THP swapped out.

Because of the algorithm adopted, if the difference of the number of the
free swap clusters among multiple swap devices is significant, it is
possible that some THPs are split earlier than necessary.  For example,
this could be caused by big size difference among multiple swap devices.

Cc: Andrea Arcangeli 
Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Shaohua Li 
Cc: Minchan Kim 
Cc: Rik van Riel 
Signed-off-by: "Huang, Ying" 
---
 include/linux/swap.h | 19 ++-
 mm/swap_slots.c  |  5 +++--
 mm/swapfile.c| 18 +++---
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 278e1349a424..e3a7609a8989 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -388,7 +388,7 @@ static inline long get_nr_swap_pages(void)
 extern void si_swapinfo(struct sysinfo *);
 extern swp_entry_t get_swap_page(void);
 extern swp_entry_t get_swap_page_of_type(int);
-extern int get_swap_pages(int n, swp_entry_t swp_entries[]);
+extern int get_swap_pages(int n, swp_entry_t swp_entries[], bool huge);
 extern int add_swap_count_continuation(swp_entry_t, gfp_t);
 extern void swap_shmem_alloc(swp_entry_t);
 extern int swap_duplicate(swp_entry_t);
@@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
 
 #endif /* CONFIG_SWAP */
 
+#ifdef CONFIG_THP_SWAP_CLUSTER
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   swp_entry_t entry;
+
+   if (get_swap_pages(1, , true))
+   return entry;
+   else
+   return (swp_entry_t) {0};
+}
+#else
+static inline swp_entry_t get_huge_swap_page(void)
+{
+   return (swp_entry_t) {0};
+}
+#endif
+
 #ifdef CONFIG_MEMCG
 static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
 {
diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 9b5bc86f96ad..075bb39e03c5 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -258,7 +258,8 @@ static int refill_swap_slots_cache(struct swap_slots_cache 
*cache)
 
cache->cur = 0;
if (swap_slot_cache_active)
-   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots);
+   cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots,
+  false);
 
return cache->nr;
 }
@@ -334,7 +335,7 @@ swp_entry_t get_swap_page(void)
return entry;
}
 
-   get_swap_pages(1, );
+   get_swap_pages(1, , false);
 
return entry;
 }
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 54480acbbeef..382e84541e16 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -904,13 +904,14 @@ static unsigned long scan_swap_map(struct 
swap_info_struct *si,
 
 }
 
-int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
+int get_swap_pages(int n_goal, swp_entry_t swp_entries[], bool huge)
 {
struct swap_info_struct *si, *next;
long avail_pgs;
int n_ret = 0;
+   int nr_pages = huge_cluster_nr_entries(huge);
 
-   avail_pgs = atomic_long_read(_swap_pages);
+   avail_pgs = atomic_long_read(_swap_pages) / nr_pages;
if (avail_pgs <= 0)
goto noswap;
 
@@ -920,7 +921,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
if (n_goal > avail_pgs)
n_goal = avail_pgs;
 
-   atomic_long_sub(n_goal, _swap_pages);
+   atomic_long_sub(n_goal * nr_pages, _swap_pages);
 
spin_lock(_avail_lock);
 
@@ -946,10 +947,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
spin_unlock(>lock);
goto nextsi;
}
-   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
-   n_goal, swp_entries);
+   if (likely(!huge))
+   n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
+   n_goal, swp_entries);
+   else
+   n_ret = swap_alloc_huge_cluster(si, swp_entries);
spin_unlock(>lock);
-   if (n_ret)
+   if (n_ret || unlikely(huge))
goto check_out;
pr_debug("scan_swap_map of si %d failed to find offset\n",
si->type);
@@ -975,7 +979,7 @@ int get_swap_pages(int n_goal,