Re: [virtio-dev] Re: [PATCH v13 4/5] mm: support reporting free page blocks

2017-08-10 Thread Michal Hocko
On Thu 10-08-17 15:38:34, Wei Wang wrote:
> On 08/10/2017 03:05 PM, Michal Hocko wrote:
> >On Tue 08-08-17 14:34:25, Wei Wang wrote:
> >>On 08/08/2017 02:12 PM, Wei Wang wrote:
> >>>On 08/03/2017 05:11 PM, Michal Hocko wrote:
> On Thu 03-08-17 14:38:18, Wei Wang wrote:
> This is just too ugly and wrong actually. Never provide struct page
> pointers outside of the zone->lock. What I've had in mind was to simply
> walk free lists of the suitable order and call the callback for each
> one.
> Something as simple as
> 
> for (i = 0; i < MAX_NR_ZONES; i++) {
> struct zone *zone = >node_zones[i];
> 
> if (!populated_zone(zone))
> continue;
> >>>Can we directly use for_each_populated_zone(zone) here?
> >yes, my example couldn't because I was still assuming per-node API
> >
> spin_lock_irqsave(>lock, flags);
> for (order = min_order; order < MAX_ORDER; ++order) {
> >>>
> >>>This appears to be covered by for_each_migratetype_order(order, mt) below.
> >yes but
> >#define for_each_migratetype_order(order, type) \
> > for (order = 0; order < MAX_ORDER; order++) \
> > for (type = 0; type < MIGRATE_TYPES; type++)
> >
> >so you would have to skip orders < min_order
> 
> Yes, that's why we have a new macro
> 
> #define for_each_migratetype_order_decend(min_order, order, type) \
>  for (order = MAX_ORDER - 1; order < MAX_ORDER && order >= min_order; \
>  order--) \
> for (type = 0; type < MIGRATE_TYPES; type++)
> 
> If you don't like the macro, we can also directly use it in the code.
> 
> I think it would be better to report the larger free page block first, since
> the callback has an opportunity (though just a theoretical possibility, good
> to
> take that into consideration if possible) to skip reporting the given free
> page
> block to the hypervisor as the ring gets full. Losing the small block is
> better
> than losing the larger one, in terms of the optimization work.

I see. But I think this is so specialized that opencoding the macro
would be easier to read.

-- 
Michal Hocko
SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-dev] Re: [PATCH v13 4/5] mm: support reporting free page blocks

2017-08-10 Thread Wei Wang

On 08/10/2017 03:05 PM, Michal Hocko wrote:

On Tue 08-08-17 14:34:25, Wei Wang wrote:

On 08/08/2017 02:12 PM, Wei Wang wrote:

On 08/03/2017 05:11 PM, Michal Hocko wrote:

On Thu 03-08-17 14:38:18, Wei Wang wrote:
This is just too ugly and wrong actually. Never provide struct page
pointers outside of the zone->lock. What I've had in mind was to simply
walk free lists of the suitable order and call the callback for each
one.
Something as simple as

for (i = 0; i < MAX_NR_ZONES; i++) {
struct zone *zone = >node_zones[i];

if (!populated_zone(zone))
continue;

Can we directly use for_each_populated_zone(zone) here?

yes, my example couldn't because I was still assuming per-node API


spin_lock_irqsave(>lock, flags);
for (order = min_order; order < MAX_ORDER; ++order) {


This appears to be covered by for_each_migratetype_order(order, mt) below.

yes but
#define for_each_migratetype_order(order, type) \
for (order = 0; order < MAX_ORDER; order++) \
for (type = 0; type < MIGRATE_TYPES; type++)

so you would have to skip orders < min_order


Yes, that's why we have a new macro

#define for_each_migratetype_order_decend(min_order, order, type) \
 for (order = MAX_ORDER - 1; order < MAX_ORDER && order >= min_order; \
 order--) \
for (type = 0; type < MIGRATE_TYPES; type++)

If you don't like the macro, we can also directly use it in the code.

I think it would be better to report the larger free page block first, since
the callback has an opportunity (though just a theoretical possibility, 
good to
take that into consideration if possible) to skip reporting the given 
free page
block to the hypervisor as the ring gets full. Losing the small block is 
better

than losing the larger one, in terms of the optimization work.


Best,
Wei



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-dev] Re: [PATCH v13 4/5] mm: support reporting free page blocks

2017-08-10 Thread Michal Hocko
On Tue 08-08-17 14:34:25, Wei Wang wrote:
> On 08/08/2017 02:12 PM, Wei Wang wrote:
> >On 08/03/2017 05:11 PM, Michal Hocko wrote:
> >>On Thu 03-08-17 14:38:18, Wei Wang wrote:
> >>This is just too ugly and wrong actually. Never provide struct page
> >>pointers outside of the zone->lock. What I've had in mind was to simply
> >>walk free lists of the suitable order and call the callback for each
> >>one.
> >>Something as simple as
> >>
> >>for (i = 0; i < MAX_NR_ZONES; i++) {
> >>struct zone *zone = >node_zones[i];
> >>
> >>if (!populated_zone(zone))
> >>continue;
> >
> >Can we directly use for_each_populated_zone(zone) here?

yes, my example couldn't because I was still assuming per-node API

> >>spin_lock_irqsave(>lock, flags);
> >>for (order = min_order; order < MAX_ORDER; ++order) {
> >
> >
> >This appears to be covered by for_each_migratetype_order(order, mt) below.

yes but
#define for_each_migratetype_order(order, type) \
for (order = 0; order < MAX_ORDER; order++) \
for (type = 0; type < MIGRATE_TYPES; type++)

so you would have to skip orders < min_order
-- 
Michal Hocko
SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-dev] Re: [PATCH v13 4/5] mm: support reporting free page blocks

2017-08-08 Thread Wei Wang

On 08/08/2017 02:12 PM, Wei Wang wrote:

On 08/03/2017 05:11 PM, Michal Hocko wrote:

On Thu 03-08-17 14:38:18, Wei Wang wrote:
This is just too ugly and wrong actually. Never provide struct page
pointers outside of the zone->lock. What I've had in mind was to simply
walk free lists of the suitable order and call the callback for each 
one.

Something as simple as

for (i = 0; i < MAX_NR_ZONES; i++) {
struct zone *zone = >node_zones[i];

if (!populated_zone(zone))
continue;


Can we directly use for_each_populated_zone(zone) here?



spin_lock_irqsave(>lock, flags);
for (order = min_order; order < MAX_ORDER; ++order) {



This appears to be covered by for_each_migratetype_order(order, mt) 
below.




struct free_area *free_area = >free_area[order];
enum migratetype mt;
struct page *page;

if (!free_area->nr_pages)
continue;

for_each_migratetype_order(order, mt) {
list_for_each_entry(page,
_area->free_list[mt], lru) {

pfn = page_to_pfn(page);
visit(opaque2, prn, 1<lock, flags);
}

[...]



What do you think if we further simply the above implementation like 
this:


for_each_populated_zone(zone) {
for_each_migratetype_order_decend(1, order, mt) {


here it will be min_order (passed by the caller), instead of "1",
that is, for_each_migratetype_order_decend(min_order, order, mt)



spin_lock_irqsave(>lock, flags);
list_for_each_entry(page,
>free_area[order].free_list[mt], lru) {
pfn = page_to_pfn(page);
visit(opaque1, pfn, 1 << order);
}
spin_unlock_irqrestore(>lock, flags);
}
}





Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization