Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-23 Thread Hugh Dickins
On Wed, 23 Nov 2016, Daniel Vetter wrote:
> On Tue, Nov 22, 2016 at 09:26:11PM -0800, Hugh Dickins wrote:
> > On Tue, 22 Nov 2016, Matthew Auld wrote:
> > > On 9 November 2016 at 18:36, Hugh Dickins  wrote:
> > > > On Wed, 9 Nov 2016, Daniel Vetter wrote:
> > > >>
> > > >> Hi all -mm folks!
> > > >>
> > > >> Any feedback on these two? It's kinda an intermediate step towards a
> > > >> full-blown gemfs, and I think useful for that. Or do we need to go
> > > >> directly to our own backing storage thing? Aside from ack/nack from 
> > > >> -mm I
> > > >> think this is ready for merging.
> > > >
> > > > I'm currently considering them at last: will report back later.
> > > >
> > > > Full-blown gemfs does not come in here, of course; but let me
> > > > fire a warning shot since you mention it: if it's going to use swap,
> > > > then we shall probably have to nak it in favour of continuing to use
> > > > infrastructure from mm/shmem.c.  I very much understand why you would
> > > > love to avoid that dependence, but I doubt it can be safely bypassed.
> > >
> > > Could you please elaborate on what specifically you don't like about
> > > gemfs implementing swap, just to make sure I'm following?
> > 
> > If we're talking about swap as implemented in mm/swapfile.c, and
> > managed for tmpfs mainly through shmem_getpage_gfp(): that's slippery
> > stuff, private to mm, and I would not want such trickiness duplicated
> > somewhere down in drivers/gpu/drm, where mm developers and drm developers
> > will keep on forgetting to keep it working correctly.
> > 
> > But you write of gemfs "implementing" swap (and I see Daniel wrote of
> > "our own backing storage"): perhaps you intend a disk or slow-mem file
> > of your own, dedicated to paging gemfs objects according to your own
> > rules, poked from memory reclaim via a shrinker.  I certainly don't
> > have the same quick objection to that: it may be a good way forward,
> > though I'm not at all sure (and would prefer a name distinct from
> > swap, so we wouldn't get confused - maybe gemswap).
> 
> "our backing storage" was from the pov of the gpu, which is just
> memory (and then normal swap). I think that's exactly the part you don't
> like ;-)

Yes ;) but never mind, reassuring answer below...

> 
> Anwyway, objections noted, we'll go and beef up the interfaces exposed by
> shmem in the style of this patch series here. What I'll expect in the
> future beyong the migrate callback so we can unpin pages is asking shmem
> to move the pages around to a different numa node, and also asking for
> hugepages (if available). Thanks a lot for your feedback meanwhile.

Migration callback, NUMA improvements, hugepages: all are reasonable
things to be asking of shmem.  I expect we'll have some to and fro on
how best to fit whatever interface you want with how those are already
handled, but none of those requests is reason to replace shmem by an
independently backed gemfs.

Hugh
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-23 Thread Daniel Vetter
On Tue, Nov 22, 2016 at 09:26:11PM -0800, Hugh Dickins wrote:
> On Tue, 22 Nov 2016, Matthew Auld wrote:
> > On 9 November 2016 at 18:36, Hugh Dickins  wrote:
> > > On Wed, 9 Nov 2016, Daniel Vetter wrote:
> > >>
> > >> Hi all -mm folks!
> > >>
> > >> Any feedback on these two? It's kinda an intermediate step towards a
> > >> full-blown gemfs, and I think useful for that. Or do we need to go
> > >> directly to our own backing storage thing? Aside from ack/nack from -mm I
> > >> think this is ready for merging.
> > >
> > > I'm currently considering them at last: will report back later.
> > >
> > > Full-blown gemfs does not come in here, of course; but let me
> > > fire a warning shot since you mention it: if it's going to use swap,
> > > then we shall probably have to nak it in favour of continuing to use
> > > infrastructure from mm/shmem.c.  I very much understand why you would
> > > love to avoid that dependence, but I doubt it can be safely bypassed.
> >
> > Could you please elaborate on what specifically you don't like about
> > gemfs implementing swap, just to make sure I'm following?
> 
> If we're talking about swap as implemented in mm/swapfile.c, and
> managed for tmpfs mainly through shmem_getpage_gfp(): that's slippery
> stuff, private to mm, and I would not want such trickiness duplicated
> somewhere down in drivers/gpu/drm, where mm developers and drm developers
> will keep on forgetting to keep it working correctly.
> 
> But you write of gemfs "implementing" swap (and I see Daniel wrote of
> "our own backing storage"): perhaps you intend a disk or slow-mem file
> of your own, dedicated to paging gemfs objects according to your own
> rules, poked from memory reclaim via a shrinker.  I certainly don't
> have the same quick objection to that: it may be a good way forward,
> though I'm not at all sure (and would prefer a name distinct from
> swap, so we wouldn't get confused - maybe gemswap).

"our backing storage" was from the pov of the gpu, which is just
memory (and then normal swap). I think that's exactly the part you don't
like ;-)

Anwyway, objections noted, we'll go and beef up the interfaces exposed by
shmem in the style of this patch series here. What I'll expect in the
future beyong the migrate callback so we can unpin pages is asking shmem
to move the pages around to a different numa node, and also asking for
hugepages (if available). Thanks a lot for your feedback meanwhile.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-22 Thread Hugh Dickins
On Tue, 22 Nov 2016, Matthew Auld wrote:
> On 9 November 2016 at 18:36, Hugh Dickins  wrote:
> > On Wed, 9 Nov 2016, Daniel Vetter wrote:
> >>
> >> Hi all -mm folks!
> >>
> >> Any feedback on these two? It's kinda an intermediate step towards a
> >> full-blown gemfs, and I think useful for that. Or do we need to go
> >> directly to our own backing storage thing? Aside from ack/nack from -mm I
> >> think this is ready for merging.
> >
> > I'm currently considering them at last: will report back later.
> >
> > Full-blown gemfs does not come in here, of course; but let me
> > fire a warning shot since you mention it: if it's going to use swap,
> > then we shall probably have to nak it in favour of continuing to use
> > infrastructure from mm/shmem.c.  I very much understand why you would
> > love to avoid that dependence, but I doubt it can be safely bypassed.
>
> Could you please elaborate on what specifically you don't like about
> gemfs implementing swap, just to make sure I'm following?

If we're talking about swap as implemented in mm/swapfile.c, and
managed for tmpfs mainly through shmem_getpage_gfp(): that's slippery
stuff, private to mm, and I would not want such trickiness duplicated
somewhere down in drivers/gpu/drm, where mm developers and drm developers
will keep on forgetting to keep it working correctly.

But you write of gemfs "implementing" swap (and I see Daniel wrote of
"our own backing storage"): perhaps you intend a disk or slow-mem file
of your own, dedicated to paging gemfs objects according to your own
rules, poked from memory reclaim via a shrinker.  I certainly don't
have the same quick objection to that: it may be a good way forward,
though I'm not at all sure (and would prefer a name distinct from
swap, so we wouldn't get confused - maybe gemswap).

Hugh
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-22 Thread Matthew Auld
On 9 November 2016 at 18:36, Hugh Dickins  wrote:
> On Wed, 9 Nov 2016, Daniel Vetter wrote:
>>
>> Hi all -mm folks!
>>
>> Any feedback on these two? It's kinda an intermediate step towards a
>> full-blown gemfs, and I think useful for that. Or do we need to go
>> directly to our own backing storage thing? Aside from ack/nack from -mm I
>> think this is ready for merging.
>
> I'm currently considering them at last: will report back later.
>
> Full-blown gemfs does not come in here, of course; but let me
> fire a warning shot since you mention it: if it's going to use swap,
> then we shall probably have to nak it in favour of continuing to use
> infrastructure from mm/shmem.c.  I very much understand why you would
> love to avoid that dependence, but I doubt it can be safely bypassed.
Could you please elaborate on what specifically you don't like about
gemfs implementing swap, just to make sure I'm following?

Thanks,
Matt
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-15 Thread akash goel
On Wed, Nov 16, 2016 at 6:55 AM, Hugh Dickins  wrote:
> On Mon, 14 Nov 2016, akash goel wrote:
>> On Thu, Nov 10, 2016 at 1:00 PM, Goel, Akash  wrote:
>> > On 11/10/2016 12:09 PM, Hugh Dickins wrote:
>> >> On Fri, 4 Nov 2016, akash.g...@intel.com wrote:
>> >>> @@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
>> >>>
>> >>> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
>> >>> +   if (IS_ENABLED(MIGRATION))
>
> Oh, I knew I'd seen a line like that recently, and it was bugging me
> that I ought to search my mailboxes for it; but now I'm glad to find
> it again.  If that condition stays, it would really need to say
>   if (IS_ENABLED(CONFIG_MIGRATION))
> wouldn't it?
>
Sorry this was a blooper, should have been
 if (IS_ENABLED(CONFIG_MIGRATION))

>> >>> +   mask |= __GFP_MOVABLE;
>> >>
>> >>
>> >> I was going to suggest just make that unconditional,
>> >> mask = GFP_HIGHUSER_MOVABLE | __GFP_RECLAIMABLE;
>> >>
>> >> But then I wondered what that __GFP_RECLAIMABLE actually achieves?
>> >> These pages are already __GFP_RECLAIM (inside GFP_HIGHUSER) and on
>> >> the LRU.  It affects gfpflags_to_migratetype(), but I'm not familiar
>> >> with what that different migratetype will end up doing.
>> >>
>> >
>> > Will check for this.
>> >
>>
>> The anti-fragmentation technique used by kernel is based on the idea
>> of grouping pages with identical mobility (UNMOVABLE, RECLAIMABLE,
>> MOVABLE) together.
>
> Yes.
>
>> __GFP_RECLAIMABLE, like  __GFP_MOVABLE, specifies the
>> mobility/migration type of the page and serves a different purpose
>> than __GFP_RECLAIM.
>
> Yes, I was wrong to mention __GFP_RECLAIM above: it describes what
> to do when in difficulty allocating a page, but says nothing at all
> about the nature of the page to be allocated.
>
Right, nicely phrased, thanks.

>>
>> Also as per the below snippet from gfpflags_to_migratetype(), looks
>> like __GFP_MOVABLE &  __GFP_RECLAIMABLE can't be used together, which
>> makes sense.
>> /* Convert GFP flags to their corresponding migrate type */
>> #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE | __GFP_MOVABLE)
>> static inline int gfpflags_to_migratetype(const gfp_t gfp_flags)
>> {
>> VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
>> .
>
> You're right, that does exclude them from being used together.  And it
> makes sense inasmuch as they're expected to be appled to quite different
> uses of a page (lru pages versus slab pages).
>
> The comment on __GFP_MOVABLE says "or can be reclaimed"; and
> the comment on __GFP_RECLAIMABLE says "used for slab allocations...".
> Though it does not say "used for allocations not put on a reclaimable
> lru", I think that is the intention; whereas shmem allocations are put
> on a reclaimable lru (though they might need your shrinker to unpin them).
>

As per my understanding both  __GFP_MOVABLE & __GFP_RECLAIMABLE type
pages would get added to the LRU list for reclaiming.
Irrespective of whether a shmem page is allocated as __GFP_MOVABLE
type or  __GFP_RECLAIMABLE type, it will be added to the LRU list.

>>
>> So probably would need to update the mask like this,
>>mask = GFP_HIGHUSER;
>>if (IS_ENABLED(MIGRATION))
>>  mask |= __GFP_MOVABLE;
>>else
>>  mask |=  __GFP_RECLAIMABLE;
>>
>> Please kindly let us know if this looks fine to you or not.
>
> Thanks for looking into it more deeply.  You leave me thinking that
> it should simply say
>
> mask = GFP_HIGHUSER_MOVABLE;
>
> Which is the default anyway, but it then has the Crestline+Broadwater
> condition to modify the mask further, so it's probably clearest to
> leave the mask = GFP_HIGHUSER_MOVABLE explicit.
>
> GFP_HIGHUSER_MOVABLE is used in many places, and includes __GFP_MOVABLE
> without any condition on CONFIG_MIGRATION - because the migratetype is
> irrelevant if there is no migration, I presume.
>
> Would you lose something by not or'ing in __GFP_RECLAIMABLE when
> CONFIG_MIGRATION=n?  No, because __GFP_RECLAIMABLE is not used for
> anything but the migratetype, and the migratetype is then irrelevant.
> (I didn't study the code closely enough to say whether the grouping
> can still happen even when migration is disabled, but even if it
> does still happen, I can't see that it would have any benefit.)
>
The freelist, for a particular order,  in buddy allocator is always
organized based on the migrate type.
from 
struct free_area {
   struct list_head free_list[MIGRATE_TYPES];
unsigned long nr_free;
};
And the page grouping, based on migrate type, is also always done by the kernel.

Its just that when CONFIG_MIGRATION=n, the actual migration does not
take place, leveraging the page grouping, so no compaction effectively
happens.
I think even without the migration, the page grouping could still be
beneficial in limiting the fragmentation to some extent, as at least
the non-movable pages will 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-15 Thread Hugh Dickins
On Mon, 14 Nov 2016, akash goel wrote:
> On Thu, Nov 10, 2016 at 1:00 PM, Goel, Akash  wrote:
> > On 11/10/2016 12:09 PM, Hugh Dickins wrote:
> >> On Fri, 4 Nov 2016, akash.g...@intel.com wrote:
> >>> @@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
> >>>
> >>> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> >>> +   if (IS_ENABLED(MIGRATION))

Oh, I knew I'd seen a line like that recently, and it was bugging me
that I ought to search my mailboxes for it; but now I'm glad to find
it again.  If that condition stays, it would really need to say
  if (IS_ENABLED(CONFIG_MIGRATION))
wouldn't it?

> >>> +   mask |= __GFP_MOVABLE;
> >>
> >>
> >> I was going to suggest just make that unconditional,
> >> mask = GFP_HIGHUSER_MOVABLE | __GFP_RECLAIMABLE;
> >>
> >> But then I wondered what that __GFP_RECLAIMABLE actually achieves?
> >> These pages are already __GFP_RECLAIM (inside GFP_HIGHUSER) and on
> >> the LRU.  It affects gfpflags_to_migratetype(), but I'm not familiar
> >> with what that different migratetype will end up doing.
> >>
> >
> > Will check for this.
> >
> 
> The anti-fragmentation technique used by kernel is based on the idea
> of grouping pages with identical mobility (UNMOVABLE, RECLAIMABLE,
> MOVABLE) together.

Yes.

> __GFP_RECLAIMABLE, like  __GFP_MOVABLE, specifies the
> mobility/migration type of the page and serves a different purpose
> than __GFP_RECLAIM.

Yes, I was wrong to mention __GFP_RECLAIM above: it describes what
to do when in difficulty allocating a page, but says nothing at all
about the nature of the page to be allocated.

> 
> Also as per the below snippet from gfpflags_to_migratetype(), looks
> like __GFP_MOVABLE &  __GFP_RECLAIMABLE can't be used together, which
> makes sense.
> /* Convert GFP flags to their corresponding migrate type */
> #define GFP_MOVABLE_MASK (__GFP_RECLAIMABLE | __GFP_MOVABLE)
> static inline int gfpflags_to_migratetype(const gfp_t gfp_flags)
> {
> VM_WARN_ON((gfp_flags & GFP_MOVABLE_MASK) == GFP_MOVABLE_MASK);
> .

You're right, that does exclude them from being used together.  And it
makes sense inasmuch as they're expected to be appled to quite different
uses of a page (lru pages versus slab pages).

The comment on __GFP_MOVABLE says "or can be reclaimed"; and
the comment on __GFP_RECLAIMABLE says "used for slab allocations...".
Though it does not say "used for allocations not put on a reclaimable
lru", I think that is the intention; whereas shmem allocations are put
on a reclaimable lru (though they might need your shrinker to unpin them).

> 
> So probably would need to update the mask like this,
>mask = GFP_HIGHUSER;
>if (IS_ENABLED(MIGRATION))
>  mask |= __GFP_MOVABLE;
>else
>  mask |=  __GFP_RECLAIMABLE;
> 
> Please kindly let us know if this looks fine to you or not.

Thanks for looking into it more deeply.  You leave me thinking that
it should simply say

mask = GFP_HIGHUSER_MOVABLE;

Which is the default anyway, but it then has the Crestline+Broadwater
condition to modify the mask further, so it's probably clearest to
leave the mask = GFP_HIGHUSER_MOVABLE explicit.

GFP_HIGHUSER_MOVABLE is used in many places, and includes __GFP_MOVABLE
without any condition on CONFIG_MIGRATION - because the migratetype is
irrelevant if there is no migration, I presume.

Would you lose something by not or'ing in __GFP_RECLAIMABLE when
CONFIG_MIGRATION=n?  No, because __GFP_RECLAIMABLE is not used for
anything but the migratetype, and the migratetype is then irrelevant.
(I didn't study the code closely enough to say whether the grouping
can still happen even when migration is disabled, but even if it
does still happen, I can't see that it would have any benefit.)

Hugh
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-13 Thread akash goel
On Thu, Nov 10, 2016 at 1:00 PM, Goel, Akash  wrote:
>
>
> On 11/10/2016 12:09 PM, Hugh Dickins wrote:
>>
>> On Fri, 4 Nov 2016, akash.g...@intel.com wrote:
>>>
>>> From: Chris Wilson 
>>>
>>> On a long run of more than 2-3 days, physical memory tends to get
>>> fragmented severely, which considerably slows down the system. In such a
>>> scenario, the shrinker is also unable to help as lack of memory is not
>>> the actual problem, since it has been observed that there are enough free
>>> pages of 0 order. This also manifests itself when an indiviual zone in
>>> the mm runs out of pages and if we cannot migrate pages between zones,
>>> the kernel hits an out-of-memory even though there are free pages (and
>>> often all of swap) available.
>>>
>>> To address the issue of external fragementation, kernel does a compaction
>>> (which involves migration of pages) but it's efficacy depends upon how
>>> many pages are marked as MOVABLE, as only those pages can be migrated.
>>>
>>> Currently the backing pages for GPU buffers are allocated from shmemfs
>>> with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
>>> swap space, it may not be possible always to reclaim or swap-out pages of
>>> all the inactive objects, to make way for free space allowing formation
>>> of higher order groups of physically-contiguous pages on compaction.
>>>
>>> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
>>> pin the pages if they are in use by GPU, which will prevent their
>>> migration. So the migratepage callback in shmem is also hooked up to get
>>> a notification when kernel initiates the page migration. On the
>>> notification, i915.ko appropriately unpin the pages.  With this we can
>>> effectively mark the GPU pages as MOVABLE and hence mitigate the
>>> fragmentation problem.
>>>
>>> v2:
>>>  - Rename the migration routine to gem_shrink_migratepage, move it to the
>>>shrinker file, and use the existing constructs (Chris)
>>>  - To cleanup, add a new helper function to encapsulate all page
>>> migration
>>>skip conditions (Chris)
>>>  - Add a new local helper function in shrinker file, for dropping the
>>>backing pages, and call the same from gem_shrink() also (Chris)
>>>
>>> v3:
>>>  - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
>>>
>>> v4:
>>>  - Minor tidy
>>>
>>> v5:
>>>  - Fix unsafe usage of unsafe_drop_pages()
>>>  - Rebase onto vmap-notifier
>>>
>>> v6:
>>> - Remove i915_gem_object_get/put across unsafe_drop_pages() as with
>>>   struct_mutex protection object can't disappear. (Chris)
>>>
>>> Testcase: igt/gem_shrink
>>> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
>>> Cc: Hugh Dickins 
>>> Cc: linux...@kvack.org
>>> Signed-off-by: Sourab Gupta 
>>> Signed-off-by: Akash Goel 
>>> Signed-off-by: Chris Wilson 
>>> Reviewed-by: Joonas Lahtinen 
>>> Reviewed-by: Chris Wilson 
>>
>>
>> I'm confused!  But perhaps it's gone around and around between you all,
>> I'm not sure what the rules are then.  I think this sequence implies
>> that Sourab wrote it originally, then Akash and Chris passed it on
>> with refinements - but then Chris wouldn't add Reviewed-by.
>>
> Thank you very much for the review and sorry for all the needless confusion.
>
> Chris actually conceived the patches and prepared an initial version of them
> (hence he is the Author).
> I & Sourab did the further refinements and fixed issues (all those
> page_private stuff).
> Chris then reviewed the final patch and also recently did a rebase for it.
>
>
>>> ---
>>>  drivers/gpu/drm/i915/i915_drv.h  |   2 +
>>>  drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
>>>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 132
>>> +++
>>>  3 files changed, 142 insertions(+), 1 deletion(-)
>>>
snip
>>
>>> @@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
>>> goto fail;
>>>
>>> mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
>>> +   if (IS_ENABLED(MIGRATION))
>>> +   mask |= __GFP_MOVABLE;
>>
>>
>> I was going to suggest just make that unconditional,
>> mask = GFP_HIGHUSER_MOVABLE | __GFP_RECLAIMABLE;
>>
>> But then I wondered what that __GFP_RECLAIMABLE actually achieves?
>> These pages are already __GFP_RECLAIM (inside GFP_HIGHUSER) and on
>> the LRU.  It affects gfpflags_to_migratetype(), but I'm not familiar
>> with what that different migratetype will end up doing.
>>
>
> Will check for this.
>

The anti-fragmentation technique used by kernel is based on the idea
of grouping pages with identical mobility (UNMOVABLE, RECLAIMABLE,
MOVABLE) together.
__GFP_RECLAIMABLE, like  __GFP_MOVABLE, specifies the
mobility/migration type of the page and serves a different purpose
than __GFP_RECLAIM.

Also as per 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-09 Thread Goel, Akash



On 11/10/2016 12:09 PM, Hugh Dickins wrote:

On Fri, 4 Nov 2016, akash.g...@intel.com wrote:

From: Chris Wilson 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GPU buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

v5:
 - Fix unsafe usage of unsafe_drop_pages()
 - Rebase onto vmap-notifier

v6:
- Remove i915_gem_object_get/put across unsafe_drop_pages() as with
  struct_mutex protection object can't disappear. (Chris)

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 


I'm confused!  But perhaps it's gone around and around between you all,
I'm not sure what the rules are then.  I think this sequence implies
that Sourab wrote it originally, then Akash and Chris passed it on
with refinements - but then Chris wouldn't add Reviewed-by.


Thank you very much for the review and sorry for all the needless confusion.

Chris actually conceived the patches and prepared an initial version of 
them (hence he is the Author).
I & Sourab did the further refinements and fixed issues (all those 
page_private stuff).

Chris then reviewed the final patch and also recently did a rebase for it.


---
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 +++
 3 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4735b417..7f2717b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1357,6 +1357,8 @@ struct intel_l3_parity {
 };

 struct i915_gem_mm {
+   struct shmem_dev_info shmem_info;
+
/** Memory allocator for GTT stolen memory */
struct drm_mm stolen;
/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1f995ce..f0d4ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
drm_i915_gem_object *obj)
if (obj->mm.madv == I915_MADV_WILLNEED)
mark_page_accessed(page);

+   set_page_private(page, 0);
put_page(page);
}
obj->mm.dirty = false;
@@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
sg->length += PAGE_SIZE;
}
last_pfn = page_to_pfn(page);
+   set_page_private(page, (unsigned long)obj);

/* Check that the i965g/gm workaround works. */
WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
@@ 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-09 Thread Hugh Dickins
On Fri, 4 Nov 2016, akash.g...@intel.com wrote:
> From: Chris Wilson 
> 
> On a long run of more than 2-3 days, physical memory tends to get
> fragmented severely, which considerably slows down the system. In such a
> scenario, the shrinker is also unable to help as lack of memory is not
> the actual problem, since it has been observed that there are enough free
> pages of 0 order. This also manifests itself when an indiviual zone in
> the mm runs out of pages and if we cannot migrate pages between zones,
> the kernel hits an out-of-memory even though there are free pages (and
> often all of swap) available.
> 
> To address the issue of external fragementation, kernel does a compaction
> (which involves migration of pages) but it's efficacy depends upon how
> many pages are marked as MOVABLE, as only those pages can be migrated.
> 
> Currently the backing pages for GPU buffers are allocated from shmemfs
> with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
> swap space, it may not be possible always to reclaim or swap-out pages of
> all the inactive objects, to make way for free space allowing formation
> of higher order groups of physically-contiguous pages on compaction.
> 
> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
> pin the pages if they are in use by GPU, which will prevent their
> migration. So the migratepage callback in shmem is also hooked up to get
> a notification when kernel initiates the page migration. On the
> notification, i915.ko appropriately unpin the pages.  With this we can
> effectively mark the GPU pages as MOVABLE and hence mitigate the
> fragmentation problem.
> 
> v2:
>  - Rename the migration routine to gem_shrink_migratepage, move it to the
>shrinker file, and use the existing constructs (Chris)
>  - To cleanup, add a new helper function to encapsulate all page migration
>skip conditions (Chris)
>  - Add a new local helper function in shrinker file, for dropping the
>backing pages, and call the same from gem_shrink() also (Chris)
> 
> v3:
>  - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
> 
> v4:
>  - Minor tidy
> 
> v5:
>  - Fix unsafe usage of unsafe_drop_pages()
>  - Rebase onto vmap-notifier
> 
> v6:
> - Remove i915_gem_object_get/put across unsafe_drop_pages() as with
>   struct_mutex protection object can't disappear. (Chris)
> 
> Testcase: igt/gem_shrink
> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
> Cc: Hugh Dickins 
> Cc: linux...@kvack.org
> Signed-off-by: Sourab Gupta 
> Signed-off-by: Akash Goel 
> Signed-off-by: Chris Wilson 
> Reviewed-by: Joonas Lahtinen 
> Reviewed-by: Chris Wilson 

I'm confused!  But perhaps it's gone around and around between you all,
I'm not sure what the rules are then.  I think this sequence implies
that Sourab wrote it originally, then Akash and Chris passed it on
with refinements - but then Chris wouldn't add Reviewed-by.

> ---
>  drivers/gpu/drm/i915/i915_drv.h  |   2 +
>  drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 
> +++
>  3 files changed, 142 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4735b417..7f2717b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1357,6 +1357,8 @@ struct intel_l3_parity {
>  };
>  
>  struct i915_gem_mm {
> + struct shmem_dev_info shmem_info;
> +
>   /** Memory allocator for GTT stolen memory */
>   struct drm_mm stolen;
>   /** Protects the usage of the GTT stolen memory allocator. This is
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1f995ce..f0d4ce7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
> drm_i915_gem_object *obj)
>   if (obj->mm.madv == I915_MADV_WILLNEED)
>   mark_page_accessed(page);
>  
> + set_page_private(page, 0);
>   put_page(page);
>   }
>   obj->mm.dirty = false;
> @@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
>   sg->length += PAGE_SIZE;
>   }
>   last_pfn = page_to_pfn(page);
> + set_page_private(page, (unsigned long)obj);
>  
>   /* Check that the i965g/gm workaround works. */
>   WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
> @@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
>  
>  err_pages:
>   sg_mark_end(sg);
> - for_each_sgt_page(page, sgt_iter, st)
> + for_each_sgt_page(page, sgt_iter, st) {
> + 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-09 Thread Hugh Dickins
On Wed, 9 Nov 2016, Daniel Vetter wrote:
> 
> Hi all -mm folks!
> 
> Any feedback on these two? It's kinda an intermediate step towards a
> full-blown gemfs, and I think useful for that. Or do we need to go
> directly to our own backing storage thing? Aside from ack/nack from -mm I
> think this is ready for merging.

I'm currently considering them at last: will report back later.

Full-blown gemfs does not come in here, of course; but let me
fire a warning shot since you mention it: if it's going to use swap,
then we shall probably have to nak it in favour of continuing to use 
infrastructure from mm/shmem.c.  I very much understand why you would
love to avoid that dependence, but I doubt it can be safely bypassed.

Hugh
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-09 Thread Daniel Vetter
On Fri, Nov 04, 2016 at 08:32:56PM +0530, akash.g...@intel.com wrote:
> From: Chris Wilson 
> 
> On a long run of more than 2-3 days, physical memory tends to get
> fragmented severely, which considerably slows down the system. In such a
> scenario, the shrinker is also unable to help as lack of memory is not
> the actual problem, since it has been observed that there are enough free
> pages of 0 order. This also manifests itself when an indiviual zone in
> the mm runs out of pages and if we cannot migrate pages between zones,
> the kernel hits an out-of-memory even though there are free pages (and
> often all of swap) available.
> 
> To address the issue of external fragementation, kernel does a compaction
> (which involves migration of pages) but it's efficacy depends upon how
> many pages are marked as MOVABLE, as only those pages can be migrated.
> 
> Currently the backing pages for GPU buffers are allocated from shmemfs
> with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
> swap space, it may not be possible always to reclaim or swap-out pages of
> all the inactive objects, to make way for free space allowing formation
> of higher order groups of physically-contiguous pages on compaction.
> 
> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
> pin the pages if they are in use by GPU, which will prevent their
> migration. So the migratepage callback in shmem is also hooked up to get
> a notification when kernel initiates the page migration. On the
> notification, i915.ko appropriately unpin the pages.  With this we can
> effectively mark the GPU pages as MOVABLE and hence mitigate the
> fragmentation problem.
> 
> v2:
>  - Rename the migration routine to gem_shrink_migratepage, move it to the
>shrinker file, and use the existing constructs (Chris)
>  - To cleanup, add a new helper function to encapsulate all page migration
>skip conditions (Chris)
>  - Add a new local helper function in shrinker file, for dropping the
>backing pages, and call the same from gem_shrink() also (Chris)
> 
> v3:
>  - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
> 
> v4:
>  - Minor tidy
> 
> v5:
>  - Fix unsafe usage of unsafe_drop_pages()
>  - Rebase onto vmap-notifier
> 
> v6:
> - Remove i915_gem_object_get/put across unsafe_drop_pages() as with
>   struct_mutex protection object can't disappear. (Chris)
> 
> Testcase: igt/gem_shrink
> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
> Cc: Hugh Dickins 
> Cc: linux...@kvack.org
> Signed-off-by: Sourab Gupta 
> Signed-off-by: Akash Goel 
> Signed-off-by: Chris Wilson 
> Reviewed-by: Joonas Lahtinen 
> Reviewed-by: Chris Wilson 

Hi all -mm folks!

Any feedback on these two? It's kinda an intermediate step towards a
full-blown gemfs, and I think useful for that. Or do we need to go
directly to our own backing storage thing? Aside from ack/nack from -mm I
think this is ready for merging.

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h  |   2 +
>  drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 
> +++
>  3 files changed, 142 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4735b417..7f2717b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1357,6 +1357,8 @@ struct intel_l3_parity {
>  };
>  
>  struct i915_gem_mm {
> + struct shmem_dev_info shmem_info;
> +
>   /** Memory allocator for GTT stolen memory */
>   struct drm_mm stolen;
>   /** Protects the usage of the GTT stolen memory allocator. This is
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1f995ce..f0d4ce7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
> drm_i915_gem_object *obj)
>   if (obj->mm.madv == I915_MADV_WILLNEED)
>   mark_page_accessed(page);
>  
> + set_page_private(page, 0);
>   put_page(page);
>   }
>   obj->mm.dirty = false;
> @@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
>   sg->length += PAGE_SIZE;
>   }
>   last_pfn = page_to_pfn(page);
> + set_page_private(page, (unsigned long)obj);
>  
>   /* Check that the i965g/gm workaround works. */
>   WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
> @@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
>  
>  err_pages:
>   sg_mark_end(sg);
> - for_each_sgt_page(page, sgt_iter, st)
> + for_each_sgt_page(page, sgt_iter, 

[Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread akash . goel
From: Chris Wilson 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GPU buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

v5:
 - Fix unsafe usage of unsafe_drop_pages()
 - Rebase onto vmap-notifier

v6:
- Remove i915_gem_object_get/put across unsafe_drop_pages() as with
  struct_mutex protection object can't disappear. (Chris)

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 132 +++
 3 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4735b417..7f2717b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1357,6 +1357,8 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
+   struct shmem_dev_info shmem_info;
+
/** Memory allocator for GTT stolen memory */
struct drm_mm stolen;
/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1f995ce..f0d4ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
drm_i915_gem_object *obj)
if (obj->mm.madv == I915_MADV_WILLNEED)
mark_page_accessed(page);
 
+   set_page_private(page, 0);
put_page(page);
}
obj->mm.dirty = false;
@@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
sg->length += PAGE_SIZE;
}
last_pfn = page_to_pfn(page);
+   set_page_private(page, (unsigned long)obj);
 
/* Check that the i965g/gm workaround works. */
WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
@@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
 
 err_pages:
sg_mark_end(sg);
-   for_each_sgt_page(page, sgt_iter, st)
+   for_each_sgt_page(page, sgt_iter, st) {
+   set_page_private(page, 0);
put_page(page);
+   }
sg_free_table(st);
kfree(st);
 
@@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
goto fail;
 
mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+   if (IS_ENABLED(MIGRATION))
+   mask |= __GFP_MOVABLE;
if (IS_CRESTLINE(dev_priv) || IS_BROADWATER(dev_priv)) {
/* 965gm cannot relocate objects above 4GiB. */
mask &= ~__GFP_HIGHMEM;
@@ -4193,6 

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread Goel, Akash



On 11/4/2016 7:07 PM, Chris Wilson wrote:

Best if we send these as a new series to unconfuse CI.


Okay will send as a new series.


On Fri, Nov 04, 2016 at 06:18:26PM +0530, akash.g...@intel.com wrote:

+static int do_migrate_page(struct drm_i915_gem_object *obj)
+{
+   struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+   int ret = 0;
+
+   if (!can_migrate_page(obj))
+   return -EBUSY;
+
+   /* HW access would be required for a GGTT bound object, for which
+* device has to be kept awake. But a deadlock scenario can arise if
+* the attempt is made to resume the device, when either a suspend
+* or a resume operation is already happening concurrently from some
+* other path and that only also triggers compaction. So only unbind
+* if the device is currently awake.
+*/
+   if (!intel_runtime_pm_get_if_in_use(dev_priv))
+   return -EBUSY;
+
+   i915_gem_object_get(obj);
+   if (!unsafe_drop_pages(obj))
+   ret = -EBUSY;
+   i915_gem_object_put(obj);


Since the object release changes, we can now do this without the
i915_gem_object_get / i915_gem_object_put (as we are guarded by the BKL
struct_mutex).
Fine will remove object_get/put as with struct_mutex protection object 
can't disappear across unsafe_drop_pages().


Best regards
Akash



-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread Chris Wilson
Best if we send these as a new series to unconfuse CI.

On Fri, Nov 04, 2016 at 06:18:26PM +0530, akash.g...@intel.com wrote:
> +static int do_migrate_page(struct drm_i915_gem_object *obj)
> +{
> + struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
> + int ret = 0;
> +
> + if (!can_migrate_page(obj))
> + return -EBUSY;
> +
> + /* HW access would be required for a GGTT bound object, for which
> +  * device has to be kept awake. But a deadlock scenario can arise if
> +  * the attempt is made to resume the device, when either a suspend
> +  * or a resume operation is already happening concurrently from some
> +  * other path and that only also triggers compaction. So only unbind
> +  * if the device is currently awake.
> +  */
> + if (!intel_runtime_pm_get_if_in_use(dev_priv))
> + return -EBUSY;
> +
> + i915_gem_object_get(obj);
> + if (!unsafe_drop_pages(obj))
> + ret = -EBUSY;
> + i915_gem_object_put(obj);

Since the object release changes, we can now do this without the
i915_gem_object_get / i915_gem_object_put (as we are guarded by the BKL
struct_mutex).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: Make GPU pages movable

2016-11-04 Thread akash . goel
From: Chris Wilson 

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GPU buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

v5:
 - Fix unsafe usage of unsafe_drop_pages()
 - Rebase onto vmap-notifier

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Signed-off-by: Sourab Gupta 
Signed-off-by: Akash Goel 
Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
Reviewed-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_drv.h  |   2 +
 drivers/gpu/drm/i915/i915_gem.c  |   9 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 134 +++
 3 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4735b417..7f2717b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1357,6 +1357,8 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
+   struct shmem_dev_info shmem_info;
+
/** Memory allocator for GTT stolen memory */
struct drm_mm stolen;
/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1f995ce..f0d4ce7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2164,6 +2164,7 @@ void __i915_gem_object_invalidate(struct 
drm_i915_gem_object *obj)
if (obj->mm.madv == I915_MADV_WILLNEED)
mark_page_accessed(page);
 
+   set_page_private(page, 0);
put_page(page);
}
obj->mm.dirty = false;
@@ -2310,6 +2311,7 @@ static unsigned int swiotlb_max_size(void)
sg->length += PAGE_SIZE;
}
last_pfn = page_to_pfn(page);
+   set_page_private(page, (unsigned long)obj);
 
/* Check that the i965g/gm workaround works. */
WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x0010UL));
@@ -2334,8 +2336,10 @@ static unsigned int swiotlb_max_size(void)
 
 err_pages:
sg_mark_end(sg);
-   for_each_sgt_page(page, sgt_iter, st)
+   for_each_sgt_page(page, sgt_iter, st) {
+   set_page_private(page, 0);
put_page(page);
+   }
sg_free_table(st);
kfree(st);
 
@@ -4185,6 +4189,8 @@ struct drm_i915_gem_object *
goto fail;
 
mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+   if (IS_ENABLED(MIGRATION))
+   mask |= __GFP_MOVABLE;
if (IS_CRESTLINE(dev_priv) || IS_BROADWATER(dev_priv)) {
/* 965gm cannot relocate objects above 4GiB. */
mask &= ~__GFP_HIGHMEM;
@@ -4193,6 +4199,7 @@ struct drm_i915_gem_object *
 
mapping = obj->base.filp->f_mapping;
mapping_set_gfp_mask(mapping, mask);
+