On 12/05/2014 18:02, Eric Anholt wrote:
arun.siluv...@linux.intel.com writes:

From: "Siluvery, Arun" <arun.siluv...@intel.com>

This patch adds support to have gem objects of variable size.
The size of the gem object obj->size is always constant and this fact
is tightly coupled in the driver; this implementation allows to vary
its effective size using an interface similar to fallocate().

A new ioctl() is introduced to mark a range as scratch/usable.
Once marked as scratch, associated backing store is released and the
region is filled with scratch pages. The region can also be unmarked
at a later point in which case new backing pages are created.
The range can be anywhere within the object space, it can have multiple
ranges possibly overlapping forming a large contiguous range.

There is only one single scratch page and Kernel allows to write to this
page; userspace need to keep track of scratch page range otherwise any
subsequent writes to these pages will overwrite previous content.

This feature is useful where the exact size of the object is not clear
at the time of its creation, in such case we usually create an object
with more than the required size but end up using it partially.
In devices where there are tight memory constraints it would be useful
to release that additional space which is currently unused. Using this
interface the region can be simply marked as scratch which releases
its backing store thus reducing the memory pressure on the kernel.

Many thanks to Daniel, ChrisW, Tvrtko, Bob for the idea and feedback
on this implementation.

v2: fix holes in error handling and use consistent data types (Tvrtko)
  - If page allocation fails simply return error; do not try to invoke
    shrinker to free backing store.
  - Release new pages created by us in case of error during page allocation
    or sg_table update.
  - Use 64-bit data types for start and length values to avoid truncation.

The idea sounds nice to have for Mesa.  We've got this ugly code right
now for guessing how many levels a miptree is going to be, and then do
copies if we find out we were wrong about how many the app was going to
use.  This will let us allocate for a maximum-depth miptree, and mark
the smaller levels as unused until an image gets put there.

The problem I see with this plan is if the page table twiddling ends up
being too expensive in our BO reallocation path (right now, if we make
the same guess on every allocation, we'll reuse cached BOs with the same
size and no mapping cost).

It would be nice to see some performance data from real applications, if
possible.  But then, I don't think I've seen any real applications hit
the copy path.

The way I am planning to test is to calculate the time it takes to falloc a big object. Could you suggest a best way to test the performance of this change?

regards
Arun

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to