Re: TTM merging?

Thomas Hellström Wed, 14 May 2008 07:38:03 -0700

Jerome Glisse wrote:
> On Wed, 14 May 2008 12:09:06 +0200
> Thomas Hellström <[EMAIL PROTECTED]> wrote:
>
>   
>> Jerome Glisse wrote:
>> Jerome, Dave, Keith
>>
>>
>> 1) The inability to map device memory. The design arguments and proposed 
>> solution for VRAM are not really valid. Think of this, probably not too 
>> uncommon, scenario of a single pixel fallback composite to a scanout 
>> buffer in vram. Or a texture or video frame upload:
>>
>> A) Page in all GEM pages, because they've been paged out.
>> B) Copy the complete scanout buffer to GEM because it's dirty. Untile.
>> C) Write the pixel.
>> D) Copy the complete buffer back while tiling.
>>     
>
> With pwrite/pread you give offset and size of things you are interested in.
> So for single pixel case it will pread a page and pwrite it once fallback
> finished. I totaly agree that dowloading whole object on fallback is to be
> avoided. But as long as we don't have a fallback which draws the whole
> screen then we are fine, and as anyway such fallback will be desastrous
> wether we map vram or not lead me to discard this drawback and just
> accept pain for such fallback.
>
>   
I don't agree with you here. EXA is much faster for small composite 
operations and even small fill blits if fallbacks are used. Even to 
write-combined memory, but that of course depends on the hardware. This 
is going to be even more pronounced with acceleration architectures like 
Glucose and similar, that don't have an optimized path for small 
hardware composite operations.


My personal feeling is that pwrites are a workaround for a workaround 
for a very bad decision:

To avoid user-space allocators on device-mapped memory. This lead to a 
hack to avoid cahing-policy changes which lead to  cache trashing 
problems which put us in the current situation.  How far are we going to 
follow this path before people wake up? What's wrong with the 
performance of good old i915tex which even beats "classic" i915 in many 
cases.

Having to go through potentially (and even probably) paged-out memory to 
access buffers to make that are present in VRAM sounds like a very odd 
approach (to say the least) to me. Even if it's a single page and 
implementing per-page dirty checks for domain flushing isn't very 
appealing either.

> Also i am confident that we can find a more clever way in such case.
> Like doing the whole rendering in ram and updating the final result
> so assuming that the up to date copy is in ram and that vram might
> be out of sync.
>   
Why should we have to when we can do it right?
>  
>   
>> 2) Reserving pages when allocating VRAM buffers is also a very bad 
>> solution particularly on systems with a lot of VRAM and little system 
>> RAM. (Multiple card machines?). GEM basically needs to reserve 
>> swap-space when buffers are created, and put a limit on the pinned 
>> physical pages.  We basically should not be able to fail memory 
>> allocation during execbuf, because we cannot recover from that.
>>     
>
> Well this solve the suspend problem we were discussing at xds ie what
> to do on buffer. If we know that we have room to put buffer then we
> don't to worry about which buffer we are ready to loose. Given that
> opengl don't give any clue on that this sounds like a good approach.
>
> For embedded device where every piece of ram still matter i guess
> you also have to deal with suspend case so you have a way to either
> save vram content or to preserve it. I don't see any problem with
> gem to cop with this case too.
>   
No. Gem can't coop with it. Let's say you have a 512M system with two 1G 
video cards, 4G swap space, and you want to fill both card's videoram 
with render-and-forget textures for whatever purpose.

What happens? After you've generated the first say 300M, The system 
mysteriously starts to page, and when, after a a couple of minutes of 
crawling texture upload speeds, you're done, The system is using and 
have written almost 2G of swap. Now, you want to update the textures and 
expect fast texsubimage...

So having a backing object that you have to access to get things into 
VRAM is not the way to go.
The correct way to do this is to reserve, but not use swap space. Then 
you can start using it on suspend, provided that the swapping system is 
still up (which is has to be with the current GEM approach anyway). If 
pwrite is used in this case, it must not dirty any backing object pages.

/Thomas





>   
>> Other things like GFP_HIGHUSER etc are probably fixable if there is a 
>> will to do it.
>>
>> So if GEM is the future, these shortcomings must IMHO be addressed. In 
>> particular GEM should not stop people from mapping device memory 
>> directly. Particularly not in the view of the arguments against TTM 
>> previously outlined.
>>     
>
> As i said i have come to the opinion that not mapping vram in userspace
> vma sounds like a good plan. I am even thinking that avoiding all mapping
> and encourage pread/pwrite is a better solution. For me vram is a
> temporary storage card maker use to speed up their hw as so it should not
> be directly used for userspace. Note that this does not go against having
> user space choosing policy for vram usage ie which object to put where.
>
> Cheers,
> Jerome Glisse
>   




-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: TTM merging?

Reply via email to