On Wed, 14 May 2008 10:21:15 -0700 (PDT) Keith Whitwell <[EMAIL PROTECTED]> wrote:
> > On Wed, 14 May 2008 16:36:54 +0200 > > Thomas Hellström wrote: > > > > > Jerome Glisse wrote: > > > I don't agree with you here. EXA is much faster for small composite > > > operations and even small fill blits if fallbacks are used. Even to > > > write-combined memory, but that of course depends on the hardware. This > > > is going to be even more pronounced with acceleration architectures like > > > Glucose and similar, that don't have an optimized path for small > > > hardware composite operations. > > > > > > My personal feeling is that pwrites are a workaround for a workaround > > > for a very bad decision: > > > > > > To avoid user-space allocators on device-mapped memory. This lead to a > > > hack to avoid cahing-policy changes which lead to cache trashing > > > problems which put us in the current situation. How far are we going to > > > follow this path before people wake up? What's wrong with the > > > performance of good old i915tex which even beats "classic" i915 in many > > > cases. > > > > > > Having to go through potentially (and even probably) paged-out memory to > > > access buffers to make that are present in VRAM sounds like a very odd > > > approach (to say the least) to me. Even if it's a single page and > > > implementing per-page dirty checks for domain flushing isn't very > > > appealing either. > > > > I don't have number or benchmark to check how fast pread/pwrite path might > > be in this use so i am just expressing my feeling which happen to just be > > to avoid vma tlb flush as most as we can. I got the feeling that kernel > > goes through numerous trick to avoid tlb flushing for a good reason and > > also i am pretty sure that with number of core keeping growing anythings > > that need cpu broad synchronization is to be avoided. > > > > Hopefully once i got decent amount of time to do benchmark with gem i will > > check out my theory. I think simple benchmark can be done on intel hw just > > return false in EXA prepare access to force use of download from screen, > > and in download from screen use pread then comparing benchmark of this > > hacked intel ddx with a normal one should already give some numbers. > > > > > Why should we have to when we can do it right? > > > > Well my point was that mapping vram is not right, i am not saying that > > i know the truth. It's just a feeling based on my experiment with ttm > > and on the bar restriction stuff and others consideration of same kind. > > > > > No. Gem can't coop with it. Let's say you have a 512M system with two 1G > > > video cards, 4G swap space, and you want to fill both card's videoram > > > with render-and-forget textures for whatever purpose. > > > > > > What happens? After you've generated the first say 300M, The system > > > mysteriously starts to page, and when, after a a couple of minutes of > > > crawling texture upload speeds, you're done, The system is using and > > > have written almost 2G of swap. Now, you want to update the textures and > > > expect fast texsubimage... > > > > > > So having a backing object that you have to access to get things into > > > VRAM is not the way to go. > > > The correct way to do this is to reserve, but not use swap space. Then > > > you can start using it on suspend, provided that the swapping system is > > > still up (which is has to be with the current GEM approach anyway). If > > > pwrite is used in this case, it must not dirty any backing object pages. > > > > > > > For normal desktop i don't expect VRAM amount > RAM amount, people with > > 1Go VRAM are usually hard gamer with 4G of ram :). Also most object in > > 3d world are stored in memory, if program are not stupid and trust gl > > to keep their texture then you just have the usual ram copy and possibly > > a vram copy, so i don't see any waste in the normal use case. Of course > > we can always come up with crazy weird setup, but i am more interested > > in dealing well with average Joe than dealing mostly well with every > > use case. > > It's always been a big win to go to single-copy texturing. Textures tend to > be large and nobody has so much memory that doubling up on textures has ever > been appealing... And there are obvious use-cases like textured video where > only having a single copy is a big performance. > > It certainly makes things easier for the driver to duplicate textures -- > which is why all the old DRI drivers did it -- but it doesn't make it > right... And the old DRI drivers also copped out on things like > render-to-texture, etc, so whatever gains you make in simplicity by treating > VRAM as a cache, some of those will be lost because you'll have to keep track > of which one of the two copies of a texture is up-to-date, and you'll still > have to preserve (modified) texture contents on eviction, which old DRI never > had to. > > Ultimately it boils down to a choice between making your life easier as a > developer of the driver and producing a driver that makes most advantage of > all the system resources. > > Nobody can force you to take one path or the other, but it's certainly my > intention when considering drivers for VRAM hardware to support > single-copy-number textures, and for that reason, I'd be unhappy to see a > system adopted that prevented that. > > Keith > I am also for saving memory and i think you can do it in gem here is call chain i foresee: -create buffer -specific driver ioctl to set buffer hint: asking for object to be in vram -pwrite texture >From their pwrite goes through drm driver and through driver specific callback: -driver see hint vram check for vram space -if space in vram take it & write object their, allocate backing store object but page are not instancied so no RAM is actually use, only swap area. -if no space in vram well you loose you to normal RAM. Drawbacks is that it's up to driver to take care of saving vram copy but i believe this to be driver specific enough to be fine. So in this scheme you got one copy of the object with swap area (or i am just severly miss understanding few kernel area which could happen). Cheers, Jerome Glisse <[EMAIL PROTECTED]> ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel