On Thu, 2009-01-22 at 20:54 +0100, Thomas Hellström wrote: > Jerome Glisse wrote: > > Hi Thomas, > > > > I did go through new ttm and i would like to get your feedback on > > change i think are worthwhile. Current design enforce driver to use > > the ttm memory placement and movement logic. I would like to let that > > up to the driver and factor out part which do the low level work, > > basically it end up to split : > > > > -mmap buffer to userspace > > -kmap buffer > > -allocate page > > -setup pat or mtrr > > -invalidate mapping > > -Other things i might just forget about right now :) > > > > >From : > > -buffer movement > > -buffer validation > > -buffer eviction > > -... > > > > The motivation behind this are that current design will likely often > > end up in stalling GPU each time we are doing buffer validation. This > > doesn't sounds like a good plan. > > > > Here is what would happen on radeon if i correctly understand current > > ttm movement code (like eviction path when we move a bo from system to > > vram, or vram to system). Bo movement will be scheduled in command > > stream and CPU will wait until the GPU is done moving once this happen > > this likely mean that is nor more scheduled work in the GPU pipeline. > > > > For radeon the idea i did have was to use 2 list : > > -GART (could be AGP, PCIE, PCI) bind list > > -GART (same) unbind list > > GPU will wait on the bind list and CPU will wait on the unbind list. > > So when we do validation we find where all BO of the CS would be > > placed (could depend on hw and userspace hint) and we accordingly > > setup : > > (Validation of a command stream buffer produce this some of this > > could be empty if there is enough room) > > -GART unbind (to make room in gart if necessary) > > -GART bind (to bind system memory where we will save BO evicted from > > vram) > > -gpu cmd to dma into GART the evicted bo > > -GART unbind (to make room in gart if necessary could be > > merged with first unbind) > > -GART (bind BO referenced in the CS and which are moved into VRAM) > > -gpu cmd to dma into VRAM > > -GART unbind (if necessary again could be merged with previous unbind) > > -GART bind (remaining BO if any which stay in the gart for cs) > > -cs > > > > It was the worst case scenario where we use all the GART area. Looks > > complex but the idea is that most of the unbinding and binding could be > > scheduled before the GPU even have to wait for it. This scheme also > > allow to add in the future some kind of GPU cs scheduler which can try > > to minimize BO movement by rescheduling cs. Anyway the assumption here > > is that most of the time we will have this situation: > > -bind BO referenced by cs1 > > -GPU dma > > -signal to CPU to unbind DMAed bo (GPU keep working) > > -cs1 > > -bind BO referenced by cs2 (GPU is still working on cs1 > > but we have enough room) > > -signal cs1 is done to CPU and all referenced BO could > > be unbind > > -GPU dma > > -cs2 > > ... > > > > GPU is always busy and CPU is waken up through irq to bind/unbind to > > minimize the chance of GPU having to wait on the CPU binding/unbinding > > activities. > > > > Also on some newer GPU and PCIE if there is no iommu (or it could be > > ignored) GPU can do the binding unbinding on it's own and so never wait > > on the CPU. I need to checkup what iommu design require. > > > > > > So i would like to isolate ttm function which provide core utilities > > (mapping,swapping,allocating page, setting cache, ...) from the logic > > which drive BO movement/validation. Do you think it's good plan ? > > Do you have any requirement or wish on the way i could do that ? > > > > Note that i don't intend to implement the scheme i just described > > in the first shot but rather to stick with the bounded BO movement/ > > placement/validation scheme as first step. Then i can slowly try > > implementing this scheme or any other better scheme one might come > > up with :) > > > > Cheers, > > Jerome Glisse > > > > > Hi, Jerome! > > So, If I understand you correctly you want to make a smaller bo class on > which the low level functions operate, (basically just something similar > to a struct ttm_tt), so that one can derive from that and implement a > more elaborate migration scheme? > > I think that is a good idea, and in that case one could perhaps keep the > current implementation as a special case. > > What I'm a little worried about, though, is that there will come a lot > of new ideas that will postpone an upstream move again, but at this > point I don't see such a splitting doing that, since it will be only > kernel internals. > > But, before you start looking at this, remember that the current > implementation is using unnecessary synchronization when moving. There > is, for example no need to synchronize if the GPU is used to DMA from > system to VRAM. That's basically because I've been lazy, and it's a bit > hard to debug when buffer data is in flight, and even worse if a > pipelined buffer move fails :(. > > Anyway, It's entirely possible to remove the synchronization from the > current buffer move logic, and let the driver handle it. The only > synchronization points would be bo_move_memcpy and bo_move_tt. > The driver could then, in its own move functions, do what's necessary to > sync. > > /Thomas
I don't want neither to postpone upstream acceptance of this code, this is why i asked first, i will try to come up with a patch but ttm_tt struct is exactly what i did have in mind, i haven't looked deeply but i think i mostly need to isolate few things so most of the function dealing with ttm_tt doesn't require too much from other part of ttm. Also i think there are few things that kernel people might pester with (like using int when the data is a boolean) will also try to go through the code with a kernel reviewer minds ;) Will push in my own repo once i got somethings which at least compile, so you could review & comment. Cheers, Jerome ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel