date:20120507

On 07.05.2012 20:52, Jerome Glisse wrote:
> On Mon, May 7, 2012 at 1:59 PM, Jerome Glisse  wrote:
>>> On 07.05.2012 17:23, Jerome Glisse wrote:
 On Mon, May 7, 2012 at 7:42 AM, Christian K?nig
   wrote:
> A startover with a new idea for a multiple ring allocator.
> Should perform as well as a normal ring allocator as long
> as only one ring does somthing, but falls back to a more
> complex algorithm if more complex things start to happen.
>
> We store the last allocated bo in last, we always try to allocate
> after the last allocated bo. Principle is that in a linear GPU ring
> progression was is after last is the oldest bo we allocated and thus
> the first one that should no longer be in use by the GPU.
>
> If it's not the case we skip over the bo after last to the closest
> done bo if such one exist. If none exist and we are not asked to
> block we report failure to allocate.
>
> If we are asked to block we wait on all the oldest fence of all
> rings. We just wait for any of those fence to complete.
>
> v2: We need to be able to let hole point to the list_head, otherwise
> try free will never free the first allocation of the list. Also
> stop calling radeon_fence_signalled more than necessary.
>
> Signed-off-by: Christian K?nig
> Signed-off-by: Jerome Glisse
 This one is NAK please use my patch. Yes in my patch we never try to
 free anything if there is only on sa_bo in the list if you really care
 about this it's a one line change:

 http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch
>>> Nope that won't work correctly, "last" is pointing to the last allocation
>>> and that's the most unlikely to be freed at this time. Also in this version
>>> (like in the one before) radeon_sa_bo_next_hole lets hole point to the
>>> "prev" of the found sa_bo without checking if this isn't the lists head.
>>> That might cause a crash if an to be freed allocation is the first one in
>>> the buffer.
>>>
>>> What radeon_sa_bo_try_free would need to do to get your approach working is
>>> to loop over the end of the buffer and also try to free at the beginning,
>>> but saying that keeping the last allocation results in a whole bunch of
>>> extra cases and "if"s, while just keeping a pointer to the "hole" (e.g.
>>> where the next allocation is most likely to succeed) simplifies the code
>>> quite a bit (but I agree that on the down side it makes it harder to
>>> understand).
>>>
 Your patch here can enter in infinite loop and never return holding
 the lock. See below.

 [SNIP]

> +   } while (radeon_sa_bo_next_hole(sa_manager, fences));
 Here you can infinite loop, in the case there is a bunch of hole in
 the allocator but none of them allow to full fill the allocation.
 radeon_sa_bo_next_hole will keep returning true looping over and over
 on all the all. That's why i only restrict my patch to 2 hole skeeping
 and then fails the allocation or try to wait. I believe sadly we need
 an heuristic and 2 hole skeeping at most sounded like a good one.
>>> Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in
>>> conjunction with radeon_sa_bo_try_free are eating up the opportunities for
>>> holes.
>>>
>>> Look again, it probably will never loop more than RADEON_NUM_RINGS + 1, with
>>> the exception for allocating in a complete scattered buffer, and even then
>>> it will never loop more often than halve the number of current allocations
>>> (and that is really really unlikely).
>>>
>>> Cheers,
>>> Christian.
>> I looked again and yes it can loop infinitly, think of hole you can
>> never free ie radeon_sa_bo_try_free can't free anything. This
>> situation can happen if you have several thread allocating sa bo at
>> the same time while none of them are yet done with there sa_bo (ie
>> none have call sa_bo_free yet). I updated a v3 that track oldest and
>> fix all things you were pointing out above.
No that isn't a problem, radeon_sa_bo_next_hole takes the firsts entries 
of the flist, so it only considers holes that have a signaled fence and 
so can be freed.

Having multiple threads allocate objects that can't be freed yet will 
just result in empty flists, and so radeon_sa_bo_next_hole will return 
false, resulting in calling radeon_fence_wait_any with an empty fence 
list, which in turn will result in an ENOENT and abortion of allocation 
(ok maybe we should catch that and return -ENOMEM instead).

So even the corner cases should now be handled fine.

>>
>> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch
>>
>> Cheers,
>> Jerome
> Of course by tracking oldest it defeat the algo so updated patch :
>   
> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch
>
> Just fix the corner case of list of single

[Bug 49603] New: [regression] Fullscreen video no longer smooth with GPU in low power mode

https://bugs.freedesktop.org/show_bug.cgi?id=49603

 Bug #: 49603
   Summary: [regression] Fullscreen video no longer smooth with
GPU in low power mode
Classification: Unclassified
   Product: Mesa
   Version: 8.0
  Platform: Other
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: sa at whiz.se


With Mesa 8.0 I can watch fullscreen videos using Totem (and other players that
uses an OpenGL sink) with my GPU set to "low" power mode without problems. With
8.0.1 (and later) this is no longer possible. Every so often I get stalls and
what looks like dropped frames. It's a blink and you'll miss it kind of thing,
but over a longer period of time (such as watching a movie) it's quite
noticeable and annoying.

Setting the card to "mid" or higher works around this, but as it's a passive
card I would prefer to keep it running in low as much as possible.

Bisecting for this bug turns up the below commit, I have confirmed it by
reverting this change. 

(I'm not sure if adding the patch author to the cc list is considered good
practice or not?)

System environment:
-- system architecture: 32-bit
-- Linux distribution: Debian unstable
-- GPU: REDWOOD
-- Model: XFX Radeon HD 5670 1GB
-- Display connector: DVI
-- xf86-video-ati: 6.14.4
-- xserver: 1.12.1
-- mesa: 8.0.2
-- drm: 2.4.33
-- kernel: 3.3.4


106ea10d1b246aba1a0f4e171fd7d21268f3960f is the first bad commit
commit 106ea10d1b246aba1a0f4e171fd7d21268f3960f
Author: Simon Farnsworth 
Date:   Tue Feb 14 12:06:20 2012 +

r600g: Use a fake reloc to sleep for fences

r300g is able to sleep until a fence completes rather than busywait because
it creates a special buffer object and relocation that stays busy until the
CS containing the fence is finished.

Copy the idea into r600g, and use it to sleep if the user asked for an
infinite wait, falling back to busywaiting if the user provided a timeout.

Signed-off-by: Simon Farnsworth 
Signed-off-by: Alex Deucher 
(cherry picked from commit 8cd03b933cf868ff867e2db4a0937005a02fd0e4)

Conflicts:

src/gallium/drivers/r600/r600_pipe.c

:04 04 390170e370f86ee323dce284906ed21693ed9d09
cccea412e6be4f3619422196231e02b375ab4772 Msrc

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode

2012-05-07 Thread Subash Patel

Hello Thomasz, Laurent,

I found an issue in the function vb2_dc_pages_to_sgt() below. I saw that 
during the attach, the size of the SGT and size requested mis-matched 
(by atleast 8k bytes). Hence I made a small correction to the code as 
below. I could then attach the importer properly.

Regards,
Subash

On 04/20/2012 08:15 PM, Tomasz Stanislawski wrote:
> From: Andrzej Pietrasiewicz
>
> This patch introduces usage of dma_map_sg to map memory behind
> a userspace pointer to a device as dma-contiguous mapping.
>
> Signed-off-by: Andrzej Pietrasiewicz
> Signed-off-by: Marek Szyprowski
>   [bugfixing]
> Signed-off-by: Kamil Debski
>   [bugfixing]
> Signed-off-by: Tomasz Stanislawski
>   [add sglist subroutines/code refactoring]
> Signed-off-by: Kyungmin Park
> ---
>   drivers/media/video/videobuf2-dma-contig.c |  279 
> ++--
>   1 files changed, 262 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/media/video/videobuf2-dma-contig.c 
> b/drivers/media/video/videobuf2-dma-contig.c
> index 476e536..9cbc8d4 100644
> --- a/drivers/media/video/videobuf2-dma-contig.c
> +++ b/drivers/media/video/videobuf2-dma-contig.c
> @@ -11,6 +11,8 @@
>*/
>
>   #include
> +#include
> +#include
>   #include
>   #include
>
> @@ -22,6 +24,8 @@ struct vb2_dc_buf {
>   void*vaddr;
>   unsigned long   size;
>   dma_addr_t  dma_addr;
> + enum dma_data_direction dma_dir;
> + struct sg_table *dma_sgt;
>
>   /* MMAP related */
>   struct vb2_vmarea_handler   handler;
> @@ -32,6 +36,95 @@ struct vb2_dc_buf {
>   };
>
>   /*/
> +/*scatterlist table functions*/
> +/*/
> +
> +static struct sg_table *vb2_dc_pages_to_sgt(struct page **pages,
> + unsigned int n_pages, unsigned long offset, unsigned long size)
> +{
> + struct sg_table *sgt;
> + unsigned int chunks;
> + unsigned int i;
> + unsigned int cur_page;
> + int ret;
> + struct scatterlist *s;
> +
> + sgt = kzalloc(sizeof *sgt, GFP_KERNEL);
> + if (!sgt)
> + return ERR_PTR(-ENOMEM);
> +
> + /* compute number of chunks */
> + chunks = 1;
> + for (i = 1; i<  n_pages; ++i)
> + if (pages[i] != pages[i - 1] + 1)
> + ++chunks;
> +
> + ret = sg_alloc_table(sgt, chunks, GFP_KERNEL);
> + if (ret) {
> + kfree(sgt);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> + /* merging chunks and putting them into the scatterlist */
> + cur_page = 0;
> + for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
> + unsigned long chunk_size;
> + unsigned int j;
size = PAGE_SIZE;

> +
> + for (j = cur_page + 1; j<  n_pages; ++j)
for (j = cur_page + 1; j < n_pages; ++j) {
> + if (pages[j] != pages[j - 1] + 1)
> + break;
size += PAGE
}
> +
> + chunk_size = ((j - cur_page)<<  PAGE_SHIFT) - offset;
> + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
[DELETE] size -= chunk_size;
> + offset = 0;
> + cur_page = j;
> + }
> +
> + return sgt;
> +}
> +
> +static void vb2_dc_release_sgtable(struct sg_table *sgt)
> +{
> + sg_free_table(sgt);
> + kfree(sgt);
> +}
> +
> +static void vb2_dc_sgt_foreach_page(struct sg_table *sgt,
> + void (*cb)(struct page *pg))
> +{
> + struct scatterlist *s;
> + unsigned int i;
> +
> + for_each_sg(sgt->sgl, s, sgt->nents, i) {
> + struct page *page = sg_page(s);
> + unsigned int n_pages = PAGE_ALIGN(s->offset + s->length)
> + >>  PAGE_SHIFT;
> + unsigned int j;
> +
> + for (j = 0; j<  n_pages; ++j, ++page)
> + cb(page);
> + }
> +}
> +
> +static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
> +{
> + struct scatterlist *s;
> + dma_addr_t expected = sg_dma_address(sgt->sgl);
> + unsigned int i;
> + unsigned long size = 0;
> +
> + for_each_sg(sgt->sgl, s, sgt->nents, i) {
> + if (sg_dma_address(s) != expected)
> + break;
> + expected = sg_dma_address(s) + sg_dma_len(s);
> + size += sg_dma_len(s);
> + }
> + return size;
> +}
> +
> +/*/
>   /* callbacks for all buffers */
>   /*/
>
> @@ -116,42 +209,194 @@ static int vb2_dc_mmap(void *buf_priv, struct 
> vm_area_struct *vma)
>   /*   callbacks for USERPTR buffers   */
>   /*/
>
> +static inline int vma_is_io(struct vm_area_struct *vma)
> +{
> + return

[Bug 42490] NUTMEG DP to VGA bridge not working

https://bugs.freedesktop.org/show_bug.cgi?id=42490

--- Comment #28 from Jerome Glisse  2012-05-07 
12:57:34 PDT ---
Does people here have better luck with the patch mentioned previously:

drm/radeon/kms: need to set up ss on DP bridges as well

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #5 from Mike Mestnik  2012-05-07 11:59:39 PDT ---
This patch worked for me and got me to the next undefined reference.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCH 14/20] drm/radeon: multiple ring allocator v2

On 07.05.2012 17:23, Jerome Glisse wrote:
> On Mon, May 7, 2012 at 7:42 AM, Christian K?nig  
> wrote:
>> A startover with a new idea for a multiple ring allocator.
>> Should perform as well as a normal ring allocator as long
>> as only one ring does somthing, but falls back to a more
>> complex algorithm if more complex things start to happen.
>>
>> We store the last allocated bo in last, we always try to allocate
>> after the last allocated bo. Principle is that in a linear GPU ring
>> progression was is after last is the oldest bo we allocated and thus
>> the first one that should no longer be in use by the GPU.
>>
>> If it's not the case we skip over the bo after last to the closest
>> done bo if such one exist. If none exist and we are not asked to
>> block we report failure to allocate.
>>
>> If we are asked to block we wait on all the oldest fence of all
>> rings. We just wait for any of those fence to complete.
>>
>> v2: We need to be able to let hole point to the list_head, otherwise
>> try free will never free the first allocation of the list. Also
>> stop calling radeon_fence_signalled more than necessary.
>>
>> Signed-off-by: Christian K?nig
>> Signed-off-by: Jerome Glisse
> This one is NAK please use my patch. Yes in my patch we never try to
> free anything if there is only on sa_bo in the list if you really care
> about this it's a one line change:
> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch
Nope that won't work correctly, "last" is pointing to the last 
allocation and that's the most unlikely to be freed at this time. Also 
in this version (like in the one before) radeon_sa_bo_next_hole lets 
hole point to the "prev" of the found sa_bo without checking if this 
isn't the lists head. That might cause a crash if an to be freed 
allocation is the first one in the buffer.

What radeon_sa_bo_try_free would need to do to get your approach working 
is to loop over the end of the buffer and also try to free at the 
beginning, but saying that keeping the last allocation results in a 
whole bunch of extra cases and "if"s, while just keeping a pointer to 
the "hole" (e.g. where the next allocation is most likely to succeed) 
simplifies the code quite a bit (but I agree that on the down side it 
makes it harder to understand).

> Your patch here can enter in infinite loop and never return holding
> the lock. See below.
>
> [SNIP]
>> +   } while (radeon_sa_bo_next_hole(sa_manager, fences));
> Here you can infinite loop, in the case there is a bunch of hole in
> the allocator but none of them allow to full fill the allocation.
> radeon_sa_bo_next_hole will keep returning true looping over and over
> on all the all. That's why i only restrict my patch to 2 hole skeeping
> and then fails the allocation or try to wait. I believe sadly we need
> an heuristic and 2 hole skeeping at most sounded like a good one.
Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in 
conjunction with radeon_sa_bo_try_free are eating up the opportunities 
for holes.

Look again, it probably will never loop more than RADEON_NUM_RINGS + 1, 
with the exception for allocating in a complete scattered buffer, and 
even then it will never loop more often than halve the number of current 
allocations (and that is really really unlikely).

Cheers,
Christian.

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

Michal Suchanek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||INVALID

--- Comment #6 from Michal Suchanek  2012-05-07 11:03:54 
PDT ---
Indeed, it works with the texture compression library installed.

I guess this is something that Wine should report.

Unfortunately, the available messages are very unhelpful.


Sorry about the noise.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #4 from Tom Stellard  2012-05-07 10:43:27 
PDT ---
Created attachment 61159
  --> https://bugs.freedesktop.org/attachment.cgi?id=61159
Possible fix

Does it build with this patch?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

Enhancing EDID quirk functionality

2012-05-07 Thread Adam Jackson

n I'd fixed
nouveau for this before.  I'll send the fix along, thanks for catching
it.

- ajax
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20120507/67406281/attachment.pgp>

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

--- Comment #5 from Henri Verbeet  2012-05-07 10:37:29 
PDT ---
That generally happens when an application tries to use a (D3D) format (e.g.
DXT/s3tc) even though it's not available. A WINEDEBUG=+d3d,+d3d_surface log
should show which format, although typically it's either s3tc or one of the
floating point formats.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCH 14/20] drm/radeon: multiple ring allocator v2

On Mon, May 7, 2012 at 4:38 PM, Christian K?nig  
wrote:
> On 07.05.2012 20:52, Jerome Glisse wrote:
>>
>> On Mon, May 7, 2012 at 1:59 PM, Jerome Glisse ?wrote:

 On 07.05.2012 17:23, Jerome Glisse wrote:
>
> On Mon, May 7, 2012 at 7:42 AM, Christian
> K?nig
> ?wrote:
>>
>> A startover with a new idea for a multiple ring allocator.
>> Should perform as well as a normal ring allocator as long
>> as only one ring does somthing, but falls back to a more
>> complex algorithm if more complex things start to happen.
>>
>> We store the last allocated bo in last, we always try to allocate
>> after the last allocated bo. Principle is that in a linear GPU ring
>> progression was is after last is the oldest bo we allocated and thus
>> the first one that should no longer be in use by the GPU.
>>
>> If it's not the case we skip over the bo after last to the closest
>> done bo if such one exist. If none exist and we are not asked to
>> block we report failure to allocate.
>>
>> If we are asked to block we wait on all the oldest fence of all
>> rings. We just wait for any of those fence to complete.
>>
>> v2: We need to be able to let hole point to the list_head, otherwise
>> ? ?try free will never free the first allocation of the list. Also
>> ? ?stop calling radeon_fence_signalled more than necessary.
>>
>> Signed-off-by: Christian K?nig
>> Signed-off-by: Jerome Glisse
>
> This one is NAK please use my patch. Yes in my patch we never try to
> free anything if there is only on sa_bo in the list if you really care
> about this it's a one line change:
>
>
> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch

 Nope that won't work correctly, "last" is pointing to the last
 allocation
 and that's the most unlikely to be freed at this time. Also in this
 version
 (like in the one before) radeon_sa_bo_next_hole lets hole point to the
 "prev" of the found sa_bo without checking if this isn't the lists head.
 That might cause a crash if an to be freed allocation is the first one
 in
 the buffer.

 What radeon_sa_bo_try_free would need to do to get your approach working
 is
 to loop over the end of the buffer and also try to free at the
 beginning,
 but saying that keeping the last allocation results in a whole bunch of
 extra cases and "if"s, while just keeping a pointer to the "hole" (e.g.
 where the next allocation is most likely to succeed) simplifies the code
 quite a bit (but I agree that on the down side it makes it harder to
 understand).

> Your patch here can enter in infinite loop and never return holding
> the lock. See below.
>
> [SNIP]
>
>> + ? ? ? ? ? ? ? } while (radeon_sa_bo_next_hole(sa_manager, fences));
>
> Here you can infinite loop, in the case there is a bunch of hole in
> the allocator but none of them allow to full fill the allocation.
> radeon_sa_bo_next_hole will keep returning true looping over and over
> on all the all. That's why i only restrict my patch to 2 hole skeeping
> and then fails the allocation or try to wait. I believe sadly we need
> an heuristic and 2 hole skeeping at most sounded like a good one.

 Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in
 conjunction with radeon_sa_bo_try_free are eating up the opportunities
 for
 holes.

 Look again, it probably will never loop more than RADEON_NUM_RINGS + 1,
 with
 the exception for allocating in a complete scattered buffer, and even
 then
 it will never loop more often than halve the number of current
 allocations
 (and that is really really unlikely).

 Cheers,
 Christian.
>>>
>>> I looked again and yes it can loop infinitly, think of hole you can
>>> never free ie radeon_sa_bo_try_free can't free anything. This
>>> situation can happen if you have several thread allocating sa bo at
>>> the same time while none of them are yet done with there sa_bo (ie
>>> none have call sa_bo_free yet). I updated a v3 that track oldest and
>>> fix all things you were pointing out above.
>
> No that isn't a problem, radeon_sa_bo_next_hole takes the firsts entries of
> the flist, so it only considers holes that have a signaled fence and so can
> be freed.
>
> Having multiple threads allocate objects that can't be freed yet will just
> result in empty flists, and so radeon_sa_bo_next_hole will return false,
> resulting in calling radeon_fence_wait_any with an empty fence list, which
> in turn will result in an ENOENT and abortion of allocation (ok maybe we
> should catch that and return -ENOMEM instead).
>
> So even the corner cases should now be handled fine.

No, there is still infinite loop possible with gpu lockup, i am
against the while (next_hole)

[PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode

2012-05-07 Thread Tomasz Stanislawski

Hi Subash,
Could you provide a detailed description of a test case
that causes a failure of vb2_dc_pages_to_sgt?

Regards,
Tomasz Stanislawski

[PATCH 04/20] drm/radeon: convert fence to uint64_t v4

On 07.05.2012 16:39, Jerome Glisse wrote:
> On Mon, May 7, 2012 at 7:42 AM, Christian K?nig  
> wrote:
>> From: Jerome Glisse
>>
>> This convert fence to use uint64_t sequence number intention is
>> to use the fact that uin64_t is big enough that we don't need to
>> care about wrap around.
>>
>> Tested with and without writeback using 0xF000 as initial
>> fence sequence and thus allowing to test the wrap around from
>> 32bits to 64bits.
>>
>> v2: Add comment about possible race btw CPU&  GPU, add comment
>> stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
>> Read fence sequenc in reverse order of GPU write them so we
>> mitigate the race btw CPU and GPU.
>>
>> v3: Drop the need for ring to emit the 64bits fence, and just have
>> each ring emit the lower 32bits of the fence sequence. We
>> handle the wrap over 32bits in fence_process.
>>
>> v4: Just a small optimization: Don't reread the last_seq value
>> if loop restarts, since we already know its value anyway.
>> Also start at zero not one for seq value and use pre instead
>> of post increment in emmit, otherwise wait_empty will deadlock.
> Why changing that v3 was already good no deadlock. I started at 1
> especialy for that, a signaled fence is set to 0 so it always compare
> as signaled. Just using preincrement is exactly like starting at one.
> I don't see the need for this change but if it makes you happy.

Not exactly, the last emitted sequence is also used in 
radeon_fence_wait_empty. So when you use post increment 
radeon_fence_wait_empty will actually not wait for the last emitted 
fence to be signaled, but for last emitted + 1, so it practically waits 
forever.

Without this change suspend (for example) will just lockup.

Cheers,
Christian.

>
> Cheers,
> Jerome
>> Signed-off-by: Jerome Glisse
>> Signed-off-by: Christian K?nig
>> ---
>>   drivers/gpu/drm/radeon/radeon.h   |   39 ++-
>>   drivers/gpu/drm/radeon/radeon_fence.c |  116 
>> +++--
>>   drivers/gpu/drm/radeon/radeon_ring.c  |9 ++-
>>   3 files changed, 107 insertions(+), 57 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon.h 
>> b/drivers/gpu/drm/radeon/radeon.h
>> index e99ea81..cdf46bc 100644
>> --- a/drivers/gpu/drm/radeon/radeon.h
>> +++ b/drivers/gpu/drm/radeon/radeon.h
>> @@ -100,28 +100,32 @@ extern int radeon_lockup_timeout;
>>   * Copy from radeon_drv.h so we don't have to include both and have 
>> conflicting
>>   * symbol;
>>   */
>> -#define RADEON_MAX_USEC_TIMEOUT10  /* 100 ms */
>> -#define RADEON_FENCE_JIFFIES_TIMEOUT   (HZ / 2)
>> +#define RADEON_MAX_USEC_TIMEOUT10  /* 100 ms */
>> +#define RADEON_FENCE_JIFFIES_TIMEOUT   (HZ / 2)
>>   /* RADEON_IB_POOL_SIZE must be a power of 2 */
>> -#define RADEON_IB_POOL_SIZE16
>> -#define RADEON_DEBUGFS_MAX_COMPONENTS  32
>> -#define RADEONFB_CONN_LIMIT4
>> -#define RADEON_BIOS_NUM_SCRATCH8
>> +#define RADEON_IB_POOL_SIZE16
>> +#define RADEON_DEBUGFS_MAX_COMPONENTS  32
>> +#define RADEONFB_CONN_LIMIT4
>> +#define RADEON_BIOS_NUM_SCRATCH8
>>
>>   /* max number of rings */
>> -#define RADEON_NUM_RINGS 3
>> +#define RADEON_NUM_RINGS   3
>> +
>> +/* fence seq are set to this number when signaled */
>> +#define RADEON_FENCE_SIGNALED_SEQ  0LL
>> +#define RADEON_FENCE_NOTEMITED_SEQ (~0LL)
>>
>>   /* internal ring indices */
>>   /* r1xx+ has gfx CP ring */
>> -#define RADEON_RING_TYPE_GFX_INDEX  0
>> +#define RADEON_RING_TYPE_GFX_INDEX 0
>>
>>   /* cayman has 2 compute CP rings */
>> -#define CAYMAN_RING_TYPE_CP1_INDEX 1
>> -#define CAYMAN_RING_TYPE_CP2_INDEX 2
>> +#define CAYMAN_RING_TYPE_CP1_INDEX 1
>> +#define CAYMAN_RING_TYPE_CP2_INDEX 2
>>
>>   /* hardcode those limit for now */
>> -#define RADEON_VA_RESERVED_SIZE(8<<  20)
>> -#define RADEON_IB_VM_MAX_SIZE  (64<<  10)
>> +#define RADEON_VA_RESERVED_SIZE(8<<  20)
>> +#define RADEON_IB_VM_MAX_SIZE  (64<<  10)
>>
>>   /*
>>   * Errata workarounds.
>> @@ -254,8 +258,9 @@ struct radeon_fence_driver {
>> uint32_tscratch_reg;
>> uint64_tgpu_addr;
>> volatile uint32_t   *cpu_addr;
>> -   atomic_tseq;
>> -   uint32_tlast_seq;
>> +   /* seq is protected by ring emission lock */
>> +   uint64_tseq;
>> +   atomic64_t  last_seq;
>> unsigned long   last_activity;
>> wait_queue_head_t   queue;
>> struct list_heademitted;
>> @@ -268,11 +273,9 @@ struct radeon_fence {
>> struct kref kref;
>>

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

--- Comment #4 from Michal Suchanek  2012-05-07 10:02:05 
PDT ---
invalid value:

Breakpoint 1, _mesa_error (ctx=0xccba90, error=1281, fmtString=0x74706278
"glTexImage%dD(internalFormat=%s)") at main/errors.c:996
996main/errors.c: No such file or directory.
(gdb) bt full
#0  _mesa_error (ctx=0xccba90, error=1281, fmtString=0x74706278
"glTexImage%dD(internalFormat=%s)") at main/errors.c:996
do_output = 225 '\341'
do_log = 
#1  0x745e2a84 in texture_error_check (border=0, depth=1, height=64,
width=64, type=0, format=0, internalFormat=0, level=0, 
target=3553, dimensions=2, ctx=0xccba90) at main/teximage.c:1621
proxyTarget = 
err = 
indexFormat = 0 '\000'
isProxy = 
sizeOK = 1 '\001'
colorFormat = 
#2  teximage (ctx=0xccba90, dims=2, target=3553, level=0, internalFormat=0,
width=64, height=64, depth=1, border=0, format=0, type=0, 
pixels=0x0) at main/teximage.c:2501
error = 1 '\001'
unpack_no_border = {Alignment = -7152, RowLength = 32767, SkipPixels =
9180912, SkipRows = 0, ImageHeight = -8144, 
  SkipImages = 32767, SwapBytes = 45 '-', LsbFirst = 17 '\021', Invert
= 90 'Z', BufferObj = 0x7fffe410}
unpack = 0xcd22e0
#3  0x745e2fc4 in _mesa_TexImage2D (target=,
level=, internalFormat=, 
width=, height=, border=,
format=0, type=0, pixels=0x0) at main/teximage.c:2639
No locals.
#4  0x00480145 in ?? ()
No symbol table info available.
#5  0x004bbfd6 in ?? ()
No symbol table info available.
#6  0x00440687 in ?? ()
No symbol table info available.
#7  0x0043b985 in ?? ()
No symbol table info available.
#8  0x0043c092 in ?? ()
No symbol table info available.
#9  0x7696eead in __libc_start_main (main=,
argc=, ubp_av=, init=, 
fini=, rtld_fini=, stack_end=0x7fffe408)
at libc-start.c:228
result = 
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 738182590451561014,
4428032, 140737488348176, 0, 0, -738182590032367050, 
-738203265834012106}, mask_was_saved = 0}}, priv = {pad = {0x0,
0x0, 0x5ddb20, 0x7fffe418}, data = {prev = 0x0, 
  cleanup = 0x0, canceltype = 6150944}}}
not_first_call = 
#10 0x00439129 in _start ()
No symbol table info available.
(gdb) c
Continuing.
37923 glTexImage2D(target = GL_TEXTURE_2D, level = 0, internalformat = GL_ZERO,
width = 64, height = 64, border = 0, format = GL_ZERO, type = GL_ZERO, pixels =
NULL)
37923: warning: glGetError(glTexImage2D) = GL_INVALID_VALUE

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #3 from Mike Mestnik  2012-05-07 09:58:45 PDT ---
Tom,
The short of it: I'm already doing that.

The long:
  I took a look at that script and it eventually just calls "autoreconf -v
--install" my log clearly shows "autoreconf -vfi" being called.

Also note that that script will call configure.

Thanks!

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCH] mm: Work around Intel SNB GTT bug with some physical pages.

2012-05-07 Thread Andi Kleen

St?phane Marchesin  writes:

> While investing some Sandy Bridge rendering corruption, I found out
> that all physical memory pages below 1MiB were returning garbage when
> read through the GTT. This has been causing graphics corruption (when
> it's used for textures, render targets and pixmaps) and GPU hangups
> (when it's used for GPU batch buffers).

It would be possible to exlude GFP_DMA from the page allocator. That
covers the first 16MB. You just need a custom zone list with ZONE_DMA.

-Andi

-- 
ak at linux.intel.com -- Speaking for myself only

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

--- Comment #3 from Michal Suchanek  2012-05-07 09:55:15 
PDT ---
I get no mesa warnings, only warnings from wine about Mesa returning
GL_INVALID*

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCH] mm: Work around Intel SNB GTT bug with some physical pages.

2012-05-07 Thread Stéphane Marchesin

While investing some Sandy Bridge rendering corruption, I found out
that all physical memory pages below 1MiB were returning garbage when
read through the GTT. This has been causing graphics corruption (when
it's used for textures, render targets and pixmaps) and GPU hangups
(when it's used for GPU batch buffers).

I talked with some people at Intel and they confirmed my findings,
and said that a couple of other random pages were also affected.

We could fix this problem by adding an e820 region preventing the
memory below 1 MiB to be used, but that prevents at least my machine
from booting. One could think that we should be able to fix it in
i915, but since the allocation is done by the backing shmem this is
not possible.

In the end, I came up with the ugly workaround of just leaking the
offending pages in shmem.c. I do realize it's truly ugly, but I'm
looking for a fix to the existing code, and am wondering if people on
this list have a better idea, short of rewriting i915_gem.c to
allocate its own pages directly.

Signed-off-by: St?phane Marchesin 

Change-Id: I957e125fb280e0b0d6b05a83cc4068df2f05aa0a
---
 mm/shmem.c |   39 +--
 1 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 6c253f7..dcbb58b 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -768,6 +768,31 @@ redirty:
return 0;
 }

+/*
+ * Some intel GPUs can't use those pages in the GTT, which results in
+ * graphics corruption. Sadly, it's impossible to prevent usage of those
+ * pages in the intel allocator.
+ *
+ * Instead, we test for those areas here and leak the corresponding pages.
+ *
+ * Some day, when the intel GPU memory is not backed by shmem any more,
+ * we'll be able to come up with a solution which is contained in i915.
+ */
+static bool i915_usable_page(struct page *page)
+{
+   dma_addr_t addr = page_to_phys(page);
+
+   if (unlikely((addr < 1 * 1024 * 1024) ||
+   (addr == 0x2005) ||
+   (addr == 0x2011) ||
+   (addr == 0x2013) ||
+   (addr == 0x20138000) ||
+   (addr == 0x40004000)))
+   return false;
+
+   return true;
+}
+
 #ifdef CONFIG_NUMA
 #ifdef CONFIG_TMPFS
 static void shmem_show_mpol(struct seq_file *seq, struct mempolicy *mpol)
@@ -816,6 +841,7 @@ static struct page *shmem_alloc_page(gfp_t gfp,
struct shmem_inode_info *info, pgoff_t index)
 {
struct vm_area_struct pvma;
+   struct page *page;

/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
@@ -826,7 +852,11 @@ static struct page *shmem_alloc_page(gfp_t gfp,
/*
 * alloc_page_vma() will drop the shared policy reference
 */
-   return alloc_page_vma(gfp, , 0);
+   do {
+   page = alloc_page_vma(gfp, , 0);
+   } while (!i915_usable_page(page));
+
+   return page;
 }
 #else /* !CONFIG_NUMA */
 #ifdef CONFIG_TMPFS
@@ -844,7 +874,12 @@ static inline struct page *shmem_swapin(swp_entry_t swap, 
gfp_t gfp,
 static inline struct page *shmem_alloc_page(gfp_t gfp,
struct shmem_inode_info *info, pgoff_t index)
 {
-   return alloc_page(gfp);
+   struct page *page;
+   do {
+   page = alloc_page(gfp);
+   } while (!i915_usable_page(page));
+
+   return page;
 }
 #endif /* CONFIG_NUMA */

-- 
1.7.5.3.367.ga9930

[RFC 05/13] v4l: vb2-dma-contig: add support for DMABUF exporting

2012-05-07 Thread Laurent Pinchart

Hi Tomasz,

Sorry for the late reply, this one slipped through the cracks.

On Thursday 19 April 2012 12:42:12 Tomasz Stanislawski wrote:
> On 04/17/2012 04:08 PM, Laurent Pinchart wrote:
> > On Tuesday 10 April 2012 15:10:39 Tomasz Stanislawski wrote:
> >> This patch adds support for exporting a dma-contig buffer using
> >> DMABUF interface.
> >> 
> >> Signed-off-by: Tomasz Stanislawski 
> >> Signed-off-by: Kyungmin Park 
> >> ---
> 
> [snip]
> 
> >> +static struct sg_table *vb2_dc_dmabuf_ops_map(
> >> +  struct dma_buf_attachment *db_attach, enum dma_data_direction dir)
> >> +{
> >> +  struct dma_buf *dbuf = db_attach->dmabuf;
> >> +  struct vb2_dc_buf *buf = dbuf->priv;
> >> +  struct vb2_dc_attachment *attach = db_attach->priv;
> >> +  struct sg_table *sgt;
> >> +  struct scatterlist *rd, *wr;
> >> +  int i, ret;
> > 
> > You can make i an unsigned int :-)
> 
> Right.. splitting declaration may be also a good idea :)
> 
> >> +
> >> +  /* return previously mapped sg table */
> >> +  if (attach)
> >> +  return >sgt;
> > 
> > This effectively keeps the mapping around as long as the attachment
> > exists. We don't try to swap out buffers in V4L2 as is done in DRM at the
> > moment, so it might not be too much of an issue, but the behaviour of the
> > implementation will change if we later decide to map/unmap the buffers in
> > the map/unmap handlers. Do you think that could be a problem ?
> 
> I don't that it is a problem. If an importer calls dma_map_sg then caching
> sgt on an exporter side reduces a cost of an allocating and an
> initialization of sgt.
> 
> >> +
> >> +  attach = kzalloc(sizeof *attach, GFP_KERNEL);
> >> +  if (!attach)
> >> +  return ERR_PTR(-ENOMEM);
> > 
> > Why don't you allocate the vb2_dc_attachment here instead of
> > vb2_dc_dmabuf_ops_attach() ?
> 
> Good point.
> The attachment could be allocated at vb2_dc_attachment but all its
> fields would be uninitialized. I mean an empty sgt and an undefined
> dma direction. I decided to allocate the attachment in vb2_dc_dmabuf_ops_map
> because only than all information needed to create a valid attachment
> object are available.
> 
> The other solution might be the allocation at vb2_dc_attachment. The field
> dir would be set to DMA_NONE. If this filed is equal to DMA_NONE at
> vb2_dc_dmabuf_ops_map then sgt is allocated and mapped and direction field
> is updated. If value is not DMA_NONE then the sgt is reused.
> 
> Do you think that it is a good idea?

I think I would prefer that. It sounds more logical to allocate the attachment 
in the attach operation handler.

> >> +  sgt = >sgt;
> >> +  attach->dir = dir;
> >> +
> >> +  /* copying the buf->base_sgt to attachment */
> > 
> > I would add an explanation regarding why you need to copy the SG list.
> > Something like.
> > 
> > "Copy the buf->base_sgt scatter list to the attachment, as we can't map
> > the same scatter list to multiple devices at the same time."
> 
> ok
> 
> >> +  ret = sg_alloc_table(sgt, buf->sgt_base->orig_nents, GFP_KERNEL);
> >> +  if (ret) {
> >> +  kfree(attach);
> >> +  return ERR_PTR(-ENOMEM);
> >> +  }
> >> +
> >> +  rd = buf->sgt_base->sgl;
> >> +  wr = sgt->sgl;
> >> +  for (i = 0; i < sgt->orig_nents; ++i) {
> >> +  sg_set_page(wr, sg_page(rd), rd->length, rd->offset);
> >> +  rd = sg_next(rd);
> >> +  wr = sg_next(wr);
> >> +  }
> >> 
> >> +  /* mapping new sglist to the client */
> >> +  ret = dma_map_sg(db_attach->dev, sgt->sgl, sgt->orig_nents, dir);
> >> +  if (ret <= 0) {
> >> +  printk(KERN_ERR "failed to map scatterlist\n");
> >> +  sg_free_table(sgt);
> >> +  kfree(attach);
> >> +  return ERR_PTR(-EIO);
> >> +  }
> >> +
> >> +  db_attach->priv = attach;
> >> +
> >> +  return sgt;
> >> +}
> >> +
> >> +static void vb2_dc_dmabuf_ops_unmap(struct dma_buf_attachment
> >> *db_attach,
> >> +  struct sg_table *sgt, enum dma_data_direction dir)
> >> +{
> >> +  /* nothing to be done here */
> >> +}
> >> +
> >> +static void vb2_dc_dmabuf_ops_release(struct dma_buf *dbuf)
> >> +{
> >> +  /* drop reference obtained in vb2_dc_get_dmabuf */
> >> +  vb2_dc_put(dbuf->priv);
> > 
> > Shouldn't you set vb2_dc_buf::dma_buf to NULL here ? Otherwise the next
> > vb2_dc_get_dmabuf() call will return a DMABUF object that has been freed.
> 
> No.
> 
> The buffer object is destroyed at vb2_dc_put when reference count drops to
> 0. It happens could happen after only REQBUF(count=0) or on last close().
> The DMABUF object is created only for MMAP buffers. The DMABUF object is
> based only on results of dma_alloc_coherent and dma_get_pages (or its future
> equivalent). Therefore the DMABUF object is valid as long as the buffer is
> valid.

OK.

> Notice that dmabuf object could be created in vb2_dc_alloc. I moved it to
> vb2_dc_get_dmabuf to avoid a creation of an object that may not be used.
> 
> >> +}
> >> +
> >> +static struct dma_buf_ops vb2_dc_dmabuf_ops = {
> >> +  .attach =

[PATCH 14/20] drm/radeon: multiple ring allocator v2

On Mon, May 7, 2012 at 1:59 PM, Jerome Glisse  wrote:
>> On 07.05.2012 17:23, Jerome Glisse wrote:
>>>
>>> On Mon, May 7, 2012 at 7:42 AM, Christian K?nig
>>> ?wrote:

 A startover with a new idea for a multiple ring allocator.
 Should perform as well as a normal ring allocator as long
 as only one ring does somthing, but falls back to a more
 complex algorithm if more complex things start to happen.

 We store the last allocated bo in last, we always try to allocate
 after the last allocated bo. Principle is that in a linear GPU ring
 progression was is after last is the oldest bo we allocated and thus
 the first one that should no longer be in use by the GPU.

 If it's not the case we skip over the bo after last to the closest
 done bo if such one exist. If none exist and we are not asked to
 block we report failure to allocate.

 If we are asked to block we wait on all the oldest fence of all
 rings. We just wait for any of those fence to complete.

 v2: We need to be able to let hole point to the list_head, otherwise
 ? ?try free will never free the first allocation of the list. Also
 ? ?stop calling radeon_fence_signalled more than necessary.

 Signed-off-by: Christian K?nig
 Signed-off-by: Jerome Glisse
>>>
>>> This one is NAK please use my patch. Yes in my patch we never try to
>>> free anything if there is only on sa_bo in the list if you really care
>>> about this it's a one line change:
>>>
>>> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch
>>
>> Nope that won't work correctly, "last" is pointing to the last allocation
>> and that's the most unlikely to be freed at this time. Also in this version
>> (like in the one before) radeon_sa_bo_next_hole lets hole point to the
>> "prev" of the found sa_bo without checking if this isn't the lists head.
>> That might cause a crash if an to be freed allocation is the first one in
>> the buffer.
>>
>> What radeon_sa_bo_try_free would need to do to get your approach working is
>> to loop over the end of the buffer and also try to free at the beginning,
>> but saying that keeping the last allocation results in a whole bunch of
>> extra cases and "if"s, while just keeping a pointer to the "hole" (e.g.
>> where the next allocation is most likely to succeed) simplifies the code
>> quite a bit (but I agree that on the down side it makes it harder to
>> understand).
>>
>>> Your patch here can enter in infinite loop and never return holding
>>> the lock. See below.
>>>
>>> [SNIP]
>>>
 + ? ? ? ? ? ? ? } while (radeon_sa_bo_next_hole(sa_manager, fences));
>>>
>>> Here you can infinite loop, in the case there is a bunch of hole in
>>> the allocator but none of them allow to full fill the allocation.
>>> radeon_sa_bo_next_hole will keep returning true looping over and over
>>> on all the all. That's why i only restrict my patch to 2 hole skeeping
>>> and then fails the allocation or try to wait. I believe sadly we need
>>> an heuristic and 2 hole skeeping at most sounded like a good one.
>>
>> Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in
>> conjunction with radeon_sa_bo_try_free are eating up the opportunities for
>> holes.
>>
>> Look again, it probably will never loop more than RADEON_NUM_RINGS + 1, with
>> the exception for allocating in a complete scattered buffer, and even then
>> it will never loop more often than halve the number of current allocations
>> (and that is really really unlikely).
>>
>> Cheers,
>> Christian.
>
> I looked again and yes it can loop infinitly, think of hole you can
> never free ie radeon_sa_bo_try_free can't free anything. This
> situation can happen if you have several thread allocating sa bo at
> the same time while none of them are yet done with there sa_bo (ie
> none have call sa_bo_free yet). I updated a v3 that track oldest and
> fix all things you were pointing out above.
>
> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch
>
> Cheers,
> Jerome

Of course by tracking oldest it defeat the algo so updated patch :
 
http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch

Just fix the corner case of list of single entry.

Cheers,
Jerome

Enhancing EDID quirk functionality

2012-05-07 Thread Ian Pilcher

On 05/03/2012 02:42 PM, Adam Jackson wrote:
> This looks good, thank you for taking it on.

It was either that or give up on my big display, so ... you're welcome.

> I'd like to see documentation for the bit values of the quirks as well.
> And, ideally, this would also have some runtime API for manipulating the
> quirk list, so that way you can test new quirks without needing a reboot
> cycle.

I agree that the bit values should be documented.  I'm not sure where
that documentation should go, however, since I can't find any
documentation of the existing drm module parameters.  Tell me where it
should go, and I'll happily write the doc.

I also agree that it would be nice to be able to manipulate the quirk
list at runtime, and I did think about trying to enable that.  I held
off for a couple of reasons:

1) I'm a total noob at kernel code, so things like in-kernel locking,
   sysfs, memory management, etc., that would be required for a more
   dynamic API are all new to me.

   That said, I'm more that willing to give it a go, if I can get some
   guidance on those (and similar) topics.

2) I'm not sure how a runtime API should work.  The simplest possibility
   is to just take a string, parse it, and overwrite the old extra
   quirk list with the new list.  The downside to this is that all of
   the existing extra quirks need to be repeated to change a single
   quirk.

> To close the loop all the way on that I'd also want to be able to scrape
> the quirk list back out from that API, but that's not completely clean
> right now.

Sound like a couple of sysfs files to me, one for the built-in quirks
and one for the extra quirks -- maybe one quirk per line?  See my
comments about the sysfs API above.

> We're being a little cavalier with the quirk list as it
> stands because we don't differentiate among phy layers, and I can easily
> imagine a monitor that needs a quirk on DVI but where the same quirk on
> the same monitors' VGA would break it.  I don't think this has caused
> problems yet, but.

Now you're above my pay grade.  What little I've read discovered about
the way DisplayPort, HDMI, VGA, and DVI play together makes me think
this is a nightmare best deferred, hopefully forever.

> InfoFrames are not valid for non-HDMI sinks, so yes, I'd call that a bug.

That's pretty much what I figured.

> Where the EDID for DP-1 appears to be truncated: the "extension" field
> (second byte from the end) is 1 as you'd expect for an HDMI monitor, but
> there's no extension block.  How big of a file do you get from
> /sys/class/drm/*/edid for that port?

The EDID data in sysfs is 256 bytes, which I believe means that it does
include the extension block.

I just tried connecting an HDMI TV to my laptop, and I saw the same
behavior -- 256-byte edid file in sysfs, but "xrandr --verbose" only
shows 128 bytes.  When I attach the same TV to my workstation with Intel
"HD 2000" graphics, "xrandr --verbose" shows all 256 bytes of EDID data.

So it appears that the full data is being read by both systems, but the
behavior of xrandr (or presumably whatever API xrandr uses to get the
EDID data that it displays) differs between the two drivers.  Fun.

Thanks!

-- 

Ian Pilcher arequipeno at gmail.com
"If you're going to shift my paradigm ... at least buy me dinner first."

[PULL] drm-intel-next manual merge

Hi Dave,

As discussed on irc, here's the pull request for the manual merge to
unconfuse git about the changes in intel_display.c. Note that I've
manually frobbed the shortlog to exclude all the changes merge through
Linus' tree.

Yours, Daniel

The following changes since commit 5bc69bf9aeb73547cad8e1ce683a103fe9728282:

  Merge tag 'drm-intel-next-2012-04-23' of 
git://people.freedesktop.org/~danvet/drm-intel into drm-core-next (2012-05-02 
09:22:29 +0100)

are available in the git repository at:


  git://people.freedesktop.org/~danvet/drm-intel for-airlied

for you to fetch changes up to dc257cf154be708ecc47b8b89c12ad8cd2cc35e4:

  Merge tag 'v3.4-rc6' into drm-intel-next (2012-05-07 14:02:14 +0200)



Daniel Vetter (1):
  Merge tag 'v3.4-rc6' into drm-intel-next

-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #2 from Tom Stellard  2012-05-07 07:18:45 
PDT ---
If you re-run autogen.sh and configure does that fix the problem?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[PATCH v2 3/4] drm/exynos: added userptr feature.

On Sat, May 5, 2012 at 6:22 AM, Dave Airlie  wrote:
> On Sat, May 5, 2012 at 11:19 AM,   wrote:
>> Hi Dave,
>>
>> 2012. 4. 25. ?? 7:15 Dave Airlie  ??:
>>
>>> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae  wrote:
 this feature could be used to use memory region allocated by malloc() in 
 user
 mode and mmaped memory region allocated by other memory allocators. userptr
 interface can identify memory type through vm_flags value and would get
 pages or page frame numbers to user space appropriately.
>>>
>>> Is there anything to stop the unpriviledged userspace driver locking
>>> all the RAM in the machine inside userptr?
>>>
>>
>> you mean that there is something that it can stop user space driver locking 
>> some memory region  of RAM? and if any user space driver locked some region 
>> then anyone on user space can't access the region? could you please tell me 
>> about your concerns in more detail so that we can solve the issue? I guess 
>> you mean that any user level driver such as specific EGL library can 
>> allocate some memory region and also lock the region so that other user 
>> space applications can't access the region until rendering is completed by 
>> hw accelerator such as 2d/3d core or opposite case.
>>
>> actually, this feature has already been used by v4l2 so I didn't try to 
>> consider we could face with any problem with this and I've got a feeling 
>> maybe there is something I missed so I'd be happy for you or anyone give me 
>> any advices.
>
> Well v4l get to make their own bad design decisions.
>
> The problem is if an unprivledged users accessing the drm can lock all
> the pages it allocates into memory, by passing them to the kernel as
> userptrs., thus bypassing the swap and blocking all other users on the
> system.
>
> Dave.

Beside that you are not locking the vma and afaik this means that the
page backing the vma might change, yes you will still own the page you
get but userspace might be reading/writing to different pages. The vma
would need to be locked but than the userspace might unlock it in your
back and you start right from the begining.

Cheers,
Jerome

[RFC][PATCH] drm/radeon/hdmi: define struct for AVI infoframe

On Mon, May 7, 2012 at 3:38 AM, Michel D?nzer  wrote:
> On Son, 2012-05-06 at 18:29 +0200, Rafa? Mi?ecki wrote:
>> 2012/5/6 Dave Airlie :
>> > On Sun, May 6, 2012 at 5:19 PM, Rafa? Mi?ecki  wrote:
>> >> 2012/5/6 Rafa? Mi?ecki :
>> >>> diff --git a/drivers/gpu/drm/radeon/r600_hdmi.c 
>> >>> b/drivers/gpu/drm/radeon/r600_hdmi.c
>> >>> index c308432..b14c90a 100644
>> >>> --- a/drivers/gpu/drm/radeon/r600_hdmi.c
>> >>> +++ b/drivers/gpu/drm/radeon/r600_hdmi.c
>> >>> @@ -134,78 +134,22 @@ static void r600_hdmi_infoframe_checksum(uint8_t 
>> >>> packetType,
>> >>> ?}
>> >>>
>> >>> ?/*
>> >>> - * build a HDMI Video Info Frame
>> >>> + * Upload a HDMI AVI Infoframe
>> >>> ?*/
>> >>> -static void r600_hdmi_videoinfoframe(
>> >>> - ? ? ? struct drm_encoder *encoder,
>> >>> - ? ? ? enum r600_hdmi_color_format color_format,
>> >>> - ? ? ? int active_information_present,
>> >>> - ? ? ? uint8_t active_format_aspect_ratio,
>> >>> - ? ? ? uint8_t scan_information,
>> >>> - ? ? ? uint8_t colorimetry,
>> >>> - ? ? ? uint8_t ex_colorimetry,
>> >>> - ? ? ? uint8_t quantization,
>> >>> - ? ? ? int ITC,
>> >>> - ? ? ? uint8_t picture_aspect_ratio,
>> >>> - ? ? ? uint8_t video_format_identification,
>> >>> - ? ? ? uint8_t pixel_repetition,
>> >>> - ? ? ? uint8_t non_uniform_picture_scaling,
>> >>> - ? ? ? uint8_t bar_info_data_valid,
>> >>> - ? ? ? uint16_t top_bar,
>> >>> - ? ? ? uint16_t bottom_bar,
>> >>> - ? ? ? uint16_t left_bar,
>> >>> - ? ? ? uint16_t right_bar
>> >>> -)
>> >>
>> >> In case someone wonders about the reason: I think it's really ugly to
>> >> have a function taking 18 arguments, 17 of them related to the
>> >> infoframe. It makes much more sense for me to use struct for that.
>> >> While working on that I though it's reasonable to prepare nice
>> >> bitfield __packed struct ready-to-be-written to the GPU registers.
>> >
>> > won't this screw up on other endian machines?
>>
>> Hm, maybe it can. Is there some easy to handle it correctly? Some trick like
>> __le8 foo: 3
>> __le8 bar: 1
>> maybe?
>
> Not really. The memory layout of bitfields is basically completely up to
> the C implementation, so IMHO they're just inadequate for describing
> fixed memory layouts.
>
>

Yes i agree please stay away from bitfields, i know it looks cool but
bitshift is cool too.

Cheers,
Jerome

Fwd: [PATCH 14/20] drm/radeon: multiple ring allocator v2

> On 07.05.2012 17:23, Jerome Glisse wrote:
>>
>> On Mon, May 7, 2012 at 7:42 AM, Christian K?nig
>> ?wrote:
>>>
>>> A startover with a new idea for a multiple ring allocator.
>>> Should perform as well as a normal ring allocator as long
>>> as only one ring does somthing, but falls back to a more
>>> complex algorithm if more complex things start to happen.
>>>
>>> We store the last allocated bo in last, we always try to allocate
>>> after the last allocated bo. Principle is that in a linear GPU ring
>>> progression was is after last is the oldest bo we allocated and thus
>>> the first one that should no longer be in use by the GPU.
>>>
>>> If it's not the case we skip over the bo after last to the closest
>>> done bo if such one exist. If none exist and we are not asked to
>>> block we report failure to allocate.
>>>
>>> If we are asked to block we wait on all the oldest fence of all
>>> rings. We just wait for any of those fence to complete.
>>>
>>> v2: We need to be able to let hole point to the list_head, otherwise
>>> ? ?try free will never free the first allocation of the list. Also
>>> ? ?stop calling radeon_fence_signalled more than necessary.
>>>
>>> Signed-off-by: Christian K?nig
>>> Signed-off-by: Jerome Glisse
>>
>> This one is NAK please use my patch. Yes in my patch we never try to
>> free anything if there is only on sa_bo in the list if you really care
>> about this it's a one line change:
>>
>> http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch
>
> Nope that won't work correctly, "last" is pointing to the last allocation
> and that's the most unlikely to be freed at this time. Also in this version
> (like in the one before) radeon_sa_bo_next_hole lets hole point to the
> "prev" of the found sa_bo without checking if this isn't the lists head.
> That might cause a crash if an to be freed allocation is the first one in
> the buffer.
>
> What radeon_sa_bo_try_free would need to do to get your approach working is
> to loop over the end of the buffer and also try to free at the beginning,
> but saying that keeping the last allocation results in a whole bunch of
> extra cases and "if"s, while just keeping a pointer to the "hole" (e.g.
> where the next allocation is most likely to succeed) simplifies the code
> quite a bit (but I agree that on the down side it makes it harder to
> understand).
>
>> Your patch here can enter in infinite loop and never return holding
>> the lock. See below.
>>
>> [SNIP]
>>
>>> + ? ? ? ? ? ? ? } while (radeon_sa_bo_next_hole(sa_manager, fences));
>>
>> Here you can infinite loop, in the case there is a bunch of hole in
>> the allocator but none of them allow to full fill the allocation.
>> radeon_sa_bo_next_hole will keep returning true looping over and over
>> on all the all. That's why i only restrict my patch to 2 hole skeeping
>> and then fails the allocation or try to wait. I believe sadly we need
>> an heuristic and 2 hole skeeping at most sounded like a good one.
>
> Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in
> conjunction with radeon_sa_bo_try_free are eating up the opportunities for
> holes.
>
> Look again, it probably will never loop more than RADEON_NUM_RINGS + 1, with
> the exception for allocating in a complete scattered buffer, and even then
> it will never loop more often than halve the number of current allocations
> (and that is really really unlikely).
>
> Cheers,
> Christian.

I looked again and yes it can loop infinitly, think of hole you can
never free ie radeon_sa_bo_try_free can't free anything. This
situation can happen if you have several thread allocating sa bo at
the same time while none of them are yet done with there sa_bo (ie
none have call sa_bo_free yet). I updated a v3 that track oldest and
fix all things you were pointing out above.

http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch

Cheers,
Jerome

[PATCH 20/20] drm/radeon: make the ib an inline object

From: Jerome Glisse 

No need to malloc it any more.

Signed-off-by: Jerome Glisse 
Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/evergreen_cs.c |   10 +++---
 drivers/gpu/drm/radeon/r100.c |   38 ++--
 drivers/gpu/drm/radeon/r200.c |2 +-
 drivers/gpu/drm/radeon/r300.c |4 +-
 drivers/gpu/drm/radeon/r600.c |   16 
 drivers/gpu/drm/radeon/r600_cs.c  |   22 +--
 drivers/gpu/drm/radeon/radeon.h   |8 ++--
 drivers/gpu/drm/radeon/radeon_cs.c|   63 -
 drivers/gpu/drm/radeon/radeon_ring.c  |   41 +++--
 9 files changed, 93 insertions(+), 111 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen_cs.c 
b/drivers/gpu/drm/radeon/evergreen_cs.c
index 70089d3..4e7dd2b 100644
--- a/drivers/gpu/drm/radeon/evergreen_cs.c
+++ b/drivers/gpu/drm/radeon/evergreen_cs.c
@@ -1057,7 +1057,7 @@ static int evergreen_cs_packet_parse_vline(struct 
radeon_cs_parser *p)
uint32_t header, h_idx, reg, wait_reg_mem_info;
volatile uint32_t *ib;

-   ib = p->ib->ptr;
+   ib = p->ib.ptr;

/* parse the WAIT_REG_MEM */
r = evergreen_cs_packet_parse(p, _reg_mem, p->idx);
@@ -1215,7 +1215,7 @@ static int evergreen_cs_check_reg(struct radeon_cs_parser 
*p, u32 reg, u32 idx)
if (!(evergreen_reg_safe_bm[i] & m))
return 0;
}
-   ib = p->ib->ptr;
+   ib = p->ib.ptr;
switch (reg) {
/* force following reg to 0 in an attempt to disable out buffer
 * which will need us to better understand how it works to perform
@@ -1896,7 +1896,7 @@ static int evergreen_packet3_check(struct 
radeon_cs_parser *p,
u32 idx_value;

track = (struct evergreen_cs_track *)p->track;
-   ib = p->ib->ptr;
+   ib = p->ib.ptr;
idx = pkt->idx + 1;
idx_value = radeon_get_ib_value(p, idx);

@@ -2610,8 +2610,8 @@ int evergreen_cs_parse(struct radeon_cs_parser *p)
}
} while (p->idx < p->chunks[p->chunk_ib_idx].length_dw);
 #if 0
-   for (r = 0; r < p->ib->length_dw; r++) {
-   printk(KERN_INFO "%05d  0x%08X\n", r, p->ib->ptr[r]);
+   for (r = 0; r < p->ib.length_dw; r++) {
+   printk(KERN_INFO "%05d  0x%08X\n", r, p->ib.ptr[r]);
mdelay(1);
}
 #endif
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index ad6ceb7..0874a6d 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -139,9 +139,9 @@ int r100_reloc_pitch_offset(struct radeon_cs_parser *p,
}

tmp |= tile_flags;
-   p->ib->ptr[idx] = (value & 0x3fc0) | tmp;
+   p->ib.ptr[idx] = (value & 0x3fc0) | tmp;
} else
-   p->ib->ptr[idx] = (value & 0xffc0) | tmp;
+   p->ib.ptr[idx] = (value & 0xffc0) | tmp;
return 0;
 }

@@ -156,7 +156,7 @@ int r100_packet3_load_vbpntr(struct radeon_cs_parser *p,
volatile uint32_t *ib;
u32 idx_value;

-   ib = p->ib->ptr;
+   ib = p->ib.ptr;
track = (struct r100_cs_track *)p->track;
c = radeon_get_ib_value(p, idx++) & 0x1F;
if (c > 16) {
@@ -1275,7 +1275,7 @@ void r100_cs_dump_packet(struct radeon_cs_parser *p,
unsigned i;
unsigned idx;

-   ib = p->ib->ptr;
+   ib = p->ib.ptr;
idx = pkt->idx;
for (i = 0; i <= (pkt->count + 1); i++, idx++) {
DRM_INFO("ib[%d]=0x%08X\n", idx, ib[idx]);
@@ -1354,7 +1354,7 @@ int r100_cs_packet_parse_vline(struct radeon_cs_parser *p)
uint32_t header, h_idx, reg;
volatile uint32_t *ib;

-   ib = p->ib->ptr;
+   ib = p->ib.ptr;

/* parse the wait until */
r = r100_cs_packet_parse(p, , p->idx);
@@ -1533,7 +1533,7 @@ static int r100_packet0_check(struct radeon_cs_parser *p,
u32 tile_flags = 0;
u32 idx_value;

-   ib = p->ib->ptr;
+   ib = p->ib.ptr;
track = (struct r100_cs_track *)p->track;

idx_value = radeon_get_ib_value(p, idx);
@@ -1889,7 +1889,7 @@ static int r100_packet3_check(struct radeon_cs_parser *p,
volatile uint32_t *ib;
int r;

-   ib = p->ib->ptr;
+   ib = p->ib.ptr;
idx = pkt->idx + 1;
track = (struct r100_cs_track *)p->track;
switch (pkt->opcode) {
@@ -3684,7 +3684,7 @@ void r100_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)

 int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring)
 {
-   struct radeon_ib *ib;
+   struct radeon_ib ib;
uint32_t scratch;
uint32_t tmp = 0;
unsigned i;
@@ -3700,22 +3700,22 @@ int r100_ib_test(struct radeon_device *rdev, struct 
radeon_ring *ring)
if (r) {
return r;
}
-   ib->ptr[0] = PACKET0(scratch, 0);
-   ib->ptr[1] = 0xDEADBEEF;
-

[PATCH 19/20] drm/radeon: remove r600 blit mutex v2

If we don't store local data into global variables
it isn't necessary to lock anything.

v2: rebased on new SA interface

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |1 -
 drivers/gpu/drm/radeon/r600.c   |   13 +---
 drivers/gpu/drm/radeon/r600_blit_kms.c  |   99 +++
 drivers/gpu/drm/radeon/radeon.h |3 -
 drivers/gpu/drm/radeon/radeon_asic.h|9 ++-
 5 files changed, 50 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c 
b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index 222acd2..30f0480 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -637,7 +637,6 @@ int evergreen_blit_init(struct radeon_device *rdev)
if (rdev->r600_blit.shader_obj)
goto done;

-   mutex_init(>r600_blit.mutex);
rdev->r600_blit.state_offset = 0;

if (rdev->family < CHIP_CAYMAN)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 0ae2d2d..9d6009a 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2363,20 +2363,15 @@ int r600_copy_blit(struct radeon_device *rdev,
   unsigned num_gpu_pages,
   struct radeon_fence *fence)
 {
+   struct radeon_sa_bo *vb = NULL;
int r;

-   mutex_lock(>r600_blit.mutex);
-   rdev->r600_blit.vb_ib = NULL;
-   r = r600_blit_prepare_copy(rdev, num_gpu_pages);
+   r = r600_blit_prepare_copy(rdev, num_gpu_pages, );
if (r) {
-   if (rdev->r600_blit.vb_ib)
-   radeon_ib_free(rdev, >r600_blit.vb_ib);
-   mutex_unlock(>r600_blit.mutex);
return r;
}
-   r600_kms_blit_copy(rdev, src_offset, dst_offset, num_gpu_pages);
-   r600_blit_done_copy(rdev, fence);
-   mutex_unlock(>r600_blit.mutex);
+   r600_kms_blit_copy(rdev, src_offset, dst_offset, num_gpu_pages, vb);
+   r600_blit_done_copy(rdev, fence, vb);
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
b/drivers/gpu/drm/radeon/r600_blit_kms.c
index db38f58..ef20822 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -513,7 +513,6 @@ int r600_blit_init(struct radeon_device *rdev)
rdev->r600_blit.primitives.set_default_state = set_default_state;

rdev->r600_blit.ring_size_common = 40; /* shaders + def state */
-   rdev->r600_blit.ring_size_common += 16; /* fence emit for VB IB */
rdev->r600_blit.ring_size_common += 5; /* done copy */
rdev->r600_blit.ring_size_common += 16; /* fence emit for done copy */

@@ -528,7 +527,6 @@ int r600_blit_init(struct radeon_device *rdev)
if (rdev->r600_blit.shader_obj)
goto done;

-   mutex_init(>r600_blit.mutex);
rdev->r600_blit.state_offset = 0;

if (rdev->family >= CHIP_RV770)
@@ -621,27 +619,6 @@ void r600_blit_fini(struct radeon_device *rdev)
radeon_bo_unref(>r600_blit.shader_obj);
 }

-static int r600_vb_ib_get(struct radeon_device *rdev, unsigned size)
-{
-   int r;
-   r = radeon_ib_get(rdev, RADEON_RING_TYPE_GFX_INDEX,
- >r600_blit.vb_ib, size);
-   if (r) {
-   DRM_ERROR("failed to get IB for vertex buffer\n");
-   return r;
-   }
-
-   rdev->r600_blit.vb_total = size;
-   rdev->r600_blit.vb_used = 0;
-   return 0;
-}
-
-static void r600_vb_ib_put(struct radeon_device *rdev)
-{
-   radeon_fence_emit(rdev, rdev->r600_blit.vb_ib->fence);
-   radeon_ib_free(rdev, >r600_blit.vb_ib);
-}
-
 static unsigned r600_blit_create_rect(unsigned num_gpu_pages,
  int *width, int *height, int max_dim)
 {
@@ -688,7 +665,8 @@ static unsigned r600_blit_create_rect(unsigned 
num_gpu_pages,
 }


-int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages)
+int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages,
+  struct radeon_sa_bo **vb)
 {
struct radeon_ring *ring = >ring[RADEON_RING_TYPE_GFX_INDEX];
int r;
@@ -705,46 +683,54 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, 
unsigned num_gpu_pages)
}

/* 48 bytes for vertex per loop */
-   r = r600_vb_ib_get(rdev, (num_loops*48)+256);
-   if (r)
+   r = radeon_sa_bo_new(rdev, >ring_tmp_bo, vb,
+(num_loops*48)+256, 256, true);
+   if (r) {
return r;
+   }

/* calculate number of loops correctly */
ring_size = num_loops * dwords_per_loop;
ring_size += rdev->r600_blit.ring_size_common;
r = radeon_ring_lock(rdev, ring, ring_size);
-   if (r)
+   if (r) {
+   radeon_sa_bo_free(rdev, vb, NULL);
return r;
+   }

[PATCH 18/20] drm/radeon: move the semaphore from the fence into the ib

From: Jerome Glisse 

It never really belonged there in the first place.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h   |   16 
 drivers/gpu/drm/radeon/radeon_cs.c|4 ++--
 drivers/gpu/drm/radeon/radeon_fence.c |3 ---
 drivers/gpu/drm/radeon/radeon_ring.c  |2 ++
 4 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 6170307..9507be0 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -272,7 +272,6 @@ struct radeon_fence {
uint64_tseq;
/* RB, DMA, etc. */
unsignedring;
-   struct radeon_semaphore *semaphore;
 };

 int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring);
@@ -624,13 +623,14 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
  */

 struct radeon_ib {
-   struct radeon_sa_bo *sa_bo;
-   uint32_tlength_dw;
-   uint64_tgpu_addr;
-   uint32_t*ptr;
-   struct radeon_fence *fence;
-   unsignedvm_id;
-   boolis_const_ib;
+   struct radeon_sa_bo *sa_bo;
+   uint32_tlength_dw;
+   uint64_tgpu_addr;
+   uint32_t*ptr;
+   struct radeon_fence *fence;
+   unsignedvm_id;
+   boolis_const_ib;
+   struct radeon_semaphore *semaphore;
 };

 struct radeon_ring {
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 5c065bf..dcfe2a0 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -138,12 +138,12 @@ static int radeon_cs_sync_rings(struct radeon_cs_parser 
*p)
return 0;
}

-   r = radeon_semaphore_create(p->rdev, >ib->fence->semaphore);
+   r = radeon_semaphore_create(p->rdev, >ib->semaphore);
if (r) {
return r;
}

-   return radeon_semaphore_sync_rings(p->rdev, p->ib->fence->semaphore,
+   return radeon_semaphore_sync_rings(p->rdev, p->ib->semaphore,
   sync_to_ring, p->ring);
 }

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 6767381..c1f5233 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -137,8 +137,6 @@ static void radeon_fence_destroy(struct kref *kref)

fence = container_of(kref, struct radeon_fence, kref);
fence->seq = RADEON_FENCE_NOTEMITED_SEQ;
-   if (fence->semaphore)
-   radeon_semaphore_free(fence->rdev, fence->semaphore, NULL);
kfree(fence);
 }

@@ -154,7 +152,6 @@ int radeon_fence_create(struct radeon_device *rdev,
(*fence)->rdev = rdev;
(*fence)->seq = RADEON_FENCE_NOTEMITED_SEQ;
(*fence)->ring = ring;
-   (*fence)->semaphore = NULL;
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index b3d6942..af8e1ee 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -93,6 +93,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
(*ib)->gpu_addr = radeon_sa_bo_gpu_addr((*ib)->sa_bo);
(*ib)->vm_id = 0;
(*ib)->is_const_ib = false;
+   (*ib)->semaphore = NULL;

return 0;
 }
@@ -105,6 +106,7 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
radeon_ib **ib)
if (tmp == NULL) {
return;
}
+   radeon_semaphore_free(rdev, tmp->semaphore, tmp->fence);
radeon_sa_bo_free(rdev, >sa_bo, tmp->fence);
radeon_fence_unref(>fence);
kfree(tmp);
-- 
1.7.5.4

[PATCH 17/20] drm/radeon: immediately free ttm-move semaphore

We can now protected the semaphore ram by a
fence, so free it immediately.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_ttm.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 5e3d54d..0f6aee8 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -223,6 +223,7 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
struct radeon_device *rdev;
uint64_t old_start, new_start;
struct radeon_fence *fence, *old_fence;
+   struct radeon_semaphore *sem = NULL;
int r;

rdev = radeon_get_rdev(bo->bdev);
@@ -272,15 +273,16 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
bool sync_to_ring[RADEON_NUM_RINGS] = { };
sync_to_ring[old_fence->ring] = true;

-   r = radeon_semaphore_create(rdev, >semaphore);
+   r = radeon_semaphore_create(rdev, );
if (r) {
radeon_fence_unref();
return r;
}

-   r = radeon_semaphore_sync_rings(rdev, fence->semaphore,
+   r = radeon_semaphore_sync_rings(rdev, sem,
sync_to_ring, fence->ring);
if (r) {
+   radeon_semaphore_free(rdev, sem, NULL);
radeon_fence_unref();
return r;
}
@@ -292,6 +294,7 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
/* FIXME: handle copy error */
r = ttm_bo_move_accel_cleanup(bo, (void *)fence, NULL,
  evict, no_wait_reserve, no_wait_gpu, 
new_mem);
+   radeon_semaphore_free(rdev, sem, fence);
radeon_fence_unref();
return r;
 }
-- 
1.7.5.4

[PATCH 16/20] drm/radeon: rip out the ib pool

From: Jerome Glisse 

It isn't necessary any more and the suballocator seems to perform
even better.

Signed-off-by: Christian K?nig 
Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon.h   |   17 +--
 drivers/gpu/drm/radeon/radeon_device.c|1 -
 drivers/gpu/drm/radeon/radeon_gart.c  |   12 +-
 drivers/gpu/drm/radeon/radeon_ring.c  |  241 -
 drivers/gpu/drm/radeon/radeon_semaphore.c |2 +-
 5 files changed, 71 insertions(+), 202 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 45164e1..6170307 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -625,7 +625,6 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);

 struct radeon_ib {
struct radeon_sa_bo *sa_bo;
-   unsignedidx;
uint32_tlength_dw;
uint64_tgpu_addr;
uint32_t*ptr;
@@ -634,18 +633,6 @@ struct radeon_ib {
boolis_const_ib;
 };

-/*
- * locking -
- * mutex protects scheduled_ibs, ready, alloc_bm
- */
-struct radeon_ib_pool {
-   struct radeon_mutex mutex;
-   struct radeon_sa_managersa_manager;
-   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
-   boolready;
-   unsignedhead_id;
-};
-
 struct radeon_ring {
struct radeon_bo*ring_obj;
volatile uint32_t   *ring;
@@ -787,7 +774,6 @@ struct si_rlc {
 int radeon_ib_get(struct radeon_device *rdev, int ring,
  struct radeon_ib **ib, unsigned size);
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib);
-bool radeon_ib_try_free(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void radeon_ib_pool_fini(struct radeon_device *rdev);
@@ -1522,7 +1508,8 @@ struct radeon_device {
wait_queue_head_t   fence_queue;
struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
-   struct radeon_ib_pool   ib_pool;
+   boolib_pool_ready;
+   struct radeon_sa_managerring_tmp_bo;
struct radeon_irq   irq;
struct radeon_asic  *asic;
struct radeon_gem   gem;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 48876c1..e1bc7e9 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -724,7 +724,6 @@ int radeon_device_init(struct radeon_device *rdev,
/* mutex initialization are all done here so we
 * can recall function without having locking issues */
radeon_mutex_init(>cs_mutex);
-   radeon_mutex_init(>ib_pool.mutex);
mutex_init(>ring_lock);
mutex_init(>dc_hw_i2c_mutex);
if (rdev->family >= CHIP_R600)
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index 53dba8e..8e9ef34 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -432,8 +432,8 @@ retry_id:
rdev->vm_manager.use_bitmap |= 1 << id;
vm->id = id;
list_add_tail(>list, >vm_manager.lru_vm);
-   return radeon_vm_bo_update_pte(rdev, vm, rdev->ib_pool.sa_manager.bo,
-  >ib_pool.sa_manager.bo->tbo.mem);
+   return radeon_vm_bo_update_pte(rdev, vm, rdev->ring_tmp_bo.bo,
+  >ring_tmp_bo.bo->tbo.mem);
 }

 /* object have to be reserved */
@@ -631,7 +631,7 @@ int radeon_vm_init(struct radeon_device *rdev, struct 
radeon_vm *vm)
/* map the ib pool buffer at 0 in virtual address space, set
 * read only
 */
-   r = radeon_vm_bo_add(rdev, vm, rdev->ib_pool.sa_manager.bo, 0,
+   r = radeon_vm_bo_add(rdev, vm, rdev->ring_tmp_bo.bo, 0,
 RADEON_VM_PAGE_READABLE | RADEON_VM_PAGE_SNOOPED);
return r;
 }
@@ -648,12 +648,12 @@ void radeon_vm_fini(struct radeon_device *rdev, struct 
radeon_vm *vm)
radeon_mutex_unlock(>cs_mutex);

/* remove all bo */
-   r = radeon_bo_reserve(rdev->ib_pool.sa_manager.bo, false);
+   r = radeon_bo_reserve(rdev->ring_tmp_bo.bo, false);
if (!r) {
-   bo_va = radeon_bo_va(rdev->ib_pool.sa_manager.bo, vm);
+   bo_va = radeon_bo_va(rdev->ring_tmp_bo.bo, vm);
list_del_init(_va->bo_list);
list_del_init(_va->vm_list);
-   radeon_bo_unreserve(rdev->ib_pool.sa_manager.bo);
+   radeon_bo_unreserve(rdev->ring_tmp_bo.bo);
kfree(bo_va);
}
if

[PATCH 15/20] drm/radeon: simplify semaphore handling v2

From: Jerome Glisse 

Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.

v2: rebased on new SA interface.

Signed-off-by: Christian K?nig 
Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/evergreen.c|1 -
 drivers/gpu/drm/radeon/ni.c   |1 -
 drivers/gpu/drm/radeon/r600.c |1 -
 drivers/gpu/drm/radeon/radeon.h   |   29 +-
 drivers/gpu/drm/radeon/radeon_device.c|2 -
 drivers/gpu/drm/radeon/radeon_fence.c |2 +-
 drivers/gpu/drm/radeon/radeon_semaphore.c |  137 +
 drivers/gpu/drm/radeon/radeon_test.c  |4 +-
 drivers/gpu/drm/radeon/rv770.c|1 -
 drivers/gpu/drm/radeon/si.c   |1 -
 10 files changed, 30 insertions(+), 149 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index ecc29bc..7e7ac3d 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3550,7 +3550,6 @@ void evergreen_fini(struct radeon_device *rdev)
evergreen_pcie_gart_fini(rdev);
r600_vram_scratch_fini(rdev);
radeon_gem_fini(rdev);
-   radeon_semaphore_driver_fini(rdev);
radeon_fence_driver_fini(rdev);
radeon_agp_fini(rdev);
radeon_bo_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 9cd2657..107b217 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1744,7 +1744,6 @@ void cayman_fini(struct radeon_device *rdev)
cayman_pcie_gart_fini(rdev);
r600_vram_scratch_fini(rdev);
radeon_gem_fini(rdev);
-   radeon_semaphore_driver_fini(rdev);
radeon_fence_driver_fini(rdev);
radeon_bo_fini(rdev);
radeon_atombios_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 87a2333..0ae2d2d 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2658,7 +2658,6 @@ void r600_fini(struct radeon_device *rdev)
r600_vram_scratch_fini(rdev);
radeon_agp_fini(rdev);
radeon_gem_fini(rdev);
-   radeon_semaphore_driver_fini(rdev);
radeon_fence_driver_fini(rdev);
radeon_bo_fini(rdev);
radeon_atombios_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index cc7f16a..45164e1 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -434,34 +434,13 @@ int radeon_mode_dumb_destroy(struct drm_file *file_priv,
 /*
  * Semaphores.
  */
-struct radeon_ring;
-
-#defineRADEON_SEMAPHORE_BO_SIZE256
-
-struct radeon_semaphore_driver {
-   rwlock_tlock;
-   struct list_headbo;
-};
-
-struct radeon_semaphore_bo;
-
 /* everything here is constant */
 struct radeon_semaphore {
-   struct list_headlist;
+   struct radeon_sa_bo *sa_bo;
+   signed  waiters;
uint64_tgpu_addr;
-   uint32_t*cpu_ptr;
-   struct radeon_semaphore_bo  *bo;
 };

-struct radeon_semaphore_bo {
-   struct list_headlist;
-   struct radeon_ib*ib;
-   struct list_headfree;
-   struct radeon_semaphore semaphores[RADEON_SEMAPHORE_BO_SIZE/8];
-   unsignednused;
-};
-
-void radeon_semaphore_driver_fini(struct radeon_device *rdev);
 int radeon_semaphore_create(struct radeon_device *rdev,
struct radeon_semaphore **semaphore);
 void radeon_semaphore_emit_signal(struct radeon_device *rdev, int ring,
@@ -473,7 +452,8 @@ int radeon_semaphore_sync_rings(struct radeon_device *rdev,
bool sync_to[RADEON_NUM_RINGS],
int dst_ring);
 void radeon_semaphore_free(struct radeon_device *rdev,
-  struct radeon_semaphore *semaphore);
+  struct radeon_semaphore *semaphore,
+  struct radeon_fence *fence);

 /*
  * GART structures, functions & helpers
@@ -1540,7 +1520,6 @@ struct radeon_device {
struct radeon_mman  mman;
struct radeon_fence_driver  fence_drv[RADEON_NUM_RINGS];
wait_queue_head_t   fence_queue;
-   struct radeon_semaphore_driver  semaphore_drv;
struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
struct radeon_ib_pool   ib_pool;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index b827b2e..48876c1 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -732,11 +732,9 @@ int

[PATCH 14/20] drm/radeon: multiple ring allocator v2

A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.

We store the last allocated bo in last, we always try to allocate
after the last allocated bo. Principle is that in a linear GPU ring
progression was is after last is the oldest bo we allocated and thus
the first one that should no longer be in use by the GPU.

If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.

If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.

v2: We need to be able to let hole point to the list_head, otherwise
try free will never free the first allocation of the list. Also
stop calling radeon_fence_signalled more than necessary.

Signed-off-by: Christian K?nig 
Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon.h  |7 +-
 drivers/gpu/drm/radeon/radeon_ring.c |   19 +--
 drivers/gpu/drm/radeon/radeon_sa.c   |  292 +++---
 3 files changed, 210 insertions(+), 108 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 37a7459..cc7f16a 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -385,7 +385,9 @@ struct radeon_bo_list {
 struct radeon_sa_manager {
spinlock_t  lock;
struct radeon_bo*bo;
-   struct list_headsa_bo;
+   struct list_head*hole;
+   struct list_headflist[RADEON_NUM_RINGS];
+   struct list_headolist;
unsignedsize;
uint64_tgpu_addr;
void*cpu_ptr;
@@ -396,7 +398,8 @@ struct radeon_sa_bo;

 /* sub-allocation buffer */
 struct radeon_sa_bo {
-   struct list_headlist;
+   struct list_headolist;
+   struct list_headflist;
struct radeon_sa_manager*manager;
unsignedsoffset;
unsignedeoffset;
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 1748d93..e074ff5 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -204,25 +204,22 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib)

 int radeon_ib_pool_init(struct radeon_device *rdev)
 {
-   struct radeon_sa_manager tmp;
int i, r;

-   r = radeon_sa_bo_manager_init(rdev, ,
- RADEON_IB_POOL_SIZE*64*1024,
- RADEON_GEM_DOMAIN_GTT);
-   if (r) {
-   return r;
-   }
-
radeon_mutex_lock(>ib_pool.mutex);
if (rdev->ib_pool.ready) {
radeon_mutex_unlock(>ib_pool.mutex);
-   radeon_sa_bo_manager_fini(rdev, );
return 0;
}

-   rdev->ib_pool.sa_manager = tmp;
-   INIT_LIST_HEAD(>ib_pool.sa_manager.sa_bo);
+   r = radeon_sa_bo_manager_init(rdev, >ib_pool.sa_manager,
+ RADEON_IB_POOL_SIZE*64*1024,
+ RADEON_GEM_DOMAIN_GTT);
+   if (r) {
+   radeon_mutex_unlock(>ib_pool.mutex);
+   return r;
+   }
+
for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
rdev->ib_pool.ibs[i].fence = NULL;
rdev->ib_pool.ibs[i].idx = i;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index 90ee8ad..757a9d4 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -27,21 +27,42 @@
  * Authors:
  *Jerome Glisse 
  */
+/* Algorithm:
+ *
+ * We store the last allocated bo in "hole", we always try to allocate
+ * after the last allocated bo. Principle is that in a linear GPU ring
+ * progression was is after last is the oldest bo we allocated and thus
+ * the first one that should no longer be in use by the GPU.
+ *
+ * If it's not the case we skip over the bo after last to the closest
+ * done bo if such one exist. If none exist and we are not asked to
+ * block we report failure to allocate.
+ *
+ * If we are asked to block we wait on all the oldest fence of all
+ * rings. We just wait for any of those fence to complete.
+ */
 #include "drmP.h"
 #include "drm.h"
 #include "radeon.h"

+static void radeon_sa_bo_remove_locked(struct radeon_sa_bo *sa_bo);
+static void radeon_sa_bo_try_free(struct radeon_sa_manager *sa_manager);
+
 int radeon_sa_bo_manager_init(struct radeon_device *rdev,
  struct radeon_sa_manager *sa_manager,
  unsigned size, u32 domain)
 {
-   int r;
+   int i,

[PATCH 13/20] drm/radeon: use one wait queue for all rings add fence_wait_any v2

From: Jerome Glisse 

Use one wait queue for all rings. When one ring progress, other
likely does to and we are not expecting to have a lot of waiter
anyway.

Also add a fence_wait_any that will wait until the first fence
in the fence array (one fence per ring) is signaled. This allow
to wait on all rings.

v2: some minor cleanups and improvements.

Signed-off-by: Christian K?nig 
Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon.h   |5 +-
 drivers/gpu/drm/radeon/radeon_fence.c |  163 -
 2 files changed, 162 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index ada70d1..37a7459 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -262,7 +262,6 @@ struct radeon_fence_driver {
uint64_tseq;
atomic64_t  last_seq;
unsigned long   last_activity;
-   wait_queue_head_t   queue;
boolinitialized;
 };

@@ -286,6 +285,9 @@ bool radeon_fence_signaled(struct radeon_fence *fence);
 int radeon_fence_wait(struct radeon_fence *fence, bool interruptible);
 int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring);
 int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
+int radeon_fence_wait_any(struct radeon_device *rdev,
+ struct radeon_fence **fences,
+ bool intr);
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence);
 void radeon_fence_unref(struct radeon_fence **fence);
 unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
@@ -1534,6 +1536,7 @@ struct radeon_device {
struct radeon_scratch   scratch;
struct radeon_mman  mman;
struct radeon_fence_driver  fence_drv[RADEON_NUM_RINGS];
+   wait_queue_head_t   fence_queue;
struct radeon_semaphore_driver  semaphore_drv;
struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 8034b42..45d4e6e 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -222,11 +222,11 @@ static int radeon_fence_wait_seq(struct radeon_device 
*rdev, u64 target_seq,
trace_radeon_fence_wait_begin(rdev->ddev, seq);
radeon_irq_kms_sw_irq_get(rdev, ring);
if (intr) {
-   r = 
wait_event_interruptible_timeout(rdev->fence_drv[ring].queue,
+   r = wait_event_interruptible_timeout(rdev->fence_queue,
(signaled = radeon_fence_seq_signaled(rdev, 
target_seq, ring)),
timeout);
 } else {
-   r = wait_event_timeout(rdev->fence_drv[ring].queue,
+   r = wait_event_timeout(rdev->fence_queue,
(signaled = radeon_fence_seq_signaled(rdev, 
target_seq, ring)),
timeout);
}
@@ -300,6 +300,159 @@ int radeon_fence_wait(struct radeon_fence *fence, bool 
intr)
return 0;
 }

+bool radeon_fence_any_seq_signaled(struct radeon_device *rdev, u64 *seq)
+{
+   unsigned i;
+
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   if (seq[i] && radeon_fence_seq_signaled(rdev, seq[i], i)) {
+   return true;
+   }
+   }
+   return false;
+}
+
+static int radeon_fence_wait_any_seq(struct radeon_device *rdev,
+u64 *target_seq, bool intr)
+{
+   unsigned long timeout, last_activity, tmp;
+   unsigned i, ring = RADEON_NUM_RINGS;
+   bool signaled;
+   int r;
+
+   for (i = 0, last_activity = 0; i < RADEON_NUM_RINGS; ++i) {
+   if (!target_seq[i]) {
+   continue;
+   }
+
+   /* use the most recent one as indicator */
+   if (time_after(rdev->fence_drv[i].last_activity, 
last_activity)) {
+   last_activity = rdev->fence_drv[i].last_activity;
+   }
+
+   /* For lockup detection just pick the lowest ring we are
+* actively waiting for
+*/
+   if (i < ring) {
+   ring = i;
+   }
+   }
+
+   /* nothing to wait for ? */
+   if (ring == RADEON_NUM_RINGS) {
+   return 0;
+   }
+
+   while (!radeon_fence_any_seq_signaled(rdev, target_seq)) {
+   timeout = jiffies - RADEON_FENCE_JIFFIES_TIMEOUT;
+   if (time_after(last_activity, timeout)) {
+   /* the normal case, timeout is somewhere before 
last_activity */
+

[PATCH 12/20] drm/radeon: define new SA interface v3

Define the interface without modifying the allocation
algorithm in any way.

v2: rebase on top of fence new uint64 patch
v3: add ring to debugfs output

Signed-off-by: Jerome Glisse 
Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h   |1 +
 drivers/gpu/drm/radeon/radeon_gart.c  |6 +-
 drivers/gpu/drm/radeon/radeon_object.h|5 +-
 drivers/gpu/drm/radeon/radeon_ring.c  |8 ++--
 drivers/gpu/drm/radeon/radeon_sa.c|   60 
 drivers/gpu/drm/radeon/radeon_semaphore.c |2 +-
 6 files changed, 63 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 9374ab1..ada70d1 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -398,6 +398,7 @@ struct radeon_sa_bo {
struct radeon_sa_manager*manager;
unsignedsoffset;
unsignedeoffset;
+   struct radeon_fence *fence;
 };

 /*
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index c5789ef..53dba8e 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -326,7 +326,7 @@ static void radeon_vm_unbind_locked(struct radeon_device 
*rdev,
rdev->vm_manager.use_bitmap &= ~(1 << vm->id);
list_del_init(>list);
vm->id = -1;
-   radeon_sa_bo_free(rdev, >sa_bo);
+   radeon_sa_bo_free(rdev, >sa_bo, NULL);
vm->pt = NULL;

list_for_each_entry(bo_va, >va, vm_list) {
@@ -395,7 +395,7 @@ int radeon_vm_bind(struct radeon_device *rdev, struct 
radeon_vm *vm)
 retry:
r = radeon_sa_bo_new(rdev, >vm_manager.sa_manager, >sa_bo,
 RADEON_GPU_PAGE_ALIGN(vm->last_pfn * 8),
-RADEON_GPU_PAGE_SIZE);
+RADEON_GPU_PAGE_SIZE, false);
if (r) {
if (list_empty(>vm_manager.lru_vm)) {
return r;
@@ -426,7 +426,7 @@ retry_id:
/* do hw bind */
r = rdev->vm_manager.funcs->bind(rdev, vm, id);
if (r) {
-   radeon_sa_bo_free(rdev, >sa_bo);
+   radeon_sa_bo_free(rdev, >sa_bo, NULL);
return r;
}
rdev->vm_manager.use_bitmap |= 1 << id;
diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index 4fc7f07..befec7d 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -169,9 +169,10 @@ extern int radeon_sa_bo_manager_suspend(struct 
radeon_device *rdev,
 extern int radeon_sa_bo_new(struct radeon_device *rdev,
struct radeon_sa_manager *sa_manager,
struct radeon_sa_bo **sa_bo,
-   unsigned size, unsigned align);
+   unsigned size, unsigned align, bool block);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
- struct radeon_sa_bo **sa_bo);
+ struct radeon_sa_bo **sa_bo,
+ struct radeon_fence *fence);
 #if defined(CONFIG_DEBUG_FS)
 extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 struct seq_file *m);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 45adb37..1748d93 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -85,7 +85,7 @@ bool radeon_ib_try_free(struct radeon_device *rdev, struct 
radeon_ib *ib)
if (ib->fence && ib->fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
if (radeon_fence_signaled(ib->fence)) {
radeon_fence_unref(>fence);
-   radeon_sa_bo_free(rdev, >sa_bo);
+   radeon_sa_bo_free(rdev, >sa_bo, NULL);
done = true;
}
}
@@ -124,7 +124,7 @@ retry:
if (rdev->ib_pool.ibs[idx].fence == NULL) {
r = radeon_sa_bo_new(rdev, >ib_pool.sa_manager,
 >ib_pool.ibs[idx].sa_bo,
-size, 256);
+size, 256, false);
if (!r) {
*ib = >ib_pool.ibs[idx];
(*ib)->ptr = 
radeon_sa_bo_cpu_addr((*ib)->sa_bo);
@@ -173,7 +173,7 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
radeon_ib **ib)
}
radeon_mutex_lock(>ib_pool.mutex);
if (tmp->fence && tmp->fence->seq == RADEON_FENCE_NOTEMITED_SEQ) {
-   radeon_sa_bo_free(rdev, >sa_bo);
+   radeon_sa_bo_free(rdev, >sa_bo, NULL);
radeon_fence_unref(>fence);
}

[PATCH 11/20] drm/radeon: make sa bo a stand alone object

Allocating and freeing it seperately.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h   |4 ++--
 drivers/gpu/drm/radeon/radeon_cs.c|4 ++--
 drivers/gpu/drm/radeon/radeon_gart.c  |4 ++--
 drivers/gpu/drm/radeon/radeon_object.h|4 ++--
 drivers/gpu/drm/radeon/radeon_ring.c  |6 +++---
 drivers/gpu/drm/radeon/radeon_sa.c|   28 +++-
 drivers/gpu/drm/radeon/radeon_semaphore.c |4 ++--
 7 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index d1c2154..9374ab1 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -638,7 +638,7 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
  */

 struct radeon_ib {
-   struct radeon_sa_bo sa_bo;
+   struct radeon_sa_bo *sa_bo;
unsignedidx;
uint32_tlength_dw;
uint64_tgpu_addr;
@@ -693,7 +693,7 @@ struct radeon_vm {
unsignedlast_pfn;
u64 pt_gpu_addr;
u64 *pt;
-   struct radeon_sa_bo sa_bo;
+   struct radeon_sa_bo *sa_bo;
struct mutexmutex;
/* last fence for cs using this vm */
struct radeon_fence *fence;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index b778037..5c065bf 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -477,7 +477,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
/* ib pool is bind at 0 in virtual address space to gpu_addr is 
the
 * offset inside the pool bo
 */
-   parser->const_ib->gpu_addr = parser->const_ib->sa_bo.soffset;
+   parser->const_ib->gpu_addr = parser->const_ib->sa_bo->soffset;
r = radeon_ib_schedule(rdev, parser->const_ib);
if (r)
goto out;
@@ -487,7 +487,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 * offset inside the pool bo
 */
-   parser->ib->gpu_addr = parser->ib->sa_bo.soffset;
+   parser->ib->gpu_addr = parser->ib->sa_bo->soffset;
parser->ib->is_const_ib = false;
r = radeon_ib_schedule(rdev, parser->ib);
 out:
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index 4a5d9d4..c5789ef 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -404,8 +404,8 @@ retry:
radeon_vm_unbind(rdev, vm_evict);
goto retry;
}
-   vm->pt = radeon_sa_bo_cpu_addr(>sa_bo);
-   vm->pt_gpu_addr = radeon_sa_bo_gpu_addr(>sa_bo);
+   vm->pt = radeon_sa_bo_cpu_addr(vm->sa_bo);
+   vm->pt_gpu_addr = radeon_sa_bo_gpu_addr(vm->sa_bo);
memset(vm->pt, 0, RADEON_GPU_PAGE_ALIGN(vm->last_pfn * 8));

 retry_id:
diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index 99ab46a..4fc7f07 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -168,10 +168,10 @@ extern int radeon_sa_bo_manager_suspend(struct 
radeon_device *rdev,
struct radeon_sa_manager *sa_manager);
 extern int radeon_sa_bo_new(struct radeon_device *rdev,
struct radeon_sa_manager *sa_manager,
-   struct radeon_sa_bo *sa_bo,
+   struct radeon_sa_bo **sa_bo,
unsigned size, unsigned align);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
- struct radeon_sa_bo *sa_bo);
+ struct radeon_sa_bo **sa_bo);
 #if defined(CONFIG_DEBUG_FS)
 extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 struct seq_file *m);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index f49c9c0..45adb37 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -127,8 +127,8 @@ retry:
 size, 256);
if (!r) {
*ib = >ib_pool.ibs[idx];
-   (*ib)->ptr = 
radeon_sa_bo_cpu_addr(&(*ib)->sa_bo);
-   (*ib)->gpu_addr = 
radeon_sa_bo_gpu_addr(&(*ib)->sa_bo);
+   (*ib)->ptr = 
radeon_sa_bo_cpu_addr((*ib)->sa_bo);
+   (*ib)->gpu_addr = 
radeon_sa_bo_gpu_addr((*ib)->sa_bo);

[PATCH 10/20] drm/radeon: keep start and end offset in the SA

Instead of offset + size keep start and end offset directly.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|4 ++--
 drivers/gpu/drm/radeon/radeon_cs.c |4 ++--
 drivers/gpu/drm/radeon/radeon_object.h |4 ++--
 drivers/gpu/drm/radeon/radeon_sa.c |   13 +++--
 4 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 8a6b1b3..d1c2154 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -396,8 +396,8 @@ struct radeon_sa_bo;
 struct radeon_sa_bo {
struct list_headlist;
struct radeon_sa_manager*manager;
-   unsignedoffset;
-   unsignedsize;
+   unsignedsoffset;
+   unsignedeoffset;
 };

 /*
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 289b0d7..b778037 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -477,7 +477,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
/* ib pool is bind at 0 in virtual address space to gpu_addr is 
the
 * offset inside the pool bo
 */
-   parser->const_ib->gpu_addr = parser->const_ib->sa_bo.offset;
+   parser->const_ib->gpu_addr = parser->const_ib->sa_bo.soffset;
r = radeon_ib_schedule(rdev, parser->const_ib);
if (r)
goto out;
@@ -487,7 +487,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 * offset inside the pool bo
 */
-   parser->ib->gpu_addr = parser->ib->sa_bo.offset;
+   parser->ib->gpu_addr = parser->ib->sa_bo.soffset;
parser->ib->is_const_ib = false;
r = radeon_ib_schedule(rdev, parser->ib);
 out:
diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index d9fca1e..99ab46a 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -149,12 +149,12 @@ extern struct radeon_bo_va *radeon_bo_va(struct radeon_bo 
*rbo,

 static inline uint64_t radeon_sa_bo_gpu_addr(struct radeon_sa_bo *sa_bo)
 {
-   return sa_bo->manager->gpu_addr + sa_bo->offset;
+   return sa_bo->manager->gpu_addr + sa_bo->soffset;
 }

 static inline void * radeon_sa_bo_cpu_addr(struct radeon_sa_bo *sa_bo)
 {
-   return sa_bo->manager->cpu_ptr + sa_bo->offset;
+   return sa_bo->manager->cpu_ptr + sa_bo->soffset;
 }

 extern int radeon_sa_bo_manager_init(struct radeon_device *rdev,
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index 1db0568..3bea7ba 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -152,11 +152,11 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
offset = 0;
list_for_each_entry(tmp, _manager->sa_bo, list) {
/* room before this object ? */
-   if (offset < tmp->offset && (tmp->offset - offset) >= size) {
+   if (offset < tmp->soffset && (tmp->soffset - offset) >= size) {
head = tmp->list.prev;
goto out;
}
-   offset = tmp->offset + tmp->size;
+   offset = tmp->eoffset;
wasted = offset % align;
if (wasted) {
wasted = align - wasted;
@@ -166,7 +166,7 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
/* room at the end ? */
head = sa_manager->sa_bo.prev;
tmp = list_entry(head, struct radeon_sa_bo, list);
-   offset = tmp->offset + tmp->size;
+   offset = tmp->eoffset;
wasted = offset % align;
if (wasted) {
wasted = align - wasted;
@@ -180,8 +180,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev,

 out:
sa_bo->manager = sa_manager;
-   sa_bo->offset = offset;
-   sa_bo->size = size;
+   sa_bo->soffset = offset;
+   sa_bo->eoffset = offset + size;
list_add(_bo->list, head);
spin_unlock(_manager->lock);
return 0;
@@ -202,7 +202,8 @@ void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager 
*sa_manager,

spin_lock(_manager->lock);
list_for_each_entry(i, _manager->sa_bo, list) {
-   seq_printf(m, "offset %08d: size %4d\n", i->offset, i->size);
+   seq_printf(m, "[%08x %08x] size %4d [%p]\n",
+  i->soffset, i->eoffset, i->eoffset - i->soffset, i);
}
spin_unlock(_manager->lock);
 }
-- 
1.7.5.4

[PATCH 09/20] drm/radeon: add sub allocator debugfs file

Dumping the current allocations.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_object.h |5 +
 drivers/gpu/drm/radeon/radeon_ring.c   |   22 ++
 drivers/gpu/drm/radeon/radeon_sa.c |   14 ++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index c120ab9..d9fca1e 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -172,5 +172,10 @@ extern int radeon_sa_bo_new(struct radeon_device *rdev,
unsigned size, unsigned align);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
  struct radeon_sa_bo *sa_bo);
+#if defined(CONFIG_DEBUG_FS)
+extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
+struct seq_file *m);
+#endif
+

 #endif
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 116be5e..f49c9c0 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -601,6 +601,23 @@ static int radeon_debugfs_ib_info(struct seq_file *m, void 
*data)
 static struct drm_info_list radeon_debugfs_ib_list[RADEON_IB_POOL_SIZE];
 static char radeon_debugfs_ib_names[RADEON_IB_POOL_SIZE][32];
 static unsigned radeon_debugfs_ib_idx[RADEON_IB_POOL_SIZE];
+
+static int radeon_debugfs_sa_info(struct seq_file *m, void *data)
+{
+   struct drm_info_node *node = (struct drm_info_node *) m->private;
+   struct drm_device *dev = node->minor->dev;
+   struct radeon_device *rdev = dev->dev_private;
+
+   radeon_sa_bo_dump_debug_info(>ib_pool.sa_manager, m);
+
+   return 0;
+
+}
+
+static struct drm_info_list radeon_debugfs_sa_list[] = {
+{"radeon_sa_info", _debugfs_sa_info, 0, NULL},
+};
+
 #endif

 int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ring 
*ring)
@@ -627,6 +644,11 @@ int radeon_debugfs_ib_init(struct radeon_device *rdev)
 {
 #if defined(CONFIG_DEBUG_FS)
unsigned i;
+   int r;
+
+   r = radeon_debugfs_add_files(rdev, radeon_debugfs_sa_list, 1);
+   if (r)
+   return r;

for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
sprintf(radeon_debugfs_ib_names[i], "radeon_ib_%04u", i);
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index aed0a8c..1db0568 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -193,3 +193,17 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct 
radeon_sa_bo *sa_bo)
list_del_init(_bo->list);
spin_unlock(_bo->manager->lock);
 }
+
+#if defined(CONFIG_DEBUG_FS)
+void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
+ struct seq_file *m)
+{
+   struct radeon_sa_bo *i;
+
+   spin_lock(_manager->lock);
+   list_for_each_entry(i, _manager->sa_bo, list) {
+   seq_printf(m, "offset %08d: size %4d\n", i->offset, i->size);
+   }
+   spin_unlock(_manager->lock);
+}
+#endif
-- 
1.7.5.4

[PATCH 08/20] drm/radeon: add proper locking to the SA v3

Make the suballocator self containing to locking.

v2: split the bugfix into a seperate patch.
v3: remove some unreleated changes.

Sig-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|1 +
 drivers/gpu/drm/radeon/radeon_sa.c |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 701094b..8a6b1b3 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -381,6 +381,7 @@ struct radeon_bo_list {
  * alignment).
  */
 struct radeon_sa_manager {
+   spinlock_t  lock;
struct radeon_bo*bo;
struct list_headsa_bo;
unsignedsize;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index 8fbfe69..aed0a8c 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -37,6 +37,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 {
int r;

+   spin_lock_init(_manager->lock);
sa_manager->bo = NULL;
sa_manager->size = size;
sa_manager->domain = domain;
@@ -139,6 +140,7 @@ int radeon_sa_bo_new(struct radeon_device *rdev,

BUG_ON(align > RADEON_GPU_PAGE_SIZE);
BUG_ON(size > sa_manager->size);
+   spin_lock(_manager->lock);

/* no one ? */
head = sa_manager->sa_bo.prev;
@@ -172,6 +174,7 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
offset += wasted;
if ((sa_manager->size - offset) < size) {
/* failed to find somethings big enough */
+   spin_unlock(_manager->lock);
return -ENOMEM;
}

@@ -180,10 +183,13 @@ out:
sa_bo->offset = offset;
sa_bo->size = size;
list_add(_bo->list, head);
+   spin_unlock(_manager->lock);
return 0;
 }

 void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo *sa_bo)
 {
+   spin_lock(_bo->manager->lock);
list_del_init(_bo->list);
+   spin_unlock(_bo->manager->lock);
 }
-- 
1.7.5.4

[PATCH 07/20] drm/radeon: use inline functions to calc sa_bo addr

Instead of hacking the calculation multiple times.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_gart.c  |6 ++
 drivers/gpu/drm/radeon/radeon_object.h|   11 +++
 drivers/gpu/drm/radeon/radeon_ring.c  |6 ++
 drivers/gpu/drm/radeon/radeon_semaphore.c |6 ++
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index c58a036..4a5d9d4 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -404,10 +404,8 @@ retry:
radeon_vm_unbind(rdev, vm_evict);
goto retry;
}
-   vm->pt = rdev->vm_manager.sa_manager.cpu_ptr;
-   vm->pt += (vm->sa_bo.offset >> 3);
-   vm->pt_gpu_addr = rdev->vm_manager.sa_manager.gpu_addr;
-   vm->pt_gpu_addr += vm->sa_bo.offset;
+   vm->pt = radeon_sa_bo_cpu_addr(>sa_bo);
+   vm->pt_gpu_addr = radeon_sa_bo_gpu_addr(>sa_bo);
memset(vm->pt, 0, RADEON_GPU_PAGE_ALIGN(vm->last_pfn * 8));

 retry_id:
diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index f9104be..c120ab9 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -146,6 +146,17 @@ extern struct radeon_bo_va *radeon_bo_va(struct radeon_bo 
*rbo,
 /*
  * sub allocation
  */
+
+static inline uint64_t radeon_sa_bo_gpu_addr(struct radeon_sa_bo *sa_bo)
+{
+   return sa_bo->manager->gpu_addr + sa_bo->offset;
+}
+
+static inline void * radeon_sa_bo_cpu_addr(struct radeon_sa_bo *sa_bo)
+{
+   return sa_bo->manager->cpu_ptr + sa_bo->offset;
+}
+
 extern int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 struct radeon_sa_manager *sa_manager,
 unsigned size, u32 domain);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 2fdc8c3..116be5e 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -127,10 +127,8 @@ retry:
 size, 256);
if (!r) {
*ib = >ib_pool.ibs[idx];
-   (*ib)->ptr = rdev->ib_pool.sa_manager.cpu_ptr;
-   (*ib)->ptr += ((*ib)->sa_bo.offset >> 2);
-   (*ib)->gpu_addr = 
rdev->ib_pool.sa_manager.gpu_addr;
-   (*ib)->gpu_addr += (*ib)->sa_bo.offset;
+   (*ib)->ptr = 
radeon_sa_bo_cpu_addr(&(*ib)->sa_bo);
+   (*ib)->gpu_addr = 
radeon_sa_bo_gpu_addr(&(*ib)->sa_bo);
(*ib)->fence = fence;
(*ib)->vm_id = 0;
(*ib)->is_const_ib = false;
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c 
b/drivers/gpu/drm/radeon/radeon_semaphore.c
index c5b3d8e..f312ba5 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -53,10 +53,8 @@ static int radeon_semaphore_add_bo(struct radeon_device 
*rdev)
kfree(bo);
return r;
}
-   gpu_addr = rdev->ib_pool.sa_manager.gpu_addr;
-   gpu_addr += bo->ib->sa_bo.offset;
-   cpu_ptr = rdev->ib_pool.sa_manager.cpu_ptr;
-   cpu_ptr += (bo->ib->sa_bo.offset >> 2);
+   gpu_addr = radeon_sa_bo_gpu_addr(>ib->sa_bo);
+   cpu_ptr = radeon_sa_bo_cpu_addr(>ib->sa_bo);
for (i = 0; i < (RADEON_SEMAPHORE_BO_SIZE/8); i++) {
bo->semaphores[i].gpu_addr = gpu_addr;
bo->semaphores[i].cpu_ptr = cpu_ptr;
-- 
1.7.5.4

[PATCH 06/20] drm/radeon: rework locking ring emission mutex in fence deadlock detection

Some callers illegal called fence_wait_next/empty
while holding the ring emission mutex. So don't
relock the mutex in that cases, and move the actual
locking into the fence code.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|4 +-
 drivers/gpu/drm/radeon/radeon_device.c |5 +++-
 drivers/gpu/drm/radeon/radeon_fence.c  |   39 ---
 drivers/gpu/drm/radeon/radeon_pm.c |8 +-
 drivers/gpu/drm/radeon/radeon_ring.c   |6 +
 5 files changed, 33 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 7c87117..701094b 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -284,8 +284,8 @@ int radeon_fence_emit(struct radeon_device *rdev, struct 
radeon_fence *fence);
 void radeon_fence_process(struct radeon_device *rdev, int ring);
 bool radeon_fence_signaled(struct radeon_fence *fence);
 int radeon_fence_wait(struct radeon_fence *fence, bool interruptible);
-int radeon_fence_wait_next(struct radeon_device *rdev, int ring);
-int radeon_fence_wait_empty(struct radeon_device *rdev, int ring);
+int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring);
+int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence);
 void radeon_fence_unref(struct radeon_fence **fence);
 unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 0e7b72a..b827b2e 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -912,9 +912,12 @@ int radeon_suspend_kms(struct drm_device *dev, 
pm_message_t state)
}
/* evict vram memory */
radeon_bo_evict_vram(rdev);
+
+   mutex_lock(>ring_lock);
/* wait for gpu to finish processing current batch */
for (i = 0; i < RADEON_NUM_RINGS; i++)
-   radeon_fence_wait_empty(rdev, i);
+   radeon_fence_wait_empty_locked(rdev, i);
+   mutex_unlock(>ring_lock);

radeon_save_bios_scratch_regs(rdev);

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index f386807..8034b42 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -192,7 +192,7 @@ bool radeon_fence_signaled(struct radeon_fence *fence)
 }

 static int radeon_fence_wait_seq(struct radeon_device *rdev, u64 target_seq,
-unsigned ring, bool intr)
+unsigned ring, bool intr, bool lock_ring)
 {
unsigned long timeout, last_activity;
uint64_t seq;
@@ -247,8 +247,14 @@ static int radeon_fence_wait_seq(struct radeon_device 
*rdev, u64 target_seq,
if (seq != 
atomic64_read(>fence_drv[ring].last_seq)) {
continue;
}
+
+   if (lock_ring) {
+   mutex_lock(>ring_lock);
+   }
+
/* test if somebody else has already decided that this 
is a lockup */
if (last_activity != 
rdev->fence_drv[ring].last_activity) {
+   mutex_unlock(>ring_lock);
continue;
}

@@ -262,15 +268,15 @@ static int radeon_fence_wait_seq(struct radeon_device 
*rdev, u64 target_seq,
rdev->fence_drv[i].last_activity = 
jiffies;
}

-   /* change last activity so nobody else think 
there is a lockup */
-   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-   rdev->fence_drv[i].last_activity = 
jiffies;
-   }
-
/* mark the ring as not ready any more */
rdev->ring[ring].ready = false;
+   mutex_unlock(>ring_lock);
return -EDEADLK;
}
+
+   if (lock_ring) {
+   mutex_unlock(>ring_lock);
+   }
}
}
return 0;
@@ -285,7 +291,8 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
return -EINVAL;
}

-   r = radeon_fence_wait_seq(fence->rdev, fence->seq, fence->ring, intr);
+   r = radeon_fence_wait_seq(fence->rdev, fence->seq,
+ fence->ring, intr, true);
if (r) {
return r;
}
@@ -293,7 +300,7 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
return 0;
 }

-int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
+int

[PATCH 05/20] drm/radeon: rework fence handling, drop fence list v5

From: Jerome Glisse 

Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.

Things like ring.ready are no longer behind a lock, this should
be ok as ring.ready is initialized once and will only change
when facing lockup. Worst case is that we return an -EBUSY just
after a successfull GPU reset, or we go into wait state instead
of returning -EBUSY (thus delaying reporting -EBUSY to fence
wait caller).

v2: Remove left over comment, force using writeback on cayman and
newer, thus not having to suffer from possibly scratch reg
exhaustion
v3: Rebase on top of change to uint64 fence patch
v4: Change DCE5 test to force write back on cayman and newer but
also any APU such as PALM or SUMO family
v5: Rebase on top of new uint64 fence patch
v6: Just break if seq doesn't change any more. Use radeon_fence
prefix for all function names. Even if it's now highly optimized,
try avoiding polling to often.

Signed-off-by: Jerome Glisse 
Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|6 +-
 drivers/gpu/drm/radeon/radeon_device.c |8 +-
 drivers/gpu/drm/radeon/radeon_fence.c  |  289 +---
 3 files changed, 118 insertions(+), 185 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index cdf46bc..7c87117 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -263,15 +263,12 @@ struct radeon_fence_driver {
atomic64_t  last_seq;
unsigned long   last_activity;
wait_queue_head_t   queue;
-   struct list_heademitted;
-   struct list_headsignaled;
boolinitialized;
 };

 struct radeon_fence {
struct radeon_device*rdev;
struct kref kref;
-   struct list_headlist;
/* protected by radeon_fence.lock */
uint64_tseq;
/* RB, DMA, etc. */
@@ -291,7 +288,7 @@ int radeon_fence_wait_next(struct radeon_device *rdev, int 
ring);
 int radeon_fence_wait_empty(struct radeon_device *rdev, int ring);
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence);
 void radeon_fence_unref(struct radeon_fence **fence);
-int radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
+unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring);

 /*
  * Tiling registers
@@ -1534,7 +1531,6 @@ struct radeon_device {
struct radeon_mode_info mode_info;
struct radeon_scratch   scratch;
struct radeon_mman  mman;
-   rwlock_tfence_lock;
struct radeon_fence_driver  fence_drv[RADEON_NUM_RINGS];
struct radeon_semaphore_driver  semaphore_drv;
struct mutexring_lock;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 3f6ff2a..0e7b72a 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -225,9 +225,9 @@ int radeon_wb_init(struct radeon_device *rdev)
/* disable event_write fences */
rdev->wb.use_event = false;
/* disabled via module param */
-   if (radeon_no_wb == 1)
+   if (radeon_no_wb == 1) {
rdev->wb.enabled = false;
-   else {
+   } else {
if (rdev->flags & RADEON_IS_AGP) {
/* often unreliable on AGP */
rdev->wb.enabled = false;
@@ -237,8 +237,9 @@ int radeon_wb_init(struct radeon_device *rdev)
} else {
rdev->wb.enabled = true;
/* event_write fences are only available on r600+ */
-   if (rdev->family >= CHIP_R600)
+   if (rdev->family >= CHIP_R600) {
rdev->wb.use_event = true;
+   }
}
}
/* always use writeback/events on NI, APUs */
@@ -731,7 +732,6 @@ int radeon_device_init(struct radeon_device *rdev,
mutex_init(>gem.mutex);
mutex_init(>pm.mutex);
mutex_init(>vram_mutex);
-   rwlock_init(>fence_lock);
rwlock_init(>semaphore_drv.lock);
INIT_LIST_HEAD(>gem.objects);
init_waitqueue_head(>irq.vblank_queue);
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index feb2bbc..f386807 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -63,30 +63,18 @@ static u32 radeon_fence_read(struct radeon_device *rdev, 
int ring)

 int radeon_fence_emit(struct radeon_device *rdev, struct radeon_fence *fence)
 {
-

[PATCH 04/20] drm/radeon: convert fence to uint64_t v4

From: Jerome Glisse 

This convert fence to use uint64_t sequence number intention is
to use the fact that uin64_t is big enough that we don't need to
care about wrap around.

Tested with and without writeback using 0xF000 as initial
fence sequence and thus allowing to test the wrap around from
32bits to 64bits.

v2: Add comment about possible race btw CPU & GPU, add comment
stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
Read fence sequenc in reverse order of GPU write them so we
mitigate the race btw CPU and GPU.

v3: Drop the need for ring to emit the 64bits fence, and just have
each ring emit the lower 32bits of the fence sequence. We
handle the wrap over 32bits in fence_process.

v4: Just a small optimization: Don't reread the last_seq value
if loop restarts, since we already know its value anyway.
Also start at zero not one for seq value and use pre instead
of post increment in emmit, otherwise wait_empty will deadlock.

Signed-off-by: Jerome Glisse 
Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h   |   39 ++-
 drivers/gpu/drm/radeon/radeon_fence.c |  116 +++--
 drivers/gpu/drm/radeon/radeon_ring.c  |9 ++-
 3 files changed, 107 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index e99ea81..cdf46bc 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -100,28 +100,32 @@ extern int radeon_lockup_timeout;
  * Copy from radeon_drv.h so we don't have to include both and have conflicting
  * symbol;
  */
-#define RADEON_MAX_USEC_TIMEOUT10  /* 100 ms */
-#define RADEON_FENCE_JIFFIES_TIMEOUT   (HZ / 2)
+#define RADEON_MAX_USEC_TIMEOUT10  /* 100 ms */
+#define RADEON_FENCE_JIFFIES_TIMEOUT   (HZ / 2)
 /* RADEON_IB_POOL_SIZE must be a power of 2 */
-#define RADEON_IB_POOL_SIZE16
-#define RADEON_DEBUGFS_MAX_COMPONENTS  32
-#define RADEONFB_CONN_LIMIT4
-#define RADEON_BIOS_NUM_SCRATCH8
+#define RADEON_IB_POOL_SIZE16
+#define RADEON_DEBUGFS_MAX_COMPONENTS  32
+#define RADEONFB_CONN_LIMIT4
+#define RADEON_BIOS_NUM_SCRATCH8

 /* max number of rings */
-#define RADEON_NUM_RINGS 3
+#define RADEON_NUM_RINGS   3
+
+/* fence seq are set to this number when signaled */
+#define RADEON_FENCE_SIGNALED_SEQ  0LL
+#define RADEON_FENCE_NOTEMITED_SEQ (~0LL)

 /* internal ring indices */
 /* r1xx+ has gfx CP ring */
-#define RADEON_RING_TYPE_GFX_INDEX  0
+#define RADEON_RING_TYPE_GFX_INDEX 0

 /* cayman has 2 compute CP rings */
-#define CAYMAN_RING_TYPE_CP1_INDEX 1
-#define CAYMAN_RING_TYPE_CP2_INDEX 2
+#define CAYMAN_RING_TYPE_CP1_INDEX 1
+#define CAYMAN_RING_TYPE_CP2_INDEX 2

 /* hardcode those limit for now */
-#define RADEON_VA_RESERVED_SIZE(8 << 20)
-#define RADEON_IB_VM_MAX_SIZE  (64 << 10)
+#define RADEON_VA_RESERVED_SIZE(8 << 20)
+#define RADEON_IB_VM_MAX_SIZE  (64 << 10)

 /*
  * Errata workarounds.
@@ -254,8 +258,9 @@ struct radeon_fence_driver {
uint32_tscratch_reg;
uint64_tgpu_addr;
volatile uint32_t   *cpu_addr;
-   atomic_tseq;
-   uint32_tlast_seq;
+   /* seq is protected by ring emission lock */
+   uint64_tseq;
+   atomic64_t  last_seq;
unsigned long   last_activity;
wait_queue_head_t   queue;
struct list_heademitted;
@@ -268,11 +273,9 @@ struct radeon_fence {
struct kref kref;
struct list_headlist;
/* protected by radeon_fence.lock */
-   uint32_tseq;
-   boolemitted;
-   boolsignaled;
+   uint64_tseq;
/* RB, DMA, etc. */
-   int ring;
+   unsignedring;
struct radeon_semaphore *semaphore;
 };

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 5bb78bf..feb2bbc 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -66,14 +66,14 @@ int radeon_fence_emit(struct radeon_device *rdev, struct 
radeon_fence *fence)
unsigned long irq_flags;

write_lock_irqsave(>fence_lock, irq_flags);
-   if (fence->emitted) {
+   if (fence->seq && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
write_unlock_irqrestore(>fence_lock, irq_flags);

[PATCH 03/20] drm/radeon: replace the per ring mutex with a global one

A single global mutex for ring submissions seems sufficient.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h   |3 +-
 drivers/gpu/drm/radeon/radeon_device.c|3 +-
 drivers/gpu/drm/radeon/radeon_pm.c|   10 +-
 drivers/gpu/drm/radeon/radeon_ring.c  |   28 +++---
 drivers/gpu/drm/radeon/radeon_semaphore.c |   42 +
 5 files changed, 41 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 82ffa6a..e99ea81 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -676,7 +676,6 @@ struct radeon_ring {
uint64_tgpu_addr;
uint32_talign_mask;
uint32_tptr_mask;
-   struct mutexmutex;
boolready;
u32 ptr_reg_shift;
u32 ptr_reg_mask;
@@ -815,6 +814,7 @@ int radeon_ring_alloc(struct radeon_device *rdev, struct 
radeon_ring *cp, unsign
 int radeon_ring_lock(struct radeon_device *rdev, struct radeon_ring *cp, 
unsigned ndw);
 void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *cp);
 void radeon_ring_unlock_commit(struct radeon_device *rdev, struct radeon_ring 
*cp);
+void radeon_ring_undo(struct radeon_ring *ring);
 void radeon_ring_unlock_undo(struct radeon_device *rdev, struct radeon_ring 
*cp);
 int radeon_ring_test(struct radeon_device *rdev, struct radeon_ring *cp);
 void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring 
*ring);
@@ -1534,6 +1534,7 @@ struct radeon_device {
rwlock_tfence_lock;
struct radeon_fence_driver  fence_drv[RADEON_NUM_RINGS];
struct radeon_semaphore_driver  semaphore_drv;
+   struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
struct radeon_ib_pool   ib_pool;
struct radeon_irq   irq;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index ff28210..3f6ff2a 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -724,8 +724,7 @@ int radeon_device_init(struct radeon_device *rdev,
 * can recall function without having locking issues */
radeon_mutex_init(>cs_mutex);
radeon_mutex_init(>ib_pool.mutex);
-   for (i = 0; i < RADEON_NUM_RINGS; ++i)
-   mutex_init(>ring[i].mutex);
+   mutex_init(>ring_lock);
mutex_init(>dc_hw_i2c_mutex);
if (rdev->family >= CHIP_R600)
spin_lock_init(>ih.lock);
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c 
b/drivers/gpu/drm/radeon/radeon_pm.c
index caa55d6..7c38745 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -252,10 +252,7 @@ static void radeon_pm_set_clocks(struct radeon_device 
*rdev)

mutex_lock(>ddev->struct_mutex);
mutex_lock(>vram_mutex);
-   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-   if (rdev->ring[i].ring_obj)
-   mutex_lock(>ring[i].mutex);
-   }
+   mutex_lock(>ring_lock);

/* gui idle int has issues on older chips it seems */
if (rdev->family >= CHIP_R600) {
@@ -311,10 +308,7 @@ static void radeon_pm_set_clocks(struct radeon_device 
*rdev)

rdev->pm.dynpm_planned_action = DYNPM_ACTION_NONE;

-   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-   if (rdev->ring[i].ring_obj)
-   mutex_unlock(>ring[i].mutex);
-   }
+   mutex_unlock(>ring_lock);
mutex_unlock(>vram_mutex);
mutex_unlock(>ddev->struct_mutex);
 }
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 2eb4c6e..a4d60ae 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -346,9 +346,9 @@ int radeon_ring_alloc(struct radeon_device *rdev, struct 
radeon_ring *ring, unsi
if (ndw < ring->ring_free_dw) {
break;
}
-   mutex_unlock(>mutex);
+   mutex_unlock(>ring_lock);
r = radeon_fence_wait_next(rdev, radeon_ring_index(rdev, ring));
-   mutex_lock(>mutex);
+   mutex_lock(>ring_lock);
if (r)
return r;
}
@@ -361,10 +361,10 @@ int radeon_ring_lock(struct radeon_device *rdev, struct 
radeon_ring *ring, unsig
 {
int r;

-   mutex_lock(>mutex);
+   mutex_lock(>ring_lock);
r = radeon_ring_alloc(rdev, ring, ndw);
if (r) {
-   mutex_unlock(>mutex);
+   mutex_unlock(>ring_lock);
return r;
}
return 0;
@@ -389,20 +389,25 @@ void radeon_ring_commit(struct radeon_device *rdev, 
struct radeon_ring *ring)

[PATCH 02/20] drm/radeon: clarify and extend wb setup on APUs and NI+ asics

From: Alex Deucher 

Use family rather than DCE check for clarity, also always use
wb on APUs, there will never be AGP variants.

Signed-off-by: Alex Deucher 
Reviewed by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_device.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index d18f0c4..ff28210 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -241,8 +241,8 @@ int radeon_wb_init(struct radeon_device *rdev)
rdev->wb.use_event = true;
}
}
-   /* always use writeback/events on NI */
-   if (ASIC_IS_DCE5(rdev)) {
+   /* always use writeback/events on NI, APUs */
+   if (rdev->family >= CHIP_PALM) {
rdev->wb.enabled = true;
rdev->wb.use_event = true;
}
-- 
1.7.5.4

[PATCH 01/20] drm/radeon: fix possible lack of synchronization btw ttm and other ring

From: Jerome Glisse 

We need to sync with the GFX ring as ttm might have schedule bo move
on it and new command scheduled for other ring need to wait for bo
data to be in place.

Signed-off-by: Jerome Glisse 
Reviewed by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_cs.c |   12 ++--
 include/drm/radeon_drm.h   |1 -
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index c66beb1..289b0d7 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -122,15 +122,15 @@ static int radeon_cs_sync_rings(struct radeon_cs_parser 
*p)
int i, r;

for (i = 0; i < p->nrelocs; i++) {
+   struct radeon_fence *fence;
+
if (!p->relocs[i].robj || !p->relocs[i].robj->tbo.sync_obj)
continue;

-   if (!(p->relocs[i].flags & RADEON_RELOC_DONT_SYNC)) {
-   struct radeon_fence *fence = 
p->relocs[i].robj->tbo.sync_obj;
-   if (fence->ring != p->ring && 
!radeon_fence_signaled(fence)) {
-   sync_to_ring[fence->ring] = true;
-   need_sync = true;
-   }
+   fence = p->relocs[i].robj->tbo.sync_obj;
+   if (fence->ring != p->ring && !radeon_fence_signaled(fence)) {
+   sync_to_ring[fence->ring] = true;
+   need_sync = true;
}
}

diff --git a/include/drm/radeon_drm.h b/include/drm/radeon_drm.h
index 7c491b4..5805686 100644
--- a/include/drm/radeon_drm.h
+++ b/include/drm/radeon_drm.h
@@ -926,7 +926,6 @@ struct drm_radeon_cs_chunk {
 };

 /* drm_radeon_cs_reloc.flags */
-#define RADEON_RELOC_DONT_SYNC 0x01

 struct drm_radeon_cs_reloc {
uint32_thandle;
-- 
1.7.5.4

SA and other Patches.

Hi Jerome & everybody on the list,

this gathers together every patch we developed over the last week or so and
which is not already in drm-next.

I've run quite some tests with them yesterday and today and as far as I can
see hammered out every known bug. For the SA allocator I reverted to tracking
the hole pointer instead of just the last allocation, cause otherwise we will
never release the first allocation on the list. Glxgears now even keeps happily
running if I deadlock on the not GFX rings on purpose.

Please take a second look at them and if nobody objects any more we should
commit them to drm-next.

Cheers,
Christian.

[RFC v2 5/5] drm: Add NVIDIA Tegra support

2012-05-07 Thread Terje Bergström

On 25.04.2012 12:45, Thierry Reding wrote:

> +/ {
> +   ...
> +
> +   /* host1x */
> +   host1x: host1x at 5000 {
> +   compatible = "nvidia,tegra20-host1x";
> +   reg = <0x5000 0x00024000>;
> +   interrupts = <0 64 0x04   /* cop syncpt */
> + 0 65 0x04   /* mpcore syncpt */
> + 0 66 0x04   /* cop general */
> + 0 67 0x04>; /* mpcore general */
> +   };
> +
> +   /* video-encoding/decoding */
> +   mpe at 5404 {
> +   reg = <0x5404 0x0004>;
> +   interrupts = <0 68 0x04>;
> +   };
> +

(...)

Hi Thierry,

I have still lots of questions regarding how device trees work. I'm now
just trying to match the device tree structure with hardware - let me
know if that goes wrong.

There's a hierarchy in the hardware, which should be represented in the
device trees. All of the hardware are client modules for host1x - with
the exception of host1x obviously. CPU has two methods for accessing the
hardware: clients' register aperture and host1x channels. Both of these
operate via host1x hardware.

We should define host1x bus in the device tree, and move all nodes
except host1x under that bus. This will help us in the long run, as we
will have multiple drivers (drm, v4l2) each accessing hardware under
host1x. We will need to model the bus and the bus_type will need to take
over responsibilities of managing the common resources.

When we are clocking hardware, whenever we want to access display's
register aperture, host1x needs to be clocked.

> +   /* graphics host */

> +   graphics at 5400 {
> +   compatible = "nvidia,tegra20-graphics";
> +
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   ranges;
> +
> +   display-controllers = < >;
> +   carveout = <0x0e00 0x0200>;
> +   host1x = <>;
> +   gart = <>;
> +
> +   connectors {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   connector at 0 {
> +   reg = <0>;
> +   edid = /incbin/("machine.edid");
> +   output = <>;
> +   };
> +
> +   connector at 1 {
> +   reg = <1>;
> +   output = <>;
> +   ddc = <>;
> +
> +   hpd-gpio = < 111 0>; /* PN7 */
> +   };
> +   };
> +   };
> +};

I'm not sure what this node means. The register range from 5400
onwards is actually the one that you just described in the nodes of the
individual client modules. Why is it represented here again?

Terje

[git pull] drm fixes

2012-05-07 Thread Dave Airlie


Two fixes from Intel, one a regression, one because I merged an early 
version of a fix.

Also the nouveau revert of the i2c code that was tested on the list.

Dave.

The following changes since commit febb72a6e4cc6c8cffcc1ea649a3fb364f1ea432:

  IA32 emulation: Fix build problem for modular ia32 a.out support (2012-05-06 
18:26:20 -0700)

are available in the git repository at:
  git://people.freedesktop.org/~airlied/linux drm-fixes

Ben Skeggs (1):
  drm/nouveau/i2c: resume use of i2c-algo-bit, rather than custom stack

Daniel Vetter (2):
  drm/i915: disable sdvo hotplug on i945g/gm
  drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+

Dave Airlie (1):
  Merge branch 'drm-intel-fixes' of 
git://people.freedesktop.org/~danvet/drm-intel into drm-fixes

 drivers/gpu/drm/i915/intel_ringbuffer.c |9 +-
 drivers/gpu/drm/i915/intel_sdvo.c   |6 +
 drivers/gpu/drm/nouveau/nouveau_i2c.c   |  199 ---
 drivers/gpu/drm/nouveau/nouveau_i2c.h   |1 +
 4 files changed, 34 insertions(+), 181 deletions(-)

[PATCH 04/20] drm/radeon: convert fence to uint64_t v4

On Mon, May 7, 2012 at 11:04 AM, Christian K?nig
 wrote:
> On 07.05.2012 16:39, Jerome Glisse wrote:
>>
>> On Mon, May 7, 2012 at 7:42 AM, Christian K?nig
>> ?wrote:
>>>
>>> From: Jerome Glisse
>>>
>>> This convert fence to use uint64_t sequence number intention is
>>> to use the fact that uin64_t is big enough that we don't need to
>>> care about wrap around.
>>>
>>> Tested with and without writeback using 0xF000 as initial
>>> fence sequence and thus allowing to test the wrap around from
>>> 32bits to 64bits.
>>>
>>> v2: Add comment about possible race btw CPU& ?GPU, add comment
>>>
>>> ? ?stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
>>> ? ?Read fence sequenc in reverse order of GPU write them so we
>>> ? ?mitigate the race btw CPU and GPU.
>>>
>>> v3: Drop the need for ring to emit the 64bits fence, and just have
>>> ? ?each ring emit the lower 32bits of the fence sequence. We
>>> ? ?handle the wrap over 32bits in fence_process.
>>>
>>> v4: Just a small optimization: Don't reread the last_seq value
>>> ? ?if loop restarts, since we already know its value anyway.
>>> ? ?Also start at zero not one for seq value and use pre instead
>>> ? ?of post increment in emmit, otherwise wait_empty will deadlock.
>>
>> Why changing that v3 was already good no deadlock. I started at 1
>> especialy for that, a signaled fence is set to 0 so it always compare
>> as signaled. Just using preincrement is exactly like starting at one.
>> I don't see the need for this change but if it makes you happy.
>
>
> Not exactly, the last emitted sequence is also used in
> radeon_fence_wait_empty. So when you use post increment
> radeon_fence_wait_empty will actually not wait for the last emitted fence to
> be signaled, but for last emitted + 1, so it practically waits forever.
>
> Without this change suspend (for example) will just lockup.
>
> Cheers,
> Christian.

Yeah you right, my tree had a fix for that. I probably messed up the
rebase patch at one point. Well as your version fix it i am fine with
it.

Cheers,
Jerome

>>
>> Cheers,
>> Jerome
>>>
>>> Signed-off-by: Jerome Glisse
>>> Signed-off-by: Christian K?nig
>>> ---
>>> ?drivers/gpu/drm/radeon/radeon.h ? ? ? | ? 39 ++-
>>> ?drivers/gpu/drm/radeon/radeon_fence.c | ?116
>>> +++--
>>> ?drivers/gpu/drm/radeon/radeon_ring.c ?| ? ?9 ++-
>>> ?3 files changed, 107 insertions(+), 57 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon.h
>>> b/drivers/gpu/drm/radeon/radeon.h
>>> index e99ea81..cdf46bc 100644
>>> --- a/drivers/gpu/drm/radeon/radeon.h
>>> +++ b/drivers/gpu/drm/radeon/radeon.h
>>> @@ -100,28 +100,32 @@ extern int radeon_lockup_timeout;
>>> ?* Copy from radeon_drv.h so we don't have to include both and have
>>> conflicting
>>> ?* symbol;
>>> ?*/
>>> -#define RADEON_MAX_USEC_TIMEOUT ? ? ? ? ? ? ? ?10 ?/* 100 ms */
>>> -#define RADEON_FENCE_JIFFIES_TIMEOUT ? (HZ / 2)
>>> +#define RADEON_MAX_USEC_TIMEOUT ? ? ? ? ? ? ? ? ? ? ? ?10 ?/* 100 ms
>>> */
>>> +#define RADEON_FENCE_JIFFIES_TIMEOUT ? ? ? ? ? (HZ / 2)
>>> ?/* RADEON_IB_POOL_SIZE must be a power of 2 */
>>> -#define RADEON_IB_POOL_SIZE ? ? ? ? ? ?16
>>> -#define RADEON_DEBUGFS_MAX_COMPONENTS ?32
>>> -#define RADEONFB_CONN_LIMIT ? ? ? ? ? ?4
>>> -#define RADEON_BIOS_NUM_SCRATCH ? ? ? ? ? ? ? ?8
>>> +#define RADEON_IB_POOL_SIZE ? ? ? ? ? ? ? ? ? ?16
>>> +#define RADEON_DEBUGFS_MAX_COMPONENTS ? ? ? ? ?32
>>> +#define RADEONFB_CONN_LIMIT ? ? ? ? ? ? ? ? ? ?4
>>> +#define RADEON_BIOS_NUM_SCRATCH ? ? ? ? ? ? ? ? ? ? ? ?8
>>>
>>> ?/* max number of rings */
>>> -#define RADEON_NUM_RINGS 3
>>> +#define RADEON_NUM_RINGS ? ? ? ? ? ? ? ? ? ? ? 3
>>> +
>>> +/* fence seq are set to this number when signaled */
>>> +#define RADEON_FENCE_SIGNALED_SEQ ? ? ? ? ? ? ?0LL
>>> +#define RADEON_FENCE_NOTEMITED_SEQ ? ? ? ? ? ? (~0LL)
>>>
>>> ?/* internal ring indices */
>>> ?/* r1xx+ has gfx CP ring */
>>> -#define RADEON_RING_TYPE_GFX_INDEX ?0
>>> +#define RADEON_RING_TYPE_GFX_INDEX ? ? ? ? ? ? 0
>>>
>>> ?/* cayman has 2 compute CP rings */
>>> -#define CAYMAN_RING_TYPE_CP1_INDEX 1
>>> -#define CAYMAN_RING_TYPE_CP2_INDEX 2
>>> +#define CAYMAN_RING_TYPE_CP1_INDEX ? ? ? ? ? ? 1
>>> +#define CAYMAN_RING_TYPE_CP2_INDEX ? ? ? ? ? ? 2
>>>
>>> ?/* hardcode those limit for now */
>>> -#define RADEON_VA_RESERVED_SIZE ? ? ? ? ? ? ? ?(8<< ?20)
>>> -#define RADEON_IB_VM_MAX_SIZE ? ? ? ? ?(64<< ?10)
>>> +#define RADEON_VA_RESERVED_SIZE ? ? ? ? ? ? ? ? ? ? ? ?(8<< ?20)
>>> +#define RADEON_IB_VM_MAX_SIZE ? ? ? ? ? ? ? ? ?(64<< ?10)
>>>
>>> ?/*
>>> ?* Errata workarounds.
>>> @@ -254,8 +258,9 @@ struct radeon_fence_driver {
>>> ? ? ? ?uint32_t ? ? ? ? ? ? ? ? ? ? ? ?scratch_reg;
>>> ? ? ? ?uint64_t ? ? ? ? ? ? ? ? ? ? ? ?gpu_addr;
>>> ? ? ? ?volatile uint32_t ? ? ? ? ? ? ? *cpu_addr;
>>> - ? ? ? atomic_t ? ? ? ? ? ? ? ? ? ? ? ?seq;
>>> - ? ? ? uint32_t ? ? ? ? ? ? ? ? ? ? ? ?last_seq;
>>> + ? ? ? /* seq is protected by ring emission lock */
>>> + ? ? ? uint64_t ? ? ? ? ? ? ? ? ? ? ?

[PATCH 14/20] drm/radeon: multiple ring allocator v2

On Mon, May 7, 2012 at 7:42 AM, Christian K?nig  
wrote:
> A startover with a new idea for a multiple ring allocator.
> Should perform as well as a normal ring allocator as long
> as only one ring does somthing, but falls back to a more
> complex algorithm if more complex things start to happen.
>
> We store the last allocated bo in last, we always try to allocate
> after the last allocated bo. Principle is that in a linear GPU ring
> progression was is after last is the oldest bo we allocated and thus
> the first one that should no longer be in use by the GPU.
>
> If it's not the case we skip over the bo after last to the closest
> done bo if such one exist. If none exist and we are not asked to
> block we report failure to allocate.
>
> If we are asked to block we wait on all the oldest fence of all
> rings. We just wait for any of those fence to complete.
>
> v2: We need to be able to let hole point to the list_head, otherwise
> ? ?try free will never free the first allocation of the list. Also
> ? ?stop calling radeon_fence_signalled more than necessary.
>
> Signed-off-by: Christian K?nig 
> Signed-off-by: Jerome Glisse 

This one is NAK please use my patch. Yes in my patch we never try to
free anything if there is only on sa_bo in the list if you really care
about this it's a one line change:
http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch


Your patch here can enter in infinite loop and never return holding
the lock. See below.

Cheers,
Jerome

> ---
> ?drivers/gpu/drm/radeon/radeon.h ? ? ?| ? ?7 +-
> ?drivers/gpu/drm/radeon/radeon_ring.c | ? 19 +--
> ?drivers/gpu/drm/radeon/radeon_sa.c ? | ?292 
> +++---
> ?3 files changed, 210 insertions(+), 108 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 37a7459..cc7f16a 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -385,7 +385,9 @@ struct radeon_bo_list {
> ?struct radeon_sa_manager {
> ? ? ? ?spinlock_t ? ? ? ? ? ? ?lock;
> ? ? ? ?struct radeon_bo ? ? ? ?*bo;
> - ? ? ? struct list_head ? ? ? ?sa_bo;
> + ? ? ? struct list_head ? ? ? ?*hole;
> + ? ? ? struct list_head ? ? ? ?flist[RADEON_NUM_RINGS];
> + ? ? ? struct list_head ? ? ? ?olist;
> ? ? ? ?unsigned ? ? ? ? ? ? ? ?size;
> ? ? ? ?uint64_t ? ? ? ? ? ? ? ?gpu_addr;
> ? ? ? ?void ? ? ? ? ? ? ? ? ? ?*cpu_ptr;
> @@ -396,7 +398,8 @@ struct radeon_sa_bo;
>
> ?/* sub-allocation buffer */
> ?struct radeon_sa_bo {
> - ? ? ? struct list_head ? ? ? ? ? ? ? ?list;
> + ? ? ? struct list_head ? ? ? ? ? ? ? ?olist;
> + ? ? ? struct list_head ? ? ? ? ? ? ? ?flist;
> ? ? ? ?struct radeon_sa_manager ? ? ? ?*manager;
> ? ? ? ?unsigned ? ? ? ? ? ? ? ? ? ? ? ?soffset;
> ? ? ? ?unsigned ? ? ? ? ? ? ? ? ? ? ? ?eoffset;
> diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
> b/drivers/gpu/drm/radeon/radeon_ring.c
> index 1748d93..e074ff5 100644
> --- a/drivers/gpu/drm/radeon/radeon_ring.c
> +++ b/drivers/gpu/drm/radeon/radeon_ring.c
> @@ -204,25 +204,22 @@ int radeon_ib_schedule(struct radeon_device *rdev, 
> struct radeon_ib *ib)
>
> ?int radeon_ib_pool_init(struct radeon_device *rdev)
> ?{
> - ? ? ? struct radeon_sa_manager tmp;
> ? ? ? ?int i, r;
>
> - ? ? ? r = radeon_sa_bo_manager_init(rdev, ,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? RADEON_IB_POOL_SIZE*64*1024,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? RADEON_GEM_DOMAIN_GTT);
> - ? ? ? if (r) {
> - ? ? ? ? ? ? ? return r;
> - ? ? ? }
> -
> ? ? ? ?radeon_mutex_lock(>ib_pool.mutex);
> ? ? ? ?if (rdev->ib_pool.ready) {
> ? ? ? ? ? ? ? ?radeon_mutex_unlock(>ib_pool.mutex);
> - ? ? ? ? ? ? ? radeon_sa_bo_manager_fini(rdev, );
> ? ? ? ? ? ? ? ?return 0;
> ? ? ? ?}
>
> - ? ? ? rdev->ib_pool.sa_manager = tmp;
> - ? ? ? INIT_LIST_HEAD(>ib_pool.sa_manager.sa_bo);
> + ? ? ? r = radeon_sa_bo_manager_init(rdev, >ib_pool.sa_manager,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? RADEON_IB_POOL_SIZE*64*1024,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? RADEON_GEM_DOMAIN_GTT);
> + ? ? ? if (r) {
> + ? ? ? ? ? ? ? radeon_mutex_unlock(>ib_pool.mutex);
> + ? ? ? ? ? ? ? return r;
> + ? ? ? }
> +
> ? ? ? ?for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
> ? ? ? ? ? ? ? ?rdev->ib_pool.ibs[i].fence = NULL;
> ? ? ? ? ? ? ? ?rdev->ib_pool.ibs[i].idx = i;
> diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
> b/drivers/gpu/drm/radeon/radeon_sa.c
> index 90ee8ad..757a9d4 100644
> --- a/drivers/gpu/drm/radeon/radeon_sa.c
> +++ b/drivers/gpu/drm/radeon/radeon_sa.c
> @@ -27,21 +27,42 @@
> ?* Authors:
> ?* ? ?Jerome Glisse 
> ?*/
> +/* Algorithm:
> + *
> + * We store the last allocated bo in "hole", we always try to allocate
> + * after the last allocated bo. Principle is that in a linear GPU ring
> + * progression was is after last is the oldest bo we allocated and thus
> + * the first one that should no longer be in use by the GPU.
> + *
> + * If it's not the case we skip over the bo after last to the closest
> + * done bo if such one exist. If

[PULL] drm-intel-fixes

Hi Dave,

2 little patches:
- One regression fix to disable sdvo hotplug on broken hw.
- One patch to upconvert the snb hang workaround from patch v1 to patch
  v2.

Yours, Daniel

The following changes since commit d48b97b403d23f6df0b990cee652bdf9a52337a3:

  Linux 3.4-rc6 (2012-05-06 15:07:32 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~danvet/drm-intel drm-intel-fixes

for you to fetch changes up to 2e7a44814d802c8ba479164b8924070cd908d6b5:

  drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+ (2012-05-07 
10:37:56 +0200)


Daniel Vetter (2):
  drm/i915: disable sdvo hotplug on i945g/gm
  drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+

 drivers/gpu/drm/i915/intel_ringbuffer.c |9 ++---
 drivers/gpu/drm/i915/intel_sdvo.c   |6 ++
 2 files changed, 12 insertions(+), 3 deletions(-)
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48

[PATCH 04/20] drm/radeon: convert fence to uint64_t v4

On Mon, May 7, 2012 at 7:42 AM, Christian K?nig  
wrote:
> From: Jerome Glisse 
>
> This convert fence to use uint64_t sequence number intention is
> to use the fact that uin64_t is big enough that we don't need to
> care about wrap around.
>
> Tested with and without writeback using 0xF000 as initial
> fence sequence and thus allowing to test the wrap around from
> 32bits to 64bits.
>
> v2: Add comment about possible race btw CPU & GPU, add comment
> ? ?stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
> ? ?Read fence sequenc in reverse order of GPU write them so we
> ? ?mitigate the race btw CPU and GPU.
>
> v3: Drop the need for ring to emit the 64bits fence, and just have
> ? ?each ring emit the lower 32bits of the fence sequence. We
> ? ?handle the wrap over 32bits in fence_process.
>
> v4: Just a small optimization: Don't reread the last_seq value
> ? ?if loop restarts, since we already know its value anyway.
> ? ?Also start at zero not one for seq value and use pre instead
> ? ?of post increment in emmit, otherwise wait_empty will deadlock.

Why changing that v3 was already good no deadlock. I started at 1
especialy for that, a signaled fence is set to 0 so it always compare
as signaled. Just using preincrement is exactly like starting at one.
I don't see the need for this change but if it makes you happy.

Cheers,
Jerome
>
> Signed-off-by: Jerome Glisse 
> Signed-off-by: Christian K?nig 
> ---
> ?drivers/gpu/drm/radeon/radeon.h ? ? ? | ? 39 ++-
> ?drivers/gpu/drm/radeon/radeon_fence.c | ?116 
> +++--
> ?drivers/gpu/drm/radeon/radeon_ring.c ?| ? ?9 ++-
> ?3 files changed, 107 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index e99ea81..cdf46bc 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -100,28 +100,32 @@ extern int radeon_lockup_timeout;
> ?* Copy from radeon_drv.h so we don't have to include both and have 
> conflicting
> ?* symbol;
> ?*/
> -#define RADEON_MAX_USEC_TIMEOUT ? ? ? ? ? ? ? ?10 ?/* 100 ms */
> -#define RADEON_FENCE_JIFFIES_TIMEOUT ? (HZ / 2)
> +#define RADEON_MAX_USEC_TIMEOUT ? ? ? ? ? ? ? ? ? ? ? ?10 ?/* 100 ms */
> +#define RADEON_FENCE_JIFFIES_TIMEOUT ? ? ? ? ? (HZ / 2)
> ?/* RADEON_IB_POOL_SIZE must be a power of 2 */
> -#define RADEON_IB_POOL_SIZE ? ? ? ? ? ?16
> -#define RADEON_DEBUGFS_MAX_COMPONENTS ?32
> -#define RADEONFB_CONN_LIMIT ? ? ? ? ? ?4
> -#define RADEON_BIOS_NUM_SCRATCH ? ? ? ? ? ? ? ?8
> +#define RADEON_IB_POOL_SIZE ? ? ? ? ? ? ? ? ? ?16
> +#define RADEON_DEBUGFS_MAX_COMPONENTS ? ? ? ? ?32
> +#define RADEONFB_CONN_LIMIT ? ? ? ? ? ? ? ? ? ?4
> +#define RADEON_BIOS_NUM_SCRATCH ? ? ? ? ? ? ? ? ? ? ? ?8
>
> ?/* max number of rings */
> -#define RADEON_NUM_RINGS 3
> +#define RADEON_NUM_RINGS ? ? ? ? ? ? ? ? ? ? ? 3
> +
> +/* fence seq are set to this number when signaled */
> +#define RADEON_FENCE_SIGNALED_SEQ ? ? ? ? ? ? ?0LL
> +#define RADEON_FENCE_NOTEMITED_SEQ ? ? ? ? ? ? (~0LL)
>
> ?/* internal ring indices */
> ?/* r1xx+ has gfx CP ring */
> -#define RADEON_RING_TYPE_GFX_INDEX ?0
> +#define RADEON_RING_TYPE_GFX_INDEX ? ? ? ? ? ? 0
>
> ?/* cayman has 2 compute CP rings */
> -#define CAYMAN_RING_TYPE_CP1_INDEX 1
> -#define CAYMAN_RING_TYPE_CP2_INDEX 2
> +#define CAYMAN_RING_TYPE_CP1_INDEX ? ? ? ? ? ? 1
> +#define CAYMAN_RING_TYPE_CP2_INDEX ? ? ? ? ? ? 2
>
> ?/* hardcode those limit for now */
> -#define RADEON_VA_RESERVED_SIZE ? ? ? ? ? ? ? ?(8 << 20)
> -#define RADEON_IB_VM_MAX_SIZE ? ? ? ? ?(64 << 10)
> +#define RADEON_VA_RESERVED_SIZE ? ? ? ? ? ? ? ? ? ? ? ?(8 << 20)
> +#define RADEON_IB_VM_MAX_SIZE ? ? ? ? ? ? ? ? ?(64 << 10)
>
> ?/*
> ?* Errata workarounds.
> @@ -254,8 +258,9 @@ struct radeon_fence_driver {
> ? ? ? ?uint32_t ? ? ? ? ? ? ? ? ? ? ? ?scratch_reg;
> ? ? ? ?uint64_t ? ? ? ? ? ? ? ? ? ? ? ?gpu_addr;
> ? ? ? ?volatile uint32_t ? ? ? ? ? ? ? *cpu_addr;
> - ? ? ? atomic_t ? ? ? ? ? ? ? ? ? ? ? ?seq;
> - ? ? ? uint32_t ? ? ? ? ? ? ? ? ? ? ? ?last_seq;
> + ? ? ? /* seq is protected by ring emission lock */
> + ? ? ? uint64_t ? ? ? ? ? ? ? ? ? ? ? ?seq;
> + ? ? ? atomic64_t ? ? ? ? ? ? ? ? ? ? ?last_seq;
> ? ? ? ?unsigned long ? ? ? ? ? ? ? ? ? last_activity;
> ? ? ? ?wait_queue_head_t ? ? ? ? ? ? ? queue;
> ? ? ? ?struct list_head ? ? ? ? ? ? ? ?emitted;
> @@ -268,11 +273,9 @@ struct radeon_fence {
> ? ? ? ?struct kref ? ? ? ? ? ? ? ? ? ? kref;
> ? ? ? ?struct list_head ? ? ? ? ? ? ? ?list;
> ? ? ? ?/* protected by radeon_fence.lock */
> - ? ? ? uint32_t ? ? ? ? ? ? ? ? ? ? ? ?seq;
> - ? ? ? bool ? ? ? ? ? ? ? ? ? ? ? ? ? ?emitted;
> - ? ? ? bool ? ? ? ? ? ? ? ? ? ? ? ? ? ?signaled;
> + ? ? ? uint64_t ? ? ? ? ? ? ? ? ? ? ? ?seq;
> ? ? ? ?/* RB, DMA, etc. */
> - ? ? ? int ? ? ? ? ? ? ? ? ? ? ? ? ? ? ring;
> + ? ? ? unsigned ? ? ? ? ? ? ? ? ? ? ? ?ring;
> ? ? ? ?struct radeon_semaphore ? ? ? ? *semaphore;
> ?};
>
> diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
>

SA and other Patches.

On Mon, May 7, 2012 at 7:42 AM, Christian K?nig  
wrote:
> Hi Jerome & everybody on the list,
>
> this gathers together every patch we developed over the last week or so and
> which is not already in drm-next.
>
> I've run quite some tests with them yesterday and today and as far as I can
> see hammered out every known bug. For the SA allocator I reverted to tracking
> the hole pointer instead of just the last allocation, cause otherwise we will
> never release the first allocation on the list. Glxgears now even keeps 
> happily
> running if I deadlock on the not GFX rings on purpose.

Now we will release the first entry if we use the last allocate ptr i
believe it's cleaner to use the last ptr.

> Please take a second look at them and if nobody objects any more we should
> commit them to drm-next.
>
> Cheers,
> Christian.
>


Cheers,
Jerome

[RFC v2 5/5] drm: Add NVIDIA Tegra support

2012-05-07 Thread Stephen Warren

On 05/07/2012 02:50 AM, Terje Bergstr?m wrote:
> On 25.04.2012 12:45, Thierry Reding wrote:
> 
>> +/ {
>> +   ...
>> +
>> +   /* host1x */
>> +   host1x: host1x at 5000 {
>> +   compatible = "nvidia,tegra20-host1x";
>> +   reg = <0x5000 0x00024000>;
>> +   interrupts = <0 64 0x04   /* cop syncpt */
>> + 0 65 0x04   /* mpcore syncpt */
>> + 0 66 0x04   /* cop general */
>> + 0 67 0x04>; /* mpcore general */
>> +   };
>> +
>> +   /* video-encoding/decoding */
>> +   mpe at 5404 {
>> +   reg = <0x5404 0x0004>;
>> +   interrupts = <0 68 0x04>;
>> +   };
>> +
> 
> (...)
> 
> Hi Thierry,
> 
> I have still lots of questions regarding how device trees work. I'm now
> just trying to match the device tree structure with hardware - let me
> know if that goes wrong.
> 
> There's a hierarchy in the hardware, which should be represented in the
> device trees. All of the hardware are client modules for host1x - with
> the exception of host1x obviously. CPU has two methods for accessing the
> hardware: clients' register aperture and host1x channels. Both of these
> operate via host1x hardware.
> 
> We should define host1x bus in the device tree, and move all nodes
> except host1x under that bus.

I think the host1x node /is/ that bus.

[Bug 45018] [bisected] rendering regression since added support for virtual address space on cayman v11

https://bugs.freedesktop.org/show_bug.cgi?id=45018

--- Comment #55 from Michel D?nzer  2012-05-07 03:07:07 
PDT ---
(In reply to comment #54)
> On latest git (3cd7bee48f7caf7850ea64d40f43875d4c975507), in
> src/gallium/drivers/r600/r66_hw_context.c, on line 194, shouldn't it be:
> - int offset
> + unsigned offset

That might be slightly better, but it doesn't really matter. It's the offset
from the start of the MMIO aperture, so it would only matter if the register
aperture grew beyond 2GB, which we're almost 5 orders of magnitude short of.
Very unlikely.


> Also, at line 1259, I'm not quite sure why it is shifted by 2. Most of the
> time, offset is usually shifted by 8.

It's just converting offset from units of 32 bits to bytes.


> Just looking through the code to see if something could have been missed...

Right now it would be most useful to track down why radeon_bomgr_find_va /
radeon_bomgr_force_va ends up returning the offset the kernel complains about.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[RFC][PATCH] drm/radeon/hdmi: define struct for AVI infoframe

2012-05-07 Thread Michel Dänzer

On Son, 2012-05-06 at 18:29 +0200, Rafa? Mi?ecki wrote: 
> 2012/5/6 Dave Airlie :
> > On Sun, May 6, 2012 at 5:19 PM, Rafa? Mi?ecki  wrote:
> >> 2012/5/6 Rafa? Mi?ecki :
> >>> diff --git a/drivers/gpu/drm/radeon/r600_hdmi.c 
> >>> b/drivers/gpu/drm/radeon/r600_hdmi.c
> >>> index c308432..b14c90a 100644
> >>> --- a/drivers/gpu/drm/radeon/r600_hdmi.c
> >>> +++ b/drivers/gpu/drm/radeon/r600_hdmi.c
> >>> @@ -134,78 +134,22 @@ static void r600_hdmi_infoframe_checksum(uint8_t 
> >>> packetType,
> >>>  }
> >>>
> >>>  /*
> >>> - * build a HDMI Video Info Frame
> >>> + * Upload a HDMI AVI Infoframe
> >>>  */
> >>> -static void r600_hdmi_videoinfoframe(
> >>> -   struct drm_encoder *encoder,
> >>> -   enum r600_hdmi_color_format color_format,
> >>> -   int active_information_present,
> >>> -   uint8_t active_format_aspect_ratio,
> >>> -   uint8_t scan_information,
> >>> -   uint8_t colorimetry,
> >>> -   uint8_t ex_colorimetry,
> >>> -   uint8_t quantization,
> >>> -   int ITC,
> >>> -   uint8_t picture_aspect_ratio,
> >>> -   uint8_t video_format_identification,
> >>> -   uint8_t pixel_repetition,
> >>> -   uint8_t non_uniform_picture_scaling,
> >>> -   uint8_t bar_info_data_valid,
> >>> -   uint16_t top_bar,
> >>> -   uint16_t bottom_bar,
> >>> -   uint16_t left_bar,
> >>> -   uint16_t right_bar
> >>> -)
> >>
> >> In case someone wonders about the reason: I think it's really ugly to
> >> have a function taking 18 arguments, 17 of them related to the
> >> infoframe. It makes much more sense for me to use struct for that.
> >> While working on that I though it's reasonable to prepare nice
> >> bitfield __packed struct ready-to-be-written to the GPU registers.
> >
> > won't this screw up on other endian machines?
> 
> Hm, maybe it can. Is there some easy to handle it correctly? Some trick like
> __le8 foo: 3
> __le8 bar: 1
> maybe?

Not really. The memory layout of bitfields is basically completely up to
the C implementation, so IMHO they're just inadequate for describing
fixed memory layouts.


-- 
Earthling Michel D?nzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #1 from Mike Mestnik  2012-05-06 20:47:52 PDT ---
I got this same error with llvm1.3-rc1 and rc2.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49110] debug build: AMDILCFGStructurizer.cpp:1751:3: error: 'isCurrentDebugType' was not declared in this scope

https://bugs.freedesktop.org/show_bug.cgi?id=49110

--- Comment #4 from Mike Mestnik  2012-05-06 20:44:15 PDT ---
This bug cloned to: https://bugs.freedesktop.org/show_bug.cgi?id=49567

No rule to make target libradeon.a, needed by libr600.a.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49110] debug build: AMDILCFGStructurizer.cpp:1751:3: error: 'isCurrentDebugType' was not declared in this scope

https://bugs.freedesktop.org/show_bug.cgi?id=49110

Mike Mestnik  changed:

   What|Removed |Added

   See Also||https://bugs.freedesktop.or
   ||g/show_bug.cgi?id=49567

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

Mike Mestnik  changed:

   What|Removed |Added

 Depends on|49110   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49110] debug build: AMDILCFGStructurizer.cpp:1751:3: error: 'isCurrentDebugType' was not declared in this scope

https://bugs.freedesktop.org/show_bug.cgi?id=49110

Mike Mestnik  changed:

   What|Removed |Added

 Blocks|49567   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49110] debug build: AMDILCFGStructurizer.cpp:1751:3: error: 'isCurrentDebugType' was not declared in this scope

https://bugs.freedesktop.org/show_bug.cgi?id=49110

Mike Mestnik  changed:

   What|Removed |Added

   Priority|medium  |low

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49110] debug build: AMDILCFGStructurizer.cpp:1751:3: error: 'isCurrentDebugType' was not declared in this scope

https://bugs.freedesktop.org/show_bug.cgi?id=49110

Mike Mestnik  changed:

   What|Removed |Added

Summary|AMDILCFGStructurizer.cpp:17 |debug build:
   |51:3: error:|AMDILCFGStructurizer.cpp:17
   |'isCurrentDebugType' was|51:3: error:
   |not declared in this scope  |'isCurrentDebugType' was
   ||not declared in this scope

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49567] New: No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

 Bug #: 49567
   Summary: No rule to make target libradeon.a, needed by
libr600.a.
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: x86 (IA32)
OS/Version: Linux (All)
Status: NEW
  Severity: major
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: cheako+bugs_freedesktop_org at mikemestnik.net
CC: cheako+bugs_freedesktop_org at mikemestnik.net,
fabio.ped at libero.it
Depends on: 49110


+++ This bug was initially created as a clone of Bug #49110 +++

make[5]: *** No rule to make target
`../../../../../../src/gallium/drivers/radeon/libradeon.a', needed by
`libr600.a'.  Stop.

Full log at:
https://launchpadlibrarian.net/103127393/buildlog_ubuntu-precise-i386.mesa_8.1~git1204261417.a2f7ec~gd~p_FAILEDTOBUILD.txt.gz
https://launchpadlibrarian.net/104275700/buildlog_ubuntu-precise-i386.mesa_8.1~git20120504.5cc4b4aa-1ubuntu0cheako2~precise_FAILEDTOBUILD.txt.gz

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

[Bug 49110] AMDILCFGStructurizer.cpp:1751:3: error: 'isCurrentDebugType' was not declared in this scope

https://bugs.freedesktop.org/show_bug.cgi?id=49110

Mike Mestnik  changed:

   What|Removed |Added

 Blocks||49567

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.

Re: [RFC][PATCH] drm/radeon/hdmi: define struct for AVI infoframe

2012-05-07 Thread Michel Dänzer

On Son, 2012-05-06 at 18:29 +0200, Rafał Miłecki wrote: 
 2012/5/6 Dave Airlie airl...@gmail.com:
  On Sun, May 6, 2012 at 5:19 PM, Rafał Miłecki zaj...@gmail.com wrote:
  2012/5/6 Rafał Miłecki zaj...@gmail.com:
  diff --git a/drivers/gpu/drm/radeon/r600_hdmi.c 
  b/drivers/gpu/drm/radeon/r600_hdmi.c
  index c308432..b14c90a 100644
  --- a/drivers/gpu/drm/radeon/r600_hdmi.c
  +++ b/drivers/gpu/drm/radeon/r600_hdmi.c
  @@ -134,78 +134,22 @@ static void r600_hdmi_infoframe_checksum(uint8_t 
  packetType,
   }
 
   /*
  - * build a HDMI Video Info Frame
  + * Upload a HDMI AVI Infoframe
   */
  -static void r600_hdmi_videoinfoframe(
  -   struct drm_encoder *encoder,
  -   enum r600_hdmi_color_format color_format,
  -   int active_information_present,
  -   uint8_t active_format_aspect_ratio,
  -   uint8_t scan_information,
  -   uint8_t colorimetry,
  -   uint8_t ex_colorimetry,
  -   uint8_t quantization,
  -   int ITC,
  -   uint8_t picture_aspect_ratio,
  -   uint8_t video_format_identification,
  -   uint8_t pixel_repetition,
  -   uint8_t non_uniform_picture_scaling,
  -   uint8_t bar_info_data_valid,
  -   uint16_t top_bar,
  -   uint16_t bottom_bar,
  -   uint16_t left_bar,
  -   uint16_t right_bar
  -)
 
  In case someone wonders about the reason: I think it's really ugly to
  have a function taking 18 arguments, 17 of them related to the
  infoframe. It makes much more sense for me to use struct for that.
  While working on that I though it's reasonable to prepare nice
  bitfield __packed struct ready-to-be-written to the GPU registers.
 
  won't this screw up on other endian machines?
 
 Hm, maybe it can. Is there some easy to handle it correctly? Some trick like
 __le8 foo: 3
 __le8 bar: 1
 maybe?

Not really. The memory layout of bitfields is basically completely up to
the C implementation, so IMHO they're just inadequate for describing
fixed memory layouts.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PULL] drm-intel-fixes

Hi Dave,

2 little patches:
- One regression fix to disable sdvo hotplug on broken hw.
- One patch to upconvert the snb hang workaround from patch v1 to patch
  v2.

Yours, Daniel

The following changes since commit d48b97b403d23f6df0b990cee652bdf9a52337a3:

  Linux 3.4-rc6 (2012-05-06 15:07:32 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~danvet/drm-intel drm-intel-fixes

for you to fetch changes up to 2e7a44814d802c8ba479164b8924070cd908d6b5:

  drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+ (2012-05-07 
10:37:56 +0200)


Daniel Vetter (2):
  drm/i915: disable sdvo hotplug on i945g/gm
  drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+

 drivers/gpu/drm/i915/intel_ringbuffer.c |9 ++---
 drivers/gpu/drm/i915/intel_sdvo.c   |6 ++
 2 files changed, 12 insertions(+), 3 deletions(-)
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC v2 5/5] drm: Add NVIDIA Tegra support

2012-05-07 Thread Terje Bergström

On 25.04.2012 12:45, Thierry Reding wrote:

 +/ {
 +   ...
 +
 +   /* host1x */
 +   host1x: host1x@5000 {
 +   compatible = nvidia,tegra20-host1x;
 +   reg = 0x5000 0x00024000;
 +   interrupts = 0 64 0x04   /* cop syncpt */
 + 0 65 0x04   /* mpcore syncpt */
 + 0 66 0x04   /* cop general */
 + 0 67 0x04; /* mpcore general */
 +   };
 +
 +   /* video-encoding/decoding */
 +   mpe@5404 {
 +   reg = 0x5404 0x0004;
 +   interrupts = 0 68 0x04;
 +   };
 +

(...)

Hi Thierry,

I have still lots of questions regarding how device trees work. I'm now
just trying to match the device tree structure with hardware - let me
know if that goes wrong.

There's a hierarchy in the hardware, which should be represented in the
device trees. All of the hardware are client modules for host1x - with
the exception of host1x obviously. CPU has two methods for accessing the
hardware: clients' register aperture and host1x channels. Both of these
operate via host1x hardware.

We should define host1x bus in the device tree, and move all nodes
except host1x under that bus. This will help us in the long run, as we
will have multiple drivers (drm, v4l2) each accessing hardware under
host1x. We will need to model the bus and the bus_type will need to take
over responsibilities of managing the common resources.

When we are clocking hardware, whenever we want to access display's
register aperture, host1x needs to be clocked.

 +   /* graphics host */

 +   graphics@5400 {
 +   compatible = nvidia,tegra20-graphics;
 +
 +   #address-cells = 1;
 +   #size-cells = 1;
 +   ranges;
 +
 +   display-controllers = disp1 disp2;
 +   carveout = 0x0e00 0x0200;
 +   host1x = host1x;
 +   gart = gart;
 +
 +   connectors {
 +   #address-cells = 1;
 +   #size-cells = 0;
 +
 +   connector@0 {
 +   reg = 0;
 +   edid = /incbin/(machine.edid);
 +   output = lvds;
 +   };
 +
 +   connector@1 {
 +   reg = 1;
 +   output = hdmi;
 +   ddc = i2c2;
 +
 +   hpd-gpio = gpio 111 0; /* PN7 */
 +   };
 +   };
 +   };
 +};


I'm not sure what this node means. The register range from 5400
onwards is actually the one that you just described in the nodes of the
individual client modules. Why is it represented here again?

Terje
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 45018] [bisected] rendering regression since added support for virtual address space on cayman v11

https://bugs.freedesktop.org/show_bug.cgi?id=45018

--- Comment #55 from Michel Dänzer mic...@daenzer.net 2012-05-07 03:07:07 PDT 
---
(In reply to comment #54)
 On latest git (3cd7bee48f7caf7850ea64d40f43875d4c975507), in
 src/gallium/drivers/r600/r66_hw_context.c, on line 194, shouldn't it be:
 - int offset
 + unsigned offset

That might be slightly better, but it doesn't really matter. It's the offset
from the start of the MMIO aperture, so it would only matter if the register
aperture grew beyond 2GB, which we're almost 5 orders of magnitude short of.
Very unlikely.


 Also, at line 1259, I'm not quite sure why it is shifted by 2. Most of the
 time, offset is usually shifted by 8.

It's just converting offset from units of 32 bits to bytes.


 Just looking through the code to see if something could have been missed...

Right now it would be most useful to track down why radeon_bomgr_find_va /
radeon_bomgr_force_va ends up returning the offset the kernel complains about.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[git pull] drm fixes

2012-05-07 Thread Dave Airlie


Two fixes from Intel, one a regression, one because I merged an early 
version of a fix.

Also the nouveau revert of the i2c code that was tested on the list.

Dave.

The following changes since commit febb72a6e4cc6c8cffcc1ea649a3fb364f1ea432:

  IA32 emulation: Fix build problem for modular ia32 a.out support (2012-05-06 
18:26:20 -0700)

are available in the git repository at:
  git://people.freedesktop.org/~airlied/linux drm-fixes

Ben Skeggs (1):
  drm/nouveau/i2c: resume use of i2c-algo-bit, rather than custom stack

Daniel Vetter (2):
  drm/i915: disable sdvo hotplug on i945g/gm
  drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+

Dave Airlie (1):
  Merge branch 'drm-intel-fixes' of 
git://people.freedesktop.org/~danvet/drm-intel into drm-fixes

 drivers/gpu/drm/i915/intel_ringbuffer.c |9 +-
 drivers/gpu/drm/i915/intel_sdvo.c   |6 +
 drivers/gpu/drm/nouveau/nouveau_i2c.c   |  199 ---
 drivers/gpu/drm/nouveau/nouveau_i2c.h   |1 +
 4 files changed, 34 insertions(+), 181 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

SA and other Patches.

Hi Jerome  everybody on the list,

this gathers together every patch we developed over the last week or so and
which is not already in drm-next.

I've run quite some tests with them yesterday and today and as far as I can
see hammered out every known bug. For the SA allocator I reverted to tracking
the hole pointer instead of just the last allocation, cause otherwise we will
never release the first allocation on the list. Glxgears now even keeps happily
running if I deadlock on the not GFX rings on purpose.

Please take a second look at them and if nobody objects any more we should
commit them to drm-next.

Cheers,
Christian.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 02/20] drm/radeon: clarify and extend wb setup on APUs and NI+ asics

From: Alex Deucher alexander.deuc...@amd.com

Use family rather than DCE check for clarity, also always use
wb on APUs, there will never be AGP variants.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
Reviewed by: Christian König christian.koe...@amd.com
---
 drivers/gpu/drm/radeon/radeon_device.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index d18f0c4..ff28210 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -241,8 +241,8 @@ int radeon_wb_init(struct radeon_device *rdev)
rdev-wb.use_event = true;
}
}
-   /* always use writeback/events on NI */
-   if (ASIC_IS_DCE5(rdev)) {
+   /* always use writeback/events on NI, APUs */
+   if (rdev-family = CHIP_PALM) {
rdev-wb.enabled = true;
rdev-wb.use_event = true;
}
-- 
1.7.5.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 09/20] drm/radeon: add sub allocator debugfs file

Dumping the current allocations.

Signed-off-by: Christian König deathsim...@vodafone.de
---
 drivers/gpu/drm/radeon/radeon_object.h |5 +
 drivers/gpu/drm/radeon/radeon_ring.c   |   22 ++
 drivers/gpu/drm/radeon/radeon_sa.c |   14 ++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index c120ab9..d9fca1e 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -172,5 +172,10 @@ extern int radeon_sa_bo_new(struct radeon_device *rdev,
unsigned size, unsigned align);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
  struct radeon_sa_bo *sa_bo);
+#if defined(CONFIG_DEBUG_FS)
+extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
+struct seq_file *m);
+#endif
+
 
 #endif
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 116be5e..f49c9c0 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -601,6 +601,23 @@ static int radeon_debugfs_ib_info(struct seq_file *m, void 
*data)
 static struct drm_info_list radeon_debugfs_ib_list[RADEON_IB_POOL_SIZE];
 static char radeon_debugfs_ib_names[RADEON_IB_POOL_SIZE][32];
 static unsigned radeon_debugfs_ib_idx[RADEON_IB_POOL_SIZE];
+
+static int radeon_debugfs_sa_info(struct seq_file *m, void *data)
+{
+   struct drm_info_node *node = (struct drm_info_node *) m-private;
+   struct drm_device *dev = node-minor-dev;
+   struct radeon_device *rdev = dev-dev_private;
+
+   radeon_sa_bo_dump_debug_info(rdev-ib_pool.sa_manager, m);
+
+   return 0;
+
+}
+
+static struct drm_info_list radeon_debugfs_sa_list[] = {
+{radeon_sa_info, radeon_debugfs_sa_info, 0, NULL},
+};
+
 #endif
 
 int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ring 
*ring)
@@ -627,6 +644,11 @@ int radeon_debugfs_ib_init(struct radeon_device *rdev)
 {
 #if defined(CONFIG_DEBUG_FS)
unsigned i;
+   int r;
+
+   r = radeon_debugfs_add_files(rdev, radeon_debugfs_sa_list, 1);
+   if (r)
+   return r;
 
for (i = 0; i  RADEON_IB_POOL_SIZE; i++) {
sprintf(radeon_debugfs_ib_names[i], radeon_ib_%04u, i);
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index aed0a8c..1db0568 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -193,3 +193,17 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct 
radeon_sa_bo *sa_bo)
list_del_init(sa_bo-list);
spin_unlock(sa_bo-manager-lock);
 }
+
+#if defined(CONFIG_DEBUG_FS)
+void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
+ struct seq_file *m)
+{
+   struct radeon_sa_bo *i;
+
+   spin_lock(sa_manager-lock);
+   list_for_each_entry(i, sa_manager-sa_bo, list) {
+   seq_printf(m, offset %08d: size %4d\n, i-offset, i-size);
+   }
+   spin_unlock(sa_manager-lock);
+}
+#endif
-- 
1.7.5.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 12/20] drm/radeon: define new SA interface v3

Define the interface without modifying the allocation
algorithm in any way.

v2: rebase on top of fence new uint64 patch
v3: add ring to debugfs output

Signed-off-by: Jerome Glisse jgli...@redhat.com
Signed-off-by: Christian König deathsim...@vodafone.de
---
 drivers/gpu/drm/radeon/radeon.h   |1 +
 drivers/gpu/drm/radeon/radeon_gart.c  |6 +-
 drivers/gpu/drm/radeon/radeon_object.h|5 +-
 drivers/gpu/drm/radeon/radeon_ring.c  |8 ++--
 drivers/gpu/drm/radeon/radeon_sa.c|   60 
 drivers/gpu/drm/radeon/radeon_semaphore.c |2 +-
 6 files changed, 63 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 9374ab1..ada70d1 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -398,6 +398,7 @@ struct radeon_sa_bo {
struct radeon_sa_manager*manager;
unsignedsoffset;
unsignedeoffset;
+   struct radeon_fence *fence;
 };
 
 /*
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index c5789ef..53dba8e 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -326,7 +326,7 @@ static void radeon_vm_unbind_locked(struct radeon_device 
*rdev,
rdev-vm_manager.use_bitmap = ~(1  vm-id);
list_del_init(vm-list);
vm-id = -1;
-   radeon_sa_bo_free(rdev, vm-sa_bo);
+   radeon_sa_bo_free(rdev, vm-sa_bo, NULL);
vm-pt = NULL;
 
list_for_each_entry(bo_va, vm-va, vm_list) {
@@ -395,7 +395,7 @@ int radeon_vm_bind(struct radeon_device *rdev, struct 
radeon_vm *vm)
 retry:
r = radeon_sa_bo_new(rdev, rdev-vm_manager.sa_manager, vm-sa_bo,
 RADEON_GPU_PAGE_ALIGN(vm-last_pfn * 8),
-RADEON_GPU_PAGE_SIZE);
+RADEON_GPU_PAGE_SIZE, false);
if (r) {
if (list_empty(rdev-vm_manager.lru_vm)) {
return r;
@@ -426,7 +426,7 @@ retry_id:
/* do hw bind */
r = rdev-vm_manager.funcs-bind(rdev, vm, id);
if (r) {
-   radeon_sa_bo_free(rdev, vm-sa_bo);
+   radeon_sa_bo_free(rdev, vm-sa_bo, NULL);
return r;
}
rdev-vm_manager.use_bitmap |= 1  id;
diff --git a/drivers/gpu/drm/radeon/radeon_object.h 
b/drivers/gpu/drm/radeon/radeon_object.h
index 4fc7f07..befec7d 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -169,9 +169,10 @@ extern int radeon_sa_bo_manager_suspend(struct 
radeon_device *rdev,
 extern int radeon_sa_bo_new(struct radeon_device *rdev,
struct radeon_sa_manager *sa_manager,
struct radeon_sa_bo **sa_bo,
-   unsigned size, unsigned align);
+   unsigned size, unsigned align, bool block);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
- struct radeon_sa_bo **sa_bo);
+ struct radeon_sa_bo **sa_bo,
+ struct radeon_fence *fence);
 #if defined(CONFIG_DEBUG_FS)
 extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 struct seq_file *m);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 45adb37..1748d93 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -85,7 +85,7 @@ bool radeon_ib_try_free(struct radeon_device *rdev, struct 
radeon_ib *ib)
if (ib-fence  ib-fence-seq  RADEON_FENCE_NOTEMITED_SEQ) {
if (radeon_fence_signaled(ib-fence)) {
radeon_fence_unref(ib-fence);
-   radeon_sa_bo_free(rdev, ib-sa_bo);
+   radeon_sa_bo_free(rdev, ib-sa_bo, NULL);
done = true;
}
}
@@ -124,7 +124,7 @@ retry:
if (rdev-ib_pool.ibs[idx].fence == NULL) {
r = radeon_sa_bo_new(rdev, rdev-ib_pool.sa_manager,
 rdev-ib_pool.ibs[idx].sa_bo,
-size, 256);
+size, 256, false);
if (!r) {
*ib = rdev-ib_pool.ibs[idx];
(*ib)-ptr = 
radeon_sa_bo_cpu_addr((*ib)-sa_bo);
@@ -173,7 +173,7 @@ void radeon_ib_free(struct radeon_device *rdev, struct 
radeon_ib **ib)
}
radeon_mutex_lock(rdev-ib_pool.mutex);
if (tmp-fence  tmp-fence-seq == RADEON_FENCE_NOTEMITED_SEQ) {
-   radeon_sa_bo_free(rdev, tmp-sa_bo);
+   radeon_sa_bo_free(rdev, tmp-sa_bo, NULL);

[PATCH 13/20] drm/radeon: use one wait queue for all rings add fence_wait_any v2

From: Jerome Glisse jgli...@redhat.com

Use one wait queue for all rings. When one ring progress, other
likely does to and we are not expecting to have a lot of waiter
anyway.

Also add a fence_wait_any that will wait until the first fence
in the fence array (one fence per ring) is signaled. This allow
to wait on all rings.

v2: some minor cleanups and improvements.

Signed-off-by: Christian König deathsim...@vodafone.de
Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 drivers/gpu/drm/radeon/radeon.h   |5 +-
 drivers/gpu/drm/radeon/radeon_fence.c |  163 -
 2 files changed, 162 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index ada70d1..37a7459 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -262,7 +262,6 @@ struct radeon_fence_driver {
uint64_tseq;
atomic64_t  last_seq;
unsigned long   last_activity;
-   wait_queue_head_t   queue;
boolinitialized;
 };
 
@@ -286,6 +285,9 @@ bool radeon_fence_signaled(struct radeon_fence *fence);
 int radeon_fence_wait(struct radeon_fence *fence, bool interruptible);
 int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring);
 int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
+int radeon_fence_wait_any(struct radeon_device *rdev,
+ struct radeon_fence **fences,
+ bool intr);
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence);
 void radeon_fence_unref(struct radeon_fence **fence);
 unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
@@ -1534,6 +1536,7 @@ struct radeon_device {
struct radeon_scratch   scratch;
struct radeon_mman  mman;
struct radeon_fence_driver  fence_drv[RADEON_NUM_RINGS];
+   wait_queue_head_t   fence_queue;
struct radeon_semaphore_driver  semaphore_drv;
struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 8034b42..45d4e6e 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -222,11 +222,11 @@ static int radeon_fence_wait_seq(struct radeon_device 
*rdev, u64 target_seq,
trace_radeon_fence_wait_begin(rdev-ddev, seq);
radeon_irq_kms_sw_irq_get(rdev, ring);
if (intr) {
-   r = 
wait_event_interruptible_timeout(rdev-fence_drv[ring].queue,
+   r = wait_event_interruptible_timeout(rdev-fence_queue,
(signaled = radeon_fence_seq_signaled(rdev, 
target_seq, ring)),
timeout);
 } else {
-   r = wait_event_timeout(rdev-fence_drv[ring].queue,
+   r = wait_event_timeout(rdev-fence_queue,
(signaled = radeon_fence_seq_signaled(rdev, 
target_seq, ring)),
timeout);
}
@@ -300,6 +300,159 @@ int radeon_fence_wait(struct radeon_fence *fence, bool 
intr)
return 0;
 }
 
+bool radeon_fence_any_seq_signaled(struct radeon_device *rdev, u64 *seq)
+{
+   unsigned i;
+
+   for (i = 0; i  RADEON_NUM_RINGS; ++i) {
+   if (seq[i]  radeon_fence_seq_signaled(rdev, seq[i], i)) {
+   return true;
+   }
+   }
+   return false;
+}
+
+static int radeon_fence_wait_any_seq(struct radeon_device *rdev,
+u64 *target_seq, bool intr)
+{
+   unsigned long timeout, last_activity, tmp;
+   unsigned i, ring = RADEON_NUM_RINGS;
+   bool signaled;
+   int r;
+
+   for (i = 0, last_activity = 0; i  RADEON_NUM_RINGS; ++i) {
+   if (!target_seq[i]) {
+   continue;
+   }
+
+   /* use the most recent one as indicator */
+   if (time_after(rdev-fence_drv[i].last_activity, 
last_activity)) {
+   last_activity = rdev-fence_drv[i].last_activity;
+   }
+
+   /* For lockup detection just pick the lowest ring we are
+* actively waiting for
+*/
+   if (i  ring) {
+   ring = i;
+   }
+   }
+
+   /* nothing to wait for ? */
+   if (ring == RADEON_NUM_RINGS) {
+   return 0;
+   }
+
+   while (!radeon_fence_any_seq_signaled(rdev, target_seq)) {
+   timeout = jiffies - RADEON_FENCE_JIFFIES_TIMEOUT;
+   if (time_after(last_activity, timeout)) {
+   /* the normal case, timeout is somewhere

[PATCH 15/20] drm/radeon: simplify semaphore handling v2

From: Jerome Glisse jgli...@redhat.com

Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.

v2: rebased on new SA interface.

Signed-off-by: Christian König deathsim...@vodafone.de
Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 drivers/gpu/drm/radeon/evergreen.c|1 -
 drivers/gpu/drm/radeon/ni.c   |1 -
 drivers/gpu/drm/radeon/r600.c |1 -
 drivers/gpu/drm/radeon/radeon.h   |   29 +-
 drivers/gpu/drm/radeon/radeon_device.c|2 -
 drivers/gpu/drm/radeon/radeon_fence.c |2 +-
 drivers/gpu/drm/radeon/radeon_semaphore.c |  137 +
 drivers/gpu/drm/radeon/radeon_test.c  |4 +-
 drivers/gpu/drm/radeon/rv770.c|1 -
 drivers/gpu/drm/radeon/si.c   |1 -
 10 files changed, 30 insertions(+), 149 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index ecc29bc..7e7ac3d 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3550,7 +3550,6 @@ void evergreen_fini(struct radeon_device *rdev)
evergreen_pcie_gart_fini(rdev);
r600_vram_scratch_fini(rdev);
radeon_gem_fini(rdev);
-   radeon_semaphore_driver_fini(rdev);
radeon_fence_driver_fini(rdev);
radeon_agp_fini(rdev);
radeon_bo_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 9cd2657..107b217 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1744,7 +1744,6 @@ void cayman_fini(struct radeon_device *rdev)
cayman_pcie_gart_fini(rdev);
r600_vram_scratch_fini(rdev);
radeon_gem_fini(rdev);
-   radeon_semaphore_driver_fini(rdev);
radeon_fence_driver_fini(rdev);
radeon_bo_fini(rdev);
radeon_atombios_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 87a2333..0ae2d2d 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2658,7 +2658,6 @@ void r600_fini(struct radeon_device *rdev)
r600_vram_scratch_fini(rdev);
radeon_agp_fini(rdev);
radeon_gem_fini(rdev);
-   radeon_semaphore_driver_fini(rdev);
radeon_fence_driver_fini(rdev);
radeon_bo_fini(rdev);
radeon_atombios_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index cc7f16a..45164e1 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -434,34 +434,13 @@ int radeon_mode_dumb_destroy(struct drm_file *file_priv,
 /*
  * Semaphores.
  */
-struct radeon_ring;
-
-#defineRADEON_SEMAPHORE_BO_SIZE256
-
-struct radeon_semaphore_driver {
-   rwlock_tlock;
-   struct list_headbo;
-};
-
-struct radeon_semaphore_bo;
-
 /* everything here is constant */
 struct radeon_semaphore {
-   struct list_headlist;
+   struct radeon_sa_bo *sa_bo;
+   signed  waiters;
uint64_tgpu_addr;
-   uint32_t*cpu_ptr;
-   struct radeon_semaphore_bo  *bo;
 };
 
-struct radeon_semaphore_bo {
-   struct list_headlist;
-   struct radeon_ib*ib;
-   struct list_headfree;
-   struct radeon_semaphore semaphores[RADEON_SEMAPHORE_BO_SIZE/8];
-   unsignednused;
-};
-
-void radeon_semaphore_driver_fini(struct radeon_device *rdev);
 int radeon_semaphore_create(struct radeon_device *rdev,
struct radeon_semaphore **semaphore);
 void radeon_semaphore_emit_signal(struct radeon_device *rdev, int ring,
@@ -473,7 +452,8 @@ int radeon_semaphore_sync_rings(struct radeon_device *rdev,
bool sync_to[RADEON_NUM_RINGS],
int dst_ring);
 void radeon_semaphore_free(struct radeon_device *rdev,
-  struct radeon_semaphore *semaphore);
+  struct radeon_semaphore *semaphore,
+  struct radeon_fence *fence);
 
 /*
  * GART structures, functions  helpers
@@ -1540,7 +1520,6 @@ struct radeon_device {
struct radeon_mman  mman;
struct radeon_fence_driver  fence_drv[RADEON_NUM_RINGS];
wait_queue_head_t   fence_queue;
-   struct radeon_semaphore_driver  semaphore_drv;
struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
struct radeon_ib_pool   ib_pool;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index b827b2e..48876c1 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@

[PATCH 17/20] drm/radeon: immediately free ttm-move semaphore

We can now protected the semaphore ram by a
fence, so free it immediately.

Signed-off-by: Christian König deathsim...@vodafone.de
---
 drivers/gpu/drm/radeon/radeon_ttm.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 5e3d54d..0f6aee8 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -223,6 +223,7 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
struct radeon_device *rdev;
uint64_t old_start, new_start;
struct radeon_fence *fence, *old_fence;
+   struct radeon_semaphore *sem = NULL;
int r;
 
rdev = radeon_get_rdev(bo-bdev);
@@ -272,15 +273,16 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
bool sync_to_ring[RADEON_NUM_RINGS] = { };
sync_to_ring[old_fence-ring] = true;
 
-   r = radeon_semaphore_create(rdev, fence-semaphore);
+   r = radeon_semaphore_create(rdev, sem);
if (r) {
radeon_fence_unref(fence);
return r;
}
 
-   r = radeon_semaphore_sync_rings(rdev, fence-semaphore,
+   r = radeon_semaphore_sync_rings(rdev, sem,
sync_to_ring, fence-ring);
if (r) {
+   radeon_semaphore_free(rdev, sem, NULL);
radeon_fence_unref(fence);
return r;
}
@@ -292,6 +294,7 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
/* FIXME: handle copy error */
r = ttm_bo_move_accel_cleanup(bo, (void *)fence, NULL,
  evict, no_wait_reserve, no_wait_gpu, 
new_mem);
+   radeon_semaphore_free(rdev, sem, fence);
radeon_fence_unref(fence);
return r;
 }
-- 
1.7.5.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 14/20] drm/radeon: multiple ring allocator v2

A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.

We store the last allocated bo in last, we always try to allocate
after the last allocated bo. Principle is that in a linear GPU ring
progression was is after last is the oldest bo we allocated and thus
the first one that should no longer be in use by the GPU.

If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.

If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.

v2: We need to be able to let hole point to the list_head, otherwise
try free will never free the first allocation of the list. Also
stop calling radeon_fence_signalled more than necessary.

Signed-off-by: Christian König deathsim...@vodafone.de
Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 drivers/gpu/drm/radeon/radeon.h  |7 +-
 drivers/gpu/drm/radeon/radeon_ring.c |   19 +--
 drivers/gpu/drm/radeon/radeon_sa.c   |  292 +++---
 3 files changed, 210 insertions(+), 108 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 37a7459..cc7f16a 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -385,7 +385,9 @@ struct radeon_bo_list {
 struct radeon_sa_manager {
spinlock_t  lock;
struct radeon_bo*bo;
-   struct list_headsa_bo;
+   struct list_head*hole;
+   struct list_headflist[RADEON_NUM_RINGS];
+   struct list_headolist;
unsignedsize;
uint64_tgpu_addr;
void*cpu_ptr;
@@ -396,7 +398,8 @@ struct radeon_sa_bo;
 
 /* sub-allocation buffer */
 struct radeon_sa_bo {
-   struct list_headlist;
+   struct list_headolist;
+   struct list_headflist;
struct radeon_sa_manager*manager;
unsignedsoffset;
unsignedeoffset;
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 1748d93..e074ff5 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -204,25 +204,22 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib)
 
 int radeon_ib_pool_init(struct radeon_device *rdev)
 {
-   struct radeon_sa_manager tmp;
int i, r;
 
-   r = radeon_sa_bo_manager_init(rdev, tmp,
- RADEON_IB_POOL_SIZE*64*1024,
- RADEON_GEM_DOMAIN_GTT);
-   if (r) {
-   return r;
-   }
-
radeon_mutex_lock(rdev-ib_pool.mutex);
if (rdev-ib_pool.ready) {
radeon_mutex_unlock(rdev-ib_pool.mutex);
-   radeon_sa_bo_manager_fini(rdev, tmp);
return 0;
}
 
-   rdev-ib_pool.sa_manager = tmp;
-   INIT_LIST_HEAD(rdev-ib_pool.sa_manager.sa_bo);
+   r = radeon_sa_bo_manager_init(rdev, rdev-ib_pool.sa_manager,
+ RADEON_IB_POOL_SIZE*64*1024,
+ RADEON_GEM_DOMAIN_GTT);
+   if (r) {
+   radeon_mutex_unlock(rdev-ib_pool.mutex);
+   return r;
+   }
+
for (i = 0; i  RADEON_IB_POOL_SIZE; i++) {
rdev-ib_pool.ibs[i].fence = NULL;
rdev-ib_pool.ibs[i].idx = i;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
b/drivers/gpu/drm/radeon/radeon_sa.c
index 90ee8ad..757a9d4 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -27,21 +27,42 @@
  * Authors:
  *Jerome Glisse gli...@freedesktop.org
  */
+/* Algorithm:
+ *
+ * We store the last allocated bo in hole, we always try to allocate
+ * after the last allocated bo. Principle is that in a linear GPU ring
+ * progression was is after last is the oldest bo we allocated and thus
+ * the first one that should no longer be in use by the GPU.
+ *
+ * If it's not the case we skip over the bo after last to the closest
+ * done bo if such one exist. If none exist and we are not asked to
+ * block we report failure to allocate.
+ *
+ * If we are asked to block we wait on all the oldest fence of all
+ * rings. We just wait for any of those fence to complete.
+ */
 #include drmP.h
 #include drm.h
 #include radeon.h
 
+static void radeon_sa_bo_remove_locked(struct radeon_sa_bo *sa_bo);
+static void radeon_sa_bo_try_free(struct radeon_sa_manager *sa_manager);
+
 int radeon_sa_bo_manager_init(struct radeon_device *rdev,
  struct radeon_sa_manager *sa_manager,

[PATCH 16/20] drm/radeon: rip out the ib pool

From: Jerome Glisse jgli...@redhat.com

It isn't necessary any more and the suballocator seems to perform
even better.

Signed-off-by: Christian König deathsim...@vodafone.de
Signed-off-by: Jerome Glisse jgli...@redhat.com
---
 drivers/gpu/drm/radeon/radeon.h   |   17 +--
 drivers/gpu/drm/radeon/radeon_device.c|1 -
 drivers/gpu/drm/radeon/radeon_gart.c  |   12 +-
 drivers/gpu/drm/radeon/radeon_ring.c  |  241 -
 drivers/gpu/drm/radeon/radeon_semaphore.c |2 +-
 5 files changed, 71 insertions(+), 202 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 45164e1..6170307 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -625,7 +625,6 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
 
 struct radeon_ib {
struct radeon_sa_bo *sa_bo;
-   unsignedidx;
uint32_tlength_dw;
uint64_tgpu_addr;
uint32_t*ptr;
@@ -634,18 +633,6 @@ struct radeon_ib {
boolis_const_ib;
 };
 
-/*
- * locking -
- * mutex protects scheduled_ibs, ready, alloc_bm
- */
-struct radeon_ib_pool {
-   struct radeon_mutex mutex;
-   struct radeon_sa_managersa_manager;
-   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
-   boolready;
-   unsignedhead_id;
-};
-
 struct radeon_ring {
struct radeon_bo*ring_obj;
volatile uint32_t   *ring;
@@ -787,7 +774,6 @@ struct si_rlc {
 int radeon_ib_get(struct radeon_device *rdev, int ring,
  struct radeon_ib **ib, unsigned size);
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib);
-bool radeon_ib_try_free(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void radeon_ib_pool_fini(struct radeon_device *rdev);
@@ -1522,7 +1508,8 @@ struct radeon_device {
wait_queue_head_t   fence_queue;
struct mutexring_lock;
struct radeon_ring  ring[RADEON_NUM_RINGS];
-   struct radeon_ib_pool   ib_pool;
+   boolib_pool_ready;
+   struct radeon_sa_managerring_tmp_bo;
struct radeon_irq   irq;
struct radeon_asic  *asic;
struct radeon_gem   gem;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 48876c1..e1bc7e9 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -724,7 +724,6 @@ int radeon_device_init(struct radeon_device *rdev,
/* mutex initialization are all done here so we
 * can recall function without having locking issues */
radeon_mutex_init(rdev-cs_mutex);
-   radeon_mutex_init(rdev-ib_pool.mutex);
mutex_init(rdev-ring_lock);
mutex_init(rdev-dc_hw_i2c_mutex);
if (rdev-family = CHIP_R600)
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index 53dba8e..8e9ef34 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -432,8 +432,8 @@ retry_id:
rdev-vm_manager.use_bitmap |= 1  id;
vm-id = id;
list_add_tail(vm-list, rdev-vm_manager.lru_vm);
-   return radeon_vm_bo_update_pte(rdev, vm, rdev-ib_pool.sa_manager.bo,
-  rdev-ib_pool.sa_manager.bo-tbo.mem);
+   return radeon_vm_bo_update_pte(rdev, vm, rdev-ring_tmp_bo.bo,
+  rdev-ring_tmp_bo.bo-tbo.mem);
 }
 
 /* object have to be reserved */
@@ -631,7 +631,7 @@ int radeon_vm_init(struct radeon_device *rdev, struct 
radeon_vm *vm)
/* map the ib pool buffer at 0 in virtual address space, set
 * read only
 */
-   r = radeon_vm_bo_add(rdev, vm, rdev-ib_pool.sa_manager.bo, 0,
+   r = radeon_vm_bo_add(rdev, vm, rdev-ring_tmp_bo.bo, 0,
 RADEON_VM_PAGE_READABLE | RADEON_VM_PAGE_SNOOPED);
return r;
 }
@@ -648,12 +648,12 @@ void radeon_vm_fini(struct radeon_device *rdev, struct 
radeon_vm *vm)
radeon_mutex_unlock(rdev-cs_mutex);
 
/* remove all bo */
-   r = radeon_bo_reserve(rdev-ib_pool.sa_manager.bo, false);
+   r = radeon_bo_reserve(rdev-ring_tmp_bo.bo, false);
if (!r) {
-   bo_va = radeon_bo_va(rdev-ib_pool.sa_manager.bo, vm);
+   bo_va = radeon_bo_va(rdev-ring_tmp_bo.bo, vm);
list_del_init(bo_va-bo_list);
list_del_init(bo_va-vm_list);
-   radeon_bo_unreserve(rdev-ib_pool.sa_manager.bo);
+

[PULL] drm-intel-next manual merge

Hi Dave,

As discussed on ircmail, here's the pull request for the manual merge to
unconfuse git about the changes in intel_display.c. Note that I've
manually frobbed the shortlog to exclude all the changes merge through
Linus' tree.

Yours, Daniel

The following changes since commit 5bc69bf9aeb73547cad8e1ce683a103fe9728282:

  Merge tag 'drm-intel-next-2012-04-23' of 
git://people.freedesktop.org/~danvet/drm-intel into drm-core-next (2012-05-02 
09:22:29 +0100)

are available in the git repository at:


  git://people.freedesktop.org/~danvet/drm-intel for-airlied

for you to fetch changes up to dc257cf154be708ecc47b8b89c12ad8cd2cc35e4:

  Merge tag 'v3.4-rc6' into drm-intel-next (2012-05-07 14:02:14 +0200)



Daniel Vetter (1):
  Merge tag 'v3.4-rc6' into drm-intel-next

-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC 05/13] v4l: vb2-dma-contig: add support for DMABUF exporting

2012-05-07 Thread Laurent Pinchart

Hi Tomasz,

Sorry for the late reply, this one slipped through the cracks.

On Thursday 19 April 2012 12:42:12 Tomasz Stanislawski wrote:
 On 04/17/2012 04:08 PM, Laurent Pinchart wrote:
  On Tuesday 10 April 2012 15:10:39 Tomasz Stanislawski wrote:
  This patch adds support for exporting a dma-contig buffer using
  DMABUF interface.
  
  Signed-off-by: Tomasz Stanislawski t.stanisl...@samsung.com
  Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
  ---
 
 [snip]
 
  +static struct sg_table *vb2_dc_dmabuf_ops_map(
  +  struct dma_buf_attachment *db_attach, enum dma_data_direction dir)
  +{
  +  struct dma_buf *dbuf = db_attach-dmabuf;
  +  struct vb2_dc_buf *buf = dbuf-priv;
  +  struct vb2_dc_attachment *attach = db_attach-priv;
  +  struct sg_table *sgt;
  +  struct scatterlist *rd, *wr;
  +  int i, ret;
  
  You can make i an unsigned int :-)
 
 Right.. splitting declaration may be also a good idea :)
 
  +
  +  /* return previously mapped sg table */
  +  if (attach)
  +  return attach-sgt;
  
  This effectively keeps the mapping around as long as the attachment
  exists. We don't try to swap out buffers in V4L2 as is done in DRM at the
  moment, so it might not be too much of an issue, but the behaviour of the
  implementation will change if we later decide to map/unmap the buffers in
  the map/unmap handlers. Do you think that could be a problem ?
 
 I don't that it is a problem. If an importer calls dma_map_sg then caching
 sgt on an exporter side reduces a cost of an allocating and an
 initialization of sgt.
 
  +
  +  attach = kzalloc(sizeof *attach, GFP_KERNEL);
  +  if (!attach)
  +  return ERR_PTR(-ENOMEM);
  
  Why don't you allocate the vb2_dc_attachment here instead of
  vb2_dc_dmabuf_ops_attach() ?
 
 Good point.
 The attachment could be allocated at vb2_dc_attachment but all its
 fields would be uninitialized. I mean an empty sgt and an undefined
 dma direction. I decided to allocate the attachment in vb2_dc_dmabuf_ops_map
 because only than all information needed to create a valid attachment
 object are available.
 
 The other solution might be the allocation at vb2_dc_attachment. The field
 dir would be set to DMA_NONE. If this filed is equal to DMA_NONE at
 vb2_dc_dmabuf_ops_map then sgt is allocated and mapped and direction field
 is updated. If value is not DMA_NONE then the sgt is reused.
 
 Do you think that it is a good idea?

I think I would prefer that. It sounds more logical to allocate the attachment 
in the attach operation handler.

  +  sgt = attach-sgt;
  +  attach-dir = dir;
  +
  +  /* copying the buf-base_sgt to attachment */
  
  I would add an explanation regarding why you need to copy the SG list.
  Something like.
  
  Copy the buf-base_sgt scatter list to the attachment, as we can't map
  the same scatter list to multiple devices at the same time.
 
 ok
 
  +  ret = sg_alloc_table(sgt, buf-sgt_base-orig_nents, GFP_KERNEL);
  +  if (ret) {
  +  kfree(attach);
  +  return ERR_PTR(-ENOMEM);
  +  }
  +
  +  rd = buf-sgt_base-sgl;
  +  wr = sgt-sgl;
  +  for (i = 0; i  sgt-orig_nents; ++i) {
  +  sg_set_page(wr, sg_page(rd), rd-length, rd-offset);
  +  rd = sg_next(rd);
  +  wr = sg_next(wr);
  +  }
  
  +  /* mapping new sglist to the client */
  +  ret = dma_map_sg(db_attach-dev, sgt-sgl, sgt-orig_nents, dir);
  +  if (ret = 0) {
  +  printk(KERN_ERR failed to map scatterlist\n);
  +  sg_free_table(sgt);
  +  kfree(attach);
  +  return ERR_PTR(-EIO);
  +  }
  +
  +  db_attach-priv = attach;
  +
  +  return sgt;
  +}
  +
  +static void vb2_dc_dmabuf_ops_unmap(struct dma_buf_attachment
  *db_attach,
  +  struct sg_table *sgt, enum dma_data_direction dir)
  +{
  +  /* nothing to be done here */
  +}
  +
  +static void vb2_dc_dmabuf_ops_release(struct dma_buf *dbuf)
  +{
  +  /* drop reference obtained in vb2_dc_get_dmabuf */
  +  vb2_dc_put(dbuf-priv);
  
  Shouldn't you set vb2_dc_buf::dma_buf to NULL here ? Otherwise the next
  vb2_dc_get_dmabuf() call will return a DMABUF object that has been freed.
 
 No.
 
 The buffer object is destroyed at vb2_dc_put when reference count drops to
 0. It happens could happen after only REQBUF(count=0) or on last close().
 The DMABUF object is created only for MMAP buffers. The DMABUF object is
 based only on results of dma_alloc_coherent and dma_get_pages (or its future
 equivalent). Therefore the DMABUF object is valid as long as the buffer is
 valid.

OK.

 Notice that dmabuf object could be created in vb2_dc_alloc. I moved it to
 vb2_dc_get_dmabuf to avoid a creation of an object that may not be used.
 
  +}
  +
  +static struct dma_buf_ops vb2_dc_dmabuf_ops = {
  +  .attach = vb2_dc_dmabuf_ops_attach,
  +  .detach = vb2_dc_dmabuf_ops_detach,
  +  .map_dma_buf = vb2_dc_dmabuf_ops_map,
  +  .unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
  +  .release = vb2_dc_dmabuf_ops_release,
  +};
  +
  +static struct dma_buf *vb2_dc_get_dmabuf(void

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #2 from Tom Stellard tstel...@gmail.com 2012-05-07 07:18:45 PDT 
---
If you re-run autogen.sh and configure does that fix the problem?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: SA and other Patches.

On Mon, May 7, 2012 at 7:42 AM, Christian König deathsim...@vodafone.de wrote:
 Hi Jerome  everybody on the list,

 this gathers together every patch we developed over the last week or so and
 which is not already in drm-next.

 I've run quite some tests with them yesterday and today and as far as I can
 see hammered out every known bug. For the SA allocator I reverted to tracking
 the hole pointer instead of just the last allocation, cause otherwise we will
 never release the first allocation on the list. Glxgears now even keeps 
 happily
 running if I deadlock on the not GFX rings on purpose.

Now we will release the first entry if we use the last allocate ptr i
believe it's cleaner to use the last ptr.

 Please take a second look at them and if nobody objects any more we should
 commit them to drm-next.

 Cheers,
 Christian.



Cheers,
Jerome
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 04/20] drm/radeon: convert fence to uint64_t v4

On Mon, May 7, 2012 at 7:42 AM, Christian König deathsim...@vodafone.de wrote:
 From: Jerome Glisse jgli...@redhat.com

 This convert fence to use uint64_t sequence number intention is
 to use the fact that uin64_t is big enough that we don't need to
 care about wrap around.

 Tested with and without writeback using 0xF000 as initial
 fence sequence and thus allowing to test the wrap around from
 32bits to 64bits.

 v2: Add comment about possible race btw CPU  GPU, add comment
    stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
    Read fence sequenc in reverse order of GPU write them so we
    mitigate the race btw CPU and GPU.

 v3: Drop the need for ring to emit the 64bits fence, and just have
    each ring emit the lower 32bits of the fence sequence. We
    handle the wrap over 32bits in fence_process.

 v4: Just a small optimization: Don't reread the last_seq value
    if loop restarts, since we already know its value anyway.
    Also start at zero not one for seq value and use pre instead
    of post increment in emmit, otherwise wait_empty will deadlock.

Why changing that v3 was already good no deadlock. I started at 1
especialy for that, a signaled fence is set to 0 so it always compare
as signaled. Just using preincrement is exactly like starting at one.
I don't see the need for this change but if it makes you happy.

Cheers,
Jerome

 Signed-off-by: Jerome Glisse jgli...@redhat.com
 Signed-off-by: Christian König deathsim...@vodafone.de
 ---
  drivers/gpu/drm/radeon/radeon.h       |   39 ++-
  drivers/gpu/drm/radeon/radeon_fence.c |  116 
 +++--
  drivers/gpu/drm/radeon/radeon_ring.c  |    9 ++-
  3 files changed, 107 insertions(+), 57 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
 index e99ea81..cdf46bc 100644
 --- a/drivers/gpu/drm/radeon/radeon.h
 +++ b/drivers/gpu/drm/radeon/radeon.h
 @@ -100,28 +100,32 @@ extern int radeon_lockup_timeout;
  * Copy from radeon_drv.h so we don't have to include both and have 
 conflicting
  * symbol;
  */
 -#define RADEON_MAX_USEC_TIMEOUT                10  /* 100 ms */
 -#define RADEON_FENCE_JIFFIES_TIMEOUT   (HZ / 2)
 +#define RADEON_MAX_USEC_TIMEOUT                        10  /* 100 ms */
 +#define RADEON_FENCE_JIFFIES_TIMEOUT           (HZ / 2)
  /* RADEON_IB_POOL_SIZE must be a power of 2 */
 -#define RADEON_IB_POOL_SIZE            16
 -#define RADEON_DEBUGFS_MAX_COMPONENTS  32
 -#define RADEONFB_CONN_LIMIT            4
 -#define RADEON_BIOS_NUM_SCRATCH                8
 +#define RADEON_IB_POOL_SIZE                    16
 +#define RADEON_DEBUGFS_MAX_COMPONENTS          32
 +#define RADEONFB_CONN_LIMIT                    4
 +#define RADEON_BIOS_NUM_SCRATCH                        8

  /* max number of rings */
 -#define RADEON_NUM_RINGS 3
 +#define RADEON_NUM_RINGS                       3
 +
 +/* fence seq are set to this number when signaled */
 +#define RADEON_FENCE_SIGNALED_SEQ              0LL
 +#define RADEON_FENCE_NOTEMITED_SEQ             (~0LL)

  /* internal ring indices */
  /* r1xx+ has gfx CP ring */
 -#define RADEON_RING_TYPE_GFX_INDEX  0
 +#define RADEON_RING_TYPE_GFX_INDEX             0

  /* cayman has 2 compute CP rings */
 -#define CAYMAN_RING_TYPE_CP1_INDEX 1
 -#define CAYMAN_RING_TYPE_CP2_INDEX 2
 +#define CAYMAN_RING_TYPE_CP1_INDEX             1
 +#define CAYMAN_RING_TYPE_CP2_INDEX             2

  /* hardcode those limit for now */
 -#define RADEON_VA_RESERVED_SIZE                (8  20)
 -#define RADEON_IB_VM_MAX_SIZE          (64  10)
 +#define RADEON_VA_RESERVED_SIZE                        (8  20)
 +#define RADEON_IB_VM_MAX_SIZE                  (64  10)

  /*
  * Errata workarounds.
 @@ -254,8 +258,9 @@ struct radeon_fence_driver {
        uint32_t                        scratch_reg;
        uint64_t                        gpu_addr;
        volatile uint32_t               *cpu_addr;
 -       atomic_t                        seq;
 -       uint32_t                        last_seq;
 +       /* seq is protected by ring emission lock */
 +       uint64_t                        seq;
 +       atomic64_t                      last_seq;
        unsigned long                   last_activity;
        wait_queue_head_t               queue;
        struct list_head                emitted;
 @@ -268,11 +273,9 @@ struct radeon_fence {
        struct kref                     kref;
        struct list_head                list;
        /* protected by radeon_fence.lock */
 -       uint32_t                        seq;
 -       bool                            emitted;
 -       bool                            signaled;
 +       uint64_t                        seq;
        /* RB, DMA, etc. */
 -       int                             ring;
 +       unsigned                        ring;
        struct radeon_semaphore         *semaphore;
  };

 diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
 b/drivers/gpu/drm/radeon/radeon_fence.c
 index

Re: [PATCHv5 08/13] v4l: vb2-dma-contig: add support for scatterlist in userptr mode

2012-05-07 Thread Tomasz Stanislawski

Hi Subash,
Could you provide a detailed description of a test case
that causes a failure of vb2_dc_pages_to_sgt?

Regards,
Tomasz Stanislawski
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 14/20] drm/radeon: multiple ring allocator v2

On Mon, May 7, 2012 at 7:42 AM, Christian König deathsim...@vodafone.de wrote:
 A startover with a new idea for a multiple ring allocator.
 Should perform as well as a normal ring allocator as long
 as only one ring does somthing, but falls back to a more
 complex algorithm if more complex things start to happen.

 We store the last allocated bo in last, we always try to allocate
 after the last allocated bo. Principle is that in a linear GPU ring
 progression was is after last is the oldest bo we allocated and thus
 the first one that should no longer be in use by the GPU.

 If it's not the case we skip over the bo after last to the closest
 done bo if such one exist. If none exist and we are not asked to
 block we report failure to allocate.

 If we are asked to block we wait on all the oldest fence of all
 rings. We just wait for any of those fence to complete.

 v2: We need to be able to let hole point to the list_head, otherwise
    try free will never free the first allocation of the list. Also
    stop calling radeon_fence_signalled more than necessary.

 Signed-off-by: Christian König deathsim...@vodafone.de
 Signed-off-by: Jerome Glisse jgli...@redhat.com

This one is NAK please use my patch. Yes in my patch we never try to
free anything if there is only on sa_bo in the list if you really care
about this it's a one line change:
http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch


Your patch here can enter in infinite loop and never return holding
the lock. See below.

Cheers,
Jerome

 ---
  drivers/gpu/drm/radeon/radeon.h      |    7 +-
  drivers/gpu/drm/radeon/radeon_ring.c |   19 +--
  drivers/gpu/drm/radeon/radeon_sa.c   |  292 
 +++---
  3 files changed, 210 insertions(+), 108 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
 index 37a7459..cc7f16a 100644
 --- a/drivers/gpu/drm/radeon/radeon.h
 +++ b/drivers/gpu/drm/radeon/radeon.h
 @@ -385,7 +385,9 @@ struct radeon_bo_list {
  struct radeon_sa_manager {
        spinlock_t              lock;
        struct radeon_bo        *bo;
 -       struct list_head        sa_bo;
 +       struct list_head        *hole;
 +       struct list_head        flist[RADEON_NUM_RINGS];
 +       struct list_head        olist;
        unsigned                size;
        uint64_t                gpu_addr;
        void                    *cpu_ptr;
 @@ -396,7 +398,8 @@ struct radeon_sa_bo;

  /* sub-allocation buffer */
  struct radeon_sa_bo {
 -       struct list_head                list;
 +       struct list_head                olist;
 +       struct list_head                flist;
        struct radeon_sa_manager        *manager;
        unsigned                        soffset;
        unsigned                        eoffset;
 diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
 b/drivers/gpu/drm/radeon/radeon_ring.c
 index 1748d93..e074ff5 100644
 --- a/drivers/gpu/drm/radeon/radeon_ring.c
 +++ b/drivers/gpu/drm/radeon/radeon_ring.c
 @@ -204,25 +204,22 @@ int radeon_ib_schedule(struct radeon_device *rdev, 
 struct radeon_ib *ib)

  int radeon_ib_pool_init(struct radeon_device *rdev)
  {
 -       struct radeon_sa_manager tmp;
        int i, r;

 -       r = radeon_sa_bo_manager_init(rdev, tmp,
 -                                     RADEON_IB_POOL_SIZE*64*1024,
 -                                     RADEON_GEM_DOMAIN_GTT);
 -       if (r) {
 -               return r;
 -       }
 -
        radeon_mutex_lock(rdev-ib_pool.mutex);
        if (rdev-ib_pool.ready) {
                radeon_mutex_unlock(rdev-ib_pool.mutex);
 -               radeon_sa_bo_manager_fini(rdev, tmp);
                return 0;
        }

 -       rdev-ib_pool.sa_manager = tmp;
 -       INIT_LIST_HEAD(rdev-ib_pool.sa_manager.sa_bo);
 +       r = radeon_sa_bo_manager_init(rdev, rdev-ib_pool.sa_manager,
 +                                     RADEON_IB_POOL_SIZE*64*1024,
 +                                     RADEON_GEM_DOMAIN_GTT);
 +       if (r) {
 +               radeon_mutex_unlock(rdev-ib_pool.mutex);
 +               return r;
 +       }
 +
        for (i = 0; i  RADEON_IB_POOL_SIZE; i++) {
                rdev-ib_pool.ibs[i].fence = NULL;
                rdev-ib_pool.ibs[i].idx = i;
 diff --git a/drivers/gpu/drm/radeon/radeon_sa.c 
 b/drivers/gpu/drm/radeon/radeon_sa.c
 index 90ee8ad..757a9d4 100644
 --- a/drivers/gpu/drm/radeon/radeon_sa.c
 +++ b/drivers/gpu/drm/radeon/radeon_sa.c
 @@ -27,21 +27,42 @@
  * Authors:
  *    Jerome Glisse gli...@freedesktop.org
  */
 +/* Algorithm:
 + *
 + * We store the last allocated bo in hole, we always try to allocate
 + * after the last allocated bo. Principle is that in a linear GPU ring
 + * progression was is after last is the oldest bo we allocated and thus
 + * the first one that should no longer be in use by the GPU.
 + *
 + * If it's not the case we skip over the bo after last to the closest
 + * done bo if such one exist. If none exist

Re: [PATCH 14/20] drm/radeon: multiple ring allocator v2

On 07.05.2012 17:23, Jerome Glisse wrote:

On Mon, May 7, 2012 at 7:42 AM, Christian Königdeathsim...@vodafone.de wrote:

A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.

We store the last allocated bo in last, we always try to allocate
after the last allocated bo. Principle is that in a linear GPU ring
progression was is after last is the oldest bo we allocated and thus
the first one that should no longer be in use by the GPU.

If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.

If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.

v2: We need to be able to let hole point to the list_head, otherwise
try free will never free the first allocation of the list. Also
stop calling radeon_fence_signalled more than necessary.

Signed-off-by: Christian Königdeathsim...@vodafone.de
Signed-off-by: Jerome Glissejgli...@redhat.com

This one is NAK please use my patch. Yes in my patch we never try to
free anything if there is only on sa_bo in the list if you really care
about this it's a one line change:
http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch
Nope that won't work correctly, last is pointing to the last
allocation and that's the most unlikely to be freed at this time. Also
in this version (like in the one before) radeon_sa_bo_next_hole lets
hole point to the prev of the found sa_bo without checking if this
isn't the lists head. That might cause a crash if an to be freed
allocation is the first one in the buffer.

What radeon_sa_bo_try_free would need to do to get your approach working
is to loop over the end of the buffer and also try to free at the
beginning, but saying that keeping the last allocation results in a
whole bunch of extra cases and ifs, while just keeping a pointer to
the hole (e.g. where the next allocation is most likely to succeed)
simplifies the code quite a bit (but I agree that on the down side it
makes it harder to understand).

Your patch here can enter in infinite loop and never return holding
the lock. See below.

[SNIP]

+ } while (radeon_sa_bo_next_hole(sa_manager, fences));

Here you can infinite loop, in the case there is a bunch of hole in
the allocator but none of them allow to full fill the allocation.
radeon_sa_bo_next_hole will keep returning true looping over and over
on all the all. That's why i only restrict my patch to 2 hole skeeping
and then fails the allocation or try to wait. I believe sadly we need
an heuristic and 2 hole skeeping at most sounded like a good one.
Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in
conjunction with radeon_sa_bo_try_free are eating up the opportunities
for holes.

Look again, it probably will never loop more than RADEON_NUM_RINGS + 1,
with the exception for allocating in a complete scattered buffer, and
even then it will never loop more often than halve the number of current
allocations (and that is really really unlikely).

Cheers,
Christian.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

--- Comment #3 from Michal Suchanek hramr...@gmail.com 2012-05-07 09:55:15 
PDT ---
I get no mesa warnings, only warnings from wine about Mesa returning
GL_INVALID*

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #3 from Mike Mestnik cheako+bugs_freedesktop_...@mikemestnik.net 
2012-05-07 09:58:45 PDT ---
Tom,
The short of it: I'm already doing that.

The long:
  I took a look at that script and it eventually just calls autoreconf -v
--install my log clearly shows autoreconf -vfi being called.

Also note that that script will call configure.

Thanks!

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

--- Comment #4 from Michal Suchanek hramr...@gmail.com 2012-05-07 10:02:05 
PDT ---
invalid value:

Breakpoint 1, _mesa_error (ctx=0xccba90, error=1281, fmtString=0x74706278
glTexImage%dD(internalFormat=%s)) at main/errors.c:996
996main/errors.c: No such file or directory.
(gdb) bt full
#0  _mesa_error (ctx=0xccba90, error=1281, fmtString=0x74706278
glTexImage%dD(internalFormat=%s)) at main/errors.c:996
do_output = 225 '\341'
do_log = optimized out
#1  0x745e2a84 in texture_error_check (border=0, depth=1, height=64,
width=64, type=0, format=0, internalFormat=0, level=0, 
target=3553, dimensions=2, ctx=0xccba90) at main/teximage.c:1621
proxyTarget = optimized out
err = optimized out
indexFormat = 0 '\000'
isProxy = optimized out
sizeOK = 1 '\001'
colorFormat = optimized out
#2  teximage (ctx=0xccba90, dims=2, target=3553, level=0, internalFormat=0,
width=64, height=64, depth=1, border=0, format=0, type=0, 
pixels=0x0) at main/teximage.c:2501
error = 1 '\001'
unpack_no_border = {Alignment = -7152, RowLength = 32767, SkipPixels =
9180912, SkipRows = 0, ImageHeight = -8144, 
  SkipImages = 32767, SwapBytes = 45 '-', LsbFirst = 17 '\021', Invert
= 90 'Z', BufferObj = 0x7fffe410}
unpack = 0xcd22e0
#3  0x745e2fc4 in _mesa_TexImage2D (target=optimized out,
level=optimized out, internalFormat=optimized out, 
width=optimized out, height=optimized out, border=optimized out,
format=0, type=0, pixels=0x0) at main/teximage.c:2639
No locals.
#4  0x00480145 in ?? ()
No symbol table info available.
#5  0x004bbfd6 in ?? ()
No symbol table info available.
#6  0x00440687 in ?? ()
No symbol table info available.
#7  0x0043b985 in ?? ()
No symbol table info available.
#8  0x0043c092 in ?? ()
No symbol table info available.
#9  0x7696eead in __libc_start_main (main=optimized out,
argc=optimized out, ubp_av=optimized out, init=optimized out, 
fini=optimized out, rtld_fini=optimized out, stack_end=0x7fffe408)
at libc-start.c:228
result = optimized out
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 738182590451561014,
4428032, 140737488348176, 0, 0, -738182590032367050, 
-738203265834012106}, mask_was_saved = 0}}, priv = {pad = {0x0,
0x0, 0x5ddb20, 0x7fffe418}, data = {prev = 0x0, 
  cleanup = 0x0, canceltype = 6150944}}}
not_first_call = optimized out
#10 0x00439129 in _start ()
No symbol table info available.
(gdb) c
Continuing.
37923 glTexImage2D(target = GL_TEXTURE_2D, level = 0, internalformat = GL_ZERO,
width = 64, height = 64, border = 0, format = GL_ZERO, type = GL_ZERO, pixels =
NULL)
37923: warning: glGetError(glTexImage2D) = GL_INVALID_VALUE

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

--- Comment #5 from Henri Verbeet hverb...@gmail.com 2012-05-07 10:37:29 PDT 
---
That generally happens when an application tries to use a (D3D) format (e.g.
DXT/s3tc) even though it's not available. A WINEDEBUG=+d3d,+d3d_surface log
should show which format, although typically it's either s3tc or one of the
floating point formats.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #4 from Tom Stellard tstel...@gmail.com 2012-05-07 10:43:27 PDT 
---
Created attachment 61159
  -- https://bugs.freedesktop.org/attachment.cgi?id=61159
Possible fix

Does it build with this patch?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49484] invalid enum 0x500, invalid value 0x501

https://bugs.freedesktop.org/show_bug.cgi?id=49484

Michal Suchanek hramr...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||INVALID

--- Comment #6 from Michal Suchanek hramr...@gmail.com 2012-05-07 11:03:54 
PDT ---
Indeed, it works with the texture compression library installed.

I guess this is something that Wine should report.

Unfortunately, the available messages are very unhelpful.


Sorry about the noise.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC][PATCH] drm/radeon/hdmi: define struct for AVI infoframe

On Mon, May 7, 2012 at 3:38 AM, Michel Dänzer mic...@daenzer.net wrote:
 On Son, 2012-05-06 at 18:29 +0200, Rafał Miłecki wrote:
 2012/5/6 Dave Airlie airl...@gmail.com:
  On Sun, May 6, 2012 at 5:19 PM, Rafał Miłecki zaj...@gmail.com wrote:
  2012/5/6 Rafał Miłecki zaj...@gmail.com:
  diff --git a/drivers/gpu/drm/radeon/r600_hdmi.c 
  b/drivers/gpu/drm/radeon/r600_hdmi.c
  index c308432..b14c90a 100644
  --- a/drivers/gpu/drm/radeon/r600_hdmi.c
  +++ b/drivers/gpu/drm/radeon/r600_hdmi.c
  @@ -134,78 +134,22 @@ static void r600_hdmi_infoframe_checksum(uint8_t 
  packetType,
   }
 
   /*
  - * build a HDMI Video Info Frame
  + * Upload a HDMI AVI Infoframe
   */
  -static void r600_hdmi_videoinfoframe(
  -       struct drm_encoder *encoder,
  -       enum r600_hdmi_color_format color_format,
  -       int active_information_present,
  -       uint8_t active_format_aspect_ratio,
  -       uint8_t scan_information,
  -       uint8_t colorimetry,
  -       uint8_t ex_colorimetry,
  -       uint8_t quantization,
  -       int ITC,
  -       uint8_t picture_aspect_ratio,
  -       uint8_t video_format_identification,
  -       uint8_t pixel_repetition,
  -       uint8_t non_uniform_picture_scaling,
  -       uint8_t bar_info_data_valid,
  -       uint16_t top_bar,
  -       uint16_t bottom_bar,
  -       uint16_t left_bar,
  -       uint16_t right_bar
  -)
 
  In case someone wonders about the reason: I think it's really ugly to
  have a function taking 18 arguments, 17 of them related to the
  infoframe. It makes much more sense for me to use struct for that.
  While working on that I though it's reasonable to prepare nice
  bitfield __packed struct ready-to-be-written to the GPU registers.
 
  won't this screw up on other endian machines?

 Hm, maybe it can. Is there some easy to handle it correctly? Some trick like
 __le8 foo: 3
 __le8 bar: 1
 maybe?

 Not really. The memory layout of bitfields is basically completely up to
 the C implementation, so IMHO they're just inadequate for describing
 fixed memory layouts.



Yes i agree please stay away from bitfields, i know it looks cool but
bitshift is cool too.

Cheers,
Jerome
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 3/4] drm/exynos: added userptr feature.

On Sat, May 5, 2012 at 6:22 AM, Dave Airlie airl...@gmail.com wrote:
 On Sat, May 5, 2012 at 11:19 AM,  daei...@gmail.com wrote:
 Hi Dave,

 2012. 4. 25. 오후 7:15 Dave Airlie airl...@gmail.com 작성:

 On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae inki@samsung.com wrote:
 this feature could be used to use memory region allocated by malloc() in 
 user
 mode and mmaped memory region allocated by other memory allocators. userptr
 interface can identify memory type through vm_flags value and would get
 pages or page frame numbers to user space appropriately.

 Is there anything to stop the unpriviledged userspace driver locking
 all the RAM in the machine inside userptr?


 you mean that there is something that it can stop user space driver locking 
 some memory region  of RAM? and if any user space driver locked some region 
 then anyone on user space can't access the region? could you please tell me 
 about your concerns in more detail so that we can solve the issue? I guess 
 you mean that any user level driver such as specific EGL library can 
 allocate some memory region and also lock the region so that other user 
 space applications can't access the region until rendering is completed by 
 hw accelerator such as 2d/3d core or opposite case.

 actually, this feature has already been used by v4l2 so I didn't try to 
 consider we could face with any problem with this and I've got a feeling 
 maybe there is something I missed so I'd be happy for you or anyone give me 
 any advices.

 Well v4l get to make their own bad design decisions.

 The problem is if an unprivledged users accessing the drm can lock all
 the pages it allocates into memory, by passing them to the kernel as
 userptrs., thus bypassing the swap and blocking all other users on the
 system.

 Dave.

Beside that you are not locking the vma and afaik this means that the
page backing the vma might change, yes you will still own the page you
get but userspace might be reading/writing to different pages. The vma
would need to be locked but than the userspace might unlock it in your
back and you start right from the begining.

Cheers,
Jerome
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 14/20] drm/radeon: multiple ring allocator v2

On Mon, May 7, 2012 at 1:59 PM, Jerome Glisse j.gli...@gmail.com wrote:
On 07.05.2012 17:23, Jerome Glisse wrote:

On Mon, May 7, 2012 at 7:42 AM, Christian Königdeathsim...@vodafone.de
wrote:

If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.

If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.

v2: We need to be able to let hole point to the list_head, otherwise
try free will never free the first allocation of the list. Also
stop calling radeon_fence_signalled more than necessary.

Signed-off-by: Christian Königdeathsim...@vodafone.de
Signed-off-by: Jerome Glissejgli...@redhat.com

This one is NAK please use my patch. Yes in my patch we never try to
free anything if there is only on sa_bo in the list if you really care
about this it's a one line change:

http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v2.patch

Nope that won't work correctly, last is pointing to the last allocation
and that's the most unlikely to be freed at this time. Also in this version
(like in the one before) radeon_sa_bo_next_hole lets hole point to the
prev of the found sa_bo without checking if this isn't the lists head.
That might cause a crash if an to be freed allocation is the first one in
the buffer.

What radeon_sa_bo_try_free would need to do to get your approach working is
to loop over the end of the buffer and also try to free at the beginning,
but saying that keeping the last allocation results in a whole bunch of
extra cases and ifs, while just keeping a pointer to the hole (e.g.
where the next allocation is most likely to succeed) simplifies the code
quite a bit (but I agree that on the down side it makes it harder to
understand).

Your patch here can enter in infinite loop and never return holding
the lock. See below.

[SNIP]

+ } while (radeon_sa_bo_next_hole(sa_manager, fences));

Nope, that can't be an infinite loop, cause radeon_sa_bo_next_hole in
conjunction with radeon_sa_bo_try_free are eating up the opportunities for
holes.

Look again, it probably will never loop more than RADEON_NUM_RINGS + 1, with
the exception for allocating in a complete scattered buffer, and even then
it will never loop more often than halve the number of current allocations
(and that is really really unlikely).

Cheers,
Christian.

I looked again and yes it can loop infinitly, think of hole you can
never free ie radeon_sa_bo_try_free can't free anything. This
situation can happen if you have several thread allocating sa bo at
the same time while none of them are yet done with there sa_bo (ie
none have call sa_bo_free yet). I updated a v3 that track oldest and
fix all things you were pointing out above.

http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch

Cheers,
Jerome

Of course by tracking oldest it defeat the algo so updated patch :

http://people.freedesktop.org/~glisse/reset5/0001-drm-radeon-multiple-ring-allocator-v3.patch

Just fix the corner case of list of single entry.

Cheers,
Jerome
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 49567] No rule to make target libradeon.a, needed by libr600.a.

https://bugs.freedesktop.org/show_bug.cgi?id=49567

--- Comment #5 from Mike Mestnik cheako+bugs_freedesktop_...@mikemestnik.net 
2012-05-07 11:59:39 PDT ---
This patch worked for me and got me to the next undefined reference.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Enhancing EDID quirk functionality

2012-05-07 Thread Ian Pilcher

On 05/03/2012 02:42 PM, Adam Jackson wrote:
 This looks good, thank you for taking it on.

It was either that or give up on my big display, so ... you're welcome.

 I'd like to see documentation for the bit values of the quirks as well.
 And, ideally, this would also have some runtime API for manipulating the
 quirk list, so that way you can test new quirks without needing a reboot
 cycle.

I agree that the bit values should be documented.  I'm not sure where
that documentation should go, however, since I can't find any
documentation of the existing drm module parameters.  Tell me where it
should go, and I'll happily write the doc.

I also agree that it would be nice to be able to manipulate the quirk
list at runtime, and I did think about trying to enable that.  I held
off for a couple of reasons:

1) I'm a total noob at kernel code, so things like in-kernel locking,
   sysfs, memory management, etc., that would be required for a more
   dynamic API are all new to me.

   That said, I'm more that willing to give it a go, if I can get some
   guidance on those (and similar) topics.

2) I'm not sure how a runtime API should work.  The simplest possibility
   is to just take a string, parse it, and overwrite the old extra
   quirk list with the new list.  The downside to this is that all of
   the existing extra quirks need to be repeated to change a single
   quirk.

 To close the loop all the way on that I'd also want to be able to scrape
 the quirk list back out from that API, but that's not completely clean
 right now.

Sound like a couple of sysfs files to me, one for the built-in quirks
and one for the extra quirks -- maybe one quirk per line?  See my
comments about the sysfs API above.

 We're being a little cavalier with the quirk list as it
 stands because we don't differentiate among phy layers, and I can easily
 imagine a monitor that needs a quirk on DVI but where the same quirk on
 the same monitors' VGA would break it.  I don't think this has caused
 problems yet, but.

Now you're above my pay grade.  What little I've read discovered about
the way DisplayPort, HDMI, VGA, and DVI play together makes me think
this is a nightmare best deferred, hopefully forever.

 InfoFrames are not valid for non-HDMI sinks, so yes, I'd call that a bug.

That's pretty much what I figured.

 Where the EDID for DP-1 appears to be truncated: the extension field
 (second byte from the end) is 1 as you'd expect for an HDMI monitor, but
 there's no extension block.  How big of a file do you get from
 /sys/class/drm/*/edid for that port?

The EDID data in sysfs is 256 bytes, which I believe means that it does
include the extension block.

I just tried connecting an HDMI TV to my laptop, and I saw the same
behavior -- 256-byte edid file in sysfs, but xrandr --verbose only
shows 128 bytes.  When I attach the same TV to my workstation with Intel
HD 2000 graphics, xrandr --verbose shows all 256 bytes of EDID data.

So it appears that the full data is being read by both systems, but the
behavior of xrandr (or presumably whatever API xrandr uses to get the
EDID data that it displays) differs between the two drivers.  Fun.

Thanks!

-- 

Ian Pilcher arequip...@gmail.com
If you're going to shift my paradigm ... at least buy me dinner first.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 42490] NUTMEG DP to VGA bridge not working