Re: general protection fault on ttm_init()

2012-07-10 Thread Fengguang Wu
On Sat, Jul 07, 2012 at 11:31:42PM +0800, Fengguang Wu wrote:
> On Sat, Jul 07, 2012 at 10:08:47AM +0800, Fengguang Wu wrote:
> > On Fri, Jul 06, 2012 at 06:09:20PM +0100, Dave Airlie wrote:
> > > On Fri, Jul 6, 2012 at 5:49 PM, Dave Airlie  wrote:
> > > > On Fri, Jul 6, 2012 at 3:48 PM, Fengguang Wu  
> > > > wrote:
> > > >> ... The missed kconfig.
> > > >>
> > > >> On Fri, Jul 06, 2012 at 10:46:22PM +0800, Fengguang Wu wrote:
> > > >>> Hi Thomas,
> > > >
> > > > Wierd, I'm sorta tempted to just depend drm on CONFIG_PROC_FS, but it
> > > > looks like the error path is failing to dtrt.
> > > 
> > > I've attached a patch that should fix it, let me know if it works.
> > 
> > It does not work.. The dmesg (attached) remains the same.
> 
> I got more interesting back traces in a clean kernel:

Another trace shows that ttm_init tries to register with an empty name:

[2.919061] WARNING: at /c/kernel-tests/tip/lib/kobject.c:166 
kobject_add_internal+0x1a3/0x210()
[2.917489] device: 'ttm': device_add
[2.918179] [ cut here ]
[2.919061] WARNING: at /c/kernel-tests/tip/lib/kobject.c:166 
kobject_add_internal+0x1a3/0x210()
 ==>[2.920704] kobject: (8826ecc0): attempted to be registered 
with empty name!
[2.922129] Pid: 1, comm: swapper Not tainted 3.5.0-rc2+ #28
[2.923172] Call Trace:
[2.923638]  [] ? kobject_add_internal+0x1a3/0x210
[2.924827]  [] warn_slowpath_common+0x66/0x90
[2.925993]  [] ? drm_core_init+0xca/0xca
[2.927028]  [] warn_slowpath_fmt+0x41/0x50
[2.928093]  [] kobject_add_internal+0x1a3/0x210
[2.929261]  [] ? drm_core_init+0xca/0xca
[2.930327]  [] ? drm_core_init+0xca/0xca
[2.931473]  [] kobject_add+0x67/0xc0
[2.932589]  [] ? get_device_parent+0x118/0x1b7
[2.933790]  [] get_device_parent+0x161/0x1b7
[2.934895]  [] device_add+0x151/0x5f0
[2.935907]  [] ? drm_core_init+0xca/0xca
[2.936940]  [] ? __raw_spin_lock_init+0x38/0x70
[2.938099]  [] ? drm_core_init+0xca/0xca
[2.939132]  [] device_register+0x19/0x20
[2.940254]  [] drm_class_device_register+0x17/0x20
[2.941437]  [] ttm_init+0x37/0x62
[2.942360]  [] do_one_initcall+0x78/0x136
[2.943413]  [] kernel_init+0x122/0x1a6
[2.944415]  [] ? loglevel+0x31/0x31
[2.945402]  [] kernel_thread_helper+0x4/0x10
[2.946506]  [] ? retint_restore_args+0x13/0x13
[2.947635]  [] ? do_one_initcall+0x136/0x136
[2.948739]  [] ? gs_change+0x13/0x13

Thanks,
Fengguang

> device class 'drm': registering
> kobject: 'drm' (88000f07f050): kobject_add_internal: parent: 'class', 
> set: 'class'
> kobject: 'drm' (88000f07f050): kobject_uevent_env
> kobject: 'drm' (88000f07f050): fill_kobj_path: path = '/class/drm'
> [drm:drm_core_init] *ERROR* Cannot create /proc/dri
> device class 'drm': unregistering
> kobject: 'drm' (88000f07f050): kobject_cleanup
> kobject: 'drm' (88000f07f050): auto cleanup 'remove' event
> kobject: 'drm' (88000f07f050): kobject_uevent_env
> kobject: 'drm' (88000f07f050): fill_kobj_path: path = '/class/drm'
> kobject: 'drm' (88000f07f050): auto cleanup kobject_del
> kobject: 'drm' (88000f07f050): calling ktype release
> class 'drm': release.
> class_create_release called for drm
> kobject: 'drm': free name
> kobject: 'drm' (88000f080070): kobject_cleanup
> kobject: 'drm' (88000f080070): calling ktype release
> kobject: 'drm': free name
> device: 'ttm': device_add
> kobject: '(null)' (88000f080230): kobject_add_internal: parent: 
> 'virtual', set: '(null)'
> kobject: 'ttm' (824709b0): kobject_add_internal: parent: '(null)', 
> set: 'devices'
> general protection fault:  [#1] SMP
> CPU 1
> Pid: 1, comm: swapper/0 Not tainted 3.5.0-rc5-bisect #207
> RIP: 0010:[]  [] 
> sysfs_do_create_link+0x59/0x1c0
> RSP: 0018:88107db0  EFLAGS: 00010206
> RAX: 8810 RBX: 00cc RCX: dad9
> RDX: d9d9 RSI:  RDI: 8243b320
> RBP: 88107e00 R08: 88100580 R09: fe80
> R10: 8810 R11: 0200 R12: 821622db
> R13: 88000f080150 R14: 0001 R15: 88000f080308
> FS:  () GS:88000df0() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2:  CR3: 02411000 CR4: 06a0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process swapper/0 (pid: 1, threadinfo 88106000, task 8810)
> Stack:
>  88000f080308  824709b0 02ec
>    824709b

3.5-rc5: radeon acceleration regression on Transmeta system

2012-07-10 Thread valdis.kletni...@vt.edu
On Mon, 09 Jul 2012 14:30:40 +0300, Meelis Roos said:

> It's actually more complicated than that. Old kernel images started
> misbehaving from around 2.6.35-rc5 and any kernel older than that was
> OK. When I recompiled the older kernels with squeeze gcc (migh have been
> lenny gcc before, or different answers to make oldconfig), anything from
> current git down to 2.6.33 is broken with radeon.modeset=1 and works (I

What releases of GCC were those?  I'm chasing an issue where compiling
with 4.7.[01] breaks but 4.6.2 is OK, wondering if we're chasing the same thing.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 865 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20120710/a7e5b04a/attachment.pgp>


[PATCH 15/15] drm/radeon: implement ring saving on reset v2

2012-07-10 Thread Michel Dänzer
On Die, 2012-07-10 at 14:51 +0200, Christian K?nig wrote: 
> Try to save whatever is on the rings when
> we encounter an lockup.
> 
> v2: Fix spelling error. Free saved ring data if reset fails.
> Add documentation for the new functions.
> 
> Signed-off-by: Christian K?nig 

Just some more spelling nits, otherwise this patch and patch 13 are

Reviewed-by: Michel D?nzer 


> +/**
> + * radeon_ring_backup - Backup the content of a ring
> + *
> + * @rdev: radeon_device pointer
> + * @ring: the ring we want to backup

'back up', in both cases.

> + * Saves all unprocessed commits to a ring, returns the number of dwords 
> saved.
> + */

'unprocessed commands from'?


> +/**
> + * radeon_ring_restore - append saved commands to the ring again
> + *
> + * @rdev: radeon_device pointer
> + * @ring: ring to append commands to
> + * @size: number of dwords we want to write
> + * @data: saved commands
> + *
> + * Allocates space on the ring and restore the previusly saved commands.

'previously'


-- 
Earthling Michel D?nzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer


Mesa shader compiling/optimizing process is too slow

2012-07-10 Thread Chris Forbes
Presumably there needs to be a api-level mechanism to wait for the
background optimization to finish, so that piglit etc can validate the
behavior of the optimized shader?

-- Chris

On Tue, Jul 10, 2012 at 5:17 AM, Eric Anholt  wrote:
> Tiziano Bacocco  writes:
>
>> I've done benchmarks and comparison between proprietary drivers and
>> Mesa, Mesa seems to be up to 200x slower compiling the same shader,
>> since i understand optimizing such part of code may take months or even
>> more, i have thought to solve it this way:
>>
>> Upon calling glLinkProgram , an unoptimized version of the shader (
>> compiles much much faster ) is uploaded to the GPU
>> Then a separate thread is launched that will optimize the shader and as
>> soon it is done, on the next call to glUseProgram it will upload
>> optimized version in place of unoptimized one.
>>
>> This will solve many performance issues and temporary freezes with games
>> that load/unload content while running, while not reducing performance
>> once the background optimization is done
>
> Yeah, we've thought of this, and it would take some work.  Sounds like a
> fun project for someone.
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>


Re: 3.5-rc5: radeon acceleration regression on Transmeta system

2012-07-10 Thread valdis . kletnieks
On Mon, 09 Jul 2012 14:30:40 +0300, Meelis Roos said:

> It's actually more complicated than that. Old kernel images started
> misbehaving from around 2.6.35-rc5 and any kernel older than that was
> OK. When I recompiled the older kernels with squeeze gcc (migh have been
> lenny gcc before, or different answers to make oldconfig), anything from
> current git down to 2.6.33 is broken with radeon.modeset=1 and works (I

What releases of GCC were those?  I'm chasing an issue where compiling
with 4.7.[01] breaks but 4.6.2 is OK, wondering if we're chasing the same thing.


pgpky9saflsmH.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/3] drm/exynos: implement kmap/kunmap/kmap_atomic/kunmap_atomic functions of dma_buf_ops

2012-07-10 Thread Cooper Yuan
Implement kmap/kmap_atomic, kunmap/kunmap_atomic functions of dma_buf_ops.

Signed-off-by: Cooper Yuan 
---
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |   17 +++--
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index 913a23e..805b344 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -138,30 +138,35 @@ static void exynos_dmabuf_release(struct dma_buf *dmabuf)
 static void *exynos_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf,
unsigned long page_num)
 {
-   /* TODO */
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;

-   return NULL;
+   return kmap_atomic(buf->pages[page_num]);
 }

 static void exynos_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf,
unsigned long page_num,
void *addr)
 {
-   /* TODO */
+   kunmap_atomic(addr);
 }

 static void *exynos_gem_dmabuf_kmap(struct dma_buf *dma_buf,
unsigned long page_num)
 {
-   /* TODO */
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;

-   return NULL;
+   return kmap(buf->pages[page_num]);
 }

 static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf,
unsigned long page_num, void *addr)
 {
-   /* TODO */
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;
+
+   kunmap(buf->pages[page_num]);
 }

 static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct
vm_area_struct *vma)
-- 
1.7.0.4


[PATCH 2/3] drm/exynos: add dmabuf mmap function

2012-07-10 Thread Cooper Yuan
implement mmap function of dma_buf_ops.

Signed-off-by: Cooper Yuan 
---
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |   38 
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index e4eeb0b..913a23e 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -164,6 +164,43 @@ static void exynos_gem_dmabuf_kunmap(struct
dma_buf *dma_buf,
/* TODO */
 }

+static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct
vm_area_struct *vma)
+{
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct drm_device *dev = exynos_gem_obj->base.dev;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;
+   int ret = 0;
+   if (WARN_ON(!exynos_gem_obj->base.filp))
+   return -EINVAL;
+
+   /* Check for valid size. */
+   if (buf->size < vma->vm_end - vma->vm_start) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   if (!dev->driver->gem_vm_ops) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   vma->vm_flags |= VM_RESERVED | VM_IO | VM_MIXEDMAP | VM_DONTEXPAND;
+   vma->vm_ops = dev->driver->gem_vm_ops;
+   vma->vm_private_data = exynos_gem_obj;
+   vma->vm_page_prot =  
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+
+   /* Take a ref for this mapping of the object, so that the fault
+* handler can dereference the mmap offset's pointer to the object.
+* This reference is cleaned up by the corresponding vm_close
+* (which should happen whether the vma was created by this call, or
+* by a vm_open due to mremap or partial unmap or whatever).
+*/
+   vma->vm_ops->open(vma);
+
+out_unlock:
+   return ret;
+}
+
 static struct dma_buf_ops exynos_dmabuf_ops = {
.map_dma_buf= exynos_gem_map_dma_buf,
.unmap_dma_buf  = exynos_gem_unmap_dma_buf,
@@ -172,6 +209,7 @@ static struct dma_buf_ops exynos_dmabuf_ops = {
.kunmap = exynos_gem_dmabuf_kunmap,
.kunmap_atomic  = exynos_gem_dmabuf_kunmap_atomic,
.release= exynos_dmabuf_release,
+   .mmap   = exynos_gem_dmabuf_mmap,
 };

 struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev,
-- 
1.7.0.4


[PATCH 1/3] drm/exynos: correct dma_buf exporter permission as ReadWrite

2012-07-10 Thread Cooper Yuan
Set dma_buf exporter permission as ReadWrite, otherwise mmap will get
errno 13: permission denied.

Signed-off-by: Cooper Yuan 
---
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index 613bf8a..e4eeb0b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -29,6 +29,7 @@
 #include "exynos_drm_drv.h"
 #include "exynos_drm_gem.h"

+#include 
 #include 

 static struct sg_table *exynos_pages_to_sg(struct page **pages, int nr_pages,
@@ -179,7 +180,7 @@ struct dma_buf *exynos_dmabuf_prime_export(struct
drm_device *drm_dev,
struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);

return dma_buf_export(exynos_gem_obj, &exynos_dmabuf_ops,
-   exynos_gem_obj->base.size, 0600);
+   exynos_gem_obj->base.size, O_RDWR);
 }

 struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
-- 
1.7.0.4


[RFC] drm/radeon: restoring ring commands in case of a lockup

2012-07-10 Thread Christian König
On 09.07.2012 18:10, Jerome Glisse wrote:
> On Mon, Jul 9, 2012 at 11:59 AM, Michel D?nzer  wrote:
>> On Mon, 2012-07-09 at 12:41 +0200, Christian K?nig wrote:
>>> Hi,
>>>
>>> The following patchset tries to save and restore the not yet processed 
>>> commands
>>> from the rings in case of a lockup and with that should make a userspace
>>> problem with a single application far less problematic.
>>>
>>> The first four patches are just stuff this patchset is based upon, followed 
>>> by
>>> four patches which fix various bugs found while working on this feature.
>>>
>>> Followed by patches which change the way how memory is saved/restored on
>>> suspend/resume, basically before we have unpinned most of the buffer 
>>> objects so
>>> it could be move from vram into system memory. But that is mostly 
>>> unnecessary
>>> cause the buffer object either are already in system memory or their content
>>> can be easily reinitialized.
>>>
>>> The last three patches implement the actual tracking and restoring of 
>>> commands
>>> in case of a lockup. Please take a look and review.
>> Patches 3, 5 and 14 are
>>
>> Reviewed-by: Michel D?nzer 
>>
> Patch 1-9 are
> Reviewed-by: Jerome Glisse 
>
> Other looks good but i want to test them too and spend a bit more time
> to double check few things. Will try to do that tomorrow.
Just send out v2 of the patchset. Mainly it integrates your idea of just 
saving rptr right before we call into the IB, but also contains all the 
other comments and fixes from Michel.

Cheers,
Christian.


[PATCH 15/15] drm/radeon: implement ring saving on reset v2

2012-07-10 Thread Christian König
Try to save whatever is on the rings when
we encounter an lockup.

v2: Fix spelling error. Free saved ring data if reset fails.
Add documentation for the new functions.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|4 ++
 drivers/gpu/drm/radeon/radeon_device.c |   48 
 drivers/gpu/drm/radeon/radeon_ring.c   |   75 
 3 files changed, 119 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 64d39ad..6715e4c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -768,6 +768,10 @@ int radeon_ring_test(struct radeon_device *rdev, struct 
radeon_ring *cp);
 void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring 
*ring);
 void radeon_ring_lockup_update(struct radeon_ring *ring);
 bool radeon_ring_test_lockup(struct radeon_device *rdev, struct radeon_ring 
*ring);
+unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring 
*ring,
+   uint32_t **data);
+int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring,
+   unsigned size, uint32_t *data);
 int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *cp, 
unsigned ring_size,
 unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg,
 u32 ptr_reg_shift, u32 ptr_reg_mask, u32 nop);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index bbd0971..0302a9f 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -996,7 +996,12 @@ int radeon_resume_kms(struct drm_device *dev)

 int radeon_gpu_reset(struct radeon_device *rdev)
 {
-   int r;
+   unsigned ring_sizes[RADEON_NUM_RINGS];
+   uint32_t *ring_data[RADEON_NUM_RINGS];
+
+   bool saved = false;
+
+   int i, r;
int resched;

down_write(&rdev->exclusive_lock);
@@ -1005,20 +1010,47 @@ int radeon_gpu_reset(struct radeon_device *rdev)
resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
radeon_suspend(rdev);

+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   ring_sizes[i] = radeon_ring_backup(rdev, &rdev->ring[i],
+  &ring_data[i]);
+   if (ring_sizes[i]) {
+   saved = true;
+   dev_info(rdev->dev, "Saved %d dwords of commands "
+"on ring %d.\n", ring_sizes[i], i);
+   }
+   }
+
+retry:
r = radeon_asic_reset(rdev);
if (!r) {
-   dev_info(rdev->dev, "GPU reset succeed\n");
+   dev_info(rdev->dev, "GPU reset succeeded, trying to resume\n");
radeon_resume(rdev);
+   }

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   DRM_ERROR("ib ring test failed (%d).\n", r);
+   radeon_restore_bios_scratch_regs(rdev);
+   drm_helper_resume_force_mode(rdev->ddev);

-   radeon_restore_bios_scratch_regs(rdev);
-   drm_helper_resume_force_mode(rdev->ddev);
-   ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
+   if (!r) {
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   radeon_ring_restore(rdev, &rdev->ring[i],
+   ring_sizes[i], ring_data[i]);
+   }
+
+   r = radeon_ib_ring_tests(rdev);
+   if (r) {
+   dev_err(rdev->dev, "ib ring test failed (%d).\n", r);
+   if (saved) {
+   radeon_suspend(rdev);
+   goto retry;
+   }
+   }
+   } else {
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   kfree(ring_data[i]);
+   }
}

+   ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
if (r) {
/* bad news, how to tell it to userspace ? */
dev_info(rdev->dev, "GPU reset failed\n");
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index ce8eb9d..a4fa2c7 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -362,6 +362,81 @@ bool radeon_ring_test_lockup(struct radeon_device *rdev, 
struct radeon_ring *rin
return false;
 }

+/**
+ * radeon_ring_backup - Backup the content of a ring
+ *
+ * @rdev: radeon_device pointer
+ * @ring: the ring we want to backup
+ *
+ * Saves all unprocessed commits to a ring, returns the number of dwords saved.
+ */
+unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring 
*ring,
+   uint32_t **data)
+{
+   unsigned size, ptr, i;
+
+   /* just in case l

[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v2

2012-07-10 Thread Christian König
Before emitting any indirect buffer, emit the offset of the next
valid ring content if any. This allow code that want to resume
ring to resume ring right after ib that caused GPU lockup.

v2: use scratch registers instead of storing it into memory

Signed-off-by: Jerome Glisse 
Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/evergreen.c   |8 +++-
 drivers/gpu/drm/radeon/ni.c  |   11 ++-
 drivers/gpu/drm/radeon/r600.c|   18 --
 drivers/gpu/drm/radeon/radeon.h  |1 +
 drivers/gpu/drm/radeon/radeon_ring.c |4 
 drivers/gpu/drm/radeon/rv770.c   |4 +++-
 drivers/gpu/drm/radeon/si.c  |   22 +++---
 7 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index f39b900..40de347 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device 
*rdev, struct radeon_ib *ib)
/* set to DX10/11 mode */
radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
radeon_ring_write(ring, 1);
-   /* FIXME: implement */
+
+   if (ring->rptr_save_reg) {
+   uint32_t next_rptr = ring->wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
radeon_ring_write(ring,
 #ifdef __BIG_ENDIAN
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index f2afefb..6e3d448 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
/* set to DX10/11 mode */
radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
radeon_ring_write(ring, 1);
+
+   if (ring->rptr_save_reg) {
+   uint32_t next_rptr = ring->wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
radeon_ring_write(ring,
 #ifdef __BIG_ENDIAN
@@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev)

 static void cayman_cp_fini(struct radeon_device *rdev)
 {
+   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
cayman_cp_enable(rdev, false);
-   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
+   radeon_ring_fini(rdev, ring);
+   radeon_scratch_free(rdev, ring->rptr_save_reg);
 }

 int cayman_cp_resume(struct radeon_device *rdev)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index c808fa9..74fca15 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev)
 void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, 
unsigned ring_size)
 {
u32 rb_bufsz;
+   int r;

/* Align ring size */
rb_bufsz = drm_order(ring_size / 8);
ring_size = (1 << (rb_bufsz + 1)) * 4;
ring->ring_size = ring_size;
ring->align_mask = 16 - 1;
+
+   r = radeon_scratch_get(rdev, &ring->rptr_save_reg);
+   if (r) {
+   DRM_ERROR("failed to get scratch reg for rptr save (%d).\n", r);
+   ring->rptr_save_reg = 0;
+   }
 }

 void r600_cp_fini(struct radeon_device *rdev)
 {
+   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
r600_cp_stop(rdev);
-   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
+   radeon_ring_fini(rdev, ring);
+   radeon_scratch_free(rdev, ring->rptr_save_reg);
 }


@@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
 {
struct radeon_ring *ring = &rdev->ring[ib->ring];

-   /* FIXME: implement */
+   if (ring->rptr_save_reg) {
+   uint32_t next_rptr = ring->wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
radeon_ring_write(ring,
 #ifdef __BIG_ENDIAN
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 872270c..64d39ad 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -622,6 +622,7 @@ struct radeon_ring {
unsignedrptr;
unsignedrptr_offs;
unsignedrptr_reg;
+   unsignedrptr_save_reg;
unsignedwptr;
unsignedwptr_old;
unsignedwptr_reg;

[PATCH 13/15] drm/radeon: move radeon_ib_ring_tests out of chipset code

2012-07-10 Thread Christian König
Making it easier to controlwhen it is executed.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/evergreen.c |4 
 drivers/gpu/drm/radeon/ni.c|4 
 drivers/gpu/drm/radeon/r100.c  |4 
 drivers/gpu/drm/radeon/r300.c  |4 
 drivers/gpu/drm/radeon/r420.c  |4 
 drivers/gpu/drm/radeon/r520.c  |4 
 drivers/gpu/drm/radeon/r600.c  |4 
 drivers/gpu/drm/radeon/radeon_device.c |   15 +++
 drivers/gpu/drm/radeon/rs400.c |4 
 drivers/gpu/drm/radeon/rs600.c |4 
 drivers/gpu/drm/radeon/rs690.c |4 
 drivers/gpu/drm/radeon/rv515.c |4 
 drivers/gpu/drm/radeon/rv770.c |4 
 drivers/gpu/drm/radeon/si.c|   21 -
 14 files changed, 15 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 82f7aea..f39b900 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3093,10 +3093,6 @@ static int evergreen_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio init failed\n");
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index ec5307c..f2afefb 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1276,10 +1276,6 @@ static int cayman_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
r = radeon_vm_manager_init(rdev);
if (r) {
dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 9524bd4..e0f5ae8 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3887,10 +3887,6 @@ static int r100_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
index b396e34..646a192 100644
--- a/drivers/gpu/drm/radeon/r300.c
+++ b/drivers/gpu/drm/radeon/r300.c
@@ -1397,10 +1397,6 @@ static int r300_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/r420.c b/drivers/gpu/drm/radeon/r420.c
index 0062938..f2f5bf6 100644
--- a/drivers/gpu/drm/radeon/r420.c
+++ b/drivers/gpu/drm/radeon/r420.c
@@ -281,10 +281,6 @@ static int r420_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/r520.c b/drivers/gpu/drm/radeon/r520.c
index 6df3e51..079d3c5 100644
--- a/drivers/gpu/drm/radeon/r520.c
+++ b/drivers/gpu/drm/radeon/r520.c
@@ -209,10 +209,6 @@ static int r520_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }

diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index af2f74a..c808fa9 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2395,10 +2395,6 @@ int r600_startup(struct radeon_device *rdev)
return r;
}

-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio init failed\n");
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 254fdb4..bbd0971 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -822,6 +822,10 @@ int radeon_device_init(struct radeon_device *rdev,
if (r)
return r;

+   r = radeon_ib_ring_tests(rdev);
+   if (r)
+   DRM_ERROR("ib ring test failed (%d).\n", r);
+
if (rdev->flags & RADEON_IS_AGP && !rdev->accel_working) {
/* Acceleration not working on AGP card try again
 * with fallback to PCI or PCIE GART
@@ -946,6 +950,7 @@ int radeon_resume_kms(struct drm_device *dev)
 {
struct drm_connector *connector;
struct radeon_device *rdev = dev->dev_private;
+   int r;

if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
return 0;
@@ -960,6 +965,11 @@ int radeon_resume_kms(struct drm_device *dev)
/* resume AGP if in use */
radeon_agp_resume(rdev);
 

[PATCH 12/15] drm/radeon: remove vm_manager start/suspend

2012-07-10 Thread Christian König
Just restore the page table instead. Addressing three
problem with this change:

1. Calling vm_manager_suspend in the suspend path is
   problematic cause it wants to wait for the VM use
   to end, which in case of a lockup never happens.

2. In case of a locked up memory controller
   unbinding the VM seems to make it even more
   unstable, creating an unrecoverable lockup
   in the end.

3. If we want to backup/restore the leftover ring
   content we must not unbind VMs in between.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/ni.c  |   12 ++---
 drivers/gpu/drm/radeon/radeon.h  |2 -
 drivers/gpu/drm/radeon/radeon_gart.c |   83 +-
 drivers/gpu/drm/radeon/si.c  |   12 ++---
 4 files changed, 59 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 4004376..ec5307c 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1280,9 +1280,11 @@ static int cayman_startup(struct radeon_device *rdev)
if (r)
return r;

-   r = radeon_vm_manager_start(rdev);
-   if (r)
+   r = radeon_vm_manager_init(rdev);
+   if (r) {
+   dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
return r;
+   }

r = r600_audio_init(rdev);
if (r)
@@ -1315,7 +1317,6 @@ int cayman_resume(struct radeon_device *rdev)
 int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
-   radeon_vm_manager_suspend(rdev);
cayman_cp_enable(rdev, false);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
evergreen_irq_suspend(rdev);
@@ -1392,11 +1393,6 @@ int cayman_init(struct radeon_device *rdev)
return r;

rdev->accel_working = true;
-   r = radeon_vm_manager_init(rdev);
-   if (r) {
-   dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
-   }
-
r = cayman_startup(rdev);
if (r) {
dev_err(rdev->dev, "disabling GPU acceleration\n");
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 8a8c3f8..872270c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1759,8 +1759,6 @@ extern void radeon_ttm_set_active_vram_size(struct 
radeon_device *rdev, u64 size
  */
 int radeon_vm_manager_init(struct radeon_device *rdev);
 void radeon_vm_manager_fini(struct radeon_device *rdev);
-int radeon_vm_manager_start(struct radeon_device *rdev);
-int radeon_vm_manager_suspend(struct radeon_device *rdev);
 int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm);
 void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm);
 int radeon_vm_bind(struct radeon_device *rdev, struct radeon_vm *vm);
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index ee11c50..56752da 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -282,27 +282,58 @@ void radeon_gart_fini(struct radeon_device *rdev)
  *
  * TODO bind a default page at vm initialization for default address
  */
+
 int radeon_vm_manager_init(struct radeon_device *rdev)
 {
+   struct radeon_vm *vm;
+   struct radeon_bo_va *bo_va;
int r;

-   rdev->vm_manager.enabled = false;
+   if (!rdev->vm_manager.enabled) {
+   /* mark first vm as always in use, it's the system one */
+   r = radeon_sa_bo_manager_init(rdev, 
&rdev->vm_manager.sa_manager,
+ rdev->vm_manager.max_pfn * 8,
+ RADEON_GEM_DOMAIN_VRAM);
+   if (r) {
+   dev_err(rdev->dev, "failed to allocate vm bo (%dKB)\n",
+   (rdev->vm_manager.max_pfn * 8) >> 10);
+   return r;
+   }

-   /* mark first vm as always in use, it's the system one */
-   r = radeon_sa_bo_manager_init(rdev, &rdev->vm_manager.sa_manager,
- rdev->vm_manager.max_pfn * 8,
- RADEON_GEM_DOMAIN_VRAM);
-   if (r) {
-   dev_err(rdev->dev, "failed to allocate vm bo (%dKB)\n",
-   (rdev->vm_manager.max_pfn * 8) >> 10);
-   return r;
+   r = rdev->vm_manager.funcs->init(rdev);
+   if (r)
+   return r;
+   
+   rdev->vm_manager.enabled = true;
+
+   r = radeon_sa_bo_manager_start(rdev, 
&rdev->vm_manager.sa_manager);
+   if (r)
+   return r;
}

-   r = rdev->vm_manager.funcs->init(rdev);
-   if (r == 0)
-   rdev->vm_manager.enabled = true;
+   /* restore page table */
+   list_for_each_entry(vm, &rdev->vm_manager.lru_vm, list) {
+   if (vm->id == -1)

[PATCH 11/15] drm/radeon: remove r600_blit_suspend

2012-07-10 Thread Christian König
Just reinitialize the shader content on resume instead.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/evergreen.c  |1 -
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |   40 +--
 drivers/gpu/drm/radeon/ni.c |1 -
 drivers/gpu/drm/radeon/r600.c   |   15 --
 drivers/gpu/drm/radeon/r600_blit_kms.c  |   40 +--
 drivers/gpu/drm/radeon/radeon.h |2 --
 drivers/gpu/drm/radeon/rv770.c  |1 -
 drivers/gpu/drm/radeon/si.c |3 --
 8 files changed, 40 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 64e06e6..82f7aea 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3139,7 +3139,6 @@ int evergreen_suspend(struct radeon_device *rdev)
struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];

r600_audio_fini(rdev);
-   r600_blit_suspend(rdev);
r700_cp_stop(rdev);
ring->ready = false;
evergreen_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c 
b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index e512560..89cb9fe 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -634,10 +634,6 @@ int evergreen_blit_init(struct radeon_device *rdev)

rdev->r600_blit.max_dim = 16384;

-   /* pin copy shader into vram if already initialized */
-   if (rdev->r600_blit.shader_obj)
-   goto done;
-
rdev->r600_blit.state_offset = 0;

if (rdev->family < CHIP_CAYMAN)
@@ -668,11 +664,26 @@ int evergreen_blit_init(struct radeon_device *rdev)
obj_size += cayman_ps_size * 4;
obj_size = ALIGN(obj_size, 256);

-   r = radeon_bo_create(rdev, obj_size, PAGE_SIZE, true, 
RADEON_GEM_DOMAIN_VRAM,
-NULL, &rdev->r600_blit.shader_obj);
-   if (r) {
-   DRM_ERROR("evergreen failed to allocate shader\n");
-   return r;
+   /* pin copy shader into vram if not already initialized */
+   if (!rdev->r600_blit.shader_obj) {
+   r = radeon_bo_create(rdev, obj_size, PAGE_SIZE, true,
+RADEON_GEM_DOMAIN_VRAM,
+NULL, &rdev->r600_blit.shader_obj);
+   if (r) {
+   DRM_ERROR("evergreen failed to allocate shader\n");
+   return r;
+   }
+
+   r = radeon_bo_reserve(rdev->r600_blit.shader_obj, false);
+   if (unlikely(r != 0))
+   return r;
+   r = radeon_bo_pin(rdev->r600_blit.shader_obj, 
RADEON_GEM_DOMAIN_VRAM,
+ &rdev->r600_blit.shader_gpu_addr);
+   radeon_bo_unreserve(rdev->r600_blit.shader_obj);
+   if (r) {
+   dev_err(rdev->dev, "(%d) pin blit object failed\n", r);
+   return r;
+   }
}

DRM_DEBUG("evergreen blit allocated bo %08x vs %08x ps %08x\n",
@@ -714,17 +725,6 @@ int evergreen_blit_init(struct radeon_device *rdev)
radeon_bo_kunmap(rdev->r600_blit.shader_obj);
radeon_bo_unreserve(rdev->r600_blit.shader_obj);

-done:
-   r = radeon_bo_reserve(rdev->r600_blit.shader_obj, false);
-   if (unlikely(r != 0))
-   return r;
-   r = radeon_bo_pin(rdev->r600_blit.shader_obj, RADEON_GEM_DOMAIN_VRAM,
- &rdev->r600_blit.shader_gpu_addr);
-   radeon_bo_unreserve(rdev->r600_blit.shader_obj);
-   if (r) {
-   dev_err(rdev->dev, "(%d) pin blit object failed\n", r);
-   return r;
-   }
radeon_ttm_set_active_vram_size(rdev, rdev->mc.real_vram_size);
return 0;
 }
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index fe55310..4004376 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1316,7 +1316,6 @@ int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
radeon_vm_manager_suspend(rdev);
-   r600_blit_suspend(rdev);
cayman_cp_enable(rdev, false);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
evergreen_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 9750f53..af2f74a 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2307,20 +2307,6 @@ int r600_copy_blit(struct radeon_device *rdev,
return 0;
 }

-void r600_blit_suspend(struct radeon_device *rdev)
-{
-   int r;
-
-   /* unpin shaders bo */
-   if (rdev->r600_blit.shader_obj) {
-   r = radeon_bo_reserve(rdev->r600_blit.shader_obj, false);
-   if (!r) {
-   radeon_bo_unpin(rdev->r600_blit.sha

[PATCH 10/15] drm/radeon: remove ip_pool start/suspend

2012-07-10 Thread Christian König
The IB pool is in gart memory, so it is completely
superfluous to unpin / repin it on suspend / resume.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/evergreen.c   |   17 ++---
 drivers/gpu/drm/radeon/ni.c  |   16 ++--
 drivers/gpu/drm/radeon/r100.c|   23 ++-
 drivers/gpu/drm/radeon/r300.c|   17 ++---
 drivers/gpu/drm/radeon/r420.c|   17 ++---
 drivers/gpu/drm/radeon/r520.c|   14 +-
 drivers/gpu/drm/radeon/r600.c|   17 ++---
 drivers/gpu/drm/radeon/radeon.h  |2 --
 drivers/gpu/drm/radeon/radeon_asic.h |1 -
 drivers/gpu/drm/radeon/radeon_ring.c |   17 +++--
 drivers/gpu/drm/radeon/rs400.c   |   17 ++---
 drivers/gpu/drm/radeon/rs600.c   |   17 ++---
 drivers/gpu/drm/radeon/rs690.c   |   17 ++---
 drivers/gpu/drm/radeon/rv515.c   |   16 ++--
 drivers/gpu/drm/radeon/rv770.c   |   17 ++---
 drivers/gpu/drm/radeon/si.c  |   16 ++--
 16 files changed, 84 insertions(+), 157 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index eb9a71a..64e06e6 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3087,9 +3087,11 @@ static int evergreen_startup(struct radeon_device *rdev)
if (r)
return r;

-   r = radeon_ib_pool_start(rdev);
-   if (r)
+   r = radeon_ib_pool_init(rdev);
+   if (r) {
+   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
return r;
+   }

r = radeon_ib_ring_tests(rdev);
if (r)
@@ -3137,7 +3139,6 @@ int evergreen_suspend(struct radeon_device *rdev)
struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];

r600_audio_fini(rdev);
-   radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
r700_cp_stop(rdev);
ring->ready = false;
@@ -3224,20 +3225,14 @@ int evergreen_init(struct radeon_device *rdev)
if (r)
return r;

-   r = radeon_ib_pool_init(rdev);
rdev->accel_working = true;
-   if (r) {
-   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
-   rdev->accel_working = false;
-   }
-
r = evergreen_startup(rdev);
if (r) {
dev_err(rdev->dev, "disabling GPU acceleration\n");
r700_cp_fini(rdev);
r600_irq_fini(rdev);
radeon_wb_fini(rdev);
-   r100_ib_fini(rdev);
+   radeon_ib_pool_fini(rdev);
radeon_irq_kms_fini(rdev);
evergreen_pcie_gart_fini(rdev);
rdev->accel_working = false;
@@ -3264,7 +3259,7 @@ void evergreen_fini(struct radeon_device *rdev)
r700_cp_fini(rdev);
r600_irq_fini(rdev);
radeon_wb_fini(rdev);
-   r100_ib_fini(rdev);
+   radeon_ib_pool_fini(rdev);
radeon_irq_kms_fini(rdev);
evergreen_pcie_gart_fini(rdev);
r600_vram_scratch_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 8b1df33..fe55310 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1270,9 +1270,11 @@ static int cayman_startup(struct radeon_device *rdev)
if (r)
return r;

-   r = radeon_ib_pool_start(rdev);
-   if (r)
+   r = radeon_ib_pool_init(rdev);
+   if (r) {
+   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
return r;
+   }

r = radeon_ib_ring_tests(rdev);
if (r)
@@ -1313,7 +1315,6 @@ int cayman_resume(struct radeon_device *rdev)
 int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
-   radeon_ib_pool_suspend(rdev);
radeon_vm_manager_suspend(rdev);
r600_blit_suspend(rdev);
cayman_cp_enable(rdev, false);
@@ -1391,12 +1392,7 @@ int cayman_init(struct radeon_device *rdev)
if (r)
return r;

-   r = radeon_ib_pool_init(rdev);
rdev->accel_working = true;
-   if (r) {
-   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
-   rdev->accel_working = false;
-   }
r = radeon_vm_manager_init(rdev);
if (r) {
dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
@@ -1410,7 +1406,7 @@ int cayman_init(struct radeon_device *rdev)
if (rdev->flags & RADEON_IS_IGP)
si_rlc_fini(rdev);
radeon_wb_fini(rdev);
-   r100_ib_fini(rdev);
+   radeon_ib_pool_fini(rdev);
radeon_vm_manager_fini(rdev);
radeon_irq_kms_fini(rdev);
cayman_pcie_gart_fini(rdev);
@@ -1441,7 +1437,7 @@ void cayman_fini(struct radeon

[PATCH 09/15] drm/radeon: make cp init on cayman more robust

2012-07-10 Thread Christian König
It's not critical, but the current code isn't
100% correct.

Signed-off-by: Christian K?nig 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/ni.c |  133 ++-
 1 file changed, 56 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 32a6082..8b1df33 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -987,10 +987,33 @@ static void cayman_cp_fini(struct radeon_device *rdev)

 int cayman_cp_resume(struct radeon_device *rdev)
 {
+   static const int ridx[] = {
+   RADEON_RING_TYPE_GFX_INDEX,
+   CAYMAN_RING_TYPE_CP1_INDEX,
+   CAYMAN_RING_TYPE_CP2_INDEX
+   };
+   static const unsigned cp_rb_cntl[] = {
+   CP_RB0_CNTL,
+   CP_RB1_CNTL,
+   CP_RB2_CNTL,
+   };
+   static const unsigned cp_rb_rptr_addr[] = {
+   CP_RB0_RPTR_ADDR,
+   CP_RB1_RPTR_ADDR,
+   CP_RB2_RPTR_ADDR
+   };
+   static const unsigned cp_rb_rptr_addr_hi[] = {
+   CP_RB0_RPTR_ADDR_HI,
+   CP_RB1_RPTR_ADDR_HI,
+   CP_RB2_RPTR_ADDR_HI
+   };
+   static const unsigned cp_rb_base[] = {
+   CP_RB0_BASE,
+   CP_RB1_BASE,
+   CP_RB2_BASE
+   };
struct radeon_ring *ring;
-   u32 tmp;
-   u32 rb_bufsz;
-   int r;
+   int i, r;

/* Reset cp; if cp is reset, then PA, SH, VGT also need to be reset */
WREG32(GRBM_SOFT_RESET, (SOFT_RESET_CP |
@@ -1012,91 +1035,47 @@ int cayman_cp_resume(struct radeon_device *rdev)

WREG32(CP_DEBUG, (1 << 27));

-   /* ring 0 - compute and gfx */
-   /* Set ring buffer size */
-   ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
-   rb_bufsz = drm_order(ring->ring_size / 8);
-   tmp = (drm_order(RADEON_GPU_PAGE_SIZE/8) << 8) | rb_bufsz;
-#ifdef __BIG_ENDIAN
-   tmp |= BUF_SWAP_32BIT;
-#endif
-   WREG32(CP_RB0_CNTL, tmp);
-
-   /* Initialize the ring buffer's read and write pointers */
-   WREG32(CP_RB0_CNTL, tmp | RB_RPTR_WR_ENA);
-   ring->wptr = 0;
-   WREG32(CP_RB0_WPTR, ring->wptr);
-
/* set the wb address wether it's enabled or not */
-   WREG32(CP_RB0_RPTR_ADDR, (rdev->wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 
& 0xFFFC);
-   WREG32(CP_RB0_RPTR_ADDR_HI, upper_32_bits(rdev->wb.gpu_addr + 
RADEON_WB_CP_RPTR_OFFSET) & 0xFF);
WREG32(SCRATCH_ADDR, ((rdev->wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) >> 
8) & 0x);
+   WREG32(SCRATCH_UMSK, 0xff);

-   if (rdev->wb.enabled)
-   WREG32(SCRATCH_UMSK, 0xff);
-   else {
-   tmp |= RB_NO_UPDATE;
-   WREG32(SCRATCH_UMSK, 0);
-   }
-
-   mdelay(1);
-   WREG32(CP_RB0_CNTL, tmp);
-
-   WREG32(CP_RB0_BASE, ring->gpu_addr >> 8);
-
-   ring->rptr = RREG32(CP_RB0_RPTR);
+   for (i = 0; i < 3; ++i) {
+   uint32_t rb_cntl;
+   uint64_t addr;

-   /* ring1  - compute only */
-   /* Set ring buffer size */
-   ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX];
-   rb_bufsz = drm_order(ring->ring_size / 8);
-   tmp = (drm_order(RADEON_GPU_PAGE_SIZE/8) << 8) | rb_bufsz;
+   /* Set ring buffer size */
+   ring = &rdev->ring[ridx[i]];
+   rb_cntl = drm_order(ring->ring_size / 8);
+   rb_cntl |= drm_order(RADEON_GPU_PAGE_SIZE/8) << 8;
 #ifdef __BIG_ENDIAN
-   tmp |= BUF_SWAP_32BIT;
+   rb_cntl |= BUF_SWAP_32BIT;
 #endif
-   WREG32(CP_RB1_CNTL, tmp);
+   WREG32(cp_rb_cntl[i], rb_cntl);

-   /* Initialize the ring buffer's read and write pointers */
-   WREG32(CP_RB1_CNTL, tmp | RB_RPTR_WR_ENA);
-   ring->wptr = 0;
-   WREG32(CP_RB1_WPTR, ring->wptr);
-
-   /* set the wb address wether it's enabled or not */
-   WREG32(CP_RB1_RPTR_ADDR, (rdev->wb.gpu_addr + 
RADEON_WB_CP1_RPTR_OFFSET) & 0xFFFC);
-   WREG32(CP_RB1_RPTR_ADDR_HI, upper_32_bits(rdev->wb.gpu_addr + 
RADEON_WB_CP1_RPTR_OFFSET) & 0xFF);
-
-   mdelay(1);
-   WREG32(CP_RB1_CNTL, tmp);
-
-   WREG32(CP_RB1_BASE, ring->gpu_addr >> 8);
-
-   ring->rptr = RREG32(CP_RB1_RPTR);
-
-   /* ring2 - compute only */
-   /* Set ring buffer size */
-   ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX];
-   rb_bufsz = drm_order(ring->ring_size / 8);
-   tmp = (drm_order(RADEON_GPU_PAGE_SIZE/8) << 8) | rb_bufsz;
-#ifdef __BIG_ENDIAN
-   tmp |= BUF_SWAP_32BIT;
-#endif
-   WREG32(CP_RB2_CNTL, tmp);
-
-   /* Initialize the ring buffer's read and write pointers */
-   WREG32(CP_RB2_CNTL, tmp | RB_RPTR_WR_ENA);
-   ring->wptr = 0;
-   WREG32(CP_RB2_WPTR, ring->wptr);
+   /* set the wb address wether it's enabled or not */
+   addr = rdev->wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET;
+   WREG32(cp_r

[PATCH 08/15] drm/radeon: remove FIXME comment from chipset suspend

2012-07-10 Thread Christian König
For a normal suspend/resume we allready wait for
the rings to be empty, and for a suspend/reasume
in case of a lockup we REALLY don't want to wait
for anything.

Signed-off-by: Christian K?nig 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/evergreen.c |1 -
 drivers/gpu/drm/radeon/ni.c|1 -
 drivers/gpu/drm/radeon/r600.c  |1 -
 drivers/gpu/drm/radeon/rv770.c |1 -
 drivers/gpu/drm/radeon/si.c|1 -
 5 files changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index f716e08..eb9a71a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3137,7 +3137,6 @@ int evergreen_suspend(struct radeon_device *rdev)
struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];

r600_audio_fini(rdev);
-   /* FIXME: we should wait for ring to be empty */
radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
r700_cp_stop(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 2366be3..32a6082 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1334,7 +1334,6 @@ int cayman_resume(struct radeon_device *rdev)
 int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
-   /* FIXME: we should wait for ring to be empty */
radeon_ib_pool_suspend(rdev);
radeon_vm_manager_suspend(rdev);
r600_blit_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 43d0c41..de4de2d 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2461,7 +2461,6 @@ int r600_suspend(struct radeon_device *rdev)
r600_audio_fini(rdev);
radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
-   /* FIXME: we should wait for ring to be empty */
r600_cp_stop(rdev);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
r600_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index b4f51c5..7e230f6 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -996,7 +996,6 @@ int rv770_suspend(struct radeon_device *rdev)
r600_audio_fini(rdev);
radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
-   /* FIXME: we should wait for ring to be empty */
r700_cp_stop(rdev);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
r600_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index 34603b3c8..78c790f 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -3807,7 +3807,6 @@ int si_resume(struct radeon_device *rdev)

 int si_suspend(struct radeon_device *rdev)
 {
-   /* FIXME: we should wait for ring to be empty */
radeon_ib_pool_suspend(rdev);
radeon_vm_manager_suspend(rdev);
 #if 0
-- 
1.7.9.5



[PATCH 07/15] drm/radeon: fix fence init after resume

2012-07-10 Thread Christian König
Start with last signaled fence number instead
of last emitted one.

Signed-off-by: Christian K?nig 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon_fence.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index a194a14..76c5b22 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -578,7 +578,7 @@ int radeon_fence_driver_start_ring(struct radeon_device 
*rdev, int ring)
}
rdev->fence_drv[ring].cpu_addr = &rdev->wb.wb[index/4];
rdev->fence_drv[ring].gpu_addr = rdev->wb.gpu_addr + index;
-   radeon_fence_write(rdev, rdev->fence_drv[ring].sync_seq[ring], ring);
+   radeon_fence_write(rdev, 
atomic64_read(&rdev->fence_drv[ring].last_seq), ring);
rdev->fence_drv[ring].initialized = true;
dev_info(rdev->dev, "fence driver on ring %d use gpu addr 0x%016llx and 
cpu addr 0x%p\n",
 ring, rdev->fence_drv[ring].gpu_addr, 
rdev->fence_drv[ring].cpu_addr);
-- 
1.7.9.5



[PATCH 06/15] drm/radeon: fix fence value access

2012-07-10 Thread Christian König
It is possible that radeon_fence_process is called
after writeback is disabled for suspend, leading
to an invalid read of register 0x0.

This fixes a problem for me where the fence value
is temporary incremented by 0x1 on
suspend/resume.

Signed-off-by: Christian K?nig 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon_fence.c |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index be4e4f3..a194a14 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -42,21 +42,23 @@

 static void radeon_fence_write(struct radeon_device *rdev, u32 seq, int ring)
 {
-   if (rdev->wb.enabled) {
-   *rdev->fence_drv[ring].cpu_addr = cpu_to_le32(seq);
+   struct radeon_fence_driver *drv = &rdev->fence_drv[ring];
+   if (likely(rdev->wb.enabled || !drv->scratch_reg)) {
+   *drv->cpu_addr = cpu_to_le32(seq);
} else {
-   WREG32(rdev->fence_drv[ring].scratch_reg, seq);
+   WREG32(drv->scratch_reg, seq);
}
 }

 static u32 radeon_fence_read(struct radeon_device *rdev, int ring)
 {
+   struct radeon_fence_driver *drv = &rdev->fence_drv[ring];
u32 seq = 0;

-   if (rdev->wb.enabled) {
-   seq = le32_to_cpu(*rdev->fence_drv[ring].cpu_addr);
+   if (likely(rdev->wb.enabled || !drv->scratch_reg)) {
+   seq = le32_to_cpu(*drv->cpu_addr);
} else {
-   seq = RREG32(rdev->fence_drv[ring].scratch_reg);
+   seq = RREG32(drv->scratch_reg);
}
return seq;
 }
-- 
1.7.9.5



[PATCH 05/15] drm/radeon: fix ring commit padding

2012-07-10 Thread Christian König
We don't need to pad anything if the number of dwords
written to the ring already matches the requirements.

Fixes some "writting more dword to ring than expected"
warnings.

Signed-off-by: Christian K?nig 
Reviewed-by: Jerome Glisse 
Reviewed-by: Michel D?nzer 
---
 drivers/gpu/drm/radeon/radeon_ring.c |7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 0826e77..674aaba 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -272,13 +272,8 @@ int radeon_ring_lock(struct radeon_device *rdev, struct 
radeon_ring *ring, unsig

 void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *ring)
 {
-   unsigned count_dw_pad;
-   unsigned i;
-
/* We pad to match fetch size */
-   count_dw_pad = (ring->align_mask + 1) -
-  (ring->wptr & ring->align_mask);
-   for (i = 0; i < count_dw_pad; i++) {
+   while (ring->wptr & ring->align_mask) {
radeon_ring_write(ring, ring->nop);
}
DRM_MEMORYBARRIER();
-- 
1.7.9.5



[PATCH 04/15] drm/radeon: add an exclusive lock for GPU reset v2

2012-07-10 Thread Christian König
From: Jerome Glisse 

GPU reset need to be exclusive, one happening at a time. For this
add a rw semaphore so that any path that trigger GPU activities
have to take the semaphore as a reader thus allowing concurency.

The GPU reset path take the semaphore as a writer ensuring that
no concurrent reset take place.

v2: init rw semaphore

Signed-off-by: Jerome Glisse 
Reviewed-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|1 +
 drivers/gpu/drm/radeon/radeon_cs.c |5 +
 drivers/gpu/drm/radeon/radeon_device.c |3 +++
 drivers/gpu/drm/radeon/radeon_gem.c|8 
 4 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 5861ec8..4487873 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1446,6 +1446,7 @@ struct radeon_device {
struct device   *dev;
struct drm_device   *ddev;
struct pci_dev  *pdev;
+   struct rw_semaphore exclusive_lock;
/* ASIC */
union radeon_asic_configconfig;
enum radeon_family  family;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index d5aec09..553da67 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -499,7 +499,9 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
struct radeon_cs_parser parser;
int r;

+   down_read(&rdev->exclusive_lock);
if (!rdev->accel_working) {
+   up_read(&rdev->exclusive_lock);
return -EBUSY;
}
/* initialize parser */
@@ -512,6 +514,7 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r) {
DRM_ERROR("Failed to initialize parser !\n");
radeon_cs_parser_fini(&parser, r);
+   up_read(&rdev->exclusive_lock);
r = radeon_cs_handle_lockup(rdev, r);
return r;
}
@@ -520,6 +523,7 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r != -ERESTARTSYS)
DRM_ERROR("Failed to parse relocation %d!\n", r);
radeon_cs_parser_fini(&parser, r);
+   up_read(&rdev->exclusive_lock);
r = radeon_cs_handle_lockup(rdev, r);
return r;
}
@@ -533,6 +537,7 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
}
 out:
radeon_cs_parser_fini(&parser, r);
+   up_read(&rdev->exclusive_lock);
r = radeon_cs_handle_lockup(rdev, r);
return r;
 }
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index f654ba8..254fdb4 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -734,6 +734,7 @@ int radeon_device_init(struct radeon_device *rdev,
mutex_init(&rdev->gem.mutex);
mutex_init(&rdev->pm.mutex);
init_rwsem(&rdev->pm.mclk_lock);
+   init_rwsem(&rdev->exclusive_lock);
init_waitqueue_head(&rdev->irq.vblank_queue);
init_waitqueue_head(&rdev->irq.idle_queue);
r = radeon_gem_init(rdev);
@@ -988,6 +989,7 @@ int radeon_gpu_reset(struct radeon_device *rdev)
int r;
int resched;

+   down_write(&rdev->exclusive_lock);
radeon_save_bios_scratch_regs(rdev);
/* block TTM */
resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
@@ -1007,6 +1009,7 @@ int radeon_gpu_reset(struct radeon_device *rdev)
dev_info(rdev->dev, "GPU reset failed\n");
}

+   up_write(&rdev->exclusive_lock);
return r;
 }

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index d9b0809..b0be9c4 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -215,12 +215,14 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
*data,
uint32_t handle;
int r;

+   down_read(&rdev->exclusive_lock);
/* create a gem object to contain this object in */
args->size = roundup(args->size, PAGE_SIZE);
r = radeon_gem_object_create(rdev, args->size, args->alignment,
args->initial_domain, false,
false, &gobj);
if (r) {
+   up_read(&rdev->exclusive_lock);
r = radeon_gem_handle_lockup(rdev, r);
return r;
}
@@ -228,10 +230,12 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
*data,
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
if (r) {
+   up_read(&rdev->exclusive_lock);
r = radeon_gem_handle_locku

[PATCH 03/15] drm/radeon: fix fence related segfault in CS

2012-07-10 Thread Christian König
Don't return success if scheduling the IB fails, otherwise
we end up with an oops in ttm_eu_fence_buffer_objects.

Signed-off-by: Christian K?nig 
Reviewed-by: Jerome Glisse 
Reviewed-by: Michel D?nzer 
Cc: stable at vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon_cs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index f1b7527..d5aec09 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -358,7 +358,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
}
-   return 0;
+   return r;
 }

 static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
-- 
1.7.9.5



[PATCH 02/15] drm/radeon: add error handling to radeon_vm_unbind_locked

2012-07-10 Thread Christian König
Waiting for a fence can fail for different reasons,
the most common is a deadlock.

Signed-off-by: Christian K?nig 
Reviewed-by: Michel D?nzer 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon_gart.c |   17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index 2b34c1a..ee11c50 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -316,10 +316,21 @@ static void radeon_vm_unbind_locked(struct radeon_device 
*rdev,
}

/* wait for vm use to end */
-   if (vm->fence) {
-   radeon_fence_wait(vm->fence, false);
-   radeon_fence_unref(&vm->fence);
+   while (vm->fence) {
+   int r;
+   r = radeon_fence_wait(vm->fence, false);
+   if (r)
+   DRM_ERROR("error while waiting for fence: %d\n", r);
+   if (r == -EDEADLK) {
+   mutex_unlock(&rdev->vm_manager.lock);
+   r = radeon_gpu_reset(rdev);
+   mutex_lock(&rdev->vm_manager.lock);
+   if (!r)
+   continue;
+   }
+   break;
}
+   radeon_fence_unref(&vm->fence);

/* hw unbind */
rdev->vm_manager.funcs->unbind(rdev, vm);
-- 
1.7.9.5



[PATCH 01/15] drm/radeon: add error handling to fence_wait_empty_locked

2012-07-10 Thread Christian König
Instead of returning the error handle it directly
and while at it fix the comments about the ring lock.

Signed-off-by: Christian K?nig 
Reviewed-by: Michel D?nzer 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon.h   |2 +-
 drivers/gpu/drm/radeon/radeon_fence.c |   33 +
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 77b4519b..5861ec8 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -239,7 +239,7 @@ void radeon_fence_process(struct radeon_device *rdev, int 
ring);
 bool radeon_fence_signaled(struct radeon_fence *fence);
 int radeon_fence_wait(struct radeon_fence *fence, bool interruptible);
 int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring);
-int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
+void radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
 int radeon_fence_wait_any(struct radeon_device *rdev,
  struct radeon_fence **fences,
  bool intr);
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 7b55625..be4e4f3 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -440,14 +440,11 @@ int radeon_fence_wait_any(struct radeon_device *rdev,
return 0;
 }

+/* caller must hold ring lock */
 int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring)
 {
uint64_t seq;

-   /* We are not protected by ring lock when reading current seq but
-* it's ok as worst case is we return to early while we could have
-* wait.
-*/
seq = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL;
if (seq >= rdev->fence_drv[ring].sync_seq[ring]) {
/* nothing to wait for, last_seq is
@@ -457,15 +454,27 @@ int radeon_fence_wait_next_locked(struct radeon_device 
*rdev, int ring)
return radeon_fence_wait_seq(rdev, seq, ring, false, false);
 }

-int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring)
+/* caller must hold ring lock */
+void radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring)
 {
-   /* We are not protected by ring lock when reading current seq
-* but it's ok as wait empty is call from place where no more
-* activity can be scheduled so there won't be concurrent access
-* to seq value.
-*/
-   return radeon_fence_wait_seq(rdev, rdev->fence_drv[ring].sync_seq[ring],
-ring, false, false);
+   uint64_t seq = rdev->fence_drv[ring].sync_seq[ring];
+
+   while(1) {
+   int r;
+   r = radeon_fence_wait_seq(rdev, seq, ring, false, false);
+   if (r == -EDEADLK) {
+   mutex_unlock(&rdev->ring_lock);
+   r = radeon_gpu_reset(rdev);
+   mutex_lock(&rdev->ring_lock);
+   if (!r)
+   continue;
+   }
+   if (r) {
+   dev_err(rdev->dev, "error waiting for ring to become"
+   " idle (%d)\n", r);
+   }
+   return;
+   }
 }

 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence)
-- 
1.7.9.5



[RFC PATCH 8/8] nouveau: Prime execbuffer submission synchronization

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_gem.c |  121 +++--
 1 file changed, 116 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 11c9c2a..e5d36bb 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -31,6 +31,7 @@
 #include "nouveau_drm.h"
 #include "nouveau_dma.h"
 #include "nouveau_fence.h"
+#include 

 #define nouveau_gem_pushbuf_sync(chan) 0

@@ -277,6 +278,7 @@ struct validate_op {
struct list_head vram_list;
struct list_head gart_list;
struct list_head both_list;
+   struct list_head prime_list;
 };

 static void
@@ -305,9 +307,36 @@ validate_fini_list(struct list_head *list, struct 
nouveau_fence *fence)
 static void
 validate_fini(struct validate_op *op, struct nouveau_fence* fence)
 {
+   struct list_head *entry, *tmp;
+   struct nouveau_bo *nvbo;
+   struct dma_buf *sync_buf;
+   u32 ofs, val;
+
validate_fini_list(&op->vram_list, fence);
validate_fini_list(&op->gart_list, fence);
validate_fini_list(&op->both_list, fence);
+
+   if (list_empty(&op->prime_list))
+   return;
+
+   if (fence &&
+   !nouveau_fence_prime_get(fence, &sync_buf, &ofs, &val)) {
+   dmabufmgr_eu_fence_buffer_objects(sync_buf, ofs, val,
+ &op->prime_list);
+   dma_buf_put(sync_buf);
+   } else
+   dmabufmgr_eu_backoff_reservation(&op->prime_list);
+
+   list_for_each_safe(entry, tmp, &op->prime_list) {
+   struct dmabufmgr_validate *val;
+   val = list_entry(entry, struct dmabufmgr_validate, head);
+   nvbo = val->priv;
+
+   list_del(&val->head);
+   nvbo->reserved_by = NULL;
+   drm_gem_object_unreference_unlocked(nvbo->gem);
+   kfree(val);
+   }
 }

 static int
@@ -319,9 +348,9 @@ validate_init(struct nouveau_channel *chan, struct drm_file 
*file_priv,
struct drm_nouveau_private *dev_priv = dev->dev_private;
uint32_t sequence;
int trycnt = 0;
-   int ret, i;
+   int i;

-   sequence = atomic_add_return(1, &dev_priv->ttm.validate_sequence);
+   sequence = atomic_inc_return(&dev_priv->ttm.validate_sequence);
 retry:
if (++trycnt > 10) {
NV_ERROR(dev, "%s failed and gave up.\n", __func__);
@@ -332,6 +361,8 @@ retry:
struct drm_nouveau_gem_pushbuf_bo *b = &pbbo[i];
struct drm_gem_object *gem;
struct nouveau_bo *nvbo;
+   int ret = 0, is_prime;
+   struct dmabufmgr_validate *validate = NULL;

gem = drm_gem_object_lookup(dev, file_priv, b->handle);
if (!gem) {
@@ -340,6 +371,7 @@ retry:
return -ENOENT;
}
nvbo = gem->driver_private;
+   is_prime = gem->export_dma_buf || gem->import_attach;

if (nvbo->reserved_by && nvbo->reserved_by == file_priv) {
NV_ERROR(dev, "multiple instances of buffer %d on "
@@ -349,7 +381,21 @@ retry:
return -EINVAL;
}

-   ret = ttm_bo_reserve(&nvbo->bo, true, false, true, sequence);
+   if (likely(!is_prime))
+   ret = ttm_bo_reserve(&nvbo->bo, true, false,
+true, sequence);
+   else {
+   validate = kzalloc(sizeof(*validate), GFP_KERNEL);
+   if (validate) {
+   if (gem->import_attach)
+   validate->bo =
+   gem->import_attach->dmabuf;
+   else
+   validate->bo = gem->export_dma_buf;
+   validate->priv = nvbo;
+   } else
+   ret = -ENOMEM;
+   }
if (ret) {
validate_fini(op, NULL);
if (unlikely(ret == -EAGAIN))
@@ -366,6 +412,9 @@ retry:
b->user_priv = (uint64_t)(unsigned long)nvbo;
nvbo->reserved_by = file_priv;
nvbo->pbbo_index = i;
+   if (is_prime) {
+   list_add_tail(&validate->head, &op->prime_list);
+   } else
if ((b->valid_domains & NOUVEAU_GEM_DOMAIN_VRAM) &&
(b->valid_domains & NOUVEAU_GEM_DOMAIN_GART))
list_add_tail(&nvbo->entry, &op->both_list);
@@ -473,6 +522,60 @@ validate_list(struct nouveau_channel *chan, struct 
list_head *list,
 }

 static int
+validate_prime(struct nouveau_channel *chan,

[RFC PATCH 7/8] nouveau: nvc0 fence prime implementation

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Create a read-only mapping for every imported bo, and create a prime
bo in in system memory.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nvc0_fence.c |  104 +-
 1 file changed, 89 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvc0_fence.c 
b/drivers/gpu/drm/nouveau/nvc0_fence.c
index 198e31f..dc6ccab 100644
--- a/drivers/gpu/drm/nouveau/nvc0_fence.c
+++ b/drivers/gpu/drm/nouveau/nvc0_fence.c
@@ -37,6 +37,7 @@ struct nvc0_fence_priv {
 struct nvc0_fence_chan {
struct nouveau_fence_chan base;
struct nouveau_vma vma;
+   struct nouveau_vma prime_vma;
 };

 static int
@@ -45,19 +46,23 @@ nvc0_fence_emit(struct nouveau_fence *fence, bool prime)
struct nouveau_channel *chan = fence->channel;
struct nvc0_fence_chan *fctx = chan->engctx[NVOBJ_ENGINE_FENCE];
u64 addr = fctx->vma.offset + chan->id * 16;
-   int ret;
+   int ret, i;

-   ret = RING_SPACE(chan, 5);
-   if (ret == 0) {
+   ret = RING_SPACE(chan, prime ? 10 : 5);
+   if (ret)
+   return ret;
+
+   for (i = 0; i < (prime ? 2 : 1); ++i) {
+   if (i)
+   addr = fctx->prime_vma.offset + chan->id * 16;
BEGIN_NVC0(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
OUT_RING  (chan, upper_32_bits(addr));
OUT_RING  (chan, lower_32_bits(addr));
OUT_RING  (chan, fence->sequence);
OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_WRITE_LONG);
-   FIRE_RING (chan);
}
-
-   return ret;
+   FIRE_RING(chan);
+   return 0;
 }

 static int
@@ -95,6 +100,8 @@ nvc0_fence_context_del(struct nouveau_channel *chan, int 
engine)
struct nvc0_fence_priv *priv = nv_engine(chan->dev, engine);
struct nvc0_fence_chan *fctx = chan->engctx[engine];

+   if (priv->base.prime_bo)
+   nouveau_bo_vma_del(priv->base.prime_bo, &fctx->prime_vma);
nouveau_bo_vma_del(priv->bo, &fctx->vma);
nouveau_fence_context_del(chan->dev, &fctx->base);
chan->engctx[engine] = NULL;
@@ -115,10 +122,16 @@ nvc0_fence_context_new(struct nouveau_channel *chan, int 
engine)
nouveau_fence_context_new(&fctx->base);

ret = nouveau_bo_vma_add(priv->bo, chan->vm, &fctx->vma);
+   if (!ret && priv->base.prime_bo)
+   ret = nouveau_bo_vma_add(priv->base.prime_bo, chan->vm,
+&fctx->prime_vma);
if (ret)
nvc0_fence_context_del(chan, engine);

-   nouveau_bo_wr32(priv->bo, chan->id * 16/4, 0x);
+   fctx->base.sequence = nouveau_bo_rd32(priv->bo, chan->id * 16/4);
+   if (priv->base.prime_bo)
+   nouveau_bo_wr32(priv->base.prime_bo, chan->id * 16/4,
+   fctx->base.sequence);
return ret;
 }

@@ -140,12 +153,55 @@ nvc0_fence_destroy(struct drm_device *dev, int engine)
struct drm_nouveau_private *dev_priv = dev->dev_private;
struct nvc0_fence_priv *priv = nv_engine(dev, engine);

+   nouveau_fence_prime_del(&priv->base);
nouveau_bo_unmap(priv->bo);
+   nouveau_bo_unpin(priv->bo);
nouveau_bo_ref(NULL, &priv->bo);
dev_priv->eng[engine] = NULL;
kfree(priv);
 }

+static int
+nvc0_fence_prime_sync(struct nouveau_channel *chan,
+ struct nouveau_bo *bo,
+ u32 ofs, u32 val, u64 sema_start)
+{
+   struct nvc0_fence_chan *fctx = chan->engctx[NVOBJ_ENGINE_FENCE];
+   struct nvc0_fence_priv *priv = nv_engine(chan->dev, NVOBJ_ENGINE_FENCE);
+   int ret = RING_SPACE(chan, 5);
+   if (ret)
+   return ret;
+
+   if (bo == priv->base.prime_bo)
+   sema_start = fctx->prime_vma.offset;
+   else
+   NV_ERROR(chan->dev, "syncing with %08Lx + %08x >= %08x\n",
+   sema_start, ofs, val);
+   sema_start += ofs;
+
+   BEGIN_NVC0(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
+   OUT_RING  (chan, upper_32_bits(sema_start));
+   OUT_RING  (chan, lower_32_bits(sema_start));
+   OUT_RING  (chan, val);
+   OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_ACQUIRE_GEQUAL |
+NVC0_SUBCHAN_SEMAPHORE_TRIGGER_YIELD);
+   FIRE_RING (chan);
+   return ret;
+}
+
+static void
+nvc0_fence_prime_del_import(struct nouveau_fence_prime_bo_entry *entry) {
+   nouveau_bo_vma_del(entry->bo, &entry->vma);
+}
+
+static int
+nvc0_fence_prime_add_import(struct nouveau_fence_prime_bo_entry *entry) {
+   int ret = nouveau_bo_vma_add_access(entry->bo, entry->chan->vm,
+   &entry->vma, NV_MEM_ACCESS_RO);
+   entry->sema_start = entry->vma.offset;
+   return ret;
+}
+
 int
 nvc0_fence_create(struct drm_device *dev)
 {
@@ -168,17 +224,35 @@ nvc0_fence_create(struct d

[RFC PATCH 6/8] nouveau: nv84 fence prime implementation

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Create a dma object for the prime semaphore and every imported sync bo.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nv84_fence.c |  121 --
 1 file changed, 115 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c 
b/drivers/gpu/drm/nouveau/nv84_fence.c
index b5cfbcb..f739dfc 100644
--- a/drivers/gpu/drm/nouveau/nv84_fence.c
+++ b/drivers/gpu/drm/nouveau/nv84_fence.c
@@ -31,6 +31,7 @@

 struct nv84_fence_chan {
struct nouveau_fence_chan base;
+   u32 sema_start;
 };

 struct nv84_fence_priv {
@@ -42,21 +43,25 @@ static int
 nv84_fence_emit(struct nouveau_fence *fence, bool prime)
 {
struct nouveau_channel *chan = fence->channel;
-   int ret = RING_SPACE(chan, 7);
-   if (ret == 0) {
+   int i, ret;
+
+   ret = RING_SPACE(chan, prime ? 14 : 7);
+   if (ret)
+   return ret;
+
+   for (i = 0; i < (prime ? 2 : 1); ++i) {
BEGIN_NV04(chan, 0, NV11_SUBCHAN_DMA_SEMAPHORE, 1);
-   OUT_RING  (chan, NvSema);
+   OUT_RING  (chan, i ? NvSemaPrime : NvSema);
BEGIN_NV04(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
OUT_RING  (chan, upper_32_bits(chan->id * 16));
OUT_RING  (chan, lower_32_bits(chan->id * 16));
OUT_RING  (chan, fence->sequence);
OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_WRITE_LONG);
-   FIRE_RING (chan);
}
+   FIRE_RING (chan);
return ret;
 }

-
 static int
 nv84_fence_sync(struct nouveau_fence *fence,
struct nouveau_channel *prev, struct nouveau_channel *chan)
@@ -82,12 +87,94 @@ nv84_fence_read(struct nouveau_channel *chan)
return nv_ro32(priv->mem, chan->id * 16);
 }

+static int
+nv84_fence_prime_sync(struct nouveau_channel *chan,
+ struct nouveau_bo *bo,
+ u32 ofs, u32 val, u64 sema_start)
+{
+   struct nv84_fence_priv *priv = nv_engine(chan->dev, NVOBJ_ENGINE_FENCE);
+   int ret = RING_SPACE(chan, 7);
+   u32 sema = 0;
+   if (ret < 0)
+   return ret;
+
+   if (bo == priv->base.prime_bo) {
+   sema = NvSema;
+   } else {
+   struct sg_table *sgt = bo->bo.sg;
+   struct scatterlist *sg;
+   u32 i;
+   sema = sema_start;
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   if (ofs < sg->offset + sg->length) {
+   ofs -= sg->offset;
+   break;
+   }
+   sema++;
+   }
+   }
+
+   BEGIN_NV04(chan, 0, NV11_SUBCHAN_DMA_SEMAPHORE, 1);
+   OUT_RING  (chan, sema);
+   BEGIN_NV04(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
+   OUT_RING  (chan, 0);
+   OUT_RING  (chan, ofs);
+   OUT_RING  (chan, val);
+   OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_ACQUIRE_GEQUAL);
+   FIRE_RING (chan);
+   return ret;
+}
+
+static void
+nv84_fence_prime_del_import(struct nouveau_fence_prime_bo_entry *entry) {
+   u32 i;
+   for (i = entry->sema_start; i <  entry->sema_start + entry->sema_len; 
++i)
+   nouveau_ramht_remove(entry->chan, i);
+}
+
+static int
+nv84_fence_prime_add_import(struct nouveau_fence_prime_bo_entry *entry) {
+   struct sg_table *sgt = entry->bo->bo.sg;
+   struct nouveau_channel *chan = entry->chan;
+   struct nv84_fence_chan *fctx = chan->engctx[NVOBJ_ENGINE_FENCE];
+   struct scatterlist *sg;
+   u32 i, sema;
+   int ret;
+
+   sema = entry->sema_start = fctx->sema_start;
+   entry->sema_len = 0;
+
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   struct nouveau_gpuobj *obj;
+   ret = nouveau_gpuobj_dma_new(chan, NV_CLASS_DMA_FROM_MEMORY,
+sg_dma_address(sg), PAGE_SIZE,
+NV_MEM_ACCESS_RO,
+NV_MEM_TARGET_PCI, &obj);
+   if (ret)
+   goto err;
+
+   ret = nouveau_ramht_insert(chan, sema, obj);
+   nouveau_gpuobj_ref(NULL, &obj);
+   if (ret)
+   goto err;
+   entry->sema_len++;
+   sema++;
+   }
+   fctx->sema_start += (entry->sema_len + 0xff) & ~0xff;
+   return 0;
+
+err:
+   nv84_fence_prime_del_import(entry);
+   return ret;
+}
+
 static void
 nv84_fence_context_del(struct nouveau_channel *chan, int engine)
 {
struct nv84_fence_chan *fctx = chan->engctx[engine];
nouveau_fence_context_del(chan->dev, &fctx->base);
chan->engctx[engine] = NULL;
+
kfree(fctx);
 }

@@ -104,6 +191,7 @@ nv84_fence_context_new(struct nouveau_channel *chan, int 
engine)
return -ENOMEM;

[RFC PATCH 5/8] nouveau: Add methods preparing for prime fencing

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

This can be used by nv84 and nvc0 to implement hardware fencing,
earlier systems will require more thought but can fall back to
software for now.

Signed-off-by: Maarten Lankhorst 

---
 drivers/gpu/drm/nouveau/nouveau_bo.c  |6 +-
 drivers/gpu/drm/nouveau/nouveau_channel.c |2 +-
 drivers/gpu/drm/nouveau/nouveau_display.c |2 +-
 drivers/gpu/drm/nouveau/nouveau_dma.h |1 +
 drivers/gpu/drm/nouveau/nouveau_drv.h |5 +
 drivers/gpu/drm/nouveau/nouveau_fence.c   |  242 -
 drivers/gpu/drm/nouveau/nouveau_fence.h   |   44 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c |6 +-
 drivers/gpu/drm/nouveau/nouveau_prime.c   |2 +
 drivers/gpu/drm/nouveau/nv04_fence.c  |4 +-
 drivers/gpu/drm/nouveau/nv10_fence.c  |4 +-
 drivers/gpu/drm/nouveau/nv84_fence.c  |4 +-
 drivers/gpu/drm/nouveau/nvc0_fence.c  |4 +-
 13 files changed, 304 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 4318320..a97025a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -52,6 +52,9 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo)
DRM_ERROR("bo %p still attached to GEM object\n", bo);

nv10_mem_put_tile_region(dev, nvbo->tile, NULL);
+
+   if (nvbo->fence_import_attach)
+   nouveau_fence_prime_del_bo(nvbo);
kfree(nvbo);
 }

@@ -109,6 +112,7 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
INIT_LIST_HEAD(&nvbo->head);
INIT_LIST_HEAD(&nvbo->entry);
INIT_LIST_HEAD(&nvbo->vma_list);
+   INIT_LIST_HEAD(&nvbo->prime_chan_entries);
nvbo->tile_mode = tile_mode;
nvbo->tile_flags = tile_flags;
nvbo->bo.bdev = &dev_priv->ttm.bdev;
@@ -480,7 +484,7 @@ nouveau_bo_move_accel_cleanup(struct nouveau_channel *chan,
struct nouveau_fence *fence = NULL;
int ret;

-   ret = nouveau_fence_new(chan, &fence);
+   ret = nouveau_fence_new(chan, &fence, false);
if (ret)
return ret;

diff --git a/drivers/gpu/drm/nouveau/nouveau_channel.c 
b/drivers/gpu/drm/nouveau/nouveau_channel.c
index 629d8a2..85a8556 100644
--- a/drivers/gpu/drm/nouveau/nouveau_channel.c
+++ b/drivers/gpu/drm/nouveau/nouveau_channel.c
@@ -362,7 +362,7 @@ nouveau_channel_idle(struct nouveau_channel *chan)
struct nouveau_fence *fence = NULL;
int ret;

-   ret = nouveau_fence_new(chan, &fence);
+   ret = nouveau_fence_new(chan, &fence, false);
if (!ret) {
ret = nouveau_fence_wait(fence, false, false);
nouveau_fence_unref(&fence);
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 69688ef..7c76776 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -466,7 +466,7 @@ nouveau_page_flip_emit(struct nouveau_channel *chan,
}
FIRE_RING (chan);

-   ret = nouveau_fence_new(chan, pfence);
+   ret = nouveau_fence_new(chan, pfence, false);
if (ret)
goto fail;

diff --git a/drivers/gpu/drm/nouveau/nouveau_dma.h 
b/drivers/gpu/drm/nouveau/nouveau_dma.h
index 8db68be..d02ffd3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dma.h
+++ b/drivers/gpu/drm/nouveau/nouveau_dma.h
@@ -74,6 +74,7 @@ enum {
NvEvoSema0  = 0x8010,
NvEvoSema1  = 0x8011,
NvNotify1   = 0x8012,
+   NvSemaPrime = 0x801f,

/* G80+ display objects */
NvEvoVRAM   = 0x0100,
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 2c17989..ad49594 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -126,6 +126,11 @@ struct nouveau_bo {

struct ttm_bo_kmap_obj dma_buf_vmap;
int vmapping_count;
+
+   /* fence related stuff */
+   struct nouveau_bo *sync_bo;
+   struct list_head prime_chan_entries;
+   struct dma_buf_attachment *fence_import_attach;
 };

 #define nouveau_bo_tile_layout(nvbo)   \
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 3c18049..d4c9c40 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -29,17 +29,64 @@

 #include 
 #include 
+#include 

 #include "nouveau_drv.h"
 #include "nouveau_ramht.h"
 #include "nouveau_fence.h"
 #include "nouveau_software.h"
 #include "nouveau_dma.h"
+#include "nouveau_fifo.h"
+
+int nouveau_fence_prime_init(struct drm_device *dev,
+struct nouveau_fence_priv *priv, u32 align)
+{
+   int ret = 0;
+#ifdef CONFIG_DMA_SHARED_BUFFER
+   struct nouveau_fifo_priv *pfifo = nv_engine(dev, NVOBJ_ENGINE_FIFO);
+   u32 size = PAGE_ALIGN(pfifo->c

[RFC PATCH 4/8] nouveau: add nouveau_bo_vma_add_access

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

This is needed to allow creation of read-only vm mappings
in fence objects.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c  |6 +++---
 drivers/gpu/drm/nouveau/nouveau_drv.h |6 --
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 7f80ed5..4318320 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1443,15 +1443,15 @@ nouveau_bo_vma_find(struct nouveau_bo *nvbo, struct 
nouveau_vm *vm)
 }

 int
-nouveau_bo_vma_add(struct nouveau_bo *nvbo, struct nouveau_vm *vm,
-  struct nouveau_vma *vma)
+nouveau_bo_vma_add_access(struct nouveau_bo *nvbo, struct nouveau_vm *vm,
+ struct nouveau_vma *vma, u32 access)
 {
const u32 size = nvbo->bo.mem.num_pages << PAGE_SHIFT;
struct nouveau_mem *node = nvbo->bo.mem.mm_node;
int ret;

ret = nouveau_vm_get(vm, size, nvbo->page_shift,
-NV_MEM_ACCESS_RW, vma);
+access, vma);
if (ret)
return ret;

diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 7c52eba..2c17989 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -1350,8 +1350,10 @@ extern int nouveau_bo_validate(struct nouveau_bo *, bool 
interruptible,

 extern struct nouveau_vma *
 nouveau_bo_vma_find(struct nouveau_bo *, struct nouveau_vm *);
-extern int  nouveau_bo_vma_add(struct nouveau_bo *, struct nouveau_vm *,
-  struct nouveau_vma *);
+#define nouveau_bo_vma_add(nvbo, vm, vma) \
+   nouveau_bo_vma_add_access((nvbo), (vm), (vma), NV_MEM_ACCESS_RW)
+extern int nouveau_bo_vma_add_access(struct nouveau_bo *, struct nouveau_vm *,
+struct nouveau_vma *, u32 access);
 extern void nouveau_bo_vma_del(struct nouveau_bo *, struct nouveau_vma *);

 /* nouveau_gem.c */
-- 
1.7.9.5



[RFC PATCH 3/8] nouveau: Extend prime code

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

The prime code no longer requires the bo to be backed by a gem object,
and cpu access calls have been implemented. This will be needed for
exporting fence bo's.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_drv.h   |6 +-
 drivers/gpu/drm/nouveau/nouveau_prime.c |  106 +--
 2 files changed, 79 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 8613cb2..7c52eba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -1374,11 +1374,15 @@ extern int nouveau_gem_ioctl_cpu_fini(struct drm_device 
*, void *,
 extern int nouveau_gem_ioctl_info(struct drm_device *, void *,
  struct drm_file *);

+extern int nouveau_gem_prime_export_bo(struct nouveau_bo *nvbo, int flags,
+  u32 size, struct dma_buf **ret);
 extern struct dma_buf *nouveau_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *obj, int flags);
 extern struct drm_gem_object *nouveau_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf);
-
+extern int nouveau_prime_import_bo(struct drm_device *dev,
+  struct dma_buf *dma_buf,
+  struct nouveau_bo **pnvbo, bool gem);
 /* nouveau_display.c */
 int nouveau_display_create(struct drm_device *dev);
 void nouveau_display_destroy(struct drm_device *dev);
diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c 
b/drivers/gpu/drm/nouveau/nouveau_prime.c
index a25cf2c..537154d3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -35,7 +35,8 @@ static struct sg_table *nouveau_gem_map_dma_buf(struct 
dma_buf_attachment *attac
  enum dma_data_direction dir)
 {
struct nouveau_bo *nvbo = attachment->dmabuf->priv;
-   struct drm_device *dev = nvbo->gem->dev;
+   struct drm_nouveau_private *dev_priv = nouveau_bdev(nvbo->bo.bdev);
+   struct drm_device *dev = dev_priv->dev;
int npages = nvbo->bo.num_pages;
struct sg_table *sg;
int nents;
@@ -59,29 +60,37 @@ static void nouveau_gem_dmabuf_release(struct dma_buf 
*dma_buf)
 {
struct nouveau_bo *nvbo = dma_buf->priv;

-   if (nvbo->gem->export_dma_buf == dma_buf) {
-   nvbo->gem->export_dma_buf = NULL;
+   nouveau_bo_unpin(nvbo);
+   if (!nvbo->gem)
+   nouveau_bo_ref(NULL, &nvbo);
+   else {
+   if (nvbo->gem->export_dma_buf == dma_buf)
+   nvbo->gem->export_dma_buf = NULL;
drm_gem_object_unreference_unlocked(nvbo->gem);
}
 }

 static void *nouveau_gem_kmap_atomic(struct dma_buf *dma_buf, unsigned long 
page_num)
 {
-   return NULL;
+   struct nouveau_bo *nvbo = dma_buf->priv;
+   return kmap_atomic(nvbo->bo.ttm->pages[page_num]);
 }

 static void nouveau_gem_kunmap_atomic(struct dma_buf *dma_buf, unsigned long 
page_num, void *addr)
 {
-
+   kunmap_atomic(addr);
 }
+
 static void *nouveau_gem_kmap(struct dma_buf *dma_buf, unsigned long page_num)
 {
-   return NULL;
+   struct nouveau_bo *nvbo = dma_buf->priv;
+   return kmap(nvbo->bo.ttm->pages[page_num]);
 }

 static void nouveau_gem_kunmap(struct dma_buf *dma_buf, unsigned long 
page_num, void *addr)
 {
-
+   struct nouveau_bo *nvbo = dma_buf->priv;
+   return kunmap(nvbo->bo.ttm->pages[page_num]);
 }

 static int nouveau_gem_prime_mmap(struct dma_buf *dma_buf, struct 
vm_area_struct *vma)
@@ -92,7 +101,8 @@ static int nouveau_gem_prime_mmap(struct dma_buf *dma_buf, 
struct vm_area_struct
 static void *nouveau_gem_prime_vmap(struct dma_buf *dma_buf)
 {
struct nouveau_bo *nvbo = dma_buf->priv;
-   struct drm_device *dev = nvbo->gem->dev;
+   struct drm_nouveau_private *dev_priv = nouveau_bdev(nvbo->bo.bdev);
+   struct drm_device *dev = dev_priv->dev;
int ret;

mutex_lock(&dev->struct_mutex);
@@ -116,7 +126,8 @@ out_unlock:
 static void nouveau_gem_prime_vunmap(struct dma_buf *dma_buf, void *vaddr)
 {
struct nouveau_bo *nvbo = dma_buf->priv;
-   struct drm_device *dev = nvbo->gem->dev;
+   struct drm_nouveau_private *dev_priv = nouveau_bdev(nvbo->bo.bdev);
+   struct drm_device *dev = dev_priv->dev;

mutex_lock(&dev->struct_mutex);
nvbo->vmapping_count--;
@@ -140,10 +151,9 @@ static const struct dma_buf_ops nouveau_dmabuf_ops =  {
 };

 static int
-nouveau_prime_new(struct drm_device *dev,
- size_t size,
+nouveau_prime_new(struct drm_device *dev, size_t size,
  struct sg_table *sg,
- struct nouveau_bo **pnvbo)
+ struct nouveau_bo **pnvbo, bool gem)
 {
struct nouveau_bo *nvbo;
u32 flags = 0;
@@

[RFC PATCH 2/8] prime wip: i915

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Export the hardware status page so others can read seqno.

Signed-off-by: Maarten Lankhorst 

---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |   29 --
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   87 
 drivers/gpu/drm/i915/intel_ringbuffer.c|   42 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h|3 +
 4 files changed, 145 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index aa308e1..d6bcfdc 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -66,12 +66,25 @@ static void i915_gem_unmap_dma_buf(struct 
dma_buf_attachment *attachment,
 static void i915_gem_dmabuf_release(struct dma_buf *dma_buf)
 {
struct drm_i915_gem_object *obj = dma_buf->priv;
+   struct drm_device *dev = obj->base.dev;
+
+   mutex_lock(&dev->struct_mutex);

if (obj->base.export_dma_buf == dma_buf) {
-   /* drop the reference on the export fd holds */
obj->base.export_dma_buf = NULL;
-   drm_gem_object_unreference_unlocked(&obj->base);
+   } else {
+   drm_i915_private_t *dev_priv = dev->dev_private;
+   struct intel_ring_buffer *ring;
+   int i;
+
+   for_each_ring(ring, dev_priv, i)
+   WARN_ON(ring->sync_buf == dma_buf);
}
+
+   /* drop the reference on the export fd holds */
+   drm_gem_object_unreference(&obj->base);
+
+   mutex_unlock(&dev->struct_mutex);
 }

 static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
@@ -129,21 +142,25 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf 
*dma_buf, void *vaddr)

 static void *i915_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, unsigned 
long page_num)
 {
-   return NULL;
+   struct drm_i915_gem_object *obj = dma_buf->priv;
+   return kmap_atomic(obj->pages[page_num]);
 }

 static void i915_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf, unsigned 
long page_num, void *addr)
 {
-
+   kunmap_atomic(addr);
 }
+
 static void *i915_gem_dmabuf_kmap(struct dma_buf *dma_buf, unsigned long 
page_num)
 {
-   return NULL;
+   struct drm_i915_gem_object *obj = dma_buf->priv;
+   return kmap(obj->pages[page_num]);
 }

 static void i915_gem_dmabuf_kunmap(struct dma_buf *dma_buf, unsigned long 
page_num, void *addr)
 {
-
+   struct drm_i915_gem_object *obj = dma_buf->priv;
+   kunmap(obj->pages[page_num]);
 }

 static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct 
*vma)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 88e2e11..245340e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,6 +33,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include 
+#include 

 struct change_domains {
uint32_t invalidate_domains;
@@ -556,7 +557,8 @@ err_unpin:
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
struct drm_file *file,
-   struct list_head *objects)
+   struct list_head *objects,
+   struct list_head *prime_val)
 {
drm_i915_private_t *dev_priv = ring->dev->dev_private;
struct drm_i915_gem_object *obj;
@@ -564,6 +566,31 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
struct list_head ordered_objects;

+   list_for_each_entry(obj, objects, exec_list) {
+   struct dmabufmgr_validate *val;
+
+   if (!(obj->base.import_attach ||
+ obj->base.export_dma_buf))
+   continue;
+
+   val = kzalloc(sizeof(*val), GFP_KERNEL);
+   if (!val)
+   return -ENOMEM;
+
+   if (obj->base.export_dma_buf)
+   val->bo = obj->base.export_dma_buf;
+   else
+   val->bo = obj->base.import_attach->dmabuf;
+   val->priv = obj;
+   list_add_tail(&val->head, prime_val);
+   }
+
+   if (!list_empty(prime_val)) {
+   ret = dmabufmgr_eu_reserve_buffers(prime_val);
+   if (ret)
+   return ret;
+   }
+
INIT_LIST_HEAD(&ordered_objects);
while (!list_empty(objects)) {
struct drm_i915_gem_exec_object2 *entry;
@@ -712,6 +739,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
  struct drm_file *file,
  struct intel_ring_buffer *ring,
  struct list_head *objects,
+ struct list_head *prime_val,
  struct eb_objects *

[RFC PATCH 1/8] dma-buf-mgr: Try 2

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Core code based on ttm_bo and ttm_execbuf_util

Signed-off-by: Maarten Lankhorst 

---
 drivers/base/Makefile |2 +-
 drivers/base/dma-buf-mgr-eu.c |  263 +
 drivers/base/dma-buf-mgr.c|  149 +++
 drivers/base/dma-buf.c|4 +
 include/linux/dma-buf-mgr.h   |  150 +++
 include/linux/dma-buf.h   |   24 
 6 files changed, 591 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/dma-buf-mgr-eu.c
 create mode 100644 drivers/base/dma-buf-mgr.c
 create mode 100644 include/linux/dma-buf-mgr.h

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 5aa2d70..86e7598 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
 obj-y  += power/
 obj-$(CONFIG_HAS_DMA)  += dma-mapping.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o
+obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-buf-mgr.o dma-buf-mgr-eu.o
 obj-$(CONFIG_ISA)  += isa.o
 obj-$(CONFIG_FW_LOADER)+= firmware_class.o
 obj-$(CONFIG_NUMA) += node.o
diff --git a/drivers/base/dma-buf-mgr-eu.c b/drivers/base/dma-buf-mgr-eu.c
new file mode 100644
index 000..ed5e01c
--- /dev/null
+++ b/drivers/base/dma-buf-mgr-eu.c
@@ -0,0 +1,263 @@
+/*
+ * Copyright (C) 2012 Canonical Ltd
+ *
+ * Based on ttm_bo.c which bears the following copyright notice,
+ * but is dual licensed:
+ *
+ * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+#include 
+#include 
+#include 
+
+static void dmabufmgr_eu_backoff_reservation_locked(struct list_head *list)
+{
+   struct dmabufmgr_validate *entry;
+
+   list_for_each_entry(entry, list, head) {
+   struct dma_buf *bo = entry->bo;
+   if (!entry->reserved)
+   continue;
+   entry->reserved = false;
+
+   bo->sync_buf = entry->sync_buf;
+   entry->sync_buf = NULL;
+
+   atomic_set(&bo->reserved, 0);
+   wake_up_all(&bo->event_queue);
+   }
+}
+
+static int
+dmabufmgr_eu_wait_unreserved_locked(struct list_head *list,
+   struct dma_buf *bo)
+{
+   int ret;
+
+   spin_unlock(&dmabufmgr.lru_lock);
+   ret = dmabufmgr_bo_wait_unreserved(bo, true);
+   spin_lock(&dmabufmgr.lru_lock);
+   if (unlikely(ret != 0))
+   dmabufmgr_eu_backoff_reservation_locked(list);
+   return ret;
+}
+
+void
+dmabufmgr_eu_backoff_reservation(struct list_head *list)
+{
+   if (list_empty(list))
+   return;
+
+   spin_lock(&dmabufmgr.lru_lock);
+   dmabufmgr_eu_backoff_reservation_locked(list);
+   spin_unlock(&dmabufmgr.lru_lock);
+}
+EXPORT_SYMBOL_GPL(dmabufmgr_eu_backoff_reservation);
+
+int
+dmabufmgr_eu_reserve_buffers(struct list_head *list)
+{
+   struct dmabufmgr_validate *entry;
+   int ret;
+   u32 val_seq;
+
+   if (list_empty(list))
+   return 0;
+
+   list_for_each_entry(entry, list, head) {
+   entry->reserved = false;
+   entry->sync_buf = NULL;
+   }
+
+retry:
+   spin_lock(&dmabufmgr.lru_lock);
+   val_seq = dmabufmgr.counter++;
+
+   list_for_each_entry(entry, list, head) {
+   struct dma_buf *bo = entry->bo;
+
+retry_this_bo:
+   ret = dmabufmgr_bo_reserve_locked(bo, true, true, true, 
val_seq);
+   switch (ret) {
+   case 0:
+   break;
+   case -EBUSY:
+   ret = dmabufmgr_eu_wait_unreserved_locked(list, bo);
+ 

[RFC PATCH 0/8] Dmabuf synchronization

2012-07-10 Thread Maarten Lankhorst
This patch implements my attempt at dmabuf synchronization.
The core idea is that a lot of devices will have their own
methods of synchronization, but more complicated devices
allow some way of fencing, so why not export those as
dma-buf?

This patchset implements dmabufmgr, which is based on ttm's code.
The ttm code deals with a lot more than just reservation however, 
I took out almost all the code not dealing with reservations.

I used the drm-intel-next-queued tree as base. It contains some i915
flushing changes. I would rather use linux-next, but the deferred
fput code makes my system unbootable. That is unfortunate since
it would reduce the deadlocks happening in dma_buf_put when 2
devices release each other's dmabuf.

The i915 changes implement a simple cpu wait only, the nouveau code
imports the sync dmabuf read-only and maps it to affected channels,
then performs a wait on it in hardware. Since the hardware may still
be processing other commands, it could be the case that no hardware
wait would have to be performed at all.

Only the nouveau nv84 code is tested, but the nvc0 code should work
as well.



[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v2

2012-07-10 Thread Jerome Glisse
On Tue, Jul 10, 2012 at 8:51 AM, Christian K?nig
 wrote:
> Before emitting any indirect buffer, emit the offset of the next
> valid ring content if any. This allow code that want to resume
> ring to resume ring right after ib that caused GPU lockup.
>
> v2: use scratch registers instead of storing it into memory
>

Why using scratch register ? To minimize bus activities ?

> Signed-off-by: Jerome Glisse 
> Signed-off-by: Christian K?nig 
> ---
>  drivers/gpu/drm/radeon/evergreen.c   |8 +++-
>  drivers/gpu/drm/radeon/ni.c  |   11 ++-
>  drivers/gpu/drm/radeon/r600.c|   18 --
>  drivers/gpu/drm/radeon/radeon.h  |1 +
>  drivers/gpu/drm/radeon/radeon_ring.c |4 
>  drivers/gpu/drm/radeon/rv770.c   |4 +++-
>  drivers/gpu/drm/radeon/si.c  |   22 +++---
>  7 files changed, 60 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/evergreen.c 
> b/drivers/gpu/drm/radeon/evergreen.c
> index f39b900..40de347 100644
> --- a/drivers/gpu/drm/radeon/evergreen.c
> +++ b/drivers/gpu/drm/radeon/evergreen.c
> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device 
> *rdev, struct radeon_ib *ib)
> /* set to DX10/11 mode */
> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
> radeon_ring_write(ring, 1);
> -   /* FIXME: implement */
> +
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;
> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }
> +
> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
> radeon_ring_write(ring,
>  #ifdef __BIG_ENDIAN
> diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
> index f2afefb..6e3d448 100644
> --- a/drivers/gpu/drm/radeon/ni.c
> +++ b/drivers/gpu/drm/radeon/ni.c
> @@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
> struct radeon_ib *ib)
> /* set to DX10/11 mode */
> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
> radeon_ring_write(ring, 1);
> +
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;

I would rather also skip the surface sync so add another + 8

> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }
> +
> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
> radeon_ring_write(ring,
>  #ifdef __BIG_ENDIAN
> @@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev)
>
>  static void cayman_cp_fini(struct radeon_device *rdev)
>  {
> +   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
> cayman_cp_enable(rdev, false);
> -   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
> +   radeon_ring_fini(rdev, ring);
> +   radeon_scratch_free(rdev, ring->rptr_save_reg);
>  }
>
>  int cayman_cp_resume(struct radeon_device *rdev)
> diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
> index c808fa9..74fca15 100644
> --- a/drivers/gpu/drm/radeon/r600.c
> +++ b/drivers/gpu/drm/radeon/r600.c
> @@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev)
>  void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, 
> unsigned ring_size)
>  {
> u32 rb_bufsz;
> +   int r;
>
> /* Align ring size */
> rb_bufsz = drm_order(ring_size / 8);
> ring_size = (1 << (rb_bufsz + 1)) * 4;
> ring->ring_size = ring_size;
> ring->align_mask = 16 - 1;
> +
> +   r = radeon_scratch_get(rdev, &ring->rptr_save_reg);
> +   if (r) {
> +   DRM_ERROR("failed to get scratch reg for rptr save (%d).\n", 
> r);
> +   ring->rptr_save_reg = 0;
> +   }
>  }
>
>  void r600_cp_fini(struct radeon_device *rdev)
>  {
> +   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
> r600_cp_stop(rdev);
> -   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
> +   radeon_ring_fini(rdev, ring);
> +   radeon_scratch_free(rdev, ring->rptr_save_reg);
>  }
>
>
> @@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, 
> struct radeon_ib *ib)
>  {
> struct radeon_ring *ring = &rdev->ring[ib->ring];
>
> -   /* FIXME: implement */
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;
> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }
> +
> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
> radeon_ring_write(ring,
>  #ifdef __BIG_ENDIAN
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 872270c..64d39ad 

[PATCH 1/2] drm: Add colouring to the range allocator

2012-07-10 Thread Chris Wilson
In order to support snoopable memory on non-LLC architectures (so that
we can bind vgem objects into the i915 GATT for example), we have to
avoid the prefetcher on the GPU from crossing memory domains and so
prevent allocation of a snoopable PTE immediately following an uncached
PTE. To do that, we need to extend the range allocator with support for
tracking and segregating different node colours.

This will be used by i915 to segregate memory domains within the GTT.

v2: Now with more drm_mm helpers and less driver interference.

Signed-off-by: Chris Wilson 
Cc: Dave Airlie 
Cc: Ben Skeggs 
Cc: Jerome Glisse 
Cc: Alex Deucher 
Cc: Daniel Vetter 
Cc: dri-devel at lists.freedesktop.org
---
 drivers/gpu/drm/drm_gem.c |2 +-
 drivers/gpu/drm/drm_mm.c  |  169 -
 drivers/gpu/drm/i915/i915_gem.c   |6 +-
 drivers/gpu/drm/i915/i915_gem_evict.c |9 +-
 include/drm/drm_mm.h  |   93 +++---
 5 files changed, 191 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index d58e69d..fbe0842 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -354,7 +354,7 @@ drm_gem_create_mmap_offset(struct drm_gem_object *obj)

/* Get a DRM GEM mmap offset allocated... */
list->file_offset_node = drm_mm_search_free(&mm->offset_manager,
-   obj->size / PAGE_SIZE, 0, 0);
+   obj->size / PAGE_SIZE, 0, false);

if (!list->file_offset_node) {
DRM_ERROR("failed to allocate offset for bo %d\n", obj->name);
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 961fb54..9bb82f7 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -118,45 +118,53 @@ static inline unsigned long drm_mm_hole_node_end(struct 
drm_mm_node *hole_node)

 static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 struct drm_mm_node *node,
-unsigned long size, unsigned alignment)
+unsigned long size, unsigned alignment,
+unsigned long color)
 {
struct drm_mm *mm = hole_node->mm;
-   unsigned long tmp = 0, wasted = 0;
unsigned long hole_start = drm_mm_hole_node_start(hole_node);
unsigned long hole_end = drm_mm_hole_node_end(hole_node);
+   unsigned long adj_start = hole_start;
+   unsigned long adj_end = hole_end;

BUG_ON(!hole_node->hole_follows || node->allocated);

-   if (alignment)
-   tmp = hole_start % alignment;
+   if (mm->color_adjust)
+   mm->color_adjust(hole_node, color, &adj_start, &adj_end);

-   if (!tmp) {
+   if (alignment) {
+   unsigned tmp = adj_start % alignment;
+   if (tmp)
+   adj_start += alignment - tmp;
+   }
+
+   if (adj_start == hole_start) {
hole_node->hole_follows = 0;
-   list_del_init(&hole_node->hole_stack);
-   } else
-   wasted = alignment - tmp;
+   list_del(&hole_node->hole_stack);
+   }

-   node->start = hole_start + wasted;
+   node->start = adj_start;
node->size = size;
node->mm = mm;
+   node->color = color;
node->allocated = 1;

INIT_LIST_HEAD(&node->hole_stack);
list_add(&node->node_list, &hole_node->node_list);

-   BUG_ON(node->start + node->size > hole_end);
+   BUG_ON(node->start + node->size > adj_end);

+   node->hole_follows = 0;
if (node->start + node->size < hole_end) {
list_add(&node->hole_stack, &mm->hole_stack);
node->hole_follows = 1;
-   } else {
-   node->hole_follows = 0;
}
 }

 struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node,
 unsigned long size,
 unsigned alignment,
+unsigned long color,
 int atomic)
 {
struct drm_mm_node *node;
@@ -165,7 +173,7 @@ struct drm_mm_node *drm_mm_get_block_generic(struct 
drm_mm_node *hole_node,
if (unlikely(node == NULL))
return NULL;

-   drm_mm_insert_helper(hole_node, node, size, alignment);
+   drm_mm_insert_helper(hole_node, node, size, alignment, color);

return node;
 }
@@ -181,11 +189,11 @@ int drm_mm_insert_node(struct drm_mm *mm, struct 
drm_mm_node *node,
 {
struct drm_mm_node *hole_node;

-   hole_node = drm_mm_search_free(mm, size, alignment, 0);
+   hole_node = drm_mm_search_free(mm, size, alignment, false);
if (!hole_node)
return -ENOSPC;

-   drm_mm_insert_helper(hole_node, node, size, alignment);
+   drm_mm_insert_helper(hole_node, node, size, 

Re: [PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v2

2012-07-10 Thread Jerome Glisse
On Tue, Jul 10, 2012 at 8:51 AM, Christian König
 wrote:
> Before emitting any indirect buffer, emit the offset of the next
> valid ring content if any. This allow code that want to resume
> ring to resume ring right after ib that caused GPU lockup.
>
> v2: use scratch registers instead of storing it into memory
>

Why using scratch register ? To minimize bus activities ?

> Signed-off-by: Jerome Glisse 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/radeon/evergreen.c   |8 +++-
>  drivers/gpu/drm/radeon/ni.c  |   11 ++-
>  drivers/gpu/drm/radeon/r600.c|   18 --
>  drivers/gpu/drm/radeon/radeon.h  |1 +
>  drivers/gpu/drm/radeon/radeon_ring.c |4 
>  drivers/gpu/drm/radeon/rv770.c   |4 +++-
>  drivers/gpu/drm/radeon/si.c  |   22 +++---
>  7 files changed, 60 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/evergreen.c 
> b/drivers/gpu/drm/radeon/evergreen.c
> index f39b900..40de347 100644
> --- a/drivers/gpu/drm/radeon/evergreen.c
> +++ b/drivers/gpu/drm/radeon/evergreen.c
> @@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device 
> *rdev, struct radeon_ib *ib)
> /* set to DX10/11 mode */
> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
> radeon_ring_write(ring, 1);
> -   /* FIXME: implement */
> +
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;
> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }
> +
> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
> radeon_ring_write(ring,
>  #ifdef __BIG_ENDIAN
> diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
> index f2afefb..6e3d448 100644
> --- a/drivers/gpu/drm/radeon/ni.c
> +++ b/drivers/gpu/drm/radeon/ni.c
> @@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
> struct radeon_ib *ib)
> /* set to DX10/11 mode */
> radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
> radeon_ring_write(ring, 1);
> +
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;

I would rather also skip the surface sync so add another + 8

> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }
> +
> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
> radeon_ring_write(ring,
>  #ifdef __BIG_ENDIAN
> @@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev)
>
>  static void cayman_cp_fini(struct radeon_device *rdev)
>  {
> +   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
> cayman_cp_enable(rdev, false);
> -   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
> +   radeon_ring_fini(rdev, ring);
> +   radeon_scratch_free(rdev, ring->rptr_save_reg);
>  }
>
>  int cayman_cp_resume(struct radeon_device *rdev)
> diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
> index c808fa9..74fca15 100644
> --- a/drivers/gpu/drm/radeon/r600.c
> +++ b/drivers/gpu/drm/radeon/r600.c
> @@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev)
>  void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, 
> unsigned ring_size)
>  {
> u32 rb_bufsz;
> +   int r;
>
> /* Align ring size */
> rb_bufsz = drm_order(ring_size / 8);
> ring_size = (1 << (rb_bufsz + 1)) * 4;
> ring->ring_size = ring_size;
> ring->align_mask = 16 - 1;
> +
> +   r = radeon_scratch_get(rdev, &ring->rptr_save_reg);
> +   if (r) {
> +   DRM_ERROR("failed to get scratch reg for rptr save (%d).\n", 
> r);
> +   ring->rptr_save_reg = 0;
> +   }
>  }
>
>  void r600_cp_fini(struct radeon_device *rdev)
>  {
> +   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
> r600_cp_stop(rdev);
> -   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
> +   radeon_ring_fini(rdev, ring);
> +   radeon_scratch_free(rdev, ring->rptr_save_reg);
>  }
>
>
> @@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, 
> struct radeon_ib *ib)
>  {
> struct radeon_ring *ring = &rdev->ring[ib->ring];
>
> -   /* FIXME: implement */
> +   if (ring->rptr_save_reg) {
> +   uint32_t next_rptr = ring->wptr + 2 + 4;
> +   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
> +   radeon_ring_write(ring, next_rptr);
> +   }
> +
> radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
> radeon_ring_write(ring,
>  #ifdef __BIG_ENDIAN
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 872270c..64d39ad 

Re: [PATCH 15/15] drm/radeon: implement ring saving on reset v2

2012-07-10 Thread Michel Dänzer
On Die, 2012-07-10 at 14:51 +0200, Christian König wrote: 
> Try to save whatever is on the rings when
> we encounter an lockup.
> 
> v2: Fix spelling error. Free saved ring data if reset fails.
> Add documentation for the new functions.
> 
> Signed-off-by: Christian König 

Just some more spelling nits, otherwise this patch and patch 13 are

Reviewed-by: Michel Dänzer 


> +/**
> + * radeon_ring_backup - Backup the content of a ring
> + *
> + * @rdev: radeon_device pointer
> + * @ring: the ring we want to backup

'back up', in both cases.

> + * Saves all unprocessed commits to a ring, returns the number of dwords 
> saved.
> + */

'unprocessed commands from'?


> +/**
> + * radeon_ring_restore - append saved commands to the ring again
> + *
> + * @rdev: radeon_device pointer
> + * @ring: ring to append commands to
> + * @size: number of dwords we want to write
> + * @data: saved commands
> + *
> + * Allocates space on the ring and restore the previusly saved commands.

'previously'


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 45018] [bisected] rendering regression since added support for virtual address space on cayman v11

2012-07-10 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=45018

--- Comment #65 from Alexandre Demers  
2012-07-10 00:23:46 PDT ---
Created attachment 64053
  --> https://bugs.freedesktop.org/attachment.cgi?id=64053
xsession with drm-next

.xsession with drm-next branch

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 45018] [bisected] rendering regression since added support for virtual address space on cayman v11

2012-07-10 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=45018

--- Comment #64 from Alexandre Demers  
2012-07-10 00:22:55 PDT ---
Created attachment 64052
  --> https://bugs.freedesktop.org/attachment.cgi?id=64052
dmesg drm-next

dmesg with latest drm-next branch

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 45018] [bisected] rendering regression since added support for virtual address space on cayman v11

2012-07-10 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=45018

--- Comment #63 from Alexandre Demers  
2012-07-10 00:21:56 PDT ---
Now running latest drm-next just in case. Always the same error, but with a
little something new: with regular kernel, once the GPU crashed, it stays this
way. With the drm-next branch, it loops. Attaching some files in a moment.

I just started Gnome Shell, then opened a terminal window and launched piglit
r600 tests.

I'm pretty sure (dmesg):
[   66.238981] radeon :01:00.0: bo 88020f46bc00 va 0x0183B000 conflict
with (bo 88021b65d000 0x0183B000 0x0183C000)
[   66.271373] radeon :01:00.0: bo 880222cc9400 va 0x01814000 conflict
with (bo 880221a50800 0x01814000 0x01815000)
[   66.334540] radeon :01:00.0: bo 880222b7 va 0x01809000 conflict
with (bo 8802230a9000 0x01809000 0x0180A000)

corresponds to (.xsession-error):

radeon: Failed to allocate a buffer:
radeon:size  : 256 bytes
radeon:alignment : 256 bytes
radeon:domains   : 2
EE r600_texture.c:869 r600_texture_get_transfer - failed to create temporary
texture to hold untiled copy
Mesa: User error: GL_OUT_OF_MEMORY in glTexSubImage
radeon: Failed to allocate a buffer:
radeon:size  : 256 bytes
radeon:alignment : 256 bytes
radeon:domains   : 2
EE r600_texture.c:869 r600_texture_get_transfer - failed to create temporary
texture to hold untiled copy
radeon: Failed to allocate a buffer:
radeon:size  : 256 bytes
radeon:alignment : 256 bytes
radeon:domains   : 2
EE r600_texture.c:869 r600_texture_get_transfer - failed to create temporary
texture to hold untiled copy

Then (dmesg):

[  196.710933] radeon :01:00.0: GPU lockup CP stall for more than 1msec
[  196.710946] radeon :01:00.0: GPU lockup (waiting for 0x0675
last fence id 0x066c)
[  196.711129] radeon :01:00.0: couldn't schedule ib
[  196.711239] radeon :01:00.0: couldn't schedule ib
[  196.711805] radeon :01:00.0: couldn't schedule ib
[  196.715732] radeon :01:00.0: couldn't schedule ib
[  196.715975] radeon :01:00.0: couldn't schedule ib
[  196.716362] radeon :01:00.0: couldn't schedule ib
[  196.716627] radeon :01:00.0: couldn't schedule ib
[  196.718012] radeon :01:00.0: couldn't schedule ib
[  196.718262] radeon :01:00.0: couldn't schedule ib
[  196.718480] radeon :01:00.0: couldn't schedule ib
[  196.718985] radeon :01:00.0: couldn't schedule ib
[  196.920396] radeon :01:00.0: couldn't schedule ib
[  196.920703] radeon :01:00.0: couldn't schedule ib
[  196.921084] radeon :01:00.0: couldn't schedule ib
[  196.921318] radeon :01:00.0: couldn't schedule ib
[  196.921558] radeon :01:00.0: couldn't schedule ib
[  196.921898] radeon :01:00.0: couldn't schedule ib
[  196.952350] radeon :01:00.0: couldn't schedule ib
[  196.952386] [drm:radeon_cs_ib_chunk] *ERROR* Failed to schedule IB !
[  196.952439] BUG: unable to handle kernel NULL pointer dereference at
0008
[  196.952494] IP: [] radeon_fence_ref+0xd/0x40 [radeon]
[  196.952531] PGD 221dc4067 PUD 2228ff067 PMD 0 
[  196.952556] Oops:  [#1] PREEMPT SMP 
[  196.952579] CPU 1 
[  196.952617] Modules linked in: fuse snd_usb_audio snd_usbmidi_lib
snd_rawmidi powernow_k8 snd_seq_device radeon ttm joydev snd_hda_codec_hdmi
ppdev evdev pwc snd_hda_codec_realtek r8712u(C) r8169 mperf parport_pc parport
sp5100_tco usb_storage uas drm_kms_helper drm videobuf2_vmalloc
videobuf2_memops hid_logitech_dj pcspkr processor snd_hda_intel snd_hda_codec
i2c_algo_bit mii hid_generic videobuf2_core videodev media wmi kvm_amd
snd_hwdep snd_pcm snd_page_alloc snd_timer psmouse i2c_piix4 usbhid
firewire_ohci hid serio_raw i2c_core firewire_core k10temp kvm microcode
crc_itu_t snd edac_core button soundcore edac_mce_amd ext4 crc16 jbd2 mbcache
pata_acpi sr_mod sd_mod cdrom pata_atiixp ata_generic ohci_hcd ahci libahci
libata ehci_hcd usbcore scsi_mod usb_common
[  196.952957] 
[  196.952969] Pid: 715, comm: Xorg Tainted: G C  
3.5.0-rc4-VANILLA-46957-g74da01d #1 Gigabyte Technology Co., Ltd.
GA-MA78GM-S2H/GA-MA78GM-S2H
[  196.953044] RIP: 0010:[]  []
radeon_fence_ref+0xd/0x40 [radeon]
[  196.953092] RSP: 0018:8802230e9b48  EFLAGS: 00010286
...


and it loops.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[PATCH 3/3] drm/exynos: implement kmap/kunmap/kmap_atomic/kunmap_atomic functions of dma_buf_ops

2012-07-10 Thread Cooper Yuan
Implement kmap/kmap_atomic, kunmap/kunmap_atomic functions of dma_buf_ops.

Signed-off-by: Cooper Yuan 
---
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |   17 +++--
 1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index 913a23e..805b344 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -138,30 +138,35 @@ static void exynos_dmabuf_release(struct dma_buf *dmabuf)
 static void *exynos_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf,
unsigned long page_num)
 {
-   /* TODO */
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;

-   return NULL;
+   return kmap_atomic(buf->pages[page_num]);
 }

 static void exynos_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf,
unsigned long page_num,
void *addr)
 {
-   /* TODO */
+   kunmap_atomic(addr);
 }

 static void *exynos_gem_dmabuf_kmap(struct dma_buf *dma_buf,
unsigned long page_num)
 {
-   /* TODO */
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;

-   return NULL;
+   return kmap(buf->pages[page_num]);
 }

 static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf,
unsigned long page_num, void *addr)
 {
-   /* TODO */
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;
+
+   kunmap(buf->pages[page_num]);
 }

 static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct
vm_area_struct *vma)
-- 
1.7.0.4
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/3] drm/exynos: add dmabuf mmap function

2012-07-10 Thread Cooper Yuan
implement mmap function of dma_buf_ops.

Signed-off-by: Cooper Yuan 
---
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |   38 
 1 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index e4eeb0b..913a23e 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -164,6 +164,43 @@ static void exynos_gem_dmabuf_kunmap(struct
dma_buf *dma_buf,
/* TODO */
 }

+static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct
vm_area_struct *vma)
+{
+   struct exynos_drm_gem_obj *exynos_gem_obj = dma_buf->priv;
+   struct drm_device *dev = exynos_gem_obj->base.dev;
+   struct exynos_drm_gem_buf *buf = exynos_gem_obj->buffer;
+   int ret = 0;
+   if (WARN_ON(!exynos_gem_obj->base.filp))
+   return -EINVAL;
+
+   /* Check for valid size. */
+   if (buf->size < vma->vm_end - vma->vm_start) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   if (!dev->driver->gem_vm_ops) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   vma->vm_flags |= VM_RESERVED | VM_IO | VM_MIXEDMAP | VM_DONTEXPAND;
+   vma->vm_ops = dev->driver->gem_vm_ops;
+   vma->vm_private_data = exynos_gem_obj;
+   vma->vm_page_prot =  
pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+
+   /* Take a ref for this mapping of the object, so that the fault
+* handler can dereference the mmap offset's pointer to the object.
+* This reference is cleaned up by the corresponding vm_close
+* (which should happen whether the vma was created by this call, or
+* by a vm_open due to mremap or partial unmap or whatever).
+*/
+   vma->vm_ops->open(vma);
+
+out_unlock:
+   return ret;
+}
+
 static struct dma_buf_ops exynos_dmabuf_ops = {
.map_dma_buf= exynos_gem_map_dma_buf,
.unmap_dma_buf  = exynos_gem_unmap_dma_buf,
@@ -172,6 +209,7 @@ static struct dma_buf_ops exynos_dmabuf_ops = {
.kunmap = exynos_gem_dmabuf_kunmap,
.kunmap_atomic  = exynos_gem_dmabuf_kunmap_atomic,
.release= exynos_dmabuf_release,
+   .mmap   = exynos_gem_dmabuf_mmap,
 };

 struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev,
-- 
1.7.0.4
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/3] drm/exynos: correct dma_buf exporter permission as ReadWrite

2012-07-10 Thread Cooper Yuan
Set dma_buf exporter permission as ReadWrite, otherwise mmap will get
errno 13: permission denied.

Signed-off-by: Cooper Yuan 
---
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
index 613bf8a..e4eeb0b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -29,6 +29,7 @@
 #include "exynos_drm_drv.h"
 #include "exynos_drm_gem.h"

+#include 
 #include 

 static struct sg_table *exynos_pages_to_sg(struct page **pages, int nr_pages,
@@ -179,7 +180,7 @@ struct dma_buf *exynos_dmabuf_prime_export(struct
drm_device *drm_dev,
struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);

return dma_buf_export(exynos_gem_obj, &exynos_dmabuf_ops,
-   exynos_gem_obj->base.size, 0600);
+   exynos_gem_obj->base.size, O_RDWR);
 }

 struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
-- 
1.7.0.4
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] drm/radeon: restoring ring commands in case of a lockup

2012-07-10 Thread Christian König

On 09.07.2012 18:10, Jerome Glisse wrote:

On Mon, Jul 9, 2012 at 11:59 AM, Michel Dänzer  wrote:

On Mon, 2012-07-09 at 12:41 +0200, Christian König wrote:

Hi,

The following patchset tries to save and restore the not yet processed commands
from the rings in case of a lockup and with that should make a userspace
problem with a single application far less problematic.

The first four patches are just stuff this patchset is based upon, followed by
four patches which fix various bugs found while working on this feature.

Followed by patches which change the way how memory is saved/restored on
suspend/resume, basically before we have unpinned most of the buffer objects so
it could be move from vram into system memory. But that is mostly unnecessary
cause the buffer object either are already in system memory or their content
can be easily reinitialized.

The last three patches implement the actual tracking and restoring of commands
in case of a lockup. Please take a look and review.

Patches 3, 5 and 14 are

Reviewed-by: Michel Dänzer 


Patch 1-9 are
Reviewed-by: Jerome Glisse 

Other looks good but i want to test them too and spend a bit more time
to double check few things. Will try to do that tomorrow.
Just send out v2 of the patchset. Mainly it integrates your idea of just 
saving rptr right before we call into the IB, but also contains all the 
other comments and fixes from Michel.


Cheers,
Christian.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 15/15] drm/radeon: implement ring saving on reset v2

2012-07-10 Thread Christian König
Try to save whatever is on the rings when
we encounter an lockup.

v2: Fix spelling error. Free saved ring data if reset fails.
Add documentation for the new functions.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon.h|4 ++
 drivers/gpu/drm/radeon/radeon_device.c |   48 
 drivers/gpu/drm/radeon/radeon_ring.c   |   75 
 3 files changed, 119 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 64d39ad..6715e4c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -768,6 +768,10 @@ int radeon_ring_test(struct radeon_device *rdev, struct 
radeon_ring *cp);
 void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring 
*ring);
 void radeon_ring_lockup_update(struct radeon_ring *ring);
 bool radeon_ring_test_lockup(struct radeon_device *rdev, struct radeon_ring 
*ring);
+unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring 
*ring,
+   uint32_t **data);
+int radeon_ring_restore(struct radeon_device *rdev, struct radeon_ring *ring,
+   unsigned size, uint32_t *data);
 int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *cp, 
unsigned ring_size,
 unsigned rptr_offs, unsigned rptr_reg, unsigned wptr_reg,
 u32 ptr_reg_shift, u32 ptr_reg_mask, u32 nop);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index bbd0971..0302a9f 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -996,7 +996,12 @@ int radeon_resume_kms(struct drm_device *dev)
 
 int radeon_gpu_reset(struct radeon_device *rdev)
 {
-   int r;
+   unsigned ring_sizes[RADEON_NUM_RINGS];
+   uint32_t *ring_data[RADEON_NUM_RINGS];
+
+   bool saved = false;
+
+   int i, r;
int resched;
 
down_write(&rdev->exclusive_lock);
@@ -1005,20 +1010,47 @@ int radeon_gpu_reset(struct radeon_device *rdev)
resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
radeon_suspend(rdev);
 
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   ring_sizes[i] = radeon_ring_backup(rdev, &rdev->ring[i],
+  &ring_data[i]);
+   if (ring_sizes[i]) {
+   saved = true;
+   dev_info(rdev->dev, "Saved %d dwords of commands "
+"on ring %d.\n", ring_sizes[i], i);
+   }
+   }
+
+retry:
r = radeon_asic_reset(rdev);
if (!r) {
-   dev_info(rdev->dev, "GPU reset succeed\n");
+   dev_info(rdev->dev, "GPU reset succeeded, trying to resume\n");
radeon_resume(rdev);
+   }
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   DRM_ERROR("ib ring test failed (%d).\n", r);
+   radeon_restore_bios_scratch_regs(rdev);
+   drm_helper_resume_force_mode(rdev->ddev);
 
-   radeon_restore_bios_scratch_regs(rdev);
-   drm_helper_resume_force_mode(rdev->ddev);
-   ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
+   if (!r) {
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   radeon_ring_restore(rdev, &rdev->ring[i],
+   ring_sizes[i], ring_data[i]);
+   }
+
+   r = radeon_ib_ring_tests(rdev);
+   if (r) {
+   dev_err(rdev->dev, "ib ring test failed (%d).\n", r);
+   if (saved) {
+   radeon_suspend(rdev);
+   goto retry;
+   }
+   }
+   } else {
+   for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+   kfree(ring_data[i]);
+   }
}
 
+   ttm_bo_unlock_delayed_workqueue(&rdev->mman.bdev, resched);
if (r) {
/* bad news, how to tell it to userspace ? */
dev_info(rdev->dev, "GPU reset failed\n");
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index ce8eb9d..a4fa2c7 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -362,6 +362,81 @@ bool radeon_ring_test_lockup(struct radeon_device *rdev, 
struct radeon_ring *rin
return false;
 }
 
+/**
+ * radeon_ring_backup - Backup the content of a ring
+ *
+ * @rdev: radeon_device pointer
+ * @ring: the ring we want to backup
+ *
+ * Saves all unprocessed commits to a ring, returns the number of dwords saved.
+ */
+unsigned radeon_ring_backup(struct radeon_device *rdev, struct radeon_ring 
*ring,
+   uint32_t **data)
+{
+   unsigned size, ptr, i;
+
+   /* just in

[PATCH 14/15] drm/radeon: record what is next valid wptr for each ring v2

2012-07-10 Thread Christian König
Before emitting any indirect buffer, emit the offset of the next
valid ring content if any. This allow code that want to resume
ring to resume ring right after ib that caused GPU lockup.

v2: use scratch registers instead of storing it into memory

Signed-off-by: Jerome Glisse 
Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/evergreen.c   |8 +++-
 drivers/gpu/drm/radeon/ni.c  |   11 ++-
 drivers/gpu/drm/radeon/r600.c|   18 --
 drivers/gpu/drm/radeon/radeon.h  |1 +
 drivers/gpu/drm/radeon/radeon_ring.c |4 
 drivers/gpu/drm/radeon/rv770.c   |4 +++-
 drivers/gpu/drm/radeon/si.c  |   22 +++---
 7 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index f39b900..40de347 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1368,7 +1368,13 @@ void evergreen_ring_ib_execute(struct radeon_device 
*rdev, struct radeon_ib *ib)
/* set to DX10/11 mode */
radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
radeon_ring_write(ring, 1);
-   /* FIXME: implement */
+
+   if (ring->rptr_save_reg) {
+   uint32_t next_rptr = ring->wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
radeon_ring_write(ring,
 #ifdef __BIG_ENDIAN
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index f2afefb..6e3d448 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -855,6 +855,13 @@ void cayman_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
/* set to DX10/11 mode */
radeon_ring_write(ring, PACKET3(PACKET3_MODE_CONTROL, 0));
radeon_ring_write(ring, 1);
+
+   if (ring->rptr_save_reg) {
+   uint32_t next_rptr = ring->wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
radeon_ring_write(ring,
 #ifdef __BIG_ENDIAN
@@ -981,8 +988,10 @@ static int cayman_cp_start(struct radeon_device *rdev)
 
 static void cayman_cp_fini(struct radeon_device *rdev)
 {
+   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
cayman_cp_enable(rdev, false);
-   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
+   radeon_ring_fini(rdev, ring);
+   radeon_scratch_free(rdev, ring->rptr_save_reg);
 }
 
 int cayman_cp_resume(struct radeon_device *rdev)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index c808fa9..74fca15 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2155,18 +2155,27 @@ int r600_cp_resume(struct radeon_device *rdev)
 void r600_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, 
unsigned ring_size)
 {
u32 rb_bufsz;
+   int r;
 
/* Align ring size */
rb_bufsz = drm_order(ring_size / 8);
ring_size = (1 << (rb_bufsz + 1)) * 4;
ring->ring_size = ring_size;
ring->align_mask = 16 - 1;
+
+   r = radeon_scratch_get(rdev, &ring->rptr_save_reg);
+   if (r) {
+   DRM_ERROR("failed to get scratch reg for rptr save (%d).\n", r);
+   ring->rptr_save_reg = 0;
+   }
 }
 
 void r600_cp_fini(struct radeon_device *rdev)
 {
+   struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
r600_cp_stop(rdev);
-   radeon_ring_fini(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
+   radeon_ring_fini(rdev, ring);
+   radeon_scratch_free(rdev, ring->rptr_save_reg);
 }
 
 
@@ -2568,7 +2577,12 @@ void r600_ring_ib_execute(struct radeon_device *rdev, 
struct radeon_ib *ib)
 {
struct radeon_ring *ring = &rdev->ring[ib->ring];
 
-   /* FIXME: implement */
+   if (ring->rptr_save_reg) {
+   uint32_t next_rptr = ring->wptr + 2 + 4;
+   radeon_ring_write(ring, PACKET0(ring->rptr_save_reg, 0));
+   radeon_ring_write(ring, next_rptr);
+   }
+
radeon_ring_write(ring, PACKET3(PACKET3_INDIRECT_BUFFER, 2));
radeon_ring_write(ring,
 #ifdef __BIG_ENDIAN
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 872270c..64d39ad 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -622,6 +622,7 @@ struct radeon_ring {
unsignedrptr;
unsignedrptr_offs;
unsignedrptr_reg;
+   unsignedrptr_save_reg;
unsignedwptr;
unsignedwptr_old;
unsignedwp

[PATCH 13/15] drm/radeon: move radeon_ib_ring_tests out of chipset code

2012-07-10 Thread Christian König
Making it easier to controlwhen it is executed.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/evergreen.c |4 
 drivers/gpu/drm/radeon/ni.c|4 
 drivers/gpu/drm/radeon/r100.c  |4 
 drivers/gpu/drm/radeon/r300.c  |4 
 drivers/gpu/drm/radeon/r420.c  |4 
 drivers/gpu/drm/radeon/r520.c  |4 
 drivers/gpu/drm/radeon/r600.c  |4 
 drivers/gpu/drm/radeon/radeon_device.c |   15 +++
 drivers/gpu/drm/radeon/rs400.c |4 
 drivers/gpu/drm/radeon/rs600.c |4 
 drivers/gpu/drm/radeon/rs690.c |4 
 drivers/gpu/drm/radeon/rv515.c |4 
 drivers/gpu/drm/radeon/rv770.c |4 
 drivers/gpu/drm/radeon/si.c|   21 -
 14 files changed, 15 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 82f7aea..f39b900 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3093,10 +3093,6 @@ static int evergreen_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio init failed\n");
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index ec5307c..f2afefb 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1276,10 +1276,6 @@ static int cayman_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
r = radeon_vm_manager_init(rdev);
if (r) {
dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index 9524bd4..e0f5ae8 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3887,10 +3887,6 @@ static int r100_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
index b396e34..646a192 100644
--- a/drivers/gpu/drm/radeon/r300.c
+++ b/drivers/gpu/drm/radeon/r300.c
@@ -1397,10 +1397,6 @@ static int r300_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/radeon/r420.c b/drivers/gpu/drm/radeon/r420.c
index 0062938..f2f5bf6 100644
--- a/drivers/gpu/drm/radeon/r420.c
+++ b/drivers/gpu/drm/radeon/r420.c
@@ -281,10 +281,6 @@ static int r420_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/radeon/r520.c b/drivers/gpu/drm/radeon/r520.c
index 6df3e51..079d3c5 100644
--- a/drivers/gpu/drm/radeon/r520.c
+++ b/drivers/gpu/drm/radeon/r520.c
@@ -209,10 +209,6 @@ static int r520_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index af2f74a..c808fa9 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2395,10 +2395,6 @@ int r600_startup(struct radeon_device *rdev)
return r;
}
 
-   r = radeon_ib_ring_tests(rdev);
-   if (r)
-   return r;
-
r = r600_audio_init(rdev);
if (r) {
DRM_ERROR("radeon: audio init failed\n");
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index 254fdb4..bbd0971 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -822,6 +822,10 @@ int radeon_device_init(struct radeon_device *rdev,
if (r)
return r;
 
+   r = radeon_ib_ring_tests(rdev);
+   if (r)
+   DRM_ERROR("ib ring test failed (%d).\n", r);
+
if (rdev->flags & RADEON_IS_AGP && !rdev->accel_working) {
/* Acceleration not working on AGP card try again
 * with fallback to PCI or PCIE GART
@@ -946,6 +950,7 @@ int radeon_resume_kms(struct drm_device *dev)
 {
struct drm_connector *connector;
struct radeon_device *rdev = dev->dev_private;
+   int r;
 
if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
return 0;
@@ -960,6 +965,11 @@ int radeon_resume_kms(struct drm_device *dev)
/* resume AGP if in use */
radeon_agp_resume

[PATCH 11/15] drm/radeon: remove r600_blit_suspend

2012-07-10 Thread Christian König
Just reinitialize the shader content on resume instead.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/evergreen.c  |1 -
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |   40 +--
 drivers/gpu/drm/radeon/ni.c |1 -
 drivers/gpu/drm/radeon/r600.c   |   15 --
 drivers/gpu/drm/radeon/r600_blit_kms.c  |   40 +--
 drivers/gpu/drm/radeon/radeon.h |2 --
 drivers/gpu/drm/radeon/rv770.c  |1 -
 drivers/gpu/drm/radeon/si.c |3 --
 8 files changed, 40 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 64e06e6..82f7aea 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3139,7 +3139,6 @@ int evergreen_suspend(struct radeon_device *rdev)
struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 
r600_audio_fini(rdev);
-   r600_blit_suspend(rdev);
r700_cp_stop(rdev);
ring->ready = false;
evergreen_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c 
b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index e512560..89cb9fe 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -634,10 +634,6 @@ int evergreen_blit_init(struct radeon_device *rdev)
 
rdev->r600_blit.max_dim = 16384;
 
-   /* pin copy shader into vram if already initialized */
-   if (rdev->r600_blit.shader_obj)
-   goto done;
-
rdev->r600_blit.state_offset = 0;
 
if (rdev->family < CHIP_CAYMAN)
@@ -668,11 +664,26 @@ int evergreen_blit_init(struct radeon_device *rdev)
obj_size += cayman_ps_size * 4;
obj_size = ALIGN(obj_size, 256);
 
-   r = radeon_bo_create(rdev, obj_size, PAGE_SIZE, true, 
RADEON_GEM_DOMAIN_VRAM,
-NULL, &rdev->r600_blit.shader_obj);
-   if (r) {
-   DRM_ERROR("evergreen failed to allocate shader\n");
-   return r;
+   /* pin copy shader into vram if not already initialized */
+   if (!rdev->r600_blit.shader_obj) {
+   r = radeon_bo_create(rdev, obj_size, PAGE_SIZE, true,
+RADEON_GEM_DOMAIN_VRAM,
+NULL, &rdev->r600_blit.shader_obj);
+   if (r) {
+   DRM_ERROR("evergreen failed to allocate shader\n");
+   return r;
+   }
+
+   r = radeon_bo_reserve(rdev->r600_blit.shader_obj, false);
+   if (unlikely(r != 0))
+   return r;
+   r = radeon_bo_pin(rdev->r600_blit.shader_obj, 
RADEON_GEM_DOMAIN_VRAM,
+ &rdev->r600_blit.shader_gpu_addr);
+   radeon_bo_unreserve(rdev->r600_blit.shader_obj);
+   if (r) {
+   dev_err(rdev->dev, "(%d) pin blit object failed\n", r);
+   return r;
+   }
}
 
DRM_DEBUG("evergreen blit allocated bo %08x vs %08x ps %08x\n",
@@ -714,17 +725,6 @@ int evergreen_blit_init(struct radeon_device *rdev)
radeon_bo_kunmap(rdev->r600_blit.shader_obj);
radeon_bo_unreserve(rdev->r600_blit.shader_obj);
 
-done:
-   r = radeon_bo_reserve(rdev->r600_blit.shader_obj, false);
-   if (unlikely(r != 0))
-   return r;
-   r = radeon_bo_pin(rdev->r600_blit.shader_obj, RADEON_GEM_DOMAIN_VRAM,
- &rdev->r600_blit.shader_gpu_addr);
-   radeon_bo_unreserve(rdev->r600_blit.shader_obj);
-   if (r) {
-   dev_err(rdev->dev, "(%d) pin blit object failed\n", r);
-   return r;
-   }
radeon_ttm_set_active_vram_size(rdev, rdev->mc.real_vram_size);
return 0;
 }
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index fe55310..4004376 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1316,7 +1316,6 @@ int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
radeon_vm_manager_suspend(rdev);
-   r600_blit_suspend(rdev);
cayman_cp_enable(rdev, false);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
evergreen_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 9750f53..af2f74a 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2307,20 +2307,6 @@ int r600_copy_blit(struct radeon_device *rdev,
return 0;
 }
 
-void r600_blit_suspend(struct radeon_device *rdev)
-{
-   int r;
-
-   /* unpin shaders bo */
-   if (rdev->r600_blit.shader_obj) {
-   r = radeon_bo_reserve(rdev->r600_blit.shader_obj, false);
-   if (!r) {
-   radeon_bo_unpin(rdev->r600_

[PATCH 12/15] drm/radeon: remove vm_manager start/suspend

2012-07-10 Thread Christian König
Just restore the page table instead. Addressing three
problem with this change:

1. Calling vm_manager_suspend in the suspend path is
   problematic cause it wants to wait for the VM use
   to end, which in case of a lockup never happens.

2. In case of a locked up memory controller
   unbinding the VM seems to make it even more
   unstable, creating an unrecoverable lockup
   in the end.

3. If we want to backup/restore the leftover ring
   content we must not unbind VMs in between.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/ni.c  |   12 ++---
 drivers/gpu/drm/radeon/radeon.h  |2 -
 drivers/gpu/drm/radeon/radeon_gart.c |   83 +-
 drivers/gpu/drm/radeon/si.c  |   12 ++---
 4 files changed, 59 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 4004376..ec5307c 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1280,9 +1280,11 @@ static int cayman_startup(struct radeon_device *rdev)
if (r)
return r;
 
-   r = radeon_vm_manager_start(rdev);
-   if (r)
+   r = radeon_vm_manager_init(rdev);
+   if (r) {
+   dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
return r;
+   }
 
r = r600_audio_init(rdev);
if (r)
@@ -1315,7 +1317,6 @@ int cayman_resume(struct radeon_device *rdev)
 int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
-   radeon_vm_manager_suspend(rdev);
cayman_cp_enable(rdev, false);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
evergreen_irq_suspend(rdev);
@@ -1392,11 +1393,6 @@ int cayman_init(struct radeon_device *rdev)
return r;
 
rdev->accel_working = true;
-   r = radeon_vm_manager_init(rdev);
-   if (r) {
-   dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
-   }
-
r = cayman_startup(rdev);
if (r) {
dev_err(rdev->dev, "disabling GPU acceleration\n");
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 8a8c3f8..872270c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1759,8 +1759,6 @@ extern void radeon_ttm_set_active_vram_size(struct 
radeon_device *rdev, u64 size
  */
 int radeon_vm_manager_init(struct radeon_device *rdev);
 void radeon_vm_manager_fini(struct radeon_device *rdev);
-int radeon_vm_manager_start(struct radeon_device *rdev);
-int radeon_vm_manager_suspend(struct radeon_device *rdev);
 int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm);
 void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm);
 int radeon_vm_bind(struct radeon_device *rdev, struct radeon_vm *vm);
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index ee11c50..56752da 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -282,27 +282,58 @@ void radeon_gart_fini(struct radeon_device *rdev)
  *
  * TODO bind a default page at vm initialization for default address
  */
+
 int radeon_vm_manager_init(struct radeon_device *rdev)
 {
+   struct radeon_vm *vm;
+   struct radeon_bo_va *bo_va;
int r;
 
-   rdev->vm_manager.enabled = false;
+   if (!rdev->vm_manager.enabled) {
+   /* mark first vm as always in use, it's the system one */
+   r = radeon_sa_bo_manager_init(rdev, 
&rdev->vm_manager.sa_manager,
+ rdev->vm_manager.max_pfn * 8,
+ RADEON_GEM_DOMAIN_VRAM);
+   if (r) {
+   dev_err(rdev->dev, "failed to allocate vm bo (%dKB)\n",
+   (rdev->vm_manager.max_pfn * 8) >> 10);
+   return r;
+   }
 
-   /* mark first vm as always in use, it's the system one */
-   r = radeon_sa_bo_manager_init(rdev, &rdev->vm_manager.sa_manager,
- rdev->vm_manager.max_pfn * 8,
- RADEON_GEM_DOMAIN_VRAM);
-   if (r) {
-   dev_err(rdev->dev, "failed to allocate vm bo (%dKB)\n",
-   (rdev->vm_manager.max_pfn * 8) >> 10);
-   return r;
+   r = rdev->vm_manager.funcs->init(rdev);
+   if (r)
+   return r;
+   
+   rdev->vm_manager.enabled = true;
+
+   r = radeon_sa_bo_manager_start(rdev, 
&rdev->vm_manager.sa_manager);
+   if (r)
+   return r;
}
 
-   r = rdev->vm_manager.funcs->init(rdev);
-   if (r == 0)
-   rdev->vm_manager.enabled = true;
+   /* restore page table */
+   list_for_each_entry(vm, &rdev->vm_manager.lru_vm, list) {
+   if (vm->id =

[PATCH 10/15] drm/radeon: remove ip_pool start/suspend

2012-07-10 Thread Christian König
The IB pool is in gart memory, so it is completely
superfluous to unpin / repin it on suspend / resume.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/radeon/evergreen.c   |   17 ++---
 drivers/gpu/drm/radeon/ni.c  |   16 ++--
 drivers/gpu/drm/radeon/r100.c|   23 ++-
 drivers/gpu/drm/radeon/r300.c|   17 ++---
 drivers/gpu/drm/radeon/r420.c|   17 ++---
 drivers/gpu/drm/radeon/r520.c|   14 +-
 drivers/gpu/drm/radeon/r600.c|   17 ++---
 drivers/gpu/drm/radeon/radeon.h  |2 --
 drivers/gpu/drm/radeon/radeon_asic.h |1 -
 drivers/gpu/drm/radeon/radeon_ring.c |   17 +++--
 drivers/gpu/drm/radeon/rs400.c   |   17 ++---
 drivers/gpu/drm/radeon/rs600.c   |   17 ++---
 drivers/gpu/drm/radeon/rs690.c   |   17 ++---
 drivers/gpu/drm/radeon/rv515.c   |   16 ++--
 drivers/gpu/drm/radeon/rv770.c   |   17 ++---
 drivers/gpu/drm/radeon/si.c  |   16 ++--
 16 files changed, 84 insertions(+), 157 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index eb9a71a..64e06e6 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3087,9 +3087,11 @@ static int evergreen_startup(struct radeon_device *rdev)
if (r)
return r;
 
-   r = radeon_ib_pool_start(rdev);
-   if (r)
+   r = radeon_ib_pool_init(rdev);
+   if (r) {
+   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
return r;
+   }
 
r = radeon_ib_ring_tests(rdev);
if (r)
@@ -3137,7 +3139,6 @@ int evergreen_suspend(struct radeon_device *rdev)
struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 
r600_audio_fini(rdev);
-   radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
r700_cp_stop(rdev);
ring->ready = false;
@@ -3224,20 +3225,14 @@ int evergreen_init(struct radeon_device *rdev)
if (r)
return r;
 
-   r = radeon_ib_pool_init(rdev);
rdev->accel_working = true;
-   if (r) {
-   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
-   rdev->accel_working = false;
-   }
-
r = evergreen_startup(rdev);
if (r) {
dev_err(rdev->dev, "disabling GPU acceleration\n");
r700_cp_fini(rdev);
r600_irq_fini(rdev);
radeon_wb_fini(rdev);
-   r100_ib_fini(rdev);
+   radeon_ib_pool_fini(rdev);
radeon_irq_kms_fini(rdev);
evergreen_pcie_gart_fini(rdev);
rdev->accel_working = false;
@@ -3264,7 +3259,7 @@ void evergreen_fini(struct radeon_device *rdev)
r700_cp_fini(rdev);
r600_irq_fini(rdev);
radeon_wb_fini(rdev);
-   r100_ib_fini(rdev);
+   radeon_ib_pool_fini(rdev);
radeon_irq_kms_fini(rdev);
evergreen_pcie_gart_fini(rdev);
r600_vram_scratch_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 8b1df33..fe55310 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1270,9 +1270,11 @@ static int cayman_startup(struct radeon_device *rdev)
if (r)
return r;
 
-   r = radeon_ib_pool_start(rdev);
-   if (r)
+   r = radeon_ib_pool_init(rdev);
+   if (r) {
+   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
return r;
+   }
 
r = radeon_ib_ring_tests(rdev);
if (r)
@@ -1313,7 +1315,6 @@ int cayman_resume(struct radeon_device *rdev)
 int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
-   radeon_ib_pool_suspend(rdev);
radeon_vm_manager_suspend(rdev);
r600_blit_suspend(rdev);
cayman_cp_enable(rdev, false);
@@ -1391,12 +1392,7 @@ int cayman_init(struct radeon_device *rdev)
if (r)
return r;
 
-   r = radeon_ib_pool_init(rdev);
rdev->accel_working = true;
-   if (r) {
-   dev_err(rdev->dev, "IB initialization failed (%d).\n", r);
-   rdev->accel_working = false;
-   }
r = radeon_vm_manager_init(rdev);
if (r) {
dev_err(rdev->dev, "vm manager initialization failed (%d).\n", 
r);
@@ -1410,7 +1406,7 @@ int cayman_init(struct radeon_device *rdev)
if (rdev->flags & RADEON_IS_IGP)
si_rlc_fini(rdev);
radeon_wb_fini(rdev);
-   r100_ib_fini(rdev);
+   radeon_ib_pool_fini(rdev);
radeon_vm_manager_fini(rdev);
radeon_irq_kms_fini(rdev);
cayman_pcie_gart_fini(rdev);
@@ -1441,7 +1437,7 @@ void cayman_fini(struct

[PATCH 09/15] drm/radeon: make cp init on cayman more robust

2012-07-10 Thread Christian König
It's not critical, but the current code isn't
100% correct.

Signed-off-by: Christian König 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/ni.c |  133 ++-
 1 file changed, 56 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 32a6082..8b1df33 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -987,10 +987,33 @@ static void cayman_cp_fini(struct radeon_device *rdev)
 
 int cayman_cp_resume(struct radeon_device *rdev)
 {
+   static const int ridx[] = {
+   RADEON_RING_TYPE_GFX_INDEX,
+   CAYMAN_RING_TYPE_CP1_INDEX,
+   CAYMAN_RING_TYPE_CP2_INDEX
+   };
+   static const unsigned cp_rb_cntl[] = {
+   CP_RB0_CNTL,
+   CP_RB1_CNTL,
+   CP_RB2_CNTL,
+   };
+   static const unsigned cp_rb_rptr_addr[] = {
+   CP_RB0_RPTR_ADDR,
+   CP_RB1_RPTR_ADDR,
+   CP_RB2_RPTR_ADDR
+   };
+   static const unsigned cp_rb_rptr_addr_hi[] = {
+   CP_RB0_RPTR_ADDR_HI,
+   CP_RB1_RPTR_ADDR_HI,
+   CP_RB2_RPTR_ADDR_HI
+   };
+   static const unsigned cp_rb_base[] = {
+   CP_RB0_BASE,
+   CP_RB1_BASE,
+   CP_RB2_BASE
+   };
struct radeon_ring *ring;
-   u32 tmp;
-   u32 rb_bufsz;
-   int r;
+   int i, r;
 
/* Reset cp; if cp is reset, then PA, SH, VGT also need to be reset */
WREG32(GRBM_SOFT_RESET, (SOFT_RESET_CP |
@@ -1012,91 +1035,47 @@ int cayman_cp_resume(struct radeon_device *rdev)
 
WREG32(CP_DEBUG, (1 << 27));
 
-   /* ring 0 - compute and gfx */
-   /* Set ring buffer size */
-   ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
-   rb_bufsz = drm_order(ring->ring_size / 8);
-   tmp = (drm_order(RADEON_GPU_PAGE_SIZE/8) << 8) | rb_bufsz;
-#ifdef __BIG_ENDIAN
-   tmp |= BUF_SWAP_32BIT;
-#endif
-   WREG32(CP_RB0_CNTL, tmp);
-
-   /* Initialize the ring buffer's read and write pointers */
-   WREG32(CP_RB0_CNTL, tmp | RB_RPTR_WR_ENA);
-   ring->wptr = 0;
-   WREG32(CP_RB0_WPTR, ring->wptr);
-
/* set the wb address wether it's enabled or not */
-   WREG32(CP_RB0_RPTR_ADDR, (rdev->wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 
& 0xFFFC);
-   WREG32(CP_RB0_RPTR_ADDR_HI, upper_32_bits(rdev->wb.gpu_addr + 
RADEON_WB_CP_RPTR_OFFSET) & 0xFF);
WREG32(SCRATCH_ADDR, ((rdev->wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) >> 
8) & 0x);
+   WREG32(SCRATCH_UMSK, 0xff);
 
-   if (rdev->wb.enabled)
-   WREG32(SCRATCH_UMSK, 0xff);
-   else {
-   tmp |= RB_NO_UPDATE;
-   WREG32(SCRATCH_UMSK, 0);
-   }
-
-   mdelay(1);
-   WREG32(CP_RB0_CNTL, tmp);
-
-   WREG32(CP_RB0_BASE, ring->gpu_addr >> 8);
-
-   ring->rptr = RREG32(CP_RB0_RPTR);
+   for (i = 0; i < 3; ++i) {
+   uint32_t rb_cntl;
+   uint64_t addr;
 
-   /* ring1  - compute only */
-   /* Set ring buffer size */
-   ring = &rdev->ring[CAYMAN_RING_TYPE_CP1_INDEX];
-   rb_bufsz = drm_order(ring->ring_size / 8);
-   tmp = (drm_order(RADEON_GPU_PAGE_SIZE/8) << 8) | rb_bufsz;
+   /* Set ring buffer size */
+   ring = &rdev->ring[ridx[i]];
+   rb_cntl = drm_order(ring->ring_size / 8);
+   rb_cntl |= drm_order(RADEON_GPU_PAGE_SIZE/8) << 8;
 #ifdef __BIG_ENDIAN
-   tmp |= BUF_SWAP_32BIT;
+   rb_cntl |= BUF_SWAP_32BIT;
 #endif
-   WREG32(CP_RB1_CNTL, tmp);
+   WREG32(cp_rb_cntl[i], rb_cntl);
 
-   /* Initialize the ring buffer's read and write pointers */
-   WREG32(CP_RB1_CNTL, tmp | RB_RPTR_WR_ENA);
-   ring->wptr = 0;
-   WREG32(CP_RB1_WPTR, ring->wptr);
-
-   /* set the wb address wether it's enabled or not */
-   WREG32(CP_RB1_RPTR_ADDR, (rdev->wb.gpu_addr + 
RADEON_WB_CP1_RPTR_OFFSET) & 0xFFFC);
-   WREG32(CP_RB1_RPTR_ADDR_HI, upper_32_bits(rdev->wb.gpu_addr + 
RADEON_WB_CP1_RPTR_OFFSET) & 0xFF);
-
-   mdelay(1);
-   WREG32(CP_RB1_CNTL, tmp);
-
-   WREG32(CP_RB1_BASE, ring->gpu_addr >> 8);
-
-   ring->rptr = RREG32(CP_RB1_RPTR);
-
-   /* ring2 - compute only */
-   /* Set ring buffer size */
-   ring = &rdev->ring[CAYMAN_RING_TYPE_CP2_INDEX];
-   rb_bufsz = drm_order(ring->ring_size / 8);
-   tmp = (drm_order(RADEON_GPU_PAGE_SIZE/8) << 8) | rb_bufsz;
-#ifdef __BIG_ENDIAN
-   tmp |= BUF_SWAP_32BIT;
-#endif
-   WREG32(CP_RB2_CNTL, tmp);
-
-   /* Initialize the ring buffer's read and write pointers */
-   WREG32(CP_RB2_CNTL, tmp | RB_RPTR_WR_ENA);
-   ring->wptr = 0;
-   WREG32(CP_RB2_WPTR, ring->wptr);
+   /* set the wb address wether it's enabled or not */
+   addr = rdev->wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET;
+   WREG

[PATCH 08/15] drm/radeon: remove FIXME comment from chipset suspend

2012-07-10 Thread Christian König
For a normal suspend/resume we allready wait for
the rings to be empty, and for a suspend/reasume
in case of a lockup we REALLY don't want to wait
for anything.

Signed-off-by: Christian König 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/evergreen.c |1 -
 drivers/gpu/drm/radeon/ni.c|1 -
 drivers/gpu/drm/radeon/r600.c  |1 -
 drivers/gpu/drm/radeon/rv770.c |1 -
 drivers/gpu/drm/radeon/si.c|1 -
 5 files changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index f716e08..eb9a71a 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3137,7 +3137,6 @@ int evergreen_suspend(struct radeon_device *rdev)
struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 
r600_audio_fini(rdev);
-   /* FIXME: we should wait for ring to be empty */
radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
r700_cp_stop(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 2366be3..32a6082 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1334,7 +1334,6 @@ int cayman_resume(struct radeon_device *rdev)
 int cayman_suspend(struct radeon_device *rdev)
 {
r600_audio_fini(rdev);
-   /* FIXME: we should wait for ring to be empty */
radeon_ib_pool_suspend(rdev);
radeon_vm_manager_suspend(rdev);
r600_blit_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 43d0c41..de4de2d 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2461,7 +2461,6 @@ int r600_suspend(struct radeon_device *rdev)
r600_audio_fini(rdev);
radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
-   /* FIXME: we should wait for ring to be empty */
r600_cp_stop(rdev);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
r600_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index b4f51c5..7e230f6 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -996,7 +996,6 @@ int rv770_suspend(struct radeon_device *rdev)
r600_audio_fini(rdev);
radeon_ib_pool_suspend(rdev);
r600_blit_suspend(rdev);
-   /* FIXME: we should wait for ring to be empty */
r700_cp_stop(rdev);
rdev->ring[RADEON_RING_TYPE_GFX_INDEX].ready = false;
r600_irq_suspend(rdev);
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index 34603b3c8..78c790f 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -3807,7 +3807,6 @@ int si_resume(struct radeon_device *rdev)
 
 int si_suspend(struct radeon_device *rdev)
 {
-   /* FIXME: we should wait for ring to be empty */
radeon_ib_pool_suspend(rdev);
radeon_vm_manager_suspend(rdev);
 #if 0
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 07/15] drm/radeon: fix fence init after resume

2012-07-10 Thread Christian König
Start with last signaled fence number instead
of last emitted one.

Signed-off-by: Christian König 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon_fence.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index a194a14..76c5b22 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -578,7 +578,7 @@ int radeon_fence_driver_start_ring(struct radeon_device 
*rdev, int ring)
}
rdev->fence_drv[ring].cpu_addr = &rdev->wb.wb[index/4];
rdev->fence_drv[ring].gpu_addr = rdev->wb.gpu_addr + index;
-   radeon_fence_write(rdev, rdev->fence_drv[ring].sync_seq[ring], ring);
+   radeon_fence_write(rdev, 
atomic64_read(&rdev->fence_drv[ring].last_seq), ring);
rdev->fence_drv[ring].initialized = true;
dev_info(rdev->dev, "fence driver on ring %d use gpu addr 0x%016llx and 
cpu addr 0x%p\n",
 ring, rdev->fence_drv[ring].gpu_addr, 
rdev->fence_drv[ring].cpu_addr);
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 06/15] drm/radeon: fix fence value access

2012-07-10 Thread Christian König
It is possible that radeon_fence_process is called
after writeback is disabled for suspend, leading
to an invalid read of register 0x0.

This fixes a problem for me where the fence value
is temporary incremented by 0x1 on
suspend/resume.

Signed-off-by: Christian König 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon_fence.c |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index be4e4f3..a194a14 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -42,21 +42,23 @@
 
 static void radeon_fence_write(struct radeon_device *rdev, u32 seq, int ring)
 {
-   if (rdev->wb.enabled) {
-   *rdev->fence_drv[ring].cpu_addr = cpu_to_le32(seq);
+   struct radeon_fence_driver *drv = &rdev->fence_drv[ring];
+   if (likely(rdev->wb.enabled || !drv->scratch_reg)) {
+   *drv->cpu_addr = cpu_to_le32(seq);
} else {
-   WREG32(rdev->fence_drv[ring].scratch_reg, seq);
+   WREG32(drv->scratch_reg, seq);
}
 }
 
 static u32 radeon_fence_read(struct radeon_device *rdev, int ring)
 {
+   struct radeon_fence_driver *drv = &rdev->fence_drv[ring];
u32 seq = 0;
 
-   if (rdev->wb.enabled) {
-   seq = le32_to_cpu(*rdev->fence_drv[ring].cpu_addr);
+   if (likely(rdev->wb.enabled || !drv->scratch_reg)) {
+   seq = le32_to_cpu(*drv->cpu_addr);
} else {
-   seq = RREG32(rdev->fence_drv[ring].scratch_reg);
+   seq = RREG32(drv->scratch_reg);
}
return seq;
 }
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 05/15] drm/radeon: fix ring commit padding

2012-07-10 Thread Christian König
We don't need to pad anything if the number of dwords
written to the ring already matches the requirements.

Fixes some "writting more dword to ring than expected"
warnings.

Signed-off-by: Christian König 
Reviewed-by: Jerome Glisse 
Reviewed-by: Michel Dänzer 
---
 drivers/gpu/drm/radeon/radeon_ring.c |7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ring.c 
b/drivers/gpu/drm/radeon/radeon_ring.c
index 0826e77..674aaba 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -272,13 +272,8 @@ int radeon_ring_lock(struct radeon_device *rdev, struct 
radeon_ring *ring, unsig
 
 void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *ring)
 {
-   unsigned count_dw_pad;
-   unsigned i;
-
/* We pad to match fetch size */
-   count_dw_pad = (ring->align_mask + 1) -
-  (ring->wptr & ring->align_mask);
-   for (i = 0; i < count_dw_pad; i++) {
+   while (ring->wptr & ring->align_mask) {
radeon_ring_write(ring, ring->nop);
}
DRM_MEMORYBARRIER();
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 04/15] drm/radeon: add an exclusive lock for GPU reset v2

2012-07-10 Thread Christian König
From: Jerome Glisse 

GPU reset need to be exclusive, one happening at a time. For this
add a rw semaphore so that any path that trigger GPU activities
have to take the semaphore as a reader thus allowing concurency.

The GPU reset path take the semaphore as a writer ensuring that
no concurrent reset take place.

v2: init rw semaphore

Signed-off-by: Jerome Glisse 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon.h|1 +
 drivers/gpu/drm/radeon/radeon_cs.c |5 +
 drivers/gpu/drm/radeon/radeon_device.c |3 +++
 drivers/gpu/drm/radeon/radeon_gem.c|8 
 4 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 5861ec8..4487873 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1446,6 +1446,7 @@ struct radeon_device {
struct device   *dev;
struct drm_device   *ddev;
struct pci_dev  *pdev;
+   struct rw_semaphore exclusive_lock;
/* ASIC */
union radeon_asic_configconfig;
enum radeon_family  family;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index d5aec09..553da67 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -499,7 +499,9 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
struct radeon_cs_parser parser;
int r;
 
+   down_read(&rdev->exclusive_lock);
if (!rdev->accel_working) {
+   up_read(&rdev->exclusive_lock);
return -EBUSY;
}
/* initialize parser */
@@ -512,6 +514,7 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r) {
DRM_ERROR("Failed to initialize parser !\n");
radeon_cs_parser_fini(&parser, r);
+   up_read(&rdev->exclusive_lock);
r = radeon_cs_handle_lockup(rdev, r);
return r;
}
@@ -520,6 +523,7 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
if (r != -ERESTARTSYS)
DRM_ERROR("Failed to parse relocation %d!\n", r);
radeon_cs_parser_fini(&parser, r);
+   up_read(&rdev->exclusive_lock);
r = radeon_cs_handle_lockup(rdev, r);
return r;
}
@@ -533,6 +537,7 @@ int radeon_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
}
 out:
radeon_cs_parser_fini(&parser, r);
+   up_read(&rdev->exclusive_lock);
r = radeon_cs_handle_lockup(rdev, r);
return r;
 }
diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index f654ba8..254fdb4 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -734,6 +734,7 @@ int radeon_device_init(struct radeon_device *rdev,
mutex_init(&rdev->gem.mutex);
mutex_init(&rdev->pm.mutex);
init_rwsem(&rdev->pm.mclk_lock);
+   init_rwsem(&rdev->exclusive_lock);
init_waitqueue_head(&rdev->irq.vblank_queue);
init_waitqueue_head(&rdev->irq.idle_queue);
r = radeon_gem_init(rdev);
@@ -988,6 +989,7 @@ int radeon_gpu_reset(struct radeon_device *rdev)
int r;
int resched;
 
+   down_write(&rdev->exclusive_lock);
radeon_save_bios_scratch_regs(rdev);
/* block TTM */
resched = ttm_bo_lock_delayed_workqueue(&rdev->mman.bdev);
@@ -1007,6 +1009,7 @@ int radeon_gpu_reset(struct radeon_device *rdev)
dev_info(rdev->dev, "GPU reset failed\n");
}
 
+   up_write(&rdev->exclusive_lock);
return r;
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index d9b0809..b0be9c4 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -215,12 +215,14 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
*data,
uint32_t handle;
int r;
 
+   down_read(&rdev->exclusive_lock);
/* create a gem object to contain this object in */
args->size = roundup(args->size, PAGE_SIZE);
r = radeon_gem_object_create(rdev, args->size, args->alignment,
args->initial_domain, false,
false, &gobj);
if (r) {
+   up_read(&rdev->exclusive_lock);
r = radeon_gem_handle_lockup(rdev, r);
return r;
}
@@ -228,10 +230,12 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
*data,
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
if (r) {
+   up_read(&rdev->exclusive_lock);
r = radeon_gem_handle_

[PATCH 01/15] drm/radeon: add error handling to fence_wait_empty_locked

2012-07-10 Thread Christian König
Instead of returning the error handle it directly
and while at it fix the comments about the ring lock.

Signed-off-by: Christian König 
Reviewed-by: Michel Dänzer 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon.h   |2 +-
 drivers/gpu/drm/radeon/radeon_fence.c |   33 +
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 77b4519b..5861ec8 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -239,7 +239,7 @@ void radeon_fence_process(struct radeon_device *rdev, int 
ring);
 bool radeon_fence_signaled(struct radeon_fence *fence);
 int radeon_fence_wait(struct radeon_fence *fence, bool interruptible);
 int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring);
-int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
+void radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring);
 int radeon_fence_wait_any(struct radeon_device *rdev,
  struct radeon_fence **fences,
  bool intr);
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 7b55625..be4e4f3 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -440,14 +440,11 @@ int radeon_fence_wait_any(struct radeon_device *rdev,
return 0;
 }
 
+/* caller must hold ring lock */
 int radeon_fence_wait_next_locked(struct radeon_device *rdev, int ring)
 {
uint64_t seq;
 
-   /* We are not protected by ring lock when reading current seq but
-* it's ok as worst case is we return to early while we could have
-* wait.
-*/
seq = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL;
if (seq >= rdev->fence_drv[ring].sync_seq[ring]) {
/* nothing to wait for, last_seq is
@@ -457,15 +454,27 @@ int radeon_fence_wait_next_locked(struct radeon_device 
*rdev, int ring)
return radeon_fence_wait_seq(rdev, seq, ring, false, false);
 }
 
-int radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring)
+/* caller must hold ring lock */
+void radeon_fence_wait_empty_locked(struct radeon_device *rdev, int ring)
 {
-   /* We are not protected by ring lock when reading current seq
-* but it's ok as wait empty is call from place where no more
-* activity can be scheduled so there won't be concurrent access
-* to seq value.
-*/
-   return radeon_fence_wait_seq(rdev, rdev->fence_drv[ring].sync_seq[ring],
-ring, false, false);
+   uint64_t seq = rdev->fence_drv[ring].sync_seq[ring];
+
+   while(1) {
+   int r;
+   r = radeon_fence_wait_seq(rdev, seq, ring, false, false);
+   if (r == -EDEADLK) {
+   mutex_unlock(&rdev->ring_lock);
+   r = radeon_gpu_reset(rdev);
+   mutex_lock(&rdev->ring_lock);
+   if (!r)
+   continue;
+   }
+   if (r) {
+   dev_err(rdev->dev, "error waiting for ring to become"
+   " idle (%d)\n", r);
+   }
+   return;
+   }
 }
 
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence)
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 03/15] drm/radeon: fix fence related segfault in CS

2012-07-10 Thread Christian König
Don't return success if scheduling the IB fails, otherwise
we end up with an oops in ttm_eu_fence_buffer_objects.

Signed-off-by: Christian König 
Reviewed-by: Jerome Glisse 
Reviewed-by: Michel Dänzer 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon_cs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index f1b7527..d5aec09 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -358,7 +358,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
}
-   return 0;
+   return r;
 }
 
 static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 02/15] drm/radeon: add error handling to radeon_vm_unbind_locked

2012-07-10 Thread Christian König
Waiting for a fence can fail for different reasons,
the most common is a deadlock.

Signed-off-by: Christian König 
Reviewed-by: Michel Dänzer 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon_gart.c |   17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index 2b34c1a..ee11c50 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -316,10 +316,21 @@ static void radeon_vm_unbind_locked(struct radeon_device 
*rdev,
}
 
/* wait for vm use to end */
-   if (vm->fence) {
-   radeon_fence_wait(vm->fence, false);
-   radeon_fence_unref(&vm->fence);
+   while (vm->fence) {
+   int r;
+   r = radeon_fence_wait(vm->fence, false);
+   if (r)
+   DRM_ERROR("error while waiting for fence: %d\n", r);
+   if (r == -EDEADLK) {
+   mutex_unlock(&rdev->vm_manager.lock);
+   r = radeon_gpu_reset(rdev);
+   mutex_lock(&rdev->vm_manager.lock);
+   if (!r)
+   continue;
+   }
+   break;
}
+   radeon_fence_unref(&vm->fence);
 
/* hw unbind */
rdev->vm_manager.funcs->unbind(rdev, vm);
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[RFC PATCH 8/8] nouveau: Prime execbuffer submission synchronization

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_gem.c |  121 +++--
 1 file changed, 116 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 11c9c2a..e5d36bb 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -31,6 +31,7 @@
 #include "nouveau_drm.h"
 #include "nouveau_dma.h"
 #include "nouveau_fence.h"
+#include 
 
 #define nouveau_gem_pushbuf_sync(chan) 0
 
@@ -277,6 +278,7 @@ struct validate_op {
struct list_head vram_list;
struct list_head gart_list;
struct list_head both_list;
+   struct list_head prime_list;
 };
 
 static void
@@ -305,9 +307,36 @@ validate_fini_list(struct list_head *list, struct 
nouveau_fence *fence)
 static void
 validate_fini(struct validate_op *op, struct nouveau_fence* fence)
 {
+   struct list_head *entry, *tmp;
+   struct nouveau_bo *nvbo;
+   struct dma_buf *sync_buf;
+   u32 ofs, val;
+
validate_fini_list(&op->vram_list, fence);
validate_fini_list(&op->gart_list, fence);
validate_fini_list(&op->both_list, fence);
+
+   if (list_empty(&op->prime_list))
+   return;
+
+   if (fence &&
+   !nouveau_fence_prime_get(fence, &sync_buf, &ofs, &val)) {
+   dmabufmgr_eu_fence_buffer_objects(sync_buf, ofs, val,
+ &op->prime_list);
+   dma_buf_put(sync_buf);
+   } else
+   dmabufmgr_eu_backoff_reservation(&op->prime_list);
+
+   list_for_each_safe(entry, tmp, &op->prime_list) {
+   struct dmabufmgr_validate *val;
+   val = list_entry(entry, struct dmabufmgr_validate, head);
+   nvbo = val->priv;
+
+   list_del(&val->head);
+   nvbo->reserved_by = NULL;
+   drm_gem_object_unreference_unlocked(nvbo->gem);
+   kfree(val);
+   }
 }
 
 static int
@@ -319,9 +348,9 @@ validate_init(struct nouveau_channel *chan, struct drm_file 
*file_priv,
struct drm_nouveau_private *dev_priv = dev->dev_private;
uint32_t sequence;
int trycnt = 0;
-   int ret, i;
+   int i;
 
-   sequence = atomic_add_return(1, &dev_priv->ttm.validate_sequence);
+   sequence = atomic_inc_return(&dev_priv->ttm.validate_sequence);
 retry:
if (++trycnt > 10) {
NV_ERROR(dev, "%s failed and gave up.\n", __func__);
@@ -332,6 +361,8 @@ retry:
struct drm_nouveau_gem_pushbuf_bo *b = &pbbo[i];
struct drm_gem_object *gem;
struct nouveau_bo *nvbo;
+   int ret = 0, is_prime;
+   struct dmabufmgr_validate *validate = NULL;
 
gem = drm_gem_object_lookup(dev, file_priv, b->handle);
if (!gem) {
@@ -340,6 +371,7 @@ retry:
return -ENOENT;
}
nvbo = gem->driver_private;
+   is_prime = gem->export_dma_buf || gem->import_attach;
 
if (nvbo->reserved_by && nvbo->reserved_by == file_priv) {
NV_ERROR(dev, "multiple instances of buffer %d on "
@@ -349,7 +381,21 @@ retry:
return -EINVAL;
}
 
-   ret = ttm_bo_reserve(&nvbo->bo, true, false, true, sequence);
+   if (likely(!is_prime))
+   ret = ttm_bo_reserve(&nvbo->bo, true, false,
+true, sequence);
+   else {
+   validate = kzalloc(sizeof(*validate), GFP_KERNEL);
+   if (validate) {
+   if (gem->import_attach)
+   validate->bo =
+   gem->import_attach->dmabuf;
+   else
+   validate->bo = gem->export_dma_buf;
+   validate->priv = nvbo;
+   } else
+   ret = -ENOMEM;
+   }
if (ret) {
validate_fini(op, NULL);
if (unlikely(ret == -EAGAIN))
@@ -366,6 +412,9 @@ retry:
b->user_priv = (uint64_t)(unsigned long)nvbo;
nvbo->reserved_by = file_priv;
nvbo->pbbo_index = i;
+   if (is_prime) {
+   list_add_tail(&validate->head, &op->prime_list);
+   } else
if ((b->valid_domains & NOUVEAU_GEM_DOMAIN_VRAM) &&
(b->valid_domains & NOUVEAU_GEM_DOMAIN_GART))
list_add_tail(&nvbo->entry, &op->both_list);
@@ -473,6 +522,60 @@ validate_list(struct nouveau_channel *chan, struct 
list_head *list,
 }
 
 static int
+validate_prime(struct nouveau_chann

[RFC PATCH 7/8] nouveau: nvc0 fence prime implementation

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Create a read-only mapping for every imported bo, and create a prime
bo in in system memory.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nvc0_fence.c |  104 +-
 1 file changed, 89 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvc0_fence.c 
b/drivers/gpu/drm/nouveau/nvc0_fence.c
index 198e31f..dc6ccab 100644
--- a/drivers/gpu/drm/nouveau/nvc0_fence.c
+++ b/drivers/gpu/drm/nouveau/nvc0_fence.c
@@ -37,6 +37,7 @@ struct nvc0_fence_priv {
 struct nvc0_fence_chan {
struct nouveau_fence_chan base;
struct nouveau_vma vma;
+   struct nouveau_vma prime_vma;
 };
 
 static int
@@ -45,19 +46,23 @@ nvc0_fence_emit(struct nouveau_fence *fence, bool prime)
struct nouveau_channel *chan = fence->channel;
struct nvc0_fence_chan *fctx = chan->engctx[NVOBJ_ENGINE_FENCE];
u64 addr = fctx->vma.offset + chan->id * 16;
-   int ret;
+   int ret, i;
 
-   ret = RING_SPACE(chan, 5);
-   if (ret == 0) {
+   ret = RING_SPACE(chan, prime ? 10 : 5);
+   if (ret)
+   return ret;
+
+   for (i = 0; i < (prime ? 2 : 1); ++i) {
+   if (i)
+   addr = fctx->prime_vma.offset + chan->id * 16;
BEGIN_NVC0(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
OUT_RING  (chan, upper_32_bits(addr));
OUT_RING  (chan, lower_32_bits(addr));
OUT_RING  (chan, fence->sequence);
OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_WRITE_LONG);
-   FIRE_RING (chan);
}
-
-   return ret;
+   FIRE_RING(chan);
+   return 0;
 }
 
 static int
@@ -95,6 +100,8 @@ nvc0_fence_context_del(struct nouveau_channel *chan, int 
engine)
struct nvc0_fence_priv *priv = nv_engine(chan->dev, engine);
struct nvc0_fence_chan *fctx = chan->engctx[engine];
 
+   if (priv->base.prime_bo)
+   nouveau_bo_vma_del(priv->base.prime_bo, &fctx->prime_vma);
nouveau_bo_vma_del(priv->bo, &fctx->vma);
nouveau_fence_context_del(chan->dev, &fctx->base);
chan->engctx[engine] = NULL;
@@ -115,10 +122,16 @@ nvc0_fence_context_new(struct nouveau_channel *chan, int 
engine)
nouveau_fence_context_new(&fctx->base);
 
ret = nouveau_bo_vma_add(priv->bo, chan->vm, &fctx->vma);
+   if (!ret && priv->base.prime_bo)
+   ret = nouveau_bo_vma_add(priv->base.prime_bo, chan->vm,
+&fctx->prime_vma);
if (ret)
nvc0_fence_context_del(chan, engine);
 
-   nouveau_bo_wr32(priv->bo, chan->id * 16/4, 0x);
+   fctx->base.sequence = nouveau_bo_rd32(priv->bo, chan->id * 16/4);
+   if (priv->base.prime_bo)
+   nouveau_bo_wr32(priv->base.prime_bo, chan->id * 16/4,
+   fctx->base.sequence);
return ret;
 }
 
@@ -140,12 +153,55 @@ nvc0_fence_destroy(struct drm_device *dev, int engine)
struct drm_nouveau_private *dev_priv = dev->dev_private;
struct nvc0_fence_priv *priv = nv_engine(dev, engine);
 
+   nouveau_fence_prime_del(&priv->base);
nouveau_bo_unmap(priv->bo);
+   nouveau_bo_unpin(priv->bo);
nouveau_bo_ref(NULL, &priv->bo);
dev_priv->eng[engine] = NULL;
kfree(priv);
 }
 
+static int
+nvc0_fence_prime_sync(struct nouveau_channel *chan,
+ struct nouveau_bo *bo,
+ u32 ofs, u32 val, u64 sema_start)
+{
+   struct nvc0_fence_chan *fctx = chan->engctx[NVOBJ_ENGINE_FENCE];
+   struct nvc0_fence_priv *priv = nv_engine(chan->dev, NVOBJ_ENGINE_FENCE);
+   int ret = RING_SPACE(chan, 5);
+   if (ret)
+   return ret;
+
+   if (bo == priv->base.prime_bo)
+   sema_start = fctx->prime_vma.offset;
+   else
+   NV_ERROR(chan->dev, "syncing with %08Lx + %08x >= %08x\n",
+   sema_start, ofs, val);
+   sema_start += ofs;
+
+   BEGIN_NVC0(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
+   OUT_RING  (chan, upper_32_bits(sema_start));
+   OUT_RING  (chan, lower_32_bits(sema_start));
+   OUT_RING  (chan, val);
+   OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_ACQUIRE_GEQUAL |
+NVC0_SUBCHAN_SEMAPHORE_TRIGGER_YIELD);
+   FIRE_RING (chan);
+   return ret;
+}
+
+static void
+nvc0_fence_prime_del_import(struct nouveau_fence_prime_bo_entry *entry) {
+   nouveau_bo_vma_del(entry->bo, &entry->vma);
+}
+
+static int
+nvc0_fence_prime_add_import(struct nouveau_fence_prime_bo_entry *entry) {
+   int ret = nouveau_bo_vma_add_access(entry->bo, entry->chan->vm,
+   &entry->vma, NV_MEM_ACCESS_RO);
+   entry->sema_start = entry->vma.offset;
+   return ret;
+}
+
 int
 nvc0_fence_create(struct drm_device *dev)
 {
@@ -168,17 +224,35 @@ nvc0_fence_create

[RFC PATCH 6/8] nouveau: nv84 fence prime implementation

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Create a dma object for the prime semaphore and every imported sync bo.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nv84_fence.c |  121 --
 1 file changed, 115 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c 
b/drivers/gpu/drm/nouveau/nv84_fence.c
index b5cfbcb..f739dfc 100644
--- a/drivers/gpu/drm/nouveau/nv84_fence.c
+++ b/drivers/gpu/drm/nouveau/nv84_fence.c
@@ -31,6 +31,7 @@
 
 struct nv84_fence_chan {
struct nouveau_fence_chan base;
+   u32 sema_start;
 };
 
 struct nv84_fence_priv {
@@ -42,21 +43,25 @@ static int
 nv84_fence_emit(struct nouveau_fence *fence, bool prime)
 {
struct nouveau_channel *chan = fence->channel;
-   int ret = RING_SPACE(chan, 7);
-   if (ret == 0) {
+   int i, ret;
+
+   ret = RING_SPACE(chan, prime ? 14 : 7);
+   if (ret)
+   return ret;
+
+   for (i = 0; i < (prime ? 2 : 1); ++i) {
BEGIN_NV04(chan, 0, NV11_SUBCHAN_DMA_SEMAPHORE, 1);
-   OUT_RING  (chan, NvSema);
+   OUT_RING  (chan, i ? NvSemaPrime : NvSema);
BEGIN_NV04(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
OUT_RING  (chan, upper_32_bits(chan->id * 16));
OUT_RING  (chan, lower_32_bits(chan->id * 16));
OUT_RING  (chan, fence->sequence);
OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_WRITE_LONG);
-   FIRE_RING (chan);
}
+   FIRE_RING (chan);
return ret;
 }
 
-
 static int
 nv84_fence_sync(struct nouveau_fence *fence,
struct nouveau_channel *prev, struct nouveau_channel *chan)
@@ -82,12 +87,94 @@ nv84_fence_read(struct nouveau_channel *chan)
return nv_ro32(priv->mem, chan->id * 16);
 }
 
+static int
+nv84_fence_prime_sync(struct nouveau_channel *chan,
+ struct nouveau_bo *bo,
+ u32 ofs, u32 val, u64 sema_start)
+{
+   struct nv84_fence_priv *priv = nv_engine(chan->dev, NVOBJ_ENGINE_FENCE);
+   int ret = RING_SPACE(chan, 7);
+   u32 sema = 0;
+   if (ret < 0)
+   return ret;
+
+   if (bo == priv->base.prime_bo) {
+   sema = NvSema;
+   } else {
+   struct sg_table *sgt = bo->bo.sg;
+   struct scatterlist *sg;
+   u32 i;
+   sema = sema_start;
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   if (ofs < sg->offset + sg->length) {
+   ofs -= sg->offset;
+   break;
+   }
+   sema++;
+   }
+   }
+
+   BEGIN_NV04(chan, 0, NV11_SUBCHAN_DMA_SEMAPHORE, 1);
+   OUT_RING  (chan, sema);
+   BEGIN_NV04(chan, 0, NV84_SUBCHAN_SEMAPHORE_ADDRESS_HIGH, 4);
+   OUT_RING  (chan, 0);
+   OUT_RING  (chan, ofs);
+   OUT_RING  (chan, val);
+   OUT_RING  (chan, NV84_SUBCHAN_SEMAPHORE_TRIGGER_ACQUIRE_GEQUAL);
+   FIRE_RING (chan);
+   return ret;
+}
+
+static void
+nv84_fence_prime_del_import(struct nouveau_fence_prime_bo_entry *entry) {
+   u32 i;
+   for (i = entry->sema_start; i <  entry->sema_start + entry->sema_len; 
++i)
+   nouveau_ramht_remove(entry->chan, i);
+}
+
+static int
+nv84_fence_prime_add_import(struct nouveau_fence_prime_bo_entry *entry) {
+   struct sg_table *sgt = entry->bo->bo.sg;
+   struct nouveau_channel *chan = entry->chan;
+   struct nv84_fence_chan *fctx = chan->engctx[NVOBJ_ENGINE_FENCE];
+   struct scatterlist *sg;
+   u32 i, sema;
+   int ret;
+
+   sema = entry->sema_start = fctx->sema_start;
+   entry->sema_len = 0;
+
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   struct nouveau_gpuobj *obj;
+   ret = nouveau_gpuobj_dma_new(chan, NV_CLASS_DMA_FROM_MEMORY,
+sg_dma_address(sg), PAGE_SIZE,
+NV_MEM_ACCESS_RO,
+NV_MEM_TARGET_PCI, &obj);
+   if (ret)
+   goto err;
+
+   ret = nouveau_ramht_insert(chan, sema, obj);
+   nouveau_gpuobj_ref(NULL, &obj);
+   if (ret)
+   goto err;
+   entry->sema_len++;
+   sema++;
+   }
+   fctx->sema_start += (entry->sema_len + 0xff) & ~0xff;
+   return 0;
+
+err:
+   nv84_fence_prime_del_import(entry);
+   return ret;
+}
+
 static void
 nv84_fence_context_del(struct nouveau_channel *chan, int engine)
 {
struct nv84_fence_chan *fctx = chan->engctx[engine];
nouveau_fence_context_del(chan->dev, &fctx->base);
chan->engctx[engine] = NULL;
+
kfree(fctx);
 }
 
@@ -104,6 +191,7 @@ nv84_fence_context_new(struct nouveau_channel *chan, int 
engine)
return -EN

[RFC PATCH 5/8] nouveau: Add methods preparing for prime fencing

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

This can be used by nv84 and nvc0 to implement hardware fencing,
earlier systems will require more thought but can fall back to
software for now.

Signed-off-by: Maarten Lankhorst 

---
 drivers/gpu/drm/nouveau/nouveau_bo.c  |6 +-
 drivers/gpu/drm/nouveau/nouveau_channel.c |2 +-
 drivers/gpu/drm/nouveau/nouveau_display.c |2 +-
 drivers/gpu/drm/nouveau/nouveau_dma.h |1 +
 drivers/gpu/drm/nouveau/nouveau_drv.h |5 +
 drivers/gpu/drm/nouveau/nouveau_fence.c   |  242 -
 drivers/gpu/drm/nouveau/nouveau_fence.h   |   44 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c |6 +-
 drivers/gpu/drm/nouveau/nouveau_prime.c   |2 +
 drivers/gpu/drm/nouveau/nv04_fence.c  |4 +-
 drivers/gpu/drm/nouveau/nv10_fence.c  |4 +-
 drivers/gpu/drm/nouveau/nv84_fence.c  |4 +-
 drivers/gpu/drm/nouveau/nvc0_fence.c  |4 +-
 13 files changed, 304 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 4318320..a97025a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -52,6 +52,9 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo)
DRM_ERROR("bo %p still attached to GEM object\n", bo);
 
nv10_mem_put_tile_region(dev, nvbo->tile, NULL);
+
+   if (nvbo->fence_import_attach)
+   nouveau_fence_prime_del_bo(nvbo);
kfree(nvbo);
 }
 
@@ -109,6 +112,7 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
INIT_LIST_HEAD(&nvbo->head);
INIT_LIST_HEAD(&nvbo->entry);
INIT_LIST_HEAD(&nvbo->vma_list);
+   INIT_LIST_HEAD(&nvbo->prime_chan_entries);
nvbo->tile_mode = tile_mode;
nvbo->tile_flags = tile_flags;
nvbo->bo.bdev = &dev_priv->ttm.bdev;
@@ -480,7 +484,7 @@ nouveau_bo_move_accel_cleanup(struct nouveau_channel *chan,
struct nouveau_fence *fence = NULL;
int ret;
 
-   ret = nouveau_fence_new(chan, &fence);
+   ret = nouveau_fence_new(chan, &fence, false);
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_channel.c 
b/drivers/gpu/drm/nouveau/nouveau_channel.c
index 629d8a2..85a8556 100644
--- a/drivers/gpu/drm/nouveau/nouveau_channel.c
+++ b/drivers/gpu/drm/nouveau/nouveau_channel.c
@@ -362,7 +362,7 @@ nouveau_channel_idle(struct nouveau_channel *chan)
struct nouveau_fence *fence = NULL;
int ret;
 
-   ret = nouveau_fence_new(chan, &fence);
+   ret = nouveau_fence_new(chan, &fence, false);
if (!ret) {
ret = nouveau_fence_wait(fence, false, false);
nouveau_fence_unref(&fence);
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index 69688ef..7c76776 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -466,7 +466,7 @@ nouveau_page_flip_emit(struct nouveau_channel *chan,
}
FIRE_RING (chan);
 
-   ret = nouveau_fence_new(chan, pfence);
+   ret = nouveau_fence_new(chan, pfence, false);
if (ret)
goto fail;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_dma.h 
b/drivers/gpu/drm/nouveau/nouveau_dma.h
index 8db68be..d02ffd3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dma.h
+++ b/drivers/gpu/drm/nouveau/nouveau_dma.h
@@ -74,6 +74,7 @@ enum {
NvEvoSema0  = 0x8010,
NvEvoSema1  = 0x8011,
NvNotify1   = 0x8012,
+   NvSemaPrime = 0x801f,
 
/* G80+ display objects */
NvEvoVRAM   = 0x0100,
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 2c17989..ad49594 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -126,6 +126,11 @@ struct nouveau_bo {
 
struct ttm_bo_kmap_obj dma_buf_vmap;
int vmapping_count;
+
+   /* fence related stuff */
+   struct nouveau_bo *sync_bo;
+   struct list_head prime_chan_entries;
+   struct dma_buf_attachment *fence_import_attach;
 };
 
 #define nouveau_bo_tile_layout(nvbo)   \
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 3c18049..d4c9c40 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -29,17 +29,64 @@
 
 #include 
 #include 
+#include 
 
 #include "nouveau_drv.h"
 #include "nouveau_ramht.h"
 #include "nouveau_fence.h"
 #include "nouveau_software.h"
 #include "nouveau_dma.h"
+#include "nouveau_fifo.h"
+
+int nouveau_fence_prime_init(struct drm_device *dev,
+struct nouveau_fence_priv *priv, u32 align)
+{
+   int ret = 0;
+#ifdef CONFIG_DMA_SHARED_BUFFER
+   struct nouveau_fifo_priv *pfifo = nv_engine(dev, NVOBJ_ENGINE_FIFO);
+   u32 size = PAGE_AL

[RFC PATCH 4/8] nouveau: add nouveau_bo_vma_add_access

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

This is needed to allow creation of read-only vm mappings
in fence objects.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c  |6 +++---
 drivers/gpu/drm/nouveau/nouveau_drv.h |6 --
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 7f80ed5..4318320 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1443,15 +1443,15 @@ nouveau_bo_vma_find(struct nouveau_bo *nvbo, struct 
nouveau_vm *vm)
 }
 
 int
-nouveau_bo_vma_add(struct nouveau_bo *nvbo, struct nouveau_vm *vm,
-  struct nouveau_vma *vma)
+nouveau_bo_vma_add_access(struct nouveau_bo *nvbo, struct nouveau_vm *vm,
+ struct nouveau_vma *vma, u32 access)
 {
const u32 size = nvbo->bo.mem.num_pages << PAGE_SHIFT;
struct nouveau_mem *node = nvbo->bo.mem.mm_node;
int ret;
 
ret = nouveau_vm_get(vm, size, nvbo->page_shift,
-NV_MEM_ACCESS_RW, vma);
+access, vma);
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 7c52eba..2c17989 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -1350,8 +1350,10 @@ extern int nouveau_bo_validate(struct nouveau_bo *, bool 
interruptible,
 
 extern struct nouveau_vma *
 nouveau_bo_vma_find(struct nouveau_bo *, struct nouveau_vm *);
-extern int  nouveau_bo_vma_add(struct nouveau_bo *, struct nouveau_vm *,
-  struct nouveau_vma *);
+#define nouveau_bo_vma_add(nvbo, vm, vma) \
+   nouveau_bo_vma_add_access((nvbo), (vm), (vma), NV_MEM_ACCESS_RW)
+extern int nouveau_bo_vma_add_access(struct nouveau_bo *, struct nouveau_vm *,
+struct nouveau_vma *, u32 access);
 extern void nouveau_bo_vma_del(struct nouveau_bo *, struct nouveau_vma *);
 
 /* nouveau_gem.c */
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[RFC PATCH 3/8] nouveau: Extend prime code

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

The prime code no longer requires the bo to be backed by a gem object,
and cpu access calls have been implemented. This will be needed for
exporting fence bo's.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_drv.h   |6 +-
 drivers/gpu/drm/nouveau/nouveau_prime.c |  106 +--
 2 files changed, 79 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 8613cb2..7c52eba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -1374,11 +1374,15 @@ extern int nouveau_gem_ioctl_cpu_fini(struct drm_device 
*, void *,
 extern int nouveau_gem_ioctl_info(struct drm_device *, void *,
  struct drm_file *);
 
+extern int nouveau_gem_prime_export_bo(struct nouveau_bo *nvbo, int flags,
+  u32 size, struct dma_buf **ret);
 extern struct dma_buf *nouveau_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *obj, int flags);
 extern struct drm_gem_object *nouveau_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf);
-
+extern int nouveau_prime_import_bo(struct drm_device *dev,
+  struct dma_buf *dma_buf,
+  struct nouveau_bo **pnvbo, bool gem);
 /* nouveau_display.c */
 int nouveau_display_create(struct drm_device *dev);
 void nouveau_display_destroy(struct drm_device *dev);
diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c 
b/drivers/gpu/drm/nouveau/nouveau_prime.c
index a25cf2c..537154d3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -35,7 +35,8 @@ static struct sg_table *nouveau_gem_map_dma_buf(struct 
dma_buf_attachment *attac
  enum dma_data_direction dir)
 {
struct nouveau_bo *nvbo = attachment->dmabuf->priv;
-   struct drm_device *dev = nvbo->gem->dev;
+   struct drm_nouveau_private *dev_priv = nouveau_bdev(nvbo->bo.bdev);
+   struct drm_device *dev = dev_priv->dev;
int npages = nvbo->bo.num_pages;
struct sg_table *sg;
int nents;
@@ -59,29 +60,37 @@ static void nouveau_gem_dmabuf_release(struct dma_buf 
*dma_buf)
 {
struct nouveau_bo *nvbo = dma_buf->priv;
 
-   if (nvbo->gem->export_dma_buf == dma_buf) {
-   nvbo->gem->export_dma_buf = NULL;
+   nouveau_bo_unpin(nvbo);
+   if (!nvbo->gem)
+   nouveau_bo_ref(NULL, &nvbo);
+   else {
+   if (nvbo->gem->export_dma_buf == dma_buf)
+   nvbo->gem->export_dma_buf = NULL;
drm_gem_object_unreference_unlocked(nvbo->gem);
}
 }
 
 static void *nouveau_gem_kmap_atomic(struct dma_buf *dma_buf, unsigned long 
page_num)
 {
-   return NULL;
+   struct nouveau_bo *nvbo = dma_buf->priv;
+   return kmap_atomic(nvbo->bo.ttm->pages[page_num]);
 }
 
 static void nouveau_gem_kunmap_atomic(struct dma_buf *dma_buf, unsigned long 
page_num, void *addr)
 {
-
+   kunmap_atomic(addr);
 }
+
 static void *nouveau_gem_kmap(struct dma_buf *dma_buf, unsigned long page_num)
 {
-   return NULL;
+   struct nouveau_bo *nvbo = dma_buf->priv;
+   return kmap(nvbo->bo.ttm->pages[page_num]);
 }
 
 static void nouveau_gem_kunmap(struct dma_buf *dma_buf, unsigned long 
page_num, void *addr)
 {
-
+   struct nouveau_bo *nvbo = dma_buf->priv;
+   return kunmap(nvbo->bo.ttm->pages[page_num]);
 }
 
 static int nouveau_gem_prime_mmap(struct dma_buf *dma_buf, struct 
vm_area_struct *vma)
@@ -92,7 +101,8 @@ static int nouveau_gem_prime_mmap(struct dma_buf *dma_buf, 
struct vm_area_struct
 static void *nouveau_gem_prime_vmap(struct dma_buf *dma_buf)
 {
struct nouveau_bo *nvbo = dma_buf->priv;
-   struct drm_device *dev = nvbo->gem->dev;
+   struct drm_nouveau_private *dev_priv = nouveau_bdev(nvbo->bo.bdev);
+   struct drm_device *dev = dev_priv->dev;
int ret;
 
mutex_lock(&dev->struct_mutex);
@@ -116,7 +126,8 @@ out_unlock:
 static void nouveau_gem_prime_vunmap(struct dma_buf *dma_buf, void *vaddr)
 {
struct nouveau_bo *nvbo = dma_buf->priv;
-   struct drm_device *dev = nvbo->gem->dev;
+   struct drm_nouveau_private *dev_priv = nouveau_bdev(nvbo->bo.bdev);
+   struct drm_device *dev = dev_priv->dev;
 
mutex_lock(&dev->struct_mutex);
nvbo->vmapping_count--;
@@ -140,10 +151,9 @@ static const struct dma_buf_ops nouveau_dmabuf_ops =  {
 };
 
 static int
-nouveau_prime_new(struct drm_device *dev,
- size_t size,
+nouveau_prime_new(struct drm_device *dev, size_t size,
  struct sg_table *sg,
- struct nouveau_bo **pnvbo)
+ struct nouveau_bo **pnvbo, bool gem)
 {
struct nouveau_bo *nvbo;
u32 flag

[RFC PATCH 2/8] prime wip: i915

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Export the hardware status page so others can read seqno.

Signed-off-by: Maarten Lankhorst 

---
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |   29 --
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   87 
 drivers/gpu/drm/i915/intel_ringbuffer.c|   42 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h|3 +
 4 files changed, 145 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index aa308e1..d6bcfdc 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -66,12 +66,25 @@ static void i915_gem_unmap_dma_buf(struct 
dma_buf_attachment *attachment,
 static void i915_gem_dmabuf_release(struct dma_buf *dma_buf)
 {
struct drm_i915_gem_object *obj = dma_buf->priv;
+   struct drm_device *dev = obj->base.dev;
+
+   mutex_lock(&dev->struct_mutex);
 
if (obj->base.export_dma_buf == dma_buf) {
-   /* drop the reference on the export fd holds */
obj->base.export_dma_buf = NULL;
-   drm_gem_object_unreference_unlocked(&obj->base);
+   } else {
+   drm_i915_private_t *dev_priv = dev->dev_private;
+   struct intel_ring_buffer *ring;
+   int i;
+
+   for_each_ring(ring, dev_priv, i)
+   WARN_ON(ring->sync_buf == dma_buf);
}
+
+   /* drop the reference on the export fd holds */
+   drm_gem_object_unreference(&obj->base);
+
+   mutex_unlock(&dev->struct_mutex);
 }
 
 static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
@@ -129,21 +142,25 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf 
*dma_buf, void *vaddr)
 
 static void *i915_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, unsigned 
long page_num)
 {
-   return NULL;
+   struct drm_i915_gem_object *obj = dma_buf->priv;
+   return kmap_atomic(obj->pages[page_num]);
 }
 
 static void i915_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf, unsigned 
long page_num, void *addr)
 {
-
+   kunmap_atomic(addr);
 }
+
 static void *i915_gem_dmabuf_kmap(struct dma_buf *dma_buf, unsigned long 
page_num)
 {
-   return NULL;
+   struct drm_i915_gem_object *obj = dma_buf->priv;
+   return kmap(obj->pages[page_num]);
 }
 
 static void i915_gem_dmabuf_kunmap(struct dma_buf *dma_buf, unsigned long 
page_num, void *addr)
 {
-
+   struct drm_i915_gem_object *obj = dma_buf->priv;
+   kunmap(obj->pages[page_num]);
 }
 
 static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct 
*vma)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 88e2e11..245340e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -33,6 +33,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include 
+#include 
 
 struct change_domains {
uint32_t invalidate_domains;
@@ -556,7 +557,8 @@ err_unpin:
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
struct drm_file *file,
-   struct list_head *objects)
+   struct list_head *objects,
+   struct list_head *prime_val)
 {
drm_i915_private_t *dev_priv = ring->dev->dev_private;
struct drm_i915_gem_object *obj;
@@ -564,6 +566,31 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
bool has_fenced_gpu_access = INTEL_INFO(ring->dev)->gen < 4;
struct list_head ordered_objects;
 
+   list_for_each_entry(obj, objects, exec_list) {
+   struct dmabufmgr_validate *val;
+
+   if (!(obj->base.import_attach ||
+ obj->base.export_dma_buf))
+   continue;
+
+   val = kzalloc(sizeof(*val), GFP_KERNEL);
+   if (!val)
+   return -ENOMEM;
+
+   if (obj->base.export_dma_buf)
+   val->bo = obj->base.export_dma_buf;
+   else
+   val->bo = obj->base.import_attach->dmabuf;
+   val->priv = obj;
+   list_add_tail(&val->head, prime_val);
+   }
+
+   if (!list_empty(prime_val)) {
+   ret = dmabufmgr_eu_reserve_buffers(prime_val);
+   if (ret)
+   return ret;
+   }
+
INIT_LIST_HEAD(&ordered_objects);
while (!list_empty(objects)) {
struct drm_i915_gem_exec_object2 *entry;
@@ -712,6 +739,7 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
  struct drm_file *file,
  struct intel_ring_buffer *ring,
  struct list_head *objects,
+ struct list_head *prime_val,
  struct eb_o

[RFC PATCH 1/8] dma-buf-mgr: Try 2

2012-07-10 Thread Maarten Lankhorst
From: Maarten Lankhorst 

Core code based on ttm_bo and ttm_execbuf_util

Signed-off-by: Maarten Lankhorst 

---
 drivers/base/Makefile |2 +-
 drivers/base/dma-buf-mgr-eu.c |  263 +
 drivers/base/dma-buf-mgr.c|  149 +++
 drivers/base/dma-buf.c|4 +
 include/linux/dma-buf-mgr.h   |  150 +++
 include/linux/dma-buf.h   |   24 
 6 files changed, 591 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/dma-buf-mgr-eu.c
 create mode 100644 drivers/base/dma-buf-mgr.c
 create mode 100644 include/linux/dma-buf-mgr.h

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 5aa2d70..86e7598 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
 obj-y  += power/
 obj-$(CONFIG_HAS_DMA)  += dma-mapping.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o
+obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-buf-mgr.o dma-buf-mgr-eu.o
 obj-$(CONFIG_ISA)  += isa.o
 obj-$(CONFIG_FW_LOADER)+= firmware_class.o
 obj-$(CONFIG_NUMA) += node.o
diff --git a/drivers/base/dma-buf-mgr-eu.c b/drivers/base/dma-buf-mgr-eu.c
new file mode 100644
index 000..ed5e01c
--- /dev/null
+++ b/drivers/base/dma-buf-mgr-eu.c
@@ -0,0 +1,263 @@
+/*
+ * Copyright (C) 2012 Canonical Ltd
+ *
+ * Based on ttm_bo.c which bears the following copyright notice,
+ * but is dual licensed:
+ *
+ * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+#include 
+#include 
+#include 
+
+static void dmabufmgr_eu_backoff_reservation_locked(struct list_head *list)
+{
+   struct dmabufmgr_validate *entry;
+
+   list_for_each_entry(entry, list, head) {
+   struct dma_buf *bo = entry->bo;
+   if (!entry->reserved)
+   continue;
+   entry->reserved = false;
+
+   bo->sync_buf = entry->sync_buf;
+   entry->sync_buf = NULL;
+
+   atomic_set(&bo->reserved, 0);
+   wake_up_all(&bo->event_queue);
+   }
+}
+
+static int
+dmabufmgr_eu_wait_unreserved_locked(struct list_head *list,
+   struct dma_buf *bo)
+{
+   int ret;
+
+   spin_unlock(&dmabufmgr.lru_lock);
+   ret = dmabufmgr_bo_wait_unreserved(bo, true);
+   spin_lock(&dmabufmgr.lru_lock);
+   if (unlikely(ret != 0))
+   dmabufmgr_eu_backoff_reservation_locked(list);
+   return ret;
+}
+
+void
+dmabufmgr_eu_backoff_reservation(struct list_head *list)
+{
+   if (list_empty(list))
+   return;
+
+   spin_lock(&dmabufmgr.lru_lock);
+   dmabufmgr_eu_backoff_reservation_locked(list);
+   spin_unlock(&dmabufmgr.lru_lock);
+}
+EXPORT_SYMBOL_GPL(dmabufmgr_eu_backoff_reservation);
+
+int
+dmabufmgr_eu_reserve_buffers(struct list_head *list)
+{
+   struct dmabufmgr_validate *entry;
+   int ret;
+   u32 val_seq;
+
+   if (list_empty(list))
+   return 0;
+
+   list_for_each_entry(entry, list, head) {
+   entry->reserved = false;
+   entry->sync_buf = NULL;
+   }
+
+retry:
+   spin_lock(&dmabufmgr.lru_lock);
+   val_seq = dmabufmgr.counter++;
+
+   list_for_each_entry(entry, list, head) {
+   struct dma_buf *bo = entry->bo;
+
+retry_this_bo:
+   ret = dmabufmgr_bo_reserve_locked(bo, true, true, true, 
val_seq);
+   switch (ret) {
+   case 0:
+   break;
+   case -EBUSY:
+   ret = dmabufmgr_eu_wait_unreserved_locked(list, bo);
+ 

[RFC PATCH 0/8] Dmabuf synchronization

2012-07-10 Thread Maarten Lankhorst
This patch implements my attempt at dmabuf synchronization.
The core idea is that a lot of devices will have their own
methods of synchronization, but more complicated devices
allow some way of fencing, so why not export those as
dma-buf?

This patchset implements dmabufmgr, which is based on ttm's code.
The ttm code deals with a lot more than just reservation however, 
I took out almost all the code not dealing with reservations.

I used the drm-intel-next-queued tree as base. It contains some i915
flushing changes. I would rather use linux-next, but the deferred
fput code makes my system unbootable. That is unfortunate since
it would reduce the deadlocks happening in dma_buf_put when 2
devices release each other's dmabuf.

The i915 changes implement a simple cpu wait only, the nouveau code
imports the sync dmabuf read-only and maps it to affected channels,
then performs a wait on it in hardware. Since the hardware may still
be processing other commands, it could be the case that no hardware
wait would have to be performed at all.

Only the nouveau nv84 code is tested, but the nvc0 code should work
as well.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/2] drm: Add colouring to the range allocator

2012-07-10 Thread Chris Wilson
In order to support snoopable memory on non-LLC architectures (so that
we can bind vgem objects into the i915 GATT for example), we have to
avoid the prefetcher on the GPU from crossing memory domains and so
prevent allocation of a snoopable PTE immediately following an uncached
PTE. To do that, we need to extend the range allocator with support for
tracking and segregating different node colours.

This will be used by i915 to segregate memory domains within the GTT.

v2: Now with more drm_mm helpers and less driver interference.

Signed-off-by: Chris Wilson 
Cc: Dave Airlie 
Cc: Ben Skeggs 
Cc: Jerome Glisse 
Cc: Alex Deucher 
Cc: Daniel Vetter 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/drm_gem.c |2 +-
 drivers/gpu/drm/drm_mm.c  |  169 -
 drivers/gpu/drm/i915/i915_gem.c   |6 +-
 drivers/gpu/drm/i915/i915_gem_evict.c |9 +-
 include/drm/drm_mm.h  |   93 +++---
 5 files changed, 191 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index d58e69d..fbe0842 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -354,7 +354,7 @@ drm_gem_create_mmap_offset(struct drm_gem_object *obj)
 
/* Get a DRM GEM mmap offset allocated... */
list->file_offset_node = drm_mm_search_free(&mm->offset_manager,
-   obj->size / PAGE_SIZE, 0, 0);
+   obj->size / PAGE_SIZE, 0, false);
 
if (!list->file_offset_node) {
DRM_ERROR("failed to allocate offset for bo %d\n", obj->name);
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 961fb54..9bb82f7 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -118,45 +118,53 @@ static inline unsigned long drm_mm_hole_node_end(struct 
drm_mm_node *hole_node)
 
 static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 struct drm_mm_node *node,
-unsigned long size, unsigned alignment)
+unsigned long size, unsigned alignment,
+unsigned long color)
 {
struct drm_mm *mm = hole_node->mm;
-   unsigned long tmp = 0, wasted = 0;
unsigned long hole_start = drm_mm_hole_node_start(hole_node);
unsigned long hole_end = drm_mm_hole_node_end(hole_node);
+   unsigned long adj_start = hole_start;
+   unsigned long adj_end = hole_end;
 
BUG_ON(!hole_node->hole_follows || node->allocated);
 
-   if (alignment)
-   tmp = hole_start % alignment;
+   if (mm->color_adjust)
+   mm->color_adjust(hole_node, color, &adj_start, &adj_end);
 
-   if (!tmp) {
+   if (alignment) {
+   unsigned tmp = adj_start % alignment;
+   if (tmp)
+   adj_start += alignment - tmp;
+   }
+
+   if (adj_start == hole_start) {
hole_node->hole_follows = 0;
-   list_del_init(&hole_node->hole_stack);
-   } else
-   wasted = alignment - tmp;
+   list_del(&hole_node->hole_stack);
+   }
 
-   node->start = hole_start + wasted;
+   node->start = adj_start;
node->size = size;
node->mm = mm;
+   node->color = color;
node->allocated = 1;
 
INIT_LIST_HEAD(&node->hole_stack);
list_add(&node->node_list, &hole_node->node_list);
 
-   BUG_ON(node->start + node->size > hole_end);
+   BUG_ON(node->start + node->size > adj_end);
 
+   node->hole_follows = 0;
if (node->start + node->size < hole_end) {
list_add(&node->hole_stack, &mm->hole_stack);
node->hole_follows = 1;
-   } else {
-   node->hole_follows = 0;
}
 }
 
 struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node,
 unsigned long size,
 unsigned alignment,
+unsigned long color,
 int atomic)
 {
struct drm_mm_node *node;
@@ -165,7 +173,7 @@ struct drm_mm_node *drm_mm_get_block_generic(struct 
drm_mm_node *hole_node,
if (unlikely(node == NULL))
return NULL;
 
-   drm_mm_insert_helper(hole_node, node, size, alignment);
+   drm_mm_insert_helper(hole_node, node, size, alignment, color);
 
return node;
 }
@@ -181,11 +189,11 @@ int drm_mm_insert_node(struct drm_mm *mm, struct 
drm_mm_node *node,
 {
struct drm_mm_node *hole_node;
 
-   hole_node = drm_mm_search_free(mm, size, alignment, 0);
+   hole_node = drm_mm_search_free(mm, size, alignment, false);
if (!hole_node)
return -ENOSPC;
 
-   drm_mm_insert_helper(hole_node, node, size, alignment);
+   drm_mm_insert_helper(hole_node, 

Re: [REGRESSION] nouveau: Memory corruption using nva3 engine for 0xaf

2012-07-10 Thread Henrik Rydberg
On Mon, Jul 09, 2012 at 03:13:25PM +0200, Henrik Rydberg wrote:
> On Thu, Jul 05, 2012 at 10:34:10AM +0200, Henrik Rydberg wrote:
> > On Thu, Jul 05, 2012 at 08:54:46AM +0200, Henrik Rydberg wrote:
> > > > Thanks for tracking down the source of this corruption.  I don't have
> > > > any such hardware, so until someone can figure it out, I think we
> > > > should apply this patch.
> > > 
> > > In that case, I would have to massage the patch a bit first; it
> > > creates a problem with suspend/resume. Might be something with
> > > nva3_pm.c, who knows. I am really stabbing in the dark here. :-)
> > 
> > It seems the suspend/resume problem is unrelated (bad systemd update),
> > so I am fine with applying this as is. Obviously not the best
> > solution, and if I have time I will continue to look for problems in
> > the nva3 copy code, but for now,
> > 
> > Signed-off-by: Henrik Rydberg 
> 
> I have not encountered the problem in a long while, and I do not have
> the patch applied. It is entirely possible that this was fixed by
> something else. Unless you have already applied the patch, I would
> suggest holding on to it to see if the problem reappears.
> 
> Sorry for the churn.

... and there it was again, hours after giving up on it. Oh well.

What makes this bug particularly difficult is that as soon as the
patch is applied, the problem disappears and does not show itself
again - with or without the patch applied. Sounds very much like the
problem is a failure state that does not get reset by current
mainline, but somehow gets reset with the patch applied.

I also learnt that the problem is not in the nva3_copy code itself; I
reverted nva3_copy.c and nva3_pm.c back to v3.4, but the problem persisted.

A DMA problem elsewhere, in the drm code or in the pci layer, seems
more likely than this particular hardware having problems with this
particular copy engine. As it stands, though, applying the patch is
the only thing known to work.

Thanks,
Henrik
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/exynos: Add exynos drm specific fb_mmap function

2012-07-10 Thread Prathyush K
This patch adds a exynos drm specific implementation of fb_mmap
which supports mapping a non-contiguous buffer to user space.
This new function does not assume that the frame buffer is contiguous
and calls dma_mmap_writecombine for mapping the buffer to user space.
dma_mmap_writecombine will be able to map a contiguous buffer as well
as non-contig buffer depending on whether an IOMMU mapping is created
for drm or not.

Signed-off-by: Prathyush K 
---
 drivers/gpu/drm/exynos/exynos_drm_fbdev.c |   16 
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c 
b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
index d5586cc..b53e638 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
@@ -46,8 +46,24 @@ struct exynos_drm_fbdev {
struct exynos_drm_gem_obj   *exynos_gem_obj;
 };
 
+static int exynos_drm_fb_mmap(struct fb_info *info,
+ struct vm_area_struct *vma)
+{
+   if ((vma->vm_end - vma->vm_start) > info->fix.smem_len)
+   return -EINVAL;
+
+   vma->vm_pgoff = 0;
+   vma->vm_flags |= VM_IO | VM_RESERVED;
+   if (dma_mmap_writecombine(info->device, vma, info->screen_base,
+   info->fix.smem_start, vma->vm_end - vma->vm_start))
+   return -EAGAIN;
+
+   return 0;
+}
+
 static struct fb_ops exynos_drm_fb_ops = {
.owner  = THIS_MODULE,
+   .fb_mmap= exynos_drm_fb_mmap,
.fb_fillrect= cfb_fillrect,
.fb_copyarea= cfb_copyarea,
.fb_imageblit   = cfb_imageblit,
-- 
1.7.0.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel