Fence, timeline and android sync points

2014-08-14 Thread Maarten Lankhorst


On 14-08-14 21:15, Jerome Glisse wrote:
> On Thu, Aug 14, 2014 at 08:47:16PM +0200, Daniel Vetter wrote:
>> On Thu, Aug 14, 2014 at 8:18 PM, Jerome Glisse  wrote:
>>> Sucks because you can not do weird synchronization like one i depicted in 
>>> another
>>> mail in this thread and for as long as cmdbuf_ioctl do not give you 
>>> fence|syncpt
>>> you can not such thing cleanly in non hackish way.
>>
>> Actually i915 can soon will do that that.
> 
> So you will return fence|syncpoint with each cmdbuf_ioctl ?

It might, soon. There have been patches on the ml about it. It can create a 
userspace android fence backed by a kernel dma fence.
And it will be created like any other userspace android fence. ;-)

Yet even with that, it will continue to support the implicit sync model since 
they're not mutually exclusive.

I also fail to understand why you think a fence should be associated with a 
buffer object. It's the other way around. TTM currently requires buffer objects 
to be fenced off to protect against eviction.

reservation_object is used for this, and by sharing the reservation_object 
pointer with a dma-buf you get cross-device synchronization.

It has a bunch of helpers to make common operations easy, see 
include/linux/reservation.h and drivers/dma-buf/reservation.c
It also allows multiple readers simultaneously across any number of devices. I 
intend to use this in nouveau.

But there's no requirement to use reservation_object's apart from that's how 
ttm currently works, and for implicit syncing in dma-buf. If you don't need 
either go crazy with fence and write your own mechanism on top of fence. 
Although with android sync and TTM I think I handled all common usecases.

~Maarten


Fence, timeline and android sync points

2014-08-14 Thread Maarten Lankhorst


On 14-08-14 20:26, Jerome Glisse wrote:
> On Thu, Aug 14, 2014 at 05:58:48PM +0200, Daniel Vetter wrote:
>> On Thu, Aug 14, 2014 at 10:12:06AM -0400, Jerome Glisse wrote:
>>> On Thu, Aug 14, 2014 at 09:16:02AM -0400, Rob Clark wrote:
 On Wed, Aug 13, 2014 at 1:07 PM, Jerome Glisse  
 wrote:
> So this is fundamentaly different, fence as they are now allow random 
> driver
> callback and this is bound to get ugly this is bound to lead to one driver
> doing something that seems innocuous but turn out to break heavoc when 
> call
> from some other driver function.


 tbh, that seems solvable by some strict rules about what you can do in
 the callback.. ie. don't do anything you couldn't do in atomic, and
 don't signal another fence.. off the top of my head that seems
 sufficient.

 If the driver getting the callback needs to do more, then it can
 always schedule a worker..

 But I could certainly see the case where the driver waiting on fence
 sets everything up before installing the cb and then just needs to
 write one or a couple regs from the cb.
>>>
>>> Yes sane code will do sane things, sadly i fear we can not enforce sane
>>> code everywhere especialy with out of tree driver and i would rather
>>> force there hand to only allow sane implementation. Providing call back
>>> api obviously allows them to do crazy stuff.
>>
>> Well then don't support out of tree drivers. Fairly easy problem really,
>> and last time I checked "out of tree drivers suck" isn't a valid
>> objections for upstream code ... It's kinda assumed that they all do, it's
>> why we have staging after all.
> 
> As usual i fail at expressing my point. I am not saying do not merge this
> because of out of tree drivers, i am saying while doing an api let make it
> sane and try to make it so that it enforce sanity to anything that lives
> outside our realm.
> 
> And not even thinking outside kernel tree, but someone might come along and
> start using fence, see the callback stuff and start doing crazy stuff with
> that all this inside a different obscur kernel subsystem. Then someone in
> graphic sees that and use that as justification to do crazy thing too,
> because hey, if he is doing why can't i ?
Since when has crazy been contagious?

And here's what stops you:
1. LOCKDEP
2. PROVE_RCU
3. rcu sparse annotations (kbuild test bot)
4. DEBUG_ATOMIC_SLEEP

~Maarten


Fence, timeline and android sync points

2014-08-14 Thread Daniel Vetter
On Thu, Aug 14, 2014 at 8:18 PM, Jerome Glisse  wrote:
> Sucks because you can not do weird synchronization like one i depicted in 
> another
> mail in this thread and for as long as cmdbuf_ioctl do not give you 
> fence|syncpt
> you can not such thing cleanly in non hackish way.

Actually i915 can soon will do that that.

> Sucks because you have a fence object per buffer object and thus overhead grow
> with the number of objects. Not even mentioning fence lifetime issue.
>
> Sucks because sub-buffer allocation is just one of many tricks that can not be
> achieved properly and cleanly with implicit sync.
>
> ...

Well I heard all those reasons and I'm well of aware of them. The
problem is that with current hardware the kernel needs to know for
each buffer how long it needs to be kept around since hw just can't do
page faulting. Yeah you can pin them but for an uma design that
doesn't go down well with folks.

The other problem is that the Linux Desktop I don't seem to care about
any more kinda relies on implicit syncing right now, so we better keep
that working fairly well. Of course we could dream up a shiny new
world where all of the Linux desktop does explicit syncing, but that
world doesn't exist right now. I mean really if you want to right away
throw implicit syncing overboard that doesn't bode well for the
current linux desktop.

So I don't understand all the implicit syncing bashing. It's here, and
it'll stay for a while longer whether you like it or not ...

Of course that doesn't mean we (intel here) won't support explicit
syncing too, and I really don't see a conflict here when mixing and
matching these two approaches.
-Daniel

> Having code that work around or alieviate some of this, is in no way a 
> testimony
> that it's the best solution. I do believe explicit sync to be superior in use
> case it allows way more freedom while the only drawback i see is that you 
> have to
> put some trust into userspace.
>
> So yes implicit sync sucks and it does map to i915 reality as well.




-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Bug 82586] UBO matrix in std140 struct does not work

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82586

--- Comment #4 from pavol at klacansky.com ---
(In reply to comment #3)
> You say it doesn't work.  That's pretty vague.  If it fails in some way,
> what is the mode of the failure?  (See also
> http://en.wikipedia.org/wiki/Failure_causes#Component_failure)
It just does not display anything, hence either whole pipeline crashes (and
OpenGL KHR_debug does not print it out) or the transform moves vertices outside
clipping region (I might try transform feedback over weekend).

> Are the member locations incorrect (query through the API)?  Is the
> row-major / column-major orientation of the matrices incorrect (query
> through the API)?  Is the generated code reading the wrong locations?  Does
> it crash in the driver?  Does your computer catch on fire?
Locations are correct (at least if gl query returns them properly). I do not
know, I do not read anything (or I do not follow). How do I know if it crashes
in driver? No, though DPM is really bad.

> Without any information about how it actually fails, it's pretty much
> impossible to fix.

If I apply one of the matrices, it works, if I multiply two of them, it does
not.

> When I check it now, it should have everything.  The other question: does
> this work on older Mesa?  Say... a 10.2 stable release?
Yes, it works on 10.2.5

> You can... at some point there will probably be a patch for you to test, and
> that won't be in PPA.  You can probably wait to worry about it until such a
> patch exists.

Ok, hope the process is seamless.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/4c7fae3d/attachment.html>


[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #16 from Andy Furniss  ---
(In reply to comment #15)
> Does pyrit transfer much data from the GPU to the CPU? If so, my patch
> "gallium/radeon: Do not use u_upload_mgr for buffer downloads" that I have
> just sent to the mesa-dev mailing list might help...

It does help pyrit, but as expected I guess, not valley.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/a4c23b2b/attachment.html>


[PATCH] drm/msm: Fix missing unlock on error in msm_fbdev_create()

2014-08-14 Thread Rob Clark
On Wed, Aug 13, 2014 at 9:01 PM,   wrote:
> From: Wei Yongjun 
>
> Add the missing unlock before return from function msm_fbdev_create()
> in the error handling case.
>
> Signed-off-by: Wei Yongjun 

Thanks, I've got it queued up..

BR,
-R

> ---
>  drivers/gpu/drm/msm/msm_fbdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
> index 9c5221c..ab5bfd2 100644
> --- a/drivers/gpu/drm/msm/msm_fbdev.c
> +++ b/drivers/gpu/drm/msm/msm_fbdev.c
> @@ -143,7 +143,7 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
> ret = msm_gem_get_iova_locked(fbdev->bo, 0, );
> if (ret) {
> dev_err(dev->dev, "failed to get buffer obj iova: %d\n", ret);
> -   goto fail;
> +   goto fail_unlock;
> }
>
> fbi = framebuffer_alloc(0, dev->dev);
>


Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 11:23:01PM +0200, Daniel Vetter wrote:
> On Thu, Aug 14, 2014 at 9:15 PM, Jerome Glisse  wrote:
> > Cost 1 uint32 per buffer and simple if without locking to check status of
> > a buffer.
> 
> Yeah well except it doesn't and that's why we switch to full blown
> fence objects internally instead of smacking seqno values all over the
> place. At least in i915 because I have seen what this "simple"
> solution looks like if you throw all the complexities into the bin and
> mix it well.

I am kind of curious on what can go wrong here ? Unless you have thousands
of different hw block inside your gpu all of them with different cmd queue
then i do not see any issue.

Note that this global seqno i am talking is only useful for bind/unbind of
buffer in ideal world of explicit sync, not for synchronization btw command
buffer. So pseudo code would be :

// if buffer_can_unbind(buf, dev) return true then buffer is no longer in
// use and can be unbind. if false you can wait on the device wait queue.
bool buffer_can_unbind(buf, dev)
{
  // Is the last gseqno associated with buffer is smaller than current
  // smallest signaled seqno ?
  if (buf->gseqno <= dev->gseqno_cmin)
return true;
  hw_update_gseqno_min(dev);
  if (buf->gseqno <= dev->gseqno_cmin)
return true;
  for_each_hw_block(hwblock, dev) {
// If that hwblock has not signaled a gseqno bigger than the
// buffer one's and also has work scheduled that might be using
// the buffer (ie the last scheduled gseqno is greater than
// buffer gseqno). If that's true than buffer might still be
// in use so assume the worst.
if (hwblock->gseqno < buf->gseqno &&
hwblock->gseqno_last_scheduled >= buf->gseqno)
  return false;
// This means either this block is already past the buf->gseqno
// or that this block have nothing in its pipeline that will ever
// use buf.
// As an extra optimization one might add a hwblock mask to buf
// and unmask wait for that hwblock so further wait will not wait
// on this block as we know for sure it's not involve.
  }
  // Well none of the hwblock is still using that buffer.
  return true;
}

hw_update_gseqno_min(dev)
{
   unsigned nmin = -1;

   for_each_hw_block(hwblock, dev) {
 nmin = min(nmin, hwblock->gseqno);
   }
   // atomic exchange and compage set new min only if it's bigger then
   // current min
   atomic_cmp_xchg(dev->gseqno_cmin, nmin)
}


So this is how it looks in my mind, each hwblock write to there dedicated
>gseqno and for each hwblock you store the last gseqno that was scheduled.
There is no need for any lock and this is outside any other sync mecanism.

For hw with preemption, i assume you have then have a hwcontext, well you
use same code except that now you have a gseqno per context which means
you need to store a seqno per context per buffer. With some trickery this
might be avoided especialy if hw can do atomic cmp_xchg.

Cheers,
J?r?me


> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Bug 77181] radeon -- GPU lockup when hibernating or waking up

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=77181

--- Comment #5 from Rostislav Devyatov  ---
(In reply to Alex Deucher from comment #4)
> Did it used to work previously?  If so what kernel?  If you know what kernel
> used to work, you can use git to bisect the changes and identify what change
> broke it.

Previously, in kernel 3.12.21-r1-gentoo, I had the same problem (the screen is
sometimes messy after hibernation, and the same "fence wait failed" error in
syslog). Before that, I ran deblobbed kernels 3.8.13-gentoo and 3.4.9-gentoo,
and there were also problems with the screen, but a bit different. The screen
just became black (sometimes it happened after hibernation, sometimes just
during the normal work), but the mouse pointer was there, and the error in
syslog was different: it started with "kernel BUG" at some location. I don't
have the syslog from 3.8.13, but in 3.4.9 it was at mm/slub.c:3474 .

On my system, both problems (the one I have now and the one I had previously)
happen rather irregularly.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 82139] [r600g, bisected] multiple ubo piglit regressions

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82139

Marek Ol??k  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Marek Ol??k  ---
Fixed by da9c3ed304be5d08ff989d61c6e2d1be8a845767. Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/c46ff7e8/attachment.html>


[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #15 from Niels Ole Salscheider  
---
Does pyrit transfer much data from the GPU to the CPU? If so, my patch
"gallium/radeon: Do not use u_upload_mgr for buffer downloads" that I have just
sent to the mesa-dev mailing list might help...

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/43f19833/attachment-0001.html>


[Bug 82019] Unreal Engine Effects Cave demo lockup HD 7970M

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82019

Christoph Haag  changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |MOVED

--- Comment #5 from Christoph Haag  ---
(In reply to comment #4)

Good news: It has nothing to do with nine.
I think it's caused by the GALLIUM_HUD and so I opened bug 82628.

On second thought I could have seen that the hang is actually the same I had in
the first place and probably this bug would have been perfectly fine to post in
too, but now it's too late.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/60dd60d0/attachment.html>


[PATCH 4/4] drm/radeon: add flag to NOT fence a BO on CS

2014-08-14 Thread Christian König
From: Christian K?nig 

This is useful and only allowed with pure sync objects. E.g.
the CS depends on finishing a previous CS, but should not add
the new fence the handle.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon_cs.c | 29 -
 include/uapi/drm/radeon_drm.h  |  1 +
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 11e4789..bdb8eda 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -225,7 +225,7 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, 
u32 ring, s32 priority
return 0;
 }

-static void radeon_cs_sync_rings(struct radeon_cs_parser *p)
+static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
 {
int i;

@@ -237,6 +237,18 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser 
*p)
if (!bo)
continue;

+   fence = bo->tbo.sync_obj;
+   if (reloc->flags & RADEON_RELOC_DONT_FENCE) {
+
+   /* only allowed for sync objects */
+   if (radeon_bo_size(bo) != 0)
+   return -EINVAL;
+
+   list_del(>relocs[i].tv.head);
+   radeon_bo_unreserve(bo);
+   p->relocs[i].robj = NULL;
+   }
+
/* always sync to the last operation
   the clients doesn't know about */
radeon_semaphore_sync_to(p->ib.presync, bo->last_sync);
@@ -245,8 +257,6 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser *p)
reloc->flags & RADEON_RELOC_DONT_SYNC)
continue;

-   fence = bo->tbo.sync_obj;
-
if (bo->written && radeon_fence_signaled(bo->written))
radeon_fence_unref(>written);

@@ -257,6 +267,8 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser *p)
else
radeon_semaphore_sync_to(p->ib.postsync, fence);
}
+
+   return 0;
 }

 /* XXX: note that this is called from the legacy UMS CS ioctl as well */
@@ -498,7 +510,10 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
 (parser->ring == TN_RING_TYPE_VCE2_INDEX))
radeon_vce_note_usage(rdev);

-   radeon_cs_sync_rings(parser);
+   r = radeon_cs_sync_rings(parser);
+   if (r)
+   return r;
+
r = radeon_ib_schedule(rdev, >ib, NULL);
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
@@ -585,7 +600,11 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
*rdev,
if (r) {
goto out;
}
-   radeon_cs_sync_rings(parser);
+
+   r = radeon_cs_sync_rings(parser);
+   if (r)
+   goto out;
+
radeon_semaphore_sync_to(parser->ib.presync, vm->fence);

if ((rdev->family >= CHIP_TAHITI) &&
diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
index 5bd3f68..289a0ca 100644
--- a/include/uapi/drm/radeon_drm.h
+++ b/include/uapi/drm/radeon_drm.h
@@ -946,6 +946,7 @@ struct drm_radeon_cs_chunk {
 /* drm_radeon_cs_reloc.flags */
 #define RADEON_RELOC_PRIO_MASK (0xf << 0)
 #define RADEON_RELOC_DONT_SYNC (1 << 4)
+#define RADEON_RELOC_DONT_FENCE(1 << 5)

 struct drm_radeon_cs_reloc {
uint32_thandle;
-- 
1.9.1



[PATCH 3/4] drm/radeon: add DONT_SYNC flag to CS relocs

2014-08-14 Thread Christian König
From: Christian K?nig 

This allows userspace to explicitly don't sync submission to
different rings as long as all uses stay in the same client.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h |  3 +++
 drivers/gpu/drm/radeon/radeon_cs.c  | 23 ++-
 drivers/gpu/drm/radeon/radeon_gem.c |  1 +
 drivers/gpu/drm/radeon/radeon_ttm.c |  3 +++
 include/uapi/drm/radeon_drm.h   |  2 ++
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index c0f7773..740a0b2 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -478,6 +478,8 @@ struct radeon_bo {
u32 tiling_flags;
u32 pitch;
int surface_reg;
+   struct drm_file *last_user;
+   struct radeon_fence *last_sync;
struct radeon_fence *written;
/* list of all virtual address to which this bo
 * is associated to
@@ -1018,6 +1020,7 @@ struct radeon_cs_reloc {
unsignedallowed_domains;
uint32_ttiling_flags;
uint32_thandle;
+   uint32_tflags;
boolwritten;
 };

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 3aa7e48..11e4789 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -166,6 +166,7 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
*p)

p->relocs[i].tv.bo = >relocs[i].robj->tbo;
p->relocs[i].handle = r->handle;
+   p->relocs[i].flags = r->flags;
p->relocs[i].written = !!r->write_domain;

radeon_cs_buckets_add(, >relocs[i].tv.head,
@@ -236,6 +237,14 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser 
*p)
if (!bo)
continue;

+   /* always sync to the last operation
+  the clients doesn't know about */
+   radeon_semaphore_sync_to(p->ib.presync, bo->last_sync);
+
+   if (bo->last_user == p->filp &&
+   reloc->flags & RADEON_RELOC_DONT_SYNC)
+   continue;
+
fence = bo->tbo.sync_obj;

if (bo->written && radeon_fence_signaled(bo->written))
@@ -421,7 +430,19 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser 
*parser, int error, bo
struct radeon_cs_reloc *reloc = >relocs[i];
struct radeon_bo *bo = reloc->robj;

-   if (!bo || !reloc->written)
+   if (!bo)
+   continue;
+
+   /* if the client changes remember that and always 
serialize
+  all operations from different clients */
+   if (bo->last_user != parser->filp && bo->tbo.sync_obj) {
+   struct radeon_fence *fence = bo->tbo.sync_obj;
+   radeon_fence_unref(>last_sync);
+   bo->last_sync = radeon_fence_ref(fence);
+   }
+   bo->last_user = parser->filp;
+
+   if (!reloc->written)
continue;

radeon_fence_unref(>written);
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index bfd7e1b..c73dbc1 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -259,6 +259,7 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
*data,
r = radeon_gem_handle_lockup(rdev, r);
return r;
}
+   gem_to_radeon_bo(gobj)->last_user = filp;
r = drm_gem_handle_create(filp, gobj, );
/* drop reference from allocate - handle holds it now */
drm_gem_object_unreference_unlocked(gobj);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 76be612..a4f964f 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -273,6 +273,9 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
);

if (!r) {
+   rbo->last_user = NULL;
+   radeon_fence_unref(>last_sync);
+   rbo->last_sync = radeon_fence_ref(fence);
radeon_fence_unref(>written);
rbo->written = radeon_fence_ref(fence);
}
diff --git a/include/uapi/drm/radeon_drm.h b/include/uapi/drm/radeon_drm.h
index 509b2d7..5bd3f68 100644
--- a/include/uapi/drm/radeon_drm.h
+++ b/include/uapi/drm/radeon_drm.h
@@ -944,6 +944,8 @@ struct 

[PATCH 2/4] drm/radeon: allow concurrent BO access by different engines

2014-08-14 Thread Christian König
From: Christian K?nig 

This patch allows concurrent access of different engines to the same BO
as long as everybody only reads from it. Since TTM can't (yet) handle
multiple fences for one BO we still sync the fence after executing the IB.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h |  2 ++
 drivers/gpu/drm/radeon/radeon_cs.c  | 24 +++-
 drivers/gpu/drm/radeon/radeon_ttm.c |  8 
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 4579361..c0f7773 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -478,6 +478,7 @@ struct radeon_bo {
u32 tiling_flags;
u32 pitch;
int surface_reg;
+   struct radeon_fence *written;
/* list of all virtual address to which this bo
 * is associated to
 */
@@ -1017,6 +1018,7 @@ struct radeon_cs_reloc {
unsignedallowed_domains;
uint32_ttiling_flags;
uint32_thandle;
+   boolwritten;
 };

 struct radeon_cs_chunk {
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 2be4fc5..3aa7e48 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -166,6 +166,7 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser 
*p)

p->relocs[i].tv.bo = >relocs[i].robj->tbo;
p->relocs[i].handle = r->handle;
+   p->relocs[i].written = !!r->write_domain;

radeon_cs_buckets_add(, >relocs[i].tv.head,
  priority);
@@ -236,7 +237,16 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser 
*p)
continue;

fence = bo->tbo.sync_obj;
-   radeon_semaphore_sync_to(p->ib.presync, fence);
+
+   if (bo->written && radeon_fence_signaled(bo->written))
+   radeon_fence_unref(>written);
+
+   /* if either this CS or the last one write to
+  the BO we sync before executing the IB */
+   if (reloc->written || bo->written)
+   radeon_semaphore_sync_to(p->ib.presync, fence);
+   else
+   radeon_semaphore_sync_to(p->ib.postsync, fence);
}
 }

@@ -406,6 +416,18 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser 
*parser, int error, bo
 */
list_sort(NULL, >validated, cmp_size_smaller_first);

+   /* remember which BOs we write to */
+   for (i = 0; i < parser->nrelocs; i++) {
+   struct radeon_cs_reloc *reloc = >relocs[i];
+   struct radeon_bo *bo = reloc->robj;
+
+   if (!bo || !reloc->written)
+   continue;
+
+   radeon_fence_unref(>written);
+   bo->written = radeon_fence_ref(parser->ib.fence);
+   }
+
ttm_eu_fence_buffer_objects(>ticket,
>validated,
parser->ib.fence);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 72afe82..76be612 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -228,10 +228,12 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
struct radeon_device *rdev;
uint64_t old_start, new_start;
struct radeon_fence *fence;
+   struct radeon_bo *rbo;
int r, ridx;

rdev = radeon_get_rdev(bo->bdev);
ridx = radeon_copy_ring_index(rdev);
+   rbo = container_of(bo, struct radeon_bo, tbo);
old_start = old_mem->start << PAGE_SHIFT;
new_start = new_mem->start << PAGE_SHIFT;

@@ -269,6 +271,12 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
r = radeon_copy(rdev, old_start, new_start,
new_mem->num_pages * (PAGE_SIZE / 
RADEON_GPU_PAGE_SIZE), /* GPU pages */
);
+
+   if (!r) {
+   radeon_fence_unref(>written);
+   rbo->written = radeon_fence_ref(fence);
+   }
+
/* FIXME: handle copy error */
r = ttm_bo_move_accel_cleanup(bo, (void *)fence,
  evict, no_wait_gpu, new_mem);
-- 
1.9.1



[PATCH 1/4] drm/radeon: add pre and post IB sync

2014-08-14 Thread Christian König
From: Christian K?nig 

This allows us to run different engines concurrently and
still have the fences sync to all operations as currently
required by TTM.

Signed-off-by: Christian K?nig 
---
 drivers/gpu/drm/radeon/radeon.h|  3 ++-
 drivers/gpu/drm/radeon/radeon_cs.c | 18 --
 drivers/gpu/drm/radeon/radeon_ib.c | 30 ++
 drivers/gpu/drm/radeon/radeon_vm.c |  8 
 4 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index e715e0c..4579361 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -800,7 +800,8 @@ struct radeon_ib {
struct radeon_fence *fence;
struct radeon_vm*vm;
boolis_const_ib;
-   struct radeon_semaphore *semaphore;
+   struct radeon_semaphore *presync;
+   struct radeon_semaphore *postsync;
 };

 struct radeon_ring {
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index ee712c1..2be4fc5 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -228,11 +228,15 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser 
*p)
int i;

for (i = 0; i < p->nrelocs; i++) {
-   if (!p->relocs[i].robj)
+   struct radeon_cs_reloc *reloc = >relocs[i];
+   struct radeon_bo *bo = reloc->robj;
+   struct radeon_fence *fence;
+
+   if (!bo)
continue;

-   radeon_semaphore_sync_to(p->ib.semaphore,
-p->relocs[i].robj->tbo.sync_obj);
+   fence = bo->tbo.sync_obj;
+   radeon_semaphore_sync_to(p->ib.presync, fence);
}
 }

@@ -252,9 +256,11 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void 
*data)
INIT_LIST_HEAD(>validated);
p->idx = 0;
p->ib.sa_bo = NULL;
-   p->ib.semaphore = NULL;
+   p->ib.presync = NULL;
+   p->ib.postsync = NULL;
p->const_ib.sa_bo = NULL;
-   p->const_ib.semaphore = NULL;
+   p->const_ib.presync = NULL;
+   p->const_ib.postsync = NULL;
p->chunk_ib_idx = -1;
p->chunk_relocs_idx = -1;
p->chunk_flags_idx = -1;
@@ -537,7 +543,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
goto out;
}
radeon_cs_sync_rings(parser);
-   radeon_semaphore_sync_to(parser->ib.semaphore, vm->fence);
+   radeon_semaphore_sync_to(parser->ib.presync, vm->fence);

if ((rdev->family >= CHIP_TAHITI) &&
(parser->chunk_const_ib_idx != -1)) {
diff --git a/drivers/gpu/drm/radeon/radeon_ib.c 
b/drivers/gpu/drm/radeon/radeon_ib.c
index 65b0c21..be09d70 100644
--- a/drivers/gpu/drm/radeon/radeon_ib.c
+++ b/drivers/gpu/drm/radeon/radeon_ib.c
@@ -64,10 +64,13 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
return r;
}

-   r = radeon_semaphore_create(rdev, >semaphore);
-   if (r) {
+   r = radeon_semaphore_create(rdev, >presync);
+   if (r)
+   return r;
+
+   r = radeon_semaphore_create(rdev, >postsync);
+   if (r)
return r;
-   }

ib->ring = ring;
ib->fence = NULL;
@@ -96,7 +99,8 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
  */
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib)
 {
-   radeon_semaphore_free(rdev, >semaphore, ib->fence);
+   radeon_semaphore_free(rdev, >presync, ib->fence);
+   radeon_semaphore_free(rdev, >postsync, ib->fence);
radeon_sa_bo_free(rdev, >sa_bo, ib->fence);
radeon_fence_unref(>fence);
 }
@@ -144,11 +148,11 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib,
if (ib->vm) {
struct radeon_fence *vm_id_fence;
vm_id_fence = radeon_vm_grab_id(rdev, ib->vm, ib->ring);
-   radeon_semaphore_sync_to(ib->semaphore, vm_id_fence);
+   radeon_semaphore_sync_to(ib->presync, vm_id_fence);
}

-   /* sync with other rings */
-   r = radeon_semaphore_sync_rings(rdev, ib->semaphore, ib->ring);
+   /* sync with other rings before IB execution */
+   r = radeon_semaphore_sync_rings(rdev, ib->presync, ib->ring);
if (r) {
dev_err(rdev->dev, "failed to sync rings (%d)\n", r);
radeon_ring_unlock_undo(rdev, ring);
@@ -160,9 +164,19 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib,

if (const_ib) {
radeon_ring_ib_execute(rdev, const_ib->ring, const_ib);
-   radeon_semaphore_free(rdev, _ib->semaphore, NULL);
+   radeon_semaphore_free(rdev, _ib->presync, NULL);
+   radeon_semaphore_free(rdev, _ib->postsync, 

drm/radeon: giving userspace control over hardware engine sync

2014-08-14 Thread Christian König
Hello everyone,

this set of patch adds the ability for userspace to better control when 
synchronization between the different hardware engines on radeon happens. 
Previously every access to a buffer object was serialized and concurrent 
execution could only happen if command submissions didn't shared any buffer 
handle.

Patch #1 in this series adds the ability to not only sync before the IB 
execution, but also after it before the fence value is written. This is a 
workaround because TTM currently can't handle multiple fences attached to a 
single buffer object.

Patch #2 allows concurrent execution of command submission if there is only 
read only access to the same buffer.

Patch #3 adds a DONT_SYNC flag to each buffer object send to the kernel which 
allows userspace to explicitly note that concurrent access to a buffer is ok. 
The usage of this flag is restricted in that way that each operation the client 
doesn't knows about (eviction, access by other clients etc...) is still 
implicitly synced to.

Patch #4 adds a DONT_FENCE flag that tells the kernel to sync all operations to 
a buffer handle, but don't fence that handle with the current command 
submission. This is necessarily because we currently abuses zero sized buffer 
objects as fence handles.

Please review and comment,
Christian.



Fence, timeline and android sync points

2014-08-14 Thread Rob Clark
On Thu, Aug 14, 2014 at 10:12 AM, Jerome Glisse  wrote:
> On Thu, Aug 14, 2014 at 09:16:02AM -0400, Rob Clark wrote:
>> On Wed, Aug 13, 2014 at 1:07 PM, Jerome Glisse  wrote:
>> > So this is fundamentaly different, fence as they are now allow random 
>> > driver
>> > callback and this is bound to get ugly this is bound to lead to one driver
>> > doing something that seems innocuous but turn out to break heavoc when call
>> > from some other driver function.
>>
>>
>> tbh, that seems solvable by some strict rules about what you can do in
>> the callback.. ie. don't do anything you couldn't do in atomic, and
>> don't signal another fence.. off the top of my head that seems
>> sufficient.
>>
>> If the driver getting the callback needs to do more, then it can
>> always schedule a worker..
>>
>> But I could certainly see the case where the driver waiting on fence
>> sets everything up before installing the cb and then just needs to
>> write one or a couple regs from the cb.
>
> Yes sane code will do sane things, sadly i fear we can not enforce sane
> code everywhere especialy with out of tree driver and i would rather
> force there hand to only allow sane implementation. Providing call back
> api obviously allows them to do crazy stuff.

callback's hard, let's go shopping..

But seriously, we've solved problems like this w/ various kernel debug
features before.  I am pretty sure we could make lock debugging, and
maybe some new fence debugging option, catch a lot of things.  There
is probably a better way, but a dummy spinlock around the callback, or
maybe preempt_enable()/disable() around the callback?  And we should
be able to come up with a way to catch signalling a fence from cb..

A bit of extra rope with a warning sign not to hang yourself and debug
features to tell people when they are hanging themselves seems like
the better option to me.

BR,
-R

>>
>> BR,
>> -R


[Intel-gfx] [PATCH] drm: fix plane rotation when restoring fbdev configuration

2014-08-14 Thread Daniel Vetter
On Thu, Aug 14, 2014 at 04:33:18PM +0100, Thomas Wood wrote:
> Make sure plane rotation is reset correctly when restoring the fbdev
> configuration by using drm_mode_plane_set_obj_prop. This calls the
> driver's set_property callback and ensures the rotation is actually
> changed.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82236
> Signed-off-by: Thomas Wood 

Commit message is missing the citation of the offending commit that
introduced this. With that addressed this is

Reviewed-by: Daniel Vetter 

And please cc all the people involved in the offending commit next time
around, too.
-Daniel
> ---
>  drivers/gpu/drm/drm_crtc.c  | 25 -
>  drivers/gpu/drm/drm_fb_helper.c |  6 +++---
>  include/drm/drm_crtc.h  |  3 +++
>  3 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
> index f09b752..95f330a 100644
> --- a/drivers/gpu/drm/drm_crtc.c
> +++ b/drivers/gpu/drm/drm_crtc.c
> @@ -4175,12 +4175,25 @@ static int drm_mode_crtc_set_obj_prop(struct 
> drm_mode_object *obj,
>   return ret;
>  }
>  
> -static int drm_mode_plane_set_obj_prop(struct drm_mode_object *obj,
> -   struct drm_property *property,
> -   uint64_t value)
> +/**
> + * drm_mode_plane_set_obj_prop - set the value of a property
> + * @plane: drm plane object to set property value for
> + * @property: property to set
> + * @val: value the property should be set to
> + *
> + * This functions sets a given property on a given plane object. This 
> function
> + * calls the driver's ->set_property callback and changes the software state 
> of
> + * the property if the callback succeeds.
> + *
> + * Returns:
> + * Zero on success, error code on failure.
> + */
> +int drm_mode_plane_set_obj_prop(struct drm_plane *plane,
> + struct drm_property *property,
> + uint64_t value)
>  {
>   int ret = -EINVAL;
> - struct drm_plane *plane = obj_to_plane(obj);
> + struct drm_mode_object *obj = >base;
>  
>   if (plane->funcs->set_property)
>   ret = plane->funcs->set_property(plane, property, value);
> @@ -4189,6 +4202,7 @@ static int drm_mode_plane_set_obj_prop(struct 
> drm_mode_object *obj,
>  
>   return ret;
>  }
> +EXPORT_SYMBOL(drm_mode_plane_set_obj_prop);
>  
>  /**
>   * drm_mode_getproperty_ioctl - get the current value of a object's property
> @@ -4327,7 +4341,8 @@ int drm_mode_obj_set_property_ioctl(struct drm_device 
> *dev, void *data,
>   ret = drm_mode_crtc_set_obj_prop(arg_obj, property, arg->value);
>   break;
>   case DRM_MODE_OBJECT_PLANE:
> - ret = drm_mode_plane_set_obj_prop(arg_obj, property, 
> arg->value);
> + ret = drm_mode_plane_set_obj_prop(obj_to_plane(arg_obj),
> +   property, arg->value);
>   break;
>   }
>  
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 63d7b8e..0c0c39b 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -296,9 +296,9 @@ static bool restore_fbdev_mode(struct drm_fb_helper 
> *fb_helper)
>   drm_plane_force_disable(plane);
>  
>   if (dev->mode_config.rotation_property) {
> - drm_object_property_set_value(>base,
> - dev->mode_config.rotation_property,
> - BIT(DRM_ROTATE_0));
> + drm_mode_plane_set_obj_prop(plane,
> + 
> dev->mode_config.rotation_property,
> + BIT(DRM_ROTATE_0));
>   }
>   }
>  
> diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
> index 0375d75..31344bf 100644
> --- a/include/drm/drm_crtc.h
> +++ b/include/drm/drm_crtc.h
> @@ -1127,6 +1127,9 @@ extern int drm_mode_obj_get_properties_ioctl(struct 
> drm_device *dev, void *data,
>struct drm_file *file_priv);
>  extern int drm_mode_obj_set_property_ioctl(struct drm_device *dev, void 
> *data,
>  struct drm_file *file_priv);
> +extern int drm_mode_plane_set_obj_prop(struct drm_plane *plane,
> +struct drm_property *property,
> +uint64_t value);
>  
>  extern void drm_fb_get_bpp_depth(uint32_t format, unsigned int *depth,
>int *bpp);
> -- 
> 1.9.3
> 
> ___
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #29 from sarnex  ---
(In reply to comment #28)
> That's right -- it's working now. No need for PPAs on Arch when you have the
> AUR. There's also a pacman repository for mesa-git that compiles daily so
> it's just a matter of installing lib32-mesa-git and mesa-git.

Ah, I'm a debian-based plebian. Glad to see it works.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/5ce5ed5a/attachment.html>


[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #28 from mmstickman at gmail.com ---
That's right -- it's working now. No need for PPAs on Arch when you have the
AUR. There's also a pacman repository for mesa-git that compiles daily so it's
just a matter of installing lib32-mesa-git and mesa-git.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/8fac088f/attachment.html>


Fence, timeline and android sync points

2014-08-14 Thread Daniel Vetter
On Thu, Aug 14, 2014 at 10:12:06AM -0400, Jerome Glisse wrote:
> On Thu, Aug 14, 2014 at 09:16:02AM -0400, Rob Clark wrote:
> > On Wed, Aug 13, 2014 at 1:07 PM, Jerome Glisse  
> > wrote:
> > > So this is fundamentaly different, fence as they are now allow random 
> > > driver
> > > callback and this is bound to get ugly this is bound to lead to one driver
> > > doing something that seems innocuous but turn out to break heavoc when 
> > > call
> > > from some other driver function.
> > 
> > 
> > tbh, that seems solvable by some strict rules about what you can do in
> > the callback.. ie. don't do anything you couldn't do in atomic, and
> > don't signal another fence.. off the top of my head that seems
> > sufficient.
> > 
> > If the driver getting the callback needs to do more, then it can
> > always schedule a worker..
> > 
> > But I could certainly see the case where the driver waiting on fence
> > sets everything up before installing the cb and then just needs to
> > write one or a couple regs from the cb.
> 
> Yes sane code will do sane things, sadly i fear we can not enforce sane
> code everywhere especialy with out of tree driver and i would rather
> force there hand to only allow sane implementation. Providing call back
> api obviously allows them to do crazy stuff.

Well then don't support out of tree drivers. Fairly easy problem really,
and last time I checked "out of tree drivers suck" isn't a valid
objections for upstream code ... It's kinda assumed that they all do, it's
why we have staging after all.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Fence, timeline and android sync points

2014-08-14 Thread Daniel Vetter
On Thu, Aug 14, 2014 at 10:23:30AM -0400, Jerome Glisse wrote:
> On Thu, Aug 14, 2014 at 11:08:34AM +0200, Daniel Vetter wrote:
> > On Wed, Aug 13, 2014 at 01:07:20PM -0400, Jerome Glisse wrote:
> > > Let me make this crystal clear this must be a valid kernel page that have 
> > > a
> > > valid kernel mapping for the lifetime of the device. Hence there is no 
> > > access
> > > to mmio space or anything, just a regular kernel page. If can not rely on 
> > > that
> > > this is a sad world.
> > > 
> > > That being said, yes i am aware that some device incapacity to write to 
> > > such
> > > a page. For those dumb hardware what you need to do is have the irq 
> > > handler
> > > write to this page on behalf of the hardware. But i would like to know any
> > > hardware than can not write a single dword from its ring buffer.
> > > 
> > > The only tricky part in this, is when device is unloaded and driver is 
> > > removing
> > > itself, it obviously need to synchronize itself with anyone possibly 
> > > waiting on
> > > it and possibly reading. But truly this is not that hard to solve.
> > > 
> > > So tell me once the above is clear what kind of scary thing can happen 
> > > when cpu
> > > or a device _read_ a kernel page ?
> > 
> > It's not reading it, it's making sense of what you read. In i915 we had
> > exactly the (timeline, seqno) value pair design for fences for a long
> > time, and we're switching away from it since it stops working when you
> > have preemption and scheduler. Or at least it gets really interesting to
> > interpret the data written into the page.
> > 
> > So I don't want to expose that to other drivers if we decided that
> > exposing this internally is a stupid idea.
> 
> I am well aware of that, but context scheduling really is what the timeline i
> talk about is. The user timeline should be consider like a single cpu thread 
> on
> to which operation involving different hw are scheduled. You can have the hw
> switch from one timeline to another and a seqno is only meaningful per 
> timeline.
> 
> The whole preemption and scheduling is something bound to happen on gpu and we
> will want to integrate with core scheduler to manage time slice allocated to
> process, but the we need the concept of thread in which operation on same hw
> block are processed in a linear fashion but still allowing concurrency with
> other hw block.

Well yeah, it's just that a gpu thread doesn't have a lot to do with a cpu
process, at least in the current drm world. It would be nice make that a
bit more cross-driver for management, but we didn't even manage to pull
that of for gpu memory. So I don't think it'll happen anytime soonish.

> > > > > > So from that pov (presuming I didn't miss anything) your proposal is
> > > > > > identical to what we have, minor some different color choices (like 
> > > > > > where
> > > > > > to place the callback queue).
> > > > > 
> > > > > No callback is the mantra here, and instead of bolting free living 
> > > > > fence
> > > > > to buffer object, they are associated with timeline which means you 
> > > > > do not
> > > > > need to go over all buffer object to know what you need to wait for.
> > > > 
> > > > Ok, then I guess I didn't understand that part of your the proposal. Can
> > > > you please elaborate a bit more how you want to synchronize mulitple
> > > > drivers accessing a dma-buf object and what piece of state we need to
> > > > associate to the dma-buf to make this happen?
> > > 
> > > Beauty of it you associate ziltch to the buffer. So for existing cs ioctl 
> > > where
> > > the implicit synchronization is the rule it enforce mandatory 
> > > synchronization
> > > accross all hw timeline on which a buffer shows up :
> > >   for_each_buffer_in_cmdbuffer(buffer, cmdbuf) {
> > > if (!cmdbuf_write_to_buffer(buffer, cmdbuf))
> > >   continue;
> > > for_each_process_sharing_buffer(buffer, process) {
> > >   schedule_fence(process->implicit_timeline, cmdbuf->fence)
> > > }
> > >   }
> > > 
> > > Next time another process use current ioctl with implicit sync it will 
> > > synch with
> > > the last fence for any shared resource. This might sounds bad but truely 
> > > as it is
> > > right now this is already how it happens (at least for radeon).
> > 
> > Well i915 is a lot better than that. And I'm not going to implement some
> > special-case for dma-buf shared buffers just because radeon sucks and
> > wants to enforce that suckage on everyone else.
> 
> I guess i am having hard time to express myself, what i am saying here is that
> implicit synchronization sucks because it has to assume the worst case and 
> this
> is what current code does, and i am sure intel is doing something similar with
> today code.
> 
> Explicit synchronization allow more flexibility but fence code as it is 
> designed
> does not allow to fully do down that line. By associating fence to buffer 
> object
> which is the biggest shortcoming of implicit sync.

I don't see how 

[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #27 from sarnex  ---
(In reply to comment #26)
> After compiling the latest version of mesa from git, I'm still getting this
> bug with my Radeon HD 7950 spamming the system log.
> 
> [  372.835016] [TTM] Illegal buffer object size
> [  372.835019] [TTM] Illegal buffer object size
> [  372.835021] [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM
> object (0, 6, 4096, -22)
> 
> So it doesn't seem to be fixed for me at least.

You need to compile the 32 bit mesa packages you are using as well. It's must
easier to just install the oibaf PPA. I have no issues after the fix.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/a4006791/attachment.html>


[Bug 82162] Syslog flooded by [drm:radeon_gem_object_create] errors

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82162

--- Comment #26 from mmstickman at gmail.com ---
After compiling the latest version of mesa from git, I'm still getting this bug
with my Radeon HD 7950 spamming the system log.

[  372.835016] [TTM] Illegal buffer object size
[  372.835019] [TTM] Illegal buffer object size
[  372.835021] [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM
object (0, 6, 4096, -22)

So it doesn't seem to be fixed for me at least.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/e3c8d58b/attachment-0001.html>


[Bug 82019] Unreal Engine Effects Cave demo lockup HD 7970M

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82019

Christoph Haag  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #4 from Christoph Haag  ---
(In reply to comment #3)
> Well, not sure what exactly fixed it, but with a recent mesa git build it
> runs without a hang now.

Oh wow, no.

I have a mesa build from two days ago with the master branch of the nine
repository merged into it. With that build it works without hang.

I now tried it with a clean mesa master git build from today but it hangs the
GPU.
I then tried merging the nine master branch in, had to disable opencl and
osmesa, and it hangs too.

I do not know yet if having the nine repository merged makes any difference.

At least it's a small time window to bisect.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/87cd7d10/attachment.html>


[PATCH 3/4] drm/radeon: add DONT_SYNC flag to CS relocs

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 04:34:29PM -0400, Alex Deucher wrote:
> On Thu, Aug 14, 2014 at 2:43 PM, Jerome Glisse  wrote:
> > On Thu, Aug 14, 2014 at 06:12:04PM +0200, Christian K?nig wrote:
> >> From: Christian K?nig 
> >>
> >> This allows userspace to explicitly don't sync submission to
> >> different rings as long as all uses stay in the same client.
> >>
> >> Signed-off-by: Christian K?nig 
> >> ---
> >>  drivers/gpu/drm/radeon/radeon.h |  3 +++
> >>  drivers/gpu/drm/radeon/radeon_cs.c  | 23 ++-
> >>  drivers/gpu/drm/radeon/radeon_gem.c |  1 +
> >>  drivers/gpu/drm/radeon/radeon_ttm.c |  3 +++
> >>  include/uapi/drm/radeon_drm.h   |  2 ++
> >>  5 files changed, 31 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/radeon/radeon.h 
> >> b/drivers/gpu/drm/radeon/radeon.h
> >> index c0f7773..740a0b2 100644
> >> --- a/drivers/gpu/drm/radeon/radeon.h
> >> +++ b/drivers/gpu/drm/radeon/radeon.h
> >> @@ -478,6 +478,8 @@ struct radeon_bo {
> >>   u32 tiling_flags;
> >>   u32 pitch;
> >>   int surface_reg;
> >> + struct drm_file *last_user;
> >> + struct radeon_fence *last_sync;
> >>   struct radeon_fence *written;
> >>   /* list of all virtual address to which this bo
> >>* is associated to
> >> @@ -1018,6 +1020,7 @@ struct radeon_cs_reloc {
> >>   unsignedallowed_domains;
> >>   uint32_ttiling_flags;
> >>   uint32_thandle;
> >> + uint32_tflags;
> >>   boolwritten;
> >>  };
> >>
> >> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> >> b/drivers/gpu/drm/radeon/radeon_cs.c
> >> index 3aa7e48..11e4789 100644
> >> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> >> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> >> @@ -166,6 +166,7 @@ static int radeon_cs_parser_relocs(struct 
> >> radeon_cs_parser *p)
> >>
> >>   p->relocs[i].tv.bo = >relocs[i].robj->tbo;
> >>   p->relocs[i].handle = r->handle;
> >> + p->relocs[i].flags = r->flags;
> >>   p->relocs[i].written = !!r->write_domain;
> >>
> >>   radeon_cs_buckets_add(, >relocs[i].tv.head,
> >> @@ -236,6 +237,14 @@ static void radeon_cs_sync_rings(struct 
> >> radeon_cs_parser *p)
> >>   if (!bo)
> >>   continue;
> >>
> >> + /* always sync to the last operation
> >> +the clients doesn't know about */
> >> + radeon_semaphore_sync_to(p->ib.presync, bo->last_sync);
> >> +
> >> + if (bo->last_user == p->filp &&
> >> + reloc->flags & RADEON_RELOC_DONT_SYNC)
> >> + continue;
> >> +
> >>   fence = bo->tbo.sync_obj;
> >>
> >>   if (bo->written && radeon_fence_signaled(bo->written))
> >> @@ -421,7 +430,19 @@ static void radeon_cs_parser_fini(struct 
> >> radeon_cs_parser *parser, int error, bo
> >>   struct radeon_cs_reloc *reloc = >relocs[i];
> >>   struct radeon_bo *bo = reloc->robj;
> >>
> >> - if (!bo || !reloc->written)
> >> + if (!bo)
> >> + continue;
> >> +
> >> + /* if the client changes remember that and always 
> >> serialize
> >> +all operations from different clients */
> >> + if (bo->last_user != parser->filp && 
> >> bo->tbo.sync_obj) {
> >> + struct radeon_fence *fence = 
> >> bo->tbo.sync_obj;
> >> + radeon_fence_unref(>last_sync);
> >> + bo->last_sync = radeon_fence_ref(fence);
> >> + }
> >> + bo->last_user = parser->filp;
> >> +
> >> + if (!reloc->written)
> >>   continue;
> >>
> >>   radeon_fence_unref(>written);
> >> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> >> b/drivers/gpu/drm/radeon/radeon_gem.c
> >> index bfd7e1b..c73dbc1 100644
> >> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> >> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> >> @@ -259,6 +259,7 @@ int radeon_gem_create_ioctl(struct drm_device *dev, 
> >> void *data,
> >>   r = radeon_gem_handle_lockup(rdev, r);
> >>   return r;
> >>   }
> >> + gem_to_radeon_bo(gobj)->last_user = filp;
> >>   r = drm_gem_handle_create(filp, gobj, );
> >>   /* drop reference from allocate - handle holds it now */
> >>   drm_gem_object_unreference_unlocked(gobj);
> >> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> >> b/drivers/gpu/drm/radeon/radeon_ttm.c
> >> index 76be612..a4f964f 100644
> >> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> >> +++ 

[Bug 77181] radeon -- GPU lockup when hibernating or waking up

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=77181

--- Comment #4 from Alex Deucher  ---
(In reply to Rostislav Devyatov from comment #3)
> (In reply to Alex Deucher from comment #2)
> 
> I am sorry, but I don't think I understand you. What do you mean by
> "regression"? What am I trying to bisect?

Did it used to work previously?  If so what kernel?  If you know what kernel
used to work, you can use git to bisect the changes and identify what change
broke it.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 82586] UBO matrix in std140 struct does not work

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82586

--- Comment #3 from Ian Romanick  ---
(In reply to comment #2)
> "What is the failure mode?"
> Don't follow

You say it doesn't work.  That's pretty vague.  If it fails in some way, what
is the mode of the failure?  (See also
http://en.wikipedia.org/wiki/Failure_causes#Component_failure)

Are the member locations incorrect (query through the API)?  Is the row-major /
column-major orientation of the matrices incorrect (query through the API)?  Is
the generated code reading the wrong locations?  Does it crash in the driver? 
Does your computer catch on fire?

Without any information about how it actually fails, it's pretty much
impossible to fix.

> "What version of Mesa are you on?"
> https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers
> Seem couple days old.

When I check it now, it should have everything.  The other question: does this
work on older Mesa?  Say... a 10.2 stable release?

> "Can you try a build of master?"
> Can't I just wait for new build in PPA?

You can... at some point there will probably be a patch for you to test, and
that won't be in PPA.  You can probably wait to worry about it until such a
patch exists.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/18a879a1/attachment.html>


[PATCH 3/4] drm/radeon: add DONT_SYNC flag to CS relocs

2014-08-14 Thread Alex Deucher
On Thu, Aug 14, 2014 at 2:43 PM, Jerome Glisse  wrote:
> On Thu, Aug 14, 2014 at 06:12:04PM +0200, Christian K?nig wrote:
>> From: Christian K?nig 
>>
>> This allows userspace to explicitly don't sync submission to
>> different rings as long as all uses stay in the same client.
>>
>> Signed-off-by: Christian K?nig 
>> ---
>>  drivers/gpu/drm/radeon/radeon.h |  3 +++
>>  drivers/gpu/drm/radeon/radeon_cs.c  | 23 ++-
>>  drivers/gpu/drm/radeon/radeon_gem.c |  1 +
>>  drivers/gpu/drm/radeon/radeon_ttm.c |  3 +++
>>  include/uapi/drm/radeon_drm.h   |  2 ++
>>  5 files changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon.h 
>> b/drivers/gpu/drm/radeon/radeon.h
>> index c0f7773..740a0b2 100644
>> --- a/drivers/gpu/drm/radeon/radeon.h
>> +++ b/drivers/gpu/drm/radeon/radeon.h
>> @@ -478,6 +478,8 @@ struct radeon_bo {
>>   u32 tiling_flags;
>>   u32 pitch;
>>   int surface_reg;
>> + struct drm_file *last_user;
>> + struct radeon_fence *last_sync;
>>   struct radeon_fence *written;
>>   /* list of all virtual address to which this bo
>>* is associated to
>> @@ -1018,6 +1020,7 @@ struct radeon_cs_reloc {
>>   unsignedallowed_domains;
>>   uint32_ttiling_flags;
>>   uint32_thandle;
>> + uint32_tflags;
>>   boolwritten;
>>  };
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
>> b/drivers/gpu/drm/radeon/radeon_cs.c
>> index 3aa7e48..11e4789 100644
>> --- a/drivers/gpu/drm/radeon/radeon_cs.c
>> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
>> @@ -166,6 +166,7 @@ static int radeon_cs_parser_relocs(struct 
>> radeon_cs_parser *p)
>>
>>   p->relocs[i].tv.bo = >relocs[i].robj->tbo;
>>   p->relocs[i].handle = r->handle;
>> + p->relocs[i].flags = r->flags;
>>   p->relocs[i].written = !!r->write_domain;
>>
>>   radeon_cs_buckets_add(, >relocs[i].tv.head,
>> @@ -236,6 +237,14 @@ static void radeon_cs_sync_rings(struct 
>> radeon_cs_parser *p)
>>   if (!bo)
>>   continue;
>>
>> + /* always sync to the last operation
>> +the clients doesn't know about */
>> + radeon_semaphore_sync_to(p->ib.presync, bo->last_sync);
>> +
>> + if (bo->last_user == p->filp &&
>> + reloc->flags & RADEON_RELOC_DONT_SYNC)
>> + continue;
>> +
>>   fence = bo->tbo.sync_obj;
>>
>>   if (bo->written && radeon_fence_signaled(bo->written))
>> @@ -421,7 +430,19 @@ static void radeon_cs_parser_fini(struct 
>> radeon_cs_parser *parser, int error, bo
>>   struct radeon_cs_reloc *reloc = >relocs[i];
>>   struct radeon_bo *bo = reloc->robj;
>>
>> - if (!bo || !reloc->written)
>> + if (!bo)
>> + continue;
>> +
>> + /* if the client changes remember that and always 
>> serialize
>> +all operations from different clients */
>> + if (bo->last_user != parser->filp && bo->tbo.sync_obj) 
>> {
>> + struct radeon_fence *fence = bo->tbo.sync_obj;
>> + radeon_fence_unref(>last_sync);
>> + bo->last_sync = radeon_fence_ref(fence);
>> + }
>> + bo->last_user = parser->filp;
>> +
>> + if (!reloc->written)
>>   continue;
>>
>>   radeon_fence_unref(>written);
>> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
>> b/drivers/gpu/drm/radeon/radeon_gem.c
>> index bfd7e1b..c73dbc1 100644
>> --- a/drivers/gpu/drm/radeon/radeon_gem.c
>> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
>> @@ -259,6 +259,7 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
>> *data,
>>   r = radeon_gem_handle_lockup(rdev, r);
>>   return r;
>>   }
>> + gem_to_radeon_bo(gobj)->last_user = filp;
>>   r = drm_gem_handle_create(filp, gobj, );
>>   /* drop reference from allocate - handle holds it now */
>>   drm_gem_object_unreference_unlocked(gobj);
>> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
>> b/drivers/gpu/drm/radeon/radeon_ttm.c
>> index 76be612..a4f964f 100644
>> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
>> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
>> @@ -273,6 +273,9 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
>>   );
>>
>>   if (!r) {
>> + rbo->last_user = NULL;
>> + radeon_fence_unref(>last_sync);
>> + rbo->last_sync = radeon_fence_ref(fence);
>>

[PATCH] drm: fix plane rotation when restoring fbdev configuration

2014-08-14 Thread Thomas Wood
Make sure plane rotation is reset correctly when restoring the fbdev
configuration by using drm_mode_plane_set_obj_prop. This calls the
driver's set_property callback and ensures the rotation is actually
changed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82236
Signed-off-by: Thomas Wood 
---
 drivers/gpu/drm/drm_crtc.c  | 25 -
 drivers/gpu/drm/drm_fb_helper.c |  6 +++---
 include/drm/drm_crtc.h  |  3 +++
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index f09b752..95f330a 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -4175,12 +4175,25 @@ static int drm_mode_crtc_set_obj_prop(struct 
drm_mode_object *obj,
return ret;
 }

-static int drm_mode_plane_set_obj_prop(struct drm_mode_object *obj,
- struct drm_property *property,
- uint64_t value)
+/**
+ * drm_mode_plane_set_obj_prop - set the value of a property
+ * @plane: drm plane object to set property value for
+ * @property: property to set
+ * @val: value the property should be set to
+ *
+ * This functions sets a given property on a given plane object. This function
+ * calls the driver's ->set_property callback and changes the software state of
+ * the property if the callback succeeds.
+ *
+ * Returns:
+ * Zero on success, error code on failure.
+ */
+int drm_mode_plane_set_obj_prop(struct drm_plane *plane,
+   struct drm_property *property,
+   uint64_t value)
 {
int ret = -EINVAL;
-   struct drm_plane *plane = obj_to_plane(obj);
+   struct drm_mode_object *obj = >base;

if (plane->funcs->set_property)
ret = plane->funcs->set_property(plane, property, value);
@@ -4189,6 +4202,7 @@ static int drm_mode_plane_set_obj_prop(struct 
drm_mode_object *obj,

return ret;
 }
+EXPORT_SYMBOL(drm_mode_plane_set_obj_prop);

 /**
  * drm_mode_getproperty_ioctl - get the current value of a object's property
@@ -4327,7 +4341,8 @@ int drm_mode_obj_set_property_ioctl(struct drm_device 
*dev, void *data,
ret = drm_mode_crtc_set_obj_prop(arg_obj, property, arg->value);
break;
case DRM_MODE_OBJECT_PLANE:
-   ret = drm_mode_plane_set_obj_prop(arg_obj, property, 
arg->value);
+   ret = drm_mode_plane_set_obj_prop(obj_to_plane(arg_obj),
+ property, arg->value);
break;
}

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 63d7b8e..0c0c39b 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -296,9 +296,9 @@ static bool restore_fbdev_mode(struct drm_fb_helper 
*fb_helper)
drm_plane_force_disable(plane);

if (dev->mode_config.rotation_property) {
-   drm_object_property_set_value(>base,
-   dev->mode_config.rotation_property,
-   BIT(DRM_ROTATE_0));
+   drm_mode_plane_set_obj_prop(plane,
+   
dev->mode_config.rotation_property,
+   BIT(DRM_ROTATE_0));
}
}

diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index 0375d75..31344bf 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -1127,6 +1127,9 @@ extern int drm_mode_obj_get_properties_ioctl(struct 
drm_device *dev, void *data,
 struct drm_file *file_priv);
 extern int drm_mode_obj_set_property_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file_priv);
+extern int drm_mode_plane_set_obj_prop(struct drm_plane *plane,
+  struct drm_property *property,
+  uint64_t value);

 extern void drm_fb_get_bpp_depth(uint32_t format, unsigned int *depth,
 int *bpp);
-- 
1.9.3



Fence, timeline and android sync points

2014-08-14 Thread Christian König
Am 14.08.2014 um 14:37 schrieb Maarten Lankhorst:
> Op 14-08-14 om 13:53 schreef Christian K?nig:
>>> But because of driver differences I can't implement it as a straight wait 
>>> queue. Some drivers may not have a reliable interrupt, so they need a 
>>> custom wait function. (qxl)
>>> Some may need to do extra flushing to get fences signaled (vmwgfx), others 
>>> need some locking to protect against gpu lockup races (radeon, i915??).  
>>> And nouveau
>>> doesn't use wait queues, but rolls its own (nouveau).
>> But when all those drivers need a special wait function how can you still 
>> justify the common callback when a fence is signaled?
>>
>> If I understood it right the use case for this was waiting for any fence of 
>> a list of fences from multiple drivers, but if each driver needs special 
>> handling how for it's wait how can that work reliable?
> TTM doesn't rely on the callbacks. It will call .enable_signaling when 
> .is_signaled is NULL, to make sure that fence_is_signaled returns true sooner.
>
> QXL is completely awful, I've seen some patches to add dma-buf support but 
> I'll make sure it never supports importing from/exporting to other devices. 
> This should reduce insanity factor there.
So if I understand you right QXL doesn't really implement the whole 
fence interface? It just implements enough to make TTM happy and you 
forbid DMA-buf support because the rest isn't really working?

Sorry, but that just sounds like your fence design just isn't doing the 
right thing here.

> If I understand QXL correctly, sometimes fences may never signal at all due 
> to virt-hw bugs.
Radeon has the same problem, with the hardware scheduler each client 
essentially has it's own fence sequence number range. If you kill a 
client the remaining fences not necessarily gets signaled by the hardware.

>
> nouveau (pre nv84) has no interrupt for completed work, but it has a reliable 
> is_signaled. So .enable_signaling only checks if fence is signaled here.
> A custom waiting function makes sure things work correctly, and also signals 
> all unsignaled fences for that context. I preserved the original wait from 
> before the fence conversion.
> Nouveau keeps a global list of unsignaled fences, so they will all signal 
> eventually.
> I may have to limit importing/exporting dma-buf's to other devices, or add 
> delayed work that periodically checks all contexts for completed fences for 
> this to work cross-device.
>
> nv84+ has a sane interrupt, so I use it. :-)
>
> Radeon with fence conversion has the delayed work for handling lockups that 
> also checks.

Which I still don't had time to check completely, but it sounds more and 
more like those fallbacks for not fired interrupts should be in the 
common fence code and not in each individual driver.

Christian.

>
> ~Maarten



Dual-channel DSI

2014-08-14 Thread Tomi Valkeinen
port {
dsi1_out_ep: endpoint {
remote-endpoint = <_in0>;
lanes = <0 1 2 3 4 5>;
};
};

display {
compatible = "sharp,lq101r1sx01";

secondary = <_secondary>;

port {
lcd_in0: endpoint {
remote-endpoint = <_out_ep>;
};
};
};
};

 {
port {
dsi2_out_ep: endpoint {
remote-endpoint = <_in1>;
lanes = <0 1 2 3 4 5>;
};
};

panel_secondary: display {
compatible = "sharp,lq101r1sx01-secondary";

port {
lcd_in1: endpoint {
remote-endpoint = <_out_ep>;
};
};
};
};

I guess the above was already more or less presented in the thread, but I
didn't see it written out.

So there would two two devices, a master and a slave, and in the driver side
the slave would not do anything by itself.

 Tomi


-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/79f17c19/attachment-0001.sig>


Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 09:40:08PM +0200, Maarten Lankhorst wrote:
> 
> 
> On 14-08-14 21:15, Jerome Glisse wrote:
> > On Thu, Aug 14, 2014 at 08:47:16PM +0200, Daniel Vetter wrote:
> >> On Thu, Aug 14, 2014 at 8:18 PM, Jerome Glisse  
> >> wrote:
> >>> Sucks because you can not do weird synchronization like one i depicted in 
> >>> another
> >>> mail in this thread and for as long as cmdbuf_ioctl do not give you 
> >>> fence|syncpt
> >>> you can not such thing cleanly in non hackish way.
> >>
> >> Actually i915 can soon will do that that.
> > 
> > So you will return fence|syncpoint with each cmdbuf_ioctl ?
> 
> It might, soon. There have been patches on the ml about it. It can create a
> userspace android fence backed by a kernel dma fence. And it will be created
> like any other userspace android fence. ;-)

Android fence are not in my mind a nice thing :)

> Yet even with that, it will continue to support the implicit sync model
> since they're not mutually exclusive.

Again i have fail to express myself. I have tried to repeatly said that
what i proposed was not mutualy exclusive.

> I also fail to understand why you think a fence should be associated with
> a buffer object. It's the other way around. TTM currently requires buffer
> objects to be fenced off to protect against eviction.

Again i fail to properly explain myself. I said no fence should be associted
to buffer nor dma-buf wether shared or not. The only things that you really
need for a buffer is a seqno and this only have use for binding/unbinding
object from GART/GPU address space/... so no i am not saying fence should be
associated with a buffer. I am saying fence should be associated with each
cmdbuf and there should be no link whatsover btw fence and buffer object ie
you do not need to store a pointer to fence inside dma-buf or any buffer
object structure.

I am well aware of how ttm works and i am saying it can be replace with
simpler seqno.

> 
> reservation_object is used for this, and by sharing the reservation_object
> pointer with a dma-buf you get cross-device synchronization.

My point is that dma-buf should not have reserversion_object pointer it is
useless and counter productive and only add complexity. You do not care about
buffer, you care about synchronizing cmdbuf/hw block with each other. A buffer
is just what those block consume. In other words reservation_object as fence
are useless and only complexify things for no added value on contrary they
are not as flexible as fence associated to cmdbuf.

> 
> It has a bunch of helpers to make common operations easy, see
include/linux/reservation.h and drivers/dma-buf/reservation.c
> It also allows multiple readers simultaneously across any number of devices.
> I intend to use this in nouveau.

As does what i am talking about. The reservation stuff is a limiting factor
more than an helper in my eyes.

> 
> But there's no requirement to use reservation_object's apart from that's how
> ttm currently works, and for implicit syncing in dma-buf. If you don't need
> either go crazy with fence and write your own mechanism on top of fence.
> Although with android sync and TTM I think I handled all common usecases.

My point is that implicit syncing can be implemented in a saner way that would
also allow to implement an explicit syncing with no restriction and only room
for extra optimisation. By enforcing to have the cpu involve in hw block sync
you are limiting what can be done and clever hardware are force to use sub
optimal solution.

> ~Maarten


[Bug 77181] radeon -- GPU lockup when hibernating or waking up

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=77181

--- Comment #3 from Rostislav Devyatov  ---
(In reply to Alex Deucher from comment #2)

I am sorry, but I don't think I understand you. What do you mean by
"regression"? What am I trying to bisect?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


GEM memory DOS (WAS Re: [PATCH 3/3] drm/ttm: under memory pressure minimize the size of memory pool)

2014-08-14 Thread Jesse Barnes
On Wed, 13 Aug 2014 17:13:56 +0200
Thomas Hellstrom  wrote:

> On 08/13/2014 03:01 PM, Daniel Vetter wrote:
> > On Wed, Aug 13, 2014 at 02:35:52PM +0200, Thomas Hellstrom wrote:
> >> On 08/13/2014 12:42 PM, Daniel Vetter wrote:
> >>> On Wed, Aug 13, 2014 at 11:06:25AM +0200, Thomas Hellstrom wrote:
>  On 08/13/2014 05:52 AM, J?r?me Glisse wrote:
> > From: J?r?me Glisse 
> >
> > When experiencing memory pressure we want to minimize pool size so that
> > memory we just shrinked is not added back again just as the next thing.
> >
> > This will divide by 2 the maximum pool size for each device each time
> > the pool have to shrink. The limit is bumped again is next allocation
> > happen after one second since the last shrink. The one second delay is
> > obviously an arbitrary choice.
>  J?r?me,
> 
>  I don't like this patch. It adds extra complexity and its usefulness is
>  highly questionable.
>  There are a number of caches in the system, and if all of them added
>  some sort of voluntary shrink heuristics like this, we'd end up with
>  impossible-to-debug unpredictable performance issues.
> 
>  We should let the memory subsystem decide when to reclaim pages from
>  caches and what caches to reclaim them from.
> >>> Yeah, artificially limiting your cache from growing when your shrinker
> >>> gets called will just break the equal-memory pressure the core mm uses to
> >>> rebalance between all caches when workload changes. In i915 we let
> >>> everything grow without artificial bounds and only rely upon the shrinker
> >>> callbacks to ensure we don't consume more than our fair share of available
> >>> memory overall.
> >>> -Daniel
> >> Now when you bring i915 memory usage up, Daniel,
> >> I can't refrain from bringing up the old user-space unreclaimable kernel
> >> memory issue, for which gem open is a good example ;) Each time
> >> user-space opens a gem handle, some un-reclaimable kernel memory is
> >> allocated, for which there is no accounting, so theoretically I think a
> >> user can bring a system to unusability this way.
> >>
> >> Typically there are various limits on unreclaimable objects like this,
> >> like open file descriptors, and IIRC the kernel even has an internal
> >> limit on the number of struct files you initialize, based on the
> >> available system memory, so dma-buf / prime should already have some
> >> sort of protection.
> > Oh yeah, we have zero cgroups limits or similar stuff for gem allocations,
> > so there's not really a way to isolate gpu memory usage in a sane way for
> > specific processes. But there's also zero limits on actual gpu usage
> > itself (timeslices or whatever) so I guess no one asked for this yet.
> 
> In its simplest form (like in TTM if correctly implemented by drivers)
> this type of accounting stops non-privileged malicious GPU-users from
> exhausting all system physical memory causing grief for other kernel
> systems but not from causing grief for other GPU users. I think that's
> the minimum level that's intended also for example also for the struct
> file accounting.
> 
> > My comment really was about balancing mm users under the assumption that
> > they're all unlimited.
> 
> Yeah, sorry for stealing the thread. I usually bring this up now and
> again but nowadays with an exponential backoff.

Yeah I agree we're missing some good limits stuff in i915 and DRM in
general probably.  Chris started looking at this awhile back, but I
haven't seen anything recently.  Tying into the ulimits/rlimits might
make sense, and at the very least we need to account for things
properly so we can add new limits where needed.

-- 
Jesse Barnes, Intel Open Source Technology Center


Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 08:47:16PM +0200, Daniel Vetter wrote:
> On Thu, Aug 14, 2014 at 8:18 PM, Jerome Glisse  wrote:
> > Sucks because you can not do weird synchronization like one i depicted in 
> > another
> > mail in this thread and for as long as cmdbuf_ioctl do not give you 
> > fence|syncpt
> > you can not such thing cleanly in non hackish way.
> 
> Actually i915 can soon will do that that.

So you will return fence|syncpoint with each cmdbuf_ioctl ?

> 
> > Sucks because you have a fence object per buffer object and thus overhead 
> > grow
> > with the number of objects. Not even mentioning fence lifetime issue.
> >
> > Sucks because sub-buffer allocation is just one of many tricks that can not 
> > be
> > achieved properly and cleanly with implicit sync.
> >
> > ...
> 
> Well I heard all those reasons and I'm well of aware of them. The
> problem is that with current hardware the kernel needs to know for
> each buffer how long it needs to be kept around since hw just can't do
> page faulting. Yeah you can pin them but for an uma design that
> doesn't go down well with folks.

I am not thinking with fancy hw in mind, on contrary i thought about all
this with the crappiest hw i could think of, in mind.

Yes you can get rid of fence and not have to pin memory with current hw.
What matter for unpinning is to know that all hw block are done using the
memory. This is easily achievable with your beloved seqno. Have one seqno
per driver (one driver can have different block 3d, video decoding, crtc,
...) each time a buffer is use as part of a command on one block inc the
common seqno and tag the buffer with that number. Have each hw block write
the lastest seqno that is done to a per block location. Now to determine
is buffer is done compare the buffer seqno with the max of all the signaled
seqno of all blocks.

Cost 1 uint32 per buffer and simple if without locking to check status of
a buffer.

Yes preemption and gpu scheduling would break such scheme, but my point is
that when you have such gpu you want to implement a proper solution. Which
of course require quite some work accross the stack. So the past can live
on but the future needs to get its acts together.

> The other problem is that the Linux Desktop I don't seem to care about
> any more kinda relies on implicit syncing right now, so we better keep
> that working fairly well. Of course we could dream up a shiny new
> world where all of the Linux desktop does explicit syncing, but that
> world doesn't exist right now. I mean really if you want to right away
> throw implicit syncing overboard that doesn't bode well for the
> current linux desktop.

Again i fail at expressing myself. I am saying throw things over board,
i am well aware of the current reliance on implicit fencing. I am saying
if fence wants to be this new thing that should allow to do explicit
fencing in the future than it better be done correctly in the first place.

> So I don't understand all the implicit syncing bashing. It's here, and
> it'll stay for a while longer whether you like it or not ...

I am saying this is where we are and it sucks for a number of reasons,
then looking at fence and by looking at fence i am saying this try to
go in the right direction but do crazy things that i am convince we
will regret. In other word if we ever get to the explicit fence better
starts on the right path with the right tool. Moreover i am saying that
this can be done without breaking implicit sync we have today.

> Of course that doesn't mean we (intel here) won't support explicit
> syncing too, and I really don't see a conflict here when mixing and
> matching these two approaches.

Again i fail to express myself. I am not saying there is conflict. I
am saying better take a path which allow to go full way with explicit
fencing while still allowing a less optimal use for an implicit sync
model.

My point is the fence code proposed here, keeps the worst thing about
implicit fencing we have today. This can be done differently, in what
i believe to be better way. And this different approach stills allow
to have have implicit sync for existing userspace.

Cheers,
J?r?me

> -Daniel
> 
> > Having code that work around or alieviate some of this, is in no way a 
> > testimony
> > that it's the best solution. I do believe explicit sync to be superior in 
> > use
> > case it allows way more freedom while the only drawback i see is that you 
> > have to
> > put some trust into userspace.
> >
> > So yes implicit sync sucks and it does map to i915 reality as well.
> 
> 
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch


[PATCH 3/4] drm/radeon: add DONT_SYNC flag to CS relocs

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 06:12:04PM +0200, Christian K?nig wrote:
> From: Christian K?nig 
> 
> This allows userspace to explicitly don't sync submission to
> different rings as long as all uses stay in the same client.
> 
> Signed-off-by: Christian K?nig 
> ---
>  drivers/gpu/drm/radeon/radeon.h |  3 +++
>  drivers/gpu/drm/radeon/radeon_cs.c  | 23 ++-
>  drivers/gpu/drm/radeon/radeon_gem.c |  1 +
>  drivers/gpu/drm/radeon/radeon_ttm.c |  3 +++
>  include/uapi/drm/radeon_drm.h   |  2 ++
>  5 files changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index c0f7773..740a0b2 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -478,6 +478,8 @@ struct radeon_bo {
>   u32 tiling_flags;
>   u32 pitch;
>   int surface_reg;
> + struct drm_file *last_user;
> + struct radeon_fence *last_sync;
>   struct radeon_fence *written;
>   /* list of all virtual address to which this bo
>* is associated to
> @@ -1018,6 +1020,7 @@ struct radeon_cs_reloc {
>   unsignedallowed_domains;
>   uint32_ttiling_flags;
>   uint32_thandle;
> + uint32_tflags;
>   boolwritten;
>  };
>  
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 3aa7e48..11e4789 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -166,6 +166,7 @@ static int radeon_cs_parser_relocs(struct 
> radeon_cs_parser *p)
>  
>   p->relocs[i].tv.bo = >relocs[i].robj->tbo;
>   p->relocs[i].handle = r->handle;
> + p->relocs[i].flags = r->flags;
>   p->relocs[i].written = !!r->write_domain;
>  
>   radeon_cs_buckets_add(, >relocs[i].tv.head,
> @@ -236,6 +237,14 @@ static void radeon_cs_sync_rings(struct radeon_cs_parser 
> *p)
>   if (!bo)
>   continue;
>  
> + /* always sync to the last operation
> +the clients doesn't know about */
> + radeon_semaphore_sync_to(p->ib.presync, bo->last_sync);
> +
> + if (bo->last_user == p->filp &&
> + reloc->flags & RADEON_RELOC_DONT_SYNC)
> + continue;
> +
>   fence = bo->tbo.sync_obj;
>  
>   if (bo->written && radeon_fence_signaled(bo->written))
> @@ -421,7 +430,19 @@ static void radeon_cs_parser_fini(struct 
> radeon_cs_parser *parser, int error, bo
>   struct radeon_cs_reloc *reloc = >relocs[i];
>   struct radeon_bo *bo = reloc->robj;
>  
> - if (!bo || !reloc->written)
> + if (!bo)
> + continue;
> +
> + /* if the client changes remember that and always 
> serialize
> +all operations from different clients */
> + if (bo->last_user != parser->filp && bo->tbo.sync_obj) {
> + struct radeon_fence *fence = bo->tbo.sync_obj;
> + radeon_fence_unref(>last_sync);
> + bo->last_sync = radeon_fence_ref(fence);
> + }
> + bo->last_user = parser->filp;
> +
> + if (!reloc->written)
>   continue;
>  
>   radeon_fence_unref(>written);
> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> b/drivers/gpu/drm/radeon/radeon_gem.c
> index bfd7e1b..c73dbc1 100644
> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> @@ -259,6 +259,7 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void 
> *data,
>   r = radeon_gem_handle_lockup(rdev, r);
>   return r;
>   }
> + gem_to_radeon_bo(gobj)->last_user = filp;
>   r = drm_gem_handle_create(filp, gobj, );
>   /* drop reference from allocate - handle holds it now */
>   drm_gem_object_unreference_unlocked(gobj);
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 76be612..a4f964f 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -273,6 +273,9 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
>   );
>  
>   if (!r) {
> + rbo->last_user = NULL;
> + radeon_fence_unref(>last_sync);
> + rbo->last_sync = radeon_fence_ref(fence);
>   radeon_fence_unref(>written);
>   rbo->written = radeon_fence_ref(fence);
>   }
> diff --git a/include/uapi/drm/radeon_drm.h 

Fence, timeline and android sync points

2014-08-14 Thread Maarten Lankhorst
Op 14-08-14 om 13:53 schreef Christian K?nig:
>> But because of driver differences I can't implement it as a straight wait 
>> queue. Some drivers may not have a reliable interrupt, so they need a custom 
>> wait function. (qxl)
>> Some may need to do extra flushing to get fences signaled (vmwgfx), others 
>> need some locking to protect against gpu lockup races (radeon, i915??).  And 
>> nouveau
>> doesn't use wait queues, but rolls its own (nouveau).
> But when all those drivers need a special wait function how can you still 
> justify the common callback when a fence is signaled?
>
> If I understood it right the use case for this was waiting for any fence of a 
> list of fences from multiple drivers, but if each driver needs special 
> handling how for it's wait how can that work reliable?
TTM doesn't rely on the callbacks. It will call .enable_signaling when 
.is_signaled is NULL, to make sure that fence_is_signaled returns true sooner.

QXL is completely awful, I've seen some patches to add dma-buf support but I'll 
make sure it never supports importing from/exporting to other devices. This 
should reduce insanity factor there.
If I understand QXL correctly, sometimes fences may never signal at all due to 
virt-hw bugs.

nouveau (pre nv84) has no interrupt for completed work, but it has a reliable 
is_signaled. So .enable_signaling only checks if fence is signaled here.
A custom waiting function makes sure things work correctly, and also signals 
all unsignaled fences for that context. I preserved the original wait from 
before the fence conversion.
Nouveau keeps a global list of unsignaled fences, so they will all signal 
eventually.
I may have to limit importing/exporting dma-buf's to other devices, or add 
delayed work that periodically checks all contexts for completed fences for 
this to work cross-device.

nv84+ has a sane interrupt, so I use it. :-)

Radeon with fence conversion has the delayed work for handling lockups that 
also checks.

~Maarten


drm/radeon: giving userspace control over hardware engine sync

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 06:12:01PM +0200, Christian K?nig wrote:
> Hello everyone,
> 
> this set of patch adds the ability for userspace to better control when
> synchronization between the different hardware engines on radeon happens.
> Previously every access to a buffer object was serialized and concurrent
> execution could only happen if command submissions didn't shared any
> buffer handle.

I must say i do not like the whole direction of abusing buffer object to be
expose as fence to userspace. Allowing 0 sized bo was the first step in this
bad direction.

I do understand it's lot easier to implement such things in isolation from
other and that trying to design and get a common kernel subsystem accepted
is way more painful and unpredictible.

> Patch #1 in this series adds the ability to not only sync before the IB
> execution, but also after it before the fence value is written. This is
> a workaround because TTM currently can't handle multiple fences attached
> to a single buffer object.

We do not want multiple fences to be associated with a buffer object ever.
In fact i believe getting rid of fence associated to buffer object is what
would make sense.

Others comment as a per patch reply.

Cheers,
J?r?me

> Patch #2 allows concurrent execution of command submission if there is
> only read only access to the same buffer.
> 
> Patch #3 adds a DONT_SYNC flag to each buffer object send to the kernel
> which allows userspace to explicitly note that concurrent access to a
> buffer is ok. The usage of this flag is restricted in that way that each
> operation the client doesn't knows about (eviction, access by other clients
> etc...) is still implicitly synced to.
> 
> Patch #4 adds a DONT_FENCE flag that tells the kernel to sync all operations
> to a buffer handle, but don't fence that handle with the current command
> submission. This is necessarily because we currently abuses zero sized buffer
> objects as fence handles.
> 
> Please review and comment,
> Christian.
> 
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 05:58:48PM +0200, Daniel Vetter wrote:
> On Thu, Aug 14, 2014 at 10:12:06AM -0400, Jerome Glisse wrote:
> > On Thu, Aug 14, 2014 at 09:16:02AM -0400, Rob Clark wrote:
> > > On Wed, Aug 13, 2014 at 1:07 PM, Jerome Glisse  
> > > wrote:
> > > > So this is fundamentaly different, fence as they are now allow random 
> > > > driver
> > > > callback and this is bound to get ugly this is bound to lead to one 
> > > > driver
> > > > doing something that seems innocuous but turn out to break heavoc when 
> > > > call
> > > > from some other driver function.
> > > 
> > > 
> > > tbh, that seems solvable by some strict rules about what you can do in
> > > the callback.. ie. don't do anything you couldn't do in atomic, and
> > > don't signal another fence.. off the top of my head that seems
> > > sufficient.
> > > 
> > > If the driver getting the callback needs to do more, then it can
> > > always schedule a worker..
> > > 
> > > But I could certainly see the case where the driver waiting on fence
> > > sets everything up before installing the cb and then just needs to
> > > write one or a couple regs from the cb.
> > 
> > Yes sane code will do sane things, sadly i fear we can not enforce sane
> > code everywhere especialy with out of tree driver and i would rather
> > force there hand to only allow sane implementation. Providing call back
> > api obviously allows them to do crazy stuff.
> 
> Well then don't support out of tree drivers. Fairly easy problem really,
> and last time I checked "out of tree drivers suck" isn't a valid
> objections for upstream code ... It's kinda assumed that they all do, it's
> why we have staging after all.

As usual i fail at expressing my point. I am not saying do not merge this
because of out of tree drivers, i am saying while doing an api let make it
sane and try to make it so that it enforce sanity to anything that lives
outside our realm.

And not even thinking outside kernel tree, but someone might come along and
start using fence, see the callback stuff and start doing crazy stuff with
that all this inside a different obscur kernel subsystem. Then someone in
graphic sees that and use that as justification to do crazy thing too,
because hey, if he is doing why can't i ?

> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch


[PATCH] drm/radeon: Remove duplicate include from Makefile

2014-08-14 Thread Christian König
Am 14.08.2014 um 14:17 schrieb Andreas Ruprecht:
> In the Makefile, radeon_uvd.o is added to radeon-y twice.
>
> As it belongs to the UVD block marked with a comment, the other include
> from the block of includes labelled as "KMS driver" is deleted.
>
> Signed-off-by: Andreas Ruprecht 

Nice catch, patch is Reviewed-by: Christian K?nig 

> ---
>   drivers/gpu/drm/radeon/Makefile | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
> index 0013ad0..f77b713 100644
> --- a/drivers/gpu/drm/radeon/Makefile
> +++ b/drivers/gpu/drm/radeon/Makefile
> @@ -76,7 +76,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
>   evergreen.o evergreen_cs.o evergreen_blit_shaders.o \
>   evergreen_hdmi.o radeon_trace_points.o ni.o cayman_blit_shaders.o \
>   atombios_encoders.o radeon_semaphore.o radeon_sa.o atombios_i2c.o si.o \
> - si_blit_shaders.o radeon_prime.o radeon_uvd.o cik.o cik_blit_shaders.o \
> + si_blit_shaders.o radeon_prime.o cik.o cik_blit_shaders.o \
>   r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \
>   rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o 
> \
>   trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \



Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 05:55:51PM +0200, Daniel Vetter wrote:
> On Thu, Aug 14, 2014 at 10:23:30AM -0400, Jerome Glisse wrote:
> > On Thu, Aug 14, 2014 at 11:08:34AM +0200, Daniel Vetter wrote:
> > > On Wed, Aug 13, 2014 at 01:07:20PM -0400, Jerome Glisse wrote:
> > > > Let me make this crystal clear this must be a valid kernel page that 
> > > > have a
> > > > valid kernel mapping for the lifetime of the device. Hence there is no 
> > > > access
> > > > to mmio space or anything, just a regular kernel page. If can not rely 
> > > > on that
> > > > this is a sad world.
> > > > 
> > > > That being said, yes i am aware that some device incapacity to write to 
> > > > such
> > > > a page. For those dumb hardware what you need to do is have the irq 
> > > > handler
> > > > write to this page on behalf of the hardware. But i would like to know 
> > > > any
> > > > hardware than can not write a single dword from its ring buffer.
> > > > 
> > > > The only tricky part in this, is when device is unloaded and driver is 
> > > > removing
> > > > itself, it obviously need to synchronize itself with anyone possibly 
> > > > waiting on
> > > > it and possibly reading. But truly this is not that hard to solve.
> > > > 
> > > > So tell me once the above is clear what kind of scary thing can happen 
> > > > when cpu
> > > > or a device _read_ a kernel page ?
> > > 
> > > It's not reading it, it's making sense of what you read. In i915 we had
> > > exactly the (timeline, seqno) value pair design for fences for a long
> > > time, and we're switching away from it since it stops working when you
> > > have preemption and scheduler. Or at least it gets really interesting to
> > > interpret the data written into the page.
> > > 
> > > So I don't want to expose that to other drivers if we decided that
> > > exposing this internally is a stupid idea.
> > 
> > I am well aware of that, but context scheduling really is what the timeline 
> > i
> > talk about is. The user timeline should be consider like a single cpu 
> > thread on
> > to which operation involving different hw are scheduled. You can have the hw
> > switch from one timeline to another and a seqno is only meaningful per 
> > timeline.
> > 
> > The whole preemption and scheduling is something bound to happen on gpu and 
> > we
> > will want to integrate with core scheduler to manage time slice allocated to
> > process, but the we need the concept of thread in which operation on same hw
> > block are processed in a linear fashion but still allowing concurrency with
> > other hw block.
> 
> Well yeah, it's just that a gpu thread doesn't have a lot to do with a cpu
> process, at least in the current drm world. It would be nice make that a
> bit more cross-driver for management, but we didn't even manage to pull
> that of for gpu memory. So I don't think it'll happen anytime soonish.
> 
> > > > > > > So from that pov (presuming I didn't miss anything) your proposal 
> > > > > > > is
> > > > > > > identical to what we have, minor some different color choices 
> > > > > > > (like where
> > > > > > > to place the callback queue).
> > > > > > 
> > > > > > No callback is the mantra here, and instead of bolting free living 
> > > > > > fence
> > > > > > to buffer object, they are associated with timeline which means you 
> > > > > > do not
> > > > > > need to go over all buffer object to know what you need to wait for.
> > > > > 
> > > > > Ok, then I guess I didn't understand that part of your the proposal. 
> > > > > Can
> > > > > you please elaborate a bit more how you want to synchronize mulitple
> > > > > drivers accessing a dma-buf object and what piece of state we need to
> > > > > associate to the dma-buf to make this happen?
> > > > 
> > > > Beauty of it you associate ziltch to the buffer. So for existing cs 
> > > > ioctl where
> > > > the implicit synchronization is the rule it enforce mandatory 
> > > > synchronization
> > > > accross all hw timeline on which a buffer shows up :
> > > >   for_each_buffer_in_cmdbuffer(buffer, cmdbuf) {
> > > > if (!cmdbuf_write_to_buffer(buffer, cmdbuf))
> > > >   continue;
> > > > for_each_process_sharing_buffer(buffer, process) {
> > > >   schedule_fence(process->implicit_timeline, cmdbuf->fence)
> > > > }
> > > >   }
> > > > 
> > > > Next time another process use current ioctl with implicit sync it will 
> > > > synch with
> > > > the last fence for any shared resource. This might sounds bad but 
> > > > truely as it is
> > > > right now this is already how it happens (at least for radeon).
> > > 
> > > Well i915 is a lot better than that. And I'm not going to implement some
> > > special-case for dma-buf shared buffers just because radeon sucks and
> > > wants to enforce that suckage on everyone else.
> > 
> > I guess i am having hard time to express myself, what i am saying here is 
> > that
> > implicit synchronization sucks because it has to assume the worst case and 
> > this
> > is what current code does, 

[PATCH] drm/radeon: Remove duplicate include from Makefile

2014-08-14 Thread Andreas Ruprecht
In the Makefile, radeon_uvd.o is added to radeon-y twice.

As it belongs to the UVD block marked with a comment, the other include
from the block of includes labelled as "KMS driver" is deleted.

Signed-off-by: Andreas Ruprecht 
---
 drivers/gpu/drm/radeon/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 0013ad0..f77b713 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -76,7 +76,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
evergreen.o evergreen_cs.o evergreen_blit_shaders.o \
evergreen_hdmi.o radeon_trace_points.o ni.o cayman_blit_shaders.o \
atombios_encoders.o radeon_semaphore.o radeon_sa.o atombios_i2c.o si.o \
-   si_blit_shaders.o radeon_prime.o radeon_uvd.o cik.o cik_blit_shaders.o \
+   si_blit_shaders.o radeon_prime.o cik.o cik_blit_shaders.o \
r600_dpm.o rs780_dpm.o rv6xx_dpm.o rv770_dpm.o rv730_dpm.o rv740_dpm.o \
rv770_smc.o cypress_dpm.o btc_dpm.o sumo_dpm.o sumo_smc.o trinity_dpm.o 
\
trinity_smc.o ni_dpm.o si_smc.o si_dpm.o kv_smc.o kv_dpm.o ci_smc.o \
-- 
1.9.1



Fence, timeline and android sync points

2014-08-14 Thread Christian König
> But because of driver differences I can't implement it as a straight wait 
> queue. Some drivers may not have a reliable interrupt, so they need a custom 
> wait function. (qxl)
> Some may need to do extra flushing to get fences signaled (vmwgfx), others 
> need some locking to protect against gpu lockup races (radeon, i915??).  And 
> nouveau
> doesn't use wait queues, but rolls its own (nouveau).
But when all those drivers need a special wait function how can you 
still justify the common callback when a fence is signaled?

If I understood it right the use case for this was waiting for any fence 
of a list of fences from multiple drivers, but if each driver needs 
special handling how for it's wait how can that work reliable?

Christian.

Am 14.08.2014 um 11:15 schrieb Maarten Lankhorst:
> Op 13-08-14 om 19:07 schreef Jerome Glisse:
>> On Wed, Aug 13, 2014 at 05:54:20PM +0200, Daniel Vetter wrote:
>>> On Wed, Aug 13, 2014 at 09:36:04AM -0400, Jerome Glisse wrote:
 On Wed, Aug 13, 2014 at 10:28:22AM +0200, Daniel Vetter wrote:
> On Tue, Aug 12, 2014 at 06:13:41PM -0400, Jerome Glisse wrote:
>> Hi,
>>
>> So i want over the whole fence and sync point stuff as it's becoming a 
>> pressing
>> issue. I think we first need to agree on what is the problem we want to 
>> solve
>> and what would be the requirements to solve it.
>>
>> Problem :
>>Explicit synchronization btw different hardware block over a buffer 
>> object.
>>
>> Requirements :
>>Share common infrastructure.
>>Allow optimal hardware command stream scheduling accross hardware 
>> block.
>>Allow android sync point to be implemented on top of it.
>>Handle/acknowledge exception (like good old gpu lockup).
>>Minimize driver changes.
>>
>> Glossary :
>>hardware timeline: timeline bound to a specific hardware block.
>>pipeline timeline: timeline bound to a userspace rendering pipeline, 
>> each
>>   point on that timeline can be a composite of 
>> several
>>   different hardware pipeline point.
>>pipeline: abstract object representing userspace application graphic 
>> pipeline
>>  of each of the application graphic operations.
>>fence: specific point in a timeline where synchronization needs to 
>> happen.
>>
>>
>> So now, current include/linux/fence.h implementation is i believe 
>> missing the
>> objective by confusing hardware and pipeline timeline and by bolting 
>> fence to
>> buffer object while what is really needed is true and proper timeline 
>> for both
>> hardware and pipeline. But before going further down that road let me 
>> look at
>> things and explain how i see them.
> fences can be used free-standing and no one forces you to integrate them
> with buffers. We actually plan to go this way with the intel svm stuff.
> Ofc for dma-buf the plan is to synchronize using such fences, but that's
> somewhat orthogonal I think. At least you only talk about fences and
> timeline and not dma-buf here.
>   
>> Current ttm fence have one and a sole purpose, allow synchronization for 
>> buffer
>> object move even thought some driver like radeon slightly abuse it and 
>> use them
>> for things like lockup detection.
>>
>> The new fence want to expose an api that would allow some implementation 
>> of a
>> timeline. For that it introduces callback and some hard requirement on 
>> what the
>> driver have to expose :
>>enable_signaling
>>[signaled]
>>wait
>>
>> Each of those have to do work inside the driver to which the fence 
>> belongs and
>> each of those can be call more or less from unexpected (with restriction 
>> like
>> outside irq) context. So we end up with thing like :
>>
>>   Process 1  Process 2   Process 3
>>   I_A_schedule(fence0)
>>  CI_A_F_B_signaled(fence0)
>>  I_A_signal(fence0)
>>  
>> CI_B_F_A_callback(fence0)
>>  CI_A_F_B_wait(fence0)
>> Lexique:
>> I_x  in driver x (I_A == in driver A)
>> CI_x_F_y call in driver X from driver Y (CI_A_F_B call in driver A from 
>> driver B)
>>
>> So this is an happy mess everyone call everyone and this bound to get 
>> messy.
>> Yes i know there is all kind of requirement on what happen once a fence 
>> is
>> signaled. But those requirement only looks like they are trying to atone 
>> any
>> mess that can happen from the whole callback dance.
>>
>> While i was too seduced by the whole callback idea long time ago, i 
>> think it is
>> a highly 

[Bug 77181] radeon -- GPU lockup when hibernating or waking up

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=77181

Alex Deucher  changed:

   What|Removed |Added

 CC||alexdeucher at gmail.com

--- Comment #2 from Alex Deucher  ---
Is this a regression?  If so, can you bisect?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 82586] UBO matrix in std140 struct does not work

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82586

--- Comment #2 from pavol at klacansky.com ---
"What is the failure mode?"
Don't follow

"What version of Mesa are you on?"
https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers
Seem couple days old.

"Can you try a build of master?"
Can't I just wait for new build in PPA?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/30fc627c/attachment.html>


[Bug 77181] radeon -- GPU lockup when hibernating or waking up

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=77181

Rostislav Devyatov  changed:

   What|Removed |Added

 CC||deviatov at gmail.com

--- Comment #1 from Rostislav Devyatov  ---
Same problem here (mobility radeon 4550, kernel 3.14.14-gentoo)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 82533] Line of tearing while playing games

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82533

--- Comment #11 from dawide2211 at gmail.com ---
Compton fixed the tearing issue, thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/3ba79b1c/attachment.html>


[Bug 78221] 3.16 RC1: AMD R9 270 GPU locks up on some heavy 2D activity - GPU VM fault occurs. (possibly DMA copying issue strikes back?)

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=78221

Tomasz Mloduchowski  changed:

   What|Removed |Added

 CC||q at qdot.me

--- Comment #17 from Tomasz Mloduchowski  ---
I can confirm that the bug still occurs on 3.16 as well. 

Different hardware:
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Curacao XT [Radeon R9 270X]

Non-AMD-Vi  (Intel Xeon), IO-MMU disabled. 

Occasionally on large window resizes (4K display running awesome WM, moving a
2D window from a small tile to a large one) this issue triggers. 


[ 6735.965953] radeon :02:00.0: ring 0 stalled for more than 10081msec
[ 6735.965958] radeon :02:00.0: GPU lockup (waiting for 0x00041872
last fence id 0x00041871 on ring 0)
[ 6735.965962] radeon :02:00.0: failed to get a new IB (-35)
[ 6736.546504] radeon :02:00.0: Saved 12093 dwords of commands on ring 0.
[ 6736.546647] radeon :02:00.0: GPU softreset: 0x006C
[ 6736.546651] radeon :02:00.0:   GRBM_STATUS   = 0xA0003028
[ 6736.546654] radeon :02:00.0:   GRBM_STATUS_SE0   = 0x0006
[ 6736.546657] radeon :02:00.0:   GRBM_STATUS_SE1   = 0x0006
[ 6736.546660] radeon :02:00.0:   SRBM_STATUS   = 0x20C0
[ 6736.546773] radeon :02:00.0:   SRBM_STATUS2  = 0x
[ 6736.546777] radeon :02:00.0:   R_008674_CP_STALLED_STAT1 = 0x
[ 6736.546780] radeon :02:00.0:   R_008678_CP_STALLED_STAT2 = 0x0001
[ 6736.546783] radeon :02:00.0:   R_00867C_CP_BUSY_STAT = 0x0002
[ 6736.546786] radeon :02:00.0:   R_008680_CP_STAT  = 0x80010243
[ 6736.546789] radeon :02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83146
[ 6736.546793] radeon :02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C84246
[ 6736.546796] radeon :02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR  
0x
[ 6736.546802] radeon :02:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS
0x
[ 6737.119141] radeon :02:00.0: GRBM_SOFT_RESET=0xDDFF
[ 6737.119197] radeon :02:00.0: SRBM_SOFT_RESET=0x00100140
[ 6737.120382] radeon :02:00.0:   GRBM_STATUS   = 0x3028
[ 6737.120385] radeon :02:00.0:   GRBM_STATUS_SE0   = 0x0006
[ 6737.120388] radeon :02:00.0:   GRBM_STATUS_SE1   = 0x0006
[ 6737.120391] radeon :02:00.0:   SRBM_STATUS   = 0x2AC0
[ 6737.120503] radeon :02:00.0:   SRBM_STATUS2  = 0x
[ 6737.120507] radeon :02:00.0:   R_008674_CP_STALLED_STAT1 = 0x
[ 6737.120510] radeon :02:00.0:   R_008678_CP_STALLED_STAT2 = 0x
[ 6737.120513] radeon :02:00.0:   R_00867C_CP_BUSY_STAT = 0x
[ 6737.120516] radeon :02:00.0:   R_008680_CP_STAT  = 0x
[ 6737.120519] radeon :02:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[ 6737.120522] radeon :02:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[ 6737.120770] radeon :02:00.0: GPU reset succeeded, trying to resume
[ 6737.169219] [drm] probing gen 2 caps for device 8086:340a = 3b3d02/0
[ 6737.169230] [drm] PCIE gen 2 link speeds already enabled
[ 6737.172143] [drm] PCIE GART of 1024M enabled (table at 0x00276000).
[ 6737.172320] radeon :02:00.0: WB enabled
[ 6737.172324] radeon :02:00.0: fence driver on ring 0 use gpu addr
0x8c00 and cpu addr 0x880197695c00
[ 6737.172327] radeon :02:00.0: fence driver on ring 1 use gpu addr
0x8c04 and cpu addr 0x880197695c04
[ 6737.172330] radeon :02:00.0: fence driver on ring 2 use gpu addr
0x8c08 and cpu addr 0x880197695c08
[ 6737.172335] radeon :02:00.0: fence driver on ring 3 use gpu addr
0x8c0c and cpu addr 0x880197695c0c
[ 6737.172338] radeon :02:00.0: fence driver on ring 4 use gpu addr
0x8c10 and cpu addr 0x880197695c10
[ 6737.216900] radeon :02:00.0: fence driver on ring 5 use gpu addr
0x00075a18 and cpu addr 0xc90001735a18
[ 6737.402614] [drm] ring test on 0 succeeded in 3 usecs
[ 6737.402627] [drm] ring test on 1 succeeded in 1 usecs
[ 6737.402634] [drm] ring test on 2 succeeded in 1 usecs
[ 6737.402701] [drm] ring test on 3 succeeded in 2 usecs
[ 6737.402713] [drm] ring test on 4 succeeded in 1 usecs
[ 6737.579764] [drm] ring test on 5 succeeded in 2 usecs
[ 6737.579778] [drm] UVD initialized successfully.
[ 6747.574404] radeon :02:00.0: ring 0 stalled for more than 1msec
[ 6747.574410] radeon :02:00.0: GPU lockup (waiting for 0x00041920
last fence id 0x00041871 on ring 0)
[ 6747.574414] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
[ 6747.574418] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on
GFX ring (-35).
[ 6747.574421] radeon :02:00.0: ib ring test failed (-35).
[ 6748.140502] 

[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #14 from Andy Furniss  ---
(In reply to comment #13)
> Does reverting Mesa commit 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 help for
> Valley or pyrit with the latest kernel?

Yes, with that reverted perf is roughly back to "good" kernel for both.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/916d470b/attachment.html>


[PATCH] drm/exynos: dsi: fix exynos_dsi_set_pll() wrong return value

2014-08-14 Thread YoungJun Cho
The type of this function is unsigned long, and it is expected
to return proper fout value or zero if something is wrong.
So this patch fixes wrong return value for error cases.

Signed-off-by: YoungJun Cho 
Acked-by: Inki Dae 
Acked-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_dsi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index 86aebd8..061017b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -421,7 +421,7 @@ static unsigned long exynos_dsi_set_pll(struct exynos_dsi 
*dsi,
if (!fout) {
dev_err(dsi->dev,
"failed to find PLL PMS for requested frequency\n");
-   return -EFAULT;
+   return 0;
}
dev_dbg(dsi->dev, "PLL freq %lu, (p %d, m %d, s %d)\n", fout, p, m, s);

@@ -453,7 +453,7 @@ static unsigned long exynos_dsi_set_pll(struct exynos_dsi 
*dsi,
do {
if (timeout-- == 0) {
dev_err(dsi->dev, "PLL failed to stabilize\n");
-   return -EFAULT;
+   return 0;
}
reg = readl(dsi->reg_base + DSIM_STATUS_REG);
} while ((reg & DSIM_PLL_STABLE) == 0);
-- 
1.9.0



Fence, timeline and android sync points

2014-08-14 Thread Maarten Lankhorst
Op 13-08-14 om 19:07 schreef Jerome Glisse:
> On Wed, Aug 13, 2014 at 05:54:20PM +0200, Daniel Vetter wrote:
>> On Wed, Aug 13, 2014 at 09:36:04AM -0400, Jerome Glisse wrote:
>>> On Wed, Aug 13, 2014 at 10:28:22AM +0200, Daniel Vetter wrote:
 On Tue, Aug 12, 2014 at 06:13:41PM -0400, Jerome Glisse wrote:
> Hi,
>
> So i want over the whole fence and sync point stuff as it's becoming a 
> pressing
> issue. I think we first need to agree on what is the problem we want to 
> solve
> and what would be the requirements to solve it.
>
> Problem :
>   Explicit synchronization btw different hardware block over a buffer 
> object.
>
> Requirements :
>   Share common infrastructure.
>   Allow optimal hardware command stream scheduling accross hardware block.
>   Allow android sync point to be implemented on top of it.
>   Handle/acknowledge exception (like good old gpu lockup).
>   Minimize driver changes.
>
> Glossary :
>   hardware timeline: timeline bound to a specific hardware block.
>   pipeline timeline: timeline bound to a userspace rendering pipeline, 
> each
>  point on that timeline can be a composite of several
>  different hardware pipeline point.
>   pipeline: abstract object representing userspace application graphic 
> pipeline
> of each of the application graphic operations.
>   fence: specific point in a timeline where synchronization needs to 
> happen.
>
>
> So now, current include/linux/fence.h implementation is i believe missing 
> the
> objective by confusing hardware and pipeline timeline and by bolting 
> fence to
> buffer object while what is really needed is true and proper timeline for 
> both
> hardware and pipeline. But before going further down that road let me 
> look at
> things and explain how i see them.
 fences can be used free-standing and no one forces you to integrate them
 with buffers. We actually plan to go this way with the intel svm stuff.
 Ofc for dma-buf the plan is to synchronize using such fences, but that's
 somewhat orthogonal I think. At least you only talk about fences and
 timeline and not dma-buf here.
  
> Current ttm fence have one and a sole purpose, allow synchronization for 
> buffer
> object move even thought some driver like radeon slightly abuse it and 
> use them
> for things like lockup detection.
>
> The new fence want to expose an api that would allow some implementation 
> of a
> timeline. For that it introduces callback and some hard requirement on 
> what the
> driver have to expose :
>   enable_signaling
>   [signaled]
>   wait
>
> Each of those have to do work inside the driver to which the fence 
> belongs and
> each of those can be call more or less from unexpected (with restriction 
> like
> outside irq) context. So we end up with thing like :
>
>  Process 1  Process 2   Process 3
>  I_A_schedule(fence0)
> CI_A_F_B_signaled(fence0)
> I_A_signal(fence0)
> 
> CI_B_F_A_callback(fence0)
> CI_A_F_B_wait(fence0)
> Lexique:
> I_x  in driver x (I_A == in driver A)
> CI_x_F_y call in driver X from driver Y (CI_A_F_B call in driver A from 
> driver B)
>
> So this is an happy mess everyone call everyone and this bound to get 
> messy.
> Yes i know there is all kind of requirement on what happen once a fence is
> signaled. But those requirement only looks like they are trying to atone 
> any
> mess that can happen from the whole callback dance.
>
> While i was too seduced by the whole callback idea long time ago, i think 
> it is
> a highly dangerous path to take where the combinatorial of what could 
> happen
> are bound to explode with the increase in the number of players.
>
>
> So now back to how to solve the problem we are trying to address. First i 
> want
> to make an observation, almost all GPU that exist today have a command 
> ring
> on to which userspace command buffer are executed and inside the command 
> ring
> you can do something like :
>
>   if (condition) execute_command_buffer else skip_command_buffer
>
> where condition is a simple expression (memory_address cop value)) with 
> cop one
> of the generic comparison (==, <, >, <=, >=). I think it is a safe 
> assumption
> that any gpu that slightly matter can do that. Those who can not should 
> fix
> there command ring processor.
>
>
> With that in mind, i think proper solution is implementing timeline and 

Fence, timeline and android sync points

2014-08-14 Thread Daniel Vetter
On Wed, Aug 13, 2014 at 01:07:20PM -0400, Jerome Glisse wrote:
> Let me make this crystal clear this must be a valid kernel page that have a
> valid kernel mapping for the lifetime of the device. Hence there is no access
> to mmio space or anything, just a regular kernel page. If can not rely on that
> this is a sad world.
> 
> That being said, yes i am aware that some device incapacity to write to such
> a page. For those dumb hardware what you need to do is have the irq handler
> write to this page on behalf of the hardware. But i would like to know any
> hardware than can not write a single dword from its ring buffer.
> 
> The only tricky part in this, is when device is unloaded and driver is 
> removing
> itself, it obviously need to synchronize itself with anyone possibly waiting 
> on
> it and possibly reading. But truly this is not that hard to solve.
> 
> So tell me once the above is clear what kind of scary thing can happen when 
> cpu
> or a device _read_ a kernel page ?

It's not reading it, it's making sense of what you read. In i915 we had
exactly the (timeline, seqno) value pair design for fences for a long
time, and we're switching away from it since it stops working when you
have preemption and scheduler. Or at least it gets really interesting to
interpret the data written into the page.

So I don't want to expose that to other drivers if we decided that
exposing this internally is a stupid idea.

> > 
> > > > So from that pov (presuming I didn't miss anything) your proposal is
> > > > identical to what we have, minor some different color choices (like 
> > > > where
> > > > to place the callback queue).
> > > 
> > > No callback is the mantra here, and instead of bolting free living fence
> > > to buffer object, they are associated with timeline which means you do not
> > > need to go over all buffer object to know what you need to wait for.
> > 
> > Ok, then I guess I didn't understand that part of your the proposal. Can
> > you please elaborate a bit more how you want to synchronize mulitple
> > drivers accessing a dma-buf object and what piece of state we need to
> > associate to the dma-buf to make this happen?
> 
> Beauty of it you associate ziltch to the buffer. So for existing cs ioctl 
> where
> the implicit synchronization is the rule it enforce mandatory synchronization
> accross all hw timeline on which a buffer shows up :
>   for_each_buffer_in_cmdbuffer(buffer, cmdbuf) {
> if (!cmdbuf_write_to_buffer(buffer, cmdbuf))
>   continue;
> for_each_process_sharing_buffer(buffer, process) {
>   schedule_fence(process->implicit_timeline, cmdbuf->fence)
> }
>   }
> 
> Next time another process use current ioctl with implicit sync it will synch 
> with
> the last fence for any shared resource. This might sounds bad but truely as 
> it is
> right now this is already how it happens (at least for radeon).

Well i915 is a lot better than that. And I'm not going to implement some
special-case for dma-buf shared buffers just because radeon sucks and
wants to enforce that suckage on everyone else.

So let's cut this short: If you absolutely insist I guess we could ditch
the callback stuff from fences, but I really don't see the problem with
radeon just not using that and then being happy. We can easily implement a
bit of insulation code _just_ for radeon so that the only thing radeon
does is wake up a process (which then calls the callback if it's something
special).

Otoh I don't care about what ttm and radeon do, for i915 the important
stuff is integration with android syncpts and being able to do explicit
fencing for e.g. svm stuff. We can do that with what's merged in 3.17 and
I expect that those patches will land in 3.18, at least the internal
integration.

It would be cool if we could get tear-free optimus working on desktop
linux, but that flat out doesn't pay my bills here. So I think I'll let
you guys figure this out yourself.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[patch] drm/nouveau: indent an else statement

2014-08-14 Thread Dan Carpenter
This else statement wasn't indented so it was confusing.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/gpu/drm/nouveau/nouveau_usif.c 
b/drivers/gpu/drm/nouveau/nouveau_usif.c
index cb1182d..41c9b404 100644
--- a/drivers/gpu/drm/nouveau/nouveau_usif.c
+++ b/drivers/gpu/drm/nouveau/nouveau_usif.c
@@ -83,9 +83,10 @@ usif_notify(const void *header, u32 length, const void 
*data, u32 size)
if (WARN_ON(!(ntfy = (void *)(unsigned long)rep->v0.token)))
return NVIF_NOTIFY_DROP;
BUG_ON(rep->v0.route != NVDRM_NOTIFY_USIF);
-   } else
-   if (WARN_ON(1))
-   return NVIF_NOTIFY_DROP;
+   } else {
+   if (WARN_ON(1))
+   return NVIF_NOTIFY_DROP;
+   }

if (WARN_ON(!ntfy->p || ntfy->reply != (length + size)))
return NVIF_NOTIFY_DROP;


Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 11:08:34AM +0200, Daniel Vetter wrote:
> On Wed, Aug 13, 2014 at 01:07:20PM -0400, Jerome Glisse wrote:
> > Let me make this crystal clear this must be a valid kernel page that have a
> > valid kernel mapping for the lifetime of the device. Hence there is no 
> > access
> > to mmio space or anything, just a regular kernel page. If can not rely on 
> > that
> > this is a sad world.
> > 
> > That being said, yes i am aware that some device incapacity to write to such
> > a page. For those dumb hardware what you need to do is have the irq handler
> > write to this page on behalf of the hardware. But i would like to know any
> > hardware than can not write a single dword from its ring buffer.
> > 
> > The only tricky part in this, is when device is unloaded and driver is 
> > removing
> > itself, it obviously need to synchronize itself with anyone possibly 
> > waiting on
> > it and possibly reading. But truly this is not that hard to solve.
> > 
> > So tell me once the above is clear what kind of scary thing can happen when 
> > cpu
> > or a device _read_ a kernel page ?
> 
> It's not reading it, it's making sense of what you read. In i915 we had
> exactly the (timeline, seqno) value pair design for fences for a long
> time, and we're switching away from it since it stops working when you
> have preemption and scheduler. Or at least it gets really interesting to
> interpret the data written into the page.
> 
> So I don't want to expose that to other drivers if we decided that
> exposing this internally is a stupid idea.

I am well aware of that, but context scheduling really is what the timeline i
talk about is. The user timeline should be consider like a single cpu thread on
to which operation involving different hw are scheduled. You can have the hw
switch from one timeline to another and a seqno is only meaningful per timeline.

The whole preemption and scheduling is something bound to happen on gpu and we
will want to integrate with core scheduler to manage time slice allocated to
process, but the we need the concept of thread in which operation on same hw
block are processed in a linear fashion but still allowing concurrency with
other hw block.

> 
> > > 
> > > > > So from that pov (presuming I didn't miss anything) your proposal is
> > > > > identical to what we have, minor some different color choices (like 
> > > > > where
> > > > > to place the callback queue).
> > > > 
> > > > No callback is the mantra here, and instead of bolting free living fence
> > > > to buffer object, they are associated with timeline which means you do 
> > > > not
> > > > need to go over all buffer object to know what you need to wait for.
> > > 
> > > Ok, then I guess I didn't understand that part of your the proposal. Can
> > > you please elaborate a bit more how you want to synchronize mulitple
> > > drivers accessing a dma-buf object and what piece of state we need to
> > > associate to the dma-buf to make this happen?
> > 
> > Beauty of it you associate ziltch to the buffer. So for existing cs ioctl 
> > where
> > the implicit synchronization is the rule it enforce mandatory 
> > synchronization
> > accross all hw timeline on which a buffer shows up :
> >   for_each_buffer_in_cmdbuffer(buffer, cmdbuf) {
> > if (!cmdbuf_write_to_buffer(buffer, cmdbuf))
> >   continue;
> > for_each_process_sharing_buffer(buffer, process) {
> >   schedule_fence(process->implicit_timeline, cmdbuf->fence)
> > }
> >   }
> > 
> > Next time another process use current ioctl with implicit sync it will 
> > synch with
> > the last fence for any shared resource. This might sounds bad but truely as 
> > it is
> > right now this is already how it happens (at least for radeon).
> 
> Well i915 is a lot better than that. And I'm not going to implement some
> special-case for dma-buf shared buffers just because radeon sucks and
> wants to enforce that suckage on everyone else.

I guess i am having hard time to express myself, what i am saying here is that
implicit synchronization sucks because it has to assume the worst case and this
is what current code does, and i am sure intel is doing something similar with
today code.

Explicit synchronization allow more flexibility but fence code as it is designed
does not allow to fully do down that line. By associating fence to buffer object
which is the biggest shortcoming of implicit sync.

> 
> So let's cut this short: If you absolutely insist I guess we could ditch
> the callback stuff from fences, but I really don't see the problem with
> radeon just not using that and then being happy. We can easily implement a
> bit of insulation code _just_ for radeon so that the only thing radeon
> does is wake up a process (which then calls the callback if it's something
> special).

Like i said feel free to ignore me. I am just genuinely want to have the best
solution inside the linux kernel and i do think that fence and callback and
buffer association is not that 

Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 09:16:02AM -0400, Rob Clark wrote:
> On Wed, Aug 13, 2014 at 1:07 PM, Jerome Glisse  wrote:
> > So this is fundamentaly different, fence as they are now allow random driver
> > callback and this is bound to get ugly this is bound to lead to one driver
> > doing something that seems innocuous but turn out to break heavoc when call
> > from some other driver function.
> 
> 
> tbh, that seems solvable by some strict rules about what you can do in
> the callback.. ie. don't do anything you couldn't do in atomic, and
> don't signal another fence.. off the top of my head that seems
> sufficient.
> 
> If the driver getting the callback needs to do more, then it can
> always schedule a worker..
> 
> But I could certainly see the case where the driver waiting on fence
> sets everything up before installing the cb and then just needs to
> write one or a couple regs from the cb.

Yes sane code will do sane things, sadly i fear we can not enforce sane
code everywhere especialy with out of tree driver and i would rather
force there hand to only allow sane implementation. Providing call back
api obviously allows them to do crazy stuff.

> 
> BR,
> -R


Fence, timeline and android sync points

2014-08-14 Thread Jerome Glisse
On Thu, Aug 14, 2014 at 11:15:11AM +0200, Maarten Lankhorst wrote:
> Op 13-08-14 om 19:07 schreef Jerome Glisse:
> > On Wed, Aug 13, 2014 at 05:54:20PM +0200, Daniel Vetter wrote:
> >> On Wed, Aug 13, 2014 at 09:36:04AM -0400, Jerome Glisse wrote:
> >>> On Wed, Aug 13, 2014 at 10:28:22AM +0200, Daniel Vetter wrote:
>  On Tue, Aug 12, 2014 at 06:13:41PM -0400, Jerome Glisse wrote:
> > Hi,
> >
> > So i want over the whole fence and sync point stuff as it's becoming a 
> > pressing
> > issue. I think we first need to agree on what is the problem we want to 
> > solve
> > and what would be the requirements to solve it.
> >
> > Problem :
> >   Explicit synchronization btw different hardware block over a buffer 
> > object.
> >
> > Requirements :
> >   Share common infrastructure.
> >   Allow optimal hardware command stream scheduling accross hardware 
> > block.
> >   Allow android sync point to be implemented on top of it.
> >   Handle/acknowledge exception (like good old gpu lockup).
> >   Minimize driver changes.
> >
> > Glossary :
> >   hardware timeline: timeline bound to a specific hardware block.
> >   pipeline timeline: timeline bound to a userspace rendering pipeline, 
> > each
> >  point on that timeline can be a composite of 
> > several
> >  different hardware pipeline point.
> >   pipeline: abstract object representing userspace application graphic 
> > pipeline
> > of each of the application graphic operations.
> >   fence: specific point in a timeline where synchronization needs to 
> > happen.
> >
> >
> > So now, current include/linux/fence.h implementation is i believe 
> > missing the
> > objective by confusing hardware and pipeline timeline and by bolting 
> > fence to
> > buffer object while what is really needed is true and proper timeline 
> > for both
> > hardware and pipeline. But before going further down that road let me 
> > look at
> > things and explain how i see them.
>  fences can be used free-standing and no one forces you to integrate them
>  with buffers. We actually plan to go this way with the intel svm stuff.
>  Ofc for dma-buf the plan is to synchronize using such fences, but that's
>  somewhat orthogonal I think. At least you only talk about fences and
>  timeline and not dma-buf here.
>   
> > Current ttm fence have one and a sole purpose, allow synchronization 
> > for buffer
> > object move even thought some driver like radeon slightly abuse it and 
> > use them
> > for things like lockup detection.
> >
> > The new fence want to expose an api that would allow some 
> > implementation of a
> > timeline. For that it introduces callback and some hard requirement on 
> > what the
> > driver have to expose :
> >   enable_signaling
> >   [signaled]
> >   wait
> >
> > Each of those have to do work inside the driver to which the fence 
> > belongs and
> > each of those can be call more or less from unexpected (with 
> > restriction like
> > outside irq) context. So we end up with thing like :
> >
> >  Process 1  Process 2   Process 3
> >  I_A_schedule(fence0)
> > CI_A_F_B_signaled(fence0)
> > I_A_signal(fence0)
> > 
> > CI_B_F_A_callback(fence0)
> > CI_A_F_B_wait(fence0)
> > Lexique:
> > I_x  in driver x (I_A == in driver A)
> > CI_x_F_y call in driver X from driver Y (CI_A_F_B call in driver A from 
> > driver B)
> >
> > So this is an happy mess everyone call everyone and this bound to get 
> > messy.
> > Yes i know there is all kind of requirement on what happen once a fence 
> > is
> > signaled. But those requirement only looks like they are trying to 
> > atone any
> > mess that can happen from the whole callback dance.
> >
> > While i was too seduced by the whole callback idea long time ago, i 
> > think it is
> > a highly dangerous path to take where the combinatorial of what could 
> > happen
> > are bound to explode with the increase in the number of players.
> >
> >
> > So now back to how to solve the problem we are trying to address. First 
> > i want
> > to make an observation, almost all GPU that exist today have a command 
> > ring
> > on to which userspace command buffer are executed and inside the 
> > command ring
> > you can do something like :
> >
> >   if (condition) execute_command_buffer else skip_command_buffer
> >
> > where condition is a simple expression (memory_address cop value)) with 
> > 

[PATCH] drm/exynos: dsi: fix exynos_dsi_set_pll() wrong return value

2014-08-14 Thread Andrzej Hajda
Hi YoungJun,

Thanks for spotting it.

On 08/14/2014 04:22 AM, YoungJun Cho wrote:
> The type of this function is unsigned long, and it is expected
> to return proper fout value or zero if something is wrong.
> So this patch fixes wrong return value for error cases.
>
> Signed-off-by: YoungJun Cho 
> Acked-by: Inki Dae 
> Acked-by: Kyungmin Park 
Acked-by: Andrzej Hajda 

Regards
Andrzej
> ---
>  drivers/gpu/drm/exynos/exynos_drm_dsi.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
> b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> index 86aebd8..061017b 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> @@ -421,7 +421,7 @@ static unsigned long exynos_dsi_set_pll(struct exynos_dsi 
> *dsi,
>   if (!fout) {
>   dev_err(dsi->dev,
>   "failed to find PLL PMS for requested frequency\n");
> - return -EFAULT;
> + return 0;
>   }
>   dev_dbg(dsi->dev, "PLL freq %lu, (p %d, m %d, s %d)\n", fout, p, m, s);
>  
> @@ -453,7 +453,7 @@ static unsigned long exynos_dsi_set_pll(struct exynos_dsi 
> *dsi,
>   do {
>   if (timeout-- == 0) {
>   dev_err(dsi->dev, "PLL failed to stabilize\n");
> - return -EFAULT;
> + return 0;
>   }
>   reg = readl(dsi->reg_base + DSIM_STATUS_REG);
>   } while ((reg & DSIM_PLL_STABLE) == 0);



Fence, timeline and android sync points

2014-08-14 Thread Rob Clark
On Wed, Aug 13, 2014 at 1:07 PM, Jerome Glisse  wrote:
> So this is fundamentaly different, fence as they are now allow random driver
> callback and this is bound to get ugly this is bound to lead to one driver
> doing something that seems innocuous but turn out to break heavoc when call
> from some other driver function.


tbh, that seems solvable by some strict rules about what you can do in
the callback.. ie. don't do anything you couldn't do in atomic, and
don't signal another fence.. off the top of my head that seems
sufficient.

If the driver getting the callback needs to do more, then it can
always schedule a worker..

But I could certainly see the case where the driver waiting on fence
sets everything up before installing the cb and then just needs to
write one or a couple regs from the cb.

BR,
-R


[Bug 82455] Failed to allocate virtual address for buffer

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82455

--- Comment #8 from charlie <407883775 at qq.com> ---
Our system PAGE_SIZE is 8k, but the GPU PAGE_SIZE is 4k. The mesa radeon

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/e1a4f7de/attachment.html>


[PATCH] drm/msm: Fix missing unlock on error in msm_fbdev_create()

2014-08-14 Thread weiyj...@163.com
From: Wei Yongjun 

Add the missing unlock before return from function msm_fbdev_create()
in the error handling case.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/msm/msm_fbdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
index 9c5221c..ab5bfd2 100644
--- a/drivers/gpu/drm/msm/msm_fbdev.c
+++ b/drivers/gpu/drm/msm/msm_fbdev.c
@@ -143,7 +143,7 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
ret = msm_gem_get_iova_locked(fbdev->bo, 0, );
if (ret) {
dev_err(dev->dev, "failed to get buffer obj iova: %d\n", ret);
-   goto fail;
+   goto fail_unlock;
}

fbi = framebuffer_alloc(0, dev->dev);



[PATCH] drm: sti: Fix return value check in sti_drm_platform_probe()

2014-08-14 Thread weiyj...@163.com
From: Wei Yongjun 

In case of error, the function platform_device_register_resndata()
returns ERR_PTR() and never returns NULL. The NULL test in the return
value check should be replaced with IS_ERR().

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/sti/sti_drm_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/sti/sti_drm_drv.c 
b/drivers/gpu/drm/sti/sti_drm_drv.c
index a7cc249..c8a8412 100644
--- a/drivers/gpu/drm/sti/sti_drm_drv.c
+++ b/drivers/gpu/drm/sti/sti_drm_drv.c
@@ -201,8 +201,8 @@ static int sti_drm_platform_probe(struct platform_device 
*pdev)
master = platform_device_register_resndata(dev,
DRIVER_NAME "__master", -1,
NULL, 0, NULL, 0);
-   if (!master)
-   return -EINVAL;
+   if (IS_ERR(master))
+   return PTR_ERR(master);

platform_set_drvdata(pdev, master);
return 0;



[PATCH] drm: sti: hda: fix return value check in sti_hda_probe()

2014-08-14 Thread weiyj...@163.com
From: Wei Yongjun 

In case of error, the function devm_ioremap_nocache() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/sti/sti_hda.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/sti/sti_hda.c b/drivers/gpu/drm/sti/sti_hda.c
index 72d957f..2802b81 100644
--- a/drivers/gpu/drm/sti/sti_hda.c
+++ b/drivers/gpu/drm/sti/sti_hda.c
@@ -730,16 +730,16 @@ static int sti_hda_probe(struct platform_device *pdev)
return -ENOMEM;
}
hda->regs = devm_ioremap_nocache(dev, res->start, resource_size(res));
-   if (IS_ERR(hda->regs))
-   return PTR_ERR(hda->regs);
+   if (!hda->regs)
+   return -ENOMEM;

res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
"video-dacs-ctrl");
if (res) {
hda->video_dacs_ctrl = devm_ioremap_nocache(dev, res->start,
resource_size(res));
-   if (IS_ERR(hda->video_dacs_ctrl))
-   return PTR_ERR(hda->video_dacs_ctrl);
+   if (!hda->video_dacs_ctrl)
+   return -ENOMEM;
} else {
/* If no existing video-dacs-ctrl resource continue the probe */
DRM_DEBUG_DRIVER("No video-dacs-ctrl resource\n");



[PATCH] drm: sti: hdmi: fix return value check in sti_hdmi_probe()

2014-08-14 Thread weiyj...@163.com
From: Wei Yongjun 

In case of error, the function devm_ioremap_nocache() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/sti/sti_hdmi.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/sti/sti_hdmi.c b/drivers/gpu/drm/sti/sti_hdmi.c
index 284e541..8319f76 100644
--- a/drivers/gpu/drm/sti/sti_hdmi.c
+++ b/drivers/gpu/drm/sti/sti_hdmi.c
@@ -713,8 +713,8 @@ static int sti_hdmi_probe(struct platform_device *pdev)
return -ENOMEM;
}
hdmi->regs = devm_ioremap_nocache(dev, res->start, resource_size(res));
-   if (IS_ERR(hdmi->regs))
-   return PTR_ERR(hdmi->regs);
+   if (!hdmi->regs)
+   return -ENOMEM;

if (of_device_is_compatible(np, "st,stih416-hdmi")) {
res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
@@ -725,8 +725,8 @@ static int sti_hdmi_probe(struct platform_device *pdev)
}
hdmi->syscfg = devm_ioremap_nocache(dev, res->start,
resource_size(res));
-   if (IS_ERR(hdmi->syscfg))
-   return PTR_ERR(hdmi->syscfg);
+   if (!hdmi->syscfg)
+   return -ENOMEM;

}




[PATCH] drm: sti: tvout: fix return value check in sti_tvout_probe()

2014-08-14 Thread weiyj...@163.com
From: Wei Yongjun 

In case of error, the function devm_ioremap_nocache() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/sti/sti_tvout.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/sti/sti_tvout.c b/drivers/gpu/drm/sti/sti_tvout.c
index b69e26f..9ad2d44 100644
--- a/drivers/gpu/drm/sti/sti_tvout.c
+++ b/drivers/gpu/drm/sti/sti_tvout.c
@@ -591,8 +591,8 @@ static int sti_tvout_probe(struct platform_device *pdev)
return -ENOMEM;
}
tvout->regs = devm_ioremap_nocache(dev, res->start, resource_size(res));
-   if (IS_ERR(tvout->regs))
-   return PTR_ERR(tvout->regs);
+   if (!tvout->regs)
+   return -ENOMEM;

/* get reset resources */
tvout->reset = devm_reset_control_get(dev, "tvout");



[Bug 82588] X fails to start with linus-tip or drm-next

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82588

--- Comment #4 from Mike Lothian  ---
Created attachment 104605
  --> https://bugs.freedesktop.org/attachment.cgi?id=104605=edit
Dmesg Revert

It made no difference

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/24f61fb7/attachment.html>


[Bug 82371] Capa Verde (Radeon 7750) GPU lockup using UVD

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=82371

Pablo Wagner  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Pablo Wagner  ---
Yeah, I've purge oibaf ppa, and installed xorg-edgers which has a git pulled
with that patch included, and it's working again.

Thanks for your fast response. I'm marking as invalid, since it's related to
mesa (correct me if I'm wrong), and already fixed.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[PATCH 1/2 v2] x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup()

2014-08-14 Thread Andreas Noever
On Sun, Aug 10, 2014 at 6:34 PM, Bruno Pr?mont
 wrote:
> On Sun, 10 August 2014 Andreas Noever  wrote:
>> On Sun, Aug 10, 2014 at 11:26 AM, Bruno Pr?mont wrote:
>> > On Sun, 10 August 2014 Andreas Noever wrote:
>> >
>> >> On Sun, Aug 10, 2014 at 2:21 AM, Andreas Noeverwrote:
>> >> > I just tried to run the latest kernel.  It failed to boot and git
>> >> > bisect points to this commit (MacBookPro10,1 with Nvidia
>> >> > graphics).
>> >> >
>> >> > The (now removed) code in efifb_setup did always set default_vga, even
>> >> > if it had already been set earlier. The new code in pci_fixup_video
>> >> > runs only if vga_default_device() is NULL. Removing the check fixes
>> >> > the regression.
>> >> >
>> >> >
>> >> > The following calls to vga_set_default_device are made during boot:
>> >> >
>> >> > vga_arbiter_add_pci_device -> vga_set_default_device(intel)
>> >> > pci_fixup_video -> vga_set_default_device(intel) (there are two calls
>> >> > in pci_fixup_video, this one is the one near "Boot video device")
>> >> > pci_fixup_video -> vga_set_default_device(nvidia) (from the "Does
>> >> > firmware framebuffer belong to us?" loop, only if I remove the check)
>> >> >
>> >> > vga_arbiter_add_pci_device chooses intel simply because it is the
>> >> > first device. Next pci_fixup_video(intel) sees that it is the default
>> >> > device, sets the IORESOURCE_ROM_SHADOW flag and calls
>> >> > vga_set_default_device again. And finally (if the check is removed)
>> >> > pci_fixup_video(nvidia) sees that it owns the framebuffer and sets
>> >> > itself as the default device which allows the system to boot again.
>> >> >
>> >> > Does setting the ROM_SHADOW flag on (possibly) the wrong device have
>> >> > any effect?
>> >> Yes it does. Removing the line changes a long standing
>> >> i915 :00:02.0: Invalid ROM contents
>> >> into a
>> >> i915 :00:02.0: BAR 6: can't assign [??? 0x flags
>> >> 0x2000] (bogus alignment).
>> >>
>> >> The first is logged at KERN_ERR and the second one only at KERN_INFO.
>> >> We are making progress.
>> >
>> > How does your system behave if you change vga_arbiter_add_pci_device()
>> > not to set vga_set_default_device() with the first device registered?
>> >
>> > That is remove the
>> > #ifndef __ARCH_HAS_VGA_DEFAULT_DEVICE
>> > code block in vga_arbiter_add_pci_device().
>> The system does not boot.  The Intel device is still set as the
>> default device in pci_fixup_video (near "Boot video device") and
>> prevents the nvidia device (which is initialized later) from becoming
>> the default one.
>>
>> > How did your system behave in the past if you did not enable efifb?
>> I don't think that I ever did not enable efifb. It seems to have been
>> around for quite a while?
>
> The question here is if you system just refuses to boot as well.

I have just tested 3.16 without FB_EFI and it refuses to boot as well.

> The aim of my patch is to make system work (properly) when efifb is not used
> (why use efifb if built-in drm drivers handle the GPU(s)?)
>
> If you decided to replace efifb with platform-framebuffer+simplefb/simpledrm
> you would probably see the same issue as that would be no efifb as well.
>
> The presence of efifb should not be mandatory for successfully booting
> and running Xorg.
>
>> Andreas
>
> How does you system behave with below patch on top of Linus tree?

This patch fixes the problem (with and without FB_EFI).

Andreas


> Bruno
>
>
>
> ---
> diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
> index c61ea57..15d0068 100644
> --- a/arch/x86/pci/fixup.c
> +++ b/arch/x86/pci/fixup.c
> @@ -367,7 +367,7 @@ static void pci_fixup_video(struct pci_dev *pdev)
> }
> bus = bus->parent;
> }
> -   if (!vga_default_device() || pdev == vga_default_device()) {
> +   if (pdev == vga_default_device()) {
> pci_read_config_word(pdev, PCI_COMMAND, );
> if (config & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
> pdev->resource[PCI_ROM_RESOURCE].flags |= 
> IORESOURCE_ROM_SHADOW;
> diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
> index d2077f0..ac44d87 100644
> --- a/drivers/gpu/vga/vgaarb.c
> +++ b/drivers/gpu/vga/vgaarb.c
> @@ -112,10 +112,8 @@ both:
> return 1;
>  }
>
> -#ifndef __ARCH_HAS_VGA_DEFAULT_DEVICE
>  /* this is only used a cookie - it should not be dereferenced */
>  static struct pci_dev *vga_default;
> -#endif
>
>  static void vga_arb_device_card_gone(struct pci_dev *pdev);
>
> @@ -131,7 +129,6 @@ static struct vga_device *vgadev_find(struct pci_dev 
> *pdev)
>  }
>
>  /* Returns the default VGA device (vgacon's babe) */
> -#ifndef __ARCH_HAS_VGA_DEFAULT_DEVICE
>  struct pci_dev *vga_default_device(void)
>  {
> return vga_default;
> @@ -147,7 +144,6 @@ void vga_set_default_device(struct pci_dev *pdev)
> pci_dev_put(vga_default);
> vga_default = pci_dev_get(pdev);
>  }
> -#endif
>
>  static inline void 

[Bug 82371] Capa Verde (Radeon 7750) GPU lockup using UVD

2014-08-14 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=82371

--- Comment #2 from Michel D?nzer  ---
Sounds like https://bugs.freedesktop.org/show_bug.cgi?id=82428 , fixed in
current Mesa Git.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 82588] X fails to start with linus-tip or drm-next

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82588

--- Comment #3 from Michel D?nzer  ---
Does reverting
http://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-3.17=1490434f0da63afc6006411c8829c6a7935a4e7e
help?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/7b9d4466/attachment.html>


[PATCH] drm/radeon/dpm: select the appropriate vce power state for KV/KB/ML

2014-08-14 Thread Alex Deucher
Compare the clock in the limits table to the requested evclk rather
than just taking the first value.

Signed-off-by: Alex Deucher 
Cc: stable at vger.kernel.org
---
 drivers/gpu/drm/radeon/kv_dpm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/kv_dpm.c b/drivers/gpu/drm/radeon/kv_dpm.c
index c667d83..8b58e11 100644
--- a/drivers/gpu/drm/radeon/kv_dpm.c
+++ b/drivers/gpu/drm/radeon/kv_dpm.c
@@ -1438,14 +1438,14 @@ static int kv_update_uvd_dpm(struct radeon_device 
*rdev, bool gate)
return kv_enable_uvd_dpm(rdev, !gate);
 }

-static u8 kv_get_vce_boot_level(struct radeon_device *rdev)
+static u8 kv_get_vce_boot_level(struct radeon_device *rdev, u32 evclk)
 {
u8 i;
struct radeon_vce_clock_voltage_dependency_table *table =
>pm.dpm.dyn_state.vce_clock_voltage_dependency_table;

for (i = 0; i < table->count; i++) {
-   if (table->entries[i].evclk >= 0) /* XXX */
+   if (table->entries[i].evclk >= evclk)
break;
}

@@ -1468,7 +1468,7 @@ static int kv_update_vce_dpm(struct radeon_device *rdev,
if (pi->caps_stable_p_state)
pi->vce_boot_level = table->count - 1;
else
-   pi->vce_boot_level = kv_get_vce_boot_level(rdev);
+   pi->vce_boot_level = kv_get_vce_boot_level(rdev, 
radeon_new_state->evclk);

ret = kv_copy_bytes_to_smc(rdev,
   pi->dpm_table_start +
-- 
1.8.3.1



[Bug 82050] R9270X pyrit benchmark perf regressions with latest kernel/llvm

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82050

--- Comment #13 from Michel D?nzer  ---
Does reverting Mesa commit 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 help for
Valley or pyrit with the latest kernel?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/4da337fe/attachment.html>


[Bug 82586] UBO matrix in std140 struct does not work

2014-08-14 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=82586

Ian Romanick  changed:

   What|Removed |Added

 Status|NEW |NEEDINFO

--- Comment #1 from Ian Romanick  ---
What is the failure mode?  A bunch of changes recently occurred in the GLSL
compiler (especially in the linker) for UBO matrices.  What version of Mesa are
you on?  Can you try a build of master?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140814/9c263599/attachment.html>