date:20140904

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

--- Comment #3 from smoki  ---

 Sreenshot reminds me on Unigine Sanctuary shadow artifacts somehow, so this
looks like hyperz issue to me... I guess yes, but did you have it enabled?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL:

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

--- Comment #4 from smoki  ---
 Maybe right the same commit that fixes Sanctuary broke Europa? 


http://cgit.freedesktop.org/mesa/mesa/commit/?id=91050ff2154417d7f3a16b582f28c8bbdcea6cfb

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL:

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

Hi folks !

I've been tracking down some problems with the recent DRI on powerpc and
stumbled upon something that doesn't look right, and not necessarily
only for us.

Now it's possible that I haven't fully understood the code here and I
also don't know to what extent some of that behaviour is necessary for
some platforms such as Intel GTT bits.

What I've observed with a simple/dumb (no DMA) driver like AST (but this
probably happens more generally) is that when evicting a BO from VRAM
into System memory, the TTM tries to preserve the existing caching
attributes of the VRAM object.

>From what I can tell, we end up with going from VRAM to System memory
type, and we eventually call ttm_bo_select_caching() to select the
caching option for the target.

This will, from what I can tell, try to use the same caching mode as the
original object:

if ((cur_placement & caching) != 0)
result |= (cur_placement & caching);

And cur_placement comes from bo->mem.placement which as far as I can
tell is based on the placement array which the drivers set up.

Now they tend to uniformly setup the placement for System memory as
TTM_PL_MASK_CACHING which enables all caching modes.

So I end up with, for example, my System memory BOs having
TTM_PL_FLAG_CACHED not set (though they also don't have
TTM_PL_FLAG_UNCACHED) and TTM_PL_FLAG_WC.

We don't seem to use the man->default_caching (which will have
TTM_PL_FLAG_CACHED) unless there is no matching bit at all between the
proposed placement and the existing caching mode.

Now this is a problem for several reason that I can think of:

 - On a number of powerpc platforms, such as all our server 64-bit one
for example, it's actually illegal to map system memory non-cached. The
system is fully cache coherent for all possible DMA originators (that we
care about at least) and mapping memory non-cachable while it's mapped
cachable in the linear mapping can cause nasty cache paradox which, when
detected by HW, can checkstop the system.

 - A similar issue exists, afaik, on ARM >= v7, so anything mapped
non-cachable must be removed from the linear mapping explicitly since
otherwise it can be speculatively prefetched into the cache.

 - I don't know about x86, but even then, it looks quite sub-optimal to
map the memory backing of the BOs and access it using a WC rather than a
cachable mapping attribute.

Now, some folks on IRC mentioned that there might be reasons for the
current behaviour as to not change the caching attributes when going
in/out of the GTT on Intel, I don't know how that relates and how that
works, but maybe that should be enforced by having a different placement
mask specifically on those chipsets.

Dave, should we change the various PCI drivers for generally coherent
devices such that the System memory type doesn't allow placements
without CACHED attribute ? Or at least on coherent platforms ? How do
detect that ? Should we have a TTM helper to establish the default
memory placement attributes that "normal PCI" drivers call to set that
up so we can have all the necessary arch ifdefs in one single place, at
least for "classic PCI/PCIe" stuff (AGP might need additional tweaks) ?

Non-PCI and "special" drivers like Intel can use a different set of
placement attributes to represent the requirements of those specific
platforms (mostly thinking of embedded ARM here which under some
circumstances might actually require non-cached mappings).
Or am I missing another part of the puzzle ?

As it-is, things are broken for me even for dumb drivers, and I suspect
to a large extent with radeon and nouveau too, though in some case we
might get away with it most of the time ... until the machine locks up
for some unexplainable reason... This might cause problems on existing
distros such as RHEL7 with our radeon adapters even.

Any suggestion of what's the best approach to fix it ? I'm happy to
produce the patches but I'm not that familiar with the TTM so I would
like to make sure I'm the right track first :-)

Cheers,
Ben.

[Bug 81644] Random crashes on RadeonSI with Chromium.

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81644

--- Comment #81 from Michel D?nzer  ---
(In reply to comment #80)
> Hmm, okay, I'll try again. I'll test for about 8 hours this time.

Thanks. Basically, you need to test for at least as long as it's ever taken for
the problem to appear. A multiple of that would be even better.


> I don't have the best grasp of the Mesa works, but I didn't think I screwed up
> as it seemed to be something related to multisampler blitting which sounds 
> darn
> close to what is screwed up.

The meta code is only used by classic drivers, not by Gallium based drivers.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/b0688a88/attachment.html>

[Bug 83416] [radeonsi] Serious Sam 3 lockup during its start

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83416

--- Comment #13 from Michel D?nzer  ---
(In reply to comment #12)
> Can you try this patch?

The patch fixes the GPUVM faults for me while replaying the apitrace.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/4771900d/attachment.html>

[Bug 83453] dpm not working in Gigabyte RADEON HD 7870 (bios F11)

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83453

Michel D?nzer  changed:

   What|Removed |Added

   Assignee|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
 QA Contact|xorg-team at lists.x.org   |
Product|xorg|DRI
  Component|Driver/Radeon   |DRM/Radeon

--- Comment #1 from Michel D?nzer  ---
Sounds like bug 73338.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/c31d0ec0/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Wed, 2014-09-03 at 21:55 -0400, Jerome Glisse wrote:
> So i think we need to get a platform flags and or set_pages_array_wc|uc
> needs to fail and this would fallback to cached mapping if the fallback
> code still works. So if your arch properly return and error for those
> cache changing function then you should be fine.
> 
> This also means that we need to fix ttm_tt_set_placement_caching so that
> when it returns an error it switches to cached mapping. Which will always
> work.

Can't I just filter the mem_type definitions in the mem_type_manager
with something along that totally untested patch ?

Or do I *also* need to make those set_page_array_* things fail ?

--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1308,6 +1308,24 @@ int ttm_bo_evict_mm(struct ttm_bo_device *bdev, unsigned 
 }
 EXPORT_SYMBOL(ttm_bo_evict_mm);

+static void ttm_bo_filter_mem_type(struct ttm_bo_device *bdev, unsigned type,
+  struct ttm_mem_type_manager *man)
+{
+   /*
+* On some architectures/patforms, we cannot allow non-cachable
+* mappings of system memory. This can be a problem with AGP on
+* old G5 systems vs. TTM_PL_TT but we don't really have a choice
+* at this point on ppc64 at least and the AGP on these never
+* worked reliably anyway.
+*/
+#if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
+   if (type == TTM_PL_SYSTEM || type == TTM_PL_TT) {
+   man->available_caching &= TTM_PL_FLAG_CACHED;
+   man->default_caching &= man->available_caching;
+   }
+#endif
+}
+
 int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned type,
unsigned long p_size)
 {
@@ -1327,6 +1345,8 @@ int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned ty
return ret;
man->bdev = bdev;

+   ttm_bo_filter_mem_type(bdev, type, man);
+
ret = 0;
if (type != TTM_PL_SYSTEM) {
ret = (*man->func->init)(man, p_size);

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Wed, 2014-09-03 at 22:07 -0400, Jerome Glisse wrote:

> So in the meantime the attached patch should work, it just silently ignore
> the caching attribute request on non x86 instead of pretending that things
> are setup as expected and then latter the radeon ou nouveau hw unsetting
> the snoop bit.
> 
> It's not tested but i think it should work.

I'm still getting placements with !CACHED going from bo_memcpy in
ttm_io_prot() though ... I'm looking at filtering the placement
attributes instead.

Ben.

> > 
> > Cheers,
> > J?r?me
> > 
> > > 
> > > Cheers,
> > > Ben.
> > > 
> > > 
> > > ___
> > > dri-devel mailing list
> > > dri-devel at lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 72921] DPM Power Cycle with AMD A8-6600K & MSI FM2-A55M-E33

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=72921

--- Comment #38 from Matt  ---
Where do I look to check if the patch has landed yet? I have been scanning git
repos and mailing lists since you stated that but am apparently looking in all
the wrong places. Or does it just take longer than 2 weeks.

This is the only commit I can find anywhere that seems to be related but it's
in 3.16:
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=730a336c33a3398d65896e8ee3ef9f5679fe30a9

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/acfc07b2/attachment.html>

[Bug 83436] Sudden framerate drops in multiple games

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83436

--- Comment #8 from Michel D?nzer  ---
(In reply to comment #4)
> Unigine is affected, but it's 64bit.

You're saying it's affected by both the framerate drops and performance
decrease?

Can you guys bisect?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/9087eea1/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #7 from Michel D?nzer  ---
(In reply to comment #6)
> my xorg.conf is empty. if i add your lines the complete screen turns black
> the next time i login making kubuntu unusable.

Sounds like you didn't create a valid xorg.conf. Something like this should
work:

Section "Device"
Identifier "default"
Option "AccelMethod" "glamor"
EndSection


> or is this update available through the update center?

No idea, but there are PPAs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/35f98310/attachment.html>

[Bug 83416] [radeonsi] Serious Sam 3 lockup during its start

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83416

--- Comment #14 from Laurent carlier  ---
(In reply to comment #12)
> Created attachment 105709 [details] [review]
> Fix suggested by Vadim
> 
> Can you try this patch?

It doesn't fix the lockup for me. I've tested mesa-git with llvm 3.4.3 both the
trace and the game, and they failled both with the following error:

LLVM ERROR: Cannot select: 0x1671def0: i32 = truncate 0x16716ff4 [ORD=21]
[ID=121]
  0x16716ff4: i128 = srl 0x1671cb14, 0x16717198 [ORD=21] [ID=102]
0x1671cb14: i128,ch = load 0x166a9484, 0x167123bc,
0x16712e20<LD16[%32](tbaa=!"const")> [ORD=21] [ID=90]
  0x167123bc: i64,ch = CopyFromReg 0x166a9484, 0x16712330 [ID=81]
0x16712330: i64 = Register %vreg66 [ID=2]
  0x16712e20: i64 = undef [ID=8]
0x16717198: i32 = Constant<96> [ID=76]
In function: main

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/ac947037/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Wed, 2014-09-03 at 22:36 -0400, Jerome Glisse wrote:
> On Wed, Sep 03, 2014 at 10:31:18PM -0400, Jerome Glisse wrote:
> > On Thu, Sep 04, 2014 at 12:25:23PM +1000, Benjamin Herrenschmidt wrote:
> > > On Wed, 2014-09-03 at 22:07 -0400, Jerome Glisse wrote:
> > > 
> > > > So in the meantime the attached patch should work, it just silently 
> > > > ignore
> > > > the caching attribute request on non x86 instead of pretending that 
> > > > things
> > > > are setup as expected and then latter the radeon ou nouveau hw unsetting
> > > > the snoop bit.
> > > > 
> > > > It's not tested but i think it should work.
> > > 
> > > I'm still getting placements with !CACHED going from bo_memcpy in
> > > ttm_io_prot() though ... I'm looking at filtering the placement
> > > attributes instead.
> > > 
> > > Ben.
> > 
> > Ok so this one should do the trick.
> 
> Ok final version ... famous last word.

Minus a couple of obvious typos that prevent if from building, it seems
to do the trick for me with the AST driver, no more bad mappings.

I'll still send a patch that catches the incorrect mapping attempts
inside ttm_io_prot() and warns to help future debugging and avoid
"random" behaviour. (I need to fix other things in the powerpc code
in there anyway).

Cheers,
Ben.

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #8 from simon  ---
"Something like this should work:"

and it did indeed. everything looks fine now.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/48c7a3bd/attachment-0001.html>

[Bug 73338] Fan speed in idle at 40% with radeonsi and at 18% with catalyst

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=73338

--- Comment #23 from Zermond  ---
I have a similar problem with Radeon7870HD
https://bugs.freedesktop.org/show_bug.cgi?id=83453


But there are some differences in the windows ... Catalyst says that the cooler
is spinning at a rate of 40% - it is quiet, but Linuxs on hearing much more ...

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/288b4bf6/attachment.html>

[Bug 83453] dpm not working in Gigabyte RADEON HD 7870 (bios F11)

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83453

--- Comment #2 from Zermond  ---
thanks you Michel D?nzer

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/c18e48dd/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #9 from Michel D?nzer  ---
(In reply to comment #8)
> and it did indeed. everything looks fine now.

Great, can you attach another Xorg.0.log corresponding to that?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/8e713e5d/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #10 from simon  ---
Created attachment 105719
  --> https://bugs.freedesktop.org/attachment.cgi?id=105719=edit
updated xorg.0.log

updated xorg.0.log

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/3166ef04/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

simon  changed:

   What|Removed |Added

 Attachment #105660|0   |1
is obsolete||
 Attachment #105719|0   |1
is obsolete||

--- Comment #11 from simon  ---
Created attachment 105720
  --> https://bugs.freedesktop.org/attachment.cgi?id=105720=edit
updated xorg.0.log

updated xorg.0.log

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/e38eaeb5/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #12 from simon  ---
@Michel D?nzer

the problem remains. in the meantime i have updated my kde plasma next via
kubuntu ppa's like i  am doing everyday, and while the content in the xorg.conf
remains the same, i now have the black screen again :(

next stop mesa drivers?
could you tell me what exactly do i have look for in the ppa's as this seems to
be a custom install and not included in the updates.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/7712f774/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Gabriel Paubert

On Wed, Sep 03, 2014 at 10:36:57PM -0400, Jerome Glisse wrote:
> On Wed, Sep 03, 2014 at 10:31:18PM -0400, Jerome Glisse wrote:
> > On Thu, Sep 04, 2014 at 12:25:23PM +1000, Benjamin Herrenschmidt wrote:
> > > On Wed, 2014-09-03 at 22:07 -0400, Jerome Glisse wrote:
> > > 
> > > > So in the meantime the attached patch should work, it just silently 
> > > > ignore
> > > > the caching attribute request on non x86 instead of pretending that 
> > > > things
> > > > are setup as expected and then latter the radeon ou nouveau hw unsetting
> > > > the snoop bit.
> > > > 
> > > > It's not tested but i think it should work.
> > > 
> > > I'm still getting placements with !CACHED going from bo_memcpy in
> > > ttm_io_prot() though ... I'm looking at filtering the placement
> > > attributes instead.
> > > 
> > > Ben.
> > 
> > Ok so this one should do the trick.
> 
> Ok final version ... famous last word.
[snipped older version]
> >From 236038e18dc303bb9aa877922e01963d3fb0b7af Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= 
> Date: Wed, 3 Sep 2014 22:04:34 -0400
> Subject: [PATCH] drm/ttm: force cached mapping on non x86 platform.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> People interested in providing uncached or write combined mapping
> on there architecture need to do the ground work inside there arch

s/there/their/g

> specific code to allow to break the linear kernel mapping so that
> page mapping attributes can be updated, in the meantime force cached
> mapping for non x86 architecture.
> 
> Signed-off-by: J?r?me Glisse 
> ---
>  drivers/gpu/drm/radeon/radeon_ttm.c |  2 +-
>  drivers/gpu/drm/ttm/ttm_bo.c|  2 +-
>  drivers/gpu/drm/ttm/ttm_tt.c| 32 +---
>  include/drm/ttm/ttm_bo_driver.h |  2 +-
>  4 files changed, 24 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 72afe82..4dd5060 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -304,7 +304,7 @@ static int radeon_move_vram_ram(struct ttm_buffer_object 
> *bo,
>   return r;
>   }
>  
> - r = ttm_tt_set_placement_caching(bo->ttm, tmp_mem.placement);
> + r = ttm_tt_set_placement_caching(bo->ttm, _mem.placement);
>   if (unlikely(r)) {
>   goto out_cleanup;
>   }
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 3da89d5..4dc21c3 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -305,7 +305,7 @@ static int ttm_bo_handle_move_mem(struct 
> ttm_buffer_object *bo,
>   goto out_err;
>   }
>  
> - ret = ttm_tt_set_placement_caching(bo->ttm, mem->placement);
> + ret = ttm_tt_set_placement_caching(bo->ttm, >placement);
>   if (ret)
>   goto out_err;
>  
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index bf080ab..a0df803 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -89,14 +89,6 @@ static inline int ttm_tt_set_page_caching(struct page *p,
>  
>   return ret;
>  }
> -#else /* CONFIG_X86 */
> -static inline int ttm_tt_set_page_caching(struct page *p,
> -   enum ttm_caching_state c_old,
> -   enum ttm_caching_state c_new)
> -{
> - return 0;
> -}
> -#endif /* CONFIG_X86 */
>  
>  /*
>   * Change caching policy for the linear kernel map
> @@ -149,19 +141,37 @@ out_err:
>   return ret;
>  }
>  
> -int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t placement)
> +int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t *placement)
>  {
>   enum ttm_caching_state state;
>  
> - if (placement & TTM_PL_FLAG_WC)
> + if (*placement & TTM_PL_FLAG_WC)
>   state = tt_wc;
> - else if (placement & TTM_PL_FLAG_UNCACHED)
> + else if (*placement & TTM_PL_FLAG_UNCACHED)
>   state = tt_uncached;
>   else
>   state = tt_cached;
>  
>   return ttm_tt_set_caching(ttm, state);
>  }
> +#else /* CONFIG_X86 */
> +int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t *placement)
> +{
> + if (*placement & (TTM_PL_TT | TTM_PL_FLAG_SYSTEM)) {
> + ttm->caching_state = tt_cached;
> + *placement &= ~TTM_PL_MASK_CACHING;
> + *placement |= TTM_PL_FLAG_CACHED;
> + } else {
> + if (*placement & TTM_PL_FLAG_WC)
> + ttm->caching_state = tt_wc;
> + else if (placement & TTM_PL_FLAG_UNCACHED)
> + ttm->caching_state = tt_uncached;
> + else
> + ttm->caching_state = tt_cached;
> + }
> + return 0;
> +}
> +#endif /* CONFIG_X86 */
>

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

--- Comment #5 from smoki  ---

 Actually it happens later i bisect this to:

 r600g,radeonsi: initialize HTILE to fully-expanded state


http://cgit.freedesktop.org/mesa/mesa/commit/?id=f05fe294e7e8dfb08be172f426252192c0ba17ab

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/15ebdcc1/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Michel Dänzer

On 04.09.2014 10:55, Jerome Glisse wrote:
>
> While i agree about the issue of incoherent double map of same page, i
> think we have more issue. For instance lattely AMD have been pushing a
> lot of patches to move things to use uncached memory for radeon and as
> usual thoses patches comes with no comment to the motivations of those
> changes.

That would have been a fair review comment...


> What i understand is that uncached mapping for some frequently use buffer
> give a significant performance boost (i am assuming this has to do with
> all the snoop pci transaction overhead).

Exactly, although it's a win even if the data is written by the CPU only 
once and read by the GPU only once.


> This also means that we need to fix ttm_tt_set_placement_caching so that
> when it returns an error it switches to cached mapping. Which will always
> work.

GTT with AGP being one exception.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #13 from simon  ---
@ Michel D?nzer

also the entry in the xorg.conf seems to be causing some graphical glitches in
kde4 
see here:
http://s30.postimg.org/dorcvsxi9/bug1.jpg

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/2796e230/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Michel Dänzer

On 04.09.2014 11:36, Jerome Glisse wrote:
> On Wed, Sep 03, 2014 at 10:31:18PM -0400, Jerome Glisse wrote:
>> On Thu, Sep 04, 2014 at 12:25:23PM +1000, Benjamin Herrenschmidt wrote:
>>> On Wed, 2014-09-03 at 22:07 -0400, Jerome Glisse wrote:
>>>
 So in the meantime the attached patch should work, it just silently ignore
 the caching attribute request on non x86 instead of pretending that things
 are setup as expected and then latter the radeon ou nouveau hw unsetting
 the snoop bit.

 It's not tested but i think it should work.
>>>
>>> I'm still getting placements with !CACHED going from bo_memcpy in
>>> ttm_io_prot() though ... I'm looking at filtering the placement
>>> attributes instead.
>>>
>>> Ben.
>>
>> Ok so this one should do the trick.
>
> Ok final version ... famous last word.

[...]

> +#else /* CONFIG_X86 */
> +int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t *placement)
> +{
> + if (*placement & (TTM_PL_TT | TTM_PL_FLAG_SYSTEM)) {
> + ttm->caching_state = tt_cached;
> + *placement &= ~TTM_PL_MASK_CACHING;
> + *placement |= TTM_PL_FLAG_CACHED;

NAK, this will break AGP on PowerMacs.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #14 from Michel D?nzer  ---
So this is probably indeed a Mesa issue, and you can remove the xorg.conf file
again.

Sorry, I can't help you with Ubuntu packages.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/6e437c5b/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #15 from simon  ---
so where do i go from now?
kubuntu/KDE guys send me over to you to file the bug?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/1dc072fe/attachment.html>

[PATCH 1/9] drm/ast: Try to use MMIO registers when PIO isn't supported

2014-09-04 Thread Benjamin Herrenschmidt

If the PIO resources haven't been assigned, then we have no choice
but try to use the MMIO version. This is the case for example on
POWER8 which doesn't support PIO at all.

Chips rev 0x20 or later have MMIO decoding enabled by default.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_drv.h  |  5 -
 drivers/gpu/drm/ast/ast_main.c | 20 +---
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 957d4fa..a203a6f 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -125,8 +125,9 @@ struct ast_gem_object;

 #define AST_IO_AR_PORT_WRITE   (0x40)
 #define AST_IO_MISC_PORT_WRITE (0x42)
+#define AST_IO_VGA_ENABLE_PORT (0x43)
 #define AST_IO_SEQ_PORT(0x44)
-#define AST_DAC_INDEX_READ (0x3c7)
+#define AST_IO_DAC_INDEX_READ  (0x47)
 #define AST_IO_DAC_INDEX_WRITE (0x48)
 #define AST_IO_DAC_DATA(0x49)
 #define AST_IO_GR_PORT (0x4E)
@@ -134,6 +135,8 @@ struct ast_gem_object;
 #define AST_IO_INPUT_STATUS1_READ  (0x5A)
 #define AST_IO_MISC_PORT_READ  (0x4C)

+#define AST_IO_MM_OFFSET   (0x380)
+
 #define __ast_read(x) \
 static inline u##x ast_read##x(struct ast_private *ast, u32 reg) { \
 u##x val = 0;\
diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index a2cc6be..c2ff793 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -359,10 +359,24 @@ int ast_driver_load(struct drm_device *dev, unsigned long 
flags)
ret = -EIO;
goto out_free;
}
-   ast->ioregs = pci_iomap(dev->pdev, 2, 0);
+
+   /*
+* If we don't have IO space at all, use MMIO now and
+* assume the chip has MMIO enabled by default (rev 0x20
+* and higher).
+*/
+   if (!(pci_resource_flags(dev->pdev, 2) & IORESOURCE_IO)) {
+   DRM_INFO("platform has no IO space, trying MMIO\n");
+   ast->ioregs = ast->regs + AST_IO_MM_OFFSET;
+   }
+
+   /* "map" IO regs if the above hasn't done so already */
if (!ast->ioregs) {
-   ret = -EIO;
-   goto out_free;
+   ast->ioregs = pci_iomap(dev->pdev, 2, 0);
+   if (!ast->ioregs) {
+   ret = -EIO;
+   goto out_free;
+   }
}

ast_detect_chip(dev);

TTM placement & caching issue/questions

2014-09-04 Thread Thomas Hellstrom

Hi!

Let me try to bring some clarity and suggestions into this.

On 09/04/2014 02:12 AM, Benjamin Herrenschmidt wrote:
> Hi folks !
>
> I've been tracking down some problems with the recent DRI on powerpc and
> stumbled upon something that doesn't look right, and not necessarily
> only for us.
>
> Now it's possible that I haven't fully understood the code here and I
> also don't know to what extent some of that behaviour is necessary for
> some platforms such as Intel GTT bits.
>
> What I've observed with a simple/dumb (no DMA) driver like AST (but this
> probably happens more generally) is that when evicting a BO from VRAM
> into System memory, the TTM tries to preserve the existing caching
> attributes of the VRAM object.
>
> >From what I can tell, we end up with going from VRAM to System memory
> type, and we eventually call ttm_bo_select_caching() to select the
> caching option for the target.
>
> This will, from what I can tell, try to use the same caching mode as the
> original object:
>
>   if ((cur_placement & caching) != 0)
>   result |= (cur_placement & caching);
>
> And cur_placement comes from bo->mem.placement which as far as I can
> tell is based on the placement array which the drivers set up.

This originates from the fact that when evicting GTT memory, on x86 it's
unnecessary and undesirable to switch caching mode when going to system.

>
> Now they tend to uniformly setup the placement for System memory as
> TTM_PL_MASK_CACHING which enables all caching modes.
>
> So I end up with, for example, my System memory BOs having
> TTM_PL_FLAG_CACHED not set (though they also don't have
> TTM_PL_FLAG_UNCACHED) and TTM_PL_FLAG_WC.
>
> We don't seem to use the man->default_caching (which will have
> TTM_PL_FLAG_CACHED) unless there is no matching bit at all between the
> proposed placement and the existing caching mode.
>
> Now this is a problem for several reason that I can think of:
>
>  - On a number of powerpc platforms, such as all our server 64-bit one
> for example, it's actually illegal to map system memory non-cached. The
> system is fully cache coherent for all possible DMA originators (that we
> care about at least) and mapping memory non-cachable while it's mapped
> cachable in the linear mapping can cause nasty cache paradox which, when
> detected by HW, can checkstop the system.
>
>  - A similar issue exists, afaik, on ARM >= v7, so anything mapped
> non-cachable must be removed from the linear mapping explicitly since
> otherwise it can be speculatively prefetched into the cache.
>
>  - I don't know about x86, but even then, it looks quite sub-optimal to
> map the memory backing of the BOs and access it using a WC rather than a
> cachable mapping attribute.

Last time I tested, (and it seems like Michel is on the same track),
writing with the CPU to write-combined memory was substantially faster
than writing to cached memory, with the additional side-effect that CPU
caches are left unpolluted.

Moreover (although only tested on Intel's embedded chipsets), texturing
from cpu-cache-coherent PCI memory was a real GPU performance hog
compared to texturing from non-snooped memory. Hence, whenever a buffer
could be classified as GPU-read-only (or almost at least), it should be
placed in write-combined memory.

>
> Now, some folks on IRC mentioned that there might be reasons for the
> current behaviour as to not change the caching attributes when going
> in/out of the GTT on Intel, I don't know how that relates and how that
> works, but maybe that should be enforced by having a different placement
> mask specifically on those chipsets.
>
> Dave, should we change the various PCI drivers for generally coherent
> devices such that the System memory type doesn't allow placements
> without CACHED attribute ? Or at least on coherent platforms ? How do
> detect that ? Should we have a TTM helper to establish the default
> memory placement attributes that "normal PCI" drivers call to set that
> up so we can have all the necessary arch ifdefs in one single place, at
> least for "classic PCI/PCIe" stuff (AGP might need additional tweaks) ?
>
> Non-PCI and "special" drivers like Intel can use a different set of
> placement attributes to represent the requirements of those specific
> platforms (mostly thinking of embedded ARM here which under some
> circumstances might actually require non-cached mappings).
> Or am I missing another part of the puzzle ?
>
> As it-is, things are broken for me even for dumb drivers, and I suspect
> to a large extent with radeon and nouveau too, though in some case we
> might get away with it most of the time ... until the machine locks up
> for some unexplainable reason... This might cause problems on existing
> distros such as RHEL7 with our radeon adapters even.
>
> Any suggestion of what's the best approach to fix it ? I'm happy to
> produce the patches but I'm not that familiar with the TTM so I would
> like to make sure I'm the right track first

[PATCH 2/9] drm/ast: POST chip at probe time if VGA not enabled

2014-09-04 Thread Benjamin Herrenschmidt

We need to do it on machines without a BIOS such as POWER8. Also
for detection to work without triggering PCIe errors, we need
to enable VGA early on, inside ast_detect_chip().

While touching those files, replace a few hard coded register
numbers with the corresponding symbolic constant.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_drv.h  |  3 +++
 drivers/gpu/drm/ast/ast_main.c | 47 --
 drivers/gpu/drm/ast/ast_post.c | 23 +
 3 files changed, 62 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index a203a6f..78fc683 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -384,6 +384,9 @@ int ast_bo_push_sysram(struct ast_bo *bo);
 int ast_mmap(struct file *filp, struct vm_area_struct *vma);

 /* ast post */
+void ast_enable_vga(struct drm_device *dev);
+void ast_enable_mmio(struct drm_device *dev);
+bool ast_is_vga_enabled(struct drm_device *dev);
 void ast_post_gpu(struct drm_device *dev);
 u32 ast_mindwm(struct ast_private *ast, u32 r);
 void ast_moutdwm(struct ast_private *ast, u32 r, u32 v);
diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index c2ff793..556d065 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -63,7 +63,7 @@ uint8_t ast_get_index_reg_mask(struct ast_private *ast,
 }


-static int ast_detect_chip(struct drm_device *dev)
+static int ast_detect_chip(struct drm_device *dev, bool *need_post)
 {
struct ast_private *ast = dev->dev_private;
uint32_t data, jreg;
@@ -109,6 +109,21 @@ static int ast_detect_chip(struct drm_device *dev)
}
}

+   /*
+* If VGA isn't enabled, we need to enable now or subsequent
+* access to the scratch registers will fail. We also inform
+* our caller that it needs to POST the chip
+* (Assumption: VGA not enabled -> need to POST)
+*/
+   if (!ast_is_vga_enabled(dev)) {
+   ast_enable_vga(dev);
+   ast_enable_mmio(dev);
+   DRM_INFO("VGA not enabled on entry, requesting chip POST\n");
+   *need_post = true;
+   } else
+   *need_post = false;
+
+   /* Check if we support wide screen */
switch (ast->chip) {
case AST1180:
ast->support_wide_screen = true;
@@ -124,6 +139,7 @@ static int ast_detect_chip(struct drm_device *dev)
ast->support_wide_screen = true;
else {
ast->support_wide_screen = false;
+   /* Read SCU7c (silicon revision register) */
ast_write32(ast, 0xf004, 0x1e6e);
ast_write32(ast, 0xf000, 0x1);
data = ast_read32(ast, 0x1207c);
@@ -136,11 +152,23 @@ static int ast_detect_chip(struct drm_device *dev)
break;
}

+   /* Check 3rd Tx option (digital output afaik) */
ast->tx_chip_type = AST_TX_NONE;
+
+   /*
+* VGACRA3 Enhanced Color Mode Register, check if DVO is already
+* enabled, in that case, assume we have a SIL164 TMDS transmitter
+*/
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 0xff);
if (jreg & 0x80)
ast->tx_chip_type = AST_TX_SIL164;
+
if ((ast->chip == AST2300) || (ast->chip == AST2400)) {
+   /*
+* On AST2300 and 2400, look the configuration set by the SoC in
+* the SOC scratch register #1 bits 11:8 (interestingly marked
+* as "reserved" in the spec
+*/
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xd1, 
0xff);
switch (jreg) {
case 0x04:
@@ -161,6 +189,17 @@ static int ast_detect_chip(struct drm_device *dev)
}
}

+   /* Print stuff for diagnostic purposes */
+   switch(ast->tx_chip_type) {
+   case AST_TX_SIL164:
+   DRM_INFO("Using Sil164 TMDS transmitter\n");
+   break;
+   case AST_TX_DP501:
+   DRM_INFO("Using DP501 DisplayPort transmitter\n");
+   break;
+   default:
+   DRM_INFO("Analog VGA only\n");
+   }
return 0;
 }

@@ -345,6 +384,7 @@ static u32 ast_get_vram_info(struct drm_device *dev)
 int ast_driver_load(struct drm_device *dev, unsigned long flags)
 {
struct ast_private *ast;
+   bool need_post;
int ret = 0;

ast = kzalloc(sizeof(struct ast_private), GFP_KERNEL);
@@ -379,7 +419,7 @@ int ast_driver_load(struct drm_device *dev, unsigned long 
flags)
}
}

-   ast_detect_chip(dev);
+   ast_detect_chip(dev, _post);

if (ast->chip != AST1180) {
ast_get_dram_info(dev);
@@ -387,6 +427,9 @@ int ast_driver_load(struct drm_device *dev,

[PATCH 4/9] drm/ast: Don't assume DVO enabled means SIL164 on uninitialized chips

2014-09-04 Thread Benjamin Herrenschmidt

It looks like the AST2400 comes up with the DVO enable bit set,
which causes us to incorrectly assume we have a SIL164 regardless
of the value of the scratch registers setup by the BMC firmware.

So let's limit that test to the case where the chip has already
been setup by a BIOS.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_main.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index 556d065..48998b2 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -158,16 +158,22 @@ static int ast_detect_chip(struct drm_device *dev, bool 
*need_post)
/*
 * VGACRA3 Enhanced Color Mode Register, check if DVO is already
 * enabled, in that case, assume we have a SIL164 TMDS transmitter
+*
+* Don't make that assumption if we the chip wasn't enabled and
+* is at power-on reset, otherwise we'll incorrectly "detect" a
+* SIL164 when there is none.
 */
-   jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 0xff);
-   if (jreg & 0x80)
-   ast->tx_chip_type = AST_TX_SIL164;
+   if (!*need_post) {
+   jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 
0xff);
+   if (jreg & 0x80)
+   ast->tx_chip_type = AST_TX_SIL164;
+   }

if ((ast->chip == AST2300) || (ast->chip == AST2400)) {
/*
 * On AST2300 and 2400, look the configuration set by the SoC in
 * the SOC scratch register #1 bits 11:8 (interestingly marked
-* as "reserved" in the spec
+* as "reserved" in the spec)
 */
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xd1, 
0xff);
switch (jreg) {

[PATCH 5/9] drm/ast: Cleanup analog init code path

2014-09-04 Thread Benjamin Herrenschmidt

Move the MMIO mangling to a separate routine and actually
disable the DVO output when using pure analog.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_dp501.c | 49 ++---
 1 file changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
index 7e2ddde..76f07f3 100644
--- a/drivers/gpu/drm/ast/ast_dp501.c
+++ b/drivers/gpu/drm/ast/ast_dp501.c
@@ -379,11 +379,39 @@ static bool ast_init_dvo(struct drm_device *dev)
return true;
 }

+
+static void ast_init_analog(struct drm_device *dev)
+{
+   struct ast_private *ast = dev->dev_private;
+   u32 data;
+
+   /*
+* Set DAC source to VGA mode in SCU2C via the P2A
+* bridge. First configure the P2U to target the SCU
+* in case it isn't at this stage.
+*/
+   ast_write32(ast, 0xf004, 0x1e6e);
+   ast_write32(ast, 0xf000, 0x1);
+
+   /* Then unlock the SCU with the magic password */
+   ast_write32(ast, 0x12000, 0x1688a8a8);
+   ast_write32(ast, 0x12000, 0x1688a8a8);
+   ast_write32(ast, 0x12000, 0x1688a8a8);
+
+   /* Finally, clear bits [17:16] of SCU2c */
+   data = ast_read32(ast, 0x1202c);
+   data &= 0xfffc;
+   ast_write32(ast, 0, data);
+
+   /* Disable DVO */
+   ast_set_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 0xcf, 0x00);
+}
+
 void ast_init_3rdtx(struct drm_device *dev)
 {
struct ast_private *ast = dev->dev_private;
u8 jreg;
-   u32 data;
+
if (ast->chip == AST2300 || ast->chip == AST2400) {
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xd1, 
0xff);
switch (jreg & 0x0e) {
@@ -399,23 +427,8 @@ void ast_init_3rdtx(struct drm_device *dev)
default:
if (ast->tx_chip_type == AST_TX_SIL164)
ast_init_dvo(dev);
-   else {
-   /*
-* Set DAC source to VGA mode in SCU2C via the 
P2A
-* bridge. First configure the P2U to target 
the SCU
-* in case it isn't at this stage.
-*/
-   ast_write32(ast, 0xf004, 0x1e6e);
-   ast_write32(ast, 0xf000, 0x1);
-   /* Then unlock the SCU with the magic password 
*/
-   ast_write32(ast, 0x12000, 0x1688a8a8);
-   ast_write32(ast, 0x12000, 0x1688a8a8);
-   ast_write32(ast, 0x12000, 0x1688a8a8);
-   /* Finally, clear bits [17:16] of SCU2c */
-   data = ast_read32(ast, 0x1202c);
-   data &= 0xfffc;
-   ast_write32(ast, 0, data);
-   }
+   else
+   ast_init_analog(dev);
}
}
 }

[PATCH 6/9] drm/ttm: force cached mapping on non x86 platform

2014-09-04 Thread Benjamin Herrenschmidt

From: J?r?me Glisse 

People interested in providing uncached or write combined mapping
on there architecture need to do the ground work inside there arch
specific code to allow to break the linear kernel mapping so that
page mapping attributes can be updated, in the meantime force cached
mapping for non x86 architecture.

Signed-off-by: J?r?me Glisse 
Signed-off-by: Benjamin Herrenschmidt 
---

[Minor compile fixes on top of Jerome original v3]

 drivers/gpu/drm/radeon/radeon_ttm.c |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c|  2 +-
 drivers/gpu/drm/ttm/ttm_bo_util.c   |  2 +-
 drivers/gpu/drm/ttm/ttm_tt.c| 32 +---
 include/drm/ttm/ttm_bo_driver.h |  2 +-
 5 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 72afe82..4dd5060 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -304,7 +304,7 @@ static int radeon_move_vram_ram(struct ttm_buffer_object 
*bo,
return r;
}

-   r = ttm_tt_set_placement_caching(bo->ttm, tmp_mem.placement);
+   r = ttm_tt_set_placement_caching(bo->ttm, _mem.placement);
if (unlikely(r)) {
goto out_cleanup;
}
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 3da89d5..4dc21c3 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -305,7 +305,7 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object 
*bo,
goto out_err;
}

-   ret = ttm_tt_set_placement_caching(bo->ttm, mem->placement);
+   ret = ttm_tt_set_placement_caching(bo->ttm, >placement);
if (ret)
goto out_err;

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 30e5d90..e31d48c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -59,7 +59,7 @@ int ttm_bo_move_ttm(struct ttm_buffer_object *bo,
old_mem->mem_type = TTM_PL_SYSTEM;
}

-   ret = ttm_tt_set_placement_caching(ttm, new_mem->placement);
+   ret = ttm_tt_set_placement_caching(ttm, _mem->placement);
if (unlikely(ret != 0))
return ret;

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index bf080ab..19ae8ee 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -89,14 +89,6 @@ static inline int ttm_tt_set_page_caching(struct page *p,

return ret;
 }
-#else /* CONFIG_X86 */
-static inline int ttm_tt_set_page_caching(struct page *p,
- enum ttm_caching_state c_old,
- enum ttm_caching_state c_new)
-{
-   return 0;
-}
-#endif /* CONFIG_X86 */

 /*
  * Change caching policy for the linear kernel map
@@ -149,19 +141,37 @@ out_err:
return ret;
 }

-int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t placement)
+int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t *placement)
 {
enum ttm_caching_state state;

-   if (placement & TTM_PL_FLAG_WC)
+   if (*placement & TTM_PL_FLAG_WC)
state = tt_wc;
-   else if (placement & TTM_PL_FLAG_UNCACHED)
+   else if (*placement & TTM_PL_FLAG_UNCACHED)
state = tt_uncached;
else
state = tt_cached;

return ttm_tt_set_caching(ttm, state);
 }
+#else /* CONFIG_X86 */
+int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t *placement)
+{
+   if (*placement & (TTM_PL_TT | TTM_PL_FLAG_SYSTEM)) {
+   ttm->caching_state = tt_cached;
+   *placement &= ~TTM_PL_MASK_CACHING;
+   *placement |= TTM_PL_FLAG_CACHED;
+   } else {
+   if (*placement & TTM_PL_FLAG_WC)
+   ttm->caching_state = tt_wc;
+   else if (*placement & TTM_PL_FLAG_UNCACHED)
+   ttm->caching_state = tt_uncached;
+   else
+   ttm->caching_state = tt_cached;
+   }
+   return 0;
+}
+#endif /* CONFIG_X86 */
 EXPORT_SYMBOL(ttm_tt_set_placement_caching);

 void ttm_tt_destroy(struct ttm_tt *ttm)
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 1d9f0f1..cbc5ad2 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -669,7 +669,7 @@ extern int ttm_tt_swapin(struct ttm_tt *ttm);
  * hit RAM. This function may be very costly as it involves global TLB
  * and cache flushes and potential page splitting / combining.
  */
-extern int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t 
placement);
+extern int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t 
*placement);
 extern int ttm_tt_swapout(struct ttm_tt *ttm,
  struct file

[PATCH 7/9] drm: powerpc can use a simpler drm_io_prot()

2014-09-04 Thread Benjamin Herrenschmidt

What the code does is equivalent to the x86 code, so let's use
it as well

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/drm_vm.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c
index 24e045c..ed02563 100644
--- a/drivers/gpu/drm/drm_vm.c
+++ b/drivers/gpu/drm/drm_vm.c
@@ -48,15 +48,11 @@ static pgprot_t drm_io_prot(struct drm_local_map *map,
 {
pgprot_t tmp = vm_get_page_prot(vma->vm_flags);

-#if defined(__i386__) || defined(__x86_64__)
+#if defined(__i386__) || defined(__x86_64__) || defined(__powerpc__)
if (map->type == _DRM_REGISTERS && !(map->flags & _DRM_WRITE_COMBINING))
tmp = pgprot_noncached(tmp);
else
tmp = pgprot_writecombine(tmp);
-#elif defined(__powerpc__)
-   pgprot_val(tmp) |= _PAGE_NO_CACHE;
-   if (map->type == _DRM_REGISTERS)
-   pgprot_val(tmp) |= _PAGE_GUARDED;
 #elif defined(__ia64__)
if (efi_range_is_wc(vma->vm_start, vma->vm_end -
vma->vm_start))

[PATCH 8/9] drm/ttm: Clean usage of ttm_io_prot() with TTM_PL_FLAG_CACHED

2014-09-04 Thread Benjamin Herrenschmidt

Today, most callers of ttm_io_prot() check TTM_PL_FLAG_CACHED before
calling it since on some archs it will unconditionally create non-cached
mappings.

But not all callers do which is incorrect as far as I can tell.

Instead, move that check inside ttm_io_port() itself for all archs
and make powerpc use the same implementation as ia64 and arm

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 19 ---
 drivers/gpu/drm/ttm/ttm_bo_vm.c   |  5 ++---
 2 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index e31d48c..ef2ac3c 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -487,28 +487,27 @@ static int ttm_buffer_object_transfer(struct 
ttm_buffer_object *bo,

 pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
 {
+   /* Cached mappings need no adjustment */
+   if (caching_flags & TTM_PL_FLAG_CACHED)
+   return tmp;
+
 #if defined(__i386__) || defined(__x86_64__)
if (caching_flags & TTM_PL_FLAG_WC)
tmp = pgprot_writecombine(tmp);
else if (boot_cpu_data.x86 > 3)
tmp = pgprot_noncached(tmp);

-#elif defined(__powerpc__)
-   if (!(caching_flags & TTM_PL_FLAG_CACHED)) {
-   pgprot_val(tmp) |= _PAGE_NO_CACHE;
-   if (caching_flags & TTM_PL_FLAG_UNCACHED)
-   pgprot_val(tmp) |= _PAGE_GUARDED;
+#endif
}
 #endif
-#if defined(__ia64__) || defined(__arm__)
+#if defined(__ia64__) || defined(__arm__) || defined(__powerpc__)
if (caching_flags & TTM_PL_FLAG_WC)
tmp = pgprot_writecombine(tmp);
else
tmp = pgprot_noncached(tmp);
 #endif
 #if defined(__sparc__) || defined(__mips__)
-   if (!(caching_flags & TTM_PL_FLAG_CACHED))
-   tmp = pgprot_noncached(tmp);
+   tmp = pgprot_noncached(tmp);
 #endif
return tmp;
 }
@@ -567,9 +566,7 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
 * We need to use vmap to get the desired page protection
 * or to make the buffer object look contiguous.
 */
-   prot = (mem->placement & TTM_PL_FLAG_CACHED) ?
-   PAGE_KERNEL :
-   ttm_io_prot(mem->placement, PAGE_KERNEL);
+   prot = ttm_io_prot(mem->placement, PAGE_KERNEL);
map->bo_kmap_type = ttm_bo_map_vmap;
map->virtual = vmap(ttm->pages + start_page, num_pages,
0, prot);
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 0ce48e5..4ce8dc1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -200,9 +200,8 @@ static int ttm_bo_vm_fault(struct vm_area_struct *vma, 
struct vm_fault *vmf)
cvma.vm_page_prot);
} else {
ttm = bo->ttm;
-   if (!(bo->mem.placement & TTM_PL_FLAG_CACHED))
-   cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
-   cvma.vm_page_prot);
+   cvma.vm_page_prot = ttm_io_prot(bo->mem.placement,
+   cvma.vm_page_prot);

/* Allocate all page at once, most common usage */
if (ttm->bdev->driver->ttm_tt_populate(ttm)) {

[PATCH 9/9] drm/ttm: Sanity check mapping attributes on powerpc in ttm_io_prot()

2014-09-04 Thread Benjamin Herrenschmidt

On all current cache coherent powerpc processors, it is not legit
to map system memory non-cachable. This will cause aliases with
the linear mapping which can be fatal.

The TTM should generally avoid it after Jerome placement patches but
let's add a sanity check anyway to catch any possible remaining issue.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index ef2ac3c..48095be 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -37,6 +37,9 @@
 #include 
 #include 
 #include 
+#if defined(__powerpc__)
+#include 
+#endif

 void ttm_bo_free_old_node(struct ttm_buffer_object *bo)
 {
@@ -498,6 +501,20 @@ pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t tmp)
tmp = pgprot_noncached(tmp);

 #endif
+#if defined(__powerpc__) && !defined(CONFIG_NOT_COHERENT_CACHE)
+   /*
+* Using a non-cachable mapping of system memory on
+* cache coherent powerpc's can be fatal, let's make
+* sure this doesn't happen and warn if it does. The
+* only exception is powermac with AGP which has to
+* take the risk.
+*/
+   if (!machine_is(powermac) &&
+   ((caching_flags & TTM_PL_FLAG_SYSTEM) ||
+(caching_flags & TTM_PL_FLAG_TT))) {
+   pr_err_once("TTM: Attempt to use a non-cached"
+   " mapping on RAM unsupported !\n");
+   return tmp;
}
 #endif
 #if defined(__ia64__) || defined(__arm__) || defined(__powerpc__)

[PATCH 2/9] drm/ast: POST chip at probe time if VGA not enabled

2014-09-04 Thread Benjamin Herrenschmidt


We need to do it on machines without a BIOS such as POWER8. Also
for detection to work without triggering PCIe errors, we need
to enable VGA early on, inside ast_detect_chip().

While touching those files, replace a few hard coded register
numbers with the corresponding symbolic constant.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_drv.h  |  3 +++
 drivers/gpu/drm/ast/ast_main.c | 47 --
 drivers/gpu/drm/ast/ast_post.c | 23 +
 3 files changed, 62 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index a203a6f..78fc683 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -384,6 +384,9 @@ int ast_bo_push_sysram(struct ast_bo *bo);
 int ast_mmap(struct file *filp, struct vm_area_struct *vma);

 /* ast post */
+void ast_enable_vga(struct drm_device *dev);
+void ast_enable_mmio(struct drm_device *dev);
+bool ast_is_vga_enabled(struct drm_device *dev);
 void ast_post_gpu(struct drm_device *dev);
 u32 ast_mindwm(struct ast_private *ast, u32 r);
 void ast_moutdwm(struct ast_private *ast, u32 r, u32 v);
diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index c2ff793..556d065 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -63,7 +63,7 @@ uint8_t ast_get_index_reg_mask(struct ast_private *ast,
 }


-static int ast_detect_chip(struct drm_device *dev)
+static int ast_detect_chip(struct drm_device *dev, bool *need_post)
 {
struct ast_private *ast = dev->dev_private;
uint32_t data, jreg;
@@ -109,6 +109,21 @@ static int ast_detect_chip(struct drm_device *dev)
}
}

+   /*
+* If VGA isn't enabled, we need to enable now or subsequent
+* access to the scratch registers will fail. We also inform
+* our caller that it needs to POST the chip
+* (Assumption: VGA not enabled -> need to POST)
+*/
+   if (!ast_is_vga_enabled(dev)) {
+   ast_enable_vga(dev);
+   ast_enable_mmio(dev);
+   DRM_INFO("VGA not enabled on entry, requesting chip POST\n");
+   *need_post = true;
+   } else
+   *need_post = false;
+
+   /* Check if we support wide screen */
switch (ast->chip) {
case AST1180:
ast->support_wide_screen = true;
@@ -124,6 +139,7 @@ static int ast_detect_chip(struct drm_device *dev)
ast->support_wide_screen = true;
else {
ast->support_wide_screen = false;
+   /* Read SCU7c (silicon revision register) */
ast_write32(ast, 0xf004, 0x1e6e);
ast_write32(ast, 0xf000, 0x1);
data = ast_read32(ast, 0x1207c);
@@ -136,11 +152,23 @@ static int ast_detect_chip(struct drm_device *dev)
break;
}

+   /* Check 3rd Tx option (digital output afaik) */
ast->tx_chip_type = AST_TX_NONE;
+
+   /*
+* VGACRA3 Enhanced Color Mode Register, check if DVO is already
+* enabled, in that case, assume we have a SIL164 TMDS transmitter
+*/
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 0xff);
if (jreg & 0x80)
ast->tx_chip_type = AST_TX_SIL164;
+
if ((ast->chip == AST2300) || (ast->chip == AST2400)) {
+   /*
+* On AST2300 and 2400, look the configuration set by the SoC in
+* the SOC scratch register #1 bits 11:8 (interestingly marked
+* as "reserved" in the spec
+*/
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xd1, 
0xff);
switch (jreg) {
case 0x04:
@@ -161,6 +189,17 @@ static int ast_detect_chip(struct drm_device *dev)
}
}

+   /* Print stuff for diagnostic purposes */
+   switch(ast->tx_chip_type) {
+   case AST_TX_SIL164:
+   DRM_INFO("Using Sil164 TMDS transmitter\n");
+   break;
+   case AST_TX_DP501:
+   DRM_INFO("Using DP501 DisplayPort transmitter\n");
+   break;
+   default:
+   DRM_INFO("Analog VGA only\n");
+   }
return 0;
 }

@@ -345,6 +384,7 @@ static u32 ast_get_vram_info(struct drm_device *dev)
 int ast_driver_load(struct drm_device *dev, unsigned long flags)
 {
struct ast_private *ast;
+   bool need_post;
int ret = 0;

ast = kzalloc(sizeof(struct ast_private), GFP_KERNEL);
@@ -379,7 +419,7 @@ int ast_driver_load(struct drm_device *dev, unsigned long 
flags)
}
}

-   ast_detect_chip(dev);
+   ast_detect_chip(dev, _post);

if (ast->chip != AST1180) {
ast_get_dram_info(dev);
@@ -387,6 +427,9 @@ int ast_driver_load(struct drm_device *dev,

[PATCH 3/9] drm/ast: Properly initialize P2A base before using it in ast_init_3rdtx()

2014-09-04 Thread Benjamin Herrenschmidt


If the P2A has been used to target other SOC registers before that
call, we're going to hit the wrong place so make sure we set the
base address up properly before using it.

(P2A stands for PCIe to AHB bridge and is the bride that allows
accessing the AST's internal AHB bus using a relocatable 64k
window in the second half of the PCIe MMIO BAR)

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_dp501.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
index 5da4b62..7e2ddde 100644
--- a/drivers/gpu/drm/ast/ast_dp501.c
+++ b/drivers/gpu/drm/ast/ast_dp501.c
@@ -400,7 +400,18 @@ void ast_init_3rdtx(struct drm_device *dev)
if (ast->tx_chip_type == AST_TX_SIL164)
ast_init_dvo(dev);
else {
+   /*
+* Set DAC source to VGA mode in SCU2C via the 
P2A
+* bridge. First configure the P2U to target 
the SCU
+* in case it isn't at this stage.
+*/
+   ast_write32(ast, 0xf004, 0x1e6e);
+   ast_write32(ast, 0xf000, 0x1);
+   /* Then unlock the SCU with the magic password 
*/
ast_write32(ast, 0x12000, 0x1688a8a8);
+   ast_write32(ast, 0x12000, 0x1688a8a8);
+   ast_write32(ast, 0x12000, 0x1688a8a8);
+   /* Finally, clear bits [17:16] of SCU2c */
data = ast_read32(ast, 0x1202c);
data &= 0xfffc;
ast_write32(ast, 0, data);

[PATCH 4/9] drm/ast: Don't assume DVO enabled means SIL164 on uninitialized chips

2014-09-04 Thread Benjamin Herrenschmidt


It looks like the AST2400 comes up with the DVO enable bit set,
which causes us to incorrectly assume we have a SIL164 regardless
of the value of the scratch registers setup by the BMC firmware.

So let's limit that test to the case where the chip has already
been setup by a BIOS.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/gpu/drm/ast/ast_main.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index 556d065..48998b2 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -158,16 +158,22 @@ static int ast_detect_chip(struct drm_device *dev, bool 
*need_post)
/*
 * VGACRA3 Enhanced Color Mode Register, check if DVO is already
 * enabled, in that case, assume we have a SIL164 TMDS transmitter
+*
+* Don't make that assumption if we the chip wasn't enabled and
+* is at power-on reset, otherwise we'll incorrectly "detect" a
+* SIL164 when there is none.
 */
-   jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 0xff);
-   if (jreg & 0x80)
-   ast->tx_chip_type = AST_TX_SIL164;
+   if (!*need_post) {
+   jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xa3, 
0xff);
+   if (jreg & 0x80)
+   ast->tx_chip_type = AST_TX_SIL164;
+   }

if ((ast->chip == AST2300) || (ast->chip == AST2400)) {
/*
 * On AST2300 and 2400, look the configuration set by the SoC in
 * the SOC scratch register #1 bits 11:8 (interestingly marked
-* as "reserved" in the spec
+* as "reserved" in the spec)
 */
jreg = ast_get_index_reg_mask(ast, AST_IO_CRTC_PORT, 0xd1, 
0xff);
switch (jreg) {

[PATCH 9/9] drm/ttm: Sanity check mapping attributes on powerpc in ttm_io_prot()

2014-09-04 Thread Michel Dänzer

On 04.09.2014 16:47, Benjamin Herrenschmidt wrote:
> On all current cache coherent powerpc processors, it is not legit
> to map system memory non-cachable. This will cause aliases with
> the linear mapping which can be fatal.
>
> The TTM should generally avoid it after Jerome placement patches but
> let's add a sanity check anyway to catch any possible remaining issue.
>
> Signed-off-by: Benjamin Herrenschmidt 

[...]

> @@ -498,6 +501,20 @@ pgprot_t ttm_io_prot(uint32_t caching_flags, pgprot_t 
> tmp)
>   tmp = pgprot_noncached(tmp);
>
>   #endif
> +#if defined(__powerpc__) && !defined(CONFIG_NOT_COHERENT_CACHE)
> + /*
> +  * Using a non-cachable mapping of system memory on
> +  * cache coherent powerpc's can be fatal, let's make
> +  * sure this doesn't happen and warn if it does. The
> +  * only exception is powermac with AGP which has to
> +  * take the risk.
> +  */
> + if (!machine_is(powermac) &&
> + ((caching_flags & TTM_PL_FLAG_SYSTEM) ||
> +  (caching_flags & TTM_PL_FLAG_TT))) {
> + pr_err_once("TTM: Attempt to use a non-cached"
> + " mapping on RAM unsupported !\n");
> + return tmp;

NAK, this breaks AGP on PowerMacs.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Thu, 2014-09-04 at 16:19 +0900, Michel D?nzer wrote:
> > +#else /* CONFIG_X86 */
> > +int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t
> *placement)
> > +{
> > + if (*placement & (TTM_PL_TT | TTM_PL_FLAG_SYSTEM)) {
> > + ttm->caching_state = tt_cached;
> > + *placement &= ~TTM_PL_MASK_CACHING;
> > + *placement |= TTM_PL_FLAG_CACHED;
> 
> NAK, this will break AGP on PowerMacs.

 ... which doesn't work reliably anyway with DRI2 :-)

The problem is ... with DRI1 I think we had tricks to take out the
AGP from the linear mapping but that want away, didn't we ?

In any case, we are playing with fire on these by allowing the
cache paradox. It just happens that those old CPUs aren't *that*
aggressive at speculative prefetch and we probably rarely hit the
lockups that they would cause...

Michel, what do you recommend we do then ? The patch I sent to
double check in ttm_io_prot() has a specific hack to avoid warning
on PowerMac for the above reason, but we need to fix Jerome if we
want to keep that broken-by-design Mac AGP functionality going :-)

Maybe we could add a similar ifdef in the above ?

Cheers,
Ben.

TTM placement & caching issue/questions

2014-09-04 Thread Michel Dänzer

On 04.09.2014 16:54, Benjamin Herrenschmidt wrote:
> On Thu, 2014-09-04 at 16:19 +0900, Michel D?nzer wrote:
>>> +#else /* CONFIG_X86 */
>>> +int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t
>> *placement)
>>> +{
>>> + if (*placement & (TTM_PL_TT | TTM_PL_FLAG_SYSTEM)) {
>>> + ttm->caching_state = tt_cached;
>>> + *placement &= ~TTM_PL_MASK_CACHING;
>>> + *placement |= TTM_PL_FLAG_CACHED;
>>
>> NAK, this will break AGP on PowerMacs.
>
>   ... which doesn't work reliably anyway with DRI2 :-)

Define 'not reliably'. I have uptimes of weeks, and I'm pretty sure I'm 
not alone, at least with AGP 1x it seems to work quite well for most 
people. So I don't see the justification for intentionally breaking it 
completely for all of us.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

TTM placement & caching issue/questions

2014-09-04 Thread Michel Dänzer

On 04.09.2014 16:59, Michel D?nzer wrote:
> On 04.09.2014 16:54, Benjamin Herrenschmidt wrote:
>> On Thu, 2014-09-04 at 16:19 +0900, Michel D?nzer wrote:
 +#else /* CONFIG_X86 */
 +int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t
>>> *placement)
 +{
 + if (*placement & (TTM_PL_TT | TTM_PL_FLAG_SYSTEM)) {
 + ttm->caching_state = tt_cached;
 + *placement &= ~TTM_PL_MASK_CACHING;
 + *placement |= TTM_PL_FLAG_CACHED;
>>>
>>> NAK, this will break AGP on PowerMacs.
>>
>>   ... which doesn't work reliably anyway with DRI2 :-)
>
> Define 'not reliably'. I have uptimes of weeks, and I'm pretty sure I'm
> not alone, at least with AGP 1x it seems to work quite well for most
> people. So I don't see the justification for intentionally breaking it
> completely for all of us.

Even more so because PCI GART is unusably slow in general.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

[PATCH 6/9] drm/ttm: force cached mapping on non x86 platform

2014-09-04 Thread Thomas Hellstrom

On 09/04/2014 09:46 AM, Benjamin Herrenschmidt wrote:
> From: J?r?me Glisse 
>
> People interested in providing uncached or write combined mapping
> on there architecture need to do the ground work inside there arch
> specific code to allow to break the linear kernel mapping so that
> page mapping attributes can be updated, in the meantime force cached
> mapping for non x86 architecture.
I don't like this patch. Please see my previous email.

/Thomas

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Thu, 2014-09-04 at 09:44 +0200, Thomas Hellstrom wrote:

> > This will, from what I can tell, try to use the same caching mode as the
> > original object:
> >
> > if ((cur_placement & caching) != 0)
> > result |= (cur_placement & caching);
> >
> > And cur_placement comes from bo->mem.placement which as far as I can
> > tell is based on the placement array which the drivers set up.
> 
> This originates from the fact that when evicting GTT memory, on x86 it's
> unnecessary and undesirable to switch caching mode when going to system.

But that's what I don't quite understand. We have two different mappings
here. The VRAM and the memory object. We wouldn't be "switching"... we
are creating a temporary mapping for the memory object in order to do
the memcpy, but we seem to be doing it by using the caching attributes
of the VRAM object or am I missing something ? I don't see how that
makes sense so I suppose I'm missing something here :-)

> Last time I tested, (and it seems like Michel is on the same track),
> writing with the CPU to write-combined memory was substantially faster
> than writing to cached memory, with the additional side-effect that CPU
> caches are left unpolluted.

That's very strange indeed. It's certainly an x86 specific artifact,
even if we were allowed by our hypervisor to map memory non-cachable
(the HW somewhat can), we tend to have a higher throughput by going
cachable, but that could be due to the way the PowerBus works (it's
basically very biased toward cachable transactions).

> I dislike the approach of rewriting placements. In some cases I think it
> won't even work, because placements are declared 'static const'
> 
> What I'd suggest is instead to intercept the driver response from
> init_mem_type() and filter out undesired caching modes from
> available_caching and default_caching, 

This was my original intent but Jerome seems to have different ideas
(see his proposed patches). I'm happy to revive mine as well and post it
as an alternative after I've tested it a bit more (tomorrow).

> perhaps also looking at whether
> the memory type is mappable or not. This should have the additional
> benefit of working everywhere, and if a caching mode is selected that's
> not available on the platform, you'll simply get an error. (I guess?)

You mean that if not mappable we don't bother filtering ?

The rule is really for me pretty simple:

   - If it's system memory (PL_SYSTEM/PL_TT), it MUST be cachable

   - If it's PCIe memory space (VRAM, registers, ...) it MUST be
non-cachable.

Cheers,
Ben.

> /Thomas
> 
> 
> >
> > Cheers,
> > Ben.
> >
> >
> > ___
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Thu, 2014-09-04 at 16:59 +0900, Michel D?nzer wrote:
> 
> Define 'not reliably'. I have uptimes of weeks, and I'm pretty sure I'm 
> not alone, at least with AGP 1x it seems to work quite well for most 
> people. So I don't see the justification for intentionally breaking it 
> completely for all of us.

Oh I wasn't arguing for breaking it, just jesting. We need to keep it
working. It's amazing how well broken stuff actually work though :-)

I mean, it's architecturally broken and if we get a collision between
the cache and the NCU, the chip will crash. We just get lucky I suppose.

Anyway, I'll try a different approach tomorrow see how it goes.

Cheers,
Ben.

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

--- Comment #6 from Jos? Su?rez  ---
I had started to try and bisect. Although I could successfully compile mesa git
I was having problems loading the hand compiled mesa drivers rather than the
system ones (not sure why the apps were not using those hand compiled libs and
drivers...). Thank you, smoki, for helping me narrowing the offending commit.

I had tried EU IV both with hyperz on and off and the problem was there in both
cases so I think your second guess might be correct. What I can easily do is
revert that "r600g,radeonsi: initialize HTILE to fully-expanded state" commit
from the ppa sources I usually use to build mesa and get back to you. I'll see
if I can do that during lunch time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/0fab0f7b/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Thomas Hellstrom

On 09/04/2014 10:06 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-09-04 at 09:44 +0200, Thomas Hellstrom wrote:
>
>>> This will, from what I can tell, try to use the same caching mode as the
>>> original object:
>>>
>>> if ((cur_placement & caching) != 0)
>>> result |= (cur_placement & caching);
>>>
>>> And cur_placement comes from bo->mem.placement which as far as I can
>>> tell is based on the placement array which the drivers set up.
>> This originates from the fact that when evicting GTT memory, on x86 it's
>> unnecessary and undesirable to switch caching mode when going to system.
> But that's what I don't quite understand. We have two different mappings
> here. The VRAM and the memory object. We wouldn't be "switching"... we
> are creating a temporary mapping for the memory object in order to do
> the memcpy, but we seem to be doing it by using the caching attributes
> of the VRAM object or am I missing something ? I don't see how that
> makes sense so I suppose I'm missing something here :-)

Well, the intention when TTM was written was that the driver writer
should be smart enough that when he wanted a move from unached VRAM to
system, he'd request cached system in the placement flags in the first
place.  If TTM somehow overrides such a request, that's a bug in TTM.

If the move, for example, is a result of an eviction, then the driver
evict_flags() function should ideally look at the current placement and
decide about a suitable placement based on that: vram-to-system moves
should generally request cacheable memory if the next access is expected
by the CPU. Probably write-combined otherwise.
If the move is the result of a TTM swapout, TTM will automatically
select cachable system, and for most other moves, I think the driver
writer is in full control.

>
>> Last time I tested, (and it seems like Michel is on the same track),
>> writing with the CPU to write-combined memory was substantially faster
>> than writing to cached memory, with the additional side-effect that CPU
>> caches are left unpolluted.
> That's very strange indeed. It's certainly an x86 specific artifact,
> even if we were allowed by our hypervisor to map memory non-cachable
> (the HW somewhat can), we tend to have a higher throughput by going
> cachable, but that could be due to the way the PowerBus works (it's
> basically very biased toward cachable transactions).
>
>> I dislike the approach of rewriting placements. In some cases I think it
>> won't even work, because placements are declared 'static const'
>>
>> What I'd suggest is instead to intercept the driver response from
>> init_mem_type() and filter out undesired caching modes from
>> available_caching and default_caching, 
> This was my original intent but Jerome seems to have different ideas
> (see his proposed patches). I'm happy to revive mine as well and post it
> as an alternative after I've tested it a bit more (tomorrow).
>
>> perhaps also looking at whether
>> the memory type is mappable or not. This should have the additional
>> benefit of working everywhere, and if a caching mode is selected that's
>> not available on the platform, you'll simply get an error. (I guess?)
> You mean that if not mappable we don't bother filtering ?
>
> The rule is really for me pretty simple:
>
>- If it's system memory (PL_SYSTEM/PL_TT), it MUST be cachable
>
>- If it's PCIe memory space (VRAM, registers, ...) it MUST be
> non-cachable.

Yes, something along these lines. I guess checking for VRAM or
TTM_MEMTYPE_FLAG_FIXED would perhaps do the trick

/Thomas

>
> Cheers,
> Ben.
>
>> /Thomas
>>
>>
>>> Cheers,
>>> Ben.
>>>
>>>
>>> ___
>>> dri-devel mailing list
>>> dri-devel at lists.freedesktop.org
>>> https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/dri-devel=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A=l5Ago9ekmVFZ3c4M6eauqrJWGwjf6fTb%2BP3CxbBFkVM%3D%0A=C9AHL1VngKBOxe2UrNP2eCZo6FLqdlr6Y90rpfE5rUs%3D%0A=73da0633bafc5d54bf116bc861d48d13c39cf8f41832adfb739709e98ec05553
>

[Bug 83012] Unigine Tropics horrible performance with vblank_mode=2 (which is the default) or =3

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83012

--- Comment #4 from Eero Tamminen  ---
(In reply to comment #3)
> Why don't we disable vsync by default?
> I think that's what proprietary drivers do.

Because that's relevant only for benchmarks.  Normal applications want to
present user complete (non-tearing) frames.

Note: with DRI2, using fullscreen with vsync off causes X to do copy of every
frame (which has clear performance penalty for memory bandwidth bound
benchmarks). With DRI3, when Vsync is disabled on client side, the X side copy
is done only every Vsync, but currently you would need quad buffering to avoid
stalling from X server side synching, which isn't what Mesa does (see bug
79715).

(If you're very close, but not quite at 60 or 30 FPS, this non-Vsync induced
copy can cause your FPS even to slightly decrease compared to Vsync.)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/419f7adb/attachment.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #16 from Michel D?nzer  ---
If you can create an apitrace demonstrating the problem, that should be useful.

Or if you could reproduce the crashes and attach gdb backtraces for them, that
might also be interesting.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/ad2f8dc2/attachment.html>

[Bug 81239] Evolution window content not shown fully (only desktop background)

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81239

Paul Menzel  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |NOTOURBUG

--- Comment #10 from Paul Menzel  ---
The user owen in #gtk+ at irc.gnome.org told me that it is probably not a driver
problem but an Evolution/WebKitGTK/GTK+ problem.

Thank you for all the help.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/05f2d77a/attachment-0001.html>

[Bug 83422] i was opening systemsettings, chose workspace design and the default look and feel screen just stays black when resizing the window or chosing another option on the left panel, it crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83422

--- Comment #17 from simon  ---
@Michel D?nzer

you won't believe it i just loged in kde plasma next again and the applet is
now working again.
dunno what the problem really is.

i will leave it till tomorrow, do again a kde update and will have a look at it
again.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/4e0d4e19/attachment.html>

[Bug 44126] [r300g] 0ad: carpet textures "flash" and get hidden by ground texture.

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=44126

Fabio Pedretti  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #3 from Fabio Pedretti  ---
The game model was modified moving the carpets up a bit, and the issue is no
longer reproducible now.

No idea if that was the proper fix or just a workaround for a mesa bug. Setting
as WORKSFORME then.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/3727bb5e/attachment.html>

[PATCH 9/9] drm/ttm: Sanity check mapping attributes on powerpc in ttm_io_prot()

2014-09-04 Thread Benjamin Herrenschmidt

On Thu, 2014-09-04 at 16:52 +0900, Michel D?nzer wrote:
> >   #endif
> > +#if defined(__powerpc__) && !defined(CONFIG_NOT_COHERENT_CACHE)
> > + /*
> > +  * Using a non-cachable mapping of system memory on
> > +  * cache coherent powerpc's can be fatal, let's make
> > +  * sure this doesn't happen and warn if it does. The
> > +  * only exception is powermac with AGP which has to
> > +  * take the risk.
> > +  */
> > + if (!machine_is(powermac) &&
> > + ((caching_flags & TTM_PL_FLAG_SYSTEM) ||
> > +  (caching_flags & TTM_PL_FLAG_TT))) {
> > + pr_err_once("TTM: Attempt to use a non-cached"
> > + " mapping on RAM unsupported !\n");
> > + return tmp;
> 
> NAK, this breaks AGP on PowerMacs.

No it doesn't :-)

Cheers,
Ben.

TTM placement & caching issue/questions

2014-09-04 Thread Daniel Vetter

On Thu, Sep 04, 2014 at 09:44:04AM +0200, Thomas Hellstrom wrote:
> Last time I tested, (and it seems like Michel is on the same track),
> writing with the CPU to write-combined memory was substantially faster
> than writing to cached memory, with the additional side-effect that CPU
> caches are left unpolluted.
> 
> Moreover (although only tested on Intel's embedded chipsets), texturing
> from cpu-cache-coherent PCI memory was a real GPU performance hog
> compared to texturing from non-snooped memory. Hence, whenever a buffer
> could be classified as GPU-read-only (or almost at least), it should be
> placed in write-combined memory.

Just a quick comment since this explicitly referes to intel chips: On
desktop/laptop chips with the big shared l3/l4 caches it's the other way
round. Cached uploads are substantially faster than wc and not using
coherent access is a severe perf hit for texturing. I guess the hw guys
worked really hard to hide the snooping costs so that the gpu can benefit
from the massive bandwidth these caches can provide.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[git pull] drm fixes

2014-09-04 Thread Dave Airlie


Hi Linus,

just i915 and vmwgfx fixes,

i915 contains a bunch of fixes for recent regressions in outputs,
vmwgfx fixes a possible loop for ever and a bad return code.

Dave.

The following changes since commit 59753a805499f1ffbca4ac0a24b3dff67bf1:

  Merge tag 'backlight-fixes-3.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight (2014-08-28 
10:47:10 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 3aacfda0ecd9040521fbfb4a2c53cd6bf77ae4ee:

  Merge tag 'drm-intel-fixes-2014-09-03' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes (2014-09-04 11:20:00 
+1000)



Dave Airlie (4):
  Merge tag 'drm-intel-fixes-2014-08-28' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes
  drm/i915: handle G45/GM45 pulse detection connected state.
  Merge branch 'vmwgfx-fixes-3.17' of 
git://people.freedesktop.org/~thomash/linux into drm-fixes
  Merge tag 'drm-intel-fixes-2014-09-03' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes

Mathias Krause (1):
  drm/i915: Remove bogus __init annotation from DMI callbacks

Paulo Zanoni (1):
  drm/i915: fix plane/cursor handling when runtime suspended

Scot Doyle (2):
  drm/i915: Ignore VBT backlight presence check on Acer C720 (4005U)
  drm/i915: don't warn if backlight unexpectedly enabled

Thomas Hellstrom (2):
  drm/vmwgfx: Fix an incorrect OOM return value
  drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle

Ville Syrj?l? (2):
  drm/i915: Move intel_ddi_set_vc_payload_alloc(false) to 
haswell_crtc_disable()
  drm/i915: Fix lock dropping in intel_tv_detect()

 drivers/gpu/drm/i915/intel_bios.c   |  2 +-
 drivers/gpu/drm/i915/intel_crt.c|  2 +-
 drivers/gpu/drm/i915/intel_display.c| 34 +---
 drivers/gpu/drm/i915/intel_dp.c | 55 ++---
 drivers/gpu/drm/i915/intel_lvds.c   |  2 +-
 drivers/gpu/drm/i915/intel_panel.c  |  8 ++---
 drivers/gpu/drm/i915/intel_tv.c | 10 --
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 11 +++
 drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c|  3 +-
 9 files changed, 88 insertions(+), 39 deletions(-)

TTM placement & caching issue/questions

2014-09-04 Thread Benjamin Herrenschmidt

On Thu, 2014-09-04 at 11:34 +0200, Daniel Vetter wrote:
> On Thu, Sep 04, 2014 at 09:44:04AM +0200, Thomas Hellstrom wrote:
> > Last time I tested, (and it seems like Michel is on the same track),
> > writing with the CPU to write-combined memory was substantially faster
> > than writing to cached memory, with the additional side-effect that CPU
> > caches are left unpolluted.
> > 
> > Moreover (although only tested on Intel's embedded chipsets), texturing
> > from cpu-cache-coherent PCI memory was a real GPU performance hog
> > compared to texturing from non-snooped memory. Hence, whenever a buffer
> > could be classified as GPU-read-only (or almost at least), it should be
> > placed in write-combined memory.
> 
> Just a quick comment since this explicitly referes to intel chips: On
> desktop/laptop chips with the big shared l3/l4 caches it's the other way
> round. Cached uploads are substantially faster than wc and not using
> coherent access is a severe perf hit for texturing. I guess the hw guys
> worked really hard to hide the snooping costs so that the gpu can benefit
> from the massive bandwidth these caches can provide.

This is similar to modern POWER chips as well. We have pretty big L3's
(though not technically shared they are in a separate quadrant and we
have a shared L4 in the memory buffer) and our fabric is generally
optimized for cachable/coherent access performance. In fact, we only
have so many credits for NC accesses on the bus...

What that tells me is that when setting up the desired cachability
attributes for the mapping of a memory object, we need to consider these
things here:

  - The hard requirement of the HW (non-coherent GPUs require NC, AGP
does in some cases, etc...) which I think is basically already handled
using the placement attributes set by the GPU driver for the memory type

  - The optimal attributes (and platform hard requirements) for fast
memory accesses to an object by the processor.  From what I read here,
this can be NC+WC on older Intel, cachable on newer, etc...)

  - The optimal attributes for fast GPU DMA accesses to the object in
system memory. Here too, this is fairly platform/chipset dependent.

Do we have flags in the DRM that tell us whether an object in memory is
more likely to be used by the GPU via DMA vs by the CPU via MMIO ? On
powerpc (except in the old AGP case), I wouldn't care about require
cachable in both case, but I can see the low latency crowd wanting the
former to be non-cachable while the dumb GPUs like AST who don't do DMA
would benefit greatly from the latter...

Cheers,
Ben.

[Intel-gfx] [PATCH v2] drm/i915: Sysfs interface to get GFX shmem usage stats per process

2014-09-04 Thread Daniel Vetter

On Thu, Sep 4, 2014 at 9:03 AM, Gupta, Sourab  wrote:
> On Wed, 2014-09-03 at 13:09 +, Daniel Vetter wrote:
>> On Wed, Sep 03, 2014 at 11:49:52AM +, Gupta, Sourab wrote:
>> > On Wed, 2014-09-03 at 10:58 +, Daniel Vetter wrote:
>> > > On Wed, Sep 03, 2014 at 03:39:55PM +0530, sourab.gupta at intel.com 
>> > > wrote:
>> > > > From: Sourab Gupta 
>> > > >
>> > > > Currently the Graphics Driver provides an interface through which
>> > > > one can get a snapshot of the overall Graphics memory consumption.
>> > > > Also there is an interface available, which provides information
>> > > > about the several memory related attributes of every single Graphics
>> > > > buffer created by the various clients.
>> > > >
>> > > > There is a requirement of a new interface for achieving below
>> > > > functionalities:
>> > > > 1) Need to provide Client based detailed information about the
>> > > > distribution of Graphics memory
>> > > > 2) Need to provide an interface which can provide info about the
>> > > > sharing of Graphics buffers between the clients.
>> > > >
>> > > > The client based interface would also aid in debugging of
>> > > > memory usage/consumption by each client & debug memleak related issues.
>> > > >
>> > > > With this new interface,
>> > > > 1) In case of memleak scenarios, we can easily zero in on the culprit
>> > > > client which is unexpectedly holding on the Graphics buffers for an
>> > > > inordinate amount of time.
>> > > > 2) We can get an estimate of the instantaneous memory footprint of
>> > > > every Graphics client.
>> > > > 3) We can now trace all the processes sharing a particular Graphics 
>> > > > buffer.
>> > > >
>> > > > By means of this patch we try to provide a sysfs interface to achieve
>> > > > the mentioned functionalities.
>> > > >
>> > > > There are two files created in sysfs:
>> > > > 'i915_gem_meminfo' will provide summary of the graphics resources used 
>> > > > by
>> > > > each graphics client.
>> > > > 'i915_gem_objinfo' will provide detailed view of each object created by
>> > > > individual clients.
>> > > >
>> > > > v2: Changes made for
>> > > > - adding support to report user virtual addresses of mapped buffers
>> > > > - replacing pid based reporting with tgid based one
>> > > > - checkpatch and other misc cleanup
>> > > >
>> > > > Signed-off-by: Sourab Gupta 
>> > > > Signed-off-by: Akash Goel 
>> > >
>> > > Sorry I didn't spot this the first time around, but I think sysfs is the
>> > > wrong place for this.
>> > >
>> > > Generally sysfs is for setting/reading per-object values, and it has the
>> > > big rule that there should be only _one_ value per file. The error state
>> > > is a bit an exception, but otoh it's also just the full dump as a binary
>> > > file (which for historical reasons is printed as ascii).
>> > >
>> > > The other issue is that imo this should be a generic interface, so that 
>> > > we
>> > > can write a gpu_top tool for dumping memory consumers which works on all
>> > > linux platforms.
>> > >
>> > > To avoid delaying for a long time can we just move ahead by putting this
>> > > into debugfs?
>> > >
>> > > Also in debugfs there's already a lot of this stuff around - why is that
>> > > not sufficient and could we extend it somehow with the missing bits?
>> > >
>> > > Thanks, Daniel
>> >
>> > Hi Daniel,
>> >
>> > Thanks for your inputs.
>> > We had originally put the patch in sysfs, as there was a requirement for
>> > this feature to be available in production kernels also.
>> > We can move it to debugfs to move ahead with this. I'll submit the
>> > debugfs version of this patch next time.
>>
>> Yeah sysfs is the only place where we have a stable api, but that also
>> implies that requirements are a _lot_ more stringent. At least we need
>> testcases to make sure the interface actually do what we want them to do,
>> and to make sure we don't break the interface by accident.
>>
>> > Also,
>> > we developed this new interface to overcome the deficiencies of existing
>> > interface. With this new interface, we can provide client based detailed
>> > information about the distribution of Graphics memory. This gives
>> > information about the various states of the graphics objects opened per
>> > process (summarized as well as detailed info)
>> > It also gives information about Graphics buffers shared between the
>> > clients, and gives user mapped virtual address of all the mapped
>> > graphics buffers.
>> > It was not feasible to fit all this info in the existing interface. So
>> > we decided to go ahead with new interface for these functionality.
>>
>> Well the problem is that adding more files like that increases the
>> maintenance burden. So if there's some way to compute the information you
>> want from information already provided in debugfs, then I prefer we do
>> that at first.
>> -Daniel
>
> Hi Daniel,
>
> We went through the existing debugfs interfaces, but we couldn't derive
> the information we need from these

[PATCH 9/9] drm/ttm: Sanity check mapping attributes on powerpc in ttm_io_prot()

2014-09-04 Thread Michel Dänzer

On 04.09.2014 18:34, Benjamin Herrenschmidt wrote:
> On Thu, 2014-09-04 at 16:52 +0900, Michel D?nzer wrote:
>>>#endif
>>> +#if defined(__powerpc__) && !defined(CONFIG_NOT_COHERENT_CACHE)
>>> + /*
>>> +  * Using a non-cachable mapping of system memory on
>>> +  * cache coherent powerpc's can be fatal, let's make
>>> +  * sure this doesn't happen and warn if it does. The
>>> +  * only exception is powermac with AGP which has to
>>> +  * take the risk.
>>> +  */
>>> + if (!machine_is(powermac) &&
>>> + ((caching_flags & TTM_PL_FLAG_SYSTEM) ||
>>> +  (caching_flags & TTM_PL_FLAG_TT))) {
>>> + pr_err_once("TTM: Attempt to use a non-cached"
>>> + " mapping on RAM unsupported !\n");
>>> + return tmp;
>>
>> NAK, this breaks AGP on PowerMacs.
>
> No it doesn't :-)

Yeah sorry, I was blind.


-- 
Earthling Michel D?nzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer

[Bug 82828] Regression: Crash in 3Dmark2001

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=82828

--- Comment #20 from Pavel Ondra?ka  ---
Your patch does indeed fix the crashing tests, I still see some piglit
regressions but that should be either bug 82882 or bug 82978.
Thanks for the fix.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/41fa3826/attachment.html>

TTM placement & caching issue/questions

2014-09-04 Thread Thomas Hellstrom

On 09/04/2014 11:43 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2014-09-04 at 11:34 +0200, Daniel Vetter wrote:
>> On Thu, Sep 04, 2014 at 09:44:04AM +0200, Thomas Hellstrom wrote:
>>> Last time I tested, (and it seems like Michel is on the same track),
>>> writing with the CPU to write-combined memory was substantially faster
>>> than writing to cached memory, with the additional side-effect that CPU
>>> caches are left unpolluted.
>>>
>>> Moreover (although only tested on Intel's embedded chipsets), texturing
>>> from cpu-cache-coherent PCI memory was a real GPU performance hog
>>> compared to texturing from non-snooped memory. Hence, whenever a buffer
>>> could be classified as GPU-read-only (or almost at least), it should be
>>> placed in write-combined memory.
>> Just a quick comment since this explicitly referes to intel chips: On
>> desktop/laptop chips with the big shared l3/l4 caches it's the other way
>> round. Cached uploads are substantially faster than wc and not using
>> coherent access is a severe perf hit for texturing. I guess the hw guys
>> worked really hard to hide the snooping costs so that the gpu can benefit
>> from the massive bandwidth these caches can provide.
> This is similar to modern POWER chips as well. We have pretty big L3's
> (though not technically shared they are in a separate quadrant and we
> have a shared L4 in the memory buffer) and our fabric is generally
> optimized for cachable/coherent access performance. In fact, we only
> have so many credits for NC accesses on the bus...
>

Thanks both of you for the update. I haven't dealt with real hardware
for a while..

/Thomas

[Bug 83436] Sudden framerate drops in multiple games

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83436

--- Comment #9 from Maciej  ---
Did an update today, performance decrease is still there, but fps drops are
gone. I had no other apps running in the background, so I'm not sure what's up.
However fps drops in TF2 are still a thing.

As for bisecting, I really have no skills to do that, I'm just a gamer with AMD
card :/

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/627f9d1b/attachment.html>

[PATCH 0/7] cross-dev synchronization in TTM through dma-buf.

2014-09-04 Thread Maarten Lankhorst

So this is finally it. After all the work writing support for fences cross-dev 
synchronization is now possible. :-)

The last 2 patches of this series are not needed for cross-dev to work. But 
without it any waits on cross-device fences will be done synchronously.
I've previously tested this with i915, but the patches for i915 fail to apply 
again with the execlist stuff, so I haven't tried with the latest drm-next 
changes.

I would like to have the first 2 patches applied on drm-next, and the 
radeon/nouveau specific patches when they go through their review.

[PATCH 1/7] drm: Pass dma-buf as argument to, gem_prime_import_sg_table

2014-09-04 Thread Maarten Lankhorst

Allows importing reservation_objects from a dma-buf.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/drm_gem_cma_helper.c| 5 +++--
 drivers/gpu/drm/drm_prime.c | 2 +-
 drivers/gpu/drm/msm/msm_drv.h   | 2 +-
 drivers/gpu/drm/msm/msm_gem_prime.c | 4 ++--
 drivers/gpu/drm/nouveau/nouveau_gem.h   | 2 +-
 drivers/gpu/drm/nouveau/nouveau_prime.c | 5 +++--
 drivers/gpu/drm/qxl/qxl_drv.h   | 2 +-
 drivers/gpu/drm/qxl/qxl_prime.c | 2 +-
 drivers/gpu/drm/radeon/radeon_drv.c | 2 +-
 drivers/gpu/drm/radeon/radeon_prime.c   | 5 +++--
 include/drm/drmP.h  | 3 ++-
 include/drm/drm_gem_cma_helper.h| 3 ++-
 12 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c 
b/drivers/gpu/drm/drm_gem_cma_helper.c
index e467e67af6e7..08646af2ddc2 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -316,7 +316,8 @@ out:
 EXPORT_SYMBOL_GPL(drm_gem_cma_prime_get_sg_table);

 struct drm_gem_object *
-drm_gem_cma_prime_import_sg_table(struct drm_device *dev, size_t size,
+drm_gem_cma_prime_import_sg_table(struct drm_device *dev,
+ struct dma_buf_attachment *attach,
  struct sg_table *sgt)
 {
struct drm_gem_cma_object *cma_obj;
@@ -325,7 +326,7 @@ drm_gem_cma_prime_import_sg_table(struct drm_device *dev, 
size_t size,
return ERR_PTR(-EINVAL);

/* Create a CMA GEM buffer. */
-   cma_obj = __drm_gem_cma_create(dev, size);
+   cma_obj = __drm_gem_cma_create(dev, attach->dmabuf->size);
if (IS_ERR(cma_obj))
return ERR_CAST(cma_obj);

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 99d578bad17e..dc4711f30382 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -522,7 +522,7 @@ struct drm_gem_object *drm_gem_prime_import(struct 
drm_device *dev,
goto fail_detach;
}

-   obj = dev->driver->gem_prime_import_sg_table(dev, dma_buf->size, sgt);
+   obj = dev->driver->gem_prime_import_sg_table(dev, attach, sgt);
if (IS_ERR(obj)) {
ret = PTR_ERR(obj);
goto fail_unmap;
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 8a2c5fd0893e..a0dc2592ffc1 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -170,7 +170,7 @@ struct sg_table *msm_gem_prime_get_sg_table(struct 
drm_gem_object *obj);
 void *msm_gem_prime_vmap(struct drm_gem_object *obj);
 void msm_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
 struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev,
-   size_t size, struct sg_table *sg);
+   struct dma_buf_attachment *attach, struct sg_table *sg);
 int msm_gem_prime_pin(struct drm_gem_object *obj);
 void msm_gem_prime_unpin(struct drm_gem_object *obj);
 void *msm_gem_vaddr_locked(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/msm/msm_gem_prime.c 
b/drivers/gpu/drm/msm/msm_gem_prime.c
index d48f9fc5129b..b75f9940ee9e 100644
--- a/drivers/gpu/drm/msm/msm_gem_prime.c
+++ b/drivers/gpu/drm/msm/msm_gem_prime.c
@@ -37,9 +37,9 @@ void msm_gem_prime_vunmap(struct drm_gem_object *obj, void 
*vaddr)
 }

 struct drm_gem_object *msm_gem_prime_import_sg_table(struct drm_device *dev,
-   size_t size, struct sg_table *sg)
+   struct dma_buf_attachment *attach, struct sg_table *sg)
 {
-   return msm_gem_import(dev, size, sg);
+   return msm_gem_import(dev, attach->dmabuf->size, sg);
 }

 int msm_gem_prime_pin(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.h 
b/drivers/gpu/drm/nouveau/nouveau_gem.h
index ddab762d81fe..e4049faca780 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.h
@@ -39,7 +39,7 @@ struct reservation_object *nouveau_gem_prime_res_obj(struct 
drm_gem_object *);
 extern void nouveau_gem_prime_unpin(struct drm_gem_object *);
 extern struct sg_table *nouveau_gem_prime_get_sg_table(struct drm_gem_object 
*);
 extern struct drm_gem_object *nouveau_gem_prime_import_sg_table(
-   struct drm_device *, size_t size, struct sg_table *);
+   struct drm_device *, struct dma_buf_attachment *, struct sg_table *);
 extern void *nouveau_gem_prime_vmap(struct drm_gem_object *);
 extern void nouveau_gem_prime_vunmap(struct drm_gem_object *, void *);

diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c 
b/drivers/gpu/drm/nouveau/nouveau_prime.c
index 1f51008e4d26..2215cdba587d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_prime.c
+++ b/drivers/gpu/drm/nouveau/nouveau_prime.c
@@ -23,6 +23,7 @@
  */

 #include 
+#include 

 #include "nouveau_drm.h"
 #include "nouveau_gem.h"
@@ -56,7 +57,7 @@ void nouveau_gem_prime_vunmap(struct drm_gem_object *obj, 
void *vaddr)
 }

 struct drm_gem_object

[PATCH 2/7] drm/ttm: add reservation_object as argument to ttm_bo_init

2014-09-04 Thread Maarten Lankhorst

This allows importing reservation objects from dma-bufs.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/ast/ast_ttm.c|  2 +-
 drivers/gpu/drm/bochs/bochs_mm.c |  2 +-
 drivers/gpu/drm/cirrus/cirrus_ttm.c  |  2 +-
 drivers/gpu/drm/mgag200/mgag200_ttm.c|  2 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c |  2 +-
 drivers/gpu/drm/qxl/qxl_object.c |  2 +-
 drivers/gpu/drm/radeon/radeon_object.c   |  2 +-
 drivers/gpu/drm/ttm/ttm_bo.c | 24 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |  2 +-
 include/drm/ttm/ttm_bo_api.h |  2 ++
 10 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 8008ea0bc76c..58c19cfe6af0 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -339,7 +339,7 @@ int ast_bo_create(struct drm_device *dev, int size, int 
align,
ret = ttm_bo_init(>ttm.bdev, >bo, size,
  ttm_bo_type_device, >placement,
  align >> PAGE_SHIFT, false, NULL, acc_size,
- NULL, ast_bo_ttm_destroy);
+ NULL, NULL, ast_bo_ttm_destroy);
if (ret)
return ret;

diff --git a/drivers/gpu/drm/bochs/bochs_mm.c b/drivers/gpu/drm/bochs/bochs_mm.c
index 2af30e7607d7..6c50a7a44864 100644
--- a/drivers/gpu/drm/bochs/bochs_mm.c
+++ b/drivers/gpu/drm/bochs/bochs_mm.c
@@ -377,7 +377,7 @@ static int bochs_bo_create(struct drm_device *dev, int 
size, int align,
ret = ttm_bo_init(>ttm.bdev, >bo, size,
  ttm_bo_type_device, >placement,
  align >> PAGE_SHIFT, false, NULL, acc_size,
- NULL, bochs_bo_ttm_destroy);
+ NULL, NULL, bochs_bo_ttm_destroy);
if (ret)
return ret;

diff --git a/drivers/gpu/drm/cirrus/cirrus_ttm.c 
b/drivers/gpu/drm/cirrus/cirrus_ttm.c
index 3e7d758330a9..b3b3d16d1279 100644
--- a/drivers/gpu/drm/cirrus/cirrus_ttm.c
+++ b/drivers/gpu/drm/cirrus/cirrus_ttm.c
@@ -343,7 +343,7 @@ int cirrus_bo_create(struct drm_device *dev, int size, int 
align,
ret = ttm_bo_init(>ttm.bdev, >bo, size,
  ttm_bo_type_device, >placement,
  align >> PAGE_SHIFT, false, NULL, acc_size,
- NULL, cirrus_bo_ttm_destroy);
+ NULL, NULL, cirrus_bo_ttm_destroy);
if (ret)
return ret;

diff --git a/drivers/gpu/drm/mgag200/mgag200_ttm.c 
b/drivers/gpu/drm/mgag200/mgag200_ttm.c
index be883ef5a1d3..398b6fb161a6 100644
--- a/drivers/gpu/drm/mgag200/mgag200_ttm.c
+++ b/drivers/gpu/drm/mgag200/mgag200_ttm.c
@@ -339,7 +339,7 @@ int mgag200_bo_create(struct drm_device *dev, int size, int 
align,
ret = ttm_bo_init(>ttm.bdev, >bo, size,
  ttm_bo_type_device, >placement,
  align >> PAGE_SHIFT, false, NULL, acc_size,
- NULL, mgag200_bo_ttm_destroy);
+ NULL, NULL, mgag200_bo_ttm_destroy);
if (ret)
return ret;

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index eea74b127b03..bda32276bcc2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -230,7 +230,7 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
ret = ttm_bo_init(>ttm.bdev, >bo, size,
  type, >placement,
  align >> PAGE_SHIFT, false, NULL, acc_size, sg,
- nouveau_bo_del_ttm);
+ NULL, nouveau_bo_del_ttm);
if (ret) {
/* ttm will call nouveau_bo_del_ttm if it fails.. */
return ret;
diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c
index 69c104c3240f..cdeaf08fdc74 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -110,7 +110,7 @@ int qxl_bo_create(struct qxl_device *qdev,

r = ttm_bo_init(>mman.bdev, >tbo, size, type,
>placement, 0, !kernel, NULL, size,
-   NULL, _ttm_bo_destroy);
+   NULL, NULL, _ttm_bo_destroy);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS)
dev_err(qdev->dev,
diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index aadbd36e64b9..61f3f16bbcbc 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -209,7 +209,7 @@ int radeon_bo_create(struct radeon_device *rdev,
down_read(>pm.mclk_lock);
r = ttm_bo_init(>mman.bdev, >tbo, size, type,
>placement, page_align, !kernel, NULL,
-   acc_size, sg, _ttm_bo_destroy);
+

[PATCH 3/7] drm/radeon: cope with foreign fences inside the reservation object

2014-09-04 Thread Maarten Lankhorst

Not the whole world is a radeon! :-)

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/radeon/radeon.h | 11 -
 drivers/gpu/drm/radeon/radeon_cs.c  | 32 +
 drivers/gpu/drm/radeon/radeon_display.c | 41 -
 drivers/gpu/drm/radeon/radeon_fence.c   |  3 +++
 drivers/gpu/drm/radeon/radeon_mode.h|  1 +
 5 files changed, 61 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index d80dc547a105..dddb2b7dd752 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -699,17 +699,6 @@ void radeon_doorbell_free(struct radeon_device *rdev, u32 
doorbell);
  * IRQS.
  */

-struct radeon_flip_work {
-   struct work_struct  flip_work;
-   struct work_struct  unpin_work;
-   struct radeon_device*rdev;
-   int crtc_id;
-   uint64_tbase;
-   struct drm_pending_vblank_event *event;
-   struct radeon_bo*old_rbo;
-   struct radeon_fence *fence;
-};
-
 struct r500_irq_stat_regs {
u32 disp_int;
u32 hdmi0_status;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 6e3d1c8f3483..8ad4e2cfae15 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -248,23 +248,34 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, 
u32 ring, s32 priority
return 0;
 }

-static void radeon_cs_sync_rings(struct radeon_cs_parser *p)
+static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
 {
int i;

for (i = 0; i < p->nrelocs; i++) {
struct reservation_object *resv;
struct fence *fence;
+   struct radeon_fence *rfence;
+   int r;

if (!p->relocs[i].robj)
continue;

resv = p->relocs[i].robj->tbo.resv;
fence = reservation_object_get_excl(resv);
+   if (!fence)
+   continue;
+   rfence = to_radeon_fence(fence);
+   if (!rfence || rfence->rdev != p->rdev) {
+   r = fence_wait(fence, true);
+   if (r)
+   return r;
+   continue;
+   }

-   radeon_semaphore_sync_to(p->ib.semaphore,
-(struct radeon_fence *)fence);
+   radeon_semaphore_sync_to(p->ib.semaphore, rfence);
}
+   return 0;
 }

 /* XXX: note that this is called from the legacy UMS CS ioctl as well */
@@ -474,13 +485,19 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
return r;
}

+   r = radeon_cs_sync_rings(parser);
+   if (r) {
+   if (r != -ERESTARTSYS)
+   DRM_ERROR("Failed to sync rings: %i\n", r);
+   return r;
+   }
+
if (parser->ring == R600_RING_TYPE_UVD_INDEX)
radeon_uvd_note_usage(rdev);
else if ((parser->ring == TN_RING_TYPE_VCE1_INDEX) ||
 (parser->ring == TN_RING_TYPE_VCE2_INDEX))
radeon_vce_note_usage(rdev);

-   radeon_cs_sync_rings(parser);
r = radeon_ib_schedule(rdev, >ib, NULL, true);
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
@@ -567,7 +584,12 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
*rdev,
if (r) {
goto out;
}
-   radeon_cs_sync_rings(parser);
+   r = radeon_cs_sync_rings(parser);
+   if (r) {
+   if (r != -ERESTARTSYS)
+   DRM_ERROR("Failed to sync rings: %i\n", r);
+   goto out;
+   }
radeon_semaphore_sync_to(parser->ib.semaphore, vm->fence);

if ((rdev->family >= CHIP_TAHITI) &&
diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
b/drivers/gpu/drm/radeon/radeon_display.c
index bc894c17b2f9..715b2d95346c 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -36,6 +36,17 @@

 #include 

+struct radeon_flip_work {
+   struct work_struct  flip_work;
+   struct work_struct  unpin_work;
+   struct radeon_device*rdev;
+   int crtc_id;
+   uint64_tbase;
+   struct drm_pending_vblank_event *event;
+   struct radeon_bo*old_rbo;
+   struct fence*fence;
+};
+
 static void avivo_crtc_load_lut(struct drm_crtc *crtc)
 {
struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
@@ -402,14 +413,21 @@ static void radeon_flip_work_func(struct work_struct 
*__work)

 down_read(>exclusive_lock);
if (work->fence) {
-   r =

[PATCH 4/7] drm/radeon: export reservation_object from dmabuf to ttm

2014-09-04 Thread Maarten Lankhorst

Adds an extra argument to radeon_bo_create, which is used in radeon_prime.c.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/radeon/cik.c  | 4 ++--
 drivers/gpu/drm/radeon/evergreen.c| 6 +++---
 drivers/gpu/drm/radeon/r600.c | 4 ++--
 drivers/gpu/drm/radeon/radeon_benchmark.c | 4 ++--
 drivers/gpu/drm/radeon/radeon_device.c| 2 +-
 drivers/gpu/drm/radeon/radeon_gart.c  | 2 +-
 drivers/gpu/drm/radeon/radeon_gem.c   | 2 +-
 drivers/gpu/drm/radeon/radeon_object.c| 8 +---
 drivers/gpu/drm/radeon/radeon_object.h| 1 +
 drivers/gpu/drm/radeon/radeon_prime.c | 5 -
 drivers/gpu/drm/radeon/radeon_ring.c  | 2 +-
 drivers/gpu/drm/radeon/radeon_sa.c| 2 +-
 drivers/gpu/drm/radeon/radeon_test.c  | 5 +++--
 drivers/gpu/drm/radeon/radeon_ttm.c   | 2 +-
 drivers/gpu/drm/radeon/radeon_uvd.c   | 3 ++-
 drivers/gpu/drm/radeon/radeon_vce.c   | 3 ++-
 drivers/gpu/drm/radeon/radeon_vm.c| 5 +++--
 17 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 1f598ab3b9a7..d984de903928 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4689,7 +4689,7 @@ static int cik_mec_init(struct radeon_device *rdev)
r = radeon_bo_create(rdev,
 rdev->mec.num_mec *rdev->mec.num_pipe * 
MEC_HPD_SIZE * 2,
 PAGE_SIZE, true,
-RADEON_GEM_DOMAIN_GTT, 0, NULL,
+RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL,
 >mec.hpd_eop_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create HDP EOP bo failed\n", 
r);
@@ -4860,7 +4860,7 @@ static int cik_cp_compute_resume(struct radeon_device 
*rdev)
 sizeof(struct bonaire_mqd),
 PAGE_SIZE, true,
 RADEON_GEM_DOMAIN_GTT, 0, NULL,
->ring[idx].mqd_obj);
+NULL, >ring[idx].mqd_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create MQD bo 
failed\n", r);
return r;
diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index dbca60c7d097..c6ccef6c3596 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -4023,7 +4023,7 @@ int sumo_rlc_init(struct radeon_device *rdev)
if (rdev->rlc.save_restore_obj == NULL) {
r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true,
 RADEON_GEM_DOMAIN_VRAM, 0, NULL,
->rlc.save_restore_obj);
+NULL, >rlc.save_restore_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create RLC sr bo 
failed\n", r);
return r;
@@ -4102,7 +4102,7 @@ int sumo_rlc_init(struct radeon_device *rdev)
if (rdev->rlc.clear_state_obj == NULL) {
r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true,
 RADEON_GEM_DOMAIN_VRAM, 0, NULL,
->rlc.clear_state_obj);
+NULL, >rlc.clear_state_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create RLC c bo 
failed\n", r);
sumo_rlc_fini(rdev);
@@ -4179,7 +4179,7 @@ int sumo_rlc_init(struct radeon_device *rdev)
r = radeon_bo_create(rdev, rdev->rlc.cp_table_size,
 PAGE_SIZE, true,
 RADEON_GEM_DOMAIN_VRAM, 0, NULL,
->rlc.cp_table_obj);
+NULL, >rlc.cp_table_obj);
if (r) {
dev_warn(rdev->dev, "(%d) create RLC cp table 
bo failed\n", r);
sumo_rlc_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index a95ced569d84..94e82c6b03ca 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -1430,7 +1430,7 @@ int r600_vram_scratch_init(struct radeon_device *rdev)
if (rdev->vram_scratch.robj == NULL) {
r = radeon_bo_create(rdev, RADEON_GPU_PAGE_SIZE,
 PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM,
-0, NULL, >vram_scratch.robj);
+0, NULL, NULL, >vram_scratch.robj);

[PATCH 5/7] drm/nouveau: export reservation_object from dmabuf to ttm

2014-09-04 Thread Maarten Lankhorst

Adds an extra argument to nouveau_bo_new, which is used in nouveau_prime.c.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/dispnv04/crtc.c | 2 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c| 4 ++--
 drivers/gpu/drm/nouveau/nouveau_bo.h| 1 +
 drivers/gpu/drm/nouveau/nouveau_chan.c  | 2 +-
 drivers/gpu/drm/nouveau/nouveau_fence.c | 6 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c   | 2 +-
 drivers/gpu/drm/nouveau/nouveau_prime.c | 5 -
 drivers/gpu/drm/nouveau/nv17_fence.c| 2 +-
 drivers/gpu/drm/nouveau/nv50_display.c  | 6 +++---
 drivers/gpu/drm/nouveau/nv50_fence.c| 2 +-
 drivers/gpu/drm/nouveau/nv84_fence.c| 4 ++--
 11 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c 
b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index b90aa5c1f90a..fca6a1f9c20c 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -1127,7 +1127,7 @@ nv04_crtc_create(struct drm_device *dev, int crtc_num)
drm_mode_crtc_set_gamma_size(_crtc->base, 256);

ret = nouveau_bo_new(dev, 64*64*4, 0x100, TTM_PL_FLAG_VRAM,
-0, 0x, NULL, _crtc->cursor.nvbo);
+0, 0x, NULL, NULL, _crtc->cursor.nvbo);
if (!ret) {
ret = nouveau_bo_pin(nv_crtc->cursor.nvbo, TTM_PL_FLAG_VRAM);
if (!ret) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index bda32276bcc2..f89b4a7c93fe 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -181,7 +181,7 @@ nouveau_bo_fixup_align(struct nouveau_bo *nvbo, u32 flags,
 int
 nouveau_bo_new(struct drm_device *dev, int size, int align,
   uint32_t flags, uint32_t tile_mode, uint32_t tile_flags,
-  struct sg_table *sg,
+  struct sg_table *sg, struct reservation_object *robj,
   struct nouveau_bo **pnvbo)
 {
struct nouveau_drm *drm = nouveau_drm(dev);
@@ -230,7 +230,7 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
ret = ttm_bo_init(>ttm.bdev, >bo, size,
  type, >placement,
  align >> PAGE_SHIFT, false, NULL, acc_size, sg,
- NULL, nouveau_bo_del_ttm);
+ robj, nouveau_bo_del_ttm);
if (ret) {
/* ttm will call nouveau_bo_del_ttm if it fails.. */
return ret;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h 
b/drivers/gpu/drm/nouveau/nouveau_bo.h
index ae95b2d43b36..d20c0b5c4e31 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.h
@@ -68,6 +68,7 @@ extern struct ttm_bo_driver nouveau_bo_driver;
 void nouveau_bo_move_init(struct nouveau_drm *);
 int  nouveau_bo_new(struct drm_device *, int size, int align, u32 flags,
u32 tile_mode, u32 tile_flags, struct sg_table *sg,
+   struct reservation_object *robj,
struct nouveau_bo **);
 int  nouveau_bo_pin(struct nouveau_bo *, u32 flags);
 int  nouveau_bo_unpin(struct nouveau_bo *);
diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.c 
b/drivers/gpu/drm/nouveau/nouveau_chan.c
index 99cd9e4a2aa6..d639750379d6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_chan.c
+++ b/drivers/gpu/drm/nouveau/nouveau_chan.c
@@ -106,7 +106,7 @@ nouveau_channel_prep(struct nouveau_drm *drm, struct 
nvif_device *device,
if (nouveau_vram_pushbuf)
target = TTM_PL_FLAG_VRAM;

-   ret = nouveau_bo_new(drm->dev, size, 0, target, 0, 0, NULL,
+   ret = nouveau_bo_new(drm->dev, size, 0, target, 0, 0, NULL, NULL,
>push.buffer);
if (ret == 0) {
ret = nouveau_bo_pin(chan->push.buffer, target);
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index decfe6c4ac07..574517a396fd 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -195,8 +195,12 @@ nouveau_fence_work(struct fence *fence,

work = kmalloc(sizeof(*work), GFP_KERNEL);
if (!work) {
+   /*
+* this might not be a nouveau fence any more,
+* so force a lazy wait here
+*/
WARN_ON(nouveau_fence_wait((struct nouveau_fence *)fence,
-  false, false));
+  true, false));
goto err;
}

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index b7dbd16904e0..1bc4eb33b60f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -165,7 +165,7 @@ nouveau_gem_new(struct drm_device *dev, int size, int 
align, uint32_t domain,
flags |= TTM_PL_FLAG_SYSTEM;

ret = nouveau_bo_new(dev, size,

[PATCH 6/7] drm/radeon: allow asynchronous waiting on foreign fences

2014-09-04 Thread Maarten Lankhorst

Use the semaphore mechanism to make this happen, this uses signaling
from the cpu instead of signaling by the gpu.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/radeon/radeon.h   |  17 ++-
 drivers/gpu/drm/radeon/radeon_cs.c|  30 ++---
 drivers/gpu/drm/radeon/radeon_fence.c |  13 ++-
 drivers/gpu/drm/radeon/radeon_semaphore.c | 184 ++
 4 files changed, 221 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index dddb2b7dd752..cd18fa7f801c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -359,6 +359,11 @@ struct radeon_fence_driver {
struct delayed_work lockup_work;
 };

+struct radeon_fence_cb {
+   struct fence_cb base;
+   struct fence *fence;
+};
+
 struct radeon_fence {
struct fence base;

@@ -368,6 +373,10 @@ struct radeon_fence {
unsignedring;

wait_queue_tfence_wake;
+
+   atomic_tnum_cpu_cbs;
+   struct radeon_fence_cb  *cpu_cbs;
+   uint32_t*cpu_sema;
 };

 int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring);
@@ -574,9 +583,11 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
  */
 struct radeon_semaphore {
struct radeon_sa_bo *sa_bo;
-   signed  waiters;
+   signed  waiters, cpu_waiters, cpu_waiters_max;
uint64_tgpu_addr;
struct radeon_fence *sync_to[RADEON_NUM_RINGS];
+   uint32_t*cpu_sema;
+   struct radeon_fence_cb  *cpu_cbs;
 };

 int radeon_semaphore_create(struct radeon_device *rdev,
@@ -587,6 +598,10 @@ bool radeon_semaphore_emit_wait(struct radeon_device 
*rdev, int ring,
struct radeon_semaphore *semaphore);
 void radeon_semaphore_sync_to(struct radeon_semaphore *semaphore,
  struct radeon_fence *fence);
+int radeon_semaphore_sync_obj(struct radeon_device *rdev,
+ struct radeon_semaphore *semaphore,
+ struct reservation_object *resv);
+
 int radeon_semaphore_sync_rings(struct radeon_device *rdev,
struct radeon_semaphore *semaphore,
int waiting_ring);
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 8ad4e2cfae15..b141f5bd029d 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -250,32 +250,16 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, 
u32 ring, s32 priority

 static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
 {
-   int i;
-
-   for (i = 0; i < p->nrelocs; i++) {
-   struct reservation_object *resv;
-   struct fence *fence;
-   struct radeon_fence *rfence;
-   int r;
+   int i, ret = 0;

+   for (i = 0; !ret && i < p->nrelocs; i++) {
if (!p->relocs[i].robj)
continue;

-   resv = p->relocs[i].robj->tbo.resv;
-   fence = reservation_object_get_excl(resv);
-   if (!fence)
-   continue;
-   rfence = to_radeon_fence(fence);
-   if (!rfence || rfence->rdev != p->rdev) {
-   r = fence_wait(fence, true);
-   if (r)
-   return r;
-   continue;
-   }
-
-   radeon_semaphore_sync_to(p->ib.semaphore, rfence);
+   ret = radeon_semaphore_sync_obj(p->rdev, p->ib.semaphore,
+   p->relocs[i].robj->tbo.resv);
}
-   return 0;
+   return ret;
 }

 /* XXX: note that this is called from the legacy UMS CS ioctl as well */
@@ -442,6 +426,10 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser 
*parser, int error, bo
 */
list_sort(NULL, >validated, cmp_size_smaller_first);

+   /* must be called with all reservation_objects still held */
+   radeon_semaphore_free(parser->rdev, >ib.semaphore,
+ parser->ib.fence);
+
ttm_eu_fence_buffer_objects(>ticket,
>validated,
>ib.fence->base);
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c 
b/drivers/gpu/drm/radeon/radeon_fence.c
index 0262fe2580d2..7687a7f8f41b 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -142,6 +142,8 @@ int radeon_fence_emit(struct radeon_device *rdev,
(*fence)->ring = ring;
fence_init(&(*fence)->base, _fence_ops,

[PATCH 7/7] drm/nouveau: allow asynchronous waiting using gart fences

2014-09-04 Thread Maarten Lankhorst

This requires allocating a fence sooner to annotate any
cross-dev fences, and making sure that enough memory is
available before emitting the fence.

The current seqno is written to the GART bo on completion,
and a list of finished fences is kept to allow arbitrary depth.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c  |  28 ++--
 drivers/gpu/drm/nouveau/nouveau_chan.c|   6 +-
 drivers/gpu/drm/nouveau/nouveau_display.c |  45 ---
 drivers/gpu/drm/nouveau/nouveau_fence.c   | 212 ++
 drivers/gpu/drm/nouveau/nouveau_fence.h   |  29 ++--
 drivers/gpu/drm/nouveau/nouveau_gem.c |  25 ++--
 drivers/gpu/drm/nouveau/nv04_fence.c  |   9 +-
 drivers/gpu/drm/nouveau/nv10_fence.c  |   9 +-
 drivers/gpu/drm/nouveau/nv84_fence.c  |  31 +++--
 drivers/gpu/drm/nouveau/nvc0_fence.c  |   4 +-
 10 files changed, 305 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index f89b4a7c93fe..24c941927926 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -970,21 +970,21 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int 
evict, bool intr,
}

mutex_lock_nested(>mutex, SINGLE_DEPTH_NESTING);
-   ret = nouveau_fence_sync(nouveau_bo(bo), chan, true);
-   if (ret == 0) {
+   ret = nouveau_fence_new(chan, );
+   if (ret)
+   goto out;
+
+   ret = nouveau_fence_sync(nouveau_bo(bo), fence, true);
+   if (ret == 0)
ret = drm->ttm.move(chan, bo, >mem, new_mem);
-   if (ret == 0) {
-   ret = nouveau_fence_new(chan, false, );
-   if (ret == 0) {
-   ret = ttm_bo_move_accel_cleanup(bo,
-   >base,
-   evict,
-   no_wait_gpu,
-   new_mem);
-   nouveau_fence_unref();
-   }
-   }
-   }
+   if (ret == 0)
+   ret = nouveau_fence_emit(fence);
+   if (ret == 0)
+   ret = ttm_bo_move_accel_cleanup(bo, >base, evict,
+   no_wait_gpu, new_mem);
+   nouveau_fence_unref();
+
+out:
mutex_unlock(>mutex);
return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_chan.c 
b/drivers/gpu/drm/nouveau/nouveau_chan.c
index d639750379d6..1e5c76dfed3a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_chan.c
+++ b/drivers/gpu/drm/nouveau/nouveau_chan.c
@@ -46,9 +46,11 @@ nouveau_channel_idle(struct nouveau_channel *chan)
struct nouveau_fence *fence = NULL;
int ret;

-   ret = nouveau_fence_new(chan, false, );
+   ret = nouveau_fence_new(chan, );
if (!ret) {
-   ret = nouveau_fence_wait(fence, false, false);
+   ret = nouveau_fence_emit(fence);
+   if (!ret)
+   ret = nouveau_fence_wait(fence, false, false);
nouveau_fence_unref();
}

diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
b/drivers/gpu/drm/nouveau/nouveau_display.c
index a9ec525c0994..adbf870686aa 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -26,6 +26,7 @@

 #include 
 #include 
+#include 

 #include 

@@ -36,7 +37,6 @@
 #include "nouveau_gem.h"
 #include "nouveau_connector.h"
 #include "nv50_display.h"
-
 #include "nouveau_fence.h"

 #include 
@@ -644,7 +644,7 @@ nouveau_page_flip_emit(struct nouveau_channel *chan,
   struct nouveau_bo *old_bo,
   struct nouveau_bo *new_bo,
   struct nouveau_page_flip_state *s,
-  struct nouveau_fence **pfence)
+  struct nouveau_fence *fence)
 {
struct nouveau_fence_chan *fctx = chan->fence;
struct nouveau_drm *drm = chan->drm;
@@ -657,11 +657,6 @@ nouveau_page_flip_emit(struct nouveau_channel *chan,
list_add_tail(>head, >flip);
spin_unlock_irqrestore(>event_lock, flags);

-   /* Synchronize with the old framebuffer */
-   ret = nouveau_fence_sync(old_bo, chan, false);
-   if (ret)
-   goto fail;
-
/* Emit the pageflip */
ret = RING_SPACE(chan, 2);
if (ret)
@@ -674,7 +669,7 @@ nouveau_page_flip_emit(struct nouveau_channel *chan,
OUT_RING  (chan, 0x);
FIRE_RING (chan);

-   ret = nouveau_fence_new(chan, false, pfence);
+   ret = nouveau_fence_emit(fence);
if (ret)
goto fail;

@@ -700,6 +695,12 @@ nouveau_crtc_page_flip(struct drm_crtc *crtc, struct 
drm_framebuffer *fb,
struct nouveau_cli *cli;
struct

[PATCH 3/7] drm/radeon: cope with foreign fences inside the reservation object

2014-09-04 Thread Christian König

Am 04.09.2014 um 13:40 schrieb Maarten Lankhorst:
> Not the whole world is a radeon! :-)
>
> Signed-off-by: Maarten Lankhorst 
> ---
>   drivers/gpu/drm/radeon/radeon.h | 11 -
>   drivers/gpu/drm/radeon/radeon_cs.c  | 32 +
>   drivers/gpu/drm/radeon/radeon_display.c | 41 
> -
>   drivers/gpu/drm/radeon/radeon_fence.c   |  3 +++
>   drivers/gpu/drm/radeon/radeon_mode.h|  1 +
>   5 files changed, 61 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index d80dc547a105..dddb2b7dd752 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -699,17 +699,6 @@ void radeon_doorbell_free(struct radeon_device *rdev, 
> u32 doorbell);
>* IRQS.
>*/
>   
> -struct radeon_flip_work {
> - struct work_struct  flip_work;
> - struct work_struct  unpin_work;
> - struct radeon_device*rdev;
> - int crtc_id;
> - uint64_tbase;
> - struct drm_pending_vblank_event *event;
> - struct radeon_bo*old_rbo;
> - struct radeon_fence *fence;
> -};
> -

Please keep this structure were it was, apart from that the patch looks 
good on first glance.

Christian.

>   struct r500_irq_stat_regs {
>   u32 disp_int;
>   u32 hdmi0_status;
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 6e3d1c8f3483..8ad4e2cfae15 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -248,23 +248,34 @@ static int radeon_cs_get_ring(struct radeon_cs_parser 
> *p, u32 ring, s32 priority
>   return 0;
>   }
>   
> -static void radeon_cs_sync_rings(struct radeon_cs_parser *p)
> +static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
>   {
>   int i;
>   
>   for (i = 0; i < p->nrelocs; i++) {
>   struct reservation_object *resv;
>   struct fence *fence;
> + struct radeon_fence *rfence;
> + int r;
>   
>   if (!p->relocs[i].robj)
>   continue;
>   
>   resv = p->relocs[i].robj->tbo.resv;
>   fence = reservation_object_get_excl(resv);
> + if (!fence)
> + continue;
> + rfence = to_radeon_fence(fence);
> + if (!rfence || rfence->rdev != p->rdev) {
> + r = fence_wait(fence, true);
> + if (r)
> + return r;
> + continue;
> + }
>   
> - radeon_semaphore_sync_to(p->ib.semaphore,
> -  (struct radeon_fence *)fence);
> + radeon_semaphore_sync_to(p->ib.semaphore, rfence);
>   }
> + return 0;
>   }
>   
>   /* XXX: note that this is called from the legacy UMS CS ioctl as well */
> @@ -474,13 +485,19 @@ static int radeon_cs_ib_chunk(struct radeon_device 
> *rdev,
>   return r;
>   }
>   
> + r = radeon_cs_sync_rings(parser);
> + if (r) {
> + if (r != -ERESTARTSYS)
> + DRM_ERROR("Failed to sync rings: %i\n", r);
> + return r;
> + }
> +
>   if (parser->ring == R600_RING_TYPE_UVD_INDEX)
>   radeon_uvd_note_usage(rdev);
>   else if ((parser->ring == TN_RING_TYPE_VCE1_INDEX) ||
>(parser->ring == TN_RING_TYPE_VCE2_INDEX))
>   radeon_vce_note_usage(rdev);
>   
> - radeon_cs_sync_rings(parser);
>   r = radeon_ib_schedule(rdev, >ib, NULL, true);
>   if (r) {
>   DRM_ERROR("Failed to schedule IB !\n");
> @@ -567,7 +584,12 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device 
> *rdev,
>   if (r) {
>   goto out;
>   }
> - radeon_cs_sync_rings(parser);
> + r = radeon_cs_sync_rings(parser);
> + if (r) {
> + if (r != -ERESTARTSYS)
> + DRM_ERROR("Failed to sync rings: %i\n", r);
> + goto out;
> + }
>   radeon_semaphore_sync_to(parser->ib.semaphore, vm->fence);
>   
>   if ((rdev->family >= CHIP_TAHITI) &&
> diff --git a/drivers/gpu/drm/radeon/radeon_display.c 
> b/drivers/gpu/drm/radeon/radeon_display.c
> index bc894c17b2f9..715b2d95346c 100644
> --- a/drivers/gpu/drm/radeon/radeon_display.c
> +++ b/drivers/gpu/drm/radeon/radeon_display.c
> @@ -36,6 +36,17 @@
>   
>   #include 
>   
> +struct radeon_flip_work {
> + struct work_struct  flip_work;
> + struct work_struct  unpin_work;
> + struct radeon_device*rdev;
> + int crtc_id;
> + uint64_tbase;
> + struct drm_pending_vblank_event *event;
> + struct radeon_bo*old_rbo;
> + struct fence*fence;
> +};
>

[PATCH 6/7] drm/radeon: allow asynchronous waiting on foreign fences

2014-09-04 Thread Christian König

Am 04.09.2014 um 13:42 schrieb Maarten Lankhorst:
> Use the semaphore mechanism to make this happen, this uses signaling
> from the cpu instead of signaling by the gpu.

I'm not sure if this will work reliable when the semaphores are in 
system memory. We might need to reserve some VRAM for them instead.

Regards,
Christian.

>
> Signed-off-by: Maarten Lankhorst 
> ---
>   drivers/gpu/drm/radeon/radeon.h   |  17 ++-
>   drivers/gpu/drm/radeon/radeon_cs.c|  30 ++---
>   drivers/gpu/drm/radeon/radeon_fence.c |  13 ++-
>   drivers/gpu/drm/radeon/radeon_semaphore.c | 184 
> ++
>   4 files changed, 221 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index dddb2b7dd752..cd18fa7f801c 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -359,6 +359,11 @@ struct radeon_fence_driver {
>   struct delayed_work lockup_work;
>   };
>   
> +struct radeon_fence_cb {
> + struct fence_cb base;
> + struct fence *fence;
> +};
> +
>   struct radeon_fence {
>   struct fence base;
>   
> @@ -368,6 +373,10 @@ struct radeon_fence {
>   unsignedring;
>   
>   wait_queue_tfence_wake;
> +
> + atomic_tnum_cpu_cbs;
> + struct radeon_fence_cb  *cpu_cbs;
> + uint32_t*cpu_sema;
>   };
>   
>   int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring);
> @@ -574,9 +583,11 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
>*/
>   struct radeon_semaphore {
>   struct radeon_sa_bo *sa_bo;
> - signed  waiters;
> + signed  waiters, cpu_waiters, cpu_waiters_max;
>   uint64_tgpu_addr;
>   struct radeon_fence *sync_to[RADEON_NUM_RINGS];
> + uint32_t*cpu_sema;
> + struct radeon_fence_cb  *cpu_cbs;
>   };
>   
>   int radeon_semaphore_create(struct radeon_device *rdev,
> @@ -587,6 +598,10 @@ bool radeon_semaphore_emit_wait(struct radeon_device 
> *rdev, int ring,
>   struct radeon_semaphore *semaphore);
>   void radeon_semaphore_sync_to(struct radeon_semaphore *semaphore,
> struct radeon_fence *fence);
> +int radeon_semaphore_sync_obj(struct radeon_device *rdev,
> +   struct radeon_semaphore *semaphore,
> +   struct reservation_object *resv);
> +
>   int radeon_semaphore_sync_rings(struct radeon_device *rdev,
>   struct radeon_semaphore *semaphore,
>   int waiting_ring);
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 8ad4e2cfae15..b141f5bd029d 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -250,32 +250,16 @@ static int radeon_cs_get_ring(struct radeon_cs_parser 
> *p, u32 ring, s32 priority
>   
>   static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
>   {
> - int i;
> -
> - for (i = 0; i < p->nrelocs; i++) {
> - struct reservation_object *resv;
> - struct fence *fence;
> - struct radeon_fence *rfence;
> - int r;
> + int i, ret = 0;
>   
> + for (i = 0; !ret && i < p->nrelocs; i++) {
>   if (!p->relocs[i].robj)
>   continue;
>   
> - resv = p->relocs[i].robj->tbo.resv;
> - fence = reservation_object_get_excl(resv);
> - if (!fence)
> - continue;
> - rfence = to_radeon_fence(fence);
> - if (!rfence || rfence->rdev != p->rdev) {
> - r = fence_wait(fence, true);
> - if (r)
> - return r;
> - continue;
> - }
> -
> - radeon_semaphore_sync_to(p->ib.semaphore, rfence);
> + ret = radeon_semaphore_sync_obj(p->rdev, p->ib.semaphore,
> + p->relocs[i].robj->tbo.resv);
>   }
> - return 0;
> + return ret;
>   }
>   
>   /* XXX: note that this is called from the legacy UMS CS ioctl as well */
> @@ -442,6 +426,10 @@ static void radeon_cs_parser_fini(struct 
> radeon_cs_parser *parser, int error, bo
>*/
>   list_sort(NULL, >validated, cmp_size_smaller_first);
>   
> + /* must be called with all reservation_objects still held */
> + radeon_semaphore_free(parser->rdev, >ib.semaphore,
> +   parser->ib.fence);
> +
>   ttm_eu_fence_buffer_objects(>ticket,
>   >validated,
>   >ib.fence->base);
> diff --git

[Bug 83436] Sudden framerate drops in multiple games

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83436

--- Comment #10 from smoki  ---
 OK i will bisect this, having now here pretty much clear case something like 3
times performance drop in OpenJK :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/acc1c2a6/attachment-0001.html>

[PATCH 6/7] drm/radeon: allow asynchronous waiting on foreign fences

2014-09-04 Thread Maarten Lankhorst

Hey,

Op 04-09-14 om 13:54 schreef Christian K?nig:
> Am 04.09.2014 um 13:42 schrieb Maarten Lankhorst:
>> Use the semaphore mechanism to make this happen, this uses signaling
>> from the cpu instead of signaling by the gpu.
>
> I'm not sure if this will work reliable when the semaphores are in system 
> memory. We might need to reserve some VRAM for them instead.
>
> Regards,
> Christian.
Why would it be unreliable? I mostly kept it in semaphore for simplicity.

~Maarten

[PATCH 14/19] drm: Don't update vblank timestamp when the counter didn't change

2014-09-04 Thread Mario Kleiner

I thought about this one again and opposed to my previous comment now think
it's fine, also for drivers without hw vblank counter queries.

-mario



On Wed, Aug 6, 2014 at 1:49 PM,  wrote:

> From: Ville Syrj?l? 
>
> If we already have a timestamp for the current vblank counter, don't
> update it with a new timestmap. Small errors can creep in between two
> timestamp queries for the same vblank count, which could be confusing to
> userspace when it queries the timestamp for the same vblank sequence
> number twice.
>
> This problem gets exposed when the vblank disable timer is not used
> (or is set to expire quickly) and thus we can get multiple vblank
> disable<->enable transition during the same frame which would all
> attempt to update the timestamp with the latest estimate.
>
> Testcase: igt/kms_flip/flip-vs-expired-vblank
> Signed-off-by: Ville Syrj?l? 
> ---
>  drivers/gpu/drm/drm_irq.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index af33df1..0523f5b 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device
> *dev, int crtc)
> DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n",
>   crtc, diff);
>
> +   if (diff == 0)
> +   return;
> +
> /* Reinitialize corresponding vblank timestamp if high-precision
> query
>  * available. Skip this step if query unsupported or failed. Will
>  * reinitialize delayed at next vblank interrupt in that case.
> --
> 1.8.5.5
>
>
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/9cdf6567/attachment.html>

[PATCH 6/7] drm/radeon: allow asynchronous waiting on foreign fences

2014-09-04 Thread Christian König

Am 04.09.2014 um 14:08 schrieb Maarten Lankhorst:
> Hey,
>
> Op 04-09-14 om 13:54 schreef Christian K?nig:
>> Am 04.09.2014 um 13:42 schrieb Maarten Lankhorst:
>>> Use the semaphore mechanism to make this happen, this uses signaling
>>> from the cpu instead of signaling by the gpu.
>> I'm not sure if this will work reliable when the semaphores are in system 
>> memory. We might need to reserve some VRAM for them instead.
>>
>> Regards,
>> Christian.
> Why would it be unreliable? I mostly kept it in semaphore for simplicity.

The semaphore block tries to avoid memory accesses whenever possible.

For example when a signal for address A arrives the block doesn't 
necessary writes that to memory but instead tries to match it 
immediately with a wait for address A. Similar is true if a wait for 
address A arrives and the semaphore block thinks it knows the memory 
value at address A.

Also I'm not sure if the semaphore block really polls the memory address 
for changes, instead it might just snoop the MC for writes to this 
address. Since CPU writes to system memory aren't seen by the GPU MC the 
semaphore block would never know something changed.

I need to check the docs how to do this correctly,
Christian.

>
> ~Maarten
>

[Intel-gfx] [PATCH v2] drm/i915: Sysfs interface to get GFX shmem usage stats per process

2014-09-04 Thread Daniel Vetter

On Thu, Sep 04, 2014 at 11:52:15AM +, Gupta, Sourab wrote:
> On Thu, 2014-09-04 at 10:01 +, Daniel Vetter wrote:
> > Interface design discussions should happen in public (so that
> > non-intel people can jump in, which happens rather often for other
> > drivers actually). But at least include internal mailing lists next
> > time around. Also adding dri-devel.
> > 
> > The problem I see with your approach is that "process-wise" is not a
> > solid concept with drm. We can dump information per open drm file, but
> > that file descriptor can be shared between processes. And the latest
> > generation of linux compositor protocols (like dri3) actually take
> > advantage of this.
> 
> By "process-wise" sharing, do you mean the sharing of the drm file
> across different processes (having different tgid's), or is it sharing
> across the threads of a single process (having same tgid)?
> Sorry, we are not aware of the sharing of drm file across processes in
> dri3 protocols, as in android userspace, we have not come across such
> scenario. Can you please shed some light on it.
> 
> In our design, we have a tgid based accounting mechanism. As long as the
> drm file is shared within the threads of the same process, its resources
> (objects and memory) are accounted together. But if the drm file is
> shared across different processes (diff tgid's), this case is still an
> open.
> Will our tgid based accounting cover the dri3 usecases also (if they
> share drm file within same tgid)?

Well in unix a file descriptor is simply not tied to a process/thread at
all, so if you expose accounting data for resources which are tied to file
descriptors then that doesn't work. E.g.
- fork inteherits all the filedescriptors from its parents, same for exec
- you can pass file descriptors explicitly between processes over unix
  domain sockets (this is what dri3 does).

So if you'd use the tgid of the process that opened the file you'd account
everything to the X server with dri3. Which is not really useful.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

--- Comment #7 from Jos? Su?rez  ---
As stated by smoki, reverting that commit indeed fixes the problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/ac463558/attachment.html>

[Bug 44126] [r300g] 0ad: carpet textures "flash" and get hidden by ground texture.

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=44126

--- Comment #4 from Marek Ol??k  ---
Flickering when 2 primitives exactly intersect each other and vertex positions
are not equal is a quite common programming mistake in games. Drivers cannot do
anything about it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/01a410f7/attachment.html>

[Bug 75112] Meta Bug for HyperZ issues on r600g and radeonsi

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=75112

smoki  changed:

   What|Removed |Added

 Depends on||83418

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/1b7b6bbc/attachment-0001.html>

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

smoki  changed:

   What|Removed |Added

 Blocks||75112

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/ad047194/attachment.html>

[PATCH 6/7] drm/radeon: allow asynchronous waiting on foreign fences

2014-09-04 Thread Christian König

> I need to check the docs how to do this correctly,
The docs don't really cover this case.

For the GPU waiting on an address there is an extra document just for 
this case which I don't have at hand right now. But IIRC it was 
recommended to use the local memory of the device waiting on the 
semaphore. I'm just not sure if that's for pure performance reasons to 
avoid accessing the bus or if there's a hard and unavoidable hardware 
reason to do so.

For the GPU signaling case there is a special bit in the semaphore 
instructions that you need to set if any user outside of the GPU should 
see the write.

In general it is explicitly supported to use semaphores for inter device 
synchronization on the bus (that's what the block is made for), but it's 
not intended to be used for synchronization between the CPU and the 
device. So I'm not sure if things like cache snooping is implemented and 
correctly supported.

Well I see the feature more like nice to have and needs a bunch of 
testing, so I would say either wait with the patch for now or make it 
optional to use or something like that.

Regards,
Christian.

Am 04.09.2014 um 14:25 schrieb Christian K?nig:
> Am 04.09.2014 um 14:08 schrieb Maarten Lankhorst:
>> Hey,
>>
>> Op 04-09-14 om 13:54 schreef Christian K?nig:
>>> Am 04.09.2014 um 13:42 schrieb Maarten Lankhorst:
 Use the semaphore mechanism to make this happen, this uses signaling
 from the cpu instead of signaling by the gpu.
>>> I'm not sure if this will work reliable when the semaphores are in 
>>> system memory. We might need to reserve some VRAM for them instead.
>>>
>>> Regards,
>>> Christian.
>> Why would it be unreliable? I mostly kept it in semaphore for 
>> simplicity.
>
> The semaphore block tries to avoid memory accesses whenever possible.
>
> For example when a signal for address A arrives the block doesn't 
> necessary writes that to memory but instead tries to match it 
> immediately with a wait for address A. Similar is true if a wait for 
> address A arrives and the semaphore block thinks it knows the memory 
> value at address A.
>
> Also I'm not sure if the semaphore block really polls the memory 
> address for changes, instead it might just snoop the MC for writes to 
> this address. Since CPU writes to system memory aren't seen by the GPU 
> MC the semaphore block would never know something changed.
>
> I need to check the docs how to do this correctly,
> Christian.
>
>>
>> ~Maarten
>>
>

[Bug 83500] New: si_dma_copy_tile causes GPU hangs

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83500

  Priority: medium
Bug ID: 83500
  Assignee: dri-devel at lists.freedesktop.org
   Summary: si_dma_copy_tile causes GPU hangs
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: greg at chown.ath.cx
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Drivers/Gallium/radeonsi
   Product: Mesa

Created attachment 105745
  --> https://bugs.freedesktop.org/attachment.cgi?id=105745=edit
Workaround

Async DMA linear to tiled copies are causing GPU hangs in some cases. On Cape
Verde, I can easily triggers this as described in [1]. The game Brutal Legend
also triggers similar hangs when it streams assets while gameplay.

Disabling usage of this function and using the resource_copy_region fallback
instead fixes all hangs. The attached patch does that.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=79980#c124

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/683ad5c6/attachment.html>

[Bug 79980] Random radeonsi crashes

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=79980

--- Comment #128 from Grigori Goronzy  ---
You might want to try the patch in
https://bugs.freedesktop.org/show_bug.cgi?id=83500

Maybe some of these issues have a common cause.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/ad0d226a/attachment.html>

[Bug 83418] EU IV is incorrectly rendered after git1409011930.d571f2

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83418

Marek Ol??k  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Marek Ol??k  ---
I reverted the problematic commit as 8bd67231797e5d79d72a4e91b37ea81da30c6df3.
Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/534b5783/attachment.html>

[Bug 75112] Meta Bug for HyperZ issues on r600g and radeonsi

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=75112

Bug 75112 depends on bug 83418, which changed state.

Bug 83418 Summary: EU IV is incorrectly rendered after git1409011930.d571f2
https://bugs.freedesktop.org/show_bug.cgi?id=83418

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/17efa872/attachment.html>

[PATCH 6/7] drm/radeon: allow asynchronous waiting on foreign fences

2014-09-04 Thread Maarten Lankhorst

Hey,

Op 04-09-14 om 15:34 schreef Christian K?nig:
>> I need to check the docs how to do this correctly,
> The docs don't really cover this case.
>
> For the GPU waiting on an address there is an extra document just for this 
> case which I don't have at hand right now. But IIRC it was recommended to use 
> the local memory of the device waiting on the semaphore. I'm just not sure if 
> that's for pure performance reasons to avoid accessing the bus or if there's 
> a hard and unavoidable hardware reason to do so.
>
> For the GPU signaling case there is a special bit in the semaphore 
> instructions that you need to set if any user outside of the GPU should see 
> the write.
>
> In general it is explicitly supported to use semaphores for inter device 
> synchronization on the bus (that's what the block is made for), but it's not 
> intended to be used for synchronization between the CPU and the device. So 
> I'm not sure if things like cache snooping is implemented and correctly 
> supported.
>
> Well I see the feature more like nice to have and needs a bunch of testing, 
> so I would say either wait with the patch for now or make it optional to use 
> or something like that.
You're right, it's meant as something 'nice to have'. This is why it came after 
the patch that exports reservation_object to/from dma-buf. :-)

~Maarten

[Bug 83500] si_dma_copy_tile causes GPU hangs

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83500

--- Comment #1 from Marek Ol??k  ---
Thank you very much for tracking this down.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/520bd07c/attachment.html>

[Bug 83432] r600_query.c:269:r600_emit_query_end: Assertion `ctx->num_pipelinestat_queries > 0' failed [Gallium HUD]

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83432

Marek Ol??k  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Marek Ol??k  ---
Fixed by 3dbf55c1be5a8867616e475d943c776d8245d0c. Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/ddb19548/attachment.html>

SI display gap for more than 2 displays

2014-09-04 Thread Sylvain BERTRAND

Hi,

In si_program_display_gap we have DISP1_GAP and DISP2_GAP.

Where are DISP3_GAP to DISP6_GAP? What does expect this hardware
block when more than 2 displays are connected? Is DISP2_GAP
actually stand for DISP[3-6]_GAP?

Still in the same function, what happened to the pipes for
DCCG_DISP[2-6]_SLOW_SELECT?

regards,

-- 
Sylvain

P.S. It seems that all this was "fixed" in CI with new hardware
blocks, but I'm focussing on SI blocks.

[Bug 81644] Random crashes on RadeonSI with Chromium.

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81644

--- Comment #82 from Aaron B  ---
I'm still bisecting, but I just want to say I suck at it and I'll probably need
at least 2 bisects to the same point, if not more. I'm trying to be patient,
but on the old Mesa's the glitch just takes so long to do, even when I set it
up to do it.

So, should I skip to bisecting if this DMA patch that was just proposed is the
source of our problem, also?

https://bugs.freedesktop.org/show_bug.cgi?id=83500

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/46ca10bf/attachment.html>

SI display gap for more than 2 displays

2014-09-04 Thread Sylvain BERTRAND

On Thu, Sep 04, 2014 at 03:52:20PM +0200, Sylvain BERTRAND wrote:
> Hi,
> 
> In si_program_display_gap we have DISP1_GAP and DISP2_GAP.
> 
> Where are DISP3_GAP to DISP6_GAP? What does expect this hardware
> block when more than 2 displays are connected? Is DISP2_GAP
> actually stand for DISP[3-6]_GAP?
> 
> Still in the same function, what happened to the pipes for
> DCCG_DISP[2-6]_SLOW_SELECT?

I noticed something else: in si_enable_display_gap, the
DISP1_GAP_MCHG and DISP2_GAP_MCHG fields from CG_DISPLAY_GAP_CNTL
get inited with DISP1 only to vblank, and never reprogrammed
with new displays like DISP[12]_GAP. It seems not consistant,
expected?

regards,

-- 
Sylvain BERTRAND

[PATCH -v3 3/4] drm/i915: split intel_cursor_plane_update() into check() and commit()

2014-09-04 Thread Ville Syrjälä

On Wed, Sep 03, 2014 at 05:10:17PM -0300, Gustavo Padovan wrote:
> From: Gustavo Padovan 
> 
> Due to the upcoming atomic modesetting feature we need to separate
> some update functions into a check step that can fail and a commit
> step that should, ideally, never fail.
> 
> The commit part can still fail, but that should be solved in another
> upcoming patch.
> 
> Signed-off-by: Gustavo Padovan 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 104 
> ++-
>  1 file changed, 67 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 22d3902..c3f1967 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11896,51 +11896,42 @@ intel_cursor_plane_disable(struct drm_plane *plane)
>  }
>  
>  static int
> -intel_cursor_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
> -   struct drm_framebuffer *fb, int crtc_x, int crtc_y,
> -   unsigned int crtc_w, unsigned int crtc_h,
> -   uint32_t src_x, uint32_t src_y,
> -   uint32_t src_w, uint32_t src_h)
> +intel_check_cursor_plane(struct drm_plane *plane,
> +  struct intel_plane_state *state)
>  {
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> - struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
> - struct drm_i915_gem_object *obj = intel_fb->obj;
> - struct drm_rect dest = {
> - /* integer pixels */
> - .x1 = crtc_x,
> - .y1 = crtc_y,
> - .x2 = crtc_x + crtc_w,
> - .y2 = crtc_y + crtc_h,
> - };
> - struct drm_rect src = {
> - /* 16.16 fixed point */
> - .x1 = src_x,
> - .y1 = src_y,
> - .x2 = src_x + src_w,
> - .y2 = src_y + src_h,
> - };
> - const struct drm_rect clip = {
> - /* integer pixels */
> - .x2 = intel_crtc->active ? intel_crtc->config.pipe_src_w : 0,
> - .y2 = intel_crtc->active ? intel_crtc->config.pipe_src_h : 0,
> - };
> - bool visible;
> - int ret;
> + struct drm_crtc *crtc = state->crtc;
> + struct drm_framebuffer *fb = state->fb;
> + struct drm_rect *dest = >dst;
> + struct drm_rect *src = >src;
> + const struct drm_rect *clip = >clip;
>  
> - ret = drm_plane_helper_check_update(plane, crtc, fb,
> - , , ,
> + return drm_plane_helper_check_update(plane, crtc, fb,
> + src, dest, clip,
>   DRM_PLANE_HELPER_NO_SCALING,
>   DRM_PLANE_HELPER_NO_SCALING,
> - true, true, );
> - if (ret)
> - return ret;
> + true, true, >visible);
> +}
>  
> - crtc->cursor_x = crtc_x;
> - crtc->cursor_y = crtc_y;
> +static int
> +intel_commit_cursor_plane(struct drm_plane *plane,
> +   struct intel_plane_state *state)
> +{
> + struct drm_crtc *crtc = state->crtc;
> + struct drm_framebuffer *fb = state->fb;
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
> + struct drm_i915_gem_object *obj = intel_fb->obj;
> + struct drm_rect *dest = >dst;
> + int crtc_w, crtc_h;
> +
> + crtc->cursor_x = state->dst.x1;
> + crtc->cursor_y = state->dst.y1;
>   if (fb != crtc->cursor->fb) {
> + crtc_w = drm_rect_width(dest);
> + crtc_h = drm_rect_height(dest);

These would need to be the original unclipped coordinates since we
program the cursor hardware with those and the hardware clips itself.

>   return intel_crtc_cursor_set_obj(crtc, obj, crtc_w, crtc_h);
>   } else {
> - intel_crtc_update_cursor(crtc, visible);
> + intel_crtc_update_cursor(crtc, state->visible);
>  
>   intel_frontbuffer_flip(crtc->dev,
>  
> INTEL_FRONTBUFFER_CURSOR(intel_crtc->pipe));
> @@ -11948,6 +11939,45 @@ intel_cursor_plane_update(struct drm_plane *plane, 
> struct drm_crtc *crtc,
>   return 0;
>   }
>  }
> +
> +static int
> +intel_cursor_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
> +   struct drm_framebuffer *fb, int crtc_x, int crtc_y,
> +   unsigned int crtc_w, unsigned int crtc_h,
> +   uint32_t src_x, uint32_t src_y,
> +   uint32_t src_w, uint32_t src_h)
> +{
> + struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct intel_plane_state state;
> + int ret;
> +
> + state.crtc = crtc;
> + state.fb = fb;
> +
> + /* sample coordinates in 16.16 fixed point */

[PATCH -v3 4/4] drm/i915: split intel_primary_plane_setplane() into check() and commit()

2014-09-04 Thread Ville Syrjälä

On Wed, Sep 03, 2014 at 05:10:18PM -0300, Gustavo Padovan wrote:
> From: Gustavo Padovan 
> 
> As a preparation for atomic updates we need to split the code to check
> everything we are going to commit first. This patch starts the work to
> split intel_primary_plane_setplane() into check() and commit() parts.
> 
> More work is expected on this to get a better split of the two steps.
> Ideally the commit() step should never fail.
> 
> Signed-off-by: Gustavo Padovan 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 130 
> +++
>  1 file changed, 72 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index c3f1967..1e3985b 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11663,63 +11663,37 @@ disable_unpin:
>  }
>  
>  static int
> -intel_primary_plane_setplane(struct drm_plane *plane, struct drm_crtc *crtc,
> -  struct drm_framebuffer *fb, int crtc_x, int crtc_y,
> -  unsigned int crtc_w, unsigned int crtc_h,
> -  uint32_t src_x, uint32_t src_y,
> -  uint32_t src_w, uint32_t src_h)
> +intel_check_primary_plane(struct drm_plane *plane,
> +   struct intel_plane_state *state)
> +{
> + struct drm_crtc *crtc = state->crtc;
> + struct drm_framebuffer *fb = state->fb;
> + struct drm_rect *dest = >dst;
> + struct drm_rect *src = >src;
> + const struct drm_rect *clip = >clip;
> +
> + return drm_plane_helper_check_update(plane, crtc, fb,
> + src, dest, clip,
> + DRM_PLANE_HELPER_NO_SCALING,
> + DRM_PLANE_HELPER_NO_SCALING,
> + false, true, >visible);
> +}
> +
> +static int
> +intel_commit_primary_plane(struct drm_plane *plane,
> +struct intel_plane_state *state)
>  {
> + struct drm_crtc *crtc = state->crtc;
> + struct drm_framebuffer *fb = state->fb;
>   struct drm_device *dev = crtc->dev;
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>   struct drm_i915_gem_object *obj = intel_fb_obj(fb);
>   struct drm_i915_gem_object *old_obj = intel_fb_obj(plane->fb);
> - struct drm_rect dest = {
> - /* integer pixels */
> - .x1 = crtc_x,
> - .y1 = crtc_y,
> - .x2 = crtc_x + crtc_w,
> - .y2 = crtc_y + crtc_h,
> - };
> - struct drm_rect src = {
> - /* 16.16 fixed point */
> - .x1 = src_x,
> - .y1 = src_y,
> - .x2 = src_x + src_w,
> - .y2 = src_y + src_h,
> - };
> - const struct drm_rect clip = {
> - /* integer pixels */
> - .x2 = intel_crtc->active ? intel_crtc->config.pipe_src_w : 0,
> - .y2 = intel_crtc->active ? intel_crtc->config.pipe_src_h : 0,
> - };
> - const struct {
> - int crtc_x, crtc_y;
> - unsigned int crtc_w, crtc_h;
> - uint32_t src_x, src_y, src_w, src_h;
> - } orig = {
> - .crtc_x = crtc_x,
> - .crtc_y = crtc_y,
> - .crtc_w = crtc_w,
> - .crtc_h = crtc_h,
> - .src_x = src_x,
> - .src_y = src_y,
> - .src_w = src_w,
> - .src_h = src_h,
> - };
>   struct intel_plane *intel_plane = to_intel_plane(plane);
> - bool visible;
> + struct drm_rect *src = >src;
>   int ret;
>  
> - ret = drm_plane_helper_check_update(plane, crtc, fb,
> - , , ,
> - DRM_PLANE_HELPER_NO_SCALING,
> - DRM_PLANE_HELPER_NO_SCALING,
> - false, true, );
> -
> - if (ret)
> - return ret;
> -
>   /*
>* If the CRTC isn't enabled, we're just pinning the framebuffer,
>* updating the fb pointer, and returning without touching the
> @@ -11754,7 +11728,7 @@ intel_primary_plane_setplane(struct drm_plane *plane, 
> struct drm_crtc *crtc,
>* happens if userspace explicitly disables the plane by passing fb=0
>* because plane->fb still gets set and pinned.
>*/
> - if (!visible) {
> + if (!state->visible) {
>   mutex_lock(>struct_mutex);
>  
>   /*
> @@ -11801,7 +11775,7 @@ intel_primary_plane_setplane(struct drm_plane *plane, 
> struct drm_crtc *crtc,
>   intel_disable_fbc(dev);
>   }
>   }
> - ret = intel_pipe_set_base(crtc, src.x1, src.y1, fb);
> + ret = intel_pipe_set_base(crtc, src->x1, src->y1, fb);
>   if

[PATCH -v3 2/4] drm/i915: split intel_update_plane into check() and commit()

2014-09-04 Thread Ville Syrjälä

On Wed, Sep 03, 2014 at 05:10:16PM -0300, Gustavo Padovan wrote:
> From: Gustavo Padovan 
> 
> Due to the upcoming atomic modesetting feature we need to separate
> some update functions into a check step that can fail and a commit
> step that should, ideally, never fail.
> 
> This commit splits intel_update_plane() and its commit part can still
> fail due to the fb pinning procedure.

Crap. I wrote a reply and somehow it seems to have gotten lost before I
sent it out. I'll try to recall it all again...

> 
> Signed-off-by: Gustavo Padovan 
> ---
>  drivers/gpu/drm/i915/intel_sprite.c | 253 
> +---
>  1 file changed, 150 insertions(+), 103 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c 
> b/drivers/gpu/drm/i915/intel_sprite.c
> index 07a74ef..7b0d1a9 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -845,57 +845,24 @@ static bool colorkey_enabled(struct intel_plane 
> *intel_plane)
>  }
>  
>  static int
> -intel_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
> -struct drm_framebuffer *fb, int crtc_x, int crtc_y,
> -unsigned int crtc_w, unsigned int crtc_h,
> -uint32_t src_x, uint32_t src_y,
> -uint32_t src_w, uint32_t src_h)
> +intel_check_sprite_plane(struct drm_plane *plane,
> +  struct intel_plane_state *state)
>  {
> - struct drm_device *dev = plane->dev;
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> + struct intel_crtc *intel_crtc = to_intel_crtc(state->crtc);
>   struct intel_plane *intel_plane = to_intel_plane(plane);
> - enum pipe pipe = intel_crtc->pipe;
> + struct drm_framebuffer *fb = state->fb;
>   struct intel_framebuffer *intel_fb = to_intel_framebuffer(fb);
>   struct drm_i915_gem_object *obj = intel_fb->obj;
> - struct drm_i915_gem_object *old_obj = intel_plane->obj;
> - int ret;
> - bool primary_enabled;
> - bool visible;
> + int crtc_x, crtc_y;
> + unsigned int crtc_w, crtc_h;
> + uint32_t src_x, src_y, src_w, src_h;
> + struct drm_rect *src = >src;
> + struct drm_rect *dst = >dst;
> + struct drm_rect *orig_src = >orig_src;
> + const struct drm_rect *clip = >clip;
>   int hscale, vscale;
>   int max_scale, min_scale;
>   int pixel_size = drm_format_plane_cpp(fb->pixel_format, 0);
> - struct drm_rect src = {
> - /* sample coordinates in 16.16 fixed point */
> - .x1 = src_x,
> - .x2 = src_x + src_w,
> - .y1 = src_y,
> - .y2 = src_y + src_h,
> - };
> - struct drm_rect dst = {
> - /* integer pixels */
> - .x1 = crtc_x,
> - .x2 = crtc_x + crtc_w,
> - .y1 = crtc_y,
> - .y2 = crtc_y + crtc_h,
> - };
> - const struct drm_rect clip = {
> - .x2 = intel_crtc->active ? intel_crtc->config.pipe_src_w : 0,
> - .y2 = intel_crtc->active ? intel_crtc->config.pipe_src_h : 0,
> - };
> - const struct {
> - int crtc_x, crtc_y;
> - unsigned int crtc_w, crtc_h;
> - uint32_t src_x, src_y, src_w, src_h;
> - } orig = {
> - .crtc_x = crtc_x,
> - .crtc_y = crtc_y,
> - .crtc_w = crtc_w,
> - .crtc_h = crtc_h,
> - .src_x = src_x,
> - .src_y = src_y,
> - .src_w = src_w,
> - .src_h = src_h,
> - };
>  
>   /* Don't modify another pipe's plane */
>   if (intel_plane->pipe != intel_crtc->pipe) {
> @@ -927,55 +894,55 @@ intel_update_plane(struct drm_plane *plane, struct 
> drm_crtc *crtc,
>   max_scale = intel_plane->max_downscale << 16;
>   min_scale = intel_plane->can_scale ? 1 : (1 << 16);
>  
> - drm_rect_rotate(, fb->width << 16, fb->height << 16,
> + drm_rect_rotate(src, fb->width << 16, fb->height << 16,
>   intel_plane->rotation);
>  
> - hscale = drm_rect_calc_hscale_relaxed(, , min_scale, max_scale);
> + hscale = drm_rect_calc_hscale_relaxed(src, dst, min_scale, max_scale);
>   BUG_ON(hscale < 0);
>  
> - vscale = drm_rect_calc_vscale_relaxed(, , min_scale, max_scale);
> + vscale = drm_rect_calc_vscale_relaxed(src, dst, min_scale, max_scale);
>   BUG_ON(vscale < 0);
>  
> - visible = drm_rect_clip_scaled(, , , hscale, vscale);
> + state->visible =  drm_rect_clip_scaled(src, dst, clip, hscale, vscale);
>  
> - crtc_x = dst.x1;
> - crtc_y = dst.y1;
> - crtc_w = drm_rect_width();
> - crtc_h = drm_rect_height();
> + crtc_x = dst->x1;
> + crtc_y = dst->y1;
> + crtc_w = drm_rect_width(dst);
> + crtc_h = drm_rect_height(dst);
>  
> - if (visible) {
> + if (state->visible) {
>   /* check again in case clipping clamped the results */
> - hscale = drm_rect_calc_hscale(, , min_scale,

[Bug 83416] [radeonsi] Serious Sam 3 lockup during its start

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83416

Laurent carlier  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Laurent carlier  ---
I can confirm that 8bd67231797e5d79d72a4e91b37ea81da30c6df3 is fixing the hang.

Thanks Marek, closing!

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/8b653eac/attachment.html>

[Bug 83416] [radeonsi] Serious Sam 3 lockup during its start

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83416

Laurent carlier  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #16 from Laurent carlier  ---
Bad luck, it's hanging again! -> reopened

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/832b91d8/attachment.html>

[Bug 83500] si_dma_copy_tile causes GPU hangs

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83500

--- Comment #2 from Grigori Goronzy  ---
Created attachment 105755
  --> https://bugs.freedesktop.org/attachment.cgi?id=105755=edit
Better fix

This is a possibly better fix that only disables DMA if 1D tiling is involved.
Please give it a try.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/4dc47054/attachment.html>

[Bug 83416] [radeonsi] Serious Sam 3 lockup during its start

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=83416

--- Comment #17 from Grigori Goronzy  ---
Does this Mesa patch help?

https://bugs.freedesktop.org/attachment.cgi?id=105755

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/e8552757/attachment-0001.html>

[Bug 81239] Evolution window content not shown fully (only desktop background)

2014-09-04 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=81239

--- Comment #11 from Paul Menzel  ---
I created ticket 736069 [1] in the GNOME Bugzilla bug tracker.

[1] https://bugzilla.gnome.org/show_bug.cgi?id=736069

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140904/6a2dba55/attachment.html>

1 2 >

1 - 100 of 126 matches

Mail list logo