[Bug 41740] New: Mesa 7.12-devel gallium/state_trackers/d3d1x compilation error

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=41740

   Summary: Mesa 7.12-devel gallium/state_trackers/d3d1x
compilation error
   Product: Mesa
   Version: git
  Platform: x86-64 (AMD64)
OS/Version: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: wolput at onsneteindhoven.nl


Compiling Mesa 7.12-devel configured with --enable-d3d1x shows the folowing
error:
---
In file included from d3d11.cpp:220:0:
d3d11_context.h: In member function ?void
GalliumD3D11DeviceContext::init_context()?:
d3d11_context.h:153:34: error: ?screen? was not declared in this scope
d3d11.cpp: In function ?HRESULT GalliumD3D11DeviceCreate(pipe_screen*,
pipe_context*, BOOL, unsigned int, IDXGIAdapter*, ID3D11Device**)?:
d3d11.cpp:224:200: error: new declaration ?HRESULT
GalliumD3D11DeviceCreate(pipe_screen*, pipe_context*, BOOL, unsigned int,
IDXGIAdapter*, ID3D11Device**)?
../gd3dapi/galliumd3d11.h:65:10: error: ambiguates old declaration ?HRESULT
GalliumD3D11DeviceCreate(pipe_screen*, pipe_context*, BOOL, unsigned int,
IDXGIAdapter*, ID3D11Device**)?
In file included from d3d11.cpp:220:0:
d3d11_context.h: In member function ?HRESULT
GalliumD3D11DeviceContext::Map(ID3D11Resource*, unsigned int,
D3D11_MAP, unsigned int, D3D11_MAPPED_SUBRESOURCE*) [with PtrTraits =
nonatomic_device_child_ptr_traits, HRESULT = int, ID3D11Resource =
ID3D11Resource, D3D11_MAP = D3D11_MAP, D3D11_MAPPED_SUBRESOURCE =
D3D11_MAPPED_SUBRESOURCE]?:
d3d11.cpp:231:1:   instantiated from here
d3d11_context.h:1484:12: warning: unused variable ?face? [-Wunused-variable]
d3d11_context.h: In member function ?void
GalliumD3D11DeviceContext::CopySubresourceRegion(ID3D11Resource*,
unsigned int, unsigned int, unsigned int, unsigned int, ID3D11Resource*,
unsigned int, const D3D11_BOX*) [with PtrTraits =
nonatomic_device_child_ptr_traits, ID3D11Resource = ID3D11Resource, D3D11_BOX =
D3D11_BOX]?:
d3d11.cpp:231:1:   instantiated from here
d3d11_context.h:1545:12: warning: unused variable ?dst_face?
[-Wunused-variable]
d3d11_context.h:1547:12: warning: unused variable ?src_face?
[-Wunused-variable]
make[5]: *** [d3d11.o] Error 1
make[5]: Leaving directory
`/home/jos/src/xorg/git-master/mesa/src/gallium/state_trackers/d3d1x/gd3d11'
make[4]: *** [all] Error 2
make[4]: Leaving directory
`/home/jos/src/xorg/git-master/mesa/src/gallium/state_trackers/d3d1x'
make[3]: *** [subdirs] Error 1
make[3]: Leaving directory
`/home/jos/src/xorg/git-master/mesa/src/gallium/state_trackers'
make[2]: *** [default] Error 1
make[2]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src/gallium'
make[1]: *** [subdirs] Error 1
make[1]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src'
make: *** [default] Error 1
---
Make Mesa 7.11 produces a similar error.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


Power profiles low and mid are identical on Radeon HD6470M

2011-10-12 Thread Wolfgang Fritz
Am 11.10.2011 23:53, schrieb Alex Deucher:
> On Sat, Oct 8, 2011 at 2:25 PM, Wolfgang Fritz  
> wrote:
>> Hello,
>>
>> I have an HP Elitebook 8560p with Radeon HD7470M graphics, running Debian
>> sid with kernel 3.0.4.
>>
>> I noticed that the power profiles low and mid are setting identical clocks
>> and voltage, the lowest possible values:
>>
>> default engine clock: 75 kHz
>> current engine clock: 0 kHz
>> default memory clock: 90 kHz
>> current memory clock: 149970 kHz
>> voltage: 900 mV
>>
>> Looking at the code, this seems to be intentional at least for the mobility
>> chips, but the chip provides more modes:
>>
>> [9.361401] [drm] R600: Number of power states = 7
>> [9.361402] [drm] Is mobility = YES
>> [9.361403] [drm] ps #0 type 0, modes=3
>> [9.361404] [drm] 0: mclk=9, sclk=75000, volt=1100, vddci=0
>> [9.361406] [drm] 1: mclk=9, sclk=75000, volt=1100, vddci=0
>> [9.361407] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0
>> [9.361409] [drm] ps #1 type 4, modes=3
>> [9.361410] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0
>> [9.361411] [drm] 1: mclk=9, sclk=4, volt=1000, vddci=0
>> [9.361413] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0
>> [9.361414] [drm] ps #2 type 0, modes=3
>> [9.361415] [drm] 0: mclk=9, sclk=7, volt=1100, vddci=0
>> [9.361417] [drm] 1: mclk=9, sclk=7, volt=1100, vddci=0
>> [9.361418] [drm] 2: mclk=9, sclk=7, volt=1100, vddci=0
>> [9.361419] [drm] ps #3 type 2, modes=3
>> [9.361420] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0
>> [9.361422] [drm] 1: mclk=15000, sclk=1, volt=900, vddci=0
>> [9.361423] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0
>> [9.361424] [drm] ps #4 type 2, modes=3
>> [9.361426] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0
>> [9.361427] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0
>> [9.361428] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0
>> [9.361430] [drm] ps #5 type 2, modes=3
>> [9.361431] [drm] 0: mclk=3, sclk=3, volt=900, vddci=0
>> [9.361433] [drm] 1: mclk=3, sclk=3, volt=900, vddci=0
>> [9.361434] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0
>> [9.361435] [drm] ps #6 type 0, modes=3
>> [9.361436] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0
>> [9.361438] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0
>> [9.361439] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0
>> [9.361440] [drm] NOT CHIP_R600
>>
>> (dmesg output from patched radeon module)
>>
>> Questions:
>> 1. Is this a bug or a feature? (I see that it is not obvious which power
>> state to choose)
>
> It's the way it is.
>

:-)

>> 2. What do the 3 clock/voltage modes per power state mean?
>
> On r6xx+, each power state defines an operating state (e.g., single
> head battery, multi-head battery, single head performance, multi-head
> performance, etc.).  Within each operating state, there are
> high/mid/low clock modes that the define that operating state.  So if
> you have one head active and are on battery, the driver should switch
> between the high/mid/low clock modes defined in that power state based
> on the GPU load.  If you enable multi-head and are still on battery,
> the driver would switch to the multi-head battery state and switch
> between the high/mid/low modes in that state.
>

OK. That's what I assumed after short code inspection.

So, this is not cooperating well with the current dynamic clock 
interface in sysfs (at least as I understand it now).

I understand that there are the dynamic and the profile power methods.
In dynamic, I see the clocks switching, probably using the 3 power 
states in the second operation state in the list above (maximum 
performance). This results in an average power consumption similar to 
the catalyst driver (the fan is off most of the time). But it is not 
usable because the screen flickers when the clock state is changed, and 
this happens quite frequently. Also it seems to be independent of 
battery/mains mode.

In the profile power mode, the clocks are at full speed with clock 
profiles default, high and at lowest speed with profiles mid and low. 
The high profile keeps the fan running continuously. This seems to be 
independent of mains or battery mode (I have to double check this)

Low and mid profiles are unusable slow with 3D effects enabled, but work 
quite well with effects disabled, so this would be a suitable profile on 
low battery.

With power profile auto, power state is high performance in mains mode 
and low in battery mode.

So, as long as true dynamic clocking is not working flicker free, it 
would be nice to be able to change the clock modes manually to a value 
that keeps the fan quiet but is sufficient for ordinary work with 
effects enabled. I am currently running at 400/650 MHz @ 900mV with a 
patched driver.

Finally some questions:

Q1: Are all the power modes safe 

[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)

2011-10-12 Thread alexdeuc...@gmail.com
From: Alex Deucher 

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

v2: fix typo for DRM_MODE_CONNECTOR_VGA case.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/radeon/radeon_encoders.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index a90d9ee..eb3f6dc 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder 
*encoder)
break;
case 2:
args.v2.ucCRTC = radeon_crtc->crtc_id;
-   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   if (radeon_encoder_is_dp_bridge(encoder)) {
+   struct drm_connector *connector = 
radeon_get_connector_for_encoder(encoder);
+
+   if (connector->connector_type == 
DRM_MODE_CONNECTOR_LVDS)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_LVDS;
+   else if (connector->connector_type == 
DRM_MODE_CONNECTOR_VGA)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_CRT;
+   else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   } else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
switch (radeon_encoder->encoder_id) {
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY:
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1:
-- 
1.7.1.1



[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges

2011-10-12 Thread alexdeuc...@gmail.com
From: Alex Deucher 

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/radeon/radeon_encoders.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index a90d9ee..bfe1662 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder 
*encoder)
break;
case 2:
args.v2.ucCRTC = radeon_crtc->crtc_id;
-   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   if (radeon_encoder_is_dp_bridge(encoder)) {
+   struct drm_connector *connector = 
radeon_get_connector_for_encoder(encoder);
+
+   if (connector->connector_type == 
DRM_MODE_CONNECTOR_LVDS)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_LVDS;
+   else if (connector->connector_type == 
DRM_MODE_CONNECTOR_VGA)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_LVDS;
+   else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   } else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
switch (radeon_encoder->encoder_id) {
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY:
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1:
-- 
1.7.1.1



[PATCH 2/3] drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls

2011-10-12 Thread alexdeuc...@gmail.com
From: Alex Deucher 

It's handled via external clock.  It should already be protected
by the external ss flag, but add an explicit check just in case.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/radeon/atombios_crtc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c 
b/drivers/gpu/drm/radeon/atombios_crtc.c
index c742944..a515b2a 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -466,7 +466,7 @@ static void atombios_crtc_program_ss(struct drm_crtc *crtc,
return;
}
args.v2.ucEnable = enable;
-   if ((ss->percentage == 0) || (ss->type & ATOM_EXTERNAL_SS_MASK))
+   if ((ss->percentage == 0) || (ss->type & ATOM_EXTERNAL_SS_MASK) 
|| ASIC_IS_DCE41(rdev))
args.v2.ucEnable = ATOM_DISABLE;
} else if (ASIC_IS_DCE3(rdev)) {
args.v1.usSpreadSpectrumPercentage = 
cpu_to_le16(ss->percentage);
-- 
1.7.1.1



[PATCH 1/3] drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping

2011-10-12 Thread alexdeuc...@gmail.com
From: Alex Deucher 

llano has fully routeable dig encoders similar to DCE3.2 while
ontario has a hardcoded mapping similar to DCE4.0.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/radeon/radeon_encoders.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index 8a171b2..a90d9ee 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -1756,10 +1756,15 @@ static int radeon_atom_pick_dig_encoder(struct 
drm_encoder *encoder)
if (ASIC_IS_DCE4(rdev)) {
dig = radeon_encoder->enc_priv;
if (ASIC_IS_DCE41(rdev)) {
-   if (dig->linkb)
-   return 1;
-   else
-   return 0;
+   /* ontario follows DCE4 */
+   if (rdev->family == CHIP_PALM) {
+   if (dig->linkb)
+   return 1;
+   else
+   return 0;
+   } else
+   /* llano follows DCE3.2 */
+   return radeon_crtc->crtc_id;
} else {
switch (radeon_encoder->encoder_id) {
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY:
-- 
1.7.1.1



[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Daniel Vetter
On Wed, Oct 12, 2011 at 03:34:54PM +0100, Dave Airlie wrote:
> On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark  wrote:
> > On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie  wrote:
> >>> But then we'd need a different set of accessors for every different
> >>> drm/v4l/etc driver, wouldn't we?
> >>
> >> Not any more different than you need for this, you just have a new
> >> interface that you request a sw object from,
> >> then mmap that object, and underneath it knows who owns it in the kernel.
> >
> > oh, ok, so you are talking about a kernel level interface, rather than
> > userspace..
> >
> > but I guess in this case I don't quite see the difference. ?It amounts
> > to which fd you call mmap (or ioctl[*]) on.. ?If you use the dmabuf fd
> > directly then you don't have to pass around a 2nd fd.
> >
> > [*] there is nothing stopping defining some dmabuf ioctls (such as for
> > synchronization).. although the thinking was to keep it simple for
> > first version of dmabuf
> >
> 
> Yes a separate kernel level interface.
> 
> Well I'd like to keep it even simpler. dmabuf is a buffer sharing API,
> shoehorning in a sw mapping API isn't making it simpler.
> 
> The problem I have with implementing mmap on the sharing fd, is that
> nothing says this should be purely optional and userspace shouldn't
> rely on it.
> 
> In the Intel GEM space alone you have two types of mapping, one direct
> to shmem one via GTT, the GTT could be even be a linear view. The
> intel guys initially did GEM mmaps direct to the shmem pages because
> it seemed simple, up until they
> had to do step two which was do mmaps on the GTT copy and ended up
> having two separate mmap methods. I think the problem here is it seems
> deceptively simple to add this to the API now because the API is
> simple, however I think in the future it'll become a burden that we'll
> have to workaround.

Yeah, that's my feeling, too. Adding mmap sounds like a neat, simple idea,
that could simplify things for simple devices like v4l. But as soon as
you're dealing with a real gpu, nothing is simple. Those who don't believe
this, just take a look at the data upload/download paths in the
open-source i915,nouveau,radeon drivers. Making this fast (and for gpus,
it needs to be fast) requires tons of tricks, special-cases and jumping
through loops.

You absolutely want the device-specific ioctls to do that. Adding a
generic mmap just makes matters worse, especially if userspace expects
this to work synchronized with everything else that is going on.

Cheers, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


[PATCH 19/21] drm/i915: Asynchronous eDP panel power off

2011-10-12 Thread Dave Airlie
> Using the same basic plan as the VDD force delayed power off, make
> turning the panel power off asynchronous.

NAK, tested on my 2540p, up to this patch in macbook-air branch stuff
worked, after this I just get black screen on resume.

Dave.


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark  wrote:
> On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie  wrote:
>>> But then we'd need a different set of accessors for every different
>>> drm/v4l/etc driver, wouldn't we?
>>
>> Not any more different than you need for this, you just have a new
>> interface that you request a sw object from,
>> then mmap that object, and underneath it knows who owns it in the kernel.
>
> oh, ok, so you are talking about a kernel level interface, rather than
> userspace..
>
> but I guess in this case I don't quite see the difference. ?It amounts
> to which fd you call mmap (or ioctl[*]) on.. ?If you use the dmabuf fd
> directly then you don't have to pass around a 2nd fd.
>
> [*] there is nothing stopping defining some dmabuf ioctls (such as for
> synchronization).. although the thinking was to keep it simple for
> first version of dmabuf
>

Yes a separate kernel level interface.

Well I'd like to keep it even simpler. dmabuf is a buffer sharing API,
shoehorning in a sw mapping API isn't making it simpler.

The problem I have with implementing mmap on the sharing fd, is that
nothing says this should be purely optional and userspace shouldn't
rely on it.

In the Intel GEM space alone you have two types of mapping, one direct
to shmem one via GTT, the GTT could be even be a linear view. The
intel guys initially did GEM mmaps direct to the shmem pages because
it seemed simple, up until they
had to do step two which was do mmaps on the GTT copy and ended up
having two separate mmap methods. I think the problem here is it seems
deceptively simple to add this to the API now because the API is
simple, however I think in the future it'll become a burden that we'll
have to workaround.

Dave.


[RFC 2/2] dma-buf: Documentation for buffer sharing framework

2011-10-12 Thread Randy Dunlap
On 10/11/2011 02:23 AM, Sumit Semwal wrote:
> Add documentation for dma buffer sharing framework, explaining the
> various operations, members and API of the dma buffer sharing
> framework.
> 
> Signed-off-by: Sumit Semwal 
> Signed-off-by: Sumit Semwal 
> ---
>  Documentation/dma-buf-sharing.txt |  210 
> +
>  1 files changed, 210 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/dma-buf-sharing.txt
> 
> diff --git a/Documentation/dma-buf-sharing.txt 
> b/Documentation/dma-buf-sharing.txt
> new file mode 100644
> index 000..4da6644
> --- /dev/null
> +++ b/Documentation/dma-buf-sharing.txt
> @@ -0,0 +1,210 @@
> +DMA Buffer Sharing API Guide
> +
> +
> +Sumit Semwal
> +
> + 
> +
> +This document serves as a guide to device-driver writers on what is the 
> dma-buf
> +buffer sharing API, how to use it for exporting and using shared buffers.
> +
> +Any device driver which wishes to be a part of dma buffer sharing, can do so 
> as

Please use DMA instead of dma (except combinations like dma-buf are OK).  
[multiple]

> +either the 'exporter' of buffers, or the 'user' of buffers.
> +
> +Say a driver A wants to use buffers created by driver B, then we call B as 
> the
> +exporter, and B as buffer-user.

 and A

> +
> +The exporter
> +- implements and manages operations[1] for the buffer
> +- allows other users to share the buffer by using dma_buf sharing APIs,
> +- manages the details of buffer allocation,
> +- decides about the actual backing storage where this allocation happens,
> +- takes care of any migration of scatterlist - for all (shared) users of this
> +   buffer,
> +- optionally, provides mmap capability for drivers that need it.
> +
> +The buffer-user
> +- is one of (many) sharing users of the buffer.
> +- doesn't need to worry about how the buffer is allocated, or where.
> +- needs a mechanism to get access to the scatterlist that makes up this 
> buffer
> +   in memory, mapped into its own address space, so it can access the same 
> area
> +   of memory.
> +
> +
> +The dma_buf buffer sharing API usage contains the following steps:
> +
> +1. Exporter announces that it wishes to export a buffer
> +2. Userspace gets the file descriptor associated with the exported buffer, 
> and
> +   passes it around to potential buffer-users based on use case
> +3. Each buffer-user 'connects' itself to the buffer
> +4. When needed, buffer-user requests access to the buffer from exporter
> +5. When finished with its use, the buffer-user notifies end-of-dma to 
> exporter
> +6. when buffer-user is done using this buffer completely, it 'disconnects'
> +   itself from the buffer.
> +
> +
> +1. Exporter's announcement of buffer export
> +
> +   The buffer exporter announces its wish to export a buffer. In this, it
> +   connects its own private buffer data, provides implementation for 
> operations
> +   that can be performed on the exported dma_buf, and flags for the file
> +   associated with this buffer.
> +
> +   Interface:
> +  struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops,
> +int flags)
> +
> +   If this succeeds, dma_buf_export allocates a dma_buf structure, and 
> returns a
> +   pointer to the same. It also associates an anon file with this buffer, so 
> it

s/anon/anonymous/ (multiple)

> +   can be exported. On failure to allocate the dma_buf object, it returns 
> NULL.
> +
> +2. Userspace gets a handle to pass around to potential buffer-users
> +
> +   Userspace entity requests for a file-descriptor (fd) which is a handle to 
> the
> +   anon file associated with the buffer. It can then share the fd with other
> +   drivers and/or processes.
> +
> +   Interface:
> +  int dma_buf_fd(struct dma_buf *dmabuf)
> +
> +   This API installs an fd for the anon file associated with this buffer;
> +   returns either 'fd', or error.
> +
> +3. Each buffer-user 'connects' itself to the buffer
> +
> +   Each buffer-user now gets a reference to the buffer, using the fd passed 
> to
> +   it.
> +
> +   Interface:
> +  struct dma_buf *dma_buf_get(int fd)
> +
> +   This API will return a reference to the dma_buf, and increment refcount 
> for
> +   it.
> +
> +   After this, the buffer-user needs to attach its device with the buffer, 
> which
> +   helps the exporter to know of device buffer constraints.
> +
> +   Interface:
> +  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
> +struct device *dev)
> +
> +   This API returns reference to an attachment structure, which is then used
> +   for scatterlist operations. It will optionally call the 'attach' dma_buf
> +   operation, if provided by the exporter.
> +
> +   The dma-buf sharing framework does the book-keeping bits related to 
> keeping


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
> But then we'd need a different set of accessors for every different
> drm/v4l/etc driver, wouldn't we?

Not any more different than you need for this, you just have a new
interface that you request a sw object from,
then mmap that object, and underneath it knows who owns it in the kernel.

mmap just feels wrong in this API, which is a buffer sharing API not a
buffer mapping API.

> I guess if sharing a buffer between multiple drm devices, there is
> nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you
> pass when the buffer is allocated, then you don't have to support
> dmabuf->mmap(), and instead mmap via device and use some sort of
> DRM_CPU_PREP/FINI ioctls for synchronization..

Or we could make a generic CPU accessor that we don't have to worry about.

Dave.


[PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI

2011-10-12 Thread Dave Airlie
On Wed, Oct 12, 2011 at 2:25 PM, Ilija Hadzic
 wrote:
>
> Hi Dave,
>
> A few weeks ago I sent the two patches that allow PCI Express interface to
> run at Gen 2 speed on NI parts. Links to the patches in the mailing list
> archive + review from Alex quoted below:
>
> http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html
> http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html
>
> I saw some activity on drm-next and drm-core-next branches, but I have not
> seen these two patches merge yet. Just wondering if they are in the queue
> for merging or if they may have fell through the cracks?

/me misses patchwork a lot.

I've picked them up now.

Thanks,
Dave.


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
>
> well, the mmap is actually implemented by the buffer allocator
> (v4l/drm).. although not sure if this was the point

Then why not use the correct interface? doing some sort of not-quite
generic interface isn't really helping anyone except adding an ABI
that we have to support.

If someone wants to bypass the current kernel APIs we should add a new
API for them not shove it into this generic buffer sharing layer.

> The intent was that this is for well defined formats.. ie. it would
> need to be a format that both v4l and drm understood in the first
> place for sharing to make sense at all..

How will you know the stride to take a simple example? The userspace
had to create this buffer somehow and wants to share it with
"something", you sound like
you really needs another API that is a simple accessor API that can
handle mmaps.

> Anyways, the basic reason is to handle random edge cases where you
> need sw access to the buffer. ?For example, you are decoding video and
> pull out a frame to generate a thumbnail w/ a sw jpeg encoder..

Again, doesn't sound like it should be part of this API, and also
sounds like the sw jpeg encoder will need more info about the buffer
anyways like stride and format.

> With this current scheme, synchronization could be handled in
> dmabufops->mmap() and vm_ops->close().. ?it is perhaps a bit heavy to
> require mmap/munmap for each sw access, but I suppose this isn't
> really for the high-performance use case. ?It is just so that some
> random bit of sw that gets passed a dmabuf handle without knowing who
> allocated it can have sw access if really needed.

So I think thats fine, write a sw accessor providers, don't go
overloading the buffer sharing code.

This API will limit what people can use this buffer sharing for with
pure hw accessors, you might say, oh buts its okay to fail the mmap
then, but the chances of sw handling that I'm not so sure off.

Dave.


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal  wrote:
> This is the first step in defining a dma buffer sharing mechanism.
>
> A new buffer object dma_buf is added, with operations and API to allow easy
> sharing of this buffer object across devices.
>
> The framework allows:
> - a new buffer-object to be created with fixed size.
> - different devices to 'attach' themselves to this buffer, to facilitate
> ?backing storage negotiation, using dma_buf_attach() API.
> - association of a file pointer with each user-buffer and associated
> ? allocator-defined operations on that buffer. This operation is called the
> ? 'export' operation.
> - this exported buffer-object to be shared with the other entity by asking for
> ? its 'file-descriptor (fd)', and sharing the fd across.
> - a received fd to get the buffer object back, where it can be accessed using
> ? the associated exporter-defined operations.
> - the exporter and user to share the scatterlist using get_scatterlist and
> ? put_scatterlist operations.
>
> Atleast one 'attach()' call is required to be made prior to calling the
> get_scatterlist() operation.
>
> Couple of building blocks in get_scatterlist() are added to ease introduction
> of sync'ing across exporter and users, and late allocation by the exporter.
>
> mmap() file operation is provided for the associated 'fd', as wrapper over the
> optional allocator defined mmap(), to be used by devices that might need one.

Why is this needed? it really doesn't make sense to be mmaping objects
independent of some front-end like drm or v4l.

how will you know what contents are in them, how will you synchronise
access. Unless someone has a hard use-case for this I'd say we drop it
until someone does.

Dave.


[PATCH 19/21] drm/i915: Asynchronous eDP panel power off

2011-10-12 Thread Keith Packard
On Wed, 12 Oct 2011 15:41:11 +0100, Dave Airlie  wrote:

> > Using the same basic plan as the VDD force delayed power off, make
> > turning the panel power off asynchronous.
> 
> NAK, tested on my 2540p, up to this patch in macbook-air branch stuff
> worked, after this I just get black screen on resume.

Thanks for testing. I've created a new edp-training-fixes branch that
removes the async panel power off and leaves the rest of the branch.

I'll test on a 2540p that I've got access to today, and on the MBA when
I get home this evening.

-- 
keith.packard at intel.com
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20111012/ac22efd7/attachment-0001.pgp>


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 9:34 AM, Dave Airlie  wrote:
> On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark  wrote:
>> On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie  wrote:
 But then we'd need a different set of accessors for every different
 drm/v4l/etc driver, wouldn't we?
>>>
>>> Not any more different than you need for this, you just have a new
>>> interface that you request a sw object from,
>>> then mmap that object, and underneath it knows who owns it in the kernel.
>>
>> oh, ok, so you are talking about a kernel level interface, rather than
>> userspace..
>>
>> but I guess in this case I don't quite see the difference. ?It amounts
>> to which fd you call mmap (or ioctl[*]) on.. ?If you use the dmabuf fd
>> directly then you don't have to pass around a 2nd fd.
>>
>> [*] there is nothing stopping defining some dmabuf ioctls (such as for
>> synchronization).. although the thinking was to keep it simple for
>> first version of dmabuf
>>
>
> Yes a separate kernel level interface.

I'm not against it, but if it is a device-independent interface, it
just seems like six of one, half-dozen of the other..

Ie. how does it differ if the dmabuf fd is the fd used for ioctl/mmap,
vs if some other /dev/buffer-sharer file that you open?

But I think maybe I'm misunderstanding what you have in mind?

BR,
-R

> Well I'd like to keep it even simpler. dmabuf is a buffer sharing API,
> shoehorning in a sw mapping API isn't making it simpler.
>
> The problem I have with implementing mmap on the sharing fd, is that
> nothing says this should be purely optional and userspace shouldn't
> rely on it.
>
> In the Intel GEM space alone you have two types of mapping, one direct
> to shmem one via GTT, the GTT could be even be a linear view. The
> intel guys initially did GEM mmaps direct to the shmem pages because
> it seemed simple, up until they
> had to do step two which was do mmaps on the GTT copy and ended up
> having two separate mmap methods. I think the problem here is it seems
> deceptively simple to add this to the API now because the API is
> simple, however I think in the future it'll become a burden that we'll
> have to workaround.
>
> Dave.
>


[Bug 36003] [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=36003

Jeremy Huddleston  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #20 from Jeremy Huddleston  2011-10-12 
10:00:48 PDT ---
Thanks.  Closing based on the above comment.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie  wrote:
>> But then we'd need a different set of accessors for every different
>> drm/v4l/etc driver, wouldn't we?
>
> Not any more different than you need for this, you just have a new
> interface that you request a sw object from,
> then mmap that object, and underneath it knows who owns it in the kernel.

oh, ok, so you are talking about a kernel level interface, rather than
userspace..

but I guess in this case I don't quite see the difference.  It amounts
to which fd you call mmap (or ioctl[*]) on..  If you use the dmabuf fd
directly then you don't have to pass around a 2nd fd.

[*] there is nothing stopping defining some dmabuf ioctls (such as for
synchronization).. although the thinking was to keep it simple for
first version of dmabuf

BR,
-R

> mmap just feels wrong in this API, which is a buffer sharing API not a
> buffer mapping API.
>
>> I guess if sharing a buffer between multiple drm devices, there is
>> nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you
>> pass when the buffer is allocated, then you don't have to support
>> dmabuf->mmap(), and instead mmap via device and use some sort of
>> DRM_CPU_PREP/FINI ioctls for synchronization..
>
> Or we could make a generic CPU accessor that we don't have to worry about.
>
> Dave.
>


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 8:35 AM, Dave Airlie  wrote:
>>
>> well, the mmap is actually implemented by the buffer allocator
>> (v4l/drm).. although not sure if this was the point
>
> Then why not use the correct interface? doing some sort of not-quite
> generic interface isn't really helping anyone except adding an ABI
> that we have to support.

But what if you don't know who allocated the buffer?  How do you know
what interface to use to mmap?

> If someone wants to bypass the current kernel APIs we should add a new
> API for them not shove it into this generic buffer sharing layer.
>
>> The intent was that this is for well defined formats.. ie. it would
>> need to be a format that both v4l and drm understood in the first
>> place for sharing to make sense at all..
>
> How will you know the stride to take a simple example? The userspace
> had to create this buffer somehow and wants to share it with
> "something", you sound like
> you really needs another API that is a simple accessor API that can
> handle mmaps.

Well, things like stride, width, height, color format, userspace needs
to know all this already, even for malloc()'d sw buffers.  The
assumption is userspace already has a way to pass this information
around so it was not required to be duplicated by dmabuf.

>> Anyways, the basic reason is to handle random edge cases where you
>> need sw access to the buffer. ?For example, you are decoding video and
>> pull out a frame to generate a thumbnail w/ a sw jpeg encoder..
>
> Again, doesn't sound like it should be part of this API, and also
> sounds like the sw jpeg encoder will need more info about the buffer
> anyways like stride and format.
>
>> With this current scheme, synchronization could be handled in
>> dmabufops->mmap() and vm_ops->close().. ?it is perhaps a bit heavy to
>> require mmap/munmap for each sw access, but I suppose this isn't
>> really for the high-performance use case. ?It is just so that some
>> random bit of sw that gets passed a dmabuf handle without knowing who
>> allocated it can have sw access if really needed.
>
> So I think thats fine, write a sw accessor providers, don't go
> overloading the buffer sharing code.

But then we'd need a different set of accessors for every different
drm/v4l/etc driver, wouldn't we?

> This API will limit what people can use this buffer sharing for with
> pure hw accessors, you might say, oh buts its okay to fail the mmap
> then, but the chances of sw handling that I'm not so sure off.

I'm not entirely sure the case you are worried about.. sharing buffers
between multiple GPU's that understand same tiled formats?  I guess
that is a bit different from a case like a jpeg encoder that is passed
a dmabuf handle without any idea where it came from..

I guess if sharing a buffer between multiple drm devices, there is
nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you
pass when the buffer is allocated, then you don't have to support
dmabuf->mmap(), and instead mmap via device and use some sort of
DRM_CPU_PREP/FINI ioctls for synchronization..

BR,
-R

> Dave.
>


[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 7:41 AM, Dave Airlie  wrote:
> On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal  wrote:
>> This is the first step in defining a dma buffer sharing mechanism.
>>
>> A new buffer object dma_buf is added, with operations and API to allow easy
>> sharing of this buffer object across devices.
>>
>> The framework allows:
>> - a new buffer-object to be created with fixed size.
>> - different devices to 'attach' themselves to this buffer, to facilitate
>> ?backing storage negotiation, using dma_buf_attach() API.
>> - association of a file pointer with each user-buffer and associated
>> ? allocator-defined operations on that buffer. This operation is called the
>> ? 'export' operation.
>> - this exported buffer-object to be shared with the other entity by asking 
>> for
>> ? its 'file-descriptor (fd)', and sharing the fd across.
>> - a received fd to get the buffer object back, where it can be accessed using
>> ? the associated exporter-defined operations.
>> - the exporter and user to share the scatterlist using get_scatterlist and
>> ? put_scatterlist operations.
>>
>> Atleast one 'attach()' call is required to be made prior to calling the
>> get_scatterlist() operation.
>>
>> Couple of building blocks in get_scatterlist() are added to ease introduction
>> of sync'ing across exporter and users, and late allocation by the exporter.
>>
>> mmap() file operation is provided for the associated 'fd', as wrapper over 
>> the
>> optional allocator defined mmap(), to be used by devices that might need one.
>
> Why is this needed? it really doesn't make sense to be mmaping objects
> independent of some front-end like drm or v4l.

well, the mmap is actually implemented by the buffer allocator
(v4l/drm).. although not sure if this was the point

> how will you know what contents are in them, how will you synchronise
> access. Unless someone has a hard use-case for this I'd say we drop it
> until someone does.

The intent was that this is for well defined formats.. ie. it would
need to be a format that both v4l and drm understood in the first
place for sharing to make sense at all..

Anyways, the basic reason is to handle random edge cases where you
need sw access to the buffer.  For example, you are decoding video and
pull out a frame to generate a thumbnail w/ a sw jpeg encoder..

On gstreamer 0.11 branch, for example, there is already a map/unmap
virtual method on the gst buffer for sw access (ie. same purpose as
PrepareAccess/FinishAccess in EXA).  The idea w/ dmabuf mmap() support
is that we could implement support to mmap()/munmap() before/after sw
access.

With this current scheme, synchronization could be handled in
dmabufops->mmap() and vm_ops->close()..  it is perhaps a bit heavy to
require mmap/munmap for each sw access, but I suppose this isn't
really for the high-performance use case.  It is just so that some
random bit of sw that gets passed a dmabuf handle without knowing who
allocated it can have sw access if really needed.

BR,
-R

> Dave.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>


[patch] drm/nva3: checking the wrong variable

2011-10-12 Thread Ben Skeggs
On Tue, 2011-10-11 at 17:34 +0300, Dan Carpenter wrote:
> "id" is unsigned here and it's never less than zero.  I believe the
> intent was to check the return value from nva3_pm_pll_offset().
> Also I've changed it to pass on the -ENOENT error code from the lower
> levels instead of returning -EINVAL.
The patch looks correct.  It's worth noting though that a complete
rewrite of that particular code is queued for 3.2 already.

Ben.
> 
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/drivers/gpu/drm/nouveau/nva3_pm.c 
> b/drivers/gpu/drm/nouveau/nva3_pm.c
> index e4b2b9e..0be517d 100644
> --- a/drivers/gpu/drm/nouveau/nva3_pm.c
> +++ b/drivers/gpu/drm/nouveau/nva3_pm.c
> @@ -112,8 +112,8 @@ nva3_pm_clock_pre(struct drm_device *dev, struct 
> nouveau_pm_level *perflvl,
>   return (ret == -ENOENT) ? NULL : ERR_PTR(ret);
>  
>   off = nva3_pm_pll_offset(id);
> - if (id < 0)
> - return ERR_PTR(-EINVAL);
> + if (off < 0)
> + return ERR_PTR(off);
>  
> 
>   pll = kzalloc(sizeof(*pll), GFP_KERNEL);




[PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI

2011-10-12 Thread Ilija Hadzic

Hi Dave,

A few weeks ago I sent the two patches that allow PCI Express interface to 
run at Gen 2 speed on NI parts. Links to the patches in the mailing list 
archive + review from Alex quoted below:

http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html
http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html

I saw some activity on drm-next and drm-core-next branches, but I have 
not seen these two patches merge yet. Just wondering if they are in the 
queue for merging or if they may have fell through the cracks?

thanks,

Ilija

On Tue, 20 Sep 2011, Alex Deucher wrote:

> On Tue, Sep 20, 2011 at 10:22 AM, Ilija Hadzic
>  wrote:
>> Enabling pcie gen2 speed was skipped for Northern Islands
>> AISCs, although it looks like it works just fine with the same
>> initialization sequence used for evergreen.
>>
>> According to Alex D. gen2 init was skipped to prevent a crash
>> that has been caused by some other bug that has been
>> fixed in the meantime; so now it should be safe to enable it.
>>
>> Signed-off-by: Ilija Hadzic 
>
> I just double checked and BTC and cayman use the same programming
> method.  Both patches:
>
> Reviewed-by: Alex Deucher 
>
> Thanks!
>
> Alex
>
>
>> ---
>> ?drivers/gpu/drm/radeon/evergreen.c | ? ?3 +--
>> ?1 files changed, 1 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/evergreen.c 
>> b/drivers/gpu/drm/radeon/evergreen.c
>> index f09bace..208b59c 100644
>> --- a/drivers/gpu/drm/radeon/evergreen.c
>> +++ b/drivers/gpu/drm/radeon/evergreen.c
>> @@ -2987,8 +2987,7 @@ static int evergreen_startup(struct radeon_device 
>> *rdev)
>> ? ? ? ?int r;
>>
>> ? ? ? ?/* enable pcie gen2 link */
>> - ? ? ? if (!ASIC_IS_DCE5(rdev))
>> - ? ? ? ? ? ? ? evergreen_pcie_gen2_enable(rdev);
>> + ? ? ? evergreen_pcie_gen2_enable(rdev);
>>
>> ? ? ? ?if (ASIC_IS_DCE5(rdev)) {
>> ? ? ? ? ? ? ? ?if (!rdev->me_fw || !rdev->pfp_fw || !rdev->rlc_fw || 
>> !rdev->mc_fw) {
>> --
>> 1.7.6
>>
>> ___
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>


[Bug 41698] [r300g] Flickering user interface in WoW

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=41698

--- Comment #1 from Chris Rankin  2011-10-12 
03:35:09 PDT ---
Reverting this single patch in git (with the exception of the header file that
no longer exists, of course) has fixed the flickering problem. So far, anyway.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=41668

--- Comment #11 from dmotd  2011-10-12 
03:24:07 PDT ---
(In reply to comment #10)
> (In reply to comment #6)
> > > > running glxgears just shows an empty black box.. all other glx demos 
> > > > are the
> > > > same empty boxes.. 
> > > 
> > > Do they work with the environment variable vblank_mode=0? If yes, does the
> > > number for radeon increase in /proc/interrupts once the problem occurs?
> > 
> > setting vblank_mode=0 works and displays an output.. but not much change in
> > /proc/interrupts (irq 46 for radeon)
> 
> Not much change for the radeon number, or none at all? If the latter,
> apparently the IRQ for the radeon card stops working for some reason, which
> would explain the core symptoms of the freeze.

no change to the radeon irq number.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 23103] screen not lighting up on resume when using kms

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=23103

Michel D?nzer  changed:

   What|Removed |Added

Product|xorg|DRI
Version|7.4 |unspecified
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
  QAContact|xorg-team at lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 24097] screen backlight off after resume-from-suspend when using ATI KMS

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=24097

Michel D?nzer  changed:

   What|Removed |Added

Product|xorg|DRI
Version|7.4 |unspecified
  Status Whiteboard|2011BRB_Reviewed|
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
  QAContact|xorg-team at lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 36003] [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=36003

Michel D?nzer  changed:

   What|Removed |Added

Product|xorg|DRI
Version|7.6 |unspecified
  Status Whiteboard|2011BRB_Reviewed|
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
  QAContact|xorg-team at lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 38694] Server freezes with latest commit on 22/06/2011

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=38694

Michel D?nzer  changed:

   What|Removed |Added

Product|xorg|DRI
Version|git |unspecified
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
  QAContact|xorg-team at lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=41668

--- Comment #10 from Michel D?nzer  2011-10-12 03:07:04 
PDT ---
(In reply to comment #6)
> > > running glxgears just shows an empty black box.. all other glx demos are 
> > > the
> > > same empty boxes.. 
> > 
> > Do they work with the environment variable vblank_mode=0? If yes, does the
> > number for radeon increase in /proc/interrupts once the problem occurs?
> 
> setting vblank_mode=0 works and displays an output.. but not much change in
> /proc/interrupts (irq 46 for radeon)

Not much change for the radeon number, or none at all? If the latter,
apparently the IRQ for the radeon card stops working for some reason, which
would explain the core symptoms of the freeze.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 41579] R300 Segfaults when using mupen64plus

2011-10-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=41579

--- Comment #3 from Michel D?nzer  2011-10-12 02:59:21 
PDT ---
(In reply to comment #2)
> Just tested current Git, and it seems to be working.

Great. If it's still broken with the current 7.11 branch, and you can isolate
the change that fixed it, maybe we can backport it.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 41579] R300 Segfaults when using mupen64plus

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=41579

--- Comment #3 from Michel Dänzer mic...@daenzer.net 2011-10-12 02:59:21 PDT 
---
(In reply to comment #2)
 Just tested current Git, and it seems to be working.

Great. If it's still broken with the current 7.11 branch, and you can isolate
the change that fixed it, maybe we can backport it.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=41668

--- Comment #10 from Michel Dänzer mic...@daenzer.net 2011-10-12 03:07:04 PDT 
---
(In reply to comment #6)
   running glxgears just shows an empty black box.. all other glx demos are 
   the
   same empty boxes.. 
  
  Do they work with the environment variable vblank_mode=0? If yes, does the
  number for radeon increase in /proc/interrupts once the problem occurs?
 
 setting vblank_mode=0 works and displays an output.. but not much change in
 /proc/interrupts (irq 46 for radeon)

Not much change for the radeon number, or none at all? If the latter,
apparently the IRQ for the radeon card stops working for some reason, which
would explain the core symptoms of the freeze.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 38694] Server freezes with latest commit on 22/06/2011

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=38694

Michel Dänzer mic...@daenzer.net changed:

   What|Removed |Added

Product|xorg|DRI
Version|git |unspecified
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop
   ||.org
  QAContact|xorg-t...@lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 24097] screen backlight off after resume-from-suspend when using ATI KMS

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=24097

Michel Dänzer mic...@daenzer.net changed:

   What|Removed |Added

Product|xorg|DRI
Version|7.4 |unspecified
  Status Whiteboard|2011BRB_Reviewed|
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop
   ||.org
  QAContact|xorg-t...@lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 23103] screen not lighting up on resume when using kms

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=23103

Michel Dänzer mic...@daenzer.net changed:

   What|Removed |Added

Product|xorg|DRI
Version|7.4 |unspecified
  Component|Driver/Radeon   |DRM/Radeon
 AssignedTo|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop
   ||.org
  QAContact|xorg-t...@lists.x.org   |

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=41668

--- Comment #11 from dmotd inaudi...@simplesuperlativ.es 2011-10-12 03:24:07 
PDT ---
(In reply to comment #10)
 (In reply to comment #6)
running glxgears just shows an empty black box.. all other glx demos 
are the
same empty boxes.. 
   
   Do they work with the environment variable vblank_mode=0? If yes, does the
   number for radeon increase in /proc/interrupts once the problem occurs?
  
  setting vblank_mode=0 works and displays an output.. but not much change in
  /proc/interrupts (irq 46 for radeon)
 
 Not much change for the radeon number, or none at all? If the latter,
 apparently the IRQ for the radeon card stops working for some reason, which
 would explain the core symptoms of the freeze.

no change to the radeon irq number.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal sumit.sem...@ti.com wrote:
 This is the first step in defining a dma buffer sharing mechanism.

 A new buffer object dma_buf is added, with operations and API to allow easy
 sharing of this buffer object across devices.

 The framework allows:
 - a new buffer-object to be created with fixed size.
 - different devices to 'attach' themselves to this buffer, to facilitate
  backing storage negotiation, using dma_buf_attach() API.
 - association of a file pointer with each user-buffer and associated
   allocator-defined operations on that buffer. This operation is called the
   'export' operation.
 - this exported buffer-object to be shared with the other entity by asking for
   its 'file-descriptor (fd)', and sharing the fd across.
 - a received fd to get the buffer object back, where it can be accessed using
   the associated exporter-defined operations.
 - the exporter and user to share the scatterlist using get_scatterlist and
   put_scatterlist operations.

 Atleast one 'attach()' call is required to be made prior to calling the
 get_scatterlist() operation.

 Couple of building blocks in get_scatterlist() are added to ease introduction
 of sync'ing across exporter and users, and late allocation by the exporter.

 mmap() file operation is provided for the associated 'fd', as wrapper over the
 optional allocator defined mmap(), to be used by devices that might need one.

Why is this needed? it really doesn't make sense to be mmaping objects
independent of some front-end like drm or v4l.

how will you know what contents are in them, how will you synchronise
access. Unless someone has a hard use-case for this I'd say we drop it
until someone does.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI

2011-10-12 Thread Ilija Hadzic


Hi Dave,

A few weeks ago I sent the two patches that allow PCI Express interface to 
run at Gen 2 speed on NI parts. Links to the patches in the mailing list 
archive + review from Alex quoted below:


http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html
http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html

I saw some activity on drm-next and drm-core-next branches, but I have 
not seen these two patches merge yet. Just wondering if they are in the 
queue for merging or if they may have fell through the cracks?


thanks,

Ilija

On Tue, 20 Sep 2011, Alex Deucher wrote:


On Tue, Sep 20, 2011 at 10:22 AM, Ilija Hadzic
ihad...@research.bell-labs.com wrote:

Enabling pcie gen2 speed was skipped for Northern Islands
AISCs, although it looks like it works just fine with the same
initialization sequence used for evergreen.

According to Alex D. gen2 init was skipped to prevent a crash
that has been caused by some other bug that has been
fixed in the meantime; so now it should be safe to enable it.

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com


I just double checked and BTC and cayman use the same programming
method.  Both patches:

Reviewed-by: Alex Deucher alexander.deuc...@amd.com

Thanks!

Alex



---
 drivers/gpu/drm/radeon/evergreen.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index f09bace..208b59c 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2987,8 +2987,7 @@ static int evergreen_startup(struct radeon_device *rdev)
       int r;

       /* enable pcie gen2 link */
-       if (!ASIC_IS_DCE5(rdev))
-               evergreen_pcie_gen2_enable(rdev);
+       evergreen_pcie_gen2_enable(rdev);

       if (ASIC_IS_DCE5(rdev)) {
               if (!rdev-me_fw || !rdev-pfp_fw || !rdev-rlc_fw || 
!rdev-mc_fw) {
--
1.7.6

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 7:41 AM, Dave Airlie airl...@gmail.com wrote:
 On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal sumit.sem...@ti.com wrote:
 This is the first step in defining a dma buffer sharing mechanism.

 A new buffer object dma_buf is added, with operations and API to allow easy
 sharing of this buffer object across devices.

 The framework allows:
 - a new buffer-object to be created with fixed size.
 - different devices to 'attach' themselves to this buffer, to facilitate
  backing storage negotiation, using dma_buf_attach() API.
 - association of a file pointer with each user-buffer and associated
   allocator-defined operations on that buffer. This operation is called the
   'export' operation.
 - this exported buffer-object to be shared with the other entity by asking 
 for
   its 'file-descriptor (fd)', and sharing the fd across.
 - a received fd to get the buffer object back, where it can be accessed using
   the associated exporter-defined operations.
 - the exporter and user to share the scatterlist using get_scatterlist and
   put_scatterlist operations.

 Atleast one 'attach()' call is required to be made prior to calling the
 get_scatterlist() operation.

 Couple of building blocks in get_scatterlist() are added to ease introduction
 of sync'ing across exporter and users, and late allocation by the exporter.

 mmap() file operation is provided for the associated 'fd', as wrapper over 
 the
 optional allocator defined mmap(), to be used by devices that might need one.

 Why is this needed? it really doesn't make sense to be mmaping objects
 independent of some front-end like drm or v4l.

well, the mmap is actually implemented by the buffer allocator
(v4l/drm).. although not sure if this was the point

 how will you know what contents are in them, how will you synchronise
 access. Unless someone has a hard use-case for this I'd say we drop it
 until someone does.

The intent was that this is for well defined formats.. ie. it would
need to be a format that both v4l and drm understood in the first
place for sharing to make sense at all..

Anyways, the basic reason is to handle random edge cases where you
need sw access to the buffer.  For example, you are decoding video and
pull out a frame to generate a thumbnail w/ a sw jpeg encoder..

On gstreamer 0.11 branch, for example, there is already a map/unmap
virtual method on the gst buffer for sw access (ie. same purpose as
PrepareAccess/FinishAccess in EXA).  The idea w/ dmabuf mmap() support
is that we could implement support to mmap()/munmap() before/after sw
access.

With this current scheme, synchronization could be handled in
dmabufops-mmap() and vm_ops-close()..  it is perhaps a bit heavy to
require mmap/munmap for each sw access, but I suppose this isn't
really for the high-performance use case.  It is just so that some
random bit of sw that gets passed a dmabuf handle without knowing who
allocated it can have sw access if really needed.

BR,
-R

 Dave.
 --
 To unsubscribe from this list: send the line unsubscribe linux-media in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie

 well, the mmap is actually implemented by the buffer allocator
 (v4l/drm).. although not sure if this was the point

Then why not use the correct interface? doing some sort of not-quite
generic interface isn't really helping anyone except adding an ABI
that we have to support.

If someone wants to bypass the current kernel APIs we should add a new
API for them not shove it into this generic buffer sharing layer.

 The intent was that this is for well defined formats.. ie. it would
 need to be a format that both v4l and drm understood in the first
 place for sharing to make sense at all..

How will you know the stride to take a simple example? The userspace
had to create this buffer somehow and wants to share it with
something, you sound like
you really needs another API that is a simple accessor API that can
handle mmaps.

 Anyways, the basic reason is to handle random edge cases where you
 need sw access to the buffer.  For example, you are decoding video and
 pull out a frame to generate a thumbnail w/ a sw jpeg encoder..

Again, doesn't sound like it should be part of this API, and also
sounds like the sw jpeg encoder will need more info about the buffer
anyways like stride and format.

 With this current scheme, synchronization could be handled in
 dmabufops-mmap() and vm_ops-close()..  it is perhaps a bit heavy to
 require mmap/munmap for each sw access, but I suppose this isn't
 really for the high-performance use case.  It is just so that some
 random bit of sw that gets passed a dmabuf handle without knowing who
 allocated it can have sw access if really needed.

So I think thats fine, write a sw accessor providers, don't go
overloading the buffer sharing code.

This API will limit what people can use this buffer sharing for with
pure hw accessors, you might say, oh buts its okay to fail the mmap
then, but the chances of sw handling that I'm not so sure off.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI

2011-10-12 Thread Dave Airlie
On Wed, Oct 12, 2011 at 2:25 PM, Ilija Hadzic
ihad...@research.bell-labs.com wrote:

 Hi Dave,

 A few weeks ago I sent the two patches that allow PCI Express interface to
 run at Gen 2 speed on NI parts. Links to the patches in the mailing list
 archive + review from Alex quoted below:

 http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html
 http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html

 I saw some activity on drm-next and drm-core-next branches, but I have not
 seen these two patches merge yet. Just wondering if they are in the queue
 for merging or if they may have fell through the cracks?

/me misses patchwork a lot.

I've picked them up now.

Thanks,
Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 8:35 AM, Dave Airlie airl...@gmail.com wrote:

 well, the mmap is actually implemented by the buffer allocator
 (v4l/drm).. although not sure if this was the point

 Then why not use the correct interface? doing some sort of not-quite
 generic interface isn't really helping anyone except adding an ABI
 that we have to support.

But what if you don't know who allocated the buffer?  How do you know
what interface to use to mmap?

 If someone wants to bypass the current kernel APIs we should add a new
 API for them not shove it into this generic buffer sharing layer.

 The intent was that this is for well defined formats.. ie. it would
 need to be a format that both v4l and drm understood in the first
 place for sharing to make sense at all..

 How will you know the stride to take a simple example? The userspace
 had to create this buffer somehow and wants to share it with
 something, you sound like
 you really needs another API that is a simple accessor API that can
 handle mmaps.

Well, things like stride, width, height, color format, userspace needs
to know all this already, even for malloc()'d sw buffers.  The
assumption is userspace already has a way to pass this information
around so it was not required to be duplicated by dmabuf.

 Anyways, the basic reason is to handle random edge cases where you
 need sw access to the buffer.  For example, you are decoding video and
 pull out a frame to generate a thumbnail w/ a sw jpeg encoder..

 Again, doesn't sound like it should be part of this API, and also
 sounds like the sw jpeg encoder will need more info about the buffer
 anyways like stride and format.

 With this current scheme, synchronization could be handled in
 dmabufops-mmap() and vm_ops-close()..  it is perhaps a bit heavy to
 require mmap/munmap for each sw access, but I suppose this isn't
 really for the high-performance use case.  It is just so that some
 random bit of sw that gets passed a dmabuf handle without knowing who
 allocated it can have sw access if really needed.

 So I think thats fine, write a sw accessor providers, don't go
 overloading the buffer sharing code.

But then we'd need a different set of accessors for every different
drm/v4l/etc driver, wouldn't we?

 This API will limit what people can use this buffer sharing for with
 pure hw accessors, you might say, oh buts its okay to fail the mmap
 then, but the chances of sw handling that I'm not so sure off.

I'm not entirely sure the case you are worried about.. sharing buffers
between multiple GPU's that understand same tiled formats?  I guess
that is a bit different from a case like a jpeg encoder that is passed
a dmabuf handle without any idea where it came from..

I guess if sharing a buffer between multiple drm devices, there is
nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you
pass when the buffer is allocated, then you don't have to support
dmabuf-mmap(), and instead mmap via device and use some sort of
DRM_CPU_PREP/FINI ioctls for synchronization..

BR,
-R

 Dave.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
 But then we'd need a different set of accessors for every different
 drm/v4l/etc driver, wouldn't we?

Not any more different than you need for this, you just have a new
interface that you request a sw object from,
then mmap that object, and underneath it knows who owns it in the kernel.

mmap just feels wrong in this API, which is a buffer sharing API not a
buffer mapping API.

 I guess if sharing a buffer between multiple drm devices, there is
 nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you
 pass when the buffer is allocated, then you don't have to support
 dmabuf-mmap(), and instead mmap via device and use some sort of
 DRM_CPU_PREP/FINI ioctls for synchronization..

Or we could make a generic CPU accessor that we don't have to worry about.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote:
 But then we'd need a different set of accessors for every different
 drm/v4l/etc driver, wouldn't we?

 Not any more different than you need for this, you just have a new
 interface that you request a sw object from,
 then mmap that object, and underneath it knows who owns it in the kernel.

oh, ok, so you are talking about a kernel level interface, rather than
userspace..

but I guess in this case I don't quite see the difference.  It amounts
to which fd you call mmap (or ioctl[*]) on..  If you use the dmabuf fd
directly then you don't have to pass around a 2nd fd.

[*] there is nothing stopping defining some dmabuf ioctls (such as for
synchronization).. although the thinking was to keep it simple for
first version of dmabuf

BR,
-R

 mmap just feels wrong in this API, which is a buffer sharing API not a
 buffer mapping API.

 I guess if sharing a buffer between multiple drm devices, there is
 nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you
 pass when the buffer is allocated, then you don't have to support
 dmabuf-mmap(), and instead mmap via device and use some sort of
 DRM_CPU_PREP/FINI ioctls for synchronization..

 Or we could make a generic CPU accessor that we don't have to worry about.

 Dave.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Dave Airlie
On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark robdcl...@gmail.com wrote:
 On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote:
 But then we'd need a different set of accessors for every different
 drm/v4l/etc driver, wouldn't we?

 Not any more different than you need for this, you just have a new
 interface that you request a sw object from,
 then mmap that object, and underneath it knows who owns it in the kernel.

 oh, ok, so you are talking about a kernel level interface, rather than
 userspace..

 but I guess in this case I don't quite see the difference.  It amounts
 to which fd you call mmap (or ioctl[*]) on..  If you use the dmabuf fd
 directly then you don't have to pass around a 2nd fd.

 [*] there is nothing stopping defining some dmabuf ioctls (such as for
 synchronization).. although the thinking was to keep it simple for
 first version of dmabuf


Yes a separate kernel level interface.

Well I'd like to keep it even simpler. dmabuf is a buffer sharing API,
shoehorning in a sw mapping API isn't making it simpler.

The problem I have with implementing mmap on the sharing fd, is that
nothing says this should be purely optional and userspace shouldn't
rely on it.

In the Intel GEM space alone you have two types of mapping, one direct
to shmem one via GTT, the GTT could be even be a linear view. The
intel guys initially did GEM mmaps direct to the shmem pages because
it seemed simple, up until they
had to do step two which was do mmaps on the GTT copy and ended up
having two separate mmap methods. I think the problem here is it seems
deceptively simple to add this to the API now because the API is
simple, however I think in the future it'll become a burden that we'll
have to workaround.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 19/21] drm/i915: Asynchronous eDP panel power off

2011-10-12 Thread Dave Airlie
 Using the same basic plan as the VDD force delayed power off, make
 turning the panel power off asynchronous.

NAK, tested on my 2540p, up to this patch in macbook-air branch stuff
worked, after this I just get black screen on resume.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Daniel Vetter
On Wed, Oct 12, 2011 at 03:34:54PM +0100, Dave Airlie wrote:
 On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark robdcl...@gmail.com wrote:
  On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote:
  But then we'd need a different set of accessors for every different
  drm/v4l/etc driver, wouldn't we?
 
  Not any more different than you need for this, you just have a new
  interface that you request a sw object from,
  then mmap that object, and underneath it knows who owns it in the kernel.
 
  oh, ok, so you are talking about a kernel level interface, rather than
  userspace..
 
  but I guess in this case I don't quite see the difference.  It amounts
  to which fd you call mmap (or ioctl[*]) on..  If you use the dmabuf fd
  directly then you don't have to pass around a 2nd fd.
 
  [*] there is nothing stopping defining some dmabuf ioctls (such as for
  synchronization).. although the thinking was to keep it simple for
  first version of dmabuf
 
 
 Yes a separate kernel level interface.
 
 Well I'd like to keep it even simpler. dmabuf is a buffer sharing API,
 shoehorning in a sw mapping API isn't making it simpler.
 
 The problem I have with implementing mmap on the sharing fd, is that
 nothing says this should be purely optional and userspace shouldn't
 rely on it.
 
 In the Intel GEM space alone you have two types of mapping, one direct
 to shmem one via GTT, the GTT could be even be a linear view. The
 intel guys initially did GEM mmaps direct to the shmem pages because
 it seemed simple, up until they
 had to do step two which was do mmaps on the GTT copy and ended up
 having two separate mmap methods. I think the problem here is it seems
 deceptively simple to add this to the API now because the API is
 simple, however I think in the future it'll become a burden that we'll
 have to workaround.

Yeah, that's my feeling, too. Adding mmap sounds like a neat, simple idea,
that could simplify things for simple devices like v4l. But as soon as
you're dealing with a real gpu, nothing is simple. Those who don't believe
this, just take a look at the data upload/download paths in the
open-source i915,nouveau,radeon drivers. Making this fast (and for gpus,
it needs to be fast) requires tons of tricks, special-cases and jumping
through loops.

You absolutely want the device-specific ioctls to do that. Adding a
generic mmap just makes matters worse, especially if userspace expects
this to work synchronized with everything else that is going on.

Cheers, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism

2011-10-12 Thread Rob Clark
On Wed, Oct 12, 2011 at 9:34 AM, Dave Airlie airl...@gmail.com wrote:
 On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark robdcl...@gmail.com wrote:
 On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote:
 But then we'd need a different set of accessors for every different
 drm/v4l/etc driver, wouldn't we?

 Not any more different than you need for this, you just have a new
 interface that you request a sw object from,
 then mmap that object, and underneath it knows who owns it in the kernel.

 oh, ok, so you are talking about a kernel level interface, rather than
 userspace..

 but I guess in this case I don't quite see the difference.  It amounts
 to which fd you call mmap (or ioctl[*]) on..  If you use the dmabuf fd
 directly then you don't have to pass around a 2nd fd.

 [*] there is nothing stopping defining some dmabuf ioctls (such as for
 synchronization).. although the thinking was to keep it simple for
 first version of dmabuf


 Yes a separate kernel level interface.

I'm not against it, but if it is a device-independent interface, it
just seems like six of one, half-dozen of the other..

Ie. how does it differ if the dmabuf fd is the fd used for ioctl/mmap,
vs if some other /dev/buffer-sharer file that you open?

But I think maybe I'm misunderstanding what you have in mind?

BR,
-R

 Well I'd like to keep it even simpler. dmabuf is a buffer sharing API,
 shoehorning in a sw mapping API isn't making it simpler.

 The problem I have with implementing mmap on the sharing fd, is that
 nothing says this should be purely optional and userspace shouldn't
 rely on it.

 In the Intel GEM space alone you have two types of mapping, one direct
 to shmem one via GTT, the GTT could be even be a linear view. The
 intel guys initially did GEM mmaps direct to the shmem pages because
 it seemed simple, up until they
 had to do step two which was do mmaps on the GTT copy and ended up
 having two separate mmap methods. I think the problem here is it seems
 deceptively simple to add this to the API now because the API is
 simple, however I think in the future it'll become a burden that we'll
 have to workaround.

 Dave.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 19/21] drm/i915: Asynchronous eDP panel power off

2011-10-12 Thread Keith Packard
On Wed, 12 Oct 2011 15:41:11 +0100, Dave Airlie airl...@gmail.com wrote:

  Using the same basic plan as the VDD force delayed power off, make
  turning the panel power off asynchronous.
 
 NAK, tested on my 2540p, up to this patch in macbook-air branch stuff
 worked, after this I just get black screen on resume.

Thanks for testing. I've created a new edp-training-fixes branch that
removes the async panel power off and leaves the rest of the branch.

I'll test on a 2540p that I've got access to today, and on the MBA when
I get home this evening.

-- 
keith.pack...@intel.com


pgpQJKfQjssQ3.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 36003] [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)

2011-10-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=36003

Jeremy Huddleston jerem...@freedesktop.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #20 from Jeremy Huddleston jerem...@freedesktop.org 2011-10-12 
10:00:48 PDT ---
Thanks.  Closing based on the above comment.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Power profiles low and mid are identical on Radeon HD6470M

2011-10-12 Thread Wolfgang Fritz

Am 11.10.2011 23:53, schrieb Alex Deucher:

On Sat, Oct 8, 2011 at 2:25 PM, Wolfgang Fritzwolfgang.fr...@gmx.net  wrote:

Hello,

I have an HP Elitebook 8560p with Radeon HD7470M graphics, running Debian
sid with kernel 3.0.4.

I noticed that the power profiles low and mid are setting identical clocks
and voltage, the lowest possible values:

default engine clock: 75 kHz
current engine clock: 0 kHz
default memory clock: 90 kHz
current memory clock: 149970 kHz
voltage: 900 mV

Looking at the code, this seems to be intentional at least for the mobility
chips, but the chip provides more modes:

[9.361401] [drm] R600: Number of power states = 7
[9.361402] [drm] Is mobility = YES
[9.361403] [drm] ps #0 type 0, modes=3
[9.361404] [drm] 0: mclk=9, sclk=75000, volt=1100, vddci=0
[9.361406] [drm] 1: mclk=9, sclk=75000, volt=1100, vddci=0
[9.361407] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0
[9.361409] [drm] ps #1 type 4, modes=3
[9.361410] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0
[9.361411] [drm] 1: mclk=9, sclk=4, volt=1000, vddci=0
[9.361413] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0
[9.361414] [drm] ps #2 type 0, modes=3
[9.361415] [drm] 0: mclk=9, sclk=7, volt=1100, vddci=0
[9.361417] [drm] 1: mclk=9, sclk=7, volt=1100, vddci=0
[9.361418] [drm] 2: mclk=9, sclk=7, volt=1100, vddci=0
[9.361419] [drm] ps #3 type 2, modes=3
[9.361420] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0
[9.361422] [drm] 1: mclk=15000, sclk=1, volt=900, vddci=0
[9.361423] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0
[9.361424] [drm] ps #4 type 2, modes=3
[9.361426] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0
[9.361427] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0
[9.361428] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0
[9.361430] [drm] ps #5 type 2, modes=3
[9.361431] [drm] 0: mclk=3, sclk=3, volt=900, vddci=0
[9.361433] [drm] 1: mclk=3, sclk=3, volt=900, vddci=0
[9.361434] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0
[9.361435] [drm] ps #6 type 0, modes=3
[9.361436] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0
[9.361438] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0
[9.361439] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0
[9.361440] [drm] NOT CHIP_R600

(dmesg output from patched radeon module)

Questions:
1. Is this a bug or a feature? (I see that it is not obvious which power
state to choose)


It's the way it is.



:-)


2. What do the 3 clock/voltage modes per power state mean?


On r6xx+, each power state defines an operating state (e.g., single
head battery, multi-head battery, single head performance, multi-head
performance, etc.).  Within each operating state, there are
high/mid/low clock modes that the define that operating state.  So if
you have one head active and are on battery, the driver should switch
between the high/mid/low clock modes defined in that power state based
on the GPU load.  If you enable multi-head and are still on battery,
the driver would switch to the multi-head battery state and switch
between the high/mid/low modes in that state.



OK. That's what I assumed after short code inspection.

So, this is not cooperating well with the current dynamic clock 
interface in sysfs (at least as I understand it now).


I understand that there are the dynamic and the profile power methods.
In dynamic, I see the clocks switching, probably using the 3 power 
states in the second operation state in the list above (maximum 
performance). This results in an average power consumption similar to 
the catalyst driver (the fan is off most of the time). But it is not 
usable because the screen flickers when the clock state is changed, and 
this happens quite frequently. Also it seems to be independent of 
battery/mains mode.


In the profile power mode, the clocks are at full speed with clock 
profiles default, high and at lowest speed with profiles mid and low. 
The high profile keeps the fan running continuously. This seems to be 
independent of mains or battery mode (I have to double check this)


Low and mid profiles are unusable slow with 3D effects enabled, but work 
quite well with effects disabled, so this would be a suitable profile on 
low battery.


With power profile auto, power state is high performance in mains mode 
and low in battery mode.


So, as long as true dynamic clocking is not working flicker free, it 
would be nice to be able to change the clock modes manually to a value 
that keeps the fan quiet but is sufficient for ordinary work with 
effects enabled. I am currently running at 400/650 MHz @ 900mV with a 
patched driver.


Finally some questions:

Q1: Are all the power modes safe (maybe not optimal) to be used in all 
configurations (dual/single, battery/mains) or is it dangerous (meaning 
for the HW) using for example a dual 

[PATCH 1/3] drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping

2011-10-12 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

llano has fully routeable dig encoders similar to DCE3.2 while
ontario has a hardcoded mapping similar to DCE4.0.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/radeon/radeon_encoders.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index 8a171b2..a90d9ee 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -1756,10 +1756,15 @@ static int radeon_atom_pick_dig_encoder(struct 
drm_encoder *encoder)
if (ASIC_IS_DCE4(rdev)) {
dig = radeon_encoder-enc_priv;
if (ASIC_IS_DCE41(rdev)) {
-   if (dig-linkb)
-   return 1;
-   else
-   return 0;
+   /* ontario follows DCE4 */
+   if (rdev-family == CHIP_PALM) {
+   if (dig-linkb)
+   return 1;
+   else
+   return 0;
+   } else
+   /* llano follows DCE3.2 */
+   return radeon_crtc-crtc_id;
} else {
switch (radeon_encoder-encoder_id) {
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY:
-- 
1.7.1.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/3] drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls

2011-10-12 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

It's handled via external clock.  It should already be protected
by the external ss flag, but add an explicit check just in case.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/radeon/atombios_crtc.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c 
b/drivers/gpu/drm/radeon/atombios_crtc.c
index c742944..a515b2a 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -466,7 +466,7 @@ static void atombios_crtc_program_ss(struct drm_crtc *crtc,
return;
}
args.v2.ucEnable = enable;
-   if ((ss-percentage == 0) || (ss-type  ATOM_EXTERNAL_SS_MASK))
+   if ((ss-percentage == 0) || (ss-type  ATOM_EXTERNAL_SS_MASK) 
|| ASIC_IS_DCE41(rdev))
args.v2.ucEnable = ATOM_DISABLE;
} else if (ASIC_IS_DCE3(rdev)) {
args.v1.usSpreadSpectrumPercentage = 
cpu_to_le16(ss-percentage);
-- 
1.7.1.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges

2011-10-12 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/radeon/radeon_encoders.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index a90d9ee..bfe1662 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder 
*encoder)
break;
case 2:
args.v2.ucCRTC = radeon_crtc-crtc_id;
-   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   if (radeon_encoder_is_dp_bridge(encoder)) {
+   struct drm_connector *connector = 
radeon_get_connector_for_encoder(encoder);
+
+   if (connector-connector_type == 
DRM_MODE_CONNECTOR_LVDS)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_LVDS;
+   else if (connector-connector_type == 
DRM_MODE_CONNECTOR_VGA)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_LVDS;
+   else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   } else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
switch (radeon_encoder-encoder_id) {
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY:
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1:
-- 
1.7.1.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)

2011-10-12 Thread alexdeucher
From: Alex Deucher alexander.deuc...@amd.com

Settings in this table reflect the physical panel/connector rather
than the internal dig encoding.

v2: fix typo for DRM_MODE_CONNECTOR_VGA case.

Signed-off-by: Alex Deucher alexander.deuc...@amd.com
---
 drivers/gpu/drm/radeon/radeon_encoders.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c 
b/drivers/gpu/drm/radeon/radeon_encoders.c
index a90d9ee..eb3f6dc 100644
--- a/drivers/gpu/drm/radeon/radeon_encoders.c
+++ b/drivers/gpu/drm/radeon/radeon_encoders.c
@@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder 
*encoder)
break;
case 2:
args.v2.ucCRTC = radeon_crtc-crtc_id;
-   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   if (radeon_encoder_is_dp_bridge(encoder)) {
+   struct drm_connector *connector = 
radeon_get_connector_for_encoder(encoder);
+
+   if (connector-connector_type == 
DRM_MODE_CONNECTOR_LVDS)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_LVDS;
+   else if (connector-connector_type == 
DRM_MODE_CONNECTOR_VGA)
+   args.v2.ucEncodeMode = 
ATOM_ENCODER_MODE_CRT;
+   else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
+   } else
+   args.v2.ucEncodeMode = 
atombios_get_encoder_mode(encoder);
switch (radeon_encoder-encoder_id) {
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY:
case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1:
-- 
1.7.1.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


drm/radeon/kms: improve performance of blit-copy

2011-10-12 Thread Ilija Hadzic

The following set of patches will improve the performance
of blit-copy functions for Radeon GPUs based on 
R600, R700, Evergreen and NI ASICs.

The foundation for improvement is the use of tiled mode access
(which for copying bo's can be used regardless of whether the
content is tiled or not), and segmenting the memory block
being copied into rectangles whose edge ratio is between 1:1
and 1:2. This maximizes the number of PCIe transactions that
use maximum payload size (typically 128 bytes) and also 
creates a memory access pattern that is more favorable for
both VRAM and host DRAM than what's currently in the kernel.

To come up with the new blit-copy code, I did a lot of 
PCIe traffic analysis with the bus analyzer and also 
had many discussions with Alex, trying to explain what's 
going on (thanks to Alex for his time).

Below (at the end of this note) are the results of some benchmarks
that I did with various GPUs (all in the same host: Intel i7 CPU,
X58 chipset, three DRAM channels). To run the tests on your machine
load the radeon module with 'benchmark=1 pcie_gen2=1' parameters.
Most significant improvement is in the upstream (VRAM to GART)
direction because that's where the PCIe transactions were fragmented 
and also where memory access pattern was such that it created a lot of 
backpressure from the host.

It is also interesting that high-end devices (e.g. Cayman) exhibit
the least improvement and were the worst to begin with. This is
because high-end devices copy more tiles in parallel which 
in turn can create bank conflicts on host memory and cause the
host to do lots of bank-close/precharge/bank-open cycles. 

As an added bonus, I also did some code cleanup and consolidated
the repeated code into common function, so r600 and evergreen/NI
parts now share the blit-copy code. I also expanded on the
benchmark coverage, so the module now takes benckmark parameter
value between 1 and 8 and each results in running a different 
benchmark.

For details, see the commit log messages and the code.
I have been running with these patches for a few months 
(and I kept rebasing them to drm-core-next as the public 
git progressed) and I used them in a system setup that does
*many* copying of this kind (and does them frequently); I 
have not seen instabilities introduced by these patches. I also
verified the correctness of the copy using test=1 parameter
for each GPU that I had and the test passed.

I would welcome some feedback and if you run the benchmarks
with the new blit code, I would very much like to hear
what kind of improvement you are seeing.


BENCHMARK RESULTS:
==

1) VRAM to GTT 
==

Card (ASIC) VRAMBefore  After
-
5570 (Redwood)  DDR3 1600MHZ 4543912
6450 (Caicos)   DDR5 3200MHz37185090
6570 (Turks)DDR3 1800MHz 4844144
5450 (Cedar)DDR3 1600MHz36795090
5450 (Cedar)DDR2  800MHz26954639
E4690 (RV730)   DDR3 1400MHZ 4854969
E6760 (Turks)   DDR5 3200MHz 4744177
V5700 (RV730)   DDR3 MHz 4884297
2260 (RV620)DDR2 MHz 4943093
6870 (Barts)DDR5 4200MHz 4751113
6970 (Cayman)   DDR5 4200MHz 473 710

2) GTT to VRAM
==

Card (ASIC) VRAMBefore  After
-
5570 (Redwood)  DDR3 1600MHz31583360
6450 (Caicos)   DDR5 3200MHz29953393
6570 (Turks)DDR3 1800MHz30393339
5450 (Cedar)DDR3 1600MHz32463404
5450 (Cedar)DDR2  800MHz26143371
E4690 (RV730)   DDR3 1400MHz30843426
E6760 (Turks)   DDR5 3200MHz24432570
V5700 (RV730)   DDR3 MHz31873506
2260 (RV620)DDR2 MHz 5843246
6870 (Barts)DDR5 4200MHz24722601
6970 (Cayman)   DDR5 4200MHz24602737
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/9] drm/radeon/kms: improve evergreen blit code

2011-10-12 Thread Ilija Hadzic
start with first-cut conceptual patch from Alex Deucher
(commit info below); turn on 1D tiling
make rectangular buffer always 2:1 or 1:2 ratio
make buffer dimenstions an integer multiple of unit dimensions
make sures that integral number of pages map to the buffer
fix a few bugs that resulted in incorrect dimensions
tidy up a little bit to get rid of an ugly if/else
parametrize some magic constants
add protections from illegal buffer sizes etc.

From 77e6703c37f0ad8673b9ab285589d5c26782a515 Mon Sep 17 00:00:00 2001
From: Alex Deucher alexdeuc...@gmail.com
Date: Tue, 17 May 2011 05:08:58 -0400
Subject: [PATCH 1/2] drm/radeon/kms: simplify evergreen blit code

Covert 4k pages to multiples of 64x64x4 tiles.
This is also more efficient than a scanline based
approach from the MC's perspective.

Signed-off-by: Alex Deucher alexdeuc...@gmail.com
Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/evergreen.c  |4 +-
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |  295 +++
 drivers/gpu/drm/radeon/radeon_asic.h|4 +-
 3 files changed, 123 insertions(+), 180 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 5df39bf..5f0ecc7 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3180,14 +3180,14 @@ int evergreen_copy_blit(struct radeon_device *rdev,
 
mutex_lock(rdev-r600_blit.mutex);
rdev-r600_blit.vb_ib = NULL;
-   r = evergreen_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE);
+   r = evergreen_blit_prepare_copy(rdev, num_pages);
if (r) {
if (rdev-r600_blit.vb_ib)
radeon_ib_free(rdev, rdev-r600_blit.vb_ib);
mutex_unlock(rdev-r600_blit.mutex);
return r;
}
-   evergreen_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * 
RADEON_GPU_PAGE_SIZE);
+   evergreen_kms_blit_copy(rdev, src_offset, dst_offset, num_pages);
evergreen_blit_done_copy(rdev, fence);
mutex_unlock(rdev-r600_blit.mutex);
return 0;
diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c 
b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index 2eb2518..3b24137 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -44,6 +44,10 @@
 #define COLOR_5_6_5   0x8
 #define COLOR_8_8_8_8 0x1a
 
+#define RECT_UNIT_H   32
+#define RECT_UNIT_W   (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H)
+#define MAX_RECT_DIM  16384
+
 /* emits 17 */
 static void
 set_render_target(struct radeon_device *rdev, int format,
@@ -56,7 +60,7 @@ set_render_target(struct radeon_device *rdev, int format,
if (h  8)
h = 8;
 
-   cb_color_info = ((format  2) | (1  24) | (1  8));
+   cb_color_info = ((format  2) | (1  24) | (2  8));
pitch = (w / 8) - 1;
slice = ((w * h) / 64) - 1;
 
@@ -67,7 +71,7 @@ set_render_target(struct radeon_device *rdev, int format,
radeon_ring_write(rdev, slice);
radeon_ring_write(rdev, 0);
radeon_ring_write(rdev, cb_color_info);
-   radeon_ring_write(rdev, (1  4));
+   radeon_ring_write(rdev, 0);
radeon_ring_write(rdev, (w - 1) | ((h - 1)  16));
radeon_ring_write(rdev, 0);
radeon_ring_write(rdev, 0);
@@ -179,7 +183,7 @@ set_tex_resource(struct radeon_device *rdev,
sq_tex_resource_word0 = (1  0); /* 2D */
sq_tex_resource_word0 |= pitch  3) - 1)  6) |
  ((w - 1)  18));
-   sq_tex_resource_word1 = ((h - 1)  0) | (1  28);
+   sq_tex_resource_word1 = ((h - 1)  0) | (2  28);
/* xyzw swizzles */
sq_tex_resource_word4 = (0  16) | (1  19) | (2  22) | (3  25);
 
@@ -751,30 +755,80 @@ static void evergreen_vb_ib_put(struct radeon_device 
*rdev)
radeon_ib_free(rdev, rdev-r600_blit.vb_ib);
 }
 
-int evergreen_blit_prepare_copy(struct radeon_device *rdev, int size_bytes)
+
+/* maps the rectangle to the buffer so that satisfies the following properties:
+ * - dimensions are less or equal to the hardware limit (MAX_RECT_DIM)
+ * - rectangle consists of integer number of pages
+ * - height is an integer multiple of RECT_UNIT_H
+ * - width is an integer multiple of RECT_UNIT_W
+ * - (the above three conditions also guarantee tile-aligned size)
+ * - it is as square as possible (sides ratio never greater than 2:1)
+ * - uses maximum number of pages that fit the above constraints
+ *
+ *  input:  buffer size, pointers to width/height variables
+ *  return: number of pages that were successfully mapped to the rectangle
+ *  width/height of the rectangle
+ */
+static unsigned evergreen_blit_create_rect(unsigned num_pages, int *width, int 
*height)
+{
+   unsigned max_pages;
+   unsigned pages = num_pages;
+   int w, h;
+
+   if (num_pages 

[PATCH 2/9] drm/radeon/kms: improve r6xx blit code

2011-10-12 Thread Ilija Hadzic
start with first-cut conceptual patch from Alex Deucher
(commit info below); turn on 1D tiling
make rectangular buffer always 2:1 or 1:2 ratio
make buffer dimenstions an integer multiple of unit
dimensionsmake sures that integral number of pages map
to the buffer fix a few bugs that resulted in incorrect
dimensions tidy up a little bit to get rid of an ugly
if/else parametrize some magic constants
add protections from illegal buffer sizes etc.

From 2cd7a267d6cbcdf414b7a724237aa24525c12b54 Mon Sep 17 00:00:00 2001
From: Alex Deucher alexdeuc...@gmail.com
Date: Tue, 17 May 2011 05:09:43 -0400
Subject: [PATCH 2/2] drm/radeon/kms: simplify r6xx blit code

Covert 4k pages to multiples of 64x64x4 tiles.
This is also more efficient than a scanline based
approach from the MC's perspective.

Signed-off-by: Alex Deucher alexdeuc...@gmail.com
Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/r600.c  |4 +-
 drivers/gpu/drm/radeon/r600_blit_kms.c |  276 
 drivers/gpu/drm/radeon/radeon_asic.h   |4 +-
 3 files changed, 109 insertions(+), 175 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 334aee6..9fc6844 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2363,14 +2363,14 @@ int r600_copy_blit(struct radeon_device *rdev,
 
mutex_lock(rdev-r600_blit.mutex);
rdev-r600_blit.vb_ib = NULL;
-   r = r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE);
+   r = r600_blit_prepare_copy(rdev, num_pages);
if (r) {
if (rdev-r600_blit.vb_ib)
radeon_ib_free(rdev, rdev-r600_blit.vb_ib);
mutex_unlock(rdev-r600_blit.mutex);
return r;
}
-   r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * 
RADEON_GPU_PAGE_SIZE);
+   r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages);
r600_blit_done_copy(rdev, fence);
mutex_unlock(rdev-r600_blit.mutex);
return 0;
diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
b/drivers/gpu/drm/radeon/r600_blit_kms.c
index 9aa74c3..d9994c9 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -42,6 +42,10 @@
 #define COLOR_5_6_5   0x8
 #define COLOR_8_8_8_8 0x1a
 
+#define RECT_UNIT_H   32
+#define RECT_UNIT_W   (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H)
+#define MAX_RECT_DIM  8192
+
 /* emits 21 on rv770+, 23 on r600 */
 static void
 set_render_target(struct radeon_device *rdev, int format,
@@ -600,13 +604,59 @@ static void r600_vb_ib_put(struct radeon_device *rdev)
radeon_ib_free(rdev, rdev-r600_blit.vb_ib);
 }
 
-int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes)
+/* FIXME: the function is very similar to evergreen_blit_create_rect, except
+   that it different predefined constants; consider commonizing */
+static unsigned r600_blit_create_rect(unsigned num_pages, int *width, int 
*height)
+{
+   unsigned max_pages;
+   unsigned pages = num_pages;
+   int w, h;
+
+   if (num_pages == 0) {
+   /* not supposed to be called with no pages, but just in case */
+   h = 0;
+   w = 0;
+   pages = 0;
+   WARN_ON(1);
+   } else {
+   int rect_order = 2;
+   h = RECT_UNIT_H;
+   while (num_pages / rect_order) {
+   h *= 2;
+   rect_order *= 4;
+   if (h = MAX_RECT_DIM) {
+   h = MAX_RECT_DIM;
+   break;
+   }
+   }
+   max_pages = (MAX_RECT_DIM * h) / (RECT_UNIT_W * RECT_UNIT_H);
+   if (pages  max_pages)
+   pages = max_pages;
+   w = (pages * RECT_UNIT_W * RECT_UNIT_H) / h;
+   w = (w / RECT_UNIT_W) * RECT_UNIT_W;
+   pages = (w * h) / (RECT_UNIT_W * RECT_UNIT_H);
+   BUG_ON(pages == 0);
+   }
+
+
+   DRM_DEBUG(blit_rectangle: h=%d, w=%d, pages=%d\n, h, w, pages);
+
+   /* return width and height only of the caller wants it */
+   if (height)
+   *height = h;
+   if (width)
+   *width = w;
+
+   return pages;
+}
+
+
+int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_pages)
 {
int r;
-   int ring_size, line_size;
-   int max_size;
+   int ring_size;
/* loops of emits 64 + fence emit possible */
-   int dwords_per_loop = 76, num_loops;
+   int dwords_per_loop = 76, num_loops = 0;
 
r = r600_vb_ib_get(rdev);
if (r)
@@ -616,18 +666,12 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, 
int size_bytes)
if (rdev-family  CHIP_R600  rdev-family  CHIP_RV770)
dwords_per_loop += 2;
 
-   /* 8 

[PATCH 3/9] drm/radeon/kms: demystify evergreen blit code

2011-10-12 Thread Ilija Hadzic
some bits in 3D registers used by blit functions look like
magic and this is hard to follow; change them to a little bit
more meaningful pre-defined constants

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |   29 +--
 drivers/gpu/drm/radeon/evergreend.h |   42 +++
 2 files changed, 62 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c 
b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index 3b24137..68d0de2 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -60,7 +60,9 @@ set_render_target(struct radeon_device *rdev, int format,
if (h  8)
h = 8;
 
-   cb_color_info = ((format  2) | (1  24) | (2  8));
+   cb_color_info = CB_FORMAT(format) |
+   CB_SOURCE_FORMAT(CB_SF_EXPORT_NORM) |
+   CB_ARRAY_MODE(ARRAY_1D_TILED_THIN1);
pitch = (w / 8) - 1;
slice = ((w * h) / 64) - 1;
 
@@ -137,12 +139,16 @@ set_vtx_resource(struct radeon_device *rdev, u64 gpu_addr)
u32 sq_vtx_constant_word2, sq_vtx_constant_word3;
 
/* high addr, stride */
-   sq_vtx_constant_word2 = ((upper_32_bits(gpu_addr)  0xff) | (16  8));
+   sq_vtx_constant_word2 = SQ_VTXC_BASE_ADDR_HI(upper_32_bits(gpu_addr)  
0xff) |
+   SQ_VTXC_STRIDE(16);
 #ifdef __BIG_ENDIAN
-   sq_vtx_constant_word2 |= (2  30);
+   sq_vtx_constant_word2 |= SQ_VTXC_ENDIAN_SWAP(SQ_ENDIAN_8IN32);
 #endif
/* xyzw swizzles */
-   sq_vtx_constant_word3 = (0  3) | (1  6) | (2  9) | (3  12);
+   sq_vtx_constant_word3 = SQ_VTCX_SEL_X(SQ_SEL_X) |
+   SQ_VTCX_SEL_Y(SQ_SEL_Y) |
+   SQ_VTCX_SEL_Z(SQ_SEL_Z) |
+   SQ_VTCX_SEL_W(SQ_SEL_W);
 
radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 8));
radeon_ring_write(rdev, 0x580);
@@ -153,7 +159,7 @@ set_vtx_resource(struct radeon_device *rdev, u64 gpu_addr)
radeon_ring_write(rdev, 0);
radeon_ring_write(rdev, 0);
radeon_ring_write(rdev, 0);
-   radeon_ring_write(rdev, SQ_TEX_VTX_VALID_BUFFER  30);
+   radeon_ring_write(rdev, S__SQ_CONSTANT_TYPE(SQ_TEX_VTX_VALID_BUFFER));
 
if ((rdev-family == CHIP_CEDAR) ||
(rdev-family == CHIP_PALM) ||
@@ -180,14 +186,19 @@ set_tex_resource(struct radeon_device *rdev,
if (h  1)
h = 1;
 
-   sq_tex_resource_word0 = (1  0); /* 2D */
+   sq_tex_resource_word0 = TEX_DIM(SQ_TEX_DIM_2D);
sq_tex_resource_word0 |= pitch  3) - 1)  6) |
  ((w - 1)  18));
-   sq_tex_resource_word1 = ((h - 1)  0) | (2  28);
+   sq_tex_resource_word1 = ((h - 1)  0) |
+   TEX_ARRAY_MODE(ARRAY_1D_TILED_THIN1);
/* xyzw swizzles */
-   sq_tex_resource_word4 = (0  16) | (1  19) | (2  22) | (3  25);
+   sq_tex_resource_word4 = TEX_DST_SEL_X(SQ_SEL_X) |
+   TEX_DST_SEL_Y(SQ_SEL_Y) |
+   TEX_DST_SEL_Z(SQ_SEL_Z) |
+   TEX_DST_SEL_W(SQ_SEL_W);
 
-   sq_tex_resource_word7 = format | (SQ_TEX_VTX_VALID_TEXTURE  30);
+   sq_tex_resource_word7 = format |
+   S__SQ_CONSTANT_TYPE(SQ_TEX_VTX_VALID_TEXTURE);
 
radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 8));
radeon_ring_write(rdev, 0);
diff --git a/drivers/gpu/drm/radeon/evergreend.h 
b/drivers/gpu/drm/radeon/evergreend.h
index 7363d9d..b937c49 100644
--- a/drivers/gpu/drm/radeon/evergreend.h
+++ b/drivers/gpu/drm/radeon/evergreend.h
@@ -941,11 +941,15 @@
 #defineCB_COLOR0_SLICE 0x28c68
 #defineCB_COLOR0_VIEW  0x28c6c
 #defineCB_COLOR0_INFO  0x28c70
+#  define CB_FORMAT(x) ((x)  2)
 #   define CB_ARRAY_MODE(x) ((x)  8)
 #   define ARRAY_LINEAR_GENERAL 0
 #   define ARRAY_LINEAR_ALIGNED 1
 #   define ARRAY_1D_TILED_THIN1 2
 #   define ARRAY_2D_TILED_THIN1 4
+#  define CB_SOURCE_FORMAT(x)  ((x)  24)
+#  define CB_SF_EXPORT_FULL0
+#  define CB_SF_EXPORT_NORM1
 #defineCB_COLOR0_ATTRIB0x28c74
 #defineCB_COLOR0_DIM   0x28c78
 /* only CB0-7 blocks have these regs */
@@ -1107,15 +,53 @@
 #defineCB_COLOR7_CLEAR_WORD3   0x28e3c
 
 #define SQ_TEX_RESOURCE_WORD0_0 0x3
+#  define TEX_DIM(x)   ((x)  0)
+#  define SQ_TEX_DIM_1D0
+#  define SQ_TEX_DIM_2D   

[PATCH 4/9] drm/radeon/kms: demystify r600 blit code

2011-10-12 Thread Ilija Hadzic
some 3d register bits look like magic in r600 blit functions
use predefined constants to make it more intuitive what they are

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/r600_blit_kms.c |   30 +-
 drivers/gpu/drm/radeon/r600d.h |   22 ++
 2 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
b/drivers/gpu/drm/radeon/r600_blit_kms.c
index d9994c9..71fec92 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -58,7 +58,9 @@ set_render_target(struct radeon_device *rdev, int format,
if (h  8)
h = 8;
 
-   cb_color_info = ((format  2) | (1  27) | (1  8));
+   cb_color_info = CB_FORMAT(format) |
+   CB_SOURCE_FORMAT(CB_SF_EXPORT_NORM) |
+   CB_ARRAY_MODE(ARRAY_1D_TILED_THIN1);
pitch = (w / 8) - 1;
slice = ((w * h) / 64) - 1;
 
@@ -168,9 +170,10 @@ set_vtx_resource(struct radeon_device *rdev, u64 gpu_addr)
 {
u32 sq_vtx_constant_word2;
 
-   sq_vtx_constant_word2 = ((upper_32_bits(gpu_addr)  0xff) | (16  8));
+   sq_vtx_constant_word2 = SQ_VTXC_BASE_ADDR_HI(upper_32_bits(gpu_addr)  
0xff) |
+   SQ_VTXC_STRIDE(16);
 #ifdef __BIG_ENDIAN
-   sq_vtx_constant_word2 |= (2  30);
+   sq_vtx_constant_word2 |=  SQ_VTXC_ENDIAN_SWAP(SQ_ENDIAN_8IN32);
 #endif
 
radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 7));
@@ -206,18 +209,19 @@ set_tex_resource(struct radeon_device *rdev,
if (h  1)
h = 1;
 
-   sq_tex_resource_word0 = (1  0) | (1  3);
-   sq_tex_resource_word0 |= pitch  3) - 1)  8) |
- ((w - 1)  19));
+   sq_tex_resource_word0 = S_038000_DIM(V_038000_SQ_TEX_DIM_2D) |
+   S_038000_TILE_MODE(V_038000_ARRAY_1D_TILED_THIN1);
+   sq_tex_resource_word0 |= S_038000_PITCH((pitch  3) - 1) |
+   S_038000_TEX_WIDTH(w - 1);
 
-   sq_tex_resource_word1 = (format  26);
-   sq_tex_resource_word1 |= ((h - 1)  0);
+   sq_tex_resource_word1 = S_038004_DATA_FORMAT(format);
+   sq_tex_resource_word1 |= S_038004_TEX_HEIGHT(h - 1);
 
-   sq_tex_resource_word4 = ((1  14) |
-(0  16) |
-(1  19) |
-(2  22) |
-(3  25));
+   sq_tex_resource_word4 = S_038010_REQUEST_SIZE(1) |
+   S_038010_DST_SEL_X(SQ_SEL_X) |
+   S_038010_DST_SEL_Y(SQ_SEL_Y) |
+   S_038010_DST_SEL_Z(SQ_SEL_Z) |
+   S_038010_DST_SEL_W(SQ_SEL_W);
 
radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 7));
radeon_ring_write(rdev, 0);
diff --git a/drivers/gpu/drm/radeon/r600d.h b/drivers/gpu/drm/radeon/r600d.h
index 0245ae6..bfe1b5d 100644
--- a/drivers/gpu/drm/radeon/r600d.h
+++ b/drivers/gpu/drm/radeon/r600d.h
@@ -79,6 +79,11 @@
 #define CB_COLOR0_SIZE  0x28060
 #define CB_COLOR0_VIEW  0x28080
 #define CB_COLOR0_INFO  0x280a0
+#  define CB_FORMAT(x) ((x)  2)
+#   define CB_ARRAY_MODE(x) ((x)  8)
+#  define CB_SOURCE_FORMAT(x)  ((x)  27)
+#  define CB_SF_EXPORT_FULL0
+#  define CB_SF_EXPORT_NORM1
 #define CB_COLOR0_TILE  0x280c0
 #define CB_COLOR0_FRAG  0x280e0
 #define CB_COLOR0_MASK  0x28100
@@ -417,6 +422,17 @@
 #defineSQ_PGM_START_VS 0x28858
 #define SQ_PGM_RESOURCES_VS 0x28868
 #define SQ_PGM_CF_OFFSET_VS 0x288d0
+
+#define SQ_VTX_CONSTANT_WORD0_00x3
+#define SQ_VTX_CONSTANT_WORD1_00x30004
+#define SQ_VTX_CONSTANT_WORD2_00x30008
+#  define SQ_VTXC_BASE_ADDR_HI(x)  ((x)  0)
+#  define SQ_VTXC_STRIDE(x)((x)  8)
+#  define SQ_VTXC_ENDIAN_SWAP(x)   ((x)  30)
+#  define SQ_ENDIAN_NONE   0
+#  define SQ_ENDIAN_8IN16  1
+#  define SQ_ENDIAN_8IN32  2
+#define SQ_VTX_CONSTANT_WORD3_00x3000c
 #defineSQ_VTX_CONSTANT_WORD6_0 0x38018
 #defineS__SQ_VTX_CONSTANT_TYPE(x)  (((x)  
3)  30)
 #defineG__SQ_VTX_CONSTANT_TYPE(x)  (((x) 
 30)  3)
@@ -1352,6 +1368,12 @@
 #define   S_038010_DST_SEL_W(x)(((x)  0x7)  25)
 #define   G_038010_DST_SEL_W(x) 

[PATCH 5/9] drm/radeon/kms: cleanup benchmark code

2011-10-12 Thread Ilija Hadzic
factor out repeated code into functions
fix units in which the throughput is reported (megabytes per second
and megabits per second make sense, others are kind of confusing)
make report more amenable to awk and friends (e.g. whitespace is
always the separator, unit is separated from the number, etc)
add #defines for some hard coded constants

besides beautification this reorg is done in preparation
for writing more elaborate benchmarks

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/radeon_benchmark.c |  156 -
 1 files changed, 86 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c 
b/drivers/gpu/drm/radeon/radeon_benchmark.c
index 10191d9..6951426 100644
--- a/drivers/gpu/drm/radeon/radeon_benchmark.c
+++ b/drivers/gpu/drm/radeon/radeon_benchmark.c
@@ -26,21 +26,80 @@
 #include radeon_reg.h
 #include radeon.h
 
-void radeon_benchmark_move(struct radeon_device *rdev, unsigned bsize,
-  unsigned sdomain, unsigned ddomain)
+#define RADEON_BENCHMARK_COPY_BLIT 1
+#define RADEON_BENCHMARK_COPY_DMA  0
+
+#define RADEON_BENCHMARK_ITERATIONS 1024
+
+static int radeon_benchmark_do_move(struct radeon_device *rdev, unsigned size,
+   uint64_t saddr, uint64_t daddr,
+   int flag, int n)
+{
+   unsigned long start_jiffies;
+   unsigned long end_jiffies;
+   struct radeon_fence *fence = NULL;
+   int i, r;
+
+   start_jiffies = jiffies;
+   for (i = 0; i  n; i++) {
+   r = radeon_fence_create(rdev, fence);
+   if (r)
+   return r;
+
+   switch (flag) {
+   case RADEON_BENCHMARK_COPY_DMA:
+   r = radeon_copy_dma(rdev, saddr, daddr,
+   size / RADEON_GPU_PAGE_SIZE,
+   fence);
+   break;
+   case RADEON_BENCHMARK_COPY_BLIT:
+   r = radeon_copy_blit(rdev, saddr, daddr,
+size / RADEON_GPU_PAGE_SIZE,
+fence);
+   break;
+   default:
+   DRM_ERROR(Unknown copy method\n);
+   r = -EINVAL;
+   }
+   if (r)
+   goto exit_do_move;
+   r = radeon_fence_wait(fence, false);
+   if (r)
+   goto exit_do_move;
+   radeon_fence_unref(fence);
+   }
+   end_jiffies = jiffies;
+   r = jiffies_to_msecs(end_jiffies - start_jiffies);
+
+exit_do_move:
+   if (fence)
+   radeon_fence_unref(fence);
+   return r;
+}
+
+
+static void radeon_benchmark_log_results(int n, unsigned size,
+unsigned int time,
+unsigned sdomain, unsigned ddomain,
+char *kind)
+{
+   unsigned int throughput = (n * (size  10)) / time;
+   DRM_INFO(radeon: %s %u bo moves of %u kB from
+ %d to %d in %u ms, throughput: %u Mb/s or %u MB/s\n,
+kind, n, size  10, sdomain, ddomain, time,
+throughput * 8, throughput);
+}
+
+static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size,
+ unsigned sdomain, unsigned ddomain)
 {
struct radeon_bo *dobj = NULL;
struct radeon_bo *sobj = NULL;
-   struct radeon_fence *fence = NULL;
uint64_t saddr, daddr;
-   unsigned long start_jiffies;
-   unsigned long end_jiffies;
-   unsigned long time;
-   unsigned i, n, size;
-   int r;
+   int r, n;
+   unsigned int time;
 
-   size = bsize;
-   n = 1024;
+   n = RADEON_BENCHMARK_ITERATIONS;
r = radeon_bo_create(rdev, size, PAGE_SIZE, true, sdomain, sobj);
if (r) {
goto out_cleanup;
@@ -68,64 +127,23 @@ void radeon_benchmark_move(struct radeon_device *rdev, 
unsigned bsize,
 
/* r100 doesn't have dma engine so skip the test */
if (rdev-asic-copy_dma) {
-
-   start_jiffies = jiffies;
-   for (i = 0; i  n; i++) {
-   r = radeon_fence_create(rdev, fence);
-   if (r) {
-   goto out_cleanup;
-   }
-
-   r = radeon_copy_dma(rdev, saddr, daddr,
-   size / RADEON_GPU_PAGE_SIZE, fence);
-
-   if (r) {
-   goto out_cleanup;
-   }
-   r = radeon_fence_wait(fence, false);
-   if (r) {
-   goto out_cleanup;
-   }
-   

[PATCH 6/9] drm/radeon/kms: add more elaborate benchmarks

2011-10-12 Thread Ilija Hadzic
Lots of new (and hopefully useful) benchmark. Load the driver
with radeon_benchmark=test_number and enjoy. Among tests
added are VRAM to VRAM blits and blits with buffer size sweeps.
The latter can be from GTT to VRAM, VRAM to GTT, and VRAM to VRAM
and there are two types of sweeps: powers of two and (probably
more interesting) buffers sizes that correspond to common modes.

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/radeon.h   |2 +-
 drivers/gpu/drm/radeon/radeon_benchmark.c |   91 +++--
 drivers/gpu/drm/radeon/radeon_device.c|2 +-
 3 files changed, 87 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index ff5424e..5361dd7 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -868,7 +868,7 @@ struct radeon_pm {
 /*
  * Benchmarking
  */
-void radeon_benchmark(struct radeon_device *rdev);
+void radeon_benchmark(struct radeon_device *rdev, int test_number);
 
 
 /*
diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c 
b/drivers/gpu/drm/radeon/radeon_benchmark.c
index 6951426..5cafc90 100644
--- a/drivers/gpu/drm/radeon/radeon_benchmark.c
+++ b/drivers/gpu/drm/radeon/radeon_benchmark.c
@@ -30,6 +30,7 @@
 #define RADEON_BENCHMARK_COPY_DMA  0
 
 #define RADEON_BENCHMARK_ITERATIONS 1024
+#define RADEON_BENCHMARK_COMMON_MODES_N 17
 
 static int radeon_benchmark_do_move(struct radeon_device *rdev, unsigned size,
uint64_t saddr, uint64_t daddr,
@@ -126,7 +127,9 @@ static void radeon_benchmark_move(struct radeon_device 
*rdev, unsigned size,
}
 
/* r100 doesn't have dma engine so skip the test */
-   if (rdev-asic-copy_dma) {
+   /* also, VRAM-to-VRAM test doesn't make much sense for DMA */
+   /* skip it as well if domains are the same */
+   if ((rdev-asic-copy_dma)  (sdomain != ddomain)) {
time = radeon_benchmark_do_move(rdev, size, saddr, daddr,
RADEON_BENCHMARK_COPY_DMA, n);
if (time  0)
@@ -167,10 +170,86 @@ out_cleanup:
}
 }
 
-void radeon_benchmark(struct radeon_device *rdev)
+void radeon_benchmark(struct radeon_device *rdev, int test_number)
 {
-   radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_GTT,
- RADEON_GEM_DOMAIN_VRAM);
-   radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_VRAM,
- RADEON_GEM_DOMAIN_GTT);
+   int i;
+   int common_modes[RADEON_BENCHMARK_COMMON_MODES_N] = {
+   640 * 480 * 4,
+   720 * 480 * 4,
+   800 * 600 * 4,
+   848 * 480 * 4,
+   1024 * 768 * 4,
+   1152 * 768 * 4,
+   1280 * 720 * 4,
+   1280 * 800 * 4,
+   1280 * 854 * 4,
+   1280 * 960 * 4,
+   1280 * 1024 * 4,
+   1440 * 900 * 4,
+   1400 * 1050 * 4,
+   1680 * 1050 * 4,
+   1600 * 1200 * 4,
+   1920 * 1080 * 4,
+   1920 * 1200 * 4
+   };
+
+   switch (test_number) {
+   case 1:
+   /* simple test, VRAM to GTT and GTT to VRAM */
+   radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_GTT,
+ RADEON_GEM_DOMAIN_VRAM);
+   radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_VRAM,
+ RADEON_GEM_DOMAIN_GTT);
+   break;
+   case 2:
+   /* simple test, VRAM to VRAM */
+   radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_VRAM,
+ RADEON_GEM_DOMAIN_VRAM);
+   break;
+   case 3:
+   /* GTT to VRAM, buffer size sweep, powers of 2 */
+   for (i = 1; i = 65536; i = 1)
+   radeon_benchmark_move(rdev, i*1024,
+ RADEON_GEM_DOMAIN_GTT,
+ RADEON_GEM_DOMAIN_VRAM);
+   break;
+   case 4:
+   /* VRAM to GTT, buffer size sweep, powers of 2 */
+   for (i = 1; i = 65536; i = 1)
+   radeon_benchmark_move(rdev, i*1024,
+ RADEON_GEM_DOMAIN_VRAM,
+ RADEON_GEM_DOMAIN_GTT);
+   break;
+   case 5:
+   /* VRAM to VRAM, buffer size sweep, powers of 2 */
+   for (i = 1; i = 65536; i = 1)
+   radeon_benchmark_move(rdev, i*1024,
+ RADEON_GEM_DOMAIN_VRAM,
+ RADEON_GEM_DOMAIN_VRAM);
+   break;
+   case 6:
+   /* GTT to VRAM, buffer size sweep, common modes */
+ 

[PATCH 7/9] drm/radeon/kms: cleanup r600 blit code

2011-10-12 Thread Ilija Hadzic
reorganize the code such that only the primitives (i.e., the functions
that load the CP ring) are hardware specific; dynamically link the
primitives in a (new) pointer structure inside r600_blit at
blit initialization time so that the functions that control the blit
operations can be made common for r600 and evergreen parts

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/r600_blit_kms.c |   94 +---
 drivers/gpu/drm/radeon/radeon.h|   21 +++
 2 files changed, 70 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
b/drivers/gpu/drm/radeon/r600_blit_kms.c
index 71fec92..07e3df4 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -44,7 +44,6 @@
 
 #define RECT_UNIT_H   32
 #define RECT_UNIT_W   (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H)
-#define MAX_RECT_DIM  8192
 
 /* emits 21 on rv770+, 23 on r600 */
 static void
@@ -491,6 +490,27 @@ int r600_blit_init(struct radeon_device *rdev)
u32 packet2s[16];
int num_packet2s = 0;
 
+   rdev-r600_blit.primitives.set_render_target = set_render_target;
+   rdev-r600_blit.primitives.cp_set_surface_sync = cp_set_surface_sync;
+   rdev-r600_blit.primitives.set_shaders = set_shaders;
+   rdev-r600_blit.primitives.set_vtx_resource = set_vtx_resource;
+   rdev-r600_blit.primitives.set_tex_resource = set_tex_resource;
+   rdev-r600_blit.primitives.set_scissors = set_scissors;
+   rdev-r600_blit.primitives.draw_auto = draw_auto;
+   rdev-r600_blit.primitives.set_default_state = set_default_state;
+
+   rdev-r600_blit.ring_size_common = 40; /* shaders + def state */
+   rdev-r600_blit.ring_size_common += 10; /* fence emit for VB IB */
+   rdev-r600_blit.ring_size_common += 5; /* done copy */
+   rdev-r600_blit.ring_size_common += 10; /* fence emit for done copy */
+
+   rdev-r600_blit.ring_size_per_loop = 76;
+   /* set_render_target emits 2 extra dwords on rv6xx */
+   if (rdev-family  CHIP_R600  rdev-family  CHIP_RV770)
+   rdev-r600_blit.ring_size_per_loop += 2;
+
+   rdev-r600_blit.max_dim = 8192;
+
/* pin copy shader into vram if already initialized */
if (rdev-r600_blit.shader_obj)
goto done;
@@ -608,9 +628,8 @@ static void r600_vb_ib_put(struct radeon_device *rdev)
radeon_ib_free(rdev, rdev-r600_blit.vb_ib);
 }
 
-/* FIXME: the function is very similar to evergreen_blit_create_rect, except
-   that it different predefined constants; consider commonizing */
-static unsigned r600_blit_create_rect(unsigned num_pages, int *width, int 
*height)
+static unsigned r600_blit_create_rect(unsigned num_pages,
+ int *width, int *height, int max_dim)
 {
unsigned max_pages;
unsigned pages = num_pages;
@@ -628,12 +647,12 @@ static unsigned r600_blit_create_rect(unsigned num_pages, 
int *width, int *heigh
while (num_pages / rect_order) {
h *= 2;
rect_order *= 4;
-   if (h = MAX_RECT_DIM) {
-   h = MAX_RECT_DIM;
+   if (h = max_dim) {
+   h = max_dim;
break;
}
}
-   max_pages = (MAX_RECT_DIM * h) / (RECT_UNIT_W * RECT_UNIT_H);
+   max_pages = (max_dim * h) / (RECT_UNIT_W * RECT_UNIT_H);
if (pages  max_pages)
pages = max_pages;
w = (pages * RECT_UNIT_W * RECT_UNIT_H) / h;
@@ -659,36 +678,29 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, 
unsigned num_pages)
 {
int r;
int ring_size;
-   /* loops of emits 64 + fence emit possible */
-   int dwords_per_loop = 76, num_loops = 0;
+   int num_loops = 0;
+   int dwords_per_loop = rdev-r600_blit.ring_size_per_loop;
 
r = r600_vb_ib_get(rdev);
if (r)
return r;
 
-   /* set_render_target emits 2 extra dwords on rv6xx */
-   if (rdev-family  CHIP_R600  rdev-family  CHIP_RV770)
-   dwords_per_loop += 2;
-
/* num loops */
while (num_pages) {
-   num_pages -= r600_blit_create_rect(num_pages, NULL, NULL);
+   num_pages -= r600_blit_create_rect(num_pages, NULL, NULL,
+  rdev-r600_blit.max_dim);
num_loops++;
}
 
/* calculate number of loops correctly */
ring_size = num_loops * dwords_per_loop;
-   /* set default  + shaders */
-   ring_size += 40; /* shaders + def state */
-   ring_size += 10; /* fence emit for VB IB */
-   ring_size += 5; /* done copy */
-   ring_size += 10; /* fence emit for done copy */
+   ring_size += 

[PATCH 8/9] drm/radeon/kms: blit code commoning

2011-10-12 Thread Ilija Hadzic
factor out most of evergreen blit code and use the refactored code
from r600 that is now common for both r600 and evergreen

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/evergreen.c  |   25 +---
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |  260 ++-
 drivers/gpu/drm/radeon/ni.c |4 +-
 drivers/gpu/drm/radeon/radeon_asic.c|   16 +-
 drivers/gpu/drm/radeon/radeon_asic.h|   10 -
 5 files changed, 30 insertions(+), 285 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index 5f0ecc7..69dded2 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3087,7 +3087,7 @@ static int evergreen_startup(struct radeon_device *rdev)
 
r = evergreen_blit_init(rdev);
if (r) {
-   evergreen_blit_fini(rdev);
+   r600_blit_fini(rdev);
rdev-asic-copy = NULL;
dev_warn(rdev-dev, failed blitter (%d) falling back to 
memcpy\n, r);
}
@@ -3172,27 +3172,6 @@ int evergreen_suspend(struct radeon_device *rdev)
return 0;
 }
 
-int evergreen_copy_blit(struct radeon_device *rdev,
-   uint64_t src_offset, uint64_t dst_offset,
-   unsigned num_pages, struct radeon_fence *fence)
-{
-   int r;
-
-   mutex_lock(rdev-r600_blit.mutex);
-   rdev-r600_blit.vb_ib = NULL;
-   r = evergreen_blit_prepare_copy(rdev, num_pages);
-   if (r) {
-   if (rdev-r600_blit.vb_ib)
-   radeon_ib_free(rdev, rdev-r600_blit.vb_ib);
-   mutex_unlock(rdev-r600_blit.mutex);
-   return r;
-   }
-   evergreen_kms_blit_copy(rdev, src_offset, dst_offset, num_pages);
-   evergreen_blit_done_copy(rdev, fence);
-   mutex_unlock(rdev-r600_blit.mutex);
-   return 0;
-}
-
 /* Plan is to move initialization in that function and use
  * helper function so that radeon_device_init pretty much
  * do nothing more than calling asic specific function. This
@@ -3301,7 +3280,7 @@ int evergreen_init(struct radeon_device *rdev)
 
 void evergreen_fini(struct radeon_device *rdev)
 {
-   evergreen_blit_fini(rdev);
+   r600_blit_fini(rdev);
r700_cp_fini(rdev);
r600_irq_fini(rdev);
radeon_wb_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c 
b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index 68d0de2..dcf11bb 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -44,10 +44,6 @@
 #define COLOR_5_6_5   0x8
 #define COLOR_8_8_8_8 0x1a
 
-#define RECT_UNIT_H   32
-#define RECT_UNIT_W   (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H)
-#define MAX_RECT_DIM  16384
-
 /* emits 17 */
 static void
 set_render_target(struct radeon_device *rdev, int format,
@@ -599,31 +595,6 @@ set_default_state(struct radeon_device *rdev)
 
 }
 
-static inline uint32_t i2f(uint32_t input)
-{
-   u32 result, i, exponent, fraction;
-
-   if ((input  0x3fff) == 0)
-   result = 0; /* 0 is a special case */
-   else {
-   exponent = 140; /* exponent biased by 127; */
-   fraction = (input  0x3fff)  10; /* cheat and only
- handle numbers below 
2^^15 */
-   for (i = 0; i  14; i++) {
-   if (fraction  0x80)
-   break;
-   else {
-   fraction = fraction  1; /* keep
-shifting left 
until top bit = 1 */
-   exponent = exponent - 1;
-   }
-   }
-   result = exponent  23 | (fraction  0x7f); /* mask
-   off top 
bit; assumed 1 */
-   }
-   return result;
-}
-
 int evergreen_blit_init(struct radeon_device *rdev)
 {
u32 obj_size;
@@ -632,6 +603,24 @@ int evergreen_blit_init(struct radeon_device *rdev)
u32 packet2s[16];
int num_packet2s = 0;
 
+   rdev-r600_blit.primitives.set_render_target = set_render_target;
+   rdev-r600_blit.primitives.cp_set_surface_sync = cp_set_surface_sync;
+   rdev-r600_blit.primitives.set_shaders = set_shaders;
+   rdev-r600_blit.primitives.set_vtx_resource = set_vtx_resource;
+   rdev-r600_blit.primitives.set_tex_resource = set_tex_resource;
+   rdev-r600_blit.primitives.set_scissors = set_scissors;
+   rdev-r600_blit.primitives.draw_auto = draw_auto;
+   rdev-r600_blit.primitives.set_default_state = set_default_state;
+
+   rdev-r600_blit.ring_size_common = 55; /* shaders + def state */
+   rdev-r600_blit.ring_size_common += 10; /* fence emit for VB IB */
+   

Re: [RFC 2/2] dma-buf: Documentation for buffer sharing framework

2011-10-12 Thread Semwal, Sumit
Hi Randy,
On Thu, Oct 13, 2011 at 4:00 AM, Randy Dunlap rdun...@xenotime.net wrote:
 On 10/11/2011 02:23 AM, Sumit Semwal wrote:
 Add documentation for dma buffer sharing framework, explaining the
 various operations, members and API of the dma buffer sharing
 framework.

 Signed-off-by: Sumit Semwal sumit.sem...@linaro.org
 Signed-off-by: Sumit Semwal sumit.sem...@ti.com
 ---
  Documentation/dma-buf-sharing.txt |  210 
 +
snip
 +    if the new buffer-user has stricter 'backing-storage constraints', and 
 the
 +    exporter can handle these constraints, the exporter can just stall on 
 the
 +    get_scatterlist till all outstanding access is completed (as signalled 
 by

                       until

Thanks for your review; I will update all these in the next version.
 +    put_scatterlist).
 +    Once all ongoing access is completed, the exporter could potentially 
 move
 +    the buffer to the stricter backing-storage, and then allow further
 +    {get,put}_scatterlist operations from any buffer-user from the migrated
 +    backing-storage.
 +
 +   If the exporter cannot fulfill the backing-storage constraints of the new
 +   buffer-user device as requested, dma_buf_attach() would return an error 
 to
 +   denote non-compatibility of the new buffer-sharing request with the 
 current
 +   buffer.
 +
 +   If the exporter chooses not to allow an attach() operation once a
 +   get_scatterlist has been called, it simply returns an error.
 +
 +- mmap file operation
 +   An mmap() file operation is provided for the fd associated with the 
 buffer.
 +   If the exporter defines an mmap operation, the mmap() fop calls this to 
 allow
 +   mmap for devices that might need it; if not, it returns an error.
 +
 +References:
 +[1] struct dma_buf_ops in include/linux/dma-buf.h
 +[2] All interfaces mentioned above defined in include/linux/dma-buf.h


 --
 ~Randy
 *** Remember to use Documentation/SubmitChecklist when testing your code ***

Best regards,
~Sumit.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel