Re: [PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread Michel Dänzer
On Mit, 2011-11-16 at 20:42 +0200, ville.syrj...@linux.intel.com wrote: 
> 
> Name the formats as DRM_FORMAT_X instead of DRM_FOURCC_X. Use consistent
> names, especially for the RGB formats. Component order and byte order are
> now strictly specified for each format.
> 
> The RGB format naming follows a convention where the components names
> and sizes are listed from left to right, matching the order within a
> single pixel from most significant bit to least significant bit. Lower
> case letters are used when listing the components to improve
> readablility. I believe this convention matches the one used by pixman.

The RGB formats are all defined in the CPU native byte order. But e.g.
pre-R600 Radeons can only scan out little endian formats. For the
framebuffer device, we use GPU byte swapping facilities to make the
pixels appear to the CPU in its native byte order, so these format
definitions make sense for that. But I'm not sure they make sense for
the KMS APIs, e.g. the userspace drivers don't use these facilities but
handle byte swapping themselves.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


drm pixel formats update

2011-11-16 Thread Ville Syrjälä
On Wed, Nov 16, 2011 at 07:54:12PM +, Alan Cox wrote:
> > If anyone has problems with the way the formats are defined, please
> > speak up now! Since only Jesse has bothered to comment on my rantings
> > I can only assume people are happy with my approach to things.
> 
> Umm .. no. I don't see why they are needed. Its just an extra layer of
> gratuitious confusing indirection. The rest of the world speaks and
> understands FourCC sp for all the formats covered by an existing FourCC
> name we should just the existing name.
> 
> You might need to check one now and then but everyone doing video
> processing is familiar with them including all the Windows folk.

I think the only format in my list where I didn't use an existing fourcc
is I420/IYUV. And BTW, for that one I used the same "fake" fourcc that
v4l2 uses (YU12). 

And that brings another matter to the table. How should we deal with
duplicate fourccs? I420/IYUV and YUY2/YUYV come to mind.

Also, if I now add these ad-hoc fourccs for the RGB formats, and some
time later someone comes in with a format with a conflicting official
fourcc, what should we do?

Oh and one extra detail just occured to me regarding the three plane
formats. Should we even define formats for both the YUV vs. YVU
variant. Seeing as we now have independent handles and offsets for
each plane, we can make do with just one format definition.

-- 
Ville Syrj?l?
Intel OTC


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42999

--- Comment #3 from interweiss at yahoo.ca 2011-11-16 15:03:59 PST ---
Created attachment 53613
  --> https://bugs.freedesktop.org/attachment.cgi?id=53613
xorg log

Here is my xorg log.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42999

--- Comment #2 from interweiss at yahoo.ca 2011-11-16 15:00:15 PST ---
Created attachment 53612
  --> https://bugs.freedesktop.org/attachment.cgi?id=53612
dmesg output

Here is my dmesg output.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[PATCH 4/4] gma500: Move the API

2011-11-16 Thread Alan Cox
From: Alan Cox 

Finally move the API where it can be seen

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/cdv_device.c  |2 -
 drivers/gpu/drm/gma500/gem.c |2 -
 drivers/gpu/drm/gma500/intel_bios.c  |2 -
 drivers/gpu/drm/gma500/mid_bios.c|2 -
 drivers/gpu/drm/gma500/oaktrail_device.c |2 -
 drivers/gpu/drm/gma500/psb_device.c  |2 -
 drivers/gpu/drm/gma500/psb_drm.h |   91 --
 drivers/gpu/drm/gma500/psb_drv.c |2 -
 drivers/gpu/drm/gma500/psb_drv.h |2 -
 include/drm/gma_drm.h|   91 ++
 10 files changed, 99 insertions(+), 99 deletions(-)
 delete mode 100644 drivers/gpu/drm/gma500/psb_drm.h
 create mode 100644 include/drm/gma_drm.h


diff --git a/drivers/gpu/drm/gma500/cdv_device.c 
b/drivers/gpu/drm/gma500/cdv_device.c
index 87614e0..c0583df 100644
--- a/drivers/gpu/drm/gma500/cdv_device.c
+++ b/drivers/gpu/drm/gma500/cdv_device.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_reg.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/gem.c b/drivers/gpu/drm/gma500/gem.c
index d743679..fdc8b5d 100644
--- a/drivers/gpu/drm/gma500/gem.c
+++ b/drivers/gpu/drm/gma500/gem.c
@@ -25,7 +25,7 @@

 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"

 int psb_gem_init_object(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/gma500/intel_bios.c 
b/drivers/gpu/drm/gma500/intel_bios.c
index 096757f..d4d0c5b 100644
--- a/drivers/gpu/drm/gma500/intel_bios.c
+++ b/drivers/gpu/drm/gma500/intel_bios.c
@@ -20,7 +20,7 @@
  */
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_intel_drv.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/mid_bios.c 
b/drivers/gpu/drm/gma500/mid_bios.c
index 7115d1a..018ab46 100644
--- a/drivers/gpu/drm/gma500/mid_bios.c
+++ b/drivers/gpu/drm/gma500/mid_bios.c
@@ -25,7 +25,7 @@

 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "mid_bios.h"

diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c 
b/drivers/gpu/drm/gma500/oaktrail_device.c
index 41c418f..57ad3ea6 100644
--- a/drivers/gpu/drm/gma500/oaktrail_device.c
+++ b/drivers/gpu/drm/gma500/oaktrail_device.c
@@ -22,7 +22,7 @@
 #include 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_reg.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/psb_device.c 
b/drivers/gpu/drm/gma500/psb_device.c
index 4659132..9d6959a 100644
--- a/drivers/gpu/drm/gma500/psb_device.c
+++ b/drivers/gpu/drm/gma500/psb_device.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_reg.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
deleted file mode 100644
index 1136867..000
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ /dev/null
@@ -1,91 +0,0 @@
-/**
- * Copyright (c) 2007-2011, Intel Corporation.
- * All Rights Reserved.
- * Copyright (c) 2008, Tungsten Graphics Inc.  Cedar Park, TX., USA.
- * All Rights Reserved.
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License,
- * version 2, as published by the Free Software Foundation.
- *
- * This program is distributed in the hope it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
- *
- **/
-
-#ifndef _PSB_DRM_H_
-#define _PSB_DRM_H_
-
-/*
- * Manage the LUT for an output
- */
-struct drm_psb_dpst_lut_arg {
-   uint8_t lut[256];
-   int output_id;
-};
-
-/*
- * Validate modes
- */
-struct drm_psb_mode_operation_arg {
-   u32 obj_id;
-   u16 operation;
-   struct drm_mode_modeinfo mode;
-   u64 data;
-};
-
-/*
- * Query the stolen memory for smarter management of
- * memory by the server
- */
-struct drm_psb_stolen_memory_arg {
-   u32 base;
-   u32 size;
-};
-
-struct drm_psb_get_pipe_from_crtc_id_arg {
-   /** ID of CRTC being requested **/
-   u32 crtc_id;
-   /** pipe of requested CRTC **/
-   u32 pipe;
-};
-
-struct drm_psb_gem_create {
-   __u64 size;
-   __u32 handle;
-   __u32 flags;
-#define GMA_GEM_CREATE_STOLEN  1 

[PATCH 3/4] gma500: kill off NUM_PIPE define

2011-11-16 Thread Alan Cox
From: Alan Cox 

We don't want this external in case someone adds more to the hardware. We
want it out of the ABI.

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/psb_drm.h |3 ---
 drivers/gpu/drm/gma500/psb_drv.h |2 ++
 2 files changed, 2 insertions(+), 3 deletions(-)


diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
index 6ded343..1136867 100644
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ b/drivers/gpu/drm/gma500/psb_drm.h
@@ -22,9 +22,6 @@
 #ifndef _PSB_DRM_H_
 #define _PSB_DRM_H_

-#define PSB_NUM_PIPE 3
-
-
 /*
  * Manage the LUT for an output
  */
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 9567748..ffb05f2 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -258,6 +258,8 @@ struct psb_intel_opregion {

 struct psb_ops;

+#define PSB_NUM_PIPE   3
+
 struct drm_psb_private {
struct drm_device *dev;
const struct psb_ops *ops;



[PATCH 2/4] gma500: Rename the ioctls to avoid clashing with the legacy drivers

2011-11-16 Thread Alan Cox
From: Alan Cox 

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/gem.c |4 ++--
 drivers/gpu/drm/gma500/psb_drm.h |   20 ++--
 drivers/gpu/drm/gma500/psb_drv.c |   16 
 3 files changed, 20 insertions(+), 20 deletions(-)


diff --git a/drivers/gpu/drm/gma500/gem.c b/drivers/gpu/drm/gma500/gem.c
index 65fdd6b..d743679 100644
--- a/drivers/gpu/drm/gma500/gem.c
+++ b/drivers/gpu/drm/gma500/gem.c
@@ -274,13 +274,13 @@ int psb_gem_create_ioctl(struct drm_device *dev, void 
*data,
 {
struct drm_psb_gem_create *args = data;
int ret;
-   if (args->flags & PSB_GEM_CREATE_STOLEN) {
+   if (args->flags & GMA_GEM_CREATE_STOLEN) {
ret = psb_gem_create_stolen(file, dev, args->size,
&args->handle);
if (ret == 0)
return 0;
/* Fall throguh */
-   args->flags &= ~PSB_GEM_CREATE_STOLEN;
+   args->flags &= ~GMA_GEM_CREATE_STOLEN;
}
return psb_gem_create(file, dev, args->size, &args->handle);
 }
diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
index 72eeb7a..6ded343 100644
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ b/drivers/gpu/drm/gma500/psb_drm.h
@@ -63,7 +63,7 @@ struct drm_psb_gem_create {
__u64 size;
__u32 handle;
__u32 flags;
-#define PSB_GEM_CREATE_STOLEN  1   /* Stolen memory can be used */
+#define GMA_GEM_CREATE_STOLEN  1   /* Stolen memory can be used */
 };

 struct drm_psb_gem_mmap {
@@ -79,15 +79,15 @@ struct drm_psb_gem_mmap {

 /* Controlling the kernel modesetting buffers */

-#define DRM_PSB_GEM_CREATE 0x00/* Create a GEM object */
-#define DRM_PSB_GEM_MMAP   0x01/* Map GEM memory */
-#define DRM_PSB_STOLEN_MEMORY  0x02/* Report stolen memory */
-#define DRM_PSB_2D_OP  0x03/* Will be merged later */
-#define DRM_PSB_GAMMA  0x04/* Set gamma table */
-#define DRM_PSB_ADB0x05/* Get backlight */
-#define DRM_PSB_DPST_BL0x06/* Set backlight */
-#define DRM_PSB_GET_PIPE_FROM_CRTC_ID 0x1  /* CRTC to physical pipe# */
-#define DRM_PSB_MODE_OPERATION 0x07/* Mode validation/DC set */
+#define DRM_GMA_GEM_CREATE 0x00/* Create a GEM object */
+#define DRM_GMA_GEM_MMAP   0x01/* Map GEM memory */
+#define DRM_GMA_STOLEN_MEMORY  0x02/* Report stolen memory */
+#define DRM_GMA_2D_OP  0x03/* Will be merged later */
+#define DRM_GMA_GAMMA  0x04/* Set gamma table */
+#define DRM_GMA_ADB0x05/* Get backlight */
+#define DRM_GMA_DPST_BL0x06/* Set backlight */
+#define DRM_GMA_GET_PIPE_FROM_CRTC_ID 0x1  /* CRTC to physical pipe# */
+#define DRM_GMA_MODE_OPERATION 0x07/* Mode validation/DC set */
 #definePSB_MODE_OPERATION_MODE_VALID   0x01


diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 2d5050e..9294e71 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -81,27 +81,27 @@ MODULE_DEVICE_TABLE(pci, pciidlist);
  */

 #define DRM_IOCTL_PSB_ADB  \
-   DRM_IOWR(DRM_PSB_ADB + DRM_COMMAND_BASE, uint32_t)
+   DRM_IOWR(DRM_GMA_ADB + DRM_COMMAND_BASE, uint32_t)
 #define DRM_IOCTL_PSB_MODE_OPERATION   \
-   DRM_IOWR(DRM_PSB_MODE_OPERATION + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_MODE_OPERATION + DRM_COMMAND_BASE, \
 struct drm_psb_mode_operation_arg)
 #define DRM_IOCTL_PSB_STOLEN_MEMORY\
-   DRM_IOWR(DRM_PSB_STOLEN_MEMORY + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_STOLEN_MEMORY + DRM_COMMAND_BASE, \
 struct drm_psb_stolen_memory_arg)
 #define DRM_IOCTL_PSB_GAMMA\
-   DRM_IOWR(DRM_PSB_GAMMA + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_GAMMA + DRM_COMMAND_BASE, \
 struct drm_psb_dpst_lut_arg)
 #define DRM_IOCTL_PSB_DPST_BL  \
-   DRM_IOWR(DRM_PSB_DPST_BL + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_DPST_BL + DRM_COMMAND_BASE, \
 uint32_t)
 #define DRM_IOCTL_PSB_GET_PIPE_FROM_CRTC_ID\
-   DRM_IOWR(DRM_PSB_GET_PIPE_FROM_CRTC_ID + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_GET_PIPE_FROM_CRTC_ID + DRM_COMMAND_BASE, \
 struct drm_psb_get_pipe_from_crtc_id_arg)
 #define DRM_IOCTL_PSB_GEM_CREATE   \
-   DRM_IOWR(DRM_PSB_GEM_CREATE + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_GEM_CREATE + DRM_COMMAND_BASE, \
 struct drm_psb_gem_create)
 #define DRM_IOCTL_PSB_GEM_MMAP \
-   DRM_IOWR(DRM_PSB_GEM_MMAP + DRM_COMMAND_BASE, \
+

[PATCH 1/4] gma500: begin pruning dead bits of API

2011-11-16 Thread Alan Cox
From: Alan Cox 

At this point we won't add an external set of definitions. We want to get
everything out before we admit to a public API beyond the standardised
ones.

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/psb_drm.h |  159 ++--
 drivers/gpu/drm/gma500/psb_drv.c |  507 --
 drivers/gpu/drm/gma500/psb_drv.h |2 
 3 files changed, 24 insertions(+), 644 deletions(-)


diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
index dca7b20..72eeb7a 100644
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ b/drivers/gpu/drm/gma500/psb_drm.h
@@ -24,168 +24,41 @@

 #define PSB_NUM_PIPE 3

-#define PSB_GPU_ACCESS_READ (1ULL << 32)
-#define PSB_GPU_ACCESS_WRITE(1ULL << 33)
-#define PSB_GPU_ACCESS_MASK (PSB_GPU_ACCESS_READ | 
PSB_GPU_ACCESS_WRITE)
-
-#define PSB_BO_FLAG_COMMAND (1ULL << 52)

 /*
- * Feedback components:
+ * Manage the LUT for an output
  */
-
-struct drm_psb_sizes_arg {
-   u32 ta_mem_size;
-   u32 mmu_size;
-   u32 pds_size;
-   u32 rastgeom_size;
-   u32 tt_size;
-   u32 vram_size;
-};
-
 struct drm_psb_dpst_lut_arg {
uint8_t lut[256];
int output_id;
 };

-#define PSB_DC_CRTC_SAVE 0x01
-#define PSB_DC_CRTC_RESTORE 0x02
-#define PSB_DC_OUTPUT_SAVE 0x04
-#define PSB_DC_OUTPUT_RESTORE 0x08
-#define PSB_DC_CRTC_MASK 0x03
-#define PSB_DC_OUTPUT_MASK 0x0C
-
-struct drm_psb_dc_state_arg {
-   u32 flags;
-   u32 obj_id;
-};
-
+/*
+ * Validate modes
+ */
 struct drm_psb_mode_operation_arg {
u32 obj_id;
u16 operation;
struct drm_mode_modeinfo mode;
-   void *data;
+   u64 data;
 };

+/*
+ * Query the stolen memory for smarter management of
+ * memory by the server
+ */
 struct drm_psb_stolen_memory_arg {
u32 base;
u32 size;
 };

-/*Display Register Bits*/
-#define REGRWBITS_PFIT_CONTROLS(1 << 0)
-#define REGRWBITS_PFIT_AUTOSCALE_RATIOS(1 << 1)
-#define REGRWBITS_PFIT_PROGRAMMED_SCALE_RATIOS (1 << 2)
-#define REGRWBITS_PIPEASRC (1 << 3)
-#define REGRWBITS_PIPEBSRC (1 << 4)
-#define REGRWBITS_VTOTAL_A (1 << 5)
-#define REGRWBITS_VTOTAL_B (1 << 6)
-#define REGRWBITS_DSPACNTR (1 << 8)
-#define REGRWBITS_DSPBCNTR (1 << 9)
-#define REGRWBITS_DSPCCNTR (1 << 10)
-
-/*Overlay Register Bits*/
-#define OV_REGRWBITS_OVADD (1 << 0)
-#define OV_REGRWBITS_OGAM_ALL  (1 << 1)
-
-#define OVC_REGRWBITS_OVADD  (1 << 2)
-#define OVC_REGRWBITS_OGAM_ALL (1 << 3)
-
-struct drm_psb_register_rw_arg {
-   u32 b_force_hw_on;
-
-   u32 display_read_mask;
-   u32 display_write_mask;
-
-   struct {
-   u32 pfit_controls;
-   u32 pfit_autoscale_ratios;
-   u32 pfit_programmed_scale_ratios;
-   u32 pipeasrc;
-   u32 pipebsrc;
-   u32 vtotal_a;
-   u32 vtotal_b;
-   } display;
-
-   u32 overlay_read_mask;
-   u32 overlay_write_mask;
-
-   struct {
-   u32 OVADD;
-   u32 OGAMC0;
-   u32 OGAMC1;
-   u32 OGAMC2;
-   u32 OGAMC3;
-   u32 OGAMC4;
-   u32 OGAMC5;
-   u32 IEP_ENABLED;
-   u32 IEP_BLE_MINMAX;
-   u32 IEP_BSSCC_CONTROL;
-   u32 b_wait_vblank;
-   } overlay;
-
-   u32 sprite_enable_mask;
-   u32 sprite_disable_mask;
-
-   struct {
-   u32 dspa_control;
-   u32 dspa_key_value;
-   u32 dspa_key_mask;
-   u32 dspc_control;
-   u32 dspc_stride;
-   u32 dspc_position;
-   u32 dspc_linear_offset;
-   u32 dspc_size;
-   u32 dspc_surface;
-   } sprite;
-
-   u32 subpicture_enable_mask;
-   u32 subpicture_disable_mask;
-};
-
-/* Controlling the kernel modesetting buffers */
-
-#define DRM_PSB_SIZES   0x07
-#define DRM_PSB_FUSE_REG   0x08
-#define DRM_PSB_DC_STATE   0x0A
-#define DRM_PSB_ADB0x0B
-#define DRM_PSB_MODE_OPERATION 0x0C
-#define DRM_PSB_STOLEN_MEMORY  0x0D
-#define DRM_PSB_REGISTER_RW0x0E
-
-/*
- * NOTE: Add new commands here, but increment
- * the values below and increment their
- * corresponding defines where they're
- * defined elsewhere.
- */
-
-#define DRM_PSB_GEM_CREATE 0x10
-#define DRM_PSB_2D_OP  0x11/* Will be merged later */
-#define DRM_PSB_GEM_MMAP   0x12
-#define DRM_PSB_DPST   0x1B
-#define DRM_PSB_GAMMA  0x1C
-#define DRM_PSB_DPST_BL0x1D
-#define DRM_PSB_GET_PIPE_FROM_CRTC_ID 0x1F
-
-#define PSB_MODE_OPERATION_MODE_VALID  0x01
-#define PSB_MODE_OPERATION_SET_DC_BASE  0x02
-
 struct drm_psb_get_pipe_from_crtc_id_arg {
/** ID of CR

[Bug 43000] huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=43000

--- Comment #3 from Ian Romanick  2011-11-16 14:25:23 
PST ---
If this was a recent change, I'll guess that it will bisect to my changes to
the way uniforms are handled.  I pushed a patch today that may restore previous
performance:

commit 010dc29283cfc7791a29ba8a0570d8f7f9edef05
Author: Ian Romanick 
Date:   Thu Nov 10 12:32:35 2011 -0800

mesa: Only update sampler uniforms that are used by the shader stage

Previously a vertex shader that used no samplers would get updated (by
calling the driver's ProgramStringNotify) when a sampler in the
fragment shader was updated.  This was discovered while investigating
some spurious code generation for shaders in Cogs.  The behavior in
Cogs is especially pessimal because it ping-pongs sampler uniform
settings:

glUniform1i(sampler1, 0);
glUniform1i(sampler2, 1);
draw();
glUniform1i(sampler1, 1);
glUniform1i(sampler2, 0);
draw();
glUniform1i(sampler1, 0);
glUniform1i(sampler2, 1);
draw();
// etc.

ProgramStringNotify is still too big of a hammer.  Applications like
Cogs will still defeat the shader cache.  A lighter-weight mechanism
that can work with the shader cache is needed.  However, this patch at
least restores the previous behavior.

Signed-off-by: Ian Romanick 
Reviewed-by: Kenneth Graunke 

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[RFC PATCH] drm: Fix off-by-one races on vblank disable

2011-11-16 Thread Andy Lutomirski
There are two possible races when disabling vblanks.  If the IRQ
fired but the hardware didn't update its counter yet, then we store
too low a hardware counter.  (Sensible hardware never does this.
Apparently not all hardware is sensible.)  If, on the other hand,
the counter updated but the IRQ didn't fire yet, we store too high a
counter.

We handled the former case with a heuristic based on timestamps and
we did not handle the latter case.  By saving a little more state,
we can handle both cases exactly: all we need to do is watch for
changes in the difference between the hardware and software vblank
counts.

Signed-off-by: Andy Lutomirski 
---

Rather than tweaking more things to reduce the chance of hitting a race
while keeping the vblank disable timeout as low as possible, why not
just fix the race?

This compiles but is not very well tested, because I don't know what
tests to run.  I haven't been able to provoke either race on my SNB
laptop.

 drivers/gpu/drm/drm_irq.c |   92 
 include/drm/drmP.h|2 +-
 2 files changed, 59 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3830e9e..1674a33 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -56,6 +56,12 @@
  */
 #define DRM_REDUNDANT_VBLIRQ_THRESH_NS 100
 
+/* Saved vblank count data, used only in this file. */
+struct drm_vbl_counts {
+   u32 hwcount;/* hw count at last state save or load */
+   u32 drmcount;   /* drm count at last state save or load */
+};
+
 /**
  * Get interrupt from bus id.
  *
@@ -101,7 +107,8 @@ static void clear_vblank_timestamps(struct drm_device *dev, 
int crtc)
 static void vblank_disable_and_save(struct drm_device *dev, int crtc)
 {
unsigned long irqflags;
-   u32 vblcount;
+   u32 drmcount, hwcount;
+   u32 drm_counts_seen, hw_counts_seen, offset;
s64 diff_ns;
int vblrc;
struct timeval tvblank;
@@ -121,44 +128,53 @@ static void vblank_disable_and_save(struct drm_device 
*dev, int crtc)
/* No further vblank irq's will be processed after
 * this point. Get current hardware vblank count and
 * vblank timestamp, repeat until they are consistent.
-*
-* FIXME: There is still a race condition here and in
-* drm_update_vblank_count() which can cause off-by-one
-* reinitialization of software vblank counter. If gpu
-* vblank counter doesn't increment exactly at the leading
-* edge of a vblank interval, then we can lose 1 count if
-* we happen to execute between start of vblank and the
-* delayed gpu counter increment.
 */
do {
-   dev->last_vblank[crtc] = dev->driver->get_vblank_counter(dev, 
crtc);
+   hwcount = dev->driver->get_vblank_counter(dev, crtc);
vblrc = drm_get_last_vbltimestamp(dev, crtc, &tvblank, 0);
-   } while (dev->last_vblank[crtc] != dev->driver->get_vblank_counter(dev, 
crtc));
+   } while (hwcount != dev->driver->get_vblank_counter(dev, crtc));
 
/* Compute time difference to stored timestamp of last vblank
 * as updated by last invocation of drm_handle_vblank() in vblank irq.
 */
-   vblcount = atomic_read(&dev->_vblank_count[crtc]);
+   drmcount = atomic_read(&dev->_vblank_count[crtc]);
diff_ns = timeval_to_ns(&tvblank) -
- timeval_to_ns(&vblanktimestamp(dev, crtc, vblcount));
+ timeval_to_ns(&vblanktimestamp(dev, crtc, drmcount));
 
-   /* If there is at least 1 msec difference between the last stored
-* timestamp and tvblank, then we are currently executing our
-* disable inside a new vblank interval, the tvblank timestamp
-* corresponds to this new vblank interval and the irq handler
-* for this vblank didn't run yet and won't run due to our disable.
-* Therefore we need to do the job of drm_handle_vblank() and
-* increment the vblank counter by one to account for this vblank.
+   /* We could be off by one in either direction.  If a vblank just
+* happened but the IRQ hasn't been handled yet, then drmcount is
+* too low by one.  On the other hand, if the GPU fires its vblank
+* interrupts *before* updating its counter, then hwcount could
+* be too low by one.  (If both happen, they cancel out.)
 *
-* Skip this step if there isn't any high precision timestamp
-* available. In that case we can't account for this and just
-* hope for the best.
+* Fortunately, we have enough information to figure out what
+* happened.  Assuming the hardware counter works right, the
+* difference between drmcount and vblcount should be a constant
+* (modulo max_vblank_count).  We have both saved values from last
+* time we turned the interrupt on.
 */
- 

[RFC PATCH] drm: Fix off-by-one races on vblank disable

2011-11-16 Thread Andy Lutomirski
There are two possible races when disabling vblanks.  If the IRQ
fired but the hardware didn't update its counter yet, then we store
too low a hardware counter.  (Sensible hardware never does this.
Apparently not all hardware is sensible.)  If, on the other hand,
the counter updated but the IRQ didn't fire yet, we store too high a
counter.

We handled the former case with a heuristic based on timestamps and
we did not handle the latter case.  By saving a little more state,
we can handle both cases exactly: all we need to do is watch for
changes in the difference between the hardware and software vblank
counts.

Signed-off-by: Andy Lutomirski 
---

Rather than tweaking more things to reduce the chance of hitting a race
while keeping the vblank disable timeout as low as possible, why not
just fix the race?

This compiles but is not very well tested, because I don't know what
tests to run.  I haven't been able to provoke either race on my SNB
laptop.

 drivers/gpu/drm/drm_irq.c |   92 
 include/drm/drmP.h|2 +-
 2 files changed, 59 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index 3830e9e..1674a33 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -56,6 +56,12 @@
  */
 #define DRM_REDUNDANT_VBLIRQ_THRESH_NS 100

+/* Saved vblank count data, used only in this file. */
+struct drm_vbl_counts {
+   u32 hwcount;/* hw count at last state save or load */
+   u32 drmcount;   /* drm count at last state save or load */
+};
+
 /**
  * Get interrupt from bus id.
  *
@@ -101,7 +107,8 @@ static void clear_vblank_timestamps(struct drm_device *dev, 
int crtc)
 static void vblank_disable_and_save(struct drm_device *dev, int crtc)
 {
unsigned long irqflags;
-   u32 vblcount;
+   u32 drmcount, hwcount;
+   u32 drm_counts_seen, hw_counts_seen, offset;
s64 diff_ns;
int vblrc;
struct timeval tvblank;
@@ -121,44 +128,53 @@ static void vblank_disable_and_save(struct drm_device 
*dev, int crtc)
/* No further vblank irq's will be processed after
 * this point. Get current hardware vblank count and
 * vblank timestamp, repeat until they are consistent.
-*
-* FIXME: There is still a race condition here and in
-* drm_update_vblank_count() which can cause off-by-one
-* reinitialization of software vblank counter. If gpu
-* vblank counter doesn't increment exactly at the leading
-* edge of a vblank interval, then we can lose 1 count if
-* we happen to execute between start of vblank and the
-* delayed gpu counter increment.
 */
do {
-   dev->last_vblank[crtc] = dev->driver->get_vblank_counter(dev, 
crtc);
+   hwcount = dev->driver->get_vblank_counter(dev, crtc);
vblrc = drm_get_last_vbltimestamp(dev, crtc, &tvblank, 0);
-   } while (dev->last_vblank[crtc] != dev->driver->get_vblank_counter(dev, 
crtc));
+   } while (hwcount != dev->driver->get_vblank_counter(dev, crtc));

/* Compute time difference to stored timestamp of last vblank
 * as updated by last invocation of drm_handle_vblank() in vblank irq.
 */
-   vblcount = atomic_read(&dev->_vblank_count[crtc]);
+   drmcount = atomic_read(&dev->_vblank_count[crtc]);
diff_ns = timeval_to_ns(&tvblank) -
- timeval_to_ns(&vblanktimestamp(dev, crtc, vblcount));
+ timeval_to_ns(&vblanktimestamp(dev, crtc, drmcount));

-   /* If there is at least 1 msec difference between the last stored
-* timestamp and tvblank, then we are currently executing our
-* disable inside a new vblank interval, the tvblank timestamp
-* corresponds to this new vblank interval and the irq handler
-* for this vblank didn't run yet and won't run due to our disable.
-* Therefore we need to do the job of drm_handle_vblank() and
-* increment the vblank counter by one to account for this vblank.
+   /* We could be off by one in either direction.  If a vblank just
+* happened but the IRQ hasn't been handled yet, then drmcount is
+* too low by one.  On the other hand, if the GPU fires its vblank
+* interrupts *before* updating its counter, then hwcount could
+* be too low by one.  (If both happen, they cancel out.)
 *
-* Skip this step if there isn't any high precision timestamp
-* available. In that case we can't account for this and just
-* hope for the best.
+* Fortunately, we have enough information to figure out what
+* happened.  Assuming the hardware counter works right, the
+* difference between drmcount and vblcount should be a constant
+* (modulo max_vblank_count).  We have both saved values from last
+* time we turned the interrupt on.
 */
-

Re: [Intel-gfx] [RFC] Reduce idle vblank wakeups

2011-11-16 Thread Andrew Lutomirski
On Wed, Nov 16, 2011 at 6:19 PM, Matthew Garrett  wrote:
> On Thu, Nov 17, 2011 at 01:26:37AM +0100, Mario Kleiner wrote:
>> On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:
>> >For Radeon, I'd have thought you could handle this by scheduling
>> >an irq
>> >for the beginning of scanout (avivo has a register for that) and
>> >delaying the vblank disable until you hit it?
>>
>> For Radeon there is such an irq, but iirc we had some discussions on
>> this, also with Alex Deucher, a while ago and some irq's weren't
>> considered very reliable, or already used for other stuff. The idea
>> i had goes like this:
>>
>> Use the crtc scanout position queries together with the vblank
>> counter queries inside some calibration loop, maybe executed after
>> each modeset, to find out the scanline range in which the hardware
>> vblank counter increments -- basically a forbidden range of scanline
>> positions where the race would happen. Then at each vblank off/on,
>> query scanout position before and after the hw vblank counter query.
>> If according to the scanout positions the vblank counter query
>> happened within the forbidden time window, retry the query. With a
>> well working calibration that should add no delay in most cases and
>> a delay to the on/off code of a few dozen microseconds (=duration of
>> a few scanlines) worst case.
>
> Assuming we're sleeping rather than busy-looping, that's certainly ok.
> My previous experiments with radeon indicated that the scanout irq was
> certainly not entirely reliable - on the other hand, I was trying to use
> it for completing memory reclocking within the vblank interval. It was
> typically still within a few scanlines, so a sanity check there wouldn't
> pose too much of a problem.

I think there's a simpler fix: just keep the hardware and software
counts in sync -- if everything is working correctly (even with all
these crazy races), the difference should be constant.  Patch coming
momentarily.

--Andy
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/3 v2] drm/i915: hot removal notification to HDMI audio driver

2011-11-16 Thread Wu Fengguang
On monitor hot removal:

1) clear SDVO_AUDIO_ENABLE or DP_AUDIO_OUTPUT_ENABLE
2) clear ELD Valid bit

So that the audio driver will receive hot plug events and take action to
refresh its device state and ELD contents.

cc: Wang Zhenyu 
Signed-off-by: Wu Fengguang 
---
 drivers/gpu/drm/drm_crtc_helper.c |4 
 drivers/gpu/drm/i915/intel_dp.c   |   17 +
 drivers/gpu/drm/i915/intel_hdmi.c |   17 +
 include/drm/drm_crtc.h|1 +
 4 files changed, 39 insertions(+)

--- linux.orig/drivers/gpu/drm/i915/intel_dp.c  2011-11-16 21:36:58.0 
+0800
+++ linux/drivers/gpu/drm/i915/intel_dp.c   2011-11-16 21:37:00.0 
+0800
@@ -1984,6 +1984,22 @@ intel_dp_detect(struct drm_connector *co
return connector_status_connected;
 }

+static void intel_dp_hot_remove(struct drm_connector *connector)
+{
+   struct intel_dp *intel_dp = intel_attached_dp(connector);
+   struct drm_device *dev = intel_dp->base.base.dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+   struct drm_crtc *crtc = intel_dp->base.base.crtc;
+
+   intel_dp->DP &= ~DP_AUDIO_OUTPUT_ENABLE;
+   I915_WRITE(intel_dp->output_reg, intel_dp->DP);
+   POSTING_READ(intel_dp->output_reg);
+
+   connector->eld[0] = 0;
+   if (dev_priv->display.write_eld)
+   dev_priv->display.write_eld(connector, crtc);
+}
+
 static int intel_dp_get_modes(struct drm_connector *connector)
 {
struct intel_dp *intel_dp = intel_attached_dp(connector);
@@ -2143,6 +2159,7 @@ static const struct drm_connector_funcs 
.detect = intel_dp_detect,
.fill_modes = drm_helper_probe_single_connector_modes,
.set_property = intel_dp_set_property,
+   .hot_remove = intel_dp_hot_remove,
.destroy = intel_dp_destroy,
 };

--- linux.orig/drivers/gpu/drm/i915/intel_hdmi.c2011-11-16 
21:36:58.0 +0800
+++ linux/drivers/gpu/drm/i915/intel_hdmi.c 2011-11-16 21:37:00.0 
+0800
@@ -350,6 +350,22 @@ intel_hdmi_detect(struct drm_connector *
return status;
 }

+static void intel_hdmi_hot_remove(struct drm_connector *connector)
+{
+   struct intel_hdmi *intel_hdmi = intel_attached_hdmi(connector);
+   struct drm_i915_private *dev_priv = connector->dev->dev_private;
+   u32 temp;
+
+   temp = I915_READ(intel_hdmi->sdvox_reg);
+   I915_WRITE(intel_hdmi->sdvox_reg, temp & ~SDVO_AUDIO_ENABLE);
+   POSTING_READ(intel_hdmi->sdvox_reg);
+
+   connector->eld[0] = 0;
+   if (dev_priv->display.write_eld)
+   dev_priv->display.write_eld(connector,
+   intel_hdmi->base.base.crtc);
+}
+
 static int intel_hdmi_get_modes(struct drm_connector *connector)
 {
struct intel_hdmi *intel_hdmi = intel_attached_hdmi(connector);
@@ -459,6 +475,7 @@ static const struct drm_connector_funcs 
.detect = intel_hdmi_detect,
.fill_modes = drm_helper_probe_single_connector_modes,
.set_property = intel_hdmi_set_property,
+   .hot_remove = intel_hdmi_hot_remove,
.destroy = intel_hdmi_destroy,
 };

--- linux.orig/drivers/gpu/drm/drm_crtc_helper.c2011-11-16 
21:36:58.0 +0800
+++ linux/drivers/gpu/drm/drm_crtc_helper.c 2011-11-16 21:37:00.0 
+0800
@@ -905,6 +905,10 @@ static void output_poll_execute(struct w
  old_status, connector->status);
if (old_status != connector->status)
changed = true;
+   if (old_status == connector_status_connected &&
+   connector->status == connector_status_disconnected)
+   connector->funcs->hot_remove(connector);
+
}

mutex_unlock(&dev->mode_config.mutex);
--- linux.orig/include/drm/drm_crtc.h   2011-11-16 21:36:58.0 +0800
+++ linux/include/drm/drm_crtc.h2011-11-16 21:37:00.0 +0800
@@ -419,6 +419,7 @@ struct drm_connector_funcs {
int (*fill_modes)(struct drm_connector *connector, uint32_t max_width, 
uint32_t max_height);
int (*set_property)(struct drm_connector *connector, struct 
drm_property *property,
 uint64_t val);
+   void (*hot_remove)(struct drm_connector *connector);
void (*destroy)(struct drm_connector *connector);
void (*force)(struct drm_connector *connector);
 };


[Intel-gfx] [RFC] Reduce idle vblank wakeups

2011-11-16 Thread Andrew Lutomirski
On Wed, Nov 16, 2011 at 6:19 PM, Matthew Garrett  wrote:
> On Thu, Nov 17, 2011 at 01:26:37AM +0100, Mario Kleiner wrote:
>> On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:
>> >For Radeon, I'd have thought you could handle this by scheduling
>> >an irq
>> >for the beginning of scanout (avivo has a register for that) and
>> >delaying the vblank disable until you hit it?
>>
>> For Radeon there is such an irq, but iirc we had some discussions on
>> this, also with Alex Deucher, a while ago and some irq's weren't
>> considered very reliable, or already used for other stuff. The idea
>> i had goes like this:
>>
>> Use the crtc scanout position queries together with the vblank
>> counter queries inside some calibration loop, maybe executed after
>> each modeset, to find out the scanline range in which the hardware
>> vblank counter increments -- basically a forbidden range of scanline
>> positions where the race would happen. Then at each vblank off/on,
>> query scanout position before and after the hw vblank counter query.
>> If according to the scanout positions the vblank counter query
>> happened within the forbidden time window, retry the query. With a
>> well working calibration that should add no delay in most cases and
>> a delay to the on/off code of a few dozen microseconds (=duration of
>> a few scanlines) worst case.
>
> Assuming we're sleeping rather than busy-looping, that's certainly ok.
> My previous experiments with radeon indicated that the scanout irq was
> certainly not entirely reliable - on the other hand, I was trying to use
> it for completing memory reclocking within the vblank interval. It was
> typically still within a few scanlines, so a sanity check there wouldn't
> pose too much of a problem.

I think there's a simpler fix: just keep the hardware and software
counts in sync -- if everything is working correctly (even with all
these crazy races), the difference should be constant.  Patch coming
momentarily.

--Andy


[PATCH 3/3] drm/i915: hot removal notification to HDMI audio driver

2011-11-16 Thread Wu Fengguang
Sorry forgot to remove this left over chunk...

Note that I've not yet got the hardware to test the DisplayPort part
of this patch, but should be able to do so this week.

> --- linux.orig/drivers/gpu/drm/i915/intel_drv.h   2011-11-16 
> 20:54:27.0 +0800
> +++ linux/drivers/gpu/drm/i915/intel_drv.h2011-11-16 21:19:42.0 
> +0800
> @@ -382,6 +382,10 @@ extern void intel_fb_restore_mode(struct
>  extern void intel_init_clock_gating(struct drm_device *dev);
>  extern void intel_write_eld(struct drm_encoder *encoder,
>   struct drm_display_mode *mode);
> +extern void intel_hotplug_status(struct drm_device *dev,
> +   struct drm_connector *connector,
> +   struct drm_crtc *crtc,
> +   enum drm_connector_status status);
>  extern void intel_cpt_verify_modeset(struct drm_device *dev, int pipe);
>  
>  #endif /* __INTEL_DRV_H__ */


drm pixel formats update

2011-11-16 Thread Alan Cox
> I think the only format in my list where I didn't use an existing fourcc
> is I420/IYUV. And BTW, for that one I used the same "fake" fourcc that

Right but you redefine all sorts of stuff in the driver in your patch to
non FourCC names which is just confusing (and painful given the format
picked)

> v4l2 uses (YU12). 
> 
> And that brings another matter to the table. How should we deal with
> duplicate fourccs? I420/IYUV and YUY2/YUYV come to mind.

Just accept both. FourCC as with all API's is not perfect

> Also, if I now add these ad-hoc fourccs for the RGB formats, and some
> time later someone comes in with a format with a conflicting official
> fourcc, what should we do?

One possibility I suggested originally was to mix FourCC codes and native
formats which are numbered. That works fine in both endiannesses in
theory because you'll always have a \0 in it which is invalid FourCC

ie just number the Linux specific DRM formats 0, 1, 2, 3, 4, 5, ...

> Oh and one extra detail just occured to me regarding the three plane
> formats. Should we even define formats for both the YUV vs. YVU
> variant. Seeing as we now have independent handles and offsets for
> each plane, we can make do with just one format definition.

I think so - or the helper should do the translation and flip the planes.
We want the user to get flexibility and the driver to be as simple as
possible.

(and btw I've no problem at all with the idea that you can pass in a
FourCC *or* a format specifying structure, or with an internal API where
a fourCC is always internally turned into a struct of offsets and other
useful info before hitting the drivers)

Alan


[PATCH 3/3] drm/i915: hot removal notification to HDMI audio driver

2011-11-16 Thread Wu Fengguang
An embedded and charset-unspecified text was scrubbed...
Name: eld-hot-removal
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20111116/abe8b563/attachment.asc>


[PATCH 2/3] drm/i915: dont trigger hotplug events on unchanged ELD

2011-11-16 Thread Wu Fengguang
An embedded and charset-unspecified text was scrubbed...
Name: no-extra-eld-passing
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20111116/01c9d33c/attachment.asc>


[PATCH 1/3] drm/i915: fix ELD writing for SandyBridge

2011-11-16 Thread Wu Fengguang
An embedded and charset-unspecified text was scrubbed...
Name: sandybridge-eld-fix
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20111116/914003d5/attachment.asc>


[PATCH 0/3] HDMI ELD fixes for 3.2

2011-11-16 Thread Wu Fengguang
Keith,

Here are 3 fixes on HDMI/ELD audio.

The third one adds a ->hot_remove hook to drm_connector_funcs. Please review.

[PATCH 1/3] drm/i915: fix ELD writing for SandyBridge
[PATCH 2/3] drm/i915: dont trigger hotplug events on unchanged ELD
[PATCH 3/3] drm/i915: hot removal notification to HDMI audio driver

Thanks,
Fengguang



[PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread ville.syrj...@linux.intel.com
From: Ville Syrj?l? 

Name the formats as DRM_FORMAT_X instead of DRM_FOURCC_X. Use consistent
names, especially for the RGB formats. Component order and byte order are
now strictly specified for each format.

The RGB format naming follows a convention where the components names
and sizes are listed from left to right, matching the order within a
single pixel from most significant bit to least significant bit. Lower
case letters are used when listing the components to improve
readablility. I believe this convention matches the one used by pixman.

The YUV format names vary more. For the 4:2:2 packed formats and 2
plane formats use the fourcc. For the three plane formats the
name includes the plane order and subsampling information using the
standard subsampling notation. Some of those also happen to match
the official fourcc definition.

The fourccs for for all the RGB formats and some of the YUV formats
I invented myself. The idea was that looking at just the fourcc you
get some idea what the format is about without having to decode it
using some external reference.

Signed-off-by: Ville Syrj?l? 
---
 drivers/gpu/drm/drm_crtc.c   |   18 +++---
 drivers/gpu/drm/drm_crtc_helper.c|   39 --
 drivers/gpu/drm/i915/intel_display.c |   18 ---
 include/drm/drm_fourcc.h |   96 --
 4 files changed, 121 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 30a70a4..761f265 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -1918,28 +1918,28 @@ uint32_t drm_mode_legacy_fb_format(uint32_t bpp, 
uint32_t depth)

switch (bpp) {
case 8:
-   fmt = DRM_FOURCC_RGB332;
+   fmt = DRM_FORMAT_r3g3b2;
break;
case 16:
if (depth == 15)
-   fmt = DRM_FOURCC_RGB555;
+   fmt = DRM_FORMAT_x1r5g5b5;
else
-   fmt = DRM_FOURCC_RGB565;
+   fmt = DRM_FORMAT_r5g6b5;
break;
case 24:
-   fmt = DRM_FOURCC_RGB24;
+   fmt = DRM_FORMAT_r8g8b8;
break;
case 32:
if (depth == 24)
-   fmt = DRM_FOURCC_RGB24;
+   fmt = DRM_FORMAT_x8r8g8b8;
else if (depth == 30)
-   fmt = DRM_INTEL_RGB30;
+   fmt = DRM_FORMAT_x2r10g10b10;
else
-   fmt = DRM_FOURCC_RGB32;
+   fmt = DRM_FORMAT_a8r8g8b8;
break;
default:
-   DRM_ERROR("bad bpp, assuming RGB24 pixel format\n");
-   fmt = DRM_FOURCC_RGB24;
+   DRM_ERROR("bad bpp, assuming x8r8g8b8 pixel format\n");
+   fmt = DRM_FORMAT_x8r8g8b8;
break;
}

diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
b/drivers/gpu/drm/drm_crtc_helper.c
index 3e0645c..4ef19d37 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -816,27 +816,54 @@ void drm_helper_get_fb_bpp_depth(uint32_t format, 
unsigned int *depth,
 int *bpp)
 {
switch (format) {
-   case DRM_FOURCC_RGB332:
+   case DRM_FORMAT_r3g3b2:
+   case DRM_FORMAT_b2g3r3:
*depth = 8;
*bpp = 8;
break;
-   case DRM_FOURCC_RGB555:
+   case DRM_FORMAT_x1r5g5b5:
+   case DRM_FORMAT_x1b5g5r5:
+   case DRM_FORMAT_r5g5b5x1:
+   case DRM_FORMAT_b5g5r5x1:
+   case DRM_FORMAT_a1r5g5b5:
+   case DRM_FORMAT_a1b5g5r5:
+   case DRM_FORMAT_r5g5b5a1:
+   case DRM_FORMAT_b5g5r5a1:
*depth = 15;
*bpp = 16;
break;
-   case DRM_FOURCC_RGB565:
+   case DRM_FORMAT_r5g6b5:
+   case DRM_FORMAT_b5g6r5:
*depth = 16;
*bpp = 16;
break;
-   case DRM_FOURCC_RGB24:
+   case DRM_FORMAT_r8g8b8:
+   case DRM_FORMAT_b8g8r8:
+   *depth = 24;
+   *bpp = 24;
+   break;
+   case DRM_FORMAT_x8r8g8b8:
+   case DRM_FORMAT_x8b8g8r8:
+   case DRM_FORMAT_r8g8b8x8:
+   case DRM_FORMAT_b8g8r8x8:
*depth = 24;
*bpp = 32;
break;
-   case DRM_INTEL_RGB30:
+   case DRM_FORMAT_x2r10g10b10:
+   case DRM_FORMAT_x2b10g10r10:
+   case DRM_FORMAT_r10g10b10x2:
+   case DRM_FORMAT_b10g10r10x2:
+   case DRM_FORMAT_a2r10g10b10:
+   case DRM_FORMAT_a2b10g10r10:
+   case DRM_FORMAT_r10g10b10a2:
+   case DRM_FORMAT_b10g10r10a2:
*depth = 30;
*bpp = 32;
break;
-   case DRM_FOURCC_RGB32:
+   case DRM_FORMAT_a8r8g8b8:
+   case DRM_FORMAT_a8b8g8r8:
+   case DRM_FORMAT_r8g8b8a8:
+   case DRM_FORMAT_b8g8r8a8:
 

[PATCH 1/2] drm: Add a missing ')'

2011-11-16 Thread ville.syrj...@linux.intel.com
From: Ville Syrj?l? 

The code happened to compile because the flag wasn't actually used yet.

Signed-off-by: Ville Syrj?l? 
---
 include/drm/drm_mode.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/drm/drm_mode.h b/include/drm/drm_mode.h
index 094da8a..27d7faf 100644
--- a/include/drm/drm_mode.h
+++ b/include/drm/drm_mode.h
@@ -266,7 +266,7 @@ struct drm_mode_fb_cmd {
__u32 handle;
 };

-#define DRM_MODE_FB_INTERLACED (1<<0 /* for interlaced framebuffers */
+#define DRM_MODE_FB_INTERLACED (1<<0) /* for interlaced framebuffers */

 struct drm_mode_fb_cmd2 {
__u32 fb_id;
-- 
1.7.3.4



drm pixel formats update

2011-11-16 Thread ville.syrj...@linux.intel.com
I decided to go all out with the pixel format definitions. Added pretty
much all of the possible RGB/BGR variations. Just left out ones with
16bit components and floats. Also added a whole bunch of YUV formats,
and 8 bit pseudocolor for good measure.

I'm sure some of the fourccs now clash with the ones used by v4l2,
but that's life.

If anyone has problems with the way the formats are defined, please
speak up now! Since only Jesse has bothered to comment on my rantings
I can only assume people are happy with my approach to things.

These patches should apply on top of Jesse's v3+v5 set... I think.
I sort of lost track of things when the patches started having
different version numbers. At least we're not yet into two digits
numbers ;)


[PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread Ilyes Gouta
Hi Ville,

Regarding 3 plane YCbCr, DRM_FORMAT_yuv444 (non sub-sampled YCbCr)
would also be useful.

-Ilyes

On Wed, Nov 16, 2011 at 7:42 PM,   wrote:
> From: Ville Syrj?l? 
>
> Name the formats as DRM_FORMAT_X instead of DRM_FOURCC_X. Use consistent
> names, especially for the RGB formats. Component order and byte order are
> now strictly specified for each format.
>
> The RGB format naming follows a convention where the components names
> and sizes are listed from left to right, matching the order within a
> single pixel from most significant bit to least significant bit. Lower
> case letters are used when listing the components to improve
> readablility. I believe this convention matches the one used by pixman.
>
> The YUV format names vary more. For the 4:2:2 packed formats and 2
> plane formats use the fourcc. For the three plane formats the
> name includes the plane order and subsampling information using the
> standard subsampling notation. Some of those also happen to match
> the official fourcc definition.
>
> The fourccs for for all the RGB formats and some of the YUV formats
> I invented myself. The idea was that looking at just the fourcc you
> get some idea what the format is about without having to decode it
> using some external reference.
>
> Signed-off-by: Ville Syrj?l? 
> ---
> ?drivers/gpu/drm/drm_crtc.c ? ? ? ? ? | ? 18 +++---
> ?drivers/gpu/drm/drm_crtc_helper.c ? ?| ? 39 --
> ?drivers/gpu/drm/i915/intel_display.c | ? 18 ---
> ?include/drm/drm_fourcc.h ? ? ? ? ? ? | ? 96 
> --
> ?4 files changed, 121 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
> index 30a70a4..761f265 100644
> --- a/drivers/gpu/drm/drm_crtc.c
> +++ b/drivers/gpu/drm/drm_crtc.c
> @@ -1918,28 +1918,28 @@ uint32_t drm_mode_legacy_fb_format(uint32_t bpp, 
> uint32_t depth)
>
> ? ? ? ?switch (bpp) {
> ? ? ? ?case 8:
> - ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB332;
> + ? ? ? ? ? ? ? fmt = DRM_FORMAT_r3g3b2;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 16:
> ? ? ? ? ? ? ? ?if (depth == 15)
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB555;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_x1r5g5b5;
> ? ? ? ? ? ? ? ?else
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB565;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_r5g6b5;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 24:
> - ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB24;
> + ? ? ? ? ? ? ? fmt = DRM_FORMAT_r8g8b8;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 32:
> ? ? ? ? ? ? ? ?if (depth == 24)
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB24;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_x8r8g8b8;
> ? ? ? ? ? ? ? ?else if (depth == 30)
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_INTEL_RGB30;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_x2r10g10b10;
> ? ? ? ? ? ? ? ?else
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB32;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_a8r8g8b8;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?default:
> - ? ? ? ? ? ? ? DRM_ERROR("bad bpp, assuming RGB24 pixel format\n");
> - ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB24;
> + ? ? ? ? ? ? ? DRM_ERROR("bad bpp, assuming x8r8g8b8 pixel format\n");
> + ? ? ? ? ? ? ? fmt = DRM_FORMAT_x8r8g8b8;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?}
>
> diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
> b/drivers/gpu/drm/drm_crtc_helper.c
> index 3e0645c..4ef19d37 100644
> --- a/drivers/gpu/drm/drm_crtc_helper.c
> +++ b/drivers/gpu/drm/drm_crtc_helper.c
> @@ -816,27 +816,54 @@ void drm_helper_get_fb_bpp_depth(uint32_t format, 
> unsigned int *depth,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int *bpp)
> ?{
> ? ? ? ?switch (format) {
> - ? ? ? case DRM_FOURCC_RGB332:
> + ? ? ? case DRM_FORMAT_r3g3b2:
> + ? ? ? case DRM_FORMAT_b2g3r3:
> ? ? ? ? ? ? ? ?*depth = 8;
> ? ? ? ? ? ? ? ?*bpp = 8;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_FOURCC_RGB555:
> + ? ? ? case DRM_FORMAT_x1r5g5b5:
> + ? ? ? case DRM_FORMAT_x1b5g5r5:
> + ? ? ? case DRM_FORMAT_r5g5b5x1:
> + ? ? ? case DRM_FORMAT_b5g5r5x1:
> + ? ? ? case DRM_FORMAT_a1r5g5b5:
> + ? ? ? case DRM_FORMAT_a1b5g5r5:
> + ? ? ? case DRM_FORMAT_r5g5b5a1:
> + ? ? ? case DRM_FORMAT_b5g5r5a1:
> ? ? ? ? ? ? ? ?*depth = 15;
> ? ? ? ? ? ? ? ?*bpp = 16;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_FOURCC_RGB565:
> + ? ? ? case DRM_FORMAT_r5g6b5:
> + ? ? ? case DRM_FORMAT_b5g6r5:
> ? ? ? ? ? ? ? ?*depth = 16;
> ? ? ? ? ? ? ? ?*bpp = 16;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_FOURCC_RGB24:
> + ? ? ? case DRM_FORMAT_r8g8b8:
> + ? ? ? case DRM_FORMAT_b8g8r8:
> + ? ? ? ? ? ? ? *depth = 24;
> + ? ? ? ? ? ? ? *bpp = 24;
> + ? ? ? ? ? ? ? break;
> + ? ? ? case DRM_FORMAT_x8r8g8b8:
> + ? ? ? case DRM_FORMAT_x8b8g8r8:
> + ? ? ? case DRM_FORMAT_r8g8b8x8:
> + ? ? ? case DRM_FORMAT_b8g8r8x8:
> ? ? ? ? ? ? ? ?*depth = 24;
> ? ? ? ? ? ? ? ?*bpp = 32;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_INTEL_RGB30:
> + ? ? ? case DRM_FORMAT_x2r10g10b10:
> + ? ? ? case DRM_FORMAT_x2b10g10r10:
> + ? ? ? case DRM_FORMAT_r10g10b10x2:
> + ? ? ? case DRM_FORMAT_b10g10r10x2:
> + ? ? ? case DRM_FORMAT_a2r10g10b1

[Bug 42960] Display does not work when resuming from suspend

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42960

--- Comment #2 from Sandeep  2011-11-16 20:02:02 PST ---
Created attachment 53616
  --> https://bugs.freedesktop.org/attachment.cgi?id=53616
dmesg output before suspending and after resuming

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 42960] Display does not work when resuming from suspend

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42960

--- Comment #1 from Sandeep  2011-11-16 20:01:12 UTC ---
Created attachment 53615
  --> https://bugs.freedesktop.org/attachment.cgi?id=53615
glxinfo output

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 42960] Display does not work when resuming from suspend

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42960

--- Comment #1 from Sandeep  2011-11-16 20:01:12 UTC 
---
Created attachment 53615
  --> https://bugs.freedesktop.org/attachment.cgi?id=53615
glxinfo output

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


drm pixel formats update

2011-11-16 Thread Alan Cox
> If anyone has problems with the way the formats are defined, please
> speak up now! Since only Jesse has bothered to comment on my rantings
> I can only assume people are happy with my approach to things.

Umm .. no. I don't see why they are needed. Its just an extra layer of
gratuitious confusing indirection. The rest of the world speaks and
understands FourCC sp for all the formats covered by an existing FourCC
name we should just the existing name.

You might need to check one now and then but everyone doing video
processing is familiar with them including all the Windows folk.


[Bug 43000] huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=43000

--- Comment #2 from almos  2011-11-16 11:52:30 PST ---
The hw is barts pro (hd6850). The only part changed is mesa: 7.11 is installed
(debian unstable), and I compiled one from git. In the latter case I start
programs as
LD_LIBRARY_PATH=/home/almos/SRC/mesa/lib/
LIBGL_DRIVERS_PATH=/home/almos/SRC/mesa/lib/gallium "$@"

I'll try to bisect later.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42999

--- Comment #1 from Alex Deucher  2011-11-16 11:45:32 PST 
---
I doubt you are using radeonhd.  Please attach your xorg log and dmesg output.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 43000] huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=43000

--- Comment #1 from Alex Deucher  2011-11-16 11:42:52 PST 
---
What hardware are you using?  Is mesa the only part that changed or did you
also update your kernel and/or ddx?  If it's just mesa, can you bisect?  If
it's multiple parts that you upgraded can you track down what component caused
the problem?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42999

Alex Deucher  changed:

   What|Removed |Added

 AssignedTo|eich at pdx.freedesktop.org|dri-devel at 
lists.freedesktop
   ||.org
  QAContact|xorg-team at lists.x.org   |
Product|xorg|DRI
Version|git |unspecified
  Component|Driver/radeonhd |DRM/Radeon

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 43000] New: huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=43000

 Bug #: 43000
   Summary: huge performance regression in ut2004 since 7.11
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: Other
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: aaalmosss at gmail.com


With 7.11 I get 60fps during the nvidia logo and in the menu. Ingame it is e.g.
~44fps if I load ons-torlan and look at the central tower from the base.

With 7.12-dev (git-b618e78) I get <30fps during the nvidia logo, and ~6fps on
the same level.

I must add, that 7.11 isn't quite playable either, because the fps has very
high variance: it jumps between 20 and 60, which makes the game very laggy.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[RFC] Reduce idle vblank wakeups

2011-11-16 Thread Mario Kleiner

On Nov 16, 2011, at 6:11 PM, Matthew Garrett wrote:

> On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel D?nzer wrote:
>
>> I thought the main reason for the delay wasn't broken hardware but to
>> avoid constantly ping-ponging the vblank IRQ between on and off with
>> apps which regularly neeed the vblank counter value, as that could  
>> make
>> the counter unreliable. Maybe I'm misremembering though.
>
> If turning it on and off results in the counter value being wrong then
> surely that's a hardware problem? I've tested that turning it on  
> and off
> between every IRQ still gives valid counter values on sandybridge.
>
> -- 
> Matthew Garrett | mjg59 at srcf.ucam.org


> On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel D?nzer wrote:


 >Even if I'm not, lowering the delay shouldn't be a problem, so long as
 >it's long enough that at least apps which need the vblank counter  
every
 >or every other frame don't cause the IRQ to ping-pong. But that
 >shouldn't depend on the hardware.

Hi, and thanks Michel for cc'ing me,

It's not broken hardware, but fast ping-ponging it on and off can  
make the vblank counter and vblank timestamps unreliable for apps  
that need high timing precision, especially for the ones that use the  
OML_sync_control extensions for precise swap scheduling. My target  
application is vision science
  neuroscience, where (sub-)milliseconds often matter for visual  
stimulation.

I think making the vblank off delay driver specific via these patches  
is a good idea. Lowering the timeout to something like a few refresh  
cycles, maybe somewhere between 50 msecs and 100 msecs would be also  
fine by me. I still would like to keep some drm config option to  
disable or override the vblank off delay by users.

The intel and radeon kms drivers implement everything that's needed  
to make it mostly work. Except for a small race between the cpu and  
gpu in the vblank_disable_and_save() function  and  
drm_update_vblank_count(). It can cause an off-by-one error when  
reinitializing the drm vblank counter from the gpu's hardware counter  
if the enable/disable function is called at the wrong moment while  
the gpu's scanout is inside the vblank interval, see comments in the  
code. I have some sketchy idea for a patch that could detect when the  
race happens and retry hw counter queries to fix this. Without that  
patch, there's some chance between 0% and 4% of being off-by-one.

On current nouveau kms, disabling vblank irqs guarantees you wrong  
vblank counts and wrong vblank timestamps after turning them on  
again, because the kms driver doesn't implement the hook for hardware  
vblank counter query correctly. The drm vblank counter misses all  
counts during the vblank irq off period. Other timing related hooks  
are missing as well. I have a couple of patches queued up and some  
more to come for the ddx and kms driver to mostly fix this. NVidia  
gpu's only have hardware vblank counters for NV-50 and later, fixing  
this for earlier gpu's would require some emulation of a hw vblank  
counter inside the kms driver.

Apps that rely on the vblank counter being totally reliable over long  
periods of time currently would be in a bad situation with a lowered  
vblank off delay, but that's probably generally not a good  
assumption. Toolkits like mine, which are more paranoid, currently  
can work fine as long as the off delay is at least a few video  
refresh cycles. I do the following for scheduling a reliably timed swap:

1. Query current vblank counter current_msc and vblank timestamp  
current_ust.
2. Calculate a target vblank count target_msc, based on current_msc,  
current_ust and some target time from usercode.
3. Schedule bufferswap for target_msc.

As long as the vblank off delay is long enough so that vblanks don't  
get turned off between 1. and 3, everything is fine, otherwise bad  
things will happen.
Keeping a way to override the default off delay would be good to  
allow poor scientists to work around potentially broken drivers or  
gpu's in the future. @Matthew: I'm appealing here to your ex-  
Drosophila biologist heritage ;-)

thanks,
-mario


*
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.kleiner at tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:+49 (0)7071/601-616
www:http://www.kyb.tuebingen.mpg.de/~kleinerm
*
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)



drm pixel formats update

2011-11-16 Thread James Simmons

> I decided to go all out with the pixel format definitions. Added pretty
> much all of the possible RGB/BGR variations. Just left out ones with
> 16bit components and floats. Also added a whole bunch of YUV formats,
> and 8 bit pseudocolor for good measure.

Thank you for including the pseudocolor as well.


[RFC] Reduce idle vblank wakeups

2011-11-16 Thread Matthew Garrett
On Wed, Nov 16, 2011 at 07:27:51PM +0100, Mario Kleiner wrote:

> It's not broken hardware, but fast ping-ponging it on and off can
> make the vblank counter and vblank timestamps unreliable for apps
> that need high timing precision, especially for the ones that use
> the OML_sync_control extensions for precise swap scheduling. My
> target application is vision science
>  neuroscience, where (sub-)milliseconds often matter for visual
> stimulation.

I'll admit that I'm struggling to understand the issue here. If the 
vblank counter is incremented at the time of vblank (which isn't the 
case for radeon, it seems, but as far as I can tell is the case for 
Intel) then how does ping-ponging the IRQ matter? 
vblank_disable_and_save() appears to handle this case.

> I think making the vblank off delay driver specific via these
> patches is a good idea. Lowering the timeout to something like a few
> refresh cycles, maybe somewhere between 50 msecs and 100 msecs would
> be also fine by me. I still would like to keep some drm config
> option to disable or override the vblank off delay by users.

Does the timeout serve any purpose other than letting software 
effectively prevent vblanks from being disabled?

> The intel and radeon kms drivers implement everything that's needed
> to make it mostly work. Except for a small race between the cpu and
> gpu in the vblank_disable_and_save() function  electrons.com/source/drivers/gpu/drm/drm_irq.c#L101> and
> drm_update_vblank_count(). It can cause an off-by-one error when
> reinitializing the drm vblank counter from the gpu's hardware
> counter if the enable/disable function is called at the wrong moment
> while the gpu's scanout is inside the vblank interval, see comments
> in the code. I have some sketchy idea for a patch that could detect
> when the race happens and retry hw counter queries to fix this.
> Without that patch, there's some chance between 0% and 4% of being
> off-by-one.

For Radeon, I'd have thought you could handle this by scheduling an irq 
for the beginning of scanout (avivo has a register for that) and 
delaying the vblank disable until you hit it?

> On current nouveau kms, disabling vblank irqs guarantees you wrong
> vblank counts and wrong vblank timestamps after turning them on
> again, because the kms driver doesn't implement the hook for
> hardware vblank counter query correctly. The drm vblank counter
> misses all counts during the vblank irq off period. Other timing
> related hooks are missing as well. I have a couple of patches queued
> up and some more to come for the ddx and kms driver to mostly fix
> this. NVidia gpu's only have hardware vblank counters for NV-50 and
> later, fixing this for earlier gpu's would require some emulation of
> a hw vblank counter inside the kms driver.

I've no problem with all of this work being case by case.

> Apps that rely on the vblank counter being totally reliable over
> long periods of time currently would be in a bad situation with a
> lowered vblank off delay, but that's probably generally not a good
> assumption. Toolkits like mine, which are more paranoid, currently
> can work fine as long as the off delay is at least a few video
> refresh cycles. I do the following for scheduling a reliably timed
> swap:
> 
> 1. Query current vblank counter current_msc and vblank timestamp
> current_ust.
> 2. Calculate a target vblank count target_msc, based on current_msc,
> current_ust and some target time from usercode.
> 3. Schedule bufferswap for target_msc.
> 
> As long as the vblank off delay is long enough so that vblanks don't
> get turned off between 1. and 3, everything is fine, otherwise bad
> things will happen.
> Keeping a way to override the default off delay would be good to
> allow poor scientists to work around potentially broken drivers or
> gpu's in the future. @Matthew: I'm appealing here to your ex-
> Drosophila biologist heritage ;-)

If vblanks are disabled and then re-enabled between 1 and 3, what's the 
negative outcome?

-- 
Matthew Garrett | mjg59 at srcf.ucam.org


[Bug 42998] New: [r600g] Regression: EVE Online graphics borked again (bisected)

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42998

 Bug #: 42998
   Summary: [r600g] Regression: EVE Online graphics borked again
(bisected)
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: x86-64 (AMD64)
OS/Version: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: luziphermcleod at yahoo.ie


Created attachment 53609
  --> https://bugs.freedesktop.org/attachment.cgi?id=53609
Screenshot of login screen (bad)

EVE Online (via wine, usually works well) has severe graphical glitches on
r600g (HD4870). It's unplayable now. I bisected this down to:

r600g: lazy load for AR register
commit: 8e366dc365d01213b71b87ace47d30938db74845
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8e366dc365d01213b71b87ace47d30938db74845

It seems that all triangles (I guess) are malformed, jump around and aren't
textured as they should. Also the system gets quite slow as long as the game is
active.

System:
Gentoo on vanilla kernel linux-3.2.0-rc2
Mesa from git, r600g driver
GPU: Radeon HD4870 (x2, only one used), CPU: Intel Core i7-965, MB: ASUS P6T
Deluxe
libdrm from git, xf86-video-ati from git, wine-1.3.32, xorg-server-1.11.2

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


Re: [RFC] Reduce idle vblank wakeups

2011-11-16 Thread Matthew Garrett
On Thu, Nov 17, 2011 at 01:26:37AM +0100, Mario Kleiner wrote:
> On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:
> >I'll admit that I'm struggling to understand the issue here. If the
> >vblank counter is incremented at the time of vblank (which isn't the
> >case for radeon, it seems, but as far as I can tell is the case for
> >Intel) then how does ping-ponging the IRQ matter?
> >vblank_disable_and_save() appears to handle this case.
> >
> 
> The drm software vblank counter which is used for scheduling swaps
> etc. gets incremented in the gpu's vblank irq handler. The gpu's
> hardware counter gets incremented somewhere inside the vblank
> interval. The increments don't happen at the same point in time. The
> race is between the vblank on/off code, which gets scheduled either
> by the timeout timer for the "off" case, or by usercode for the "on"
> case and the gpu's hardware vblank counter.

Yes, that makes sense given the current design.

> The existing code uses a lock (vblank_time_lock) to prevent some
> races between itself and the vblank irq handler, but it can't "lock"
> the gpu, so if the enable/disable code executes while the gpu is
> scanning out the vblank interval, it is undefined if the final
> vblank count will be correct or off by one. Vblank duration is
> somewhere up to 4.5% of refresh interval duration, so there's your
> up to 4% chance of getting it wrong.

Well presumably by "undefined" we really mean "hardware-specific" - for 
any given well-functioning hardware I'd expect the answer to be 
well-defined. Like I said, with Radeon we have the option of triggering 
an interrupt at the point where the hardware counter is actually 
incremented. If that then shuts down the vblank interrupt then we should 
have a well-defined answer.

> >Does the timeout serve any purpose other than letting software
> >effectively prevent vblanks from being disabled?
> 
> With perfect drivers and gpu's in a perfect world, no. In reality
> there's the race i described above, and nouveau and all other
> drivers except intel and radeon. The vblank irq also drives
> timestamping of vblanks, one update per vblank. The timestamps are
> cached if a client needs them inbetween updates. Turning off vblank
> irq invalidates the timestamps. radeon and intel can recreate the
> timestamp anytime as needed, but nouveau lacks this atm., so
> timestamps remain invalid for a whole video refresh cycle after
> vblank irq on. We have patches for nouveau kms almost ready, so only
> the race mentioned above would remain.

Sure. We'd certainly need to improve things for other GPUs before 
enabling the same functionality there.

> >For Radeon, I'd have thought you could handle this by scheduling
> >an irq
> >for the beginning of scanout (avivo has a register for that) and
> >delaying the vblank disable until you hit it?
> 
> For Radeon there is such an irq, but iirc we had some discussions on
> this, also with Alex Deucher, a while ago and some irq's weren't
> considered very reliable, or already used for other stuff. The idea
> i had goes like this:
> 
> Use the crtc scanout position queries together with the vblank
> counter queries inside some calibration loop, maybe executed after
> each modeset, to find out the scanline range in which the hardware
> vblank counter increments -- basically a forbidden range of scanline
> positions where the race would happen. Then at each vblank off/on,
> query scanout position before and after the hw vblank counter query.
> If according to the scanout positions the vblank counter query
> happened within the forbidden time window, retry the query. With a
> well working calibration that should add no delay in most cases and
> a delay to the on/off code of a few dozen microseconds (=duration of
> a few scanlines) worst case.

Assuming we're sleeping rather than busy-looping, that's certainly ok. 
My previous experiments with radeon indicated that the scanout irq was 
certainly not entirely reliable - on the other hand, I was trying to use 
it for completing memory reclocking within the vblank interval. It was 
typically still within a few scanlines, so a sanity check there wouldn't 
pose too much of a problem.

> >I've no problem with all of this work being case by case.
> 
> Me neither. I just say if you'd go to the extreme and disable vblank
> irq's immediately with zero delay, without reliably fixing the
> problem i mentioned, you'd get those off by one counts, which would
> be very bad for apps that rely on precise timing. Doing so would
> basically make the whole oml_sync_control implementation mostly
> useless.

Right. My testing of sandybridge suggests that there wasn't a problem 
here - even with the ping-ponging I was reliably getting 60 interrupts 
in 60 seconds, with the counter incrementing by 1 each time. I certainly 
wouldn't push to enable it elsewhere without making sure that the 
results are reliable.

> >If vblanks are disabled and then re-enabled between 1 and 3,
> >what's the

Strange effect with i915 backlight controller

2011-11-16 Thread Takashi Iwai
At Wed, 16 Nov 2011 13:58:57 +0100,
Daniel Mack wrote:
> 
> On 11/14/2011 11:39 AM, Takashi Iwai wrote:
> > OK, then perhaps a better fix is to change the check to be equivalent
> > with pineview, as you mentioned in the original post.  The handling of
> > bit 0 for old chips was lost during the refactoring of backlight code
> > since 2.6.37.
> > 
> > Does the patch below work for you?
> > 
> > The only concern by this fix is that it changes the max value.  If
> > apps expect some certain (e.g. recorded) value, it may screw up.  But
> > I don't expect this would happen with sane apps.
> 
> Works perfectly - let's ship it :)

OK, now the patch was resent to intel-gfx with proper tags.


thanks,

Takashi


[RFC] Reduce idle vblank wakeups

2011-11-16 Thread Michel Dänzer
On Mit, 2011-11-16 at 09:20 -0500, Matthew Garrett wrote: 
> The drm core currently waits 5 seconds from userspace dropping a request
> for vblanks to vblanks actually being disabled. This appears to be a
> workaround for broken hardware, but results in a mostly idle desktop
> generating a huge number of wakeups that are entirely unnecessary but which
> consume measurable amounts of power. This patchset makes the vblank timeout
> per-device rather than global, making it easy for drivers to override the
> behaviour without compromising other graphics hardware in the system. It
> then removes the delay on Intel hardware. I've tested this successfully on
> Sandybridge without any evidence of spurious or missing irqs, but I don't
> know how well it works on older hardware. Feedback not only welcome, but
> positively desired.

Have you discussed this with Mario Kleiner (CC'd)?

I thought the main reason for the delay wasn't broken hardware but to
avoid constantly ping-ponging the vblank IRQ between on and off with
apps which regularly neeed the vblank counter value, as that could make
the counter unreliable. Maybe I'm misremembering though.

Even if I'm not, lowering the delay shouldn't be a problem, so long as
it's long enough that at least apps which need the vblank counter every
or every other frame don't cause the IRQ to ping-pong. But that
shouldn't depend on the hardware.


-- 
Earthling Michel D?nzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer


[PATCH 00/23] kill drm cruft with fire

2011-11-16 Thread Daniel Vetter
On Mon, Nov 14, 2011 at 17:10, James Simmons  wrote:
>> > Should I test this set of patches for the VIA driver or wait until you
>> > have a second version of this patch?
>>
>> Testing this on via would be awesome! Iirc I haven't changed anything in
>> the via specific patches, but if it's more convenient you can also
>> directly test my branch:
>>
>> http://cgit.freedesktop.org/~danvet/drm/log/?h=kill-with-fire
>
> Okay I tried the patches and it locked up the openchrome X server. I'm
> going to try your branch tonight to see if it makes any difference. If it
> still fails I will have to track down what the problem is.

If you can bisect the issue, that would be awesome. Meanwhile my sis
card arrived, so I'm hopefully get around to test that part of the
series rsn. I'm traveling atm though, so response time will suffer a
bit.
-Daniel
-- 
Daniel Vetter
daniel.vetter at ffwll.ch - +41 (0) 79 364 57 48 - http://blog.ffwll.ch


[Bug 42997] [R600] Corruption after resume from suspend to ram

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42997

--- Comment #4 from Thomas Wendt  2011-11-16 09:15:13 PST 
---
Created attachment 53608
  --> https://bugs.freedesktop.org/attachment.cgi?id=53608
dmesg output

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42997] [R600] Corruption after resume from suspend to ram

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42997

--- Comment #3 from Alex Deucher  2011-11-16 09:13:24 PST 
---
Please attach your dmesg output.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42997] [R600] Corruption after resume from suspend to ram

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42997

Alex Deucher  changed:

   What|Removed |Added

  Attachment #53605|text/x-log  |text/plain
  mime type||
  Attachment #53605|0   |1
   is patch||

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[RFC] Reduce idle vblank wakeups

2011-11-16 Thread Matthew Garrett
On Wed, Nov 16, 2011 at 06:03:15PM +0100, Michel D?nzer wrote:

> I thought the main reason for the delay wasn't broken hardware but to
> avoid constantly ping-ponging the vblank IRQ between on and off with
> apps which regularly neeed the vblank counter value, as that could make
> the counter unreliable. Maybe I'm misremembering though.

If turning it on and off results in the counter value being wrong then 
surely that's a hardware problem? I've tested that turning it on and off 
between every IRQ still gives valid counter values on sandybridge.

-- 
Matthew Garrett | mjg59 at srcf.ucam.org


[Bug 42997] [R600] Corruption after resume from suspend to ram

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42997

--- Comment #2 from Thomas Wendt  2011-11-16 09:05:23 PST 
---
Created attachment 53607
  --> https://bugs.freedesktop.org/attachment.cgi?id=53607
Screenshot of the corruption

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42997] [R600] Corruption after resume from suspend to ram

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42997

--- Comment #1 from Thomas Wendt  2011-11-16 09:04:56 PST 
---
Created attachment 53606
  --> https://bugs.freedesktop.org/attachment.cgi?id=53606
Screenshot of the corruption

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42997] New: [R600] Corruption after resume from suspend to ram

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42997

 Bug #: 42997
   Summary: [R600] Corruption after resume from suspend to ram
Classification: Unclassified
   Product: DRI
   Version: XOrg CVS
  Platform: x86-64 (AMD64)
OS/Version: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: DRM/Radeon
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: thoemy at gmail.com


Created attachment 53605
  --> https://bugs.freedesktop.org/attachment.cgi?id=53605
Xorg log showing start, suspend and resume

After resuming from suspend to ram I experience corruptions on the whole
screen. I'm using a compositing desktop with gnome-shell but the corruption is
also visible in the text consoles. I can recover from the corruption and
continue work normally if I change the resolution or screen orientation.
This bug is present in the kernel versions 3.0 - 3.2rc2 and I think also in
some releases before them. The graphics card is a HD3870, X.Org X Server
1.11.1.902
The radeon module is loaded with dynclks=1 audio=1

Attached are pictures of the corruption and a my Xorg.log. Any other
information I can provide?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 42998] [r600g] Regression: EVE Online graphics borked again (bisected)

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42998

--- Comment #1 from Vadim  2011-11-16 16:47:41 UTC ---
This patch should help:
http://lists.freedesktop.org/archives/mesa-dev/2011-November/014688.html

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 42998] [r600g] Regression: EVE Online graphics borked again (bisected)

2011-11-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=42998

--- Comment #1 from Vadim  2011-11-16 16:47:41 UTC ---
This patch should help:
http://lists.freedesktop.org/archives/mesa-dev/2011-November/014688.html

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[PATCH] drm/i915: Hook up Ivybridge eDP

2011-11-16 Thread Keith Packard
The Ivybridge eDP control register looks like a cross between a
Cougarpoint PCH DP control register and a Sandybridge eDP control
register.

Where things trivially match, share the code. Where there are any
tricky bits, just split things out into two obviously separate code paths.

Signed-off-by: Keith Packard 
---
 drivers/gpu/drm/i915/i915_reg.h |   18 +
 drivers/gpu/drm/i915/intel_dp.c |  151 ++-
 2 files changed, 135 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b080cc8..43f27ad 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3447,6 +3447,24 @@
 #define  EDP_LINK_TRAIN_800_1200MV_0DB_SNB_B   (0x38<<22)
 #define  EDP_LINK_TRAIN_VOL_EMP_MASK_SNB   (0x3f<<22)
 
+/* IVB */
+#define EDP_LINK_TRAIN_400MV_0DB_IVB   (0x24 <<22)
+#define EDP_LINK_TRAIN_400MV_3_5DB_IVB (0x2a <<22)
+#define EDP_LINK_TRAIN_400MV_6DB_IVB   (0x2f <<22)
+#define EDP_LINK_TRAIN_600MV_0DB_IVB   (0x30 <<22)
+#define EDP_LINK_TRAIN_600MV_3_5DB_IVB (0x36 <<22)
+#define EDP_LINK_TRAIN_800MV_0DB_IVB   (0x38 <<22)
+#define EDP_LINK_TRAIN_800MV_3_5DB_IVB (0x33 <<22)
+
+/* legacy values */
+#define EDP_LINK_TRAIN_500MV_0DB_IVB   (0x00 <<22)
+#define EDP_LINK_TRAIN_1000MV_0DB_IVB  (0x20 <<22)
+#define EDP_LINK_TRAIN_500MV_3_5DB_IVB (0x02 <<22)
+#define EDP_LINK_TRAIN_1000MV_3_5DB_IVB(0x22 <<22)
+#define EDP_LINK_TRAIN_1000MV_6DB_IVB  (0x23 <<22)
+
+#define  EDP_LINK_TRAIN_VOL_EMP_MASK_IVB   (0x3f<<22)
+
 #define  FORCEWAKE 0xA18C
 #define  FORCEWAKE_ACK 0x130090
 
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index ec28aeb..f63c6b2 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -361,8 +361,8 @@ intel_dp_aux_ch(struct intel_dp *intel_dp,
 * clock divider.
 */
if (is_cpu_edp(intel_dp)) {
-   if (IS_GEN6(dev))
-   aux_clock_divider = 200; /* SNB eDP input clock at 
400Mhz */
+   if (IS_GEN6(dev) || IS_GEN7(dev))
+   aux_clock_divider = 200; /* SNB & IVB eDP input clock 
at 400Mhz */
else
aux_clock_divider = 225; /* eDP input clock at 450Mhz */
} else if (HAS_PCH_SPLIT(dev))
@@ -816,10 +816,11 @@ intel_dp_mode_set(struct drm_encoder *encoder, struct 
drm_display_mode *mode,
}
 
/*
-* There are three kinds of DP registers:
+* There are four kinds of DP registers:
 *
 *  IBX PCH
-*  CPU
+*  SNB CPU
+*  IVB CPU
 *  CPT PCH
 *
 * IBX PCH and CPU are the same for almost everything,
@@ -872,7 +873,26 @@ intel_dp_mode_set(struct drm_encoder *encoder, struct 
drm_display_mode *mode,
 
/* Split out the IBX/CPU vs CPT settings */
 
-   if (!HAS_PCH_CPT(dev) || is_cpu_edp(intel_dp)) {
+   if (is_cpu_edp(intel_dp) && IS_GEN7(dev)) {
+   if (adjusted_mode->flags & DRM_MODE_FLAG_PHSYNC)
+   intel_dp->DP |= DP_SYNC_HS_HIGH;
+   if (adjusted_mode->flags & DRM_MODE_FLAG_PVSYNC)
+   intel_dp->DP |= DP_SYNC_VS_HIGH;
+   intel_dp->DP |= DP_LINK_TRAIN_OFF_CPT;
+
+   if (intel_dp->link_configuration[1] & 
DP_LANE_COUNT_ENHANCED_FRAME_EN)
+   intel_dp->DP |= DP_ENHANCED_FRAMING;
+
+   intel_dp->DP |= intel_crtc->pipe << 29;
+
+   /* don't miss out required setting for eDP */
+   intel_dp->DP |= DP_PLL_ENABLE;
+   if (adjusted_mode->clock < 20)
+   intel_dp->DP |= DP_PLL_FREQ_160MHZ;
+   else
+   intel_dp->DP |= DP_PLL_FREQ_270MHZ;
+   
+   } else if (!HAS_PCH_CPT(dev) || is_cpu_edp(intel_dp)) {
intel_dp->DP |= intel_dp->color_range;
 
if (adjusted_mode->flags & DRM_MODE_FLAG_PHSYNC)
@@ -1374,34 +1394,60 @@ static char *link_train_names[] = {
  * These are source-specific values; current Intel hardware supports
  * a maximum voltage of 800mV and a maximum pre-emphasis of 6dB
  */
-#define I830_DP_VOLTAGE_MAXDP_TRAIN_VOLTAGE_SWING_800
-#define I830_DP_VOLTAGE_MAX_CPTDP_TRAIN_VOLTAGE_SWING_1200
 
 static uint8_t
-intel_dp_pre_emphasis_max(uint8_t voltage_swing)
+intel_dp_voltage_max(struct intel_dp *intel_dp)
 {
-   switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) {
-   case DP_TRAIN_VOLTAGE_SWING_400:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_600:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_800:
-   return DP_TRAIN_PRE_EMPHASIS_3_5;
-   case DP_

[PATCH] drm/i915: Hook up Ivybridge eDP

2011-11-16 Thread Keith Packard
The Ivybridge eDP control register looks like a cross between a
Cougarpoint PCH DP control register and a Sandybridge eDP control
register.

Where things trivially match, share the code. Where there are any
tricky bits, just split things out into two obviously separate code paths.

Signed-off-by: Keith Packard 
---
 drivers/gpu/drm/i915/i915_reg.h |   18 +
 drivers/gpu/drm/i915/intel_dp.c |  151 ++-
 2 files changed, 135 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b080cc8..43f27ad 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3447,6 +3447,24 @@
 #define  EDP_LINK_TRAIN_800_1200MV_0DB_SNB_B   (0x38<<22)
 #define  EDP_LINK_TRAIN_VOL_EMP_MASK_SNB   (0x3f<<22)

+/* IVB */
+#define EDP_LINK_TRAIN_400MV_0DB_IVB   (0x24 <<22)
+#define EDP_LINK_TRAIN_400MV_3_5DB_IVB (0x2a <<22)
+#define EDP_LINK_TRAIN_400MV_6DB_IVB   (0x2f <<22)
+#define EDP_LINK_TRAIN_600MV_0DB_IVB   (0x30 <<22)
+#define EDP_LINK_TRAIN_600MV_3_5DB_IVB (0x36 <<22)
+#define EDP_LINK_TRAIN_800MV_0DB_IVB   (0x38 <<22)
+#define EDP_LINK_TRAIN_800MV_3_5DB_IVB (0x33 <<22)
+
+/* legacy values */
+#define EDP_LINK_TRAIN_500MV_0DB_IVB   (0x00 <<22)
+#define EDP_LINK_TRAIN_1000MV_0DB_IVB  (0x20 <<22)
+#define EDP_LINK_TRAIN_500MV_3_5DB_IVB (0x02 <<22)
+#define EDP_LINK_TRAIN_1000MV_3_5DB_IVB(0x22 <<22)
+#define EDP_LINK_TRAIN_1000MV_6DB_IVB  (0x23 <<22)
+
+#define  EDP_LINK_TRAIN_VOL_EMP_MASK_IVB   (0x3f<<22)
+
 #define  FORCEWAKE 0xA18C
 #define  FORCEWAKE_ACK 0x130090

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index ec28aeb..f63c6b2 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -361,8 +361,8 @@ intel_dp_aux_ch(struct intel_dp *intel_dp,
 * clock divider.
 */
if (is_cpu_edp(intel_dp)) {
-   if (IS_GEN6(dev))
-   aux_clock_divider = 200; /* SNB eDP input clock at 
400Mhz */
+   if (IS_GEN6(dev) || IS_GEN7(dev))
+   aux_clock_divider = 200; /* SNB & IVB eDP input clock 
at 400Mhz */
else
aux_clock_divider = 225; /* eDP input clock at 450Mhz */
} else if (HAS_PCH_SPLIT(dev))
@@ -816,10 +816,11 @@ intel_dp_mode_set(struct drm_encoder *encoder, struct 
drm_display_mode *mode,
}

/*
-* There are three kinds of DP registers:
+* There are four kinds of DP registers:
 *
 *  IBX PCH
-*  CPU
+*  SNB CPU
+*  IVB CPU
 *  CPT PCH
 *
 * IBX PCH and CPU are the same for almost everything,
@@ -872,7 +873,26 @@ intel_dp_mode_set(struct drm_encoder *encoder, struct 
drm_display_mode *mode,

/* Split out the IBX/CPU vs CPT settings */

-   if (!HAS_PCH_CPT(dev) || is_cpu_edp(intel_dp)) {
+   if (is_cpu_edp(intel_dp) && IS_GEN7(dev)) {
+   if (adjusted_mode->flags & DRM_MODE_FLAG_PHSYNC)
+   intel_dp->DP |= DP_SYNC_HS_HIGH;
+   if (adjusted_mode->flags & DRM_MODE_FLAG_PVSYNC)
+   intel_dp->DP |= DP_SYNC_VS_HIGH;
+   intel_dp->DP |= DP_LINK_TRAIN_OFF_CPT;
+
+   if (intel_dp->link_configuration[1] & 
DP_LANE_COUNT_ENHANCED_FRAME_EN)
+   intel_dp->DP |= DP_ENHANCED_FRAMING;
+
+   intel_dp->DP |= intel_crtc->pipe << 29;
+
+   /* don't miss out required setting for eDP */
+   intel_dp->DP |= DP_PLL_ENABLE;
+   if (adjusted_mode->clock < 20)
+   intel_dp->DP |= DP_PLL_FREQ_160MHZ;
+   else
+   intel_dp->DP |= DP_PLL_FREQ_270MHZ;
+   
+   } else if (!HAS_PCH_CPT(dev) || is_cpu_edp(intel_dp)) {
intel_dp->DP |= intel_dp->color_range;

if (adjusted_mode->flags & DRM_MODE_FLAG_PHSYNC)
@@ -1374,34 +1394,60 @@ static char *link_train_names[] = {
  * These are source-specific values; current Intel hardware supports
  * a maximum voltage of 800mV and a maximum pre-emphasis of 6dB
  */
-#define I830_DP_VOLTAGE_MAXDP_TRAIN_VOLTAGE_SWING_800
-#define I830_DP_VOLTAGE_MAX_CPTDP_TRAIN_VOLTAGE_SWING_1200

 static uint8_t
-intel_dp_pre_emphasis_max(uint8_t voltage_swing)
+intel_dp_voltage_max(struct intel_dp *intel_dp)
 {
-   switch (voltage_swing & DP_TRAIN_VOLTAGE_SWING_MASK) {
-   case DP_TRAIN_VOLTAGE_SWING_400:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_600:
-   return DP_TRAIN_PRE_EMPHASIS_6;
-   case DP_TRAIN_VOLTAGE_SWING_800:
-   return DP_TRAIN_PRE_EMPHASIS_3_5;
-   case DP_TRAIN_V

Re: [RFC] Reduce idle vblank wakeups

2011-11-16 Thread Mario Kleiner

On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:


On Wed, Nov 16, 2011 at 07:27:51PM +0100, Mario Kleiner wrote:


It's not broken hardware, but fast ping-ponging it on and off can
make the vblank counter and vblank timestamps unreliable for apps
that need high timing precision, especially for the ones that use
the OML_sync_control extensions for precise swap scheduling. My
target application is vision science
 neuroscience, where (sub-)milliseconds often matter for visual
stimulation.


I'll admit that I'm struggling to understand the issue here. If the
vblank counter is incremented at the time of vblank (which isn't the
case for radeon, it seems, but as far as I can tell is the case for
Intel) then how does ping-ponging the IRQ matter?
vblank_disable_and_save() appears to handle this case.



The drm software vblank counter which is used for scheduling swaps  
etc. gets incremented in the gpu's vblank irq handler. The gpu's  
hardware counter gets incremented somewhere inside the vblank  
interval. The increments don't happen at the same point in time. The  
race is between the vblank on/off code, which gets scheduled either  
by the timeout timer for the "off" case, or by usercode for the "on"  
case and the gpu's hardware vblank counter.


The existing code uses a lock (vblank_time_lock) to prevent some  
races between itself and the vblank irq handler, but it can't "lock"  
the gpu, so if the enable/disable code executes while the gpu is  
scanning out the vblank interval, it is undefined if the final vblank  
count will be correct or off by one. Vblank duration is somewhere up  
to 4.5% of refresh interval duration, so there's your up to 4% chance  
of getting it wrong.


If one could reliably avoid enabling/disabling in the problematic  
time period, the problem would go away.



I think making the vblank off delay driver specific via these
patches is a good idea. Lowering the timeout to something like a few
refresh cycles, maybe somewhere between 50 msecs and 100 msecs would
be also fine by me. I still would like to keep some drm config
option to disable or override the vblank off delay by users.


Does the timeout serve any purpose other than letting software
effectively prevent vblanks from being disabled?


With perfect drivers and gpu's in a perfect world, no. In reality  
there's the race i described above, and nouveau and all other drivers  
except intel and radeon. The vblank irq also drives timestamping of  
vblanks, one update per vblank. The timestamps are cached if a client  
needs them inbetween updates. Turning off vblank irq invalidates the  
timestamps. radeon and intel can recreate the timestamp anytime as  
needed, but nouveau lacks this atm., so timestamps remain invalid for  
a whole video refresh cycle after vblank irq on. We have patches for  
nouveau kms almost ready, so only the race mentioned above would remain.



The intel and radeon kms drivers implement everything that's needed
to make it mostly work. Except for a small race between the cpu and
gpu in the vblank_disable_and_save() function  and
drm_update_vblank_count(). It can cause an off-by-one error when
reinitializing the drm vblank counter from the gpu's hardware
counter if the enable/disable function is called at the wrong moment
while the gpu's scanout is inside the vblank interval, see comments
in the code. I have some sketchy idea for a patch that could detect
when the race happens and retry hw counter queries to fix this.
Without that patch, there's some chance between 0% and 4% of being
off-by-one.


For Radeon, I'd have thought you could handle this by scheduling an  
irq

for the beginning of scanout (avivo has a register for that) and
delaying the vblank disable until you hit it?


For Radeon there is such an irq, but iirc we had some discussions on  
this, also with Alex Deucher, a while ago and some irq's weren't  
considered very reliable, or already used for other stuff. The idea i  
had goes like this:


Use the crtc scanout position queries together with the vblank  
counter queries inside some calibration loop, maybe executed after  
each modeset, to find out the scanline range in which the hardware  
vblank counter increments -- basically a forbidden range of scanline  
positions where the race would happen. Then at each vblank off/on,  
query scanout position before and after the hw vblank counter query.  
If according to the scanout positions the vblank counter query  
happened within the forbidden time window, retry the query. With a  
well working calibration that should add no delay in most cases and a  
delay to the on/off code of a few dozen microseconds (=duration of a  
few scanlines) worst case.


With that and the pending nouveau patches in place i think we  
wouldn't need the vblank off delay anymore on such drivers.



On current nouveau kms, disabling vblank irqs guarantees you wrong
vblank counts and wrong vbl

[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42999

--- Comment #3 from interwe...@yahoo.ca 2011-11-16 15:03:59 PST ---
Created attachment 53613
  --> https://bugs.freedesktop.org/attachment.cgi?id=53613
xorg log

Here is my xorg log.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42999

--- Comment #2 from interwe...@yahoo.ca 2011-11-16 15:00:15 PST ---
Created attachment 53612
  --> https://bugs.freedesktop.org/attachment.cgi?id=53612
dmesg output

Here is my dmesg output.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 08/12] gma500: Add the core DRM files and headers

2011-11-16 Thread Alan Cox
> So generally we need to provide a userspace interface via ioctls, we
> do this with a shared header file that goes in include/drm/ with the
> Kbuild bits

At the moment the only public API is the generic bits.

> Now I'm assuming psb_drm.h is meant to be this file? but as-is its not
> really what I'd expect,

Not yet.

> If you look at radeon you'll see defines for device specific ioctls,
> same for i915, now I note these tables start at 0 and work their way
> up,

I'll move the ones we want to zero.

> Now if I understand the holes in this are due to some old userspace
> code that is probably broken anyway,

Yep.. but really I can kill them off. The total number of people affected
will be < 10.

> willing to listen to why this is a bad plan, but I'd rather not push
> psb_drm.h into kernel header packages and libdrm if there is a
> conflicting one in existance (or conflicting 6).

Last time I looked there were conflicts between the various IMG based
Intel ones let along with anything else - lost cause so we might as well
do it right.

We could use a new namespace that might be smarter in fact.

> > +#define PSB_BO_FLAG_COMMAND ? ? ? ? (1ULL << 52)
> 
> Any reason it start at 52?

That may be obsolete

> 
> > +struct drm_psb_mode_operation_arg {
> > + ? ? ? u32 obj_id;
> > + ? ? ? u16 operation;
> > + ? ? ? struct drm_mode_modeinfo mode;
> > + ? ? ? void *data;
> > +};
> 
> No void * in ioctl args, no u16 in ioctls args, not really sure what a
> mode "operation" is here either. Try and design the ioctls to avoid
> compat wrapper requirements.
> Maybe move the operation flags up here.

Again I need to see if we can simply kill it off

> > +
> > +struct drm_psb_stolen_memory_arg {
> > + ? ? ? u32 base;
> > + ? ? ? u32 size;
> > +};
> 
> Why does someone care about this? is it a workaround for lack of
> decent memory management?

A smart GMA500 aware server can allocate GEM objects from host memory or
from the stolen memory area. Being smart about it really means knowing
how much stolen memory is so it can see how best to use it.

> > +/* Controlling the kernel modesetting buffers */
> > +
> > +#define DRM_PSB_SIZES ? ? ? ? ? 0x07
> > +#define DRM_PSB_FUSE_REG ? ? ? 0x08
> > +#define DRM_PSB_DC_STATE ? ? ? 0x0A
> > +#define DRM_PSB_ADB ? ? ? ? ? ?0x0B
> > +#define DRM_PSB_MODE_OPERATION 0x0C
> > +#define DRM_PSB_STOLEN_MEMORY ?0x0D
> > +#define DRM_PSB_REGISTER_RW ? ?0x0E
> 
> Direct read/writing of registers is not something, the regs being hit
> here seem like workarounds for not having good overlay support in the
> kernel perhaps,
> can the new plane stuff help workaround this?

That can go - its basically left for debugging.

> So you can probably considered these patches merged at least, I'll
> just keep them in a topic branch which I'll stick into the drm-next
> queue.

Cool.

I'll take the secateurs to the interfaces.



[PATCH 4/4] gma500: Move the API

2011-11-16 Thread Alan Cox
From: Alan Cox 

Finally move the API where it can be seen

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/cdv_device.c  |2 -
 drivers/gpu/drm/gma500/gem.c |2 -
 drivers/gpu/drm/gma500/intel_bios.c  |2 -
 drivers/gpu/drm/gma500/mid_bios.c|2 -
 drivers/gpu/drm/gma500/oaktrail_device.c |2 -
 drivers/gpu/drm/gma500/psb_device.c  |2 -
 drivers/gpu/drm/gma500/psb_drm.h |   91 --
 drivers/gpu/drm/gma500/psb_drv.c |2 -
 drivers/gpu/drm/gma500/psb_drv.h |2 -
 include/drm/gma_drm.h|   91 ++
 10 files changed, 99 insertions(+), 99 deletions(-)
 delete mode 100644 drivers/gpu/drm/gma500/psb_drm.h
 create mode 100644 include/drm/gma_drm.h


diff --git a/drivers/gpu/drm/gma500/cdv_device.c 
b/drivers/gpu/drm/gma500/cdv_device.c
index 87614e0..c0583df 100644
--- a/drivers/gpu/drm/gma500/cdv_device.c
+++ b/drivers/gpu/drm/gma500/cdv_device.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_reg.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/gem.c b/drivers/gpu/drm/gma500/gem.c
index d743679..fdc8b5d 100644
--- a/drivers/gpu/drm/gma500/gem.c
+++ b/drivers/gpu/drm/gma500/gem.c
@@ -25,7 +25,7 @@
 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 
 int psb_gem_init_object(struct drm_gem_object *obj)
diff --git a/drivers/gpu/drm/gma500/intel_bios.c 
b/drivers/gpu/drm/gma500/intel_bios.c
index 096757f..d4d0c5b 100644
--- a/drivers/gpu/drm/gma500/intel_bios.c
+++ b/drivers/gpu/drm/gma500/intel_bios.c
@@ -20,7 +20,7 @@
  */
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_intel_drv.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/mid_bios.c 
b/drivers/gpu/drm/gma500/mid_bios.c
index 7115d1a..018ab46 100644
--- a/drivers/gpu/drm/gma500/mid_bios.c
+++ b/drivers/gpu/drm/gma500/mid_bios.c
@@ -25,7 +25,7 @@
 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "mid_bios.h"
 
diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c 
b/drivers/gpu/drm/gma500/oaktrail_device.c
index 41c418f..57ad3ea6 100644
--- a/drivers/gpu/drm/gma500/oaktrail_device.c
+++ b/drivers/gpu/drm/gma500/oaktrail_device.c
@@ -22,7 +22,7 @@
 #include 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_reg.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/psb_device.c 
b/drivers/gpu/drm/gma500/psb_device.c
index 4659132..9d6959a 100644
--- a/drivers/gpu/drm/gma500/psb_device.c
+++ b/drivers/gpu/drm/gma500/psb_device.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-#include "psb_drm.h"
+#include "gma_drm.h"
 #include "psb_drv.h"
 #include "psb_reg.h"
 #include "psb_intel_reg.h"
diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
deleted file mode 100644
index 1136867..000
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ /dev/null
@@ -1,91 +0,0 @@
-/**
- * Copyright (c) 2007-2011, Intel Corporation.
- * All Rights Reserved.
- * Copyright (c) 2008, Tungsten Graphics Inc.  Cedar Park, TX., USA.
- * All Rights Reserved.
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License,
- * version 2, as published by the Free Software Foundation.
- *
- * This program is distributed in the hope it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
- *
- **/
-
-#ifndef _PSB_DRM_H_
-#define _PSB_DRM_H_
-
-/*
- * Manage the LUT for an output
- */
-struct drm_psb_dpst_lut_arg {
-   uint8_t lut[256];
-   int output_id;
-};
-
-/*
- * Validate modes
- */
-struct drm_psb_mode_operation_arg {
-   u32 obj_id;
-   u16 operation;
-   struct drm_mode_modeinfo mode;
-   u64 data;
-};
-
-/*
- * Query the stolen memory for smarter management of
- * memory by the server
- */
-struct drm_psb_stolen_memory_arg {
-   u32 base;
-   u32 size;
-};
-
-struct drm_psb_get_pipe_from_crtc_id_arg {
-   /** ID of CRTC being requested **/
-   u32 crtc_id;
-   /** pipe of requested CRTC **/
-   u32 pipe;
-};
-
-struct drm_psb_gem_create {
-   __u64 size;
-   __u32 handle;
-   __u32 flags;
-#define GMA_GEM_CREATE_STOLEN

[PATCH 3/4] gma500: kill off NUM_PIPE define

2011-11-16 Thread Alan Cox
From: Alan Cox 

We don't want this external in case someone adds more to the hardware. We
want it out of the ABI.

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/psb_drm.h |3 ---
 drivers/gpu/drm/gma500/psb_drv.h |2 ++
 2 files changed, 2 insertions(+), 3 deletions(-)


diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
index 6ded343..1136867 100644
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ b/drivers/gpu/drm/gma500/psb_drm.h
@@ -22,9 +22,6 @@
 #ifndef _PSB_DRM_H_
 #define _PSB_DRM_H_
 
-#define PSB_NUM_PIPE 3
-
-
 /*
  * Manage the LUT for an output
  */
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 9567748..ffb05f2 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -258,6 +258,8 @@ struct psb_intel_opregion {
 
 struct psb_ops;
 
+#define PSB_NUM_PIPE   3
+
 struct drm_psb_private {
struct drm_device *dev;
const struct psb_ops *ops;

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/4] gma500: Rename the ioctls to avoid clashing with the legacy drivers

2011-11-16 Thread Alan Cox
From: Alan Cox 

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/gem.c |4 ++--
 drivers/gpu/drm/gma500/psb_drm.h |   20 ++--
 drivers/gpu/drm/gma500/psb_drv.c |   16 
 3 files changed, 20 insertions(+), 20 deletions(-)


diff --git a/drivers/gpu/drm/gma500/gem.c b/drivers/gpu/drm/gma500/gem.c
index 65fdd6b..d743679 100644
--- a/drivers/gpu/drm/gma500/gem.c
+++ b/drivers/gpu/drm/gma500/gem.c
@@ -274,13 +274,13 @@ int psb_gem_create_ioctl(struct drm_device *dev, void 
*data,
 {
struct drm_psb_gem_create *args = data;
int ret;
-   if (args->flags & PSB_GEM_CREATE_STOLEN) {
+   if (args->flags & GMA_GEM_CREATE_STOLEN) {
ret = psb_gem_create_stolen(file, dev, args->size,
&args->handle);
if (ret == 0)
return 0;
/* Fall throguh */
-   args->flags &= ~PSB_GEM_CREATE_STOLEN;
+   args->flags &= ~GMA_GEM_CREATE_STOLEN;
}
return psb_gem_create(file, dev, args->size, &args->handle);
 }
diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
index 72eeb7a..6ded343 100644
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ b/drivers/gpu/drm/gma500/psb_drm.h
@@ -63,7 +63,7 @@ struct drm_psb_gem_create {
__u64 size;
__u32 handle;
__u32 flags;
-#define PSB_GEM_CREATE_STOLEN  1   /* Stolen memory can be used */
+#define GMA_GEM_CREATE_STOLEN  1   /* Stolen memory can be used */
 };
 
 struct drm_psb_gem_mmap {
@@ -79,15 +79,15 @@ struct drm_psb_gem_mmap {
 
 /* Controlling the kernel modesetting buffers */
 
-#define DRM_PSB_GEM_CREATE 0x00/* Create a GEM object */
-#define DRM_PSB_GEM_MMAP   0x01/* Map GEM memory */
-#define DRM_PSB_STOLEN_MEMORY  0x02/* Report stolen memory */
-#define DRM_PSB_2D_OP  0x03/* Will be merged later */
-#define DRM_PSB_GAMMA  0x04/* Set gamma table */
-#define DRM_PSB_ADB0x05/* Get backlight */
-#define DRM_PSB_DPST_BL0x06/* Set backlight */
-#define DRM_PSB_GET_PIPE_FROM_CRTC_ID 0x1  /* CRTC to physical pipe# */
-#define DRM_PSB_MODE_OPERATION 0x07/* Mode validation/DC set */
+#define DRM_GMA_GEM_CREATE 0x00/* Create a GEM object */
+#define DRM_GMA_GEM_MMAP   0x01/* Map GEM memory */
+#define DRM_GMA_STOLEN_MEMORY  0x02/* Report stolen memory */
+#define DRM_GMA_2D_OP  0x03/* Will be merged later */
+#define DRM_GMA_GAMMA  0x04/* Set gamma table */
+#define DRM_GMA_ADB0x05/* Get backlight */
+#define DRM_GMA_DPST_BL0x06/* Set backlight */
+#define DRM_GMA_GET_PIPE_FROM_CRTC_ID 0x1  /* CRTC to physical pipe# */
+#define DRM_GMA_MODE_OPERATION 0x07/* Mode validation/DC set */
 #definePSB_MODE_OPERATION_MODE_VALID   0x01
 
 
diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 2d5050e..9294e71 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -81,27 +81,27 @@ MODULE_DEVICE_TABLE(pci, pciidlist);
  */
 
 #define DRM_IOCTL_PSB_ADB  \
-   DRM_IOWR(DRM_PSB_ADB + DRM_COMMAND_BASE, uint32_t)
+   DRM_IOWR(DRM_GMA_ADB + DRM_COMMAND_BASE, uint32_t)
 #define DRM_IOCTL_PSB_MODE_OPERATION   \
-   DRM_IOWR(DRM_PSB_MODE_OPERATION + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_MODE_OPERATION + DRM_COMMAND_BASE, \
 struct drm_psb_mode_operation_arg)
 #define DRM_IOCTL_PSB_STOLEN_MEMORY\
-   DRM_IOWR(DRM_PSB_STOLEN_MEMORY + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_STOLEN_MEMORY + DRM_COMMAND_BASE, \
 struct drm_psb_stolen_memory_arg)
 #define DRM_IOCTL_PSB_GAMMA\
-   DRM_IOWR(DRM_PSB_GAMMA + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_GAMMA + DRM_COMMAND_BASE, \
 struct drm_psb_dpst_lut_arg)
 #define DRM_IOCTL_PSB_DPST_BL  \
-   DRM_IOWR(DRM_PSB_DPST_BL + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_DPST_BL + DRM_COMMAND_BASE, \
 uint32_t)
 #define DRM_IOCTL_PSB_GET_PIPE_FROM_CRTC_ID\
-   DRM_IOWR(DRM_PSB_GET_PIPE_FROM_CRTC_ID + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_GET_PIPE_FROM_CRTC_ID + DRM_COMMAND_BASE, \
 struct drm_psb_get_pipe_from_crtc_id_arg)
 #define DRM_IOCTL_PSB_GEM_CREATE   \
-   DRM_IOWR(DRM_PSB_GEM_CREATE + DRM_COMMAND_BASE, \
+   DRM_IOWR(DRM_GMA_GEM_CREATE + DRM_COMMAND_BASE, \
 struct drm_psb_gem_create)
 #define DRM_IOCTL_PSB_GEM_MMAP \
-   DRM_IOWR(DRM_PSB_GEM_MMAP + DRM_COMMAND_BASE, \
+  

[PATCH 1/4] gma500: begin pruning dead bits of API

2011-11-16 Thread Alan Cox
From: Alan Cox 

At this point we won't add an external set of definitions. We want to get
everything out before we admit to a public API beyond the standardised
ones.

Signed-off-by: Alan Cox 
---

 drivers/gpu/drm/gma500/psb_drm.h |  159 ++--
 drivers/gpu/drm/gma500/psb_drv.c |  507 --
 drivers/gpu/drm/gma500/psb_drv.h |2 
 3 files changed, 24 insertions(+), 644 deletions(-)


diff --git a/drivers/gpu/drm/gma500/psb_drm.h b/drivers/gpu/drm/gma500/psb_drm.h
index dca7b20..72eeb7a 100644
--- a/drivers/gpu/drm/gma500/psb_drm.h
+++ b/drivers/gpu/drm/gma500/psb_drm.h
@@ -24,168 +24,41 @@
 
 #define PSB_NUM_PIPE 3
 
-#define PSB_GPU_ACCESS_READ (1ULL << 32)
-#define PSB_GPU_ACCESS_WRITE(1ULL << 33)
-#define PSB_GPU_ACCESS_MASK (PSB_GPU_ACCESS_READ | 
PSB_GPU_ACCESS_WRITE)
-
-#define PSB_BO_FLAG_COMMAND (1ULL << 52)
 
 /*
- * Feedback components:
+ * Manage the LUT for an output
  */
-
-struct drm_psb_sizes_arg {
-   u32 ta_mem_size;
-   u32 mmu_size;
-   u32 pds_size;
-   u32 rastgeom_size;
-   u32 tt_size;
-   u32 vram_size;
-};
-
 struct drm_psb_dpst_lut_arg {
uint8_t lut[256];
int output_id;
 };
 
-#define PSB_DC_CRTC_SAVE 0x01
-#define PSB_DC_CRTC_RESTORE 0x02
-#define PSB_DC_OUTPUT_SAVE 0x04
-#define PSB_DC_OUTPUT_RESTORE 0x08
-#define PSB_DC_CRTC_MASK 0x03
-#define PSB_DC_OUTPUT_MASK 0x0C
-
-struct drm_psb_dc_state_arg {
-   u32 flags;
-   u32 obj_id;
-};
-
+/*
+ * Validate modes
+ */
 struct drm_psb_mode_operation_arg {
u32 obj_id;
u16 operation;
struct drm_mode_modeinfo mode;
-   void *data;
+   u64 data;
 };
 
+/*
+ * Query the stolen memory for smarter management of
+ * memory by the server
+ */
 struct drm_psb_stolen_memory_arg {
u32 base;
u32 size;
 };
 
-/*Display Register Bits*/
-#define REGRWBITS_PFIT_CONTROLS(1 << 0)
-#define REGRWBITS_PFIT_AUTOSCALE_RATIOS(1 << 1)
-#define REGRWBITS_PFIT_PROGRAMMED_SCALE_RATIOS (1 << 2)
-#define REGRWBITS_PIPEASRC (1 << 3)
-#define REGRWBITS_PIPEBSRC (1 << 4)
-#define REGRWBITS_VTOTAL_A (1 << 5)
-#define REGRWBITS_VTOTAL_B (1 << 6)
-#define REGRWBITS_DSPACNTR (1 << 8)
-#define REGRWBITS_DSPBCNTR (1 << 9)
-#define REGRWBITS_DSPCCNTR (1 << 10)
-
-/*Overlay Register Bits*/
-#define OV_REGRWBITS_OVADD (1 << 0)
-#define OV_REGRWBITS_OGAM_ALL  (1 << 1)
-
-#define OVC_REGRWBITS_OVADD  (1 << 2)
-#define OVC_REGRWBITS_OGAM_ALL (1 << 3)
-
-struct drm_psb_register_rw_arg {
-   u32 b_force_hw_on;
-
-   u32 display_read_mask;
-   u32 display_write_mask;
-
-   struct {
-   u32 pfit_controls;
-   u32 pfit_autoscale_ratios;
-   u32 pfit_programmed_scale_ratios;
-   u32 pipeasrc;
-   u32 pipebsrc;
-   u32 vtotal_a;
-   u32 vtotal_b;
-   } display;
-
-   u32 overlay_read_mask;
-   u32 overlay_write_mask;
-
-   struct {
-   u32 OVADD;
-   u32 OGAMC0;
-   u32 OGAMC1;
-   u32 OGAMC2;
-   u32 OGAMC3;
-   u32 OGAMC4;
-   u32 OGAMC5;
-   u32 IEP_ENABLED;
-   u32 IEP_BLE_MINMAX;
-   u32 IEP_BSSCC_CONTROL;
-   u32 b_wait_vblank;
-   } overlay;
-
-   u32 sprite_enable_mask;
-   u32 sprite_disable_mask;
-
-   struct {
-   u32 dspa_control;
-   u32 dspa_key_value;
-   u32 dspa_key_mask;
-   u32 dspc_control;
-   u32 dspc_stride;
-   u32 dspc_position;
-   u32 dspc_linear_offset;
-   u32 dspc_size;
-   u32 dspc_surface;
-   } sprite;
-
-   u32 subpicture_enable_mask;
-   u32 subpicture_disable_mask;
-};
-
-/* Controlling the kernel modesetting buffers */
-
-#define DRM_PSB_SIZES   0x07
-#define DRM_PSB_FUSE_REG   0x08
-#define DRM_PSB_DC_STATE   0x0A
-#define DRM_PSB_ADB0x0B
-#define DRM_PSB_MODE_OPERATION 0x0C
-#define DRM_PSB_STOLEN_MEMORY  0x0D
-#define DRM_PSB_REGISTER_RW0x0E
-
-/*
- * NOTE: Add new commands here, but increment
- * the values below and increment their
- * corresponding defines where they're
- * defined elsewhere.
- */
-
-#define DRM_PSB_GEM_CREATE 0x10
-#define DRM_PSB_2D_OP  0x11/* Will be merged later */
-#define DRM_PSB_GEM_MMAP   0x12
-#define DRM_PSB_DPST   0x1B
-#define DRM_PSB_GAMMA  0x1C
-#define DRM_PSB_DPST_BL0x1D
-#define DRM_PSB_GET_PIPE_FROM_CRTC_ID 0x1F
-
-#define PSB_MODE_OPERATION_MODE_VALID  0x01
-#define PSB_MODE_OPERATION_SET_DC_BASE  0x02
-
 struct drm_psb_get_pipe_from_crtc_id_arg {
/** ID

Re: [PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread Ville Syrjälä
On Wed, Nov 16, 2011 at 08:16:31PM +0100, Ilyes Gouta wrote:
> Hi Ville,
> 
> Regarding 3 plane YCbCr, DRM_FORMAT_yuv444 (non sub-sampled YCbCr)
> would also be useful.

Yeah I was wondering whether to add that. So far I've not run into
hardware which can eat that, so I left it out for now.

Packed 4:4:4 is supported by the overlays in most ATI chips IIRC,
so adding both packed and planar 4:4:4 formats would likely make
sense.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 43000] huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=43000

--- Comment #3 from Ian Romanick  2011-11-16 14:25:23 PST 
---
If this was a recent change, I'll guess that it will bisect to my changes to
the way uniforms are handled.  I pushed a patch today that may restore previous
performance:

commit 010dc29283cfc7791a29ba8a0570d8f7f9edef05
Author: Ian Romanick 
Date:   Thu Nov 10 12:32:35 2011 -0800

mesa: Only update sampler uniforms that are used by the shader stage

Previously a vertex shader that used no samplers would get updated (by
calling the driver's ProgramStringNotify) when a sampler in the
fragment shader was updated.  This was discovered while investigating
some spurious code generation for shaders in Cogs.  The behavior in
Cogs is especially pessimal because it ping-pongs sampler uniform
settings:

glUniform1i(sampler1, 0);
glUniform1i(sampler2, 1);
draw();
glUniform1i(sampler1, 1);
glUniform1i(sampler2, 0);
draw();
glUniform1i(sampler1, 0);
glUniform1i(sampler2, 1);
draw();
// etc.

ProgramStringNotify is still too big of a hammer.  Applications like
Cogs will still defeat the shader cache.  A lighter-weight mechanism
that can work with the shader cache is needed.  However, this patch at
least restores the previous behavior.

Signed-off-by: Ian Romanick 
Reviewed-by: Kenneth Graunke 

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: drm pixel formats update

2011-11-16 Thread Ville Syrjälä
On Wed, Nov 16, 2011 at 01:23:01PM -0800, Jesse Barnes wrote:
> On Wed, 16 Nov 2011 23:19:38 +0200
> Ville Syrjälä  wrote:
> > Oh and one extra detail just occured to me regarding the three plane
> > formats. Should we even define formats for both the YUV vs. YVU
> > variant. Seeing as we now have independent handles and offsets for
> > each plane, we can make do with just one format definition.
> 
> Don't you still need to know the order?  I.e. what's in handle[1], U or
> V?

We could define it so that handle[1] is always Cb and handle [2] is Cr,
for example. Then it's up to user space to set the handles orrectly,
which it has to do anyway.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: introduce a sub allocator and convert ib pool to it

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Somewhat specializaed sub-allocator designed to perform sub-allocation
for command buffer not only for current cs ioctl but for future command
submission ioctl as well. Patch also convert current ib pool to use
the sub allocator. Idea is that ib poll buffer can be share with other
command buffer submission not having 64K granularity.

Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/Makefile|2 +-
 drivers/gpu/drm/radeon/radeon.h|   66 --
 drivers/gpu/drm/radeon/radeon_object.h |   18 +++
 drivers/gpu/drm/radeon/radeon_ring.c   |  239 
 drivers/gpu/drm/radeon/radeon_sa.c |  186 +
 5 files changed, 346 insertions(+), 165 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_sa.c

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 94dcdc7..2139fe8 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -71,7 +71,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
r600_blit_kms.o radeon_pm.o atombios_dp.o r600_audio.o r600_hdmi.o \
evergreen.o evergreen_cs.o evergreen_blit_shaders.o 
evergreen_blit_kms.o \
radeon_trace_points.o ni.o cayman_blit_shaders.o atombios_encoders.o \
-   radeon_semaphore.o
+   radeon_semaphore.o radeon_sa.o

 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b85f8a9..267bd92 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -305,6 +305,53 @@ struct radeon_bo_list {
u32 tiling_flags;
 };

+/* sub-allocation manager, it has to be protected by another lock.
+ * By conception this is an helper for other part of the driver
+ * like the indirect buffer or semaphore, which both have their
+ * locking.
+ *
+ * Principe is simple, we keep a list of sub allocation in offset
+ * order (first entry has offset == 0, last entry has the highest
+ * offset).
+ *
+ * When allocating new object we first check if there is room at
+ * the end total_size - (last_object_offset + last_object_size) >=
+ * alloc_size. If so we allocate new object there.
+ *
+ * When there is not enough room at the end, we start waiting for
+ * each sub object until we reach object_offset+object_size >=
+ * alloc_size, this object then become the sub object we return.
+ *
+ * Alignment can't be bigger than page size.
+ *
+ * Hole are not considered for allocation to keep things simple.
+ * Assumption is that there won't be hole (all object on same
+ * alignment).
+ */
+struct radeon_sa_manager {
+   struct radeon_bo*bo;
+   struct list_headsa_bo;
+   unsignedsize;
+   uint64_tgpu_addr;
+   void*cpu_ptr;
+};
+
+struct radeon_sa_bo;
+typedef void (*radeon_sa_bo_destroy_t)(struct radeon_device *rdev,
+  struct radeon_sa_bo *sa_bo);
+typedef bool (*radeon_sa_bo_done_t)(struct radeon_device *rdev,
+   struct radeon_sa_bo *sa_bo);
+
+/* sub-allocation buffer */
+struct radeon_sa_bo {
+   struct list_headlist;
+   struct radeon_sa_manager*manager;
+   unsignedoffset;
+   unsignedsize;
+   radeon_sa_bo_destroy_t  destroy;
+   radeon_sa_bo_done_t done;
+};
+
 /*
  * GEM objects.
  */
@@ -503,13 +550,12 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
 #define CAYMAN_RING_TYPE_CP2_INDEX 2

 struct radeon_ib {
-   struct list_headlist;
+   struct radeon_sa_bo sa_bo;
unsignedidx;
+   uint32_tlength_dw;
uint64_tgpu_addr;
-   struct radeon_fence *fence;
uint32_t*ptr;
-   uint32_tlength_dw;
-   boolfree;
+   struct radeon_fence *fence;
 };

 /*
@@ -517,12 +563,11 @@ struct radeon_ib {
  * mutex protects scheduled_ibs, ready, alloc_bm
  */
 struct radeon_ib_pool {
-   struct mutexmutex;
-   struct radeon_bo*robj;
-   struct list_headbogus_ib;
-   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
-   boolready;
-   unsignedhead_id;
+   struct mutexmutex;
+   struct radeon_sa_managersa_manager;
+   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
+   boolready;
+   unsignedhead_id;
 };

 struct radeon_ring {
@@ -601,7 +646,6 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void rad

Re: drm pixel formats update

2011-11-16 Thread Ville Syrjälä
On Wed, Nov 16, 2011 at 09:26:20PM +, Alan Cox wrote:
> > I think the only format in my list where I didn't use an existing fourcc
> > is I420/IYUV. And BTW, for that one I used the same "fake" fourcc that
> 
> Right but you redefine all sorts of stuff in the driver in your patch to
> non FourCC names which is just confusing (and painful given the format
> picked)

Sorry, now I lost you completely. Care to elaborate, or perhaps
point to a specific line or lines in the patch?

> > v4l2 uses (YU12). 
> > 
> > And that brings another matter to the table. How should we deal with
> > duplicate fourccs? I420/IYUV and YUY2/YUYV come to mind.
> 
> Just accept both. FourCC as with all API's is not perfect
>  
> > Also, if I now add these ad-hoc fourccs for the RGB formats, and some
> > time later someone comes in with a format with a conflicting official
> > fourcc, what should we do?
> 
> One possibility I suggested originally was to mix FourCC codes and native
> formats which are numbered. That works fine in both endiannesses in
> theory because you'll always have a \0 in it which is invalid FourCC
> 
> ie just number the Linux specific DRM formats 0, 1, 2, 3, 4, 5, ...

I suggested a running number too. But I'd rather leave the fourccs to
user space completely. But if people insist that the kernel should eat
them too, we could just convert them to the simple number format in
some helper function, to isolate the rest of the code from fourccs.
And then there'd be no point in even defining any fourcc stuff in the
headers, as everyone knows how to construct them.

> > Oh and one extra detail just occured to me regarding the three plane
> > formats. Should we even define formats for both the YUV vs. YVU
> > variant. Seeing as we now have independent handles and offsets for
> > each plane, we can make do with just one format definition.
> 
> I think so - or the helper should do the translation and flip the planes.
> We want the user to get flexibility and the driver to be as simple as
> possible.
> 
> (and btw I've no problem at all with the idea that you can pass in a
> FourCC *or* a format specifying structure, or with an internal API where
> a fourCC is always internally turned into a struct of offsets and other
> useful info before hitting the drivers)

Even if there's such a structure, I think it's still beneficial to have
a constant identifier for each format. It allows you utilize switch
statements, whereas otherwise you'd possibly need to look at multiple
bits of information inside the structure.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: introduce a sub allocator and convert ib pool to it

2011-11-16 Thread Jerome Glisse
On Wed, Nov 16, 2011 at 2:18 PM,   wrote:
> From: Jerome Glisse 
>
> Somewhat specializaed sub-allocator designed to perform sub-allocation
> for command buffer not only for current cs ioctl but for future command
> submission ioctl as well. Patch also convert current ib pool to use
> the sub allocator. Idea is that ib poll buffer can be share with other
> command buffer submission not having 64K granularity.
>
> Signed-off-by: Jerome Glisse 

Ignore first send (was wrong patch).

Cheers,
Jerome


[PATCH] drm/radeon: introduce a sub allocator and convert ib pool to it

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Somewhat specializaed sub-allocator designed to perform sub-allocation
for command buffer not only for current cs ioctl but for future command
submission ioctl as well. Patch also convert current ib pool to use
the sub allocator. Idea is that ib poll buffer can be share with other
command buffer submission not having 64K granularity.

Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/Makefile|2 +-
 drivers/gpu/drm/radeon/radeon.h|   66 --
 drivers/gpu/drm/radeon/radeon_object.h |   18 +++
 drivers/gpu/drm/radeon/radeon_ring.c   |  239 
 drivers/gpu/drm/radeon/radeon_sa.c |  186 +
 5 files changed, 346 insertions(+), 165 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_sa.c

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 94dcdc7..2139fe8 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -71,7 +71,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
r600_blit_kms.o radeon_pm.o atombios_dp.o r600_audio.o r600_hdmi.o \
evergreen.o evergreen_cs.o evergreen_blit_shaders.o 
evergreen_blit_kms.o \
radeon_trace_points.o ni.o cayman_blit_shaders.o atombios_encoders.o \
-   radeon_semaphore.o
+   radeon_semaphore.o radeon_sa.o

 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b85f8a9..267bd92 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -305,6 +305,53 @@ struct radeon_bo_list {
u32 tiling_flags;
 };

+/* sub-allocation manager, it has to be protected by another lock.
+ * By conception this is an helper for other part of the driver
+ * like the indirect buffer or semaphore, which both have their
+ * locking.
+ *
+ * Principe is simple, we keep a list of sub allocation in offset
+ * order (first entry has offset == 0, last entry has the highest
+ * offset).
+ *
+ * When allocating new object we first check if there is room at
+ * the end total_size - (last_object_offset + last_object_size) >=
+ * alloc_size. If so we allocate new object there.
+ *
+ * When there is not enough room at the end, we start waiting for
+ * each sub object until we reach object_offset+object_size >=
+ * alloc_size, this object then become the sub object we return.
+ *
+ * Alignment can't be bigger than page size.
+ *
+ * Hole are not considered for allocation to keep things simple.
+ * Assumption is that there won't be hole (all object on same
+ * alignment).
+ */
+struct radeon_sa_manager {
+   struct radeon_bo*bo;
+   struct list_headsa_bo;
+   unsignedsize;
+   uint64_tgpu_addr;
+   void*cpu_ptr;
+};
+
+struct radeon_sa_bo;
+typedef void (*radeon_sa_bo_destroy_t)(struct radeon_device *rdev,
+  struct radeon_sa_bo *sa_bo);
+typedef bool (*radeon_sa_bo_done_t)(struct radeon_device *rdev,
+   struct radeon_sa_bo *sa_bo);
+
+/* sub-allocation buffer */
+struct radeon_sa_bo {
+   struct list_headlist;
+   struct radeon_sa_manager*manager;
+   unsignedoffset;
+   unsignedsize;
+   radeon_sa_bo_destroy_t  destroy;
+   radeon_sa_bo_done_t done;
+};
+
 /*
  * GEM objects.
  */
@@ -503,13 +550,12 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
 #define CAYMAN_RING_TYPE_CP2_INDEX 2

 struct radeon_ib {
-   struct list_headlist;
+   struct radeon_sa_bo sa_bo;
unsignedidx;
+   uint32_tlength_dw;
uint64_tgpu_addr;
-   struct radeon_fence *fence;
uint32_t*ptr;
-   uint32_tlength_dw;
-   boolfree;
+   struct radeon_fence *fence;
 };

 /*
@@ -517,12 +563,11 @@ struct radeon_ib {
  * mutex protects scheduled_ibs, ready, alloc_bm
  */
 struct radeon_ib_pool {
-   struct mutexmutex;
-   struct radeon_bo*robj;
-   struct list_headbogus_ib;
-   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
-   boolready;
-   unsignedhead_id;
+   struct mutexmutex;
+   struct radeon_sa_managersa_manager;
+   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
+   boolready;
+   unsignedhead_id;
 };

 struct radeon_ring {
@@ -601,7 +646,6 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void rad

Re: drm pixel formats update

2011-11-16 Thread Jesse Barnes
On Wed, 16 Nov 2011 23:19:38 +0200
Ville Syrjälä  wrote:
> Oh and one extra detail just occured to me regarding the three plane
> formats. Should we even define formats for both the YUV vs. YVU
> variant. Seeing as we now have independent handles and offsets for
> each plane, we can make do with just one format definition.

Don't you still need to know the order?  I.e. what's in handle[1], U or
V?

-- 
Jesse Barnes, Intel Open Source Technology Center


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Strange effect with i915 backlight controller

2011-11-16 Thread Daniel Mack
On 11/14/2011 11:39 AM, Takashi Iwai wrote:
> OK, then perhaps a better fix is to change the check to be equivalent
> with pineview, as you mentioned in the original post.  The handling of
> bit 0 for old chips was lost during the refactoring of backlight code
> since 2.6.37.
> 
> Does the patch below work for you?
> 
> The only concern by this fix is that it changes the max value.  If
> apps expect some certain (e.g. recorded) value, it may screw up.  But
> I don't expect this would happen with sane apps.

Works perfectly - let's ship it :)


Thanks again,
Daniel


> ===
> From: Takashi Iwai 
> Subject: drm/i915: Fix invalid backpanel values for GEN3 or older chips
> 
> While refactoring of backlight control code in commit [a95735569:
> drm/i915: Refactor panel backlight controls], the handling of the bit
> 0 of duty-cycle was gone except for pineview.  This resulted in invalid
> register values for old chips like 915GM.  When the bit 0 is set, the
> backlight is turned off suddenly.
> 
> This patch changes the bit-0 check by replacing with the condition of
> gen < 4 (pineview is included in this condition, too).
> 
> Reported-by: Daniel Mack 
> Signed-off-by: Takashi Iwai 
> ---
>  drivers/gpu/drm/i915/intel_panel.c |8 +++-
>  1 files changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_panel.c 
> b/drivers/gpu/drm/i915/intel_panel.c
> index 499d4c0..737d00f 100644
> --- a/drivers/gpu/drm/i915/intel_panel.c
> +++ b/drivers/gpu/drm/i915/intel_panel.c
> @@ -178,12 +178,10 @@ u32 intel_panel_get_max_backlight(struct drm_device 
> *dev)
>   if (HAS_PCH_SPLIT(dev)) {
>   max >>= 16;
>   } else {
> - if (IS_PINEVIEW(dev)) {
> + if (INTEL_INFO(dev)->gen < 4) {
>   max >>= 17;
>   } else {
>   max >>= 16;
> - if (INTEL_INFO(dev)->gen < 4)
> - max &= ~1;
>   }
>  
>   if (is_backlight_combination_mode(dev))
> @@ -203,7 +201,7 @@ u32 intel_panel_get_backlight(struct drm_device *dev)
>   val = I915_READ(BLC_PWM_CPU_CTL) & BACKLIGHT_DUTY_CYCLE_MASK;
>   } else {
>   val = I915_READ(BLC_PWM_CTL) & BACKLIGHT_DUTY_CYCLE_MASK;
> - if (IS_PINEVIEW(dev))
> + if (INTEL_INFO(dev)->gen < 4)
>   val >>= 1;
>  
>   if (is_backlight_combination_mode(dev)) {
> @@ -246,7 +244,7 @@ static void intel_panel_actually_set_backlight(struct 
> drm_device *dev, u32 level
>   }
>  
>   tmp = I915_READ(BLC_PWM_CTL);
> - if (IS_PINEVIEW(dev)) {
> + if (INTEL_INFO(dev)->gen < 4) {
>   tmp &= ~(BACKLIGHT_DUTY_CYCLE_MASK - 1);
>   level <<= 1;
>   } else



Re: drm pixel formats update

2011-11-16 Thread Alan Cox
> I think the only format in my list where I didn't use an existing fourcc
> is I420/IYUV. And BTW, for that one I used the same "fake" fourcc that

Right but you redefine all sorts of stuff in the driver in your patch to
non FourCC names which is just confusing (and painful given the format
picked)

> v4l2 uses (YU12). 
> 
> And that brings another matter to the table. How should we deal with
> duplicate fourccs? I420/IYUV and YUY2/YUYV come to mind.

Just accept both. FourCC as with all API's is not perfect
 
> Also, if I now add these ad-hoc fourccs for the RGB formats, and some
> time later someone comes in with a format with a conflicting official
> fourcc, what should we do?

One possibility I suggested originally was to mix FourCC codes and native
formats which are numbered. That works fine in both endiannesses in
theory because you'll always have a \0 in it which is invalid FourCC

ie just number the Linux specific DRM formats 0, 1, 2, 3, 4, 5, ...

> Oh and one extra detail just occured to me regarding the three plane
> formats. Should we even define formats for both the YUV vs. YVU
> variant. Seeing as we now have independent handles and offsets for
> each plane, we can make do with just one format definition.

I think so - or the helper should do the translation and flip the planes.
We want the user to get flexibility and the driver to be as simple as
possible.

(and btw I've no problem at all with the idea that you can pass in a
FourCC *or* a format specifying structure, or with an internal API where
a fourCC is always internally turned into a struct of offsets and other
useful info before hitting the drivers)

Alan
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


drm pixel formats update

2011-11-16 Thread Jesse Barnes
On Wed, 16 Nov 2011 23:19:38 +0200
Ville Syrj?l?  wrote:
> Oh and one extra detail just occured to me regarding the three plane
> formats. Should we even define formats for both the YUV vs. YVU
> variant. Seeing as we now have independent handles and offsets for
> each plane, we can make do with just one format definition.

Don't you still need to know the order?  I.e. what's in handle[1], U or
V?

-- 
Jesse Barnes, Intel Open Source Technology Center
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/2016/1724ce93/attachment.pgp>


Re: drm pixel formats update

2011-11-16 Thread Ville Syrjälä
On Wed, Nov 16, 2011 at 07:54:12PM +, Alan Cox wrote:
> > If anyone has problems with the way the formats are defined, please
> > speak up now! Since only Jesse has bothered to comment on my rantings
> > I can only assume people are happy with my approach to things.
> 
> Umm .. no. I don't see why they are needed. Its just an extra layer of
> gratuitious confusing indirection. The rest of the world speaks and
> understands FourCC sp for all the formats covered by an existing FourCC
> name we should just the existing name.
> 
> You might need to check one now and then but everyone doing video
> processing is familiar with them including all the Windows folk.

I think the only format in my list where I didn't use an existing fourcc
is I420/IYUV. And BTW, for that one I used the same "fake" fourcc that
v4l2 uses (YU12). 

And that brings another matter to the table. How should we deal with
duplicate fourccs? I420/IYUV and YUY2/YUYV come to mind.

Also, if I now add these ad-hoc fourccs for the RGB formats, and some
time later someone comes in with a format with a conflicting official
fourcc, what should we do?

Oh and one extra detail just occured to me regarding the three plane
formats. Should we even define formats for both the YUV vs. YVU
variant. Seeing as we now have independent handles and offsets for
each plane, we can make do with just one format definition.

-- 
Ville Syrjälä
Intel OTC
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 13/14] drm/ttm: isolate dma data from ttm_tt V3

2011-11-16 Thread Konrad Rzeszutek Wilk
> >> -int ttm_dma_populate(struct ttm_tt *ttm, struct device *dev);
> >> -extern void ttm_dma_unpopulate(struct ttm_tt *ttm, struct device *dev);
> >> +int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev);
> >> +extern void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device 
> >> *dev);
> >>
> >> ?#else
> >> ?static inline int ttm_dma_page_alloc_init(struct ttm_mem_global *glob,
> >
> > You are missing changes to the static implementations in case 
> > CONFIG_SWIOTLB is not set.
> >
> 
> Actually i don't think i miss anything
> ttm_dma_populate/ttm_dma_unpopulate is conditional on CONFIG_SWIOTLB
> in both radeon and nouveau. So i should be fine. Or did i miss
> something else ?

You are completlty right.  Somehow I had in my mind that this was present:


diff --git a/include/drm/ttm/ttm_page_alloc.h b/include/drm/ttm/ttm_page_alloc.h
index 5fe2740..bb006c7 100644
--- a/include/drm/ttm/ttm_page_alloc.h
+++ b/include/drm/ttm/ttm_page_alloc.h
@@ -94,6 +94,16 @@ static inline int ttm_dma_page_alloc_debugfs(struct seq_file 
*m, void *data)
 {
return 0;
 }
+static inline int ttm_dma_populate(struct ttm_tt *ttm,
+  struct device *dev)
+{
+   return -ENODEV;
+}
+static inline void ttm_dma_unpopulate(struct ttm_tt *ttm,
+ struct device *dev)
+{
+   return;
+}
 #endif

 #endif

But that does not make sense as the nouveau and radeon are both guarded by the
#ifdef CONFIG_SWIOTLB. So if "# CONFIG_SWIOTLB is not set" is present, well, 
nobody
will be referencing the ttm_dma_[un|]populate calls.


[PATCH 08/12] gma500: Add the core DRM files and headers

2011-11-16 Thread Dave Airlie
On Thu, Nov 3, 2011 at 6:22 PM, Alan Cox  wrote:
> From: Alan Cox 
>
> Not really a nice way to split this up further for submission. This
> provides all the DRM interfacing logic, the headers and relevant glue.

I've started merging it, and my main review focus is as always the
userspace interfaces, which we are now setting in stone for ever (or
5-10 years whichever is shorter).

So generally we need to provide a userspace interface via ioctls, we
do this with a shared header file that goes in include/drm/ with the
Kbuild bits

Now I'm assuming psb_drm.h is meant to be this file? but as-is its not
really what I'd expect,

If you look at radeon you'll see defines for device specific ioctls,
same for i915, now I note these tables start at 0 and work their way
up,

Now if I understand the holes in this are due to some old userspace
code that is probably broken anyway,

It might be worth renaming psb_drm.h to gma500_drm.h to reflect the
overall driver name and then maybe start with a clean interface, I'm
willing to listen to why this is a bad plan, but I'd rather not push
psb_drm.h into kernel header packages and libdrm if there is a
conflicting one in existance (or conflicting 6).

some more comments below,

> +#define PSB_GPU_ACCESS_READ ? ? ? ? (1ULL << 32)
> +#define PSB_GPU_ACCESS_WRITE ? ? ? ?(1ULL << 33)
> +#define PSB_GPU_ACCESS_MASK ? ? ? ? (PSB_GPU_ACCESS_READ | 
> PSB_GPU_ACCESS_WRITE)
> +
> +#define PSB_BO_FLAG_COMMAND ? ? ? ? (1ULL << 52)

Any reason it start at 52?

> +struct drm_psb_mode_operation_arg {
> + ? ? ? u32 obj_id;
> + ? ? ? u16 operation;
> + ? ? ? struct drm_mode_modeinfo mode;
> + ? ? ? void *data;
> +};

No void * in ioctl args, no u16 in ioctls args, not really sure what a
mode "operation" is here either. Try and design the ioctls to avoid
compat wrapper requirements.
Maybe move the operation flags up here.

> +
> +struct drm_psb_stolen_memory_arg {
> + ? ? ? u32 base;
> + ? ? ? u32 size;
> +};

Why does someone care about this? is it a workaround for lack of
decent memory management?


> +/* Controlling the kernel modesetting buffers */
> +
> +#define DRM_PSB_SIZES ? ? ? ? ? 0x07
> +#define DRM_PSB_FUSE_REG ? ? ? 0x08
> +#define DRM_PSB_DC_STATE ? ? ? 0x0A
> +#define DRM_PSB_ADB ? ? ? ? ? ?0x0B
> +#define DRM_PSB_MODE_OPERATION 0x0C
> +#define DRM_PSB_STOLEN_MEMORY ?0x0D
> +#define DRM_PSB_REGISTER_RW ? ?0x0E

Direct read/writing of registers is not something, the regs being hit
here seem like workarounds for not having good overlay support in the
kernel perhaps,
can the new plane stuff help workaround this?

> +#define DRM_PSB_GEM_CREATE ? ? 0x10
> +#define DRM_PSB_2D_OP ? ? ? ? ?0x11 ? ? ? ? ? ?/* Will be merged later */
> +#define DRM_PSB_GEM_MMAP ? ? ? 0x12
> +#define DRM_PSB_DPST ? ? ? ? ? 0x1B
> +#define DRM_PSB_GAMMA ? ? ? ? ?0x1C
> +#define DRM_PSB_DPST_BL ? ? ? ? ? ? ? ?0x1D
> +#define DRM_PSB_GET_PIPE_FROM_CRTC_ID 0x1F

If these are compat with somewhere else please state where the master
database is kept so future people can avoid collisions with closed
source drivers.

> +
> +#define PSB_MODE_OPERATION_MODE_VALID ?0x01
> +#define PSB_MODE_OPERATION_SET_DC_BASE ?0x02
> +
> +struct drm_psb_get_pipe_from_crtc_id_arg {
> + ? ? ? /** ID of CRTC being requested **/
> + ? ? ? u32 crtc_id;
> +
> + ? ? ? /** pipe of requested CRTC **/
> + ? ? ? u32 pipe;
> +};

I'm happy that we can fix all these problems incrementally with
patches on top of these, as I take the notion that the userspace
ioctls aren't set in stone until Linus merges and does a release
containing them.

So you can probably considered these patches merged at least, I'll
just keep them in a topic branch which I'll stick into the drm-next
queue.

Dave.


[PATCH 14/14] drm/ttm: simplify memory accounting for ttm user

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Provide helper function to compute the kernel memory size needed
for each buffer object. Move all the accounting inside ttm, simplifying
driver and avoiding code duplication accross them.

Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c |6 +++-
 drivers/gpu/drm/radeon/radeon_object.c   |8 +++-
 drivers/gpu/drm/ttm/ttm_bo.c |   52 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |   35 +---
 include/drm/ttm/ttm_bo_api.h |   19 ++-
 include/drm/ttm/ttm_bo_driver.h  |5 ---
 6 files changed, 70 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 4347776..857bca4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -93,6 +93,7 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
 {
struct drm_nouveau_private *dev_priv = dev->dev_private;
struct nouveau_bo *nvbo;
+   size_t acc_size;
int ret;

nvbo = kzalloc(sizeof(struct nouveau_bo), GFP_KERNEL);
@@ -115,9 +116,12 @@ nouveau_bo_new(struct drm_device *dev, int size, int align,
nvbo->bo.mem.num_pages = size >> PAGE_SHIFT;
nouveau_bo_placement_set(nvbo, flags, 0);

+   acc_size = ttm_bo_dma_acc_size(&dev_priv->ttm.bdev, size,
+  sizeof(struct nouveau_bo));
+
ret = ttm_bo_init(&dev_priv->ttm.bdev, &nvbo->bo, size,
  ttm_bo_type_device, &nvbo->placement,
- align >> PAGE_SHIFT, 0, false, NULL, size,
+ align >> PAGE_SHIFT, 0, false, NULL, acc_size,
  nouveau_bo_del_ttm);
if (ret) {
/* ttm will call nouveau_bo_del_ttm if it fails.. */
diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index 1c85152..695b480 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -95,6 +95,7 @@ int radeon_bo_create(struct radeon_device *rdev,
enum ttm_bo_type type;
unsigned long page_align = roundup(byte_align, PAGE_SIZE) >> PAGE_SHIFT;
unsigned long max_size = 0;
+   size_t acc_size;
int r;

size = ALIGN(size, PAGE_SIZE);
@@ -117,6 +118,9 @@ int radeon_bo_create(struct radeon_device *rdev,
return -ENOMEM;
}

+   acc_size = ttm_bo_dma_acc_size(&rdev->mman.bdev, size,
+  sizeof(struct radeon_bo));
+
 retry:
bo = kzalloc(sizeof(struct radeon_bo), GFP_KERNEL);
if (bo == NULL)
@@ -134,8 +138,8 @@ retry:
/* Kernel allocation are uninterruptible */
mutex_lock(&rdev->vram_mutex);
r = ttm_bo_init(&rdev->mman.bdev, &bo->tbo, size, type,
-   &bo->placement, page_align, 0, !kernel, NULL, size,
-   &radeon_ttm_bo_destroy);
+   &bo->placement, page_align, 0, !kernel, NULL,
+   acc_size, &radeon_ttm_bo_destroy);
mutex_unlock(&rdev->vram_mutex);
if (unlikely(r != 0)) {
if (r != -ERESTARTSYS) {
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index cb73527..de7ad99 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -137,6 +137,7 @@ static void ttm_bo_release_list(struct kref *list_kref)
struct ttm_buffer_object *bo =
container_of(list_kref, struct ttm_buffer_object, list_kref);
struct ttm_bo_device *bdev = bo->bdev;
+   size_t acc_size = bo->acc_size;

BUG_ON(atomic_read(&bo->list_kref.refcount));
BUG_ON(atomic_read(&bo->kref.refcount));
@@ -152,9 +153,9 @@ static void ttm_bo_release_list(struct kref *list_kref)
if (bo->destroy)
bo->destroy(bo);
else {
-   ttm_mem_global_free(bdev->glob->mem_glob, bo->acc_size);
kfree(bo);
}
+   ttm_mem_global_free(bdev->glob->mem_glob, acc_size);
 }

 int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo, bool interruptible)
@@ -1157,6 +1158,17 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
 {
int ret = 0;
unsigned long num_pages;
+   struct ttm_mem_global *mem_glob = bdev->glob->mem_glob;
+
+   ret = ttm_mem_global_alloc(mem_glob, acc_size, false, false);
+   if (ret) {
+   printk(KERN_ERR TTM_PFX "Out of kernel memory.\n");
+   if (destroy)
+   (*destroy)(bo);
+   else
+   kfree(bo);
+   return -ENOMEM;
+   }

size += buffer_start & ~PAGE_MASK;
num_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
@@ -1227,14 +1239,34 @@ out_err:
 }
 EXPORT_SYMBOL(ttm_bo_init);

-static inline size_t ttm_bo_size(struct ttm_bo_global *glob,
-   

[PATCH 13/14] drm/ttm: isolate dma data from ttm_tt V4

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Move dma data to a superset ttm_dma_tt structure which herit
from ttm_tt. This allow driver that don't use dma functionalities
to not have to waste memory for it.

V2 Rebase on top of no memory account changes (where/when is my
   delorean when i need it ?)
V3 Make sure page list is initialized empty
V4 typo/syntax fixes

Signed-off-by: Jerome Glisse 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c |   18 +++--
 drivers/gpu/drm/nouveau/nouveau_sgdma.c  |   22 --
 drivers/gpu/drm/radeon/radeon_ttm.c  |   43 ++--
 drivers/gpu/drm/ttm/ttm_page_alloc.c |  114 +++---
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |   35 +
 drivers/gpu/drm/ttm/ttm_tt.c |   60 +---
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c   |2 +
 include/drm/ttm/ttm_bo_driver.h  |   32 -
 include/drm/ttm/ttm_page_alloc.h |   33 +
 9 files changed, 203 insertions(+), 156 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index e603909..4347776 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1052,6 +1052,7 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct 
nouveau_fence *fence)
 static int
 nouveau_ttm_tt_populate(struct ttm_tt *ttm)
 {
+   struct ttm_dma_tt *ttm_dma = (void *)ttm;
struct drm_nouveau_private *dev_priv;
struct drm_device *dev;
unsigned i;
@@ -1065,7 +1066,7 @@ nouveau_ttm_tt_populate(struct ttm_tt *ttm)

 #ifdef CONFIG_SWIOTLB
if (swiotlb_nr_tbl()) {
-   return ttm_dma_populate(ttm, dev->dev);
+   return ttm_dma_populate((void *)ttm, dev->dev);
}
 #endif

@@ -1075,14 +1076,14 @@ nouveau_ttm_tt_populate(struct ttm_tt *ttm)
}

for (i = 0; i < ttm->num_pages; i++) {
-   ttm->dma_address[i] = pci_map_page(dev->pdev, ttm->pages[i],
+   ttm_dma->dma_address[i] = pci_map_page(dev->pdev, ttm->pages[i],
   0, PAGE_SIZE,
   PCI_DMA_BIDIRECTIONAL);
-   if (pci_dma_mapping_error(dev->pdev, ttm->dma_address[i])) {
+   if (pci_dma_mapping_error(dev->pdev, ttm_dma->dma_address[i])) {
while (--i) {
-   pci_unmap_page(dev->pdev, ttm->dma_address[i],
+   pci_unmap_page(dev->pdev, 
ttm_dma->dma_address[i],
   PAGE_SIZE, 
PCI_DMA_BIDIRECTIONAL);
-   ttm->dma_address[i] = 0;
+   ttm_dma->dma_address[i] = 0;
}
ttm_pool_unpopulate(ttm);
return -EFAULT;
@@ -1094,6 +1095,7 @@ nouveau_ttm_tt_populate(struct ttm_tt *ttm)
 static void
 nouveau_ttm_tt_unpopulate(struct ttm_tt *ttm)
 {
+   struct ttm_dma_tt *ttm_dma = (void *)ttm;
struct drm_nouveau_private *dev_priv;
struct drm_device *dev;
unsigned i;
@@ -1103,14 +1105,14 @@ nouveau_ttm_tt_unpopulate(struct ttm_tt *ttm)

 #ifdef CONFIG_SWIOTLB
if (swiotlb_nr_tbl()) {
-   ttm_dma_unpopulate(ttm, dev->dev);
+   ttm_dma_unpopulate((void *)ttm, dev->dev);
return;
}
 #endif

for (i = 0; i < ttm->num_pages; i++) {
-   if (ttm->dma_address[i]) {
-   pci_unmap_page(dev->pdev, ttm->dma_address[i],
+   if (ttm_dma->dma_address[i]) {
+   pci_unmap_page(dev->pdev, ttm_dma->dma_address[i],
   PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
}
}
diff --git a/drivers/gpu/drm/nouveau/nouveau_sgdma.c 
b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
index ee1eb7c..47f245e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sgdma.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
@@ -8,7 +8,10 @@
 #define NV_CTXDMA_PAGE_MASK  (NV_CTXDMA_PAGE_SIZE - 1)

 struct nouveau_sgdma_be {
-   struct ttm_tt ttm;
+   /* this has to be the first field so populate/unpopulated in
+* nouve_bo.c works properly, otherwise have to move them here
+*/
+   struct ttm_dma_tt ttm;
struct drm_device *dev;
u64 offset;
 };
@@ -20,6 +23,7 @@ nouveau_sgdma_destroy(struct ttm_tt *ttm)

if (ttm) {
NV_DEBUG(nvbe->dev, "\n");
+   ttm_dma_tt_fini(&nvbe->ttm);
kfree(nvbe);
}
 }
@@ -38,7 +42,7 @@ nv04_sgdma_bind(struct ttm_tt *ttm, struct ttm_mem_reg *mem)
nvbe->offset = mem->start << PAGE_SHIFT;
pte = (nvbe->offset >> NV_CTXDMA_PAGE_SHIFT) + 2;
for (i = 0; i < ttm->num_pages; i++) {
-   dma_addr_t dma_offset = ttm->dma_address[i];
+   dma_addr_t dma_offset = nvbe->ttm.dma_address[i];
uint3

[PATCH 12/14] drm/nouveau: enable the ttm dma pool when swiotlb is active V3

2011-11-16 Thread j.gli...@gmail.com
From: Konrad Rzeszutek Wilk 

If the card is capable of more than 32-bit, then use the default
TTM page pool code which allocates from anywhere in the memory.

Note: If the 'ttm.no_dma' parameter is set, the override is ignored
and the default TTM pool is used.

V2 use pci_set_consistent_dma_mask
V3 Rebase on top of no memory account changes (where/when is my
   delorean when i need it ?)

CC: Ben Skeggs 
CC: Francisco Jerez 
CC: Dave Airlie 
Signed-off-by: Konrad Rzeszutek Wilk 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c  |   73 -
 drivers/gpu/drm/nouveau/nouveau_debugfs.c |1 +
 drivers/gpu/drm/nouveau/nouveau_mem.c |6 ++
 drivers/gpu/drm/nouveau/nouveau_sgdma.c   |   60 +---
 4 files changed, 79 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 3271001..e603909 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1049,10 +1049,79 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct 
nouveau_fence *fence)
nouveau_fence_unref(&old_fence);
 }

+static int
+nouveau_ttm_tt_populate(struct ttm_tt *ttm)
+{
+   struct drm_nouveau_private *dev_priv;
+   struct drm_device *dev;
+   unsigned i;
+   int r;
+
+   if (ttm->state != tt_unpopulated)
+   return 0;
+
+   dev_priv = nouveau_bdev(ttm->bdev);
+   dev = dev_priv->dev;
+
+#ifdef CONFIG_SWIOTLB
+   if (swiotlb_nr_tbl()) {
+   return ttm_dma_populate(ttm, dev->dev);
+   }
+#endif
+
+   r = ttm_pool_populate(ttm);
+   if (r) {
+   return r;
+   }
+
+   for (i = 0; i < ttm->num_pages; i++) {
+   ttm->dma_address[i] = pci_map_page(dev->pdev, ttm->pages[i],
+  0, PAGE_SIZE,
+  PCI_DMA_BIDIRECTIONAL);
+   if (pci_dma_mapping_error(dev->pdev, ttm->dma_address[i])) {
+   while (--i) {
+   pci_unmap_page(dev->pdev, ttm->dma_address[i],
+  PAGE_SIZE, 
PCI_DMA_BIDIRECTIONAL);
+   ttm->dma_address[i] = 0;
+   }
+   ttm_pool_unpopulate(ttm);
+   return -EFAULT;
+   }
+   }
+   return 0;
+}
+
+static void
+nouveau_ttm_tt_unpopulate(struct ttm_tt *ttm)
+{
+   struct drm_nouveau_private *dev_priv;
+   struct drm_device *dev;
+   unsigned i;
+
+   dev_priv = nouveau_bdev(ttm->bdev);
+   dev = dev_priv->dev;
+
+#ifdef CONFIG_SWIOTLB
+   if (swiotlb_nr_tbl()) {
+   ttm_dma_unpopulate(ttm, dev->dev);
+   return;
+   }
+#endif
+
+   for (i = 0; i < ttm->num_pages; i++) {
+   if (ttm->dma_address[i]) {
+   pci_unmap_page(dev->pdev, ttm->dma_address[i],
+  PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+   }
+   }
+
+   ttm_pool_unpopulate(ttm);
+}
+
 struct ttm_bo_driver nouveau_bo_driver = {
.ttm_tt_create = &nouveau_ttm_tt_create,
-   .ttm_tt_populate = &ttm_pool_populate,
-   .ttm_tt_unpopulate = &ttm_pool_unpopulate,
+   .ttm_tt_populate = &nouveau_ttm_tt_populate,
+   .ttm_tt_unpopulate = &nouveau_ttm_tt_unpopulate,
.invalidate_caches = nouveau_bo_invalidate_caches,
.init_mem_type = nouveau_bo_init_mem_type,
.evict_flags = nouveau_bo_evict_flags,
diff --git a/drivers/gpu/drm/nouveau/nouveau_debugfs.c 
b/drivers/gpu/drm/nouveau/nouveau_debugfs.c
index 8e15923..f52c2db 100644
--- a/drivers/gpu/drm/nouveau/nouveau_debugfs.c
+++ b/drivers/gpu/drm/nouveau/nouveau_debugfs.c
@@ -178,6 +178,7 @@ static struct drm_info_list nouveau_debugfs_list[] = {
{ "memory", nouveau_debugfs_memory_info, 0, NULL },
{ "vbios.rom", nouveau_debugfs_vbios_image, 0, NULL },
{ "ttm_page_pool", ttm_page_alloc_debugfs, 0, NULL },
+   { "ttm_dma_page_pool", ttm_dma_page_alloc_debugfs, 0, NULL },
 };
 #define NOUVEAU_DEBUGFS_ENTRIES ARRAY_SIZE(nouveau_debugfs_list)

diff --git a/drivers/gpu/drm/nouveau/nouveau_mem.c 
b/drivers/gpu/drm/nouveau/nouveau_mem.c
index 36bec48..37fcaa2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_mem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_mem.c
@@ -407,6 +407,12 @@ nouveau_mem_vram_init(struct drm_device *dev)
ret = pci_set_dma_mask(dev->pdev, DMA_BIT_MASK(dma_bits));
if (ret)
return ret;
+   ret = pci_set_consistent_dma_mask(dev->pdev, DMA_BIT_MASK(dma_bits));
+   if (ret) {
+   /* Reset to default value. */
+   pci_set_consistent_dma_mask(dev->pdev, DMA_BIT_MASK(32));
+   }
+

ret = nouveau_ttm_global_init(dev_priv);
if (ret)
diff --git a/drivers/gpu/drm/nouveau/nouveau_sgdma.c 

[PATCH 11/14] drm/radeon/kms: enable the ttm dma pool if swiotlb is on V4

2011-11-16 Thread j.gli...@gmail.com
From: Konrad Rzeszutek Wilk 

With the exception that we do not handle the AGP case. We only
deal with PCIe cards such as ATI ES1000 or HD3200 that have been
detected to only do DMA up to 32-bits.

V2 force dma32 if we fail to set bigger dma mask
V3 Rebase on top of no memory account changes (where/when is my
   delorean when i need it ?)
V4 add debugfs entry is swiotlb is active not only if we are
   on dma 32bits only gpu

CC: Dave Airlie 
CC: Alex Deucher 
Signed-off-by: Konrad Rzeszutek Wilk 
Reviewed-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/radeon.h|1 -
 drivers/gpu/drm/radeon/radeon_device.c |6 ++
 drivers/gpu/drm/radeon/radeon_gart.c   |   29 +---
 drivers/gpu/drm/radeon/radeon_ttm.c|   83 +--
 4 files changed, 84 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index fc5a1d6..de38e70 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -320,7 +320,6 @@ struct radeon_gart {
unsignedtable_size;
struct page **pages;
dma_addr_t  *pages_addr;
-   bool*ttm_alloced;
boolready;
 };

diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
b/drivers/gpu/drm/radeon/radeon_device.c
index c4d00a1..fb347a8 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -765,8 +765,14 @@ int radeon_device_init(struct radeon_device *rdev,
r = pci_set_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));
if (r) {
rdev->need_dma32 = true;
+   dma_bits = 32;
printk(KERN_WARNING "radeon: No suitable DMA available.\n");
}
+   r = pci_set_consistent_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));
+   if (r) {
+   pci_set_consistent_dma_mask(rdev->pdev, DMA_BIT_MASK(32));
+   printk(KERN_WARNING "radeon: No coherent DMA available.\n");
+   }

/* Registers mapping */
/* TODO: block userspace mapping of io register */
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index ba7ab79..a4d9816 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -157,9 +157,6 @@ void radeon_gart_unbind(struct radeon_device *rdev, 
unsigned offset,
p = t / (PAGE_SIZE / RADEON_GPU_PAGE_SIZE);
for (i = 0; i < pages; i++, p++) {
if (rdev->gart.pages[p]) {
-   if (!rdev->gart.ttm_alloced[p])
-   pci_unmap_page(rdev->pdev, 
rdev->gart.pages_addr[p],
-   PAGE_SIZE, 
PCI_DMA_BIDIRECTIONAL);
rdev->gart.pages[p] = NULL;
rdev->gart.pages_addr[p] = rdev->dummy_page.addr;
page_base = rdev->gart.pages_addr[p];
@@ -191,23 +188,7 @@ int radeon_gart_bind(struct radeon_device *rdev, unsigned 
offset,
p = t / (PAGE_SIZE / RADEON_GPU_PAGE_SIZE);

for (i = 0; i < pages; i++, p++) {
-   /* we reverted the patch using dma_addr in TTM for now but this
-* code stops building on alpha so just comment it out for now 
*/
-   if (0) { /*dma_addr[i] != DMA_ERROR_CODE) */
-   rdev->gart.ttm_alloced[p] = true;
-   rdev->gart.pages_addr[p] = dma_addr[i];
-   } else {
-   /* we need to support large memory configurations */
-   /* assume that unbind have already been call on the 
range */
-   rdev->gart.pages_addr[p] = pci_map_page(rdev->pdev, 
pagelist[i],
-   0, PAGE_SIZE,
-   PCI_DMA_BIDIRECTIONAL);
-   if (pci_dma_mapping_error(rdev->pdev, 
rdev->gart.pages_addr[p])) {
-   /* FIXME: failed to map page (return -ENOMEM?) 
*/
-   radeon_gart_unbind(rdev, offset, pages);
-   return -ENOMEM;
-   }
-   }
+   rdev->gart.pages_addr[p] = dma_addr[i];
rdev->gart.pages[p] = pagelist[i];
if (rdev->gart.ptr) {
page_base = rdev->gart.pages_addr[p];
@@ -274,12 +255,6 @@ int radeon_gart_init(struct radeon_device *rdev)
radeon_gart_fini(rdev);
return -ENOMEM;
}
-   rdev->gart.ttm_alloced = kzalloc(sizeof(bool) *
-rdev->gart.num_cpu_pages, GFP_KERNEL);
-   if (rdev->gart.ttm_alloced == NULL) {
-   radeon_gart_fini(rdev);
-   return -ENOMEM;
-   }
/* set GART entry to point to the d

[PATCH 10/14] drm/ttm: provide dma aware ttm page pool code V9

2011-11-16 Thread j.gli...@gmail.com
From: Konrad Rzeszutek Wilk 

In TTM world the pages for the graphic drivers are kept in three different
pools: write combined, uncached, and cached (write-back). When the pages
are used by the graphic driver the graphic adapter via its built in MMU
(or AGP) programs these pages in. The programming requires the virtual address
(from the graphic adapter perspective) and the physical address (either System 
RAM
or the memory on the card) which is obtained using the pci_map_* calls (which 
does the
virtual to physical - or bus address translation). During the graphic 
application's
"life" those pages can be shuffled around, swapped out to disk, moved from the
VRAM to System RAM or vice-versa. This all works with the existing TTM pool code
- except when we want to use the software IOTLB (SWIOTLB) code to "map" the 
physical
addresses to the graphic adapter MMU. We end up programming the bounce buffer's
physical address instead of the TTM pool memory's and get a non-worky driver.
There are two solutions:
1) using the DMA API to allocate pages that are screened by the DMA API, or
2) using the pci_sync_* calls to copy the pages from the bounce-buffer and back.

This patch fixes the issue by allocating pages using the DMA API. The second
is a viable option - but it has performance drawbacks and potential correctness
issues - think of the write cache page being bounced (SWIOTLB->TTM), the
WC is set on the TTM page and the copy from SWIOTLB not making it to the TTM
page until the page has been recycled in the pool (and used by another 
application).

The bounce buffer does not get activated often - only in cases where we have
a 32-bit capable card and we want to use a page that is allocated above the
4GB limit. The bounce buffer offers the solution of copying the contents
of that 4GB page to an location below 4GB and then back when the operation has 
been
completed (or vice-versa). This is done by using the 'pci_sync_*' calls.
Note: If you look carefully enough in the existing TTM page pool code you will
notice the GFP_DMA32 flag is used  - which should guarantee that the provided 
page
is under 4GB. It certainly is the case, except this gets ignored in two cases:
 - If user specifies 'swiotlb=force' which bounces _every_ page.
 - If user is using a Xen's PV Linux guest (which uses the SWIOTLB and the
   underlaying PFN's aren't necessarily under 4GB).

To not have this extra copying done the other option is to allocate the pages
using the DMA API so that there is not need to map the page and perform the
expensive 'pci_sync_*' calls.

This DMA API capable TTM pool requires for this the 'struct device' to
properly call the DMA API. It also has to track the virtual and bus address of
the page being handed out in case it ends up being swapped out or de-allocated -
to make sure it is de-allocated using the proper's 'struct device'.

Implementation wise the code keeps two lists: one that is attached to the
'struct device' (via the dev->dma_pools list) and a global one to be used when
the 'struct device' is unavailable (think shrinker code). The global list can
iterate over all of the 'struct device' and its associated dma_pool. The list
in dev->dma_pools can only iterate the device's dma_pool.
/[struct 
device_pool]\
/---| dev   
 |
   /+---| dma_pool  
 |
 /-+--\/
\/
 |struct device| /-->[struct dma_pool for WC][struct dma_pool for uncached]<-/--| dma_pool  
 |
 \-+--/ /   
\/
\--/
[Two pools associated with the device (WC and UC), and the parallel list
containing the 'struct dev' and 'struct dma_pool' entries]

The maximum amount of dma pools a device can have is six: write-combined,
uncached, and cached; then there are the DMA32 variants which are:
write-combined dma32, uncached dma32, and cached dma32.

Currently this code only gets activated when any variant of the SWIOTLB IOMMU
code is running (Intel without VT-d, AMD without GART, IBM Calgary and Xen PV
with PCI devices).

Tested-by: Michel D?nzer 
[v1: Using swiotlb_nr_tbl instead of swiotlb_enabled]
[v2: Major overhaul - added 'inuse_list' to seperate used from inuse and reorder
the order of lists to get better performance.]
[v3: Added comments/and some logic based on review, Added Jerome tag]
[v4: rebase on top of ttm_tt & ttm_backend merge]
[v5: rebase on top of ttm memory accounting overhaul]
[v6: New rebase on top of more memory accouting changes]
[v7: well rebase on top of no memory accounting changes]
[v8: make sure pages list is initialized empty]
[v9: calll ttm_mem_global_free_page in unpopulate for accurate accountg]
Signed-off-by: Konrad Rzeszutek Wilk 
Reviewed-by: J

[PATCH 09/14] drm/ttm: introduce callback for ttm_tt populate & unpopulate V4

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Move the page allocation and freeing to driver callback and
provide ttm code helper function for those.

Most intrusive change, is the fact that we now only fully
populate an object this simplify some of code designed around
the page fault design.

V2 Rebase on top of memory accounting overhaul
V3 New rebase on top of more memory accouting changes
V4 Rebase on top of no memory account changes (where/when is my
   delorean when i need it ?)

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c   |3 +
 drivers/gpu/drm/radeon/radeon_ttm.c|2 +
 drivers/gpu/drm/ttm/ttm_bo_util.c  |   31 ++-
 drivers/gpu/drm/ttm/ttm_bo_vm.c|9 +++-
 drivers/gpu/drm/ttm/ttm_page_alloc.c   |   57 
 drivers/gpu/drm/ttm/ttm_tt.c   |   91 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c |3 +
 include/drm/ttm/ttm_bo_driver.h|   41 --
 include/drm/ttm/ttm_page_alloc.h   |   18 ++
 9 files changed, 135 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 3f116ba..3271001 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -28,6 +28,7 @@
  */

 #include "drmP.h"
+#include "ttm/ttm_page_alloc.h"

 #include "nouveau_drm.h"
 #include "nouveau_drv.h"
@@ -1050,6 +1051,8 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct 
nouveau_fence *fence)

 struct ttm_bo_driver nouveau_bo_driver = {
.ttm_tt_create = &nouveau_ttm_tt_create,
+   .ttm_tt_populate = &ttm_pool_populate,
+   .ttm_tt_unpopulate = &ttm_pool_unpopulate,
.invalidate_caches = nouveau_bo_invalidate_caches,
.init_mem_type = nouveau_bo_init_mem_type,
.evict_flags = nouveau_bo_evict_flags,
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index af4d5f2..b1768cb 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -581,6 +581,8 @@ struct ttm_tt *radeon_ttm_tt_create(struct ttm_bo_device 
*bdev,

 static struct ttm_bo_driver radeon_bo_driver = {
.ttm_tt_create = &radeon_ttm_tt_create,
+   .ttm_tt_populate = &ttm_pool_populate,
+   .ttm_tt_unpopulate = &ttm_pool_unpopulate,
.invalidate_caches = &radeon_invalidate_caches,
.init_mem_type = &radeon_init_mem_type,
.evict_flags = &radeon_evict_flags,
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 082fcae..60f204d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -244,7 +244,7 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void 
*src,
unsigned long page,
pgprot_t prot)
 {
-   struct page *d = ttm_tt_get_page(ttm, page);
+   struct page *d = ttm->pages[page];
void *dst;

if (!d)
@@ -281,7 +281,7 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void 
*dst,
unsigned long page,
pgprot_t prot)
 {
-   struct page *s = ttm_tt_get_page(ttm, page);
+   struct page *s = ttm->pages[page];
void *src;

if (!s)
@@ -342,6 +342,12 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo,
if (old_iomap == NULL && ttm == NULL)
goto out2;

+   if (ttm->state == tt_unpopulated) {
+   ret = ttm->bdev->driver->ttm_tt_populate(ttm);
+   if (ret)
+   goto out1;
+   }
+
add = 0;
dir = 1;

@@ -502,10 +508,16 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
 {
struct ttm_mem_reg *mem = &bo->mem; pgprot_t prot;
struct ttm_tt *ttm = bo->ttm;
-   struct page *d;
-   int i;
+   int ret;

BUG_ON(!ttm);
+
+   if (ttm->state == tt_unpopulated) {
+   ret = ttm->bdev->driver->ttm_tt_populate(ttm);
+   if (ret)
+   return ret;
+   }
+
if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED)) {
/*
 * We're mapping a single page, and the desired
@@ -513,18 +525,9 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
 */

map->bo_kmap_type = ttm_bo_map_kmap;
-   map->page = ttm_tt_get_page(ttm, start_page);
+   map->page = ttm->pages[start_page];
map->virtual = kmap(map->page);
} else {
-   /*
-* Populate the part we're mapping;
-*/
-   for (i = start_page; i < start_page + num_pages; ++i) {
-   d = ttm_tt_get_page(ttm, i);
-   if (!d)
-   return -ENOMEM;
-   }
-
/*

[PATCH 08/14] drm/ttm: merge ttm_backend and ttm_tt V5

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

ttm_backend will only exist with a ttm_tt, and ttm_tt
will only be of interest when bound to a backend. Merge them
to avoid code and data duplication.

V2 Rebase on top of memory accounting overhaul
V3 Rebase on top of more memory accounting changes
V4 Rebase on top of no memory account changes (where/when is my
   delorean when i need it ?)
V5 make sure ttm is unbound before destroying, change commit
   message on suggestion from Tormod Volden

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c|   14 ++-
 drivers/gpu/drm/nouveau/nouveau_drv.h   |5 +-
 drivers/gpu/drm/nouveau/nouveau_sgdma.c |  188 ---
 drivers/gpu/drm/radeon/radeon_ttm.c |  219 ---
 drivers/gpu/drm/ttm/ttm_agp_backend.c   |   88 +
 drivers/gpu/drm/ttm/ttm_bo.c|9 +-
 drivers/gpu/drm/ttm/ttm_tt.c|   59 ++---
 drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c  |   66 +++---
 include/drm/ttm/ttm_bo_driver.h |  104 ++-
 9 files changed, 294 insertions(+), 458 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 7cc37e6..3f116ba 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -343,8 +343,10 @@ nouveau_bo_wr32(struct nouveau_bo *nvbo, unsigned index, 
u32 val)
*mem = val;
 }

-static struct ttm_backend *
-nouveau_bo_create_ttm_backend_entry(struct ttm_bo_device *bdev)
+static struct ttm_tt *
+nouveau_ttm_tt_create(struct ttm_bo_device *bdev,
+ unsigned long size, uint32_t page_flags,
+ struct page *dummy_read_page)
 {
struct drm_nouveau_private *dev_priv = nouveau_bdev(bdev);
struct drm_device *dev = dev_priv->dev;
@@ -352,11 +354,13 @@ nouveau_bo_create_ttm_backend_entry(struct ttm_bo_device 
*bdev)
switch (dev_priv->gart_info.type) {
 #if __OS_HAS_AGP
case NOUVEAU_GART_AGP:
-   return ttm_agp_backend_init(bdev, dev->agp->bridge);
+   return ttm_agp_tt_create(bdev, dev->agp->bridge,
+size, page_flags, dummy_read_page);
 #endif
case NOUVEAU_GART_PDMA:
case NOUVEAU_GART_HW:
-   return nouveau_sgdma_init_ttm(dev);
+   return nouveau_sgdma_create_ttm(bdev, size, page_flags,
+   dummy_read_page);
default:
NV_ERROR(dev, "Unknown GART type %d\n",
 dev_priv->gart_info.type);
@@ -1045,7 +1049,7 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct 
nouveau_fence *fence)
 }

 struct ttm_bo_driver nouveau_bo_driver = {
-   .create_ttm_backend_entry = nouveau_bo_create_ttm_backend_entry,
+   .ttm_tt_create = &nouveau_ttm_tt_create,
.invalidate_caches = nouveau_bo_invalidate_caches,
.init_mem_type = nouveau_bo_init_mem_type,
.evict_flags = nouveau_bo_evict_flags,
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 29837da..0c53e39 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -1000,7 +1000,10 @@ extern int nouveau_sgdma_init(struct drm_device *);
 extern void nouveau_sgdma_takedown(struct drm_device *);
 extern uint32_t nouveau_sgdma_get_physical(struct drm_device *,
   uint32_t offset);
-extern struct ttm_backend *nouveau_sgdma_init_ttm(struct drm_device *);
+extern struct ttm_tt *nouveau_sgdma_create_ttm(struct ttm_bo_device *bdev,
+  unsigned long size,
+  uint32_t page_flags,
+  struct page *dummy_read_page);

 /* nouveau_debugfs.c */
 #if defined(CONFIG_DRM_NOUVEAU_DEBUG)
diff --git a/drivers/gpu/drm/nouveau/nouveau_sgdma.c 
b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
index b75258a..bc2ab90 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sgdma.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sgdma.c
@@ -8,44 +8,23 @@
 #define NV_CTXDMA_PAGE_MASK  (NV_CTXDMA_PAGE_SIZE - 1)

 struct nouveau_sgdma_be {
-   struct ttm_backend backend;
+   struct ttm_tt ttm;
struct drm_device *dev;
-
-   dma_addr_t *pages;
-   unsigned nr_pages;
-   bool unmap_pages;
-
u64 offset;
-   bool bound;
 };

 static int
-nouveau_sgdma_populate(struct ttm_backend *be, unsigned long num_pages,
-  struct page **pages, struct page *dummy_read_page,
-  dma_addr_t *dma_addrs)
+nouveau_sgdma_dma_map(struct ttm_tt *ttm)
 {
-   struct nouveau_sgdma_be *nvbe = (struct nouveau_sgdma_be *)be;
+   struct nouveau_sgdma_be *nvbe = (struct nouveau_sgdma_be *)ttm;
struct drm_device *dev = nvbe->dev;
int i;

[PATCH 07/14] drm/ttm: page allocation use page array instead of list

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Use the ttm_tt pages array for pages allocations, move the list
unwinding into the page allocation functions.

Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/ttm/ttm_page_alloc.c |   85 +-
 drivers/gpu/drm/ttm/ttm_tt.c |   36 +++
 include/drm/ttm/ttm_page_alloc.h |8 ++--
 3 files changed, 63 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 727e93d..0f3e6d2 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -619,8 +619,10 @@ static void ttm_page_pool_fill_locked(struct ttm_page_pool 
*pool,
  * @return count of pages still required to fulfill the request.
  */
 static unsigned ttm_page_pool_get_pages(struct ttm_page_pool *pool,
-   struct list_head *pages, int ttm_flags,
-   enum ttm_caching_state cstate, unsigned count)
+   struct list_head *pages,
+   int ttm_flags,
+   enum ttm_caching_state cstate,
+   unsigned count)
 {
unsigned long irq_flags;
struct list_head *p;
@@ -664,13 +666,15 @@ out:
  * On success pages list will hold count number of correctly
  * cached pages.
  */
-int ttm_get_pages(struct list_head *pages, int flags,
- enum ttm_caching_state cstate, unsigned count,
+int ttm_get_pages(struct page **pages, int flags,
+ enum ttm_caching_state cstate, unsigned npages,
  dma_addr_t *dma_address)
 {
struct ttm_page_pool *pool = ttm_get_pool(flags, cstate);
+   struct list_head plist;
struct page *p = NULL;
gfp_t gfp_flags = GFP_USER;
+   unsigned count;
int r;

/* set zero flag for page allocation if required */
@@ -684,7 +688,7 @@ int ttm_get_pages(struct list_head *pages, int flags,
else
gfp_flags |= GFP_HIGHUSER;

-   for (r = 0; r < count; ++r) {
+   for (r = 0; r < npages; ++r) {
p = alloc_page(gfp_flags);
if (!p) {

@@ -693,85 +697,100 @@ int ttm_get_pages(struct list_head *pages, int flags,
return -ENOMEM;
}

-   list_add(&p->lru, pages);
+   pages[r] = p;
}
return 0;
}

-
/* combine zero flag to pool flags */
gfp_flags |= pool->gfp_flags;

/* First we take pages from the pool */
-   count = ttm_page_pool_get_pages(pool, pages, flags, cstate, count);
+   INIT_LIST_HEAD(&plist);
+   npages = ttm_page_pool_get_pages(pool, &plist, flags, cstate, npages);
+   count = 0;
+   list_for_each_entry(p, &plist, lru) {
+   pages[count++] = p;
+   }

/* clear the pages coming from the pool if requested */
if (flags & TTM_PAGE_FLAG_ZERO_ALLOC) {
-   list_for_each_entry(p, pages, lru) {
+   list_for_each_entry(p, &plist, lru) {
clear_page(page_address(p));
}
}

/* If pool didn't have enough pages allocate new one. */
-   if (count > 0) {
+   if (npages > 0) {
/* ttm_alloc_new_pages doesn't reference pool so we can run
 * multiple requests in parallel.
 **/
-   r = ttm_alloc_new_pages(pages, gfp_flags, flags, cstate, count);
+   INIT_LIST_HEAD(&plist);
+   r = ttm_alloc_new_pages(&plist, gfp_flags, flags, cstate, 
npages);
+   list_for_each_entry(p, &plist, lru) {
+   pages[count++] = p;
+   }
if (r) {
/* If there is any pages in the list put them back to
 * the pool. */
printk(KERN_ERR TTM_PFX
   "Failed to allocate extra pages "
   "for large request.");
-   ttm_put_pages(pages, 0, flags, cstate, NULL);
+   ttm_put_pages(pages, count, flags, cstate, NULL);
return r;
}
}

-
return 0;
 }

 /* Put all pages in pages list to correct pool to wait for reuse */
-void ttm_put_pages(struct list_head *pages, unsigned page_count, int flags,
+void ttm_put_pages(struct page **pages, unsigned npages, int flags,
   enum ttm_caching_state cstate, dma_addr_t *dma_address)
 {
unsigned long irq_flags;
struct ttm_page_pool *pool = ttm_get_pool(flags, cstate);
-   struct page *p, *tmp;
+   unsigned i;

if (pool == NULL) {
/* No pool for this memory type so free the pages */
-
-   list_for_each_entry_safe

[PATCH 06/14] drm/ttm: test for dma_address array allocation failure

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_tt.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 90527a2..54bbbad 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -320,7 +320,7 @@ struct ttm_tt *ttm_tt_create(struct ttm_bo_device *bdev, 
unsigned long size,
ttm->dummy_read_page = dummy_read_page;

ttm_tt_alloc_page_directory(ttm);
-   if (!ttm->pages) {
+   if (!ttm->pages || !ttm->dma_address) {
ttm_tt_destroy(ttm);
printk(KERN_ERR TTM_PFX "Failed allocating page table\n");
return NULL;
-- 
1.7.7.1



[PATCH 05/14] drm/ttm: use ttm put pages function to properly restore cache attribute

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

On failure we need to make sure the page we free has wb cache
attribute. Do this pas call the proper ttm page helper function.

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_tt.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index f0c5ffd..90527a2 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -90,7 +90,10 @@ static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, 
int index)
}
return p;
 out_err:
-   put_page(p);
+   INIT_LIST_HEAD(&h);
+   list_add(&p->lru, &h);
+   ttm_put_pages(&h, 1, ttm->page_flags,
+ ttm->caching_state, &ttm->dma_address[index]);
return NULL;
 }

-- 
1.7.7.1



[PATCH 04/14] drm/ttm: remove unused backend flags field

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

This field is not use by any of the driver just drop it.

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/radeon/radeon_ttm.c |1 -
 include/drm/ttm/ttm_bo_driver.h |2 --
 2 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 0b5468b..97c76ae 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -787,7 +787,6 @@ struct ttm_backend *radeon_ttm_backend_create(struct 
radeon_device *rdev)
return NULL;
}
gtt->backend.bdev = &rdev->mman.bdev;
-   gtt->backend.flags = 0;
gtt->backend.func = &radeon_backend_func;
gtt->rdev = rdev;
gtt->pages = NULL;
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 9da182b..6d17140 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -106,7 +106,6 @@ struct ttm_backend_func {
  * struct ttm_backend
  *
  * @bdev: Pointer to a struct ttm_bo_device.
- * @flags: For driver use.
  * @func: Pointer to a struct ttm_backend_func that describes
  * the backend methods.
  *
@@ -114,7 +113,6 @@ struct ttm_backend_func {

 struct ttm_backend {
struct ttm_bo_device *bdev;
-   uint32_t flags;
struct ttm_backend_func *func;
 };

-- 
1.7.7.1



[PATCH 03/14] drm/ttm: remove split btw highmen and lowmem page

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

Split btw highmem and lowmem page was rendered useless by the
pool code. Remove it. Note further cleanup would change the
ttm page allocation helper to actualy take an array instead
of relying on list this could drasticly reduce the number of
function call in the common case of allocation whole buffer.

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_tt.c|   11 ++-
 include/drm/ttm/ttm_bo_driver.h |7 ---
 2 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index c68b0e7..f0c5ffd 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -70,7 +70,7 @@ static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, int 
index)
struct ttm_mem_global *mem_glob = ttm->glob->mem_glob;
int ret;

-   while (NULL == (p = ttm->pages[index])) {
+   if (NULL == (p = ttm->pages[index])) {

INIT_LIST_HEAD(&h);

@@ -86,10 +86,7 @@ static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, 
int index)
if (unlikely(ret != 0))
goto out_err;

-   if (PageHighMem(p))
-   ttm->pages[--ttm->first_himem_page] = p;
-   else
-   ttm->pages[++ttm->last_lomem_page] = p;
+   ttm->pages[index] = p;
}
return p;
 out_err:
@@ -271,8 +268,6 @@ static void ttm_tt_free_alloced_pages(struct ttm_tt *ttm)
ttm_put_pages(&h, count, ttm->page_flags, ttm->caching_state,
  ttm->dma_address);
ttm->state = tt_unpopulated;
-   ttm->first_himem_page = ttm->num_pages;
-   ttm->last_lomem_page = -1;
 }

 void ttm_tt_destroy(struct ttm_tt *ttm)
@@ -316,8 +311,6 @@ struct ttm_tt *ttm_tt_create(struct ttm_bo_device *bdev, 
unsigned long size,

ttm->glob = bdev->glob;
ttm->num_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
-   ttm->first_himem_page = ttm->num_pages;
-   ttm->last_lomem_page = -1;
ttm->caching_state = tt_cached;
ttm->page_flags = page_flags;

diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 37527d6..9da182b 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -136,11 +136,6 @@ enum ttm_caching_state {
  * @dummy_read_page: Page to map where the ttm_tt page array contains a NULL
  * pointer.
  * @pages: Array of pages backing the data.
- * @first_himem_page: Himem pages are put last in the page array, which
- * enables us to run caching attribute changes on only the first part
- * of the page array containing lomem pages. This is the index of the
- * first himem page.
- * @last_lomem_page: Index of the last lomem page in the page array.
  * @num_pages: Number of pages in the page array.
  * @bdev: Pointer to the current struct ttm_bo_device.
  * @be: Pointer to the ttm backend.
@@ -157,8 +152,6 @@ enum ttm_caching_state {
 struct ttm_tt {
struct page *dummy_read_page;
struct page **pages;
-   long first_himem_page;
-   long last_lomem_page;
uint32_t page_flags;
unsigned long num_pages;
struct ttm_bo_global *glob;
-- 
1.7.7.1



[PATCH 02/14] drm/ttm: remove userspace backed ttm object support

2011-11-16 Thread j.gli...@gmail.com
From: Jerome Glisse 

This was never use in none of the driver, properly using userspace
page for bo would need more code (vma interaction mostly). Removing
this dead code in preparation of ttm_tt & backend merge.

Signed-off-by: Jerome Glisse 
Reviewed-by: Konrad Rzeszutek Wilk 
Reviewed-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_bo.c|   22 
 drivers/gpu/drm/ttm/ttm_tt.c|  105 +--
 include/drm/ttm/ttm_bo_api.h|5 --
 include/drm/ttm/ttm_bo_driver.h |   24 -
 4 files changed, 1 insertions(+), 155 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 617b646..4bde335 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -342,22 +342,6 @@ static int ttm_bo_add_ttm(struct ttm_buffer_object *bo, 
bool zero_alloc)
if (unlikely(bo->ttm == NULL))
ret = -ENOMEM;
break;
-   case ttm_bo_type_user:
-   bo->ttm = ttm_tt_create(bdev, bo->num_pages << PAGE_SHIFT,
-   page_flags | TTM_PAGE_FLAG_USER,
-   glob->dummy_read_page);
-   if (unlikely(bo->ttm == NULL)) {
-   ret = -ENOMEM;
-   break;
-   }
-
-   ret = ttm_tt_set_user(bo->ttm, current,
- bo->buffer_start, bo->num_pages);
-   if (unlikely(ret != 0)) {
-   ttm_tt_destroy(bo->ttm);
-   bo->ttm = NULL;
-   }
-   break;
default:
printk(KERN_ERR TTM_PFX "Illegal buffer object type\n");
ret = -EINVAL;
@@ -907,16 +891,12 @@ static uint32_t ttm_bo_select_caching(struct 
ttm_mem_type_manager *man,
 }

 static bool ttm_bo_mt_compatible(struct ttm_mem_type_manager *man,
-bool disallow_fixed,
 uint32_t mem_type,
 uint32_t proposed_placement,
 uint32_t *masked_placement)
 {
uint32_t cur_flags = ttm_bo_type_flags(mem_type);

-   if ((man->flags & TTM_MEMTYPE_FLAG_FIXED) && disallow_fixed)
-   return false;
-
if ((cur_flags & proposed_placement & TTM_PL_MASK_MEM) == 0)
return false;

@@ -961,7 +941,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
man = &bdev->man[mem_type];

type_ok = ttm_bo_mt_compatible(man,
-   bo->type == ttm_bo_type_user,
mem_type,
placement->placement[i],
&cur_flags);
@@ -1009,7 +988,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
if (!man->has_type)
continue;
if (!ttm_bo_mt_compatible(man,
-   bo->type == ttm_bo_type_user,
mem_type,
placement->busy_placement[i],
&cur_flags))
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index f9cc548..c68b0e7 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -63,43 +63,6 @@ static void ttm_tt_free_page_directory(struct ttm_tt *ttm)
ttm->dma_address = NULL;
 }

-static void ttm_tt_free_user_pages(struct ttm_tt *ttm)
-{
-   int write;
-   int dirty;
-   struct page *page;
-   int i;
-   struct ttm_backend *be = ttm->be;
-
-   BUG_ON(!(ttm->page_flags & TTM_PAGE_FLAG_USER));
-   write = ((ttm->page_flags & TTM_PAGE_FLAG_WRITE) != 0);
-   dirty = ((ttm->page_flags & TTM_PAGE_FLAG_USER_DIRTY) != 0);
-
-   if (be)
-   be->func->clear(be);
-
-   for (i = 0; i < ttm->num_pages; ++i) {
-   page = ttm->pages[i];
-   if (page == NULL)
-   continue;
-
-   if (page == ttm->dummy_read_page) {
-   BUG_ON(write);
-   continue;
-   }
-
-   if (write && dirty && !PageReserved(page))
-   set_page_dirty_lock(page);
-
-   ttm->pages[i] = NULL;
-   ttm_mem_global_free(ttm->glob->mem_glob, PAGE_SIZE);
-   put_page(page);
-   }
-   ttm->state = tt_unpopulated;
-   ttm->first_himem_page = ttm->num_pages;
-   ttm->last_lomem_page = -1;
-}
-
 static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, int index)
 {
struct page *p;
@@ -326,10 +289,7 @@ void ttm_tt_destroy(struct ttm_tt *ttm)
}

if (likely(ttm->pages != NULL)) {
-   if (ttm->page_flags & TTM_PAGE_FLAG_

[PATCH 01/14] swiotlb: Expose swiotlb_nr_tlb function to modules

2011-11-16 Thread j.gli...@gmail.com
From: Konrad Rzeszutek Wilk 

As a mechanism to detect whether SWIOTLB is enabled or not.
We also fix the spelling - it was swioltb instead of
swiotlb.

CC: FUJITA Tomonori 
[v1: Ripped out swiotlb_enabled]
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/xen/swiotlb-xen.c |2 +-
 include/linux/swiotlb.h   |2 +-
 lib/swiotlb.c |5 +++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 8e964b9..4864e5d 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -153,7 +153,7 @@ void __init xen_swiotlb_init(int verbose)
char *m = NULL;
unsigned int repeat = 3;

-   nr_tbl = swioltb_nr_tbl();
+   nr_tbl = swiotlb_nr_tbl();
if (nr_tbl)
xen_io_tlb_nslabs = nr_tbl;
else {
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 445702c..e872526 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -24,7 +24,7 @@ extern int swiotlb_force;

 extern void swiotlb_init(int verbose);
 extern void swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
verbose);
-extern unsigned long swioltb_nr_tbl(void);
+extern unsigned long swiotlb_nr_tbl(void);

 /*
  * Enumeration for sync targets
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 99093b3..058935e 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -110,11 +110,11 @@ setup_io_tlb_npages(char *str)
 __setup("swiotlb=", setup_io_tlb_npages);
 /* make io_tlb_overflow tunable too? */

-unsigned long swioltb_nr_tbl(void)
+unsigned long swiotlb_nr_tbl(void)
 {
return io_tlb_nslabs;
 }
-
+EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
 /* Note that this doesn't work with highmem page */
 static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
  volatile void *address)
@@ -321,6 +321,7 @@ void __init swiotlb_free(void)
free_bootmem_late(__pa(io_tlb_start),
  PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
}
+   io_tlb_nslabs = 0;
 }

 static int is_swiotlb_buffer(phys_addr_t paddr)
-- 
1.7.7.1



ttm: merge ttm_backend & ttm_tt, introduce ttm dma allocator V6

2011-11-16 Thread j.gli...@gmail.com
Respin some of the patch with syntax/typo fix + patch 10 with proper
memory accounting of page being free.

Cheers,
Jerome



Re: drm pixel formats update

2011-11-16 Thread Alan Cox
> If anyone has problems with the way the formats are defined, please
> speak up now! Since only Jesse has bothered to comment on my rantings
> I can only assume people are happy with my approach to things.

Umm .. no. I don't see why they are needed. Its just an extra layer of
gratuitious confusing indirection. The rest of the world speaks and
understands FourCC sp for all the formats covered by an existing FourCC
name we should just the existing name.

You might need to check one now and then but everyone doing video
processing is familiar with them including all the Windows folk.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 43000] huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=43000

--- Comment #2 from almos  2011-11-16 11:52:30 PST ---
The hw is barts pro (hd6850). The only part changed is mesa: 7.11 is installed
(debian unstable), and I compiled one from git. In the latter case I start
programs as
LD_LIBRARY_PATH=/home/almos/SRC/mesa/lib/
LIBGL_DRIVERS_PATH=/home/almos/SRC/mesa/lib/gallium "$@"

I'll try to bisect later.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42999

--- Comment #1 from Alex Deucher  2011-11-16 11:45:32 PST ---
I doubt you are using radeonhd.  Please attach your xorg log and dmesg output.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 43000] huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=43000

--- Comment #1 from Alex Deucher  2011-11-16 11:42:52 PST ---
What hardware are you using?  Is mesa the only part that changed or did you
also update your kernel and/or ddx?  If it's just mesa, can you bisect?  If
it's multiple parts that you upgraded can you track down what component caused
the problem?

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 42999] Notebook with AMD 6520G (A6-3400M) does not resume from suspend

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42999

Alex Deucher  changed:

   What|Removed |Added

 AssignedTo|e...@pdx.freedesktop.org|dri-devel@lists.freedesktop
   ||.org
  QAContact|xorg-t...@lists.x.org   |
Product|xorg|DRI
Version|git |unspecified
  Component|Driver/radeonhd |DRM/Radeon

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 43000] New: huge performance regression in ut2004 since 7.11

2011-11-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=43000

 Bug #: 43000
   Summary: huge performance regression in ut2004 since 7.11
Classification: Unclassified
   Product: Mesa
   Version: git
  Platform: Other
OS/Version: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Gallium/r600
AssignedTo: dri-devel@lists.freedesktop.org
ReportedBy: aaalmo...@gmail.com


With 7.11 I get 60fps during the nvidia logo and in the menu. Ingame it is e.g.
~44fps if I load ons-torlan and look at the central tower from the base.

With 7.12-dev (git-b618e78) I get <30fps during the nvidia logo, and ~6fps on
the same level.

I must add, that 7.11 isn't quite playable either, because the fps has very
high variance: it jumps between 20 and 60, which makes the game very laggy.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread Ilyes Gouta
Hi Ville,

Regarding 3 plane YCbCr, DRM_FORMAT_yuv444 (non sub-sampled YCbCr)
would also be useful.

-Ilyes

On Wed, Nov 16, 2011 at 7:42 PM,   wrote:
> From: Ville Syrjälä 
>
> Name the formats as DRM_FORMAT_X instead of DRM_FOURCC_X. Use consistent
> names, especially for the RGB formats. Component order and byte order are
> now strictly specified for each format.
>
> The RGB format naming follows a convention where the components names
> and sizes are listed from left to right, matching the order within a
> single pixel from most significant bit to least significant bit. Lower
> case letters are used when listing the components to improve
> readablility. I believe this convention matches the one used by pixman.
>
> The YUV format names vary more. For the 4:2:2 packed formats and 2
> plane formats use the fourcc. For the three plane formats the
> name includes the plane order and subsampling information using the
> standard subsampling notation. Some of those also happen to match
> the official fourcc definition.
>
> The fourccs for for all the RGB formats and some of the YUV formats
> I invented myself. The idea was that looking at just the fourcc you
> get some idea what the format is about without having to decode it
> using some external reference.
>
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/drm_crtc.c           |   18 +++---
>  drivers/gpu/drm/drm_crtc_helper.c    |   39 --
>  drivers/gpu/drm/i915/intel_display.c |   18 ---
>  include/drm/drm_fourcc.h             |   96 
> --
>  4 files changed, 121 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
> index 30a70a4..761f265 100644
> --- a/drivers/gpu/drm/drm_crtc.c
> +++ b/drivers/gpu/drm/drm_crtc.c
> @@ -1918,28 +1918,28 @@ uint32_t drm_mode_legacy_fb_format(uint32_t bpp, 
> uint32_t depth)
>
>        switch (bpp) {
>        case 8:
> -               fmt = DRM_FOURCC_RGB332;
> +               fmt = DRM_FORMAT_r3g3b2;
>                break;
>        case 16:
>                if (depth == 15)
> -                       fmt = DRM_FOURCC_RGB555;
> +                       fmt = DRM_FORMAT_x1r5g5b5;
>                else
> -                       fmt = DRM_FOURCC_RGB565;
> +                       fmt = DRM_FORMAT_r5g6b5;
>                break;
>        case 24:
> -               fmt = DRM_FOURCC_RGB24;
> +               fmt = DRM_FORMAT_r8g8b8;
>                break;
>        case 32:
>                if (depth == 24)
> -                       fmt = DRM_FOURCC_RGB24;
> +                       fmt = DRM_FORMAT_x8r8g8b8;
>                else if (depth == 30)
> -                       fmt = DRM_INTEL_RGB30;
> +                       fmt = DRM_FORMAT_x2r10g10b10;
>                else
> -                       fmt = DRM_FOURCC_RGB32;
> +                       fmt = DRM_FORMAT_a8r8g8b8;
>                break;
>        default:
> -               DRM_ERROR("bad bpp, assuming RGB24 pixel format\n");
> -               fmt = DRM_FOURCC_RGB24;
> +               DRM_ERROR("bad bpp, assuming x8r8g8b8 pixel format\n");
> +               fmt = DRM_FORMAT_x8r8g8b8;
>                break;
>        }
>
> diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
> b/drivers/gpu/drm/drm_crtc_helper.c
> index 3e0645c..4ef19d37 100644
> --- a/drivers/gpu/drm/drm_crtc_helper.c
> +++ b/drivers/gpu/drm/drm_crtc_helper.c
> @@ -816,27 +816,54 @@ void drm_helper_get_fb_bpp_depth(uint32_t format, 
> unsigned int *depth,
>                                 int *bpp)
>  {
>        switch (format) {
> -       case DRM_FOURCC_RGB332:
> +       case DRM_FORMAT_r3g3b2:
> +       case DRM_FORMAT_b2g3r3:
>                *depth = 8;
>                *bpp = 8;
>                break;
> -       case DRM_FOURCC_RGB555:
> +       case DRM_FORMAT_x1r5g5b5:
> +       case DRM_FORMAT_x1b5g5r5:
> +       case DRM_FORMAT_r5g5b5x1:
> +       case DRM_FORMAT_b5g5r5x1:
> +       case DRM_FORMAT_a1r5g5b5:
> +       case DRM_FORMAT_a1b5g5r5:
> +       case DRM_FORMAT_r5g5b5a1:
> +       case DRM_FORMAT_b5g5r5a1:
>                *depth = 15;
>                *bpp = 16;
>                break;
> -       case DRM_FOURCC_RGB565:
> +       case DRM_FORMAT_r5g6b5:
> +       case DRM_FORMAT_b5g6r5:
>                *depth = 16;
>                *bpp = 16;
>                break;
> -       case DRM_FOURCC_RGB24:
> +       case DRM_FORMAT_r8g8b8:
> +       case DRM_FORMAT_b8g8r8:
> +               *depth = 24;
> +               *bpp = 24;
> +               break;
> +       case DRM_FORMAT_x8r8g8b8:
> +       case DRM_FORMAT_x8b8g8r8:
> +       case DRM_FORMAT_r8g8b8x8:
> +       case DRM_FORMAT_b8g8r8x8:
>                *depth = 24;
>                *bpp = 32;
>                break;
> -       case DRM_INTEL_RGB30:
> +       case DRM_FORMAT_x2r10g10b10:
> +       case DRM_FORMAT_x2b10g10r10:
> +       case DRM_FORMAT_r10g10b10x2:
> +       case DRM_FORMAT_b10g10r10x2:
> +       case DRM_FORMAT_a2r10g10b1

Re: [PATCH] drm/radeon: introduce a sub allocator and convert ib pool to it

2011-11-16 Thread Jerome Glisse
On Wed, Nov 16, 2011 at 2:18 PM,   wrote:
> From: Jerome Glisse 
>
> Somewhat specializaed sub-allocator designed to perform sub-allocation
> for command buffer not only for current cs ioctl but for future command
> submission ioctl as well. Patch also convert current ib pool to use
> the sub allocator. Idea is that ib poll buffer can be share with other
> command buffer submission not having 64K granularity.
>
> Signed-off-by: Jerome Glisse 

Ignore first send (was wrong patch).

Cheers,
Jerome
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: introduce a sub allocator and convert ib pool to it

2011-11-16 Thread j . glisse
From: Jerome Glisse 

Somewhat specializaed sub-allocator designed to perform sub-allocation
for command buffer not only for current cs ioctl but for future command
submission ioctl as well. Patch also convert current ib pool to use
the sub allocator. Idea is that ib poll buffer can be share with other
command buffer submission not having 64K granularity.

Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/Makefile|2 +-
 drivers/gpu/drm/radeon/radeon.h|   66 --
 drivers/gpu/drm/radeon/radeon_object.h |   18 +++
 drivers/gpu/drm/radeon/radeon_ring.c   |  239 
 drivers/gpu/drm/radeon/radeon_sa.c |  186 +
 5 files changed, 346 insertions(+), 165 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_sa.c

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 94dcdc7..2139fe8 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -71,7 +71,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
r600_blit_kms.o radeon_pm.o atombios_dp.o r600_audio.o r600_hdmi.o \
evergreen.o evergreen_cs.o evergreen_blit_shaders.o 
evergreen_blit_kms.o \
radeon_trace_points.o ni.o cayman_blit_shaders.o atombios_encoders.o \
-   radeon_semaphore.o
+   radeon_semaphore.o radeon_sa.o
 
 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b85f8a9..267bd92 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -305,6 +305,53 @@ struct radeon_bo_list {
u32 tiling_flags;
 };
 
+/* sub-allocation manager, it has to be protected by another lock.
+ * By conception this is an helper for other part of the driver
+ * like the indirect buffer or semaphore, which both have their
+ * locking.
+ *
+ * Principe is simple, we keep a list of sub allocation in offset
+ * order (first entry has offset == 0, last entry has the highest
+ * offset).
+ *
+ * When allocating new object we first check if there is room at
+ * the end total_size - (last_object_offset + last_object_size) >=
+ * alloc_size. If so we allocate new object there.
+ *
+ * When there is not enough room at the end, we start waiting for
+ * each sub object until we reach object_offset+object_size >=
+ * alloc_size, this object then become the sub object we return.
+ *
+ * Alignment can't be bigger than page size.
+ *
+ * Hole are not considered for allocation to keep things simple.
+ * Assumption is that there won't be hole (all object on same
+ * alignment).
+ */
+struct radeon_sa_manager {
+   struct radeon_bo*bo;
+   struct list_headsa_bo;
+   unsignedsize;
+   uint64_tgpu_addr;
+   void*cpu_ptr;
+};
+
+struct radeon_sa_bo;
+typedef void (*radeon_sa_bo_destroy_t)(struct radeon_device *rdev,
+  struct radeon_sa_bo *sa_bo);
+typedef bool (*radeon_sa_bo_done_t)(struct radeon_device *rdev,
+   struct radeon_sa_bo *sa_bo);
+
+/* sub-allocation buffer */
+struct radeon_sa_bo {
+   struct list_headlist;
+   struct radeon_sa_manager*manager;
+   unsignedoffset;
+   unsignedsize;
+   radeon_sa_bo_destroy_t  destroy;
+   radeon_sa_bo_done_t done;
+};
+
 /*
  * GEM objects.
  */
@@ -503,13 +550,12 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
 #define CAYMAN_RING_TYPE_CP2_INDEX 2
 
 struct radeon_ib {
-   struct list_headlist;
+   struct radeon_sa_bo sa_bo;
unsignedidx;
+   uint32_tlength_dw;
uint64_tgpu_addr;
-   struct radeon_fence *fence;
uint32_t*ptr;
-   uint32_tlength_dw;
-   boolfree;
+   struct radeon_fence *fence;
 };
 
 /*
@@ -517,12 +563,11 @@ struct radeon_ib {
  * mutex protects scheduled_ibs, ready, alloc_bm
  */
 struct radeon_ib_pool {
-   struct mutexmutex;
-   struct radeon_bo*robj;
-   struct list_headbogus_ib;
-   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
-   boolready;
-   unsignedhead_id;
+   struct mutexmutex;
+   struct radeon_sa_managersa_manager;
+   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
+   boolready;
+   unsignedhead_id;
 };
 
 struct radeon_ring {
@@ -601,7 +646,6 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 voi

Re: drm pixel formats update

2011-11-16 Thread James Simmons

> I decided to go all out with the pixel format definitions. Added pretty
> much all of the possible RGB/BGR variations. Just left out ones with
> 16bit components and floats. Also added a whole bunch of YUV formats,
> and 8 bit pseudocolor for good measure.

Thank you for including the pseudocolor as well.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: introduce a sub allocator and convert ib pool to it

2011-11-16 Thread j . glisse
From: Jerome Glisse 

Somewhat specializaed sub-allocator designed to perform sub-allocation
for command buffer not only for current cs ioctl but for future command
submission ioctl as well. Patch also convert current ib pool to use
the sub allocator. Idea is that ib poll buffer can be share with other
command buffer submission not having 64K granularity.

Signed-off-by: Jerome Glisse 
---
 drivers/gpu/drm/radeon/Makefile|2 +-
 drivers/gpu/drm/radeon/radeon.h|   66 --
 drivers/gpu/drm/radeon/radeon_object.h |   18 +++
 drivers/gpu/drm/radeon/radeon_ring.c   |  239 
 drivers/gpu/drm/radeon/radeon_sa.c |  186 +
 5 files changed, 346 insertions(+), 165 deletions(-)
 create mode 100644 drivers/gpu/drm/radeon/radeon_sa.c

diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
index 94dcdc7..2139fe8 100644
--- a/drivers/gpu/drm/radeon/Makefile
+++ b/drivers/gpu/drm/radeon/Makefile
@@ -71,7 +71,7 @@ radeon-y += radeon_device.o radeon_asic.o radeon_kms.o \
r600_blit_kms.o radeon_pm.o atombios_dp.o r600_audio.o r600_hdmi.o \
evergreen.o evergreen_cs.o evergreen_blit_shaders.o 
evergreen_blit_kms.o \
radeon_trace_points.o ni.o cayman_blit_shaders.o atombios_encoders.o \
-   radeon_semaphore.o
+   radeon_semaphore.o radeon_sa.o
 
 radeon-$(CONFIG_COMPAT) += radeon_ioc32.o
 radeon-$(CONFIG_VGA_SWITCHEROO) += radeon_atpx_handler.o
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index b85f8a9..267bd92 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -305,6 +305,53 @@ struct radeon_bo_list {
u32 tiling_flags;
 };
 
+/* sub-allocation manager, it has to be protected by another lock.
+ * By conception this is an helper for other part of the driver
+ * like the indirect buffer or semaphore, which both have their
+ * locking.
+ *
+ * Principe is simple, we keep a list of sub allocation in offset
+ * order (first entry has offset == 0, last entry has the highest
+ * offset).
+ *
+ * When allocating new object we first check if there is room at
+ * the end total_size - (last_object_offset + last_object_size) >=
+ * alloc_size. If so we allocate new object there.
+ *
+ * When there is not enough room at the end, we start waiting for
+ * each sub object until we reach object_offset+object_size >=
+ * alloc_size, this object then become the sub object we return.
+ *
+ * Alignment can't be bigger than page size.
+ *
+ * Hole are not considered for allocation to keep things simple.
+ * Assumption is that there won't be hole (all object on same
+ * alignment).
+ */
+struct radeon_sa_manager {
+   struct radeon_bo*bo;
+   struct list_headsa_bo;
+   unsignedsize;
+   uint64_tgpu_addr;
+   void*cpu_ptr;
+};
+
+struct radeon_sa_bo;
+typedef void (*radeon_sa_bo_destroy_t)(struct radeon_device *rdev,
+  struct radeon_sa_bo *sa_bo);
+typedef bool (*radeon_sa_bo_done_t)(struct radeon_device *rdev,
+   struct radeon_sa_bo *sa_bo);
+
+/* sub-allocation buffer */
+struct radeon_sa_bo {
+   struct list_headlist;
+   struct radeon_sa_manager*manager;
+   unsignedoffset;
+   unsignedsize;
+   radeon_sa_bo_destroy_t  destroy;
+   radeon_sa_bo_done_t done;
+};
+
 /*
  * GEM objects.
  */
@@ -503,13 +550,12 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device 
*rdev, int crtc);
 #define CAYMAN_RING_TYPE_CP2_INDEX 2
 
 struct radeon_ib {
-   struct list_headlist;
+   struct radeon_sa_bo sa_bo;
unsignedidx;
+   uint32_tlength_dw;
uint64_tgpu_addr;
-   struct radeon_fence *fence;
uint32_t*ptr;
-   uint32_tlength_dw;
-   boolfree;
+   struct radeon_fence *fence;
 };
 
 /*
@@ -517,12 +563,11 @@ struct radeon_ib {
  * mutex protects scheduled_ibs, ready, alloc_bm
  */
 struct radeon_ib_pool {
-   struct mutexmutex;
-   struct radeon_bo*robj;
-   struct list_headbogus_ib;
-   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
-   boolready;
-   unsignedhead_id;
+   struct mutexmutex;
+   struct radeon_sa_managersa_manager;
+   struct radeon_ibibs[RADEON_IB_POOL_SIZE];
+   boolready;
+   unsignedhead_id;
 };
 
 struct radeon_ring {
@@ -601,7 +646,6 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct 
radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 voi

  1   2   >