[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Michel Dänzer
On Don, 2012-02-02 at 10:26 -0500, Ilija Hadzic wrote: 
> Evergreen and NI blit copy was broken if the buffer maps to a rectangle
> whose one dimension is 16384 (max dimension allowed by these chips).
> In the mainline kernel, the problem is exposed only when buffers are
> very large (1G), but it's still a problem. The problem could be exposed
> for smaller buffers if anyone modifies the algorithm for rectangle
> construction in r600_blit_create_rect() (the reason why someone would
> modify that algorithm is to tune the performance of buffer moves).
> 
> The root cause was in i2f() function which only operated on range between
> 0 and 16383. Fix this by extending the range of i2f() function to 0 to
> 32767.
> 
> While at it improve the function so that the range can be easily
> extended in the future (if it becomes necessary), cleanup lines
> over 80 characters, and replace in-line comments with one strategic
> comment that explains the crux of the function.
> 
> Credits to michel at daenzer.net for pointing out the root cause of
> the bug.
> 
> v2: Fix I2F_MAX_INPUT constant definition goof and warn only once
> if input argument is out of range. Edit the comment a little
> bit to avoid some linguistic confusion and make it look better
> in general.
> 
> Signed-off-by: Ilija Hadzic 

Reviewed-by: Michel D?nzer 


-- 
Earthling Michel D?nzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer


[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Alex Deucher
On Thu, Feb 2, 2012 at 10:26 AM, Ilija Hadzic
 wrote:
> Evergreen and NI blit copy was broken if the buffer maps to a rectangle
> whose one dimension is 16384 (max dimension allowed by these chips).
> In the mainline kernel, the problem is exposed only when buffers are
> very large (1G), but it's still a problem. The problem could be exposed
> for smaller buffers if anyone modifies the algorithm for rectangle
> construction in r600_blit_create_rect() (the reason why someone would
> modify that algorithm is to tune the performance of buffer moves).
>
> The root cause was in i2f() function which only operated on range between
> 0 and 16383. Fix this by extending the range of i2f() function to 0 to
> 32767.
>
> While at it improve the function so that the range can be easily
> extended in the future (if it becomes necessary), cleanup lines
> over 80 characters, and replace in-line comments with one strategic
> comment that explains the crux of the function.
>
> Credits to michel at daenzer.net for pointing out the root cause of
> the bug.
>
> v2: Fix I2F_MAX_INPUT constant definition goof and warn only once
> ? ?if input argument is out of range. Edit the comment a little
> ? ?bit to avoid some linguistic confusion and make it look better
> ? ?in general.
>
> Signed-off-by: Ilija Hadzic 

Should probably CC stable as well.

Reviewed-by: Alex Deucher 

> ---
> ?drivers/gpu/drm/radeon/r600_blit_kms.c | ? 35 ++-
> ?1 files changed, 25 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
> b/drivers/gpu/drm/radeon/r600_blit_kms.c
> index d996f43..accc032 100644
> --- a/drivers/gpu/drm/radeon/r600_blit_kms.c
> +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
> @@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev)
> ? ? ? ?radeon_ring_write(ring, sq_stack_resource_mgmt_2);
> ?}
>
> +#define I2F_MAX_BITS 15
> +#define I2F_MAX_INPUT ?((1 << I2F_MAX_BITS) - 1)
> +#define I2F_SHIFT (24 - I2F_MAX_BITS)
> +
> +/*
> + * Converts unsigned integer into 32-bit IEEE floating point representation.
> + * Conversion is not universal and only works for the range from 0
> + * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between
> + * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary,
> + * I2F_MAX_BITS can be increased, but that will add to the loop iterations
> + * and slow us down. Conversion is done by shifting the input and counting
> + * down until the first 1 reaches bit position 23. The resulting counter
> + * and the shifted input are, respectively, the exponent and the fraction.
> + * The sign is always zero.
> + */
> ?static uint32_t i2f(uint32_t input)
> ?{
> ? ? ? ?u32 result, i, exponent, fraction;
>
> - ? ? ? if ((input & 0x3fff) == 0)
> - ? ? ? ? ? ? ? result = 0; /* 0 is a special case */
> + ? ? ? WARN_ON_ONCE(input > I2F_MAX_INPUT);
> +
> + ? ? ? if ((input & I2F_MAX_INPUT) == 0)
> + ? ? ? ? ? ? ? result = 0;
> ? ? ? ?else {
> - ? ? ? ? ? ? ? exponent = 140; /* exponent biased by 127; */
> - ? ? ? ? ? ? ? fraction = (input & 0x3fff) << 10; /* cheat and only
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? handle numbers below 
> 2^^15 */
> - ? ? ? ? ? ? ? for (i = 0; i < 14; i++) {
> + ? ? ? ? ? ? ? exponent = 126 + I2F_MAX_BITS;
> + ? ? ? ? ? ? ? fraction = (input & I2F_MAX_INPUT) << I2F_SHIFT;
> +
> + ? ? ? ? ? ? ? for (i = 0; i < I2F_MAX_BITS; i++) {
> ? ? ? ? ? ? ? ? ? ? ? ?if (fraction & 0x80)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?break;
> ? ? ? ? ? ? ? ? ? ? ? ?else {
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fraction = fraction << 1; /* keep
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?shifting left 
> until top bit = 1 */
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fraction = fraction << 1;
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?exponent = exponent - 1;
> ? ? ? ? ? ? ? ? ? ? ? ?}
> ? ? ? ? ? ? ? ?}
> - ? ? ? ? ? ? ? result = exponent << 23 | (fraction & 0x7f); /* mask
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? off top 
> bit; assumed 1 */
> + ? ? ? ? ? ? ? result = exponent << 23 | (fraction & 0x7f);
> ? ? ? ?}
> ? ? ? ?return result;
> ?}
> --
> 1.7.7
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Ilija Hadzic
Evergreen and NI blit copy was broken if the buffer maps to a rectangle
whose one dimension is 16384 (max dimension allowed by these chips).
In the mainline kernel, the problem is exposed only when buffers are
very large (1G), but it's still a problem. The problem could be exposed
for smaller buffers if anyone modifies the algorithm for rectangle
construction in r600_blit_create_rect() (the reason why someone would
modify that algorithm is to tune the performance of buffer moves).

The root cause was in i2f() function which only operated on range between
0 and 16383. Fix this by extending the range of i2f() function to 0 to
32767.

While at it improve the function so that the range can be easily
extended in the future (if it becomes necessary), cleanup lines
over 80 characters, and replace in-line comments with one strategic
comment that explains the crux of the function.

Credits to michel at daenzer.net for pointing out the root cause of
the bug.

v2: Fix I2F_MAX_INPUT constant definition goof and warn only once
if input argument is out of range. Edit the comment a little
bit to avoid some linguistic confusion and make it look better
in general.

Signed-off-by: Ilija Hadzic 
---
 drivers/gpu/drm/radeon/r600_blit_kms.c |   35 ++-
 1 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
b/drivers/gpu/drm/radeon/r600_blit_kms.c
index d996f43..accc032 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev)
radeon_ring_write(ring, sq_stack_resource_mgmt_2);
 }

+#define I2F_MAX_BITS 15
+#define I2F_MAX_INPUT  ((1 << I2F_MAX_BITS) - 1)
+#define I2F_SHIFT (24 - I2F_MAX_BITS)
+
+/*
+ * Converts unsigned integer into 32-bit IEEE floating point representation.
+ * Conversion is not universal and only works for the range from 0
+ * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between
+ * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary,
+ * I2F_MAX_BITS can be increased, but that will add to the loop iterations
+ * and slow us down. Conversion is done by shifting the input and counting
+ * down until the first 1 reaches bit position 23. The resulting counter
+ * and the shifted input are, respectively, the exponent and the fraction.
+ * The sign is always zero.
+ */
 static uint32_t i2f(uint32_t input)
 {
u32 result, i, exponent, fraction;

-   if ((input & 0x3fff) == 0)
-   result = 0; /* 0 is a special case */
+   WARN_ON_ONCE(input > I2F_MAX_INPUT);
+
+   if ((input & I2F_MAX_INPUT) == 0)
+   result = 0;
else {
-   exponent = 140; /* exponent biased by 127; */
-   fraction = (input & 0x3fff) << 10; /* cheat and only
- handle numbers below 
2^^15 */
-   for (i = 0; i < 14; i++) {
+   exponent = 126 + I2F_MAX_BITS;
+   fraction = (input & I2F_MAX_INPUT) << I2F_SHIFT;
+
+   for (i = 0; i < I2F_MAX_BITS; i++) {
if (fraction & 0x80)
break;
else {
-   fraction = fraction << 1; /* keep
-shifting left 
until top bit = 1 */
+   fraction = fraction << 1;
exponent = exponent - 1;
}
}
-   result = exponent << 23 | (fraction & 0x7f); /* mask
-   off top 
bit; assumed 1 */
+   result = exponent << 23 | (fraction & 0x7f);
}
return result;
 }
-- 
1.7.7



[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Ilija Hadzic


On Thu, 2 Feb 2012, Alex Deucher wrote:

>
> Should probably CC stable as well.
>

I was thinking of that, but decided not to because it's in the gray area 
of this rule per Documentation/stable_kernel_rules.txt

  - It must fix a real bug that bothers people (not a, "This could be a
problem..." type thing).

-- Ilija


[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Ilija Hadzic
Evergreen and NI blit copy was broken if the buffer maps to a rectangle
whose one dimension is 16384 (max dimension allowed by these chips).
In the mainline kernel, the problem is exposed only when buffers are
very large (1G), but it's still a problem. The problem could be exposed
for smaller buffers if anyone modifies the algorithm for rectangle
construction in r600_blit_create_rect() (the reason why someone would
modify that algorithm is to tune the performance of buffer moves).

The root cause was in i2f() function which only operated on range between
0 and 16383. Fix this by extending the range of i2f() function to 0 to
32767.

While at it improve the function so that the range can be easily
extended in the future (if it becomes necessary), cleanup lines
over 80 characters, and replace in-line comments with one strategic
comment that explains the crux of the function.

Credits to mic...@daenzer.net for pointing out the root cause of
the bug.

v2: Fix I2F_MAX_INPUT constant definition goof and warn only once
if input argument is out of range. Edit the comment a little
bit to avoid some linguistic confusion and make it look better
in general.

Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com
---
 drivers/gpu/drm/radeon/r600_blit_kms.c |   35 ++-
 1 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
b/drivers/gpu/drm/radeon/r600_blit_kms.c
index d996f43..accc032 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev)
radeon_ring_write(ring, sq_stack_resource_mgmt_2);
 }
 
+#define I2F_MAX_BITS 15
+#define I2F_MAX_INPUT  ((1  I2F_MAX_BITS) - 1)
+#define I2F_SHIFT (24 - I2F_MAX_BITS)
+
+/*
+ * Converts unsigned integer into 32-bit IEEE floating point representation.
+ * Conversion is not universal and only works for the range from 0
+ * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between
+ * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary,
+ * I2F_MAX_BITS can be increased, but that will add to the loop iterations
+ * and slow us down. Conversion is done by shifting the input and counting
+ * down until the first 1 reaches bit position 23. The resulting counter
+ * and the shifted input are, respectively, the exponent and the fraction.
+ * The sign is always zero.
+ */
 static uint32_t i2f(uint32_t input)
 {
u32 result, i, exponent, fraction;
 
-   if ((input  0x3fff) == 0)
-   result = 0; /* 0 is a special case */
+   WARN_ON_ONCE(input  I2F_MAX_INPUT);
+
+   if ((input  I2F_MAX_INPUT) == 0)
+   result = 0;
else {
-   exponent = 140; /* exponent biased by 127; */
-   fraction = (input  0x3fff)  10; /* cheat and only
- handle numbers below 
2^^15 */
-   for (i = 0; i  14; i++) {
+   exponent = 126 + I2F_MAX_BITS;
+   fraction = (input  I2F_MAX_INPUT)  I2F_SHIFT;
+
+   for (i = 0; i  I2F_MAX_BITS; i++) {
if (fraction  0x80)
break;
else {
-   fraction = fraction  1; /* keep
-shifting left 
until top bit = 1 */
+   fraction = fraction  1;
exponent = exponent - 1;
}
}
-   result = exponent  23 | (fraction  0x7f); /* mask
-   off top 
bit; assumed 1 */
+   result = exponent  23 | (fraction  0x7f);
}
return result;
 }
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Alex Deucher
On Thu, Feb 2, 2012 at 10:26 AM, Ilija Hadzic
ihad...@research.bell-labs.com wrote:
 Evergreen and NI blit copy was broken if the buffer maps to a rectangle
 whose one dimension is 16384 (max dimension allowed by these chips).
 In the mainline kernel, the problem is exposed only when buffers are
 very large (1G), but it's still a problem. The problem could be exposed
 for smaller buffers if anyone modifies the algorithm for rectangle
 construction in r600_blit_create_rect() (the reason why someone would
 modify that algorithm is to tune the performance of buffer moves).

 The root cause was in i2f() function which only operated on range between
 0 and 16383. Fix this by extending the range of i2f() function to 0 to
 32767.

 While at it improve the function so that the range can be easily
 extended in the future (if it becomes necessary), cleanup lines
 over 80 characters, and replace in-line comments with one strategic
 comment that explains the crux of the function.

 Credits to mic...@daenzer.net for pointing out the root cause of
 the bug.

 v2: Fix I2F_MAX_INPUT constant definition goof and warn only once
    if input argument is out of range. Edit the comment a little
    bit to avoid some linguistic confusion and make it look better
    in general.

 Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com

Should probably CC stable as well.

Reviewed-by: Alex Deucher alexander.deuc...@amd.com

 ---
  drivers/gpu/drm/radeon/r600_blit_kms.c |   35 ++-
  1 files changed, 25 insertions(+), 10 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c 
 b/drivers/gpu/drm/radeon/r600_blit_kms.c
 index d996f43..accc032 100644
 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c
 +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
 @@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev)
        radeon_ring_write(ring, sq_stack_resource_mgmt_2);
  }

 +#define I2F_MAX_BITS 15
 +#define I2F_MAX_INPUT  ((1  I2F_MAX_BITS) - 1)
 +#define I2F_SHIFT (24 - I2F_MAX_BITS)
 +
 +/*
 + * Converts unsigned integer into 32-bit IEEE floating point representation.
 + * Conversion is not universal and only works for the range from 0
 + * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between
 + * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary,
 + * I2F_MAX_BITS can be increased, but that will add to the loop iterations
 + * and slow us down. Conversion is done by shifting the input and counting
 + * down until the first 1 reaches bit position 23. The resulting counter
 + * and the shifted input are, respectively, the exponent and the fraction.
 + * The sign is always zero.
 + */
  static uint32_t i2f(uint32_t input)
  {
        u32 result, i, exponent, fraction;

 -       if ((input  0x3fff) == 0)
 -               result = 0; /* 0 is a special case */
 +       WARN_ON_ONCE(input  I2F_MAX_INPUT);
 +
 +       if ((input  I2F_MAX_INPUT) == 0)
 +               result = 0;
        else {
 -               exponent = 140; /* exponent biased by 127; */
 -               fraction = (input  0x3fff)  10; /* cheat and only
 -                                                     handle numbers below 
 2^^15 */
 -               for (i = 0; i  14; i++) {
 +               exponent = 126 + I2F_MAX_BITS;
 +               fraction = (input  I2F_MAX_INPUT)  I2F_SHIFT;
 +
 +               for (i = 0; i  I2F_MAX_BITS; i++) {
                        if (fraction  0x80)
                                break;
                        else {
 -                               fraction = fraction  1; /* keep
 -                                                            shifting left 
 until top bit = 1 */
 +                               fraction = fraction  1;
                                exponent = exponent - 1;
                        }
                }
 -               result = exponent  23 | (fraction  0x7f); /* mask
 -                                                                   off top 
 bit; assumed 1 */
 +               result = exponent  23 | (fraction  0x7f);
        }
        return result;
  }
 --
 1.7.7

 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers

2012-02-02 Thread Ilija Hadzic



On Thu, 2 Feb 2012, Alex Deucher wrote:



Should probably CC stable as well.



I was thinking of that, but decided not to because it's in the gray area 
of this rule per Documentation/stable_kernel_rules.txt


 - It must fix a real bug that bothers people (not a, This could be a
   problem... type thing).

-- Ilija
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel