[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
On Don, 2012-02-02 at 10:26 -0500, Ilija Hadzic wrote: > Evergreen and NI blit copy was broken if the buffer maps to a rectangle > whose one dimension is 16384 (max dimension allowed by these chips). > In the mainline kernel, the problem is exposed only when buffers are > very large (1G), but it's still a problem. The problem could be exposed > for smaller buffers if anyone modifies the algorithm for rectangle > construction in r600_blit_create_rect() (the reason why someone would > modify that algorithm is to tune the performance of buffer moves). > > The root cause was in i2f() function which only operated on range between > 0 and 16383. Fix this by extending the range of i2f() function to 0 to > 32767. > > While at it improve the function so that the range can be easily > extended in the future (if it becomes necessary), cleanup lines > over 80 characters, and replace in-line comments with one strategic > comment that explains the crux of the function. > > Credits to michel at daenzer.net for pointing out the root cause of > the bug. > > v2: Fix I2F_MAX_INPUT constant definition goof and warn only once > if input argument is out of range. Edit the comment a little > bit to avoid some linguistic confusion and make it look better > in general. > > Signed-off-by: Ilija Hadzic Reviewed-by: Michel D?nzer -- Earthling Michel D?nzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer
[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
On Thu, Feb 2, 2012 at 10:26 AM, Ilija Hadzic wrote: > Evergreen and NI blit copy was broken if the buffer maps to a rectangle > whose one dimension is 16384 (max dimension allowed by these chips). > In the mainline kernel, the problem is exposed only when buffers are > very large (1G), but it's still a problem. The problem could be exposed > for smaller buffers if anyone modifies the algorithm for rectangle > construction in r600_blit_create_rect() (the reason why someone would > modify that algorithm is to tune the performance of buffer moves). > > The root cause was in i2f() function which only operated on range between > 0 and 16383. Fix this by extending the range of i2f() function to 0 to > 32767. > > While at it improve the function so that the range can be easily > extended in the future (if it becomes necessary), cleanup lines > over 80 characters, and replace in-line comments with one strategic > comment that explains the crux of the function. > > Credits to michel at daenzer.net for pointing out the root cause of > the bug. > > v2: Fix I2F_MAX_INPUT constant definition goof and warn only once > ? ?if input argument is out of range. Edit the comment a little > ? ?bit to avoid some linguistic confusion and make it look better > ? ?in general. > > Signed-off-by: Ilija Hadzic Should probably CC stable as well. Reviewed-by: Alex Deucher > --- > ?drivers/gpu/drm/radeon/r600_blit_kms.c | ? 35 ++- > ?1 files changed, 25 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c > b/drivers/gpu/drm/radeon/r600_blit_kms.c > index d996f43..accc032 100644 > --- a/drivers/gpu/drm/radeon/r600_blit_kms.c > +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c > @@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev) > ? ? ? ?radeon_ring_write(ring, sq_stack_resource_mgmt_2); > ?} > > +#define I2F_MAX_BITS 15 > +#define I2F_MAX_INPUT ?((1 << I2F_MAX_BITS) - 1) > +#define I2F_SHIFT (24 - I2F_MAX_BITS) > + > +/* > + * Converts unsigned integer into 32-bit IEEE floating point representation. > + * Conversion is not universal and only works for the range from 0 > + * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between > + * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary, > + * I2F_MAX_BITS can be increased, but that will add to the loop iterations > + * and slow us down. Conversion is done by shifting the input and counting > + * down until the first 1 reaches bit position 23. The resulting counter > + * and the shifted input are, respectively, the exponent and the fraction. > + * The sign is always zero. > + */ > ?static uint32_t i2f(uint32_t input) > ?{ > ? ? ? ?u32 result, i, exponent, fraction; > > - ? ? ? if ((input & 0x3fff) == 0) > - ? ? ? ? ? ? ? result = 0; /* 0 is a special case */ > + ? ? ? WARN_ON_ONCE(input > I2F_MAX_INPUT); > + > + ? ? ? if ((input & I2F_MAX_INPUT) == 0) > + ? ? ? ? ? ? ? result = 0; > ? ? ? ?else { > - ? ? ? ? ? ? ? exponent = 140; /* exponent biased by 127; */ > - ? ? ? ? ? ? ? fraction = (input & 0x3fff) << 10; /* cheat and only > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? handle numbers below > 2^^15 */ > - ? ? ? ? ? ? ? for (i = 0; i < 14; i++) { > + ? ? ? ? ? ? ? exponent = 126 + I2F_MAX_BITS; > + ? ? ? ? ? ? ? fraction = (input & I2F_MAX_INPUT) << I2F_SHIFT; > + > + ? ? ? ? ? ? ? for (i = 0; i < I2F_MAX_BITS; i++) { > ? ? ? ? ? ? ? ? ? ? ? ?if (fraction & 0x80) > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?break; > ? ? ? ? ? ? ? ? ? ? ? ?else { > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fraction = fraction << 1; /* keep > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?shifting left > until top bit = 1 */ > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? fraction = fraction << 1; > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?exponent = exponent - 1; > ? ? ? ? ? ? ? ? ? ? ? ?} > ? ? ? ? ? ? ? ?} > - ? ? ? ? ? ? ? result = exponent << 23 | (fraction & 0x7f); /* mask > - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? off top > bit; assumed 1 */ > + ? ? ? ? ? ? ? result = exponent << 23 | (fraction & 0x7f); > ? ? ? ?} > ? ? ? ?return result; > ?} > -- > 1.7.7 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
Evergreen and NI blit copy was broken if the buffer maps to a rectangle whose one dimension is 16384 (max dimension allowed by these chips). In the mainline kernel, the problem is exposed only when buffers are very large (1G), but it's still a problem. The problem could be exposed for smaller buffers if anyone modifies the algorithm for rectangle construction in r600_blit_create_rect() (the reason why someone would modify that algorithm is to tune the performance of buffer moves). The root cause was in i2f() function which only operated on range between 0 and 16383. Fix this by extending the range of i2f() function to 0 to 32767. While at it improve the function so that the range can be easily extended in the future (if it becomes necessary), cleanup lines over 80 characters, and replace in-line comments with one strategic comment that explains the crux of the function. Credits to michel at daenzer.net for pointing out the root cause of the bug. v2: Fix I2F_MAX_INPUT constant definition goof and warn only once if input argument is out of range. Edit the comment a little bit to avoid some linguistic confusion and make it look better in general. Signed-off-by: Ilija Hadzic --- drivers/gpu/drm/radeon/r600_blit_kms.c | 35 ++- 1 files changed, 25 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index d996f43..accc032 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev) radeon_ring_write(ring, sq_stack_resource_mgmt_2); } +#define I2F_MAX_BITS 15 +#define I2F_MAX_INPUT ((1 << I2F_MAX_BITS) - 1) +#define I2F_SHIFT (24 - I2F_MAX_BITS) + +/* + * Converts unsigned integer into 32-bit IEEE floating point representation. + * Conversion is not universal and only works for the range from 0 + * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between + * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary, + * I2F_MAX_BITS can be increased, but that will add to the loop iterations + * and slow us down. Conversion is done by shifting the input and counting + * down until the first 1 reaches bit position 23. The resulting counter + * and the shifted input are, respectively, the exponent and the fraction. + * The sign is always zero. + */ static uint32_t i2f(uint32_t input) { u32 result, i, exponent, fraction; - if ((input & 0x3fff) == 0) - result = 0; /* 0 is a special case */ + WARN_ON_ONCE(input > I2F_MAX_INPUT); + + if ((input & I2F_MAX_INPUT) == 0) + result = 0; else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input & 0x3fff) << 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i < 14; i++) { + exponent = 126 + I2F_MAX_BITS; + fraction = (input & I2F_MAX_INPUT) << I2F_SHIFT; + + for (i = 0; i < I2F_MAX_BITS; i++) { if (fraction & 0x80) break; else { - fraction = fraction << 1; /* keep -shifting left until top bit = 1 */ + fraction = fraction << 1; exponent = exponent - 1; } } - result = exponent << 23 | (fraction & 0x7f); /* mask - off top bit; assumed 1 */ + result = exponent << 23 | (fraction & 0x7f); } return result; } -- 1.7.7
[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
On Thu, 2 Feb 2012, Alex Deucher wrote: > > Should probably CC stable as well. > I was thinking of that, but decided not to because it's in the gray area of this rule per Documentation/stable_kernel_rules.txt - It must fix a real bug that bothers people (not a, "This could be a problem..." type thing). -- Ilija
[PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
Evergreen and NI blit copy was broken if the buffer maps to a rectangle whose one dimension is 16384 (max dimension allowed by these chips). In the mainline kernel, the problem is exposed only when buffers are very large (1G), but it's still a problem. The problem could be exposed for smaller buffers if anyone modifies the algorithm for rectangle construction in r600_blit_create_rect() (the reason why someone would modify that algorithm is to tune the performance of buffer moves). The root cause was in i2f() function which only operated on range between 0 and 16383. Fix this by extending the range of i2f() function to 0 to 32767. While at it improve the function so that the range can be easily extended in the future (if it becomes necessary), cleanup lines over 80 characters, and replace in-line comments with one strategic comment that explains the crux of the function. Credits to mic...@daenzer.net for pointing out the root cause of the bug. v2: Fix I2F_MAX_INPUT constant definition goof and warn only once if input argument is out of range. Edit the comment a little bit to avoid some linguistic confusion and make it look better in general. Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/r600_blit_kms.c | 35 ++- 1 files changed, 25 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index d996f43..accc032 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev) radeon_ring_write(ring, sq_stack_resource_mgmt_2); } +#define I2F_MAX_BITS 15 +#define I2F_MAX_INPUT ((1 I2F_MAX_BITS) - 1) +#define I2F_SHIFT (24 - I2F_MAX_BITS) + +/* + * Converts unsigned integer into 32-bit IEEE floating point representation. + * Conversion is not universal and only works for the range from 0 + * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between + * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary, + * I2F_MAX_BITS can be increased, but that will add to the loop iterations + * and slow us down. Conversion is done by shifting the input and counting + * down until the first 1 reaches bit position 23. The resulting counter + * and the shifted input are, respectively, the exponent and the fraction. + * The sign is always zero. + */ static uint32_t i2f(uint32_t input) { u32 result, i, exponent, fraction; - if ((input 0x3fff) == 0) - result = 0; /* 0 is a special case */ + WARN_ON_ONCE(input I2F_MAX_INPUT); + + if ((input I2F_MAX_INPUT) == 0) + result = 0; else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input 0x3fff) 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i 14; i++) { + exponent = 126 + I2F_MAX_BITS; + fraction = (input I2F_MAX_INPUT) I2F_SHIFT; + + for (i = 0; i I2F_MAX_BITS; i++) { if (fraction 0x80) break; else { - fraction = fraction 1; /* keep -shifting left until top bit = 1 */ + fraction = fraction 1; exponent = exponent - 1; } } - result = exponent 23 | (fraction 0x7f); /* mask - off top bit; assumed 1 */ + result = exponent 23 | (fraction 0x7f); } return result; } -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
On Thu, Feb 2, 2012 at 10:26 AM, Ilija Hadzic ihad...@research.bell-labs.com wrote: Evergreen and NI blit copy was broken if the buffer maps to a rectangle whose one dimension is 16384 (max dimension allowed by these chips). In the mainline kernel, the problem is exposed only when buffers are very large (1G), but it's still a problem. The problem could be exposed for smaller buffers if anyone modifies the algorithm for rectangle construction in r600_blit_create_rect() (the reason why someone would modify that algorithm is to tune the performance of buffer moves). The root cause was in i2f() function which only operated on range between 0 and 16383. Fix this by extending the range of i2f() function to 0 to 32767. While at it improve the function so that the range can be easily extended in the future (if it becomes necessary), cleanup lines over 80 characters, and replace in-line comments with one strategic comment that explains the crux of the function. Credits to mic...@daenzer.net for pointing out the root cause of the bug. v2: Fix I2F_MAX_INPUT constant definition goof and warn only once if input argument is out of range. Edit the comment a little bit to avoid some linguistic confusion and make it look better in general. Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com Should probably CC stable as well. Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/radeon/r600_blit_kms.c | 35 ++- 1 files changed, 25 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index d996f43..accc032 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -468,27 +468,42 @@ set_default_state(struct radeon_device *rdev) radeon_ring_write(ring, sq_stack_resource_mgmt_2); } +#define I2F_MAX_BITS 15 +#define I2F_MAX_INPUT ((1 I2F_MAX_BITS) - 1) +#define I2F_SHIFT (24 - I2F_MAX_BITS) + +/* + * Converts unsigned integer into 32-bit IEEE floating point representation. + * Conversion is not universal and only works for the range from 0 + * to 2^I2F_MAX_BITS-1. Currently we only use it with inputs between + * 0 and 16384 (inclusive), so I2F_MAX_BITS=15 is enough. If necessary, + * I2F_MAX_BITS can be increased, but that will add to the loop iterations + * and slow us down. Conversion is done by shifting the input and counting + * down until the first 1 reaches bit position 23. The resulting counter + * and the shifted input are, respectively, the exponent and the fraction. + * The sign is always zero. + */ static uint32_t i2f(uint32_t input) { u32 result, i, exponent, fraction; - if ((input 0x3fff) == 0) - result = 0; /* 0 is a special case */ + WARN_ON_ONCE(input I2F_MAX_INPUT); + + if ((input I2F_MAX_INPUT) == 0) + result = 0; else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input 0x3fff) 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i 14; i++) { + exponent = 126 + I2F_MAX_BITS; + fraction = (input I2F_MAX_INPUT) I2F_SHIFT; + + for (i = 0; i I2F_MAX_BITS; i++) { if (fraction 0x80) break; else { - fraction = fraction 1; /* keep - shifting left until top bit = 1 */ + fraction = fraction 1; exponent = exponent - 1; } } - result = exponent 23 | (fraction 0x7f); /* mask - off top bit; assumed 1 */ + result = exponent 23 | (fraction 0x7f); } return result; } -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2] drm/radeon/kms/blit: fix blit copy for very large buffers
On Thu, 2 Feb 2012, Alex Deucher wrote: Should probably CC stable as well. I was thinking of that, but decided not to because it's in the gray area of this rule per Documentation/stable_kernel_rules.txt - It must fix a real bug that bothers people (not a, This could be a problem... type thing). -- Ilija ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel