On Thu, 14 Aug 2025 21:50:06 -0600 Alex Hung <[email protected]> wrote:
> From: Harry Wentland <[email protected]> > > Certain operations require us to preserve values below 0.0 and > above 1.0 (0x0 and 0xffff respectively in 16 bpc unorm). One > such operation is a BT709 encoding operation followed by its > decoding operation, or the reverse. > > We'll use s32 values as intermediate in and outputs of our > color operations, for the operations where it matters. > > For now this won't apply to LUT operations. We might want to > update those to work on s32 as well, but it's unclear how > that should work for unorm LUT definitions. We'll revisit > that once we add LUT + CTM tests. > > In order to allow for this we'll also invert the nesting of our > colorop processing loops. We now use the pixel iteration loop > on the outside and the colorop iteration on the inside. Hi Alex, is this an out-dated paragraph in the commit message? I don't see the patch inverting the nesting of loops. That statement worried me, because changing the loop structures has tanked the performance before. Thanks, pq > > Reviewed-by: Louis Chauvet <[email protected]> > Signed-off-by: Alex Hung <[email protected]> > Signed-off-by: Harry Wentland <[email protected]> > Reviewed-by: Daniel Stone <[email protected]> > --- > v7: > - Fix checkpatch warnings > - Add a commit messages > - Fix code styles by adding and removing spaces (new lines, tabs and so on) > > v6: > - use clamp_val instead of manual clamping (Louis Chauvet) > > v4: > - Clarify that we're packing 16-bit UNORM into s32, not > converting values to a different representation (Pekka) > > v3: > - Use new colorop->next pointer > > drivers/gpu/drm/vkms/vkms_composer.c | 27 +++++++++++++++++++++++++-- > drivers/gpu/drm/vkms/vkms_drv.h | 4 ++++ > 2 files changed, 29 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/vkms/vkms_composer.c > b/drivers/gpu/drm/vkms/vkms_composer.c > index 0f3fcd6a5925..6630dccd68a4 100644 > --- a/drivers/gpu/drm/vkms/vkms_composer.c > +++ b/drivers/gpu/drm/vkms/vkms_composer.c > @@ -128,7 +128,7 @@ static void apply_lut(const struct vkms_crtc_state > *crtc_state, struct line_buff > } > } > > -static void apply_colorop(struct pixel_argb_u16 *pixel, struct drm_colorop > *colorop) > +static void apply_colorop(struct pixel_argb_s32 *pixel, struct drm_colorop > *colorop) > { > struct drm_colorop_state *colorop_state = colorop->state; > struct drm_device *dev = colorop->dev; > @@ -157,9 +157,26 @@ static void apply_colorop(struct pixel_argb_u16 *pixel, > struct drm_colorop *colo > static void pre_blend_color_transform(const struct vkms_plane_state > *plane_state, > struct line_buffer *output_buffer) > { > + struct pixel_argb_s32 pixel; > + > for (size_t x = 0; x < output_buffer->n_pixels; x++) { > struct drm_colorop *colorop = > plane_state->base.base.color_pipeline; > > + /* > + * Some operations, such as applying a BT709 encoding matrix, > + * followed by a decoding matrix, require that we preserve > + * values above 1.0 and below 0.0 until the end of the pipeline. > + * > + * Pack the 16-bit UNORM values into s32 to give us head-room to > + * avoid clipping until we're at the end of the pipeline. Clip > + * intentionally at the end of the pipeline before packing > + * UNORM values back into u16. > + */ > + pixel.a = output_buffer->pixels[x].a; > + pixel.r = output_buffer->pixels[x].r; > + pixel.g = output_buffer->pixels[x].g; > + pixel.b = output_buffer->pixels[x].b; > + > while (colorop) { > struct drm_colorop_state *colorop_state; > > @@ -169,10 +186,16 @@ static void pre_blend_color_transform(const struct > vkms_plane_state *plane_state > return; > > if (!colorop_state->bypass) > - apply_colorop(&output_buffer->pixels[x], > colorop); > + apply_colorop(&pixel, colorop); > > colorop = colorop->next; > } > + > + /* clamp values */ > + output_buffer->pixels[x].a = clamp_val(pixel.a, 0, 0xffff); > + output_buffer->pixels[x].r = clamp_val(pixel.r, 0, 0xffff); > + output_buffer->pixels[x].g = clamp_val(pixel.g, 0, 0xffff); > + output_buffer->pixels[x].b = clamp_val(pixel.b, 0, 0xffff); > } > } > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h > index 30941714cd0f..55440ec6db52 100644 > --- a/drivers/gpu/drm/vkms/vkms_drv.h > +++ b/drivers/gpu/drm/vkms/vkms_drv.h > @@ -45,6 +45,10 @@ struct vkms_frame_info { > unsigned int rotation; > }; > > +struct pixel_argb_s32 { > + s32 a, r, g, b; > +}; > + > /** > * struct pixel_argb_u16 - Internal representation of a pixel color. > * @a: Alpha component value, stored in 16 bits, without padding, using
pgpKKT8_C8TpK.pgp
Description: OpenPGP digital signature
