When linearizing, the input is an encoded signal bounded to [0,1] and PQ/sRGB EOTFs are steepest near 1, requiring more precision near the bright end.
Take the 8-bit sRGB case as a reference: 256 possible inputs and 256 HW LUT points line up, so the LUT acts as plain indexing. Float representations don't land perfectly, but LERP-ing between two HW entries, when input is within a small epsilon of one of them, doesn't materially change the result. Replace the uniform 12-region distribution (16 points each, 192 total, range [2^-12, 1]) with a 9-region halving distribution for the PQ/sRGB pre-defined EOTF: 128 points in the top region [0.5, 1], 64 in the next, 32 in the next, and so on, down to 1 point in each of the two darkest regions. Total samples grow from 192 to 256, with uniform 1/256 spacing across [0, 1]. The dark tail below 2^-9 is no longer sampled separately, which is acceptable for PQ/sRGB. Suggested-by: Krunoslav Kovac <[email protected]> Signed-off-by: Melissa Wen <[email protected]> --- .../amd/display/dc/dcn30/dcn30_cm_common.c | 33 ++++++++++++++----- 1 file changed, 24 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c index 70b7bc3494a2..66fe7f313ea3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_cm_common.c @@ -303,8 +303,6 @@ bool cm3_helper_translate_curve_to_hw_format(struct dc_context *ctx, return true; } -#define NUM_DEGAMMA_REGIONS 12 - /* Linear interpolation of tf_pts entries, where (i >> 4) is the integer tf_pts * index, (i & 0xf) is the 1/16 sub-position. */ @@ -345,17 +343,34 @@ bool cm3_helper_translate_curve_to_degamma_hw_format( memset(lut_params, 0, sizeof(struct pwl_params)); memset(seg_distr, 0, sizeof(seg_distr)); - region_start = -NUM_DEGAMMA_REGIONS; - region_end = 0; + if (output_tf->tf == TRANSFER_FUNCTION_PQ || + output_tf->tf == TRANSFER_FUNCTION_SRGB) { + /* 9 segments + * segments are from 2^-9 to 0 + */ + const uint8_t SEG_COUNT = 9; + seg_distr[0] = 0; // Since we only have one point in darkest region + for (k = 1; k < SEG_COUNT; k++) + seg_distr[k] = k - 1; // 2^(k-1) points per region; halves as k decreases + region_start = -SEG_COUNT; + region_end = 0; + } else { + /* 12 segments + * segments are from 2^-12 to 2^0 + * There are less than 256 points, for optimization + */ + const uint8_t SEG_COUNT = 12; + + for (i = 0; i < SEG_COUNT; i++) + seg_distr[i] = 4; + + region_start = -SEG_COUNT; + region_end = 0; + } for (i = region_end - region_start; i < MAX_REGIONS_NUMBER ; i++) seg_distr[i] = -1; - /* 12 segments - * segments are from 2^-12 to 0 - */ - for (i = 0; i < NUM_DEGAMMA_REGIONS ; i++) - seg_distr[i] = 4; for (k = 0; k < MAX_REGIONS_NUMBER; k++) { if (seg_distr[k] != -1) -- 2.53.0
