Re: [RFC] Plane color pipeline KMS uAPI
On Thu, 15 Jun 2023 17:44:33 -0400 Christopher Braga wrote: > On 6/14/2023 5:00 AM, Pekka Paalanen wrote: > > On Tue, 13 Jun 2023 12:29:55 -0400 > > Christopher Braga wrote: > > > >> On 6/13/2023 4:23 AM, Pekka Paalanen wrote: > >>> On Mon, 12 Jun 2023 12:56:57 -0400 > >>> Christopher Braga wrote: > >>> > On 6/12/2023 5:21 AM, Pekka Paalanen wrote: > > On Fri, 9 Jun 2023 19:11:25 -0400 > > Christopher Braga wrote: > > > >> On 6/9/2023 12:30 PM, Simon Ser wrote: > >>> Hi Christopher, > >>> > >>> On Friday, June 9th, 2023 at 17:52, Christopher Braga > >>> wrote: > >>> > > The new COLOROP objects also expose a number of KMS properties. > > Each has a > > type, a reference to the next COLOROP object in the linked list, > > and other > > type-specific properties. Here is an example for a 1D LUT operation: > > > > Color operation 42 > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = > > LUT > The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D > curves? Will different hardware be allowed to expose a subset of > these > enum values? > >>> > >>> Yes. Only hardcoded LUTs supported by the HW are exposed as enum > >>> entries. > >>> > > ├─ "lut_size": immutable range = 4096 > > ├─ "lut_data": blob > > └─ "next": immutable color operation ID = 43 > > > Some hardware has per channel 1D LUT values, while others use the > same > LUT for all channels. We will definitely need to expose this in the > UAPI in some form. > >>> > >>> Hm, I was assuming per-channel 1D LUTs here, just like the existing > >>> GAMMA_LUT/ > >>> DEGAMMA_LUT properties work. If some hardware can't support that, > >>> it'll need > >>> to get exposed as another color operation block. > >>> > > To configure this hardware block, user-space can fill a KMS blob > > with > > 4096 u32 > > entries, then set "lut_data" to the blob ID. Other color operation > > types > > might > > have different properties. > > > The bit-depth of the LUT is an important piece of information we > should > include by default. Are we assuming that the DRM driver will always > reduce the input values to the resolution supported by the pipeline? > This could result in differences between the hardware behavior > and the shader behavior. > > Additionally, some pipelines are floating point while others are > fixed. > How would user space know if it needs to pack 32 bit integer values > vs > 32 bit float values? > >>> > >>> Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use > >>> a common > >>> definition of LUT blob (u16 elements) and it's up to the driver to > >>> convert. > >>> > >>> Using a very precise format for the uAPI has the nice property of > >>> making the > >>> uAPI much simpler to use. User-space sends high precision data and > >>> it's up to > >>> drivers to map that to whatever the hardware accepts. > >>> > >> Conversion from a larger uint type to a smaller type sounds low effort, > >> however if a block works in a floating point space things are going to > >> get messy really quickly. If the block operates in FP16 space and the > >> interface is 16 bits we are good, but going from 32 bits to FP16 (such > >> as in the matrix case or 3DLUT) is less than ideal. > > > > Hi Christopher, > > > > are you thinking of precision loss, or the overhead of conversion? > > > > Conversion from N-bit fixed point to N-bit floating-point is generally > > lossy, too, and the other direction as well. > > > > What exactly would be messy? > > > Overheard of conversion is the primary concern here. Having to extract > and / or calculate the significand + exponent components in the kernel > is burdensome and imo a task better suited for user space. This also has > to be done every blob set, meaning that if user space is re-using > pre-calculated blobs we would be repeating the same conversion > operations in kernel space unnecessarily. > >>> > >>> What is burdensome in that calculation? I don't think you would need to > >>> use any actual floating-point instructions. Logarithm for finding the > >>> exponent is about finding the highest bit set in an integer and > >>> everything is conveniently expressed in base-2. Finding significand is > >>>
Re: [RFC] Plane color pipeline KMS uAPI
On 6/14/2023 5:00 AM, Pekka Paalanen wrote: On Tue, 13 Jun 2023 12:29:55 -0400 Christopher Braga wrote: On 6/13/2023 4:23 AM, Pekka Paalanen wrote: On Mon, 12 Jun 2023 12:56:57 -0400 Christopher Braga wrote: On 6/12/2023 5:21 AM, Pekka Paalanen wrote: On Fri, 9 Jun 2023 19:11:25 -0400 Christopher Braga wrote: On 6/9/2023 12:30 PM, Simon Ser wrote: Hi Christopher, On Friday, June 9th, 2023 at 17:52, Christopher Braga wrote: The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D curves? Will different hardware be allowed to expose a subset of these enum values? Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 Some hardware has per channel 1D LUT values, while others use the same LUT for all channels. We will definitely need to expose this in the UAPI in some form. Hm, I was assuming per-channel 1D LUTs here, just like the existing GAMMA_LUT/ DEGAMMA_LUT properties work. If some hardware can't support that, it'll need to get exposed as another color operation block. To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. The bit-depth of the LUT is an important piece of information we should include by default. Are we assuming that the DRM driver will always reduce the input values to the resolution supported by the pipeline? This could result in differences between the hardware behavior and the shader behavior. Additionally, some pipelines are floating point while others are fixed. How would user space know if it needs to pack 32 bit integer values vs 32 bit float values? Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a common definition of LUT blob (u16 elements) and it's up to the driver to convert. Using a very precise format for the uAPI has the nice property of making the uAPI much simpler to use. User-space sends high precision data and it's up to drivers to map that to whatever the hardware accepts. Conversion from a larger uint type to a smaller type sounds low effort, however if a block works in a floating point space things are going to get messy really quickly. If the block operates in FP16 space and the interface is 16 bits we are good, but going from 32 bits to FP16 (such as in the matrix case or 3DLUT) is less than ideal. Hi Christopher, are you thinking of precision loss, or the overhead of conversion? Conversion from N-bit fixed point to N-bit floating-point is generally lossy, too, and the other direction as well. What exactly would be messy? Overheard of conversion is the primary concern here. Having to extract and / or calculate the significand + exponent components in the kernel is burdensome and imo a task better suited for user space. This also has to be done every blob set, meaning that if user space is re-using pre-calculated blobs we would be repeating the same conversion operations in kernel space unnecessarily. What is burdensome in that calculation? I don't think you would need to use any actual floating-point instructions. Logarithm for finding the exponent is about finding the highest bit set in an integer and everything is conveniently expressed in base-2. Finding significand is just masking the integer based on the exponent. Oh it definitely can be done, but I think this is just a difference of opinion at this point. At the end of the day we will do it if we have to, but it is just more optimal if a more agreeable common type is used. Can you not cache the converted data, keyed by the DRM blob unique identity vs. the KMS property it is attached to? If the userspace compositor has N common transforms (ex: standard P3 -> sRGB matrix), they would likely have N unique blobs. Obviously from the kernel end we wouldn't want to cache the transform of every blob passed down through the UAPI. Hi Christoper, as long as the blob exists, why not? Generally because this is an unbounded amount of blobs. I'm not 100% sure what the typical behavior is upstream, but in our driver we have scenarios where we can have per-frame blob updates (unique per-frame blobs). Speaking of per-frame blob updates, there is one concern I neglected to bring up. Internally we have seen scenarios where frequent blob allocation can lead to memory allocation delays of two frames or higher. This
Re: [RFC] Plane color pipeline KMS uAPI
On Tue, 13 Jun 2023 12:29:55 -0400 Christopher Braga wrote: > On 6/13/2023 4:23 AM, Pekka Paalanen wrote: > > On Mon, 12 Jun 2023 12:56:57 -0400 > > Christopher Braga wrote: > > > >> On 6/12/2023 5:21 AM, Pekka Paalanen wrote: > >>> On Fri, 9 Jun 2023 19:11:25 -0400 > >>> Christopher Braga wrote: > >>> > On 6/9/2023 12:30 PM, Simon Ser wrote: > > Hi Christopher, > > > > On Friday, June 9th, 2023 at 17:52, Christopher Braga > > wrote: > > > >>> The new COLOROP objects also expose a number of KMS properties. Each > >>> has a > >>> type, a reference to the next COLOROP object in the linked list, and > >>> other > >>> type-specific properties. Here is an example for a 1D LUT operation: > >>> > >>> Color operation 42 > >>> ├─ "type": enum {Bypass, 1D curve} = 1D curve > >>> ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = > >>> LUT > >> The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D > >> curves? Will different hardware be allowed to expose a subset of these > >> enum values? > > > > Yes. Only hardcoded LUTs supported by the HW are exposed as enum > > entries. > > > >>> ├─ "lut_size": immutable range = 4096 > >>> ├─ "lut_data": blob > >>> └─ "next": immutable color operation ID = 43 > >>> > >> Some hardware has per channel 1D LUT values, while others use the same > >> LUT for all channels. We will definitely need to expose this in the > >> UAPI in some form. > > > > Hm, I was assuming per-channel 1D LUTs here, just like the existing > > GAMMA_LUT/ > > DEGAMMA_LUT properties work. If some hardware can't support that, it'll > > need > > to get exposed as another color operation block. > > > >>> To configure this hardware block, user-space can fill a KMS blob with > >>> 4096 u32 > >>> entries, then set "lut_data" to the blob ID. Other color operation > >>> types > >>> might > >>> have different properties. > >>> > >> The bit-depth of the LUT is an important piece of information we should > >> include by default. Are we assuming that the DRM driver will always > >> reduce the input values to the resolution supported by the pipeline? > >> This could result in differences between the hardware behavior > >> and the shader behavior. > >> > >> Additionally, some pipelines are floating point while others are fixed. > >> How would user space know if it needs to pack 32 bit integer values vs > >> 32 bit float values? > > > > Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a > > common > > definition of LUT blob (u16 elements) and it's up to the driver to > > convert. > > > > Using a very precise format for the uAPI has the nice property of > > making the > > uAPI much simpler to use. User-space sends high precision data and it's > > up to > > drivers to map that to whatever the hardware accepts. > > > Conversion from a larger uint type to a smaller type sounds low effort, > however if a block works in a floating point space things are going to > get messy really quickly. If the block operates in FP16 space and the > interface is 16 bits we are good, but going from 32 bits to FP16 (such > as in the matrix case or 3DLUT) is less than ideal. > >>> > >>> Hi Christopher, > >>> > >>> are you thinking of precision loss, or the overhead of conversion? > >>> > >>> Conversion from N-bit fixed point to N-bit floating-point is generally > >>> lossy, too, and the other direction as well. > >>> > >>> What exactly would be messy? > >>> > >> Overheard of conversion is the primary concern here. Having to extract > >> and / or calculate the significand + exponent components in the kernel > >> is burdensome and imo a task better suited for user space. This also has > >> to be done every blob set, meaning that if user space is re-using > >> pre-calculated blobs we would be repeating the same conversion > >> operations in kernel space unnecessarily. > > > > What is burdensome in that calculation? I don't think you would need to > > use any actual floating-point instructions. Logarithm for finding the > > exponent is about finding the highest bit set in an integer and > > everything is conveniently expressed in base-2. Finding significand is > > just masking the integer based on the exponent. > > > Oh it definitely can be done, but I think this is just a difference of > opinion at this point. At the end of the day we will do it if we have > to, but it is just more optimal if a more agreeable common type is used. > > > Can you not cache the converted data, keyed by the DRM blob unique > > identity vs. the KMS property it is attached to? > If the userspace
Re: [RFC] Plane color pipeline KMS uAPI
On 6/13/2023 4:23 AM, Pekka Paalanen wrote: On Mon, 12 Jun 2023 12:56:57 -0400 Christopher Braga wrote: On 6/12/2023 5:21 AM, Pekka Paalanen wrote: On Fri, 9 Jun 2023 19:11:25 -0400 Christopher Braga wrote: On 6/9/2023 12:30 PM, Simon Ser wrote: Hi Christopher, On Friday, June 9th, 2023 at 17:52, Christopher Braga wrote: The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D curves? Will different hardware be allowed to expose a subset of these enum values? Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 Some hardware has per channel 1D LUT values, while others use the same LUT for all channels. We will definitely need to expose this in the UAPI in some form. Hm, I was assuming per-channel 1D LUTs here, just like the existing GAMMA_LUT/ DEGAMMA_LUT properties work. If some hardware can't support that, it'll need to get exposed as another color operation block. To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. The bit-depth of the LUT is an important piece of information we should include by default. Are we assuming that the DRM driver will always reduce the input values to the resolution supported by the pipeline? This could result in differences between the hardware behavior and the shader behavior. Additionally, some pipelines are floating point while others are fixed. How would user space know if it needs to pack 32 bit integer values vs 32 bit float values? Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a common definition of LUT blob (u16 elements) and it's up to the driver to convert. Using a very precise format for the uAPI has the nice property of making the uAPI much simpler to use. User-space sends high precision data and it's up to drivers to map that to whatever the hardware accepts. Conversion from a larger uint type to a smaller type sounds low effort, however if a block works in a floating point space things are going to get messy really quickly. If the block operates in FP16 space and the interface is 16 bits we are good, but going from 32 bits to FP16 (such as in the matrix case or 3DLUT) is less than ideal. Hi Christopher, are you thinking of precision loss, or the overhead of conversion? Conversion from N-bit fixed point to N-bit floating-point is generally lossy, too, and the other direction as well. What exactly would be messy? Overheard of conversion is the primary concern here. Having to extract and / or calculate the significand + exponent components in the kernel is burdensome and imo a task better suited for user space. This also has to be done every blob set, meaning that if user space is re-using pre-calculated blobs we would be repeating the same conversion operations in kernel space unnecessarily. What is burdensome in that calculation? I don't think you would need to use any actual floating-point instructions. Logarithm for finding the exponent is about finding the highest bit set in an integer and everything is conveniently expressed in base-2. Finding significand is just masking the integer based on the exponent. Oh it definitely can be done, but I think this is just a difference of opinion at this point. At the end of the day we will do it if we have to, but it is just more optimal if a more agreeable common type is used. Can you not cache the converted data, keyed by the DRM blob unique identity vs. the KMS property it is attached to? If the userspace compositor has N common transforms (ex: standard P3 -> sRGB matrix), they would likely have N unique blobs. Obviously from the kernel end we wouldn't want to cache the transform of every blob passed down through the UAPI. You can assume that userspace will not be re-creating DRM blobs without a reason to believe the contents have changed. If the same blob is set on the same property repeatedly, I would definitely not expect a driver to convert the data again. If the blob ID is unchanged there is no issue since caching the last result is already common. As you say, blobs are immutable so no update is needed. I'd question why the compositor keeps trying to send down the same blob ID though. If a driver does that, it seems like it should be easy to avoid, though I'm no kernel dev. Even if the conversion was just a memcpy, I would still posit it
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 12 Jun 2023 12:56:57 -0400 Christopher Braga wrote: > On 6/12/2023 5:21 AM, Pekka Paalanen wrote: > > On Fri, 9 Jun 2023 19:11:25 -0400 > > Christopher Braga wrote: > > > >> On 6/9/2023 12:30 PM, Simon Ser wrote: > >>> Hi Christopher, > >>> > >>> On Friday, June 9th, 2023 at 17:52, Christopher Braga > >>> wrote: > >>> > > The new COLOROP objects also expose a number of KMS properties. Each > > has a > > type, a reference to the next COLOROP object in the linked list, and > > other > > type-specific properties. Here is an example for a 1D LUT operation: > > > >Color operation 42 > >├─ "type": enum {Bypass, 1D curve} = 1D curve > >├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D > curves? Will different hardware be allowed to expose a subset of these > enum values? > >>> > >>> Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. > >>> > >├─ "lut_size": immutable range = 4096 > >├─ "lut_data": blob > >└─ "next": immutable color operation ID = 43 > > > Some hardware has per channel 1D LUT values, while others use the same > LUT for all channels. We will definitely need to expose this in the > UAPI in some form. > >>> > >>> Hm, I was assuming per-channel 1D LUTs here, just like the existing > >>> GAMMA_LUT/ > >>> DEGAMMA_LUT properties work. If some hardware can't support that, it'll > >>> need > >>> to get exposed as another color operation block. > >>> > > To configure this hardware block, user-space can fill a KMS blob with > > 4096 u32 > > entries, then set "lut_data" to the blob ID. Other color operation types > > might > > have different properties. > > > The bit-depth of the LUT is an important piece of information we should > include by default. Are we assuming that the DRM driver will always > reduce the input values to the resolution supported by the pipeline? > This could result in differences between the hardware behavior > and the shader behavior. > > Additionally, some pipelines are floating point while others are fixed. > How would user space know if it needs to pack 32 bit integer values vs > 32 bit float values? > >>> > >>> Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a > >>> common > >>> definition of LUT blob (u16 elements) and it's up to the driver to > >>> convert. > >>> > >>> Using a very precise format for the uAPI has the nice property of making > >>> the > >>> uAPI much simpler to use. User-space sends high precision data and it's > >>> up to > >>> drivers to map that to whatever the hardware accepts. > >>> > >> Conversion from a larger uint type to a smaller type sounds low effort, > >> however if a block works in a floating point space things are going to > >> get messy really quickly. If the block operates in FP16 space and the > >> interface is 16 bits we are good, but going from 32 bits to FP16 (such > >> as in the matrix case or 3DLUT) is less than ideal. > > > > Hi Christopher, > > > > are you thinking of precision loss, or the overhead of conversion? > > > > Conversion from N-bit fixed point to N-bit floating-point is generally > > lossy, too, and the other direction as well. > > > > What exactly would be messy? > > > Overheard of conversion is the primary concern here. Having to extract > and / or calculate the significand + exponent components in the kernel > is burdensome and imo a task better suited for user space. This also has > to be done every blob set, meaning that if user space is re-using > pre-calculated blobs we would be repeating the same conversion > operations in kernel space unnecessarily. What is burdensome in that calculation? I don't think you would need to use any actual floating-point instructions. Logarithm for finding the exponent is about finding the highest bit set in an integer and everything is conveniently expressed in base-2. Finding significand is just masking the integer based on the exponent. Can you not cache the converted data, keyed by the DRM blob unique identity vs. the KMS property it is attached to? You can assume that userspace will not be re-creating DRM blobs without a reason to believe the contents have changed. If the same blob is set on the same property repeatedly, I would definitely not expect a driver to convert the data again. If a driver does that, it seems like it should be easy to avoid, though I'm no kernel dev. Even if the conversion was just a memcpy, I would still posit it needs to be avoided when the data has obviously not changed. Blobs are immutable. Userspace having to use hardware-specific number formats would probably not be well received. > I agree normalization of the value causing precision loss and rounding
Re: [RFC] Plane color pipeline KMS uAPI
On 6/12/2023 5:21 AM, Pekka Paalanen wrote: On Fri, 9 Jun 2023 19:11:25 -0400 Christopher Braga wrote: On 6/9/2023 12:30 PM, Simon Ser wrote: Hi Christopher, On Friday, June 9th, 2023 at 17:52, Christopher Braga wrote: The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D curves? Will different hardware be allowed to expose a subset of these enum values? Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 Some hardware has per channel 1D LUT values, while others use the same LUT for all channels. We will definitely need to expose this in the UAPI in some form. Hm, I was assuming per-channel 1D LUTs here, just like the existing GAMMA_LUT/ DEGAMMA_LUT properties work. If some hardware can't support that, it'll need to get exposed as another color operation block. To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. The bit-depth of the LUT is an important piece of information we should include by default. Are we assuming that the DRM driver will always reduce the input values to the resolution supported by the pipeline? This could result in differences between the hardware behavior and the shader behavior. Additionally, some pipelines are floating point while others are fixed. How would user space know if it needs to pack 32 bit integer values vs 32 bit float values? Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a common definition of LUT blob (u16 elements) and it's up to the driver to convert. Using a very precise format for the uAPI has the nice property of making the uAPI much simpler to use. User-space sends high precision data and it's up to drivers to map that to whatever the hardware accepts. Conversion from a larger uint type to a smaller type sounds low effort, however if a block works in a floating point space things are going to get messy really quickly. If the block operates in FP16 space and the interface is 16 bits we are good, but going from 32 bits to FP16 (such as in the matrix case or 3DLUT) is less than ideal. Hi Christopher, are you thinking of precision loss, or the overhead of conversion? Conversion from N-bit fixed point to N-bit floating-point is generally lossy, too, and the other direction as well. What exactly would be messy? Overheard of conversion is the primary concern here. Having to extract and / or calculate the significand + exponent components in the kernel is burdensome and imo a task better suited for user space. This also has to be done every blob set, meaning that if user space is re-using pre-calculated blobs we would be repeating the same conversion operations in kernel space unnecessarily. I agree normalization of the value causing precision loss and rounding we can't avoid. We should also consider the fact that float pipelines have been known to use the scrgb definition for floating point values (https://registry.khronos.org/EGL/extensions/EXT/EGL_EXT_gl_colorspace_scrgb_linear.txt). In cases like this where there may be a expected value range in the pipeline, how to normalize a larger input becomes a little confusing. Ex - Does U32 MAX become FP16 MAX or value MAX (i.e 127). Exposing the actual hardware precision is something we've talked about during the hackfest. It'll probably be useful to some extent, but will require some discussion to figure out how to design the uAPI. Maybe a simple property is enough, maybe not (e.g. fully describing the precision of segmented LUTs would probably be trickier). I'd rather keep things simple for the first pass, we can always add more properties for bit depth etc later on. Indicating if a block operates on / with fixed vs float values is significant enough that I think we should account for this in initial design. It will have a affect on both the user space value packing + expected value ranges in the hardware. What do you mean by "value packing"? Memory layout of the bits forming a value? Or possible exact values of a specific type? > Both really. If the kernel is provided a U32 value, we need to know if this is a U32 value, or a float packed into a U32 container. Likewise as mentioned with the scRGB above, float could even adjust the value range expectations. I don't think fixed vs. float is the most important thing. Even fixed point formats can have different
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, 9 Jun 2023 19:11:25 -0400 Christopher Braga wrote: > On 6/9/2023 12:30 PM, Simon Ser wrote: > > Hi Christopher, > > > > On Friday, June 9th, 2023 at 17:52, Christopher Braga > > wrote: > > > >>> The new COLOROP objects also expose a number of KMS properties. Each has a > >>> type, a reference to the next COLOROP object in the linked list, and other > >>> type-specific properties. Here is an example for a 1D LUT operation: > >>> > >>> Color operation 42 > >>> ├─ "type": enum {Bypass, 1D curve} = 1D curve > >>> ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > >> The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D > >> curves? Will different hardware be allowed to expose a subset of these > >> enum values? > > > > Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. > > > >>> ├─ "lut_size": immutable range = 4096 > >>> ├─ "lut_data": blob > >>> └─ "next": immutable color operation ID = 43 > >>> > >> Some hardware has per channel 1D LUT values, while others use the same > >> LUT for all channels. We will definitely need to expose this in the > >> UAPI in some form. > > > > Hm, I was assuming per-channel 1D LUTs here, just like the existing > > GAMMA_LUT/ > > DEGAMMA_LUT properties work. If some hardware can't support that, it'll need > > to get exposed as another color operation block. > > > >>> To configure this hardware block, user-space can fill a KMS blob with > >>> 4096 u32 > >>> entries, then set "lut_data" to the blob ID. Other color operation types > >>> might > >>> have different properties. > >>> > >> The bit-depth of the LUT is an important piece of information we should > >> include by default. Are we assuming that the DRM driver will always > >> reduce the input values to the resolution supported by the pipeline? > >> This could result in differences between the hardware behavior > >> and the shader behavior. > >> > >> Additionally, some pipelines are floating point while others are fixed. > >> How would user space know if it needs to pack 32 bit integer values vs > >> 32 bit float values? > > > > Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a > > common > > definition of LUT blob (u16 elements) and it's up to the driver to convert. > > > > Using a very precise format for the uAPI has the nice property of making the > > uAPI much simpler to use. User-space sends high precision data and it's up > > to > > drivers to map that to whatever the hardware accepts. > > > Conversion from a larger uint type to a smaller type sounds low effort, > however if a block works in a floating point space things are going to > get messy really quickly. If the block operates in FP16 space and the > interface is 16 bits we are good, but going from 32 bits to FP16 (such > as in the matrix case or 3DLUT) is less than ideal. Hi Christopher, are you thinking of precision loss, or the overhead of conversion? Conversion from N-bit fixed point to N-bit floating-point is generally lossy, too, and the other direction as well. What exactly would be messy? > > > Exposing the actual hardware precision is something we've talked about > > during > > the hackfest. It'll probably be useful to some extent, but will require some > > discussion to figure out how to design the uAPI. Maybe a simple property is > > enough, maybe not (e.g. fully describing the precision of segmented LUTs > > would > > probably be trickier). > > > > I'd rather keep things simple for the first pass, we can always add more > > properties for bit depth etc later on. > > > Indicating if a block operates on / with fixed vs float values is > significant enough that I think we should account for this in initial > design. It will have a affect on both the user space value packing + > expected value ranges in the hardware. What do you mean by "value packing"? Memory layout of the bits forming a value? Or possible exact values of a specific type? I don't think fixed vs. float is the most important thing. Even fixed point formats can have different numbers of bits for whole numbers, which changes the usable value range and not only precision. Userspace at the very least needs to know the usable value range for the block's inputs, outputs, and parameters. When defining the precision for inputs, outputs and parameters, then fixed- vs. floating-point becomes meaningful in explaining what "N bits of precision" means. Then there is the question of variable precision that depends on the actual block input and parameter values, how to represent that. Worst case precision might be too pessimistic alone. > >>> Here is another example with a 3D LUT: > >>> > >>> Color operation 42 > >>> ├─ "type": enum {Bypass, 3D LUT} = 3D LUT > >>> ├─ "lut_size": immutable range = 33 > >>> ├─ "lut_data": blob > >>> └─ "next": immutable color operation ID = 43 > >>> > >> We are going to
Re: [RFC] Plane color pipeline KMS uAPI
On 6/9/2023 12:30 PM, Simon Ser wrote: Hi Christopher, On Friday, June 9th, 2023 at 17:52, Christopher Braga wrote: The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D curves? Will different hardware be allowed to expose a subset of these enum values? Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 Some hardware has per channel 1D LUT values, while others use the same LUT for all channels. We will definitely need to expose this in the UAPI in some form. Hm, I was assuming per-channel 1D LUTs here, just like the existing GAMMA_LUT/ DEGAMMA_LUT properties work. If some hardware can't support that, it'll need to get exposed as another color operation block. To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. The bit-depth of the LUT is an important piece of information we should include by default. Are we assuming that the DRM driver will always reduce the input values to the resolution supported by the pipeline? This could result in differences between the hardware behavior and the shader behavior. Additionally, some pipelines are floating point while others are fixed. How would user space know if it needs to pack 32 bit integer values vs 32 bit float values? Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a common definition of LUT blob (u16 elements) and it's up to the driver to convert. Using a very precise format for the uAPI has the nice property of making the uAPI much simpler to use. User-space sends high precision data and it's up to drivers to map that to whatever the hardware accepts. Conversion from a larger uint type to a smaller type sounds low effort, however if a block works in a floating point space things are going to get messy really quickly. If the block operates in FP16 space and the interface is 16 bits we are good, but going from 32 bits to FP16 (such as in the matrix case or 3DLUT) is less than ideal. Exposing the actual hardware precision is something we've talked about during the hackfest. It'll probably be useful to some extent, but will require some discussion to figure out how to design the uAPI. Maybe a simple property is enough, maybe not (e.g. fully describing the precision of segmented LUTs would probably be trickier). I'd rather keep things simple for the first pass, we can always add more properties for bit depth etc later on. Indicating if a block operates on / with fixed vs float values is significant enough that I think we should account for this in initial design. It will have a affect on both the user space value packing + expected value ranges in the hardware. Here is another example with a 3D LUT: Color operation 42 ├─ "type": enum {Bypass, 3D LUT} = 3D LUT ├─ "lut_size": immutable range = 33 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 We are going to need to expose the packing order here to avoid any programming uncertainty. I don't think we can safely assume all hardware is equivalent. The driver can easily change the layout of the matrix and do any conversion necessary when programming the hardware. We do need to document what layout is used in the uAPI for sure. And one last example with a matrix: Color operation 42 ├─ "type": enum {Bypass, Matrix} = Matrix ├─ "matrix_data": blob └─ "next": immutable color operation ID = 43 It is unclear to me what the default sizing of this matrix is. Any objections to exposing these details with an additional property? The existing CTM property uses 9 uint64 (S31.32) values. Is there a case where that wouldn't be enough? Larger cases do exist, but as you mention this can be resolved with a different type then. I don't have any issues with the default 'Matrix' type being 9 entries. Dithering logic exists in some pipelines. I think we need a plan to expose that here as well. Hm, I'm not too familiar with dithering. Do you think it would make sense to expose as an additional colorop block? Do you think it would have more consequences on the design? I want to re-iterate that we don't need to ship all features from day 1. We just need to come up with a uAPI design on which new features can be built on. Agreed. I don't think this will affect the proposed design so this can be figured out once we have a DRM driver impl that
Re: [RFC] Plane color pipeline KMS uAPI
Hi Christopher, On Friday, June 9th, 2023 at 17:52, Christopher Braga wrote: > > The new COLOROP objects also expose a number of KMS properties. Each has a > > type, a reference to the next COLOROP object in the linked list, and other > > type-specific properties. Here is an example for a 1D LUT operation: > > > > Color operation 42 > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D > curves? Will different hardware be allowed to expose a subset of these > enum values? Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries. > > ├─ "lut_size": immutable range = 4096 > > ├─ "lut_data": blob > > └─ "next": immutable color operation ID = 43 > > > Some hardware has per channel 1D LUT values, while others use the same > LUT for all channels. We will definitely need to expose this in the > UAPI in some form. Hm, I was assuming per-channel 1D LUTs here, just like the existing GAMMA_LUT/ DEGAMMA_LUT properties work. If some hardware can't support that, it'll need to get exposed as another color operation block. > > To configure this hardware block, user-space can fill a KMS blob with > > 4096 u32 > > entries, then set "lut_data" to the blob ID. Other color operation types > > might > > have different properties. > > > The bit-depth of the LUT is an important piece of information we should > include by default. Are we assuming that the DRM driver will always > reduce the input values to the resolution supported by the pipeline? > This could result in differences between the hardware behavior > and the shader behavior. > > Additionally, some pipelines are floating point while others are fixed. > How would user space know if it needs to pack 32 bit integer values vs > 32 bit float values? Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a common definition of LUT blob (u16 elements) and it's up to the driver to convert. Using a very precise format for the uAPI has the nice property of making the uAPI much simpler to use. User-space sends high precision data and it's up to drivers to map that to whatever the hardware accepts. Exposing the actual hardware precision is something we've talked about during the hackfest. It'll probably be useful to some extent, but will require some discussion to figure out how to design the uAPI. Maybe a simple property is enough, maybe not (e.g. fully describing the precision of segmented LUTs would probably be trickier). I'd rather keep things simple for the first pass, we can always add more properties for bit depth etc later on. > > Here is another example with a 3D LUT: > > > > Color operation 42 > > ├─ "type": enum {Bypass, 3D LUT} = 3D LUT > > ├─ "lut_size": immutable range = 33 > > ├─ "lut_data": blob > > └─ "next": immutable color operation ID = 43 > > > We are going to need to expose the packing order here to avoid any > programming uncertainty. I don't think we can safely assume all hardware > is equivalent. The driver can easily change the layout of the matrix and do any conversion necessary when programming the hardware. We do need to document what layout is used in the uAPI for sure. > > And one last example with a matrix: > > > > Color operation 42 > > ├─ "type": enum {Bypass, Matrix} = Matrix > > ├─ "matrix_data": blob > > └─ "next": immutable color operation ID = 43 > > > It is unclear to me what the default sizing of this matrix is. Any > objections to exposing these details with an additional property? The existing CTM property uses 9 uint64 (S31.32) values. Is there a case where that wouldn't be enough? > Dithering logic exists in some pipelines. I think we need a plan to > expose that here as well. Hm, I'm not too familiar with dithering. Do you think it would make sense to expose as an additional colorop block? Do you think it would have more consequences on the design? I want to re-iterate that we don't need to ship all features from day 1. We just need to come up with a uAPI design on which new features can be built on. > > [Simon note: an alternative would be to split the color pipeline into > > two, by > > having two plane properties ("color_pipeline_pre_scale" and > > "color_pipeline_post_scale") instead of a single one. This would be > > similar to > > the way we want to split pre-blending and post-blending. This could be less > > expressive for drivers, there may be hardware where there are dependencies > > between the pre- and post-scaling pipeline?] > > > As others have noted, breaking up the pipeline with immutable blocks > makes the most sense to me here. This way we don't have to predict ahead > of time every type of block that maybe affected by pipeline ordering. > Splitting the pipeline into two properties now means future > logical splits would require introduction of further plane
Re: [RFC] Plane color pipeline KMS uAPI
Hi all, The goal of this RFC is to expose a generic KMS uAPI to configure the color pipeline before blending, ie. after a pixel is tapped from a plane's framebuffer and before it's blended with other planes. With this new uAPI we aim to reduce the battery life impact of color management and HDR on mobile devices, to improve performance and to decrease latency by skipping composition on the 3D engine. This proposal is the result of discussions at the Red Hat HDR hackfest [1] which took place a few days ago. Engineers familiar with the AMD, Intel and NVIDIA hardware have participated in the discussion. This proposal takes a prescriptive approach instead of a descriptive approach. Drivers describe the available hardware blocks in terms of low-level mathematical operations, then user-space configures each block. We decided against a descriptive approach where user-space would provide a high-level description of the colorspace and other parameters: we want to give more control and flexibility to user-space, e.g. to be able to replicate exactly the color pipeline with shaders and switch between shaders and KMS pipelines seamlessly, and to avoid forcing user-space into a particular color management policy. Thanks for posting this Simon! This overview does a great job of breaking down the proposal. A few questions inline below. We've decided against mirroring the existing CRTC properties DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management pipeline can significantly differ between vendors and this approach cannot accurately abstract all hardware. In particular, the availability, ordering and capabilities of hardware blocks is different on each display engine. So, we've decided to go for a highly detailed hardware capability discovery. This new uAPI should not be in conflict with existing standard KMS properties, since there are none which control the pre-blending color pipeline at the moment. It does conflict with any vendor-specific properties like NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific properties. Drivers will need to either reject atomic commits configuring both uAPIs, or alternatively we could add a DRM client cap which hides the vendor properties and shows the new generic properties when enabled. To use this uAPI, first user-space needs to discover hardware capabilities via KMS objects and properties, then user-space can configure the hardware via an atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. Our proposal introduces a new "color_pipeline" plane property, and a new KMS object type, "COLOROP" (short for color operation). The "color_pipeline" plane property is an enum, each enum entry represents a color pipeline supported by the hardware. The special zero entry indicates that the pipeline is in "bypass"/"no-op" mode. For instance, the following plane properties describe a primary plane with 2 supported pipelines but currently configured in bypass mode: Plane 10 ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary ├─ … └─ "color_pipeline": enum {0, 42, 52} = 0 The non-zero entries describe color pipelines as a linked list of COLOROP KMS objects. The entry value is an object ID pointing to the head of the linked list (the first operation in the color pipeline). The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D curves? Will different hardware be allowed to expose a subset of these enum values? ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 Some hardware has per channel 1D LUT values, while others use the same LUT for all channels. We will definitely need to expose this in the UAPI in some form. To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. The bit-depth of the LUT is an important piece of information we should include by default. Are we assuming that the DRM driver will always reduce the input values to the resolution supported by the pipeline? This could result in differences between the hardware behavior and the shader behavior. Additionally, some pipelines are floating point while others are fixed. How would user space know if it needs to pack 32 bit integer values vs 32 bit float values? Here is another example with a 3D LUT: Color operation 42 ├─ "type": enum {Bypass, 3D LUT} = 3D LUT ├─ "lut_size": immutable range = 33 ├─ "lut_data":
Re: [RFC] Plane color pipeline KMS uAPI
On Thu, 11 May 2023 19:29:27 + Simon Ser wrote: > On Thursday, May 11th, 2023 at 18:56, Joshua Ashton wrote: > > > When we are talking about being 'prescriptive' in the API, are we > > outright saying we don't want to support arbitrary 3D LUTs, or are we > > just offering certain algorithms to be 'executed' for a plane/crtc/etc > > in the atomic API? I am confused... > > From a kernel PoV: > > - Prescriptive = here are the available hardware blocks, feel free to > configure each as you like > - Descriptive = give me the source and destination color-spaces and I > take care of everything > > This proposal is a prescriptive API. We haven't explored _that_ much > how a descriptive API would look like, probably it can include some way > to do Night Light and similar features but not sure how high-level > they'd look like. A descriptive API is inherently more restrictive than > a prescriptive API. Right. Just like Jonas said, an arbitrary 3D LUT is a well-defined mathematical operation with no semantics at all, therefore it is a prescriptive element. A 3D LUT does not fit well in a descriptive API design, one would need to jump through lots of hoops to turn it into something descriptive'ish (like ICC does). I think Joshua mixed up the definitions of "descriptive" and "prescriptive". If Gamescope was using a descriptive KMS UAPI, then it would have very little or no say in what color operations are done and how. If Gamescope is using prescriptive KMS UAPI, then Gamescope has to know exactly what it wants to do, how it wants to achieve that, and map that to the available mathematical processing blocks. A descriptive UAPI would mean all color policy is in the kernel. A prescriptive UAPI means all policy is in userspace. Wayland uses the opposite design principle of KMS UAPI. Wayland is descriptive, KMS is prescriptive. This puts the color policy into a Wayland compositor. If we have a library converting descriptive to prescriptive, then that library contains a policy. Going from descriptive to prescriptive is easy, just add policy. Going from prescriptive to descriptive is practically impossible, because you'd have to "subtract" any policy that has already been applied, in order to understand what the starting point was. Coming back to KMS, the color transformations must be prescriptive, but then we also need to be able to send descriptive information to video sinks so that video sinks understand what our pixel values mean. Thanks, pq pgpUVy1QqYSpE.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
On Friday, May 5th, 2023 at 15:30, Joshua Ashton wrote: > > > AMD would expose the following objects and properties: > > > > > > Plane 10 > > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > > └─ "color_pipeline": enum {0, 42} = 0 > > > Color operation 42 (input CSC) > > > ├─ "type": enum {Bypass, Matrix} = Matrix > > > ├─ "matrix_data": blob > > > └─ "next": immutable color operation ID = 43 > > > Color operation 43 > > > ├─ "type": enum {Scaling} = Scaling > > > └─ "next": immutable color operation ID = 44 > > > Color operation 44 (DeGamma) > > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > > ├─ "1d_curve_type": enum {sRGB, PQ, …} = sRGB > > > └─ "next": immutable color operation ID = 45 > > Some vendors have per-tap degamma and some have a degamma after the sample. > How do we distinguish that behaviour? > It is important to know. Can you elaborate? What is "per-tap" and "sample"? Is the "Scaling" color operation above not enough to indicate where in the pipeline the hw performs scaling? > > > Color operation 45 (gamut remap) > > > ├─ "type": enum {Bypass, Matrix} = Matrix > > > ├─ "matrix_data": blob > > > └─ "next": immutable color operation ID = 46 > > > Color operation 46 (shaper LUT RAM) > > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > > ├─ "1d_curve_type": enum {LUT} = LUT > > > ├─ "lut_size": immutable range = 4096 > > > ├─ "lut_data": blob > > > └─ "next": immutable color operation ID = 47 > > > Color operation 47 (3D LUT RAM) > > > ├─ "type": enum {Bypass, 3D LUT} = 3D LUT > > > ├─ "lut_size": immutable range = 17 > > > ├─ "lut_data": blob > > > └─ "next": immutable color operation ID = 48 > > > Color operation 48 (blend gamma) > > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, …} = LUT > > > ├─ "lut_size": immutable range = 4096 > > > ├─ "lut_data": blob > > > └─ "next": immutable color operation ID = 0 > > > > > > To configure the pipeline for an HDR10 PQ plane (path at the top) and a > > > HDR > > > display, gamescope would perform an atomic commit with the following > > > property > > > values: > > > > > > Plane 10 > > > └─ "color_pipeline" = 42 > > > Color operation 42 (input CSC) > > > └─ "matrix_data" = PQ → scRGB (TF) > > ^ > Not sure what this is. > We don't use an input CSC before degamma. > > > > Color operation 44 (DeGamma) > > > └─ "type" = Bypass > > ^ > If we did PQ, this would be PQ -> Linear / 80 > If this was sRGB, it'd be sRGB -> Linear > If this was scRGB this would be just treating it as it is. So... Linear / 80. > > > > Color operation 45 (gamut remap) > > > └─ "matrix_data" = scRGB (TF) → PQ > > ^ > This is wrong, we just use this to do scRGB primaries (709) to 2020. > > We then go from scRGB -> PQ to go into our shaper + 3D LUT. > > > > Color operation 46 (shaper LUT RAM) > > > └─ "lut_data" = PQ → Display native > > ^ > "Display native" is just the response curve of the display. > In HDR10, this would just be PQ -> PQ > If we were doing HDR10 on SDR, this would be PQ -> Gamma 2.2 (mapped > from 0 to display native luminance) [with a potential bit of headroom > for tonemapping in the 3D LUT] > For SDR on HDR10 this would be Gamma 2.2 -> PQ (Not intending to start > an sRGB vs G2.2 argument here! :P) > > > > Color operation 47 (3D LUT RAM) > > > └─ "lut_data" = Gamut mapping + tone mapping + night mode > > > Color operation 48 (blend gamma) > > > └─ "1d_curve_type" = PQ > > ^ > This is wrong, this should be Display Native -> Linearized Display Referred In the HDR case, isn't this the inverse of PQ? > > You cannot do a TF with a matrix, and a gamut remap with a matrix on > > electrical values is certainly surprising, so the example here is a > > bit odd, but I don't think that hurts the intention of demonstration. > > I have done some corrections inline. > > You can see our fully correct color pipeline here: > https://raw.githubusercontent.com/ValveSoftware/gamescope/master/src/docs/Steam%20Deck%20Display%20Pipeline.png > > Please let me know if you have any more questions about our color pipeline. As expected, I got the gamescope part wrong. I'm pretty confident that the proposed API would still work since the AMD vendor-specific props would just be exposed as color operation objects. Can you confirm we can make the gamescope pipeline work with the AMD color pipeline outlined above?
Re: [RFC] Plane color pipeline KMS uAPI
On Thursday, May 11th, 2023 at 18:56, Joshua Ashton wrote: > When we are talking about being 'prescriptive' in the API, are we > outright saying we don't want to support arbitrary 3D LUTs, or are we > just offering certain algorithms to be 'executed' for a plane/crtc/etc > in the atomic API? I am confused... >From a kernel PoV: - Prescriptive = here are the available hardware blocks, feel free to configure each as you like - Descriptive = give me the source and destination color-spaces and I take care of everything This proposal is a prescriptive API. We haven't explored _that_ much how a descriptive API would look like, probably it can include some way to do Night Light and similar features but not sure how high-level they'd look like. A descriptive API is inherently more restrictive than a prescriptive API.
Re: [RFC] Plane color pipeline KMS uAPI
On Thu, May 11, 2023 at 04:56:47PM +, Joshua Ashton wrote: > When we are talking about being 'prescriptive' in the API, are we > outright saying we don't want to support arbitrary 3D LUTs, or are we > just offering certain algorithms to be 'executed' for a plane/crtc/etc > in the atomic API? I am confused... The 'prescriptive' idea that the RFC of this thread proposes *is* a way to support arbitrary 3D LUTs (and other mathematical operations), arbitrarily, in a somewhat vendored way, only that it will not be vendor prefixed hard coded properties with specific positions in the pipeline, but instead more or less an introspectable pipeline, describing what kind of LUT's, Matrix multiplication (and in what order) etc a hardware can do. The theoretical userspace library would be the one turning descriptive "please turn this into that" requests into the "prescriptive" color pipeline operations. It would target general purpose compositors, but it wouldn't be mandatory. Doing vendor specific implemantions in gamescope would be possible; it wouldn't look like the verion that exist somewhere now that uses a bunch of AMD_* properties, it'd look more like the example Simon had in the initial RFC. Jonas > > There is so much stuff to do with color, that I don't think a > prescriptive API in the kernel could ever keep up with the things that > we want to be pushing from Gamescope/SteamOS. For example, we have so > many things going on, night mode, SDR gamut widening, HDR/SDR gain, > the ability to apply 'looks' for eg. invert luma or for retro looks, > enhanced contrast, tonemapping, inverse tonemapping... We also are > going to be doing a bunch of stuff with EETFs for handling out of > range HDR content for scanout. > > Some of what we do is kinda standard, regular "there is a paper on > this" algorithms, and others are not. > While yes, it might be very possible to do simple things, once you > start wanting to do something 'different', that's kinda lock-in. > > Whether this co-exists with arbitrary LUTs (that we definitely want > for SteamOS) or not: > I think putting a bunch of math-y stuff like this into the kernel is > probably the complete wrong approach. Everything would need to be > fixed point and it would be a huge pain in the butt to deal with on > that side. > > Maybe this is a "hot take", but IMO, DRM atomic is already waaay too > much being done in the kernel space. I think making it go even further > and having it be a prescriptive color API is a complete step in the > wrong direction. > > There is also the problem of... if there is a bug in the math here or > we want to add a new feature, if it's kernel side, you are locked in > to having that bug until the next release on your distro and probably > years if it's a new feature! > Updating kernels is much harder for 'enterprise' distros if it is not > mission critical. Having all of this in userspace is completely fine > however... > > If you want to make some userspace prescriptive -> descriptive color > library I am all for that for general case compositors, but I don't > think I would use something like that in Gamescope. > That's not to be rude, we are just picky and want freedom to do what > we want and iterate on it easily. > > I guess this all comes back to my initial point... having some > userspace to handle stuff that is either kinda or entirely vendor > specific is the right way of solving this problem :-P > > - Joshie ✨ > > On Thu, 11 May 2023 at 09:51, Karol Herbst wrote: > > > > On Wed, May 10, 2023 at 9:59 AM Jonas Ådahl wrote: > > > > > > On Tue, May 09, 2023 at 08:22:30PM +, Simon Ser wrote: > > > > On Tuesday, May 9th, 2023 at 21:53, Dave Airlie > > > > wrote: > > > > > > > > > There are also other vendor side effects to having this in userspace. > > > > > > > > > > Will the library have a loader? > > > > > Will it allow proprietary plugins? > > > > > Will it allow proprietary reimplementations? > > > > > What will happen when a vendor wants distros to ship their > > > > > proprietary fork of said library? > > > > > > > > > > How would NVIDIA integrate this with their proprietary stack? > > > > > > > > Since all color operations exposed by KMS are standard, the library > > > > would just be a simple one: no loader, no plugin, no proprietary pieces, > > > > etc. > > > > > > > > > > There might be pipelines/color-ops only exposed by proprietary out of > > > tree drivers; the operation types and semantics should ideally be > > > defined upstream, but the code paths would in practice be vendor > > > specific, potentially without any upstream driver using them. It should > > > be clear whether an implementation that makes such a pipeline work is in > > > scope for the upstream library. > > > > > > The same applies to the kernel; it must be clear whether pipeline > > > elements that potentially will only be exposed by out of tree drivers > > > will be acceptable upstream, at least as documented operations. > > > > >
Re: [RFC] Plane color pipeline KMS uAPI
When we are talking about being 'prescriptive' in the API, are we outright saying we don't want to support arbitrary 3D LUTs, or are we just offering certain algorithms to be 'executed' for a plane/crtc/etc in the atomic API? I am confused... There is so much stuff to do with color, that I don't think a prescriptive API in the kernel could ever keep up with the things that we want to be pushing from Gamescope/SteamOS. For example, we have so many things going on, night mode, SDR gamut widening, HDR/SDR gain, the ability to apply 'looks' for eg. invert luma or for retro looks, enhanced contrast, tonemapping, inverse tonemapping... We also are going to be doing a bunch of stuff with EETFs for handling out of range HDR content for scanout. Some of what we do is kinda standard, regular "there is a paper on this" algorithms, and others are not. While yes, it might be very possible to do simple things, once you start wanting to do something 'different', that's kinda lock-in. Whether this co-exists with arbitrary LUTs (that we definitely want for SteamOS) or not: I think putting a bunch of math-y stuff like this into the kernel is probably the complete wrong approach. Everything would need to be fixed point and it would be a huge pain in the butt to deal with on that side. Maybe this is a "hot take", but IMO, DRM atomic is already waaay too much being done in the kernel space. I think making it go even further and having it be a prescriptive color API is a complete step in the wrong direction. There is also the problem of... if there is a bug in the math here or we want to add a new feature, if it's kernel side, you are locked in to having that bug until the next release on your distro and probably years if it's a new feature! Updating kernels is much harder for 'enterprise' distros if it is not mission critical. Having all of this in userspace is completely fine however... If you want to make some userspace prescriptive -> descriptive color library I am all for that for general case compositors, but I don't think I would use something like that in Gamescope. That's not to be rude, we are just picky and want freedom to do what we want and iterate on it easily. I guess this all comes back to my initial point... having some userspace to handle stuff that is either kinda or entirely vendor specific is the right way of solving this problem :-P - Joshie ✨ On Thu, 11 May 2023 at 09:51, Karol Herbst wrote: > > On Wed, May 10, 2023 at 9:59 AM Jonas Ådahl wrote: > > > > On Tue, May 09, 2023 at 08:22:30PM +, Simon Ser wrote: > > > On Tuesday, May 9th, 2023 at 21:53, Dave Airlie wrote: > > > > > > > There are also other vendor side effects to having this in userspace. > > > > > > > > Will the library have a loader? > > > > Will it allow proprietary plugins? > > > > Will it allow proprietary reimplementations? > > > > What will happen when a vendor wants distros to ship their > > > > proprietary fork of said library? > > > > > > > > How would NVIDIA integrate this with their proprietary stack? > > > > > > Since all color operations exposed by KMS are standard, the library > > > would just be a simple one: no loader, no plugin, no proprietary pieces, > > > etc. > > > > > > > There might be pipelines/color-ops only exposed by proprietary out of > > tree drivers; the operation types and semantics should ideally be > > defined upstream, but the code paths would in practice be vendor > > specific, potentially without any upstream driver using them. It should > > be clear whether an implementation that makes such a pipeline work is in > > scope for the upstream library. > > > > The same applies to the kernel; it must be clear whether pipeline > > elements that potentially will only be exposed by out of tree drivers > > will be acceptable upstream, at least as documented operations. > > > > they aren't. All code in the kernel needs to be used by in-tree > drivers otherwise it's fair to delete it. DRM requires any UAPI change > to have a real open source user in space user. > > Nvidia knows this and they went to great lengths to fulfill this > requirement in the past. They'll manage. > > > > > Jonas > > >
Re: [RFC] Plane color pipeline KMS uAPI
On Wed, May 10, 2023 at 9:59 AM Jonas Ådahl wrote: > > On Tue, May 09, 2023 at 08:22:30PM +, Simon Ser wrote: > > On Tuesday, May 9th, 2023 at 21:53, Dave Airlie wrote: > > > > > There are also other vendor side effects to having this in userspace. > > > > > > Will the library have a loader? > > > Will it allow proprietary plugins? > > > Will it allow proprietary reimplementations? > > > What will happen when a vendor wants distros to ship their > > > proprietary fork of said library? > > > > > > How would NVIDIA integrate this with their proprietary stack? > > > > Since all color operations exposed by KMS are standard, the library > > would just be a simple one: no loader, no plugin, no proprietary pieces, > > etc. > > > > There might be pipelines/color-ops only exposed by proprietary out of > tree drivers; the operation types and semantics should ideally be > defined upstream, but the code paths would in practice be vendor > specific, potentially without any upstream driver using them. It should > be clear whether an implementation that makes such a pipeline work is in > scope for the upstream library. > > The same applies to the kernel; it must be clear whether pipeline > elements that potentially will only be exposed by out of tree drivers > will be acceptable upstream, at least as documented operations. > they aren't. All code in the kernel needs to be used by in-tree drivers otherwise it's fair to delete it. DRM requires any UAPI change to have a real open source user in space user. Nvidia knows this and they went to great lengths to fulfill this requirement in the past. They'll manage. > > Jonas >
Re: [RFC] Plane color pipeline KMS uAPI
On Wed, 10 May 2023 09:59:21 +0200 Jonas Ådahl wrote: > On Tue, May 09, 2023 at 08:22:30PM +, Simon Ser wrote: > > On Tuesday, May 9th, 2023 at 21:53, Dave Airlie wrote: > > > > > There are also other vendor side effects to having this in userspace. > > > > > > Will the library have a loader? > > > Will it allow proprietary plugins? > > > Will it allow proprietary reimplementations? > > > What will happen when a vendor wants distros to ship their > > > proprietary fork of said library? > > > > > > How would NVIDIA integrate this with their proprietary stack? > > > > Since all color operations exposed by KMS are standard, the library > > would just be a simple one: no loader, no plugin, no proprietary pieces, > > etc. > > > > There might be pipelines/color-ops only exposed by proprietary out of > tree drivers; the operation types and semantics should ideally be > defined upstream, but the code paths would in practice be vendor > specific, potentially without any upstream driver using them. It should > be clear whether an implementation that makes such a pipeline work is in > scope for the upstream library. > > The same applies to the kernel; it must be clear whether pipeline > elements that potentially will only be exposed by out of tree drivers > will be acceptable upstream, at least as documented operations. In my opinion, a COLOROP element definition can be accepted in the upstream kernel documentation only if there is also an upstream driver implementing it. It does not need to be a "direct" hardware implementation, it could also be the upstream driver mapping the COLOROP to whatever hardware block or block chain it has. For the userspace library I don't know. I am puzzled whether people want to allow proprietary components or deny them. Thanks, pq pgp1ovgVcpYhH.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
On Tue, 09 May 2023 20:22:30 + Simon Ser wrote: > On Tuesday, May 9th, 2023 at 21:53, Dave Airlie wrote: > > > There are also other vendor side effects to having this in userspace. > > > > Will the library have a loader? > > Will it allow proprietary plugins? > > Will it allow proprietary reimplementations? > > What will happen when a vendor wants distros to ship their > > proprietary fork of said library? > > > > How would NVIDIA integrate this with their proprietary stack? > > Since all color operations exposed by KMS are standard, the library > would just be a simple one: no loader, no plugin, no proprietary pieces, > etc. Hi, that's certainly the long term goal, and *if* Linux software can in any way guide hardware design, then I believe it is an achievable goal. I understand "standard" as something that is widely implemented in various hardware rather than only "well-defined and documented and free to implement in any hardware if its vendor cared". However, like I mentioned in my other reply to Steven, I expect there will be a time period when each hardware has custom processing blocks no other hardware (same or different vendor) has. I might not call them outright proprietary though, because in order have them exposed via UAPI, the mathematical model of the processing block must be documented with its UAPI. This means there cannot be secrets on what the hardware does, which means there cannot be a requirement for secret sauce in userspace either. I wonder if we can also require new COLOROP elements to be freely implementable by anyone anywhere in any way one wants? Or do kernel maintainers just need to NAK proposals for elements that might not be that free? Anything that is driver-chosen or automatic can also be proprietary, because today's KMS UAPI rules do not require documenting how automatic features work, e.g. the existing YUV-to-RGB conversion. Hardware could have whatever wild skin tone improvement algorithms hidden in there for example. In this new proposal, there cannot be undocumented behaviour. Dave, if we went with a descriptive UAPI model, everything behind it could be proprietary and secret. That's not open in the least. On Wed, 10 May 2023 at 00:31, Harry Wentland wrote: > > I am debating whether we need to be serious about a userspace library > (or maybe a user-mode driver) to provide an abstraction from the > descriptive to the prescriptive model. HW vendors need a way to provide > timely support for new HW generations without requiring updates to a > large number of compositors. Drivers can always map old COLOROP elements to new style hardware blocks if they can achieve the same mathematical operation up to whatever precision was promised before. I think that should be the main form of supporting hardware evolution. Then also add new alternative COLOROP elements that can better utilize the hardware block. Naturally that means that COLOROP elements must be designed to be somewhat generic to have a reasonable life time. They cannot be extremely tightly married to the hardware implementation that might cease to exist in the very next hardware revision. Let's say some vendor has a hardware block that does a series of operations in an optimized fashion, perhaps with hardwired constants. This is exposed as a custom COLOROP element. The next hardware revision no longer has this block, but it has a bunch of new blocks that can produce the exact same result. The driver for this hardware can expose two different pipelines: one using the old COLOROP element, and another using a bunch of other COLOROP elements which exposes the new flexibility of the hardware design better. If userspace chooses the former pipeline, the driver just programs the bunch of blocks to behave accordingly. Hopefully the other COLOROP elements will be more standard than the old element. Over time, I hope this causes an evolution where hardware implements only the most standard COLOROP elements, and special-case compound elements will eventually fall out of use over the decades. Thanks, pq pgppuUALKyJ8w.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
On Tue, May 09, 2023 at 08:22:30PM +, Simon Ser wrote: > On Tuesday, May 9th, 2023 at 21:53, Dave Airlie wrote: > > > There are also other vendor side effects to having this in userspace. > > > > Will the library have a loader? > > Will it allow proprietary plugins? > > Will it allow proprietary reimplementations? > > What will happen when a vendor wants distros to ship their > > proprietary fork of said library? > > > > How would NVIDIA integrate this with their proprietary stack? > > Since all color operations exposed by KMS are standard, the library > would just be a simple one: no loader, no plugin, no proprietary pieces, > etc. > There might be pipelines/color-ops only exposed by proprietary out of tree drivers; the operation types and semantics should ideally be defined upstream, but the code paths would in practice be vendor specific, potentially without any upstream driver using them. It should be clear whether an implementation that makes such a pipeline work is in scope for the upstream library. The same applies to the kernel; it must be clear whether pipeline elements that potentially will only be exposed by out of tree drivers will be acceptable upstream, at least as documented operations. Jonas
Re: [RFC] Plane color pipeline KMS uAPI
On Tuesday, May 9th, 2023 at 21:53, Dave Airlie wrote: > There are also other vendor side effects to having this in userspace. > > Will the library have a loader? > Will it allow proprietary plugins? > Will it allow proprietary reimplementations? > What will happen when a vendor wants distros to ship their > proprietary fork of said library? > > How would NVIDIA integrate this with their proprietary stack? Since all color operations exposed by KMS are standard, the library would just be a simple one: no loader, no plugin, no proprietary pieces, etc.
Re: [RFC] Plane color pipeline KMS uAPI
On Wed, 10 May 2023 at 00:31, Harry Wentland wrote: > > > > On 5/7/23 19:14, Dave Airlie wrote: > > On Sat, 6 May 2023 at 08:21, Sebastian Wick > > wrote: > >> > >> On Fri, May 5, 2023 at 10:40 PM Dave Airlie wrote: > >>> > >>> On Fri, 5 May 2023 at 01:23, Simon Ser wrote: > > Hi all, > > The goal of this RFC is to expose a generic KMS uAPI to configure the > color > pipeline before blending, ie. after a pixel is tapped from a plane's > framebuffer and before it's blended with other planes. With this new > uAPI we > aim to reduce the battery life impact of color management and HDR on > mobile > devices, to improve performance and to decrease latency by skipping > composition on the 3D engine. This proposal is the result of discussions > at > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > familiar with the AMD, Intel and NVIDIA hardware have participated in the > discussion. > > This proposal takes a prescriptive approach instead of a descriptive > approach. > Drivers describe the available hardware blocks in terms of low-level > mathematical operations, then user-space configures each block. We > decided > against a descriptive approach where user-space would provide a > high-level > description of the colorspace and other parameters: we want to give more > control and flexibility to user-space, e.g. to be able to replicate > exactly the > color pipeline with shaders and switch between shaders and KMS pipelines > seamlessly, and to avoid forcing user-space into a particular color > management > policy. > >>> > >>> I'm not 100% sold on the prescriptive here, let's see if someone can > >>> get me over the line with some questions later. > >>> > >>> My feeling is color pipeline hw is not a done deal, and that hw > >>> vendors will be revising/evolving/churning the hw blocks for a while > >>> longer, as there is no real standards in the area to aim for, all the > >>> vendors are mostly just doing whatever gets Windows over the line and > >>> keeps hw engineers happy. So I have some concerns here around forwards > >>> compatibility and hence the API design. > >>> > >>> I guess my main concern is if you expose a bunch of hw blocks and > >>> someone comes up with a novel new thing, will all existing userspace > >>> work, without falling back to shaders? > >>> Do we have minimum guarantees on what hardware blocks have to be > >>> exposed to build a useable pipeline? > >>> If a hardware block goes away in a new silicon revision, do I have to > >>> rewrite my compositor? or will it be expected that the kernel will > >>> emulate the old pipelines on top of whatever new fancy thing exists. > >> > >> I think there are two answers to those questions. > > > > These aren't selling me much better :-) > >> > >> The first one is that right now KMS already doesn't guarantee that > >> every property is supported on all hardware. The guarantee we have is > >> that properties that are supported on a piece of hardware on a > >> specific kernel will be supported on the same hardware on later > >> kernels. The color pipeline is no different here. For a specific piece > >> of hardware a newer kernel might only change the pipelines in a > >> backwards compatible way and add new pipelines. > >> > >> So to answer your question: if some hardware with a novel pipeline > >> will show up it might not be supported and that's fine. We already > >> have cases where some hardware does not support the gamma lut property > >> but only the CSC property and that breaks night light because we never > >> bothered to write a shader fallback. KMS provides ways to offload work > >> but a generic user space always has to provide a fallback and this > >> doesn't change. Hardware specific user space on the other hand will > >> keep working with the forward compatibility guarantees we want to > >> provide. > > > > In my mind we've screwed up already, isn't a case to be made for > > continue down the same path. > > > > The kernel is meant to be a hardware abstraction layer, not just a > > hardware exposure layer. The kernel shouldn't set policy and there are > > cases where it can't act as an abstraction layer (like where you need > > a compiler), but I'm not sold that this case is one of those yet. I'm > > open to being educated here on why it would be. > > > > Thanks for raising these points. When I started out looking at color > management I favored the descriptive model. Most other HW vendors > I've talked to also tell me that they think about descriptive APIs > since that allows HW vendors to map that to whatever their HW supports. > > Sebastian, Pekka, and others managed to change my mind about this > but I still keep having difficult questions within AMD. > > Sebastian, Pekka, and Jonas have already done a good job to describe > our reasoning behind the
Re: [RFC] Plane color pipeline KMS uAPI
On 05/09, Pekka Paalanen wrote: > On Tue, 9 May 2023 10:23:49 -0100 > Melissa Wen wrote: > > > On 05/05, Joshua Ashton wrote: > > > Some corrections and replies inline. > > > > > > On Fri, 5 May 2023 at 12:42, Pekka Paalanen wrote: > > > > > > > > On Thu, 04 May 2023 15:22:59 + > > > > Simon Ser wrote: > > > > > > ... > > > > > > Color operation 47 (3D LUT RAM) > > > > > └─ "lut_data" = Gamut mapping + tone mapping + night mode > > > > > Color operation 48 (blend gamma) > > > > > └─ "1d_curve_type" = PQ > > > > > > ^ > > > This is wrong, this should be Display Native -> Linearized Display > > > Referred > > > > This is a good point to discuss. I understand for the HDR10 case that we > > are just setting an enumerated TF (that is PQ for this case - correct me > > if I got it wrong) but, unlike when we use a user-LUT, we don't know > > from the API that this enumerated TF value with an empty LUT is used for > > linearizing/degamma. Perhaps this could come as a pair? Any idea? > > PQ curve is an EOTF, so it's always from electrical to optical. > > Are you asking for something like > > "1d_curve_type" = "PQ EOTF" > > vs. > > "1d_curve_type" = "inverse PQ EOTF"? > > I think that's how it should work. It's not a given that if a > hardware block can do a curve, it can also do its inverse. They need to > be advertised explicitly. Sounds good and clear to me. Thanks! Melissa > > > Thanks, > pq > > ps. I picked my nick in the 90s. Any resemblance to Perceptual > Quantizer is unintended. ;-) :D > > > > > > > > > > You cannot do a TF with a matrix, and a gamut remap with a matrix on > > > > electrical values is certainly surprising, so the example here is a > > > > bit odd, but I don't think that hurts the intention of demonstration. > > > > > > I have done some corrections inline. > > > > > > You can see our fully correct color pipeline here: > > > https://raw.githubusercontent.com/ValveSoftware/gamescope/master/src/docs/Steam%20Deck%20Display%20Pipeline.png > > > > > > Please let me know if you have any more questions about our color > > > pipeline. signature.asc Description: PGP signature
Re: [RFC] Plane color pipeline KMS uAPI
On 5/7/23 19:14, Dave Airlie wrote: > On Sat, 6 May 2023 at 08:21, Sebastian Wick wrote: >> >> On Fri, May 5, 2023 at 10:40 PM Dave Airlie wrote: >>> >>> On Fri, 5 May 2023 at 01:23, Simon Ser wrote: Hi all, The goal of this RFC is to expose a generic KMS uAPI to configure the color pipeline before blending, ie. after a pixel is tapped from a plane's framebuffer and before it's blended with other planes. With this new uAPI we aim to reduce the battery life impact of color management and HDR on mobile devices, to improve performance and to decrease latency by skipping composition on the 3D engine. This proposal is the result of discussions at the Red Hat HDR hackfest [1] which took place a few days ago. Engineers familiar with the AMD, Intel and NVIDIA hardware have participated in the discussion. This proposal takes a prescriptive approach instead of a descriptive approach. Drivers describe the available hardware blocks in terms of low-level mathematical operations, then user-space configures each block. We decided against a descriptive approach where user-space would provide a high-level description of the colorspace and other parameters: we want to give more control and flexibility to user-space, e.g. to be able to replicate exactly the color pipeline with shaders and switch between shaders and KMS pipelines seamlessly, and to avoid forcing user-space into a particular color management policy. >>> >>> I'm not 100% sold on the prescriptive here, let's see if someone can >>> get me over the line with some questions later. >>> >>> My feeling is color pipeline hw is not a done deal, and that hw >>> vendors will be revising/evolving/churning the hw blocks for a while >>> longer, as there is no real standards in the area to aim for, all the >>> vendors are mostly just doing whatever gets Windows over the line and >>> keeps hw engineers happy. So I have some concerns here around forwards >>> compatibility and hence the API design. >>> >>> I guess my main concern is if you expose a bunch of hw blocks and >>> someone comes up with a novel new thing, will all existing userspace >>> work, without falling back to shaders? >>> Do we have minimum guarantees on what hardware blocks have to be >>> exposed to build a useable pipeline? >>> If a hardware block goes away in a new silicon revision, do I have to >>> rewrite my compositor? or will it be expected that the kernel will >>> emulate the old pipelines on top of whatever new fancy thing exists. >> >> I think there are two answers to those questions. > > These aren't selling me much better :-) >> >> The first one is that right now KMS already doesn't guarantee that >> every property is supported on all hardware. The guarantee we have is >> that properties that are supported on a piece of hardware on a >> specific kernel will be supported on the same hardware on later >> kernels. The color pipeline is no different here. For a specific piece >> of hardware a newer kernel might only change the pipelines in a >> backwards compatible way and add new pipelines. >> >> So to answer your question: if some hardware with a novel pipeline >> will show up it might not be supported and that's fine. We already >> have cases where some hardware does not support the gamma lut property >> but only the CSC property and that breaks night light because we never >> bothered to write a shader fallback. KMS provides ways to offload work >> but a generic user space always has to provide a fallback and this >> doesn't change. Hardware specific user space on the other hand will >> keep working with the forward compatibility guarantees we want to >> provide. > > In my mind we've screwed up already, isn't a case to be made for > continue down the same path. > > The kernel is meant to be a hardware abstraction layer, not just a > hardware exposure layer. The kernel shouldn't set policy and there are > cases where it can't act as an abstraction layer (like where you need > a compiler), but I'm not sold that this case is one of those yet. I'm > open to being educated here on why it would be. > Thanks for raising these points. When I started out looking at color management I favored the descriptive model. Most other HW vendors I've talked to also tell me that they think about descriptive APIs since that allows HW vendors to map that to whatever their HW supports. Sebastian, Pekka, and others managed to change my mind about this but I still keep having difficult questions within AMD. Sebastian, Pekka, and Jonas have already done a good job to describe our reasoning behind the prescriptive model. It might be helpful to see how different the results of different tone-mapping operators can look: http://helgeseetzen.com/wp-content/uploads/2017/06/HS1.pdf According to my understanding all other platforms that have HDR now have a single compositor. At least
Re: [RFC] Plane color pipeline KMS uAPI
On Tue, 9 May 2023 10:23:49 -0100 Melissa Wen wrote: > On 05/05, Joshua Ashton wrote: > > Some corrections and replies inline. > > > > On Fri, 5 May 2023 at 12:42, Pekka Paalanen wrote: > > > > > > On Thu, 04 May 2023 15:22:59 + > > > Simon Ser wrote: > > > ... > > > > Color operation 47 (3D LUT RAM) > > > > └─ "lut_data" = Gamut mapping + tone mapping + night mode > > > > Color operation 48 (blend gamma) > > > > └─ "1d_curve_type" = PQ > > > > ^ > > This is wrong, this should be Display Native -> Linearized Display Referred > > > > This is a good point to discuss. I understand for the HDR10 case that we > are just setting an enumerated TF (that is PQ for this case - correct me > if I got it wrong) but, unlike when we use a user-LUT, we don't know > from the API that this enumerated TF value with an empty LUT is used for > linearizing/degamma. Perhaps this could come as a pair? Any idea? PQ curve is an EOTF, so it's always from electrical to optical. Are you asking for something like "1d_curve_type" = "PQ EOTF" vs. "1d_curve_type" = "inverse PQ EOTF"? I think that's how it should work. It's not a given that if a hardware block can do a curve, it can also do its inverse. They need to be advertised explicitly. Thanks, pq ps. I picked my nick in the 90s. Any resemblance to Perceptual Quantizer is unintended. ;-) > > > > > > You cannot do a TF with a matrix, and a gamut remap with a matrix on > > > electrical values is certainly surprising, so the example here is a > > > bit odd, but I don't think that hurts the intention of demonstration. > > > > I have done some corrections inline. > > > > You can see our fully correct color pipeline here: > > https://raw.githubusercontent.com/ValveSoftware/gamescope/master/src/docs/Steam%20Deck%20Display%20Pipeline.png > > > > Please let me know if you have any more questions about our color pipeline. pgpGkHFPNJN5T.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
On 05/05, Joshua Ashton wrote: > Some corrections and replies inline. > > On Fri, 5 May 2023 at 12:42, Pekka Paalanen wrote: > > > > On Thu, 04 May 2023 15:22:59 + > > Simon Ser wrote: > > > > > Hi all, > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > color > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > framebuffer and before it's blended with other planes. With this new uAPI > > > we > > > aim to reduce the battery life impact of color management and HDR on > > > mobile > > > devices, to improve performance and to decrease latency by skipping > > > composition on the 3D engine. This proposal is the result of discussions > > > at > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > > discussion. > > > > Hi Simon, > > > > this is an excellent write-up, thank you! > > > > Harry's question about what constitutes UAPI is a good one for danvet. > > > > I don't really have much to add here, a couple inline comments. I think > > this could work. > > > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > approach. > > > Drivers describe the available hardware blocks in terms of low-level > > > mathematical operations, then user-space configures each block. We decided > > > against a descriptive approach where user-space would provide a high-level > > > description of the colorspace and other parameters: we want to give more > > > control and flexibility to user-space, e.g. to be able to replicate > > > exactly the > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > seamlessly, and to avoid forcing user-space into a particular color > > > management > > > policy. > > > > > > We've decided against mirroring the existing CRTC properties > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > > > pipeline can significantly differ between vendors and this approach cannot > > > accurately abstract all hardware. In particular, the availability, > > > ordering and > > > capabilities of hardware blocks is different on each display engine. So, > > > we've > > > decided to go for a highly detailed hardware capability discovery. > > > > > > This new uAPI should not be in conflict with existing standard KMS > > > properties, > > > since there are none which control the pre-blending color pipeline at the > > > moment. It does conflict with any vendor-specific properties like > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > > > properties. Drivers will need to either reject atomic commits configuring > > > both > > > uAPIs, or alternatively we could add a DRM client cap which hides the > > > vendor > > > properties and shows the new generic properties when enabled. > > > > > > To use this uAPI, first user-space needs to discover hardware > > > capabilities via > > > KMS objects and properties, then user-space can configure the hardware > > > via an > > > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > > > > > Our proposal introduces a new "color_pipeline" plane property, and a new > > > KMS > > > object type, "COLOROP" (short for color operation). The "color_pipeline" > > > plane > > > property is an enum, each enum entry represents a color pipeline > > > supported by > > > the hardware. The special zero entry indicates that the pipeline is in > > > "bypass"/"no-op" mode. For instance, the following plane properties > > > describe a > > > primary plane with 2 supported pipelines but currently configured in > > > bypass > > > mode: > > > > > > Plane 10 > > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > > ├─ … > > > └─ "color_pipeline": enum {0, 42, 52} = 0 > > > > > > The non-zero entries describe color pipelines as a linked list of COLOROP > > > KMS > > > objects. The entry value is an object ID pointing to the head of the > > > linked > > > list (the first operation in the color pipeline). > > > > > > The new COLOROP objects also expose a number of KMS properties. Each has a > > > type, a reference to the next COLOROP object in the linked list, and other > > > type-specific properties. Here is an example for a 1D LUT operation: > > > > > > Color operation 42 > > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > > > ├─ "lut_size": immutable range = 4096 > > > ├─ "lut_data": blob > > > └─ "next": immutable color operation ID = 43 > > > > > > To configure this hardware block, user-space can fill a KMS blob with > > > 4096 u32 > > > entries, then set "lut_data" to the blob ID. Other color operation types > > > might > > > have different properties. > > > > > > Here is another example with a 3D LUT: > > > > > > Color operation 42 > > >
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 8 May 2023 18:54:09 -0500 Steven Kucharzyk wrote: > I'd like to ask if there is a block/flow chart/diagram that has been > created that represent the elements that are being discussed for this > RFC? If so, would you be so kind as to point me to it or send it to me? Hi Steven, the whole point of the design is that there is no predefined block diagram or flow chart. It would not fit hardware well, as hardware generations and vendors do not generally have a common design. Instead, the idea is to model what the hardware can do, and for that each driver will create a set of specific pipelines the hardware implements. Userspace then choose a pipeline that suits it and populates its parameters. As for the elements themselves, we can hopefully define some commonly available types, but undoubtedly there will be a few hardware-specific elements as well. Otherwise some piece of special hardware functionality cannot be used at all. The job of defining a generic pipeline model and mapping that to actual hardware elements is left for a userspace library. I expect there will be multiple pipeline models, more to be introduced over time. Hence putting that in a userspace library instead of carving it in stone at the kernel UAPI. Next time, please do use reply-to-all, you have again dropped everyone and other mailing lists from the CC. Thanks, pq pgpigx3ybIHZj.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
On Thu, 04 May 2023 15:22:59 + Simon Ser wrote: > Hi all, > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > pipeline before blending, ie. after a pixel is tapped from a plane's > framebuffer and before it's blended with other planes. With this new uAPI we > aim to reduce the battery life impact of color management and HDR on mobile > devices, to improve performance and to decrease latency by skipping > composition on the 3D engine. This proposal is the result of discussions at > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > familiar with the AMD, Intel and NVIDIA hardware have participated in the > discussion. > > This proposal takes a prescriptive approach instead of a descriptive approach. > Drivers describe the available hardware blocks in terms of low-level > mathematical operations, then user-space configures each block. We decided > against a descriptive approach where user-space would provide a high-level > description of the colorspace and other parameters: we want to give more > control and flexibility to user-space, e.g. to be able to replicate exactly > the > color pipeline with shaders and switch between shaders and KMS pipelines > seamlessly, and to avoid forcing user-space into a particular color management > policy. > > We've decided against mirroring the existing CRTC properties > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > pipeline can significantly differ between vendors and this approach cannot > accurately abstract all hardware. In particular, the availability, ordering > and > capabilities of hardware blocks is different on each display engine. So, we've > decided to go for a highly detailed hardware capability discovery. > > This new uAPI should not be in conflict with existing standard KMS properties, > since there are none which control the pre-blending color pipeline at the > moment. It does conflict with any vendor-specific properties like > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > properties. Drivers will need to either reject atomic commits configuring both > uAPIs, or alternatively we could add a DRM client cap which hides the vendor > properties and shows the new generic properties when enabled. Hi, I have some further ideas which do conflict with some existing KMS properties. This dives into the color encoding specific side of the UAPI. The main idea is to make the color pipeline not specific to RGB. We might actually be having YCbCr, XYZ, ICtCp and whatnot instead, at least in the middle of a pipeline. The aim is to avoid the confusion from statements like "my red channel is actually luma and not red". So it's purely syntactic. ISTR some people being against saying "R is just a channel name, it's not necessarily a red component." Therefore I propose to address the color channels with indices instead: ch 0, ch 1, ch 2, ch 3 Then we define the mapping between pixel and wire formats and the indices: R = ch 0 G = ch 1 B = ch 2 A = ch 3 Y = ch 0 U = ch 1 V = ch 2 If necessary, the following can also be defined: Z = ch 1 X = ch 2 L = ch 0 M = ch 1 S = ch 2 The Y from YUV and Y from XYZ share the designation for the name's sake although they are not the same quantity. If YUV is not a well-defined designation wrt. YCbCr, ICtCp and everything else in the same category, we can assign Cb, Cr, I, Ct, Cp etc. instead. That might be more clear anyway even if there is a popular convention. We can also choose differently, to e.g. match the H.273 mapping where channels are assigned such that Y=G, Cr=R, Cb=B. H.273 gives mappings between almost all of these, so if using those make more sense, then let's use those. In the end it shouldn't matter too much, since one does not arbitrarily mix channels from different formats. Special care needs to be taken when defining COLOROP elements that do not handle all channels interchangeably. (E.g. a curve set element is mostly channel-agnostic when it applies the same curve to channels 0-2, but ch 3 is pass-through.) Then, we define COLOROP elements in terms of channel indices. This removes any implied connection to any specific color coding. Elements that just do not make sense for arbitrary channel ordering, e.g. special-case elements or enumerated matrix elements with a specific purpose, will document the intended usage and the expected channel mapping. The main reason to do all this is to ultimately allow e.g. limited range YCbCr scanout with a fully pass-through pipeline with no implied conversion to or from RGB. This is where some existing KMS properties will conflict: those that affect how current implicit YUV-RGB conversions are done. These properties shall be replaced with COLOROP elements in pipelines, so that they can be controlled explicitly and we can know where they reside wrt. e.g.
Re: [RFC] Plane color pipeline KMS uAPI
On 5/8/23 05:18, Daniel Vetter wrote: > On Mon, 8 May 2023 at 10:58, Simon Ser wrote: >> >> On Friday, May 5th, 2023 at 21:53, Daniel Vetter wrote: >> >>> On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote: On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > Ok no comments from me on the actual color operations and semantics of all > that, because I have simply nothing to bring to that except confusion :-) > > Some higher level thoughts instead: > > - I really like that we just go with graph nodes here. I think that was > bound to happen sooner or later with kms (we almost got there with > writeback, and with hindsight maybe should have). I'd really rather not do graphs here. We only need linked lists as Sebastian said. Graphs would significantly add more complexity to this proposal, and I don't think that's a good idea unless there is a strong use-case. >>> >>> You have a graph, because a graph is just nodes + links. I did _not_ >>> propose a full generic graph structure, the link pointer would be in the >>> class/type specific structure only. Like how we have the plane->crtc or >>> connector->crtc links already like that (which already _is_ is full blown >>> graph). >> >> I really don't get why a pointer in a struct makes plane->crtc a full-blown >> graph. There is only a single parent-child link. A plane has a reference to a >> CRTC, and nothing more. >> >> You could say that anything is a graph. Yes, even an isolated struct >> somewhere >> is a graph: one with a single node and no link. But I don't follow what's the >> point of explaining everything with a graph when we only need a much simpler >> subset of the concept of graphs? >> >> Putting the graph thing aside, what are you suggesting exactly from a >> concrete >> uAPI point-of-view? Introducing a new struct type? Would it be a colorop >> specific struct, or a more generic one? What would be the fields? Why do you >> think that's necessary and better than the current proposal? >> >> My understanding so far is that you're suggesting introducing something like >> this at the uAPI level: >> >> struct drm_mode_node { >> uint32_t id; >> >> uint32_t children_count; >> uint32_t *children; // list of child object IDs >> }; > > Already too much I think > > struct drm_mode_node { > struct drm_mode_object base; > struct drm_private_obj atomic_base; > enum drm_mode_node_enum type; > }; > This would be about as much as we would want for a 'node' struct, for reasons that others already outlined. In short, a good API for a color pipeline needs to do a good job communicating the constraints. Hence the "next" pointer needs to be live in a colorop struct, whether it's a drm_private_obj or its own thing. I'm not quite seeing much benefits with a drm_mode_node other than being able to have a GET_NODE IOCTL instead of a GET_COLOROP, the former being able to be re-used for future scenarios that might need a "node." I feel this adds a layer of confusion to the API. Harry > The actual graph links would be in the specific type's state > structure, like they are for everything else. And the limits would be > on the property type, we probably need a new DRM_MODE_PROP_OBJECT_ENUM > to make the new limitations work correctly, since the current > DRM_MODE_PROP_OBJECT only limits to a specific type of object, not an > explicit list of drm_mode_object.id. > > You might not even need a node subclass for the state stuff, that > would directly be a drm_color_op_state that only embeds > drm_private_state. > > Another uapi difference is that the new kms objects would be of type > DRM_MODE_OBJECT_NODE, and would always have a "class" property. > >> I don't think this is a good idea for multiple reasons. First, this is >> overkill: we don't need this complexity, and this complexity will make it >> more >> difficult to reason about the color pipeline. This is a premature >> abstraction, >> one we don't need right now, and one I heaven't heard a potential future >> use-case for. Sure, one can kill an ant with a sledgehammer if they'd like, >> but >> that's not the right tool for the job. >> >> Second, this will make user-space miserable. User-space already has a tricky >> task to achieve to translate its abstract descriptive color pipeline to our >> proposed simple list of color operations. If we expose a full-blown graph, >> then >> the user-space logic will need to handle arbitrary graphs. This will have a >> significant cost (on implementation and testing), which we will be paying in >> terms of time spent and in terms of bugs. > > The color op pipeline would still be linear. I did not ask for a non-linear > one. > >> Last, this kind of generic "node" struct is at odds with existing KMS object >> types. So far, KMS objects are concrete like CRTC, connector, plane, etc. >> "Node" is abstract. This is inconsistent. > > Yeah I think I
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, May 08, 2023 at 09:14:18AM +1000, Dave Airlie wrote: > On Sat, 6 May 2023 at 08:21, Sebastian Wick wrote: > > > > On Fri, May 5, 2023 at 10:40 PM Dave Airlie wrote: > > > > > > On Fri, 5 May 2023 at 01:23, Simon Ser wrote: > > > > > > > > Hi all, > > > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > > color > > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > > framebuffer and before it's blended with other planes. With this new > > > > uAPI we > > > > aim to reduce the battery life impact of color management and HDR on > > > > mobile > > > > devices, to improve performance and to decrease latency by skipping > > > > composition on the 3D engine. This proposal is the result of > > > > discussions at > > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > > familiar with the AMD, Intel and NVIDIA hardware have participated in > > > > the > > > > discussion. > > > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > > approach. > > > > Drivers describe the available hardware blocks in terms of low-level > > > > mathematical operations, then user-space configures each block. We > > > > decided > > > > against a descriptive approach where user-space would provide a > > > > high-level > > > > description of the colorspace and other parameters: we want to give more > > > > control and flexibility to user-space, e.g. to be able to replicate > > > > exactly the > > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > > seamlessly, and to avoid forcing user-space into a particular color > > > > management > > > > policy. > > > > > > I'm not 100% sold on the prescriptive here, let's see if someone can > > > get me over the line with some questions later. > > > > > > My feeling is color pipeline hw is not a done deal, and that hw > > > vendors will be revising/evolving/churning the hw blocks for a while > > > longer, as there is no real standards in the area to aim for, all the > > > vendors are mostly just doing whatever gets Windows over the line and > > > keeps hw engineers happy. So I have some concerns here around forwards > > > compatibility and hence the API design. > > > > > > I guess my main concern is if you expose a bunch of hw blocks and > > > someone comes up with a novel new thing, will all existing userspace > > > work, without falling back to shaders? > > > Do we have minimum guarantees on what hardware blocks have to be > > > exposed to build a useable pipeline? > > > If a hardware block goes away in a new silicon revision, do I have to > > > rewrite my compositor? or will it be expected that the kernel will > > > emulate the old pipelines on top of whatever new fancy thing exists. > > > > I think there are two answers to those questions. > > These aren't selling me much better :-) > > > > The first one is that right now KMS already doesn't guarantee that > > every property is supported on all hardware. The guarantee we have is > > that properties that are supported on a piece of hardware on a > > specific kernel will be supported on the same hardware on later > > kernels. The color pipeline is no different here. For a specific piece > > of hardware a newer kernel might only change the pipelines in a > > backwards compatible way and add new pipelines. > > > > So to answer your question: if some hardware with a novel pipeline > > will show up it might not be supported and that's fine. We already > > have cases where some hardware does not support the gamma lut property > > but only the CSC property and that breaks night light because we never > > bothered to write a shader fallback. KMS provides ways to offload work > > but a generic user space always has to provide a fallback and this > > doesn't change. Hardware specific user space on the other hand will > > keep working with the forward compatibility guarantees we want to > > provide. > > In my mind we've screwed up already, isn't a case to be made for > continue down the same path. > > The kernel is meant to be a hardware abstraction layer, not just a > hardware exposure layer. The kernel shouldn't set policy and there are > cases where it can't act as an abstraction layer (like where you need > a compiler), but I'm not sold that this case is one of those yet. I'm > open to being educated here on why it would be. It would still be an abstraction of the hardware, just that the level of abstraction is a bit "lower" than your intuition currently tells you we should have. IMO it's not too different from the kernel providing low level input events describing what what the hardware can do and does, with a rather massive user space library (libinput) turning all of that low level nonsense to actual useful abstractions. In this case it's the other way around, the kernel provides vendor independent knobs that describe what the output hardware can do, and exactly how it
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 8 May 2023 09:14:18 +1000 Dave Airlie wrote: > On Sat, 6 May 2023 at 08:21, Sebastian Wick wrote: > > > > On Fri, May 5, 2023 at 10:40 PM Dave Airlie wrote: > > > > > > On Fri, 5 May 2023 at 01:23, Simon Ser wrote: > > > > > > > > Hi all, > > > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > > color > > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > > framebuffer and before it's blended with other planes. With this new > > > > uAPI we > > > > aim to reduce the battery life impact of color management and HDR on > > > > mobile > > > > devices, to improve performance and to decrease latency by skipping > > > > composition on the 3D engine. This proposal is the result of > > > > discussions at > > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > > familiar with the AMD, Intel and NVIDIA hardware have participated in > > > > the > > > > discussion. > > > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > > approach. > > > > Drivers describe the available hardware blocks in terms of low-level > > > > mathematical operations, then user-space configures each block. We > > > > decided > > > > against a descriptive approach where user-space would provide a > > > > high-level > > > > description of the colorspace and other parameters: we want to give more > > > > control and flexibility to user-space, e.g. to be able to replicate > > > > exactly the > > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > > seamlessly, and to avoid forcing user-space into a particular color > > > > management > > > > policy. > > > > > > I'm not 100% sold on the prescriptive here, let's see if someone can > > > get me over the line with some questions later. Hi Dave, generic userspace must always be able to fall back to GPU shaders or something else, when a window suddenly stops being eligible for a KMS plane. That can happen due to a simple window re-stacking operation for example, maybe a notification pops up temporarily. Hence, it is highly desirable to be able to implement the exact same algorithm in shaders as the display hardware does, in order to not cause visible glitches on screen. One way to do that is to have a prescriptive UAPI design. Userspace decides what algorithms to use for color processing, and the UAPI simply offers a way to implement those well-defined mathematical operations. An alternative could be that the UAPI gives userspace back shader programs that implement the same as what the hardware does, but... ugh. Choosing the algorithm is policy. Userspace must be in control of policy, right? Therefore a descriptive UAPI is simply not possible. There is no single correct algorithm for these things, there are many flavors, more and less correct, different quality/performance trade-offs, and even just matters of taste. Sometimes even end user taste, that might need to be configurable. Applications have built-in assumptions too, and they vary. To clarify, a descriptive UAPI is a design where userspace tells the kernel "my source 1 is sRGB, my source 2 is BT.2100/PQ YCbCr 4:2:0 with blahblahblah metadata, do whatever to display those on KMS planes simultaneously". As I mentioned, there is not just one answer to that, and we should also allow for innovation in the algorithms by everyone, not just hardware designers. A prescriptive UAPI is where we communicate mathematical operations without any semantics. It is inherently free of policy in the kernel. > > > > > > My feeling is color pipeline hw is not a done deal, and that hw > > > vendors will be revising/evolving/churning the hw blocks for a while > > > longer, as there is no real standards in the area to aim for, all the > > > vendors are mostly just doing whatever gets Windows over the line and > > > keeps hw engineers happy. So I have some concerns here around forwards > > > compatibility and hence the API design. > > > > > > I guess my main concern is if you expose a bunch of hw blocks and > > > someone comes up with a novel new thing, will all existing userspace > > > work, without falling back to shaders? > > > Do we have minimum guarantees on what hardware blocks have to be > > > exposed to build a useable pipeline? > > > If a hardware block goes away in a new silicon revision, do I have to > > > rewrite my compositor? or will it be expected that the kernel will > > > emulate the old pipelines on top of whatever new fancy thing exists. > > > > I think there are two answers to those questions. > > These aren't selling me much better :-) > > > > The first one is that right now KMS already doesn't guarantee that > > every property is supported on all hardware. The guarantee we have is > > that properties that are supported on a piece of hardware on a > > specific kernel will be supported on the same hardware on later > > kernels. The color pipeline is no
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 8 May 2023 at 10:58, Simon Ser wrote: > > On Friday, May 5th, 2023 at 21:53, Daniel Vetter wrote: > > > On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote: > > > On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > > > > > > > Ok no comments from me on the actual color operations and semantics of > > > > all > > > > that, because I have simply nothing to bring to that except confusion > > > > :-) > > > > > > > > Some higher level thoughts instead: > > > > > > > > - I really like that we just go with graph nodes here. I think that was > > > > bound to happen sooner or later with kms (we almost got there with > > > > writeback, and with hindsight maybe should have). > > > > > > I'd really rather not do graphs here. We only need linked lists as > > > Sebastian > > > said. Graphs would significantly add more complexity to this proposal, and > > > I don't think that's a good idea unless there is a strong use-case. > > > > You have a graph, because a graph is just nodes + links. I did _not_ > > propose a full generic graph structure, the link pointer would be in the > > class/type specific structure only. Like how we have the plane->crtc or > > connector->crtc links already like that (which already _is_ is full blown > > graph). > > I really don't get why a pointer in a struct makes plane->crtc a full-blown > graph. There is only a single parent-child link. A plane has a reference to a > CRTC, and nothing more. > > You could say that anything is a graph. Yes, even an isolated struct somewhere > is a graph: one with a single node and no link. But I don't follow what's the > point of explaining everything with a graph when we only need a much simpler > subset of the concept of graphs? > > Putting the graph thing aside, what are you suggesting exactly from a concrete > uAPI point-of-view? Introducing a new struct type? Would it be a colorop > specific struct, or a more generic one? What would be the fields? Why do you > think that's necessary and better than the current proposal? > > My understanding so far is that you're suggesting introducing something like > this at the uAPI level: > > struct drm_mode_node { > uint32_t id; > > uint32_t children_count; > uint32_t *children; // list of child object IDs > }; Already too much I think struct drm_mode_node { struct drm_mode_object base; struct drm_private_obj atomic_base; enum drm_mode_node_enum type; }; The actual graph links would be in the specific type's state structure, like they are for everything else. And the limits would be on the property type, we probably need a new DRM_MODE_PROP_OBJECT_ENUM to make the new limitations work correctly, since the current DRM_MODE_PROP_OBJECT only limits to a specific type of object, not an explicit list of drm_mode_object.id. You might not even need a node subclass for the state stuff, that would directly be a drm_color_op_state that only embeds drm_private_state. Another uapi difference is that the new kms objects would be of type DRM_MODE_OBJECT_NODE, and would always have a "class" property. > I don't think this is a good idea for multiple reasons. First, this is > overkill: we don't need this complexity, and this complexity will make it more > difficult to reason about the color pipeline. This is a premature abstraction, > one we don't need right now, and one I heaven't heard a potential future > use-case for. Sure, one can kill an ant with a sledgehammer if they'd like, > but > that's not the right tool for the job. > > Second, this will make user-space miserable. User-space already has a tricky > task to achieve to translate its abstract descriptive color pipeline to our > proposed simple list of color operations. If we expose a full-blown graph, > then > the user-space logic will need to handle arbitrary graphs. This will have a > significant cost (on implementation and testing), which we will be paying in > terms of time spent and in terms of bugs. The color op pipeline would still be linear. I did not ask for a non-linear one. > Last, this kind of generic "node" struct is at odds with existing KMS object > types. So far, KMS objects are concrete like CRTC, connector, plane, etc. > "Node" is abstract. This is inconsistent. Yeah I think I think we should change that. That's essentially the full extend of my proposal. The classes + possible_foo mask approach just always felt rather brittle to me (and there's plenty of userspace out there to prove that's the case), going more explicit with the links with enumerated combos feels better. Plus it should allow building a bit cleaner interfaces for drivers to construct the correct graphs, because drivers _also_ rather consistently got the entire possible_foo mask business wrong. > Please let me know whether the above is what you have in mind. If not, please > explain what exactly you mean by "graphs" in terms of uAPI, and please explain > why we need it and what real-world use-cases it
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 8 May 2023 at 10:24, Pekka Paalanen wrote: > > On Fri, 5 May 2023 21:51:41 +0200 > Daniel Vetter wrote: > > > On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote: > > > On Fri, May 5, 2023 at 5:28 PM Daniel Vetter wrote: > > > > > > > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > > > > > Hi all, > > > > > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > > > color > > > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > > > framebuffer and before it's blended with other planes. With this new > > > > > uAPI we > > > > > aim to reduce the battery life impact of color management and HDR on > > > > > mobile > > > > > devices, to improve performance and to decrease latency by skipping > > > > > composition on the 3D engine. This proposal is the result of > > > > > discussions at > > > > > the Red Hat HDR hackfest [1] which took place a few days ago. > > > > > Engineers > > > > > familiar with the AMD, Intel and NVIDIA hardware have participated in > > > > > the > > > > > discussion. > > > > > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > > > approach. > > > > > Drivers describe the available hardware blocks in terms of low-level > > > > > mathematical operations, then user-space configures each block. We > > > > > decided > > > > > against a descriptive approach where user-space would provide a > > > > > high-level > > > > > description of the colorspace and other parameters: we want to give > > > > > more > > > > > control and flexibility to user-space, e.g. to be able to replicate > > > > > exactly the > > > > > color pipeline with shaders and switch between shaders and KMS > > > > > pipelines > > > > > seamlessly, and to avoid forcing user-space into a particular color > > > > > management > > > > > policy. > > > > > > > > Ack on the prescriptive approach, but generic imo. Descriptive pretty > > > > much > > > > means you need the shaders at the same api level for fallback purposes, > > > > and we're not going to have that ever in kms. That would need something > > > > like hwc in userspace to work. > > > > > > Which would be nice to have but that would be forcing a specific color > > > pipeline on everyone and we explicitly want to avoid that. There are > > > just too many trade-offs to consider. > > > > > > > And not generic in it's ultimate consquence would mean we just do a blob > > > > for a crtc with all the vendor register stuff like adf (android display > > > > framework) does, because I really don't see a point in trying a > > > > generic-looking-but-not vendor uapi with each color op/stage split out. > > > > > > > > So from very far and pure gut feeling, this seems like a good middle > > > > ground in the uapi design space we have here. > > > > > > Good to hear! > > > > > > > > We've decided against mirroring the existing CRTC properties > > > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color > > > > > management > > > > > pipeline can significantly differ between vendors and this approach > > > > > cannot > > > > > accurately abstract all hardware. In particular, the availability, > > > > > ordering and > > > > > capabilities of hardware blocks is different on each display engine. > > > > > So, we've > > > > > decided to go for a highly detailed hardware capability discovery. > > > > > > > > > > This new uAPI should not be in conflict with existing standard KMS > > > > > properties, > > > > > since there are none which control the pre-blending color pipeline at > > > > > the > > > > > moment. It does conflict with any vendor-specific properties like > > > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding > > > > > AMD-specific > > > > > properties. Drivers will need to either reject atomic commits > > > > > configuring both > > > > > uAPIs, or alternatively we could add a DRM client cap which hides the > > > > > vendor > > > > > properties and shows the new generic properties when enabled. > > > > > > > > > > To use this uAPI, first user-space needs to discover hardware > > > > > capabilities via > > > > > KMS objects and properties, then user-space can configure the > > > > > hardware via an > > > > > atomic commit. This works similarly to the existing KMS uAPI, e.g. > > > > > planes. > > > > > > > > > > Our proposal introduces a new "color_pipeline" plane property, and a > > > > > new KMS > > > > > object type, "COLOROP" (short for color operation). The > > > > > "color_pipeline" plane > > > > > property is an enum, each enum entry represents a color pipeline > > > > > supported by > > > > > the hardware. The special zero entry indicates that the pipeline is in > > > > > "bypass"/"no-op" mode. For instance, the following plane properties > > > > > describe a > > > > > primary plane with 2 supported pipelines but currently configured in > > > > > bypass > > > > > mode: > > > > > > > > > >
Re: [RFC] Plane color pipeline KMS uAPI
On Friday, May 5th, 2023 at 21:53, Daniel Vetter wrote: > On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote: > > On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > > > > > Ok no comments from me on the actual color operations and semantics of all > > > that, because I have simply nothing to bring to that except confusion :-) > > > > > > Some higher level thoughts instead: > > > > > > - I really like that we just go with graph nodes here. I think that was > > > bound to happen sooner or later with kms (we almost got there with > > > writeback, and with hindsight maybe should have). > > > > I'd really rather not do graphs here. We only need linked lists as Sebastian > > said. Graphs would significantly add more complexity to this proposal, and > > I don't think that's a good idea unless there is a strong use-case. > > You have a graph, because a graph is just nodes + links. I did _not_ > propose a full generic graph structure, the link pointer would be in the > class/type specific structure only. Like how we have the plane->crtc or > connector->crtc links already like that (which already _is_ is full blown > graph). I really don't get why a pointer in a struct makes plane->crtc a full-blown graph. There is only a single parent-child link. A plane has a reference to a CRTC, and nothing more. You could say that anything is a graph. Yes, even an isolated struct somewhere is a graph: one with a single node and no link. But I don't follow what's the point of explaining everything with a graph when we only need a much simpler subset of the concept of graphs? Putting the graph thing aside, what are you suggesting exactly from a concrete uAPI point-of-view? Introducing a new struct type? Would it be a colorop specific struct, or a more generic one? What would be the fields? Why do you think that's necessary and better than the current proposal? My understanding so far is that you're suggesting introducing something like this at the uAPI level: struct drm_mode_node { uint32_t id; uint32_t children_count; uint32_t *children; // list of child object IDs }; I don't think this is a good idea for multiple reasons. First, this is overkill: we don't need this complexity, and this complexity will make it more difficult to reason about the color pipeline. This is a premature abstraction, one we don't need right now, and one I heaven't heard a potential future use-case for. Sure, one can kill an ant with a sledgehammer if they'd like, but that's not the right tool for the job. Second, this will make user-space miserable. User-space already has a tricky task to achieve to translate its abstract descriptive color pipeline to our proposed simple list of color operations. If we expose a full-blown graph, then the user-space logic will need to handle arbitrary graphs. This will have a significant cost (on implementation and testing), which we will be paying in terms of time spent and in terms of bugs. Last, this kind of generic "node" struct is at odds with existing KMS object types. So far, KMS objects are concrete like CRTC, connector, plane, etc. "Node" is abstract. This is inconsistent. Please let me know whether the above is what you have in mind. If not, please explain what exactly you mean by "graphs" in terms of uAPI, and please explain why we need it and what real-world use-cases it would solve.
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, 5 May 2023 16:04:35 -0500 Steven Kucharzyk wrote: > Hi, > > I'm new to this list and probably can't contribute but interested, I > passed your original posting to a friend and have enclosed his thoughts > ... old hash or food for thought ??? I ask your forgiveness if you find > this inappropriate. (am of the elk * act first, ask for forgiveness > afterward) ;-) Thanks, but please use reply-to-all, it's a bit painful to add back all the other mailing lists and people. > > Steven > > - start > > Steven Kucharzyk wrote: > > > Thought you might find this of interest. > > Hi, > thanks for sending it to me. > > Unfortunately I don't know enough about the context to say anything > specific about it. > > The best I can do is state the big picture aims I would > look for, as someone with a background in display systems > electronic design, rendering software development > and Color Science. (I apologize in advance if any of this > is preaching to the choir!) > > 1) I would make sure that someone with a strong Color Science > background was consulted in the development of the API. Where can we find someone like that, who would also not start by saying we cannot get anything right, or that we cannot change the old software architecture, and would actually try to understand *our* goals and limitations as well? Who could commit to long discussions over several years in a *friendly* manner? It would take extreme amounts of patience from that person. > 2) I would be measuring the API against its ability to > support a "profiling" color management workflow. This workflow > allows using the full capability of a display, while also allowing > simultaneous display of multiple sources encoded in any colorspace. > So the basic architecture is to have a final frame buffer (real > or virtual) in the native displays colorspace, and use any > graphics hardware color transform and rendering capability to > assist with the transformation of data in different source > colorspaces into the displays native colorspace. > > 3) The third thing I would be looking for, is enough > standardization that user mode software can be written > that will get key benefits of what's available in the hardware, > without needing to be customized to lots of different hardware > specifics. For instance, I'd make sure that there was a standard display > frame buffer to display mode that applied per channel curves > that are specified in a standard way. (i.e. make sure that there > is an easy to use replacement for XRRCrtcGamma.) > > Any API that is specific to a type or model of graphics card, > will retard development of color management support to a very large > degree - the financial and development costs of obtaining, configuring > and testing against multiple graphic card makes and models puts this > in the too hard basket for anyone other than a corporation. > > Perhaps little of the above is relevant, if this is a low level API > that is to be used by other operating system sub-systems such > as display graphics API's like X11 or Wayland, which will choose > specific display rendering models and implement them with the hardware > capabilities that are available. That is exactly what it is. It is a way to save power and gain performance when things happen to fit in place just right: what one needs to do matches what the dedicated color processing hardware blocks implement. > From a color management point of view, > it is the operating system & UI graphics API's that are the ones that > are desirable to work with, since they are meant to insulate > applications from hardware details. Indeed. Anything the display controller hardware cannot do will be implemented by other means, e.g. on the GPU, by a display server. Thanks, pq pgpMawh849iYa.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, 5 May 2023 21:51:41 +0200 Daniel Vetter wrote: > On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote: > > On Fri, May 5, 2023 at 5:28 PM Daniel Vetter wrote: > > > > > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > > > > Hi all, > > > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > > color > > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > > framebuffer and before it's blended with other planes. With this new > > > > uAPI we > > > > aim to reduce the battery life impact of color management and HDR on > > > > mobile > > > > devices, to improve performance and to decrease latency by skipping > > > > composition on the 3D engine. This proposal is the result of > > > > discussions at > > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > > familiar with the AMD, Intel and NVIDIA hardware have participated in > > > > the > > > > discussion. > > > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > > approach. > > > > Drivers describe the available hardware blocks in terms of low-level > > > > mathematical operations, then user-space configures each block. We > > > > decided > > > > against a descriptive approach where user-space would provide a > > > > high-level > > > > description of the colorspace and other parameters: we want to give more > > > > control and flexibility to user-space, e.g. to be able to replicate > > > > exactly the > > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > > seamlessly, and to avoid forcing user-space into a particular color > > > > management > > > > policy. > > > > > > Ack on the prescriptive approach, but generic imo. Descriptive pretty much > > > means you need the shaders at the same api level for fallback purposes, > > > and we're not going to have that ever in kms. That would need something > > > like hwc in userspace to work. > > > > Which would be nice to have but that would be forcing a specific color > > pipeline on everyone and we explicitly want to avoid that. There are > > just too many trade-offs to consider. > > > > > And not generic in it's ultimate consquence would mean we just do a blob > > > for a crtc with all the vendor register stuff like adf (android display > > > framework) does, because I really don't see a point in trying a > > > generic-looking-but-not vendor uapi with each color op/stage split out. > > > > > > So from very far and pure gut feeling, this seems like a good middle > > > ground in the uapi design space we have here. > > > > Good to hear! > > > > > > We've decided against mirroring the existing CRTC properties > > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > > > > pipeline can significantly differ between vendors and this approach > > > > cannot > > > > accurately abstract all hardware. In particular, the availability, > > > > ordering and > > > > capabilities of hardware blocks is different on each display engine. > > > > So, we've > > > > decided to go for a highly detailed hardware capability discovery. > > > > > > > > This new uAPI should not be in conflict with existing standard KMS > > > > properties, > > > > since there are none which control the pre-blending color pipeline at > > > > the > > > > moment. It does conflict with any vendor-specific properties like > > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding > > > > AMD-specific > > > > properties. Drivers will need to either reject atomic commits > > > > configuring both > > > > uAPIs, or alternatively we could add a DRM client cap which hides the > > > > vendor > > > > properties and shows the new generic properties when enabled. > > > > > > > > To use this uAPI, first user-space needs to discover hardware > > > > capabilities via > > > > KMS objects and properties, then user-space can configure the hardware > > > > via an > > > > atomic commit. This works similarly to the existing KMS uAPI, e.g. > > > > planes. > > > > > > > > Our proposal introduces a new "color_pipeline" plane property, and a > > > > new KMS > > > > object type, "COLOROP" (short for color operation). The > > > > "color_pipeline" plane > > > > property is an enum, each enum entry represents a color pipeline > > > > supported by > > > > the hardware. The special zero entry indicates that the pipeline is in > > > > "bypass"/"no-op" mode. For instance, the following plane properties > > > > describe a > > > > primary plane with 2 supported pipelines but currently configured in > > > > bypass > > > > mode: > > > > > > > > Plane 10 > > > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > > > ├─ … > > > > └─ "color_pipeline": enum {0, 42, 52} = 0 > > > > > > A bit confused, why is this an enum, and not just an immutable prop that > > > points at the first element? You already can
Re: [RFC] Plane color pipeline KMS uAPI
On Sat, 6 May 2023 at 08:21, Sebastian Wick wrote: > > On Fri, May 5, 2023 at 10:40 PM Dave Airlie wrote: > > > > On Fri, 5 May 2023 at 01:23, Simon Ser wrote: > > > > > > Hi all, > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > color > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > framebuffer and before it's blended with other planes. With this new uAPI > > > we > > > aim to reduce the battery life impact of color management and HDR on > > > mobile > > > devices, to improve performance and to decrease latency by skipping > > > composition on the 3D engine. This proposal is the result of discussions > > > at > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > > discussion. > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > approach. > > > Drivers describe the available hardware blocks in terms of low-level > > > mathematical operations, then user-space configures each block. We decided > > > against a descriptive approach where user-space would provide a high-level > > > description of the colorspace and other parameters: we want to give more > > > control and flexibility to user-space, e.g. to be able to replicate > > > exactly the > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > seamlessly, and to avoid forcing user-space into a particular color > > > management > > > policy. > > > > I'm not 100% sold on the prescriptive here, let's see if someone can > > get me over the line with some questions later. > > > > My feeling is color pipeline hw is not a done deal, and that hw > > vendors will be revising/evolving/churning the hw blocks for a while > > longer, as there is no real standards in the area to aim for, all the > > vendors are mostly just doing whatever gets Windows over the line and > > keeps hw engineers happy. So I have some concerns here around forwards > > compatibility and hence the API design. > > > > I guess my main concern is if you expose a bunch of hw blocks and > > someone comes up with a novel new thing, will all existing userspace > > work, without falling back to shaders? > > Do we have minimum guarantees on what hardware blocks have to be > > exposed to build a useable pipeline? > > If a hardware block goes away in a new silicon revision, do I have to > > rewrite my compositor? or will it be expected that the kernel will > > emulate the old pipelines on top of whatever new fancy thing exists. > > I think there are two answers to those questions. These aren't selling me much better :-) > > The first one is that right now KMS already doesn't guarantee that > every property is supported on all hardware. The guarantee we have is > that properties that are supported on a piece of hardware on a > specific kernel will be supported on the same hardware on later > kernels. The color pipeline is no different here. For a specific piece > of hardware a newer kernel might only change the pipelines in a > backwards compatible way and add new pipelines. > > So to answer your question: if some hardware with a novel pipeline > will show up it might not be supported and that's fine. We already > have cases where some hardware does not support the gamma lut property > but only the CSC property and that breaks night light because we never > bothered to write a shader fallback. KMS provides ways to offload work > but a generic user space always has to provide a fallback and this > doesn't change. Hardware specific user space on the other hand will > keep working with the forward compatibility guarantees we want to > provide. In my mind we've screwed up already, isn't a case to be made for continue down the same path. The kernel is meant to be a hardware abstraction layer, not just a hardware exposure layer. The kernel shouldn't set policy and there are cases where it can't act as an abstraction layer (like where you need a compiler), but I'm not sold that this case is one of those yet. I'm open to being educated here on why it would be. > > The second answer is that we want to provide a user space library > which takes a description of a color pipeline and tries to map that to > the available KMS color pipelines. If there is a novel color > operation, adding support in this library would then make it possible > to offload compatible color pipelines on this new hardware for all > consumers of the library. Obviously there is no guarantee that > whatever color pipeline compositors come up with can actually be > realized on specific hardware but that's just an inherent hardware > issue. > Why does this library need to be in userspace though? If there's a library making device dependent decisions, why can't we just make those device dependent decisions in the kernel? This feels like we are trying to go down the Android HWC road, but we aren't
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, May 5, 2023 at 10:40 PM Dave Airlie wrote: > > On Fri, 5 May 2023 at 01:23, Simon Ser wrote: > > > > Hi all, > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > > pipeline before blending, ie. after a pixel is tapped from a plane's > > framebuffer and before it's blended with other planes. With this new uAPI we > > aim to reduce the battery life impact of color management and HDR on mobile > > devices, to improve performance and to decrease latency by skipping > > composition on the 3D engine. This proposal is the result of discussions at > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > discussion. > > > > This proposal takes a prescriptive approach instead of a descriptive > > approach. > > Drivers describe the available hardware blocks in terms of low-level > > mathematical operations, then user-space configures each block. We decided > > against a descriptive approach where user-space would provide a high-level > > description of the colorspace and other parameters: we want to give more > > control and flexibility to user-space, e.g. to be able to replicate exactly > > the > > color pipeline with shaders and switch between shaders and KMS pipelines > > seamlessly, and to avoid forcing user-space into a particular color > > management > > policy. > > I'm not 100% sold on the prescriptive here, let's see if someone can > get me over the line with some questions later. > > My feeling is color pipeline hw is not a done deal, and that hw > vendors will be revising/evolving/churning the hw blocks for a while > longer, as there is no real standards in the area to aim for, all the > vendors are mostly just doing whatever gets Windows over the line and > keeps hw engineers happy. So I have some concerns here around forwards > compatibility and hence the API design. > > I guess my main concern is if you expose a bunch of hw blocks and > someone comes up with a novel new thing, will all existing userspace > work, without falling back to shaders? > Do we have minimum guarantees on what hardware blocks have to be > exposed to build a useable pipeline? > If a hardware block goes away in a new silicon revision, do I have to > rewrite my compositor? or will it be expected that the kernel will > emulate the old pipelines on top of whatever new fancy thing exists. I think there are two answers to those questions. The first one is that right now KMS already doesn't guarantee that every property is supported on all hardware. The guarantee we have is that properties that are supported on a piece of hardware on a specific kernel will be supported on the same hardware on later kernels. The color pipeline is no different here. For a specific piece of hardware a newer kernel might only change the pipelines in a backwards compatible way and add new pipelines. So to answer your question: if some hardware with a novel pipeline will show up it might not be supported and that's fine. We already have cases where some hardware does not support the gamma lut property but only the CSC property and that breaks night light because we never bothered to write a shader fallback. KMS provides ways to offload work but a generic user space always has to provide a fallback and this doesn't change. Hardware specific user space on the other hand will keep working with the forward compatibility guarantees we want to provide. The second answer is that we want to provide a user space library which takes a description of a color pipeline and tries to map that to the available KMS color pipelines. If there is a novel color operation, adding support in this library would then make it possible to offload compatible color pipelines on this new hardware for all consumers of the library. Obviously there is no guarantee that whatever color pipeline compositors come up with can actually be realized on specific hardware but that's just an inherent hardware issue. > We are not Android, or even Steam OS on a Steamdeck, we have to be > able to independently update the kernel for new hardware and not > require every compositor currently providing HDR to need to support > new hardware blocks and models at the same time. > > Dave. >
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, 5 May 2023 at 01:23, Simon Ser wrote: > > Hi all, > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > pipeline before blending, ie. after a pixel is tapped from a plane's > framebuffer and before it's blended with other planes. With this new uAPI we > aim to reduce the battery life impact of color management and HDR on mobile > devices, to improve performance and to decrease latency by skipping > composition on the 3D engine. This proposal is the result of discussions at > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > familiar with the AMD, Intel and NVIDIA hardware have participated in the > discussion. > > This proposal takes a prescriptive approach instead of a descriptive approach. > Drivers describe the available hardware blocks in terms of low-level > mathematical operations, then user-space configures each block. We decided > against a descriptive approach where user-space would provide a high-level > description of the colorspace and other parameters: we want to give more > control and flexibility to user-space, e.g. to be able to replicate exactly > the > color pipeline with shaders and switch between shaders and KMS pipelines > seamlessly, and to avoid forcing user-space into a particular color management > policy. I'm not 100% sold on the prescriptive here, let's see if someone can get me over the line with some questions later. My feeling is color pipeline hw is not a done deal, and that hw vendors will be revising/evolving/churning the hw blocks for a while longer, as there is no real standards in the area to aim for, all the vendors are mostly just doing whatever gets Windows over the line and keeps hw engineers happy. So I have some concerns here around forwards compatibility and hence the API design. I guess my main concern is if you expose a bunch of hw blocks and someone comes up with a novel new thing, will all existing userspace work, without falling back to shaders? Do we have minimum guarantees on what hardware blocks have to be exposed to build a useable pipeline? If a hardware block goes away in a new silicon revision, do I have to rewrite my compositor? or will it be expected that the kernel will emulate the old pipelines on top of whatever new fancy thing exists. We are not Android, or even Steam OS on a Steamdeck, we have to be able to independently update the kernel for new hardware and not require every compositor currently providing HDR to need to support new hardware blocks and models at the same time. Dave.
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote: > On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > > > Ok no comments from me on the actual color operations and semantics of all > > that, because I have simply nothing to bring to that except confusion :-) > > > > Some higher level thoughts instead: > > > > - I really like that we just go with graph nodes here. I think that was > > bound to happen sooner or later with kms (we almost got there with > > writeback, and with hindsight maybe should have). > > I'd really rather not do graphs here. We only need linked lists as Sebastian > said. Graphs would significantly add more complexity to this proposal, and > I don't think that's a good idea unless there is a strong use-case. You have a graph, because a graph is just nodes + links. I did _not_ propose a full generic graph structure, the link pointer would be in the class/type specific structure only. Like how we have the plane->crtc or connector->crtc links already like that (which already _is_ is full blown graph). Maybe explain what exactly you're thinking under "do graphs here" so I understand what you mean differently than me? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote: > On Fri, May 5, 2023 at 5:28 PM Daniel Vetter wrote: > > > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > > > Hi all, > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > color > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > framebuffer and before it's blended with other planes. With this new uAPI > > > we > > > aim to reduce the battery life impact of color management and HDR on > > > mobile > > > devices, to improve performance and to decrease latency by skipping > > > composition on the 3D engine. This proposal is the result of discussions > > > at > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > > discussion. > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > approach. > > > Drivers describe the available hardware blocks in terms of low-level > > > mathematical operations, then user-space configures each block. We decided > > > against a descriptive approach where user-space would provide a high-level > > > description of the colorspace and other parameters: we want to give more > > > control and flexibility to user-space, e.g. to be able to replicate > > > exactly the > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > seamlessly, and to avoid forcing user-space into a particular color > > > management > > > policy. > > > > Ack on the prescriptive approach, but generic imo. Descriptive pretty much > > means you need the shaders at the same api level for fallback purposes, > > and we're not going to have that ever in kms. That would need something > > like hwc in userspace to work. > > Which would be nice to have but that would be forcing a specific color > pipeline on everyone and we explicitly want to avoid that. There are > just too many trade-offs to consider. > > > And not generic in it's ultimate consquence would mean we just do a blob > > for a crtc with all the vendor register stuff like adf (android display > > framework) does, because I really don't see a point in trying a > > generic-looking-but-not vendor uapi with each color op/stage split out. > > > > So from very far and pure gut feeling, this seems like a good middle > > ground in the uapi design space we have here. > > Good to hear! > > > > We've decided against mirroring the existing CRTC properties > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > > > pipeline can significantly differ between vendors and this approach cannot > > > accurately abstract all hardware. In particular, the availability, > > > ordering and > > > capabilities of hardware blocks is different on each display engine. So, > > > we've > > > decided to go for a highly detailed hardware capability discovery. > > > > > > This new uAPI should not be in conflict with existing standard KMS > > > properties, > > > since there are none which control the pre-blending color pipeline at the > > > moment. It does conflict with any vendor-specific properties like > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > > > properties. Drivers will need to either reject atomic commits configuring > > > both > > > uAPIs, or alternatively we could add a DRM client cap which hides the > > > vendor > > > properties and shows the new generic properties when enabled. > > > > > > To use this uAPI, first user-space needs to discover hardware > > > capabilities via > > > KMS objects and properties, then user-space can configure the hardware > > > via an > > > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > > > > > Our proposal introduces a new "color_pipeline" plane property, and a new > > > KMS > > > object type, "COLOROP" (short for color operation). The "color_pipeline" > > > plane > > > property is an enum, each enum entry represents a color pipeline > > > supported by > > > the hardware. The special zero entry indicates that the pipeline is in > > > "bypass"/"no-op" mode. For instance, the following plane properties > > > describe a > > > primary plane with 2 supported pipelines but currently configured in > > > bypass > > > mode: > > > > > > Plane 10 > > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > > ├─ … > > > └─ "color_pipeline": enum {0, 42, 52} = 0 > > > > A bit confused, why is this an enum, and not just an immutable prop that > > points at the first element? You already can disable elements with the > > bypass thing, also bypassing by changing the pointers to the next node in > > the graph seems a bit confusing and redundant. > > We want to allow multiple pipelines to exist and a plane can choose > the pipeline by selecting the first element of the pipeline. The enum > here lists all the possible
Re: [RFC] Plane color pipeline KMS uAPI
On 5/5/23 15:16, Pekka Paalanen wrote: On Fri, 5 May 2023 14:30:11 +0100 Joshua Ashton wrote: Some corrections and replies inline. On Fri, 5 May 2023 at 12:42, Pekka Paalanen wrote: On Thu, 04 May 2023 15:22:59 + Simon Ser wrote: ... To wrap things up, let's take a real-world example: how would gamescope [2] configure the AMD DCN 3.0 hardware for its color pipeline? The gamescope color pipeline is described in [3]. The AMD DCN 3.0 hardware is described in [4]. AMD would expose the following objects and properties: Plane 10 ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary └─ "color_pipeline": enum {0, 42} = 0 Color operation 42 (input CSC) ├─ "type": enum {Bypass, Matrix} = Matrix ├─ "matrix_data": blob └─ "next": immutable color operation ID = 43 Color operation 43 ├─ "type": enum {Scaling} = Scaling └─ "next": immutable color operation ID = 44 Color operation 44 (DeGamma) ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {sRGB, PQ, …} = sRGB └─ "next": immutable color operation ID = 45 Some vendors have per-tap degamma and some have a degamma after the sample. How do we distinguish that behaviour? It is important to know. ... Btw. ISTR that if you want to do scaling properly with alpha channel, you need optical values multiplied by alpha. Alpha vs. scaling is just yet another thing to look into, and TF operations do not work with pre-mult. What are your concerns here? I believe this is exactly the same question as yours about sampling, at least for up-scaling where sampling the framebuffer interpolates in some way. Oh, interpolation mode would fit in the scaling COLOROP... Having pre-multiplied alpha is fine with a TF: the alpha was premultiplied in linear, then encoded with the TF by the client. There are two different ways to pre-multiply: into optical values (okay), and into electrical values (what everyone actually does, and what Wayland assumes by default). What you described is the thing mostly no-one does in GUI graphics. Even in the web. Yeah, I have seen this problem many times before in different fields. There are not many transparent clients that I know of (most of them are Gamescope Overlays), but the ones I do know of do actually do the premultiply in linear space (mainly because they use sRGB image views for their color attachments so it gets handled for them). From my perspective and experience, we definitely shouldn't do anything to try and 'fix' apps doing their premultiply in the wrong space. I've had to deal with this before in game development on a transparent HUD, and my solution and thinking for that was: It was authored (or "mastered") with this behaviour in mind. So that's what we should do. It felt bad to 'break' the blending on the HUD of that game, but it looked better, and it was what was intended before it was 'fixed' in a later engine version. It is still definitely interesting to think about, but I don't think presents a problem at all. In fact, doing anything would just 'break' the expected behaviour of apps. If you think of a TF as something something relative to a bunch of reference state or whatever then you might think "oh you can't do that!", but you really can. It's really best to just think of it as a mathematical encoding of a value in all instances that we touch. True, except when it's false. If you assume that decoding is the exact mathematical inverse of encoding, then your conclusion follows. Unfortunately many video standards do not have it so. BT.601, BT.709, and I forget if BT.2020 (SDR) as well encode with one function and decode with something that is not the inverse, and it is totally intentional and necessary mangling of the values to get the expected result on screen. Someone has called this "implicit color management". So one needs to be very careful here what the actual characteristics are. The only issue is that you lose precision from having pre-multiplied alpha as it's quantized to fit into the DRM format rather than using the full range then getting divided by the alpha at blend time. It doesn't end up being a visible issue ever however in my experience, at 8bpc. That's true. Wait, why would you divide by alpha for blending? Blending/interpolation is the only operation where pre-mult is useful. I mis-spoke, I meant multiply. - Joshie ✨ Thanks, pq Thanks - Joshie ✨ Thanks, pq I hope comparing these properties to the diagrams linked above can help understand how the uAPI would be used and give an idea of its viability. Please feel free to provide feedback! It would be especially useful to have someone familiar with Arm SoCs look at this, to confirm that this proposal would work there. Unless there is a show-stopper, we plan to follow up this RFC with implementations for AMD, Intel, NVIDIA, gamescope, and IGT. Many thanks to everybody who contributed to
Re: [RFC] Plane color pipeline KMS uAPI
On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > Ok no comments from me on the actual color operations and semantics of all > that, because I have simply nothing to bring to that except confusion :-) > > Some higher level thoughts instead: > > - I really like that we just go with graph nodes here. I think that was > bound to happen sooner or later with kms (we almost got there with > writeback, and with hindsight maybe should have). I'd really rather not do graphs here. We only need linked lists as Sebastian said. Graphs would significantly add more complexity to this proposal, and I don't think that's a good idea unless there is a strong use-case.
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, May 5, 2023 at 5:28 PM Daniel Vetter wrote: > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > > Hi all, > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > > pipeline before blending, ie. after a pixel is tapped from a plane's > > framebuffer and before it's blended with other planes. With this new uAPI we > > aim to reduce the battery life impact of color management and HDR on mobile > > devices, to improve performance and to decrease latency by skipping > > composition on the 3D engine. This proposal is the result of discussions at > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > discussion. > > > > This proposal takes a prescriptive approach instead of a descriptive > > approach. > > Drivers describe the available hardware blocks in terms of low-level > > mathematical operations, then user-space configures each block. We decided > > against a descriptive approach where user-space would provide a high-level > > description of the colorspace and other parameters: we want to give more > > control and flexibility to user-space, e.g. to be able to replicate exactly > > the > > color pipeline with shaders and switch between shaders and KMS pipelines > > seamlessly, and to avoid forcing user-space into a particular color > > management > > policy. > > Ack on the prescriptive approach, but generic imo. Descriptive pretty much > means you need the shaders at the same api level for fallback purposes, > and we're not going to have that ever in kms. That would need something > like hwc in userspace to work. Which would be nice to have but that would be forcing a specific color pipeline on everyone and we explicitly want to avoid that. There are just too many trade-offs to consider. > And not generic in it's ultimate consquence would mean we just do a blob > for a crtc with all the vendor register stuff like adf (android display > framework) does, because I really don't see a point in trying a > generic-looking-but-not vendor uapi with each color op/stage split out. > > So from very far and pure gut feeling, this seems like a good middle > ground in the uapi design space we have here. Good to hear! > > We've decided against mirroring the existing CRTC properties > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > > pipeline can significantly differ between vendors and this approach cannot > > accurately abstract all hardware. In particular, the availability, ordering > > and > > capabilities of hardware blocks is different on each display engine. So, > > we've > > decided to go for a highly detailed hardware capability discovery. > > > > This new uAPI should not be in conflict with existing standard KMS > > properties, > > since there are none which control the pre-blending color pipeline at the > > moment. It does conflict with any vendor-specific properties like > > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > > properties. Drivers will need to either reject atomic commits configuring > > both > > uAPIs, or alternatively we could add a DRM client cap which hides the vendor > > properties and shows the new generic properties when enabled. > > > > To use this uAPI, first user-space needs to discover hardware capabilities > > via > > KMS objects and properties, then user-space can configure the hardware via > > an > > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > > > Our proposal introduces a new "color_pipeline" plane property, and a new KMS > > object type, "COLOROP" (short for color operation). The "color_pipeline" > > plane > > property is an enum, each enum entry represents a color pipeline supported > > by > > the hardware. The special zero entry indicates that the pipeline is in > > "bypass"/"no-op" mode. For instance, the following plane properties > > describe a > > primary plane with 2 supported pipelines but currently configured in bypass > > mode: > > > > Plane 10 > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > ├─ … > > └─ "color_pipeline": enum {0, 42, 52} = 0 > > A bit confused, why is this an enum, and not just an immutable prop that > points at the first element? You already can disable elements with the > bypass thing, also bypassing by changing the pointers to the next node in > the graph seems a bit confusing and redundant. We want to allow multiple pipelines to exist and a plane can choose the pipeline by selecting the first element of the pipeline. The enum here lists all the possible pipelines that can be attached to the surface. > > The non-zero entries describe color pipelines as a linked list of COLOROP > > KMS > > objects. The entry value is an object ID pointing to the head of the linked > > list (the first operation in the color pipeline). > > > > The new COLOROP objects also
Re: [RFC] Plane color pipeline KMS uAPI
On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > Hi all, > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > pipeline before blending, ie. after a pixel is tapped from a plane's > framebuffer and before it's blended with other planes. With this new uAPI we > aim to reduce the battery life impact of color management and HDR on mobile > devices, to improve performance and to decrease latency by skipping > composition on the 3D engine. This proposal is the result of discussions at > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > familiar with the AMD, Intel and NVIDIA hardware have participated in the > discussion. > > This proposal takes a prescriptive approach instead of a descriptive approach. > Drivers describe the available hardware blocks in terms of low-level > mathematical operations, then user-space configures each block. We decided > against a descriptive approach where user-space would provide a high-level > description of the colorspace and other parameters: we want to give more > control and flexibility to user-space, e.g. to be able to replicate exactly > the > color pipeline with shaders and switch between shaders and KMS pipelines > seamlessly, and to avoid forcing user-space into a particular color management > policy. Ack on the prescriptive approach, but generic imo. Descriptive pretty much means you need the shaders at the same api level for fallback purposes, and we're not going to have that ever in kms. That would need something like hwc in userspace to work. And not generic in it's ultimate consquence would mean we just do a blob for a crtc with all the vendor register stuff like adf (android display framework) does, because I really don't see a point in trying a generic-looking-but-not vendor uapi with each color op/stage split out. So from very far and pure gut feeling, this seems like a good middle ground in the uapi design space we have here. > We've decided against mirroring the existing CRTC properties > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > pipeline can significantly differ between vendors and this approach cannot > accurately abstract all hardware. In particular, the availability, ordering > and > capabilities of hardware blocks is different on each display engine. So, we've > decided to go for a highly detailed hardware capability discovery. > > This new uAPI should not be in conflict with existing standard KMS properties, > since there are none which control the pre-blending color pipeline at the > moment. It does conflict with any vendor-specific properties like > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > properties. Drivers will need to either reject atomic commits configuring both > uAPIs, or alternatively we could add a DRM client cap which hides the vendor > properties and shows the new generic properties when enabled. > > To use this uAPI, first user-space needs to discover hardware capabilities via > KMS objects and properties, then user-space can configure the hardware via an > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > Our proposal introduces a new "color_pipeline" plane property, and a new KMS > object type, "COLOROP" (short for color operation). The "color_pipeline" plane > property is an enum, each enum entry represents a color pipeline supported by > the hardware. The special zero entry indicates that the pipeline is in > "bypass"/"no-op" mode. For instance, the following plane properties describe a > primary plane with 2 supported pipelines but currently configured in bypass > mode: > > Plane 10 > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > ├─ … > └─ "color_pipeline": enum {0, 42, 52} = 0 A bit confused, why is this an enum, and not just an immutable prop that points at the first element? You already can disable elements with the bypass thing, also bypassing by changing the pointers to the next node in the graph seems a bit confusing and redundant. > The non-zero entries describe color pipelines as a linked list of COLOROP KMS > objects. The entry value is an object ID pointing to the head of the linked > list (the first operation in the color pipeline). > > The new COLOROP objects also expose a number of KMS properties. Each has a > type, a reference to the next COLOROP object in the linked list, and other > type-specific properties. Here is an example for a 1D LUT operation: Ok no comments from me on the actual color operations and semantics of all that, because I have simply nothing to bring to that except confusion :-) Some higher level thoughts instead: - I really like that we just go with graph nodes here. I think that was bound to happen sooner or later with kms (we almost got there with writeback, and with hindsight maybe should have). - Since there's other use-cases for graph nodes (maybe scaler modes, or histogram
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, 5 May 2023 14:30:11 +0100 Joshua Ashton wrote: > Some corrections and replies inline. > > On Fri, 5 May 2023 at 12:42, Pekka Paalanen wrote: > > > > On Thu, 04 May 2023 15:22:59 + > > Simon Ser wrote: ... > > > To wrap things up, let's take a real-world example: how would gamescope > > > [2] > > > configure the AMD DCN 3.0 hardware for its color pipeline? The gamescope > > > color > > > pipeline is described in [3]. The AMD DCN 3.0 hardware is described in > > > [4]. > > > > > > AMD would expose the following objects and properties: > > > > > > Plane 10 > > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > > └─ "color_pipeline": enum {0, 42} = 0 > > > Color operation 42 (input CSC) > > > ├─ "type": enum {Bypass, Matrix} = Matrix > > > ├─ "matrix_data": blob > > > └─ "next": immutable color operation ID = 43 > > > Color operation 43 > > > ├─ "type": enum {Scaling} = Scaling > > > └─ "next": immutable color operation ID = 44 > > > Color operation 44 (DeGamma) > > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > > ├─ "1d_curve_type": enum {sRGB, PQ, …} = sRGB > > > └─ "next": immutable color operation ID = 45 > > Some vendors have per-tap degamma and some have a degamma after the sample. > How do we distinguish that behaviour? > It is important to know. ... > > Btw. ISTR that if you want to do scaling properly with alpha channel, > > you need optical values multiplied by alpha. Alpha vs. scaling is just > > yet another thing to look into, and TF operations do not work with > > pre-mult. > > What are your concerns here? I believe this is exactly the same question as yours about sampling, at least for up-scaling where sampling the framebuffer interpolates in some way. Oh, interpolation mode would fit in the scaling COLOROP... > Having pre-multiplied alpha is fine with a TF: the alpha was > premultiplied in linear, then encoded with the TF by the client. There are two different ways to pre-multiply: into optical values (okay), and into electrical values (what everyone actually does, and what Wayland assumes by default). What you described is the thing mostly no-one does in GUI graphics. Even in the web. > If you think of a TF as something something relative to a bunch of > reference state or whatever then you might think "oh you can't do > that!", but you really can. > It's really best to just think of it as a mathematical encoding of a > value in all instances that we touch. True, except when it's false. If you assume that decoding is the exact mathematical inverse of encoding, then your conclusion follows. Unfortunately many video standards do not have it so. BT.601, BT.709, and I forget if BT.2020 (SDR) as well encode with one function and decode with something that is not the inverse, and it is totally intentional and necessary mangling of the values to get the expected result on screen. Someone has called this "implicit color management". So one needs to be very careful here what the actual characteristics are. > The only issue is that you lose precision from having pre-multiplied > alpha as it's quantized to fit into the DRM format rather than using > the full range then getting divided by the alpha at blend time. > It doesn't end up being a visible issue ever however in my experience, at > 8bpc. That's true. Wait, why would you divide by alpha for blending? Blending/interpolation is the only operation where pre-mult is useful. Thanks, pq > > Thanks > - Joshie ✨ > > > > > > > Thanks, > > pq > > > > > > > > I hope comparing these properties to the diagrams linked above can help > > > understand how the uAPI would be used and give an idea of its viability. > > > > > > Please feel free to provide feedback! It would be especially useful to > > > have > > > someone familiar with Arm SoCs look at this, to confirm that this proposal > > > would work there. > > > > > > Unless there is a show-stopper, we plan to follow up this RFC with > > > implementations for AMD, Intel, NVIDIA, gamescope, and IGT. > > > > > > Many thanks to everybody who contributed to the hackfest, on-site or > > > remotely! > > > Let's work together to make this happen! > > > > > > Simon, on behalf of the hackfest participants > > > > > > [1]: https://wiki.gnome.org/Hackfests/ShellDisplayNext2023 > > > [2]: https://github.com/ValveSoftware/gamescope > > > [3]: > > > https://github.com/ValveSoftware/gamescope/blob/5af321724c8b8a29cef5ae9e31293fd5d560c4ec/src/docs/Steam%20Deck%20Display%20Pipeline.png > > > [4]: https://kernel.org/doc/html/latest/_images/dcn3_cm_drm_current.svg > > pgpFKEwG8wI8J.pgp Description: OpenPGP digital signature
Re: [RFC] Plane color pipeline KMS uAPI
Some corrections and replies inline. On Fri, 5 May 2023 at 12:42, Pekka Paalanen wrote: > > On Thu, 04 May 2023 15:22:59 + > Simon Ser wrote: > > > Hi all, > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > > pipeline before blending, ie. after a pixel is tapped from a plane's > > framebuffer and before it's blended with other planes. With this new uAPI we > > aim to reduce the battery life impact of color management and HDR on mobile > > devices, to improve performance and to decrease latency by skipping > > composition on the 3D engine. This proposal is the result of discussions at > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > discussion. > > Hi Simon, > > this is an excellent write-up, thank you! > > Harry's question about what constitutes UAPI is a good one for danvet. > > I don't really have much to add here, a couple inline comments. I think > this could work. > > > > > This proposal takes a prescriptive approach instead of a descriptive > > approach. > > Drivers describe the available hardware blocks in terms of low-level > > mathematical operations, then user-space configures each block. We decided > > against a descriptive approach where user-space would provide a high-level > > description of the colorspace and other parameters: we want to give more > > control and flexibility to user-space, e.g. to be able to replicate exactly > > the > > color pipeline with shaders and switch between shaders and KMS pipelines > > seamlessly, and to avoid forcing user-space into a particular color > > management > > policy. > > > > We've decided against mirroring the existing CRTC properties > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > > pipeline can significantly differ between vendors and this approach cannot > > accurately abstract all hardware. In particular, the availability, ordering > > and > > capabilities of hardware blocks is different on each display engine. So, > > we've > > decided to go for a highly detailed hardware capability discovery. > > > > This new uAPI should not be in conflict with existing standard KMS > > properties, > > since there are none which control the pre-blending color pipeline at the > > moment. It does conflict with any vendor-specific properties like > > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > > properties. Drivers will need to either reject atomic commits configuring > > both > > uAPIs, or alternatively we could add a DRM client cap which hides the vendor > > properties and shows the new generic properties when enabled. > > > > To use this uAPI, first user-space needs to discover hardware capabilities > > via > > KMS objects and properties, then user-space can configure the hardware via > > an > > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > > > Our proposal introduces a new "color_pipeline" plane property, and a new KMS > > object type, "COLOROP" (short for color operation). The "color_pipeline" > > plane > > property is an enum, each enum entry represents a color pipeline supported > > by > > the hardware. The special zero entry indicates that the pipeline is in > > "bypass"/"no-op" mode. For instance, the following plane properties > > describe a > > primary plane with 2 supported pipelines but currently configured in bypass > > mode: > > > > Plane 10 > > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > > ├─ … > > └─ "color_pipeline": enum {0, 42, 52} = 0 > > > > The non-zero entries describe color pipelines as a linked list of COLOROP > > KMS > > objects. The entry value is an object ID pointing to the head of the linked > > list (the first operation in the color pipeline). > > > > The new COLOROP objects also expose a number of KMS properties. Each has a > > type, a reference to the next COLOROP object in the linked list, and other > > type-specific properties. Here is an example for a 1D LUT operation: > > > > Color operation 42 > > ├─ "type": enum {Bypass, 1D curve} = 1D curve > > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > > ├─ "lut_size": immutable range = 4096 > > ├─ "lut_data": blob > > └─ "next": immutable color operation ID = 43 > > > > To configure this hardware block, user-space can fill a KMS blob with 4096 > > u32 > > entries, then set "lut_data" to the blob ID. Other color operation types > > might > > have different properties. > > > > Here is another example with a 3D LUT: > > > > Color operation 42 > > ├─ "type": enum {Bypass, 3D LUT} = 3D LUT > > ├─ "lut_size": immutable range = 33 > > ├─ "lut_data": blob > > └─ "next": immutable color operation ID = 43 > > > > And one last example with a matrix: > > > > Color operation 42 > > ├─ "type": enum {Bypass, Matrix} = Matrix > > ├─
Re: [RFC] Plane color pipeline KMS uAPI
On Thu, 04 May 2023 15:22:59 + Simon Ser wrote: > Hi all, > > The goal of this RFC is to expose a generic KMS uAPI to configure the color > pipeline before blending, ie. after a pixel is tapped from a plane's > framebuffer and before it's blended with other planes. With this new uAPI we > aim to reduce the battery life impact of color management and HDR on mobile > devices, to improve performance and to decrease latency by skipping > composition on the 3D engine. This proposal is the result of discussions at > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > familiar with the AMD, Intel and NVIDIA hardware have participated in the > discussion. Hi Simon, this is an excellent write-up, thank you! Harry's question about what constitutes UAPI is a good one for danvet. I don't really have much to add here, a couple inline comments. I think this could work. > > This proposal takes a prescriptive approach instead of a descriptive approach. > Drivers describe the available hardware blocks in terms of low-level > mathematical operations, then user-space configures each block. We decided > against a descriptive approach where user-space would provide a high-level > description of the colorspace and other parameters: we want to give more > control and flexibility to user-space, e.g. to be able to replicate exactly > the > color pipeline with shaders and switch between shaders and KMS pipelines > seamlessly, and to avoid forcing user-space into a particular color management > policy. > > We've decided against mirroring the existing CRTC properties > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > pipeline can significantly differ between vendors and this approach cannot > accurately abstract all hardware. In particular, the availability, ordering > and > capabilities of hardware blocks is different on each display engine. So, we've > decided to go for a highly detailed hardware capability discovery. > > This new uAPI should not be in conflict with existing standard KMS properties, > since there are none which control the pre-blending color pipeline at the > moment. It does conflict with any vendor-specific properties like > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > properties. Drivers will need to either reject atomic commits configuring both > uAPIs, or alternatively we could add a DRM client cap which hides the vendor > properties and shows the new generic properties when enabled. > > To use this uAPI, first user-space needs to discover hardware capabilities via > KMS objects and properties, then user-space can configure the hardware via an > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > Our proposal introduces a new "color_pipeline" plane property, and a new KMS > object type, "COLOROP" (short for color operation). The "color_pipeline" plane > property is an enum, each enum entry represents a color pipeline supported by > the hardware. The special zero entry indicates that the pipeline is in > "bypass"/"no-op" mode. For instance, the following plane properties describe a > primary plane with 2 supported pipelines but currently configured in bypass > mode: > > Plane 10 > ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary > ├─ … > └─ "color_pipeline": enum {0, 42, 52} = 0 > > The non-zero entries describe color pipelines as a linked list of COLOROP KMS > objects. The entry value is an object ID pointing to the head of the linked > list (the first operation in the color pipeline). > > The new COLOROP objects also expose a number of KMS properties. Each has a > type, a reference to the next COLOROP object in the linked list, and other > type-specific properties. Here is an example for a 1D LUT operation: > > Color operation 42 > ├─ "type": enum {Bypass, 1D curve} = 1D curve > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > ├─ "lut_size": immutable range = 4096 > ├─ "lut_data": blob > └─ "next": immutable color operation ID = 43 > > To configure this hardware block, user-space can fill a KMS blob with 4096 u32 > entries, then set "lut_data" to the blob ID. Other color operation types might > have different properties. > > Here is another example with a 3D LUT: > > Color operation 42 > ├─ "type": enum {Bypass, 3D LUT} = 3D LUT > ├─ "lut_size": immutable range = 33 > ├─ "lut_data": blob > └─ "next": immutable color operation ID = 43 > > And one last example with a matrix: > > Color operation 42 > ├─ "type": enum {Bypass, Matrix} = Matrix > ├─ "matrix_data": blob > └─ "next": immutable color operation ID = 43 > > [Simon note: having "Bypass" in the "type" enum, and making "type" mutable is > a bit weird. Maybe we can just add an "active"/"bypass" boolean property on > blocks which can be bypassed instead.] > > [Jonas note: perhaps a single "data" property for both LUTs
Re: [RFC] Plane color pipeline KMS uAPI
On 5/4/23 11:22, Simon Ser wrote: Hi all, The goal of this RFC is to expose a generic KMS uAPI to configure the color pipeline before blending, ie. after a pixel is tapped from a plane's framebuffer and before it's blended with other planes. With this new uAPI we aim to reduce the battery life impact of color management and HDR on mobile devices, to improve performance and to decrease latency by skipping composition on the 3D engine. This proposal is the result of discussions at the Red Hat HDR hackfest [1] which took place a few days ago. Engineers familiar with the AMD, Intel and NVIDIA hardware have participated in the discussion. Thanks for typing this up. It does a great job describing the vision. This proposal takes a prescriptive approach instead of a descriptive approach. Drivers describe the available hardware blocks in terms of low-level mathematical operations, then user-space configures each block. We decided against a descriptive approach where user-space would provide a high-level description of the colorspace and other parameters: we want to give more control and flexibility to user-space, e.g. to be able to replicate exactly the color pipeline with shaders and switch between shaders and KMS pipelines seamlessly, and to avoid forcing user-space into a particular color management policy. We've decided against mirroring the existing CRTC properties DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management pipeline can significantly differ between vendors and this approach cannot accurately abstract all hardware. In particular, the availability, ordering and capabilities of hardware blocks is different on each display engine. So, we've decided to go for a highly detailed hardware capability discovery. This new uAPI should not be in conflict with existing standard KMS properties, since there are none which control the pre-blending color pipeline at the moment. It does conflict with any vendor-specific properties like NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific properties. Drivers will need to either reject atomic commits configuring both uAPIs, or alternatively we could add a DRM client cap which hides the vendor properties and shows the new generic properties when enabled. To use this uAPI, first user-space needs to discover hardware capabilities via KMS objects and properties, then user-space can configure the hardware via an atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. Our proposal introduces a new "color_pipeline" plane property, and a new KMS object type, "COLOROP" (short for color operation). The "color_pipeline" plane property is an enum, each enum entry represents a color pipeline supported by the hardware. The special zero entry indicates that the pipeline is in "bypass"/"no-op" mode. For instance, the following plane properties describe a primary plane with 2 supported pipelines but currently configured in bypass mode: Plane 10 ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary ├─ … └─ "color_pipeline": enum {0, 42, 52} = 0 The non-zero entries describe color pipelines as a linked list of COLOROP KMS objects. The entry value is an object ID pointing to the head of the linked list (the first operation in the color pipeline). The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. Here is another example with a 3D LUT: Color operation 42 ├─ "type": enum {Bypass, 3D LUT} = 3D LUT ├─ "lut_size": immutable range = 33 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 And one last example with a matrix: Color operation 42 ├─ "type": enum {Bypass, Matrix} = Matrix ├─ "matrix_data": blob └─ "next": immutable color operation ID = 43 [Simon note: having "Bypass" in the "type" enum, and making "type" mutable is a bit weird. Maybe we can just add an "active"/"bypass" boolean property on blocks which can be bypassed instead.] I would favor a "bypass" boolean property. [Jonas note: perhaps a single "data" property for both LUTs and matrices would make more sense. And a "size" prop for both 1D and 3D LUTs.] I concur. We'll probably want to document for which types a property applies. If some hardware supports re-ordering operations in the color pipeline, the driver can expose multiple pipelines with
[RFC] Plane color pipeline KMS uAPI
Hi all, The goal of this RFC is to expose a generic KMS uAPI to configure the color pipeline before blending, ie. after a pixel is tapped from a plane's framebuffer and before it's blended with other planes. With this new uAPI we aim to reduce the battery life impact of color management and HDR on mobile devices, to improve performance and to decrease latency by skipping composition on the 3D engine. This proposal is the result of discussions at the Red Hat HDR hackfest [1] which took place a few days ago. Engineers familiar with the AMD, Intel and NVIDIA hardware have participated in the discussion. This proposal takes a prescriptive approach instead of a descriptive approach. Drivers describe the available hardware blocks in terms of low-level mathematical operations, then user-space configures each block. We decided against a descriptive approach where user-space would provide a high-level description of the colorspace and other parameters: we want to give more control and flexibility to user-space, e.g. to be able to replicate exactly the color pipeline with shaders and switch between shaders and KMS pipelines seamlessly, and to avoid forcing user-space into a particular color management policy. We've decided against mirroring the existing CRTC properties DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management pipeline can significantly differ between vendors and this approach cannot accurately abstract all hardware. In particular, the availability, ordering and capabilities of hardware blocks is different on each display engine. So, we've decided to go for a highly detailed hardware capability discovery. This new uAPI should not be in conflict with existing standard KMS properties, since there are none which control the pre-blending color pipeline at the moment. It does conflict with any vendor-specific properties like NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific properties. Drivers will need to either reject atomic commits configuring both uAPIs, or alternatively we could add a DRM client cap which hides the vendor properties and shows the new generic properties when enabled. To use this uAPI, first user-space needs to discover hardware capabilities via KMS objects and properties, then user-space can configure the hardware via an atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. Our proposal introduces a new "color_pipeline" plane property, and a new KMS object type, "COLOROP" (short for color operation). The "color_pipeline" plane property is an enum, each enum entry represents a color pipeline supported by the hardware. The special zero entry indicates that the pipeline is in "bypass"/"no-op" mode. For instance, the following plane properties describe a primary plane with 2 supported pipelines but currently configured in bypass mode: Plane 10 ├─ "type": immutable enum {Overlay, Primary, Cursor} = Primary ├─ … └─ "color_pipeline": enum {0, 42, 52} = 0 The non-zero entries describe color pipelines as a linked list of COLOROP KMS objects. The entry value is an object ID pointing to the head of the linked list (the first operation in the color pipeline). The new COLOROP objects also expose a number of KMS properties. Each has a type, a reference to the next COLOROP object in the linked list, and other type-specific properties. Here is an example for a 1D LUT operation: Color operation 42 ├─ "type": enum {Bypass, 1D curve} = 1D curve ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT ├─ "lut_size": immutable range = 4096 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 To configure this hardware block, user-space can fill a KMS blob with 4096 u32 entries, then set "lut_data" to the blob ID. Other color operation types might have different properties. Here is another example with a 3D LUT: Color operation 42 ├─ "type": enum {Bypass, 3D LUT} = 3D LUT ├─ "lut_size": immutable range = 33 ├─ "lut_data": blob └─ "next": immutable color operation ID = 43 And one last example with a matrix: Color operation 42 ├─ "type": enum {Bypass, Matrix} = Matrix ├─ "matrix_data": blob └─ "next": immutable color operation ID = 43 [Simon note: having "Bypass" in the "type" enum, and making "type" mutable is a bit weird. Maybe we can just add an "active"/"bypass" boolean property on blocks which can be bypassed instead.] [Jonas note: perhaps a single "data" property for both LUTs and matrices would make more sense. And a "size" prop for both 1D and 3D LUTs.] If some hardware supports re-ordering operations in the color pipeline, the driver can expose multiple pipelines with different operation ordering, and user-space can pick the ordering it prefers by selecting the right pipeline. The same scheme can be used to expose hardware blocks supporting multiple precision levels. That's pretty much all there is to it, but as always the devil