Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-03-07 Thread Daniel Vetter
On Thu, Feb 22, 2018 at 04:16:52PM -0500, Alex Deucher wrote:
> On Thu, Feb 22, 2018 at 1:49 PM, Bas Nieuwenhuizen
>  wrote:
> > On Thu, Feb 22, 2018 at 7:04 PM, Kristian H??gsberg  
> > wrote:
> >> On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher  wrote:
> >>
> >>> On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace 
> >> wrote:
> >>> > On Thu 21 Dec 2017, Daniel Vetter wrote:
> >>> >> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <
> >> hoegsb...@google.com> wrote:
> >>> >>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <
> >> mvicom...@nvidia.com> wrote:
> >>>  On Wed, 20 Dec 2017 11:54:10 -0800 Kristian H??gsberg <
> >> hoegsb...@gmail.com> wrote:
> >>> > I'd like to see concrete examples of actual display controllers
> >>> > supporting more format layouts than what can be specified with a 64
> >>> > bit modifier.
> >>> 
> >>>  The main problem is our tiling and other metadata parameters can't
> >>>  generally fit in a modifier, so we find passing a blob of metadata a
> >>>  more suitable mechanism.
> >>> >>>
> >>> >>> I understand that you may have n knobs with a total of more than a
> >> total of
> >>> >>> 56 bits that configure your tiling/swizzling for color buffers. What
> >> I don't
> >>> >>> buy is that you need all those combinations when passing buffers
> >> around
> >>> >>> between codecs, cameras and display controllers. Even if you're
> >> sharing
> >>> >>> between the same 3D drivers in different processes, I expect just
> >> locking
> >>> >>> down, say, 64 different combinations (you can add more over time) and
> >>> >>> assigning each a modifier would be sufficient. I doubt you'd extract
> >>> >>> meaningful performance gains from going all the way to a blob.
> >>> >
> >>> > I agree with Kristian above. In my opinion, choosing to encode in
> >>> > modifiers a precise description of every possible tiling/compression
> >>> > layout is not technically incorrect, but I believe it misses the point.
> >>> > The intention behind modifiers is not to exhaustively describe all
> >>> > possibilites.
> >>> >
> >>> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
> >>> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
> >>> >
> >>> > One goal of modifiers in the Linux ecosystem is to enumerate for
> >> each
> >>> > vendor a reasonably sized set of tiling formats that are
> >> appropriate for
> >>> > images shared across processes, APIs, and/or devices, where each
> >>> > participating component may possibly be from different vendors.
> >>> > A non-goal is to enumerate all tiling formats supported by all
> >> vendors.
> >>> > Some tiling formats used internally by vendors are inappropriate for
> >>> > sharing; no modifiers should be assigned to such tiling formats.
> >>
> >>> Where it gets tricky is how to select that subset?  Our tiling mode
> >>> are defined more by the asic specific constraints than the tiling mode
> >>> itself.  At a high level we have basically 3 tiling modes (out of 16
> >>> possible) that would be the minimum we'd want to expose for gfx6-8.
> >>> gfx9 uses a completely new scheme.
> >>> 1. Linear (per asic stride requirements, not usable by many hw blocks)
> >>> 2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
> >>> 3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
> >>> tile split (7 possible), sample split (4 possible), num banks (4
> >>> possible), bank width (4 possible), bank height (4 possible), macro
> >>> tile aspect (4 possible) all of which are asic config specific)
> >>
> >>> I guess we could do something like:
> >>> AMD_GFX6_LINEAR_ALIGNED_64B
> >>> AMD_GFX6_LINEAR_ALIGNED_256B
> >>> AMD_GFX6_LINEAR_ALIGNED_512B
> >>> AMD_GFX6_1D_THIN_DISPLAY
> >>> AMD_GFX6_1D_THIN_DEPTH
> >>> AMD_GFX6_1D_THIN_ROTATED
> >>> AMD_GFX6_1D_THIN_THIN
> >>> AMD_GFX6_1D_THIN_THICK
> >>
> >> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> >>
> >> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-26 Thread James Jones

On 02/22/2018 01:16 PM, Alex Deucher wrote:

On Thu, Feb 22, 2018 at 1:49 PM, Bas Nieuwenhuizen
 wrote:

On Thu, Feb 22, 2018 at 7:04 PM, Kristian Høgsberg  wrote:

On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher  wrote:


On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace 

wrote:

On Thu 21 Dec 2017, Daniel Vetter wrote:

On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <

hoegsb...@google.com> wrote:

On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <

mvicom...@nvidia.com> wrote:

On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg <

hoegsb...@gmail.com> wrote:

I'd like to see concrete examples of actual display controllers
supporting more format layouts than what can be specified with a 64
bit modifier.


The main problem is our tiling and other metadata parameters can't
generally fit in a modifier, so we find passing a blob of metadata a
more suitable mechanism.


I understand that you may have n knobs with a total of more than a

total of

56 bits that configure your tiling/swizzling for color buffers. What

I don't

buy is that you need all those combinations when passing buffers

around

between codecs, cameras and display controllers. Even if you're

sharing

between the same 3D drivers in different processes, I expect just

locking

down, say, 64 different combinations (you can add more over time) and
assigning each a modifier would be sufficient. I doubt you'd extract
meaningful performance gains from going all the way to a blob.


I agree with Kristian above. In my opinion, choosing to encode in
modifiers a precise description of every possible tiling/compression
layout is not technically incorrect, but I believe it misses the point.
The intention behind modifiers is not to exhaustively describe all
possibilites.

I summarized this opinion in VK_EXT_image_drm_format_modifier,
where I wrote an "introdution to modifiers" section. Here's an excerpt:

 One goal of modifiers in the Linux ecosystem is to enumerate for

each

 vendor a reasonably sized set of tiling formats that are

appropriate for

 images shared across processes, APIs, and/or devices, where each
 participating component may possibly be from different vendors.
 A non-goal is to enumerate all tiling formats supported by all

vendors.

 Some tiling formats used internally by vendors are inappropriate for
 sharing; no modifiers should be assigned to such tiling formats.



Where it gets tricky is how to select that subset?  Our tiling mode
are defined more by the asic specific constraints than the tiling mode
itself.  At a high level we have basically 3 tiling modes (out of 16
possible) that would be the minimum we'd want to expose for gfx6-8.
gfx9 uses a completely new scheme.
1. Linear (per asic stride requirements, not usable by many hw blocks)
2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
tile split (7 possible), sample split (4 possible), num banks (4
possible), bank width (4 possible), bank height (4 possible), macro
tile aspect (4 possible) all of which are asic config specific)



I guess we could do something like:
AMD_GFX6_LINEAR_ALIGNED_64B
AMD_GFX6_LINEAR_ALIGNED_256B
AMD_GFX6_LINEAR_ALIGNED_512B
AMD_GFX6_1D_THIN_DISPLAY
AMD_GFX6_1D_THIN_DEPTH
AMD_GFX6_1D_THIN_ROTATED
AMD_GFX6_1D_THIN_THIN
AMD_GFX6_1D_THIN_THICK


AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

etc.



We only probably need 40 bits to encode all of the tiling parameters
so we could do family, plus tiling encoding that still seems unwieldy
to deal with from an application perspective.  All of the parameters
affect the alignment requirements.


We discussed this earlier in t

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-22 Thread Alex Deucher
On Thu, Feb 22, 2018 at 1:49 PM, Bas Nieuwenhuizen
 wrote:
> On Thu, Feb 22, 2018 at 7:04 PM, Kristian Høgsberg  
> wrote:
>> On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher  wrote:
>>
>>> On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace 
>> wrote:
>>> > On Thu 21 Dec 2017, Daniel Vetter wrote:
>>> >> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <
>> hoegsb...@google.com> wrote:
>>> >>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <
>> mvicom...@nvidia.com> wrote:
>>>  On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg <
>> hoegsb...@gmail.com> wrote:
>>> > I'd like to see concrete examples of actual display controllers
>>> > supporting more format layouts than what can be specified with a 64
>>> > bit modifier.
>>> 
>>>  The main problem is our tiling and other metadata parameters can't
>>>  generally fit in a modifier, so we find passing a blob of metadata a
>>>  more suitable mechanism.
>>> >>>
>>> >>> I understand that you may have n knobs with a total of more than a
>> total of
>>> >>> 56 bits that configure your tiling/swizzling for color buffers. What
>> I don't
>>> >>> buy is that you need all those combinations when passing buffers
>> around
>>> >>> between codecs, cameras and display controllers. Even if you're
>> sharing
>>> >>> between the same 3D drivers in different processes, I expect just
>> locking
>>> >>> down, say, 64 different combinations (you can add more over time) and
>>> >>> assigning each a modifier would be sufficient. I doubt you'd extract
>>> >>> meaningful performance gains from going all the way to a blob.
>>> >
>>> > I agree with Kristian above. In my opinion, choosing to encode in
>>> > modifiers a precise description of every possible tiling/compression
>>> > layout is not technically incorrect, but I believe it misses the point.
>>> > The intention behind modifiers is not to exhaustively describe all
>>> > possibilites.
>>> >
>>> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
>>> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
>>> >
>>> > One goal of modifiers in the Linux ecosystem is to enumerate for
>> each
>>> > vendor a reasonably sized set of tiling formats that are
>> appropriate for
>>> > images shared across processes, APIs, and/or devices, where each
>>> > participating component may possibly be from different vendors.
>>> > A non-goal is to enumerate all tiling formats supported by all
>> vendors.
>>> > Some tiling formats used internally by vendors are inappropriate for
>>> > sharing; no modifiers should be assigned to such tiling formats.
>>
>>> Where it gets tricky is how to select that subset?  Our tiling mode
>>> are defined more by the asic specific constraints than the tiling mode
>>> itself.  At a high level we have basically 3 tiling modes (out of 16
>>> possible) that would be the minimum we'd want to expose for gfx6-8.
>>> gfx9 uses a completely new scheme.
>>> 1. Linear (per asic stride requirements, not usable by many hw blocks)
>>> 2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
>>> 3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
>>> tile split (7 possible), sample split (4 possible), num banks (4
>>> possible), bank width (4 possible), bank height (4 possible), macro
>>> tile aspect (4 possible) all of which are asic config specific)
>>
>>> I guess we could do something like:
>>> AMD_GFX6_LINEAR_ALIGNED_64B
>>> AMD_GFX6_LINEAR_ALIGNED_256B
>>> AMD_GFX6_LINEAR_ALIGNED_512B
>>> AMD_GFX6_1D_THIN_DISPLAY
>>> AMD_GFX6_1D_THIN_DEPTH
>>> AMD_GFX6_1D_THIN_ROTATED
>>> AMD_GFX6_1D_THIN_THIN
>>> AMD_GFX6_1D_THIN_THICK
>>
>> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>>
>> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-22 Thread Eric Anholt
Kristian Høgsberg  writes:

> On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher  wrote:
>
>> On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace 
> wrote:
>> > On Thu 21 Dec 2017, Daniel Vetter wrote:
>> >> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <
> hoegsb...@google.com> wrote:
>> >>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <
> mvicom...@nvidia.com> wrote:
>>  On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg <
> hoegsb...@gmail.com> wrote:
>> > I'd like to see concrete examples of actual display controllers
>> > supporting more format layouts than what can be specified with a 64
>> > bit modifier.
>> 
>>  The main problem is our tiling and other metadata parameters can't
>>  generally fit in a modifier, so we find passing a blob of metadata a
>>  more suitable mechanism.
>> >>>
>> >>> I understand that you may have n knobs with a total of more than a
> total of
>> >>> 56 bits that configure your tiling/swizzling for color buffers. What
> I don't
>> >>> buy is that you need all those combinations when passing buffers
> around
>> >>> between codecs, cameras and display controllers. Even if you're
> sharing
>> >>> between the same 3D drivers in different processes, I expect just
> locking
>> >>> down, say, 64 different combinations (you can add more over time) and
>> >>> assigning each a modifier would be sufficient. I doubt you'd extract
>> >>> meaningful performance gains from going all the way to a blob.
>> >
>> > I agree with Kristian above. In my opinion, choosing to encode in
>> > modifiers a precise description of every possible tiling/compression
>> > layout is not technically incorrect, but I believe it misses the point.
>> > The intention behind modifiers is not to exhaustively describe all
>> > possibilites.
>> >
>> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
>> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
>> >
>> > One goal of modifiers in the Linux ecosystem is to enumerate for
> each
>> > vendor a reasonably sized set of tiling formats that are
> appropriate for
>> > images shared across processes, APIs, and/or devices, where each
>> > participating component may possibly be from different vendors.
>> > A non-goal is to enumerate all tiling formats supported by all
> vendors.
>> > Some tiling formats used internally by vendors are inappropriate for
>> > sharing; no modifiers should be assigned to such tiling formats.
>
>> Where it gets tricky is how to select that subset?  Our tiling mode
>> are defined more by the asic specific constraints than the tiling mode
>> itself.  At a high level we have basically 3 tiling modes (out of 16
>> possible) that would be the minimum we'd want to expose for gfx6-8.
>> gfx9 uses a completely new scheme.
>> 1. Linear (per asic stride requirements, not usable by many hw blocks)
>> 2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
>> 3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
>> tile split (7 possible), sample split (4 possible), num banks (4
>> possible), bank width (4 possible), bank height (4 possible), macro
>> tile aspect (4 possible) all of which are asic config specific)
>
>> I guess we could do something like:
>> AMD_GFX6_LINEAR_ALIGNED_64B
>> AMD_GFX6_LINEAR_ALIGNED_256B
>> AMD_GFX6_LINEAR_ALIGNED_512B
>> AMD_GFX6_1D_THIN_DISPLAY
>> AMD_GFX6_1D_THIN_DEPTH
>> AMD_GFX6_1D_THIN_ROTATED
>> AMD_GFX6_1D_THIN_THIN
>> AMD_GFX6_1D_THIN_THICK
>
> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>> etc.
>
>> We only probably need 40 bits to encode all of the tiling parameters
>> so we could do family, plus 

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-22 Thread Bas Nieuwenhuizen
On Thu, Feb 22, 2018 at 7:04 PM, Kristian Høgsberg  wrote:
> On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher  wrote:
>
>> On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace 
> wrote:
>> > On Thu 21 Dec 2017, Daniel Vetter wrote:
>> >> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <
> hoegsb...@google.com> wrote:
>> >>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <
> mvicom...@nvidia.com> wrote:
>>  On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg <
> hoegsb...@gmail.com> wrote:
>> > I'd like to see concrete examples of actual display controllers
>> > supporting more format layouts than what can be specified with a 64
>> > bit modifier.
>> 
>>  The main problem is our tiling and other metadata parameters can't
>>  generally fit in a modifier, so we find passing a blob of metadata a
>>  more suitable mechanism.
>> >>>
>> >>> I understand that you may have n knobs with a total of more than a
> total of
>> >>> 56 bits that configure your tiling/swizzling for color buffers. What
> I don't
>> >>> buy is that you need all those combinations when passing buffers
> around
>> >>> between codecs, cameras and display controllers. Even if you're
> sharing
>> >>> between the same 3D drivers in different processes, I expect just
> locking
>> >>> down, say, 64 different combinations (you can add more over time) and
>> >>> assigning each a modifier would be sufficient. I doubt you'd extract
>> >>> meaningful performance gains from going all the way to a blob.
>> >
>> > I agree with Kristian above. In my opinion, choosing to encode in
>> > modifiers a precise description of every possible tiling/compression
>> > layout is not technically incorrect, but I believe it misses the point.
>> > The intention behind modifiers is not to exhaustively describe all
>> > possibilites.
>> >
>> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
>> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
>> >
>> > One goal of modifiers in the Linux ecosystem is to enumerate for
> each
>> > vendor a reasonably sized set of tiling formats that are
> appropriate for
>> > images shared across processes, APIs, and/or devices, where each
>> > participating component may possibly be from different vendors.
>> > A non-goal is to enumerate all tiling formats supported by all
> vendors.
>> > Some tiling formats used internally by vendors are inappropriate for
>> > sharing; no modifiers should be assigned to such tiling formats.
>
>> Where it gets tricky is how to select that subset?  Our tiling mode
>> are defined more by the asic specific constraints than the tiling mode
>> itself.  At a high level we have basically 3 tiling modes (out of 16
>> possible) that would be the minimum we'd want to expose for gfx6-8.
>> gfx9 uses a completely new scheme.
>> 1. Linear (per asic stride requirements, not usable by many hw blocks)
>> 2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
>> 3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
>> tile split (7 possible), sample split (4 possible), num banks (4
>> possible), bank width (4 possible), bank height (4 possible), macro
>> tile aspect (4 possible) all of which are asic config specific)
>
>> I guess we could do something like:
>> AMD_GFX6_LINEAR_ALIGNED_64B
>> AMD_GFX6_LINEAR_ALIGNED_256B
>> AMD_GFX6_LINEAR_ALIGNED_512B
>> AMD_GFX6_1D_THIN_DISPLAY
>> AMD_GFX6_1D_THIN_DEPTH
>> AMD_GFX6_1D_THIN_ROTATED
>> AMD_GFX6_1D_THIN_THIN
>> AMD_GFX6_1D_THIN_THICK
>
> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>
> AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
>> etc.
>
>> We only probably need 40 bits to encode all of the tiling parameters

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-22 Thread Kristian Høgsberg
On Wed, Feb 21, 2018 at 4:00 PM Alex Deucher  wrote:

> On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace 
wrote:
> > On Thu 21 Dec 2017, Daniel Vetter wrote:
> >> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen <
hoegsb...@google.com> wrote:
> >>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico <
mvicom...@nvidia.com> wrote:
>  On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg <
hoegsb...@gmail.com> wrote:
> > I'd like to see concrete examples of actual display controllers
> > supporting more format layouts than what can be specified with a 64
> > bit modifier.
> 
>  The main problem is our tiling and other metadata parameters can't
>  generally fit in a modifier, so we find passing a blob of metadata a
>  more suitable mechanism.
> >>>
> >>> I understand that you may have n knobs with a total of more than a
total of
> >>> 56 bits that configure your tiling/swizzling for color buffers. What
I don't
> >>> buy is that you need all those combinations when passing buffers
around
> >>> between codecs, cameras and display controllers. Even if you're
sharing
> >>> between the same 3D drivers in different processes, I expect just
locking
> >>> down, say, 64 different combinations (you can add more over time) and
> >>> assigning each a modifier would be sufficient. I doubt you'd extract
> >>> meaningful performance gains from going all the way to a blob.
> >
> > I agree with Kristian above. In my opinion, choosing to encode in
> > modifiers a precise description of every possible tiling/compression
> > layout is not technically incorrect, but I believe it misses the point.
> > The intention behind modifiers is not to exhaustively describe all
> > possibilites.
> >
> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
> >
> > One goal of modifiers in the Linux ecosystem is to enumerate for
each
> > vendor a reasonably sized set of tiling formats that are
appropriate for
> > images shared across processes, APIs, and/or devices, where each
> > participating component may possibly be from different vendors.
> > A non-goal is to enumerate all tiling formats supported by all
vendors.
> > Some tiling formats used internally by vendors are inappropriate for
> > sharing; no modifiers should be assigned to such tiling formats.

> Where it gets tricky is how to select that subset?  Our tiling mode
> are defined more by the asic specific constraints than the tiling mode
> itself.  At a high level we have basically 3 tiling modes (out of 16
> possible) that would be the minimum we'd want to expose for gfx6-8.
> gfx9 uses a completely new scheme.
> 1. Linear (per asic stride requirements, not usable by many hw blocks)
> 2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
> 3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
> tile split (7 possible), sample split (4 possible), num banks (4
> possible), bank width (4 possible), bank height (4 possible), macro
> tile aspect (4 possible) all of which are asic config specific)

> I guess we could do something like:
> AMD_GFX6_LINEAR_ALIGNED_64B
> AMD_GFX6_LINEAR_ALIGNED_256B
> AMD_GFX6_LINEAR_ALIGNED_512B
> AMD_GFX6_1D_THIN_DISPLAY
> AMD_GFX6_1D_THIN_DEPTH
> AMD_GFX6_1D_THIN_ROTATED
> AMD_GFX6_1D_THIN_THIN
> AMD_GFX6_1D_THIN_THICK

AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1

AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
> etc.

> We only probably need 40 bits to encode all of the tiling parameters
> so we could do family, plus tiling encoding that still seems unwieldy
> to deal with from an application perspective.  All of the parameters
> affect the alignment requirements.

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-21 Thread Alex Deucher
On Wed, Feb 21, 2018 at 1:14 AM, Chad Versace  wrote:
> On Thu 21 Dec 2017, Daniel Vetter wrote:
>> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen  
>> wrote:
>>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico  
>>> wrote:
 On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg  
 wrote:
> I'd like to see concrete examples of actual display controllers
> supporting more format layouts than what can be specified with a 64
> bit modifier.

 The main problem is our tiling and other metadata parameters can't
 generally fit in a modifier, so we find passing a blob of metadata a
 more suitable mechanism.
>>>
>>> I understand that you may have n knobs with a total of more than a total of
>>> 56 bits that configure your tiling/swizzling for color buffers. What I don't
>>> buy is that you need all those combinations when passing buffers around
>>> between codecs, cameras and display controllers. Even if you're sharing
>>> between the same 3D drivers in different processes, I expect just locking
>>> down, say, 64 different combinations (you can add more over time) and
>>> assigning each a modifier would be sufficient. I doubt you'd extract
>>> meaningful performance gains from going all the way to a blob.
>
> I agree with Kristian above. In my opinion, choosing to encode in
> modifiers a precise description of every possible tiling/compression
> layout is not technically incorrect, but I believe it misses the point.
> The intention behind modifiers is not to exhaustively describe all
> possibilites.
>
> I summarized this opinion in VK_EXT_image_drm_format_modifier,
> where I wrote an "introdution to modifiers" section. Here's an excerpt:
>
> One goal of modifiers in the Linux ecosystem is to enumerate for each
> vendor a reasonably sized set of tiling formats that are appropriate for
> images shared across processes, APIs, and/or devices, where each
> participating component may possibly be from different vendors.
> A non-goal is to enumerate all tiling formats supported by all vendors.
> Some tiling formats used internally by vendors are inappropriate for
> sharing; no modifiers should be assigned to such tiling formats.

Where it gets tricky is how to select that subset?  Our tiling mode
are defined more by the asic specific constraints than the tiling mode
itself.  At a high level we have basically 3 tiling modes (out of 16
possible) that would be the minimum we'd want to expose for gfx6-8.
gfx9 uses a completely new scheme.
1. Linear (per asic stride requirements, not usable by many hw blocks)
2. 1D Thin (5 layouts, displayable, depth, thin, rotated, thick)
3. 2D Thin (1D tiling constraints, plus pipe config (18 possible),
tile split (7 possible), sample split (4 possible), num banks (4
possible), bank width (4 possible), bank height (4 possible), macro
tile aspect (4 possible) all of which are asic config specific)

I guess we could do something like:
AMD_GFX6_LINEAR_ALIGNED_64B
AMD_GFX6_LINEAR_ALIGNED_256B
AMD_GFX6_LINEAR_ALIGNED_512B
AMD_GFX6_1D_THIN_DISPLAY
AMD_GFX6_1D_THIN_DEPTH
AMD_GFX6_1D_THIN_ROTATED
AMD_GFX6_1D_THIN_THIN
AMD_GFX6_1D_THIN_THICK
AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P2_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_DISPLAY_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_DEPTH_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_ROTATED_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_THIN_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
AMD_GFX6_2D_1D_THIN_THICK_PIPE_CONFIG_P4_8x16_TILE_SPLIT_64B_SAMPLE_SPLIT_1_NUM_BANKS_2_BANK_WIDTH_1_BANK_HEIGHT_1_MACRO_TILE_ASPECT_1
etc.

We only probably need 40 bits to encode all of the tiling parameters
so we could do family, plus tiling encoding that still seems unwieldy
to deal with from an application perspective.  All of the parameters
affect the alignment requirements.

Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-21 Thread Chad Versace
On Wed 21 Feb 2018, Daniel Vetter wrote:
> On Tue, Feb 20, 2018 at 10:14:47PM -0800, Chad Versace wrote:
> > On Thu 21 Dec 2017, Daniel Vetter wrote:
> > > On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen 
> > >  wrote:
> > >> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico 
> > >>  wrote:
> > >>> On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg 
> > >>>  wrote:
> >  I'd like to see concrete examples of actual display controllers
> >  supporting more format layouts than what can be specified with a 64
> >  bit modifier.
> > >>>
> > >>> The main problem is our tiling and other metadata parameters can't
> > >>> generally fit in a modifier, so we find passing a blob of metadata a
> > >>> more suitable mechanism.
> > >>
> > >> I understand that you may have n knobs with a total of more than a total 
> > >> of
> > >> 56 bits that configure your tiling/swizzling for color buffers. What I 
> > >> don't
> > >> buy is that you need all those combinations when passing buffers around
> > >> between codecs, cameras and display controllers. Even if you're sharing
> > >> between the same 3D drivers in different processes, I expect just locking
> > >> down, say, 64 different combinations (you can add more over time) and
> > >> assigning each a modifier would be sufficient. I doubt you'd extract
> > >> meaningful performance gains from going all the way to a blob.
> > 
> > I agree with Kristian above. In my opinion, choosing to encode in
> > modifiers a precise description of every possible tiling/compression
> > layout is not technically incorrect, but I believe it misses the point.
> > The intention behind modifiers is not to exhaustively describe all
> > possibilites.
> > 
> > I summarized this opinion in VK_EXT_image_drm_format_modifier,
> > where I wrote an "introdution to modifiers" section. Here's an excerpt:
> > 
> > One goal of modifiers in the Linux ecosystem is to enumerate for each
> > vendor a reasonably sized set of tiling formats that are appropriate for
> > images shared across processes, APIs, and/or devices, where each
> > participating component may possibly be from different vendors.
> > A non-goal is to enumerate all tiling formats supported by all vendors.
> > Some tiling formats used internally by vendors are inappropriate for
> > sharing; no modifiers should be assigned to such tiling formats.
> 
> fwiw (since the source of truth wrt modifiers is the kernel's uapi
> header):
> 
> Acked-by: Daniel Vetter 

Linux would eventually encounter big problems if the kernel and Vulkan
disagreed on the fundamental, unspoken Theory of Modifiers. So your
acked-by is definitely worth something here. Thanks for confirming.

> 
> I'm happy to merge modifier #define additions for pretty much anything
> where there's a need for sharing across devices/drivers/apis, explicitly
> including stuff that's only relevant for userspace and which the kernel
> nevers sees (in e.g. a kms addfb2 call). Trying to preemptively enumerate
> everything that's possible doesn't seem like a wise idea. But even then we
> can probably spare the oddball vendor prefix is a driver team really
> insists that this is what they want, best using some code that makes the
> case for them.

Yep. I believe Jason Ekstrand has tentative plans for such a modifier
that improves performance for interop in GL and Vulkan but the kernel
and Intel display hw wouldn't understand: a modifier for CCS_E images
that are fully compressed.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-21 Thread Daniel Vetter
On Tue, Feb 20, 2018 at 10:14:47PM -0800, Chad Versace wrote:
> On Thu 21 Dec 2017, Daniel Vetter wrote:
> > On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen 
> >  wrote:
> >> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico  
> >> wrote:
> >>> On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg 
> >>>  wrote:
>  I'd like to see concrete examples of actual display controllers
>  supporting more format layouts than what can be specified with a 64
>  bit modifier.
> >>>
> >>> The main problem is our tiling and other metadata parameters can't
> >>> generally fit in a modifier, so we find passing a blob of metadata a
> >>> more suitable mechanism.
> >>
> >> I understand that you may have n knobs with a total of more than a total of
> >> 56 bits that configure your tiling/swizzling for color buffers. What I 
> >> don't
> >> buy is that you need all those combinations when passing buffers around
> >> between codecs, cameras and display controllers. Even if you're sharing
> >> between the same 3D drivers in different processes, I expect just locking
> >> down, say, 64 different combinations (you can add more over time) and
> >> assigning each a modifier would be sufficient. I doubt you'd extract
> >> meaningful performance gains from going all the way to a blob.
> 
> I agree with Kristian above. In my opinion, choosing to encode in
> modifiers a precise description of every possible tiling/compression
> layout is not technically incorrect, but I believe it misses the point.
> The intention behind modifiers is not to exhaustively describe all
> possibilites.
> 
> I summarized this opinion in VK_EXT_image_drm_format_modifier,
> where I wrote an "introdution to modifiers" section. Here's an excerpt:
> 
> One goal of modifiers in the Linux ecosystem is to enumerate for each
> vendor a reasonably sized set of tiling formats that are appropriate for
> images shared across processes, APIs, and/or devices, where each
> participating component may possibly be from different vendors.
> A non-goal is to enumerate all tiling formats supported by all vendors.
> Some tiling formats used internally by vendors are inappropriate for
> sharing; no modifiers should be assigned to such tiling formats.

fwiw (since the source of truth wrt modifiers is the kernel's uapi
header):

Acked-by: Daniel Vetter 

I'm happy to merge modifier #define additions for pretty much anything
where there's a need for sharing across devices/drivers/apis, explicitly
including stuff that's only relevant for userspace and which the kernel
nevers sees (in e.g. a kms addfb2 call). Trying to preemptively enumerate
everything that's possible doesn't seem like a wise idea. But even then we
can probably spare the oddball vendor prefix is a driver team really
insists that this is what they want, best using some code that makes the
case for them.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-20 Thread Chad Versace
On Thu 21 Dec 2017, Daniel Vetter wrote:
> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen  
> wrote:
>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico  
>> wrote:
>>> On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg  
>>> wrote:
 I'd like to see concrete examples of actual display controllers
 supporting more format layouts than what can be specified with a 64
 bit modifier.
>>>
>>> The main problem is our tiling and other metadata parameters can't
>>> generally fit in a modifier, so we find passing a blob of metadata a
>>> more suitable mechanism.
>>
>> I understand that you may have n knobs with a total of more than a total of
>> 56 bits that configure your tiling/swizzling for color buffers. What I don't
>> buy is that you need all those combinations when passing buffers around
>> between codecs, cameras and display controllers. Even if you're sharing
>> between the same 3D drivers in different processes, I expect just locking
>> down, say, 64 different combinations (you can add more over time) and
>> assigning each a modifier would be sufficient. I doubt you'd extract
>> meaningful performance gains from going all the way to a blob.

I agree with Kristian above. In my opinion, choosing to encode in
modifiers a precise description of every possible tiling/compression
layout is not technically incorrect, but I believe it misses the point.
The intention behind modifiers is not to exhaustively describe all
possibilites.

I summarized this opinion in VK_EXT_image_drm_format_modifier,
where I wrote an "introdution to modifiers" section. Here's an excerpt:

One goal of modifiers in the Linux ecosystem is to enumerate for each
vendor a reasonably sized set of tiling formats that are appropriate for
images shared across processes, APIs, and/or devices, where each
participating component may possibly be from different vendors.
A non-goal is to enumerate all tiling formats supported by all vendors.
Some tiling formats used internally by vendors are inappropriate for
sharing; no modifiers should be assigned to such tiling formats.

> Tegra just redesigned it's modifier space from an ungodly amount of
> bits to just a few layouts. Not even just the ones in used, but simply
> limiting to the ones that make sense (there's dependencies apparently)
> Also note that the modifier alone doesn't need to describe the layout
> precisely, it only makes sense together with a specific pixel format
> and size. E.g. a bunch of the i915 layouts change layout depending
> upon bpp.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-01-16 Thread Miguel Angel Vico
Hi,

Besides the DRM modifiers discussion in the other forks or this thread
(I should've probably started separate threads), has anyone gotten the
chance to look at least at the mesa changes and allocator changes I
shared below?

With respect to Mesa changes, I think it might be worth merging the
EXT_external_objects nouveau implementation upstream. Should I just
send the list of patches as a formal RFR?

With respect to Allocator changes, it'd be nice getting someone else's
(out of NVIDIA) feedback.

Thanks.

On Wed, 20 Dec 2017 08:51:51 -0800
Miguel Angel Vico  wrote:

> Hi all,
> 
> As many of you already know, I've been working with James Jones on the
> Generic Device Allocator project lately. He started a discussion thread
> some weeks ago seeking feedback on the current prototype of the library
> and advice on how to move all this forward, from a prototype stage to
> production. For further reference, see:
> 
>https://lists.freedesktop.org/archives/mesa-dev/2017-November/177632.html
> 
> From the thread above, we came up with very interesting high level
> design ideas for one of the currently missing parts in the library:
> Usage transitions. That's something I'll personally work on during the
> following weeks.
> 
> 
> In the meantime, I've been working on putting together an open source
> implementation of the allocator mechanisms using the Nouveau driver for
> all to be able to play with.
> 
> Below I'm seeking feedback on a bunch of changes I had to make to
> different components of the graphics stack:
> 
> ** Allocator **
> 
>   An allocator driver implementation on top of Nouveau. The current
>   implementation only handles pitch linear layouts, but that's enough
>   to have the kmscube port working using the allocator and Nouveau
>   drivers.
> 
>   You can pull these changes from
> 
>   https://github.com/mvicomoya/allocator/tree/wip/mvicomoya/nouveau-driver
> 
> ** Mesa **
> 
>   James's kmscube port to use the allocator relies on the
>   EXT_external_objects extension to import allocator allocations to
>   OpenGL as a texture object. However, the Nouveau implementation of
>   these mechanisms is missing in Mesa, so I went ahead and added them.
> 
>   You can pull these changes from
> 
>   
> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/EXT_external_objects-nouveau
> 
>   Also, James's kmscube port uses the NVX_unix_allocator_import
>   extension to attach allocator metadata to texture objects so the
>   driver knows how to deal with the imported memory.
> 
>   Note that there isn't a formal spec for this extension yet. For now,
>   it just serves as an experimental mechanism to import allocator
>   memory in OpenGL, and attach metadata to texture objects.
> 
>   You can pull these changes (written on top of the above) from:
> 
>   
> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/NVX_unix_allocator_import
> 
> ** kmscube **
> 
>   Mostly minor fixes and improvements on top of James's port to use the
>   allocator. Main thing is the allocator initialization path will use
>   EGL_MESA_platform_surfaceless if EGLDevice platform isn't supported
>   by the underlying EGL implementation.
> 
>   You can pull these changes from:
> 
>   
> https://github.com/mvicomoya/kmscube/tree/wip/mvicomoya/allocator-nouveau
> 
> 
> With all the above you should be able to get kmscube working using the
> allocator on top of the Nouveau driver.
> 
> 
> Another of the missing pieces before we can move this to production is
> importing allocations to DRM FB objects. This is probably one of the
> most sensitive parts of the project as it requires modification/addition
> of kernel driver interfaces.
> 
> At XDC2017, James had several hallway conversations with several people
> about this, all having different opinions. I'd like to take this
> opportunity to also start a discussion about what's the best option to
> create a path to get allocator allocations added as DRM FB objects.
> 
> These are the few options we've considered to start with:
> 
>   A) Have vendor-private ioctls to set properties on GEM objects that
>  are inherited by the FB objects. This is how our (NVIDIA) desktop
>  DRM driver currently works. This would require every vendor to add
>  their own ioctl to process allocator metadata, but the metadata is
>  actually a vendor-agnostic object more like DRM modifiers. We'd
>  like to come up with a vendor-agnostic solutions that can be
>  integrated to core DRM.
> 
>   B) Add a new drmModeAddFBWithMetadata() command that takes allocator
>  metadata blobs for each plane of the FB. Some people in the
>  community have mentioned this is their preferred design. This,
>  however, means we'd have to go through the exercise of adding
>  another metadata mechanism to the whole graphics stack.
> 
>   C) Shove allocator metadata into DRM by defining it to be a separate
>  plane in the image, and using the existing DRM modifiers mechanis

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-01-08 Thread Daniel Vetter
Just wanted to clarify this one thing here, otherwise I think Rob/krh
covered it all.

On Thu, Dec 28, 2017 at 10:24:38AM -0800, Miguel Angel Vico wrote:
> Daniel Vetter wrote:
> > I think in the interim figuring out how to expose kms capabilities
> > better (and necessarily standardizing at least some of them which
> > matter at the compositor level, like size limits of framebuffers)
> > feels like the place to push the ecosystem forward. In some way
> > Miguel's proposal looks a bit backwards, since it adds the pitch
> > capabilities to addfb, but at addfb time you've allocated everything
> > already, so way too late to fix things up. With modifiers we've added
> > a very simple per-plane property to list which modifiers can be
> > combined with which pixel formats. Tiny start, but obviously very far
> > from all that we'll need.  
> 
> Not sure whether I might be misunderstanding your statement, but one of
> the allocator main features is negotiation of nearly optimal allocation
> parameters given a set of uses on different devices/engines by the
> capability merge operation. A client should have queried what every
> device/engine is capable of for the given uses, find the optimal set of
> capabilities, and use it for allocating a buffer. At the moment these
> parameters are given to KMS, they are expected to be good. If they
> aren't, the client didn't do things right.

Your example code has a new capability for PITCH_ALIGNMENT. That looks
wrong for addfb (which should only received the the computed intersection
of all requirements, not the requirements itself). And since that was the
only thing in your example code besides the bare boilerplate to wire it
all up it looks a bit confused.

Maybe we need to distinguish capabilities into constraints on properties
(like pitch alignment, or power-of-two pitch) and properties (like pitch)
themselves.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-01-03 Thread James Jones

On 12/28/2017 10:24 AM, Miguel Angel Vico wrote:

(Adding dri-devel back, and trying to respond to some comments from
the different forks)

James Jones wrote:


Your worst case analysis above isn't far off from our HW, give or take
some bits and axes here and there.  We've started an internal discussion
about how to lay out all the bits we need.  It's hard to even enumerate
them all without having a complete understanding of what capability sets
are going to include, a fully-optimized implementation of the mechanism
on our HW, and lot's of test scenarios though.


(thanks James for most of the info below)

To elaborate a bit, if we want to share an allocation across GPUs for 3D
rendering, it seems we would need 12 bits to express our
swizzling/tiling memory layouts for fermi+. In addition to that,
maxwell uses 3 more bits for this, and we need an extra bit to identify
pre-fermi representations.

We also need one bit to differentiate between Tegra and desktop, and
another one to indicate whether the layout is otherwise linear.

Then things like whether compression is used (one more bit), and we can
probably get by with 3 bits for the type of compression if we are
creative. However, it'd be way easier to just track arch + page kind,
which would be like 32 bits on its own.


Not clear if this is an NV-only term, so for those not familiar, page 
kind is very loosely the equivalent of a format modifier our HW uses 
internally in its memory management subsystem.  The value mappings vary 
a bit for each HW generation.



Whether Z-culling and/or zero-bandwidth-clears are used may be another 3
bits.

If device-local properties are included, we might need a couple more
bits for caching.

We may also need to express locality information, which may take at
least another 2 or 3 bits.

If we want to share array textures too, you also need to pass the array
pitch. Is it supposed to be encoded in a modifier too? That's 64 bits on
its own.

So yes, as James mentioned, with some effort, we could technically fit
our current allocation parameters in a modifier, but I'm still not
convinced this is as future proof as it could be as our hardware grows
in capabilities.


Daniel Stone wrote:


So I reflexively
get a bit itchy when I see the kernel being used to transit magic
blobs of data which are supplied by userspace, and only interpreted by
different userspace. Having tiling formats hidden away means that
we've had real-world bugs in AMD hardware, where we end up displaying
garbage because we cannot generically reason about the buffer
attributes.


I'm a bit confused. Can't modifiers be specified by vendors and only
interpreted by drivers? My understanding was that modifiers could
actually be treated as opaque 64-bit data, in which case they would
qualify as "magic blobs of data". Otherwise, it seems this wouldn't be
scalable. What am I missing?


Daniel Vetter wrote:


I think in the interim figuring out how to expose kms capabilities
better (and necessarily standardizing at least some of them which
matter at the compositor level, like size limits of framebuffers)
feels like the place to push the ecosystem forward. In some way
Miguel's proposal looks a bit backwards, since it adds the pitch
capabilities to addfb, but at addfb time you've allocated everything
already, so way too late to fix things up. With modifiers we've added
a very simple per-plane property to list which modifiers can be
combined with which pixel formats. Tiny start, but obviously very far
from all that we'll need.


Not sure whether I might be misunderstanding your statement, but one of
the allocator main features is negotiation of nearly optimal allocation
parameters given a set of uses on different devices/engines by the
capability merge operation. A client should have queried what every
device/engine is capable of for the given uses, find the optimal set of
capabilities, and use it for allocating a buffer. At the moment these
parameters are given to KMS, they are expected to be good. If they
aren't, the client didn't do things right.


Rob Clark wrote:


It does seem like, if possible, starting out with modifiers for now at
the kernel interface would make life easier, vs trying to reinvent
both kernel and userspace APIs at the same time.  Userspace APIs are
easier to change or throw away.  Presumably by the time we get to the
point of changing kernel uabi, we are already using, and pretty happy
with, serialized liballoc data over the wire in userspace so it is
only a matter of changing the kernel interface.


I guess we can indeed start with modifiers for now, if that's what it
takes to get the allocator mechanisms rolling. However, it seems to me
that we won't be able to encode the same type of information included
in capability sets with modifiers in all cases. For instance, if we end
up encoding usage transition information in capability sets, how that
would translate to modifiers?

I assume display doesn't really care about a lot of the data capab

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-01-03 Thread Rob Clark
On Thu, Dec 28, 2017 at 1:24 PM, Miguel Angel Vico  wrote:
> (Adding dri-devel back, and trying to respond to some comments from
> the different forks)
>
> James Jones wrote:
>
>> Your worst case analysis above isn't far off from our HW, give or take
>> some bits and axes here and there.  We've started an internal discussion
>> about how to lay out all the bits we need.  It's hard to even enumerate
>> them all without having a complete understanding of what capability sets
>> are going to include, a fully-optimized implementation of the mechanism
>> on our HW, and lot's of test scenarios though.
>
> (thanks James for most of the info below)
>
> To elaborate a bit, if we want to share an allocation across GPUs for 3D
> rendering, it seems we would need 12 bits to express our
> swizzling/tiling memory layouts for fermi+. In addition to that,
> maxwell uses 3 more bits for this, and we need an extra bit to identify
> pre-fermi representations.
>
> We also need one bit to differentiate between Tegra and desktop, and
> another one to indicate whether the layout is otherwise linear.
>
> Then things like whether compression is used (one more bit), and we can
> probably get by with 3 bits for the type of compression if we are
> creative. However, it'd be way easier to just track arch + page kind,
> which would be like 32 bits on its own.
>
> Whether Z-culling and/or zero-bandwidth-clears are used may be another 3
> bits.
>
> If device-local properties are included, we might need a couple more
> bits for caching.
>
> We may also need to express locality information, which may take at
> least another 2 or 3 bits.
>
> If we want to share array textures too, you also need to pass the array
> pitch. Is it supposed to be encoded in a modifier too? That's 64 bits on
> its own.
>
> So yes, as James mentioned, with some effort, we could technically fit
> our current allocation parameters in a modifier, but I'm still not
> convinced this is as future proof as it could be as our hardware grows
> in capabilities.
>
>
> Daniel Stone wrote:
>
>> So I reflexively
>> get a bit itchy when I see the kernel being used to transit magic
>> blobs of data which are supplied by userspace, and only interpreted by
>> different userspace. Having tiling formats hidden away means that
>> we've had real-world bugs in AMD hardware, where we end up displaying
>> garbage because we cannot generically reason about the buffer
>> attributes.
>
> I'm a bit confused. Can't modifiers be specified by vendors and only
> interpreted by drivers? My understanding was that modifiers could
> actually be treated as opaque 64-bit data, in which case they would
> qualify as "magic blobs of data". Otherwise, it seems this wouldn't be
> scalable. What am I missing?
>
>
> Daniel Vetter wrote:
>
>> I think in the interim figuring out how to expose kms capabilities
>> better (and necessarily standardizing at least some of them which
>> matter at the compositor level, like size limits of framebuffers)
>> feels like the place to push the ecosystem forward. In some way
>> Miguel's proposal looks a bit backwards, since it adds the pitch
>> capabilities to addfb, but at addfb time you've allocated everything
>> already, so way too late to fix things up. With modifiers we've added
>> a very simple per-plane property to list which modifiers can be
>> combined with which pixel formats. Tiny start, but obviously very far
>> from all that we'll need.
>
> Not sure whether I might be misunderstanding your statement, but one of
> the allocator main features is negotiation of nearly optimal allocation
> parameters given a set of uses on different devices/engines by the
> capability merge operation. A client should have queried what every
> device/engine is capable of for the given uses, find the optimal set of
> capabilities, and use it for allocating a buffer. At the moment these
> parameters are given to KMS, they are expected to be good. If they
> aren't, the client didn't do things right.
>
>
> Rob Clark wrote:
>
>> It does seem like, if possible, starting out with modifiers for now at
>> the kernel interface would make life easier, vs trying to reinvent
>> both kernel and userspace APIs at the same time.  Userspace APIs are
>> easier to change or throw away.  Presumably by the time we get to the
>> point of changing kernel uabi, we are already using, and pretty happy
>> with, serialized liballoc data over the wire in userspace so it is
>> only a matter of changing the kernel interface.
>
> I guess we can indeed start with modifiers for now, if that's what it
> takes to get the allocator mechanisms rolling. However, it seems to me
> that we won't be able to encode the same type of information included
> in capability sets with modifiers in all cases. For instance, if we end
> up encoding usage transition information in capability sets, how that
> would translate to modifiers?
>
> I assume display doesn't really care about a lot of the data capability
> sets may encode, but is 

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-28 Thread Miguel Angel Vico
(Adding dri-devel back, and trying to respond to some comments from
the different forks)

James Jones wrote:

> Your worst case analysis above isn't far off from our HW, give or take
> some bits and axes here and there.  We've started an internal discussion
> about how to lay out all the bits we need.  It's hard to even enumerate
> them all without having a complete understanding of what capability sets
> are going to include, a fully-optimized implementation of the mechanism
> on our HW, and lot's of test scenarios though.  

(thanks James for most of the info below)

To elaborate a bit, if we want to share an allocation across GPUs for 3D
rendering, it seems we would need 12 bits to express our
swizzling/tiling memory layouts for fermi+. In addition to that,
maxwell uses 3 more bits for this, and we need an extra bit to identify
pre-fermi representations.

We also need one bit to differentiate between Tegra and desktop, and
another one to indicate whether the layout is otherwise linear.

Then things like whether compression is used (one more bit), and we can
probably get by with 3 bits for the type of compression if we are
creative. However, it'd be way easier to just track arch + page kind,
which would be like 32 bits on its own.

Whether Z-culling and/or zero-bandwidth-clears are used may be another 3
bits.

If device-local properties are included, we might need a couple more
bits for caching.

We may also need to express locality information, which may take at
least another 2 or 3 bits.

If we want to share array textures too, you also need to pass the array
pitch. Is it supposed to be encoded in a modifier too? That's 64 bits on
its own.

So yes, as James mentioned, with some effort, we could technically fit
our current allocation parameters in a modifier, but I'm still not
convinced this is as future proof as it could be as our hardware grows
in capabilities.


Daniel Stone wrote:

> So I reflexively
> get a bit itchy when I see the kernel being used to transit magic
> blobs of data which are supplied by userspace, and only interpreted by
> different userspace. Having tiling formats hidden away means that
> we've had real-world bugs in AMD hardware, where we end up displaying
> garbage because we cannot generically reason about the buffer
> attributes.  

I'm a bit confused. Can't modifiers be specified by vendors and only
interpreted by drivers? My understanding was that modifiers could
actually be treated as opaque 64-bit data, in which case they would
qualify as "magic blobs of data". Otherwise, it seems this wouldn't be
scalable. What am I missing?


Daniel Vetter wrote:

> I think in the interim figuring out how to expose kms capabilities
> better (and necessarily standardizing at least some of them which
> matter at the compositor level, like size limits of framebuffers)
> feels like the place to push the ecosystem forward. In some way
> Miguel's proposal looks a bit backwards, since it adds the pitch
> capabilities to addfb, but at addfb time you've allocated everything
> already, so way too late to fix things up. With modifiers we've added
> a very simple per-plane property to list which modifiers can be
> combined with which pixel formats. Tiny start, but obviously very far
> from all that we'll need.  

Not sure whether I might be misunderstanding your statement, but one of
the allocator main features is negotiation of nearly optimal allocation
parameters given a set of uses on different devices/engines by the
capability merge operation. A client should have queried what every
device/engine is capable of for the given uses, find the optimal set of
capabilities, and use it for allocating a buffer. At the moment these
parameters are given to KMS, they are expected to be good. If they
aren't, the client didn't do things right.


Rob Clark wrote:

> It does seem like, if possible, starting out with modifiers for now at
> the kernel interface would make life easier, vs trying to reinvent
> both kernel and userspace APIs at the same time.  Userspace APIs are
> easier to change or throw away.  Presumably by the time we get to the
> point of changing kernel uabi, we are already using, and pretty happy
> with, serialized liballoc data over the wire in userspace so it is
> only a matter of changing the kernel interface.  

I guess we can indeed start with modifiers for now, if that's what it
takes to get the allocator mechanisms rolling. However, it seems to me
that we won't be able to encode the same type of information included
in capability sets with modifiers in all cases. For instance, if we end
up encoding usage transition information in capability sets, how that
would translate to modifiers?

I assume display doesn't really care about a lot of the data capability
sets may encode, but is it correct to think of modifiers as things only
display needs? If we are to treat modifiers as a first-class citizen, I
would expect to use them beyond that.


Kristian Kristensen wrote:

> I agree and let me 

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-21 Thread Rob Clark
On Thu, Dec 21, 2017 at 3:36 AM, Daniel Vetter  wrote:
> On Thu, Dec 21, 2017 at 9:06 AM, James Jones  wrote:
>> However, making some assumptions, I suspect it's probably going to come down
>> to yes we can fit what we need in some number of bits marginally less than
>> 56 now, with the current use cases and hardware, but we're very concerned
>> about extensibility given the number has only ever grown in our HW, is
>> uncomfortably close to the limit if it isn't over it already, and it's been
>> demonstrated it takes a monumental effort to change the mechanism if it
>> isn't extensible.  While it's hard to change the mechanism one more time
>> now, better to change it to something truly extensible now because it will
>> be much, much harder to make such a change ~5 years from now in a world
>> where it's baked in to pervasively deployed Wayland and X protocol, the EGL
>> and Vulkan extensions have been defined for a few years and in use by apps
>> besides Wayland, and the allocator stuff is deployed on ~5 operating systems
>> that have some derivative version of DRM modifiers to support it and a bunch
>> of funky embedded apps using it.  Further, we're volunteering to handle the
>> bulk of the effort needed to make the change now, so I hope architectural
>> correctness and maintainability can be the primary points of debate.
>
> I think that's already happened. So no matter what we do, we're going
> to live with an ecosystem that uses modifiers all over the place in 5
> years. Even if it's not fully pervasive we will have to keep the
> support around for 10 years (at least on the kernel side).
>
> So the option is between reving the entire ecosystem now, or reving it
> in a few years when the current scheme has run out of steam for good.
> And I much prefer the 2nd option for the simple reason that by then
> the magic 8ball has gained another 5 years of clarity for looking into
> the future.

Drive by comment (and disclaimer, haven't had chance to read rest of
thread yet), but I think there is a reasonable path to increase the
modifier space to something like 2^^568 (minus the cases were
modifiers[0]==modifiers[1]==modifiers[2]==modifiers[3]).. (Yeah, yeah,
 I'm sure there is a 640k should be enough joke here somewhere)

Fortunately currently modifiers array is at end of 'struct
drm_mode_fb_cmd2', so there maybe some other options to extend it as
well.  Possibly reserving the modifier value ~0 now might be a good
idea.

It does seem like, if possible, starting out with modifiers for now at
the kernel interface would make life easier, vs trying to reinvent
both kernel and userspace APIs at the same time.  Userspace APIs are
easier to change or throw away.  Presumably by the time we get to the
point of changing kernel uabi, we are already using, and pretty happy
with, serialized liballoc data over the wire in userspace so it is
only a matter of changing the kernel interface.

The downside of this is needing a per-driver userspace bit to map
liballoc to modifiers.  We kinda have this already in mesa, even for
the modesetting-only drivers that can be paired with a render-only
driver.

BR,
-R

> I think in the interim figuring out how to expose kms capabilities
> better (and necessarily standardizing at least some of them which
> matter at the compositor level, like size limits of framebuffers)
> feels like the place to push the ecosystem forward. In some way
> Miguel's proposal looks a bit backwards, since it adds the pitch
> capabilities to addfb, but at addfb time you've allocated everything
> already, so way too late to fix things up. With modifiers we've added
> a very simple per-plane property to list which modifiers can be
> combined with which pixel formats. Tiny start, but obviously very far
> from all that we'll need.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-21 Thread Daniel Vetter
On Thu, Dec 21, 2017 at 9:06 AM, James Jones  wrote:
> However, making some assumptions, I suspect it's probably going to come down
> to yes we can fit what we need in some number of bits marginally less than
> 56 now, with the current use cases and hardware, but we're very concerned
> about extensibility given the number has only ever grown in our HW, is
> uncomfortably close to the limit if it isn't over it already, and it's been
> demonstrated it takes a monumental effort to change the mechanism if it
> isn't extensible.  While it's hard to change the mechanism one more time
> now, better to change it to something truly extensible now because it will
> be much, much harder to make such a change ~5 years from now in a world
> where it's baked in to pervasively deployed Wayland and X protocol, the EGL
> and Vulkan extensions have been defined for a few years and in use by apps
> besides Wayland, and the allocator stuff is deployed on ~5 operating systems
> that have some derivative version of DRM modifiers to support it and a bunch
> of funky embedded apps using it.  Further, we're volunteering to handle the
> bulk of the effort needed to make the change now, so I hope architectural
> correctness and maintainability can be the primary points of debate.

I think that's already happened. So no matter what we do, we're going
to live with an ecosystem that uses modifiers all over the place in 5
years. Even if it's not fully pervasive we will have to keep the
support around for 10 years (at least on the kernel side).

So the option is between reving the entire ecosystem now, or reving it
in a few years when the current scheme has run out of steam for good.
And I much prefer the 2nd option for the simple reason that by then
the magic 8ball has gained another 5 years of clarity for looking into
the future.

I think in the interim figuring out how to expose kms capabilities
better (and necessarily standardizing at least some of them which
matter at the compositor level, like size limits of framebuffers)
feels like the place to push the ecosystem forward. In some way
Miguel's proposal looks a bit backwards, since it adds the pitch
capabilities to addfb, but at addfb time you've allocated everything
already, so way too late to fix things up. With modifiers we've added
a very simple per-plane property to list which modifiers can be
combined with which pixel formats. Tiny start, but obviously very far
from all that we'll need.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-21 Thread James Jones

On 12/20/2017 01:58 PM, Daniel Stone wrote:

Hi Miguel,

On 20 December 2017 at 16:51, Miguel Angel Vico  wrote:

In the meantime, I've been working on putting together an open source
implementation of the allocator mechanisms using the Nouveau driver for
all to be able to play with.


Thanks for taking a look at this! I'm still winding out my to-do list
for the year, but hoping to get to this more seriously soon.

As a general comment, now that modifiers are a first-class concept in
many places (KMS FBs, KMS plane format advertisement, V4L2 buffers,
EGL/Vulkan image import/export, Wayland buffer import, etc), I'd like
to see them included as a first-class concept in the allocator. I
understand one of the primary reservations against using them was that
QNX didn't have such a concept, but just specifying them to be ignored
on non-Linux platforms would probably work fine.


The allocator mechanisms and format modifiers are orthogonal though. 
Either capability sets can be represented using format modifiers (the 
direction one part of this thread is suggesting, which I think is a bad 
idea), or format modifiers could easily be included as a vendor-agnostic 
capability, similar to pitch layout.  There are no "First class 
citizens" in the allocator mechanism itself.  That's the whole idea: 
Apps don't need to care about things like how the OS represents its 
surface metadata beyond some truly universal things like width and 
height (assertions).  The rest is abstracted away such that the apps are 
portable, even if the drivers/backends aren't.  Even if the solution 
within Linux is "just use format modifiers", there's still some benefit 
to making the kernel ABI use something slightly higher level that 
translates to DRM format modifiers inside the kernel, just to keep the 
apps OS-agnostic.



Another of the missing pieces before we can move this to production is
importing allocations to DRM FB objects. This is probably one of the
most sensitive parts of the project as it requires modification/addition
of kernel driver interfaces.

At XDC2017, James had several hallway conversations with several people
about this, all having different opinions. I'd like to take this
opportunity to also start a discussion about what's the best option to
create a path to get allocator allocations added as DRM FB objects.

These are the few options we've considered to start with:

   A) Have vendor-private ioctls to set properties on GEM objects that
  are inherited by the FB objects. This is how our (NVIDIA) desktop
  DRM driver currently works. This would require every vendor to add
  their own ioctl to process allocator metadata, but the metadata is
  actually a vendor-agnostic object more like DRM modifiers. We'd
  like to come up with a vendor-agnostic solutions that can be
  integrated to core DRM.


This worries me. If the data is static for the lifetime of the buffer
- describing the tiling layout, for instance - then it would form
effective ABI for all the consumers/producers using that buffer type.
If it is dynamic, you also have a world of synchronisation problems
when multiple users race each other with different uses of that buffer
(and presumably you would need to reload the metadata on every use?).
Either way, anyone using this would need to have a very well-developed
compatibility story, given that you can mix and match kernel and
userspace versions.


I think the metadata is static.  The surface meta-state is not, but that 
would be a commit time thing if anything, not a GEM or FB object thing. 
Still attaching metadata to GEM objects, which seem to be opaque blobs 
of memory in the general case, rather than attaching it to FB's mapped 
onto the GEM objects always felt architecturally wrong to me.  You can 
have multiple FBs in one GEM object, for example.  There's no reason to 
assume they would share the same format let alone tiling layout.



   B) Add a new drmModeAddFBWithMetadata() command that takes allocator
  metadata blobs for each plane of the FB. Some people in the
  community have mentioned this is their preferred design. This,
  however, means we'd have to go through the exercise of adding
  another metadata mechanism to the whole graphics stack.


Similarly, this seems to be missing either a 'mandatory' flag so
userspace can inform the kernel it must fail if it does not understand
certain capabilities, or a way for the kernel to inform userspace
which capabilities it does/doesn't understand.


I think that will fall out of the discussion over exactly what 
capability sets look like.  Regardless, yes, the kernel must fail if it 
can't support a given capability set, just as it would fail if it 
couldn't support a given DRM Format modifier.  Like the format 
modifiers, the userspace allocator driver would have queried the DRM 
kernel driver when reporting supported capability sets for a usage that 
required creating FBs, so it would always be user error to r

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-21 Thread Daniel Vetter
On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen
 wrote:
> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico 
> wrote:
>>
>> Inline.
>>
>> On Wed, 20 Dec 2017 11:54:10 -0800
>> Kristian Høgsberg  wrote:
>>
>> > On Wed, Dec 20, 2017 at 11:51 AM, Daniel Vetter  wrote:
>> > > Since this also involves the kernel let's add dri-devel ...
>>
>> Yeah, I forgot. Thanks Daniel!
>>
>> > >
>> > > On Wed, Dec 20, 2017 at 5:51 PM, Miguel Angel Vico
>> > >  wrote:
>> > >> Hi all,
>> > >>
>> > >> As many of you already know, I've been working with James Jones on
>> > >> the
>> > >> Generic Device Allocator project lately. He started a discussion
>> > >> thread
>> > >> some weeks ago seeking feedback on the current prototype of the
>> > >> library
>> > >> and advice on how to move all this forward, from a prototype stage to
>> > >> production. For further reference, see:
>> > >>
>> > >>
>> > >> https://lists.freedesktop.org/archives/mesa-dev/2017-November/177632.html
>> > >>
>> > >> From the thread above, we came up with very interesting high level
>> > >> design ideas for one of the currently missing parts in the library:
>> > >> Usage transitions. That's something I'll personally work on during
>> > >> the
>> > >> following weeks.
>> > >>
>> > >>
>> > >> In the meantime, I've been working on putting together an open source
>> > >> implementation of the allocator mechanisms using the Nouveau driver
>> > >> for
>> > >> all to be able to play with.
>> > >>
>> > >> Below I'm seeking feedback on a bunch of changes I had to make to
>> > >> different components of the graphics stack:
>> > >>
>> > >> ** Allocator **
>> > >>
>> > >>   An allocator driver implementation on top of Nouveau. The current
>> > >>   implementation only handles pitch linear layouts, but that's enough
>> > >>   to have the kmscube port working using the allocator and Nouveau
>> > >>   drivers.
>> > >>
>> > >>   You can pull these changes from
>> > >>
>> > >>
>> > >> https://github.com/mvicomoya/allocator/tree/wip/mvicomoya/nouveau-driver
>> > >>
>> > >> ** Mesa **
>> > >>
>> > >>   James's kmscube port to use the allocator relies on the
>> > >>   EXT_external_objects extension to import allocator allocations to
>> > >>   OpenGL as a texture object. However, the Nouveau implementation of
>> > >>   these mechanisms is missing in Mesa, so I went ahead and added
>> > >> them.
>> > >>
>> > >>   You can pull these changes from
>> > >>
>> > >>
>> > >> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/EXT_external_objects-nouveau
>> > >>
>> > >>   Also, James's kmscube port uses the NVX_unix_allocator_import
>> > >>   extension to attach allocator metadata to texture objects so the
>> > >>   driver knows how to deal with the imported memory.
>> > >>
>> > >>   Note that there isn't a formal spec for this extension yet. For
>> > >> now,
>> > >>   it just serves as an experimental mechanism to import allocator
>> > >>   memory in OpenGL, and attach metadata to texture objects.
>> > >>
>> > >>   You can pull these changes (written on top of the above) from:
>> > >>
>> > >>
>> > >> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/NVX_unix_allocator_import
>> > >>
>> > >> ** kmscube **
>> > >>
>> > >>   Mostly minor fixes and improvements on top of James's port to use
>> > >> the
>> > >>   allocator. Main thing is the allocator initialization path will use
>> > >>   EGL_MESA_platform_surfaceless if EGLDevice platform isn't supported
>> > >>   by the underlying EGL implementation.
>> > >>
>> > >>   You can pull these changes from:
>> > >>
>> > >>
>> > >> https://github.com/mvicomoya/kmscube/tree/wip/mvicomoya/allocator-nouveau
>> > >>
>> > >>
>> > >> With all the above you should be able to get kmscube working using
>> > >> the
>> > >> allocator on top of the Nouveau driver.
>> > >>
>> > >>
>> > >> Another of the missing pieces before we can move this to production
>> > >> is
>> > >> importing allocations to DRM FB objects. This is probably one of the
>> > >> most sensitive parts of the project as it requires
>> > >> modification/addition
>> > >> of kernel driver interfaces.
>> > >>
>> > >> At XDC2017, James had several hallway conversations with several
>> > >> people
>> > >> about this, all having different opinions. I'd like to take this
>> > >> opportunity to also start a discussion about what's the best option
>> > >> to
>> > >> create a path to get allocator allocations added as DRM FB objects.
>> > >>
>> > >> These are the few options we've considered to start with:
>> > >>
>> > >>   A) Have vendor-private ioctls to set properties on GEM objects that
>> > >>  are inherited by the FB objects. This is how our (NVIDIA)
>> > >> desktop
>> > >>  DRM driver currently works. This would require every vendor to
>> > >> add
>> > >>  their own ioctl to process allocator metadata, but the metadata
>> > >> is
>> > >>  actually a vendor-agnostic object more like DRM modifiers. We'd
>> > >>  like to come up with a vendor-agnostic solutions that

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-20 Thread Ilia Mirkin
On Wed, Dec 20, 2017 at 6:22 PM, Kristian Kristensen
 wrote:
> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico 
> wrote:
>> On Wed, 20 Dec 2017 11:54:10 -0800
>> Kristian Høgsberg  wrote:
>> > I'd like to see concrete examples of actual display controllers
>> > supporting more format layouts than what can be specified with a 64
>> > bit modifier.
>>
>> The main problem is our tiling and other metadata parameters can't
>> generally fit in a modifier, so we find passing a blob of metadata a
>> more suitable mechanism.
>
>
> I understand that you may have n knobs with a total of more than a total of
> 56 bits that configure your tiling/swizzling for color buffers. What I don't
> buy is that you need all those combinations when passing buffers around
> between codecs, cameras and display controllers. Even if you're sharing
> between the same 3D drivers in different processes, I expect just locking
> down, say, 64 different combinations (you can add more over time) and
> assigning each a modifier would be sufficient. I doubt you'd extract
> meaningful performance gains from going all the way to a blob.

There's probably a world of stuff that we don't know about in nouveau,
but I have a hard time coming up with more than 64-bits worth of
tiling info for dGPU surfaces...

There's 8 bits (sorta, not fully populated, but might as well use
them) of "micro" tiling which is done at the PTE level by the memory
controller and includes compression settings, and then there's 4 bits
of tiling per dimension for macro blocks (which configures different
sizes for each dimension for tile sizes) -- that's only 20 bits. MSAA
level (which is part of the micro tiling setting usually, but may not
necessarily have to be) - another couple of bits, maybe something else
weird for another few bits. Anyways, this is *nowhere* close to 64
bits.

What am I missing?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-20 Thread Daniel Stone
Hi Miguel,

On 20 December 2017 at 16:51, Miguel Angel Vico  wrote:
> In the meantime, I've been working on putting together an open source
> implementation of the allocator mechanisms using the Nouveau driver for
> all to be able to play with.

Thanks for taking a look at this! I'm still winding out my to-do list
for the year, but hoping to get to this more seriously soon.

As a general comment, now that modifiers are a first-class concept in
many places (KMS FBs, KMS plane format advertisement, V4L2 buffers,
EGL/Vulkan image import/export, Wayland buffer import, etc), I'd like
to see them included as a first-class concept in the allocator. I
understand one of the primary reservations against using them was that
QNX didn't have such a concept, but just specifying them to be ignored
on non-Linux platforms would probably work fine.

> Another of the missing pieces before we can move this to production is
> importing allocations to DRM FB objects. This is probably one of the
> most sensitive parts of the project as it requires modification/addition
> of kernel driver interfaces.
>
> At XDC2017, James had several hallway conversations with several people
> about this, all having different opinions. I'd like to take this
> opportunity to also start a discussion about what's the best option to
> create a path to get allocator allocations added as DRM FB objects.
>
> These are the few options we've considered to start with:
>
>   A) Have vendor-private ioctls to set properties on GEM objects that
>  are inherited by the FB objects. This is how our (NVIDIA) desktop
>  DRM driver currently works. This would require every vendor to add
>  their own ioctl to process allocator metadata, but the metadata is
>  actually a vendor-agnostic object more like DRM modifiers. We'd
>  like to come up with a vendor-agnostic solutions that can be
>  integrated to core DRM.

This worries me. If the data is static for the lifetime of the buffer
- describing the tiling layout, for instance - then it would form
effective ABI for all the consumers/producers using that buffer type.
If it is dynamic, you also have a world of synchronisation problems
when multiple users race each other with different uses of that buffer
(and presumably you would need to reload the metadata on every use?).
Either way, anyone using this would need to have a very well-developed
compatibility story, given that you can mix and match kernel and
userspace versions.

>   B) Add a new drmModeAddFBWithMetadata() command that takes allocator
>  metadata blobs for each plane of the FB. Some people in the
>  community have mentioned this is their preferred design. This,
>  however, means we'd have to go through the exercise of adding
>  another metadata mechanism to the whole graphics stack.

Similarly, this seems to be missing either a 'mandatory' flag so
userspace can inform the kernel it must fail if it does not understand
certain capabilities, or a way for the kernel to inform userspace
which capabilities it does/doesn't understand.

The capabilities in the example are also very oddly chosen. Address
alignment, pitch alignment, and maximum pitch are superfluous: the KMS
driver is the single source of truth for these values for FBs, so it
isn't useful for userspace to provide it. Specifically for pitch
alignment and maximum pitch, the pitch values are already given in the
same ioctl, so all you can check with these values (before the driver
does its own check again) is that userspace is self-consistent. These
three capabilities all relate to BO allocation rather than FB
creation: if a BO is rendered into with the wrong pitch, or allocated
at the wrong base address, we've already lost because the allocation
was incorrect.

Did you have some other capabilities in mind which would be more
relevant to FBs?

>   C) Shove allocator metadata into DRM by defining it to be a separate
>  plane in the image, and using the existing DRM modifiers mechanism
>  to indicate there is another plane for each "real" plane added. It
>  isn't clear how this scales to surfaces that already need several
>  planes, but there are some people that see this as the only way
>  forward. Also, we would have to create a separate GEM buffer for
>  the metadatada itself, which seems excessive.

I also have my reservations about this one. The general idea behind
FBs is that, if the buffers are identical but for memory addresses and
pixel content, the parameters should be equal but the per-plane buffer
contents different. Conversely, if the buffers differ in any way but
the above, the parameters should be different. For instance, if
buffers have identical layouts (tiling/swizzling/compression),
identical pixel content once interpreted, but the only thing which
differs is the compression status (fully resolved / not resolved), I
would expect to see identical parameters and differing data in the
auxiliary compression plane. We had quite 

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-20 Thread Miguel Angel Vico
Inline.

On Wed, 20 Dec 2017 11:54:10 -0800
Kristian Høgsberg  wrote:

> On Wed, Dec 20, 2017 at 11:51 AM, Daniel Vetter  wrote:
> > Since this also involves the kernel let's add dri-devel ...

Yeah, I forgot. Thanks Daniel!

> >
> > On Wed, Dec 20, 2017 at 5:51 PM, Miguel Angel Vico  
> > wrote:  
> >> Hi all,
> >>
> >> As many of you already know, I've been working with James Jones on the
> >> Generic Device Allocator project lately. He started a discussion thread
> >> some weeks ago seeking feedback on the current prototype of the library
> >> and advice on how to move all this forward, from a prototype stage to
> >> production. For further reference, see:
> >>
> >>
> >> https://lists.freedesktop.org/archives/mesa-dev/2017-November/177632.html
> >>
> >> From the thread above, we came up with very interesting high level
> >> design ideas for one of the currently missing parts in the library:
> >> Usage transitions. That's something I'll personally work on during the
> >> following weeks.
> >>
> >>
> >> In the meantime, I've been working on putting together an open source
> >> implementation of the allocator mechanisms using the Nouveau driver for
> >> all to be able to play with.
> >>
> >> Below I'm seeking feedback on a bunch of changes I had to make to
> >> different components of the graphics stack:
> >>
> >> ** Allocator **
> >>
> >>   An allocator driver implementation on top of Nouveau. The current
> >>   implementation only handles pitch linear layouts, but that's enough
> >>   to have the kmscube port working using the allocator and Nouveau
> >>   drivers.
> >>
> >>   You can pull these changes from
> >>
> >>   
> >> https://github.com/mvicomoya/allocator/tree/wip/mvicomoya/nouveau-driver
> >>
> >> ** Mesa **
> >>
> >>   James's kmscube port to use the allocator relies on the
> >>   EXT_external_objects extension to import allocator allocations to
> >>   OpenGL as a texture object. However, the Nouveau implementation of
> >>   these mechanisms is missing in Mesa, so I went ahead and added them.
> >>
> >>   You can pull these changes from
> >>
> >>   
> >> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/EXT_external_objects-nouveau
> >>
> >>   Also, James's kmscube port uses the NVX_unix_allocator_import
> >>   extension to attach allocator metadata to texture objects so the
> >>   driver knows how to deal with the imported memory.
> >>
> >>   Note that there isn't a formal spec for this extension yet. For now,
> >>   it just serves as an experimental mechanism to import allocator
> >>   memory in OpenGL, and attach metadata to texture objects.
> >>
> >>   You can pull these changes (written on top of the above) from:
> >>
> >>   
> >> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/NVX_unix_allocator_import
> >>
> >> ** kmscube **
> >>
> >>   Mostly minor fixes and improvements on top of James's port to use the
> >>   allocator. Main thing is the allocator initialization path will use
> >>   EGL_MESA_platform_surfaceless if EGLDevice platform isn't supported
> >>   by the underlying EGL implementation.
> >>
> >>   You can pull these changes from:
> >>
> >>   
> >> https://github.com/mvicomoya/kmscube/tree/wip/mvicomoya/allocator-nouveau
> >>
> >>
> >> With all the above you should be able to get kmscube working using the
> >> allocator on top of the Nouveau driver.
> >>
> >>
> >> Another of the missing pieces before we can move this to production is
> >> importing allocations to DRM FB objects. This is probably one of the
> >> most sensitive parts of the project as it requires modification/addition
> >> of kernel driver interfaces.
> >>
> >> At XDC2017, James had several hallway conversations with several people
> >> about this, all having different opinions. I'd like to take this
> >> opportunity to also start a discussion about what's the best option to
> >> create a path to get allocator allocations added as DRM FB objects.
> >>
> >> These are the few options we've considered to start with:
> >>
> >>   A) Have vendor-private ioctls to set properties on GEM objects that
> >>  are inherited by the FB objects. This is how our (NVIDIA) desktop
> >>  DRM driver currently works. This would require every vendor to add
> >>  their own ioctl to process allocator metadata, but the metadata is
> >>  actually a vendor-agnostic object more like DRM modifiers. We'd
> >>  like to come up with a vendor-agnostic solutions that can be
> >>  integrated to core DRM.
> >>
> >>   B) Add a new drmModeAddFBWithMetadata() command that takes allocator
> >>  metadata blobs for each plane of the FB. Some people in the
> >>  community have mentioned this is their preferred design. This,
> >>  however, means we'd have to go through the exercise of adding
> >>  another metadata mechanism to the whole graphics stack.
> >>
> >>   C) Shove allocator metadata into DRM by defining it to be a separate
> >>  plane in the image, and using the existing DRM mod

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-20 Thread Kristian Høgsberg
On Wed, Dec 20, 2017 at 11:51 AM, Daniel Vetter  wrote:
> Since this also involves the kernel let's add dri-devel ...
>
> On Wed, Dec 20, 2017 at 5:51 PM, Miguel Angel Vico  
> wrote:
>> Hi all,
>>
>> As many of you already know, I've been working with James Jones on the
>> Generic Device Allocator project lately. He started a discussion thread
>> some weeks ago seeking feedback on the current prototype of the library
>> and advice on how to move all this forward, from a prototype stage to
>> production. For further reference, see:
>>
>>https://lists.freedesktop.org/archives/mesa-dev/2017-November/177632.html
>>
>> From the thread above, we came up with very interesting high level
>> design ideas for one of the currently missing parts in the library:
>> Usage transitions. That's something I'll personally work on during the
>> following weeks.
>>
>>
>> In the meantime, I've been working on putting together an open source
>> implementation of the allocator mechanisms using the Nouveau driver for
>> all to be able to play with.
>>
>> Below I'm seeking feedback on a bunch of changes I had to make to
>> different components of the graphics stack:
>>
>> ** Allocator **
>>
>>   An allocator driver implementation on top of Nouveau. The current
>>   implementation only handles pitch linear layouts, but that's enough
>>   to have the kmscube port working using the allocator and Nouveau
>>   drivers.
>>
>>   You can pull these changes from
>>
>>   
>> https://github.com/mvicomoya/allocator/tree/wip/mvicomoya/nouveau-driver
>>
>> ** Mesa **
>>
>>   James's kmscube port to use the allocator relies on the
>>   EXT_external_objects extension to import allocator allocations to
>>   OpenGL as a texture object. However, the Nouveau implementation of
>>   these mechanisms is missing in Mesa, so I went ahead and added them.
>>
>>   You can pull these changes from
>>
>>   
>> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/EXT_external_objects-nouveau
>>
>>   Also, James's kmscube port uses the NVX_unix_allocator_import
>>   extension to attach allocator metadata to texture objects so the
>>   driver knows how to deal with the imported memory.
>>
>>   Note that there isn't a formal spec for this extension yet. For now,
>>   it just serves as an experimental mechanism to import allocator
>>   memory in OpenGL, and attach metadata to texture objects.
>>
>>   You can pull these changes (written on top of the above) from:
>>
>>   
>> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/NVX_unix_allocator_import
>>
>> ** kmscube **
>>
>>   Mostly minor fixes and improvements on top of James's port to use the
>>   allocator. Main thing is the allocator initialization path will use
>>   EGL_MESA_platform_surfaceless if EGLDevice platform isn't supported
>>   by the underlying EGL implementation.
>>
>>   You can pull these changes from:
>>
>>   
>> https://github.com/mvicomoya/kmscube/tree/wip/mvicomoya/allocator-nouveau
>>
>>
>> With all the above you should be able to get kmscube working using the
>> allocator on top of the Nouveau driver.
>>
>>
>> Another of the missing pieces before we can move this to production is
>> importing allocations to DRM FB objects. This is probably one of the
>> most sensitive parts of the project as it requires modification/addition
>> of kernel driver interfaces.
>>
>> At XDC2017, James had several hallway conversations with several people
>> about this, all having different opinions. I'd like to take this
>> opportunity to also start a discussion about what's the best option to
>> create a path to get allocator allocations added as DRM FB objects.
>>
>> These are the few options we've considered to start with:
>>
>>   A) Have vendor-private ioctls to set properties on GEM objects that
>>  are inherited by the FB objects. This is how our (NVIDIA) desktop
>>  DRM driver currently works. This would require every vendor to add
>>  their own ioctl to process allocator metadata, but the metadata is
>>  actually a vendor-agnostic object more like DRM modifiers. We'd
>>  like to come up with a vendor-agnostic solutions that can be
>>  integrated to core DRM.
>>
>>   B) Add a new drmModeAddFBWithMetadata() command that takes allocator
>>  metadata blobs for each plane of the FB. Some people in the
>>  community have mentioned this is their preferred design. This,
>>  however, means we'd have to go through the exercise of adding
>>  another metadata mechanism to the whole graphics stack.
>>
>>   C) Shove allocator metadata into DRM by defining it to be a separate
>>  plane in the image, and using the existing DRM modifiers mechanism
>>  to indicate there is another plane for each "real" plane added. It
>>  isn't clear how this scales to surfaces that already need several
>>  planes, but there are some people that see this as the only way
>>  forward. Also, we would have to create a separate GEM buffer for
>>  the metadatad

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2017-12-20 Thread Daniel Vetter
Since this also involves the kernel let's add dri-devel ...

On Wed, Dec 20, 2017 at 5:51 PM, Miguel Angel Vico  wrote:
> Hi all,
>
> As many of you already know, I've been working with James Jones on the
> Generic Device Allocator project lately. He started a discussion thread
> some weeks ago seeking feedback on the current prototype of the library
> and advice on how to move all this forward, from a prototype stage to
> production. For further reference, see:
>
>https://lists.freedesktop.org/archives/mesa-dev/2017-November/177632.html
>
> From the thread above, we came up with very interesting high level
> design ideas for one of the currently missing parts in the library:
> Usage transitions. That's something I'll personally work on during the
> following weeks.
>
>
> In the meantime, I've been working on putting together an open source
> implementation of the allocator mechanisms using the Nouveau driver for
> all to be able to play with.
>
> Below I'm seeking feedback on a bunch of changes I had to make to
> different components of the graphics stack:
>
> ** Allocator **
>
>   An allocator driver implementation on top of Nouveau. The current
>   implementation only handles pitch linear layouts, but that's enough
>   to have the kmscube port working using the allocator and Nouveau
>   drivers.
>
>   You can pull these changes from
>
>   https://github.com/mvicomoya/allocator/tree/wip/mvicomoya/nouveau-driver
>
> ** Mesa **
>
>   James's kmscube port to use the allocator relies on the
>   EXT_external_objects extension to import allocator allocations to
>   OpenGL as a texture object. However, the Nouveau implementation of
>   these mechanisms is missing in Mesa, so I went ahead and added them.
>
>   You can pull these changes from
>
>   
> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/EXT_external_objects-nouveau
>
>   Also, James's kmscube port uses the NVX_unix_allocator_import
>   extension to attach allocator metadata to texture objects so the
>   driver knows how to deal with the imported memory.
>
>   Note that there isn't a formal spec for this extension yet. For now,
>   it just serves as an experimental mechanism to import allocator
>   memory in OpenGL, and attach metadata to texture objects.
>
>   You can pull these changes (written on top of the above) from:
>
>   
> https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/NVX_unix_allocator_import
>
> ** kmscube **
>
>   Mostly minor fixes and improvements on top of James's port to use the
>   allocator. Main thing is the allocator initialization path will use
>   EGL_MESA_platform_surfaceless if EGLDevice platform isn't supported
>   by the underlying EGL implementation.
>
>   You can pull these changes from:
>
>   
> https://github.com/mvicomoya/kmscube/tree/wip/mvicomoya/allocator-nouveau
>
>
> With all the above you should be able to get kmscube working using the
> allocator on top of the Nouveau driver.
>
>
> Another of the missing pieces before we can move this to production is
> importing allocations to DRM FB objects. This is probably one of the
> most sensitive parts of the project as it requires modification/addition
> of kernel driver interfaces.
>
> At XDC2017, James had several hallway conversations with several people
> about this, all having different opinions. I'd like to take this
> opportunity to also start a discussion about what's the best option to
> create a path to get allocator allocations added as DRM FB objects.
>
> These are the few options we've considered to start with:
>
>   A) Have vendor-private ioctls to set properties on GEM objects that
>  are inherited by the FB objects. This is how our (NVIDIA) desktop
>  DRM driver currently works. This would require every vendor to add
>  their own ioctl to process allocator metadata, but the metadata is
>  actually a vendor-agnostic object more like DRM modifiers. We'd
>  like to come up with a vendor-agnostic solutions that can be
>  integrated to core DRM.
>
>   B) Add a new drmModeAddFBWithMetadata() command that takes allocator
>  metadata blobs for each plane of the FB. Some people in the
>  community have mentioned this is their preferred design. This,
>  however, means we'd have to go through the exercise of adding
>  another metadata mechanism to the whole graphics stack.
>
>   C) Shove allocator metadata into DRM by defining it to be a separate
>  plane in the image, and using the existing DRM modifiers mechanism
>  to indicate there is another plane for each "real" plane added. It
>  isn't clear how this scales to surfaces that already need several
>  planes, but there are some people that see this as the only way
>  forward. Also, we would have to create a separate GEM buffer for
>  the metadatada itself, which seems excessive.
>
> We personally like option (B) better, and have already started to
> prototype the new path (which is actually very similar to the
> drmModeA