date:20180904

[PATCH] drm/amd/powerplay: fix compile warning for wrong data type V2

2018-09-04 Thread Evan Quan

do_div expects the 1st argument in 64bit instead of 32bit.
Drop the usage of do_div as it seems unnecessary.

V2: drop usage of do_div completely

Change-Id: Id2032a43727e7f1fa516d3565354d412a561
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 3efd59e984a3..1e65ac01e0f5 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -1195,7 +1195,7 @@ static int vega20_set_sclk_od(
int ret = 0;
 
od_sclk = golden_sclk_table->dpm_levels[golden_sclk_table->count - 
1].value * value;
-   do_div(od_sclk, 100);
+   od_sclk /= 100;
od_sclk += golden_sclk_table->dpm_levels[golden_sclk_table->count - 
1].value;
 
ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_GFXCLK_FMAX, od_sclk);
@@ -1242,7 +1242,7 @@ static int vega20_set_mclk_od(
int ret = 0;
 
od_mclk = golden_mclk_table->dpm_levels[golden_mclk_table->count - 
1].value * value;
-   do_div(od_mclk, 100);
+   od_mclk /= 100;
od_mclk += golden_mclk_table->dpm_levels[golden_mclk_table->count - 
1].value;
 
ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX, od_mclk);
-- 
2.18.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: fix compile warning for wrong data type

2018-09-04 Thread Quan, Evan



> -Original Message-
> From: Deucher, Alexander
> Sent: 2018年9月5日 12:33
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Quan, Evan 
> Subject: RE: [PATCH] drm/amd/powerplay: fix compile warning for wrong
> data type
> 
> > -Original Message-
> > From: amd-gfx  On Behalf Of
> > Evan Quan
> > Sent: Tuesday, September 4, 2018 10:21 PM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Quan, Evan 
> > Subject: [PATCH] drm/amd/powerplay: fix compile warning for wrong data
> > type
> >
> > do_div expects the 1st argument in 64bit instead of 32bit.
> 
> Do we actually need to use do_div here?  If both arguments are 32 bit, can't
> we just use regular division?
> 
[Quan, Evan] That's a good idea. Will update the patch accordingly.
> Alex
> 
> >
> > Change-Id: Id2032a43727e7f1fa516d3565354d412a561
> > Signed-off-by: Evan Quan 
> > ---
> >  drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> > b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> > index 3efd59e984a3..6ba5f328249d 100644
> > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> > @@ -1191,14 +1191,14 @@ static int vega20_set_sclk_od(
> > &(data->dpm_table.gfx_table);
> > struct vega20_single_dpm_table *golden_sclk_table =
> > &(data->golden_dpm_table.gfx_table);
> > -   uint32_t od_sclk;
> > +   uint64_t od_sclk;
> > int ret = 0;
> >
> > od_sclk = golden_sclk_table->dpm_levels[golden_sclk_table->count
> > - 1].value * value;
> > do_div(od_sclk, 100);
> > od_sclk += golden_sclk_table->dpm_levels[golden_sclk_table-
> > >count - 1].value;
> >
> > -   ret = vega20_od8_set_settings(hwmgr,
> > OD8_SETTING_GFXCLK_FMAX, od_sclk);
> > +   ret = vega20_od8_set_settings(hwmgr,
> > OD8_SETTING_GFXCLK_FMAX, (uint32_t)od_sclk);
> > PP_ASSERT_WITH_CODE(!ret,
> > "[SetSclkOD] failed to set od gfxclk!",
> > return ret);
> > @@ -1238,14 +1238,14 @@ static int vega20_set_mclk_od(
> > &(data->dpm_table.mem_table);
> > struct vega20_single_dpm_table *golden_mclk_table =
> > &(data->golden_dpm_table.mem_table);
> > -   uint32_t od_mclk;
> > +   uint64_t od_mclk;
> > int ret = 0;
> >
> > od_mclk = golden_mclk_table->dpm_levels[golden_mclk_table-
> > >count - 1].value * value;
> > do_div(od_mclk, 100);
> > od_mclk += golden_mclk_table->dpm_levels[golden_mclk_table-
> > >count - 1].value;
> >
> > -   ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX,
> > od_mclk);
> > +   ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX,
> > (uint32_t)od_mclk);
> > PP_ASSERT_WITH_CODE(!ret,
> > "[SetMclkOD] failed to set od memclk!",
> > return ret);
> > --
> > 2.18.0
> >
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: fix compile warning for wrong data type

2018-09-04 Thread Deucher, Alexander

> -Original Message-
> From: amd-gfx  On Behalf Of Evan
> Quan
> Sent: Tuesday, September 4, 2018 10:21 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Quan, Evan 
> Subject: [PATCH] drm/amd/powerplay: fix compile warning for wrong data
> type
> 
> do_div expects the 1st argument in 64bit instead of 32bit.

Do we actually need to use do_div here?  If both arguments are 32 bit, can't we 
just use regular division?

Alex

> 
> Change-Id: Id2032a43727e7f1fa516d3565354d412a561
> Signed-off-by: Evan Quan 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> index 3efd59e984a3..6ba5f328249d 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
> @@ -1191,14 +1191,14 @@ static int vega20_set_sclk_od(
>   &(data->dpm_table.gfx_table);
>   struct vega20_single_dpm_table *golden_sclk_table =
>   &(data->golden_dpm_table.gfx_table);
> - uint32_t od_sclk;
> + uint64_t od_sclk;
>   int ret = 0;
> 
>   od_sclk = golden_sclk_table->dpm_levels[golden_sclk_table->count
> - 1].value * value;
>   do_div(od_sclk, 100);
>   od_sclk += golden_sclk_table->dpm_levels[golden_sclk_table-
> >count - 1].value;
> 
> - ret = vega20_od8_set_settings(hwmgr,
> OD8_SETTING_GFXCLK_FMAX, od_sclk);
> + ret = vega20_od8_set_settings(hwmgr,
> OD8_SETTING_GFXCLK_FMAX, (uint32_t)od_sclk);
>   PP_ASSERT_WITH_CODE(!ret,
>   "[SetSclkOD] failed to set od gfxclk!",
>   return ret);
> @@ -1238,14 +1238,14 @@ static int vega20_set_mclk_od(
>   &(data->dpm_table.mem_table);
>   struct vega20_single_dpm_table *golden_mclk_table =
>   &(data->golden_dpm_table.mem_table);
> - uint32_t od_mclk;
> + uint64_t od_mclk;
>   int ret = 0;
> 
>   od_mclk = golden_mclk_table->dpm_levels[golden_mclk_table-
> >count - 1].value * value;
>   do_div(od_mclk, 100);
>   od_mclk += golden_mclk_table->dpm_levels[golden_mclk_table-
> >count - 1].value;
> 
> - ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX,
> od_mclk);
> + ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX,
> (uint32_t)od_mclk);
>   PP_ASSERT_WITH_CODE(!ret,
>   "[SetMclkOD] failed to set od memclk!",
>   return ret);
> --
> 2.18.0
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 5/5] [RFC]drm: add syncobj timeline support v3

2018-09-04 Thread zhoucm1




On 2018年09月04日 17:20, Christian König wrote:

Am 04.09.2018 um 11:00 schrieb zhoucm1:



On 2018年09月04日 16:42, Christian König wrote:

Am 04.09.2018 um 10:27 schrieb zhoucm1:



On 2018年09月04日 16:05, Christian König wrote:

Am 04.09.2018 um 09:53 schrieb zhoucm1:

[SNIP]


How about this idea:
1. Each signaling point is a fence implementation with an rb 
node.
2. Each node keeps a reference to the last previously inserted 
node.

3. Each node is referenced by the sync object itself.
4. Before each signal/wait operation we do a garbage 
collection and remove the first node from the tree as long as 
it is signaled.


5. When enable_signaling is requested for a node we cascade 
that to the left using rb_prev.
    This ensures that signaling is enabled for the current 
fence as well as all previous fences.


6. A wait just looks into the tree for the signal point lower 
or equal of the requested sequence number.
After re-thought your idea, I think it doesn't work since there 
is no timeline value as a line:
signal pt value doesn't must be continues, which can be jump by 
signal operation, like 1, 4, 8, 15, 19, e.g. there are five 
singal_pt, 
signal_pt1->signal_pt4->signal_pt8->signal_pt15->signal_pt19, if 
a wait pt is 7, do you mean this wait only needs signal_pt1 and 
signal_pt4???  That's certainly not right, we need to make sure 
the timeline value is bigger than wait pt value, that means 
signal_pt8 is need for wait_pt7.


That can be defined as we like it, e.g. when a wait operation asks 
for 7 we can return 8 as well.
If defined this, then problem is coming again, if 8 is removed when 
garbage collection, you will return 15?


The garbage collection is only done for signaled nodes. So when 8 is 
already garbage collected and 7 is asked we know that we don't need 
to return anything.
8 is a signaled node, waitA/signal operation do garbage collection, 
how waitB(7) know the garbage history?


Well we of course keep what the last garbage collected number is, 
don't we?


Since there is no timeline as a line, I think this is not right 
direction.


That is actually intended. There is no infinite timeline here, just 
a windows of the last not yet signaled fences.
No one said the it's a infinite timeline, timeline will stop 
increasing when syncobj is released.


Yeah, but the syncobj can live for a very very long time. Not having 
some form of shrinking it when fences are signaled is certainly not 
going to fly very far.

I will try to fix this problem.
btw, when I try your suggestion, I find it will be difficult to 
implement drm_syncobj_array_wait_timeout by your idea, since it needs 
first_signaled. if there is un-signaled syncobj, we will still register 
cb to timeline value change, then still back to need enble_signaling.


Thanks,
David Zhou


Regards,
Christian.



Anyway kref is a good way to solve the 'free' problem, I will try to 
use it improve my patch, of course, will refer your idea.:)


Thanks,
David Zhou


Otherwise you will never be able to release nodes from the tree 
since you always need to keep them around just in case somebody asks 
for a lower number.


Regards,
Christian.





The key is that as soon as a signal point is added adding a 
previous point is no longer allowed.

That's intention.

Regards,
David Zhou




7. When the sync object is released we use 
rbtree_postorder_for_each_entry_safe() and drop the extra 
reference to each node, but never call rb_erase!
    This way the rb_tree stays in memory, but without a root 
(e.g. the sync object). It only destructs itself when the 
looked up references to the nodes are dropped.
And here, who will destroy rb node since no one do 
enable_signaling, and there is no callback to free themselves.


The node will be destroyed when the last reference drops, not when 
enable_signaling is called.


In other words the sync_obj keeps the references to each tree 
object to provide the wait operation, as soon as the sync_obj is 
destroyed we don't need that functionality any more.


We don't even need to wait for anything to be signaled, this way 
we can drop all unused signal points as soon as the sync_obj is 
destroyed.


Only the used ones will stay alive and provide the necessary 
functionality to provide the signal for each wait operation.


Regards,
Christian.



Regards,
David Zhou


Well that is quite a bunch of logic, but I think that should 
work fine.
Yeah, it could work, simple timeline reference also can solve 
'free' problem.


I think this approach is still quite a bit better, 


e.g. you don't run into circle dependency problems, it needs 
less memory and each node has always the same size which means 
we can use a kmem_cache for it.


Regards,
Christian.



Thanks,
David Zhou






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx








___
amd-gfx mailing list

[PATCH] drm/amd/powerplay: fix compile warning for wrong data type

2018-09-04 Thread Evan Quan

do_div expects the 1st argument in 64bit instead of 32bit.

Change-Id: Id2032a43727e7f1fa516d3565354d412a561
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
index 3efd59e984a3..6ba5f328249d 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega20_hwmgr.c
@@ -1191,14 +1191,14 @@ static int vega20_set_sclk_od(
&(data->dpm_table.gfx_table);
struct vega20_single_dpm_table *golden_sclk_table =
&(data->golden_dpm_table.gfx_table);
-   uint32_t od_sclk;
+   uint64_t od_sclk;
int ret = 0;
 
od_sclk = golden_sclk_table->dpm_levels[golden_sclk_table->count - 
1].value * value;
do_div(od_sclk, 100);
od_sclk += golden_sclk_table->dpm_levels[golden_sclk_table->count - 
1].value;
 
-   ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_GFXCLK_FMAX, od_sclk);
+   ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_GFXCLK_FMAX, 
(uint32_t)od_sclk);
PP_ASSERT_WITH_CODE(!ret,
"[SetSclkOD] failed to set od gfxclk!",
return ret);
@@ -1238,14 +1238,14 @@ static int vega20_set_mclk_od(
&(data->dpm_table.mem_table);
struct vega20_single_dpm_table *golden_mclk_table =
&(data->golden_dpm_table.mem_table);
-   uint32_t od_mclk;
+   uint64_t od_mclk;
int ret = 0;
 
od_mclk = golden_mclk_table->dpm_levels[golden_mclk_table->count - 
1].value * value;
do_div(od_mclk, 100);
od_mclk += golden_mclk_table->dpm_levels[golden_mclk_table->count - 
1].value;
 
-   ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX, od_mclk);
+   ret = vega20_od8_set_settings(hwmgr, OD8_SETTING_UCLK_FMAX, 
(uint32_t)od_mclk);
PP_ASSERT_WITH_CODE(!ret,
"[SetMclkOD] failed to set od memclk!",
return ret);
-- 
2.18.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH libdrm] libdrm: Allow dynamic drm majors on linux

2018-09-04 Thread Dave Airlie

On Tue, 4 Sep 2018 at 03:00, Thomas Hellstrom  wrote:
>
> On 09/03/2018 06:33 PM, Daniel Vetter wrote:
> > On Mon, Sep 03, 2018 at 11:16:29AM +0200, Thomas Hellstrom wrote:
> >> On 08/31/2018 05:30 PM, Thomas Hellstrom wrote:
> >>> On 08/31/2018 05:27 PM, Emil Velikov wrote:
>  On 31 August 2018 at 15:38, Michel Dänzer  wrote:
> > [ Adding the amd-gfx list ]
> >
> > On 2018-08-31 3:05 p.m., Thomas Hellstrom wrote:
> >> On 08/31/2018 02:30 PM, Emil Velikov wrote:
> >>> On 31 August 2018 at 12:54, Thomas Hellstrom 
> >>> wrote:
>  To determine whether a device node is a drm device
>  node or not, the code
>  currently compares the node's major number to the static drm major
>  device
>  number.
> 
>  This breaks the standalone vmwgfx driver on XWayland dri clients,
> 
> >>> Any particular reason why the code doesn't use a fixed node there?
> >>> It will make the diff vs the in-kernel driver a bit smaller.
> >> Because then it won't be able to interoperate with other in-tree
> >> drivers, like virtual drm drivers or passthrough usb drm drivers.
> >> There is no clean way to share the minor number allocation
> >> with in-tree
> >> drm, so standalone vmwgfx is using dynamic major allocation.
> > I wonder why I haven't heard of any of these issues with the standalone
> > version of amdgpu shipped in packaged AMD releases. Does that
> > also use a
> > different major number? If yes, maybe it's just that nobody has tried
> > Xwayland clients with that driver. If no, how does it avoid the other
> > issues described above?
> >
>  AFAICT, the difference is that the standalone vmwgfx uses an internal
>  copy of drm core.
>  It doesn't reuse the in-kernel drm, hence it cannot know which minor
>  it can use.
> 
>  -Emil
> >>> Actually, standalone vmwgfx could perhaps also try to allocate minors
> >>> from 63 and downwards. That might work, but needs some verification.
> >>>
> >> So unfortuntately this doesn't work since the in-tree drm's file operations
> >> are registered with the DRM_MAJOR.
> >> So I still think the patch is the way to go. If people are concerned that
> >> also fbdev file descriptors are allowed, perhaps there are other sysfs
> >> traits we can look at?
> > Somewhat out of curiosity, but why do you have to overwrite all of drm?
> > amdgpu seems to be able to pull their stunt off without ...
> > -Daniel
>
> At the time we launched the standalone vmwgfx, the DRM <-> driver
> interface was moving considerably more rapidly than the DRM <-> kernel
> interface. I think that's still the case. Hence less work for us. Also
> meant we can install the full driver stack with latest features on
> fairly old VMs without backported DRM functionality.
>

I think this should be fine for 99% of drm usage, there may be corner
cases in wierd places, but I can't point to any that really matter
(maybe strace?)

Acked-by: Dave Airlie 

Dave.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Daniel Vetter

On Tue, Sep 04, 2018 at 09:36:01PM +0200, Bas Nieuwenhuizen wrote:
> On Tue, Sep 4, 2018 at 9:28 PM Daniel Vetter  wrote:
> >
> > On Tue, Sep 4, 2018 at 8:31 PM, Bas Nieuwenhuizen
> >  wrote:
> > > On Tue, Sep 4, 2018 at 8:27 PM Daniel Vetter  wrote:
> > >>
> > >> On Tue, Sep 4, 2018 at 7:57 PM, Bas Nieuwenhuizen
> > >>  wrote:
> > >> > On Tue, Sep 4, 2018 at 7:48 PM Christian König
> > >> >  wrote:
> > >> >>
> > >> >> Am 04.09.2018 um 18:37 schrieb Daniel Vetter:
> > >> >> > On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
> > >> >> >  wrote:
> > >> >> >> On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter  
> > >> >> >> wrote:
> > >> >> >>> On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
> > >> >> >>>  wrote:
> > >> >>  On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter  
> > >> >>  wrote:
> > >> >> > On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen 
> > >> >> > wrote:
> > >> >> >> On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter  
> > >> >> >> wrote:
> > >> >> >>> On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König 
> > >> >> >>> wrote:
> > >> >>  Am 04.09.2018 um 12:15 schrieb Daniel Stone:
> > >> >> > Hi,
> > >> >> >
> > >> >> > On Tue, 4 Sep 2018 at 11:05, Daniel Vetter 
> > >> >> >  wrote:
> > >> >> >> On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen 
> > >> >> >>  wrote:
> > >> >> >>> +/* The chip this is compatible with.
> > >> >> >>> + *
> > >> >> >>> + * If compression is disabled, use
> > >> >> >>> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
> > >> >> >>> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
> > >> >> >>> + *
> > >> >> >>> + * With compression enabled please use the exact chip.
> > >> >> >>> + *
> > >> >> >>> + * TODO: Do some generations share DCC format?
> > >> >> >>> + */
> > >> >> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
> > >> >> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK  
> > >> >> >>> 0xff
> > >> >> >> Do you really need all the combinations here of DCC + gpu 
> > >> >> >> gen + tiling
> > >> >> >> details? When we had the entire discussion with nvidia 
> > >> >> >> folks they
> > >> >> >> eventually agreed that they don't need the massive pile 
> > >> >> >> with every
> > >> >> >> possible combination. Do you really plan to share all 
> > >> >> >> these different
> > >> >> >> things?
> > >> >> >>
> > >> >> >> Note that e.g. on i915 we spec some of the tiling 
> > >> >> >> depending upon
> > >> >> >> buffer size and buffer format (because that's how the hw 
> > >> >> >> works), not
> > >> >> >> using explicit modifier flags for everything.
> > >> >> > Right. The conclusion, after people went through and 
> > >> >> > started sorting
> > >> >> > out the kinds of formats for which they would _actually_ 
> > >> >> > export real
> > >> >> > colour buffers for, that most vendors definitely have fewer 
> > >> >> > than
> > >> >> > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
> > >> >> > possible formats to represent, very likely fewer than
> > >> >> > 340,282,366,920,938,463,463,374,607,431,768,211,456 
> > >> >> > formats, probably
> > >> >> > fewer than 72,057,594,037,927,936 formats, and even still 
> > >> >> > generally
> > >> >> > fewer than 281,474,976,710,656 if you want to be generous 
> > >> >> > and leave 8
> > >> >> > bits of the 56 available.
> > >> >>  The problem here is that at least for some parameters we 
> > >> >>  actually don't know
> > >> >>  which formats are actually used.
> > >> >> 
> > >> >>  The following are not real world examples, but just to 
> > >> >>  explain the general
> > >> >>  problem.
> > >> >> 
> > >> >>  The memory configuration for example can be not ASIC 
> > >> >>  specific, but rather
> > >> >>  determined by whoever took the ASIC and glued it together 
> > >> >>  with VRAM on a
> > >> >>  board. It is not likely that somebody puts all the VRAM 
> > >> >>  chips on one
> > >> >>  channel, but it is still perfectly possible.
> > >> >> 
> > >> >>  Same is true for things like harvesting, e.g. of 16 channels 
> > >> >>  halve of them
> > >> >>  could be bad and we need to know which to actually use.
> > >> >> >>> For my understanding: This leaks outside the chip when 
> > >> >> >>> sharing buffers?
> > >> >> >>> All the information you only need locally to a given amdgpu 
> > >> >> >>> instance
> > >> >> >>> don't really need to be encoded in modifiers.
> > >> >>

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Bas Nieuwenhuizen

On Tue, Sep 4, 2018 at 9:28 PM Daniel Vetter  wrote:
>
> On Tue, Sep 4, 2018 at 8:31 PM, Bas Nieuwenhuizen
>  wrote:
> > On Tue, Sep 4, 2018 at 8:27 PM Daniel Vetter  wrote:
> >>
> >> On Tue, Sep 4, 2018 at 7:57 PM, Bas Nieuwenhuizen
> >>  wrote:
> >> > On Tue, Sep 4, 2018 at 7:48 PM Christian König
> >> >  wrote:
> >> >>
> >> >> Am 04.09.2018 um 18:37 schrieb Daniel Vetter:
> >> >> > On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
> >> >> >  wrote:
> >> >> >> On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter  wrote:
> >> >> >>> On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
> >> >> >>>  wrote:
> >> >>  On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter  
> >> >>  wrote:
> >> >> > On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen wrote:
> >> >> >> On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter  
> >> >> >> wrote:
> >> >> >>> On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König wrote:
> >> >>  Am 04.09.2018 um 12:15 schrieb Daniel Stone:
> >> >> > Hi,
> >> >> >
> >> >> > On Tue, 4 Sep 2018 at 11:05, Daniel Vetter 
> >> >> >  wrote:
> >> >> >> On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen 
> >> >> >>  wrote:
> >> >> >>> +/* The chip this is compatible with.
> >> >> >>> + *
> >> >> >>> + * If compression is disabled, use
> >> >> >>> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
> >> >> >>> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
> >> >> >>> + *
> >> >> >>> + * With compression enabled please use the exact chip.
> >> >> >>> + *
> >> >> >>> + * TODO: Do some generations share DCC format?
> >> >> >>> + */
> >> >> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
> >> >> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK  0xff
> >> >> >> Do you really need all the combinations here of DCC + gpu 
> >> >> >> gen + tiling
> >> >> >> details? When we had the entire discussion with nvidia folks 
> >> >> >> they
> >> >> >> eventually agreed that they don't need the massive pile with 
> >> >> >> every
> >> >> >> possible combination. Do you really plan to share all these 
> >> >> >> different
> >> >> >> things?
> >> >> >>
> >> >> >> Note that e.g. on i915 we spec some of the tiling depending 
> >> >> >> upon
> >> >> >> buffer size and buffer format (because that's how the hw 
> >> >> >> works), not
> >> >> >> using explicit modifier flags for everything.
> >> >> > Right. The conclusion, after people went through and started 
> >> >> > sorting
> >> >> > out the kinds of formats for which they would _actually_ 
> >> >> > export real
> >> >> > colour buffers for, that most vendors definitely have fewer 
> >> >> > than
> >> >> > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
> >> >> > possible formats to represent, very likely fewer than
> >> >> > 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, 
> >> >> > probably
> >> >> > fewer than 72,057,594,037,927,936 formats, and even still 
> >> >> > generally
> >> >> > fewer than 281,474,976,710,656 if you want to be generous and 
> >> >> > leave 8
> >> >> > bits of the 56 available.
> >> >>  The problem here is that at least for some parameters we 
> >> >>  actually don't know
> >> >>  which formats are actually used.
> >> >> 
> >> >>  The following are not real world examples, but just to explain 
> >> >>  the general
> >> >>  problem.
> >> >> 
> >> >>  The memory configuration for example can be not ASIC specific, 
> >> >>  but rather
> >> >>  determined by whoever took the ASIC and glued it together with 
> >> >>  VRAM on a
> >> >>  board. It is not likely that somebody puts all the VRAM chips 
> >> >>  on one
> >> >>  channel, but it is still perfectly possible.
> >> >> 
> >> >>  Same is true for things like harvesting, e.g. of 16 channels 
> >> >>  halve of them
> >> >>  could be bad and we need to know which to actually use.
> >> >> >>> For my understanding: This leaks outside the chip when sharing 
> >> >> >>> buffers?
> >> >> >>> All the information you only need locally to a given amdgpu 
> >> >> >>> instance
> >> >> >>> don't really need to be encoded in modifiers.
> >> >> >>>
> >> >> >>> Pointers to code where this is all decided (kernel and radeonsi 
> >> >> >>> would be
> >> >> >>> good starters I guess) would be really good here.
> >> >> >> I extracted the information on which bits are relevant mostly 
> >> >> >> from the
> >> >> >> AddrFromCoord functions in addrlib in

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Daniel Vetter

On Tue, Sep 4, 2018 at 8:31 PM, Bas Nieuwenhuizen
 wrote:
> On Tue, Sep 4, 2018 at 8:27 PM Daniel Vetter  wrote:
>>
>> On Tue, Sep 4, 2018 at 7:57 PM, Bas Nieuwenhuizen
>>  wrote:
>> > On Tue, Sep 4, 2018 at 7:48 PM Christian König
>> >  wrote:
>> >>
>> >> Am 04.09.2018 um 18:37 schrieb Daniel Vetter:
>> >> > On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
>> >> >  wrote:
>> >> >> On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter  wrote:
>> >> >>> On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
>> >> >>>  wrote:
>> >>  On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter  wrote:
>> >> > On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen wrote:
>> >> >> On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter  
>> >> >> wrote:
>> >> >>> On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König wrote:
>> >>  Am 04.09.2018 um 12:15 schrieb Daniel Stone:
>> >> > Hi,
>> >> >
>> >> > On Tue, 4 Sep 2018 at 11:05, Daniel Vetter 
>> >> >  wrote:
>> >> >> On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen 
>> >> >>  wrote:
>> >> >>> +/* The chip this is compatible with.
>> >> >>> + *
>> >> >>> + * If compression is disabled, use
>> >> >>> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
>> >> >>> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
>> >> >>> + *
>> >> >>> + * With compression enabled please use the exact chip.
>> >> >>> + *
>> >> >>> + * TODO: Do some generations share DCC format?
>> >> >>> + */
>> >> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
>> >> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK  0xff
>> >> >> Do you really need all the combinations here of DCC + gpu gen 
>> >> >> + tiling
>> >> >> details? When we had the entire discussion with nvidia folks 
>> >> >> they
>> >> >> eventually agreed that they don't need the massive pile with 
>> >> >> every
>> >> >> possible combination. Do you really plan to share all these 
>> >> >> different
>> >> >> things?
>> >> >>
>> >> >> Note that e.g. on i915 we spec some of the tiling depending 
>> >> >> upon
>> >> >> buffer size and buffer format (because that's how the hw 
>> >> >> works), not
>> >> >> using explicit modifier flags for everything.
>> >> > Right. The conclusion, after people went through and started 
>> >> > sorting
>> >> > out the kinds of formats for which they would _actually_ export 
>> >> > real
>> >> > colour buffers for, that most vendors definitely have fewer than
>> >> > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
>> >> > possible formats to represent, very likely fewer than
>> >> > 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, 
>> >> > probably
>> >> > fewer than 72,057,594,037,927,936 formats, and even still 
>> >> > generally
>> >> > fewer than 281,474,976,710,656 if you want to be generous and 
>> >> > leave 8
>> >> > bits of the 56 available.
>> >>  The problem here is that at least for some parameters we 
>> >>  actually don't know
>> >>  which formats are actually used.
>> >> 
>> >>  The following are not real world examples, but just to explain 
>> >>  the general
>> >>  problem.
>> >> 
>> >>  The memory configuration for example can be not ASIC specific, 
>> >>  but rather
>> >>  determined by whoever took the ASIC and glued it together with 
>> >>  VRAM on a
>> >>  board. It is not likely that somebody puts all the VRAM chips on 
>> >>  one
>> >>  channel, but it is still perfectly possible.
>> >> 
>> >>  Same is true for things like harvesting, e.g. of 16 channels 
>> >>  halve of them
>> >>  could be bad and we need to know which to actually use.
>> >> >>> For my understanding: This leaks outside the chip when sharing 
>> >> >>> buffers?
>> >> >>> All the information you only need locally to a given amdgpu 
>> >> >>> instance
>> >> >>> don't really need to be encoded in modifiers.
>> >> >>>
>> >> >>> Pointers to code where this is all decided (kernel and radeonsi 
>> >> >>> would be
>> >> >>> good starters I guess) would be really good here.
>> >> >> I extracted the information on which bits are relevant mostly from 
>> >> >> the
>> >> >> AddrFromCoord functions in addrlib in mesa:
>> >> >>
>> >> >> for macro-tiles:
>> >> >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/r800/egbaddrlib.cpp#L1587
>> >> >>
>> >> >> for micro-tiles (or the micro-tiles in macro-tiles):
>> >> >>
>> >> >>

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Daniel Vetter

On Tue, Sep 4, 2018 at 7:57 PM, Bas Nieuwenhuizen
 wrote:
> On Tue, Sep 4, 2018 at 7:48 PM Christian König
>  wrote:
>>
>> Am 04.09.2018 um 18:37 schrieb Daniel Vetter:
>> > On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
>> >  wrote:
>> >> On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter  wrote:
>> >>> On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
>> >>>  wrote:
>>  On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter  wrote:
>> > On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen wrote:
>> >> On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter  wrote:
>> >>> On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König wrote:
>>  Am 04.09.2018 um 12:15 schrieb Daniel Stone:
>> > Hi,
>> >
>> > On Tue, 4 Sep 2018 at 11:05, Daniel Vetter 
>> >  wrote:
>> >> On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen 
>> >>  wrote:
>> >>> +/* The chip this is compatible with.
>> >>> + *
>> >>> + * If compression is disabled, use
>> >>> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
>> >>> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
>> >>> + *
>> >>> + * With compression enabled please use the exact chip.
>> >>> + *
>> >>> + * TODO: Do some generations share DCC format?
>> >>> + */
>> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
>> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK  0xff
>> >> Do you really need all the combinations here of DCC + gpu gen + 
>> >> tiling
>> >> details? When we had the entire discussion with nvidia folks they
>> >> eventually agreed that they don't need the massive pile with every
>> >> possible combination. Do you really plan to share all these 
>> >> different
>> >> things?
>> >>
>> >> Note that e.g. on i915 we spec some of the tiling depending upon
>> >> buffer size and buffer format (because that's how the hw works), 
>> >> not
>> >> using explicit modifier flags for everything.
>> > Right. The conclusion, after people went through and started 
>> > sorting
>> > out the kinds of formats for which they would _actually_ export 
>> > real
>> > colour buffers for, that most vendors definitely have fewer than
>> > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
>> > possible formats to represent, very likely fewer than
>> > 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, 
>> > probably
>> > fewer than 72,057,594,037,927,936 formats, and even still generally
>> > fewer than 281,474,976,710,656 if you want to be generous and 
>> > leave 8
>> > bits of the 56 available.
>>  The problem here is that at least for some parameters we actually 
>>  don't know
>>  which formats are actually used.
>> 
>>  The following are not real world examples, but just to explain the 
>>  general
>>  problem.
>> 
>>  The memory configuration for example can be not ASIC specific, but 
>>  rather
>>  determined by whoever took the ASIC and glued it together with VRAM 
>>  on a
>>  board. It is not likely that somebody puts all the VRAM chips on one
>>  channel, but it is still perfectly possible.
>> 
>>  Same is true for things like harvesting, e.g. of 16 channels halve 
>>  of them
>>  could be bad and we need to know which to actually use.
>> >>> For my understanding: This leaks outside the chip when sharing 
>> >>> buffers?
>> >>> All the information you only need locally to a given amdgpu instance
>> >>> don't really need to be encoded in modifiers.
>> >>>
>> >>> Pointers to code where this is all decided (kernel and radeonsi 
>> >>> would be
>> >>> good starters I guess) would be really good here.
>> >> I extracted the information on which bits are relevant mostly from the
>> >> AddrFromCoord functions in addrlib in mesa:
>> >>
>> >> for macro-tiles:
>> >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/r800/egbaddrlib.cpp#L1587
>> >>
>> >> for micro-tiles (or the micro-tiles in macro-tiles):
>> >>
>> >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/core/addrlib1.cpp#L3016
>> > So this is the decoding thing. How many of these actually exist, even 
>> > when
>> > taking all the other information into account?
>> >
>> > E.g. given a platform + memory config (seems needed) + drm_fourcc + 
>> > stride
>> > + height + width, how much of all these bits do you actually still 
>> > freely
>> > pick?
>>  Basically you pick ARRAY_MODE (linear, micro-tile, macro-tile, sparse,
>>  thick

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Christian König

Am 04.09.2018 um 20:00 schrieb Bas Nieuwenhuizen:

On Tue, Sep 4, 2018 at 7:57 PM Bas Nieuwenhuizen
wrote:

On Tue, Sep 4, 2018 at 7:48 PM Christian König
wrote:

Am 04.09.2018 um 18:37 schrieb Daniel Vetter:

On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
wrote:

On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter wrote:

On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
wrote:

On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter wrote:

On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen wrote:

On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter wrote:

On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König wrote:

Am 04.09.2018 um 12:15 schrieb Daniel Stone:

Hi,

On Tue, 4 Sep 2018 at 11:05, Daniel Vetter wrote:

On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen
wrote:

+/* The chip this is compatible with.
+ *
+ * If compression is disabled, use
+ * - AMDGPU_CHIP_TAHITI for GFX6-GFX8
+ * - AMDGPU_CHIP_VEGA10 for GFX9+
+ *
+ * With compression enabled please use the exact chip.
+ *
+ * TODO: Do some generations share DCC format?
+ */
+#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
+#define AMDGPU_MODIFIER_CHIP_GEN_MASK 0xff

Do you really need all the combinations here of DCC + gpu gen + tiling
details? When we had the entire discussion with nvidia folks they
eventually agreed that they don't need the massive pile with every
possible combination. Do you really plan to share all these different
things?

Note that e.g. on i915 we spec some of the tiling depending upon
buffer size and buffer format (because that's how the hw works), not
using explicit modifier flags for everything.

Right. The conclusion, after people went through and started sorting
out the kinds of formats for which they would _actually_ export real
colour buffers for, that most vendors definitely have fewer than
115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
possible formats to represent, very likely fewer than
340,282,366,920,938,463,463,374,607,431,768,211,456 formats, probably
fewer than 72,057,594,037,927,936 formats, and even still generally
fewer than 281,474,976,710,656 if you want to be generous and leave 8
bits of the 56 available.

The problem here is that at least for some parameters we actually don't know
which formats are actually used.

The following are not real world examples, but just to explain the general
problem.

The memory configuration for example can be not ASIC specific, but rather
determined by whoever took the ASIC and glued it together with VRAM on a
board. It is not likely that somebody puts all the VRAM chips on one
channel, but it is still perfectly possible.

Same is true for things like harvesting, e.g. of 16 channels halve of them
could be bad and we need to know which to actually use.

For my understanding: This leaks outside the chip when sharing buffers?
All the information you only need locally to a given amdgpu instance
don't really need to be encoded in modifiers.

Pointers to code where this is all decided (kernel and radeonsi would be
good starters I guess) would be really good here.

I extracted the information on which bits are relevant mostly from the
AddrFromCoord functions in addrlib in mesa:

for macro-tiles:
https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/r800/egbaddrlib.cpp#L1587

for micro-tiles (or the micro-tiles in macro-tiles):

https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/core/addrlib1.cpp#L3016

So this is the decoding thing. How many of these actually exist, even when
taking all the other information into account?

E.g. given a platform + memory config (seems needed) + drm_fourcc + stride
+ height + width, how much of all these bits do you actually still freely
pick?

Basically you pick ARRAY_MODE (linear, micro-tile, macro-tile, sparse,
thick variants of macro-tile), MICRO_TILE_MODE(display, non-display,
depth, display-rotated) + whether to use compression, everything else
is fixed given those option, the properties of the chip and the
format.

It might be that all the things you need to know from the memory config
don't encode smaller than the macro/micro/whatever else stuff. But that's
kinda the angle that we looked at this for everyone else.

E.g. for multi-plane stuff, if everyone picks the same config for the
2nd/3rd plane, then you don't actually need to encode that. It just
becomes part of the implied stuff in the modifier.

The problem is some GPUs are compatible for say 8-bpp images, but not
for 32-bpp surfaces. e.g. lets look at the following table showing the
current configuration for all GFX6-GFX8 GPU:

format: (bank width, bank height, macro tile aspect, num banks) for
8-bpp, 16-bpp and 32 bpp single-sample followed by the PIPE_CONFIG

verde: (1, 4, 2, 16) (1, 2, 2, 16) (1, 1, 2, 16) ADDR_SURF_P4_8x16
oland: (1, 4, 2, 16) (1, 2, 2, 16) (1, 1, 2, 16) ADDR_SURF_P4_8x16
hainan: (1, 4, 2, 16) (1, 2, 2, 16) (1, 1, 2, 16)

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Bas Nieuwenhuizen

On Tue, Sep 4, 2018 at 7:57 PM Bas Nieuwenhuizen
 wrote:
>
> On Tue, Sep 4, 2018 at 7:48 PM Christian König
>  wrote:
> >
> > Am 04.09.2018 um 18:37 schrieb Daniel Vetter:
> > > On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
> > >  wrote:
> > >> On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter  wrote:
> > >>> On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
> > >>>  wrote:
> >  On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter  wrote:
> > > On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen wrote:
> > >> On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter  wrote:
> > >>> On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König wrote:
> >  Am 04.09.2018 um 12:15 schrieb Daniel Stone:
> > > Hi,
> > >
> > > On Tue, 4 Sep 2018 at 11:05, Daniel Vetter 
> > >  wrote:
> > >> On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen 
> > >>  wrote:
> > >>> +/* The chip this is compatible with.
> > >>> + *
> > >>> + * If compression is disabled, use
> > >>> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
> > >>> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
> > >>> + *
> > >>> + * With compression enabled please use the exact chip.
> > >>> + *
> > >>> + * TODO: Do some generations share DCC format?
> > >>> + */
> > >>> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
> > >>> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK  0xff
> > >> Do you really need all the combinations here of DCC + gpu gen + 
> > >> tiling
> > >> details? When we had the entire discussion with nvidia folks they
> > >> eventually agreed that they don't need the massive pile with 
> > >> every
> > >> possible combination. Do you really plan to share all these 
> > >> different
> > >> things?
> > >>
> > >> Note that e.g. on i915 we spec some of the tiling depending upon
> > >> buffer size and buffer format (because that's how the hw works), 
> > >> not
> > >> using explicit modifier flags for everything.
> > > Right. The conclusion, after people went through and started 
> > > sorting
> > > out the kinds of formats for which they would _actually_ export 
> > > real
> > > colour buffers for, that most vendors definitely have fewer than
> > > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
> > > possible formats to represent, very likely fewer than
> > > 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, 
> > > probably
> > > fewer than 72,057,594,037,927,936 formats, and even still 
> > > generally
> > > fewer than 281,474,976,710,656 if you want to be generous and 
> > > leave 8
> > > bits of the 56 available.
> >  The problem here is that at least for some parameters we actually 
> >  don't know
> >  which formats are actually used.
> > 
> >  The following are not real world examples, but just to explain the 
> >  general
> >  problem.
> > 
> >  The memory configuration for example can be not ASIC specific, but 
> >  rather
> >  determined by whoever took the ASIC and glued it together with 
> >  VRAM on a
> >  board. It is not likely that somebody puts all the VRAM chips on 
> >  one
> >  channel, but it is still perfectly possible.
> > 
> >  Same is true for things like harvesting, e.g. of 16 channels halve 
> >  of them
> >  could be bad and we need to know which to actually use.
> > >>> For my understanding: This leaks outside the chip when sharing 
> > >>> buffers?
> > >>> All the information you only need locally to a given amdgpu instance
> > >>> don't really need to be encoded in modifiers.
> > >>>
> > >>> Pointers to code where this is all decided (kernel and radeonsi 
> > >>> would be
> > >>> good starters I guess) would be really good here.
> > >> I extracted the information on which bits are relevant mostly from 
> > >> the
> > >> AddrFromCoord functions in addrlib in mesa:
> > >>
> > >> for macro-tiles:
> > >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/r800/egbaddrlib.cpp#L1587
> > >>
> > >> for micro-tiles (or the micro-tiles in macro-tiles):
> > >>
> > >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/core/addrlib1.cpp#L3016
> > > So this is the decoding thing. How many of these actually exist, even 
> > > when
> > > taking all the other information into account?
> > >
> > > E.g. given a platform + memory config (seems needed) + drm_fourcc + 
> > > stride
> > > + height + width, how much of all these

Re: [RFC] drm/amdgpu: Add macros and documentation for format modifiers.

2018-09-04 Thread Bas Nieuwenhuizen

On Tue, Sep 4, 2018 at 7:48 PM Christian König
 wrote:
>
> Am 04.09.2018 um 18:37 schrieb Daniel Vetter:
> > On Tue, Sep 4, 2018 at 5:52 PM, Bas Nieuwenhuizen
> >  wrote:
> >> On Tue, Sep 4, 2018 at 4:43 PM Daniel Vetter  wrote:
> >>> On Tue, Sep 4, 2018 at 3:33 PM, Bas Nieuwenhuizen
> >>>  wrote:
>  On Tue, Sep 4, 2018 at 3:04 PM Daniel Vetter  wrote:
> > On Tue, Sep 04, 2018 at 02:33:02PM +0200, Bas Nieuwenhuizen wrote:
> >> On Tue, Sep 4, 2018 at 2:26 PM Daniel Vetter  wrote:
> >>> On Tue, Sep 04, 2018 at 12:44:19PM +0200, Christian König wrote:
>  Am 04.09.2018 um 12:15 schrieb Daniel Stone:
> > Hi,
> >
> > On Tue, 4 Sep 2018 at 11:05, Daniel Vetter  
> > wrote:
> >> On Tue, Sep 4, 2018 at 3:00 AM, Bas Nieuwenhuizen 
> >>  wrote:
> >>> +/* The chip this is compatible with.
> >>> + *
> >>> + * If compression is disabled, use
> >>> + *   - AMDGPU_CHIP_TAHITI for GFX6-GFX8
> >>> + *   - AMDGPU_CHIP_VEGA10 for GFX9+
> >>> + *
> >>> + * With compression enabled please use the exact chip.
> >>> + *
> >>> + * TODO: Do some generations share DCC format?
> >>> + */
> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_SHIFT 40
> >>> +#define AMDGPU_MODIFIER_CHIP_GEN_MASK  0xff
> >> Do you really need all the combinations here of DCC + gpu gen + 
> >> tiling
> >> details? When we had the entire discussion with nvidia folks they
> >> eventually agreed that they don't need the massive pile with every
> >> possible combination. Do you really plan to share all these 
> >> different
> >> things?
> >>
> >> Note that e.g. on i915 we spec some of the tiling depending upon
> >> buffer size and buffer format (because that's how the hw works), 
> >> not
> >> using explicit modifier flags for everything.
> > Right. The conclusion, after people went through and started sorting
> > out the kinds of formats for which they would _actually_ export real
> > colour buffers for, that most vendors definitely have fewer than
> > 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936
> > possible formats to represent, very likely fewer than
> > 340,282,366,920,938,463,463,374,607,431,768,211,456 formats, 
> > probably
> > fewer than 72,057,594,037,927,936 formats, and even still generally
> > fewer than 281,474,976,710,656 if you want to be generous and leave 
> > 8
> > bits of the 56 available.
>  The problem here is that at least for some parameters we actually 
>  don't know
>  which formats are actually used.
> 
>  The following are not real world examples, but just to explain the 
>  general
>  problem.
> 
>  The memory configuration for example can be not ASIC specific, but 
>  rather
>  determined by whoever took the ASIC and glued it together with VRAM 
>  on a
>  board. It is not likely that somebody puts all the VRAM chips on one
>  channel, but it is still perfectly possible.
> 
>  Same is true for things like harvesting, e.g. of 16 channels halve 
>  of them
>  could be bad and we need to know which to actually use.
> >>> For my understanding: This leaks outside the chip when sharing 
> >>> buffers?
> >>> All the information you only need locally to a given amdgpu instance
> >>> don't really need to be encoded in modifiers.
> >>>
> >>> Pointers to code where this is all decided (kernel and radeonsi would 
> >>> be
> >>> good starters I guess) would be really good here.
> >> I extracted the information on which bits are relevant mostly from the
> >> AddrFromCoord functions in addrlib in mesa:
> >>
> >> for macro-tiles:
> >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/r800/egbaddrlib.cpp#L1587
> >>
> >> for micro-tiles (or the micro-tiles in macro-tiles):
> >>
> >> https://gitlab.freedesktop.org/mesa/mesa/blob/master/src/amd/addrlib/core/addrlib1.cpp#L3016
> > So this is the decoding thing. How many of these actually exist, even 
> > when
> > taking all the other information into account?
> >
> > E.g. given a platform + memory config (seems needed) + drm_fourcc + 
> > stride
> > + height + width, how much of all these bits do you actually still 
> > freely
> > pick?
>  Basically you pick ARRAY_MODE (linear, micro-tile, macro-tile, sparse,
>  thick variants of macro-tile), MICRO_TILE_MODE(display, non-display,
>  depth, display-rotated) + whether to use compression, everything else
>  is fixed given those option, the properties of