Re: VCN_INFO_TABLE_MAX_NUM_INSTANCES vs AMDGPU_MAX_VCN_INSTANCES

2022-05-16 Thread Christian König

Am 16.05.22 um 20:15 schrieb Alex Deucher:

On Mon, May 16, 2022 at 2:10 PM Christian König
 wrote:

Am 16.05.22 um 19:49 schrieb Ernst Sjöstrand:

Den mån 16 maj 2022 kl 17:13 skrev Alex Deucher :

On Sun, May 15, 2022 at 11:46 AM Ernst Sjöstrand  wrote:

smatch found this problem on amd-staging-drm-next:

drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1443 amdgpu_discovery_get_vcn_info() 
error: buffer overflow 'adev->vcn.vcn_codec_disable_mask' 2 <= 3

This is caused by:
#define AMDGPU_MAX_VCN_INSTANCES 2
#define VCN_INFO_TABLE_MAX_NUM_INSTANCES 4

Can we just drop VCN_INFO_TABLE_MAX_NUM_INSTANCES completely and use 
AMDGPU_MAX_VCN_INSTANCES everywhere instead (and bump it to 4)?

We should be able to bump AMDGPU_MAX_VCN_INSTANCES to 4 (although it
would waste some memory in the places it is used at this point).
VCN_INFO_TABLE_MAX_NUM_INSTANCES is part of a firmware structure so we
can't change that without breaking the firmware structure.

Alex


It would be nice to get rid of this pattern and make sure it doesn't happen 
again when the VCN info table is raised to 5.
It's very similar to the HWIP_MAX_INSTANCE issue.


No, as Alex explained that distinction is intentional.

The firmware definition is 4 for future extensions, that doesn't mean that this 
is currently used.

There is currently simply no need to set AMDGPU_MAX_VCN_INSTANCES to more than 
2.

Right.  The attached patch should protect against the scenario you are
envisioning.


Acked-by: Christian König 



Alex


Regards,
Christian.


//E






Re: VCN_INFO_TABLE_MAX_NUM_INSTANCES vs AMDGPU_MAX_VCN_INSTANCES

2022-05-16 Thread Ernst Sjöstrand
Looks good to me!

Den mån 16 maj 2022 kl 20:15 skrev Alex Deucher :

> On Mon, May 16, 2022 at 2:10 PM Christian König
>  wrote:
> >
> > Am 16.05.22 um 19:49 schrieb Ernst Sjöstrand:
> >
> > Den mån 16 maj 2022 kl 17:13 skrev Alex Deucher :
> >>
> >> On Sun, May 15, 2022 at 11:46 AM Ernst Sjöstrand 
> wrote:
> >> >
> >> > smatch found this problem on amd-staging-drm-next:
> >> >
> >> > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1443
> amdgpu_discovery_get_vcn_info() error: buffer overflow
> 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
> >> >
> >> > This is caused by:
> >> > #define AMDGPU_MAX_VCN_INSTANCES 2
> >> > #define VCN_INFO_TABLE_MAX_NUM_INSTANCES 4
> >> >
> >> > Can we just drop VCN_INFO_TABLE_MAX_NUM_INSTANCES completely and use
> AMDGPU_MAX_VCN_INSTANCES everywhere instead (and bump it to 4)?
> >>
> >> We should be able to bump AMDGPU_MAX_VCN_INSTANCES to 4 (although it
> >> would waste some memory in the places it is used at this point).
> >> VCN_INFO_TABLE_MAX_NUM_INSTANCES is part of a firmware structure so we
> >> can't change that without breaking the firmware structure.
> >>
> >> Alex
> >
> >
> > It would be nice to get rid of this pattern and make sure it doesn't
> happen again when the VCN info table is raised to 5.
> > It's very similar to the HWIP_MAX_INSTANCE issue.
> >
> >
> > No, as Alex explained that distinction is intentional.
> >
> > The firmware definition is 4 for future extensions, that doesn't mean
> that this is currently used.
> >
> > There is currently simply no need to set AMDGPU_MAX_VCN_INSTANCES to
> more than 2.
>
> Right.  The attached patch should protect against the scenario you are
> envisioning.
>
> Alex
>
> >
> > Regards,
> > Christian.
> >
> >
> > //E
> >
> >
>


Re: VCN_INFO_TABLE_MAX_NUM_INSTANCES vs AMDGPU_MAX_VCN_INSTANCES

2022-05-16 Thread Alex Deucher
On Mon, May 16, 2022 at 2:10 PM Christian König
 wrote:
>
> Am 16.05.22 um 19:49 schrieb Ernst Sjöstrand:
>
> Den mån 16 maj 2022 kl 17:13 skrev Alex Deucher :
>>
>> On Sun, May 15, 2022 at 11:46 AM Ernst Sjöstrand  wrote:
>> >
>> > smatch found this problem on amd-staging-drm-next:
>> >
>> > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1443 
>> > amdgpu_discovery_get_vcn_info() error: buffer overflow 
>> > 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
>> >
>> > This is caused by:
>> > #define AMDGPU_MAX_VCN_INSTANCES 2
>> > #define VCN_INFO_TABLE_MAX_NUM_INSTANCES 4
>> >
>> > Can we just drop VCN_INFO_TABLE_MAX_NUM_INSTANCES completely and use 
>> > AMDGPU_MAX_VCN_INSTANCES everywhere instead (and bump it to 4)?
>>
>> We should be able to bump AMDGPU_MAX_VCN_INSTANCES to 4 (although it
>> would waste some memory in the places it is used at this point).
>> VCN_INFO_TABLE_MAX_NUM_INSTANCES is part of a firmware structure so we
>> can't change that without breaking the firmware structure.
>>
>> Alex
>
>
> It would be nice to get rid of this pattern and make sure it doesn't happen 
> again when the VCN info table is raised to 5.
> It's very similar to the HWIP_MAX_INSTANCE issue.
>
>
> No, as Alex explained that distinction is intentional.
>
> The firmware definition is 4 for future extensions, that doesn't mean that 
> this is currently used.
>
> There is currently simply no need to set AMDGPU_MAX_VCN_INSTANCES to more 
> than 2.

Right.  The attached patch should protect against the scenario you are
envisioning.

Alex

>
> Regards,
> Christian.
>
>
> //E
>
>
From 6fecd3c86fbb7b89d59d16cffea438e5b2a5915c Mon Sep 17 00:00:00 2001
From: Alex Deucher 
Date: Mon, 16 May 2022 14:12:33 -0400
Subject: [PATCH] drm/amdgpu/discovery: validate VCN and SDMA instances

Validate the VCN and SDMA instances against the driver
structure sizes to make sure we don't get into a
situation where the firmware reports more instances than
the driver supports.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 1f4e07a32445..2c4f1adb5343 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1137,13 +1137,24 @@ static int amdgpu_discovery_reg_base_init(struct amdgpu_device *adev)
 adev->vcn.vcn_config[adev->vcn.num_vcn_inst] =
 	ip->revision & 0xc0;
 ip->revision &= ~0xc0;
-adev->vcn.num_vcn_inst++;
+if (adev->vcn.num_vcn_inst < AMDGPU_MAX_VCN_INSTANCES)
+	adev->vcn.num_vcn_inst++;
+else
+	dev_err(adev->dev, "Too many VCN instances: %d vs %d\n",
+		adev->vcn.num_vcn_inst + 1,
+		AMDGPU_MAX_VCN_INSTANCES);
 			}
 			if (le16_to_cpu(ip->hw_id) == SDMA0_HWID ||
 			le16_to_cpu(ip->hw_id) == SDMA1_HWID ||
 			le16_to_cpu(ip->hw_id) == SDMA2_HWID ||
-			le16_to_cpu(ip->hw_id) == SDMA3_HWID)
-adev->sdma.num_instances++;
+			le16_to_cpu(ip->hw_id) == SDMA3_HWID) {
+if (adev->sdma.num_instances < AMDGPU_MAX_SDMA_INSTANCES)
+	adev->sdma.num_instances++;
+else
+	dev_err(adev->dev, "Too many SDMA instances: %d vs %d\n",
+		adev->sdma.num_instances + 1,
+		AMDGPU_MAX_SDMA_INSTANCES);
+			}
 
 			if (le16_to_cpu(ip->hw_id) == UMC_HWID)
 adev->gmc.num_umc++;
-- 
2.35.3



Re: VCN_INFO_TABLE_MAX_NUM_INSTANCES vs AMDGPU_MAX_VCN_INSTANCES

2022-05-16 Thread Christian König

Am 16.05.22 um 19:49 schrieb Ernst Sjöstrand:

Den mån 16 maj 2022 kl 17:13 skrev Alex Deucher :

On Sun, May 15, 2022 at 11:46 AM Ernst Sjöstrand
 wrote:
>
> smatch found this problem on amd-staging-drm-next:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1443
amdgpu_discovery_get_vcn_info() error: buffer overflow
'adev->vcn.vcn_codec_disable_mask' 2 <= 3
>
> This is caused by:
> #define AMDGPU_MAX_VCN_INSTANCES 2
> #define VCN_INFO_TABLE_MAX_NUM_INSTANCES 4
>
> Can we just drop VCN_INFO_TABLE_MAX_NUM_INSTANCES completely and
use AMDGPU_MAX_VCN_INSTANCES everywhere instead (and bump it to 4)?

We should be able to bump AMDGPU_MAX_VCN_INSTANCES to 4 (although it
would waste some memory in the places it is used at this point).
VCN_INFO_TABLE_MAX_NUM_INSTANCES is part of a firmware structure so we
can't change that without breaking the firmware structure.

Alex


It would be nice to get rid of this pattern and make sure it doesn't 
happen again when the VCN info table is raised to 5.

It's very similar to the HWIP_MAX_INSTANCE issue.


No, as Alex explained that distinction is intentional.

The firmware definition is 4 for future extensions, that doesn't mean 
that this is currently used.


There is currently simply no need to set AMDGPU_MAX_VCN_INSTANCES to 
more than 2.


Regards,
Christian.



//E



Re: VCN_INFO_TABLE_MAX_NUM_INSTANCES vs AMDGPU_MAX_VCN_INSTANCES

2022-05-16 Thread Ernst Sjöstrand
Den mån 16 maj 2022 kl 17:13 skrev Alex Deucher :

> On Sun, May 15, 2022 at 11:46 AM Ernst Sjöstrand  wrote:
> >
> > smatch found this problem on amd-staging-drm-next:
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1443
> amdgpu_discovery_get_vcn_info() error: buffer overflow
> 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
> >
> > This is caused by:
> > #define AMDGPU_MAX_VCN_INSTANCES 2
> > #define VCN_INFO_TABLE_MAX_NUM_INSTANCES 4
> >
> > Can we just drop VCN_INFO_TABLE_MAX_NUM_INSTANCES completely and use
> AMDGPU_MAX_VCN_INSTANCES everywhere instead (and bump it to 4)?
>
> We should be able to bump AMDGPU_MAX_VCN_INSTANCES to 4 (although it
> would waste some memory in the places it is used at this point).
> VCN_INFO_TABLE_MAX_NUM_INSTANCES is part of a firmware structure so we
> can't change that without breaking the firmware structure.
>
> Alex
>

It would be nice to get rid of this pattern and make sure it doesn't happen
again when the VCN info table is raised to 5.
It's very similar to the HWIP_MAX_INSTANCE issue.

//E


Re: VCN_INFO_TABLE_MAX_NUM_INSTANCES vs AMDGPU_MAX_VCN_INSTANCES

2022-05-16 Thread Alex Deucher
On Sun, May 15, 2022 at 11:46 AM Ernst Sjöstrand  wrote:
>
> smatch found this problem on amd-staging-drm-next:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:1443 
> amdgpu_discovery_get_vcn_info() error: buffer overflow 
> 'adev->vcn.vcn_codec_disable_mask' 2 <= 3
>
> This is caused by:
> #define AMDGPU_MAX_VCN_INSTANCES 2
> #define VCN_INFO_TABLE_MAX_NUM_INSTANCES 4
>
> Can we just drop VCN_INFO_TABLE_MAX_NUM_INSTANCES completely and use 
> AMDGPU_MAX_VCN_INSTANCES everywhere instead (and bump it to 4)?

We should be able to bump AMDGPU_MAX_VCN_INSTANCES to 4 (although it
would waste some memory in the places it is used at this point).
VCN_INFO_TABLE_MAX_NUM_INSTANCES is part of a firmware structure so we
can't change that without breaking the firmware structure.

Alex


>
> //E
>