Re: [kbuild] drivers/gpu/drm/msm/adreno/a3xx_gpu.c:600 a3xx_gpu_init() error: passing non negative 1 to ERR_PTR
On 4/9/2021 3:07 PM, Dan Carpenter wrote: tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 2d743660786ec51f5c1fefd5782bbdee7b227db0 commit: 5785dd7a8ef0de8049f40a1a109de6a1bf17b479 drm/msm: Fix duplicate gpu node in icc summary config: arm64-randconfig-m031-20210407 (attached as .config) compiler: aarch64-linux-gcc (GCC) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot Reported-by: Dan Carpenter smatch warnings: drivers/gpu/drm/msm/adreno/a3xx_gpu.c:600 a3xx_gpu_init() error: passing non negative 1 to ERR_PTR drivers/gpu/drm/msm/adreno/a4xx_gpu.c:727 a4xx_gpu_init() error: passing non negative 1 to ERR_PTR vim +600 drivers/gpu/drm/msm/adreno/a3xx_gpu.c 7198e6b03155f6 Rob Clark 2013-07-19 515 struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) 7198e6b03155f6 Rob Clark 2013-07-19 516 { 7198e6b03155f6 Rob Clark 2013-07-19 517 struct a3xx_gpu *a3xx_gpu = NULL; 55459968176f13 Rob Clark 2013-12-05 518 struct adreno_gpu *adreno_gpu; 7198e6b03155f6 Rob Clark 2013-07-19 519 struct msm_gpu *gpu; 060530f1ea6740 Rob Clark 2014-03-03 520 struct msm_drm_private *priv = dev->dev_private; 060530f1ea6740 Rob Clark 2014-03-03 521 struct platform_device *pdev = priv->gpu_pdev; 5785dd7a8ef0de Akhil P Oommen 2020-10-28 522 struct icc_path *ocmem_icc_path; 5785dd7a8ef0de Akhil P Oommen 2020-10-28 523 struct icc_path *icc_path; 7198e6b03155f6 Rob Clark 2013-07-19 524 int ret; 7198e6b03155f6 Rob Clark 2013-07-19 525 7198e6b03155f6 Rob Clark 2013-07-19 526 if (!pdev) { 6a41da17e87dee Mamta Shukla 2018-10-20 527 DRM_DEV_ERROR(dev->dev, "no a3xx device\n"); 7198e6b03155f6 Rob Clark 2013-07-19 528 ret = -ENXIO; 7198e6b03155f6 Rob Clark 2013-07-19 529 goto fail; 7198e6b03155f6 Rob Clark 2013-07-19 530 } 7198e6b03155f6 Rob Clark 2013-07-19 531 7198e6b03155f6 Rob Clark 2013-07-19 532 a3xx_gpu = kzalloc(sizeof(*a3xx_gpu), GFP_KERNEL); 7198e6b03155f6 Rob Clark 2013-07-19 533 if (!a3xx_gpu) { 7198e6b03155f6 Rob Clark 2013-07-19 534 ret = -ENOMEM; 7198e6b03155f6 Rob Clark 2013-07-19 535 goto fail; 7198e6b03155f6 Rob Clark 2013-07-19 536 } 7198e6b03155f6 Rob Clark 2013-07-19 537 55459968176f13 Rob Clark 2013-12-05 538 adreno_gpu = _gpu->base; 55459968176f13 Rob Clark 2013-12-05 539 gpu = _gpu->base; 7198e6b03155f6 Rob Clark 2013-07-19 540 70c70f091b1ffd Rob Clark 2014-05-30 541 gpu->perfcntrs = perfcntrs; 70c70f091b1ffd Rob Clark 2014-05-30 542 gpu->num_perfcntrs = ARRAY_SIZE(perfcntrs); 70c70f091b1ffd Rob Clark 2014-05-30 543 3bcefb0497f9fc Rob Clark 2014-09-05 544 adreno_gpu->registers = a3xx_registers; 3bcefb0497f9fc Rob Clark 2014-09-05 545 f97decac5f4c2d Jordan Crouse 2017-10-20 546 ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 1); 7198e6b03155f6 Rob Clark 2013-07-19 547 if (ret) 7198e6b03155f6 Rob Clark 2013-07-19 548 goto fail; 7198e6b03155f6 Rob Clark 2013-07-19 549 55459968176f13 Rob Clark 2013-12-05 550 /* if needed, allocate gmem: */ 55459968176f13 Rob Clark 2013-12-05 551 if (adreno_is_a330(adreno_gpu)) { 26c0b26dcd005d Brian Masney 2019-08-23 552 ret = adreno_gpu_ocmem_init(_gpu->base.pdev->dev, 26c0b26dcd005d Brian Masney 2019-08-23 553 adreno_gpu, _gpu->ocmem); 26c0b26dcd005d Brian Masney 2019-08-23 554 if (ret) 26c0b26dcd005d Brian Masney 2019-08-23 555 goto fail; 55459968176f13 Rob Clark 2013-12-05 556 } 55459968176f13 Rob Clark 2013-12-05 557 667ce33e57d0de Rob Clark 2016-09-28 558 if (!gpu->aspace) { 871d812aa43e63 Rob Clark 2013-11-16 559 /* TODO we think it is possible to configure the GPU to 871d812aa43e63 Rob Clark 2013-11-16 560* restrict access to VRAM carveout. But the required 871d812aa43e63 Rob Clark 2013-11-16 561* registers are unknown. For now just bail out and 871d812aa43e63 Rob Clark 2013-11-16 562* limp along with just modesetting. If it turns out 871d812aa43e63 Rob Clark 2013-11-16 563* to not be possible to restrict access, then we must 871d812aa43e63 Rob Clark 2013-11-16 564* implement a cmdstream validator. 871d812aa43e63 Rob Clark 2013-11-16 565*/ 6a41da17e87dee Mamta Shukla 2018-10-20 566 DRM_DEV_ERROR(dev->dev, "No memory protection without IOMMU\n"); 871d812aa43e63 Rob Clark 2013-11-16 567 ret = -ENXIO; 871d812aa43e63 Rob Clark 2013-11-16 568 goto fail; 871d812aa43e63 Rob Clark 2013-11-16 569 } 871d812aa43e63 Rob Clark 20
[PATCH 1/2] drm/msm/a6xx: Fix perfcounter oob timeout
We were not programing the correct bit while clearing the perfcounter oob. So, clear it correctly using the new 'clear' bit. This fixes the below error: [drm:a6xx_gmu_set_oob] *ERROR* Timeout waiting for GMU OOB set PERFCOUNTER: 0x8000 Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 863047b..6a86cd0 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -246,7 +246,7 @@ static int a6xx_gmu_hfi_start(struct a6xx_gmu *gmu) } struct a6xx_gmu_oob_bits { - int set, ack, set_new, ack_new; + int set, ack, set_new, ack_new, clear, clear_new; const char *name; }; @@ -260,6 +260,8 @@ static const struct a6xx_gmu_oob_bits a6xx_gmu_oob_bits[] = { .ack = 24, .set_new = 30, .ack_new = 31, + .clear = 24, + .clear_new = 31, }, [GMU_OOB_PERFCOUNTER_SET] = { @@ -268,18 +270,22 @@ static const struct a6xx_gmu_oob_bits a6xx_gmu_oob_bits[] = { .ack = 25, .set_new = 28, .ack_new = 30, + .clear = 25, + .clear_new = 29, }, [GMU_OOB_BOOT_SLUMBER] = { .name = "BOOT_SLUMBER", .set = 22, .ack = 30, + .clear = 30, }, [GMU_OOB_DCVS_SET] = { .name = "GPU_DCVS", .set = 23, .ack = 31, + .clear = 31, }, }; @@ -335,9 +341,9 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state) return; if (gmu->legacy) - bit = a6xx_gmu_oob_bits[state].ack; + bit = a6xx_gmu_oob_bits[state].clear; else - bit = a6xx_gmu_oob_bits[state].ack_new; + bit = a6xx_gmu_oob_bits[state].clear_new; gmu_write(gmu, REG_A6XX_GMU_HOST2GMU_INTR_SET, 1 << bit); } -- 2.7.4
[PATCH 2/2] drm/msm: Select CONFIG_NVMEM
The speedbin support requires nvmem driver api. So lets explicitly enable CONFIG_NVMEM to have this support. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index dabb4a1..d12fa35 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -20,6 +20,7 @@ config DRM_MSM select SND_SOC_HDMI_CODEC if SND_SOC select SYNC_FILE select PM_OPP + select NVMEM help DRM/KMS driver for MSM/snapdragon. -- 2.7.4
Re: [PATCH] drm/msm/a6xx: fix for kernels without CONFIG_NVMEM
On 4/2/2021 3:19 AM, Rob Clark wrote: On Thu, Apr 1, 2021 at 2:03 PM Dmitry Baryshkov wrote: On Thu, 1 Apr 2021 at 23:09, Rob Clark wrote: On Mon, Feb 22, 2021 at 8:06 AM Rob Clark wrote: On Mon, Feb 22, 2021 at 7:45 AM Akhil P Oommen wrote: On 2/19/2021 9:30 PM, Rob Clark wrote: On Fri, Feb 19, 2021 at 2:44 AM Akhil P Oommen wrote: On 2/18/2021 9:41 PM, Rob Clark wrote: On Thu, Feb 18, 2021 at 4:28 AM Akhil P Oommen wrote: On 2/18/2021 2:05 AM, Jonathan Marek wrote: On 2/17/21 3:18 PM, Rob Clark wrote: On Wed, Feb 17, 2021 at 11:08 AM Jordan Crouse wrote: On Wed, Feb 17, 2021 at 07:14:16PM +0530, Akhil P Oommen wrote: On 2/17/2021 8:36 AM, Rob Clark wrote: On Tue, Feb 16, 2021 at 12:10 PM Jonathan Marek wrote: Ignore nvmem_cell_get() EOPNOTSUPP error in the same way as a ENOENT error, to fix the case where the kernel was compiled without CONFIG_NVMEM. Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Jonathan Marek --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..7fe5d97606aa 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1356,10 +1356,10 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, cell = nvmem_cell_get(dev, "speed_bin"); /* -* -ENOENT means that the platform doesn't support speedbin which is -* fine +* -ENOENT means no speed bin in device tree, +* -EOPNOTSUPP means kernel was built without CONFIG_NVMEM very minor nit, it would be nice to at least preserve the gist of the "which is fine" (ie. some variation of "this is an optional thing and things won't catch fire without it" ;-)) (which is, I believe, is true, hopefully Akhil could confirm.. if not we should have a harder dependency on CONFIG_NVMEM..) IIRC, if the gpu opp table in the DT uses the 'opp-supported-hw' property, we will see some error during boot up if we don't call dev_pm_opp_set_supported_hw(). So calling "nvmem_cell_get(dev, "speed_bin")" is a way to test this. If there is no other harm, we can put a hard dependency on CONFIG_NVMEM. I'm not sure if we want to go this far given the squishiness about module dependencies. As far as I know we are the only driver that uses this seriously on QCOM SoCs and this is only needed for certain targets. I don't know if we want to force every target to build NVMEM and QFPROM on our behalf. But maybe I'm just saying that because Kconfig dependencies tend to break my brain (and then Arnd has to send a patch to fix it). Hmm, good point.. looks like CONFIG_NVMEM itself doesn't have any other dependencies, so I suppose it wouldn't be the end of the world to select that.. but I guess we don't want to require QFPROM I guess at the end of the day, what is the failure mode if you have a speed-bin device, but your kernel config misses QFPROM (and possibly NVMEM)? If the result is just not having the highest clk rate(s) Atleast on sc7180's gpu, using an unsupported FMAX breaks gmu. It won't be very obvious what went wrong when this happens! Ugg, ok.. I suppose we could select NVMEM, but not QFPROM, and then the case where QFPROM is not enabled on platforms that have the speed-bin field in DT will fail gracefully and all other platforms would continue on happily? BR, -R Sounds good to me. You probably should do a quick test with NVMEM enabled but QFPROM disabled to confirm my theory, but I *think* that should work BR, -R I tried it on an sc7180 device. The suggested combo (CONFIG_NVMEM + no CONFIG_QCOM_QFPROM) makes the gpu probe fail with error "failed to read speed-bin. Some OPPs may not be supported by hardware". This is good enough clue for the developer that he should fix the broken speedbin detection. Ok, great.. then sounds like selecting NVMEM is a good approach btw, did anyone ever send a patch to select NVMEM? I'm not seeing one but I could be overlooking something I thought Jonathan would send it as the discussion was going on in his patch. No problem, I will send it out. :) -Akhil. Judging by the amount of issues surrounding speed-bin, I might have a bold suggestion to revert these patches for now and get them once all the issues are sorted, so that we'd have a single working commit instead of scattered patch series breaking git bisect, having bad side-effects on non-sc7180 platforms, etc. We do really need some pre-merge CI like we have on the mesa side of things (and we at least have 845 devices in our CI farm, but it would be useful to add more generations).. but other than the config issue, I *think* this fixes the last of the speedbin fallout? https://patchwork.freedesktop.org/patch/426538/?series=
Re: [PATCH] drm/msm: Fix removal of valid error case when checking speed_bin
On 3/30/2021 7:04 AM, John Stultz wrote: Commit 7bf168c8fe8c ("drm/msm: Fix speed-bin support not to access outside valid memory"), reworked the nvmem reading of "speed_bin", but in doing so dropped handling of the -ENOENT case which was previously documented as "fine". That change resulted in the db845c board display to fail to start, with the following error: adreno 500.gpu: [drm:a6xx_gpu_init] *ERROR* failed to read speed-bin (-2). Some OPPs may not be supported by hardware Thus, this patch simply re-adds the ENOENT handling so the lack of the speed_bin entry isn't fatal for display, and gets things working on db845c. Cc: Rob Clark Cc: Sean Paul Cc: Jordan Crouse Cc: Eric Anholt Cc: Douglas Anderson Cc: linux-arm-...@vger.kernel.org Cc: freedr...@lists.freedesktop.org Cc: Bjorn Andersson Cc: YongQin Liu Reported-by: YongQin Liu Fixes: 7bf168c8fe8c ("drm/msm: Fix speed-bin support not to access outside valid memory") Signed-off-by: John Stultz --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 690409ca8a186..cb2df8736ca85 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1406,7 +1406,13 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, int ret; ret = nvmem_cell_read_u16(dev, "speed_bin", ); - if (ret) { + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (ret == -ENOENT) { + return 0; + } else if (ret) { DRM_DEV_ERROR(dev, "failed to read speed-bin (%d). Some OPPs may not be supported by hardware", ret); Reviewed-by: Akhil P Oommen This looks "fine" to me. ;) -Akhil.
Re: [PATCH] drm/msm/a6xx: fix for kernels without CONFIG_NVMEM
On 2/19/2021 9:30 PM, Rob Clark wrote: On Fri, Feb 19, 2021 at 2:44 AM Akhil P Oommen wrote: On 2/18/2021 9:41 PM, Rob Clark wrote: On Thu, Feb 18, 2021 at 4:28 AM Akhil P Oommen wrote: On 2/18/2021 2:05 AM, Jonathan Marek wrote: On 2/17/21 3:18 PM, Rob Clark wrote: On Wed, Feb 17, 2021 at 11:08 AM Jordan Crouse wrote: On Wed, Feb 17, 2021 at 07:14:16PM +0530, Akhil P Oommen wrote: On 2/17/2021 8:36 AM, Rob Clark wrote: On Tue, Feb 16, 2021 at 12:10 PM Jonathan Marek wrote: Ignore nvmem_cell_get() EOPNOTSUPP error in the same way as a ENOENT error, to fix the case where the kernel was compiled without CONFIG_NVMEM. Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Jonathan Marek --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..7fe5d97606aa 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1356,10 +1356,10 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, cell = nvmem_cell_get(dev, "speed_bin"); /* -* -ENOENT means that the platform doesn't support speedbin which is -* fine +* -ENOENT means no speed bin in device tree, +* -EOPNOTSUPP means kernel was built without CONFIG_NVMEM very minor nit, it would be nice to at least preserve the gist of the "which is fine" (ie. some variation of "this is an optional thing and things won't catch fire without it" ;-)) (which is, I believe, is true, hopefully Akhil could confirm.. if not we should have a harder dependency on CONFIG_NVMEM..) IIRC, if the gpu opp table in the DT uses the 'opp-supported-hw' property, we will see some error during boot up if we don't call dev_pm_opp_set_supported_hw(). So calling "nvmem_cell_get(dev, "speed_bin")" is a way to test this. If there is no other harm, we can put a hard dependency on CONFIG_NVMEM. I'm not sure if we want to go this far given the squishiness about module dependencies. As far as I know we are the only driver that uses this seriously on QCOM SoCs and this is only needed for certain targets. I don't know if we want to force every target to build NVMEM and QFPROM on our behalf. But maybe I'm just saying that because Kconfig dependencies tend to break my brain (and then Arnd has to send a patch to fix it). Hmm, good point.. looks like CONFIG_NVMEM itself doesn't have any other dependencies, so I suppose it wouldn't be the end of the world to select that.. but I guess we don't want to require QFPROM I guess at the end of the day, what is the failure mode if you have a speed-bin device, but your kernel config misses QFPROM (and possibly NVMEM)? If the result is just not having the highest clk rate(s) Atleast on sc7180's gpu, using an unsupported FMAX breaks gmu. It won't be very obvious what went wrong when this happens! Ugg, ok.. I suppose we could select NVMEM, but not QFPROM, and then the case where QFPROM is not enabled on platforms that have the speed-bin field in DT will fail gracefully and all other platforms would continue on happily? BR, -R Sounds good to me. You probably should do a quick test with NVMEM enabled but QFPROM disabled to confirm my theory, but I *think* that should work BR, -R I tried it on an sc7180 device. The suggested combo (CONFIG_NVMEM + no CONFIG_QCOM_QFPROM) makes the gpu probe fail with error "failed to read speed-bin. Some OPPs may not be supported by hardware". This is good enough clue for the developer that he should fix the broken speedbin detection. -Akhil.
Re: [PATCH] drm/msm/a6xx: fix for kernels without CONFIG_NVMEM
On 2/18/2021 9:41 PM, Rob Clark wrote: On Thu, Feb 18, 2021 at 4:28 AM Akhil P Oommen wrote: On 2/18/2021 2:05 AM, Jonathan Marek wrote: On 2/17/21 3:18 PM, Rob Clark wrote: On Wed, Feb 17, 2021 at 11:08 AM Jordan Crouse wrote: On Wed, Feb 17, 2021 at 07:14:16PM +0530, Akhil P Oommen wrote: On 2/17/2021 8:36 AM, Rob Clark wrote: On Tue, Feb 16, 2021 at 12:10 PM Jonathan Marek wrote: Ignore nvmem_cell_get() EOPNOTSUPP error in the same way as a ENOENT error, to fix the case where the kernel was compiled without CONFIG_NVMEM. Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Jonathan Marek --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..7fe5d97606aa 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1356,10 +1356,10 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, cell = nvmem_cell_get(dev, "speed_bin"); /* -* -ENOENT means that the platform doesn't support speedbin which is -* fine +* -ENOENT means no speed bin in device tree, +* -EOPNOTSUPP means kernel was built without CONFIG_NVMEM very minor nit, it would be nice to at least preserve the gist of the "which is fine" (ie. some variation of "this is an optional thing and things won't catch fire without it" ;-)) (which is, I believe, is true, hopefully Akhil could confirm.. if not we should have a harder dependency on CONFIG_NVMEM..) IIRC, if the gpu opp table in the DT uses the 'opp-supported-hw' property, we will see some error during boot up if we don't call dev_pm_opp_set_supported_hw(). So calling "nvmem_cell_get(dev, "speed_bin")" is a way to test this. If there is no other harm, we can put a hard dependency on CONFIG_NVMEM. I'm not sure if we want to go this far given the squishiness about module dependencies. As far as I know we are the only driver that uses this seriously on QCOM SoCs and this is only needed for certain targets. I don't know if we want to force every target to build NVMEM and QFPROM on our behalf. But maybe I'm just saying that because Kconfig dependencies tend to break my brain (and then Arnd has to send a patch to fix it). Hmm, good point.. looks like CONFIG_NVMEM itself doesn't have any other dependencies, so I suppose it wouldn't be the end of the world to select that.. but I guess we don't want to require QFPROM I guess at the end of the day, what is the failure mode if you have a speed-bin device, but your kernel config misses QFPROM (and possibly NVMEM)? If the result is just not having the highest clk rate(s) Atleast on sc7180's gpu, using an unsupported FMAX breaks gmu. It won't be very obvious what went wrong when this happens! Ugg, ok.. I suppose we could select NVMEM, but not QFPROM, and then the case where QFPROM is not enabled on platforms that have the speed-bin field in DT will fail gracefully and all other platforms would continue on happily? BR, -R Sounds good to me. -Akhil. available, that isn't the end of the world. But if it makes things not-work, that is sub-optimal. Generally, especially on ARM, kconfig seems to be way harder than it should be to build a kernel that works, if we could somehow not add to that problem (for both people with a6xx and older gens) that would be nice ;-) There is a "imply" kconfig option which solves exactly this problem. (you would "imply NVMEM" instead of "select NVMEM". then it would be possible to disable NVMEM but it would get enabled by default) BR, -R ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/msm/a6xx: fix for kernels without CONFIG_NVMEM
On 2/18/2021 2:05 AM, Jonathan Marek wrote: On 2/17/21 3:18 PM, Rob Clark wrote: On Wed, Feb 17, 2021 at 11:08 AM Jordan Crouse wrote: On Wed, Feb 17, 2021 at 07:14:16PM +0530, Akhil P Oommen wrote: On 2/17/2021 8:36 AM, Rob Clark wrote: On Tue, Feb 16, 2021 at 12:10 PM Jonathan Marek wrote: Ignore nvmem_cell_get() EOPNOTSUPP error in the same way as a ENOENT error, to fix the case where the kernel was compiled without CONFIG_NVMEM. Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Jonathan Marek --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..7fe5d97606aa 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1356,10 +1356,10 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, cell = nvmem_cell_get(dev, "speed_bin"); /* - * -ENOENT means that the platform doesn't support speedbin which is - * fine + * -ENOENT means no speed bin in device tree, + * -EOPNOTSUPP means kernel was built without CONFIG_NVMEM very minor nit, it would be nice to at least preserve the gist of the "which is fine" (ie. some variation of "this is an optional thing and things won't catch fire without it" ;-)) (which is, I believe, is true, hopefully Akhil could confirm.. if not we should have a harder dependency on CONFIG_NVMEM..) IIRC, if the gpu opp table in the DT uses the 'opp-supported-hw' property, we will see some error during boot up if we don't call dev_pm_opp_set_supported_hw(). So calling "nvmem_cell_get(dev, "speed_bin")" is a way to test this. If there is no other harm, we can put a hard dependency on CONFIG_NVMEM. I'm not sure if we want to go this far given the squishiness about module dependencies. As far as I know we are the only driver that uses this seriously on QCOM SoCs and this is only needed for certain targets. I don't know if we want to force every target to build NVMEM and QFPROM on our behalf. But maybe I'm just saying that because Kconfig dependencies tend to break my brain (and then Arnd has to send a patch to fix it). Hmm, good point.. looks like CONFIG_NVMEM itself doesn't have any other dependencies, so I suppose it wouldn't be the end of the world to select that.. but I guess we don't want to require QFPROM I guess at the end of the day, what is the failure mode if you have a speed-bin device, but your kernel config misses QFPROM (and possibly NVMEM)? If the result is just not having the highest clk rate(s) Atleast on sc7180's gpu, using an unsupported FMAX breaks gmu. It won't be very obvious what went wrong when this happens! -Akhil. available, that isn't the end of the world. But if it makes things not-work, that is sub-optimal. Generally, especially on ARM, kconfig seems to be way harder than it should be to build a kernel that works, if we could somehow not add to that problem (for both people with a6xx and older gens) that would be nice ;-) There is a "imply" kconfig option which solves exactly this problem. (you would "imply NVMEM" instead of "select NVMEM". then it would be possible to disable NVMEM but it would get enabled by default) BR, -R ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/msm/a6xx: fix for kernels without CONFIG_NVMEM
On 2/17/2021 8:36 AM, Rob Clark wrote: On Tue, Feb 16, 2021 at 12:10 PM Jonathan Marek wrote: Ignore nvmem_cell_get() EOPNOTSUPP error in the same way as a ENOENT error, to fix the case where the kernel was compiled without CONFIG_NVMEM. Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Jonathan Marek --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..7fe5d97606aa 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1356,10 +1356,10 @@ static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, cell = nvmem_cell_get(dev, "speed_bin"); /* -* -ENOENT means that the platform doesn't support speedbin which is -* fine +* -ENOENT means no speed bin in device tree, +* -EOPNOTSUPP means kernel was built without CONFIG_NVMEM very minor nit, it would be nice to at least preserve the gist of the "which is fine" (ie. some variation of "this is an optional thing and things won't catch fire without it" ;-)) (which is, I believe, is true, hopefully Akhil could confirm.. if not we should have a harder dependency on CONFIG_NVMEM..) IIRC, if the gpu opp table in the DT uses the 'opp-supported-hw' property, we will see some error during boot up if we don't call dev_pm_opp_set_supported_hw(). So calling "nvmem_cell_get(dev, "speed_bin")" is a way to test this. If there is no other harm, we can put a hard dependency on CONFIG_NVMEM. -Akhil. BR, -R */ - if (PTR_ERR(cell) == -ENOENT) + if (PTR_ERR(cell) == -ENOENT || PTR_ERR(cell) == -EOPNOTSUPP) return 0; else if (IS_ERR(cell)) { DRM_DEV_ERROR(dev, -- 2.26.1
Re: [PATCH v2] drm/msm: a6xx: Make sure the SQE microcode is safe
On 2/11/2021 9:32 PM, Jordan Crouse wrote: On Thu, Feb 11, 2021 at 06:50:28PM +0530, Akhil P Oommen wrote: On 2/10/2021 6:22 AM, Jordan Crouse wrote: Most a6xx targets have security issues that were fixed with new versions of the microcode(s). Make sure that we are booting with a safe version of the microcode for the target and print a message and error if not. v2: Add more informative error messages and fix typos Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 77 ++- 1 file changed, 64 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..064b7face504 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -522,28 +522,73 @@ static int a6xx_cp_init(struct msm_gpu *gpu) return a6xx_idle(gpu, ring) ? 0 : -EINVAL; } -static void a6xx_ucode_check_version(struct a6xx_gpu *a6xx_gpu, +/* + * Check that the microcode version is new enough to include several key + * security fixes. Return true if the ucode is safe. + */ +static bool a6xx_ucode_check_version(struct a6xx_gpu *a6xx_gpu, struct drm_gem_object *obj) { + struct adreno_gpu *adreno_gpu = _gpu->base; + struct msm_gpu *gpu = _gpu->base; u32 *buf = msm_gem_get_vaddr(obj); + bool ret = false; if (IS_ERR(buf)) - return; + return false; /* -* If the lowest nibble is 0xa that is an indication that this microcode -* has been patched. The actual version is in dword [3] but we only care -* about the patchlevel which is the lowest nibble of dword [3] -* -* Otherwise check that the firmware is greater than or equal to 1.90 -* which was the first version that had this fix built in +* Targets up to a640 (a618, a630 and a640) need to check for a +* microcode version that is patched to support the whereami opcode or +* one that is new enough to include it by default. */ - if (((buf[0] & 0xf) == 0xa) && (buf[2] & 0xf) >= 1) - a6xx_gpu->has_whereami = true; - else if ((buf[0] & 0xfff) > 0x190) - a6xx_gpu->has_whereami = true; + if (adreno_is_a618(adreno_gpu) || adreno_is_a630(adreno_gpu) || + adreno_is_a640(adreno_gpu)) { nit: I feel a 'switch(revn)' would be more readable. Reviewed-by: Akhil P Oommen -Akhil + /* +* If the lowest nibble is 0xa that is an indication that this +* microcode has been patched. The actual version is in dword +* [3] but we only care about the patchlevel which is the lowest +* nibble of dword [3] +* +* Otherwise check that the firmware is greater than or equal +* to 1.90 which was the first version that had this fix built +* in +*/ + if buf[0] & 0xf) == 0xa) && (buf[2] & 0xf) >= 1) || + (buf[0] & 0xfff) >= 0x190) { + a6xx_gpu->has_whereami = true; + ret = true; + goto out; + } + DRM_DEV_ERROR(>pdev->dev, + "a630 SQE ucode is too old. Have version %x need at least %x\n", + buf[0] & 0xfff, 0x190); + } else { + /* +* a650 tier targets don't need whereami but still need to be +* equal to or newer than 1.95 for other security fixes +*/ + if (adreno_is_a650(adreno_gpu)) { + if ((buf[0] & 0xfff) >= 0x195) { + ret = true; + goto out; + } + + DRM_DEV_ERROR(>pdev->dev, + "a650 SQE ucode is too old. Have version %x need at least %x\n", + buf[0] & 0xfff, 0x195); + } + + /* +* When a660 is added those targets should return true here +* since those have all the critical security fixes built in +* from the start +*/ Or we can just initialize 'ret' as true. I thought about it and I think I want to force an accept list here instead of letting new targets get by with an implicit pass. Jordan -Akhil + } +out: msm_gem_put_vaddr(obj); + return ret; } static int a6xx_ucode_init(struct msm_gpu *gpu) @@ -566,7 +611,13 @@ static int a6xx_ucode_init(struct msm_gpu *gpu) } msm_gem_object_set_name(a6xx_gpu->sqe_bo, "sqefw"); -
Re: [PATCH v2] drm/msm: a6xx: Make sure the SQE microcode is safe
On 2/10/2021 6:22 AM, Jordan Crouse wrote: Most a6xx targets have security issues that were fixed with new versions of the microcode(s). Make sure that we are booting with a safe version of the microcode for the target and print a message and error if not. v2: Add more informative error messages and fix typos Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 77 ++- 1 file changed, 64 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index ba8e9d3cf0fe..064b7face504 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -522,28 +522,73 @@ static int a6xx_cp_init(struct msm_gpu *gpu) return a6xx_idle(gpu, ring) ? 0 : -EINVAL; } -static void a6xx_ucode_check_version(struct a6xx_gpu *a6xx_gpu, +/* + * Check that the microcode version is new enough to include several key + * security fixes. Return true if the ucode is safe. + */ +static bool a6xx_ucode_check_version(struct a6xx_gpu *a6xx_gpu, struct drm_gem_object *obj) { + struct adreno_gpu *adreno_gpu = _gpu->base; + struct msm_gpu *gpu = _gpu->base; u32 *buf = msm_gem_get_vaddr(obj); + bool ret = false; if (IS_ERR(buf)) - return; + return false; /* -* If the lowest nibble is 0xa that is an indication that this microcode -* has been patched. The actual version is in dword [3] but we only care -* about the patchlevel which is the lowest nibble of dword [3] -* -* Otherwise check that the firmware is greater than or equal to 1.90 -* which was the first version that had this fix built in +* Targets up to a640 (a618, a630 and a640) need to check for a +* microcode version that is patched to support the whereami opcode or +* one that is new enough to include it by default. */ - if (((buf[0] & 0xf) == 0xa) && (buf[2] & 0xf) >= 1) - a6xx_gpu->has_whereami = true; - else if ((buf[0] & 0xfff) > 0x190) - a6xx_gpu->has_whereami = true; + if (adreno_is_a618(adreno_gpu) || adreno_is_a630(adreno_gpu) || + adreno_is_a640(adreno_gpu)) { + /* +* If the lowest nibble is 0xa that is an indication that this +* microcode has been patched. The actual version is in dword +* [3] but we only care about the patchlevel which is the lowest +* nibble of dword [3] +* +* Otherwise check that the firmware is greater than or equal +* to 1.90 which was the first version that had this fix built +* in +*/ + if buf[0] & 0xf) == 0xa) && (buf[2] & 0xf) >= 1) || + (buf[0] & 0xfff) >= 0x190) { + a6xx_gpu->has_whereami = true; + ret = true; + goto out; + } + DRM_DEV_ERROR(>pdev->dev, + "a630 SQE ucode is too old. Have version %x need at least %x\n", + buf[0] & 0xfff, 0x190); + } else { + /* +* a650 tier targets don't need whereami but still need to be +* equal to or newer than 1.95 for other security fixes +*/ + if (adreno_is_a650(adreno_gpu)) { + if ((buf[0] & 0xfff) >= 0x195) { + ret = true; + goto out; + } + + DRM_DEV_ERROR(>pdev->dev, + "a650 SQE ucode is too old. Have version %x need at least %x\n", + buf[0] & 0xfff, 0x195); + } + + /* +* When a660 is added those targets should return true here +* since those have all the critical security fixes built in +* from the start +*/ Or we can just initialize 'ret' as true. -Akhil + } +out: msm_gem_put_vaddr(obj); + return ret; } static int a6xx_ucode_init(struct msm_gpu *gpu) @@ -566,7 +611,13 @@ static int a6xx_ucode_init(struct msm_gpu *gpu) } msm_gem_object_set_name(a6xx_gpu->sqe_bo, "sqefw"); - a6xx_ucode_check_version(a6xx_gpu, a6xx_gpu->sqe_bo); + if (!a6xx_ucode_check_version(a6xx_gpu, a6xx_gpu->sqe_bo)) { + msm_gem_unpin_iova(a6xx_gpu->sqe_bo, gpu->aspace); + drm_gem_object_put(a6xx_gpu->sqe_bo); + + a6xx_gpu->sqe_bo = NULL; + return -EPERM; + } } gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE_LO,
Re: [PATCH] drm/msm: Fix legacy relocs path
On 2/5/2021 4:26 AM, Rob Clark wrote: From: Rob Clark In moving code around, we ended up using the same pointer to copy_from_user() the relocs tables as we used for the cmd table entry, which is clearly not right. This went unnoticed because modern mesa on non-ancent kernels does not actually use relocs. But this broke ancient mesa on modern kernels. Reported-by: Emil Velikov Fixes: 20224d715a88 ("drm/msm/submit: Move copy_from_user ahead of locking bos") Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem_submit.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index d04c349d8112..5480852bdeda 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -198,6 +198,8 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, submit->cmd[i].idx = submit_cmd.submit_idx; submit->cmd[i].nr_relocs = submit_cmd.nr_relocs; + userptr = u64_to_user_ptr(submit_cmd.relocs); + sz = array_size(submit_cmd.nr_relocs, sizeof(struct drm_msm_gem_submit_reloc)); /* check for overflow: */ Reviewed-by: Akhil P Oommen -Akhil.
Re: [PATCH v4 2/2] arm: dts: sc7180: Add support for gpu fuse
On 2/3/2021 4:22 AM, Bjorn Andersson wrote: On Fri 08 Jan 12:15 CST 2021, Akhil P Oommen wrote: Please align the $subject prefix with other changes in the same file. I fixed it up while picking up the patch this time. Will take of this in future. Thanks, Bjorn. -Akhil. Regards, Bjorn Add support for gpu fuse to help identify the supported opps. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 6678f1e..8cae3eb 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -675,6 +675,11 @@ reg = <0x25b 0x1>; bits = <1 3>; }; + + gpu_speed_bin: gpu_speed_bin@1d2 { + reg = <0x1d2 0x2>; + bits = <5 8>; + }; }; sdhc_1: sdhci@7c4000 { @@ -1907,52 +1912,69 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + nvmem-cells = <_speed_bin>; + nvmem-cell-names = "speed_bin"; + interconnects = <_noc MASTER_GFX3D 0 _virt SLAVE_EBI1 0>; interconnect-names = "gfx-mem"; gpu_opp_table: opp-table { compatible = "operating-points-v2"; +opp-82500 { + opp-hz = /bits/ 64 <82500>; + opp-level = ; + opp-peak-kBps = <8532000>; + opp-supported-hw = <0x04>; + }; + opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; opp-peak-kBps = <8532000>; + opp-supported-hw = <0x07>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; opp-peak-kBps = <7216000>; + opp-supported-hw = <0x07>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; opp-peak-kBps = <1804000>; + opp-supported-hw = <0x07>; }; }; }; -- 2.7.4 ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 03/13] opp: Keep track of currently programmed OPP
On 1/22/2021 10:15 AM, Viresh Kumar wrote: On 22-01-21, 00:41, Dmitry Osipenko wrote: 21.01.2021 14:17, Viresh Kumar пишет: @@ -1074,15 +1091,18 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq) if (!ret) { ret = _set_opp_bw(opp_table, opp, dev, false); - if (!ret) + if (!ret) { opp_table->enabled = true; + dev_pm_opp_put(old_opp); + + /* Make sure current_opp doesn't get freed */ + dev_pm_opp_get(opp); + opp_table->current_opp = opp; + } } I'm a bit surprised that _set_opp_bw() isn't used similarly to _set_opp_voltage() in _generic_set_opp_regulator(). I'd expect the BW requirement to be raised before the clock rate goes UP. I remember discussing that earlier when this stuff came in, and this I believe is the reason for that. We need to scale regulators before/after frequency because when we increase the frequency a regulator may _not_ be providing enough power to sustain that (even for a short while) and this may have undesired effects on the hardware and so it is important to prevent that malfunction. In case of bandwidth such issues will not happen (AFAIK) and doing it just once is normally enough. It is just about allowing more data to be transmitted, and won't make the hardware behave badly. I agree with Dmitry. BW is a shared resource in a lot of architectures. Raising clk before increasing the bw can lead to a scenario where this client saturate the entire BW for whatever small duration it may be. This will impact the latency requirements of other clients. -Akhil.
[PATCH v4 1/2] drm/msm: Add speed-bin support to a618 gpu
Some GPUs support different max frequencies depending on the platform. To identify the correct variant, we should check the gpu speedbin fuse value. Add support for this speedbin detection to a6xx family along with the required fuse details for a618 gpu. Signed-off-by: Akhil P Oommen --- Changes from v2: 1. Made the changes a6xx specific to save space. Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) Changes from v3: 1. Replace a618_speedbins[] with a function. (Jordan) drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 + 2 files changed, 85 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 1306618..499d134 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -10,6 +10,7 @@ #include #include +#include #include #define GPU_PAS_ID 13 @@ -1208,6 +1209,10 @@ static void a6xx_destroy(struct msm_gpu *gpu) a6xx_gmu_remove(a6xx_gpu); adreno_gpu_cleanup(adreno_gpu); + + if (a6xx_gpu->opp_table) + dev_pm_opp_put_supported_hw(a6xx_gpu->opp_table); + kfree(a6xx_gpu); } @@ -1264,6 +1269,78 @@ static uint32_t a6xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) return ring->memptrs->rptr = gpu_read(gpu, REG_A6XX_CP_RB_RPTR); } +static u32 a618_get_speed_bin(u32 fuse) +{ + if (fuse == 0) + return 0; + else if (fuse == 169) + return 1; + else if (fuse == 174) + return 2; + + return UINT_MAX; +} + +static u32 fuse_to_supp_hw(struct device *dev, u32 revn, u32 fuse) +{ + u32 val = UINT_MAX; + + if (revn == 618) + val = a618_get_speed_bin(fuse); + + if (val == UINT_MAX) { + DRM_DEV_ERROR(dev, + "missing support for speed-bin: %u. Some OPPs may not be supported by hardware", + fuse); + return UINT_MAX; + } + + return (1 << val); +} + +static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, + u32 revn) +{ + struct opp_table *opp_table; + struct nvmem_cell *cell; + u32 supp_hw = UINT_MAX; + void *buf; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) { + DRM_DEV_ERROR(dev, + "failed to read speed-bin. Some OPPs may not be supported by hardware"); + goto done; + } + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + DRM_DEV_ERROR(dev, + "failed to read speed-bin. Some OPPs may not be supported by hardware"); + goto done; + } + + supp_hw = fuse_to_supp_hw(dev, revn, *((u32 *) buf)); + + kfree(buf); + nvmem_cell_put(cell); + +done: + opp_table = dev_pm_opp_set_supported_hw(dev, _hw, 1); + if (IS_ERR(opp_table)) + return PTR_ERR(opp_table); + + a6xx_gpu->opp_table = opp_table; + return 0; +} + static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -1325,6 +1402,12 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) a6xx_llc_slices_init(pdev, a6xx_gpu); + ret = a6xx_set_supported_hw(>dev, a6xx_gpu, info->revn); + if (ret) { + a6xx_destroy(&(a6xx_gpu->base.base)); + return ERR_PTR(ret); + } + ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 1); if (ret) { a6xx_destroy(&(a6xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index e793d32..ce0610c 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -33,6 +33,8 @@ struct a6xx_gpu { void *llc_slice; void *htw_llc_slice; bool have_mmu500; + + struct opp_table *opp_table; }; #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base) -- 2.7.4
[PATCH v4 2/2] arm: dts: sc7180: Add support for gpu fuse
Add support for gpu fuse to help identify the supported opps. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 6678f1e..8cae3eb 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -675,6 +675,11 @@ reg = <0x25b 0x1>; bits = <1 3>; }; + + gpu_speed_bin: gpu_speed_bin@1d2 { + reg = <0x1d2 0x2>; + bits = <5 8>; + }; }; sdhc_1: sdhci@7c4000 { @@ -1907,52 +1912,69 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + nvmem-cells = <_speed_bin>; + nvmem-cell-names = "speed_bin"; + interconnects = <_noc MASTER_GFX3D 0 _virt SLAVE_EBI1 0>; interconnect-names = "gfx-mem"; gpu_opp_table: opp-table { compatible = "operating-points-v2"; + opp-82500 { + opp-hz = /bits/ 64 <82500>; + opp-level = ; + opp-peak-kBps = <8532000>; + opp-supported-hw = <0x04>; + }; + opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; opp-peak-kBps = <8532000>; + opp-supported-hw = <0x07>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; opp-peak-kBps = <7216000>; + opp-supported-hw = <0x07>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; opp-peak-kBps = <1804000>; + opp-supported-hw = <0x07>; }; }; }; -- 2.7.4
Re: [PATCH v3 2/2] arm: dts: sc7180: Add support for gpu fuse
On 12/7/2020 4:12 PM, Akhil P Oommen wrote: Add support for gpu fuse to help identify the supported opps. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 6678f1e..8cae3eb 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -675,6 +675,11 @@ reg = <0x25b 0x1>; bits = <1 3>; }; + + gpu_speed_bin: gpu_speed_bin@1d2 { + reg = <0x1d2 0x2>; + bits = <5 8>; + }; }; sdhc_1: sdhci@7c4000 { @@ -1907,52 +1912,69 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + nvmem-cells = <_speed_bin>; + nvmem-cell-names = "speed_bin"; + interconnects = <_noc MASTER_GFX3D 0 _virt SLAVE_EBI1 0>; interconnect-names = "gfx-mem"; gpu_opp_table: opp-table { compatible = "operating-points-v2"; +opp-82500 { + opp-hz = /bits/ 64 <82500>; + opp-level = ; + opp-peak-kBps = <8532000>; + opp-supported-hw = <0x04>; + }; + opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; opp-peak-kBps = <8532000>; + opp-supported-hw = <0x07>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; opp-peak-kBps = <7216000>; + opp-supported-hw = <0x07>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; opp-peak-kBps = <1804000>; + opp-supported-hw = <0x07>; }; }; }; A gentle ping. -Akhil.
Re: [PATCH v3 1/2] drm/msm: Add speed-bin support to a618 gpu
On 12/7/2020 4:12 PM, Akhil P Oommen wrote: Some GPUs support different max frequencies depending on the platform. To identify the correct variant, we should check the gpu speedbin fuse value. Add support for this speedbin detection to a6xx family along with the required fuse details for a618 gpu. Signed-off-by: Akhil P Oommen --- Changes from v2: 1. Made the changes a6xx specific to save space. Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 74 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 + 2 files changed, 76 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 1306618..6304578 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -10,10 +10,13 @@ #include #include +#include #include #define GPU_PAS_ID 13 +const u32 a618_speedbins[] = {0, 169, 174}; + static inline bool _a6xx_check_idle(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -1208,6 +1211,10 @@ static void a6xx_destroy(struct msm_gpu *gpu) a6xx_gmu_remove(a6xx_gpu); adreno_gpu_cleanup(adreno_gpu); + + if (a6xx_gpu->opp_table) + dev_pm_opp_put_supported_hw(a6xx_gpu->opp_table); + kfree(a6xx_gpu); } @@ -1264,6 +1271,67 @@ static uint32_t a6xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) return ring->memptrs->rptr = gpu_read(gpu, REG_A6XX_CP_RB_RPTR); } +static u32 fuse_to_supp_hw(struct device *dev, u32 revn, u32 fuse) +{ + int i; + + if (revn == 618) { + for (i = 0; i < ARRAY_SIZE(a618_speedbins); i++) { + if (fuse == a618_speedbins[i]) + return (1 << i); + } + } + + DRM_DEV_ERROR(dev, + "missing support for speed-bin: %u. Some OPPs may not be supported by hardware", + fuse); + return ~0U; +} + +static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, + u32 revn) +{ + + struct opp_table *opp_table; + struct nvmem_cell *cell; + u32 supp_hw = ~0U; + void *buf; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) { + DRM_DEV_ERROR(dev, + "failed to read speed-bin. Some OPPs may not be supported by hardware"); + goto done; + } + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + DRM_DEV_ERROR(dev, + "failed to read speed-bin. Some OPPs may not be supported by hardware"); + goto done; + } + + supp_hw = fuse_to_supp_hw(dev, revn, *((u32 *) buf)); + + kfree(buf); + nvmem_cell_put(cell); + +done: + opp_table = dev_pm_opp_set_supported_hw(dev, _hw, 1); + if (IS_ERR(opp_table)) + return PTR_ERR(opp_table); + + a6xx_gpu->opp_table = opp_table; + return 0; +} + static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -1325,6 +1393,12 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) a6xx_llc_slices_init(pdev, a6xx_gpu); + ret = a6xx_set_supported_hw(>dev, a6xx_gpu, info->revn); + if (ret) { + a6xx_destroy(&(a6xx_gpu->base.base)); + return ERR_PTR(ret); + } + ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 1); if (ret) { a6xx_destroy(&(a6xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index e793d32..ce0610c 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -33,6 +33,8 @@ struct a6xx_gpu { void *llc_slice; void *htw_llc_slice; bool have_mmu500; + + struct opp_table *opp_table; }; #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base) A gentle ping. -Akhil.
[PATCH v3 1/2] drm/msm: Add speed-bin support to a618 gpu
Some GPUs support different max frequencies depending on the platform. To identify the correct variant, we should check the gpu speedbin fuse value. Add support for this speedbin detection to a6xx family along with the required fuse details for a618 gpu. Signed-off-by: Akhil P Oommen --- Changes from v2: 1. Made the changes a6xx specific to save space. Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 74 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 + 2 files changed, 76 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 1306618..6304578 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -10,10 +10,13 @@ #include #include +#include #include #define GPU_PAS_ID 13 +const u32 a618_speedbins[] = {0, 169, 174}; + static inline bool _a6xx_check_idle(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -1208,6 +1211,10 @@ static void a6xx_destroy(struct msm_gpu *gpu) a6xx_gmu_remove(a6xx_gpu); adreno_gpu_cleanup(adreno_gpu); + + if (a6xx_gpu->opp_table) + dev_pm_opp_put_supported_hw(a6xx_gpu->opp_table); + kfree(a6xx_gpu); } @@ -1264,6 +1271,67 @@ static uint32_t a6xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) return ring->memptrs->rptr = gpu_read(gpu, REG_A6XX_CP_RB_RPTR); } +static u32 fuse_to_supp_hw(struct device *dev, u32 revn, u32 fuse) +{ + int i; + + if (revn == 618) { + for (i = 0; i < ARRAY_SIZE(a618_speedbins); i++) { + if (fuse == a618_speedbins[i]) + return (1 << i); + } + } + + DRM_DEV_ERROR(dev, + "missing support for speed-bin: %u. Some OPPs may not be supported by hardware", + fuse); + return ~0U; +} + +static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, + u32 revn) +{ + + struct opp_table *opp_table; + struct nvmem_cell *cell; + u32 supp_hw = ~0U; + void *buf; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) { + DRM_DEV_ERROR(dev, + "failed to read speed-bin. Some OPPs may not be supported by hardware"); + goto done; + } + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + DRM_DEV_ERROR(dev, + "failed to read speed-bin. Some OPPs may not be supported by hardware"); + goto done; + } + + supp_hw = fuse_to_supp_hw(dev, revn, *((u32 *) buf)); + + kfree(buf); + nvmem_cell_put(cell); + +done: + opp_table = dev_pm_opp_set_supported_hw(dev, _hw, 1); + if (IS_ERR(opp_table)) + return PTR_ERR(opp_table); + + a6xx_gpu->opp_table = opp_table; + return 0; +} + static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -1325,6 +1393,12 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) a6xx_llc_slices_init(pdev, a6xx_gpu); + ret = a6xx_set_supported_hw(>dev, a6xx_gpu, info->revn); + if (ret) { + a6xx_destroy(&(a6xx_gpu->base.base)); + return ERR_PTR(ret); + } + ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 1); if (ret) { a6xx_destroy(&(a6xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index e793d32..ce0610c 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -33,6 +33,8 @@ struct a6xx_gpu { void *llc_slice; void *htw_llc_slice; bool have_mmu500; + + struct opp_table *opp_table; }; #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base) -- 2.7.4
[PATCH v3 2/2] arm: dts: sc7180: Add support for gpu fuse
Add support for gpu fuse to help identify the supported opps. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 6678f1e..8cae3eb 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -675,6 +675,11 @@ reg = <0x25b 0x1>; bits = <1 3>; }; + + gpu_speed_bin: gpu_speed_bin@1d2 { + reg = <0x1d2 0x2>; + bits = <5 8>; + }; }; sdhc_1: sdhci@7c4000 { @@ -1907,52 +1912,69 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + nvmem-cells = <_speed_bin>; + nvmem-cell-names = "speed_bin"; + interconnects = <_noc MASTER_GFX3D 0 _virt SLAVE_EBI1 0>; interconnect-names = "gfx-mem"; gpu_opp_table: opp-table { compatible = "operating-points-v2"; + opp-82500 { + opp-hz = /bits/ 64 <82500>; + opp-level = ; + opp-peak-kBps = <8532000>; + opp-supported-hw = <0x04>; + }; + opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; opp-peak-kBps = <8532000>; + opp-supported-hw = <0x07>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; opp-peak-kBps = <7216000>; + opp-supported-hw = <0x07>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; opp-peak-kBps = <1804000>; + opp-supported-hw = <0x07>; }; }; }; -- 2.7.4
Re: [PATCH v2 1/3] drm/msm: adreno: Make speed-bin support generic
On 12/2/2020 10:00 PM, Jordan Crouse wrote: On Wed, Dec 02, 2020 at 08:53:51PM +0530, Akhil P Oommen wrote: On 11/30/2020 10:32 PM, Jordan Crouse wrote: On Fri, Nov 27, 2020 at 06:19:44PM +0530, Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..b342fa4 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + if (!speedbins) + goto done; + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); +
Re: [PATCH v2 1/3] drm/msm: adreno: Make speed-bin support generic
<< Resending since Jordan wasn't in the CC list >> On 11/30/2020 10:32 PM, Jordan Crouse wrote: On Fri, Nov 27, 2020 at 06:19:44PM +0530, Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..b342fa4 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + if (!speedbins) + goto done; + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + return PTR_ERR(buf); +
Re: [PATCH v2 1/3] drm/msm: adreno: Make speed-bin support generic
On 11/30/2020 10:32 PM, Jordan Crouse wrote: On Fri, Nov 27, 2020 at 06:19:44PM +0530, Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..b342fa4 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + if (!speedbins) + goto done; + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + return PTR_ERR(buf); + } + + bin = *((u32 *) buf); + + for (i = 0; i < spee
[PATCH v2 1/3] drm/msm: adreno: Make speed-bin support generic
So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- Changes from v1: 1. Added the changes to support a618 sku to the series. 2. Avoid failing probe in case of an unsupported sku. (Rob) drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..b342fa4 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + if (!speedbins) + goto done; + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + return PTR_ERR(buf); + } + + bin = *((u32 *) buf); + + for (i = 0; i < speedbins_count; i++) { + if (bin == speedbins[i]) { + val = (1 << i); +
Re: [Freedreno] [PATCH] drm/msm: adreno: Make speed-bin support generic
On 11/16/2020 10:44 PM, Jordan Crouse wrote: On Mon, Nov 16, 2020 at 07:40:03PM +0530, Akhil P Oommen wrote: On 11/12/2020 10:05 PM, Jordan Crouse wrote: On Thu, Nov 12, 2020 at 09:19:04PM +0530, Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- This patch is rebased on top of msm-next-staging branch in rob's tree. drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..cdd0c11 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; We don't need to make this generic and put it in the table. Just call the function from the target specific code and pass the speedbin array and size from there. I didn't get you entirely. Do you mean we should avoid keeping speedbin array in the adreno_gpu->info table? Exactly. Jordan But why duplicate this code if it can be made generic? Could you please check the v2 version? -Akhil. -Akhil. + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin
[PATCH v2 2/3] drm/msm: Add speed-bin support for a618 gpu
Extend speed-bin support to a618 gpu. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/adreno_device.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index e0ff16c..21db7ae 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,7 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a618_speedbins[] = {0, 169, 174}; const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; static const struct adreno_info gpulist[] = { @@ -196,6 +197,8 @@ static const struct adreno_info gpulist[] = { .gmem = SZ_512K, .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a6xx_gpu_init, + .speedbins = a618_speedbins, + .speedbins_count = ARRAY_SIZE(a618_speedbins), }, { .rev = ADRENO_REV(6, 3, 0, ANY_ID), .revn = 630, -- 2.7.4
[PATCH v2 3/3] arm: dts: sc7180: Add support for gpu fuse
Add support for gpu fuse to help identify the supported opps. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 6678f1e..8cae3eb 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -675,6 +675,11 @@ reg = <0x25b 0x1>; bits = <1 3>; }; + + gpu_speed_bin: gpu_speed_bin@1d2 { + reg = <0x1d2 0x2>; + bits = <5 8>; + }; }; sdhc_1: sdhci@7c4000 { @@ -1907,52 +1912,69 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + nvmem-cells = <_speed_bin>; + nvmem-cell-names = "speed_bin"; + interconnects = <_noc MASTER_GFX3D 0 _virt SLAVE_EBI1 0>; interconnect-names = "gfx-mem"; gpu_opp_table: opp-table { compatible = "operating-points-v2"; + opp-82500 { + opp-hz = /bits/ 64 <82500>; + opp-level = ; + opp-peak-kBps = <8532000>; + opp-supported-hw = <0x04>; + }; + opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; opp-peak-kBps = <8532000>; + opp-supported-hw = <0x07>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; opp-peak-kBps = <7216000>; + opp-supported-hw = <0x07>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; opp-peak-kBps = <5412000>; + opp-supported-hw = <0x07>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; opp-peak-kBps = <3072000>; + opp-supported-hw = <0x07>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; opp-peak-kBps = <1804000>; + opp-supported-hw = <0x07>; }; }; }; -- 2.7.4
Re: [PATCH] drm/msm: adreno: Make speed-bin support generic
On 11/16/2020 9:52 PM, Rob Clark wrote: On Mon, Nov 16, 2020 at 6:34 AM Akhil P Oommen wrote: On 11/12/2020 10:07 PM, Rob Clark wrote: On Thu, Nov 12, 2020 at 7:49 AM Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- This patch is rebased on top of msm-next-staging branch in rob's tree. drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..cdd0c11 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + /* A speedbin table is must if the platform supports speedbin */ + if (!speedbins) { + DRM_DEV_ERROR(dev, "speed-bin table is missing\n"); + return -ENOENT; Hmm, t
Re: [PATCH] drm/msm: adreno: Make speed-bin support generic
On 11/12/2020 10:07 PM, Rob Clark wrote: On Thu, Nov 12, 2020 at 7:49 AM Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- This patch is rebased on top of msm-next-staging branch in rob's tree. drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..cdd0c11 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + /* A speedbin table is must if the platform supports speedbin */ + if (!speedbins) { + DRM_DEV_ERROR(dev, "speed-bin table is missing\n"); + return -ENOENT; Hmm, this means that hw which supports speed-bin, but for which we haven't yet added a speedbin table, will start failing. Which see
Re: [PATCH] drm/msm: adreno: Make speed-bin support generic
On 11/12/2020 10:05 PM, Jordan Crouse wrote: On Thu, Nov 12, 2020 at 09:19:04PM +0530, Akhil P Oommen wrote: So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- This patch is rebased on top of msm-next-staging branch in rob's tree. drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..cdd0c11 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; We don't need to make this generic and put it in the table. Just call the function from the target specific code and pass the speedbin array and size from there. I didn't get you entirely. Do you mean we should avoid keeping speedbin array in the adreno_gpu->info table? -Akhil. + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + /* A speedbin table is must if the platform supports sp
[PATCH] drm/msm: adreno: Make speed-bin support generic
So far a530v2 gpu has support for detecting its supported opps based on a fuse value called speed-bin. This patch makes this support generic across gpu families. This is in preparation to extend speed-bin support to a6x family. Signed-off-by: Akhil P Oommen --- This patch is rebased on top of msm-next-staging branch in rob's tree. drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 34 -- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c| 71 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.h| 5 +++ 4 files changed, 80 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 8fa5c91..7d42321 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1531,38 +1531,6 @@ static const struct adreno_gpu_funcs funcs = { .get_timestamp = a5xx_get_timestamp, }; -static void check_speed_bin(struct device *dev) -{ - struct nvmem_cell *cell; - u32 val; - - /* -* If the OPP table specifies a opp-supported-hw property then we have -* to set something with dev_pm_opp_set_supported_hw() or the table -* doesn't get populated so pick an arbitrary value that should -* ensure the default frequencies are selected but not conflict with any -* actual bins -*/ - val = 0x80; - - cell = nvmem_cell_get(dev, "speed_bin"); - - if (!IS_ERR(cell)) { - void *buf = nvmem_cell_read(cell, NULL); - - if (!IS_ERR(buf)) { - u8 bin = *((u8 *) buf); - - val = (1 << bin); - kfree(buf); - } - - nvmem_cell_put(cell); - } - - dev_pm_opp_set_supported_hw(dev, , 1); -} - struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; @@ -1588,8 +1556,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) a5xx_gpu->lm_leakage = 0x4E001A; - check_speed_bin(>dev); - ret = adreno_gpu_init(dev, pdev, adreno_gpu, , 4); if (ret) { a5xx_destroy(&(a5xx_gpu->base.base)); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 87c8b03..e0ff16c 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -18,6 +18,8 @@ bool snapshot_debugbus = false; MODULE_PARM_DESC(snapshot_debugbus, "Include debugbus sections in GPU devcoredump (if not fused off)"); module_param_named(snapshot_debugbus, snapshot_debugbus, bool, 0600); +const u32 a530v2_speedbins[] = {0, 1, 2, 3, 4, 5, 6, 7}; + static const struct adreno_info gpulist[] = { { .rev = ADRENO_REV(2, 0, 0, 0), @@ -163,6 +165,8 @@ static const struct adreno_info gpulist[] = { ADRENO_QUIRK_FAULT_DETECT_MASK, .init = a5xx_gpu_init, .zapfw = "a530_zap.mdt", + .speedbins = a530v2_speedbins, + .speedbins_count = ARRAY_SIZE(a530v2_speedbins), }, { .rev = ADRENO_REV(5, 4, 0, 2), .revn = 540, diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index f21561d..cdd0c11 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include "adreno_gpu.h" #include "msm_gem.h" @@ -891,6 +892,69 @@ void adreno_gpu_ocmem_cleanup(struct adreno_ocmem *adreno_ocmem) adreno_ocmem->hdl); } +static int adreno_set_supported_hw(struct device *dev, + struct adreno_gpu *adreno_gpu) +{ + u8 speedbins_count = adreno_gpu->info->speedbins_count; + const u32 *speedbins = adreno_gpu->info->speedbins; + struct nvmem_cell *cell; + u32 bin, i; + u32 val = 0; + void *buf, *opp_table; + + cell = nvmem_cell_get(dev, "speed_bin"); + /* +* -ENOENT means that the platform doesn't support speedbin which is +* fine +*/ + if (PTR_ERR(cell) == -ENOENT) + return 0; + else if (IS_ERR(cell)) + return PTR_ERR(cell); + + /* A speedbin table is must if the platform supports speedbin */ + if (!speedbins) { + DRM_DEV_ERROR(dev, "speed-bin table is missing\n"); + return -ENOENT; + } + + buf = nvmem_cell_read(cell, NULL); + if (IS_ERR(buf)) { + nvmem_cell_put(cell); + return PTR_ERR(buf); + } + + bin = *((u32 *) buf); + + for (i = 0; i < speedbins_count; i++) { +
Re: [Freedreno] [PATCH v5 3/3] dt-bindings: drm/msm/gpu: Add cooling device support
On 11/5/2020 2:28 AM, Rob Clark wrote: On Wed, Nov 4, 2020 at 12:03 PM Rob Herring wrote: On Fri, 30 Oct 2020 16:17:12 +0530, Akhil P Oommen wrote: Add cooling device support to gpu. A cooling device is bound to a thermal zone to allow thermal mitigation. Signed-off-by: Akhil P Oommen Reviewed-by: Matthias Kaehlcke --- Documentation/devicetree/bindings/display/msm/gpu.txt | 7 +++ 1 file changed, 7 insertions(+) Please add Acked-by/Reviewed-by tags when posting new versions. However, there's no need to repost patches *only* to add the tags. The upstream maintainer will do that for acks received on the version they apply. If a tag was not added on purpose, please state why and what changed. Thanks Rob I've copied over your ack from the previous version.. but yes, it definitely makes my life easier when patch senders do this for me ;-) BR, -R Robh, you Acked v4 after I shared v5 patches!! -Akhil.
[PATCH v5 3/3] dt-bindings: drm/msm/gpu: Add cooling device support
Add cooling device support to gpu. A cooling device is bound to a thermal zone to allow thermal mitigation. Signed-off-by: Akhil P Oommen Reviewed-by: Matthias Kaehlcke --- Documentation/devicetree/bindings/display/msm/gpu.txt | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index 1af0ff1..090dcb3 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -39,6 +39,10 @@ Required properties: a4xx Snapdragon SoCs. See Documentation/devicetree/bindings/sram/qcom,ocmem.yaml. +Optional properties: +- #cooling-cells: The value must be 2. For details, please refer + Documentation/devicetree/bindings/thermal/thermal-cooling-devices.yaml. + Example 3xx/4xx: / { @@ -61,6 +65,7 @@ Example 3xx/4xx: power-domains = < OXILICX_GDSC>; operating-points-v2 = <_opp_table>; iommus = <_iommu 0>; + #cooling-cells = <2>; }; gpu_sram: ocmem@fdd0 { @@ -98,6 +103,8 @@ Example a6xx (with GMU): reg = <0x500 0x4>, <0x509e000 0x10>; reg-names = "kgsl_3d0_reg_memory", "cx_mem"; + #cooling-cells = <2>; + /* * Look ma, no clocks! The GPU clocks and power are * controlled entirely by the GMU -- 2.7.4
[PATCH v5 2/3] arm64: dts: qcom: sc7180: Add gpu cooling support
Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen Reviewed-by: Matthias Kaehlcke --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..8e2000c 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1886,6 +1886,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3827,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss0_crit: gpuss0_crit { @@ -3843,19 +3845,26 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; gpuss1-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 14>; trips { gpuss1_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss1_crit: gpuss1_crit { @@ -3864,6 +3873,13 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; aoss1-thermal { -- 2.7.4
[PATCH v5 1/3] drm/msm: Add support for GPU cooling
Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen Tested-by: Matthias Kaehlcke --- Changes in v5: 1. Update Reviewed-by/Tested-by tags Changes in v4: 1. Fix gpu cooling map. 2. Add mka's Reviewed-by tag. Changes in v3: 1. Minor fix in binding documentation (RobH) Changes in v2: 1. Update the dt bindings documentation drivers/gpu/drm/msm/msm_gpu.c | 12 drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..9f9db46 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -1005,4 +1015,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); } diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fd..9a8f20d 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -147,6 +147,8 @@ struct msm_gpu { struct msm_gpu_state *crashstate; /* True if the hardware supports expanded apriv (a650 and newer) */ bool hw_apriv; + + struct thermal_cooling_device *cooling; }; static inline struct msm_gpu *dev_to_gpu(struct device *dev) -- 2.7.4
Re: [v4,1/3] drm/msm: Add support for GPU cooling
On 10/30/2020 2:18 AM, m...@chromium.org wrote: On Thu, Oct 29, 2020 at 01:37:19PM +0530, Akhil P Oommen wrote: Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen Reviewed-by: Matthias Kaehlcke Wait, I did not post a 'Reviewed-by' tag for this patch! I think the patch should be ok, but I'm still not super happy about the resource management involving devfreq in general (see discussion on https://patchwork.freedesktop.org/patch/394291/?series=82476=1). It's not really something introduced by this patch, but if it ever gets fixed releasing the cooling device at the end of msm_gpu_cleanup() after everything else might cause trouble. In summary, I'm supportive of landing this patch, but reluctant to 'sign it off' because of the above. In any case: Tested-by: Matthias Kaehlcke Sorry, Matthias. My mistake. You shared the reviewed tag for the dt-bindings update. Will fix this ASAP. Thanks for verifying this. -Akhil. ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v4 2/3] arm64: dts: qcom: sc7180: Add gpu cooling support
Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen Reviewed-by: Matthias Kaehlcke --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..8e2000c 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1886,6 +1886,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3827,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss0_crit: gpuss0_crit { @@ -3843,19 +3845,26 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; gpuss1-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 14>; trips { gpuss1_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss1_crit: gpuss1_crit { @@ -3864,6 +3873,13 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; aoss1-thermal { -- 2.7.4
[PATCH v4 3/3] dt-bindings: drm/msm/gpu: Add cooling device support
Add cooling device support to gpu. A cooling device is bound to a thermal zone to allow thermal mitigation. Signed-off-by: Akhil P Oommen --- Documentation/devicetree/bindings/display/msm/gpu.txt | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index 1af0ff1..090dcb3 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -39,6 +39,10 @@ Required properties: a4xx Snapdragon SoCs. See Documentation/devicetree/bindings/sram/qcom,ocmem.yaml. +Optional properties: +- #cooling-cells: The value must be 2. For details, please refer + Documentation/devicetree/bindings/thermal/thermal-cooling-devices.yaml. + Example 3xx/4xx: / { @@ -61,6 +65,7 @@ Example 3xx/4xx: power-domains = < OXILICX_GDSC>; operating-points-v2 = <_opp_table>; iommus = <_iommu 0>; + #cooling-cells = <2>; }; gpu_sram: ocmem@fdd0 { @@ -98,6 +103,8 @@ Example a6xx (with GMU): reg = <0x500 0x4>, <0x509e000 0x10>; reg-names = "kgsl_3d0_reg_memory", "cx_mem"; + #cooling-cells = <2>; + /* * Look ma, no clocks! The GPU clocks and power are * controlled entirely by the GMU -- 2.7.4
[PATCH v4 1/3] drm/msm: Add support for GPU cooling
Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen Reviewed-by: Matthias Kaehlcke --- Changes in v4: 1. Fix gpu cooling map. 2. Add mka's Reviewed-by tag. Changes in v3: 1. Minor fix in binding documentation (RobH) Changes in v2: 1. Update the dt bindings documentation drivers/gpu/drm/msm/msm_gpu.c | 12 drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..9f9db46 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -1005,4 +1015,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); } diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fd..9a8f20d 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -147,6 +147,8 @@ struct msm_gpu { struct msm_gpu_state *crashstate; /* True if the hardware supports expanded apriv (a650 and newer) */ bool hw_apriv; + + struct thermal_cooling_device *cooling; }; static inline struct msm_gpu *dev_to_gpu(struct device *dev) -- 2.7.4
Re: [v3,2/3] arm64: dts: qcom: sc7180: Add gpu cooling support
On 10/29/2020 6:09 AM, m...@chromium.org wrote: Hi Akhil, On Wed, Oct 28, 2020 at 07:09:53PM +0530, Akhil P Oommen wrote: Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..a7ea029 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1886,6 +1886,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3827,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss0_crit: gpuss0_crit { @@ -3843,19 +3845,26 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; gpuss1-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 14>; trips { gpuss1_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss1_crit: gpuss1_crit { @@ -3864,6 +3873,13 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; Copy & paste error, this should be 'gpuss1_alert0'. aah! you are correct. --Akhil + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; aoss1-thermal { Other than the C error: Reviewed-by: Matthias Kaehlcke ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 2/3] arm64: dts: qcom: sc7180: Add gpu cooling support
Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..a7ea029 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1886,6 +1886,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3827,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss0_crit: gpuss0_crit { @@ -3843,19 +3845,26 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; gpuss1-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 14>; trips { gpuss1_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss1_crit: gpuss1_crit { @@ -3864,6 +3873,13 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; aoss1-thermal { -- 2.7.4
[PATCH v3 1/3] drm/msm: Add support for GPU cooling
Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- Changes in v3: 1. Minor fix in binding documentation (RobH) Changes in v2: 1. Update the dt bindings documentation drivers/gpu/drm/msm/msm_gpu.c | 12 drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..9f9db46 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -1005,4 +1015,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); } diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fd..9a8f20d 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -147,6 +147,8 @@ struct msm_gpu { struct msm_gpu_state *crashstate; /* True if the hardware supports expanded apriv (a650 and newer) */ bool hw_apriv; + + struct thermal_cooling_device *cooling; }; static inline struct msm_gpu *dev_to_gpu(struct device *dev) -- 2.7.4
[PATCH v3 3/3] dt-bindings: drm/msm/gpu: Add cooling device support
Add cooling device support to gpu. A cooling device is bound to a thermal zone to allow thermal mitigation. Signed-off-by: Akhil P Oommen --- Documentation/devicetree/bindings/display/msm/gpu.txt | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index 1af0ff1..090dcb3 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -39,6 +39,10 @@ Required properties: a4xx Snapdragon SoCs. See Documentation/devicetree/bindings/sram/qcom,ocmem.yaml. +Optional properties: +- #cooling-cells: The value must be 2. For details, please refer + Documentation/devicetree/bindings/thermal/thermal-cooling-devices.yaml. + Example 3xx/4xx: / { @@ -61,6 +65,7 @@ Example 3xx/4xx: power-domains = < OXILICX_GDSC>; operating-points-v2 = <_opp_table>; iommus = <_iommu 0>; + #cooling-cells = <2>; }; gpu_sram: ocmem@fdd0 { @@ -98,6 +103,8 @@ Example a6xx (with GMU): reg = <0x500 0x4>, <0x509e000 0x10>; reg-names = "kgsl_3d0_reg_memory", "cx_mem"; + #cooling-cells = <2>; + /* * Look ma, no clocks! The GPU clocks and power are * controlled entirely by the GMU -- 2.7.4
[PATCH v2 2/2] drm/msm: Fix duplicate gpu node in icc summary
The dev_pm_opp_of_add_table() api initializes the icc nodes for gpu indirectly. So we can avoid using of_icc_get() api in the common probe path. To improve this, move of_icc_get() to target specific code where it is required. This patch helps to fix duplicate gpu node listed in the interconnect summary from the debugfs. Signed-off-by: Akhil P Oommen --- Changes in v2: 1. Minor updates (Jordan) drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 21 +++-- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 20 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 32 +--- 3 files changed, 38 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c index f29c77d..93da668 100644 --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c @@ -519,6 +519,8 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) struct msm_gpu *gpu; struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; + struct icc_path *ocmem_icc_path; + struct icc_path *icc_path; int ret; if (!pdev) { @@ -566,13 +568,28 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) goto fail; } + icc_path = devm_of_icc_get(>dev, "gfx-mem"); + ret = IS_ERR(icc_path); + if (ret) + goto fail; + + ocmem_icc_path = devm_of_icc_get(>dev, "ocmem"); + ret = IS_ERR(ocmem_icc_path); + if (ret) { + /* allow -ENODATA, ocmem icc is optional */ + if (ret != -ENODATA) + goto fail; + ocmem_icc_path = NULL; + } + + /* * Set the ICC path to maximum speed for now by multiplying the fastest * frequency by the bus width (8). We'll want to scale this later on to * improve battery life. */ - icc_set_bw(gpu->icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); - icc_set_bw(gpu->ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); return gpu; diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c index 2b93b33..c0be3a0 100644 --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c @@ -648,6 +648,8 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev) struct msm_gpu *gpu; struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; + struct icc_path *ocmem_icc_path; + struct icc_path *icc_path; int ret; if (!pdev) { @@ -694,13 +696,27 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev) goto fail; } + icc_path = devm_of_icc_get(>dev, "gfx-mem"); + ret = IS_ERR(icc_path); + if (ret) + goto fail; + + ocmem_icc_path = devm_of_icc_get(>dev, "ocmem"); + ret = IS_ERR(ocmem_icc_path); + if (ret) { + /* allow -ENODATA, ocmem icc is optional */ + if (ret != -ENODATA) + goto fail; + ocmem_icc_path = NULL; + } + /* * Set the ICC path to maximum speed for now by multiplying the fastest * frequency by the bus width (8). We'll want to scale this later on to * improve battery life. */ - icc_set_bw(gpu->icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); - icc_set_bw(gpu->ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); return gpu; diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index fd8f491..ddbd863 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -899,7 +899,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, struct adreno_platform_config *config = dev->platform_data; struct msm_gpu_config adreno_gpu_config = { 0 }; struct msm_gpu *gpu = _gpu->base; - int ret; adreno_gpu->funcs = funcs; adreno_gpu->info = adreno_info(config->rev); @@ -918,37 +917,8 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, pm_runtime_use_autosuspend(dev); pm_runtime_enable(dev); - ret = msm_gpu_init(drm, pdev, _gpu->base, >base, + return msm_gpu_init(drm, pdev, _gpu->base, >base, adreno_gpu->info->name, _gpu_config); - if (ret) - r
[PATCH v2 1/2] drm/msm: Implement shutdown callback for adreno
Implement the shutdown callback for adreno gpu platform device to safely shutdown it before a system reboot. This helps to avoid futher transactions from gpu after the smmu is moved to bypass mode. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/adreno_device.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 58e03b2..87c8b03 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -475,6 +475,11 @@ static int adreno_remove(struct platform_device *pdev) return 0; } +static void adreno_shutdown(struct platform_device *pdev) +{ + pm_runtime_force_suspend(>dev); +} + static const struct of_device_id dt_match[] = { { .compatible = "qcom,adreno" }, { .compatible = "qcom,adreno-3xx" }, @@ -509,6 +514,7 @@ static const struct dev_pm_ops adreno_pm_ops = { static struct platform_driver adreno_driver = { .probe = adreno_probe, .remove = adreno_remove, + .shutdown = adreno_shutdown, .driver = { .name = "adreno", .of_match_table = dt_match, -- 2.7.4
Re: [PATCH 2/2] drm/msm: Fix duplicate gpu node in icc summary
On 10/19/2020 8:29 PM, Jordan Crouse wrote: On Mon, Oct 19, 2020 at 06:49:18PM +0530, Akhil P Oommen wrote: On targets with a6xx gpu, there is a duplicate gpu icc node listed in the interconnect summary. On these targets, calling This first sentence is confusing to me. I think the following few sentences do a better job of explaining what you are trying to do. I can just remove that line. dev_pm_opp_of_add_table() api initializes the icc nodes for gpu indirectly. So we should avoid using of_icc_get() api in the common probe path. To fix this, we can move of_icc_get() to target specific code where it is required. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 21 +++-- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 20 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 29 + 3 files changed, 38 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c index f29c77d..93da668 100644 --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c @@ -519,6 +519,8 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) struct msm_gpu *gpu; struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; + struct icc_path *ocmem_icc_path; + struct icc_path *icc_path; int ret; if (!pdev) { @@ -566,13 +568,28 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) goto fail; } + icc_path = devm_of_icc_get(>dev, "gfx-mem"); + ret = IS_ERR(icc_path); + if (ret) + goto fail; + + ocmem_icc_path = devm_of_icc_get(>dev, "ocmem"); + ret = IS_ERR(ocmem_icc_path); + if (ret) { + /* allow -ENODATA, ocmem icc is optional */ + if (ret != -ENODATA) + goto fail; + ocmem_icc_path = NULL; + } + + /* * Set the ICC path to maximum speed for now by multiplying the fastest * frequency by the bus width (8). We'll want to scale this later on to * improve battery life. */ - icc_set_bw(gpu->icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); - icc_set_bw(gpu->ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); This seems reasonable but I hope we can get somebody to sign off on a real a3xx part. return gpu; diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c index 2b93b33..c0be3a0 100644 --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c @@ -648,6 +648,8 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev) struct msm_gpu *gpu; struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; + struct icc_path *ocmem_icc_path; + struct icc_path *icc_path; int ret; if (!pdev) { @@ -694,13 +696,27 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev) goto fail; } + icc_path = devm_of_icc_get(>dev, "gfx-mem"); + ret = IS_ERR(icc_path); + if (ret) + goto fail; + + ocmem_icc_path = devm_of_icc_get(>dev, "ocmem"); + ret = IS_ERR(ocmem_icc_path); + if (ret) { + /* allow -ENODATA, ocmem icc is optional */ + if (ret != -ENODATA) + goto fail; + ocmem_icc_path = NULL; + } + /* * Set the ICC path to maximum speed for now by multiplying the fastest * frequency by the bus width (8). We'll want to scale this later on to * improve battery life. */ - icc_set_bw(gpu->icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); - icc_set_bw(gpu->ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); Less confident we can find any 4xx fans to test this, but if a3xx works then so should this (in theory). return gpu; diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index fd8f491..6e3b820 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -920,35 +920,8 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, ret = msm_gpu_init(drm, pdev, _gpu->base, >base, adreno_gpu->info->name, _gpu_config); - if (ret) - return ret; - - /* -* The legacy case, before "interconnect-names",
[PATCH 1/2] drm/msm: Implement shutdown callback for adreno
Implement the shutdown callback for adreno gpu platform device to safely shutdown it before a system reboot. This helps to avoid futher transactions from gpu after the smmu is moved to bypass mode. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/adreno_device.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 58e03b2..87c8b03 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -475,6 +475,11 @@ static int adreno_remove(struct platform_device *pdev) return 0; } +static void adreno_shutdown(struct platform_device *pdev) +{ + pm_runtime_force_suspend(>dev); +} + static const struct of_device_id dt_match[] = { { .compatible = "qcom,adreno" }, { .compatible = "qcom,adreno-3xx" }, @@ -509,6 +514,7 @@ static const struct dev_pm_ops adreno_pm_ops = { static struct platform_driver adreno_driver = { .probe = adreno_probe, .remove = adreno_remove, + .shutdown = adreno_shutdown, .driver = { .name = "adreno", .of_match_table = dt_match, -- 2.7.4
[PATCH 2/2] drm/msm: Fix duplicate gpu node in icc summary
On targets with a6xx gpu, there is a duplicate gpu icc node listed in the interconnect summary. On these targets, calling dev_pm_opp_of_add_table() api initializes the icc nodes for gpu indirectly. So we should avoid using of_icc_get() api in the common probe path. To fix this, we can move of_icc_get() to target specific code where it is required. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 21 +++-- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 20 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 29 + 3 files changed, 38 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c index f29c77d..93da668 100644 --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c @@ -519,6 +519,8 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) struct msm_gpu *gpu; struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; + struct icc_path *ocmem_icc_path; + struct icc_path *icc_path; int ret; if (!pdev) { @@ -566,13 +568,28 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev) goto fail; } + icc_path = devm_of_icc_get(>dev, "gfx-mem"); + ret = IS_ERR(icc_path); + if (ret) + goto fail; + + ocmem_icc_path = devm_of_icc_get(>dev, "ocmem"); + ret = IS_ERR(ocmem_icc_path); + if (ret) { + /* allow -ENODATA, ocmem icc is optional */ + if (ret != -ENODATA) + goto fail; + ocmem_icc_path = NULL; + } + + /* * Set the ICC path to maximum speed for now by multiplying the fastest * frequency by the bus width (8). We'll want to scale this later on to * improve battery life. */ - icc_set_bw(gpu->icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); - icc_set_bw(gpu->ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); return gpu; diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c index 2b93b33..c0be3a0 100644 --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c @@ -648,6 +648,8 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev) struct msm_gpu *gpu; struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; + struct icc_path *ocmem_icc_path; + struct icc_path *icc_path; int ret; if (!pdev) { @@ -694,13 +696,27 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev) goto fail; } + icc_path = devm_of_icc_get(>dev, "gfx-mem"); + ret = IS_ERR(icc_path); + if (ret) + goto fail; + + ocmem_icc_path = devm_of_icc_get(>dev, "ocmem"); + ret = IS_ERR(ocmem_icc_path); + if (ret) { + /* allow -ENODATA, ocmem icc is optional */ + if (ret != -ENODATA) + goto fail; + ocmem_icc_path = NULL; + } + /* * Set the ICC path to maximum speed for now by multiplying the fastest * frequency by the bus width (8). We'll want to scale this later on to * improve battery life. */ - icc_set_bw(gpu->icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); - icc_set_bw(gpu->ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); + icc_set_bw(ocmem_icc_path, 0, Bps_to_icc(gpu->fast_rate) * 8); return gpu; diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index fd8f491..6e3b820 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -920,35 +920,8 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, ret = msm_gpu_init(drm, pdev, _gpu->base, >base, adreno_gpu->info->name, _gpu_config); - if (ret) - return ret; - - /* -* The legacy case, before "interconnect-names", only has a -* single interconnect path which is equivalent to "gfx-mem" -*/ - if (!of_find_property(dev->of_node, "interconnect-names", NULL)) { - gpu->icc_path = of_icc_get(dev, NULL); - } else { - gpu->icc_path = of_icc_get(dev, "gfx-mem"); - gpu->ocmem_icc_path = of_icc_get(dev, "ocmem"); - } - if (IS_ERR(gpu-&g
[PATCH v2 2/3] arm64: dts: qcom: sc7180: Add gpu cooling support
Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen --- The thermal policy should be set as 'step_wise' for gpu tzones from the userspace during boot up. arch/arm64/boot/dts/qcom/sc7180.dtsi | 30 +++--- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..a7ea029 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1886,6 +1886,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3827,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss0_crit: gpuss0_crit { @@ -3843,19 +3845,26 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; gpuss1-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 14>; trips { gpuss1_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss1_crit: gpuss1_crit { @@ -3864,6 +3873,13 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; aoss1-thermal { -- 2.7.4
[PATCH v2 1/3] drm/msm: Add support for GPU cooling
Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- Changes in v2: 1. Update the dt bindings documentation drivers/gpu/drm/msm/msm_gpu.c | 12 drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..9f9db46 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -1005,4 +1015,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); } diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fd..9a8f20d 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -147,6 +147,8 @@ struct msm_gpu { struct msm_gpu_state *crashstate; /* True if the hardware supports expanded apriv (a650 and newer) */ bool hw_apriv; + + struct thermal_cooling_device *cooling; }; static inline struct msm_gpu *dev_to_gpu(struct device *dev) -- 2.7.4
[PATCH v2 3/3] dt-bindings: drm/msm/gpu: Add cooling device support
Add cooling device support to gpu. A cooling device is bound to a thermal zone to allow thermal mitigation. Signed-off-by: Akhil P Oommen --- Documentation/devicetree/bindings/display/msm/gpu.txt | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index 1af0ff1..a496381 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -39,6 +39,10 @@ Required properties: a4xx Snapdragon SoCs. See Documentation/devicetree/bindings/sram/qcom,ocmem.yaml. +Optional properties: +- #cooling-cells: The value must be 2. Please refer + Documentation/devicetree/bindings/thermal/thermal.txt for detail. + Example 3xx/4xx: / { @@ -61,6 +65,7 @@ Example 3xx/4xx: power-domains = < OXILICX_GDSC>; operating-points-v2 = <_opp_table>; iommus = <_iommu 0>; + #cooling-cells = <2>; }; gpu_sram: ocmem@fdd0 { @@ -98,6 +103,8 @@ Example a6xx (with GMU): reg = <0x500 0x4>, <0x509e000 0x10>; reg-names = "kgsl_3d0_reg_memory", "cx_mem"; + #cooling-cells = <2>; + /* * Look ma, no clocks! The GPU clocks and power are * controlled entirely by the GMU -- 2.7.4
Re: [PATCH 1/2] arm64: dts: qcom: sc7180: Add gpu cooling support
On 10/16/2020 3:49 AM, Matthias Kaehlcke wrote: Hi, On Thu, Oct 15, 2020 at 12:07:01AM +0530, man...@codeaurora.org wrote: On 2020-10-14 18:59, Akhil P Oommen wrote: On 10/9/2020 10:27 PM, Matthias Kaehlcke wrote: On Fri, Oct 09, 2020 at 08:05:10AM -0700, Doug Anderson wrote: Hi, On Thu, Oct 8, 2020 at 10:10 AM Akhil P Oommen wrote: Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 29 ++--- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..40d6a28 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1885,6 +1885,7 @@ iommus = <_smmu 0>; operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; Presumably we should add this to the devicetree bindings, too? Yes, thanks for catching this. Will update in the next patch. interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3826,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; Why did you make this change? I'm pretty sure that we _don't_ want this since we're using interrupts for the thermal sensor. See commit 22337b91022d ("arm64: dts: qcom: sc7180: Changed polling mode in Thermal-zones node"). I was going to ask the same, this shouldn't be needed. As per our understanding unlike "polling-delay", this delay property is intended to activate polling thread on post trip threshold violation and it is irrespective of sensor is capable for trip interrupt or not. This polling is more of governor related. Below are the few references from Documentation/code which tells polling-delay-passive is needed for IPA for better IPA performance. As per Power allocator documentations 1. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/driver-api/thermal/power_allocator.rst?h=v5.4.71#n264 "The power allocator governor's PID controller works best if there is a periodic tick. If you have a driver that calls `thermal_zone_device_update()` (or anything that ends up calling the governor's `throttle()` function) repetitively, the governor response won't be very good. Note that this is not particular to this governor, step-wise will also misbehave if you call its throttle() faster than the normal thermal framework tick (due to interrupts for example) as it will overreact" 2. In Power allocator code, when switch_on/control trip temp violation, it is enabling passive counter to activate passive polling @ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/thermal/power_allocator.c?h=v5.4.71#n634 3. while calculating derivative term, it is using passive_delay @ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/thermal/power_allocator.c?h=v5.4.71#n243 4. Sensor interrupt will work if temperature is fluctuating between trip_temp and hysteresis. But say a case where we are not enabling polling-delay-passive. In this case if current temperature > control_temp trip(2nd passive trip) and temperature trend is still raising, then sensor high trip will be disabled (OR configured for critical trip threshold). No more trip interrupt from sensor until it reaches critical trip or falls below control_temp hysteresis. How the governor re-evaluate its next mitigation without passive polling thread here ? I think the same is required for CPU thermal zone as well. Thanks for the explication and pointers! I ran some tests to re-confirm. For that I lowered the trip point temperatures of CPU6 to 60/70, to make it easier to trigger throttling without necessarily affecting the other CPUs. Further I enabled tracing for the events 'thermal_temperature', 'thermal_zone_trip' and 'thermal_power_allocator'. With that I ran a CPU intensive task on CPU6. Without polling-delay the trace log looks like this: irq/40-c263000.-157 [000] 48.035986: thermal_temperature: thermal_zone=cpu6-thermal id=6 temp_prev=57800 temp=6 irq/40-c263000.-157 [000] 48.036029: thermal_power_allocator_pid: thermal_zone_id=6 err=1 err_integral=0 p=2402 i=0 d=0 output=1776 irq/40-c263000.-157 [000] 48.036036: thermal_power_alloca
Re: [PATCH 1/2] arm64: dts: qcom: sc7180: Add gpu cooling support
On 10/9/2020 10:27 PM, Matthias Kaehlcke wrote: On Fri, Oct 09, 2020 at 08:05:10AM -0700, Doug Anderson wrote: Hi, On Thu, Oct 8, 2020 at 10:10 AM Akhil P Oommen wrote: Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 29 ++--- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..40d6a28 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1885,6 +1885,7 @@ iommus = <_smmu 0>; operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; Presumably we should add this to the devicetree bindings, too? Yes, thanks for catching this. Will update in the next patch. interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3826,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; Why did you make this change? I'm pretty sure that we _don't_ want this since we're using interrupts for the thermal sensor. See commit 22337b91022d ("arm64: dts: qcom: sc7180: Changed polling mode in Thermal-zones node"). I was going to ask the same, this shouldn't be needed. polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; Matthias probably knows better, but I wonder if we should be making two passive trip levels like we do with CPU. IIRC this is important if someone wants to be able to use this with IPA. Yes, please introduce a second trip point and make both of them 'passive'. ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel Adding Manaf here. -Akhil.
Re: [2/2] drm/msm: Add support for GPU cooling
On 10/13/2020 11:10 PM, m...@chromium.org wrote: On Tue, Oct 13, 2020 at 07:23:34PM +0530, Akhil P Oommen wrote: On 10/12/2020 11:10 PM, m...@chromium.org wrote: On Mon, Oct 12, 2020 at 07:03:51PM +0530, Akhil P Oommen wrote: On 10/10/2020 12:06 AM, m...@chromium.org wrote: Hi Akhil, On Thu, Oct 08, 2020 at 10:39:07PM +0530, Akhil P Oommen wrote: Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gpu.c | 13 - drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..93ffd66 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -926,7 +936,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, msm_devfreq_init(gpu); - Will remove this unintended change. gpu->aspace = gpu->funcs->create_address_space(gpu, pdev); if (gpu->aspace == NULL) @@ -1005,4 +1014,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); Resources should be released in reverse order, otherwise the cooling device could use resources that have already been freed. Why do you think this is not the correct order? If you are thinking about devfreq struct, it is managed device resource. I did not check specifically if changing the frequency really uses any of the resources that are released previously, In any case it's not a good idea to allow other parts of the kernel to use a half initialized/torn down device. Even if it isn't a problem today someone could change the driver to use any of these resources (or add a new one) in a frequency change, without even thinking about the cooling device, just (rightfully) asuming that things are set up and torn down in a sane order. 'sane order' relative to what specifically here? Should we worry about freq change at this point because we have already disabled gpu runtime pm and devfreq? GPU runtime PM and the devfreq being disabled is not evident from the context of the function. You are probably right that it's not a problem in practice, but why give reason for doubts in the first place if this could be avoided by following a common practice? ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel Other option I see is to create a managed device resource (devm) version of the devfreq_cooling_register API and use that. Is that what you are trying to suggest? -Akhil.
Re: [2/2] drm/msm: Add support for GPU cooling
On 10/12/2020 11:10 PM, m...@chromium.org wrote: On Mon, Oct 12, 2020 at 07:03:51PM +0530, Akhil P Oommen wrote: On 10/10/2020 12:06 AM, m...@chromium.org wrote: Hi Akhil, On Thu, Oct 08, 2020 at 10:39:07PM +0530, Akhil P Oommen wrote: Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gpu.c | 13 - drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..93ffd66 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -926,7 +936,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, msm_devfreq_init(gpu); - Will remove this unintended change. gpu->aspace = gpu->funcs->create_address_space(gpu, pdev); if (gpu->aspace == NULL) @@ -1005,4 +1014,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); Resources should be released in reverse order, otherwise the cooling device could use resources that have already been freed. Why do you think this is not the correct order? If you are thinking about devfreq struct, it is managed device resource. I did not check specifically if changing the frequency really uses any of the resources that are released previously, In any case it's not a good idea to allow other parts of the kernel to use a half initialized/torn down device. Even if it isn't a problem today someone could change the driver to use any of these resources (or add a new one) in a frequency change, without even thinking about the cooling device, just (rightfully) asuming that things are set up and torn down in a sane order. 'sane order' relative to what specifically here? Should we worry about freq change at this point because we have already disabled gpu runtime pm and devfreq? -Akhil. ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel -Akhil.
Re: [2/2] drm/msm: Add support for GPU cooling
On 10/10/2020 12:06 AM, m...@chromium.org wrote: Hi Akhil, On Thu, Oct 08, 2020 at 10:39:07PM +0530, Akhil P Oommen wrote: Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gpu.c | 13 - drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..93ffd66 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -926,7 +936,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, msm_devfreq_init(gpu); - gpu->aspace = gpu->funcs->create_address_space(gpu, pdev); if (gpu->aspace == NULL) @@ -1005,4 +1014,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); Resources should be released in reverse order, otherwise the cooling device could use resources that have already been freed. Why do you think this is not the correct order? If you are thinking about devfreq struct, it is managed device resource. -Akhil
[PATCH 1/2] arm64: dts: qcom: sc7180: Add gpu cooling support
Add cooling-cells property and the cooling maps for the gpu tzones to support GPU cooling. Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 29 ++--- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index d46b383..40d6a28 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -2,7 +2,7 @@ /* * SC7180 SoC device tree source * - * Copyright (c) 2019, The Linux Foundation. All rights reserved. + * Copyright (c) 2019-20, The Linux Foundation. All rights reserved. */ #include @@ -1885,6 +1885,7 @@ iommus = <_smmu 0>; operating-points-v2 = <_opp_table>; qcom,gmu = <>; + #cooling-cells = <2>; interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; interconnect-names = "gfx-mem"; @@ -3825,16 +3826,16 @@ }; gpuss0-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 13>; trips { gpuss0_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss0_crit: gpuss0_crit { @@ -3843,19 +3844,26 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; gpuss1-thermal { - polling-delay-passive = <0>; + polling-delay-passive = <100>; polling-delay = <0>; thermal-sensors = < 14>; trips { gpuss1_alert0: trip-point0 { - temperature = <9>; + temperature = <95000>; hysteresis = <2000>; - type = "hot"; + type = "passive"; }; gpuss1_crit: gpuss1_crit { @@ -3864,6 +3872,13 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <_alert0>; + cooling-device = < THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; aoss1-thermal { -- 2.7.4
[PATCH 2/2] drm/msm: Add support for GPU cooling
Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gpu.c | 13 - drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..93ffd66 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -926,7 +936,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, msm_devfreq_init(gpu); - gpu->aspace = gpu->funcs->create_address_space(gpu, pdev); if (gpu->aspace == NULL) @@ -1005,4 +1014,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); } diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fd..9a8f20d 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -147,6 +147,8 @@ struct msm_gpu { struct msm_gpu_state *crashstate; /* True if the hardware supports expanded apriv (a650 and newer) */ bool hw_apriv; + + struct thermal_cooling_device *cooling; }; static inline struct msm_gpu *dev_to_gpu(struct device *dev) -- 2.7.4
Re: [PATCH v2 1/2] drm/msm: Fix premature purging of BO
On 9/23/2020 8:20 PM, Jordan Crouse wrote: On Tue, Sep 22, 2020 at 08:25:26PM +0530, Akhil P Oommen wrote: In the case where we have a back-to-back submission that shares the same BO, this BO will be prematurely moved to inactive_list while retiring the first submit. But it will be still part of the second submit which is being processed by the GPU. Now, if the shrinker happens to be triggered at this point, it will result in a premature purging of this BO. To fix this, we need to refcount BO while doing submit and retire. Then, it should be moved to inactive list when this refcount becomes 0. Signed-off-by: Akhil P Oommen --- Changes in v2: 1. Keep Active List around 2. Put back the deleted WARN_ON drivers/gpu/drm/msm/msm_drv.h | 5 ++--- drivers/gpu/drm/msm/msm_gem.c | 32 drivers/gpu/drm/msm/msm_gem.h | 4 +++- drivers/gpu/drm/msm/msm_gpu.c | 11 +++ 4 files changed, 28 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 3193274..28e3c8d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -309,9 +309,8 @@ void msm_gem_put_vaddr(struct drm_gem_object *obj); int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv); int msm_gem_sync_object(struct drm_gem_object *obj, struct msm_fence_context *fctx, bool exclusive); -void msm_gem_move_to_active(struct drm_gem_object *obj, - struct msm_gpu *gpu, bool exclusive, struct dma_fence *fence); -void msm_gem_move_to_inactive(struct drm_gem_object *obj); +void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu); +void msm_gem_active_put(struct drm_gem_object *obj); int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout); int msm_gem_cpu_fini(struct drm_gem_object *obj); void msm_gem_free_object(struct drm_gem_object *obj); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 76a6c52..14e14ca 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -743,31 +743,31 @@ int msm_gem_sync_object(struct drm_gem_object *obj, return 0; } -void msm_gem_move_to_active(struct drm_gem_object *obj, - struct msm_gpu *gpu, bool exclusive, struct dma_fence *fence) +void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) { struct msm_gem_object *msm_obj = to_msm_bo(obj); + WARN_ON(!mutex_is_locked(>dev->struct_mutex)); WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED); - msm_obj->gpu = gpu; - if (exclusive) - dma_resv_add_excl_fence(obj->resv, fence); - else - dma_resv_add_shared_fence(obj->resv, fence); - list_del_init(_obj->mm_list); - list_add_tail(_obj->mm_list, >active_list); + + if (!atomic_fetch_inc(_obj->active_count)) { + msm_obj->gpu = gpu; + list_del_init(_obj->mm_list); + list_add_tail(_obj->mm_list, >active_list); + } I'm not sure if all the renaming and reorganization are really needed here - this is the meat of the change and it would have fit in reasonably well with the existing function design. This happened due to the way I implemented the v1 patch. In the hindsight, I think you are right. Akhil. } -void msm_gem_move_to_inactive(struct drm_gem_object *obj) +void msm_gem_active_put(struct drm_gem_object *obj) { - struct drm_device *dev = obj->dev; - struct msm_drm_private *priv = dev->dev_private; struct msm_gem_object *msm_obj = to_msm_bo(obj); + struct msm_drm_private *priv = obj->dev->dev_private; - WARN_ON(!mutex_is_locked(>struct_mutex)); + WARN_ON(!mutex_is_locked(>dev->struct_mutex)); - msm_obj->gpu = NULL; - list_del_init(_obj->mm_list); - list_add_tail(_obj->mm_list, >inactive_list); + if (!atomic_dec_return(_obj->active_count)) { + msm_obj->gpu = NULL; + list_del_init(_obj->mm_list); + list_add_tail(_obj->mm_list, >inactive_list); + } Same. Jordan } int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout) diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 7b1c7a5..a1bf741 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -88,12 +88,14 @@ struct msm_gem_object { struct mutex lock; /* Protects resources associated with bo */ char name[32]; /* Identifier to print for the debugfs files */ + + atomic_t active_count; }; #define to_msm_bo(x) container_of(x, struct msm_gem_object, base) static inline bool is_active(struct msm_gem_object *msm_obj) { - return msm_obj->gpu != NULL; + return atomic_read(_obj->active_count); } static inline
[PATCH v2 2/2] drm/msm: Leave inuse count intact on map failure
Leave the inuse count intact on map failure to keep the accounting accurate. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gem_vma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index 80a8a26..f914ddb 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -88,8 +88,10 @@ msm_gem_map_vma(struct msm_gem_address_space *aspace, ret = aspace->mmu->funcs->map(aspace->mmu, vma->iova, sgt, size, prot); - if (ret) + if (ret) { vma->mapped = false; + vma->inuse--; + } return ret; } -- 2.7.4
[PATCH v2 1/2] drm/msm: Fix premature purging of BO
In the case where we have a back-to-back submission that shares the same BO, this BO will be prematurely moved to inactive_list while retiring the first submit. But it will be still part of the second submit which is being processed by the GPU. Now, if the shrinker happens to be triggered at this point, it will result in a premature purging of this BO. To fix this, we need to refcount BO while doing submit and retire. Then, it should be moved to inactive list when this refcount becomes 0. Signed-off-by: Akhil P Oommen --- Changes in v2: 1. Keep Active List around 2. Put back the deleted WARN_ON drivers/gpu/drm/msm/msm_drv.h | 5 ++--- drivers/gpu/drm/msm/msm_gem.c | 32 drivers/gpu/drm/msm/msm_gem.h | 4 +++- drivers/gpu/drm/msm/msm_gpu.c | 11 +++ 4 files changed, 28 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 3193274..28e3c8d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -309,9 +309,8 @@ void msm_gem_put_vaddr(struct drm_gem_object *obj); int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv); int msm_gem_sync_object(struct drm_gem_object *obj, struct msm_fence_context *fctx, bool exclusive); -void msm_gem_move_to_active(struct drm_gem_object *obj, - struct msm_gpu *gpu, bool exclusive, struct dma_fence *fence); -void msm_gem_move_to_inactive(struct drm_gem_object *obj); +void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu); +void msm_gem_active_put(struct drm_gem_object *obj); int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout); int msm_gem_cpu_fini(struct drm_gem_object *obj); void msm_gem_free_object(struct drm_gem_object *obj); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 76a6c52..14e14ca 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -743,31 +743,31 @@ int msm_gem_sync_object(struct drm_gem_object *obj, return 0; } -void msm_gem_move_to_active(struct drm_gem_object *obj, - struct msm_gpu *gpu, bool exclusive, struct dma_fence *fence) +void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) { struct msm_gem_object *msm_obj = to_msm_bo(obj); + WARN_ON(!mutex_is_locked(>dev->struct_mutex)); WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED); - msm_obj->gpu = gpu; - if (exclusive) - dma_resv_add_excl_fence(obj->resv, fence); - else - dma_resv_add_shared_fence(obj->resv, fence); - list_del_init(_obj->mm_list); - list_add_tail(_obj->mm_list, >active_list); + + if (!atomic_fetch_inc(_obj->active_count)) { + msm_obj->gpu = gpu; + list_del_init(_obj->mm_list); + list_add_tail(_obj->mm_list, >active_list); + } } -void msm_gem_move_to_inactive(struct drm_gem_object *obj) +void msm_gem_active_put(struct drm_gem_object *obj) { - struct drm_device *dev = obj->dev; - struct msm_drm_private *priv = dev->dev_private; struct msm_gem_object *msm_obj = to_msm_bo(obj); + struct msm_drm_private *priv = obj->dev->dev_private; - WARN_ON(!mutex_is_locked(>struct_mutex)); + WARN_ON(!mutex_is_locked(>dev->struct_mutex)); - msm_obj->gpu = NULL; - list_del_init(_obj->mm_list); - list_add_tail(_obj->mm_list, >inactive_list); + if (!atomic_dec_return(_obj->active_count)) { + msm_obj->gpu = NULL; + list_del_init(_obj->mm_list); + list_add_tail(_obj->mm_list, >inactive_list); + } } int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout) diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 7b1c7a5..a1bf741 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -88,12 +88,14 @@ struct msm_gem_object { struct mutex lock; /* Protects resources associated with bo */ char name[32]; /* Identifier to print for the debugfs files */ + + atomic_t active_count; }; #define to_msm_bo(x) container_of(x, struct msm_gem_object, base) static inline bool is_active(struct msm_gem_object *msm_obj) { - return msm_obj->gpu != NULL; + return atomic_read(_obj->active_count); } static inline bool is_purgeable(struct msm_gem_object *msm_obj) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 29c8d73c..55d1648 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -698,8 +698,8 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, for (i = 0; i < submit->nr_bos; i++) { struct msm_gem_object *msm_obj = submit-&g
[PATCH 1/2] drm/msm: Replace active_list with refcount
In the case where we have a back-to-back submission that shares the same BO, this BO will be prematurely moved to inactive_list while retiring the first submit. But it will be still part of the second submit which is being processed by the GPU. Now, if the shrinker happens to be triggered at this point, it will result in premature purging of this BO. To fix this, we can replace the active_list with reference counting and move the BO to inactive list only when this count becomes zero. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_drv.h | 5 ++--- drivers/gpu/drm/msm/msm_gem.c | 30 -- drivers/gpu/drm/msm/msm_gem.h | 4 +++- drivers/gpu/drm/msm/msm_gpu.c | 11 +++ 4 files changed, 28 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 3193274..28e3c8d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -309,9 +309,8 @@ void msm_gem_put_vaddr(struct drm_gem_object *obj); int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv); int msm_gem_sync_object(struct drm_gem_object *obj, struct msm_fence_context *fctx, bool exclusive); -void msm_gem_move_to_active(struct drm_gem_object *obj, - struct msm_gpu *gpu, bool exclusive, struct dma_fence *fence); -void msm_gem_move_to_inactive(struct drm_gem_object *obj); +void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu); +void msm_gem_active_put(struct drm_gem_object *obj); int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout); int msm_gem_cpu_fini(struct drm_gem_object *obj); void msm_gem_free_object(struct drm_gem_object *obj); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 76a6c52..accc106 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -743,33 +743,36 @@ int msm_gem_sync_object(struct drm_gem_object *obj, return 0; } -void msm_gem_move_to_active(struct drm_gem_object *obj, - struct msm_gpu *gpu, bool exclusive, struct dma_fence *fence) +void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) { struct msm_gem_object *msm_obj = to_msm_bo(obj); + WARN_ON(!mutex_is_locked(>dev->struct_mutex)); WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED); + msm_obj->gpu = gpu; - if (exclusive) - dma_resv_add_excl_fence(obj->resv, fence); - else - dma_resv_add_shared_fence(obj->resv, fence); list_del_init(_obj->mm_list); - list_add_tail(_obj->mm_list, >active_list); + atomic_inc(_obj->active_count); } -void msm_gem_move_to_inactive(struct drm_gem_object *obj) +static void move_to_inactive(struct msm_gem_object *msm_obj) { - struct drm_device *dev = obj->dev; + struct drm_device *dev = msm_obj->base.dev; struct msm_drm_private *priv = dev->dev_private; - struct msm_gem_object *msm_obj = to_msm_bo(obj); - - WARN_ON(!mutex_is_locked(>struct_mutex)); msm_obj->gpu = NULL; - list_del_init(_obj->mm_list); list_add_tail(_obj->mm_list, >inactive_list); } +void msm_gem_active_put(struct drm_gem_object *obj) +{ + struct msm_gem_object *msm_obj = to_msm_bo(obj); + + WARN_ON(!mutex_is_locked(>dev->struct_mutex)); + + if (atomic_dec_and_test(_obj->active_count)) + move_to_inactive(msm_obj); +} + int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t op, ktime_t *timeout) { bool write = !!(op & MSM_PREP_WRITE); @@ -1104,7 +1107,6 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, } if (struct_mutex_locked) { - WARN_ON(!mutex_is_locked(>struct_mutex)); list_add_tail(_obj->mm_list, >inactive_list); } else { mutex_lock(>struct_mutex); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 7b1c7a5..a1bf741 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -88,12 +88,14 @@ struct msm_gem_object { struct mutex lock; /* Protects resources associated with bo */ char name[32]; /* Identifier to print for the debugfs files */ + + atomic_t active_count; }; #define to_msm_bo(x) container_of(x, struct msm_gem_object, base) static inline bool is_active(struct msm_gem_object *msm_obj) { - return msm_obj->gpu != NULL; + return atomic_read(_obj->active_count); } static inline bool is_purgeable(struct msm_gem_object *msm_obj) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 29c8d73c..55d1648 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -698,8 +698,8 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, for (i =
[PATCH 2/2] drm/msm: Leave inuse count intact on map failure
Leave the inuse count intact on map failure to keep the accounting accurate. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gem_vma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index 80a8a26..8367a1c 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -88,8 +88,10 @@ msm_gem_map_vma(struct msm_gem_address_space *aspace, ret = aspace->mmu->funcs->map(aspace->mmu, vma->iova, sgt, size, prot); - if (ret) + if (ret) { vma->mapped = false; + vma->inuse++; + } return ret; } -- 2.7.4
Re: [PATCH 16/20] drm/msm/a6xx: Add support for per-instance pagetables
Reviewed-by: Akhil P Oommen On 8/18/2020 3:31 AM, Rob Clark wrote: From: Jordan Crouse Add support for using per-instance pagetables if all the dependencies are available. Signed-off-by: Jordan Crouse Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 63 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 + drivers/gpu/drm/msm/msm_ringbuffer.h | 1 + 3 files changed, 65 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 5eabb0109577..d7ad6c78d787 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -81,6 +81,49 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter, OUT_RING(ring, upper_32_bits(iova)); } +static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, + struct msm_ringbuffer *ring, struct msm_file_private *ctx) +{ + phys_addr_t ttbr; + u32 asid; + u64 memptr = rbmemptr(ring, ttbr0); + + if (ctx == a6xx_gpu->cur_ctx) + return; + + if (msm_iommu_pagetable_params(ctx->aspace->mmu, , )) + return; + + /* Execute the table update */ + OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4); + OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr))); + + OUT_RING(ring, + CP_SMMU_TABLE_UPDATE_1_TTBR0_HI(upper_32_bits(ttbr)) | + CP_SMMU_TABLE_UPDATE_1_ASID(asid)); + OUT_RING(ring, CP_SMMU_TABLE_UPDATE_2_CONTEXTIDR(0)); + OUT_RING(ring, CP_SMMU_TABLE_UPDATE_3_CONTEXTBANK(0)); + + /* +* Write the new TTBR0 to the memstore. This is good for debugging. +*/ + OUT_PKT7(ring, CP_MEM_WRITE, 4); + OUT_RING(ring, CP_MEM_WRITE_0_ADDR_LO(lower_32_bits(memptr))); + OUT_RING(ring, CP_MEM_WRITE_1_ADDR_HI(upper_32_bits(memptr))); + OUT_RING(ring, lower_32_bits(ttbr)); + OUT_RING(ring, (asid << 16) | upper_32_bits(ttbr)); + + /* +* And finally, trigger a uche flush to be sure there isn't anything +* lingering in that part of the GPU +*/ + + OUT_PKT7(ring, CP_EVENT_WRITE, 1); + OUT_RING(ring, 0x31); + + a6xx_gpu->cur_ctx = ctx; +} + static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) { unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; @@ -90,6 +133,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i; + a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx); + get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO, rbmemptr_stats(ring, index, cpcycles_start)); @@ -696,6 +741,8 @@ static int a6xx_hw_init(struct msm_gpu *gpu) /* Always come up on rb 0 */ a6xx_gpu->cur_ring = gpu->rb[0]; + a6xx_gpu->cur_ctx = NULL; + /* Enable the SQE_to start the CP engine */ gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 1); @@ -1008,6 +1055,21 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu) return (unsigned long)busy_time; } +static struct msm_gem_address_space * +a6xx_create_private_address_space(struct msm_gpu *gpu) +{ + struct msm_gem_address_space *aspace = NULL; + struct msm_mmu *mmu; + + mmu = msm_iommu_pagetable_create(gpu->aspace->mmu); + + if (!IS_ERR(mmu)) + aspace = msm_gem_address_space_create(mmu, + "gpu", 0x1ULL, 0x1ULL); + + return aspace; +} + static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -1031,6 +1093,7 @@ static const struct adreno_gpu_funcs funcs = { .gpu_state_put = a6xx_gpu_state_put, #endif .create_address_space = adreno_iommu_create_address_space, + .create_private_address_space = a6xx_create_private_address_space, }, .get_timestamp = a6xx_get_timestamp, }; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 03ba60d5b07f..da22d7549d9b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -19,6 +19,7 @@ struct a6xx_gpu { uint64_t sqe_iova; struct msm_ringbuffer *cur_ring; + struct msm_file_private *cur_ctx; struct a6xx_gmu gmu; }; diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h index 7764373d0ed2..0987d6bf848c 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.h +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h @@ -31,6 +31,7 @@ struct msm_rbmemptrs { volatile uint32_t fence; volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT]; + volatile u64 ttbr0; }; struct msm_ringbuffer {
Re: [PATCH 16/19] drm/msm/a6xx: Add support for per-instance pagetables
On 8/14/2020 8:11 AM, Rob Clark wrote: From: Jordan Crouse Add support for using per-instance pagetables if all the dependencies are available. Signed-off-by: Jordan Crouse Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 70 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 + drivers/gpu/drm/msm/msm_ringbuffer.h | 1 + 3 files changed, 72 insertions(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 5eabb0109577..9653ac9b3cb8 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -81,6 +81,56 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter, OUT_RING(ring, upper_32_bits(iova)); } +static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu, + struct msm_ringbuffer *ring, struct msm_file_private *ctx) +{ + phys_addr_t ttbr; + u32 asid; + u64 memptr = rbmemptr(ring, ttbr0); + + if (ctx == a6xx_gpu->cur_ctx) + return; + + if (msm_iommu_pagetable_params(ctx->aspace->mmu, , )) + return; + + /* Execute the table update */ + OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4); + OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr))); + + /* +* For now ignore the asid since the smmu driver uses a TLBIASID to +* flush the TLB when we use iommu_flush_tlb_all() and the smmu driver +* isn't aware that the asid changed. Instead, keep the default asid +* (0, same as the context bank) to make sure the TLB is properly +* flushed. +*/ + OUT_RING(ring, + CP_SMMU_TABLE_UPDATE_1_TTBR0_HI(upper_32_bits(ttbr)) | + CP_SMMU_TABLE_UPDATE_1_ASID(0)); + OUT_RING(ring, CP_SMMU_TABLE_UPDATE_2_CONTEXTIDR(0)); + OUT_RING(ring, CP_SMMU_TABLE_UPDATE_3_CONTEXTBANK(0)); + + /* +* Write the new TTBR0 to the memstore. This is good for debugging. +*/ + OUT_PKT7(ring, CP_MEM_WRITE, 4); + OUT_RING(ring, CP_MEM_WRITE_0_ADDR_LO(lower_32_bits(memptr))); + OUT_RING(ring, CP_MEM_WRITE_1_ADDR_HI(upper_32_bits(memptr))); + OUT_RING(ring, lower_32_bits(ttbr)); + OUT_RING(ring, (0 << 16) | upper_32_bits(ttbr)); why (0 << 16) is required here? + + /* +* And finally, trigger a uche flush to be sure there isn't anything +* lingering in that part of the GPU +*/ + + OUT_PKT7(ring, CP_EVENT_WRITE, 1); + OUT_RING(ring, 0x31); This may be unnecessary, but no harm in keeping it. SMMU_TABLE_UPDATE is supposed to do a UCHE flush. -Akhil + + a6xx_gpu->cur_ctx = ctx; +} + static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) { unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT; @@ -90,6 +140,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i; + a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx); + get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO, rbmemptr_stats(ring, index, cpcycles_start)); @@ -696,6 +748,8 @@ static int a6xx_hw_init(struct msm_gpu *gpu) /* Always come up on rb 0 */ a6xx_gpu->cur_ring = gpu->rb[0]; + a6xx_gpu->cur_ctx = NULL; + /* Enable the SQE_to start the CP engine */ gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 1); @@ -1008,6 +1062,21 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu) return (unsigned long)busy_time; } +static struct msm_gem_address_space * +a6xx_create_private_address_space(struct msm_gpu *gpu) +{ + struct msm_gem_address_space *aspace = NULL; + struct msm_mmu *mmu; + + mmu = msm_iommu_pagetable_create(gpu->aspace->mmu); + + if (!IS_ERR(mmu)) + aspace = msm_gem_address_space_create(mmu, + "gpu", 0x1ULL, 0x1ULL); + + return aspace; +} + static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -1031,6 +1100,7 @@ static const struct adreno_gpu_funcs funcs = { .gpu_state_put = a6xx_gpu_state_put, #endif .create_address_space = adreno_iommu_create_address_space, + .create_private_address_space = a6xx_create_private_address_space, }, .get_timestamp = a6xx_get_timestamp, }; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 03ba60d5b07f..da22d7549d9b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -19,6 +19,7 @@ struct a6xx_gpu { uint64_t sqe_iova; struct msm_ringbuffer *cur_ring; + struct msm_file_private *cur_ctx; struct a6xx_gmu gmu; }; diff --git
Re: [RESEND PATCH] drm/msm/a6xx: fix frequency not always being restored on GMU resume
Why don't we move the early return in a6xx_gmu_set_freq() to msm_devfreq_target() instead? -Akhil. On 8/14/2020 12:24 AM, Jonathan Marek wrote: The patch reorganizing the set_freq function made it so the gmu resume doesn't always set the frequency, because a6xx_gmu_set_freq() exits early when the frequency hasn't been changed. Note this always happens when resuming GMU after recovering from a hang. Use a simple workaround to prevent this from happening. Fixes: 1f60d11423db ("drm: msm: a6xx: send opp instead of a frequency") Signed-off-by: Jonathan Marek --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index b67b38c8fadf..bbbd00020f92 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -845,6 +845,7 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) if (IS_ERR_OR_NULL(gpu_opp)) return; + gmu->freq = 0; /* so a6xx_gmu_set_freq() doesn't exit early */ a6xx_gmu_set_freq(gpu, gpu_opp); dev_pm_opp_put(gpu_opp); }
[PATCH v2] drm: msm: a6xx: fix gpu failure after system resume
On targets where GMU is available, GMU takes over the ownership of GX GDSC during its initialization. So, move the refcount-get on GX PD before we initialize the GMU. This ensures that nobody can collapse the GX GDSC once GMU owns the GX GDSC. This patch fixes some GMU OOB errors seen during GPU wake up during a system resume. Signed-off-by: Akhil P Oommen Reported-by: Matthias Kaehlcke Tested-by: Matthias Kaehlcke --- Changes from v1: - Reworded the commit text - Added Reported-by & Tested-by tags drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 21e77d6..1d33020 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -854,10 +854,19 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) /* Turn on the resources */ pm_runtime_get_sync(gmu->dev); + /* +* "enable" the GX power domain which won't actually do anything but it +* will make sure that the refcounting is correct in case we need to +* bring down the GX after a GMU failure +*/ + if (!IS_ERR_OR_NULL(gmu->gxpd)) + pm_runtime_get_sync(gmu->gxpd); + /* Use a known rate to bring up the GMU */ clk_set_rate(gmu->core_clk, 2); ret = clk_bulk_prepare_enable(gmu->nr_clocks, gmu->clocks); if (ret) { + pm_runtime_put(gmu->gxpd); pm_runtime_put(gmu->dev); return ret; } @@ -903,19 +912,12 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) else a6xx_hfi_set_freq(gmu, gmu->current_perf_index); - /* -* "enable" the GX power domain which won't actually do anything but it -* will make sure that the refcounting is correct in case we need to -* bring down the GX after a GMU failure -*/ - if (!IS_ERR_OR_NULL(gmu->gxpd)) - pm_runtime_get(gmu->gxpd); - out: /* On failure, shut down the GMU to leave it in a good state */ if (ret) { disable_irq(gmu->gmu_irq); a6xx_rpmh_stop(gmu); + pm_runtime_put(gmu->gxpd); pm_runtime_put(gmu->dev); } -- 2.7.4
Re: [PATCH] drm: msm: a6xx: fix gpu failure after system resume
On 7/15/2020 12:12 AM, Rob Clark wrote: On Tue, Jul 14, 2020 at 10:10 AM Matthias Kaehlcke wrote: On Tue, Jul 14, 2020 at 06:55:30PM +0530, Akhil P Oommen wrote: On targets where GMU is available, GMU takes over the ownership of GX GDSC during its initialization. So, take a refcount on the GX PD on behalf of GMU before we initialize it. This makes sure that nobody can collapse the GX GDSC once GMU owns the GX GDSC. This patch fixes some weird failures during GPU wake up during system resume. Signed-off-by: Akhil P Oommen I went through a few dozen suspend/resume cycles on SC7180 and didn't run into the kernel panic that typically occurs after a few iterations without this patch. Reported-by: Matthias Kaehlcke Tested-by: Matthias Kaehlcke On which tree is this patch based on? I had to apply it manually because 'git am' is unhappy when I try to apply it: error: sha1 information is lacking or useless (drivers/gpu/drm/msm/adreno/a6xx_gmu.c). error: could not build fake ancestor Both upstream and drm-msm are in my remotes and synced, so I suspect it's some private tree. Please make sure to base patches on the corresponding maintainer tree or upstream, whichs makes life easier for maintainers, testers and reviewers. I've run into the same issue frequently :-( BR, -R Sorry, I was using msm-next brand as the base, but had the opp-next branch merged too inadvertently. -Akhil
[PATCH v6 6/6] arm64: dts: qcom: sc7180: Add opp-peak-kBps to GPU opp
From: Sharat Masetty Add opp-peak-kBps bindings to the GPU opp table, listing the peak GPU -> DDR bandwidth requirement for each opp level. This will be used to scale the DDR bandwidth along with the GPU frequency dynamically. Signed-off-by: Sharat Masetty Reviewed-by: Matthias Kaehlcke Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 80fe54b..ff4ddf1 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1479,36 +1479,43 @@ opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; + opp-peak-kBps = <8532000>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; + opp-peak-kBps = <5412000>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; + opp-peak-kBps = <5412000>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; + opp-peak-kBps = <3072000>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; + opp-peak-kBps = <3072000>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; + opp-peak-kBps = <1804000>; }; }; }; -- 2.7.4
[PATCH v6 2/6] drm: msm: a6xx: send opp instead of a frequency
From: Sharat Masetty This patch changes the plumbing to send the devfreq recommended opp rather than the frequency. Also consolidate and rearrange the code in a6xx to set the GPU frequency and the icc vote in preparation for the upcoming changes for GPU->DDR scaling votes. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 89 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_gpu.h | 3 +- 4 files changed, 52 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 21e77d6..856db46 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -103,17 +103,45 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); } -static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) { - struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu); - struct adreno_gpu *adreno_gpu = _gpu->base; - struct msm_gpu *gpu = _gpu->base; - int ret; + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = _gpu->gmu; + u32 perf_index; + unsigned long gpu_freq; + int ret = 0; + + gpu_freq = dev_pm_opp_get_freq(opp); + + if (gpu_freq == gmu->freq) + return; + + for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++) + if (gpu_freq == gmu->gpu_freqs[perf_index]) + break; + + gmu->current_perf_index = perf_index; + gmu->freq = gmu->gpu_freqs[perf_index]; + + /* +* This can get called from devfreq while the hardware is idle. Don't +* bring up the power if it isn't already active +*/ + if (pm_runtime_get_if_in_use(gmu->dev) == 0) + return; + + if (!gmu->legacy) { + a6xx_hfi_set_freq(gmu, perf_index); + icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + pm_runtime_put(gmu->dev); + return; + } gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0); gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING, - ((3 & 0xf) << 28) | index); + ((3 & 0xf) << 28) | perf_index); /* * Send an invalid index as a vote for the bus bandwidth and let the @@ -134,37 +162,6 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) * for now leave it at max so that the performance is nominal. */ icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); -} - -void a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) -{ - struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); - struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); - struct a6xx_gmu *gmu = _gpu->gmu; - u32 perf_index = 0; - - if (freq == gmu->freq) - return; - - for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++) - if (freq == gmu->gpu_freqs[perf_index]) - break; - - gmu->current_perf_index = perf_index; - gmu->freq = gmu->gpu_freqs[perf_index]; - - /* -* This can get called from devfreq while the hardware is idle. Don't -* bring up the power if it isn't already active -*/ - if (pm_runtime_get_if_in_use(gmu->dev) == 0) - return; - - if (gmu->legacy) - __a6xx_gmu_set_freq(gmu, perf_index); - else - a6xx_hfi_set_freq(gmu, perf_index); - pm_runtime_put(gmu->dev); } @@ -839,6 +836,19 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu) a6xx_gmu_rpmh_off(gmu); } +static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + a6xx_gmu_set_freq(gpu, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -898,10 +908,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) enable_irq(gmu->hfi_irq); /* Set the GPU to the current freq */ - if (gmu->legacy) - __a6xx_gmu_set_freq(gmu, gmu->current_perf_index); - else - a6xx_hfi_set_freq(gmu, gmu->current_perf_index); +
[PATCH v6 3/6] drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR
From: Sharat Masetty This patches replaces the previously used static DDR vote and uses dev_pm_opp_set_bw() to scale GPU->DDR bandwidth along with scaling GPU frequency. Also since the icc path voting is handled completely in the opp driver, remove the icc_path handle and its usage in the drm driver. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 856db46..a6f43ff 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -133,7 +133,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (!gmu->legacy) { a6xx_hfi_set_freq(gmu, perf_index); - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); pm_runtime_put(gmu->dev); return; } @@ -157,11 +157,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (ret) dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret); - /* -* Eventually we will want to scale the path vote with the frequency but -* for now leave it at max so that the performance is nominal. -*/ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); pm_runtime_put(gmu->dev); } @@ -849,6 +845,19 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) dev_pm_opp_put(gpu_opp); } +static void a6xx_gmu_set_initial_bw(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + dev_pm_opp_set_bw(>pdev->dev, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -873,7 +882,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) } /* Set the bus quota to a reasonable value for boot */ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(3072)); + a6xx_gmu_set_initial_bw(gpu, gmu); /* Enable the GMU interrupt */ gmu_write(gmu, REG_A6XX_GMU_AO_HOST_INTERRUPT_CLR, ~0); @@ -1049,7 +1058,7 @@ int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu) a6xx_gmu_shutdown(gmu); /* Remove the bus vote */ - icc_set_bw(gpu->icc_path, 0, 0); + dev_pm_opp_set_bw(>pdev->dev, NULL); /* * Make sure the GX domain is off before turning off the GMU (CX) -- 2.7.4
[PATCH v6 5/6] arm64: dts: qcom: sc7180: Add interconnects property for GPU
From: Sharat Masetty This patch adds the interconnects property to the GPU node. This enables the GPU->DDR path bandwidth voting. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 31b9217..80fe54b 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1470,6 +1470,9 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; + interconnect-names = "gfx-mem"; + gpu_opp_table: opp-table { compatible = "operating-points-v2"; -- 2.7.4
[PATCH v6 0/6] Add support for GPU DDR BW scaling
This series add support for GPU DDR bandwidth scaling and is based on the bindings from Georgi [1]. This is mostly a rebase of Sharat's patches [2] on the tip of msm-next branch. [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/vireshk/pm/+log/opp/linux-next/ [2] https://patchwork.freedesktop.org/series/75291/ Changes from v5: - Added "interconnect-names" property Changes from v4: - Squashed a patch to another one to fix Jonathan's comment - Add back the pm_runtime_get_if_in_use() check Changes from v3: - Rebased on top of Jonathan's patch which adds support for changing gpu freq through hfi on newer targets - As suggested by Rob, left the icc_path intact for pre-a6xx GPUs Sharat Masetty (6): dt-bindings: drm/msm/gpu: Document gpu opp table drm: msm: a6xx: send opp instead of a frequency drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR arm64: dts: qcom: SDM845: Enable GPU DDR bw scaling arm64: dts: qcom: sc7180: Add interconnects property for GPU arm64: dts: qcom: sc7180: Add opp-peak-kBps to GPU opp .../devicetree/bindings/display/msm/gpu.txt| 28 ++ arch/arm64/boot/dts/qcom/sc7180.dtsi | 10 ++ arch/arm64/boot/dts/qcom/sdm845.dtsi | 10 ++ drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 108 - drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_gpu.h | 3 +- 7 files changed, 114 insertions(+), 50 deletions(-) -- 2.7.4
[PATCH v6 4/6] arm64: dts: qcom: SDM845: Enable GPU DDR bw scaling
From: Sharat Masetty This patch adds the interconnects property for the gpu node and the opp-peak-kBps property to the opps of the gpu opp table. This should help enable DDR bandwidth scaling dynamically and proportionally to the GPU frequency. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index 8eb5a31..1cd2dae 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -3515,42 +3515,52 @@ qcom,gmu = <>; + interconnects = <_noc MASTER_GFX3D _noc SLAVE_EBI1>; + interconnect-names = "gfx-mem"; + gpu_opp_table: opp-table { compatible = "operating-points-v2"; opp-71000 { opp-hz = /bits/ 64 <71000>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-67500 { opp-hz = /bits/ 64 <67500>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-59600 { opp-hz = /bits/ 64 <59600>; opp-level = ; + opp-peak-kBps = <622>; }; opp-52000 { opp-hz = /bits/ 64 <52000>; opp-level = ; + opp-peak-kBps = <622>; }; opp-41400 { opp-hz = /bits/ 64 <41400>; opp-level = ; + opp-peak-kBps = <4068000>; }; opp-34200 { opp-hz = /bits/ 64 <34200>; opp-level = ; + opp-peak-kBps = <2724000>; }; opp-25700 { opp-hz = /bits/ 64 <25700>; opp-level = ; + opp-peak-kBps = <1648000>; }; }; }; -- 2.7.4
[PATCH v6 1/6] dt-bindings: drm/msm/gpu: Document gpu opp table
From: Sharat Masetty Update documentation to list the gpu opp table bindings including the newly added "opp-peak-kBps" needed for GPU-DDR bandwidth scaling. Signed-off-by: Sharat Masetty Acked-by: Rob Herring Signed-off-by: Akhil P Oommen --- .../devicetree/bindings/display/msm/gpu.txt| 28 ++ 1 file changed, 28 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index fd779cd..1af0ff1 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -112,6 +112,34 @@ Example a6xx (with GMU): interconnects = <_hlos MASTER_GFX3D _hlos SLAVE_EBI1>; interconnect-names = "gfx-mem"; + gpu_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-43000 { + opp-hz = /bits/ 64 <43000>; + opp-level = ; + opp-peak-kBps = <5412000>; + }; + + opp-35500 { + opp-hz = /bits/ 64 <35500>; + opp-level = ; + opp-peak-kBps = <3072000>; + }; + + opp-26700 { + opp-hz = /bits/ 64 <26700>; + opp-level = ; + opp-peak-kBps = <3072000>; + }; + + opp-18000 { + opp-hz = /bits/ 64 <18000>; + opp-level = ; + opp-peak-kBps = <1804000>; + }; + }; + qcom,gmu = <>; zap-shader { -- 2.7.4
[PATCH] drm: msm: a6xx: fix gpu failure after system resume
On targets where GMU is available, GMU takes over the ownership of GX GDSC during its initialization. So, take a refcount on the GX PD on behalf of GMU before we initialize it. This makes sure that nobody can collapse the GX GDSC once GMU owns the GX GDSC. This patch fixes some weird failures during GPU wake up during system resume. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index a6f43ff..5b2df7d 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -873,10 +873,19 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) /* Turn on the resources */ pm_runtime_get_sync(gmu->dev); + /* +* "enable" the GX power domain which won't actually do anything but it +* will make sure that the refcounting is correct in case we need to +* bring down the GX after a GMU failure +*/ + if (!IS_ERR_OR_NULL(gmu->gxpd)) + pm_runtime_get_sync(gmu->gxpd); + /* Use a known rate to bring up the GMU */ clk_set_rate(gmu->core_clk, 2); ret = clk_bulk_prepare_enable(gmu->nr_clocks, gmu->clocks); if (ret) { + pm_runtime_put(gmu->gxpd); pm_runtime_put(gmu->dev); return ret; } @@ -919,19 +928,12 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) /* Set the GPU to the current freq */ a6xx_gmu_set_initial_freq(gpu, gmu); - /* -* "enable" the GX power domain which won't actually do anything but it -* will make sure that the refcounting is correct in case we need to -* bring down the GX after a GMU failure -*/ - if (!IS_ERR_OR_NULL(gmu->gxpd)) - pm_runtime_get(gmu->gxpd); - out: /* On failure, shut down the GMU to leave it in a good state */ if (ret) { disable_irq(gmu->gmu_irq); a6xx_rpmh_stop(gmu); + pm_runtime_put(gmu->gxpd); pm_runtime_put(gmu->dev); } -- 2.7.4
[PATCH v5 2/6] drm: msm: a6xx: send opp instead of a frequency
From: Sharat Masetty This patch changes the plumbing to send the devfreq recommended opp rather than the frequency. Also consolidate and rearrange the code in a6xx to set the GPU frequency and the icc vote in preparation for the upcoming changes for GPU->DDR scaling votes. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 89 +++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_gpu.h | 3 +- 4 files changed, 52 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 21e77d6..856db46 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -103,17 +103,45 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); } -static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) { - struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu); - struct adreno_gpu *adreno_gpu = _gpu->base; - struct msm_gpu *gpu = _gpu->base; - int ret; + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = _gpu->gmu; + u32 perf_index; + unsigned long gpu_freq; + int ret = 0; + + gpu_freq = dev_pm_opp_get_freq(opp); + + if (gpu_freq == gmu->freq) + return; + + for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++) + if (gpu_freq == gmu->gpu_freqs[perf_index]) + break; + + gmu->current_perf_index = perf_index; + gmu->freq = gmu->gpu_freqs[perf_index]; + + /* +* This can get called from devfreq while the hardware is idle. Don't +* bring up the power if it isn't already active +*/ + if (pm_runtime_get_if_in_use(gmu->dev) == 0) + return; + + if (!gmu->legacy) { + a6xx_hfi_set_freq(gmu, perf_index); + icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + pm_runtime_put(gmu->dev); + return; + } gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0); gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING, - ((3 & 0xf) << 28) | index); + ((3 & 0xf) << 28) | perf_index); /* * Send an invalid index as a vote for the bus bandwidth and let the @@ -134,37 +162,6 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) * for now leave it at max so that the performance is nominal. */ icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); -} - -void a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) -{ - struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); - struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); - struct a6xx_gmu *gmu = _gpu->gmu; - u32 perf_index = 0; - - if (freq == gmu->freq) - return; - - for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++) - if (freq == gmu->gpu_freqs[perf_index]) - break; - - gmu->current_perf_index = perf_index; - gmu->freq = gmu->gpu_freqs[perf_index]; - - /* -* This can get called from devfreq while the hardware is idle. Don't -* bring up the power if it isn't already active -*/ - if (pm_runtime_get_if_in_use(gmu->dev) == 0) - return; - - if (gmu->legacy) - __a6xx_gmu_set_freq(gmu, perf_index); - else - a6xx_hfi_set_freq(gmu, perf_index); - pm_runtime_put(gmu->dev); } @@ -839,6 +836,19 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu) a6xx_gmu_rpmh_off(gmu); } +static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + a6xx_gmu_set_freq(gpu, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -898,10 +908,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) enable_irq(gmu->hfi_irq); /* Set the GPU to the current freq */ - if (gmu->legacy) - __a6xx_gmu_set_freq(gmu, gmu->current_perf_index); - else - a6xx_hfi_set_freq(gmu, gmu->current_perf_index); +
[PATCH v5 6/6] arm64: dts: qcom: sc7180: Add opp-peak-kBps to GPU opp
From: Sharat Masetty Add opp-peak-kBps bindings to the GPU opp table, listing the peak GPU -> DDR bandwidth requirement for each opp level. This will be used to scale the DDR bandwidth along with the GPU frequency dynamically. Signed-off-by: Sharat Masetty Reviewed-by: Matthias Kaehlcke Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index a567297..8567e9e 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1478,36 +1478,43 @@ opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; + opp-peak-kBps = <8532000>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; + opp-peak-kBps = <5412000>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; + opp-peak-kBps = <5412000>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; + opp-peak-kBps = <3072000>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; + opp-peak-kBps = <3072000>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; + opp-peak-kBps = <1804000>; }; }; }; -- 2.7.4
[PATCH v5 5/6] arm64: dts: qcom: sc7180: Add interconnects property for GPU
From: Sharat Masetty This patch adds the interconnects property to the GPU node. This enables the GPU->DDR path bandwidth voting. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 31b9217..a567297 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1470,6 +1470,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; + gpu_opp_table: opp-table { compatible = "operating-points-v2"; -- 2.7.4
[PATCH v5 3/6] drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR
From: Sharat Masetty This patches replaces the previously used static DDR vote and uses dev_pm_opp_set_bw() to scale GPU->DDR bandwidth along with scaling GPU frequency. Also since the icc path voting is handled completely in the opp driver, remove the icc_path handle and its usage in the drm driver. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 856db46..a6f43ff 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -133,7 +133,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (!gmu->legacy) { a6xx_hfi_set_freq(gmu, perf_index); - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); pm_runtime_put(gmu->dev); return; } @@ -157,11 +157,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (ret) dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret); - /* -* Eventually we will want to scale the path vote with the frequency but -* for now leave it at max so that the performance is nominal. -*/ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); pm_runtime_put(gmu->dev); } @@ -849,6 +845,19 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) dev_pm_opp_put(gpu_opp); } +static void a6xx_gmu_set_initial_bw(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + dev_pm_opp_set_bw(>pdev->dev, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -873,7 +882,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) } /* Set the bus quota to a reasonable value for boot */ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(3072)); + a6xx_gmu_set_initial_bw(gpu, gmu); /* Enable the GMU interrupt */ gmu_write(gmu, REG_A6XX_GMU_AO_HOST_INTERRUPT_CLR, ~0); @@ -1049,7 +1058,7 @@ int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu) a6xx_gmu_shutdown(gmu); /* Remove the bus vote */ - icc_set_bw(gpu->icc_path, 0, 0); + dev_pm_opp_set_bw(>pdev->dev, NULL); /* * Make sure the GX domain is off before turning off the GMU (CX) -- 2.7.4
[PATCH v5 4/6] arm64: dts: qcom: SDM845: Enable GPU DDR bw scaling
From: Sharat Masetty This patch adds the interconnects property for the gpu node and the opp-peak-kBps property to the opps of the gpu opp table. This should help enable DDR bandwidth scaling dynamically and proportionally to the GPU frequency. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index 8eb5a31..5e9561a 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -3515,42 +3515,51 @@ qcom,gmu = <>; + interconnects = <_noc MASTER_GFX3D _noc SLAVE_EBI1>; + gpu_opp_table: opp-table { compatible = "operating-points-v2"; opp-71000 { opp-hz = /bits/ 64 <71000>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-67500 { opp-hz = /bits/ 64 <67500>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-59600 { opp-hz = /bits/ 64 <59600>; opp-level = ; + opp-peak-kBps = <622>; }; opp-52000 { opp-hz = /bits/ 64 <52000>; opp-level = ; + opp-peak-kBps = <622>; }; opp-41400 { opp-hz = /bits/ 64 <41400>; opp-level = ; + opp-peak-kBps = <4068000>; }; opp-34200 { opp-hz = /bits/ 64 <34200>; opp-level = ; + opp-peak-kBps = <2724000>; }; opp-25700 { opp-hz = /bits/ 64 <25700>; opp-level = ; + opp-peak-kBps = <1648000>; }; }; }; -- 2.7.4
[PATCH v5 1/6] dt-bindings: drm/msm/gpu: Document gpu opp table
From: Sharat Masetty Update documentation to list the gpu opp table bindings including the newly added "opp-peak-kBps" needed for GPU-DDR bandwidth scaling. Signed-off-by: Sharat Masetty Acked-by: Rob Herring Signed-off-by: Akhil P Oommen --- .../devicetree/bindings/display/msm/gpu.txt| 28 ++ 1 file changed, 28 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index fd779cd..1af0ff1 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -112,6 +112,34 @@ Example a6xx (with GMU): interconnects = <_hlos MASTER_GFX3D _hlos SLAVE_EBI1>; interconnect-names = "gfx-mem"; + gpu_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-43000 { + opp-hz = /bits/ 64 <43000>; + opp-level = ; + opp-peak-kBps = <5412000>; + }; + + opp-35500 { + opp-hz = /bits/ 64 <35500>; + opp-level = ; + opp-peak-kBps = <3072000>; + }; + + opp-26700 { + opp-hz = /bits/ 64 <26700>; + opp-level = ; + opp-peak-kBps = <3072000>; + }; + + opp-18000 { + opp-hz = /bits/ 64 <18000>; + opp-level = ; + opp-peak-kBps = <1804000>; + }; + }; + qcom,gmu = <>; zap-shader { -- 2.7.4
[PATCH v5 0/6] Add support for GPU DDR BW scaling
This series adds support for GPU DDR bandwidth scaling and is based on the bindings from Georgi [1]. This is mostly a rebase of Sharat's patches [2] on the tip of msm-next branch. Changes from v4: - Squashed a patch to another one to fix Jonathan's comment - Add back the pm_runtime_get_if_in_use() check Changes from v3: - Rebased on top of Jonathan's patch which adds support for changing gpu freq through hfi on newer targets - As suggested by Rob, left the icc_path intact for pre-a6xx GPUs [1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/vireshk/pm/+log/opp/linux-next/ [2] https://patchwork.freedesktop.org/series/75291/ Sharat Masetty (6): dt-bindings: drm/msm/gpu: Document gpu opp table drm: msm: a6xx: send opp instead of a frequency drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR arm64: dts: qcom: SDM845: Enable GPU DDR bw scaling arm64: dts: qcom: sc7180: Add interconnects property for GPU arm64: dts: qcom: sc7180: Add opp-peak-kBps to GPU opp .../devicetree/bindings/display/msm/gpu.txt| 28 ++ arch/arm64/boot/dts/qcom/sc7180.dtsi | 9 ++ arch/arm64/boot/dts/qcom/sdm845.dtsi | 9 ++ drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 108 - drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_gpu.h | 3 +- 7 files changed, 112 insertions(+), 50 deletions(-) -- 2.7.4
Re: [PATCH v4 3/7] drm: msm: a6xx: set gpu freq through hfi
On 7/11/2020 2:43 AM, Akhil P Oommen wrote: On 7/10/2020 1:34 AM, Jonathan Marek wrote: On 7/9/20 4:00 PM, Akhil P Oommen wrote: Newer targets support changing gpu frequency through HFI. So use that wherever supported instead of the legacy method. It was already using HFI on newer targets. Don't break it in one commit then fix it in the next. Oops. I somehow got confused. Will fix and resend. -Akhil I broke the pm_runtime_get_if_in_use() check too. Other than that, just squashing this patch with the previous one should be enough. -Akhil. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 233afea..b547339 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -121,6 +121,12 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (gpu_freq == gmu->gpu_freqs[perf_index]) break; + if (!gmu->legacy) { + a6xx_hfi_set_freq(gmu, gmu->current_perf_index); + icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + return; + } + gmu->current_perf_index = perf_index; gmu->freq = gmu->gpu_freqs[perf_index]; @@ -893,10 +899,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) enable_irq(gmu->hfi_irq); /* Set the GPU to the current freq */ - if (gmu->legacy) - a6xx_gmu_set_initial_freq(gpu, gmu); - else - a6xx_hfi_set_freq(gmu, gmu->current_perf_index); + a6xx_gmu_set_initial_freq(gpu, gmu); /* * "enable" the GX power domain which won't actually do anything but it ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v4 3/7] drm: msm: a6xx: set gpu freq through hfi
On 7/10/2020 1:34 AM, Jonathan Marek wrote: On 7/9/20 4:00 PM, Akhil P Oommen wrote: Newer targets support changing gpu frequency through HFI. So use that wherever supported instead of the legacy method. It was already using HFI on newer targets. Don't break it in one commit then fix it in the next. Oops. I somehow got confused. Will fix and resend. -Akhil Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 233afea..b547339 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -121,6 +121,12 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (gpu_freq == gmu->gpu_freqs[perf_index]) break; + if (!gmu->legacy) { + a6xx_hfi_set_freq(gmu, gmu->current_perf_index); + icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + return; + } + gmu->current_perf_index = perf_index; gmu->freq = gmu->gpu_freqs[perf_index]; @@ -893,10 +899,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) enable_irq(gmu->hfi_irq); /* Set the GPU to the current freq */ - if (gmu->legacy) - a6xx_gmu_set_initial_freq(gpu, gmu); - else - a6xx_hfi_set_freq(gmu, gmu->current_perf_index); + a6xx_gmu_set_initial_freq(gpu, gmu); /* * "enable" the GX power domain which won't actually do anything but it ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Freedreno] [PATCH v4 4/7] drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR
On 7/11/2020 1:11 AM, Rob Clark wrote: On Thu, Jul 9, 2020 at 1:01 PM Akhil P Oommen wrote: From: Sharat Masetty This patches replaces the previously used static DDR vote and uses dev_pm_opp_set_bw() to scale GPU->DDR bandwidth along with scaling GPU frequency. Also since the icc path voting is handled completely in the opp driver, remove the icc_path handle and its usage in the drm driver. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index b547339..6fbfd7d 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -123,7 +123,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (!gmu->legacy) { a6xx_hfi_set_freq(gmu, gmu->current_perf_index); - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); What is the status of the patch to add dev_pm_opp_set_bw()? If it is ready to go, and I get an ack-by from the OPP maintainer, I suppose I could merge it via drm/msm. Otherwise should we consider pulling in a private copy of it into drm/msm (and then drop it to use the helper in, hopefully, the next cycle)? I'm pulling the patches preceding this one into msm-next-staging to do some testing. And the dt patches following this one would normally get merged via Bjorn. At the moment, I'm not sure what to do with this one. BR, -R I see Sibi's patch is already picked in opp/linux-next branch. https://kernel.googlesource.com/pub/scm/linux/kernel/git/vireshk/pm/+/b466542f331e221a3628c1cfe5ccff307d7d787f Thanks, -Akhil return; } @@ -149,11 +149,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (ret) dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret); - /* -* Eventually we will want to scale the path vote with the frequency but -* for now leave it at max so that the performance is nominal. -*/ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); } unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) @@ -840,6 +836,19 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) dev_pm_opp_put(gpu_opp); } +static void a6xx_gmu_set_initial_bw(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + dev_pm_opp_set_bw(>pdev->dev, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -864,7 +873,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) } /* Set the bus quota to a reasonable value for boot */ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(3072)); + a6xx_gmu_set_initial_bw(gpu, gmu); /* Enable the GMU interrupt */ gmu_write(gmu, REG_A6XX_GMU_AO_HOST_INTERRUPT_CLR, ~0); @@ -1040,7 +1049,7 @@ int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu) a6xx_gmu_shutdown(gmu); /* Remove the bus vote */ - icc_set_bw(gpu->icc_path, 0, 0); + dev_pm_opp_set_bw(>pdev->dev, NULL); /* * Make sure the GX domain is off before turning off the GMU (CX) -- 2.7.4 ___ Freedreno mailing list freedr...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
[PATCH v1] drm/msm: Fix a null pointer access in msm_gem_shrinker_count()
Adding an msm_gem_object object to the inactive_list before completing its initialization is a bad idea because shrinker may pick it up from the inactive_list. Fix this by making sure that the initialization is complete before moving the msm_obj object to the inactive list. This patch fixes the below error: [10027.553044] Unable to handle kernel NULL pointer dereference at virtual address 0068 [10027.573305] Mem abort info: [10027.590160] ESR = 0x9606 [10027.597905] EC = 0x25: DABT (current EL), IL = 32 bits [10027.614430] SET = 0, FnV = 0 [10027.624427] EA = 0, S1PTW = 0 [10027.632722] Data abort info: [10027.638039] ISV = 0, ISS = 0x0006 [10027.647459] CM = 0, WnR = 0 [10027.654345] user pgtable: 4k pages, 39-bit VAs, pgdp=0001e3a6a000 [10027.672681] [0068] pgd=000198c31003, pud=000198c31003, pmd= [10027.693900] Internal error: Oops: 9606 [#1] PREEMPT SMP [10027.738261] CPU: 3 PID: 214 Comm: kswapd0 Tainted: G S5.4.40 #1 [10027.745766] Hardware name: Qualcomm Technologies, Inc. SC7180 IDP (DT) [10027.752472] pstate: 80c9 (Nzcv daif +PAN +UAO) [10027.757409] pc : mutex_is_locked+0x14/0x2c [10027.761626] lr : msm_gem_shrinker_count+0x70/0xec [10027.766454] sp : ffc011323ad0 [10027.769867] x29: ffc011323ad0 x28: ffe677e4b878 [10027.775324] x27: 0cc0 x26: [10027.780783] x25: ff817114a708 x24: 0008 [10027.786242] x23: ff8023ab7170 x22: 0001 [10027.791701] x21: ff817114a080 x20: 0119 [10027.797160] x19: 0068 x18: 03bc [10027.802621] x17: 04a34210 x16: 00c0 [10027.808083] x15: x14: [10027.813542] x13: ffe677e0a3c0 x12: [10027.819000] x11: x10: ff8174b94340 [10027.824461] x9 : x8 : [10027.829919] x7 : 01fc x6 : ffc011323c88 [10027.835373] x5 : 0001 x4 : ffc011323d80 [10027.840832] x3 : 0477b348 x2 : [10027.846290] x1 : ffc011323b68 x0 : 0068 [10027.851748] Call trace: [10027.854264] mutex_is_locked+0x14/0x2c [10027.858121] msm_gem_shrinker_count+0x70/0xec [10027.862603] shrink_slab+0xc0/0x4b4 [10027.866187] shrink_node+0x4a8/0x818 [10027.869860] kswapd+0x624/0x890 [10027.873097] kthread+0x11c/0x12c [10027.876424] ret_from_fork+0x10/0x18 [10027.880102] Code: f9000bf3 910003fd aa0003f3 d503201f (f9400268) [10027.886362] ---[ end trace df5849a1a3543251 ]--- [10027.891518] Kernel panic - not syncing: Fatal exception Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gem.c | 36 +--- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 6277fde..f63bb7e 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -994,10 +994,8 @@ int msm_gem_new_handle(struct drm_device *dev, struct drm_file *file, static int msm_gem_new_impl(struct drm_device *dev, uint32_t size, uint32_t flags, - struct drm_gem_object **obj, - bool struct_mutex_locked) + struct drm_gem_object **obj) { - struct msm_drm_private *priv = dev->dev_private; struct msm_gem_object *msm_obj; switch (flags & MSM_BO_CACHE_MASK) { @@ -1023,15 +1021,6 @@ static int msm_gem_new_impl(struct drm_device *dev, INIT_LIST_HEAD(_obj->submit_entry); INIT_LIST_HEAD(_obj->vmas); - if (struct_mutex_locked) { - WARN_ON(!mutex_is_locked(>struct_mutex)); - list_add_tail(_obj->mm_list, >inactive_list); - } else { - mutex_lock(>struct_mutex); - list_add_tail(_obj->mm_list, >inactive_list); - mutex_unlock(>struct_mutex); - } - *obj = _obj->base; return 0; @@ -1041,6 +1030,7 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, uint32_t size, uint32_t flags, bool struct_mutex_locked) { struct msm_drm_private *priv = dev->dev_private; + struct msm_gem_object *msm_obj; struct drm_gem_object *obj = NULL; bool use_vram = false; int ret; @@ -1061,14 +1051,15 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, if (size == 0) return ERR_PTR(-EINVAL); - ret = msm_gem_new_impl(dev, size, flags, , struct_mutex_locked); + ret = msm_gem_new_impl(dev, size, flags, ); if (ret) goto fail; + msm_obj = to_msm_bo(obj); + if (use_vram) { struct msm_gem_vma *vma; struct page **pages; - struct msm_gem_object *msm_obj = to_msm_bo(obj); mutex_lock(_obj->lo
[PATCH v4 1/7] dt-bindings: drm/msm/gpu: Document gpu opp table
From: Sharat Masetty Update documentation to list the gpu opp table bindings including the newly added "opp-peak-kBps" needed for GPU-DDR bandwidth scaling. Signed-off-by: Sharat Masetty Acked-by: Rob Herring Signed-off-by: Akhil P Oommen --- .../devicetree/bindings/display/msm/gpu.txt| 28 ++ 1 file changed, 28 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/gpu.txt b/Documentation/devicetree/bindings/display/msm/gpu.txt index fd779cd..1af0ff1 100644 --- a/Documentation/devicetree/bindings/display/msm/gpu.txt +++ b/Documentation/devicetree/bindings/display/msm/gpu.txt @@ -112,6 +112,34 @@ Example a6xx (with GMU): interconnects = <_hlos MASTER_GFX3D _hlos SLAVE_EBI1>; interconnect-names = "gfx-mem"; + gpu_opp_table: opp-table { + compatible = "operating-points-v2"; + + opp-43000 { + opp-hz = /bits/ 64 <43000>; + opp-level = ; + opp-peak-kBps = <5412000>; + }; + + opp-35500 { + opp-hz = /bits/ 64 <35500>; + opp-level = ; + opp-peak-kBps = <3072000>; + }; + + opp-26700 { + opp-hz = /bits/ 64 <26700>; + opp-level = ; + opp-peak-kBps = <3072000>; + }; + + opp-18000 { + opp-hz = /bits/ 64 <18000>; + opp-level = ; + opp-peak-kBps = <1804000>; + }; + }; + qcom,gmu = <>; zap-shader { -- 2.7.4
[PATCH v4 5/7] arm64: dts: qcom: SDM845: Enable GPU DDR bw scaling
From: Sharat Masetty This patch adds the interconnects property for the gpu node and the opp-peak-kBps property to the opps of the gpu opp table. This should help enable DDR bandwidth scaling dynamically and proportionally to the GPU frequency. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index 8eb5a31..5e9561a 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -3515,42 +3515,51 @@ qcom,gmu = <>; + interconnects = <_noc MASTER_GFX3D _noc SLAVE_EBI1>; + gpu_opp_table: opp-table { compatible = "operating-points-v2"; opp-71000 { opp-hz = /bits/ 64 <71000>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-67500 { opp-hz = /bits/ 64 <67500>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-59600 { opp-hz = /bits/ 64 <59600>; opp-level = ; + opp-peak-kBps = <622>; }; opp-52000 { opp-hz = /bits/ 64 <52000>; opp-level = ; + opp-peak-kBps = <622>; }; opp-41400 { opp-hz = /bits/ 64 <41400>; opp-level = ; + opp-peak-kBps = <4068000>; }; opp-34200 { opp-hz = /bits/ 64 <34200>; opp-level = ; + opp-peak-kBps = <2724000>; }; opp-25700 { opp-hz = /bits/ 64 <25700>; opp-level = ; + opp-peak-kBps = <1648000>; }; }; }; -- 2.7.4
[PATCH v4 7/7] arm64: dts: qcom: sc7180: Add opp-peak-kBps to GPU opp
From: Sharat Masetty Add opp-peak-kBps bindings to the GPU opp table, listing the peak GPU -> DDR bandwidth requirement for each opp level. This will be used to scale the DDR bandwidth along with the GPU frequency dynamically. Signed-off-by: Sharat Masetty Reviewed-by: Matthias Kaehlcke Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index a567297..8567e9e 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1478,36 +1478,43 @@ opp-8 { opp-hz = /bits/ 64 <8>; opp-level = ; + opp-peak-kBps = <8532000>; }; opp-65000 { opp-hz = /bits/ 64 <65000>; opp-level = ; + opp-peak-kBps = <7216000>; }; opp-56500 { opp-hz = /bits/ 64 <56500>; opp-level = ; + opp-peak-kBps = <5412000>; }; opp-43000 { opp-hz = /bits/ 64 <43000>; opp-level = ; + opp-peak-kBps = <5412000>; }; opp-35500 { opp-hz = /bits/ 64 <35500>; opp-level = ; + opp-peak-kBps = <3072000>; }; opp-26700 { opp-hz = /bits/ 64 <26700>; opp-level = ; + opp-peak-kBps = <3072000>; }; opp-18000 { opp-hz = /bits/ 64 <18000>; opp-level = ; + opp-peak-kBps = <1804000>; }; }; }; -- 2.7.4
[PATCH v4 4/7] drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR
From: Sharat Masetty This patches replaces the previously used static DDR vote and uses dev_pm_opp_set_bw() to scale GPU->DDR bandwidth along with scaling GPU frequency. Also since the icc path voting is handled completely in the opp driver, remove the icc_path handle and its usage in the drm driver. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index b547339..6fbfd7d 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -123,7 +123,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (!gmu->legacy) { a6xx_hfi_set_freq(gmu, gmu->current_perf_index); - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); return; } @@ -149,11 +149,7 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (ret) dev_err(gmu->dev, "GMU set GPU frequency error: %d\n", ret); - /* -* Eventually we will want to scale the path vote with the frequency but -* for now leave it at max so that the performance is nominal. -*/ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + dev_pm_opp_set_bw(>pdev->dev, opp); } unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) @@ -840,6 +836,19 @@ static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) dev_pm_opp_put(gpu_opp); } +static void a6xx_gmu_set_initial_bw(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + dev_pm_opp_set_bw(>pdev->dev, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -864,7 +873,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) } /* Set the bus quota to a reasonable value for boot */ - icc_set_bw(gpu->icc_path, 0, MBps_to_icc(3072)); + a6xx_gmu_set_initial_bw(gpu, gmu); /* Enable the GMU interrupt */ gmu_write(gmu, REG_A6XX_GMU_AO_HOST_INTERRUPT_CLR, ~0); @@ -1040,7 +1049,7 @@ int a6xx_gmu_stop(struct a6xx_gpu *a6xx_gpu) a6xx_gmu_shutdown(gmu); /* Remove the bus vote */ - icc_set_bw(gpu->icc_path, 0, 0); + dev_pm_opp_set_bw(>pdev->dev, NULL); /* * Make sure the GX domain is off before turning off the GMU (CX) -- 2.7.4
[PATCH v4 2/7] drm: msm: a6xx: send opp instead of a frequency
From: Sharat Masetty This patch changes the plumbing to send the devfreq recommended opp rather than the frequency. Also consolidate and rearrange the code in a6xx to set the GPU frequency and the icc vote in preparation for the upcoming changes for GPU->DDR scaling votes. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 73 --- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_gpu.h | 3 +- 4 files changed, 38 insertions(+), 43 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 21e77d6..233afea 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -103,17 +103,31 @@ bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); } -static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) { - struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu); - struct adreno_gpu *adreno_gpu = _gpu->base; - struct msm_gpu *gpu = _gpu->base; - int ret; + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = _gpu->gmu; + u32 perf_index; + unsigned long gpu_freq; + int ret = 0; + + gpu_freq = dev_pm_opp_get_freq(opp); + + if (gpu_freq == gmu->freq) + return; + + for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++) + if (gpu_freq == gmu->gpu_freqs[perf_index]) + break; + + gmu->current_perf_index = perf_index; + gmu->freq = gmu->gpu_freqs[perf_index]; gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0); gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING, - ((3 & 0xf) << 28) | index); + ((3 & 0xf) << 28) | perf_index); /* * Send an invalid index as a vote for the bus bandwidth and let the @@ -136,38 +150,6 @@ static void __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); } -void a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) -{ - struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); - struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); - struct a6xx_gmu *gmu = _gpu->gmu; - u32 perf_index = 0; - - if (freq == gmu->freq) - return; - - for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++) - if (freq == gmu->gpu_freqs[perf_index]) - break; - - gmu->current_perf_index = perf_index; - gmu->freq = gmu->gpu_freqs[perf_index]; - - /* -* This can get called from devfreq while the hardware is idle. Don't -* bring up the power if it isn't already active -*/ - if (pm_runtime_get_if_in_use(gmu->dev) == 0) - return; - - if (gmu->legacy) - __a6xx_gmu_set_freq(gmu, perf_index); - else - a6xx_hfi_set_freq(gmu, perf_index); - - pm_runtime_put(gmu->dev); -} - unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -839,6 +821,19 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu) a6xx_gmu_rpmh_off(gmu); } +static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu *gmu) +{ + struct dev_pm_opp *gpu_opp; + unsigned long gpu_freq = gmu->gpu_freqs[gmu->current_perf_index]; + + gpu_opp = dev_pm_opp_find_freq_exact(>pdev->dev, gpu_freq, true); + if (IS_ERR_OR_NULL(gpu_opp)) + return; + + a6xx_gmu_set_freq(gpu, gpu_opp); + dev_pm_opp_put(gpu_opp); +} + int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) { struct adreno_gpu *adreno_gpu = _gpu->base; @@ -899,7 +894,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) /* Set the GPU to the current freq */ if (gmu->legacy) - __a6xx_gmu_set_freq(gmu, gmu->current_perf_index); + a6xx_gmu_set_initial_freq(gpu, gmu); else a6xx_hfi_set_freq(gmu, gmu->current_perf_index); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 7239b8b..03ba60d 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -63,7 +63,7 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum a6xx_gmu_oob_state state); int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node); void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
[PATCH v4 3/7] drm: msm: a6xx: set gpu freq through hfi
Newer targets support changing gpu frequency through HFI. So use that wherever supported instead of the legacy method. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 233afea..b547339 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -121,6 +121,12 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct dev_pm_opp *opp) if (gpu_freq == gmu->gpu_freqs[perf_index]) break; + if (!gmu->legacy) { + a6xx_hfi_set_freq(gmu, gmu->current_perf_index); + icc_set_bw(gpu->icc_path, 0, MBps_to_icc(7216)); + return; + } + gmu->current_perf_index = perf_index; gmu->freq = gmu->gpu_freqs[perf_index]; @@ -893,10 +899,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) enable_irq(gmu->hfi_irq); /* Set the GPU to the current freq */ - if (gmu->legacy) - a6xx_gmu_set_initial_freq(gpu, gmu); - else - a6xx_hfi_set_freq(gmu, gmu->current_perf_index); + a6xx_gmu_set_initial_freq(gpu, gmu); /* * "enable" the GX power domain which won't actually do anything but it -- 2.7.4
[PATCH v4 6/7] arm64: dts: qcom: sc7180: Add interconnects property for GPU
From: Sharat Masetty This patch adds the interconnects property to the GPU node. This enables the GPU->DDR path bandwidth voting. Signed-off-by: Sharat Masetty Signed-off-by: Akhil P Oommen --- arch/arm64/boot/dts/qcom/sc7180.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi b/arch/arm64/boot/dts/qcom/sc7180.dtsi index 31b9217..a567297 100644 --- a/arch/arm64/boot/dts/qcom/sc7180.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi @@ -1470,6 +1470,8 @@ operating-points-v2 = <_opp_table>; qcom,gmu = <>; + interconnects = <_noc MASTER_GFX3D _virt SLAVE_EBI1>; + gpu_opp_table: opp-table { compatible = "operating-points-v2"; -- 2.7.4
[PATCH v4 0/7] Add support for GPU DDR BW scaling
This is mostly a rebase of Sharat's patches [1] on the tip of msm-next branch. Changes compared to v3: 1. Rebased on top of Jonathan's patch which adds support for changing gpu freq through hfi on newer targets. Created patch 1 to make this the generic approach of setting gpu freq on newer targets. 2. As suggested by Rob, left the icc_path intact for pre-a6xx GPUs. As mentioned in [1], these patches have dependency on Georgi's series from opp/linux-next [2] and also Sibi's patch which adds a helper function to set and clear ddr bandwidth vote [2]. [1] https://patchwork.freedesktop.org/series/75291/ [2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/vireshk/pm/+log/opp/linux-next/ Akhil P Oommen (1): drm: msm: a6xx: set gpu freq through hfi Sharat Masetty (6): dt-bindings: drm/msm/gpu: Document gpu opp table drm: msm: a6xx: send opp instead of a frequency drm: msm: a6xx: use dev_pm_opp_set_bw to scale DDR arm64: dts: qcom: SDM845: Enable GPU DDR bw scaling arm64: dts: qcom: sc7180: Add interconnects property for GPU arm64: dts: qcom: sc7180: Add opp-peak-kBps to GPU opp .../devicetree/bindings/display/msm/gpu.txt| 28 ++ arch/arm64/boot/dts/qcom/sc7180.dtsi | 9 ++ arch/arm64/boot/dts/qcom/sdm845.dtsi | 9 ++ drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 105 +++-- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 +- drivers/gpu/drm/msm/msm_gpu.c | 3 +- drivers/gpu/drm/msm/msm_gpu.h | 3 +- 7 files changed, 106 insertions(+), 53 deletions(-) -- 2.7.4
Re: [PATCH v3] PM / devfreq: Fix devfreq_add_device() when drivers are built as modules.
On 6/22/2018 1:52 PM, Enric Balletbo i Serra wrote: Hi Ezequiel and Akhil, On 22/06/18 09:03, Akhil P Oommen wrote: On 6/22/2018 6:41 AM, Ezequiel Garcia wrote: Hey Enric, On Fri, 2018-06-22 at 00:04 +0200, Enric Balletbo i Serra wrote: When the devfreq driver and the governor driver are built as modules, the call to devfreq_add_device() or governor_store() fails because the governor driver is not loaded at the time the devfreq driver loads. The devfreq driver has a build dependency on the governor but also should have a runtime dependency. We need to make sure that the governor driver is loaded before the devfreq driver. This patch fixes this bug by adding a try_then_request_governor() function. First tries to find the governor, and then, if it is not found, it requests the module and tries again. Fixes: 1b5c1be2c88e (PM / devfreq: map devfreq drivers to governor using name) Signed-off-by: Enric Balletbo i Serra --- Changes in v3: - Remove unneded change in dev_err message. - Fix err returned value in case to not find the governor. Changes in v2: - Add a new function to request the module and call that function from devfreq_add_device and governor_store. drivers/devfreq/devfreq.c | 65 - -- [snip snip] - governor = find_devfreq_governor(devfreq->governor_name); + governor = try_then_request_governor(devfreq- governor_name); if (IS_ERR(governor)) { dev_err(dev, "%s: Unable to find governor for the device\n", __func__); err = PTR_ERR(governor); - goto err_init; + goto err_unregister; } + mutex_lock(_list_lock); + I know it's not something we are introducing in this patch, but still... calling a hook with a mutex held looks fishy to me. This lock should only protect the list, unless I am missing something. I think so too. devfreq->governor = governor; err = devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_START, NULL); @@ -663,14 +703,16 @@ struct devfreq *devfreq_add_device(struct device *dev, __func__); goto err_init; } + + list_add(>node, _list); + mutex_unlock(_list_lock); return devfreq; err_init: - list_del(>node); mutex_unlock(_list_lock); - +err_unregister: device_unregister(>dev); err_dev: if (devfreq) @@ -988,12 +1030,13 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, if (ret != 1) return -EINVAL; - mutex_lock(_list_lock); - governor = find_devfreq_governor(str_governor); + governor = try_then_request_governor(str_governor); if (IS_ERR(governor)) { - ret = PTR_ERR(governor); - goto out; + return PTR_ERR(governor); } + + mutex_lock(_list_lock); + if (df->governor == governor) { ret = 0; goto out; -- 2.17.1 Regards, Eze Adding to Ezequiel's point, shouldn't we take more granular lock (devfreq->lock) first and then call devfreq_list_lock at the time of adding to the list? Yes, I think so. I think, though, that this should be a separate patch, not sure if a pre or post patch to this one, but for sure it's another topic. Current patch tries to solve different problem an only tries to follow the current locking/unlocking. Anyway this is a maintainer decision I guess. Thanks, Enric -Akhil. I agree. -Akhil.
Re: [PATCH v3] PM / devfreq: Fix devfreq_add_device() when drivers are built as modules.
On 6/22/2018 1:52 PM, Enric Balletbo i Serra wrote: Hi Ezequiel and Akhil, On 22/06/18 09:03, Akhil P Oommen wrote: On 6/22/2018 6:41 AM, Ezequiel Garcia wrote: Hey Enric, On Fri, 2018-06-22 at 00:04 +0200, Enric Balletbo i Serra wrote: When the devfreq driver and the governor driver are built as modules, the call to devfreq_add_device() or governor_store() fails because the governor driver is not loaded at the time the devfreq driver loads. The devfreq driver has a build dependency on the governor but also should have a runtime dependency. We need to make sure that the governor driver is loaded before the devfreq driver. This patch fixes this bug by adding a try_then_request_governor() function. First tries to find the governor, and then, if it is not found, it requests the module and tries again. Fixes: 1b5c1be2c88e (PM / devfreq: map devfreq drivers to governor using name) Signed-off-by: Enric Balletbo i Serra --- Changes in v3: - Remove unneded change in dev_err message. - Fix err returned value in case to not find the governor. Changes in v2: - Add a new function to request the module and call that function from devfreq_add_device and governor_store. drivers/devfreq/devfreq.c | 65 - -- [snip snip] - governor = find_devfreq_governor(devfreq->governor_name); + governor = try_then_request_governor(devfreq- governor_name); if (IS_ERR(governor)) { dev_err(dev, "%s: Unable to find governor for the device\n", __func__); err = PTR_ERR(governor); - goto err_init; + goto err_unregister; } + mutex_lock(_list_lock); + I know it's not something we are introducing in this patch, but still... calling a hook with a mutex held looks fishy to me. This lock should only protect the list, unless I am missing something. I think so too. devfreq->governor = governor; err = devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_START, NULL); @@ -663,14 +703,16 @@ struct devfreq *devfreq_add_device(struct device *dev, __func__); goto err_init; } + + list_add(>node, _list); + mutex_unlock(_list_lock); return devfreq; err_init: - list_del(>node); mutex_unlock(_list_lock); - +err_unregister: device_unregister(>dev); err_dev: if (devfreq) @@ -988,12 +1030,13 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, if (ret != 1) return -EINVAL; - mutex_lock(_list_lock); - governor = find_devfreq_governor(str_governor); + governor = try_then_request_governor(str_governor); if (IS_ERR(governor)) { - ret = PTR_ERR(governor); - goto out; + return PTR_ERR(governor); } + + mutex_lock(_list_lock); + if (df->governor == governor) { ret = 0; goto out; -- 2.17.1 Regards, Eze Adding to Ezequiel's point, shouldn't we take more granular lock (devfreq->lock) first and then call devfreq_list_lock at the time of adding to the list? Yes, I think so. I think, though, that this should be a separate patch, not sure if a pre or post patch to this one, but for sure it's another topic. Current patch tries to solve different problem an only tries to follow the current locking/unlocking. Anyway this is a maintainer decision I guess. Thanks, Enric -Akhil. I agree. -Akhil.
Re: [PATCH v3] PM / devfreq: Fix devfreq_add_device() when drivers are built as modules.
On 6/22/2018 6:41 AM, Ezequiel Garcia wrote: Hey Enric, On Fri, 2018-06-22 at 00:04 +0200, Enric Balletbo i Serra wrote: When the devfreq driver and the governor driver are built as modules, the call to devfreq_add_device() or governor_store() fails because the governor driver is not loaded at the time the devfreq driver loads. The devfreq driver has a build dependency on the governor but also should have a runtime dependency. We need to make sure that the governor driver is loaded before the devfreq driver. This patch fixes this bug by adding a try_then_request_governor() function. First tries to find the governor, and then, if it is not found, it requests the module and tries again. Fixes: 1b5c1be2c88e (PM / devfreq: map devfreq drivers to governor using name) Signed-off-by: Enric Balletbo i Serra --- Changes in v3: - Remove unneded change in dev_err message. - Fix err returned value in case to not find the governor. Changes in v2: - Add a new function to request the module and call that function from devfreq_add_device and governor_store. drivers/devfreq/devfreq.c | 65 - -- [snip snip] - governor = find_devfreq_governor(devfreq->governor_name); + governor = try_then_request_governor(devfreq- governor_name); if (IS_ERR(governor)) { dev_err(dev, "%s: Unable to find governor for the device\n", __func__); err = PTR_ERR(governor); - goto err_init; + goto err_unregister; } + mutex_lock(_list_lock); + I know it's not something we are introducing in this patch, but still... calling a hook with a mutex held looks fishy to me. This lock should only protect the list, unless I am missing something. devfreq->governor = governor; err = devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_START, NULL); @@ -663,14 +703,16 @@ struct devfreq *devfreq_add_device(struct device *dev, __func__); goto err_init; } + + list_add(>node, _list); + mutex_unlock(_list_lock); return devfreq; err_init: - list_del(>node); mutex_unlock(_list_lock); - +err_unregister: device_unregister(>dev); err_dev: if (devfreq) @@ -988,12 +1030,13 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, if (ret != 1) return -EINVAL; - mutex_lock(_list_lock); - governor = find_devfreq_governor(str_governor); + governor = try_then_request_governor(str_governor); if (IS_ERR(governor)) { - ret = PTR_ERR(governor); - goto out; + return PTR_ERR(governor); } + + mutex_lock(_list_lock); + if (df->governor == governor) { ret = 0; goto out; -- 2.17.1 Regards, Eze Adding to Ezequiel's point, shouldn't we take more granular lock (devfreq->lock) first and then call devfreq_list_lock at the time of adding to the list? -Akhil.
Re: [PATCH v3] PM / devfreq: Fix devfreq_add_device() when drivers are built as modules.
On 6/22/2018 6:41 AM, Ezequiel Garcia wrote: Hey Enric, On Fri, 2018-06-22 at 00:04 +0200, Enric Balletbo i Serra wrote: When the devfreq driver and the governor driver are built as modules, the call to devfreq_add_device() or governor_store() fails because the governor driver is not loaded at the time the devfreq driver loads. The devfreq driver has a build dependency on the governor but also should have a runtime dependency. We need to make sure that the governor driver is loaded before the devfreq driver. This patch fixes this bug by adding a try_then_request_governor() function. First tries to find the governor, and then, if it is not found, it requests the module and tries again. Fixes: 1b5c1be2c88e (PM / devfreq: map devfreq drivers to governor using name) Signed-off-by: Enric Balletbo i Serra --- Changes in v3: - Remove unneded change in dev_err message. - Fix err returned value in case to not find the governor. Changes in v2: - Add a new function to request the module and call that function from devfreq_add_device and governor_store. drivers/devfreq/devfreq.c | 65 - -- [snip snip] - governor = find_devfreq_governor(devfreq->governor_name); + governor = try_then_request_governor(devfreq- governor_name); if (IS_ERR(governor)) { dev_err(dev, "%s: Unable to find governor for the device\n", __func__); err = PTR_ERR(governor); - goto err_init; + goto err_unregister; } + mutex_lock(_list_lock); + I know it's not something we are introducing in this patch, but still... calling a hook with a mutex held looks fishy to me. This lock should only protect the list, unless I am missing something. devfreq->governor = governor; err = devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_START, NULL); @@ -663,14 +703,16 @@ struct devfreq *devfreq_add_device(struct device *dev, __func__); goto err_init; } + + list_add(>node, _list); + mutex_unlock(_list_lock); return devfreq; err_init: - list_del(>node); mutex_unlock(_list_lock); - +err_unregister: device_unregister(>dev); err_dev: if (devfreq) @@ -988,12 +1030,13 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr, if (ret != 1) return -EINVAL; - mutex_lock(_list_lock); - governor = find_devfreq_governor(str_governor); + governor = try_then_request_governor(str_governor); if (IS_ERR(governor)) { - ret = PTR_ERR(governor); - goto out; + return PTR_ERR(governor); } + + mutex_lock(_list_lock); + if (df->governor == governor) { ret = 0; goto out; -- 2.17.1 Regards, Eze Adding to Ezequiel's point, shouldn't we take more granular lock (devfreq->lock) first and then call devfreq_list_lock at the time of adding to the list? -Akhil.
Re: [RFC] PM / devfreq: Add support for alerts
On 5/31/2018 11:47 AM, MyungJoo Ham wrote: Currently, DEVFREQ reevaluates the device state periodically and/or based on the OPP list changes. Private API has to be exposed to allow the device driver to alert/notify the governor to reevaluate when a new set of data is available. This makes the governor more coupled to a particular device driver. We can improve here by exposing a DEVFREQ API to allow the device drivers to send generic alerts to the governor. Signed-off-by: Akhil P Oommen --- drivers/devfreq/devfreq.c | 21 + drivers/devfreq/governor.h | 1 + include/linux/devfreq.h| 5 + 3 files changed, 27 insertions(+) Hello Akhil, It appears that this will have the same effect with "[PATCH 08/11] PM / devfreq: Make update_devfreq() public" from Matthias Kaehlcke, doesn't it? Cheers, MyungJoo Hi MyngJoo, The patch you mentioned is a step in the right direction. But this patch allows: 1. the governor to decide whether to reevaluate or not. I feel it would be a better architecture (better Separation of Concern) if that decision is left to the governor alone. 2. the devices to share multiple types of alerts. A governor may use these alerts for internal bookkeeping/algorithm and decide to reevaluate policy when it is necessary. Since we are opening up a new devfreq API for devices, isn't it better to go for a generic one? Regards, Akhil.