Re: [PATCH] drm/amd/display: avoid 64-bit division
On Tue, Jul 9, 2019 at 6:40 PM Deucher, Alexander wrote: > > I'll just apply Arnd's patch. If the display team wants to adjust it later > to clarify the > operation, they should go ahead as a follow up patch. Thanks! > From: Abramov, Slava > Sent: Tuesday, July 9, 2019 12:31 PM > > Thanks for bisecting this issue. > > > > I wonder whether you are going to commit your patch or planning to update > > it and it's > > still in your work queue. We have one of our 32-bit builds failing because > > of this > > issue, so that I would like either to fix it or wait to your fix if it has > > chances to go > > upstream today. I was going to update the patch, but had not gotten to that yet. Arnd ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amd/display: avoid 64-bit division
I'll just apply Arnd's patch. If the display team wants to adjust it later to clarify the operation, they should go ahead as a follow up patch. Thanks, Alex From: Abramov, Slava Sent: Tuesday, July 9, 2019 12:31 PM To: Arnd Bergmann; Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); David Airlie; Daniel Vetter Cc: Liu, Charlene; Park, Chris; Cheng, Tony; Francis, David; linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, Bhawanpreet; Koo, Anthony Subject: Re: [PATCH] drm/amd/display: avoid 64-bit division Hi Arnd! Thanks for bisecting this issue. I wonder whether you are going to commit your patch or planning to update it and it's still in your work queue. We have one of our 32-bit builds failing because of this issue, so that I would like either to fix it or wait to your fix if it has chances to go upstream today. Sincerely, Slava Abramov From: amd-gfx on behalf of Arnd Bergmann Sent: Monday, July 8, 2019 9:52:08 AM To: Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); David Airlie; Daniel Vetter Cc: Liu, Charlene; Park, Chris; Arnd Bergmann; Cheng, Tony; Francis, David; linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, Bhawanpreet; Koo, Anthony Subject: [PATCH] drm/amd/display: avoid 64-bit division On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index c17db5c144aa..8dbf759eba45 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( * but the difference is minimal and is in a safe direction, * which all works well around potential ambiguity of DP 1.4a spec. */ - long long fec_link_bw_kbps = link_bw_kbps * 970LL; - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, + link_bw_kbps, 32); } #endif diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index b35327bafbc5..70ac8a95d2db 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_ calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000; // FCLK:UCLK ratio is 1.08 - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 100; + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 100, uclk_states[i], 32); calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ? min_dcfclk : min_fclk_required_by_uclk; -- 2.20.0 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amd/display: avoid 64-bit division
Hi Arnd! Thanks for bisecting this issue. I wonder whether you are going to commit your patch or planning to update it and it's still in your work queue. We have one of our 32-bit builds failing because of this issue, so that I would like either to fix it or wait to your fix if it has chances to go upstream today. Sincerely, Slava Abramov From: amd-gfx on behalf of Arnd Bergmann Sent: Monday, July 8, 2019 9:52:08 AM To: Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); David Airlie; Daniel Vetter Cc: Liu, Charlene; Park, Chris; Arnd Bergmann; Cheng, Tony; Francis, David; linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, Bhawanpreet; Koo, Anthony Subject: [PATCH] drm/amd/display: avoid 64-bit division On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index c17db5c144aa..8dbf759eba45 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( * but the difference is minimal and is in a safe direction, * which all works well around potential ambiguity of DP 1.4a spec. */ - long long fec_link_bw_kbps = link_bw_kbps * 970LL; - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, + link_bw_kbps, 32); } #endif diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index b35327bafbc5..70ac8a95d2db 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_ calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000; // FCLK:UCLK ratio is 1.08 - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 100; + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 100, uclk_states[i], 32); calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ? min_dcfclk : min_fclk_required_by_uclk; -- 2.20.0 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amd/display: avoid 64-bit division
Acked-by: Slava Abramov Tested-by: Slava Abramov From: amd-gfx on behalf of Arnd Bergmann Sent: Monday, July 8, 2019 9:52:08 AM To: Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); David Airlie; Daniel Vetter Cc: Liu, Charlene; Park, Chris; Arnd Bergmann; Cheng, Tony; Francis, David; linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, Bhawanpreet; Koo, Anthony Subject: [PATCH] drm/amd/display: avoid 64-bit division On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index c17db5c144aa..8dbf759eba45 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( * but the difference is minimal and is in a safe direction, * which all works well around potential ambiguity of DP 1.4a spec. */ - long long fec_link_bw_kbps = link_bw_kbps * 970LL; - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, + link_bw_kbps, 32); } #endif diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index b35327bafbc5..70ac8a95d2db 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_ calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000; // FCLK:UCLK ratio is 1.08 - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 100; + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 100, uclk_states[i], 32); calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ? min_dcfclk : min_fclk_required_by_uclk; -- 2.20.0 ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amd/display: avoid 64-bit division
On 7/8/19 9:52 AM, Arnd Bergmann wrote: > On 32-bit architectures, dividing a 64-bit integer in the kernel > leads to a link error: > > ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! > ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! > > Change the two recently introduced instances to a multiply+shift > operation that is also much cheaper on 32-bit architectures. > We can do that here, since both of them are really 32-bit numbers > that change a few percent. > > Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to > dc_link") > Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic > for NV") > Signed-off-by: Arnd Bergmann > --- > drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- > drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c > b/drivers/gpu/drm/amd/display/dc/core/dc_link.c > index c17db5c144aa..8dbf759eba45 100644 > --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c > @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( >* but the difference is minimal and is in a safe direction, >* which all works well around potential ambiguity of DP 1.4a > spec. >*/ > - long long fec_link_bw_kbps = link_bw_kbps * 970LL; > - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); > + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, > +link_bw_kbps, 32); > } > #endif > > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > index b35327bafbc5..70ac8a95d2db 100644 > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c > @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct > _vcs_dpi_soc_bounding_box_ > calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / > 1000; > > // FCLK:UCLK ratio is 1.08 > - min_fclk_required_by_uclk = ((unsigned long > long)uclk_states[i]) * 1080 / 100; > + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 > / 100, uclk_states[i], 32); Even though the mul + shift will be faster here, I would prefer that this just be a div_u64 for clarity. Nicholas Kazlauskas > > calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk > < min_dcfclk) ? > min_dcfclk : min_fclk_required_by_uclk; > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/amd/display: avoid 64-bit division
On 32-bit architectures, dividing a 64-bit integer in the kernel leads to a link error: ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! Change the two recently introduced instances to a multiply+shift operation that is also much cheaper on 32-bit architectures. We can do that here, since both of them are really 32-bit numbers that change a few percent. Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link") Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV") Signed-off-by: Arnd Bergmann --- drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++-- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c index c17db5c144aa..8dbf759eba45 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps( * but the difference is minimal and is in a safe direction, * which all works well around potential ambiguity of DP 1.4a spec. */ - long long fec_link_bw_kbps = link_bw_kbps * 970LL; - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL); + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000, + link_bw_kbps, 32); } #endif diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c index b35327bafbc5..70ac8a95d2db 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_ calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000; // FCLK:UCLK ratio is 1.08 - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 100; + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 100, uclk_states[i], 32); calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ? min_dcfclk : min_fclk_required_by_uclk; -- 2.20.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel