Re: [PATCH] drm/amd/display: avoid 64-bit division

2019-07-09 Thread Arnd Bergmann
On Tue, Jul 9, 2019 at 6:40 PM Deucher, Alexander
 wrote:
>
> I'll just apply Arnd's patch.  If the display team wants to adjust it later 
> to clarify the
> operation, they should go ahead as a follow up patch.

Thanks!

> From: Abramov, Slava
> Sent: Tuesday, July 9, 2019 12:31 PM
> > Thanks for bisecting this issue.
> >
> > I wonder whether you are going to commit your patch or planning to update 
> > it and it's
> > still in your work queue.  We have one of our 32-bit builds failing because 
> > of this
> > issue, so that I would like either to fix it or wait to your fix if it has 
> > chances to go
> > upstream today.

I was going to update the patch, but had not gotten to that yet.

 Arnd
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amd/display: avoid 64-bit division

2019-07-09 Thread Deucher, Alexander
I'll just apply Arnd's patch.  If the display team wants to adjust it later to 
clarify the operation, they should go ahead as a follow up patch.

Thanks,

Alex

From: Abramov, Slava
Sent: Tuesday, July 9, 2019 12:31 PM
To: Arnd Bergmann; Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; 
Koenig, Christian; Zhou, David(ChunMing); David Airlie; Daniel Vetter
Cc: Liu, Charlene; Park, Chris; Cheng, Tony; Francis, David; 
linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; 
Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, 
Bhawanpreet; Koo, Anthony
Subject: Re: [PATCH] drm/amd/display: avoid 64-bit division


Hi Arnd!


Thanks for bisecting this issue.


I wonder whether you are going to commit your patch or planning to update it 
and it's still in your work queue.  We have one of our 32-bit builds failing 
because of this issue, so that I would like either to fix it or wait to your 
fix if it has chances to go upstream today.


Sincerely,



Slava Abramov


From: amd-gfx  on behalf of Arnd 
Bergmann 
Sent: Monday, July 8, 2019 9:52:08 AM
To: Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; 
Zhou, David(ChunMing); David Airlie; Daniel Vetter
Cc: Liu, Charlene; Park, Chris; Arnd Bergmann; Cheng, Tony; Francis, David; 
linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; 
Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, 
Bhawanpreet; Koo, Anthony
Subject: [PATCH] drm/amd/display: avoid 64-bit division

On 32-bit architectures, dividing a 64-bit integer in the kernel
leads to a link error:

ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Change the two recently introduced instances to a multiply+shift
operation that is also much cheaper on 32-bit architectures.
We can do that here, since both of them are really 32-bit numbers
that change a few percent.

Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")
Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for 
NV")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index c17db5c144aa..8dbf759eba45 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(
  * but the difference is minimal and is in a safe direction,
  * which all works well around potential ambiguity of DP 1.4a 
spec.
  */
-   long long fec_link_bw_kbps = link_bw_kbps * 970LL;
-   link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
+   link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
+  link_bw_kbps, 32);
 }
 #endif

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index b35327bafbc5..70ac8a95d2db 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct 
_vcs_dpi_soc_bounding_box_
 calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 
1000;

 // FCLK:UCLK ratio is 1.08
-   min_fclk_required_by_uclk = ((unsigned long 
long)uclk_states[i]) * 1080 / 100;
+   min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 
/ 100, uclk_states[i], 32);

 calculated_states[i].fabricclk_mhz = 
(min_fclk_required_by_uclk < min_dcfclk) ?
 min_dcfclk : min_fclk_required_by_uclk;
--
2.20.0

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amd/display: avoid 64-bit division

2019-07-09 Thread Abramov, Slava
Hi Arnd!


Thanks for bisecting this issue.


I wonder whether you are going to commit your patch or planning to update it 
and it's still in your work queue.  We have one of our 32-bit builds failing 
because of this issue, so that I would like either to fix it or wait to your 
fix if it has chances to go upstream today.


Sincerely,



Slava Abramov


From: amd-gfx  on behalf of Arnd 
Bergmann 
Sent: Monday, July 8, 2019 9:52:08 AM
To: Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; 
Zhou, David(ChunMing); David Airlie; Daniel Vetter
Cc: Liu, Charlene; Park, Chris; Arnd Bergmann; Cheng, Tony; Francis, David; 
linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; 
Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, 
Bhawanpreet; Koo, Anthony
Subject: [PATCH] drm/amd/display: avoid 64-bit division

On 32-bit architectures, dividing a 64-bit integer in the kernel
leads to a link error:

ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Change the two recently introduced instances to a multiply+shift
operation that is also much cheaper on 32-bit architectures.
We can do that here, since both of them are really 32-bit numbers
that change a few percent.

Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")
Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for 
NV")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index c17db5c144aa..8dbf759eba45 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(
  * but the difference is minimal and is in a safe direction,
  * which all works well around potential ambiguity of DP 1.4a 
spec.
  */
-   long long fec_link_bw_kbps = link_bw_kbps * 970LL;
-   link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
+   link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
+  link_bw_kbps, 32);
 }
 #endif

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index b35327bafbc5..70ac8a95d2db 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct 
_vcs_dpi_soc_bounding_box_
 calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 
1000;

 // FCLK:UCLK ratio is 1.08
-   min_fclk_required_by_uclk = ((unsigned long 
long)uclk_states[i]) * 1080 / 100;
+   min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 
/ 100, uclk_states[i], 32);

 calculated_states[i].fabricclk_mhz = 
(min_fclk_required_by_uclk < min_dcfclk) ?
 min_dcfclk : min_fclk_required_by_uclk;
--
2.20.0

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amd/display: avoid 64-bit division

2019-07-08 Thread Abramov, Slava
Acked-by: Slava Abramov 

Tested-by: Slava Abramov 


From: amd-gfx  on behalf of Arnd 
Bergmann 
Sent: Monday, July 8, 2019 9:52:08 AM
To: Wentland, Harry; Li, Sun peng (Leo); Deucher, Alexander; Koenig, Christian; 
Zhou, David(ChunMing); David Airlie; Daniel Vetter
Cc: Liu, Charlene; Park, Chris; Arnd Bergmann; Cheng, Tony; Francis, David; 
linux-ker...@vger.kernel.org; amd-...@lists.freedesktop.org; Cornij, Nikola; 
Laktyushkin, Dmytro; dri-devel@lists.freedesktop.org; Lei, Jun; Lakha, 
Bhawanpreet; Koo, Anthony
Subject: [PATCH] drm/amd/display: avoid 64-bit division

On 32-bit architectures, dividing a 64-bit integer in the kernel
leads to a link error:

ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Change the two recently introduced instances to a multiply+shift
operation that is also much cheaper on 32-bit architectures.
We can do that here, since both of them are really 32-bit numbers
that change a few percent.

Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")
Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for 
NV")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index c17db5c144aa..8dbf759eba45 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(
  * but the difference is minimal and is in a safe direction,
  * which all works well around potential ambiguity of DP 1.4a 
spec.
  */
-   long long fec_link_bw_kbps = link_bw_kbps * 970LL;
-   link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
+   link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
+  link_bw_kbps, 32);
 }
 #endif

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index b35327bafbc5..70ac8a95d2db 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct 
_vcs_dpi_soc_bounding_box_
 calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 
1000;

 // FCLK:UCLK ratio is 1.08
-   min_fclk_required_by_uclk = ((unsigned long 
long)uclk_states[i]) * 1080 / 100;
+   min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 
/ 100, uclk_states[i], 32);

 calculated_states[i].fabricclk_mhz = 
(min_fclk_required_by_uclk < min_dcfclk) ?
 min_dcfclk : min_fclk_required_by_uclk;
--
2.20.0

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amd/display: avoid 64-bit division

2019-07-08 Thread Kazlauskas, Nicholas
On 7/8/19 9:52 AM, Arnd Bergmann wrote:
> On 32-bit architectures, dividing a 64-bit integer in the kernel
> leads to a link error:
> 
> ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> 
> Change the two recently introduced instances to a multiply+shift
> operation that is also much cheaper on 32-bit architectures.
> We can do that here, since both of them are really 32-bit numbers
> that change a few percent.
> 
> Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to 
> dc_link")
> Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic 
> for NV")
> Signed-off-by: Arnd Bergmann 
> ---
>   drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
>   drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
>   2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
> b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> index c17db5c144aa..8dbf759eba45 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(
>* but the difference is minimal and is in a safe direction,
>* which all works well around potential ambiguity of DP 1.4a 
> spec.
>*/
> - long long fec_link_bw_kbps = link_bw_kbps * 970LL;
> - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
> + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
> +link_bw_kbps, 32);
>   }
>   #endif
>   
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
> b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> index b35327bafbc5..70ac8a95d2db 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct 
> _vcs_dpi_soc_bounding_box_
>   calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 
> 1000;
>   
>   // FCLK:UCLK ratio is 1.08
> - min_fclk_required_by_uclk = ((unsigned long 
> long)uclk_states[i]) * 1080 / 100;
> + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 
> / 100, uclk_states[i], 32);

Even though the mul + shift will be faster here, I would prefer that 
this just be a div_u64 for clarity.

Nicholas Kazlauskas

>   
>   calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk 
> < min_dcfclk) ?
>   min_dcfclk : min_fclk_required_by_uclk;
> 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/amd/display: avoid 64-bit division

2019-07-08 Thread Arnd Bergmann
On 32-bit architectures, dividing a 64-bit integer in the kernel
leads to a link error:

ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Change the two recently introduced instances to a multiply+shift
operation that is also much cheaper on 32-bit architectures.
We can do that here, since both of them are really 32-bit numbers
that change a few percent.

Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")
Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for 
NV")
Signed-off-by: Arnd Bergmann 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index c17db5c144aa..8dbf759eba45 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(
 * but the difference is minimal and is in a safe direction,
 * which all works well around potential ambiguity of DP 1.4a 
spec.
 */
-   long long fec_link_bw_kbps = link_bw_kbps * 970LL;
-   link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
+   link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
+  link_bw_kbps, 32);
}
 #endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index b35327bafbc5..70ac8a95d2db 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct 
_vcs_dpi_soc_bounding_box_
calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 
1000;
 
// FCLK:UCLK ratio is 1.08
-   min_fclk_required_by_uclk = ((unsigned long 
long)uclk_states[i]) * 1080 / 100;
+   min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 
/ 100, uclk_states[i], 32);
 
calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk 
< min_dcfclk) ?
min_dcfclk : min_fclk_required_by_uclk;
-- 
2.20.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel