[Freedreno] [PATCH v2 6/7] arm64: dts: sdm845: Set 'bi_tcxo' as ref clock of the DSI PHYs

2018-11-26 Thread Matthias Kaehlcke
Add 'bi_tcxo' as ref clock for the DSI PHYs, it was previously
hardcoded in the PLL 'driver' for the 10nm PHY.

Signed-off-by: Matthias Kaehlcke 
---
based on "[v4,1/3] arm64: dts: qcom: sdm845: Add dpu to sdm845 dts file"
  (https://patchwork.kernel.org/patch/10666253/)

Changes in v2:
- patch added to the series
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 5728b4cfae269..cdb5a9bb23e69 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -1372,8 +1372,9 @@
#clock-cells = <1>;
#phy-cells = <0>;
 
-   clocks = < DISP_CC_MDSS_AHB_CLK>;
-   clock-names = "iface";
+   clocks = < DISP_CC_MDSS_AHB_CLK>,
+< RPMH_CXO_CLK>;
+   clock-names = "iface", "ref";
};
 
dsi1: dsi@ae96000 {
@@ -1434,8 +1435,9 @@
#clock-cells = <1>;
#phy-cells = <0>;
 
-   clocks = < DISP_CC_MDSS_AHB_CLK>;
-   clock-names = "iface";
+   clocks = < DISP_CC_MDSS_AHB_CLK>,
+< RPMH_CXO_CLK>;
+   clock-names = "iface", "ref";
};
};
 
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 1/7] dt-bindings: msm/dsi: Add ref clock for PHYs

2018-11-26 Thread Matthias Kaehlcke
Allow the PHY drivers to get the ref clock from the DT.

Signed-off-by: Matthias Kaehlcke 
---
Changes in v2:
- add the ref clock for all PHYs, not only the 10nm one
- updated commit message
---
 Documentation/devicetree/bindings/display/msm/dsi.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/display/msm/dsi.txt 
b/Documentation/devicetree/bindings/display/msm/dsi.txt
index dfc743219bd88..b0485559a719c 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi.txt
+++ b/Documentation/devicetree/bindings/display/msm/dsi.txt
@@ -106,6 +106,7 @@ Required properties:
 - clocks: Phandles to device clocks. See [1] for details on clock bindings.
 - clock-names: the following clocks are required:
   * "iface"
+  * "ref"
   For 28nm HPM/LP, 28nm 8960 PHYs:
 - vddio-supply: phandle to vdd-io regulator device node
   For 20nm PHY:
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 7/7] drm/msm/dsi: 10nm PHY: Get ref clock from the DT

2018-11-26 Thread Matthias Kaehlcke
Get the ref clock of the PHY from the device tree instead of
hardcoding its name and rate.

Signed-off-by: Matthias Kaehlcke 
---
Changes in v2:
- remove anonymous array in clk_init_data assignment
- log error code if devm_clk_get() fails
- don't log devm_clk_get() failures for -EPROBE_DEFER
- updated commit message
---
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c 
b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
index 4c03f0b7343ed..32e31574e1432 100644
--- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
+++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
@@ -91,6 +91,7 @@ struct dsi_pll_10nm {
void __iomem *phy_cmn_mmio;
void __iomem *mmio;
 
+   struct clk *vco_ref_clk;
u64 vco_ref_clk_rate;
u64 vco_current_rate;
 
@@ -629,8 +630,9 @@ static int pll_10nm_register(struct dsi_pll_10nm *pll_10nm)
 {
char clk_name[32], parent[32], vco_name[32];
char parent2[32], parent3[32], parent4[32];
+   const char *ref_clk_name = __clk_get_name(pll_10nm->vco_ref_clk);
struct clk_init_data vco_init = {
-   .parent_names = (const char *[]){ "xo" },
+   .parent_names = _clk_name,
.num_parents = 1,
.name = vco_name,
.flags = CLK_IGNORE_UNUSED,
@@ -786,6 +788,15 @@ struct msm_dsi_pll *msm_dsi_pll_10nm_init(struct 
platform_device *pdev, int id)
pll_10nm->id = id;
pll_10nm_list[id] = pll_10nm;
 
+   pll_10nm->vco_ref_clk = devm_clk_get(>dev, "ref");
+   if (IS_ERR(pll_10nm->vco_ref_clk)) {
+   ret = PTR_ERR(pll_10nm->vco_ref_clk);
+   if (ret != EPROBE_DEFER)
+   dev_err(>dev, "couldn't get 'ref' clock: %d\n",
+   ret);
+   return ERR_PTR(ret);
+   }
+
pll_10nm->phy_cmn_mmio = msm_ioremap(pdev, "dsi_phy", "DSI_PHY");
if (IS_ERR_OR_NULL(pll_10nm->phy_cmn_mmio)) {
dev_err(>dev, "failed to map CMN PHY base\n");
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 4/7] arm64: dts: qcom: msm8916: Set 'xo_board' as ref clock of the DSI PHY

2018-11-26 Thread Matthias Kaehlcke
Add 'xo_board' as ref clock for the DSI PHYs, it was previously
hardcoded in the PLL 'driver' for the 28nm PHY.

Signed-off-by: Matthias Kaehlcke 
---
Changes in v2:
- patch added to the series
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index d302d8d639a12..89f30f34ff896 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -959,8 +959,9 @@
#clock-cells = <1>;
#phy-cells = <0>;
 
-   clocks = < GCC_MDSS_AHB_CLK>;
-   clock-names = "iface";
+   clocks = < GCC_MDSS_AHB_CLK>,
+<_board>;
+   clock-names = "iface", "ref";
};
};
 
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 5/7] drm/msm/dsi: 28nm PHY: Get ref clock from the DT

2018-11-26 Thread Matthias Kaehlcke
Get the ref clock of the PHY from the device tree instead of
hardcoding its name and rate.

Signed-off-by: Matthias Kaehlcke 
---
Changes in v2:
- patch added to the series
---
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm.c | 29 +++---
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm.c 
b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm.c
index 26e3a01a99c2b..a1ab5ecbf7c7d 100644
--- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm.c
+++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm.c
@@ -40,7 +40,6 @@
 
 #define NUM_PROVIDED_CLKS  2
 
-#define VCO_REF_CLK_RATE   1920
 #define VCO_MIN_RATE   35000
 #define VCO_MAX_RATE   75000
 
@@ -81,6 +80,7 @@ struct dsi_pll_28nm {
struct platform_device *pdev;
void __iomem *mmio;
 
+   struct clk *vco_ref_clk;
int vco_delay;
 
/* private clocks: */
@@ -139,6 +139,7 @@ static int dsi_pll_28nm_clk_set_rate(struct clk_hw *hw, 
unsigned long rate,
struct dsi_pll_28nm *pll_28nm = to_pll_28nm(pll);
struct device *dev = _28nm->pdev->dev;
void __iomem *base = pll_28nm->mmio;
+   u64 ref_clk_rate = parent_rate;
unsigned long div_fbx1000, gen_vco_clk;
u32 refclk_cfg, frac_n_mode, frac_n_value;
u32 sdm_cfg0, sdm_cfg1, sdm_cfg2, sdm_cfg3;
@@ -166,17 +167,17 @@ static int dsi_pll_28nm_clk_set_rate(struct clk_hw *hw, 
unsigned long rate,
pll_write(base + REG_DSI_28nm_PHY_PLL_LPFC1_CFG, 0x70);
pll_write(base + REG_DSI_28nm_PHY_PLL_LPFC2_CFG, 0x15);
 
-   rem = rate % VCO_REF_CLK_RATE;
+   rem = rate % ref_clk_rate;
if (rem) {
refclk_cfg = DSI_28nm_PHY_PLL_REFCLK_CFG_DBLR;
frac_n_mode = 1;
-   div_fbx1000 = rate / (VCO_REF_CLK_RATE / 500);
-   gen_vco_clk = div_fbx1000 * (VCO_REF_CLK_RATE / 500);
+   div_fbx1000 = rate / (ref_clk_rate / 500);
+   gen_vco_clk = div_fbx1000 * (ref_clk_rate / 500);
} else {
refclk_cfg = 0x0;
frac_n_mode = 0;
-   div_fbx1000 = rate / (VCO_REF_CLK_RATE / 1000);
-   gen_vco_clk = div_fbx1000 * (VCO_REF_CLK_RATE / 1000);
+   div_fbx1000 = rate / (ref_clk_rate / 1000);
+   gen_vco_clk = div_fbx1000 * (ref_clk_rate / 1000);
}
 
DBG("refclk_cfg = %d", refclk_cfg);
@@ -265,7 +266,7 @@ static unsigned long dsi_pll_28nm_clk_recalc_rate(struct 
clk_hw *hw,
void __iomem *base = pll_28nm->mmio;
u32 sdm0, doubler, sdm_byp_div;
u32 sdm_dc_off, sdm_freq_seed, sdm2, sdm3;
-   u32 ref_clk = VCO_REF_CLK_RATE;
+   u32 ref_clk = parent_rate;
unsigned long vco_rate;
 
VERB("parent_rate=%lu", parent_rate);
@@ -273,7 +274,7 @@ static unsigned long dsi_pll_28nm_clk_recalc_rate(struct 
clk_hw *hw,
/* Check to see if the ref clk doubler is enabled */
doubler = pll_read(base + REG_DSI_28nm_PHY_PLL_REFCLK_CFG) &
DSI_28nm_PHY_PLL_REFCLK_CFG_DBLR;
-   ref_clk += (doubler * VCO_REF_CLK_RATE);
+   ref_clk += (doubler * ref_clk);
 
/* see if it is integer mode or sdm mode */
sdm0 = pll_read(base + REG_DSI_28nm_PHY_PLL_SDM_CFG0);
@@ -517,8 +518,9 @@ static void dsi_pll_28nm_destroy(struct msm_dsi_pll *pll)
 static int pll_28nm_register(struct dsi_pll_28nm *pll_28nm)
 {
char clk_name[32], parent1[32], parent2[32], vco_name[32];
+   const char *ref_clk_name = __clk_get_name(pll_28nm->vco_ref_clk);
struct clk_init_data vco_init = {
-   .parent_names = (const char *[]){ "xo" },
+   .parent_names = _clk_name,
.num_parents = 1,
.name = vco_name,
.flags = CLK_IGNORE_UNUSED,
@@ -605,6 +607,15 @@ struct msm_dsi_pll *msm_dsi_pll_28nm_init(struct 
platform_device *pdev,
pll_28nm->pdev = pdev;
pll_28nm->id = id;
 
+   pll_28nm->vco_ref_clk = devm_clk_get(>dev, "ref");
+   if (IS_ERR(pll_28nm->vco_ref_clk)) {
+   ret = PTR_ERR(pll_28nm->vco_ref_clk);
+   if (ret != EPROBE_DEFER)
+   dev_err(>dev, "couldn't get 'ref' clock: %d\n",
+   ret);
+   return ERR_PTR(ret);
+   }
+
pll_28nm->mmio = msm_ioremap(pdev, "dsi_pll", "DSI_PLL");
if (IS_ERR_OR_NULL(pll_28nm->mmio)) {
dev_err(>dev, "%s: failed to map pll base\n", __func__);
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 3/7] drm/msm/dsi: 28nm 8960 PHY: Get ref clock from the DT

2018-11-26 Thread Matthias Kaehlcke
Get the ref clock of the PHY from the device tree instead of
hardcoding its name and rate.

Signed-off-by: Matthias Kaehlcke 
---
Changes in v2:
- patch added to the series
---
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm_8960.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm_8960.c 
b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm_8960.c
index 49008451085b8..461975c410fc4 100644
--- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm_8960.c
+++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm_8960.c
@@ -47,7 +47,6 @@
 
 #define NUM_PROVIDED_CLKS  2
 
-#define VCO_REF_CLK_RATE   2700
 #define VCO_MIN_RATE   6
 #define VCO_MAX_RATE   12
 
@@ -75,6 +74,8 @@ struct dsi_pll_28nm {
struct platform_device *pdev;
void __iomem *mmio;
 
+   struct clk *vco_ref_clk;
+
/* custom byte clock divider */
struct clk_bytediv *bytediv;
 
@@ -125,7 +126,7 @@ static int dsi_pll_28nm_clk_set_rate(struct clk_hw *hw, 
unsigned long rate,
DBG("rate=%lu, parent's=%lu", rate, parent_rate);
 
temp = rate / 10;
-   val = VCO_REF_CLK_RATE / 10;
+   val = parent_rate / 10;
fb_divider = (temp * VCO_PREF_DIV_RATIO) / val;
fb_divider = fb_divider / 2 - 1;
pll_write(base + REG_DSI_28nm_8960_PHY_PLL_CTRL_1,
@@ -409,8 +410,9 @@ static void dsi_pll_28nm_destroy(struct msm_dsi_pll *pll)
 static int pll_28nm_register(struct dsi_pll_28nm *pll_28nm)
 {
char *clk_name, *parent_name, *vco_name;
+   const char *ref_clk_name = __clk_get_name(pll_28nm->vco_ref_clk);
struct clk_init_data vco_init = {
-   .parent_names = (const char *[]){ "pxo" },
+   .parent_names = _clk_name,
.num_parents = 1,
.flags = CLK_IGNORE_UNUSED,
.ops = _ops_dsi_pll_28nm_vco,
@@ -506,6 +508,15 @@ struct msm_dsi_pll *msm_dsi_pll_28nm_8960_init(struct 
platform_device *pdev,
pll_28nm->pdev = pdev;
pll_28nm->id = id + 1;
 
+   pll_28nm->vco_ref_clk = devm_clk_get(>dev, "ref");
+   if (IS_ERR(pll_28nm->vco_ref_clk)) {
+   ret = PTR_ERR(pll_28nm->vco_ref_clk);
+   if (ret != EPROBE_DEFER)
+   dev_err(>dev, "couldn't get 'ref' clock: %d\n",
+   ret);
+   return ERR_PTR(ret);
+   }
+
pll_28nm->mmio = msm_ioremap(pdev, "dsi_pll", "DSI_PLL");
if (IS_ERR_OR_NULL(pll_28nm->mmio)) {
dev_err(>dev, "%s: failed to map pll base\n", __func__);
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 0/7] drm/msm/dsi: Get PHY ref clocks from the DT

2018-11-26 Thread Matthias Kaehlcke
The MSM DSI PHY drivers currently hardcode the name and the rate of
the PHY ref clock. Get the ref clock from the device tree instead.

Note: testing of this series was limited to SDM845 and the 10nm PHY

Major changes in v2:
- apply to all MSM DSI PHY drivers, not only 10nm

Matthias Kaehlcke (7):
  dt-bindings: msm/dsi: Add ref clock for PHYs
  drm/msm/dsi: 14nm PHY: Get ref clock from the DT
  drm/msm/dsi: 28nm 8960 PHY: Get ref clock from the DT
  arm64: dts: qcom: msm8916: Set 'xo_board' as ref clock of the DSI PHY
  drm/msm/dsi: 28nm PHY: Get ref clock from the DT
  arm64: dts: sdm845: Set 'bi_tcxo' as ref clock of the DSI PHYs
  drm/msm/dsi: 10nm PHY: Get ref clock from the DT

 .../devicetree/bindings/display/msm/dsi.txt   |  1 +
 arch/arm64/boot/dts/qcom/msm8916.dtsi |  5 ++--
 arch/arm64/boot/dts/qcom/sdm845.dtsi  | 10 ---
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c| 13 -
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_14nm.c| 16 --
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_28nm.c| 29 +--
 .../gpu/drm/msm/dsi/pll/dsi_pll_28nm_8960.c   | 17 +--
 7 files changed, 69 insertions(+), 22 deletions(-)

-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


[Freedreno] [PATCH v2 2/7] drm/msm/dsi: 14nm PHY: Get ref clock from the DT

2018-11-26 Thread Matthias Kaehlcke
Get the ref clock of the PHY from the device tree instead of
hardcoding its name and rate.

Signed-off-by: Matthias Kaehlcke 
---
Changes in v2:
- patch added to the series
---
 drivers/gpu/drm/msm/dsi/pll/dsi_pll_14nm.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_14nm.c 
b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_14nm.c
index 71fe60e5f01f1..f58298bd6c423 100644
--- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_14nm.c
+++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_14nm.c
@@ -40,7 +40,6 @@
 
 #define NUM_PROVIDED_CLKS  2
 
-#define VCO_REF_CLK_RATE   1920
 #define VCO_MIN_RATE   13UL
 #define VCO_MAX_RATE   26UL
 
@@ -139,6 +138,7 @@ struct dsi_pll_14nm {
/* protects REG_DSI_14nm_PHY_CMN_CLK_CFG0 register */
spinlock_t postdiv_lock;
 
+   struct clk *vco_ref_clk;
u64 vco_current_rate;
u64 vco_ref_clk_rate;
 
@@ -591,7 +591,7 @@ static int dsi_pll_14nm_vco_set_rate(struct clk_hw *hw, 
unsigned long rate,
parent_rate);
 
pll_14nm->vco_current_rate = rate;
-   pll_14nm->vco_ref_clk_rate = VCO_REF_CLK_RATE;
+   pll_14nm->vco_ref_clk_rate = parent_rate;
 
dsi_pll_14nm_input_init(pll_14nm);
 
@@ -950,8 +950,9 @@ static struct clk_hw *pll_14nm_postdiv_register(struct 
dsi_pll_14nm *pll_14nm,
 static int pll_14nm_register(struct dsi_pll_14nm *pll_14nm)
 {
char clk_name[32], parent[32], vco_name[32];
+   const char *ref_clk_name = __clk_get_name(pll_14nm->vco_ref_clk);
struct clk_init_data vco_init = {
-   .parent_names = (const char *[]){ "xo" },
+   .parent_names = _clk_name,
.num_parents = 1,
.name = vco_name,
.flags = CLK_IGNORE_UNUSED,
@@ -1065,6 +1066,15 @@ struct msm_dsi_pll *msm_dsi_pll_14nm_init(struct 
platform_device *pdev, int id)
pll_14nm->id = id;
pll_14nm_list[id] = pll_14nm;
 
+   pll_14nm->vco_ref_clk = devm_clk_get(>dev, "ref");
+   if (IS_ERR(pll_14nm->vco_ref_clk)) {
+   ret = PTR_ERR(pll_14nm->vco_ref_clk);
+   if (ret != EPROBE_DEFER)
+   dev_err(>dev, "couldn't get 'ref' clock: %d\n",
+   ret);
+   return ERR_PTR(ret);
+   }
+
pll_14nm->phy_cmn_mmio = msm_ioremap(pdev, "dsi_phy", "DSI_PHY");
if (IS_ERR_OR_NULL(pll_14nm->phy_cmn_mmio)) {
dev_err(>dev, "failed to map CMN PHY base\n");
-- 
2.20.0.rc0.387.gc7a69e6b6c-goog

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v2 15/24] drm/msm: dpu: Stop using encoder->crtc pointer

2018-11-26 Thread Sean Paul
On Mon, Nov 19, 2018 at 12:03:53PM -0800, Jeykumar Sankaran wrote:
> On 2018-11-16 13:14, Sean Paul wrote:
> > On Fri, Nov 16, 2018 at 12:05:09PM -0800, Jeykumar Sankaran wrote:
> > > On 2018-11-16 10:42, Sean Paul wrote:
> > > > From: Sean Paul 
> > > >
> > > > It's for legacy drivers, for atomic drivers crtc->state->encoder_mask
> > > > should be used to map encoder to crtc.
> > > >
> > > > Changes in v2:
> > > > - None
> > > >
> > > > Signed-off-by: Sean Paul 
> > > > ---
> > > >  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 46
> > 
> > > >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c  | 19 +++---
> > > >  2 files changed, 29 insertions(+), 36 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> > > > b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> > > > index 156f4c77ca44..a008a87a8113 100644
> > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> > > > @@ -284,9 +284,9 @@ enum dpu_intf_mode dpu_crtc_get_intf_mode(struct
> > > > drm_crtc *crtc)
> > > > return INTF_MODE_NONE;
> > > > }
> > > >
> > > > -   drm_for_each_encoder(encoder, crtc->dev)
> > > > -   if (encoder->crtc == crtc)
> > > > -   return dpu_encoder_get_intf_mode(encoder);
> > > > +   /* TODO: Returns the first INTF_MODE, could there be multiple
> > > > values? */
> > > > +   drm_for_each_encoder_mask(encoder, crtc->dev,
> > > > crtc->state->encoder_mask)
> > > > +   return dpu_encoder_get_intf_mode(encoder);
> > > >
> > > > return INTF_MODE_NONE;
> > > >  }
> > > > @@ -551,13 +551,9 @@ static void dpu_crtc_atomic_begin(struct drm_crtc
> > > > *crtc,
> > > > spin_unlock_irqrestore(>event_lock, flags);
> > > > }
> > > >
> > > > -   list_for_each_entry(encoder, >mode_config.encoder_list, 
> > > > head)
> > > > {
> > > > -   if (encoder->crtc != crtc)
> > > > -   continue;
> > > > -
> > > > -   /* encoder will trigger pending mask now */
> > > > +   /* encoder will trigger pending mask now */
> > > > +   drm_for_each_encoder_mask(encoder, crtc->dev,
> > > > crtc->state->encoder_mask)
> > > > dpu_encoder_trigger_kickoff_pending(encoder);
> > > > -   }
> > > >
> > > > /*
> > > >  * If no mixers have been allocated in dpu_crtc_atomic_check(),
> > > > @@ -704,7 +700,6 @@ static int _dpu_crtc_wait_for_frame_done(struct
> > > > drm_crtc *crtc)
> > > >  void dpu_crtc_commit_kickoff(struct drm_crtc *crtc)
> > > >  {
> > > > struct drm_encoder *encoder;
> > > > -   struct drm_device *dev = crtc->dev;
> > > > struct dpu_crtc *dpu_crtc = to_dpu_crtc(crtc);
> > > > struct dpu_kms *dpu_kms = _dpu_crtc_get_kms(crtc);
> > > > struct dpu_crtc_state *cstate = to_dpu_crtc_state(crtc->state);
> > > > @@ -720,16 +715,13 @@ void dpu_crtc_commit_kickoff(struct drm_crtc
> > > > *crtc)
> > > >
> > > > DPU_ATRACE_BEGIN("crtc_commit");
> > > >
> > > > -   list_for_each_entry(encoder, >mode_config.encoder_list, 
> > > > head)
> > > > {
> > > > +   /*
> > > > +* Encoder will flush/start now, unless it has a tx pending. If
> > > > so, it
> > > > +* may delay and flush at an irq event (e.g. ppdone)
> > > > +*/
> > > > +   drm_for_each_encoder_mask(encoder, crtc->dev,
> > > > + crtc->state->encoder_mask) {
> > > > struct dpu_encoder_kickoff_params params = { 0 };
> > > > -
> > > > -   if (encoder->crtc != crtc)
> > > > -   continue;
> > > > -
> > > > -   /*
> > > > -* Encoder will flush/start now, unless it has a tx
> > > > pending.
> > > > -* If so, it may delay and flush at an irq event (e.g.
> > > > ppdone)
> > > > -*/
> > > > dpu_encoder_prepare_for_kickoff(encoder, );
> > > > }
> > > >
> > > > @@ -754,12 +746,8 @@ void dpu_crtc_commit_kickoff(struct drm_crtc
> > *crtc)
> > > >
> > > > dpu_vbif_clear_errors(dpu_kms);
> > > >
> > > > -   list_for_each_entry(encoder, >mode_config.encoder_list, 
> > > > head)
> > > > {
> > > > -   if (encoder->crtc != crtc)
> > > > -   continue;
> > > > -
> > > > +   drm_for_each_encoder_mask(encoder, crtc->dev,
> > > > crtc->state->encoder_mask)
> > > > dpu_encoder_kickoff(encoder);
> > > > -   }
> > > We wont be holding the modeset locks here (and in crtc_atomic_begin)
> > > in
> > the
> > > display thread. Is
> > > it safe to iterate over encoder_mask?
> > 
> > Hmm, I'm not sure I follow. AFAICT, there are 2 callsites for
> > dpu_crtc_commit_kickoff():
> > 
> > 1- dpu_kms_encoder_enable() which is called via the
> > encoder->funcs->enable
> > hook
> > 2- dpu_kms_commit() which is called in the
> > 

Re: [Freedreno] [DPU PATCH 2/3] drm/msm/dp: add displayPort driver support

2018-11-26 Thread Sean Paul
On Mon, Nov 19, 2018 at 02:34:53PM -0800, chand...@codeaurora.org wrote:
> On 2018-10-23 09:28, Sean Paul wrote:
> > On Wed, Oct 10, 2018 at 10:15:58AM -0700, Chandan Uddaraju wrote:
> > > Add the needed displayPort files to enable DP driver
> > > on msm target.
> > > 
> > > "dp_display" module is the main module that calls into
> > > other sub-modules. "dp_drm" file represents the interface
> > > between DRM framework and DP driver.
> > > 
> > 
> 
> Hello Sean,
> I had few queries regarding your comments. Shared my queries below. Please
> share your feedback.

/snip

> > > diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
> > > b/drivers/gpu/drm/msm/dp/dp_aux.c
> > > new file mode 100644
> > > index 000..00b82c2
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> > > @@ -0,0 +1,570 @@
> > 
> > /snip
> > 
> > > diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h
> > > b/drivers/gpu/drm/msm/dp/dp_aux.h
> > > new file mode 100644
> > > index 000..0c300ed
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/msm/dp/dp_aux.h
> > 
> > /snip
> > 
> > I didn't really review the dp_aux.* files as-is. Please convert to using
> > drm_dp_aux and I'll take another look.
> > 
> 
> Please correct me if I am wrong. Looks like you are suggesting to use most
> of the drm_dp_aux struct variable
> instead of what we have defined locally.
> 

Correct


/snip


> > > + struct dp_catalog_ctrl ctrl = {
> > > + .state_ctrl = dp_catalog_ctrl_state_ctrl,
> > > + .config_ctrl= dp_catalog_ctrl_config_ctrl,
> > > + .lane_mapping   = dp_catalog_ctrl_lane_mapping,
> > > + .mainlink_ctrl  = dp_catalog_ctrl_mainlink_ctrl,
> > > + .config_misc= dp_catalog_ctrl_config_misc,
> > > + .config_msa = dp_catalog_ctrl_config_msa,
> > > + .set_pattern= dp_catalog_ctrl_set_pattern,
> > > + .reset  = dp_catalog_ctrl_reset,
> > > + .usb_reset  = dp_catalog_ctrl_usb_reset,
> > > + .mainlink_ready = dp_catalog_ctrl_mainlink_ready,
> > > + .enable_irq = dp_catalog_ctrl_enable_irq,
> > > + .hpd_config = dp_catalog_ctrl_hpd_config,
> > > + .phy_reset  = dp_catalog_ctrl_phy_reset,
> > > + .phy_lane_cfg   = dp_catalog_ctrl_phy_lane_cfg,
> > > + .update_vx_px   = dp_catalog_ctrl_update_vx_px,
> > > + .get_interrupt  = dp_catalog_ctrl_get_interrupt,
> > > + .update_transfer_unit = dp_catalog_ctrl_update_transfer_unit,
> > > + .send_phy_pattern= dp_catalog_ctrl_send_phy_pattern,
> > > + .read_phy_pattern = dp_catalog_ctrl_read_phy_pattern,
> > 
> > So, what's the benefit of adding the catalog abstractions? Do you ever
> > have
> > multiple/alternate implementations of a catalog? It doesn't seem like
> > it. You
> > should just remove all of the catalog overhead and call the functions
> > directly.
> 
> We use catalog abstractions for two different reasons:
> 1. To support multiple hardware since these implementations are hardware
> specific.
> 2. To virtualize DP hardware.
> 
> Our internal hardware already supports multiple chips. We want to keep these
> abstractions to
> make the code consistent between upstream and internal implementation.
> 

I'm not sure how this differs from having a common header/function declarations
and multiple definitions in a .c gated by CONFIG_*? Ideally even this wouldn't 
be
necessary since hw changes are either incremental and can be done more precisely
than wholesale changing the implementation.

If you want to virtualize, you can do the CONFIG_* thing. However since that's
not going upstream, you can carry the Kconfig/Makefile change in your
downstream.

All of these abstractions add thousands of lines of code, extra files, and 
multiple
levels of indirection. IMO, I don't think the benefits outweigh the costs.

/snip

> > > +static void dp_ctrl_calc_tu_parameters(struct dp_ctrl_private *ctrl,
> > > +struct dp_vc_tu_mapping_table *tu_table)
> > > +{
> > > + u32 multiplier = 100;
> > > + u64 pclk, lclk;
> > > + u8 bpp, ln_cnt;
> > > + int run_idx = 0;
> > > + u32 lwidth, h_blank;
> > > + u32 fifo_empty = 0;
> > > + u32 ratio_scale = 1001;
> > > + u64 temp, ratio, original_ratio;
> > > + u64 temp2, reminder;
> > > + u64 temp3, temp4, result = 0;
> > > +
> > > + u64 err = multiplier;
> > > + u64 n_err = 0, n_n_err = 0;
> > > + bool n_err_neg, nn_err_neg;
> > > + u8 hblank_margin = 16;
> > > +
> > > + u8 tu_size, tu_size_desired = 0, tu_size_minus1;
> > > + int valid_boundary_link;
> > > + u64 resulting_valid;
> > > + u64 total_valid;
> > > + u64 effective_valid;
> > > + u64 effective_valid_recorded;
> > > + int n_tus;
> > > + int n_tus_per_lane;
> > > + int paired_tus;
> > > + int remainder_tus;
> > > + int remainder_tus_upper, remainder_tus_lower;
> > > + int extra_bytes;
> > > + int filler_size;
> > > + int delay_start_link;
> > > + int boundary_moderation_en = 0;
> > > + int upper_bdry_cnt 

[Freedreno] [PATCH v2 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-26 Thread Vivek Gautam
dma_map_sg() expects a DMA domain. However, the drm devices
have been traditionally using unmanaged iommu domain which
is non-dma type. Using dma mapping APIs with that domain is bad.

Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
to do the cache maintenance.

Signed-off-by: Vivek Gautam 
Suggested-by: Tomasz Figa 
Cc: Rob Clark 
Cc: Jordan Crouse  
Cc: Sean Paul 
---

Changes since v1:
 - Addressed Jordan's and Tomasz's comments for
   - moving sg dma addresses preparation out of the coditional
 check to the main path so we do it irrespective of whether
 the buffer is cached or uncached.
   - Enhance the comment to explain this dma addresses prepartion.

 drivers/gpu/drm/msm/msm_gem.c | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 00c795ced02c..1811ac23a31c 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
struct drm_device *dev = obj->dev;
struct page **p;
int npages = obj->size >> PAGE_SHIFT;
+   struct scatterlist *s;
+   int i;
 
if (use_pages(obj))
p = drm_gem_get_pages(obj);
@@ -104,12 +106,21 @@ static struct page **get_pages(struct drm_gem_object *obj)
return ptr;
}
 
-   /* For non-cached buffers, ensure the new pages are clean
+   /*
+* dma_sync_sg_*() flush the physical pages, so point
+* sg->dma_address to the physical ones for the right behavior.
+*/
+   for_each_sg(msm_obj->sgt->sgl, s, msm_obj->sgt->nents, i)
+   sg_dma_address(s) = sg_phys(s);
+
+   /*
+* For non-cached buffers, ensure the new pages are clean
 * because display controller, GPU, etc. are not coherent:
 */
-   if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
-   dma_map_sg(dev->dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED))
+   dma_sync_sg_for_device(dev->dev, msm_obj->sgt->sgl,
+  msm_obj->sgt->nents,
+  DMA_TO_DEVICE);
}
 
return msm_obj->pages;
@@ -133,14 +144,16 @@ static void put_pages(struct drm_gem_object *obj)
 
if (msm_obj->pages) {
if (msm_obj->sgt) {
-   /* For non-cached buffers, ensure the new
+   /*
+* For non-cached buffers, ensure the new
 * pages are clean because display controller,
 * GPU, etc. are not coherent:
 */
-   if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
-   dma_unmap_sg(obj->dev->dev, msm_obj->sgt->sgl,
-msm_obj->sgt->nents,
-DMA_BIDIRECTIONAL);
+   if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED))
+   dma_sync_sg_for_cpu(obj->dev->dev,
+   msm_obj->sgt->sgl,
+   msm_obj->sgt->nents,
+   DMA_FROM_DEVICE);
 
sg_free_table(msm_obj->sgt);
kfree(msm_obj->sgt);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH] iommu: arm-smmu: Set SCTLR.HUPCF bit

2018-11-26 Thread Rob Clark
On Mon, Nov 26, 2018 at 2:31 PM Will Deacon  wrote:
>
> Hi Rob,
>
> On Tue, Nov 13, 2018 at 08:12:35AM -0500, Rob Clark wrote:
> > On Tue, Nov 13, 2018 at 1:32 AM Will Deacon  wrote:
> > > On Fri, Nov 09, 2018 at 01:01:55PM -0500, Rob Clark wrote:
> > > > On Mon, Oct 29, 2018 at 3:09 PM Will Deacon  wrote:
> > > > > On Thu, Sep 27, 2018 at 06:46:07PM -0400, Rob Clark wrote:
> > > > > > We seem to need to set either this or CFCFG (stall), otherwise gpu
> > > > > > faults trigger problems with other in-flight transactions from the
> > > > > > GPU causing CP errors, etc.
> > > > > >
> > > > > > In the ARM SMMU spec, the 'Hit under previous context fault' bit is
> > > > > > described as:
> > > > > >
> > > > > >  '0' - Stall or terminate subsequent transactions in the presence
> > > > > >of an outstanding context fault
> > > > > >  '1' - Process all subsequent transactions independently of any
> > > > > >outstanding context fault.
> > > > > >
> > > > > > Since we don't enable CFCFG (stall) the behavior of terminating
> > > > > > other transactions makes sense.  And is probably not what we want
> > > > > > (and definately not what we want for GPU).
> > > > > >
> > > > > > Signed-off-by: Rob Clark 
> > > > > > ---
> > > > > > So I hit this issue a long time back on 820 (msm8996) and at the
> > > > > > time I solved it with a patch that enabled CFCFG.  And it resurfaced
> > > > > > more recently on sdm845.  But at the time CFCFG was rejected, iirc
> > > > > > because of concern that it would cause problems on other non-qcom
> > > > > > arm smmu implementations.  And I think I forgot to send this version
> > > > > > of the solution.
> > > > > >
> > > > > > If enabling HUPCF is anticipated to cause problems on other ARM
> > > > > > SMMU implementations, I think I can come up with a variant of this
> > > > > > patch which conditionally enables it for snapdragon.
> > > > > >
> > > > > > Either way, I'd really like to get some variant of this fix merged
> > > > > > (and probably it would be a good idea for stable kernel branches
> > > > > > too), since current behaviour with the GPU means faults turn into
> > > > > > a fantastic cascade of fail.
> > > > >
> > > > > Can you describe how this fantastic cascade of fail improves with this
> > > > > patch, please? If you're getting context faults then something has 
> > > > > already
> > > > > gone horribly wrong, so I'm trying to work out how this improves 
> > > > > things.
> > > > >
> > > >
> > > > There are plenty of cases where getting iommu faults with a GPU is
> > > > "normal", or at least not something the kernel or even GL driver can
> > > > control.
> > >
> > > Such as? All the mainline driver does is print a diagnostic and clear the
> > > fault, which doesn't seem generally useful.
> >
> > it is useful to debug the fault ;-)
> >
> > Although eventually we'll want to be able to do more than that, like
> > have the fault trigger bringing in pages of a mmap'd file and that
> > sort of thing.
>
> Right, and feels very strange to me if we have this bit set because it
> means that your fault is no longer synchronous and therefore diverges
> from the fault model that you get from the CPU, where you certainly
> wouldn't expect stores appearing in the program after a faulting load to
> be visible in memory. However, thinking harder about it, I suppose we're
> already in a situation where the translations are handled out of order
> in the absence of barriers, so maybe it's not the end of the world.

I guess I wouldn't have expected synchronous without CFCFG=1

> Could you dump the FSR value that you see in the fault handler, please?
> From my reading of the architecture spec, as long as clear all of the
> fault bits in the irq handler, then your machine shouldn't die like it
> does with HUPCFG=CFCFG=0..

I expect it dies before the irq handler returns..

possibly the behavior of terminated translations returning zero's
might be some detail of qcom's implementation (or how the gpu reacts
to terminated memory transactions, etc), rather than something the
spec expects/specifies.

I'll try and get you a dump of FSR in next couple days.. (need to
switch kernels and write up some test code to trigger faults)

BR,
-R

>
> > > > With this patch, you still get the iommu fault, but it doesn't cause
> > > > the gpu to crash.  But without it, other memory accesses in flight
> > > > while the fault occurs, like the GPU command-processor reading further
> > > > ahead in the cmdstream to setup next draw, would return zero's,
> > > > causing the GPU to crash or get into a bad state.
> > >
> > > I get that part, but I don't understand why we're seeing faults in the 
> > > first
> > > place and I worry that this patch is just the tip of the iceberg. It's 
> > > also
> > > not clear that processing subsequent transactions is always the right 
> > > thing
> > > to do in a world where we actually want to report (and handle) synchronous
> > > faults from devices.
> >
> > Sure, 

Re: [Freedreno] [PATCH] iommu: arm-smmu: Set SCTLR.HUPCF bit

2018-11-26 Thread Jordan Crouse
On Mon, Nov 26, 2018 at 07:31:48PM +, Will Deacon wrote:
> Hi Rob,
> 
> On Tue, Nov 13, 2018 at 08:12:35AM -0500, Rob Clark wrote:
> > On Tue, Nov 13, 2018 at 1:32 AM Will Deacon  wrote:
> > > On Fri, Nov 09, 2018 at 01:01:55PM -0500, Rob Clark wrote:
> > > > On Mon, Oct 29, 2018 at 3:09 PM Will Deacon  wrote:
> > > > > On Thu, Sep 27, 2018 at 06:46:07PM -0400, Rob Clark wrote:
> > > > > > We seem to need to set either this or CFCFG (stall), otherwise gpu
> > > > > > faults trigger problems with other in-flight transactions from the
> > > > > > GPU causing CP errors, etc.
> > > > > >
> > > > > > In the ARM SMMU spec, the 'Hit under previous context fault' bit is
> > > > > > described as:
> > > > > >
> > > > > >  '0' - Stall or terminate subsequent transactions in the presence
> > > > > >of an outstanding context fault
> > > > > >  '1' - Process all subsequent transactions independently of any
> > > > > >outstanding context fault.
> > > > > >
> > > > > > Since we don't enable CFCFG (stall) the behavior of terminating
> > > > > > other transactions makes sense.  And is probably not what we want
> > > > > > (and definately not what we want for GPU).
> > > > > >
> > > > > > Signed-off-by: Rob Clark 
> > > > > > ---
> > > > > > So I hit this issue a long time back on 820 (msm8996) and at the
> > > > > > time I solved it with a patch that enabled CFCFG.  And it resurfaced
> > > > > > more recently on sdm845.  But at the time CFCFG was rejected, iirc
> > > > > > because of concern that it would cause problems on other non-qcom
> > > > > > arm smmu implementations.  And I think I forgot to send this version
> > > > > > of the solution.
> > > > > >
> > > > > > If enabling HUPCF is anticipated to cause problems on other ARM
> > > > > > SMMU implementations, I think I can come up with a variant of this
> > > > > > patch which conditionally enables it for snapdragon.
> > > > > >
> > > > > > Either way, I'd really like to get some variant of this fix merged
> > > > > > (and probably it would be a good idea for stable kernel branches
> > > > > > too), since current behaviour with the GPU means faults turn into
> > > > > > a fantastic cascade of fail.
> > > > >
> > > > > Can you describe how this fantastic cascade of fail improves with this
> > > > > patch, please? If you're getting context faults then something has 
> > > > > already
> > > > > gone horribly wrong, so I'm trying to work out how this improves 
> > > > > things.
> > > > >
> > > >
> > > > There are plenty of cases where getting iommu faults with a GPU is
> > > > "normal", or at least not something the kernel or even GL driver can
> > > > control.
> > >
> > > Such as? All the mainline driver does is print a diagnostic and clear the
> > > fault, which doesn't seem generally useful.
> > 
> > it is useful to debug the fault ;-)
> > 
> > Although eventually we'll want to be able to do more than that, like
> > have the fault trigger bringing in pages of a mmap'd file and that
> > sort of thing.
> 
> Right, and feels very strange to me if we have this bit set because it
> means that your fault is no longer synchronous and therefore diverges
> from the fault model that you get from the CPU, where you certainly
> wouldn't expect stores appearing in the program after a faulting load to
> be visible in memory. However, thinking harder about it, I suppose we're
> already in a situation where the translations are handled out of order
> in the absence of barriers, so maybe it's not the end of the world.
> 
> Could you dump the FSR value that you see in the fault handler, please?
> From my reading of the architecture spec, as long as clear all of the
> fault bits in the irq handler, then your machine shouldn't die like it
> does with HUPCFG=CFCFG=0..
> 
> > > > With this patch, you still get the iommu fault, but it doesn't cause
> > > > the gpu to crash.  But without it, other memory accesses in flight
> > > > while the fault occurs, like the GPU command-processor reading further
> > > > ahead in the cmdstream to setup next draw, would return zero's,
> > > > causing the GPU to crash or get into a bad state.
> > >
> > > I get that part, but I don't understand why we're seeing faults in the 
> > > first
> > > place and I worry that this patch is just the tip of the iceberg. It's 
> > > also
> > > not clear that processing subsequent transactions is always the right 
> > > thing
> > > to do in a world where we actually want to report (and handle) synchronous
> > > faults from devices.
> > 
> > Sure, it is a bug.. but it can be an application bug that is not
> > something the userspace GL driver or kernel could do anything about.
> > We shouldn't let this kill the GPU.  If the application didn't have
> > this much control, we wouldn't need an IOMMU in the first place[1].
> > With opencl compute, the userspace controlled shader has full blown
> > pointers to GPU memory.
> > 
> > And even in cases where it is a userspace GL driver bug, having some
> > 

Re: [Freedreno] [RESEND PATCH v17 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-11-26 Thread Will Deacon
On Mon, Nov 26, 2018 at 04:56:42PM +0530, Vivek Gautam wrote:
> On 11/26/2018 11:33 AM, Vivek Gautam wrote:
> >On 11/24/2018 12:06 AM, Will Deacon wrote:
> >>On Thu, Nov 22, 2018 at 05:32:24PM +0530, Vivek Gautam wrote:
> >>>On Wed, Nov 21, 2018 at 11:09 PM Will Deacon 
> >>>wrote:
> On Fri, Nov 16, 2018 at 04:54:27PM +0530, Vivek Gautam wrote:
> >From: Sricharan R 
> >
> >The smmu device probe/remove and add/remove master device callbacks
> >gets called when the smmu is not linked to its master, that is
> >without
> >the context of the master device. So calling runtime apis in those
> >places
> >separately.
> >Global locks are also initialized before enabling runtime pm as the
> >runtime_resume() calls device_reset() which does tlb_sync_global()
> >that ultimately requires locks to be initialized.
> >
> >Signed-off-by: Sricharan R 
> >[vivek: Cleanup pm runtime calls]
> >Signed-off-by: Vivek Gautam 
> >Reviewed-by: Tomasz Figa 
> >Tested-by: Srinivas Kandagatla 
> >Reviewed-by: Robin Murphy 
> >---
> >  drivers/iommu/arm-smmu.c | 101
> >++-
> >  1 file changed, 91 insertions(+), 10 deletions(-)
> Given that you're doing the get/put in the TLBI ops unconditionally:
> 
> >  static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
> >  {
> >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >+ struct arm_smmu_device *smmu = smmu_domain->smmu;
> >
> >- if (smmu_domain->tlb_ops)
> >+ if (smmu_domain->tlb_ops) {
> >+ arm_smmu_rpm_get(smmu);
> >smmu_domain->tlb_ops->tlb_flush_all(smmu_domain);
> >+ arm_smmu_rpm_put(smmu);
> >+ }
> >  }
> >
> >  static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
> >  {
> >   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >+ struct arm_smmu_device *smmu = smmu_domain->smmu;
> >
> >- if (smmu_domain->tlb_ops)
> >+ if (smmu_domain->tlb_ops) {
> >+ arm_smmu_rpm_get(smmu);
> >smmu_domain->tlb_ops->tlb_sync(smmu_domain);
> >+ arm_smmu_rpm_put(smmu);
> >+ }
> Why do you need them around the map/unmap calls as well?
> >>>We still have .tlb_add_flush path?
> >>Ok, so we could add the ops around that as well. Right now, we've got
> >>the runtime pm hooks crossing two parts of the API.
> >
> >Sure, will do that then, and remove the runtime pm hooks from map/unmap.
> 
> I missed this earlier -
> We are adding runtime pm hooks in the 'iommu_ops' callbacks and not really
> to
> 'tlb_ops'. So how the runtime pm hooks crossing the paths?
> '.map/.unmap' iommu_ops don't call '.flush_iotlb_all' or '.iotlb_sync'
> iommu_ops
> anywhere.
> 
> E.g., only callers to domain->ops->flush_iotlb_all() are:
> iommu_dma_flush_iotlb_all(), or iommu_flush_tlb_all() which are not in
> map/unmap paths.

Yes, sorry, I got confused here and completely misled you. In which case,
your original patch is ok because it intercepts the core IOMMU API via
iommu_ops. Apologies.

At that level, should we also annotate arm_smmu_iova_to_phys_hard()
for the iova_to_phys() implementation?

With that detail and clock bits sorted out, we should be able to get this
queued at last.

Will
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-26 Thread Jordan Crouse
On Thu, Nov 22, 2018 at 03:37:54PM +0530, Vivek Gautam wrote:
> Hi Tomasz, Jordan,
> 
> 
> On 11/21/2018 9:18 AM, Tomasz Figa wrote:
 
> >
> >>>+ for_each_sg(msm_obj->sgt->sgl, s,
> >>>+ msm_obj->sgt->nents, i)
> >>>+ sg_dma_address(s) = sg_phys(s);
> >>>+
> >>I'm wondering - wouldn't we want to do this association for cached buffers 
> >>to so
> >>we could sync them correctly in cpu_prep and cpu_fini?  Maybe it wouldn't 
> >>hurt
> >>to put this association in the main path (obviously the sync should stay 
> >>inside
> >>the conditional for uncached buffers).
> >>
> 
> Sure, I will move this out of the conditional check.
> 
> >I guess it wouldn't hurt indeed. Note that cpu_prep/fini seem to be
> >missing the sync call currently.
> 
> I can't say I understand the usage of cpu_prep and cpu_fini(). But I can add
> the necessary support if you can point me in the right direction.

Not needed for this iteration. We don't have support in those functions for
cached buffers right now so continuing to not support it wouldn't hurt.

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-26 Thread Vivek Gautam

Hi Thor,


On 11/26/2018 8:11 PM, Thor Thayer wrote:

Hi Vivek,

On 11/26/18 4:55 AM, Vivek Gautam wrote:


On 11/24/2018 12:04 AM, Will Deacon wrote:

On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote:
On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  
wrote:

On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
 wrote:
On Wed, Nov 21, 2018 at 11:09 PM Will Deacon 
 wrote:

On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
@@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, 
ARM_SMMU_V1_64K, GENERIC_SMMU);

  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);

+static const char * const qcom_smmuv2_clks[] = {
+ "bus", "iface",
+};
+
+static const struct arm_smmu_match_data qcom_smmuv2 = {
+ .version = ARM_SMMU_V2,
+ .model = QCOM_SMMUV2,
+ .clks = qcom_smmuv2_clks,
+ .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
+};
These seems redundant if we go down the route proposed by Thor, 
where we
just pull all of the clocks out of the device-tree. In which 
case, why

do we need this match_data at all?

Which is better? Driver relying solely on the device tree to tell
which all clocks
are required to be enabled,
or, the driver deciding itself based on the platform's match data,
that it should
have X, Y, & Z clocks that should be supplied from the device tree.

The former would simplify the driver, but would also make it
impossible to spot mistakes in DT, which would ultimately surface out
as very hard to debug bugs (likely complete system lockups).

Thanks.
Yea, this is how I understand things presently. Relying on device tree
puts the things out of driver's control.

But it also has the undesirable effect of having to update the driver
code whenever we want to add support for a new SMMU implementation. If
we do this all in the DT, as Thor is trying to do, then older kernels
will work well with new hardware.


Hi Will,
Am I unable to understand the intentions here for Thor's clock-fetch
design change?

I'm having trouble parsing your question, sorry. Please work with Thor
so that we have a single way to get the clock information. My 
preference

is to take it from the firmware, for the reason I stated above.

Hi Will,

Sure, thanks. I will work with Thor to get this going.

Hi Thor,
Does it sound okay to you to squash your patch [1] into my patch [2] 
with

your 'Signed-off-by' tag?
I will update the commit log to include the information about getting
clock details from device tree.

[1] https://patchwork.kernel.org/patch/10628725/
[2] https://patchwork.kernel.org/patch/10686061/



Yes, that would be great and easier to understand than my patch on top 
of yours.


Additionally, can you remove the "Error:" as Will requested as part of 
the squash?


Thanks for your consent. I have reworked the patch today, and have 
addressed Will's

comment. I will give a try on the board and post it by tomorrow.

Best regards
Vivek



Thank you!

Thor


Best regards
Vivek


Will







___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v2 5/9] drm/msm: add headless gpu device (for imx5)

2018-11-26 Thread Jordan Crouse
On Wed, Nov 21, 2018 at 08:52:31PM -0500, Jonathan Marek wrote:
> This patch allows using drm/msm without qcom display hardware. This is
> especially useful for iMX5 hardware, which has a a2xx GPU but uses the
> imx-drm driver for display.
> 
> Signed-off-by: Jonathan Marek 
> ---
> v2: added commit message and removed unnecessary comment
> 
>  drivers/gpu/drm/msm/Kconfig   |  4 ++--
>  drivers/gpu/drm/msm/msm_debugfs.c |  2 +-
>  drivers/gpu/drm/msm/msm_drv.c | 21 +++--
>  3 files changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
> index 843a9d40c05e..cf549f1ed403 100644
> --- a/drivers/gpu/drm/msm/Kconfig
> +++ b/drivers/gpu/drm/msm/Kconfig
> @@ -2,7 +2,7 @@
>  config DRM_MSM
>   tristate "MSM DRM"
>   depends on DRM
> - depends on ARCH_QCOM || (ARM && COMPILE_TEST)
> + depends on ARCH_QCOM || SOC_IMX5 || (ARM && COMPILE_TEST)
>   depends on OF && COMMON_CLK
>   depends on MMU
>   select QCOM_MDT_LOADER if ARCH_QCOM
> @@ -11,7 +11,7 @@ config DRM_MSM
>   select DRM_PANEL
>   select SHMEM
>   select TMPFS
> - select QCOM_SCM
> + select QCOM_SCM if ARCH_QCOM
>   select WANT_DEV_COREDUMP
>   select SND_SOC_HDMI_CODEC if SND_SOC
>   select SYNC_FILE

This is good - there are probably a handful of others you can add too -
WANT_DEV_COREDUMP for example (at least for now) but you can audit those later
as you try to cut down your binary size.

Also serves as an important reminder for the rest of us that Adreno just isn't
for Snapdragon anymore.

> diff --git a/drivers/gpu/drm/msm/msm_debugfs.c 
> b/drivers/gpu/drm/msm/msm_debugfs.c
> index f0da0d3c8a80..1ca99ca356a4 100644
> --- a/drivers/gpu/drm/msm/msm_debugfs.c
> +++ b/drivers/gpu/drm/msm/msm_debugfs.c
> @@ -235,7 +235,7 @@ int msm_debugfs_init(struct drm_minor *minor)
>   debugfs_create_file("gpu", S_IRUSR, minor->debugfs_root,
>   dev, _gpu_fops);
>  
> - if (priv->kms->funcs->debugfs_init) {
> + if (priv->kms && priv->kms->funcs->debugfs_init) {
>   ret = priv->kms->funcs->debugfs_init(priv->kms, minor);
>   if (ret)
>   return ret;
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index 9f219e02f3c7..3f35e57202ef 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -344,6 +344,7 @@ static int msm_drm_uninit(struct device *dev)
>   return 0;
>  }
>  
> +#define KMS_HEADLESS 1
>  #define KMS_MDP4 4
>  #define KMS_MDP5 5
>  #define KMS_DPU  3
> @@ -495,6 +496,9 @@ static int msm_drm_init(struct device *dev, struct 
> drm_driver *drv)
>   msm_gem_shrinker_init(ddev);
>  
>   switch (get_mdp_ver(pdev)) {
> + case KMS_HEADLESS:
> + priv->kms = kms = NULL;
> + break;
>   case KMS_MDP4:
>   kms = mdp4_kms_init(ddev);
>   priv->kms = kms;
> @@ -512,12 +516,6 @@ static int msm_drm_init(struct device *dev, struct 
> drm_driver *drv)
>   }
>  
>   if (IS_ERR(kms)) {
> - /*
> -  * NOTE: once we have GPU support, having no kms should not
> -  * be considered fatal.. ideally we would still support gpu
> -  * and (for example) use dmabuf/prime to share buffers with
> -  * imx drm driver on iMX5
> -  */
>   dev_err(dev, "failed to load kms\n");
>   ret = PTR_ERR(kms);
>   goto err_msm_uninit;
> @@ -633,7 +631,7 @@ static int msm_drm_init(struct device *dev, struct 
> drm_driver *drv)
>   drm_mode_config_reset(ddev);
>  
>  #ifdef CONFIG_DRM_FBDEV_EMULATION
> - if (fbdev)
> + if (kms && fbdev)
>   priv->fbdev = msm_fbdev_init(ddev);
>  #endif
>  
> @@ -1315,9 +1313,11 @@ static int msm_pdev_probe(struct platform_device *pdev)
>   struct component_match *match = NULL;
>   int ret;
>  
> - ret = add_display_components(>dev, );
> - if (ret)
> - return ret;
> + if (get_mdp_ver(pdev) != KMS_HEADLESS) {
> + ret = add_display_components(>dev, );
> + if (ret)
> + return ret;
> + }
>  
>   ret = add_gpu_components(>dev, );
>   if (ret)
> @@ -1342,6 +1342,7 @@ static int msm_pdev_remove(struct platform_device *pdev)
>  }
>  
>  static const struct of_device_id dt_match[] = {
> + { .compatible = "qcom,adreno-headless", .data = (void *)KMS_HEADLESS },

I feel that this should be more descriptive for your hardware and not just
generic. You don't need to include the qcom, extension and you'll probably
want to use a more descriptive vendor if you are adding imx5 specific data to
the DT.

>   { .compatible = "qcom,mdp4", .data = (void *)KMS_MDP4 },
>   { .compatible = "qcom,mdss", .data = (void *)KMS_MDP5 },
>   { .compatible = "qcom,sdm845-mdss", .data = (void *)KMS_DPU },
> -- 
> 2.17.1
 
-- 
The 

Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-26 Thread Thor Thayer

Hi Vivek,

On 11/26/18 4:55 AM, Vivek Gautam wrote:


On 11/24/2018 12:04 AM, Will Deacon wrote:

On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote:

On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  wrote:

On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
 wrote:
On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  
wrote:

On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
@@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, 
ARM_SMMU_V1_64K, GENERIC_SMMU);

  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);

+static const char * const qcom_smmuv2_clks[] = {
+ "bus", "iface",
+};
+
+static const struct arm_smmu_match_data qcom_smmuv2 = {
+ .version = ARM_SMMU_V2,
+ .model = QCOM_SMMUV2,
+ .clks = qcom_smmuv2_clks,
+ .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
+};
These seems redundant if we go down the route proposed by Thor, 
where we
just pull all of the clocks out of the device-tree. In which case, 
why

do we need this match_data at all?

Which is better? Driver relying solely on the device tree to tell
which all clocks
are required to be enabled,
or, the driver deciding itself based on the platform's match data,
that it should
have X, Y, & Z clocks that should be supplied from the device tree.

The former would simplify the driver, but would also make it
impossible to spot mistakes in DT, which would ultimately surface out
as very hard to debug bugs (likely complete system lockups).

Thanks.
Yea, this is how I understand things presently. Relying on device tree
puts the things out of driver's control.

But it also has the undesirable effect of having to update the driver
code whenever we want to add support for a new SMMU implementation. If
we do this all in the DT, as Thor is trying to do, then older kernels
will work well with new hardware.


Hi Will,
Am I unable to understand the intentions here for Thor's clock-fetch
design change?

I'm having trouble parsing your question, sorry. Please work with Thor
so that we have a single way to get the clock information. My preference
is to take it from the firmware, for the reason I stated above.

Hi Will,

Sure, thanks. I will work with Thor to get this going.

Hi Thor,
Does it sound okay to you to squash your patch [1] into my patch [2] with
your 'Signed-off-by' tag?
I will update the commit log to include the information about getting
clock details from device tree.

[1] https://patchwork.kernel.org/patch/10628725/
[2] https://patchwork.kernel.org/patch/10686061/



Yes, that would be great and easier to understand than my patch on top 
of yours.


Additionally, can you remove the "Error:" as Will requested as part of 
the squash?


Thank you!

Thor


Best regards
Vivek


Will





___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-11-26 Thread Vivek Gautam



On 11/26/2018 11:33 AM, Vivek Gautam wrote:



On 11/24/2018 12:06 AM, Will Deacon wrote:

On Thu, Nov 22, 2018 at 05:32:24PM +0530, Vivek Gautam wrote:

Hi Will,

On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  
wrote:

On Fri, Nov 16, 2018 at 04:54:27PM +0530, Vivek Gautam wrote:

From: Sricharan R 

The smmu device probe/remove and add/remove master device callbacks
gets called when the smmu is not linked to its master, that is 
without
the context of the master device. So calling runtime apis in those 
places

separately.
Global locks are also initialized before enabling runtime pm as the
runtime_resume() calls device_reset() which does tlb_sync_global()
that ultimately requires locks to be initialized.

Signed-off-by: Sricharan R 
[vivek: Cleanup pm runtime calls]
Signed-off-by: Vivek Gautam 
Reviewed-by: Tomasz Figa 
Tested-by: Srinivas Kandagatla 
Reviewed-by: Robin Murphy 
---
  drivers/iommu/arm-smmu.c | 101 
++-

  1 file changed, 91 insertions(+), 10 deletions(-)

Given that you're doing the get/put in the TLBI ops unconditionally:


  static void arm_smmu_flush_iotlb_all(struct iommu_domain *domain)
  {
   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_device *smmu = smmu_domain->smmu;

- if (smmu_domain->tlb_ops)
+ if (smmu_domain->tlb_ops) {
+ arm_smmu_rpm_get(smmu);
smmu_domain->tlb_ops->tlb_flush_all(smmu_domain);
+ arm_smmu_rpm_put(smmu);
+ }
  }

  static void arm_smmu_iotlb_sync(struct iommu_domain *domain)
  {
   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+ struct arm_smmu_device *smmu = smmu_domain->smmu;

- if (smmu_domain->tlb_ops)
+ if (smmu_domain->tlb_ops) {
+ arm_smmu_rpm_get(smmu);
smmu_domain->tlb_ops->tlb_sync(smmu_domain);
+ arm_smmu_rpm_put(smmu);
+ }

Why do you need them around the map/unmap calls as well?

We still have .tlb_add_flush path?

Ok, so we could add the ops around that as well. Right now, we've got
the runtime pm hooks crossing two parts of the API.


Sure, will do that then, and remove the runtime pm hooks from map/unmap.


I missed this earlier -
We are adding runtime pm hooks in the 'iommu_ops' callbacks and not 
really to

'tlb_ops'. So how the runtime pm hooks crossing the paths?
'.map/.unmap' iommu_ops don't call '.flush_iotlb_all' or '.iotlb_sync' 
iommu_ops

anywhere.

E.g., only callers to domain->ops->flush_iotlb_all() are:
iommu_dma_flush_iotlb_all(), or iommu_flush_tlb_all() which are not in 
map/unmap paths.


Regards
Vivek



Thanks
Vivek


Will




___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-26 Thread Vivek Gautam


On 11/24/2018 12:04 AM, Will Deacon wrote:

On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote:

On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  wrote:

On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
 wrote:

On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:

On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:

@@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
GENERIC_SMMU);
  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);

+static const char * const qcom_smmuv2_clks[] = {
+ "bus", "iface",
+};
+
+static const struct arm_smmu_match_data qcom_smmuv2 = {
+ .version = ARM_SMMU_V2,
+ .model = QCOM_SMMUV2,
+ .clks = qcom_smmuv2_clks,
+ .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
+};

These seems redundant if we go down the route proposed by Thor, where we
just pull all of the clocks out of the device-tree. In which case, why
do we need this match_data at all?

Which is better? Driver relying solely on the device tree to tell
which all clocks
are required to be enabled,
or, the driver deciding itself based on the platform's match data,
that it should
have X, Y, & Z clocks that should be supplied from the device tree.

The former would simplify the driver, but would also make it
impossible to spot mistakes in DT, which would ultimately surface out
as very hard to debug bugs (likely complete system lockups).

Thanks.
Yea, this is how I understand things presently. Relying on device tree
puts the things out of driver's control.

But it also has the undesirable effect of having to update the driver
code whenever we want to add support for a new SMMU implementation. If
we do this all in the DT, as Thor is trying to do, then older kernels
will work well with new hardware.


Hi Will,
Am I unable to understand the intentions here for Thor's clock-fetch
design change?

I'm having trouble parsing your question, sorry. Please work with Thor
so that we have a single way to get the clock information. My preference
is to take it from the firmware, for the reason I stated above.

Hi Will,

Sure, thanks. I will work with Thor to get this going.

Hi Thor,
Does it sound okay to you to squash your patch [1] into my patch [2] with
your 'Signed-off-by' tag?
I will update the commit log to include the information about getting
clock details from device tree.

[1] https://patchwork.kernel.org/patch/10628725/
[2] https://patchwork.kernel.org/patch/10686061/

Best regards
Vivek


Will


___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno