date:20200804

Hi all,

On Mon, 20 Jul 2020 15:59:17 +1000 Stephen Rothwell  
wrote:
>
> Hi all,
> 
> Today's linux-next merge of the seccomp tree got a conflict in:
> 
>   tools/testing/selftests/seccomp/seccomp_bpf.c
> 
> between commit:
> 
>   4c6614dc86ad ("selftests/seccomp: Check ENOSYS under tracing")
> 
> from the kselftest tree and commit:
> 
>   11eb004ef7ea ("selftests/seccomp: Check ENOSYS under tracing")
> 
> from the seccomp tree.
> 
> I fixed it up (I just used the latter version) and can carry the fix as
> necessary. This is now fixed as far as linux-next is concerned, but any
> non trivial conflicts should be mentioned to your upstream maintainer
> when your tree is submitted for merging.  You may also want to consider
> cooperating with the maintainer of the conflicting tree to minimise any
> particularly complex conflicts.

This is now a conflict between the kselftest tree and Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgpZYFI4fRAND.pgp
Description: OpenPGP digital signature

Re: [PATCH bpf-next 2/5] libbpf: support BPF_PROG_TYPE_USER programs

2020-08-04 Thread Andrii Nakryiko

On Tue, Aug 4, 2020 at 8:59 PM Song Liu  wrote:
>
>
>
> > On Aug 4, 2020, at 6:38 PM, Andrii Nakryiko  
> > wrote:
> >
> > On Mon, Aug 3, 2020 at 6:18 PM Song Liu  wrote:
> >>
> >>
> >>
> >>> On Aug 2, 2020, at 6:40 PM, Andrii Nakryiko  
> >>> wrote:
> >>>
> >>> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
> 
> >>
> >> [...]
> >>
> >>>
>  };
> 
>  LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr 
>  *test_attr);
>  diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>  index b9f11f854985b..9ce175a486214 100644
>  --- a/tools/lib/bpf/libbpf.c
>  +++ b/tools/lib/bpf/libbpf.c
>  @@ -6922,6 +6922,7 @@ static const struct bpf_sec_def section_defs[] = {
>    BPF_PROG_SEC("lwt_out", BPF_PROG_TYPE_LWT_OUT),
>    BPF_PROG_SEC("lwt_xmit",BPF_PROG_TYPE_LWT_XMIT),
>    BPF_PROG_SEC("lwt_seg6local",   
>  BPF_PROG_TYPE_LWT_SEG6LOCAL),
>  +   BPF_PROG_SEC("user",BPF_PROG_TYPE_USER),
> >>>
> >>> let's do "user/" for consistency with most other prog types (and nice
> >>> separation between prog type and custom user name)
> >>
> >> About "user" vs. "user/", I still think "user" is better.
> >>
> >> Unlike kprobe and tracepoint, user prog doesn't use the part after "/".
> >> This is similar to "perf_event" for BPF_PROG_TYPE_PERF_EVENT, "xdl" for
> >> BPF_PROG_TYPE_XDP, etc. If we specify "user" here, "user/" and "user/xxx"
> >> would also work. However, if we specify "user/" here, programs that used
> >> "user" by accident will fail to load, with a message like:
> >>
> >>libbpf: failed to load program 'user'
> >>
> >> which is confusing.
> >
> > xdp, perf_event and a bunch of others don't enforce it, that's true,
> > they are a bit of a legacy,
>
> I don't see w/o "/" is a legacy thing. BPF_PROG_TYPE_STRUCT_OPS just uses
> "struct_ops".
>
> > unfortunately. But all the recent ones do,
> > and we explicitly did that for xdp_dev/xdp_cpu, for instance.
> > Specifying just "user" in the spec would allow something nonsensical
> > like "userargh", for instance, due to this being treated as a prefix.
> > There is no harm to require users to do "user/my_prog", though.
>
> I don't see why allowing "userargh" is a problem. Failing "user" is
> more confusing. We can probably improve that by a hint like:
>
> libbpf: failed to load program 'user', do you mean "user/"?
>
> But it is pretty silly. "user/something_never_used" also looks weird.

"userargh" is terrible, IMO. It's a different identifier that just
happens to have the first 4 letters matching "user" program type.
There must be either a standardized separator (which happens to be
'/') or none. See the suggestion below.
>
> > Alternatively, we could introduce a new convention in the spec,
> > something like "user?", which would accept either "user" or
> > "user/something", but not "user/" nor "userblah". We can try that as
> > well.
>
> Again, I don't really understand why allowing "userblah" is a problem.
> We already have "xdp", "xdp_devmap/", and "xdp_cpumap/", they all work
> fine so far.

Right, we have "xdp_devmap/" and "xdp_cpumap/", as you say. I haven't
seen so much pushback against trailing forward slash with those ;)

But anyways, as part of deprecating APIs and preparing libbpf for 1.0
release over this half, I think I'm going to emit warnings for names
like "prog_type_whatever" or "prog_typeevenworse", etc. And asking
users to normalize section names to either "prog_type" or
"prog_type/something/here", whichever makes sense for a specific
program type. Right now libbpf doesn't allow two separate BPF programs
with the same section name, so enforcing strict "user" is limiting to
users. We are going to lift that restriction pretty soon, though. But
for now, please stick with what we've been doing lately and mark it as
"user/", later we'll allow just "user" as well.

>
> Thanks,
> Song

[PATCH v7 2/4] scsi: ufs: Introduce HPB feature

2020-08-04 Thread Daejun Park

This is a patch for the HPB feature.
This patch adds HPB function calls to UFS core driver.

The mininum size of the memory pool used in the HPB is implemented as a
Kconfig parameter (SCSI_UFS_HPB_HOST_MEM), so that it can be configurable.

Reviewed-by: Can Guo 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/Kconfig  |  18 +
 drivers/scsi/ufs/Makefile |   1 +
 drivers/scsi/ufs/ufshcd.c |  42 +++
 drivers/scsi/ufs/ufshcd.h |   9 +
 drivers/scsi/ufs/ufshpb.c | 738 ++
 drivers/scsi/ufs/ufshpb.h | 169 +
 6 files changed, 977 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
index f6394999b98c..33296478f411 100644
--- a/drivers/scsi/ufs/Kconfig
+++ b/drivers/scsi/ufs/Kconfig
@@ -182,3 +182,21 @@ config SCSI_UFS_CRYPTO
  Enabling this makes it possible for the kernel to use the crypto
  capabilities of the UFS device (if present) to perform crypto
  operations on data being transferred to/from the device.
+
+config SCSI_UFS_HPB
+   bool "Support UFS Host Performance Booster"
+   depends on SCSI_UFSHCD
+   help
+ A UFS HPB Feature improves random read performance. It caches
+ L2P map of UFS to host DRAM. The driver uses HPB read command
+ by piggybacking physical page number for bypassing FTL's L2P address
+ translation.
+
+config SCSI_UFS_HPB_HOST_MEM
+   int "Host-side cached memory size (KB) for HPB support"
+   default 32
+   depends on SCSI_UFS_HPB
+   help
+ The mininum size of the memory pool used in the HPB module. It can
+ be configurable by the user. If this value is larger than required
+ memory size, kernel resizes cached memory size.
diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
index 4679af1b564e..663e17cee359 100644
--- a/drivers/scsi/ufs/Makefile
+++ b/drivers/scsi/ufs/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SCSI_UFSHCD) += ufshcd-core.o
 ufshcd-core-y  += ufshcd.o ufs-sysfs.o
 ufshcd-core-$(CONFIG_SCSI_UFS_BSG) += ufs_bsg.o
 ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO) += ufshcd-crypto.o
+ufshcd-core-$(CONFIG_SCSI_UFS_HPB) += ufshpb.o
 obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
 obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o
 obj-$(CONFIG_SCSI_UFS_HISI) += ufs-hisi.o
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index cdff7e5ee588..a99afdcf8dc0 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -234,6 +234,17 @@ static int ufshcd_wb_ctrl(struct ufs_hba *hba, bool 
enable);
 static int ufshcd_wb_toggle_flush_during_h8(struct ufs_hba *hba, bool set);
 static inline void ufshcd_wb_toggle_flush(struct ufs_hba *hba, bool enable);
 
+#ifndef CONFIG_SCSI_UFS_HPB
+static void ufshpb_resume(struct ufs_hba *hba) {}
+static void ufshpb_suspend(struct ufs_hba *hba) {}
+static void ufshpb_reset(struct ufs_hba *hba) {}
+static void ufshpb_reset_host(struct ufs_hba *hba) {}
+static void ufshpb_rsp_upiu(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) {}
+static void ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) {}
+static void ufshpb_remove(struct ufs_hba *hba) {}
+static void ufshpb_scan_feature(struct ufs_hba *hba) {}
+#endif
+
 static inline bool ufshcd_valid_tag(struct ufs_hba *hba, int tag)
 {
return tag >= 0 && tag < hba->nutrs;
@@ -2559,6 +2570,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
ufshcd_comp_scsi_upiu(hba, lrbp);
 
+   ufshpb_prep(hba, lrbp);
+
err = ufshcd_map_sg(hba, lrbp);
if (err) {
lrbp->cmd = NULL;
@@ -4681,6 +4694,19 @@ static int ufshcd_change_queue_depth(struct scsi_device 
*sdev, int depth)
return scsi_change_queue_depth(sdev, depth);
 }
 
+static void ufshcd_hpb_configure(struct ufs_hba *hba, struct scsi_device *sdev)
+{
+   /* skip well-known LU */
+   if (sdev->lun >= UFS_UPIU_MAX_UNIT_NUM_ID)
+   return;
+
+   if (!(hba->dev_info.b_ufs_feature_sup & UFS_DEV_HPB_SUPPORT))
+   return;
+
+   atomic_inc(>ufsf.slave_conf_cnt);
+   wake_up(>ufsf.sdev_wait);
+}
+
 /**
  * ufshcd_slave_configure - adjust SCSI device configurations
  * @sdev: pointer to SCSI device
@@ -4690,6 +4716,8 @@ static int ufshcd_slave_configure(struct scsi_device 
*sdev)
struct ufs_hba *hba = shost_priv(sdev->host);
struct request_queue *q = sdev->request_queue;
 
+   ufshcd_hpb_configure(hba, sdev);
+
blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1);
 
if (ufshcd_is_rpm_autosuspend_allowed(hba))
@@ -4818,6 +4846,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 */
pm_runtime_get_noresume(hba->dev);
}
+
+

[PATCH v2 9/9] arm64: tegra: Audio graph sound card for Jetson Nano and TX1

Enable support for audio-graph based sound card on Jetson-Nano and
Jetson-TX1. Depending on the platform, required I/O interfaces are
enabled.

 * Jetson-Nano: Enable I2S3, I2S4, DMIC1 and DMIC2.
 * Jetson-TX1: Enable all I2S and DMIC interfaces.

Signed-off-by: Sameer Pujar 
---
 arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts | 217 +
 arch/arm64/boot/dts/nvidia/tegra210-p3450-.dts | 122 
 2 files changed, 339 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts 
b/arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts
index 56adf28..0eefad7 100644
--- a/arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts
+++ b/arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts
@@ -3,6 +3,7 @@
 
 #include "tegra210-p2180.dtsi"
 #include "tegra210-p2597.dtsi"
+#include "tegra210-audio-graph.dtsi"
 
 / {
model = "NVIDIA Jetson TX1 Developer Kit";
@@ -126,4 +127,220 @@
status = "okay";
};
};
+
+   tegra_sound {
+   status = "okay";
+
+   dais = /* FE */
+  <_port>, <_port>, <_port>,
+  <_port>, <_port>, <_port>,
+  <_port>, <_port>, <_port>,
+  <_port>,
+  /* Router */
+  <_i2s1_port>, <_i2s2_port>, <_i2s3_port>,
+  <_i2s4_port>, <_i2s5_port>, <_dmic1_port>,
+  <_dmic2_port>, <_dmic3_port>,
+  /* I/O DAP Ports */
+  <_port>, <_port>, <_port>, <_port>,
+  <_port>, <_port>, <_port>, 
<_port>;
+
+   label = "jetson-tx1-ape";
+   };
+};
+
+_admaif {
+   status = "okay";
+};
+
+_ahub {
+   status = "okay";
+
+   ports {
+   xbar_i2s1_port: port@a {
+   reg = <0xa>;
+   xbar_i2s1_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_i2s2_port: port@b {
+   reg = <0xb>;
+   xbar_i2s2_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_i2s3_port: port@c {
+   reg = <0xc>;
+   xbar_i2s3_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_i2s4_port: port@d {
+   reg = <0xd>;
+   xbar_i2s4_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_i2s5_port: port@e {
+   reg = <0xe>;
+   xbar_i2s5_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_dmic1_port: port@f {
+   reg = <0xf>;
+   xbar_dmic1_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_dmic2_port: port@10 {
+   reg = <0x10>;
+   xbar_dmic2_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   xbar_dmic3_port: port@11 {
+   reg = <0x11>;
+   xbar_dmic3_ep: endpoint {
+   remote-endpoint = <_cif_ep>;
+   };
+   };
+   };
+};
+
+_i2s1 {
+   status = "okay";
+
+   port@0 {
+   i2s1_cif_ep: endpoint {
+   remote-endpoint = <_i2s1_ep>;
+   };
+   };
+
+   i2s1_port: port@1 {
+   i2s1_dap: endpoint {
+   dai-format = "i2s";
+
+   /* Placeholder for external Codec */
+   };
+   };
+};
+
+_i2s2 {
+   status = "okay";
+
+   port@0 {
+   i2s2_cif_ep: endpoint {
+   remote-endpoint = <_i2s2_ep>;
+   };
+   };
+
+   i2s2_port: port@1 {
+   i2s2_dap: endpoint {
+   dai-format = "i2s";
+
+   /* Placeholder for external Codec */
+   };
+   };
+};
+
+_i2s3 {
+   status = "okay";
+
+   port@0 {
+   i2s3_cif_ep: endpoint {
+   remote-endpoint = <_i2s3_ep>;
+   };
+   };
+
+   i2s3_port: port@1 {
+   i2s3_dap_ep: endpoint {
+   dai-format = "i2s";
+
+   /* Placeholder for external Codec */
+   };
+   };
+};
+
+_i2s4 {
+   status = "okay";
+
+   port@0 {
+   i2s4_cif_ep: endpoint {
+

[PATCH v2 7/9] ASoC: audio-graph: Support empty Codec endpoint

For open platforms, which can support pluggable audio cards, Codec
endpoint is not fixed always. It actually depends on the compatible
HW module that is going to be connected. From SoC side the given I/O
interface is always available. Hence such links have fixed CPU endpoint
but no Codec endpoint. This patch helps to support such links where
user can populate Codec endpoint only and its fields in Platform DT
depending on the plugged HW.

Signed-off-by: Sameer Pujar 
---
 sound/soc/generic/audio-graph-card.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/sound/soc/generic/audio-graph-card.c 
b/sound/soc/generic/audio-graph-card.c
index 4a0a345..b588be9 100644
--- a/sound/soc/generic/audio-graph-card.c
+++ b/sound/soc/generic/audio-graph-card.c
@@ -231,6 +231,14 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv 
*priv,
struct snd_soc_dai_link_component *codecs = dai_link->codecs;
int ret;
 
+   /*
+* Codec endpoint can be NULL for pluggable audio HW.
+* Platform DT can populate the Codec endpoint depending on the
+* plugged HW.
+*/
+   if (!li->cpu && !codec_ep)
+   return 0;
+
/* Do it all CPU endpoint, and 1st Codec endpoint */
if (!li->cpu && dup_codec)
return 0;
@@ -566,7 +574,7 @@ static int graph_count_dpcm(struct asoc_simple_priv *priv,
li->link++; /* 1xCPU-dummy */
li->dais++; /* 1xCPU */
 
-   if (!dup_codec) {
+   if (!dup_codec && codec_ep) {
li->link++; /* 1xdummy-Codec */
li->conf++; /* 1xdummy-Codec */
li->dais++; /* 1xCodec */
-- 
2.7.4

[PATCH v2 6/9] ASoC: audio-graph: Add support for component chaining

The audio-graph driver supports both normal and DPCM DAI links. The
sound cards requiring DPCM DAI link support, use DPCM_SELECTABLE flag
and DAI links are treated as DPCM links depending on the number of
child nodes in a given DAI link.

There is another requirement where multiple ASoC components need to
be connected together in a chained fashion in a component model. This
patch sets 'component_chaining' flag for such sound cards where FE<->BE
and multiple BE<->BE component connections are required. Hence support
for such sound cards is added under 'audio-graph-cc-card' compatible.
All DAI links under this are treated as DPCM links.

Signed-off-by: Sameer Pujar 
---
 sound/soc/generic/audio-graph-card.c | 32 +---
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/sound/soc/generic/audio-graph-card.c 
b/sound/soc/generic/audio-graph-card.c
index 93bddf6..4a0a345 100644
--- a/sound/soc/generic/audio-graph-card.c
+++ b/sound/soc/generic/audio-graph-card.c
@@ -20,10 +20,13 @@
 #include 
 #include 
 
-#define DPCM_SELECTABLE 1
-
 #define PREFIX "audio-graph-card,"
 
+struct asoc_simple_soc_data {
+   unsigned int dpcm_selectable:1;
+   unsigned int component_chaining:1;
+};
+
 static int graph_outdrv_event(struct snd_soc_dapm_widget *w,
  struct snd_kcontrol *kcontrol,
  int event)
@@ -447,7 +450,7 @@ static int graph_for_each_link(struct asoc_simple_priv 
*priv,
struct device_node *codec_port;
struct device_node *codec_port_old = NULL;
struct asoc_simple_data adata;
-   uintptr_t dpcm_selectable = (uintptr_t)of_device_get_match_data(dev);
+   const struct asoc_simple_soc_data *data = of_device_get_match_data(dev);
int rc, ret;
 
/* loop for all listed CPU port */
@@ -474,10 +477,12 @@ static int graph_for_each_link(struct asoc_simple_priv 
*priv,
 * It is DPCM
 * if Codec port has many endpoints,
 * or has convert-xxx property
+* or component chaining is used
 */
-   if (dpcm_selectable &&
+   if (data && data->dpcm_selectable &&
((of_get_child_count(codec_port) > 1) ||
-adata.convert_rate || adata.convert_channels))
+adata.convert_rate || adata.convert_channels ||
+data->component_chaining))
ret = func_dpcm(priv, cpu_ep, codec_ep, li,
(codec_port_old == codec_port));
/* else normal sound */
@@ -650,6 +655,7 @@ static int graph_probe(struct platform_device *pdev)
 {
struct asoc_simple_priv *priv;
struct device *dev = >dev;
+   const struct asoc_simple_soc_data *data = of_device_get_match_data(dev);
struct snd_soc_card *card;
struct link_info li;
int ret;
@@ -666,6 +672,9 @@ static int graph_probe(struct platform_device *pdev)
card->num_dapm_widgets  = ARRAY_SIZE(graph_dapm_widgets);
card->probe = graph_card_probe;
 
+   if (data)
+   card->component_chaining = data->component_chaining;
+
memset(, 0, sizeof(li));
graph_get_dais_count(priv, );
if (!li.link || !li.dais)
@@ -711,10 +720,19 @@ static int graph_remove(struct platform_device *pdev)
return asoc_simple_clean_reference(card);
 }
 
+static const struct asoc_simple_soc_data scu_card_data = {
+   .dpcm_selectable = 1,
+};
+
+static const struct asoc_simple_soc_data cc_card_data = {
+   .dpcm_selectable = 1,
+   .component_chaining = 1,
+};
+
 static const struct of_device_id graph_of_match[] = {
{ .compatible = "audio-graph-card", },
-   { .compatible = "audio-graph-scu-card",
- .data = (void *)DPCM_SELECTABLE },
+   { .compatible = "audio-graph-scu-card", .data = _card_data, },
+   { .compatible = "audio-graph-cc-card", .data = _card_data, },
{},
 };
 MODULE_DEVICE_TABLE(of, graph_of_match);
-- 
2.7.4

[PATCH v2 8/9] arm64: tegra: Audio graph header for Tegra210

Expose a header which describes DT bindings required to use audio-graph
based sound card. All Tegra210 based platforms can include this header
and add platform specific information. Currently, from SoC point of view,
all links are exposed for ADMAIF, AHUB, I2S and DMIC components.

Signed-off-by: Sameer Pujar 
---
 .../boot/dts/nvidia/tegra210-audio-graph.dtsi  | 141 +
 1 file changed, 141 insertions(+)
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra210-audio-graph.dtsi

diff --git a/arch/arm64/boot/dts/nvidia/tegra210-audio-graph.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra210-audio-graph.dtsi
new file mode 100644
index 000..23f524d
--- /dev/null
+++ b/arch/arm64/boot/dts/nvidia/tegra210-audio-graph.dtsi
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/ {
+   tegra_sound {
+   status = "disabled";
+
+   compatible = "audio-graph-cc-card";
+
+   clocks = <_car TEGRA210_CLK_PLL_A>,
+<_car TEGRA210_CLK_PLL_A_OUT0>,
+<_car TEGRA210_CLK_EXTERN1>;
+   clock-names = "pll_a", "plla_out0", "aud_mclk";
+
+   assigned-clocks = <_car TEGRA210_CLK_PLL_A>,
+ <_car TEGRA210_CLK_PLL_A_OUT0>,
+ <_car TEGRA210_CLK_EXTERN1>;
+   assigned-clock-parents = <0>, <0>, <_car 
TEGRA210_CLK_PLL_A_OUT0>;
+   assigned-clock-rates = <36864>, <49152000>, <12288000>;
+   };
+};
+
+_admaif {
+   admaif1_port: port@0 {
+   admaif1_ep: endpoint {
+   remote-endpoint = <_admaif1_ep>;
+   };
+   };
+   admaif2_port: port@1 {
+   admaif2_ep: endpoint {
+   remote-endpoint = <_admaif2_ep>;
+   };
+   };
+   admaif3_port: port@2 {
+   admaif3_ep: endpoint {
+   remote-endpoint = <_admaif3_ep>;
+   };
+   };
+   admaif4_port: port@3 {
+   admaif4_ep: endpoint {
+   remote-endpoint = <_admaif4_ep>;
+   };
+   };
+   admaif5_port: port@4 {
+   admaif5_ep: endpoint {
+   remote-endpoint = <_admaif5_ep>;
+   };
+   };
+   admaif6_port: port@5 {
+   admaif6_ep: endpoint {
+   remote-endpoint = <_admaif6_ep>;
+   };
+   };
+   admaif7_port: port@6 {
+   admaif7_ep: endpoint {
+   remote-endpoint = <_admaif7_ep>;
+   };
+   };
+   admaif8_port: port@7 {
+   admaif8_ep: endpoint {
+   remote-endpoint = <_admaif8_ep>;
+   };
+   };
+   admaif9_port: port@8 {
+   admaif9_ep: endpoint {
+   remote-endpoint = <_admaif9_ep>;
+   };
+   };
+   admaif10_port: port@9 {
+   admaif10_ep: endpoint {
+   remote-endpoint = <_admaif10_ep>;
+   };
+   };
+};
+
+_ahub {
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0x0>;
+   xbar_admaif1_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@1 {
+   reg = <0x1>;
+   xbar_admaif2_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@2 {
+   reg = <0x2>;
+   xbar_admaif3_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@3 {
+   reg = <0x3>;
+   xbar_admaif4_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@4 {
+   reg = <0x4>;
+   xbar_admaif5_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@5 {
+   reg = <0x5>;
+   xbar_admaif6_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@6 {
+   reg = <0x6>;
+   xbar_admaif7_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@7 {
+   reg = <0x7>;
+   xbar_admaif8_ep: endpoint {
+   remote-endpoint = <_ep>;
+   };
+   };
+   port@8 {
+   reg =

[PATCH v2 0/9] Audio graph card updates and usage with Tegra210 audio

This series proposes following enhancements to audio-graph card driver.
 * Support multiple instances of a component.
 * Support open platforms with empty Codec endpoint.
 * Identify no-pcm DPCM DAI links which can be used in BE<->BE connections.
 * Add new compatible to support DPCM based DAI chaining.

This pushes DT support for Tegra210 based platforms which uses audio-graph
card and above enhancements.

The series is based on following references where DPCM usgae for Tegra
Audio and simple-card driver proposal were discussed.
 * https://lkml.org/lkml/2020/4/30/519 (DPCM for Tegra)
 * https://lkml.org/lkml/2020/6/27/4 (simple-card driver)

Changelog
=

v1 -> v2

 * Re-organized ports/endpoints description for ADMAIF and XBAR.
   Updated DT patches accordingly.
 * After above change, multiple Codec endpoint support is not
   required and hence dropped for now. This will be considered
   separately if at all required in future.
 * Re-ordered patches in the series.

Sameer Pujar (9):
  ASoC: soc-core: Fix component name_prefix parsing
  ASoC: audio-graph: Use of_node and DAI for DPCM DAI link names
  ASoC: audio-graph: Identify 'no_pcm' DAI links for DPCM
  ASoC: soc-pcm: Get all BEs along DAPM path
  ASoC: dt-bindings: audio-graph-card: Support for component chaining
  ASoC: audio-graph: Add support for component chaining
  ASoC: audio-graph: Support empty Codec endpoint
  arm64: tegra: Audio graph header for Tegra210
  arm64: tegra: Audio graph sound card for Jetson Nano and TX1

 .../devicetree/bindings/sound/audio-graph-card.txt |   1 +
 .../boot/dts/nvidia/tegra210-audio-graph.dtsi  | 141 +
 arch/arm64/boot/dts/nvidia/tegra210-p2371-2180.dts | 217 +
 arch/arm64/boot/dts/nvidia/tegra210-p3450-.dts | 122 
 include/sound/soc.h|   1 +
 sound/soc/generic/audio-graph-card.c   |  69 ++-
 sound/soc/soc-core.c   |   3 +-
 sound/soc/soc-pcm.c|   3 +-
 8 files changed, 545 insertions(+), 12 deletions(-)
 create mode 100644 arch/arm64/boot/dts/nvidia/tegra210-audio-graph.dtsi

-- 
2.7.4

[PATCH v2 3/9] ASoC: audio-graph: Identify 'no_pcm' DAI links for DPCM

PCM devices are created for FE dai links with 'no-pcm' flag as '0'.
Such DAI links have CPU component which implement either pcm_construct()
or pcm_new() at component or dai level respectively. Based on this,
current patch exposes a helper function to identify such components
and populate 'no_pcm' flag for DPCM DAI link.

This helps to have BE<->BE component links where PCM devices need
not be created for CPU component involved in such links.

Signed-off-by: Sameer Pujar 
---
 sound/soc/generic/audio-graph-card.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/sound/soc/generic/audio-graph-card.c 
b/sound/soc/generic/audio-graph-card.c
index 1e20562..93bddf6 100644
--- a/sound/soc/generic/audio-graph-card.c
+++ b/sound/soc/generic/audio-graph-card.c
@@ -111,6 +111,17 @@ static int graph_get_dai_id(struct device_node *ep)
return id;
 }
 
+static bool soc_component_is_pcm(struct snd_soc_dai_link_component *dlc)
+{
+   struct snd_soc_dai *dai = snd_soc_find_dai(dlc);
+
+   if (dai && (dai->component->driver->pcm_construct ||
+   dai->driver->pcm_new))
+   return true;
+
+   return false;
+}
+
 static int asoc_simple_parse_dai(struct device_node *ep,
 struct snd_soc_dai_link_component *dlc,
 int *is_single_link)
@@ -259,6 +270,16 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv 
*priv,
if (ret < 0)
goto out_put_node;
 
+   /*
+* In BE<->BE connections it is not required to create
+* PCM devices at CPU end of the dai link and thus 'no_pcm'
+* flag needs to be set. It is useful when there are many
+* BE components and some of these have to be connected to
+* form a valid audio path.
+*/
+   if (!soc_component_is_pcm(cpus))
+   dai_link->no_pcm = 1;
+
/* card->num_links includes Codec */
asoc_simple_canonicalize_cpu(dai_link, is_single_links);
} else {
-- 
2.7.4

[PATCH v2 1/9] ASoC: soc-core: Fix component name_prefix parsing

The "prefix" can be defined in DAI link node or it can be specified as
part of the component node itself. Currently "sound-name-prefix" defined
in a component is not taking effect. Actually the property is not getting
parsed. It can be fixed by parsing "sound-name-prefix" property whenever
"prefix" is missing in DAI link Codec node.

Signed-off-by: Sameer Pujar 
---
 sound/soc/soc-core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 2fe1b2ec..ec000fb 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1113,7 +1113,8 @@ static void soc_set_name_prefix(struct snd_soc_card *card,
for (i = 0; i < card->num_configs; i++) {
struct snd_soc_codec_conf *map = >codec_conf[i];
 
-   if (snd_soc_is_matching_component(>dlc, component)) {
+   if (snd_soc_is_matching_component(>dlc, component) &&
+   map->name_prefix) {
component->name_prefix = map->name_prefix;
return;
}
-- 
2.7.4

[PATCH v2 5/9] ASoC: dt-bindings: audio-graph-card: Support for component chaining

New compatible "audio-graph-cc-card" is exposed for audio-graph card
driver which allows usage of DAI link chaining and thus connects multiple
components together in a system.

Signed-off-by: Sameer Pujar 
---
 Documentation/devicetree/bindings/sound/audio-graph-card.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/sound/audio-graph-card.txt 
b/Documentation/devicetree/bindings/sound/audio-graph-card.txt
index d5f6919..8bf2038 100644
--- a/Documentation/devicetree/bindings/sound/audio-graph-card.txt
+++ b/Documentation/devicetree/bindings/sound/audio-graph-card.txt
@@ -27,6 +27,7 @@ Below are same as Simple-Card.
 Required properties:
 
 - compatible   : "audio-graph-card";
+   : "audio-graph-cc-card";
 - dais : list of CPU DAI port{s}
 
 Optional properties:
-- 
2.7.4

[PATCH v2 2/9] ASoC: audio-graph: Use of_node and DAI for DPCM DAI link names

For multiple instances of components, using DAI name alone for DAI links
is causing conflicts. Components can define multiple DAIs and hence using
just a device name won't help either. Thus DT device node reference and
DAI names are used to uniquely represent DAI link names.

Signed-off-by: Sameer Pujar 
---
 sound/soc/generic/audio-graph-card.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/sound/soc/generic/audio-graph-card.c 
b/sound/soc/generic/audio-graph-card.c
index 97b4f54..1e20562 100644
--- a/sound/soc/generic/audio-graph-card.c
+++ b/sound/soc/generic/audio-graph-card.c
@@ -253,7 +253,8 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv 
*priv,
goto out_put_node;
 
ret = asoc_simple_set_dailink_name(dev, dai_link,
-  "fe.%s",
+  "fe.%pOFP.%s",
+  cpus->of_node,
   cpus->dai_name);
if (ret < 0)
goto out_put_node;
@@ -287,7 +288,8 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv 
*priv,
goto out_put_node;
 
ret = asoc_simple_set_dailink_name(dev, dai_link,
-  "be.%s",
+  "be.%pOFP.%s",
+  codecs->of_node,
   codecs->dai_name);
if (ret < 0)
goto out_put_node;
-- 
2.7.4

[PATCH v2 4/9] ASoC: soc-pcm: Get all BEs along DAPM path

dpcm_end_walk_at_be() stops the graph walk when first BE is found for
the given FE component. In a component model we may want to connect
multiple DAIs from different components. A new flag is introduced in
'snd_soc_card', which when set allows DAI/component chaining. Later
PCM operations can be called for all these listed components for a
valid DAPM path.

Signed-off-by: Sameer Pujar 
---
 include/sound/soc.h | 1 +
 sound/soc/soc-pcm.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/sound/soc.h b/include/sound/soc.h
index 5e3919f..e8531be 100644
--- a/include/sound/soc.h
+++ b/include/sound/soc.h
@@ -1084,6 +1084,7 @@ struct snd_soc_card {
unsigned int fully_routed:1;
unsigned int disable_route_checks:1;
unsigned int probed:1;
+   unsigned int component_chaining:1;
 
void *drvdata;
 };
diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index 00ac1cb..5f1d8d3 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -1323,7 +1323,8 @@ int dpcm_path_get(struct snd_soc_pcm_runtime *fe,
 
/* get number of valid DAI paths and their widgets */
paths = snd_soc_dapm_dai_get_connected_widgets(cpu_dai, stream, list,
-   dpcm_end_walk_at_be);
+   fe->card->component_chaining ?
+   NULL : dpcm_end_walk_at_be);
 
dev_dbg(fe->dev, "ASoC: found %d audio %s paths\n", paths,
stream ? "capture" : "playback");
-- 
2.7.4

Re: [net v3] drivers/net/wan/lapbether: Use needed_headroom instead of hard_header_len

2020-08-04 Thread Martin Schiller


On 2020-08-04 21:20, Xie He wrote:

On Tue, Aug 4, 2020 at 5:43 AM Martin Schiller  wrote:


I'm not an expert in the field, but after reading the commit message 
and

the previous comments, I'd say that makes sense.


Thanks!


Shouldn't this kernel panic be intercepted by a skb_cow() before the
skb_push() in lapbeth_data_transmit()?


When a skb is passing down a protocol stack for transmission, there
might be several different skb_push calls to prepend different
headers. It would be the best (in terms of performance) if we can
allocate the needed header space in advance, so that we don't need to
reallocate the skb every time a new header needs to be prepended.


Yes, I agree.


Adding skb_cow before these skb_push calls would indeed help
preventing kernel panics, but that might not be the essential issue
here, and it might also prevent us from discovering the real issue. (I
guess this is also the reason skb_cow is not included in skb_push
itself.)


Well, you are right that the panic is "useful" to discover the real
problem. But on the other hand, if it is possible to prevent a panic, I
think we should do so. Maybe with adding a warning, when skb_cow() needs
to reallocate memory.

But this is getting a little bit off topic. For this patch I can say:

LGTM.

Reviewed-by: Martin Schiller

[PATCH v7 1/4] scsi: ufs: Add UFS feature related parameter

2020-08-04 Thread Daejun Park

This is a patch for parameters to be used for UFS features layer and HPB
module.

Reviewed-by: Can Guo 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufs.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index f8ab16f30fdc..ae557b8d3eba 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -122,6 +122,7 @@ enum flag_idn {
QUERY_FLAG_IDN_WB_EN= 0x0E,
QUERY_FLAG_IDN_WB_BUFF_FLUSH_EN = 0x0F,
QUERY_FLAG_IDN_WB_BUFF_FLUSH_DURING_HIBERN8 = 0x10,
+   QUERY_FLAG_IDN_HPB_RESET= 0x11,
 };
 
 /* Attribute idn for Query requests */
@@ -195,6 +196,9 @@ enum unit_desc_param {
UNIT_DESC_PARAM_PHY_MEM_RSRC_CNT= 0x18,
UNIT_DESC_PARAM_CTX_CAPABILITIES= 0x20,
UNIT_DESC_PARAM_LARGE_UNIT_SIZE_M1  = 0x22,
+   UNIT_DESC_HPB_LU_MAX_ACTIVE_REGIONS = 0x23,
+   UNIT_DESC_HPB_LU_PIN_REGION_START_OFFSET= 0x25,
+   UNIT_DESC_HPB_LU_NUM_PIN_REGIONS= 0x27,
UNIT_DESC_PARAM_WB_BUF_ALLOC_UNITS  = 0x29,
 };
 
@@ -235,6 +239,8 @@ enum device_desc_param {
DEVICE_DESC_PARAM_PSA_MAX_DATA  = 0x25,
DEVICE_DESC_PARAM_PSA_TMT   = 0x29,
DEVICE_DESC_PARAM_PRDCT_REV = 0x2A,
+   DEVICE_DESC_PARAM_HPB_VER   = 0x40,
+   DEVICE_DESC_PARAM_HPB_CONTROL   = 0x42,
DEVICE_DESC_PARAM_EXT_UFS_FEATURE_SUP   = 0x4F,
DEVICE_DESC_PARAM_WB_PRESRV_USRSPC_EN   = 0x53,
DEVICE_DESC_PARAM_WB_TYPE   = 0x54,
@@ -283,6 +289,10 @@ enum geometry_desc_param {
GEOMETRY_DESC_PARAM_ENM4_MAX_NUM_UNITS  = 0x3E,
GEOMETRY_DESC_PARAM_ENM4_CAP_ADJ_FCTR   = 0x42,
GEOMETRY_DESC_PARAM_OPT_LOG_BLK_SIZE= 0x44,
+   GEOMETRY_DESC_HPB_REGION_SIZE   = 0x48,
+   GEOMETRY_DESC_HPB_NUMBER_LU = 0x49,
+   GEOMETRY_DESC_HPB_SUBREGION_SIZE= 0x4A,
+   GEOMETRY_DESC_HPB_DEVICE_MAX_ACTIVE_REGIONS = 0x4B,
GEOMETRY_DESC_PARAM_WB_MAX_ALLOC_UNITS  = 0x4F,
GEOMETRY_DESC_PARAM_WB_MAX_WB_LUNS  = 0x53,
GEOMETRY_DESC_PARAM_WB_BUFF_CAP_ADJ = 0x54,
@@ -327,6 +337,7 @@ enum {
 
 /* Possible values for dExtendedUFSFeaturesSupport */
 enum {
+   UFS_DEV_HPB_SUPPORT = BIT(7),
UFS_DEV_WRITE_BOOSTER_SUP   = BIT(8),
 };
 
@@ -537,6 +548,7 @@ struct ufs_dev_info {
u8 *model;
u16 wspecversion;
u32 clk_gating_wait_us;
+   u8 b_ufs_feature_sup;
u32 d_ext_ufs_feature_sup;
u8 b_wb_buffer_type;
u32 d_wb_alloc_units;
-- 
2.17.1

Re: [PATCH v2 02/18] gpio: uapi: define uAPI v2

2020-08-04 Thread Kent Gibson

On Tue, Aug 04, 2020 at 07:42:34PM +0200, Bartosz Golaszewski wrote:
> On Sat, Jul 25, 2020 at 6:20 AM Kent Gibson  wrote:
> >
> > Add a new version of the uAPI to address existing 32/64-bit alignment
> > issues, add support for debounce and event sequence numbers, and provide
> > some future proofing by adding padding reserved for future use.
> >
> > The alignment issue relates to the gpioevent_data, which packs to different
> > sizes on 32-bit and 64-bit platforms. That creates problems for 32-bit apps
> > running on 64-bit kernels.  The patch addresses that particular issue, and
> > the problem more generally, by adding pad fields that explicitly pad
> > structs out to 64-bit boundaries, so they will pack to the same size now,
> > and even if some of the reserved padding is used for __u64 fields in the
> > future.
> >
> > The lack of future proofing in v1 makes it impossible to, for example,
> > add the debounce feature that is included in v2.
> > The future proofing is addressed by providing reserved padding in all
> > structs for future features.  Specifically, the line request,
> > config, info, info_changed and event structs receive updated versions,
> > and the first three new ioctls.
> >
> > Signed-off-by: Kent Gibson 
> > ---
> 
> Hi Kent,
> 
> Thanks a lot for your work on this. Please see comments below.
> 
> One thing I'd change globally for better readability is to have all
> new symbols marked as v2 - even if they have no counterparts in v1. I
> know libgpiod will wrap it all anyway but I think it's still a good
> way to make our work in user-space easier.
> 

Fair enough.  Oh joy.

> >
> > I haven't added any padding to gpiochip_info, as I haven't seen any calls
> > for new features for the corresponding ioctl, but I'm open to updating that
> > as well.
> >
> > As the majority of the structs and ioctls were being replaced, it seemed
> > opportune to rework some of the other aspects of the uAPI.
> >
> > Firstly, I've reworked the flags field throughout.  v1 has three different
> > flags fields, each with their own separate bit definitions.  In v2 that is
> > collapsed to one.
> >
> > I've also merged the handle and event requests into a single request, the
> > line request, as the two requests were mostly the same, other than the
> > edge detection provided by event requests.  As a byproduct, the v2 uAPI
> > allows for multiple lines producing edge events on the same line handle.
> > This is a new capability as v1 only supports a single line in an event
> > request.
> >
> > This means there are now only two types of file handle to be concerned with,
> > the chip and the line, and it is clearer which ioctls apply to which type
> > of handle.
> >
> > There is also some minor renaming of fields for consistency compared to
> > their v1 counterparts, e.g. offset rather than lineoffset or line_offset,
> > and consumer rather than consumer_label.
> >
> > Additionally, v1 GPIOHANDLES_MAX becomes GPIOLINES_MAX in v2 for clarity,
> > and the gpiohandle_data __u8 array becomes a bitmap gpioline_values.
> >
> > The v2 uAPI is mostly just a reorganisation of v1, so userspace code,
> > particularly libgpiod, should easily port to it.
> >
> 
> I think the info above is worth putting into the commit message.
> Especially the part about merging the two event types.
> 

OK, but I'll rework it a bit to make it more suitable for a commit
message.

> > Changes since v1:
> >  - lower case V1 and V2, except in capitalized names
> >  - hyphenate 32/64-bit
> >  - rename bitmap field to bits
> >  - drop PAD_SIZE consts in favour of hard coded numbers
> >  - sort includes
> >  - change config flags to __u64
> >  - increase padding of gpioline_event
> >  - relocate GPIOLINE_CHANGED enum into v2 section (is common with v1)
> >  - rework config to collapse direction, drive, bias and edge enums back
> >into flags and add optional attributes that can be associated with a
> >subset of the requested lines.
> >
> > Changes since the RFC:
> >  - document the constraints on array sizes to maintain 32/64 alignment
> >  - add sequence numbers to gpioline_event
> >  - use bitmap for values instead of array of __u8
> >  - gpioline_info_v2 contains gpioline_config instead of its composite fields
> >  - provide constants for all array sizes, especially padding
> >  - renamed "GPIOLINE_FLAG_V2_KERNEL" to "GPIOLINE_FLAG_V2_USED"
> >  - renamed "default_values" to "values"
> >  - made gpioline_direction zero based
> >  - document clock used in gpioline_event timestamp
> >  - add event_buffer_size to gpioline_request
> >  - rename debounce to debounce_period
> >  - rename lines to num_lines
> >
> >  include/uapi/linux/gpio.h | 284 --
> >  1 file changed, 270 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/uapi/linux/gpio.h b/include/uapi/linux/gpio.h
> > index 285cc10355b2..3f6db33014f0 100644
> > --- a/include/uapi/linux/gpio.h
> > +++ b/include/uapi/linux/gpio.h
> > @@ -12,10

Re: [PATCH v2 2/2] ASoC: fsl_sai: Refine enable and disable sequence for synchronous mode

2020-08-04 Thread Nicolin Chen

On Wed, Aug 05, 2020 at 01:03:37PM +0800, Shengjiu Wang wrote:
> > Btw, the new fsl_sai_dir_is_synced() can be probably applied to
> > other places with a followup patch.

> Do you mean move it to the beginning of this file?

There are other existing places testing "sync[tx] && !sync[!tx]"
so you may submit another change to replace them. But, yea, will
be a good idea to move that helper function to the top.

Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available

2020-08-04 Thread hpa

On August 4, 2020 10:08:08 PM PDT, Borislav Petkov  wrote:
>On Tue, Aug 04, 2020 at 09:58:25PM -0700, h...@zytor.com wrote:
>> Because why use an alternative to jump over one instruction?
>>
>> I personally would prefer to have the IRET put out of line
>
>Can't yet - SERIALIZE CPUs are a minority at the moment.
>
>> and have the call/jmp replaced by SERIALIZE inline.
>
>Well, we could do:
>
>   alternative_io("... IRET bunch", __ASM_SERIALIZE,
>X86_FEATURE_SERIALIZE, ...);
>
>and avoid all kinds of jumping. Alternatives get padded so there
>would be a couple of NOPs following when SERIALIZE gets patched in
>but it shouldn't be a problem. I guess one needs to look at what gcc
>generates...

I didn't say behind a trap. IRET is a control transfer instruction, and slow, 
so putting it out of line really isn't unreasonable. Can even do a call to a 
common handler.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available

2020-08-04 Thread Borislav Petkov

On Tue, Aug 04, 2020 at 09:58:25PM -0700, h...@zytor.com wrote:
> Because why use an alternative to jump over one instruction?
>
> I personally would prefer to have the IRET put out of line

Can't yet - SERIALIZE CPUs are a minority at the moment.

> and have the call/jmp replaced by SERIALIZE inline.

Well, we could do:

alternative_io("... IRET bunch", __ASM_SERIALIZE, 
X86_FEATURE_SERIALIZE, ...);

and avoid all kinds of jumping. Alternatives get padded so there
would be a couple of NOPs following when SERIALIZE gets patched in
but it shouldn't be a problem. I guess one needs to look at what gcc
generates...

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [GIT pull] core/entry for v5.9

2020-08-04 Thread pr-tracker-bot

The pull request you sent on Tue, 04 Aug 2020 08:21:53 -:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> core-entry-2020-08-04

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/d25c8be67481060782d7e8b84bc0d0355922

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT pull] x86/fsgsbase for v5.9

2020-08-04 Thread pr-tracker-bot

The pull request you sent on Tue, 04 Aug 2020 08:21:58 -:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> x86-fsgsbase-2020-08-04

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/4da9f3302615f4191814f826054846bf843e24fa

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT pull] x86/entry for v5.9

2020-08-04 Thread pr-tracker-bot

The pull request you sent on Tue, 04 Aug 2020 08:21:57 -:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-entry-2020-08-04

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/125cfa0d4d143416ae217c26a72003baae93233d

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH v2 2/2] ASoC: fsl_sai: Refine enable and disable sequence for synchronous mode

2020-08-04 Thread Shengjiu Wang

On Wed, Aug 5, 2020 at 12:13 PM Nicolin Chen  wrote:
>
> On Wed, Aug 05, 2020 at 10:23:53AM +0800, Shengjiu Wang wrote:
> > Tx synchronous with Rx:
> > The TCSR.TE is no need to enabled when only Rx is going to be enabled.
> > Check if need to disable RSCR.RE before disabling TCSR.TE.
> >
> > Rx synchronous with Tx:
> > The RCSR.RE is no need to enabled when only Tx is going to be enabled.
> > Check if need to disable TSCR.RE before disabling RCSR.TE.
>
> Please add to the commit log more context such as what we have
> discussed: what's the problem of the current driver, and why we
> _have_to_ apply this change though it's sightly against what RM
> recommends.
>
> (If thing is straightforward, it's okay to make the text short.
>  Yet I believe that this change deserves more than these lines.)
>
> One info that you should mention -- also the main reason why I'm
> convinced to add this change: trigger() is still in the shape of
> the early version where we only supported one operation mode --
> Tx synchronous with Rx. So we need an update for other modes.
>
> > Signed-off-by: Shengjiu Wang 
>
> The git-diff part looks good, please add this in next ver.:
>
> Reviewed-by: Nicolin Chen 
>
> Btw, the new fsl_sai_dir_is_synced() can be probably applied to
> other places with a followup patch.
Do you mean move it to the beginning of this file?

best regards
wang shengjiu

Re: [PATCH V4 linux-next 00/12] VDPA support for Mellanox ConnectX devices

2020-08-04 Thread Eli Cohen

On Tue, Aug 04, 2020 at 05:29:09PM -0400, Michael S. Tsirkin wrote:
> On Tue, Aug 04, 2020 at 07:20:36PM +0300, Eli Cohen wrote:
> > Hi Michael,
> > please note that this series depends on mlx5 core device driver patches
> > in mlx5-next branch in
> > git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git.
> 
> Thanks! OK so what's the plan for merging this?
> Do patches at least build well enough that I can push them
> upstream? Or do they have to go on top of the mellanox tree?
> 

The patches are built on your linux-next branch which I updated
yesterday.

I am based on this commit:
776b7b25f10b (origin/linux-next) vhost: add an RPMsg API

On top of that I merged
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git

and after that I have Jason's patches (five patches), than one patch
from Max and then my patches (seven).

It builds fine on x84_64.
I fixed some conflicts on Jason's patches.

I also tested it to verify it's working.

BTW, for some reason I did not get all the patches into my mailbox and I
suspect they were not all sent. Did you get all the series 0-13?

Please let me know, and if needed I'll resend.

> 
> > git pull git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git 
> > mlx5-next 
> > 
> > They also depend Jason Wang's patches: https://lkml.org/lkml/2020/7/1/301
> 
> The ones you included, right?
>

Right.
 
> > Jason, I had to resolve some conflicts so I would appreciate of you can 
> > verify
> > that it is ok.
> > 
> > The following series of patches provide VDPA support for Mellanox
> > devices. The supported devices are ConnectX6 DX and newer.
> > 
> > Currently, only a network driver is implemented; future patches will
> > introduce a block device driver. iperf performance on a single queue is
> > around 12 Gbps.  Future patches will introduce multi queue support.
> > 
> > The files are organized in such a way that code that can be used by
> > different VDPA implementations will be placed in a common are resides in
> > drivers/vdpa/mlx5/core.
> > 
> > Only virtual functions are currently supported. Also, certain firmware
> > capabilities must be set to enable the driver. Physical functions (PFs)
> > are skipped by the driver.
> > 
> > To make use of the VDPA net driver, one must load mlx5_vdpa. In such
> > case, VFs will be operated by the VDPA driver. Although one can see a
> > regular instance of a network driver on the VF, the VDPA driver takes
> > precedence over the NIC driver, steering-wize.
> > 
> > Currently, the device/interface infrastructure in mlx5_core is used to
> > probe drivers. Future patches will introduce virtbus as a means to
> > register devices and drivers and VDPA will be adapted to it.
> > 
> > The mlx5 mode of operation required to support VDPA is switchdev mode.
> > Once can use Linux or OVS bridge to take care of layer 2 switching.
> > 
> > In order to provide virtio networking to a guest, an updated version of
> > qemu is required. This version has been tested by the following quemu
> > version:
> > 
> > url: https://github.com/jasowang/qemu.git
> > branch: vdpa
> > Commit ID: 6f4e59b807db
> > 
> > 
> > V2->V3
> > Fix makefile to use include path relative to the root of the kernel
> > 
> > V3-V4
> > Rebase Jason's patches on linux-next branch
> > Fix krobot error on mips arch
> > Make use of the free callback to destroy resoruces on unload
> > Use VIRTIO_F_ACCESS_PLATFORM instead of legacy VIRTIO_F_IOMMU_PLATFORM
> > Add empty implementations for get_vq_notification() and get_vq_irq()
> > 
> > 
> > Eli Cohen (6):
> >   net/vdpa: Use struct for set/get vq state
> >   vdpa: Modify get_vq_state() to return error code
> >   vdpa/mlx5: Add hardware descriptive header file
> >   vdpa/mlx5: Add support library for mlx5 VDPA implementation
> >   vdpa/mlx5: Add shared memory registration code
> >   vdpa/mlx5: Add VDPA driver for supported mlx5 devices
> > 
> > Jason Wang (5):
> >   vhost-vdpa: refine ioctl pre-processing
> >   vhost: generialize backend features setting/getting
> >   vhost-vdpa: support get/set backend features
> >   vhost-vdpa: support IOTLB batching hints
> >   vdpasim: support batch updating
> > 
> > Max Gurtovoy (1):
> >   vdpa: remove hard coded virtq num
> > 
> >  drivers/vdpa/Kconfig   |   19 +
> >  drivers/vdpa/Makefile  |1 +
> >  drivers/vdpa/ifcvf/ifcvf_base.c|4 +-
> >  drivers/vdpa/ifcvf/ifcvf_base.h|4 +-
> >  drivers/vdpa/ifcvf/ifcvf_main.c|   13 +-
> >  drivers/vdpa/mlx5/Makefile |4 +
> >  drivers/vdpa/mlx5/core/mlx5_vdpa.h |   91 ++
> >  drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h |  168 ++
> >  drivers/vdpa/mlx5/core/mr.c|  484 ++
> >  drivers/vdpa/mlx5/core/resources.c |  284 
> >  drivers/vdpa/mlx5/net/main.c   |   76 +
> >  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 1965 
> >  drivers/vdpa/mlx5/net/mlx5_vnet.h  |   24 +
> >  drivers/vdpa/vdpa.c

Re: [PATCH] RAS/CEC: Fix cec_init prototype

2020-08-04 Thread Borislav Petkov

On Tue, Aug 04, 2020 at 06:18:47PM +0200, Luca Stefani wrote:
> * late_initcall expects a function to return an integer

Please write a proper sentence for a commit message.

> Signed-off-by: Luca Stefani 
> ---
>  drivers/ras/cec.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c
> index 569d9ad2c594..e048e0e3949a 100644
> --- a/drivers/ras/cec.c
> +++ b/drivers/ras/cec.c
> @@ -553,20 +553,20 @@ static struct notifier_block cec_nb = {
>   .priority   = MCE_PRIO_CEC,
>  };
>  
> -static void __init cec_init(void)
> +static int __init cec_init(void)
>  {
>   if (ce_arr.disabled)
> - return;
> + return 0;

Why 0?

I'm thinking all the cases when the init doesn't succeed should return
!0...

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available

2020-08-04 Thread hpa

On August 4, 2020 9:48:40 PM PDT, Borislav Petkov  wrote:
>On Tue, Aug 04, 2020 at 07:10:59PM -0700, Ricardo Neri wrote:
>> The SERIALIZE instruction gives software a way to force the processor
>to
>> complete all modifications to flags, registers and memory from
>previous
>> instructions and drain all buffered writes to memory before the next
>> instruction is fetched and executed. Thus, it serves the purpose of
>> sync_core(). Use it when available.
>> 
>> Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced
>stack
>> invariance in alternatives. The iret-to-self does not comply with
>such
>> invariance. Thus, it cannot be used inside alternative code. Instead,
>use
>> an alternative that jumps to SERIALIZE when available.
>> 
>> Cc: Andy Lutomirski 
>> Cc: Cathy Zhang 
>> Cc: Dave Hansen 
>> Cc: Fenghua Yu 
>> Cc: "H. Peter Anvin" 
>> Cc: Kyung Min Park 
>> Cc: Peter Zijlstra 
>> Cc: "Ravi V. Shankar" 
>> Cc: Sean Christopherson 
>> Cc: linux-e...@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Suggested-by: Andy Lutomirski 
>> Signed-off-by: Ricardo Neri 
>> ---
>> This is a v2 from my initial submission [1]. The first three patches
>of
>> the series have been merged in Linus' tree. Hence, I am submitting
>only
>> this patch for review.
>> 
>> [1]. https://lkml.org/lkml/2020/7/27/8
>> 
>> Changes since v1:
>>  * Support SERIALIZE using alternative runtime patching.
>>(Peter Zijlstra, H. Peter Anvin)
>>  * Added a note to specify which version of binutils supports
>SERIALIZE.
>>(Peter Zijlstra)
>>  * Verified that (::: "memory") is used. (H. Peter Anvin)
>> ---
>>  arch/x86/include/asm/special_insns.h |  2 ++
>>  arch/x86/include/asm/sync_core.h | 10 +-
>>  2 files changed, 11 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/include/asm/special_insns.h
>b/arch/x86/include/asm/special_insns.h
>> index 59a3e13204c3..25cd67801dda 100644
>> --- a/arch/x86/include/asm/special_insns.h
>> +++ b/arch/x86/include/asm/special_insns.h
>> @@ -10,6 +10,8 @@
>>  #include 
>>  #include 
>>  
>> +/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35.
>*/
>> +#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
>>  /*
>>   * Volatile isn't enough to prevent the compiler from reordering the
>>   * read/write functions for the control registers and messing
>everything up.
>> diff --git a/arch/x86/include/asm/sync_core.h
>b/arch/x86/include/asm/sync_core.h
>> index fdb5b356e59b..201ea3d9a6bd 100644
>> --- a/arch/x86/include/asm/sync_core.h
>> +++ b/arch/x86/include/asm/sync_core.h
>> @@ -5,15 +5,19 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #ifdef CONFIG_X86_32
>>  static inline void iret_to_self(void)
>>  {
>>  asm volatile (
>> +ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
>>  "pushfl\n\t"
>>  "pushl %%cs\n\t"
>>  "pushl $1f\n\t"
>>  "iret\n\t"
>> +"2:\n\t"
>> +__ASM_SERIALIZE "\n"
>>  "1:"
>>  : ASM_CALL_CONSTRAINT : : "memory");
>>  }
>> @@ -23,6 +27,7 @@ static inline void iret_to_self(void)
>>  unsigned int tmp;
>>  
>>  asm volatile (
>> +ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
>
>Why is this and above stuck inside the asm statement?
>
>Why can't you simply do:
>
>   if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
>   asm volatile(__ASM_SERIALIZE ::: "memory");
>   return;
>   }
>
>on function entry instead of making it more unreadable for no
>particular
>reason?

Because why use an alternative to jump over one instruction?

I personally would prefer to have the IRET put out of line and have the 
call/jmp replaced by SERIALIZE inline.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

[PATCH 3/3] dt-bindings: serial: Convert NXP lpuart to json-schema

Convert the NXP lpuart binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
 .../devicetree/bindings/serial/fsl-lpuart.txt  | 43 
 .../devicetree/bindings/serial/fsl-lpuart.yaml | 79 ++
 2 files changed, 79 insertions(+), 43 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/serial/fsl-lpuart.txt
 create mode 100644 Documentation/devicetree/bindings/serial/fsl-lpuart.yaml

diff --git a/Documentation/devicetree/bindings/serial/fsl-lpuart.txt 
b/Documentation/devicetree/bindings/serial/fsl-lpuart.txt
deleted file mode 100644
index e7448b9..000
--- a/Documentation/devicetree/bindings/serial/fsl-lpuart.txt
+++ /dev/null
@@ -1,43 +0,0 @@
-* Freescale low power universal asynchronous receiver/transmitter (lpuart)
-
-Required properties:
-- compatible :
-  - "fsl,vf610-lpuart" for lpuart compatible with the one integrated
-on Vybrid vf610 SoC with 8-bit register organization
-  - "fsl,ls1021a-lpuart" for lpuart compatible with the one integrated
-on LS1021A SoC with 32-bit big-endian register organization
-  - "fsl,ls1028a-lpuart" for lpuart compatible with the one integrated
-on LS1028A SoC with 32-bit little-endian register organization
-  - "fsl,imx7ulp-lpuart" for lpuart compatible with the one integrated
-on i.MX7ULP SoC with 32-bit little-endian register organization
-  - "fsl,imx8qxp-lpuart" for lpuart compatible with the one integrated
-on i.MX8QXP SoC with 32-bit little-endian register organization
-  - "fsl,imx8qm-lpuart" for lpuart compatible with the one integrated
-on i.MX8QM SoC with 32-bit little-endian register organization
-- reg : Address and length of the register set for the device
-- interrupts : Should contain uart interrupt
-- clocks : phandle + clock specifier pairs, one for each entry in clock-names
-- clock-names : For vf610/ls1021a/ls1028a/imx7ulp, "ipg" clock is for uart
-  bus/baud clock. For imx8qxp lpuart, "ipg" clock is bus clock that is used
-  to access lpuart controller registers, it also requires "baud" clock for
-  module to receive/transmit data.
-
-Optional properties:
-- dmas: A list of two dma specifiers, one for each entry in dma-names.
-- dma-names: should contain "tx" and "rx".
-- rs485-rts-active-low, linux,rs485-enabled-at-boot-time: see rs485.txt
-
-Note: Optional properties for DMA support. Write them both or both not.
-
-Example:
-
-uart0: serial@40027000 {
-   compatible = "fsl,vf610-lpuart";
-   reg = <0x40027000 0x1000>;
-   interrupts = <0 61 0x00>;
-   clocks = < VF610_CLK_UART0>;
-   clock-names = "ipg";
-   dmas = < 0 2>,
-   < 0 3>;
-   dma-names = "rx","tx";
-   };
diff --git a/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml 
b/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml
new file mode 100644
index 000..1b955f3
--- /dev/null
+++ b/Documentation/devicetree/bindings/serial/fsl-lpuart.yaml
@@ -0,0 +1,79 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/serial/fsl-lpuart.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale low power universal asynchronous receiver/transmitter (lpuart)
+
+maintainers:
+  - Kumar Gala 
+
+allOf:
+  - $ref: "rs485.yaml"
+
+properties:
+  compatible:
+enum:
+  - fsl,vf610-lpuart
+  - fsl,ls1021a-lpuart
+  - fsl,ls1028a-lpuart
+  - fsl,imx7ulp-lpuart
+  - fsl,imx8qxp-lpuart
+  - fsl,imx8qm-lpuart
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+items:
+  - description: ipg clock
+  - description: baud clock
+minItems: 1
+maxItems: 2
+
+  clock-names:
+items:
+  - const: ipg
+  - const: baud
+minItems: 1
+maxItems: 2
+
+  dmas:
+items:
+  - description: DMA controller phandle and request line for RX
+  - description: DMA controller phandle and request line for TX
+
+  dma-names:
+items:
+  - const: rx
+  - const: tx
+
+  rs485-rts-active-low: true
+  linux,rs485-enabled-at-boot-time: true
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+
+serial@40027000 {
+compatible = "fsl,vf610-lpuart";
+reg = <0x40027000 0x1000>;
+interrupts = <0 61 0x00>;
+clocks = < VF610_CLK_UART0>;
+clock-names = "ipg";
+dmas = < 0 2>, < 0 3>;
+dma-names = "rx","tx";
+};
-- 
2.7.4

[PATCH 1/3] dt-bindings: serial: Convert i.MX uart to json-schema

Convert the i.MX uart binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
 .../devicetree/bindings/serial/fsl-imx-uart.txt| 40 --
 .../devicetree/bindings/serial/fsl-imx-uart.yaml   | 92 ++
 2 files changed, 92 insertions(+), 40 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/serial/fsl-imx-uart.txt
 create mode 100644 Documentation/devicetree/bindings/serial/fsl-imx-uart.yaml

diff --git a/Documentation/devicetree/bindings/serial/fsl-imx-uart.txt 
b/Documentation/devicetree/bindings/serial/fsl-imx-uart.txt
deleted file mode 100644
index 9582fc2..000
--- a/Documentation/devicetree/bindings/serial/fsl-imx-uart.txt
+++ /dev/null
@@ -1,40 +0,0 @@
-* Freescale i.MX Universal Asynchronous Receiver/Transmitter (UART)
-
-Required properties:
-- compatible : Should be "fsl,-uart"
-- reg : Address and length of the register set for the device
-- interrupts : Should contain uart interrupt
-
-Optional properties:
-- fsl,dte-mode : Indicate the uart works in DTE mode. The uart works
-  in DCE mode by default.
-- fsl,inverted-tx , fsl,inverted-rx : Indicate that the hardware attached
-  to the peripheral inverts the signal transmitted or received,
-  respectively, and that the peripheral should invert its output/input
-  using the INVT/INVR registers.
-- rs485-rts-delay, rs485-rts-active-low, rs485-rx-during-tx,
-  linux,rs485-enabled-at-boot-time: see rs485.txt. Note that for RS485
-  you must enable either the "uart-has-rtscts" or the "rts-gpios"
-  properties. In case you use "uart-has-rtscts" the signal that controls
-  the transceiver is actually CTS_B, not RTS_B. CTS_B is always output,
-  and RTS_B is input, regardless of dte-mode.
-
-Please check Documentation/devicetree/bindings/serial/serial.yaml
-for the complete list of generic properties.
-
-Note: Each uart controller should have an alias correctly numbered
-in "aliases" node.
-
-Example:
-
-aliases {
-   serial0 = 
-};
-
-uart1: serial@73fbc000 {
-   compatible = "fsl,imx51-uart", "fsl,imx21-uart";
-   reg = <0x73fbc000 0x4000>;
-   interrupts = <31>;
-   uart-has-rtscts;
-   fsl,dte-mode;
-};
diff --git a/Documentation/devicetree/bindings/serial/fsl-imx-uart.yaml 
b/Documentation/devicetree/bindings/serial/fsl-imx-uart.yaml
new file mode 100644
index 000..cba3f83
--- /dev/null
+++ b/Documentation/devicetree/bindings/serial/fsl-imx-uart.yaml
@@ -0,0 +1,92 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/serial/fsl-imx-uart.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX Universal Asynchronous Receiver/Transmitter (UART)
+
+maintainers:
+  - Fabio Estevam 
+
+allOf:
+  - $ref: "serial.yaml"
+  - $ref: "rs485.yaml"
+
+properties:
+  compatible:
+oneOf:
+  - const: fsl,imx1-uart
+  - const: fsl,imx21-uart
+  - const: fsl,imx53-uart
+  - const: fsl,imx6q-uart
+  - items:
+  - enum:
+- fsl,imx25-uart
+- fsl,imx27-uart
+- fsl,imx31-uart
+- fsl,imx35-uart
+- fsl,imx50-uart
+- fsl,imx51-uart
+  - const: fsl,imx21-uart
+  - items:
+  - enum:
+- fsl,imx6sl-uart
+- fsl,imx6sll-uart
+- fsl,imx6sx-uart
+- fsl,imx6ul-uart
+- fsl,imx7d-uart
+  - const: fsl,imx6q-uart
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  fsl,dte-mode:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Indicate the uart works in DTE mode. The uart works in DCE mode by 
default.
+
+  fsl,inverted-tx:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Indicate that the hardware attached to the peripheral inverts the signal
+  transmitted, and that the peripheral should invert its output using the
+  INVT registers.
+
+  fsl,inverted-rx:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Indicate that the hardware attached to the peripheral inverts the signal
+  received, and that the peripheral should invert its input using the
+  INVR registers.
+
+  uart-has-rtscts: true
+
+  rs485-rts-delay: true
+  rs485-rts-active-low: true
+  rs485-rx-during-tx: true
+  linux,rs485-enabled-at-boot-time: true
+
+required:
+  - compatible
+  - reg
+  - interrupts
+
+unevaluatedProperties: false
+
+examples:
+  - |
+aliases {
+serial0 = 
+};
+
+uart1: serial@73fbc000 {
+compatible = "fsl,imx51-uart", "fsl,imx21-uart";
+reg = <0x73fbc000 0x4000>;
+interrupts = <31>;
+uart-has-rtscts;
+fsl,dte-mode;
+};
-- 
2.7.4

[PATCH 2/3] dt-bindings: serial: Convert MXS auart to json-schema

Convert the MXS auart binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
 .../devicetree/bindings/serial/fsl-mxs-auart.txt   | 53 
 .../devicetree/bindings/serial/fsl-mxs-auart.yaml  | 93 ++
 2 files changed, 93 insertions(+), 53 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/serial/fsl-mxs-auart.txt
 create mode 100644 Documentation/devicetree/bindings/serial/fsl-mxs-auart.yaml

diff --git a/Documentation/devicetree/bindings/serial/fsl-mxs-auart.txt 
b/Documentation/devicetree/bindings/serial/fsl-mxs-auart.txt
deleted file mode 100644
index 5c96d41..000
--- a/Documentation/devicetree/bindings/serial/fsl-mxs-auart.txt
+++ /dev/null
@@ -1,53 +0,0 @@
-* Freescale MXS Application UART (AUART)
-
-Required properties for all SoCs:
-- compatible : Should be one of fallowing variants:
-   "fsl,imx23-auart" - Freescale i.MX23
-   "fsl,imx28-auart" - Freescale i.MX28
-   "alphascale,asm9260-auart" - Alphascale ASM9260
-- reg : Address and length of the register set for the device
-- interrupts : Should contain the auart interrupt numbers
-- dmas: DMA specifier, consisting of a phandle to DMA controller node
-  and AUART DMA channel ID.
-  Refer to dma.txt and fsl-mxs-dma.txt for details.
-- dma-names: "rx" for RX channel, "tx" for TX channel.
-
-Required properties for "alphascale,asm9260-auart":
-- clocks : the clocks feeding the watchdog timer. See clock-bindings.txt
-- clock-names : should be set to
-   "mod" - source for tick counter.
-   "ahb" - ahb gate.
-
-Optional properties:
-- uart-has-rtscts : Indicate the UART has RTS and CTS lines
-  for hardware flow control,
-   it also means you enable the DMA support for this UART.
-- {rts,cts,dtr,dsr,rng,dcd}-gpios: specify a GPIO for RTS/CTS/DTR/DSR/RI/DCD
-  line respectively. It will use specified PIO instead of the peripheral
-  function pin for the USART feature.
-  If unsure, don't specify this property.
-
-Example:
-auart0: serial@8006a000 {
-   compatible = "fsl,imx28-auart", "fsl,imx23-auart";
-   reg = <0x8006a000 0x2000>;
-   interrupts = <112>;
-   dmas = <_apbx 8>, <_apbx 9>;
-   dma-names = "rx", "tx";
-   cts-gpios = < 15 GPIO_ACTIVE_LOW>;
-   dsr-gpios = < 16 GPIO_ACTIVE_LOW>;
-   dcd-gpios = < 17 GPIO_ACTIVE_LOW>;
-};
-
-Note: Each auart port should have an alias correctly numbered in "aliases"
-node.
-
-Example:
-
-aliases {
-   serial0 = 
-   serial1 = 
-   serial2 = 
-   serial3 = 
-   serial4 = 
-};
diff --git a/Documentation/devicetree/bindings/serial/fsl-mxs-auart.yaml 
b/Documentation/devicetree/bindings/serial/fsl-mxs-auart.yaml
new file mode 100644
index 000..096ef05
--- /dev/null
+++ b/Documentation/devicetree/bindings/serial/fsl-mxs-auart.yaml
@@ -0,0 +1,93 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/serial/fsl-mxs-auart.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale MXS Application UART (AUART)
+
+maintainers:
+  - Kumar Gala 
+
+allOf:
+  - $ref: "serial.yaml"
+
+properties:
+  compatible:
+enum:
+  - fsl,imx23-auart
+  - fsl,imx28-auart
+  - alphascale,asm9260-auart
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  dmas:
+items:
+  - description: DMA controller phandle and request line for RX
+  - description: DMA controller phandle and request line for TX
+
+  dma-names:
+items:
+  - const: rx
+  - const: tx
+
+  clocks:
+items:
+  - description: mod clock
+  - description: ahb clock
+
+  clock-names:
+items:
+  - const: mod
+  - const: ahb
+
+  uart-has-rtscts: true
+  rts-gpios: true
+  cts-gpios: true
+  dtr-gpios: true
+  dsr-gpios: true
+  rng-gpios: true
+  dcd-gpios: true
+
+if:
+  properties:
+compatible:
+  contains:
+enum:
+  - alphascale,asm9260-auart
+then:
+  required:
+- clocks
+- clock-names
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - dmas
+  - dma-names
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+
+aliases {
+serial0 = 
+};
+
+auart0: serial@8006a000 {
+compatible = "fsl,imx28-auart";
+reg = <0x8006a000 0x2000>;
+interrupts = <112>;
+dmas = <_apbx 8>, <_apbx 9>;
+dma-names = "rx", "tx";
+cts-gpios = < 15 GPIO_ACTIVE_LOW>;
+dsr-gpios = < 16 GPIO_ACTIVE_LOW>;
+dcd-gpios = < 17 GPIO_ACTIVE_LOW>;
+};
-- 
2.7.4

Re: [PATCH] regulator: Avoid grabbing regulator lock during suspend/resume

2020-08-04 Thread Doug Anderson

Hi,

On Tue, Aug 4, 2020 at 12:08 AM Stephen Boyd  wrote:
>
> I see it takes about 5us per regulator to grab the lock, check that this
> regulator isn't going to do anything for suspend, and then release the
> lock. When that is combined with PMICs that have dozens of regulators we
> get into a state where we spend a few miliseconds doing a bunch of
> locking operations synchronously to figure out that there's nothing to
> do. Let's reorganize the code here a bit so that we don't grab the lock
> until we're actually going to do something so that suspend is a little
> faster.
>
> Cc: Matthias Kaehlcke 
> Cc: Douglas Anderson 
> Signed-off-by: Stephen Boyd 
> ---
>  drivers/regulator/core.c | 75 +++-
>  1 file changed, 51 insertions(+), 24 deletions(-)

Looks good to me.  Agree that getting a pointer to the relevant
"struct regulator_state" and checking whether some details about it
and our ops should be safe to do without a lock.  Patch looks clean
and correct.

Reviewed-by: Douglas Anderson

Re: [PATCH v2 3/3] ext4: add needed paramter to ext4_mb_discard_preallocations trace

2020-08-04 Thread Andreas Dilger

On Aug 4, 2020, at 7:02 PM, brookxu  wrote:
> 
> Add the needed value to ext4_mb_discard_preallocations trace, so
> we can more easily observe the requested number of trim.
> 
> Signed-off-by: Chunguang Xu 

IMHO, this should be part of the previous patch that is changing the
API for ext4_discard_preallocations().

Cheers, Andreas

> ---
>  include/trace/events/ext4.h | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
> index cc41d69..61736d8 100644
> --- a/include/trace/events/ext4.h
> +++ b/include/trace/events/ext4.h
> @@ -746,24 +746,26 @@
>  );
> 
>  TRACE_EVENT(ext4_discard_preallocations,
> -TP_PROTO(struct inode *inode),
> +TP_PROTO(struct inode *inode, unsigned int needed),
> 
> -TP_ARGS(inode),
> +TP_ARGS(inode, needed),
> 
>  TP_STRUCT__entry(
> -__field(dev_t,dev)
> -__field(ino_t,ino)
> +__field(dev_t,dev)
> +__field(ino_t,ino)
> +__field(unsigned int,needed)
> 
>  ),
> 
>  TP_fast_assign(
>  __entry->dev= inode->i_sb->s_dev;
>  __entry->ino= inode->i_ino;
> +__entry->needed= needed;
>  ),
> 
> -TP_printk("dev %d,%d ino %lu",
> +TP_printk("dev %d,%d ino %lu needed %u",
>MAJOR(__entry->dev), MINOR(__entry->dev),
> -  (unsigned long) __entry->ino)
> +  (unsigned long) __entry->ino, __entry->needed)
>  );
> 
>  TRACE_EVENT(ext4_mb_discard_preallocations,
> 
> --
> 1.8.3.1
> 


Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP

Re: [PATCH] Userfaultfd: Avoid double free of userfault_ctx and remove O_CLOEXEC

2020-08-04 Thread Lokesh Gidra

On Tue, Aug 4, 2020 at 9:08 PM Eric Biggers  wrote:
>
> On Wed, Aug 05, 2020 at 01:47:58PM +1000, Aleksa Sarai wrote:
> > On 2020-08-04, Lokesh Gidra  wrote:
> > > when get_unused_fd_flags returns error, ctx will be freed by
> > > userfaultfd's release function, which is indirectly called by fput().
> > > Also, if anon_inode_getfile_secure() returns an error, then
> > > userfaultfd_ctx_put() is called, which calls mmdrop() and frees ctx.
> > >
> > > Also, the O_CLOEXEC was inadvertently added to the call to
> > > get_unused_fd_flags() [1].
> >
> > I disagree that it is "wrong" to do O_CLOEXEC-by-default (after all,
> > it's trivial to disable O_CLOEXEC, but it's non-trivial to enable it on
> > an existing file descriptor because it's possible for another thread to
> > exec() before you set the flag). Several new syscalls and fd-returning
> > facilities are O_CLOEXEC-by-default now (the most obvious being pidfds
> > and seccomp notifier fds).
>
> Sure, O_CLOEXEC *should* be the default, but this is an existing syscall so it
> has to keep the existing behavior.
>
> > At the very least there should be a new flag added that sets O_CLOEXEC.
>
> There already is one (but these patches broke it).
>
I looked at the existing implementation, and the right thing is to
pass on the 'flags' (that is passed in to the syscall) to fetch 'fd'.

Besides, as you said in the other email thread,
anon_inode_getfile_secure() should be replaced with
anon_inode_getfd_secure(), which will remove this ambiguity.

I'll resend the patch series soon with all the changes that you proposed.
> - Eric

Re: [External] linux-next: build warning after merge of the ftrace tree

2020-08-04 Thread Muchun Song

On Wed, Aug 5, 2020 at 12:21 PM Stephen Rothwell  wrote:
>
> Hi all,
>
> After merging the ftrace tree, today's linux-next build (powerpc
> ppc64_defconfig) produced this warning:
>
> kernel/kprobes.c: In function 'kill_kprobe':
> kernel/kprobes.c:1116:33: warning: statement with no effect [-Wunused-value]
>  1116 | #define disarm_kprobe_ftrace(p) (-ENODEV)
>   | ^
> kernel/kprobes.c:2154:3: note: in expansion of macro 'disarm_kprobe_ftrace'
>  2154 |   disarm_kprobe_ftrace(p);
>   |   ^~~~
>

Sorry, maybe we should rework the macro of disarm_kprobe_ftrace to an
inline function like below.

-#define disarm_kprobe_ftrace(p)(-ENODEV)
+static inline int disarm_kprobe_ftrace(struct kprobe *p)
+{
+   return -ENODEV
+}
 #endif

> Introduced by commit
>
>   0cb2f1372baa ("kprobes: Fix NULL pointer dereference at 
> kprobe_ftrace_handler")
>
> --
> Cheers,
> Stephen Rothwell



-- 
Yours,
Muchun

Re: [PATCH v2 2/3] ext4: limit the length of per-inode prealloc list

2020-08-04 Thread Andreas Dilger

On Aug 4, 2020, at 7:02 PM, brookxu  wrote:
> 
> In the scenario of writing sparse files, the Per-inode prealloc list may
> be very long, resulting in high overhead for ext4_mb_use_preallocated().
> To circumvent this problem, we limit the maximum length of per-inode
> prealloc list to 512 and allow users to modify it.
> 
> Signed-off-by: Chunguang Xu 

Do you have any kind of measurements that show the benefit of this patch?
For example performance improvement, memory or CPU usage before and after?
How long is "very long"?

Cheers, Andreas

> ---
>  fs/ext4/ext4.h|  3 ++-
>  fs/ext4/extents.c | 10 -
>  fs/ext4/file.c|  2 +-
>  fs/ext4/indirect.c|  2 +-
>  fs/ext4/inode.c   |  6 +++---
>  fs/ext4/ioctl.c   |  2 +-
>  fs/ext4/mballoc.c | 57 
> +++
>  fs/ext4/mballoc.h |  4 
>  fs/ext4/move_extent.c |  4 ++--
>  fs/ext4/super.c   |  2 +-
>  fs/ext4/sysfs.c   |  2 ++
>  11 files changed, 75 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 42f5060..68e0ebe 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1501,6 +1501,7 @@ struct ext4_sb_info {
>  unsigned int s_mb_stats;
>  unsigned int s_mb_order2_reqs;
>  unsigned int s_mb_group_prealloc;
> +unsigned int s_mb_max_inode_prealloc;
>  unsigned int s_max_dir_size_kb;
>  /* where last allocation was done - for stream allocation */
>  unsigned long s_mb_last_group;
> @@ -2651,7 +2652,7 @@ extern int ext4_init_inode_table(struct super_block *sb,
>  extern ext4_fsblk_t ext4_mb_new_blocks(handle_t *,
>  struct ext4_allocation_request *, int *);
>  extern int ext4_mb_reserve_blocks(struct super_block *, int);
> -extern void ext4_discard_preallocations(struct inode *);
> +extern void ext4_discard_preallocations(struct inode *, unsigned int);
>  extern int __init ext4_init_mballoc(void);
>  extern void ext4_exit_mballoc(void);
>  extern void ext4_free_blocks(handle_t *handle, struct inode *inode,
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 221f240..a40f928 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -100,7 +100,7 @@ static int ext4_ext_trunc_restart_fn(struct inode *inode, 
> int *dropped)
>   * i_mutex. So we can safely drop the i_data_sem here.
>   */
>  BUG_ON(EXT4_JOURNAL(inode) == NULL);
> -ext4_discard_preallocations(inode);
> +ext4_discard_preallocations(inode, 0);
>  up_write(_I(inode)->i_data_sem);
>  *dropped = 1;
>  return 0;
> @@ -4272,7 +4272,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode 
> *inode,
>   * not a good idea to call discard here directly,
>   * but otherwise we'd need to call it every free().
>   */
> -ext4_discard_preallocations(inode);
> +ext4_discard_preallocations(inode, 0);
>  if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
>  fb_flags = EXT4_FREE_BLOCKS_NO_QUOT_UPDATE;
>  ext4_free_blocks(handle, inode, NULL, newblock,
> @@ -5299,7 +5299,7 @@ static int ext4_collapse_range(struct inode *inode, 
> loff_t offset, loff_t len)
>  }
> 
>  down_write(_I(inode)->i_data_sem);
> -ext4_discard_preallocations(inode);
> +ext4_discard_preallocations(inode, 0);
> 
>  ret = ext4_es_remove_extent(inode, punch_start,
>  EXT_MAX_BLOCKS - punch_start);
> @@ -5313,7 +5313,7 @@ static int ext4_collapse_range(struct inode *inode, 
> loff_t offset, loff_t len)
>  up_write(_I(inode)->i_data_sem);
>  goto out_stop;
>  }
> -ext4_discard_preallocations(inode);
> +ext4_discard_preallocations(inode, 0);
> 
>  ret = ext4_ext_shift_extents(inode, handle, punch_stop,
>   punch_stop - punch_start, SHIFT_LEFT);
> @@ -5445,7 +5445,7 @@ static int ext4_insert_range(struct inode *inode, 
> loff_t offset, loff_t len)
>  goto out_stop;
> 
>  down_write(_I(inode)->i_data_sem);
> -ext4_discard_preallocations(inode);
> +ext4_discard_preallocations(inode, 0);
> 
>  path = ext4_find_extent(inode, offset_lblk, NULL, 0);
>  if (IS_ERR(path)) {
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 2a01e31..e3ab8ea 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -148,7 +148,7 @@ static int ext4_release_file(struct inode *inode, struct 
> file *filp)
>  !EXT4_I(inode)->i_reserved_data_blocks)
>  {
>  down_write(_I(inode)->i_data_sem);
> -ext4_discard_preallocations(inode);
> +ext4_discard_preallocations(inode, 0);
>  up_write(_I(inode)->i_data_sem);
>  }
>  if (is_dx(inode) && filp->private_data)
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index be2b66e..ec6b930 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -696,7 +696,7 @@ static int ext4_ind_trunc_restart_fn(handle_t *handle, 
> struct

Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available

2020-08-04 Thread Borislav Petkov

On Tue, Aug 04, 2020 at 07:10:59PM -0700, Ricardo Neri wrote:
> The SERIALIZE instruction gives software a way to force the processor to
> complete all modifications to flags, registers and memory from previous
> instructions and drain all buffered writes to memory before the next
> instruction is fetched and executed. Thus, it serves the purpose of
> sync_core(). Use it when available.
> 
> Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced stack
> invariance in alternatives. The iret-to-self does not comply with such
> invariance. Thus, it cannot be used inside alternative code. Instead, use
> an alternative that jumps to SERIALIZE when available.
> 
> Cc: Andy Lutomirski 
> Cc: Cathy Zhang 
> Cc: Dave Hansen 
> Cc: Fenghua Yu 
> Cc: "H. Peter Anvin" 
> Cc: Kyung Min Park 
> Cc: Peter Zijlstra 
> Cc: "Ravi V. Shankar" 
> Cc: Sean Christopherson 
> Cc: linux-e...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Suggested-by: Andy Lutomirski 
> Signed-off-by: Ricardo Neri 
> ---
> This is a v2 from my initial submission [1]. The first three patches of
> the series have been merged in Linus' tree. Hence, I am submitting only
> this patch for review.
> 
> [1]. https://lkml.org/lkml/2020/7/27/8
> 
> Changes since v1:
>  * Support SERIALIZE using alternative runtime patching.
>(Peter Zijlstra, H. Peter Anvin)
>  * Added a note to specify which version of binutils supports SERIALIZE.
>(Peter Zijlstra)
>  * Verified that (::: "memory") is used. (H. Peter Anvin)
> ---
>  arch/x86/include/asm/special_insns.h |  2 ++
>  arch/x86/include/asm/sync_core.h | 10 +-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/special_insns.h 
> b/arch/x86/include/asm/special_insns.h
> index 59a3e13204c3..25cd67801dda 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -10,6 +10,8 @@
>  #include 
>  #include 
>  
> +/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */
> +#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
>  /*
>   * Volatile isn't enough to prevent the compiler from reordering the
>   * read/write functions for the control registers and messing everything up.
> diff --git a/arch/x86/include/asm/sync_core.h 
> b/arch/x86/include/asm/sync_core.h
> index fdb5b356e59b..201ea3d9a6bd 100644
> --- a/arch/x86/include/asm/sync_core.h
> +++ b/arch/x86/include/asm/sync_core.h
> @@ -5,15 +5,19 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #ifdef CONFIG_X86_32
>  static inline void iret_to_self(void)
>  {
>   asm volatile (
> + ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
>   "pushfl\n\t"
>   "pushl %%cs\n\t"
>   "pushl $1f\n\t"
>   "iret\n\t"
> + "2:\n\t"
> + __ASM_SERIALIZE "\n"
>   "1:"
>   : ASM_CALL_CONSTRAINT : : "memory");
>  }
> @@ -23,6 +27,7 @@ static inline void iret_to_self(void)
>   unsigned int tmp;
>  
>   asm volatile (
> + ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)

Why is this and above stuck inside the asm statement?

Why can't you simply do:

if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
asm volatile(__ASM_SERIALIZE ::: "memory");
return;
}

on function entry instead of making it more unreadable for no particular
reason?

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG 
Nürnberg)
--

Re: [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog

2020-08-04 Thread Song Liu



> On Aug 4, 2020, at 6:52 PM, Andrii Nakryiko  wrote:
> 
> On Tue, Aug 4, 2020 at 2:01 PM Song Liu  wrote:
>> 
>> 
>> 
>>> On Aug 2, 2020, at 10:10 PM, Andrii Nakryiko  
>>> wrote:
>>> 
>>> On Sun, Aug 2, 2020 at 9:47 PM Song Liu  wrote:
 
 
> On Aug 2, 2020, at 6:51 PM, Andrii Nakryiko  
> wrote:
> 
> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
>> 
>> Add a benchmark to compare performance of
>> 1) uprobe;
>> 2) user program w/o args;
>> 3) user program w/ args;
>> 4) user program w/ args on random cpu.
>> 
> 
> Can you please add it to the existing benchmark runner instead, e.g.,
> along the other bench_trigger benchmarks? No need to re-implement
> benchmark setup. And also that would also allow to compare existing
> ways of cheaply triggering a program vs this new _USER program?
 
 Will try.
 
> 
> If the performance is not significantly better than other ways, do you
> think it still makes sense to add a new BPF program type? I think
> triggering KPROBE/TRACEPOINT from bpf_prog_test_run() would be very
> nice, maybe it's possible to add that instead of a new program type?
> Either way, let's see comparison with other program triggering
> mechanisms first.
 
 Triggering KPROBE and TRACEPOINT from bpf_prog_test_run() will be useful.
 But I don't think they can be used instead of user program, for a couple
 reasons. First, KPROBE/TRACEPOINT may be triggered by other programs
 running in the system, so user will have to filter those noise out in
 each program. Second, it is not easy to specify CPU for KPROBE/TRACEPOINT,
 while this feature could be useful in many cases, e.g. get stack trace
 on a given CPU.
 
>>> 
>>> Right, it's not as convenient with KPROBE/TRACEPOINT as with the USER
>>> program you've added specifically with that feature in mind. But if
>>> you pin user-space thread on the needed CPU and trigger kprobe/tp,
>>> then you'll get what you want. As for the "noise", see how
>>> bench_trigger() deals with that: it records thread ID and filters
>>> everything not matching. You can do the same with CPU ID. It's not as
>>> automatic as with a special BPF program type, but still pretty simple,
>>> which is why I'm still deciding (for myself) whether USER program type
>>> is necessary :)
>> 
>> Here are some bench_trigger numbers:
>> 
>> base  :1.698 ± 0.001M/s
>> tp:1.477 ± 0.001M/s
>> rawtp :1.567 ± 0.001M/s
>> kprobe:1.431 ± 0.000M/s
>> fentry:1.691 ± 0.000M/s
>> fmodret   :1.654 ± 0.000M/s
>> user  :1.253 ± 0.000M/s
>> fentry-on-cpu:0.022 ± 0.011M/s
>> user-on-cpu:0.315 ± 0.001M/s
>> 
> 
> Ok, so basically all of raw_tp,tp,kprobe,fentry/fexit are
> significantly faster than USER programs. Sure, when compared to
> uprobe, they are faster, but not when doing on-specific-CPU run, it
> seems (judging from this patch's description, if I'm reading it
> right). Anyways, speed argument shouldn't be a reason for doing this,
> IMO.
> 
>> The two "on-cpu" tests run the program on a different CPU (see the patch
>> at the end).
>> 
>> "user" is about 25% slower than "fentry". I think this is mostly because
>> getpgid() is a faster syscall than bpf(BPF_TEST_RUN).
> 
> Yes, probably.
> 
>> 
>> "user-on-cpu" is more than 10x faster than "fentry-on-cpu", because IPI
>> is way faster than moving the process (via sched_setaffinity).
> 
> I don't think that's a good comparison, because you are actually
> testing sched_setaffinity performance on each iteration vs IPI in the
> kernel, not a BPF overhead.
> 
> I think the fair comparison for this would be to create a thread and
> pin it on necessary CPU, and only then BPF program calls in a loop.
> But I bet any of existing program types would beat USER program.
> 
>> 
>> For use cases that we would like to call BPF program on specific CPU,
>> triggering it via IPI is a lot faster.
> 
> So these use cases would be nice to expand on in the motivational part
> of the patch set. It's not really emphasized and it's not at all clear
> what you are trying to achieve. It also seems, depending on latency
> requirements, it's totally possible to achieve comparable results by
> pre-creating a thread for each CPU, pinning each one to its designated
> CPU and then using any suitable user-space signaling mechanism (a
> queue, condvar, etc) to ask a thread to trigger BPF program (fentry on
> getpgid(), for instance).

I don't see why user space signal plus fentry would be faster than IPI.
If the target cpu is running something, this gonna add two context 
switches. 

> I bet in this case the  performance would be
> really nice for a lot of practical use cases. But then again, I don't
> know details of the intended use case, so please provide some more
> details.

Being able to trigger BPF program on a different CPU could enable many
use cases and optimizations. The

Re: [PATCH v2 1/3] ext4: reorganize if statement of ext4_mb_release_context()

2020-08-04 Thread Andreas Dilger

On Aug 4, 2020, at 7:01 PM, brookxu  wrote:
> 
> Reorganize the if statement of ext4_mb_release_context(), make it
> easier to read.
> 
> Signed-off-by: Chunguang Xu 

Reviewed-by: Andreas Dilger 

> ---
>  fs/ext4/mballoc.c | 27 +--
>  1 file changed, 13 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index c0a331e..4f21f34 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4564,20 +4564,19 @@ static int ext4_mb_release_context(struct 
> ext4_allocation_context *ac)
>  pa->pa_free -= ac->ac_b_ex.fe_len;
>  pa->pa_len -= ac->ac_b_ex.fe_len;
>  spin_unlock(>pa_lock);
> -}
> -}
> -if (pa) {
> -/*
> - * We want to add the pa to the right bucket.
> - * Remove it from the list and while adding
> - * make sure the list to which we are adding
> - * doesn't grow big.
> - */
> -if ((pa->pa_type == MB_GROUP_PA) && likely(pa->pa_free)) {
> -spin_lock(pa->pa_obj_lock);
> -list_del_rcu(>pa_inode_list);
> -spin_unlock(pa->pa_obj_lock);
> -ext4_mb_add_n_trim(ac);
> +
> +/*
> + * We want to add the pa to the right bucket.
> + * Remove it from the list and while adding
> + * make sure the list to which we are adding
> + * doesn't grow big.
> + */
> +if (likely(pa->pa_free)) {
> +spin_lock(pa->pa_obj_lock);
> +list_del_rcu(>pa_inode_list);
> +spin_unlock(pa->pa_obj_lock);
> +ext4_mb_add_n_trim(ac);
> +}
>  }
>  ext4_mb_put_pa(ac, ac->ac_sb, pa);
>  }
> 
> --
> 1.8.3.1
> 


Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP

[PATCH v11 1/6] Add KUnit Struct to Current Task

From: Patricia Alfonso 

In order to integrate debugging tools like KASAN into the KUnit
framework, add KUnit struct to the current task to keep track of the
current KUnit test.

Signed-off-by: Patricia Alfonso 
Reviewed-by: Brendan Higgins 
Signed-off-by: David Gow 
---
 include/linux/sched.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 27882a08163f..f3f990b82bde 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1196,6 +1196,10 @@ struct task_struct {
struct kcsan_ctxkcsan_ctx;
 #endif
 
+#if IS_ENABLED(CONFIG_KUNIT)
+   struct kunit*kunit_test;
+#endif
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
/* Index of current stored address in ret_stack: */
int curr_ret_stack;
-- 
2.28.0.163.g6104cc2f0b6-goog

[PATCH v11 5/6] KASAN: Testing Documentation

From: Patricia Alfonso 

Include documentation on how to test KASAN using CONFIG_TEST_KASAN_KUNIT
and CONFIG_TEST_KASAN_MODULE.

Signed-off-by: Patricia Alfonso 
Signed-off-by: David Gow 
Reviewed-by: Andrey Konovalov 
Reviewed-by: Dmitry Vyukov 
Acked-by: Brendan Higgins 
---
 Documentation/dev-tools/kasan.rst | 70 +++
 1 file changed, 70 insertions(+)

diff --git a/Documentation/dev-tools/kasan.rst 
b/Documentation/dev-tools/kasan.rst
index 38fd5681fade..42991e40cbe1 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -281,3 +281,73 @@ unmapped. This will require changes in arch-specific code.
 
 This allows ``VMAP_STACK`` support on x86, and can simplify support of
 architectures that do not have a fixed module region.
+
+CONFIG_KASAN_KUNIT_TEST & CONFIG_TEST_KASAN_MODULE
+--
+
+``CONFIG_KASAN_KUNIT_TEST`` utilizes the KUnit Test Framework for testing.
+This means each test focuses on a small unit of functionality and
+there are a few ways these tests can be run.
+
+Each test will print the KASAN report if an error is detected and then
+print the number of the test and the status of the test:
+
+pass::
+
+ok 28 - kmalloc_double_kzfree
+or, if kmalloc failed::
+
+# kmalloc_large_oob_right: ASSERTION FAILED at lib/test_kasan.c:163
+Expected ptr is not null, but is
+not ok 4 - kmalloc_large_oob_right
+or, if a KASAN report was expected, but not found::
+
+# kmalloc_double_kzfree: EXPECTATION FAILED at lib/test_kasan.c:629
+Expected kasan_data->report_expected == kasan_data->report_found, but
+kasan_data->report_expected == 1
+kasan_data->report_found == 0
+not ok 28 - kmalloc_double_kzfree
+
+All test statuses are tracked as they run and an overall status will
+be printed at the end::
+
+ok 1 - kasan_kunit_test
+
+or::
+
+not ok 1 - kasan_kunit_test
+
+(1) Loadable Module
+
+
+With ``CONFIG_KUNIT`` enabled, ``CONFIG_KASAN_KUNIT_TEST`` can be built as
+a loadable module and run on any architecture that supports KASAN
+using something like insmod or modprobe.
+
+(2) Built-In
+~
+
+With ``CONFIG_KUNIT`` built-in, ``CONFIG_KASAN_KUNIT_TEST`` can be built-in
+on any architecure that supports KASAN. These and any other KUnit
+tests enabled will run and print the results at boot as a late-init
+call.
+
+(3) Using kunit_tool
+~
+
+With ``CONFIG_KUNIT`` and ``CONFIG_KASAN_KUNIT_TEST`` built-in, we can also
+use kunit_tool to see the results of these along with other KUnit
+tests in a more readable way. This will not print the KASAN reports
+of tests that passed. Use `KUnit documentation 
`_ for more 
up-to-date
+information on kunit_tool.
+
+.. _KUnit: https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html
+
+``CONFIG_TEST_KASAN_MODULE`` is a set of KASAN tests that could not be
+converted to KUnit. These tests can be run only as a module with
+``CONFIG_TEST_KASAN_MODULE`` built as a loadable module and
+``CONFIG_KASAN`` built-in. The type of error expected and the
+function being run is printed before the expression expected to give
+an error. Then the error is printed, if found, and that test
+should be interpretted to pass only if the error was the one expected
+by the test.
-- 
2.28.0.163.g6104cc2f0b6-goog

[PATCH v11 2/6] KUnit: KASAN Integration

From: Patricia Alfonso 

Integrate KASAN into KUnit testing framework.
- Fail tests when KASAN reports an error that is not expected
- Use KUNIT_EXPECT_KASAN_FAIL to expect a KASAN error in KASAN
tests
- Expected KASAN reports pass tests and are still printed when run
without kunit_tool (kunit_tool still bypasses the report due to the
test passing)
- KUnit struct in current task used to keep track of the current
test from KASAN code

Make use of "[PATCH v3 kunit-next 1/2] kunit: generalize
kunit_resource API beyond allocated resources" and "[PATCH v3
kunit-next 2/2] kunit: add support for named resources" from Alan
Maguire [1]
- A named resource is added to a test when a KASAN report is
 expected
- This resource contains a struct for kasan_data containing
booleans representing if a KASAN report is expected and if a
KASAN report is found

[1] 
(https://lore.kernel.org/linux-kselftest/1583251361-12748-1-git-send-email-alan.magu...@oracle.com/T/#t)

Signed-off-by: Patricia Alfonso 
Signed-off-by: David Gow 
Reviewed-by: Andrey Konovalov 
Reviewed-by: Dmitry Vyukov 
Acked-by: Brendan Higgins 
---
 include/kunit/test.h  |  5 +
 include/linux/kasan.h |  6 ++
 lib/kunit/test.c  | 13 +++-
 lib/test_kasan.c  | 47 +--
 mm/kasan/report.c | 32 +
 5 files changed, 96 insertions(+), 7 deletions(-)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index 59f3144f009a..3391f38389f8 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -224,6 +224,11 @@ struct kunit {
struct list_head resources; /* Protected by lock. */
 };
 
+static inline void kunit_set_failure(struct kunit *test)
+{
+   WRITE_ONCE(test->success, false);
+}
+
 void kunit_init_test(struct kunit *test, const char *name, char *log);
 
 int kunit_run_tests(struct kunit_suite *suite);
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 087fba34b209..30d343b4a40a 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -14,6 +14,12 @@ struct task_struct;
 #include 
 #include 
 
+/* kasan_data struct is used in KUnit tests for KASAN expected failures */
+struct kunit_kasan_expectation {
+   bool report_expected;
+   bool report_found;
+};
+
 extern unsigned char kasan_early_shadow_page[PAGE_SIZE];
 extern pte_t kasan_early_shadow_pte[PTRS_PER_PTE];
 extern pmd_t kasan_early_shadow_pmd[PTRS_PER_PMD];
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index c36037200310..dcc35fd30d95 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -10,16 +10,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "debugfs.h"
 #include "string-stream.h"
 #include "try-catch-impl.h"
 
-static void kunit_set_failure(struct kunit *test)
-{
-   WRITE_ONCE(test->success, false);
-}
-
 static void kunit_print_tap_version(void)
 {
static bool kunit_has_printed_tap_version;
@@ -288,6 +284,10 @@ static void kunit_try_run_case(void *data)
struct kunit_suite *suite = ctx->suite;
struct kunit_case *test_case = ctx->test_case;
 
+#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
+   current->kunit_test = test;
+#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT) */
+
/*
 * kunit_run_case_internal may encounter a fatal error; if it does,
 * abort will be called, this thread will exit, and finally the parent
@@ -602,6 +602,9 @@ void kunit_cleanup(struct kunit *test)
spin_unlock(>lock);
kunit_remove_resource(test, res);
}
+#if (IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT))
+   current->kunit_test = NULL;
+#endif /* IS_ENABLED(CONFIG_KASAN) && IS_ENABLED(CONFIG_KUNIT)*/
 }
 EXPORT_SYMBOL_GPL(kunit_cleanup);
 
diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 53e953bb1d1d..58bffadd8367 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -23,6 +23,8 @@
 
 #include 
 
+#include 
+
 #include "../mm/kasan/kasan.h"
 
 #define OOB_TAG_OFF (IS_ENABLED(CONFIG_KASAN_GENERIC) ? 0 : 
KASAN_SHADOW_SCALE_SIZE)
@@ -32,14 +34,55 @@
  * are not eliminated as dead code.
  */
 
-int kasan_int_result;
 void *kasan_ptr_result;
+int kasan_int_result;
+
+static struct kunit_resource resource;
+static struct kunit_kasan_expectation fail_data;
+static bool multishot;
+
+static int kasan_test_init(struct kunit *test)
+{
+   /*
+* Temporarily enable multi-shot mode and set panic_on_warn=0.
+* Otherwise, we'd only get a report for the first case.
+*/
+   multishot = kasan_save_enable_multi_shot();
+
+   return 0;
+}
+
+static void kasan_test_exit(struct kunit *test)
+{
+   kasan_restore_multi_shot(multishot);
+}
+
+/**
+ * KUNIT_EXPECT_KASAN_FAIL() - Causes a test failure when the expression does
+ * not cause a KASAN error. This uses a KUnit resource named "kasan_data." Do
+ * Do not

[PATCH v11 6/6] mm: kasan: Do not panic if both panic_on_warn and kasan_multishot set

KASAN errors will currently trigger a panic when panic_on_warn is set.
This renders kasan_multishot useless, as further KASAN errors won't be
reported if the kernel has already paniced. By making kasan_multishot
disable this behaviour for KASAN errors, we can still have the benefits
of panic_on_warn for non-KASAN warnings, yet be able to use
kasan_multishot.

This is particularly important when running KASAN tests, which need to
trigger multiple KASAN errors: previously these would panic the system
if panic_on_warn was set, now they can run (and will panic the system
should non-KASAN warnings show up).

Signed-off-by: David Gow 
Reviewed-by: Andrey Konovalov 
Reviewed-by: Brendan Higgins 
---
 mm/kasan/report.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index e2c14b10bc81..00a53f1355ae 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -95,7 +95,7 @@ static void end_report(unsigned long *flags)

pr_err("==\n");
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
spin_unlock_irqrestore(_lock, *flags);
-   if (panic_on_warn) {
+   if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT, _flags)) {
/*
 * This thread may hit another WARN() in the panic path.
 * Resetting this prevents additional WARN() from panicking the
-- 
2.28.0.163.g6104cc2f0b6-goog

[PATCH v11 0/6] KASAN-KUnit Integration

This patchset contains everything needed to integrate KASAN and KUnit.

KUnit will be able to:
(1) Fail tests when an unexpected KASAN error occurs
(2) Pass tests when an expected KASAN error occurs

Convert KASAN tests to KUnit with the exception of copy_user_test
because KUnit is unable to test those.

Add documentation on how to run the KASAN tests with KUnit and what to
expect when running these tests.

This patchset depends on:
- "kunit: extend kunit resources API" [1]
- This is included in the KUnit 5.9-rci pull request[8]

I'd _really_ like to get this into 5.9 if possible: we also have some
other changes which depend on some things here.

Changes from v10:
- Fixed some whitespace issues in patch 2.
- Split out the renaming of the KUnit test suite into a separate patch.

Changes from v9:
- Rebased on top of linux-next (20200731) + kselftest/kunit and [7]
- Note that the kasan_rcu_uaf test has not been ported to KUnit, and
remains in test_kasan_module. This is because:
(a) KUnit's expect failure will not check if the RCU stacktraces
show.
(b) KUnit is unable to link the failure to the test, as it occurs in
an RCU callback.

Changes from v8:
- Rebased on top of kselftest/kunit
- (Which, with this patchset, should rebase cleanly on 5.8-rc7)
- Renamed the KUnit test suite, config name to patch the proposed
naming guidelines for KUnit tests[6]

Changes from v7:
- Rebased on top of kselftest/kunit
- Rebased on top of v4 of the kunit resources API[1]
- Rebased on top of v4 of the FORTIFY_SOURCE fix[2,3,4]
- Updated the Kconfig entry to support KUNIT_ALL_TESTS

Changes from v6:
- Rebased on top of kselftest/kunit
- Rebased on top of Daniel Axtens' fix for FORTIFY_SOURCE
incompatibilites [2]
- Removed a redundant report_enabled() check.
- Fixed some places with out of date Kconfig names in the
documentation.

Changes from v5:
- Split out the panic_on_warn changes to a separate patch.
- Fix documentation to fewer to the new Kconfig names.
- Fix some changes which were in the wrong patch.
- Rebase on top of kselftest/kunit (currently identical to 5.7-rc1)

Changes from v4:
- KASAN no longer will panic on errors if both panic_on_warn and
kasan_multishot are enabled.
- As a result, the KASAN tests will no-longer disable panic_on_warn.
- This also means panic_on_warn no-longer needs to be exported.
- The use of temporary "kasan_data" variables has been cleaned up
somewhat.
- A potential refcount/resource leak should multiple KASAN errors
appear during an assertion was fixed.
- Some wording changes to the KASAN test Kconfig entries.

Changes from v3:
- KUNIT_SET_KASAN_DATA and KUNIT_DO_EXPECT_KASAN_FAIL have been
combined and included in KUNIT_DO_EXPECT_KASAN_FAIL() instead.
- Reordered logic in kasan_update_kunit_status() in report.c to be
easier to read.
- Added comment to not use the name "kasan_data" for any kunit tests
outside of KUNIT_EXPECT_KASAN_FAIL().

Changes since v2:
- Due to Alan's changes in [1], KUnit can be built as a module.
- The name of the tests that could not be run with KUnit has been
changed to be more generic: test_kasan_module.
- Documentation on how to run the new KASAN tests and what to expect
when running them has been added.
- Some variables and functions are now static.
- Now save/restore panic_on_warn in a similar way to kasan_multi_shot
and renamed the init/exit functions to be more generic to accommodate.
- Due to [4] in kasan_strings, kasan_memchr, and
kasan_memcmp will fail if CONFIG_AMD_MEM_ENCRYPT is enabled so return
early and print message explaining this circumstance.
- Changed preprocessor checks to C checks where applicable.

Changes since v1:
- Make use of Alan Maguire's suggestion to use his patch that allows
static resources for integration instead of adding a new attribute to
the kunit struct
- All KUNIT_EXPECT_KASAN_FAIL statements are local to each test
- The definition of KUNIT_EXPECT_KASAN_FAIL is local to the
test_kasan.c file since it seems this is the only place this will
be used.
- Integration relies on KUnit being builtin
- copy_user_test has been separated into its own file since KUnit
is unable to test these. This can be run as a module just as before,
using CONFIG_TEST_KASAN_USER
- The addition to the current task has been separated into its own
patch as this is a significant enough change to be on its own.

[1]
https://lore.kernel.org/linux-kselftest/cafd5g46uu_5tg89uom0dj5cmq+11cwjbnsd-k_cvy6bqueu...@mail.gmail.com/T/#t
[2] https://lore.kernel.org/linux-mm/20200424145521.8203-1-...@axtens.net/
[3]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=adb72ae1915db28f934e9e02c18bfcea2f3ed3b7
[4]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=47227d27e2fcb01a9e8f5958d8997cf47a820afc
[5] https://bugzilla.kernel.org/show_bug.cgi?id=206337
[6]

[PATCH v11 3/6] KASAN: Port KASAN Tests to KUnit

From: Patricia Alfonso 

Transfer all previous tests for KASAN to KUnit so they can be run
more easily. Using kunit_tool, developers can run these tests with their
other KUnit tests and see "pass" or "fail" with the appropriate KASAN
report instead of needing to parse each KASAN report to test KASAN
functionalities. All KASAN reports are still printed to dmesg.

Stack tests do not work properly when KASAN_STACK is enabled so
those tests use a check for "if IS_ENABLED(CONFIG_KASAN_STACK)" so they
only run if stack instrumentation is enabled. If KASAN_STACK is not
enabled, KUnit will print a statement to let the user know this test
was not run with KASAN_STACK enabled.

copy_user_test and kasan_rcu_uaf cannot be run in KUnit so there is a
separate test file for those tests, which can be run as before as a
module.

Signed-off-by: Patricia Alfonso 
Signed-off-by: David Gow 
Reviewed-by: Brendan Higgins 
Reviewed-by: Andrey Konovalov 
Reviewed-by: Dmitry Vyukov 
---
 lib/Kconfig.kasan   |  22 +-
 lib/Makefile|   3 +-
 lib/test_kasan.c| 686 +++-
 lib/test_kasan_module.c | 111 +++
 4 files changed, 385 insertions(+), 437 deletions(-)
 create mode 100644 lib/test_kasan_module.c

diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan
index 047b53dbfd58..9a237887e52e 100644
--- a/lib/Kconfig.kasan
+++ b/lib/Kconfig.kasan
@@ -167,12 +167,24 @@ config KASAN_VMALLOC
  for KASAN to detect more sorts of errors (and to support vmapped
  stacks), but at the cost of higher memory usage.
 
-config TEST_KASAN
-   tristate "Module for testing KASAN for bug detection"
-   depends on m
+config KASAN_KUNIT_TEST
+   tristate "KUnit-compatible tests of KASAN bug detection capabilities" 
if !KUNIT_ALL_TESTS
+   depends on KASAN && KUNIT
+   default KUNIT_ALL_TESTS
help
- This is a test module doing various nasty things like
- out of bounds accesses, use after free. It is useful for testing
+ This is a KUnit test suite doing various nasty things like
+ out of bounds and use after free accesses. It is useful for testing
  kernel debugging features like KASAN.
 
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit
+
+config TEST_KASAN_MODULE
+   tristate "KUnit-incompatible tests of KASAN bug detection capabilities"
+   depends on m && KASAN
+   help
+ This is a part of the KASAN test suite that is incompatible with
+ KUnit. Currently includes tests that do bad copy_from/to_user
+ accesses.
+
 endif # KASAN
diff --git a/lib/Makefile b/lib/Makefile
index 46278be53cda..adaebfac81c9 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -60,9 +60,10 @@ CFLAGS_test_bitops.o += -Werror
 obj-$(CONFIG_TEST_SYSCTL) += test_sysctl.o
 obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_IDA) += test_ida.o
-obj-$(CONFIG_TEST_KASAN) += test_kasan.o
+obj-$(CONFIG_KASAN_KUNIT_TEST) += test_kasan.o
 CFLAGS_test_kasan.o += -fno-builtin
 CFLAGS_test_kasan.o += $(call cc-disable-warning, vla)
+obj-$(CONFIG_TEST_KASAN_MODULE) += test_kasan_module.o
 obj-$(CONFIG_TEST_UBSAN) += test_ubsan.o
 CFLAGS_test_ubsan.o += $(call cc-disable-warning, vla)
 UBSAN_SANITIZE_test_ubsan.o := y
diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 58bffadd8367..d023fb75fd60 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -5,8 +5,6 @@
  * Author: Andrey Ryabinin 
  */
 
-#define pr_fmt(fmt) "kasan test: %s " fmt, __func__
-
 #include 
 #include 
 #include 
@@ -77,416 +75,327 @@ static void kasan_test_exit(struct kunit *test)
fail_data.report_found); \
 } while (0)
 
-
-
-/*
- * Note: test functions are marked noinline so that their names appear in
- * reports.
- */
-static noinline void __init kmalloc_oob_right(void)
+static void kmalloc_oob_right(struct kunit *test)
 {
char *ptr;
size_t size = 123;
 
-   pr_info("out-of-bounds to right\n");
ptr = kmalloc(size, GFP_KERNEL);
-   if (!ptr) {
-   pr_err("Allocation failed\n");
-   return;
-   }
-
-   ptr[size + OOB_TAG_OFF] = 'x';
+   KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
+   KUNIT_EXPECT_KASAN_FAIL(test, ptr[size + OOB_TAG_OFF] = 'x');
kfree(ptr);
 }
 
-static noinline void __init kmalloc_oob_left(void)
+static void kmalloc_oob_left(struct kunit *test)
 {
char *ptr;
size_t size = 15;
 
-   pr_info("out-of-bounds to left\n");
ptr = kmalloc(size, GFP_KERNEL);
-   if (!ptr) {
-   pr_err("Allocation failed\n");
-   return;
-   }
+   KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr);
 
-   *ptr = *(ptr - 1);
+   KUNIT_EXPECT_KASAN_FAIL(test, *ptr = *(ptr - 1));
kfree(ptr);
 }
 
-static noinline void __init kmalloc_node_oob_right(void)
+static void

[PATCH v11 4/6] kasan: test: Make KASAN KUnit test comply with naming guidelines

The proposed KUnit test naming guidelines[1] suggest naming KUnit test
modules [suite]_kunit (and hence test source files [suite]_kunit.c).

Rename test_kunit.c to kasan_kunit.c to comply with this, and be
consistent with other KUnit tests.

[1]: 
https://lore.kernel.org/linux-kselftest/20200702071416.1780522-1-david...@google.com/

Signed-off-by: David Gow 
---
 lib/Makefile| 6 +++---
 lib/{test_kasan.c => kasan_kunit.c} | 0
 2 files changed, 3 insertions(+), 3 deletions(-)
 rename lib/{test_kasan.c => kasan_kunit.c} (100%)

diff --git a/lib/Makefile b/lib/Makefile
index adaebfac81c9..8a530bf7078c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -60,9 +60,9 @@ CFLAGS_test_bitops.o += -Werror
 obj-$(CONFIG_TEST_SYSCTL) += test_sysctl.o
 obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_IDA) += test_ida.o
-obj-$(CONFIG_KASAN_KUNIT_TEST) += test_kasan.o
-CFLAGS_test_kasan.o += -fno-builtin
-CFLAGS_test_kasan.o += $(call cc-disable-warning, vla)
+obj-$(CONFIG_KASAN_KUNIT_TEST) += kasan_kunit.o
+CFLAGS_kasan_kunit.o += -fno-builtin
+CFLAGS_kasan_kunit.o += $(call cc-disable-warning, vla)
 obj-$(CONFIG_TEST_KASAN_MODULE) += test_kasan_module.o
 obj-$(CONFIG_TEST_UBSAN) += test_ubsan.o
 CFLAGS_test_ubsan.o += $(call cc-disable-warning, vla)
diff --git a/lib/test_kasan.c b/lib/kasan_kunit.c
similarity index 100%
rename from lib/test_kasan.c
rename to lib/kasan_kunit.c
-- 
2.28.0.163.g6104cc2f0b6-goog

Re: linux-next: Fixes tag needs some work in the watchdog tree

2020-08-04 Thread Ahmad Fatoum

Hello Stephen,

On 8/5/20 12:26 AM, Stephen Rothwell wrote:
> Hi all,
> 
> In commit
> 
>   95d0c04e0cf9 ("watchdog: f71808e_wdt: remove use of wrong watchdog_info 
> option")
> 
> Fixes tag
> 
>   Fixes: 96cb4eb019ce ("watchdog: f71808e_wdt: new watchdog driver for
> 
> has these problem(s):
> 
>   - Subject has leading but no trailing parentheses
>   - Subject has leading but no trailing quotes
> 
> Please do not split Fixes tags over more than one line.

I wasn't sure of the convention. I will take care not to split in future.

Thanks!
Ahmad

> 

-- 
Pengutronix e.K.   | |
Steuerwalder Str. 21   | http://www.pengutronix.de/  |
31137 Hildesheim, Germany  | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |

[git pull] Christoph's init series

2020-08-04 Thread Al Viro

Christoph's "getting rid of ksys_...() uses under KERNEL_DS" stuff.
One trivial conflict (drivers/md/md.c).

The following changes since commit f8456690ba8eb18ea4714e68554e242a04f65cff:

  Merge tag 'clk-fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux into master (2020-07-15 
19:00:12 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git hch.init_path

for you to fetch changes up to f073531070d24bbb82cb2658952d949f4851024b:

  init: add an init_dup helper (2020-08-04 21:02:38 -0400)


Christoph Hellwig (50):
  fs: add a vfs_fchown helper
  fs: add a vfs_fchmod helper
  init: remove the bstat helper
  md: move the early init autodetect code to drivers/md/
  md: replace the RAID_AUTORUN ioctl with a direct function call
  md: remove the autoscan partition re-read
  md: remove the kernel version of md_u.h
  md: simplify md_setup_drive
  md: rewrite md_setup_drive to avoid ioctls
  initrd: remove support for multiple floppies
  initrd: remove the BLKFLSBUF call in handle_initrd
  initrd: switch initrd loading to struct file based APIs
  initrd: mark init_linuxrc as __init
  initrd: mark initrd support as deprecated
  initramfs: remove the populate_initrd_image and clean_rootfs stubs
  initramfs: remove clean_rootfs
  initramfs: switch initramfs unpacking to struct file based APIs
  init: open code setting up stdin/stdout/stderr
  fs: remove ksys_getdents64
  fs: remove ksys_open
  fs: remove ksys_dup
  fs: remove ksys_fchmod
  fs: remove ksys_ioctl
  fs: refactor do_utimes
  fs: move timespec validation into utimes_common
  fs: expose utimes_common
  initramfs: use vfs_utimes in do_copy
  fs: refactor do_mount
  fs: refactor ksys_umount
  fs: push the getname from do_rmdir into the callers
  devtmpfs: refactor devtmpfsd()
  init: initialize ramdisk_execute_command at compile time
  init: mark console_on_rootfs as __init
  init: mark create_dev as __init
  init: add an init_mount helper
  init: add an init_umount helper
  init: add an init_unlink helper
  init: add an init_rmdir helper
  init: add an init_chdir helper
  init: add an init_chroot helper
  init: add an init_chown helper
  init: add an init_chmod helper
  init: add an init_eaccess helper
  init: add an init_link helper
  init: add an init_symlink helper
  init: add an init_mkdir helper
  init: add an init_mknod helper
  init: add an init_stat helper
  init: add an init_utimes helper
  init: add an init_dup helper

 arch/arm/kernel/atags_parse.c |   2 -
 arch/sh/kernel/setup.c|   2 -
 arch/sparc/kernel/setup_32.c  |   2 -
 arch/sparc/kernel/setup_64.c  |   2 -
 arch/x86/kernel/setup.c   |   2 -
 drivers/base/devtmpfs.c   |  59 +++--
 drivers/md/Makefile   |   3 +
 init/do_mounts_md.c => drivers/md/md-autodetect.c | 247 ++--
 drivers/md/md.c   |  38 +---
 drivers/md/md.h   |  12 +
 fs/Makefile   |   2 +-
 fs/file.c |   7 +-
 fs/init.c | 265 ++
 fs/internal.h |  19 +-
 fs/ioctl.c|   7 +-
 fs/namei.c|  20 +-
 fs/namespace.c| 107 +
 fs/open.c |  78 +++
 fs/read_write.c   |   2 +-
 fs/readdir.c  |  11 +-
 fs/utimes.c   | 109 -
 include/linux/fs.h|   4 +
 include/linux/init_syscalls.h |  19 ++
 include/linux/initrd.h|   6 -
 include/linux/raid/detect.h   |   8 +
 include/linux/raid/md_u.h |  13 --
 include/linux/syscalls.h  |  83 ---
 init/Makefile |   1 -
 init/do_mounts.c  |  82 ++-
 init/do_mounts.h  |  28 +--
 init/do_mounts_initrd.c   |  39 ++--
 init/do_mounts_rd.c   | 101 -
 init/initramfs.c  | 148 +---
 init/main.c   |  28 +--
 init/noinitramfs.c|   8 +-
 35 files changed, 796 insertions(+), 768

linux-next: build warning after merge of the ftrace tree

Hi all,

After merging the ftrace tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

kernel/kprobes.c: In function 'kill_kprobe':
kernel/kprobes.c:1116:33: warning: statement with no effect [-Wunused-value]
 1116 | #define disarm_kprobe_ftrace(p) (-ENODEV)
  | ^
kernel/kprobes.c:2154:3: note: in expansion of macro 'disarm_kprobe_ftrace'
 2154 |   disarm_kprobe_ftrace(p);
  |   ^~~~

Introduced by commit

  0cb2f1372baa ("kprobes: Fix NULL pointer dereference at 
kprobe_ftrace_handler")

-- 
Cheers,
Stephen Rothwell


pgp9wR7NZkTPH.pgp
Description: OpenPGP digital signature

Re: [PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation

2020-08-04 Thread Baoquan He

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Currently, initrd image is reserved very early during setup and then it
> might be relocated and re-reserved after the initial physical memory
> mapping is created. The "late" reservation of memblock verifies that mapped
> memory size exceeds the size of initrd, the checks whether the relocation
  ~ then?
> required and, if yes, relocates inirtd to a new memory allocated from
> memblock and frees the old location.
> 
> The check for memory size is excessive as memblock allocation will anyway
> fail if there is not enough memory. Besides, there is no point to allocate
> memory from memblock using memblock_find_in_range() + memblock_reserve()
> when there exists memblock_phys_alloc_range() with required functionality.
> 
> Remove the redundant check and simplify memblock allocation.
> 
> Signed-off-by: Mike Rapoport 
> ---
>  arch/x86/kernel/setup.c | 16 +++-
>  1 file changed, 3 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index a3767e74c758..d8de4053c5e8 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -262,16 +262,12 @@ static void __init relocate_initrd(void)
>   u64 area_size = PAGE_ALIGN(ramdisk_size);
>  
>   /* We need to move the initrd down into directly mapped mem */
> - relocated_ramdisk = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
> -area_size, PAGE_SIZE);
> -
> + relocated_ramdisk = memblock_phys_alloc_range(area_size, PAGE_SIZE, 0,
> +   PFN_PHYS(max_pfn_mapped));
>   if (!relocated_ramdisk)
>   panic("Cannot find place for new RAMDISK of size %lld\n",
> ramdisk_size);
>  
> - /* Note: this includes all the mem currently occupied by
> -the initrd, we rely on that fact to keep the data intact. */
> - memblock_reserve(relocated_ramdisk, area_size);
>   initrd_start = relocated_ramdisk + PAGE_OFFSET;
>   initrd_end   = initrd_start + ramdisk_size;
>   printk(KERN_INFO "Allocated new RAMDISK: [mem %#010llx-%#010llx]\n",
> @@ -298,13 +294,13 @@ static void __init early_reserve_initrd(void)
>  
>   memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
>  }
> +
>  static void __init reserve_initrd(void)
>  {
>   /* Assume only end is not page aligned */
>   u64 ramdisk_image = get_ramdisk_image();
>   u64 ramdisk_size  = get_ramdisk_size();
>   u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
> - u64 mapped_size;
>  
>   if (!boot_params.hdr.type_of_loader ||
>   !ramdisk_image || !ramdisk_size)
> @@ -312,12 +308,6 @@ static void __init reserve_initrd(void)
>  
>   initrd_start = 0;
>  
> - mapped_size = memblock_mem_size(max_pfn_mapped);
> - if (ramdisk_size >= (mapped_size>>1))
> - panic("initrd too large to handle, "
> -"disabling initrd (%lld needed, %lld available)\n",
> -ramdisk_size, mapped_size>>1);

Reviewed-by: Baoquan He 

> -
>   printk(KERN_INFO "RAMDISK: [mem %#010llx-%#010llx]\n", ramdisk_image,
>   ramdisk_end - 1);
>  
> -- 
> 2.26.2
>

Re: [PATCH v2 2/2] ASoC: fsl_sai: Refine enable and disable sequence for synchronous mode

2020-08-04 Thread Nicolin Chen

On Wed, Aug 05, 2020 at 10:23:53AM +0800, Shengjiu Wang wrote:
> Tx synchronous with Rx:
> The TCSR.TE is no need to enabled when only Rx is going to be enabled.
> Check if need to disable RSCR.RE before disabling TCSR.TE.
> 
> Rx synchronous with Tx:
> The RCSR.RE is no need to enabled when only Tx is going to be enabled.
> Check if need to disable TSCR.RE before disabling RCSR.TE.

Please add to the commit log more context such as what we have
discussed: what's the problem of the current driver, and why we
_have_to_ apply this change though it's sightly against what RM
recommends.

(If thing is straightforward, it's okay to make the text short.
 Yet I believe that this change deserves more than these lines.)

One info that you should mention -- also the main reason why I'm
convinced to add this change: trigger() is still in the shape of
the early version where we only supported one operation mode --
Tx synchronous with Rx. So we need an update for other modes.

> Signed-off-by: Shengjiu Wang 

The git-diff part looks good, please add this in next ver.:

Reviewed-by: Nicolin Chen 

Btw, the new fsl_sai_dir_is_synced() can be probably applied to
other places with a followup patch.

Re: linux-next: manual merge of the kspp tree with the drm-misc tree

Hi all,

On Fri, 3 Jul 2020 14:35:50 +1000 Stephen Rothwell  
wrote:
> 
> Today's linux-next merge of the kspp tree got a conflict in:
> 
>   drivers/gpu/drm/drm_edid.c
> 
> between commit:
> 
>   948de84233d3 ("drm : Insert blank lines after declarations.")
> 
> from the drm-misc tree and commit:
> 
>   80b89ab785a4 ("treewide: Remove uninitialized_var() usage")
> 
> from the kspp tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> diff --cc drivers/gpu/drm/drm_edid.c
> index 252e89cb54a3,b98fa573e706..
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@@ -3095,8 -3051,7 +3095,8 @@@ static int drm_cvt_modes(struct drm_con
>   const u8 empty[3] = { 0, 0, 0 };
>   
>   for (i = 0; i < 4; i++) {
> - int uninitialized_var(width), height;
> + int width, height;
>  +
>   cvt = &(timing->data.other_data.data.cvt[i]);
>   
>   if (!memcmp(cvt->code, empty, 3))

This is now a conflict between the drm tree and Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgp09YCvuignU.pgp
Description: OpenPGP digital signature

Re: [PATCH] Userfaultfd: Avoid double free of userfault_ctx and remove O_CLOEXEC

2020-08-04 Thread Eric Biggers

On Wed, Aug 05, 2020 at 01:47:58PM +1000, Aleksa Sarai wrote:
> On 2020-08-04, Lokesh Gidra  wrote:
> > when get_unused_fd_flags returns error, ctx will be freed by
> > userfaultfd's release function, which is indirectly called by fput().
> > Also, if anon_inode_getfile_secure() returns an error, then
> > userfaultfd_ctx_put() is called, which calls mmdrop() and frees ctx.
> > 
> > Also, the O_CLOEXEC was inadvertently added to the call to
> > get_unused_fd_flags() [1].
> 
> I disagree that it is "wrong" to do O_CLOEXEC-by-default (after all,
> it's trivial to disable O_CLOEXEC, but it's non-trivial to enable it on
> an existing file descriptor because it's possible for another thread to
> exec() before you set the flag). Several new syscalls and fd-returning
> facilities are O_CLOEXEC-by-default now (the most obvious being pidfds
> and seccomp notifier fds).

Sure, O_CLOEXEC *should* be the default, but this is an existing syscall so it
has to keep the existing behavior.

> At the very least there should be a new flag added that sets O_CLOEXEC.

There already is one (but these patches broke it).

- Eric

Re: linux-next: manual merge of the kspp tree with the rdma tree

Hi all,

On Tue, 28 Jul 2020 18:45:20 +1000 Stephen Rothwell  
wrote:
> 
> Today's linux-next merge of the kspp tree got a conflict in:
> 
>   drivers/infiniband/core/uverbs_cmd.c
> 
> between commit:
> 
>   29f3fe1d6854 ("RDMA/uverbs: Remove redundant assignments")
> 
> from the rdma tree and commit:
> 
>   3f649ab728cd ("treewide: Remove uninitialized_var() usage")
> 
> from the kspp tree.
> 
> I fixed it up (the former basically did what the latter did, so I used
> the former version) and can carry the fix as necessary. This is now fixed
> as far as linux-next is concerned, but any non trivial conflicts should
> be mentioned to your upstream maintainer when your tree is submitted for
> merging.  You may also want to consider cooperating with the maintainer
> of the conflicting tree to minimise any particularly complex conflicts.

This is now a conflict between the rdma tree and Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgpYiNdsbBmEy.pgp
Description: OpenPGP digital signature

Re: linux-next: manual merge of the kspp tree with the net-next tree

Hi all,

On Mon, 27 Jul 2020 19:27:21 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the kspp tree got a conflict in:
> 
>   net/ipv6/ip6_flowlabel.c
> 
> between commit:
> 
>   ff6a4cf214ef ("net/ipv6: split up ipv6_flowlabel_opt")
> 
> from the net-next tree and commit:
> 
>   3f649ab728cd ("treewide: Remove uninitialized_var() usage")
> 
> from the kspp tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> diff --cc net/ipv6/ip6_flowlabel.c
> index 215b6f5e733e,73bb047e6037..
> --- a/net/ipv6/ip6_flowlabel.c
> +++ b/net/ipv6/ip6_flowlabel.c
> @@@ -534,184 -533,181 +534,184 @@@ int ipv6_flowlabel_opt_get(struct sock 
>   return -ENOENT;
>   }
>   
>  -int ipv6_flowlabel_opt(struct sock *sk, char __user *optval, int optlen)
>  +#define socklist_dereference(__sflp) \
>  +rcu_dereference_protected(__sflp, lockdep_is_held(_sk_fl_lock))
>  +
>  +static int ipv6_flowlabel_put(struct sock *sk, struct in6_flowlabel_req 
> *freq)
>   {
>  -int err;
>  -struct net *net = sock_net(sk);
>   struct ipv6_pinfo *np = inet6_sk(sk);
>  -struct in6_flowlabel_req freq;
>  -struct ipv6_fl_socklist *sfl1 = NULL;
>  -struct ipv6_fl_socklist *sfl;
>   struct ipv6_fl_socklist __rcu **sflp;
>  -struct ip6_flowlabel *fl, *fl1 = NULL;
>  +struct ipv6_fl_socklist *sfl;
>   
>  +if (freq->flr_flags & IPV6_FL_F_REFLECT) {
>  +if (sk->sk_protocol != IPPROTO_TCP)
>  +return -ENOPROTOOPT;
>  +if (!np->repflow)
>  +return -ESRCH;
>  +np->flow_label = 0;
>  +np->repflow = 0;
>  +return 0;
>  +}
>   
>  -if (optlen < sizeof(freq))
>  -return -EINVAL;
>  +spin_lock_bh(_sk_fl_lock);
>  +for (sflp = >ipv6_fl_list;
>  + (sfl = socklist_dereference(*sflp)) != NULL;
>  + sflp = >next) {
>  +if (sfl->fl->label == freq->flr_label)
>  +goto found;
>  +}
>  +spin_unlock_bh(_sk_fl_lock);
>  +return -ESRCH;
>  +found:
>  +if (freq->flr_label == (np->flow_label & IPV6_FLOWLABEL_MASK))
>  +np->flow_label &= ~IPV6_FLOWLABEL_MASK;
>  +*sflp = sfl->next;
>  +spin_unlock_bh(_sk_fl_lock);
>  +fl_release(sfl->fl);
>  +kfree_rcu(sfl, rcu);
>  +return 0;
>  +}
>   
>  -if (copy_from_user(, optval, sizeof(freq)))
>  -return -EFAULT;
>  +static int ipv6_flowlabel_renew(struct sock *sk, struct in6_flowlabel_req 
> *freq)
>  +{
>  +struct ipv6_pinfo *np = inet6_sk(sk);
>  +struct net *net = sock_net(sk);
>  +struct ipv6_fl_socklist *sfl;
>  +int err;
>   
>  -switch (freq.flr_action) {
>  -case IPV6_FL_A_PUT:
>  -if (freq.flr_flags & IPV6_FL_F_REFLECT) {
>  -if (sk->sk_protocol != IPPROTO_TCP)
>  -return -ENOPROTOOPT;
>  -if (!np->repflow)
>  -return -ESRCH;
>  -np->flow_label = 0;
>  -np->repflow = 0;
>  -return 0;
>  -}
>  -spin_lock_bh(_sk_fl_lock);
>  -for (sflp = >ipv6_fl_list;
>  - (sfl = rcu_dereference_protected(*sflp,
>  -  
> lockdep_is_held(_sk_fl_lock))) != NULL;
>  - sflp = >next) {
>  -if (sfl->fl->label == freq.flr_label) {
>  -if (freq.flr_label == 
> (np->flow_label_FLOWLABEL_MASK))
>  -np->flow_label &= ~IPV6_FLOWLABEL_MASK;
>  -*sflp = sfl->next;
>  -spin_unlock_bh(_sk_fl_lock);
>  -fl_release(sfl->fl);
>  -kfree_rcu(sfl, rcu);
>  -return 0;
>  -}
>  +rcu_read_lock_bh();
>  +for_each_sk_fl_rcu(np, sfl) {
>  +if (sfl->fl->label == freq->flr_label) {
>  +err = fl6_renew(sfl->fl, freq->flr_linger,
>  +freq->flr_expires);
>  +rcu_read_unlock_bh();
>  +return err;
>   }
>  -spin_unlock_bh(_sk_fl_lock);
>  -return -ESRCH;
>  +}
>  +rcu_read_unlock_bh();
>   
>  -case IPV6_FL_A_RENEW:
>  -rcu_read_lock_bh();
>  -for_each_sk_fl_rcu(np, sfl) {
>  -if (sfl->fl->label == freq.flr_label) {
>  -err = fl6_renew(sfl->fl, freq.flr_linger, 
>

[PATCH v7 0/4] scsi: ufs: Add Host Performance Booster Support

2020-08-04 Thread Daejun Park

Changelog:

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer (FTL)
to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for the
FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory. 

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device information of the UFS is
queried. After checking supported features, the data structure for the HPB
is initialized according to the device information.

A read I/O in the active sub-region where the map is cached is changed to
HPB READ by the HPB.

The HPB manages the L2P map using information received from the
device. For active sub-region, the HPB caches through ufshpb_map
request. For the in-active region, the HPB discards the L2P map.
When a write I/O occurs in an active sub-region area, associated dirty
bitmap checked as dirty for preventing stale read.

HPB is shown to have a performance improvement of 58 - 67% for random read
workload. [1]

This series patches are based on the 5.9/scsi-queue branch.

[1]:
https://www.usenix.org/conference/hotstorage17/program/presentation/jeong

Daejun park (4):
 scsi: ufs: Add UFS feature related parameter
 scsi: ufs: Introduce HPB feature
 scsi: ufs: L2P map management for HPB read
 scsi: ufs: Prepare HPB read for cached sub-region
 
 drivers/scsi/ufs/Kconfig  |   18 +
 drivers/scsi/ufs/Makefile |1 +
 drivers/scsi/ufs/ufs.h|   12 +
 drivers/scsi/ufs/ufshcd.c |   42 +
 drivers/scsi/ufs/ufshcd.h |9 +
 drivers/scsi/ufs/ufshpb.c | 1926 
 drivers/scsi/ufs/ufshpb.h |  241 +
 7 files changed, 2249 insertions(+)
 created mode 100644 drivers/scsi/ufs/ufshpb.c
 created mode 100644 drivers/scsi/ufs/ufshpb.h

Re: [PATCH bpf-next 2/5] libbpf: support BPF_PROG_TYPE_USER programs

2020-08-04 Thread Song Liu




> On Aug 4, 2020, at 6:38 PM, Andrii Nakryiko  wrote:
> 
> On Mon, Aug 3, 2020 at 6:18 PM Song Liu  wrote:
>> 
>> 
>> 
>>> On Aug 2, 2020, at 6:40 PM, Andrii Nakryiko  
>>> wrote:
>>> 
>>> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
 
>> 
>> [...]
>> 
>>> 
 };
 
 LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr 
 *test_attr);
 diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
 index b9f11f854985b..9ce175a486214 100644
 --- a/tools/lib/bpf/libbpf.c
 +++ b/tools/lib/bpf/libbpf.c
 @@ -6922,6 +6922,7 @@ static const struct bpf_sec_def section_defs[] = {
   BPF_PROG_SEC("lwt_out", BPF_PROG_TYPE_LWT_OUT),
   BPF_PROG_SEC("lwt_xmit",BPF_PROG_TYPE_LWT_XMIT),
   BPF_PROG_SEC("lwt_seg6local",   BPF_PROG_TYPE_LWT_SEG6LOCAL),
 +   BPF_PROG_SEC("user",BPF_PROG_TYPE_USER),
>>> 
>>> let's do "user/" for consistency with most other prog types (and nice
>>> separation between prog type and custom user name)
>> 
>> About "user" vs. "user/", I still think "user" is better.
>> 
>> Unlike kprobe and tracepoint, user prog doesn't use the part after "/".
>> This is similar to "perf_event" for BPF_PROG_TYPE_PERF_EVENT, "xdl" for
>> BPF_PROG_TYPE_XDP, etc. If we specify "user" here, "user/" and "user/xxx"
>> would also work. However, if we specify "user/" here, programs that used
>> "user" by accident will fail to load, with a message like:
>> 
>>libbpf: failed to load program 'user'
>> 
>> which is confusing.
> 
> xdp, perf_event and a bunch of others don't enforce it, that's true,
> they are a bit of a legacy,

I don't see w/o "/" is a legacy thing. BPF_PROG_TYPE_STRUCT_OPS just uses
"struct_ops". 

> unfortunately. But all the recent ones do,
> and we explicitly did that for xdp_dev/xdp_cpu, for instance.
> Specifying just "user" in the spec would allow something nonsensical
> like "userargh", for instance, due to this being treated as a prefix.
> There is no harm to require users to do "user/my_prog", though.

I don't see why allowing "userargh" is a problem. Failing "user" is 
more confusing. We can probably improve that by a hint like:

libbpf: failed to load program 'user', do you mean "user/"?

But it is pretty silly. "user/something_never_used" also looks weird.

> Alternatively, we could introduce a new convention in the spec,
> something like "user?", which would accept either "user" or
> "user/something", but not "user/" nor "userblah". We can try that as
> well.

Again, I don't really understand why allowing "userblah" is a problem. 
We already have "xdp", "xdp_devmap/", and "xdp_cpumap/", they all work 
fine so far. 

Thanks,
Song

Re: [RFC PATCH 00/16] Core scheduling v6

2020-08-04 Thread Li, Aubrey

On 2020/8/4 0:53, Joel Fernandes wrote:
> Hi Aubrey,
> 
> On Mon, Aug 3, 2020 at 4:23 AM Li, Aubrey  wrote:
>>
>> On 2020/7/1 5:32, Vineeth Remanan Pillai wrote:
>>> Sixth iteration of the Core-Scheduling feature.
>>>
>>> Core scheduling is a feature that allows only trusted tasks to run
>>> concurrently on cpus sharing compute resources (eg: hyperthreads on a
>>> core). The goal is to mitigate the core-level side-channel attacks
>>> without requiring to disable SMT (which has a significant impact on
>>> performance in some situations). Core scheduling (as of v6) mitigates
>>> user-space to user-space attacks and user to kernel attack when one of
>>> the siblings enters the kernel via interrupts. It is still possible to
>>> have a task attack the sibling thread when it enters the kernel via
>>> syscalls.
>>>
>>> By default, the feature doesn't change any of the current scheduler
>>> behavior. The user decides which tasks can run simultaneously on the
>>> same core (for now by having them in the same tagged cgroup). When a
>>> tag is enabled in a cgroup and a task from that cgroup is running on a
>>> hardware thread, the scheduler ensures that only idle or trusted tasks
>>> run on the other sibling(s). Besides security concerns, this feature
>>> can also be beneficial for RT and performance applications where we
>>> want to control how tasks make use of SMT dynamically.
>>>
>>> This iteration is mostly a cleanup of v5 except for a major feature of
>>> pausing sibling when a cpu enters kernel via nmi/irq/softirq. Also
>>> introducing documentation and includes minor crash fixes.
>>>
>>> One major cleanup was removing the hotplug support and related code.
>>> The hotplug related crashes were not documented and the fixes piled up
>>> over time leading to complex code. We were not able to reproduce the
>>> crashes in the limited testing done. But if they are reroducable, we
>>> don't want to hide them. We should document them and design better
>>> fixes if any.
>>>
>>> In terms of performance, the results in this release are similar to
>>> v5. On a x86 system with N hardware threads:
>>> - if only N/2 hardware threads are busy, the performance is similar
>>>   between baseline, corescheduling and nosmt
>>> - if N hardware threads are busy with N different corescheduling
>>>   groups, the impact of corescheduling is similar to nosmt
>>> - if N hardware threads are busy and multiple active threads share the
>>>   same corescheduling cookie, they gain a performance improvement over
>>>   nosmt.
>>>   The specific performance impact depends on the workload, but for a
>>>   really busy database 12-vcpu VM (1 coresched tag) running on a 36
>>>   hardware threads NUMA node with 96 mostly idle neighbor VMs (each in
>>>   their own coresched tag), the performance drops by 54% with
>>>   corescheduling and drops by 90% with nosmt.
>>>
>>
>> We found uperf(in cgroup) throughput drops by ~50% with corescheduling.
>>
>> The problem is, uperf triggered a lot of softirq and offloaded softirq
>> service to *ksoftirqd* thread.
>>
>> - default, ksoftirqd thread can run with uperf on the same core, we saw
>>   100% CPU utilization.
>> - coresched enabled, ksoftirqd's core cookie is different from uperf, so
>>   they can't run concurrently on the same core, we saw ~15% forced idle.
>>
>> I guess this kind of performance drop can be replicated by other similar
>> (a lot of softirq activities) workloads.
>>
>> Currently core scheduler picks cookie-match tasks for all SMT siblings, does
>> it make sense we add a policy to allow cookie-compatible task running 
>> together?
>> For example, if a task is trusted(set by admin), it can work with kernel 
>> thread.
>> The difference from corescheduling disabled is that we still have user to 
>> user
>> isolation.
> 
> In ChromeOS we are considering all cookie-0 tasks as trusted.
> Basically if you don't trust a task, then that is when you assign the
> task a tag. We do this for the sandboxed processes.

I have a proposal of this, by changing cpu.tag to cpu.coresched_policy,
something like the following:

+/*
+ * Core scheduling policy:
+ * - CORE_SCHED_DISABLED: core scheduling is disabled.
+ * - CORE_COOKIE_MATCH: tasks with same cookie can run
+ * on the same core concurrently.
+ * - CORE_COOKIE_TRUST: trusted task can run with kernel
thread on the same core concurrently. 
+ * - CORE_COOKIE_LONELY: tasks with cookie can run only
+ * with idle thread on the same core.
+ */
+enum coresched_policy {
+   CORE_SCHED_DISABLED,
+   CORE_SCHED_COOKIE_MATCH,
+   CORE_SCHED_COOKIE_TRUST,
+   CORE_SCHED_COOKIE_LONELY,
+};

We can set policy to CORE_COOKIE_TRUST of uperf cgroup and fix this kind
of performance regression. Not sure if this sounds attractive?

> 
> Is the uperf throughput worse with SMT+core-scheduling versus no-SMT ?

This is a good question, from the data we measured by uperf,
SMT+core-scheduling is

Re: [PATCH v2 11/17] arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()

2020-08-04 Thread Baoquan He

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> There are several occurrences of the following pattern:
> 
>   for_each_memblock(memory, reg) {
>   start_pfn = memblock_region_memory_base_pfn(reg);
>   end_pfn = memblock_region_memory_end_pfn(reg);
> 
>   /* do something with start_pfn and end_pfn */
>   }
> 
> Rather than iterate over all memblock.memory regions and each time query
> for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
> simpler and clearer code.
> 
> Signed-off-by: Mike Rapoport 
> ---
>  arch/arm/mm/init.c   | 11 ---
>  arch/arm64/mm/init.c | 11 ---
>  arch/powerpc/kernel/fadump.c | 11 ++-
>  arch/powerpc/mm/mem.c| 15 ---
>  arch/powerpc/mm/numa.c   |  7 ++-
>  arch/s390/mm/page-states.c   |  6 ++
>  arch/sh/mm/init.c|  9 +++--
>  mm/memblock.c|  6 ++
>  mm/sparse.c  | 10 --
>  9 files changed, 35 insertions(+), 51 deletions(-)
> 

Reviewed-by: Baoquan He

[PATCH net 2/4] can: j1939: cancel rxtimer on multipacket broadcast session complete

If j1939_xtp_rx_dat_one() receive last frame of multipacket broadcast
message, j1939_session_timers_cancel() should be called to cancel
rxtimer.

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Zhang Changzhong 
---
 net/can/j1939/transport.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index e5188ac..dd6a120 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1788,6 +1788,7 @@ static void j1939_xtp_rx_dat_one(struct j1939_session 
*session,
}
 
if (final) {
+   j1939_session_timers_cancel(session);
j1939_session_completed(session);
} else if (do_cts_eoma) {
j1939_tp_set_rxtimeout(session, 1250);
-- 
2.9.5

[PATCH net 4/4] can: j1939: add rxtimer for multipacket broadcast session

According to SAE J1939/21 (Chapter 5.12.3 and APPENDIX C), for transmit
side the required time interval between packets of a multipacket
broadcast message is 50 to 200 ms, the responder shall use a timeout of
250ms (provides margin allowing for the maximumm spacing of 200ms). For
receive side a timeout will occur when a time of greater than 750 ms
elapsed between two message packets when more packets were expected.

So this patch fix and add rxtimer for multipacket broadcast session.

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Zhang Changzhong 
---
 net/can/j1939/transport.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index 5757f9f..fad210e 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -716,10 +716,12 @@ static int j1939_session_tx_rts(struct j1939_session 
*session)
return ret;
 
session->last_txcmd = dat[0];
-   if (dat[0] == J1939_TP_CMD_BAM)
+   if (dat[0] == J1939_TP_CMD_BAM) {
j1939_tp_schedule_txtimer(session, 50);
-
-   j1939_tp_set_rxtimeout(session, 1250);
+   j1939_tp_set_rxtimeout(session, 250);
+   } else {
+   j1939_tp_set_rxtimeout(session, 1250);
+   }
 
netdev_dbg(session->priv->ndev, "%s: 0x%p\n", __func__, session);
 
@@ -1665,11 +1667,15 @@ static void j1939_xtp_rx_rts(struct j1939_priv *priv, 
struct sk_buff *skb,
}
session->last_cmd = cmd;
 
-   j1939_tp_set_rxtimeout(session, 1250);
-
-   if (cmd != J1939_TP_CMD_BAM && !session->transmission) {
-   j1939_session_txtimer_cancel(session);
-   j1939_tp_schedule_txtimer(session, 0);
+   if (cmd == J1939_TP_CMD_BAM) {
+   if (!session->transmission)
+   j1939_tp_set_rxtimeout(session, 750);
+   } else {
+   if (!session->transmission) {
+   j1939_session_txtimer_cancel(session);
+   j1939_tp_schedule_txtimer(session, 0);
+   }
+   j1939_tp_set_rxtimeout(session, 1250);
}
 
j1939_session_put(session);
@@ -1720,6 +1726,7 @@ static void j1939_xtp_rx_dat_one(struct j1939_session 
*session,
int offset;
int nbytes;
bool final = false;
+   bool remain = false;
bool do_cts_eoma = false;
int packet;
 
@@ -1781,6 +1788,8 @@ static void j1939_xtp_rx_dat_one(struct j1939_session 
*session,
j1939_cb_is_broadcast(>skcb)) {
if (session->pkt.rx >= session->pkt.total)
final = true;
+   else
+   remain = true;
} else {
/* never final, an EOMA must follow */
if (session->pkt.rx >= session->pkt.last)
@@ -1790,6 +1799,9 @@ static void j1939_xtp_rx_dat_one(struct j1939_session 
*session,
if (final) {
j1939_session_timers_cancel(session);
j1939_session_completed(session);
+   } else if (remain) {
+   if (!session->transmission)
+   j1939_tp_set_rxtimeout(session, 750);
} else if (do_cts_eoma) {
j1939_tp_set_rxtimeout(session, 1250);
if (!session->transmission)
-- 
2.9.5

[PATCH net 3/4] can: j1939: abort multipacket broadcast session when timeout occurs

If timeout occurs, j1939_tp_rxtimer() first calls hrtimer_start() to
restart rxtimer, and then calls __j1939_session_cancel() to set
session->state = J1939_SESSION_WAITING_ABORT. At next timeout
expiration, because of the J1939_SESSION_WAITING_ABORT session state
j1939_tp_rxtimer() will call j1939_session_deactivate_activate_next()
to deactivate current session, and rxtimer won't be set.

But for multipacket broadcast session, __j1939_session_cancel() don't
set session->state = J1939_SESSION_WAITING_ABORT, thus current session
won't be deactivate and hrtimer_start() is called to start new
rxtimer again and again.

So fix it by moving session->state = J1939_SESSION_WAITING_ABORT out of
if (!j1939_cb_is_broadcast(>skcb)) statement.

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Zhang Changzhong 
---
 net/can/j1939/transport.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index dd6a120..5757f9f 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1055,9 +1055,9 @@ static void __j1939_session_cancel(struct j1939_session 
*session,
lockdep_assert_held(>priv->active_session_list_lock);
 
session->err = j1939_xtp_abort_to_errno(priv, err);
+   session->state = J1939_SESSION_WAITING_ABORT;
/* do not send aborts on incoming broadcasts */
if (!j1939_cb_is_broadcast(>skcb)) {
-   session->state = J1939_SESSION_WAITING_ABORT;
j1939_xtp_tx_abort(priv, >skcb,
   !session->transmission,
   err, session->skcb.addr.pgn);
-- 
2.9.5

[PATCH net 0/4] support multipacket broadcast message

Zhang Changzhong (4):
  can: j1939: fix support for multipacket broadcast message
  can: j1939: cancel rxtimer on multipacket broadcast session complete
  can: j1939: abort multipacket broadcast session when timeout occurs
  can: j1939: add rxtimer for multipacket broadcast session

 net/can/j1939/transport.c | 48 +++
 1 file changed, 36 insertions(+), 12 deletions(-)

-- 
2.9.5

[PATCH net 1/4] can: j1939: fix support for multipacket broadcast message

Currently j1939_tp_im_involved_anydir() in j1939_tp_recv() check the
previously set flags J1939_ECU_LOCAL_DST and J1939_ECU_LOCAL_SRC of
incoming skb, thus multipacket broadcast message was aborted by
receive side because it may come from remote ECUs and have no exact
dst address. Similarly, j1939_tp_cmd_recv() and j1939_xtp_rx_dat()
didn't process broadcast message.

So fix it by checking and process broadcast message in j1939_tp_recv(),
j1939_tp_cmd_recv() and j1939_xtp_rx_dat().

Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Zhang Changzhong 
---
 net/can/j1939/transport.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index 9f99af5..e5188ac 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1651,8 +1651,12 @@ static void j1939_xtp_rx_rts(struct j1939_priv *priv, 
struct sk_buff *skb,
return;
}
session = j1939_xtp_rx_rts_session_new(priv, skb);
-   if (!session)
+   if (!session) {
+   if (cmd == J1939_TP_CMD_BAM && 
j1939_sk_recv_match(priv, skcb))
+   netdev_info(priv->ndev, "%s: failed to create 
TP BAM session\n",
+   __func__);
return;
+   }
} else {
if (j1939_xtp_rx_rts_session_active(session, skb)) {
j1939_session_put(session);
@@ -1829,6 +1833,13 @@ static void j1939_xtp_rx_dat(struct j1939_priv *priv, 
struct sk_buff *skb)
else
j1939_xtp_rx_dat_one(session, skb);
}
+
+   if (j1939_cb_is_broadcast(skcb)) {
+   session = j1939_session_get_by_addr(priv, >addr, false,
+   false);
+   if (session)
+   j1939_xtp_rx_dat_one(session, skb);
+   }
 }
 
 /* j1939 main intf */
@@ -1920,7 +1931,7 @@ static void j1939_tp_cmd_recv(struct j1939_priv *priv, 
struct sk_buff *skb)
if (j1939_tp_im_transmitter(skcb))
j1939_xtp_rx_rts(priv, skb, true);
 
-   if (j1939_tp_im_receiver(skcb))
+   if (j1939_tp_im_receiver(skcb) || j1939_cb_is_broadcast(skcb))
j1939_xtp_rx_rts(priv, skb, false);
 
break;
@@ -1984,7 +1995,7 @@ int j1939_tp_recv(struct j1939_priv *priv, struct sk_buff 
*skb)
 {
struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb);
 
-   if (!j1939_tp_im_involved_anydir(skcb))
+   if (!j1939_tp_im_involved_anydir(skcb) && !j1939_cb_is_broadcast(skcb))
return 0;
 
switch (skcb->addr.pgn) {
-- 
2.9.5

Re: [PATCH v2 02/17] dma-contiguous: simplify cma_early_percent_memory()

2020-08-04 Thread Baoquan He

On 08/02/20 at 07:35pm, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> The memory size calculation in cma_early_percent_memory() traverses
> memblock.memory rather than simply call memblock_phys_mem_size(). The
> comment in that function suggests that at some point there should have been
> call to memblock_analyze() before memblock_phys_mem_size() could be used.
> As of now, there is no memblock_analyze() at all and
> memblock_phys_mem_size() can be used as soon as cold-plug memory is
> registerd with memblock.
> 
> Replace loop over memblock.memory with a call to memblock_phys_mem_size().
> 
> Signed-off-by: Mike Rapoport 
> Reviewed-by: Christoph Hellwig 
> ---
>  kernel/dma/contiguous.c | 11 +--
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 15bc5026c485..1992afd8ca7b 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -73,16 +73,7 @@ early_param("cma", early_cma);
>  
>  static phys_addr_t __init __maybe_unused cma_early_percent_memory(void)
>  {
> - struct memblock_region *reg;
> - unsigned long total_pages = 0;
> -
> - /*
> -  * We cannot use memblock_phys_mem_size() here, because
> -  * memblock_analyze() has not been called yet.
> -  */
> - for_each_memblock(memory, reg)
> - total_pages += memblock_region_memory_end_pfn(reg) -
> -memblock_region_memory_base_pfn(reg);
> + unsigned long total_pages = PHYS_PFN(memblock_phys_mem_size());

Reviewed-by: Baoquan He 

>  
>   return (total_pages * CONFIG_CMA_SIZE_PERCENTAGE / 100) << PAGE_SHIFT;
>  }
> -- 
> 2.26.2
>

Re: [PATCH] Userfaultfd: Avoid double free of userfault_ctx and remove O_CLOEXEC

2020-08-04 Thread Aleksa Sarai

On 2020-08-04, Lokesh Gidra  wrote:
> when get_unused_fd_flags returns error, ctx will be freed by
> userfaultfd's release function, which is indirectly called by fput().
> Also, if anon_inode_getfile_secure() returns an error, then
> userfaultfd_ctx_put() is called, which calls mmdrop() and frees ctx.
> 
> Also, the O_CLOEXEC was inadvertently added to the call to
> get_unused_fd_flags() [1].

I disagree that it is "wrong" to do O_CLOEXEC-by-default (after all,
it's trivial to disable O_CLOEXEC, but it's non-trivial to enable it on
an existing file descriptor because it's possible for another thread to
exec() before you set the flag). Several new syscalls and fd-returning
facilities are O_CLOEXEC-by-default now (the most obvious being pidfds
and seccomp notifier fds).

At the very least there should be a new flag added that sets O_CLOEXEC.

> Adding Al Viro's suggested-by, based on [2].
> 
> [1] 
> https://lore.kernel.org/lkml/1f69c0ab-5791-974f-8bc0-3997ab1d6...@dancol.org/
> [2] https://lore.kernel.org/lkml/20200719165746.gj2786...@zeniv.linux.org.uk/
> 
> Fixes: d08ac70b1e0d (Wire UFFD up to SELinux)
> Suggested-by: Al Viro 
> Reported-by: syzbot+75867c44841cb6373...@syzkaller.appspotmail.com
> Signed-off-by: Lokesh Gidra 
> ---
>  fs/userfaultfd.c | 14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index ae859161908f..e15eb8fdc083 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -2042,24 +2042,18 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
>   O_RDWR | (flags & UFFD_SHARED_FCNTL_FLAGS),
>   NULL);
>   if (IS_ERR(file)) {
> - fd = PTR_ERR(file);
> - goto out;
> + userfaultfd_ctx_put(ctx);
> + return PTR_ERR(file);
>   }
>  
> - fd = get_unused_fd_flags(O_RDONLY | O_CLOEXEC);
> + fd = get_unused_fd_flags(O_RDONLY);
>   if (fd < 0) {
>   fput(file);
> - goto out;
> + return fd;
>   }
>  
>   ctx->owner = file_inode(file);
>   fd_install(fd, file);
> -
> -out:
> - if (fd < 0) {
> - mmdrop(ctx->mm);
> - kmem_cache_free(userfaultfd_ctx_cachep, ctx);
> - }
>   return fd;
>  }
>  
> -- 
> 2.28.0.163.g6104cc2f0b6-goog
> 

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature

Re: [PATCH v2 1/2] ASoC: fsl_sai: Clean code for synchronous mode

2020-08-04 Thread Nicolin Chen

On Wed, Aug 05, 2020 at 10:23:52AM +0800, Shengjiu Wang wrote:
> Tx synchronous with Rx: The RMR is the word mask register, it is used
> to mask any word in the frame, it is not relating to clock generation,
> So it is no need to be changed when Tx is going to be enabled.
> 
> Rx synchronous with Tx: The TMR is the word mask register, it is used
> to mask any word in the frame, it is not relating to clock generation,
> So it is no need to be changed when Rx is going to be enabled.
> 
> Signed-off-by: Shengjiu Wang 

Can you rename the PATCH subject to something more specific?
For example, "Drop TMR/RMR settings for synchronous mode".

Please add this once it's addressed:
Reviewed-by: Nicolin Chen

Re: linux-next: manual merge of the pidfd tree with the risc-v tree

Hi Palmer,

On Tue, 04 Aug 2020 18:17:35 -0700 (PDT) Palmer Dabbelt  
wrote:
>
> >> diff --cc arch/riscv/Kconfig
> >> index 76a0cfad3367,f6a3a2bea3d8..
> >> --- a/arch/riscv/Kconfig
> >> +++ b/arch/riscv/Kconfig
> >> @@@ -57,9 -52,6 +57,8 @@@ config RISC
> >>select HAVE_ARCH_SECCOMP_FILTER
> >>select HAVE_ARCH_TRACEHOOK
> >>select HAVE_ASM_MODVERSIONS
> >>  + select HAVE_CONTEXT_TRACKING
> >> -  select HAVE_COPY_THREAD_TLS
> >>  + select HAVE_DEBUG_KMEMLEAK
> >>select HAVE_DMA_CONTIGUOUS if MMU
> >>select HAVE_EBPF_JIT if MMU
> >>select HAVE_FUTEX_CMPXCHG if FUTEX  
> >
> > This is now a conflict between the risc-v tree and Linus' tree.  
> 
> Thanks.  I'd just pulled in some stuff and was intending on sending a PR to
> Linus tomorrow (we've got some autobuilders that run overnight that I like to
> give a crack at the actual commit before I send anything).  For this one I
> think the best bet is to just mention it to Linus as a conflict to be fixed --
> the only other thing I can think of would be to rebase my tree, which seems
> worse at this point.

Its pretty trivial, just mention it.

-- 
Cheers,
Stephen Rothwell


pgpPxatpXKVWa.pgp
Description: OpenPGP digital signature

[PATCH] io_uring: Fix use-after-free in io_sq_wq_submit_work()

2020-08-04 Thread Guoyu Huang

when ctx->sqo_mm is zero, io_sq_wq_submit_work() frees 'req'
without deleting it from 'task_list'. After that, 'req' is
accessed in io_ring_ctx_wait_and_kill() which lead to
a use-after-free.

Signed-off-by: Guoyu Huang 
---
 fs/io_uring.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e0200406765c..4b5ac381c67f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2242,6 +2242,7 @@ static void io_sq_wq_submit_work(struct work_struct *work)
if (io_sqe_needs_user(sqe) && !cur_mm) {
if (!mmget_not_zero(ctx->sqo_mm)) {
ret = -EFAULT;
+   goto end_req;
} else {
cur_mm = ctx->sqo_mm;
use_mm(cur_mm);
--
2.25.1

Re: [PATCH v2] block: check queue's limits.discard_granularity in __blkdev_issue_discard()

2020-08-04 Thread Ming Lei

On Wed, Aug 05, 2020 at 10:57:23AM +0800, Coly Li wrote:
> If create a loop device with a backing NVMe SSD, current loop device
> driver doesn't correctly set its  queue's limits.discard_granularity and
> leaves it as 0. If a discard request at LBA 0 on this loop device, in
> __blkdev_issue_discard() the calculated req_sects will be 0, and a zero
> length discard request will trigger a BUG() panic in generic block layer
> code at block/blk-mq.c:563.
> 
> [  955.565006][   C39] [ cut here ]
> [  955.559660][   C39] invalid opcode:  [#1] SMP NOPTI
> [  955.622171][   C39] CPU: 39 PID: 248 Comm: ksoftirqd/39 Tainted: G 
>E 5.8.0-default+ #40
> [  955.622171][   C39] Hardware name: Lenovo ThinkSystem SR650 
> -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE160M-2.70]- 07/17/2020
> [  955.622175][   C39] RIP: 0010:blk_mq_end_request+0x107/0x110
> [  955.622177][   C39] Code: 48 8b 03 e9 59 ff ff ff 48 89 df 5b 5d 41 5c e9 
> 9f ed ff ff 48 8b 35 98 3c f4 00 48 83 c7 10 48 83 c6 19 e8 cb 56 c9 ff eb cb 
> <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 54
> [  955.622179][   C39] RSP: 0018:b1288701fe28 EFLAGS: 00010202
> [  955.749277][   C39] RAX: 0001 RBX: 956fffba5080 RCX: 
> 4003
> [  955.749278][   C39] RDX: 0003 RSI:  RDI: 
> 
> [  955.749279][   C39] RBP:  R08:  R09: 
> 
> [  955.749279][   C39] R10: b1288701fd28 R11: 0001 R12: 
> a8e05160
> [  955.749280][   C39] R13: 0004 R14: 0004 R15: 
> a7ad3a1e
> [  955.749281][   C39] FS:  () GS:95bfbda0() 
> knlGS:
> [  955.749282][   C39] CS:  0010 DS:  ES:  CR0: 80050033
> [  955.749282][   C39] CR2: 7f6f0ef766a8 CR3: 005a37012002 CR4: 
> 007606e0
> [  955.749283][   C39] DR0:  DR1:  DR2: 
> 
> [  955.749284][   C39] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [  955.749284][   C39] PKRU: 5554
> [  955.749285][   C39] Call Trace:
> [  955.749290][   C39]  blk_done_softirq+0x99/0xc0
> [  957.550669][   C39]  __do_softirq+0xd3/0x45f
> [  957.550677][   C39]  ? smpboot_thread_fn+0x2f/0x1e0
> [  957.550679][   C39]  ? smpboot_thread_fn+0x74/0x1e0
> [  957.550680][   C39]  ? smpboot_thread_fn+0x14e/0x1e0
> [  957.550684][   C39]  run_ksoftirqd+0x30/0x60
> [  957.550687][   C39]  smpboot_thread_fn+0x149/0x1e0
> [  957.886225][   C39]  ? sort_range+0x20/0x20
> [  957.886226][   C39]  kthread+0x137/0x160
> [  957.886228][   C39]  ? kthread_park+0x90/0x90
> [  957.886231][   C39]  ret_from_fork+0x22/0x30
> [  959.117120][   C39] ---[ end trace 3dacdac97e2ed164 ]---
> 
> This is the procedure to reproduce the panic,
>   # modprobe scsi_debug delay=0 dev_size_mb=2048 max_queue=1
>   # losetup -f /dev/nvme0n1 --direct-io=on
>   # blkdiscard /dev/loop0 -o 0 -l 0x200
> 
> This patch fixes the issue by checking q->limits.discard_granularity in
> __blkdev_issue_discard() before composing the discard bio. If the value
> is 0, then prints a warning oops information and returns -EOPNOTSUPP to
> the caller to indicate that this buggy device driver doesn't support
> discard request.
> 
> Fixes: 9b15d109a6b2 ("block: improve discard bio alignment in 
> __blkdev_issue_discard()")
> Fixes: c52abf563049 ("loop: Better discard support for block devices")
> Reported-and-suggested-by: Ming Lei 
> Signed-off-by: Coly Li 
> Cc: Bart Van Assche 
> Cc: Christoph Hellwig 
> Cc: Enzo Matsumiya 
> Cc: Evan Green 
> Cc: Hannes Reinecke 
> Cc: Jens Axboe 
> Cc: Martin K. Petersen 
> Cc: Ming Lei 
> Cc: Xiao Ni 
> ---
> Changelog:
> v2: fix typo of the wrong return error code.
> v1: first version.
> 
>  block/blk-lib.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 019e09bb9c0e..729f05729529 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -47,6 +47,10 @@ int __blkdev_issue_discard(struct block_device *bdev, 
> sector_t sector,
>   op = REQ_OP_DISCARD;
>   }
>  
> + /* In case the discard granularity isn't set by buggy device driver */
> + if (WARN_ON_ONCE(!q->limits.discard_granularity))
> + return -EOPNOTSUPP;
> +
>   bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
>   if ((sector | nr_sects) & bs_mask)
>   return -EINVAL;
> -- 
> 2.26.2
> 

Looks fine:

Reviewed-by: Ming Lei 

BTW, it might be helpful to print the buggy disk name, so that people
can find the related driver easily.

Thanks,
Ming

Re: [PATCH v4 12/12] dt-bindings: sound: lpass-cpu: Move to yaml format

2020-08-04 Thread Rohit Kumar




On 8/3/2020 11:52 PM, Rob Herring wrote:

On Mon, Aug 3, 2020 at 2:28 AM Rohit Kumar  wrote:

Thanks Rob for reviewing

On 7/23/2020 10:46 PM, Rob Herring wrote:

On Wed, Jul 22, 2020 at 04:01:55PM +0530, Rohit kumar wrote:

Update lpass-cpu binding with yaml formats.

Signed-off-by: Rohit kumar 
---
   .../devicetree/bindings/sound/qcom,lpass-cpu.txt   | 130 ---
   .../devicetree/bindings/sound/qcom,lpass-cpu.yaml  | 185 
+
   2 files changed, 185 insertions(+), 130 deletions(-)
   delete mode 100644 Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt
   create mode 100644 
Documentation/devicetree/bindings/sound/qcom,lpass-cpu.yaml

diff --git a/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt 
b/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt
deleted file mode 100644
index c21392e..
--- a/Documentation/devicetree/bindings/sound/qcom,lpass-cpu.txt
+++ /dev/null
@@ -1,130 +0,0 @@
-* Qualcomm Technologies LPASS CPU DAI
-
-This node models the Qualcomm Technologies Low-Power Audio SubSystem (LPASS).
-
-Required properties:
-
-- compatible: "qcom,lpass-cpu" or "qcom,apq8016-lpass-cpu" or
-  "qcom,lpass-cpu-sc7180"
-- clocks: Must contain an entry for each entry in clock-names.
-- clock-names   : A list which must include the following entries:
-* "ahbix-clk"
-* "mi2s-osr-clk"
-* "mi2s-bit-clk"
-: required clocks for "qcom,lpass-cpu-apq8016"
-* "ahbix-clk"
-* "mi2s-bit-clk0"
-* "mi2s-bit-clk1"
-* "mi2s-bit-clk2"
-* "mi2s-bit-clk3"
-* "pcnoc-mport-clk"
-* "pcnoc-sway-clk"
-: required clocks for "qcom,lpass-cpu-sc7180"
-* "audio-core"
-* "mclk0"
-* "mi2s-bit-clk0"
-* "mi2s-bit-clk1"
-* "pcnoc-sway-clk"
-* "pcnoc-mport-clk"
-
-- interrupts: Must contain an entry for each entry in
-  interrupt-names.
-- interrupt-names   : A list which must include the following entries:
-* "lpass-irq-lpaif"
-- pinctrl-N : One property must exist for each entry in
-  pinctrl-names.  See ../pinctrl/pinctrl-bindings.txt
-  for details of the property values.
-- pinctrl-names : Must contain a "default" entry.
-- reg   : Must contain an address for each entry in 
reg-names.
-- reg-names : A list which must include the following entries:
-* "lpass-lpaif"
-- #address-cells: Must be 1
-- #size-cells   : Must be 0
-
-
-
-Optional properties:
-
-- qcom,adsp : Phandle for the audio DSP node
-
-By default, the driver uses up to 4 MI2S SD lines, for a total of 8 channels.
-The SD lines to use can be configured by adding subnodes for each of the DAIs.
-
-Required properties for each DAI (represented by a subnode):
-- reg   : Must be one of the DAI IDs
-  (usually part of dt-bindings header)
-- qcom,playback-sd-lines: List of serial data lines to use for playback
-  Each SD line should be represented by a number from 0-3.
-- qcom,capture-sd-lines : List of serial data lines to use for capture
-  Each SD line should be represented by a number from 0-3.
-
-Note that adding a subnode changes the default to "no lines configured",
-so both playback and capture lines should be configured when a subnode is 
added.
-
-Examples:
-1)
-
-lpass@2810 {
-compatible = "qcom,lpass-cpu";
-clocks = < AHBIX_CLK>, < MI2S_OSR_CLK>, < MI2S_BIT_CLK>;
-clock-names = "ahbix-clk", "mi2s-osr-clk", "mi2s-bit-clk";
-interrupts = <0 85 1>;
-interrupt-names = "lpass-irq-lpaif";
-pinctrl-names = "default", "idle";
-pinctrl-0 = <_default>;
-pinctrl-1 = <_idle>;
-reg = <0x2810 0x1>;
-reg-names = "lpass-lpaif";
-qcom,adsp = <>;
-
-#address-cells = <1>;
-#size-cells = <0>;
-
-/* Optional to set different MI2S SD lines */
-dai@3 {
-reg = ;
-qcom,playback-sd-lines = <0 1>;
-};
-};
-
-2)
-
-#include 
-
-lpass_cpu: lpass {
-compatible = "qcom,lpass-cpu-sc7180";
-
-reg = <0 0x62F0 0 0x29000>;
-
-iommus = <_smmu 0x1020 0>;
-
-power-domains = <_hm LPASS_CORE_HM_GDSCR>;
-clocks = < GCC_LPASS_CFG_NOC_SWAY_CLK>,
-< LPASS_AUDIO_CORE_CORE_CLK>,
-< LPASS_AUDIO_CORE_EXT_MCLK0_CLK>,
-< LPASS_AUDIO_CORE_SYSNOC_MPORT_CORE_CLK>,
-<

Re: [PATCH v6 1/4] IMA: Add func to measure LSM state and policy

2020-08-04 Thread Mimi Zohar

Hi Lakshmi,

There's still  a number of other patch sets needing to be reviewed
before my getting to this one.  The comment below is from a high level.

On Tue, 2020-08-04 at 17:43 -0700, Lakshmi Ramasubramanian wrote:
> Critical data structures of security modules need to be measured to
> enable an attestation service to verify if the configuration and
> policies for the security modules have been setup correctly and
> that they haven't been tampered with at runtime. A new IMA policy is
> required for handling this measurement.
> 
> Define two new IMA policy func namely LSM_STATE and LSM_POLICY to
> measure the state and the policy provided by the security modules.
> Update ima_match_rules() and ima_validate_rule() to check for
> the new func and ima_parse_rule() to handle the new func.

I can understand wanting to measure the in kernel LSM memory state to
make sure it hasn't changed, but policies are stored as files.  Buffer
measurements should be limited  to those things that are not files.

Changing how data is passed to the kernel has been happening for a
while.  For example, instead of passing the kernel module or kernel
image in a buffer, the new syscalls - finit_module, kexec_file_load -
pass an open file descriptor.  Similarly, instead of loading the IMA
policy data, a pathname may be provided.

Pre and post security hooks already exist for reading files.   Instead
of adding IMA support for measuring the policy file data, update the
mechanism for loading the LSM policy.  Then not only will you be able
to measure the policy, you'll also be able to require the policy be
signed.

Mimi

arch/mips/kernel/setup.c:459 early_parse_elfcorehdr() warn: inconsistent indenting

2020-08-04 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   442489c219235991de86d0277b5d859ede6d8792
commit: a94e4f24ec836c8984f839594bad7454184975b1 MIPS: init: Drop boot_mem_map
date:   12 months ago
config: mips-randconfig-m031-20200805 (attached as .config)
compiler: mips-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

smatch warnings:
arch/mips/kernel/setup.c:459 early_parse_elfcorehdr() warn: inconsistent 
indenting

vim +459 arch/mips/kernel/setup.c

   450  
   451  #ifdef CONFIG_PROC_VMCORE
   452  unsigned long setup_elfcorehdr, setup_elfcorehdr_size;
   453  static int __init early_parse_elfcorehdr(char *p)
   454  {
   455  struct memblock_region *mem;
   456  
   457  setup_elfcorehdr = memparse(p, );
   458  
 > 459   for_each_memblock(memory, mem) {
   460  unsigned long start = mem->base;
   461  unsigned long end = mem->end;
   462  if (setup_elfcorehdr >= start && setup_elfcorehdr < 
end) {
   463  /*
   464   * Reserve from the elf core header to the end 
of
   465   * the memory segment, that should all be kdump
   466   * reserved memory.
   467   */
   468  setup_elfcorehdr_size = end - setup_elfcorehdr;
   469  break;
   470  }
   471  }
   472  /*
   473   * If we don't find it in the memory map, then we shouldn't
   474   * have to worry about it, as the new kernel won't use it.
   475   */
   476  return 0;
   477  }
   478  early_param("elfcorehdr", early_parse_elfcorehdr);
   479  #endif
   480  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH v5 4/4] powerpc/pseries/iommu: Allow bigger 64bit window by removing default DMA window




On 05/08/2020 13:04, Leonardo Bras wrote:
> On LoPAR "DMA Window Manipulation Calls", it's recommended to remove the
> default DMA window for the device, before attempting to configure a DDW,
> in order to make the maximum resources available for the next DDW to be
> created.
> 
> This is a requirement for using DDW on devices in which hypervisor
> allows only one DMA window.
> 
> If setting up a new DDW fails anywhere after the removal of this
> default DMA window, it's needed to restore the default DMA window.
> For this, an implementation of ibm,reset-pe-dma-windows rtas call is
> needed:
> 
> Platforms supporting the DDW option starting with LoPAR level 2.7 implement
> ibm,ddw-extensions. The first extension available (index 2) carries the
> token for ibm,reset-pe-dma-windows rtas call, which is used to restore
> the default DMA window for a device, if it has been deleted.
> 
> It does so by resetting the TCE table allocation for the PE to it's
> boot time value, available in "ibm,dma-window" device tree node.
> 
> Signed-off-by: Leonardo Bras 
> Tested-by: David Dai 


Reviewed-by: Alexey Kardashevskiy 



> ---
>  arch/powerpc/platforms/pseries/iommu.c | 73 +++---
>  1 file changed, 66 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> b/arch/powerpc/platforms/pseries/iommu.c
> index 4e33147825cc..e4198700ed1a 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -1066,6 +1066,38 @@ static phys_addr_t ddw_memory_hotplug_max(void)
>   return max_addr;
>  }
>  
> +/*
> + * Platforms supporting the DDW option starting with LoPAR level 2.7 
> implement
> + * ibm,ddw-extensions, which carries the rtas token for
> + * ibm,reset-pe-dma-windows.
> + * That rtas-call can be used to restore the default DMA window for the 
> device.
> + */
> +static void reset_dma_window(struct pci_dev *dev, struct device_node *par_dn)
> +{
> + int ret;
> + u32 cfg_addr, reset_dma_win;
> + u64 buid;
> + struct device_node *dn;
> + struct pci_dn *pdn;
> +
> + ret = ddw_read_ext(par_dn, DDW_EXT_RESET_DMA_WIN, _dma_win);
> + if (ret)
> + return;
> +
> + dn = pci_device_to_OF_node(dev);
> + pdn = PCI_DN(dn);
> + buid = pdn->phb->buid;
> + cfg_addr = (pdn->busno << 16) | (pdn->devfn << 8);
> +
> + ret = rtas_call(reset_dma_win, 3, 1, NULL, cfg_addr, BUID_HI(buid),
> + BUID_LO(buid));
> + if (ret)
> + dev_info(>dev,
> +  "ibm,reset-pe-dma-windows(%x) %x %x %x returned %d ",
> +  reset_dma_win, cfg_addr, BUID_HI(buid), BUID_LO(buid),
> +  ret);
> +}
> +
>  /*
>   * If the PE supports dynamic dma windows, and there is space for a table
>   * that can map all pages in a linear offset, then setup such a table,
> @@ -1090,6 +1122,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
> device_node *pdn)
>   struct property *win64;
>   struct dynamic_dma_window_prop *ddwprop;
>   struct failed_ddw_pdn *fpdn;
> + bool default_win_removed = false;
>  
>   mutex_lock(_window_init_mutex);
>  
> @@ -1133,14 +1166,38 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
> device_node *pdn)
>   if (ret != 0)
>   goto out_failed;
>  
> + /*
> +  * If there is no window available, remove the default DMA window,
> +  * if it's present. This will make all the resources available to the
> +  * new DDW window.
> +  * If anything fails after this, we need to restore it, so also check
> +  * for extensions presence.
> +  */
>   if (query.windows_available == 0) {
> - /*
> -  * no additional windows are available for this device.
> -  * We might be able to reallocate the existing window,
> -  * trading in for a larger page size.
> -  */
> - dev_dbg(>dev, "no free dynamic windows");
> - goto out_failed;
> + struct property *default_win;
> + int reset_win_ext;
> +
> + default_win = of_find_property(pdn, "ibm,dma-window", NULL);
> + if (!default_win)
> + goto out_failed;
> +
> + reset_win_ext = ddw_read_ext(pdn, DDW_EXT_RESET_DMA_WIN, NULL);
> + if (reset_win_ext)
> + goto out_failed;
> +
> + remove_dma_window(pdn, ddw_avail, default_win);
> + default_win_removed = true;
> +
> + /* Query again, to check if the window is available */
> + ret = query_ddw(dev, ddw_avail, , pdn);
> + if (ret != 0)
> + goto out_failed;
> +
> + if (query.windows_available == 0) {
> + /* no windows are available for this device. */
> + dev_dbg(>dev, "no free dynamic windows");
> + goto out_failed;
> +

Re: [PATCH v5 3/4] powerpc/pseries/iommu: Move window-removing part of remove_ddw into remove_dma_window




On 05/08/2020 13:04, Leonardo Bras wrote:
> Move the window-removing part of remove_ddw into a new function
> (remove_dma_window), so it can be used to remove other DMA windows.
> 
> It's useful for removing DMA windows that don't create DIRECT64_PROPNAME
> property, like the default DMA window from the device, which uses
> "ibm,dma-window".
> 
> Signed-off-by: Leonardo Bras 
> Tested-by: David Dai 


Reviewed-by: Alexey Kardashevskiy 



> ---
>  arch/powerpc/platforms/pseries/iommu.c | 45 +++---
>  1 file changed, 27 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> b/arch/powerpc/platforms/pseries/iommu.c
> index 1a933c4e8bba..4e33147825cc 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -781,25 +781,14 @@ static int __init disable_ddw_setup(char *str)
>  
>  early_param("disable_ddw", disable_ddw_setup);
>  
> -static void remove_ddw(struct device_node *np, bool remove_prop)
> +static void remove_dma_window(struct device_node *np, u32 *ddw_avail,
> +   struct property *win)
>  {
>   struct dynamic_dma_window_prop *dwp;
> - struct property *win64;
> - u32 ddw_avail[DDW_APPLICABLE_SIZE];
>   u64 liobn;
> - int ret = 0;
> -
> - ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
> -  _avail[0], DDW_APPLICABLE_SIZE);
> -
> - win64 = of_find_property(np, DIRECT64_PROPNAME, NULL);
> - if (!win64)
> - return;
> -
> - if (ret || win64->length < sizeof(*dwp))
> - goto delprop;
> + int ret;
>  
> - dwp = win64->value;
> + dwp = win->value;
>   liobn = (u64)be32_to_cpu(dwp->liobn);
>  
>   /* clear the whole window, note the arg is in kernel pages */
> @@ -821,10 +810,30 @@ static void remove_ddw(struct device_node *np, bool 
> remove_prop)
>   pr_debug("%pOF: successfully removed direct window: rtas 
> returned "
>   "%d to ibm,remove-pe-dma-window(%x) %llx\n",
>   np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
> +}
> +
> +static void remove_ddw(struct device_node *np, bool remove_prop)
> +{
> + struct property *win;
> + u32 ddw_avail[DDW_APPLICABLE_SIZE];
> + int ret = 0;
> +
> + ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
> +  _avail[0], DDW_APPLICABLE_SIZE);
> + if (ret)
> + return;
> +
> + win = of_find_property(np, DIRECT64_PROPNAME, NULL);
> + if (!win)
> + return;
> +
> + if (win->length >= sizeof(struct dynamic_dma_window_prop))
> + remove_dma_window(np, ddw_avail, win);
> +
> + if (!remove_prop)
> + return;
>  
> -delprop:
> - if (remove_prop)
> - ret = of_remove_property(np, win64);
> + ret = of_remove_property(np, win);
>   if (ret)
>   pr_warn("%pOF: failed to remove direct window property: %d\n",
>   np, ret);
> 

-- 
Alexey

Re: [PATCH v5 2/4] powerpc/pseries/iommu: Update call to ibm,query-pe-dma-windows




On 05/08/2020 13:04, Leonardo Bras wrote:
> From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can make the number of
> outputs from "ibm,query-pe-dma-windows" go from 5 to 6.
> 
> This change of output size is meant to expand the address size of
> largest_available_block PE TCE from 32-bit to 64-bit, which ends up
> shifting page_size and migration_capable.
> 
> This ends up requiring the update of
> ddw_query_response->largest_available_block from u32 to u64, and manually
> assigning the values from the buffer into this struct, according to
> output size.
> 
> Also, a routine was created for helping reading the ddw extensions as
> suggested by LoPAR: First reading the size of the extension array from
> index 0, checking if the property exists, and then returning it's value.
> 
> Signed-off-by: Leonardo Bras 
> Tested-by: David Dai 


Reviewed-by: Alexey Kardashevskiy 



> ---
>  arch/powerpc/platforms/pseries/iommu.c | 91 +++---
>  1 file changed, 81 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> b/arch/powerpc/platforms/pseries/iommu.c
> index ac0d6376bdad..1a933c4e8bba 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -47,6 +47,12 @@ enum {
>   DDW_APPLICABLE_SIZE
>  };
>  
> +enum {
> + DDW_EXT_SIZE = 0,
> + DDW_EXT_RESET_DMA_WIN = 1,
> + DDW_EXT_QUERY_OUT_SIZE = 2
> +};
> +
>  static struct iommu_table_group *iommu_pseries_alloc_group(int node)
>  {
>   struct iommu_table_group *table_group;
> @@ -342,7 +348,7 @@ struct direct_window {
>  /* Dynamic DMA Window support */
>  struct ddw_query_response {
>   u32 windows_available;
> - u32 largest_available_block;
> + u64 largest_available_block;
>   u32 page_size;
>   u32 migration_capable;
>  };
> @@ -877,14 +883,62 @@ static int find_existing_ddw_windows(void)
>  }
>  machine_arch_initcall(pseries, find_existing_ddw_windows);
>  
> +/**
> + * ddw_read_ext - Get the value of an DDW extension
> + * @np:  device node from which the extension value is to be 
> read.
> + * @extnum:  index number of the extension.
> + * @value:   pointer to return value, modified when extension is available.
> + *
> + * Checks if "ibm,ddw-extensions" exists for this node, and get the value
> + * on index 'extnum'.
> + * It can be used only to check if a property exists, passing value == NULL.
> + *
> + * Returns:
> + *   0 if extension successfully read
> + *   -EINVAL if the "ibm,ddw-extensions" does not exist,
> + *   -ENODATA if "ibm,ddw-extensions" does not have a value, and
> + *   -EOVERFLOW if "ibm,ddw-extensions" does not contain this extension.
> + */
> +static inline int ddw_read_ext(const struct device_node *np, int extnum,
> +u32 *value)
> +{
> + static const char propname[] = "ibm,ddw-extensions";
> + u32 count;
> + int ret;
> +
> + ret = of_property_read_u32_index(np, propname, DDW_EXT_SIZE, );
> + if (ret)
> + return ret;
> +
> + if (count < extnum)
> + return -EOVERFLOW;
> +
> + if (!value)
> + value = 
> +
> + return of_property_read_u32_index(np, propname, extnum, value);
> +}
> +
>  static int query_ddw(struct pci_dev *dev, const u32 *ddw_avail,
> - struct ddw_query_response *query)
> +  struct ddw_query_response *query,
> +  struct device_node *parent)
>  {
>   struct device_node *dn;
>   struct pci_dn *pdn;
> - u32 cfg_addr;
> + u32 cfg_addr, ext_query, query_out[5];
>   u64 buid;
> - int ret;
> + int ret, out_sz;
> +
> + /*
> +  * From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can rule how many
> +  * output parameters ibm,query-pe-dma-windows will have, ranging from
> +  * 5 to 6.
> +  */
> + ret = ddw_read_ext(parent, DDW_EXT_QUERY_OUT_SIZE, _query);
> + if (!ret && ext_query == 1)
> + out_sz = 6;
> + else
> + out_sz = 5;
>  
>   /*
>* Get the config address and phb buid of the PE window.
> @@ -897,11 +951,28 @@ static int query_ddw(struct pci_dev *dev, const u32 
> *ddw_avail,
>   buid = pdn->phb->buid;
>   cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8));
>  
> - ret = rtas_call(ddw_avail[DDW_QUERY_PE_DMA_WIN], 3, 5, (u32 *)query,
> + ret = rtas_call(ddw_avail[DDW_QUERY_PE_DMA_WIN], 3, out_sz, query_out,
>   cfg_addr, BUID_HI(buid), BUID_LO(buid));
> - dev_info(>dev, "ibm,query-pe-dma-windows(%x) %x %x %x"
> - " returned %d\n", ddw_avail[DDW_QUERY_PE_DMA_WIN], cfg_addr,
> -  BUID_HI(buid), BUID_LO(buid), ret);
> + dev_info(>dev, "ibm,query-pe-dma-windows(%x) %x %x %x returned 
> %d\n",
> +  ddw_avail[DDW_QUERY_PE_DMA_WIN], cfg_addr, BUID_HI(buid),
> +  BUID_LO(buid), ret);
> +
> + switch (out_sz) {
> + case 5:
> +

Re: [PATCH v5 1/4] powerpc/pseries/iommu: Create defines for operations in ibm,ddw-applicable




On 05/08/2020 13:04, Leonardo Bras wrote:
> Create defines to help handling ibm,ddw-applicable values, avoiding
> confusion about the index of given operations.
> 
> Signed-off-by: Leonardo Bras 
> Tested-by: David Dai 

Reviewed-by: Alexey Kardashevskiy 



> ---
>  arch/powerpc/platforms/pseries/iommu.c | 43 --
>  1 file changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> b/arch/powerpc/platforms/pseries/iommu.c
> index 6d47b4a3ce39..ac0d6376bdad 100644
> --- a/arch/powerpc/platforms/pseries/iommu.c
> +++ b/arch/powerpc/platforms/pseries/iommu.c
> @@ -39,6 +39,14 @@
>  
>  #include "pseries.h"
>  
> +enum {
> + DDW_QUERY_PE_DMA_WIN  = 0,
> + DDW_CREATE_PE_DMA_WIN = 1,
> + DDW_REMOVE_PE_DMA_WIN = 2,
> +
> + DDW_APPLICABLE_SIZE
> +};
> +
>  static struct iommu_table_group *iommu_pseries_alloc_group(int node)
>  {
>   struct iommu_table_group *table_group;
> @@ -771,12 +779,12 @@ static void remove_ddw(struct device_node *np, bool 
> remove_prop)
>  {
>   struct dynamic_dma_window_prop *dwp;
>   struct property *win64;
> - u32 ddw_avail[3];
> + u32 ddw_avail[DDW_APPLICABLE_SIZE];
>   u64 liobn;
>   int ret = 0;
>  
>   ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
> -  _avail[0], 3);
> +  _avail[0], DDW_APPLICABLE_SIZE);
>  
>   win64 = of_find_property(np, DIRECT64_PROPNAME, NULL);
>   if (!win64)
> @@ -798,15 +806,15 @@ static void remove_ddw(struct device_node *np, bool 
> remove_prop)
>   pr_debug("%pOF successfully cleared tces in window.\n",
>np);
>  
> - ret = rtas_call(ddw_avail[2], 1, 1, NULL, liobn);
> + ret = rtas_call(ddw_avail[DDW_REMOVE_PE_DMA_WIN], 1, 1, NULL, liobn);
>   if (ret)
>   pr_warn("%pOF: failed to remove direct window: rtas returned "
>   "%d to ibm,remove-pe-dma-window(%x) %llx\n",
> - np, ret, ddw_avail[2], liobn);
> + np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
>   else
>   pr_debug("%pOF: successfully removed direct window: rtas 
> returned "
>   "%d to ibm,remove-pe-dma-window(%x) %llx\n",
> - np, ret, ddw_avail[2], liobn);
> + np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
>  
>  delprop:
>   if (remove_prop)
> @@ -889,11 +897,11 @@ static int query_ddw(struct pci_dev *dev, const u32 
> *ddw_avail,
>   buid = pdn->phb->buid;
>   cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8));
>  
> - ret = rtas_call(ddw_avail[0], 3, 5, (u32 *)query,
> -   cfg_addr, BUID_HI(buid), BUID_LO(buid));
> + ret = rtas_call(ddw_avail[DDW_QUERY_PE_DMA_WIN], 3, 5, (u32 *)query,
> + cfg_addr, BUID_HI(buid), BUID_LO(buid));
>   dev_info(>dev, "ibm,query-pe-dma-windows(%x) %x %x %x"
> - " returned %d\n", ddw_avail[0], cfg_addr, BUID_HI(buid),
> - BUID_LO(buid), ret);
> + " returned %d\n", ddw_avail[DDW_QUERY_PE_DMA_WIN], cfg_addr,
> +  BUID_HI(buid), BUID_LO(buid), ret);
>   return ret;
>  }
>  
> @@ -920,15 +928,16 @@ static int create_ddw(struct pci_dev *dev, const u32 
> *ddw_avail,
>  
>   do {
>   /* extra outputs are LIOBN and dma-addr (hi, lo) */
> - ret = rtas_call(ddw_avail[1], 5, 4, (u32 *)create,
> - cfg_addr, BUID_HI(buid), BUID_LO(buid),
> - page_shift, window_shift);
> + ret = rtas_call(ddw_avail[DDW_CREATE_PE_DMA_WIN], 5, 4,
> + (u32 *)create, cfg_addr, BUID_HI(buid),
> + BUID_LO(buid), page_shift, window_shift);
>   } while (rtas_busy_delay(ret));
>   dev_info(>dev,
>   "ibm,create-pe-dma-window(%x) %x %x %x %x %x returned %d "
> - "(liobn = 0x%x starting addr = %x %x)\n", ddw_avail[1],
> -  cfg_addr, BUID_HI(buid), BUID_LO(buid), page_shift,
> -  window_shift, ret, create->liobn, create->addr_hi, 
> create->addr_lo);
> + "(liobn = 0x%x starting addr = %x %x)\n",
> +  ddw_avail[DDW_CREATE_PE_DMA_WIN], cfg_addr, BUID_HI(buid),
> +  BUID_LO(buid), page_shift, window_shift, ret, create->liobn,
> +  create->addr_hi, create->addr_lo);
>  
>   return ret;
>  }
> @@ -996,7 +1005,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
> device_node *pdn)
>   int page_shift;
>   u64 dma_addr, max_addr;
>   struct device_node *dn;
> - u32 ddw_avail[3];
> + u32 ddw_avail[DDW_APPLICABLE_SIZE];
>   struct direct_window *window;
>   struct property *win64;
>   struct dynamic_dma_window_prop *ddwprop;
> @@ -1029,7 +1038,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
> device_node

Re: [PATCH 1/2] riscv: ptrace: Use the correct API for `fcsr' access

2020-08-04 Thread Palmer Dabbelt

On Tue, 04 Aug 2020 19:48:07 PDT (-0700), v...@zeniv.linux.org.uk wrote:

On Tue, Aug 04, 2020 at 07:20:05PM -0700, Palmer Dabbelt wrote:

On Tue, 04 Aug 2020 19:07:45 PDT (-0700), v...@zeniv.linux.org.uk wrote:
> On Tue, Aug 04, 2020 at 07:01:01PM -0700, Palmer Dabbelt wrote:
>
> > > We currently have @start_pos fixed at 0 across all calls, which works as
> > > a result of the implementation, in particular because we have no padding
> > > between the FP general registers and the FP control and status register,
> > > but appears not to have been the intent of the API and is not what other
> > > ports do, requiring one to study the copy handlers to understand what is
> > > going on here.
>
> start_pos *is* fixed at 0 and it's going to go away, along with the
> sodding user_regset_copyout() very shortly.  ->get() is simply a bad API.
> See vfs.git#work.regset for replacement.  And ->put() is also going to be
> taken out and shot (next cycle, most likely).

I'm not sure I understand what you're saying, but given that branch replaces
all of this I guess it's best to just do nothing on our end here?

It doesn't replace ->put() (for now); it _does_ replace ->get() and AFAICS the
replacement is much saner:

static int riscv_fpr_get(struct task_struct *target,
 const struct user_regset *regset,
 struct membuf to)
{
struct __riscv_d_ext_state *fstate = >thread.fstate;

membuf_write(, fstate, offsetof(struct __riscv_d_ext_state, fcsr));
membuf_store(, fstate->fcsr);
return membuf_zero(, 4); // explicitly pad
}

user_regset_copyout() calling conventions are atrocious and so are those of
regset ->get().  The best thing to do with both is to take them out of their
misery and be done with that.  Do you see any problems with riscv gdbserver
on current linux-next?  If not, I'd rather see that "API" simply go away...
If there are problems, I would very much prefer fixes on top of what's done
in that branch.

I guess my confusion was about "start_pos *is* fixed at 0": it certainly is
zero in the code right now, but when poking around while review the patch I
didn't see any reason that must be so.  Admittedly all I did was read the
prototype and function, so maybe I'm just missing something.  That said, if
it's all going away anyway then I don't really care either way.

As far as I can tell the patch set in question (the RISC-V one) doesn't change
any functionality.  I don't actually use GDB, but I haven't seen any issues
reported in a few years so if there is one I've missed it.

I did this ptrace stuff many years ago (IIRC it was actually my first RISC-V
Linux patch), and all I really remember is that it seemed way more complicated
than it needed to be.  I'm happy to just drop our patch set, as yours looks way
cleaner to me and if you're already planning on fixing put() then it doesn't
seem worth the churn.

Re: [v2,5/6] reset-controller: ti: Introduce force-update method

2020-08-04 Thread Yingjoe Chen

On Mon, 2020-08-03 at 14:15 +0800, Crystal Guo wrote:
> Introduce force-update method for assert and deassert interface,
> which force the write operation in case the read already happens
> to return the correct value.
> 
> Signed-off-by: Crystal Guo 
> ---
>  drivers/reset/reset-ti-syscon.c | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/reset/reset-ti-syscon.c b/drivers/reset/reset-ti-syscon.c
> index 1c74bcb9a6c3..f4baf78afd14 100644
> --- a/drivers/reset/reset-ti-syscon.c
> +++ b/drivers/reset/reset-ti-syscon.c
> @@ -57,6 +57,7 @@ struct ti_syscon_reset_data {
>   struct ti_syscon_reset_control *controls;
>   unsigned int nr_controls;
>   bool assert_deassert_together;
> + bool update_force;
>  };
>  
>  #define to_ti_syscon_reset_data(rcdev)   \
> @@ -90,7 +91,10 @@ static int ti_syscon_reset_assert(struct 
> reset_controller_dev *rcdev,
>   mask = BIT(control->assert_bit);
>   value = (control->flags & ASSERT_SET) ? mask : 0x0;
>  
> - return regmap_update_bits(data->regmap, control->assert_offset, mask, 
> value);
> + if (data->update_force)
> + return regmap_write_bits(data->regmap, control->assert_offset, 
> mask, value);
> + else
> + return regmap_update_bits(data->regmap, control->assert_offset, 
> mask, value);
>  }
>  
>  /**
> @@ -121,7 +125,10 @@ static int ti_syscon_reset_deassert(struct 
> reset_controller_dev *rcdev,
>   mask = BIT(control->deassert_bit);
>   value = (control->flags & DEASSERT_SET) ? mask : 0x0;
>  
> - return regmap_update_bits(data->regmap, control->deassert_offset, mask, 
> value);
> + if (data->update_force)
> + return regmap_write_bits(data->regmap, 
> control->deassert_offset, mask, value);
> + else
> + return regmap_update_bits(data->regmap, 
> control->deassert_offset, mask, value);
>  }
>  
>  /**
> @@ -223,6 +230,10 @@ static int ti_syscon_reset_probe(struct platform_device 
> *pdev)
>   data->assert_deassert_together = true;
>   else
>   data->assert_deassert_together = false;
> + if (of_property_read_bool(np, "update-force"))
> + data->update_force = true;
> + else
> + data->update_force = false;

You are using 'force-update' in commit message, and I think that a
better one.
Please change it if we still need this one

Joe.C

Re: [v2,3/6] dt-binding: reset-controller: ti: add generic-reset to compatible

2020-08-04 Thread Yingjoe Chen

On Tue, 2020-08-04 at 10:15 +0200, Philipp Zabel wrote:
> On Mon, 2020-08-03 at 14:15 +0800, Crystal Guo wrote:
> > The TI syscon reset controller provides a common reset management,
> > and should be suitable for other SOCs. Add compatible "generic-reset",
> > which denotes to use a common reset-controller driver.
> > 
> > Signed-off-by: Crystal Guo 
> > ---
> >  Documentation/devicetree/bindings/reset/ti-syscon-reset.txt | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/reset/ti-syscon-reset.txt 
> > b/Documentation/devicetree/bindings/reset/ti-syscon-reset.txt
> > index d551161ae785..e36d3631eab2 100644
> > --- a/Documentation/devicetree/bindings/reset/ti-syscon-reset.txt
> > +++ b/Documentation/devicetree/bindings/reset/ti-syscon-reset.txt
> > @@ -25,6 +25,7 @@ Required properties:
> > "ti,k2l-pscrst"
> > "ti,k2hk-pscrst"
> > "ti,syscon-reset"
> > +   "generic-reset", "ti,syscon-reset"
> >   - #reset-cells: Should be 1. Please see the reset consumer 
> > node below
> >   for usage details
> >   - ti,reset-bits   : Contains the reset control register information
> > -- 
> > 2.18.0
> 
> My understanding is that it would be better to add a mtk specific
> compatible instead of adding this "generic-reset", especially since we
> can't guarantee this binding will be considered generic in the future.
> I think there is nothing wrong with specifying
>   compatible = "mtk,your-compatible", "ti,syscon-reset";
> in your device tree if your hardware is indeed compatible with the
> specified "ti,syscon-reset" binding, but I may be wrong: Therefore,
> please add devicet...@vger.kernel.org to Cc: for binding changes.
> 

Hi Philipp,

This would work.
But having "ti" in mtk dts raise alarm for some people inside and
outside of MTK. It would save us some explanation if we could use a more
generic name.

Joe.C

Re: [PATCH v2 05/18] gpiolib: cdev: support GPIO_GET_LINE_IOCTL and GPIOLINE_GET_VALUES_IOCTL

2020-08-04 Thread Kent Gibson

On Tue, Aug 04, 2020 at 07:47:43PM +0200, Bartosz Golaszewski wrote:
> On Tue, Aug 4, 2020 at 1:01 AM Kent Gibson  wrote:
> >
> 
> [snip]
> 
> >
> > > Also: I just started going through the patches - nice idea with the
> > > GPIO attributes, I really like it. Although I need to give it a longer
> > > thought tomorrow - I'm wondering if we can maybe unify them and the
> > > flags.
> > >
> >
> > I had an earlier draft that did just that - and that is partially why
> > the loop is last in wins - I was using slot 0 as the default flags.
> > But the default flags cover a lot of use cases, including all of v1, and
> > it was simple and cheap to provide a default - and it simplified the
> > initial port of libgpiod to v2...
> >
> 
> If porting libgpiod to v2 is the only concern then I wouldn't stress
> about it. At the same time I'm wondering - is there any use-case where
> we wouldn't need the flags attribute for at least some lines? Because
> if it's always required than maybe having a default isn't that bad.
> 

The only case where flags are not required is an AS-IS request. I
have no idea what that use case is useful for, but it is in v1 and
therefore supported by v2 for backward compatibility.

So there is almost always a flags attribute, and I didn't want to
waste an attribute slot on it.

Supporting the default in the kernel is trivial - it is literally just
the default return in gpioline_config_flags:

+   }
+   return lc->flags;
+}

which would otherwise be 0.

Cheers,
Kent.

[PATCH v5 2/4] powerpc/pseries/iommu: Update call to ibm,query-pe-dma-windows

>From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can make the number of
outputs from "ibm,query-pe-dma-windows" go from 5 to 6.

This change of output size is meant to expand the address size of
largest_available_block PE TCE from 32-bit to 64-bit, which ends up
shifting page_size and migration_capable.

This ends up requiring the update of
ddw_query_response->largest_available_block from u32 to u64, and manually
assigning the values from the buffer into this struct, according to
output size.

Also, a routine was created for helping reading the ddw extensions as
suggested by LoPAR: First reading the size of the extension array from
index 0, checking if the property exists, and then returning it's value.

Signed-off-by: Leonardo Bras 
Tested-by: David Dai 
---
 arch/powerpc/platforms/pseries/iommu.c | 91 +++---
 1 file changed, 81 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index ac0d6376bdad..1a933c4e8bba 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -47,6 +47,12 @@ enum {
DDW_APPLICABLE_SIZE
 };
 
+enum {
+   DDW_EXT_SIZE = 0,
+   DDW_EXT_RESET_DMA_WIN = 1,
+   DDW_EXT_QUERY_OUT_SIZE = 2
+};
+
 static struct iommu_table_group *iommu_pseries_alloc_group(int node)
 {
struct iommu_table_group *table_group;
@@ -342,7 +348,7 @@ struct direct_window {
 /* Dynamic DMA Window support */
 struct ddw_query_response {
u32 windows_available;
-   u32 largest_available_block;
+   u64 largest_available_block;
u32 page_size;
u32 migration_capable;
 };
@@ -877,14 +883,62 @@ static int find_existing_ddw_windows(void)
 }
 machine_arch_initcall(pseries, find_existing_ddw_windows);
 
+/**
+ * ddw_read_ext - Get the value of an DDW extension
+ * @np:device node from which the extension value is to be 
read.
+ * @extnum:index number of the extension.
+ * @value: pointer to return value, modified when extension is available.
+ *
+ * Checks if "ibm,ddw-extensions" exists for this node, and get the value
+ * on index 'extnum'.
+ * It can be used only to check if a property exists, passing value == NULL.
+ *
+ * Returns:
+ * 0 if extension successfully read
+ * -EINVAL if the "ibm,ddw-extensions" does not exist,
+ * -ENODATA if "ibm,ddw-extensions" does not have a value, and
+ * -EOVERFLOW if "ibm,ddw-extensions" does not contain this extension.
+ */
+static inline int ddw_read_ext(const struct device_node *np, int extnum,
+  u32 *value)
+{
+   static const char propname[] = "ibm,ddw-extensions";
+   u32 count;
+   int ret;
+
+   ret = of_property_read_u32_index(np, propname, DDW_EXT_SIZE, );
+   if (ret)
+   return ret;
+
+   if (count < extnum)
+   return -EOVERFLOW;
+
+   if (!value)
+   value = 
+
+   return of_property_read_u32_index(np, propname, extnum, value);
+}
+
 static int query_ddw(struct pci_dev *dev, const u32 *ddw_avail,
-   struct ddw_query_response *query)
+struct ddw_query_response *query,
+struct device_node *parent)
 {
struct device_node *dn;
struct pci_dn *pdn;
-   u32 cfg_addr;
+   u32 cfg_addr, ext_query, query_out[5];
u64 buid;
-   int ret;
+   int ret, out_sz;
+
+   /*
+* From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can rule how many
+* output parameters ibm,query-pe-dma-windows will have, ranging from
+* 5 to 6.
+*/
+   ret = ddw_read_ext(parent, DDW_EXT_QUERY_OUT_SIZE, _query);
+   if (!ret && ext_query == 1)
+   out_sz = 6;
+   else
+   out_sz = 5;
 
/*
 * Get the config address and phb buid of the PE window.
@@ -897,11 +951,28 @@ static int query_ddw(struct pci_dev *dev, const u32 
*ddw_avail,
buid = pdn->phb->buid;
cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8));
 
-   ret = rtas_call(ddw_avail[DDW_QUERY_PE_DMA_WIN], 3, 5, (u32 *)query,
+   ret = rtas_call(ddw_avail[DDW_QUERY_PE_DMA_WIN], 3, out_sz, query_out,
cfg_addr, BUID_HI(buid), BUID_LO(buid));
-   dev_info(>dev, "ibm,query-pe-dma-windows(%x) %x %x %x"
-   " returned %d\n", ddw_avail[DDW_QUERY_PE_DMA_WIN], cfg_addr,
-BUID_HI(buid), BUID_LO(buid), ret);
+   dev_info(>dev, "ibm,query-pe-dma-windows(%x) %x %x %x returned 
%d\n",
+ddw_avail[DDW_QUERY_PE_DMA_WIN], cfg_addr, BUID_HI(buid),
+BUID_LO(buid), ret);
+
+   switch (out_sz) {
+   case 5:
+   query->windows_available = query_out[0];
+   query->largest_available_block = query_out[1];
+   query->page_size = query_out[2];
+   query->migration_capable = query_out[3];
+

[PATCH v5 3/4] powerpc/pseries/iommu: Move window-removing part of remove_ddw into remove_dma_window

Move the window-removing part of remove_ddw into a new function
(remove_dma_window), so it can be used to remove other DMA windows.

It's useful for removing DMA windows that don't create DIRECT64_PROPNAME
property, like the default DMA window from the device, which uses
"ibm,dma-window".

Signed-off-by: Leonardo Bras 
Tested-by: David Dai 
---
 arch/powerpc/platforms/pseries/iommu.c | 45 +++---
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 1a933c4e8bba..4e33147825cc 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -781,25 +781,14 @@ static int __init disable_ddw_setup(char *str)
 
 early_param("disable_ddw", disable_ddw_setup);
 
-static void remove_ddw(struct device_node *np, bool remove_prop)
+static void remove_dma_window(struct device_node *np, u32 *ddw_avail,
+ struct property *win)
 {
struct dynamic_dma_window_prop *dwp;
-   struct property *win64;
-   u32 ddw_avail[DDW_APPLICABLE_SIZE];
u64 liobn;
-   int ret = 0;
-
-   ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
-_avail[0], DDW_APPLICABLE_SIZE);
-
-   win64 = of_find_property(np, DIRECT64_PROPNAME, NULL);
-   if (!win64)
-   return;
-
-   if (ret || win64->length < sizeof(*dwp))
-   goto delprop;
+   int ret;
 
-   dwp = win64->value;
+   dwp = win->value;
liobn = (u64)be32_to_cpu(dwp->liobn);
 
/* clear the whole window, note the arg is in kernel pages */
@@ -821,10 +810,30 @@ static void remove_ddw(struct device_node *np, bool 
remove_prop)
pr_debug("%pOF: successfully removed direct window: rtas 
returned "
"%d to ibm,remove-pe-dma-window(%x) %llx\n",
np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
+}
+
+static void remove_ddw(struct device_node *np, bool remove_prop)
+{
+   struct property *win;
+   u32 ddw_avail[DDW_APPLICABLE_SIZE];
+   int ret = 0;
+
+   ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
+_avail[0], DDW_APPLICABLE_SIZE);
+   if (ret)
+   return;
+
+   win = of_find_property(np, DIRECT64_PROPNAME, NULL);
+   if (!win)
+   return;
+
+   if (win->length >= sizeof(struct dynamic_dma_window_prop))
+   remove_dma_window(np, ddw_avail, win);
+
+   if (!remove_prop)
+   return;
 
-delprop:
-   if (remove_prop)
-   ret = of_remove_property(np, win64);
+   ret = of_remove_property(np, win);
if (ret)
pr_warn("%pOF: failed to remove direct window property: %d\n",
np, ret);
-- 
2.25.4

[PATCH v5 1/4] powerpc/pseries/iommu: Create defines for operations in ibm,ddw-applicable

Create defines to help handling ibm,ddw-applicable values, avoiding
confusion about the index of given operations.

Signed-off-by: Leonardo Bras 
Tested-by: David Dai 
---
 arch/powerpc/platforms/pseries/iommu.c | 43 --
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 6d47b4a3ce39..ac0d6376bdad 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -39,6 +39,14 @@
 
 #include "pseries.h"
 
+enum {
+   DDW_QUERY_PE_DMA_WIN  = 0,
+   DDW_CREATE_PE_DMA_WIN = 1,
+   DDW_REMOVE_PE_DMA_WIN = 2,
+
+   DDW_APPLICABLE_SIZE
+};
+
 static struct iommu_table_group *iommu_pseries_alloc_group(int node)
 {
struct iommu_table_group *table_group;
@@ -771,12 +779,12 @@ static void remove_ddw(struct device_node *np, bool 
remove_prop)
 {
struct dynamic_dma_window_prop *dwp;
struct property *win64;
-   u32 ddw_avail[3];
+   u32 ddw_avail[DDW_APPLICABLE_SIZE];
u64 liobn;
int ret = 0;
 
ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
-_avail[0], 3);
+_avail[0], DDW_APPLICABLE_SIZE);
 
win64 = of_find_property(np, DIRECT64_PROPNAME, NULL);
if (!win64)
@@ -798,15 +806,15 @@ static void remove_ddw(struct device_node *np, bool 
remove_prop)
pr_debug("%pOF successfully cleared tces in window.\n",
 np);
 
-   ret = rtas_call(ddw_avail[2], 1, 1, NULL, liobn);
+   ret = rtas_call(ddw_avail[DDW_REMOVE_PE_DMA_WIN], 1, 1, NULL, liobn);
if (ret)
pr_warn("%pOF: failed to remove direct window: rtas returned "
"%d to ibm,remove-pe-dma-window(%x) %llx\n",
-   np, ret, ddw_avail[2], liobn);
+   np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
else
pr_debug("%pOF: successfully removed direct window: rtas 
returned "
"%d to ibm,remove-pe-dma-window(%x) %llx\n",
-   np, ret, ddw_avail[2], liobn);
+   np, ret, ddw_avail[DDW_REMOVE_PE_DMA_WIN], liobn);
 
 delprop:
if (remove_prop)
@@ -889,11 +897,11 @@ static int query_ddw(struct pci_dev *dev, const u32 
*ddw_avail,
buid = pdn->phb->buid;
cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8));
 
-   ret = rtas_call(ddw_avail[0], 3, 5, (u32 *)query,
- cfg_addr, BUID_HI(buid), BUID_LO(buid));
+   ret = rtas_call(ddw_avail[DDW_QUERY_PE_DMA_WIN], 3, 5, (u32 *)query,
+   cfg_addr, BUID_HI(buid), BUID_LO(buid));
dev_info(>dev, "ibm,query-pe-dma-windows(%x) %x %x %x"
-   " returned %d\n", ddw_avail[0], cfg_addr, BUID_HI(buid),
-   BUID_LO(buid), ret);
+   " returned %d\n", ddw_avail[DDW_QUERY_PE_DMA_WIN], cfg_addr,
+BUID_HI(buid), BUID_LO(buid), ret);
return ret;
 }
 
@@ -920,15 +928,16 @@ static int create_ddw(struct pci_dev *dev, const u32 
*ddw_avail,
 
do {
/* extra outputs are LIOBN and dma-addr (hi, lo) */
-   ret = rtas_call(ddw_avail[1], 5, 4, (u32 *)create,
-   cfg_addr, BUID_HI(buid), BUID_LO(buid),
-   page_shift, window_shift);
+   ret = rtas_call(ddw_avail[DDW_CREATE_PE_DMA_WIN], 5, 4,
+   (u32 *)create, cfg_addr, BUID_HI(buid),
+   BUID_LO(buid), page_shift, window_shift);
} while (rtas_busy_delay(ret));
dev_info(>dev,
"ibm,create-pe-dma-window(%x) %x %x %x %x %x returned %d "
-   "(liobn = 0x%x starting addr = %x %x)\n", ddw_avail[1],
-cfg_addr, BUID_HI(buid), BUID_LO(buid), page_shift,
-window_shift, ret, create->liobn, create->addr_hi, 
create->addr_lo);
+   "(liobn = 0x%x starting addr = %x %x)\n",
+ddw_avail[DDW_CREATE_PE_DMA_WIN], cfg_addr, BUID_HI(buid),
+BUID_LO(buid), page_shift, window_shift, ret, create->liobn,
+create->addr_hi, create->addr_lo);
 
return ret;
 }
@@ -996,7 +1005,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
int page_shift;
u64 dma_addr, max_addr;
struct device_node *dn;
-   u32 ddw_avail[3];
+   u32 ddw_avail[DDW_APPLICABLE_SIZE];
struct direct_window *window;
struct property *win64;
struct dynamic_dma_window_prop *ddwprop;
@@ -1029,7 +1038,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
 * the property is actually in the parent, not the PE
 */
ret = of_property_read_u32_array(pdn, "ibm,ddw-applicable",
-

[PATCH v5 4/4] powerpc/pseries/iommu: Allow bigger 64bit window by removing default DMA window

On LoPAR "DMA Window Manipulation Calls", it's recommended to remove the
default DMA window for the device, before attempting to configure a DDW,
in order to make the maximum resources available for the next DDW to be
created.

This is a requirement for using DDW on devices in which hypervisor
allows only one DMA window.

If setting up a new DDW fails anywhere after the removal of this
default DMA window, it's needed to restore the default DMA window.
For this, an implementation of ibm,reset-pe-dma-windows rtas call is
needed:

Platforms supporting the DDW option starting with LoPAR level 2.7 implement
ibm,ddw-extensions. The first extension available (index 2) carries the
token for ibm,reset-pe-dma-windows rtas call, which is used to restore
the default DMA window for a device, if it has been deleted.

It does so by resetting the TCE table allocation for the PE to it's
boot time value, available in "ibm,dma-window" device tree node.

Signed-off-by: Leonardo Bras 
Tested-by: David Dai 
---
 arch/powerpc/platforms/pseries/iommu.c | 73 +++---
 1 file changed, 66 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 4e33147825cc..e4198700ed1a 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1066,6 +1066,38 @@ static phys_addr_t ddw_memory_hotplug_max(void)
return max_addr;
 }
 
+/*
+ * Platforms supporting the DDW option starting with LoPAR level 2.7 implement
+ * ibm,ddw-extensions, which carries the rtas token for
+ * ibm,reset-pe-dma-windows.
+ * That rtas-call can be used to restore the default DMA window for the device.
+ */
+static void reset_dma_window(struct pci_dev *dev, struct device_node *par_dn)
+{
+   int ret;
+   u32 cfg_addr, reset_dma_win;
+   u64 buid;
+   struct device_node *dn;
+   struct pci_dn *pdn;
+
+   ret = ddw_read_ext(par_dn, DDW_EXT_RESET_DMA_WIN, _dma_win);
+   if (ret)
+   return;
+
+   dn = pci_device_to_OF_node(dev);
+   pdn = PCI_DN(dn);
+   buid = pdn->phb->buid;
+   cfg_addr = (pdn->busno << 16) | (pdn->devfn << 8);
+
+   ret = rtas_call(reset_dma_win, 3, 1, NULL, cfg_addr, BUID_HI(buid),
+   BUID_LO(buid));
+   if (ret)
+   dev_info(>dev,
+"ibm,reset-pe-dma-windows(%x) %x %x %x returned %d ",
+reset_dma_win, cfg_addr, BUID_HI(buid), BUID_LO(buid),
+ret);
+}
+
 /*
  * If the PE supports dynamic dma windows, and there is space for a table
  * that can map all pages in a linear offset, then setup such a table,
@@ -1090,6 +1122,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
struct property *win64;
struct dynamic_dma_window_prop *ddwprop;
struct failed_ddw_pdn *fpdn;
+   bool default_win_removed = false;
 
mutex_lock(_window_init_mutex);
 
@@ -1133,14 +1166,38 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
if (ret != 0)
goto out_failed;
 
+   /*
+* If there is no window available, remove the default DMA window,
+* if it's present. This will make all the resources available to the
+* new DDW window.
+* If anything fails after this, we need to restore it, so also check
+* for extensions presence.
+*/
if (query.windows_available == 0) {
-   /*
-* no additional windows are available for this device.
-* We might be able to reallocate the existing window,
-* trading in for a larger page size.
-*/
-   dev_dbg(>dev, "no free dynamic windows");
-   goto out_failed;
+   struct property *default_win;
+   int reset_win_ext;
+
+   default_win = of_find_property(pdn, "ibm,dma-window", NULL);
+   if (!default_win)
+   goto out_failed;
+
+   reset_win_ext = ddw_read_ext(pdn, DDW_EXT_RESET_DMA_WIN, NULL);
+   if (reset_win_ext)
+   goto out_failed;
+
+   remove_dma_window(pdn, ddw_avail, default_win);
+   default_win_removed = true;
+
+   /* Query again, to check if the window is available */
+   ret = query_ddw(dev, ddw_avail, , pdn);
+   if (ret != 0)
+   goto out_failed;
+
+   if (query.windows_available == 0) {
+   /* no windows are available for this device. */
+   dev_dbg(>dev, "no free dynamic windows");
+   goto out_failed;
+   }
}
if (query.page_size & 4) {
page_shift = 24; /* 16MB */
@@ -1231,6 +1288,8 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
device_node *pdn)
kfree(win64);

[PATCH v5 0/4] Allow bigger 64bit window by removing default DMA window

There are some devices in which a hypervisor may only allow 1 DMA window
to exist at a time, and in those cases, a DDW is never created to them,
since the default DMA window keeps using this resource.

LoPAR recommends this procedure:
1. Remove the default DMA window,
2. Query for which configs the DDW can be created,
3. Create a DDW.

Patch #1:
Create defines for outputs of ibm,ddw-applicable, so it's easier to
identify them.

Patch #2:
- After LoPAR level 2.8, there is an extension that can make
  ibm,query-pe-dma-windows to have 6 outputs instead of 5. This changes the
  order of the outputs, and that can cause some trouble. 
- query_ddw() was updated to check how many outputs the 
  ibm,query-pe-dma-windows is supposed to have, update the rtas_call() and
  deal correctly with the outputs in both cases.
- This patch looks somehow unrelated to the series, but it can avoid future
  problems on DDW creation.

Patch #3 moves the window-removing code from remove_ddw() to
remove_dma_window(), creating a way to delete any DMA window, so it can be
used to delete the default DMA window.

Patch #4 makes use of the remove_dma_window() from patch #3 to remove the
default DMA window before query_ddw(). It also implements a new rtas call
to recover the default DMA window, in case anything fails after it was
removed, and a DDW couldn't be created.

---
Changes since v4:
- Removed patches 5+ in order to deal with a feature at a time
- Remove unnecessary parentesis in patch #4
- Changed patch #4 title from 
  "Remove default DMA window before creating DDW"
- Included David Dai tested-by
- v4 link: 
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=190051=%2A=both

Changes since v3:
- Introduces new patch #5, to prepare for an important change in #6
- struct iommu_table was not being updated, so include a way to do this
  in patch #6.
- Improved patch #4 based in a suggestion from Alexey, to make code
  more easily understandable
- v3 link: 
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=187348=%2A=both

Changes since v2:
- Change the way ibm,ddw-extensions is accessed, using a proper function
  instead of doing this inline everytime it's used.
- Remove previous patch #6, as it doesn't look like it would be useful.
- Add new patch, for changing names from direct* to dma*, as indirect 
  mapping can be used from now on.
- Fix some typos, corrects some define usage.
- v2 link: 
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=185433=%2A=both

Changes since v1:
- Add defines for ibm,ddw-applicable and ibm,ddw-extensions outputs
- Merge aux function query_ddw_out_sz() into query_ddw()
- Merge reset_dma_window() patch (prev. #2) into remove default DMA
  window patch (#4).
- Keep device_node *np name instead of using pdn in remove_*()
- Rename 'device_node *pdn' into 'parent' in new functions
- Rename dfl_win to default_win
- Only remove the default DMA window if there is no window available
  in first query.
- Check if default DMA window can be restored before removing it.
- Fix 'unitialized use' (found by travis mpe:ci-test)
- New patches #5 and #6
- v1 link: 
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=184420=%2A=both

Special thanks for Alexey Kardashevskiy, Brian King and
Oliver O'Halloran for the feedback provided!


Leonardo Bras (4):
  powerpc/pseries/iommu: Create defines for operations in
ibm,ddw-applicable
  powerpc/pseries/iommu: Update call to ibm,query-pe-dma-windows
  powerpc/pseries/iommu: Move window-removing part of remove_ddw into
remove_dma_window
  powerpc/pseries/iommu: Allow bigger 64bit window by removing default
DMA window

 arch/powerpc/platforms/pseries/iommu.c | 242 -
 1 file changed, 195 insertions(+), 47 deletions(-)

-- 
2.25.4

drivers/video/fbdev/aty/mach64_cursor.c:156:13: sparse: sparse: incorrect type in argument 1 (different address spaces)

2020-08-04 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   4f30a60aa78410496e5ffe632a371c00f0d83a8d
commit: 670d0a4b10704667765f7d18f7592993d02783aa sparse: use identifiers to 
define address spaces
date:   7 weeks ago
config: s390-randconfig-s031-20200805 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-117-g8c7aee71-dirty
git checkout 670d0a4b10704667765f7d18f7592993d02783aa
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=s390 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

>> drivers/video/fbdev/aty/mach64_cursor.c:156:13: sparse: sparse: incorrect 
>> type in argument 1 (different address spaces) @@ expected void *s @@ 
>> got unsigned char [noderef] [usertype] __iomem *dst @@
   drivers/video/fbdev/aty/mach64_cursor.c:156:13: sparse: expected void *s
   drivers/video/fbdev/aty/mach64_cursor.c:156:13: sparse: got unsigned 
char [noderef] [usertype] __iomem *dst
   drivers/video/fbdev/aty/mach64_cursor.c:187:25: sparse: sparse: cast removes 
address space '__iomem' of expression
   drivers/video/fbdev/aty/mach64_cursor.c:188:25: sparse: sparse: cast removes 
address space '__iomem' of expression
   drivers/video/fbdev/aty/mach64_cursor.c:209:30: sparse: sparse: cast removes 
address space '__iomem' of expression
   drivers/video/fbdev/aty/mach64_cursor.c: note: in included file (through 
arch/s390/include/asm/io.h, include/linux/fb.h):
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:225:22: sparse: sparse: incorrect type in argument 
1 (different base types) @@ expected unsigned int [usertype] val @@ got 
restricted __le32 [usertype] @@
   include/asm-generic/io.h:225:22: sparse: expected unsigned int 
[usertype] val
   include/asm-generic/io.h:225:22: sparse: got restricted __le32 [usertype]
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:225:22: sparse: sparse: incorrect type in argument 
1 (different base types) @@ expected unsigned int [usertype] val @@ got 
restricted __le32 [usertype] @@
   include/asm-generic/io.h:225:22: sparse: expected unsigned int 
[usertype] val
   include/asm-generic/io.h:225:22: sparse: got restricted __le32 [usertype]
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:179:15: sparse: sparse: cast to restricted __le32
   include/asm-generic/io.h:225:22: sparse: sparse: incorrect type in argument 
1 (different base types) @@ expected unsigned int [usertype] val @@ got 
restricted __le32 [usertype] @@
   include/asm-generic/io.h:225:22: sparse: expected unsigned int 
[usertype] val
   include/asm-generic/io.h:225:22: sparse: got restricted __le32 [usertype]
   include/asm-generic/io.h:225:22: sparse: sparse: incorrect type in argument 
1 (different base types) @@ expected unsigned int [usertype] val @@ got 
restricted

Re: [PATCH V2 1/9] pci_ids: Add class code and extended capability for RCEC

2020-08-04 Thread Bjorn Helgaas

On Tue, Aug 04, 2020 at 12:40:44PM -0700, Sean V Kelley wrote:
> From: Qiuxu Zhuo 
> 
> A PCIe Root Complex Event Collector(RCEC) has the base class 0x08,
> sub-class 0x07, and programming interface 0x00. Add the class code
> 0x0807 to identify RCEC devices and add the defines for the RCEC
> Endpoint Association Extended Capability.
> 
> See PCI Express Base Specification, version 5.0-1, section "1.3.4
> Root Complex Event Collector" and section "7.9.10 Root Complex
> Event Collector Endpoint Association Extended Capability"
> 
> Signed-off-by: Qiuxu Zhuo 
> Reviewed-by: Jonathan Cameron 

1) "git log --oneline include/linux/pci_ids.h".  Match it.  Mention
the most important words, like "RCEC", early in the subject.

2) 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst#n555

> ---
>  include/linux/pci_ids.h   | 1 +
>  include/uapi/linux/pci_regs.h | 7 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 5c709a1450b1..bc6d1a4ca02d 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -81,6 +81,7 @@
>  #define PCI_CLASS_SYSTEM_RTC 0x0803
>  #define PCI_CLASS_SYSTEM_PCI_HOTPLUG 0x0804
>  #define PCI_CLASS_SYSTEM_SDHCI   0x0805
> +#define PCI_CLASS_SYSTEM_RCEC0x0807
>  #define PCI_CLASS_SYSTEM_OTHER   0x0880
>  
>  #define PCI_BASE_CLASS_INPUT 0x09
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index f9701410d3b5..f335f65f65d6 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -828,6 +828,13 @@
>  #define  PCI_PWR_CAP_BUDGET(x)   ((x) & 1)   /* Included in system 
> budget */
>  #define PCI_EXT_CAP_PWR_SIZEOF   16
>  
> +/* Root Complex Event Collector Endpoint Association  */
> +#define PCI_RCEC_RCIEP_BITMAP4   /* Associated Bitmap for RCiEPs 
> */
> +#define PCI_RCEC_BUSN8   /* RCEC Associated Bus Numbers 
> */
> +#define  PCI_RCEC_BUSN_REG_VER   0x02/* Least capability version 
> that BUSN present */
> +#define  PCI_RCEC_BUSN_NEXT(x)   (((x) >> 8) & 0xff)
> +#define  PCI_RCEC_BUSN_LAST(x)   (((x) >> 16) & 0xff)
> +
>  /* Vendor-Specific (VSEC, PCI_EXT_CAP_ID_VNDR) */
>  #define PCI_VNDR_HEADER  4   /* Vendor-Specific Header */
>  #define  PCI_VNDR_HEADER_ID(x)   ((x) & 0x)
> -- 
> 2.27.0
>

Re: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

2020-08-04 Thread Lu Baolu


Hi,

On 8/4/20 11:11 AM, FelixCui-oc wrote:

Hi  baolu ,
When creating a identity mapping for a namespace device in RMRR, 
you need to add the namespace device to the rmrr->device[] , right?


Yes. You are right.

		The dmar_acpi_bus_add_dev() in patch adds the enumeration of the namespace device in RMRR. This is similar to > enumerating pci devices. Do you think this is unreasonable? If it is 

unreasonable, please tell me why it is unreasonable.

It looks reasonable. Thanks for the explanation.

But I don't think we need to add acpi_device_create_direct_mappings()
since the rmrr identity mapping is already done in the iommu core.

Best regards,
baolu



Best regards
Felix cui-oc



-邮件原件-
发件人: Lu Baolu 
发送时间: 2020年8月4日 9:12
收件人: FelixCui-oc ; Joerg Roedel ; 
io...@lists.linux-foundation.org; linux-kernel@vger.kernel.org; David Woodhouse 

抄送: baolu...@linux.intel.com; RaymondPang-oc ; 
CobeChen-oc 
主题: Re: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

Hi Felix,

On 2020/8/3 17:41, FelixCui-oc wrote:

Hi baolu,
 dmar_acpi_dev_scope_init() parse ANDD structure and enumerated 
namespaces device in DRHD.


Yes.


 But the namespace device in RMRR is not enumerated, right?


It should be probed in probe_acpi_namespace_devices().

Best regards,
baolu



Best regards
Felix cui-oc




-邮件原件-
发件人: FelixCui-oc
发送时间: 2020年8月3日 17:02
收件人: 'Lu Baolu' ; Joerg Roedel
; io...@lists.linux-foundation.org;
linux-kernel@vger.kernel.org; David Woodhouse 
抄送: RaymondPang-oc ; CobeChen-oc

主题: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

Hi  baolu:
"The namespace devices are enumerated in 
probe_acpi_namespace_devices().
It calls iommu_probe_device() to process the enumeration and setup 
the identity mappings."

This situation only applies to the physical node of the 
namespaces device as the pci device.
In fact, the physical node of the namespaces device can be a 
platform device or NULL.
If the physical node of the namespaces is a platform device or 
NULL, it has not actually been enumerated.
So it is necessary to increase the analysis of the namespaces 
device in RMRR and establish an identity mapping.

Best regards
Felix cui




-邮件原件-
发件人: Lu Baolu 
发送时间: 2020年8月3日 16:26
收件人: FelixCui-oc ; Joerg Roedel
; io...@lists.linux-foundation.org;
linux-kernel@vger.kernel.org; David Woodhouse 
抄送: baolu...@linux.intel.com; RaymondPang-oc
; CobeChen-oc 
主题: Re: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

On 2020/8/3 14:52, FelixCui-oc wrote:

Hi  baolu ,
Yes ,that's right.
This patch is to achieve acpi namespace devices to access the 
RMRR region.


The namespace devices are enumerated in probe_acpi_namespace_devices().
It calls iommu_probe_device() to process the enumeration and setup the identity 
mappings. Can you please check why that code doesn't work for you?

Best regards,
baolu



Best regards
Felix cui




-邮件原件-
发件人: Lu Baolu 
发送时间: 2020年8月3日 14:19
收件人: FelixCui-oc ; Joerg Roedel
; io...@lists.linux-foundation.org;
linux-kernel@vger.kernel.org; David Woodhouse 
抄送: baolu...@linux.intel.com; RaymondPang-oc
; CobeChen-oc 
主题: Re: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

Hi,

On 2020/8/3 12:40, FelixCui-oc wrote:

Hi baolu:
Some ACPI devices need to issue dma requests to access the 
reserved memory area.
So bios uses the device scope type ACPI_NAMESPACE_DEVICE in 
RMRR to report these ACPI devices.
At present, there is no analysis in the kernel that the device 
scope type in RMRR is ACPI_NAMESPACE_DEVICE.
This patch is mainly to add the analysis of the device scope type 
ACPI_NAMESPACE_DEVICE in RMRR structure and establish identity mapping for > 
these ACPI devices.


So the problem is "although namespace devices in RMRR have been parsed, but the 
identity map for those devices aren't created. As the result, the namespace devices fail 
to access the RMRR region."

Do I understand it right?

Best regards,
baolu


In addition, some naming changes have been made in patch in order to 
distinguish acpi device from pci device.
You can refer to the description of type in 8.3.1 device scope 
in vt-d spec.

Best regards
FelixCui-oc



-邮件原件-
发件人: Lu Baolu 
发送时间: 2020年8月3日 10:32
收件人: FelixCui-oc ; Joerg Roedel
; io...@lists.linux-foundation.org;
linux-kernel@vger.kernel.org; David Woodhouse 
抄送: baolu...@linux.intel.com; Cobe Chen(BJ-RD)
; Raymond Pang(BJ-RD)

主题: Re: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

Hi,

On 8/2/20 6:07 PM, FelixCuioc wrote:

Some ACPI devices require access to the specified reserved memory
region.BIOS report the specified reserved memory region through
RMRR structures.Add analysis

Re: [PATCH] block: check queue's limits.discard_granularity in __blkdev_issue_discard()

On 2020/8/5 10:50, Coly Li wrote:
> If create a loop device with a backing NVMe SSD, current loop device
> driver doesn't correctly set its  queue's limits.discard_granularity and
> leaves it as 0. If a discard request at LBA 0 on this loop device, in
> __blkdev_issue_discard() the calculated req_sects will be 0, and a zero
> length discard request will trigger a BUG() panic in generic block layer
> code at block/blk-mq.c:563.
> 
> [  955.565006][   C39] [ cut here ]
> [  955.559660][   C39] invalid opcode:  [#1] SMP NOPTI
> [  955.622171][   C39] CPU: 39 PID: 248 Comm: ksoftirqd/39 Tainted: G 
>E 5.8.0-default+ #40
> [  955.622171][   C39] Hardware name: Lenovo ThinkSystem SR650 
> -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE160M-2.70]- 07/17/2020
> [  955.622175][   C39] RIP: 0010:blk_mq_end_request+0x107/0x110
> [  955.622177][   C39] Code: 48 8b 03 e9 59 ff ff ff 48 89 df 5b 5d 41 5c e9 
> 9f ed ff ff 48 8b 35 98 3c f4 00 48 83 c7 10 48 83 c6 19 e8 cb 56 c9 ff eb cb 
> <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 54
> [  955.622179][   C39] RSP: 0018:b1288701fe28 EFLAGS: 00010202
> [  955.749277][   C39] RAX: 0001 RBX: 956fffba5080 RCX: 
> 4003
> [  955.749278][   C39] RDX: 0003 RSI:  RDI: 
> 
> [  955.749279][   C39] RBP:  R08:  R09: 
> 
> [  955.749279][   C39] R10: b1288701fd28 R11: 0001 R12: 
> a8e05160
> [  955.749280][   C39] R13: 0004 R14: 0004 R15: 
> a7ad3a1e
> [  955.749281][   C39] FS:  () GS:95bfbda0() 
> knlGS:
> [  955.749282][   C39] CS:  0010 DS:  ES:  CR0: 80050033
> [  955.749282][   C39] CR2: 7f6f0ef766a8 CR3: 005a37012002 CR4: 
> 007606e0
> [  955.749283][   C39] DR0:  DR1:  DR2: 
> 
> [  955.749284][   C39] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [  955.749284][   C39] PKRU: 5554
> [  955.749285][   C39] Call Trace:
> [  955.749290][   C39]  blk_done_softirq+0x99/0xc0
> [  957.550669][   C39]  __do_softirq+0xd3/0x45f
> [  957.550677][   C39]  ? smpboot_thread_fn+0x2f/0x1e0
> [  957.550679][   C39]  ? smpboot_thread_fn+0x74/0x1e0
> [  957.550680][   C39]  ? smpboot_thread_fn+0x14e/0x1e0
> [  957.550684][   C39]  run_ksoftirqd+0x30/0x60
> [  957.550687][   C39]  smpboot_thread_fn+0x149/0x1e0
> [  957.886225][   C39]  ? sort_range+0x20/0x20
> [  957.886226][   C39]  kthread+0x137/0x160
> [  957.886228][   C39]  ? kthread_park+0x90/0x90
> [  957.886231][   C39]  ret_from_fork+0x22/0x30
> [  959.117120][   C39] ---[ end trace 3dacdac97e2ed164 ]---
> 
> This is the procedure to reproduce the panic,
>   # modprobe scsi_debug delay=0 dev_size_mb=2048 max_queue=1
>   # losetup -f /dev/nvme0n1 --direct-io=on
>   # blkdiscard /dev/loop0 -o 0 -l 0x200
> 
> This patch fixes the issue by checking q->limits.discard_granularity in
> __blkdev_issue_discard() before composing the discard bio. If the value
> is 0, then prints a warning oops information and returns -EOPNOTSUPP to
> the caller to indicate that this buggy device driver doesn't support
> discard request.
> 
> Fixes: 9b15d109a6b2 ("block: improve discard bio alignment in 
> __blkdev_issue_discard()")
> Fixes: c52abf563049 ("loop: Better discard support for block devices")
> Reported-and-suggested-by: Ming Lei 
> Signed-off-by: Coly Li 
> Cc: Bart Van Assche 
> Cc: Christoph Hellwig 
> Cc: Enzo Matsumiya 
> Cc: Evan Green 
> Cc: Hannes Reinecke 
> Cc: Jens Axboe 
> Cc: Martin K. Petersen 
> Cc: Ming Lei 
> Cc: Xiao Ni 
> ---
>  block/blk-lib.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 019e09bb9c0e..fca4f1f0c8c8 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -47,6 +47,10 @@ int __blkdev_issue_discard(struct block_device *bdev, 
> sector_t sector,
>   op = REQ_OP_DISCARD;
>   }
>  
> + /* In case the discard granularity isn't set by buggy device driver */
> + if (WARN_ON_ONCE(!q->limits.discard_granularity))
> + return -EINVAL;
> +

Here -EINVAL should be -EOPNOTSUPP to indicate the problem is from
driver not discard request, this typo is from an partial git rebase.
I will post a v2 version to fix the typo.

Coly Li

[PATCH v2] block: check queue's limits.discard_granularity in __blkdev_issue_discard()

If create a loop device with a backing NVMe SSD, current loop device
driver doesn't correctly set its  queue's limits.discard_granularity and
leaves it as 0. If a discard request at LBA 0 on this loop device, in
__blkdev_issue_discard() the calculated req_sects will be 0, and a zero
length discard request will trigger a BUG() panic in generic block layer
code at block/blk-mq.c:563.

[  955.565006][   C39] [ cut here ]
[  955.559660][   C39] invalid opcode:  [#1] SMP NOPTI
[  955.622171][   C39] CPU: 39 PID: 248 Comm: ksoftirqd/39 Tainted: G   
 E 5.8.0-default+ #40
[  955.622171][   C39] Hardware name: Lenovo ThinkSystem SR650 
-[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE160M-2.70]- 07/17/2020
[  955.622175][   C39] RIP: 0010:blk_mq_end_request+0x107/0x110
[  955.622177][   C39] Code: 48 8b 03 e9 59 ff ff ff 48 89 df 5b 5d 41 5c e9 9f 
ed ff ff 48 8b 35 98 3c f4 00 48 83 c7 10 48 83 c6 19 e8 cb 56 c9 ff eb cb <0f> 
0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 54
[  955.622179][   C39] RSP: 0018:b1288701fe28 EFLAGS: 00010202
[  955.749277][   C39] RAX: 0001 RBX: 956fffba5080 RCX: 
4003
[  955.749278][   C39] RDX: 0003 RSI:  RDI: 

[  955.749279][   C39] RBP:  R08:  R09: 

[  955.749279][   C39] R10: b1288701fd28 R11: 0001 R12: 
a8e05160
[  955.749280][   C39] R13: 0004 R14: 0004 R15: 
a7ad3a1e
[  955.749281][   C39] FS:  () GS:95bfbda0() 
knlGS:
[  955.749282][   C39] CS:  0010 DS:  ES:  CR0: 80050033
[  955.749282][   C39] CR2: 7f6f0ef766a8 CR3: 005a37012002 CR4: 
007606e0
[  955.749283][   C39] DR0:  DR1:  DR2: 

[  955.749284][   C39] DR3:  DR6: fffe0ff0 DR7: 
0400
[  955.749284][   C39] PKRU: 5554
[  955.749285][   C39] Call Trace:
[  955.749290][   C39]  blk_done_softirq+0x99/0xc0
[  957.550669][   C39]  __do_softirq+0xd3/0x45f
[  957.550677][   C39]  ? smpboot_thread_fn+0x2f/0x1e0
[  957.550679][   C39]  ? smpboot_thread_fn+0x74/0x1e0
[  957.550680][   C39]  ? smpboot_thread_fn+0x14e/0x1e0
[  957.550684][   C39]  run_ksoftirqd+0x30/0x60
[  957.550687][   C39]  smpboot_thread_fn+0x149/0x1e0
[  957.886225][   C39]  ? sort_range+0x20/0x20
[  957.886226][   C39]  kthread+0x137/0x160
[  957.886228][   C39]  ? kthread_park+0x90/0x90
[  957.886231][   C39]  ret_from_fork+0x22/0x30
[  959.117120][   C39] ---[ end trace 3dacdac97e2ed164 ]---

This is the procedure to reproduce the panic,
  # modprobe scsi_debug delay=0 dev_size_mb=2048 max_queue=1
  # losetup -f /dev/nvme0n1 --direct-io=on
  # blkdiscard /dev/loop0 -o 0 -l 0x200

This patch fixes the issue by checking q->limits.discard_granularity in
__blkdev_issue_discard() before composing the discard bio. If the value
is 0, then prints a warning oops information and returns -EOPNOTSUPP to
the caller to indicate that this buggy device driver doesn't support
discard request.

Fixes: 9b15d109a6b2 ("block: improve discard bio alignment in 
__blkdev_issue_discard()")
Fixes: c52abf563049 ("loop: Better discard support for block devices")
Reported-and-suggested-by: Ming Lei 
Signed-off-by: Coly Li 
Cc: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Enzo Matsumiya 
Cc: Evan Green 
Cc: Hannes Reinecke 
Cc: Jens Axboe 
Cc: Martin K. Petersen 
Cc: Ming Lei 
Cc: Xiao Ni 
---
Changelog:
v2: fix typo of the wrong return error code.
v1: first version.

 block/blk-lib.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 019e09bb9c0e..729f05729529 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -47,6 +47,10 @@ int __blkdev_issue_discard(struct block_device *bdev, 
sector_t sector,
op = REQ_OP_DISCARD;
}
 
+   /* In case the discard granularity isn't set by buggy device driver */
+   if (WARN_ON_ONCE(!q->limits.discard_granularity))
+   return -EOPNOTSUPP;
+
bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
if ((sector | nr_sects) & bs_mask)
return -EINVAL;
-- 
2.26.2

[PATCH 1/2] dt-bindings: fsl: Convert i.MX7ULP PM to json-schema

Convert the i.MX7ULP PM binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
 .../bindings/arm/freescale/fsl,imx7ulp-pm.txt  | 23 -
 .../bindings/arm/freescale/fsl,imx7ulp-pm.yaml | 40 ++
 2 files changed, 40 insertions(+), 23 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.txt
 create mode 100644 
Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.yaml

diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.txt 
b/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.txt
deleted file mode 100644
index 75195be..000
--- a/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.txt
+++ /dev/null
@@ -1,23 +0,0 @@
-Freescale i.MX7ULP Power Management Components
---
-
-The Multi-System Mode Controller (MSMC) is responsible for sequencing
-the MCU into and out of all stop and run power modes. Specifically, it
-monitors events to trigger transitions between power modes while
-controlling the power, clocks, and memories of the MCU to achieve the
-power consumption and functionality of that mode.
-
-The WFI or WFE instruction is used to invoke a Sleep, Deep Sleep or
-Standby modes for either Cortex family. Run, Wait, and Stop are the
-common terms used for the primary operating modes of Kinetis
-microcontrollers.
-
-Required properties:
-- compatible:  Should be "fsl,imx7ulp-smc1".
-- reg: Specifies base physical address and size of the register sets.
-
-Example:
-smc1: smc1@4041 {
-   compatible = "fsl,imx7ulp-smc1";
-   reg = <0x4041 0x1000>;
-};
diff --git 
a/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.yaml 
b/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.yaml
new file mode 100644
index 000..1b00294
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-pm.yaml
@@ -0,0 +1,40 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/freescale/fsl,imx7ulp-pm.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX7ULP Power Management Components
+
+maintainers:
+  - A.s. Dong 
+
+description: |
+  The Multi-System Mode Controller (MSMC) is responsible for sequencing
+  the MCU into and out of all stop and run power modes. Specifically, it
+  monitors events to trigger transitions between power modes while
+  controlling the power, clocks, and memories of the MCU to achieve the
+  power consumption and functionality of that mode.
+
+  The WFI or WFE instruction is used to invoke a Sleep, Deep Sleep or
+  Standby modes for either Cortex family. Run, Wait, and Stop are the
+  common terms used for the primary operating modes of Kinetis
+  microcontrollers.
+
+properties:
+  compatible:
+const: fsl,imx7ulp-smc1
+
+  reg:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+
+examples:
+  - |
+smc1@4041 {
+compatible = "fsl,imx7ulp-smc1";
+reg = <0x4041 0x1000>;
+};
-- 
2.7.4

[PATCH 2/2] dt-bindings: fsl: Convert i.MX7ULP SIM to json-schema

Convert the i.MX7ULP SIM binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
 .../bindings/arm/freescale/fsl,imx7ulp-sim.txt | 16 --
 .../bindings/arm/freescale/fsl,imx7ulp-sim.yaml| 36 ++
 2 files changed, 36 insertions(+), 16 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.txt
 create mode 100644 
Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.yaml

diff --git 
a/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.txt 
b/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.txt
deleted file mode 100644
index 7d0c7f0..000
--- a/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.txt
+++ /dev/null
@@ -1,16 +0,0 @@
-Freescale i.MX7ULP System Integration Module
---
-The system integration module (SIM) provides system control and chip 
configuration
-registers. In this module, chip revision information is located in JTAG ID 
register,
-and a set of registers have been made available in DGO domain for SW use, with 
the
-objective to maintain its value between system resets.
-
-Required properties:
-- compatible:  Should be "fsl,imx7ulp-sim".
-- reg: Specifies base physical address and size of the register sets.
-
-Example:
-sim: sim@410a3000 {
-   compatible = "fsl,imx7ulp-sim", "syscon";
-   reg = <0x410a3000 0x1000>;
-};
diff --git 
a/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.yaml 
b/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.yaml
new file mode 100644
index 000..8b4aff6
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,imx7ulp-sim.yaml
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/freescale/fsl,imx7ulp-sim.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX7ULP System Integration Module
+
+maintainers:
+  - Anson Huang 
+
+description: |
+  The system integration module (SIM) provides system control and chip 
configuration
+  registers. In this module, chip revision information is located in JTAG ID 
register,
+  and a set of registers have been made available in DGO domain for SW use, 
with the
+  objective to maintain its value between system resets.
+
+properties:
+  compatible:
+items:
+  - const: fsl,imx7ulp-sim
+  - const: syscon
+
+  reg:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+
+examples:
+  - |
+sim@410a3000 {
+compatible = "fsl,imx7ulp-sim", "syscon";
+reg = <0x410a3000 0x1000>;
+};
-- 
2.7.4

Re: [PATCH] block: tolerate 0 byte discard_granularity in __blkdev_issue_discard()

On 2020/8/5 10:46, Ming Lei wrote:
> On Wed, Aug 05, 2020 at 09:54:00AM +0800, Coly Li wrote:
>> On 2020/8/5 07:58, Ming Lei wrote:
>>> On Tue, Aug 04, 2020 at 10:23:32PM +0800, Coly Li wrote:
 When some buggy driver doesn't set its queue->limits.discard_granularity
 (e.g. current loop device driver), discard at LBA 0 on such device will
 trigger a kernel BUG() panic from block/blk-mq.c:563.

 [  955.565006][   C39] [ cut here ]
 [  955.559660][   C39] invalid opcode:  [#1] SMP NOPTI
 [  955.622171][   C39] CPU: 39 PID: 248 Comm: ksoftirqd/39 Tainted: G  
   E 5.8.0-default+ #40
 [  955.622171][   C39] Hardware name: Lenovo ThinkSystem SR650 
 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE160M-2.70]- 07/17/2020
 [  955.622175][   C39] RIP: 0010:blk_mq_end_request+0x107/0x110
 [  955.622177][   C39] Code: 48 8b 03 e9 59 ff ff ff 48 89 df 5b 5d 41 5c 
 e9 9f ed ff ff 48 8b 35 98 3c f4 00 48 83 c7 10 48 83 c6 19 e8 cb 56 c9 ff 
 eb cb <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 54
 [  955.622179][   C39] RSP: 0018:b1288701fe28 EFLAGS: 00010202
 [  955.749277][   C39] RAX: 0001 RBX: 956fffba5080 RCX: 
 4003
 [  955.749278][   C39] RDX: 0003 RSI:  RDI: 
 
 [  955.749279][   C39] RBP:  R08:  R09: 
 
 [  955.749279][   C39] R10: b1288701fd28 R11: 0001 R12: 
 a8e05160
 [  955.749280][   C39] R13: 0004 R14: 0004 R15: 
 a7ad3a1e
 [  955.749281][   C39] FS:  () 
 GS:95bfbda0() knlGS:
 [  955.749282][   C39] CS:  0010 DS:  ES:  CR0: 80050033
 [  955.749282][   C39] CR2: 7f6f0ef766a8 CR3: 005a37012002 CR4: 
 007606e0
 [  955.749283][   C39] DR0:  DR1:  DR2: 
 
 [  955.749284][   C39] DR3:  DR6: fffe0ff0 DR7: 
 0400
 [  955.749284][   C39] PKRU: 5554
 [  955.749285][   C39] Call Trace:
 [  955.749290][   C39]  blk_done_softirq+0x99/0xc0
 [  957.550669][   C39]  __do_softirq+0xd3/0x45f
 [  957.550677][   C39]  ? smpboot_thread_fn+0x2f/0x1e0
 [  957.550679][   C39]  ? smpboot_thread_fn+0x74/0x1e0
 [  957.550680][   C39]  ? smpboot_thread_fn+0x14e/0x1e0
 [  957.550684][   C39]  run_ksoftirqd+0x30/0x60
 [  957.550687][   C39]  smpboot_thread_fn+0x149/0x1e0
 [  957.886225][   C39]  ? sort_range+0x20/0x20
 [  957.886226][   C39]  kthread+0x137/0x160
 [  957.886228][   C39]  ? kthread_park+0x90/0x90
 [  957.886231][   C39]  ret_from_fork+0x22/0x30
 [  959.117120][   C39] ---[ end trace 3dacdac97e2ed164 ]---

 This is the procedure to reproduce the panic,
   # modprobe scsi_debug delay=0 dev_size_mb=2048 max_queue=1
   # losetup -f /dev/nvme0n1 --direct-io=on
   # blkdiscard /dev/loop0 -o 0 -l 0x200

 This is how the BUG() panic triggered by __blkdev_issue_discard(),
 - For a NVMe SSD backing loop device, the driver does not initialize
   its queue->limits.discard_granularity and leaves it to 0.
 - When discard on LBA 0 of the loop device, __blkdev_issue_discard()
   is called before loop device driver code.
 - Inside __blkdev_issue_discard(), when calculating value of
   granularity_aligned_lba by
granularity_aligned_lba = round_up(sector_mapped,
q->limits.discard_granularity >> SECTOR_SHIFT);
   because sector_mapped is 0 (at LBA 0 and no partition offset), and
   q->limits.discard_granularity is 0 (by the buggy loop driver), the
   calculated granularity_aligned_lba is 0.
 - The inline function bio_aligned_discard_max_sectors() is defined as
return round_down(UINT_MAX, q->limits.discard_granularity) >>
SECTOR_SHIFT;
when q->limits.discard_granularity is 0 from loop device driver, the
above calculation returns value 0.
 - Now granularity_aligned_lba and sctor_mapped are 0, req_sectors is
   calculated by the following lines in __blkdev_issue_discard(),
if (granularity_aligned_lba == sector_mapped)
req_sects = min_t(sector_t, nr_sects,
  bio_aligned_discard_max_sectors(q));
   because bio_aligned_discard_max_sectors(q) returns 0, req_sects is
   calculated as 0.
 - Now a discard bio is mistakenly initialized as a 0 byte bio by,
bio->bi_iter.bi_size = req_sects << 9;
   and sent to loop device driver.
 - This discard request is handled by loop device driver by following
   code path,
 loop_handle_cmd => do_req_filebacked => lo_fallocate =>
 file->f_op->fallocate => blkdev_fallocate =>

[PATCH] block: check queue's limits.discard_granularity in __blkdev_issue_discard()