Re: [PATCH v4 10/18] iommu/mediatek: Add mt8183 IOMMU support

2018-12-21 Thread Yong Wu
On Fri, 2018-12-21 at 19:31 +0100, Matthias Brugger wrote:
> 
> On 08/12/2018 09:39, Yong Wu wrote:
> > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> > the ARM Short-descriptor like mt8173, and most of the HW registers
> > are the same.
> > 
> > Here list main differences between mt8183 and mt8173/mt2712:
> > 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> > 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> > 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> > mode".
> > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> > the bit[33:32] in the physical address of the pgtable base, But the
> > standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> > we add a mask.
> > 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> > 6) the larb-id in smi-common is remapped. M4U should enable
> > larbid_remapped support.
> > 
> > Signed-off-by: Yong Wu 
> > ---
[...]
> > +static const struct mtk_iommu_plat_data mt8183_data = {
> > +   .m4u_plat= M4U_MT8183,
> > +   .larbid_remap_enable = true,
> > +   .larbid_remapped = {0, 4, 5, 6, 7, 2, 3, 1},
> 
> Aren't we reinventing the wheel here?
> Why can't we use larb-id to get the correct id insteaf of providing another 
> data
> structure for the remapping?

Sorry, The remapping id is arbitrary, there is no rule to get it from
the larb-id.

>From Nicolas's comment, I plan to delete "larbid_remap_enable" and only
use "larbid_remap". The other SoCs use the linear mapping here.

In addition, I have to apologize that here will may be improved for
mt2712. There are 2 smi-common(smi-common0 and smi-common1) in mt2712,
actually the remapping relationship for smi-common1 is also different.
If it is really needed, I plan to change it from "larbid_remap" to
"larbid_remap[2]" which 0 is for smi-common0 and 1 is for smi-common1.
Of course, it doesn't affect the iommu functions and only prints the
error log when IOMMU translation fault.

> 
> Regards,
> Matthias

[...]




Re: [PATCH v4 03/18] memory: mtk-smi: Use a general config_port interface

2018-12-21 Thread Yong Wu
On Fri, 2018-12-21 at 18:47 +0100, Matthias Brugger wrote:
> 
> On 08/12/2018 09:39, Yong Wu wrote:
> > The config_port of mt2712 and mt8183 are the same. Use a general
> > config_port interface instead.
> > 
> > In addition, in mt2712, larb8 and larb9 are the bdpsys larbs which
> > are not the normal larb, their register space are different from the
> > normal one. thus, we can not call the general config_port. In mt8183,
> > IPU0/1 and CCU connect with smi-common directly, they also are not
> > the normal larb. Hence, we add a "larb_special_mask" for these special
> > larbs.
> > 
> > This is also a preparing patch for adding mt8183 SMI support.
> > 
> > Signed-off-by: Yong Wu 
> > ---
> >  drivers/memory/mtk-smi.c | 12 +---
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
> > index 8f2d152..3b9ad0e 100644
> > --- a/drivers/memory/mtk-smi.c
> > +++ b/drivers/memory/mtk-smi.c
> > @@ -53,6 +53,7 @@ struct mtk_smi_larb_gen {
> > bool need_larbid;
> > int port_in_larb[MTK_LARB_NR_MAX + 1];
> > void (*config_port)(struct device *);
> > +   unsigned int larb_special_mask; /* The special larbs mask. */
> 
> I'm not really happy with the name larb_special_mask but I can't think of
> anything else. The comment is not needed as it just rewords the name of the
> variable.

Thanks this comment. then I reword more detail instead this common
"special", How about "larb_to_common_directly_mask"?

> 
> Other then that (or even without changing anything):
> 
> Reviewed-by: Matthias Brugger 

Thanks.

> 
> >  };
> >  
> >  struct mtk_smi {
> > @@ -176,17 +177,13 @@ void mtk_smi_larb_put(struct device *larbdev)
> > return -ENODEV;
> >  }
> >  
> > -static void mtk_smi_larb_config_port_mt2712(struct device *dev)
> > +static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
> >  {
> > struct mtk_smi_larb *larb = dev_get_drvdata(dev);
> > u32 reg;
> > int i;
> >  
> > -   /*
> > -* larb 8/9 is the bdpsys larb, the iommu_en is enabled defaultly.
> > -* Don't need to set it again.
> > -*/
> > -   if (larb->larbid == 8 || larb->larbid == 9)
> > +   if (BIT(larb->larbid) & larb->larb_gen->larb_special_mask)
> > return;
> >  
> > for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
> > @@ -261,7 +258,8 @@ static void mtk_smi_larb_config_port_gen1(struct device 
> > *dev)
> >  
> >  static const struct mtk_smi_larb_gen mtk_smi_larb_mt2712 = {
> > .need_larbid = true,
> > -   .config_port = mtk_smi_larb_config_port_mt2712,
> > +   .config_port   = mtk_smi_larb_config_port_gen2_general,
> > +   .larb_special_mask = BIT(8) | BIT(9),  /* bdpsys */
> >  };
> >  
> >  static const struct of_device_id mtk_smi_larb_of_ids[] = {
> > 




Re: [PATCH v4 10/18] iommu/mediatek: Add mt8183 IOMMU support

2018-12-21 Thread Yong Wu
On Sat, 2018-12-22 at 08:31 +0800, Nicolas Boichat wrote:
> On Fri, Dec 21, 2018 at 4:02 PM Yong Wu  wrote:
> >
> > On Fri, 2018-12-21 at 12:43 +0800, Nicolas Boichat wrote:
> > > On Sat, Dec 8, 2018 at 4:42 PM Yong Wu  wrote:
> > > >
> > > > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> > > > the ARM Short-descriptor like mt8173, and most of the HW registers
> > > > are the same.
> > > >
> > > > Here list main differences between mt8183 and mt8173/mt2712:
> > > > 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> > > > 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> > > > 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> > > > mode".
> > > > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> > > > the bit[33:32] in the physical address of the pgtable base, But the
> > > > standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> > > > we add a mask.
> > > > 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> > > > 6) the larb-id in smi-common is remapped. M4U should enable
> > > > larbid_remapped support.
> > > >
> > > > Signed-off-by: Yong Wu 
> > > > ---
> > > >  drivers/iommu/mtk_iommu.c | 31 ++-
> > > >  drivers/iommu/mtk_iommu.h |  1 +
> > > >  drivers/memory/mtk-smi.c  | 19 +++
> > > >  3 files changed, 42 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > > index 8ab3b69..d91a554 100644
> > > > --- a/drivers/iommu/mtk_iommu.c
> > > > +++ b/drivers/iommu/mtk_iommu.c
> > > > @@ -36,6 +36,7 @@
> > > >  #include "mtk_iommu.h"
> > > >
> > > >  #define REG_MMU_PT_BASE_ADDR   0x000
> > > > +#define MMU_PT_ADDR_MASK   GENMASK(31, 7)
> > > >
> > > >  #define REG_MMU_INVALIDATE 0x020
> > > >  #define F_ALL_INVLD0x2
> > > > @@ -54,7 +55,7 @@
> > > >  #define REG_MMU_CTRL_REG   0x110
> > > >  #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
> > > >  #define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> > > > -   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> > > > +   ((data)->plat_data->m4u_plat == M4U_MT8173 ? 5 : 4)
> > >
> > > Should the shift value be a member of plat_data instead?
> >
> > It's also ok.
> > This TF_PROTECT_SEL MACRO looks a bit complex. I can use a new patch to
> > refactor it.
> 
> SGTM.
> 
> > >
> > > >  /* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> > > >  #define F_MMU_TF_PROTECT_SEL(prot, data) \
> > > > (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> > > > @@ -347,7 +348,7 @@ static int mtk_iommu_attach_device(struct 
> > > > iommu_domain *domain,
> > > > /* Update the pgtable base address register of the M4U HW */
> > > > if (!data->m4u_dom) {
> > > > data->m4u_dom = dom;
> > > > -   writel(dom->cfg.arm_v7s_cfg.ttbr[0],
> > > > +   writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
> > > >data->base + REG_MMU_PT_BASE_ADDR);
> > > > }
> > > >
> > > > @@ -510,6 +511,7 @@ static int mtk_iommu_of_xlate(struct device *dev, 
> > > > struct of_phandle_args *args)
> > > >
> > > >  static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > > >  {
> > > > +   enum mtk_iommu_plat m4u_plat = data->plat_data->m4u_plat;
> > > > u32 regval;
> > > > int ret;
> > > >
> > > > @@ -520,7 +522,7 @@ static int mtk_iommu_hw_init(const struct 
> > > > mtk_iommu_data *data)
> > > > }
> > > >
> > > > regval = F_MMU_TF_PROTECT_SEL(2, data);
> > > > -   if (data->plat_data->m4u_plat == M4U_MT8173)
> > > > +   if (m4u_plat == M4U_MT8173)
> > > > regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> > > > writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
> > > >
> > > > @@ -541,14 +543,14 @@ static int mtk_iommu_hw_init(const struct 
> > > > mtk_iommu_data *data)
> > > > F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
> > > > writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
> > > >
> > > > -   if (data->plat_data->m4u_plat == M4U_MT8173)
> > > > +   if (m4u_plat == M4U_MT8173)
> > > > regval = (data->protect_base >> 1) | (data->enable_4GB 
> > > > << 31);
> > > > else
> > > > regval = lower_32_bits(data->protect_base) |
> > > >  upper_32_bits(data->protect_base);
> > > > writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
> > > >
> > > > -   if (data->enable_4GB && data->plat_data->m4u_plat != 
> > > > M4U_MT8173) {
> > > > +   if (data->enable_4GB && m4u_plat == M4U_MT2712) {
> > > > /*
> > > >  * If 4GB mode is enabled, the validate PA range is from
> > > >  * 0x1__ to 0x1__. here record 

[PATCH] debugfs: remove no need inc_nlink

2018-12-21 Thread yangerkun
Remove inc_nlink in debugfs_create_automount, or this inode will never
be free.

Signed-off-by: yangerkun 
---
 fs/debugfs/inode.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 13b0135..9e6e225 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -516,8 +516,6 @@ struct dentry *debugfs_create_dir(const char *name, struct 
dentry *parent)
inode->i_op = _dir_inode_operations;
inode->i_fop = _dir_operations;
 
-   /* directory inodes start off with i_nlink == 2 (for "." entry) */
-   inc_nlink(inode);
d_instantiate(dentry, inode);
inc_nlink(d_inode(dentry->d_parent));
fsnotify_mkdir(d_inode(dentry->d_parent), dentry);
-- 
2.9.5



Re: [PATCH v1 08/11] ASoC: Intel: Make PCI dependency explicit

2018-12-21 Thread Randy Dunlap
On 12/21/18 4:14 PM, Sinan Kaya wrote:
> IOSF_MBI driver depends on CONFIG_PCI set but this is not specified
> anywhere.

wrong patch description...?

> Signed-off-by: Sinan Kaya 
> ---
>  sound/soc/intel/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig
> index 2fd1b61e8331..b0764b2fe001 100644
> --- a/sound/soc/intel/Kconfig
> +++ b/sound/soc/intel/Kconfig
> @@ -91,7 +91,7 @@ config SND_SST_ATOM_HIFI2_PLATFORM_PCI
>  config SND_SST_ATOM_HIFI2_PLATFORM_ACPI
>   tristate "ACPI HiFi2 (Baytrail, Cherrytrail) Platforms"
>   default ACPI
> - depends on X86 && ACPI
> + depends on X86 && ACPI && PCI
>   select SND_SST_IPC_ACPI
>   select SND_SST_ATOM_HIFI2_PLATFORM
>   select SND_SOC_ACPI_INTEL_MATCH
> 


-- 
~Randy


Re: [PATCH] proc: sysctl: fix double drop_sysctl_table()

2018-12-21 Thread Al Viro
On Sat, Dec 22, 2018 at 10:12:21AM +0800, Yafang Shao wrote:
> The callers of insert_header() will drop sysctl table whatever
> insert_header() successes or fails.
> So we can't call drop_sysctl_table() in insert_header().
> 
> The call sites of insert_header() are all in fs/proc/proc_sysctl.c.

Except that at least insert_links() does this:

...
core_parent->header.nreg++;
...
err = insert_header(core_parent, links);
if (err)
kfree(links);
out:
drop_sysctl_table(_parent->header);
with that drop clearly paired with explicit increment of nreg.  So your
patch appears to break at least that one.

Looking at the rest of callers... __register_sysctl_table() demonstrates
the same pattern - explicit increment of nreg, then insert_header(),
then *unconditional* drop undoing that increment.

The third and last caller (get_subdir()) appears to be different.
We do insert_header(), if it succeeds we bump nreg on new and
unconditionally drop a reference to dir, as well as new.

Let's look at the callers of that sucker.

/* Reference moved down the diretory tree get_subdir */
dir->header.nreg++;
spin_unlock(_lock);

/* Find the directory for the ctl_table */
for (name = path; name; name = nextname) {
int namelen;
nextname = strchr(name, '/');
if (nextname) {  
namelen = nextname - name;
nextname++;
} else {
namelen = strlen(name);
} 
if (namelen == 0)
continue;

dir = get_subdir(dir, name, namelen);
if (IS_ERR(dir))
goto fail;
}

spin_lock(_lock);
if (insert_header(dir, header))
goto fail_put_dir_locked;

Aha...  So we are traversing a tree and it smells like get_subdir()
expects dir to be pinned by the caller, drops that reference and
either fails or returns the next tree node pinned.

_IF_ that matches the behaviour of get_subdir(), the code above
makes sense -
grab dir
for each non-empty pathname component
next = get_subdir(dir, component)
// dir got dropped
if get_subdir has failed
goto fail
// next got grabbed
dir = next
insert_header(dir, header)
drop dir
if insert_header was successful
return header
fail:
// all references grabbed by the above are dropped by now

So let's look at get_subdir() from refcounting POV (ignoring the locking
for now):
subdir = find_subdir(dir, name, namelen);
if (!IS_ERR(subdir))
goto found;
if (PTR_ERR(subdir) != -ENOENT)
goto failed;
yeccchhh  What's wrong with if (subdir == ERR_PTR(-ENOENT))?
Anyway, find_subdir() appears to be refcounting-neutral.
new = new_dir(set, name, namelen);
OK, looks like we are creating a new object here.  What's the value of .nreg
in it?  new_dir() itself doesn't seem to set it, but the thing it calls at
the end (init_header()) does set it to 1.  OK, so we'd got a reference to
new object, our counter being 1.  On failure it appears to return NULL.
OK...
subdir = ERR_PTR(-ENOMEM);
if (!new)
goto failed;
right, so if allocation in new_dir() has failed, we are buggering off to the
same 'failed' label as on other exits.  In failure case new_dir() is
refcounting-neutral, so we are in the same environment.
/* Was the subdir added while we dropped the lock? */
subdir = find_subdir(dir, name, namelen);
if (!IS_ERR(subdir))
goto found;
if (PTR_ERR(subdir) != -ENOENT)
goto failed;
Interesting...  So we can get to 'failed' in some cases when new_dir()
has not failed.
/* Nope.  Use the our freshly made directory entry. */
err = insert_header(dir, >header);
subdir = ERR_PTR(err);
if (err)
goto failed;
Looks like it expects insert_header() to be refcounting-neutral in failure
case...
subdir = new;
found:
subdir->header.nreg++;

OK, we can get here from 3 places:
* subdir found by lookup before we even tried to allocate new.
new is NULL, subdir has just gotten a reference grabbed.
* subdir found by re-lookup after new had been allocated.
new is non-NULL and we are holding one reference to it, subdir has
just gotten a reference grabbed and it's not equal to new.
* new got successfully inserted into dir; subdir is equal
to new and we'd just grabbed the second reference to it.

Looks like in all those cases we have a reference grabbed on subdir
*and* a reference grabbed on new (if non-NULL).  Covers all 3 cases.,,

failed:
... and now we rejoin the failure paths (a lousy 

[PATCH] m68k/mac: Remove obsolete comment

2018-12-21 Thread Finn Thain
According to The Guide To Mac Family Hardware, this is the correct way
to disable the VBL interrupt. The confusing comment here doesn't add any
value so remove it.

Signed-off-by: Finn Thain 
---
 arch/m68k/mac/via.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/m68k/mac/via.c b/arch/m68k/mac/via.c
index 4fa32752bd7d..b5d320844b71 100644
--- a/arch/m68k/mac/via.c
+++ b/arch/m68k/mac/via.c
@@ -180,7 +180,6 @@ void __init via_init(void)
 
/*
 * SE/30: disable video IRQ
-* XXX: testing for SE/30 VBL
 */
 
if (macintosh_config->ident == MAC_MODEL_SE30) {
-- 
2.19.2


[PATCH] m68k/mac: Use '030 reset method on SE/30

2018-12-21 Thread Finn Thain
The comment says that calling the ROM routine doesn't work. But testing
shows that the 68030 fall-back reset method does work, so just use that.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 arch/m68k/mac/misc.c | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index ebb3b6d169ea..2d8da060f875 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -410,9 +410,8 @@ void mac_poweroff(void)
 
 void mac_reset(void)
 {
-   if (macintosh_config->adb_type == MAC_ADB_II) {
-   unsigned long flags;
-
+   if (macintosh_config->adb_type == MAC_ADB_II &&
+   macintosh_config->ident != MAC_MODEL_SE30) {
/* need ROMBASE in booter */
/* indeed, plus need to MAP THE ROM !! */
 
@@ -422,17 +421,8 @@ void mac_reset(void)
/* works on some */
rom_reset = (void *) (mac_bi_data.rombase + 0xa);
 
-   if (macintosh_config->ident == MAC_MODEL_SE30) {
-   /*
-* MSch: Machines known to crash on ROM reset ...
-*/
-   } else {
-   local_irq_save(flags);
-
-   rom_reset();
-
-   local_irq_restore(flags);
-   }
+   local_irq_disable();
+   rom_reset();
 #ifdef CONFIG_ADB_CUDA
} else if (macintosh_config->adb_type == MAC_ADB_EGRET ||
   macintosh_config->adb_type == MAC_ADB_CUDA) {
-- 
2.19.2


[PATCH] m68k/mac: Skip VIA port setup unless RTC is connected

2018-12-21 Thread Finn Thain
Those Mac models which don't connect their RTC to VIA1 port B probably
have something else connected to those pins. Just leave them the way we
found them. Make the port B setup conditional on via_type, to match the
RTC accessors in arch/m68k/mac/misc.c.

Signed-off-by: Finn Thain 
---
 arch/m68k/mac/via.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/m68k/mac/via.c b/arch/m68k/mac/via.c
index 3179cb7a..4fa32752bd7d 100644
--- a/arch/m68k/mac/via.c
+++ b/arch/m68k/mac/via.c
@@ -188,13 +188,18 @@ void __init via_init(void)
via1[vBufB] |= 0x40;
}
 
-   /*
-* Set the RTC bits to a known state: all lines to outputs and
-* RTC disabled (yes that's 0 to enable and 1 to disable).
-*/
-
-   via1[vDirB] |= (VIA1B_vRTCEnb | VIA1B_vRTCClk | VIA1B_vRTCData);
-   via1[vBufB] |= (VIA1B_vRTCEnb | VIA1B_vRTCClk);
+   switch (macintosh_config->adb_type) {
+   case MAC_ADB_IOP:
+   case MAC_ADB_II:
+   case MAC_ADB_PB1:
+   /*
+* Set the RTC bits to a known state: all lines to outputs and
+* RTC disabled (yes that's 0 to enable and 1 to disable).
+*/
+   via1[vDirB] |= VIA1B_vRTCEnb | VIA1B_vRTCClk | VIA1B_vRTCData;
+   via1[vBufB] |= VIA1B_vRTCEnb | VIA1B_vRTCClk;
+   break;
+   }
 
/* Everything below this point is VIA2/RBV only... */
 
-- 
2.19.2


[PATCH] m68k/mac: Clean up unused timer definitions

2018-12-21 Thread Finn Thain
Signed-off-by: Finn Thain 
---
 arch/m68k/include/asm/macints.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/m68k/include/asm/macints.h b/arch/m68k/include/asm/macints.h
index cddb2d3ea49b..4da172bd048c 100644
--- a/arch/m68k/include/asm/macints.h
+++ b/arch/m68k/include/asm/macints.h
@@ -121,7 +121,4 @@
 #define SLOT2IRQ(x)  (x + 47)
 #define IRQ2SLOT(x)  (x - 47)
 
-#define INT_CLK   24576/* CLK while int_clk =2.456MHz and divide = 
100 */
-#define INT_TICKS 246  /* to make sched_time = 99.902... HZ */
-
 #endif /* asm/macints.h */
-- 
2.19.2


Re: [PATCH 14/14] misc: ti-st: Drop superseded driver

2018-12-21 Thread Sebastian Reichel
Hi,

On Fri, Dec 21, 2018 at 03:10:52PM -0600, Adam Ford wrote:
> On Fri, Dec 21, 2018 at 2:13 AM Sebastian Reichel  wrote:
> >
> > From: Sebastian Reichel 
> >
> > This driver has been superseded by the serdev based Bluetooth
> > hci_ll driver, which is initialized from DT. All mainline users
> > have been converted and this driver can be safely dropped.
> 
> There seems to be an issue with my wl1283 because the
> logicod-torpedo-37xx-devkit doesn't work with the proposed device tree
> changes, but the older shared transport driver still works.
> I commented on the patch that modifies the board with details of the
> firmware timeout.
> 
> Until this is resolved, I'd like to hold off on applying these changes.

mh :/ I can't help with this, since I don't have this board (nor any
other wl1283 based one). Is the FM part usable on that device? If
its unusable the patchset could be splitted.

> Also, there are references to this driver inside pdata-quirks that
> need to be removed as well once the loading and timeout issues have
> been resolved.

I dropped the pdata-quirks for TI_ST in patch 3 of this series.

-- Sebastian

> 
> adam
> >
> > Signed-off-by: Sebastian Reichel 
> > ---
> >  drivers/misc/Kconfig |   1 -
> >  drivers/misc/Makefile|   1 -
> >  drivers/misc/ti-st/Kconfig   |  18 -
> >  drivers/misc/ti-st/Makefile  |   6 -
> >  drivers/misc/ti-st/st_core.c | 922 ---
> >  drivers/misc/ti-st/st_kim.c  | 868 -
> >  drivers/misc/ti-st/st_ll.c   | 169 ---
> >  include/linux/ti_wilink_st.h | 335 -
> >  8 files changed, 2320 deletions(-)
> >  delete mode 100644 drivers/misc/ti-st/Kconfig
> >  delete mode 100644 drivers/misc/ti-st/Makefile
> >  delete mode 100644 drivers/misc/ti-st/st_core.c
> >  delete mode 100644 drivers/misc/ti-st/st_kim.c
> >  delete mode 100644 drivers/misc/ti-st/st_ll.c
> >
> > diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> > index 3726eacdf65d..a5cc07d33c74 100644
> > --- a/drivers/misc/Kconfig
> > +++ b/drivers/misc/Kconfig
> > @@ -516,7 +516,6 @@ config MISC_RTSX
> >  source "drivers/misc/c2port/Kconfig"
> >  source "drivers/misc/eeprom/Kconfig"
> >  source "drivers/misc/cb710/Kconfig"
> > -source "drivers/misc/ti-st/Kconfig"
> >  source "drivers/misc/lis3lv02d/Kconfig"
> >  source "drivers/misc/altera-stapl/Kconfig"
> >  source "drivers/misc/mei/Kconfig"
> > diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> > index af22bbc3d00c..31c1e3eb4952 100644
> > --- a/drivers/misc/Makefile
> > +++ b/drivers/misc/Makefile
> > @@ -39,7 +39,6 @@ obj-y += cb710/
> >  obj-$(CONFIG_SPEAR13XX_PCIE_GADGET)+= spear13xx_pcie_gadget.o
> >  obj-$(CONFIG_VMWARE_BALLOON)   += vmw_balloon.o
> >  obj-$(CONFIG_PCH_PHUB) += pch_phub.o
> > -obj-y  += ti-st/
> >  obj-y  += lis3lv02d/
> >  obj-$(CONFIG_USB_SWITCH_FSA9480) += fsa9480.o
> >  obj-$(CONFIG_ALTERA_STAPL) +=altera-stapl/
> > diff --git a/drivers/misc/ti-st/Kconfig b/drivers/misc/ti-st/Kconfig
> > deleted file mode 100644
> > index 5bb92698bc80..
> > --- a/drivers/misc/ti-st/Kconfig
> > +++ /dev/null
> > @@ -1,18 +0,0 @@
> > -#
> > -# TI's shared transport line discipline and the protocol
> > -# drivers (BT, FM and GPS)
> > -#
> > -menu "Texas Instruments shared transport line discipline"
> > -config TI_ST
> > -   tristate "Shared transport core driver"
> > -   depends on NET && TTY
> > -   depends on GPIOLIB || COMPILE_TEST
> > -   select FW_LOADER
> > -   help
> > - This enables the shared transport core driver for TI
> > - BT / FM and GPS combo chips. This enables protocol drivers
> > - to register themselves with core and send data, the responses
> > - are returned to relevant protocol drivers based on their
> > - packet types.
> > -
> > -endmenu
> > diff --git a/drivers/misc/ti-st/Makefile b/drivers/misc/ti-st/Makefile
> > deleted file mode 100644
> > index 78d7ebb14749..
> > --- a/drivers/misc/ti-st/Makefile
> > +++ /dev/null
> > @@ -1,6 +0,0 @@
> > -#
> > -# Makefile for TI's shared transport line discipline
> > -# and its protocol drivers (BT, FM, GPS)
> > -#
> > -obj-$(CONFIG_TI_ST)+= st_drv.o
> > -st_drv-objs:= st_core.o st_kim.o st_ll.o
> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> > deleted file mode 100644
> > index eda8d407be28..
> > --- a/drivers/misc/ti-st/st_core.c
> > +++ /dev/null
> > @@ -1,922 +0,0 @@
> > -/*
> > - *  Shared Transport Line discipline driver Core
> > - * This hooks up ST KIM driver and ST LL driver
> > - *  Copyright (C) 2009-2010 Texas Instruments
> > - *  Author: Pavan Savoy 
> > - *
> > - *  This program is free software; you can redistribute it and/or modify
> > - *  it under the terms of the GNU General Public License version 2 as
> > - *  

Re: [PATCH] tty: fix race between flush_to_ldisc and tty_open

2018-12-21 Thread kbuild test robot
Hi Li,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tty/tty-testing]
[also build test ERROR on v4.20-rc7 next-20181221]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Li-RongQing/tty-fix-race-between-flush_to_ldisc-and-tty_open/20181222-030046
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git 
tty-testing
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

>> ERROR: "tty_ldisc_unlock" [drivers/staging/speakup/speakup.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[GIT PULL] SCSI fixes for 4.20-rc7

2018-12-21 Thread James Bottomley
This is two simple target fixes and one discard related I/O starvation
problem in sd.  The discard problem occurs because the discard page
doesn't have a mempool backing so if the allocation fails due to memory
pressure, we then lose the forward progress we require if the writeout
is on the same device.  The fix is to back it with a mempool.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes

The short changelog is:

Jens Axboe (1):
  scsi: sd: use mempool for discard special page

Varun Prakash (2):
  scsi: target: iscsi: cxgbit: add missing spin_lock_init()
  scsi: target: iscsi: cxgbit: fix csk leak

And the diffstat:

 drivers/scsi/sd.c | 23 +++
 drivers/target/iscsi/cxgbit/cxgbit_cm.c   |  5 -
 drivers/target/iscsi/cxgbit/cxgbit_main.c |  1 +
 3 files changed, 24 insertions(+), 5 deletions(-)

With full diff below.

James

---

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3bb2b3351e35..bd0a5c694a97 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -133,6 +133,7 @@ static DEFINE_MUTEX(sd_ref_mutex);
 
 static struct kmem_cache *sd_cdb_cache;
 static mempool_t *sd_cdb_pool;
+static mempool_t *sd_page_pool;
 
 static const char *sd_cache_types[] = {
"write through", "none", "write back",
@@ -759,9 +760,10 @@ static int sd_setup_unmap_cmnd(struct scsi_cmnd *cmd)
unsigned int data_len = 24;
char *buf;
 
-   rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+   rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
return BLKPREP_DEFER;
+   clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@ -792,9 +794,10 @@ static int sd_setup_write_same16_cmnd(struct scsi_cmnd 
*cmd, bool unmap)
u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9);
u32 data_len = sdp->sector_size;
 
-   rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+   rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
return BLKPREP_DEFER;
+   clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@ -822,9 +825,10 @@ static int sd_setup_write_same10_cmnd(struct scsi_cmnd 
*cmd, bool unmap)
u32 nr_sectors = blk_rq_sectors(rq) >> (ilog2(sdp->sector_size) - 9);
u32 data_len = sdp->sector_size;
 
-   rq->special_vec.bv_page = alloc_page(GFP_ATOMIC | __GFP_ZERO);
+   rq->special_vec.bv_page = mempool_alloc(sd_page_pool, GFP_ATOMIC);
if (!rq->special_vec.bv_page)
return BLKPREP_DEFER;
+   clear_highpage(rq->special_vec.bv_page);
rq->special_vec.bv_offset = 0;
rq->special_vec.bv_len = data_len;
rq->rq_flags |= RQF_SPECIAL_PAYLOAD;
@@ -1286,7 +1290,7 @@ static void sd_uninit_command(struct scsi_cmnd *SCpnt)
u8 *cmnd;
 
if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
-   __free_page(rq->special_vec.bv_page);
+   mempool_free(rq->special_vec.bv_page, sd_page_pool);
 
if (SCpnt->cmnd != scsi_req(rq)->cmd) {
cmnd = SCpnt->cmnd;
@@ -3623,6 +3627,13 @@ static int __init init_sd(void)
goto err_out_cache;
}
 
+   sd_page_pool = mempool_create_page_pool(SD_MEMPOOL_SIZE, 0);
+   if (!sd_page_pool) {
+   printk(KERN_ERR "sd: can't init discard page pool\n");
+   err = -ENOMEM;
+   goto err_out_ppool;
+   }
+
err = scsi_register_driver(_template.gendrv);
if (err)
goto err_out_driver;
@@ -3630,6 +3641,9 @@ static int __init init_sd(void)
return 0;
 
 err_out_driver:
+   mempool_destroy(sd_page_pool);
+
+err_out_ppool:
mempool_destroy(sd_cdb_pool);
 
 err_out_cache:
@@ -3656,6 +3670,7 @@ static void __exit exit_sd(void)
 
scsi_unregister_driver(_template.gendrv);
mempool_destroy(sd_cdb_pool);
+   mempool_destroy(sd_page_pool);
kmem_cache_destroy(sd_cdb_cache);
 
class_unregister(_disk_class);
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c 
b/drivers/target/iscsi/cxgbit/cxgbit_cm.c
index 71888b979ab5..b19c960d5490 100644
--- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c
+++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c
@@ -641,8 +641,11 @@ static void cxgbit_send_halfclose(struct cxgbit_sock *csk)
 
 static void cxgbit_arp_failure_discard(void *handle, struct sk_buff *skb)
 {
+   struct cxgbit_sock *csk = handle;
+
pr_debug("%s cxgbit_device %p\n", __func__, handle);
kfree_skb(skb);
+   cxgbit_put_csk(csk);
 }
 
 static void cxgbit_abort_arp_failure(void *handle, 

Re: [PATCH 00/14] Add support for FM radio in hcill and kill TI_ST

2018-12-21 Thread Sebastian Reichel
Hi,

On Fri, Dec 21, 2018 at 10:02:05AM -0800, Tony Lindgren wrote:
> * Sebastian Reichel  [181221 01:18]:
> > The new code has been tested on the Motorola Droid 4. For testing the audio
> > should be configured to route Ext to Speaker or Headphone. Then you need to
> > plug headphone, since its cable is used as antenna. For testing there is a
> > 'radio' utility packages in Debian. When you start the utility you need to
> > specify a frequency, since initial get_frequency returns an error:
> 
> Nice, good to see that ti-st kim stuff gone :) I gave this a quick
> try using fmtools.git and fmscan works just fine. No luck yet with
> fm though, it gives VIDIOC_G_CTRL: Not a tty error somehow so
> maybe I'm missing some options, patch below for omap2plus_defconfig.

I only did a few quick tests with 'radio' utility. I could scan
for stations and I could listen to some station. I suppose the
wl128x-radio driver has some buggy code paths, but that are
unrelated to this patchset.

> Hmm so looks like nothing to configure for the clocks or
> CPCAP_BIT_ST_L_TIMESLOT bits for cap for the EXT? So the
> wl12xx audio is wired directly to cpcap EXT then and not a
> TDM slot on the mcbsp huh?

For FM radio it's directly wired to EXT with no DAI being required.
I think that EXT is only used by FM radio and not by bluetooth. BT
seems to use TDM.

> > Merry Christmas!
> 
> Same to you!
> 
> Tony
> 
> 8< 
> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren 
> Date: Fri, 21 Dec 2018 07:57:09 -0800
> Subject: [PATCH] ARM: omap2plus_defconfig: Add RADIO_WL128X as a loadable
>  module
> 
> This allows using the FM radio in the wl12xx chips after modprobe
> fm_drv using radio from xawt, or fmtools.

Mh. I suppose I forgot to add alias to support autoloading the
FM module.

> Note that the firmware placed into /lib/firmware/ti-connectivity
> directory:
> 
> fm_rx_ch8_1283.2.bts
> fmc_ch8_1283.2.bts

There is also a TX firmware. The Droid 4's chip should support this,
but I don't know if there is audio routing for this feature.

-- Sebastian

> Signed-off-by: Tony Lindgren 
> ---
>  arch/arm/configs/omap2plus_defconfig | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/configs/omap2plus_defconfig 
> b/arch/arm/configs/omap2plus_defconfig
> --- a/arch/arm/configs/omap2plus_defconfig
> +++ b/arch/arm/configs/omap2plus_defconfig
> @@ -126,6 +126,7 @@ CONFIG_AF_RXRPC=m
>  CONFIG_RXKAD=y
>  CONFIG_CFG80211=m
>  CONFIG_MAC80211=m
> +CONFIG_RFKILL=m
>  CONFIG_DEVTMPFS=y
>  CONFIG_DEVTMPFS_MOUNT=y
>  CONFIG_DMA_CMA=y
> @@ -343,12 +344,14 @@ CONFIG_IR_GPIO_TX=m
>  CONFIG_IR_PWM_TX=m
>  CONFIG_MEDIA_SUPPORT=m
>  CONFIG_MEDIA_CAMERA_SUPPORT=y
> +CONFIG_MEDIA_RADIO_SUPPORT=y
>  CONFIG_MEDIA_CEC_SUPPORT=y
>  CONFIG_MEDIA_CONTROLLER=y
>  CONFIG_VIDEO_V4L2_SUBDEV_API=y
>  CONFIG_V4L_PLATFORM_DRIVERS=y
>  CONFIG_VIDEO_OMAP3=m
>  CONFIG_CEC_PLATFORM_DRIVERS=y
> +CONFIG_RADIO_WL128X=m
>  # CONFIG_MEDIA_SUBDRV_AUTOSELECT is not set
>  CONFIG_VIDEO_TVP5150=m
>  CONFIG_DRM=m
> -- 
> 2.19.2


signature.asc
Description: PGP signature


Re: [PATCH 1/3] pinctrl: armada-37xx: Correct mpp definitions

2018-12-21 Thread Marek Behun
On Fri, 21 Dec 2018 18:32:57 +0100
Gregory CLEMENT  wrote:

> + PIN_GRP_GPIO("pcie1", 3, 1, BIT(5), "pcie"),
> + PIN_GRP_GPIO("pcie1_clkreq", 4, 1, BIT(9), "pcie"),

If the pair is split to clkreq and reset, shouldn't the first be called
pcie1_reset?
Marek


Re: [PATCH] soc: fsl: guts: us devm_kstrdup_const() for RO data

2018-12-21 Thread Scott Wood
On Fri, 2018-12-07 at 09:22 +0100, Nicholas Mc Guire wrote:
> devm_kstrdup() may return NULL if internal allocation failed, but
> as  machine  is from the device tree, and thus RO, devm_kstrdup_const()
> can be used here, which will only copy the reference.

Is it really going to only copy the reference?  That would require that
is_kernel_rodata(machine) be true, which it shouldn't be since it's not part
of the kernel image.

-Scott




[PATCH] x86/efi: Don't unmap EFI boot services code/data regions for EFI_OLD_MEMMAP and EFI_MIXED_MODE

2018-12-21 Thread Sai Praneeth Prakhya
Commit d5052a7130a6 ("x86/efi: Unmap EFI boot services code/data regions
from efi_pgd") forgets to take two EFI modes into consideration namely
EFI_OLD_MEMMAP and EFI_MIXED_MODE.

EFI_OLD_MEMMAP is a legacy way of mapping EFI regions into swapper_pg_dir
using ioremap() and init_memory_mapping(). This feature can be enabled by
passing "efi=old_map" as kernel command line argument. But,
efi_unmap_pages() unmaps EFI boot services code/data regions *only* from
efi_pgd and hence cannot be used for unmapping EFI boot services code/data
regions from swapper_pg_dir.

Introduce a temporary fix to not unmap EFI boot services code/data regions
when EFI_OLD_MEMMAP is enabled while working on a real fix.

EFI_MIXED_MODE is another feature where a 64-bit kernel runs on a
64-bit platform crippled by a 32-bit firmware. To support EFI_MIXED_MODE,
all RAM (i.e. namely EFI regions like EFI_CONVENTIONAL_MEMORY,
EFI_LOADER_, EFI_BOOT_SERVICES_ and
EFI_RUNTIME_CODE/DATA regions) is mapped into efi_pgd all the time to
facilitate EFI runtime calls access it's arguments in 1:1 mode. Hence,
don't unmap EFI boot services code/data regions when booted in mixed mode.

Signed-off-by: Sai Praneeth Prakhya 
Cc: Borislav Petkov 
Cc: Ingo Molnar 
Cc: Andy Lutomirski 
Cc: Dave Hansen 
Cc: Bhupesh Sharma 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ard Biesheuvel 
---
 arch/x86/platform/efi/quirks.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 09e811b9da26..9c34230aaeae 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -380,6 +380,22 @@ static void __init efi_unmap_pages(efi_memory_desc_t *md)
u64 pa = md->phys_addr;
u64 va = md->virt_addr;
 
+   /*
+* To Do: Remove this check after adding functionality to unmap EFI boot
+* services code/data regions from direct mapping area because
+* "efi=old_map" maps EFI regions in swapper_pg_dir.
+*/
+   if (efi_enabled(EFI_OLD_MEMMAP))
+   return;
+
+   /*
+* EFI mixed mode has all RAM mapped to access arguments while making
+* EFI runtime calls, hence don't unmap EFI boot services code/data
+* regions.
+*/
+   if (!efi_is_native() && IS_ENABLED(CONFIG_EFI_MIXED))
+   return;
+
if (kernel_unmap_pages_in_pgd(pgd, pa, md->num_pages))
pr_err("Failed to unmap 1:1 mapping for 0x%llx\n", pa);
 
-- 
2.19.1



Re: [PATCH 1/1] f2fs: fix validation of the block count in sanity_check_raw_super

2018-12-21 Thread Chao Yu
On 2018/12/22 3:28, Martin Blumenstingl wrote:
> Treat "block_count" from struct f2fs_super_block as 64-bit little endian
> value in sanity_check_raw_super() because struct f2fs_super_block
> declares "block_count" as "__le64".
> 
> This fixes a bug where the superblock validation fails on big endian
> devices with the following error:
>   F2FS-fs (sda1): Wrong segment_count / block_count (61439 > 0)
>   F2FS-fs (sda1): Can't find valid F2FS filesystem in 1th superblock
>   F2FS-fs (sda1): Wrong segment_count / block_count (61439 > 0)
>   F2FS-fs (sda1): Can't find valid F2FS filesystem in 2th superblock
> As result of this the partition cannot be mounted.
> 
> With this patch applied the superblock validation works fine and the
> partition can be mounted again:
>   F2FS-fs (sda1): Mounted with checkpoint version = 7c84
> 
> My little endian x86-64 hardware was able to mount the partition without
> this fix.
> To confirm that mounting f2fs filesystems works on big endian machines
> again I tested this on a 32-bit MIPS big endian (lantiq) device.
> 
> Fixes: 0cfe75c5b01199 ("f2fs: enhance sanity_check_raw_super() to avoid 
> potential overflows")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Martin Blumenstingl 

Reviewed-by: Chao Yu 

Thanks,



[PATCH] proc: sysctl: fix double drop_sysctl_table()

2018-12-21 Thread Yafang Shao
The callers of insert_header() will drop sysctl table whatever
insert_header() successes or fails.
So we can't call drop_sysctl_table() in insert_header().

The call sites of insert_header() are all in fs/proc/proc_sysctl.c.

Signed-off-by: Yafang Shao 
---
 fs/proc/proc_sysctl.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 4d598a3..5eddca4 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -241,7 +241,6 @@ static int insert_header(struct ctl_dir *dir, struct 
ctl_table_header *header)
if (header->ctl_table == sysctl_mount_point)
clear_empty_dir(dir);
header->parent = NULL;
-   drop_sysctl_table(>header);
return err;
 }
 
-- 
1.8.3.1



Re: [PATCH] arch/powerpc: Use dma_zalloc_coherent

2018-12-21 Thread Scott Wood
On Thu, 2018-11-15 at 23:26 +0530, Sabyasachi Gupta wrote:
> On Mon, Nov 5, 2018 at 7:52 AM Sabyasachi Gupta
>  wrote:
> > 
> > Replaced dma_alloc_coherent + memset with dma_zalloc_coherent
> > 
> > Signed-off-by: Sabyasachi Gupta 
> 
> Any comment on this patch?

Just that FSL patches should be CCed to me, and the subject line is too broad
(and duplicates another patch posted around the same time).

I'm applying with the subject changed to
"arch/powerpc/fsl_rmu: Use dma_zalloc_coherent".

-Scott




Re: [PATH v7 1/2] mailbox: ZynqMP IPI mailbox controller

2018-12-21 Thread Jassi Brar
On Thu, Dec 20, 2018 at 11:32 AM Wendy Liang  wrote:


> diff --git a/drivers/mailbox/zynqmp-ipi-mailbox.c 
> b/drivers/mailbox/zynqmp-ipi-mailbox.c
> new file mode 100644
> index 000..bbddfd5
> --- /dev/null
> +++ b/drivers/mailbox/zynqmp-ipi-mailbox.c
> @@ -0,0 +1,764 @@

..
> +
> +/* IPI SMC Macros */
> +#define IPI_SMC_OPEN_IRQ_MASK  0x0001UL /* IRQ enable bit in IPI
> + * open SMC call
> + */
> +#define IPI_SMC_NOTIFY_BLOCK_MASK  0x0001UL /* Flag to indicate if
> + * IPI notification needs
> + * to be blocking.
> + */
> +#define IPI_SMC_ENQUIRY_DIRQ_MASK  0x0001UL /* Flag to indicate if
> + * notification interrupt
> + * to be disabled.
> + */
> +#define IPI_SMC_ACK_EIRQ_MASK  0x0001UL /* Flag to indicate if
> + * notification interrupt
> + * to be enabled.
> + */
> +
The first two are unused.


> +struct zynqmp_ipi_pdata {
> +   struct device *dev;
> +   int irq;
> +   unsigned int method;
>
'method' doesn't track the HVC option in the driver. Please have a look.

..
> +
> +static void zynqmp_ipi_fw_call(struct zynqmp_ipi_mbox *ipi_mbox,
> +  unsigned long a0, unsigned long a3,
> +  unsigned long a4, unsigned long a5,
> +  unsigned long a6, unsigned long a7,
> +  struct arm_smccc_res *res)
>
[a4,a7] are always 0, so maybe just drop them?


> +static bool zynqmp_ipi_last_tx_done(struct mbox_chan *chan)
> +{
> +   struct device *dev = chan->mbox->dev;
> +   struct zynqmp_ipi_mbox *ipi_mbox = dev_get_drvdata(dev);
> +   struct zynqmp_ipi_mchan *mchan = chan->con_priv;
> +   int ret;
> +   u64 arg0;
> +   struct arm_smccc_res res;
> +   struct zynqmp_ipi_message *msg;
> +
> +   if (WARN_ON(!ipi_mbox)) {
> +   dev_err(dev, "no platform drv data??\n");
> +   return false;
> +   }
> +
> +   if (mchan->chan_type == IPI_MB_CHNL_TX) {
> +   /* We only need to check if the message been taken
> +* by the remote in the TX channel
> +*/
> +   arg0 = SMC_IPI_MAILBOX_STATUS_ENQUIRY;
> +   zynqmp_ipi_fw_call(ipi_mbox, arg0, 0, 0, 0, 0, 0, );
> +   /* Check the SMC call status, a0 of the result */
> +   ret = (int)(res.a0 & 0x);
> +   if (ret < 0 || ret & IPI_MB_STATUS_SEND_PENDING)
> +   return false;
> +
OK, but ...

> +   msg = mchan->rx_buf;
> +   msg->len = mchan->resp_buf_size;
> +   memcpy_fromio(msg->data, mchan->resp_buf, msg->len);
> +   mbox_chan_received_data(chan, (void *)msg);
>
 wouldn't this be done from zynqmp_ipi_interrupt()?



> +static int zynqmp_ipi_send_data(struct mbox_chan *chan, void *data)
> +{
  .
> +   /* Enquire if the mailbox is free to send message */
> +   arg0 = SMC_IPI_MAILBOX_STATUS_ENQUIRY;
> +   timeout = 10;
> +   if (msg && msg->len) {
> +   timeout = 10;
> +   do {
> +   zynqmp_ipi_fw_call(ipi_mbox, arg0,
> +  0, 0, 0, 0, 0, );
> +   ret = res.a0 & 0x;
> +   if (ret >= 0 &&
> +   !(ret & IPI_MB_STATUS_SEND_PENDING))
> +   break;
> +   usleep_range(1, 2);
> +   timeout--;
> +   } while (timeout);
> +   if (!timeout) {
> +   dev_warn(dev, "chan %d sending msg 
> timeout.\n",
> +ipi_mbox->remote_id);
> +   return -ETIME;
> +   }
> +   memcpy_toio(mchan->req_buf, msg->data, msg->len);
> +   }
>
This check should be done in last_tx_done, and not here.
send_data is never called unless last_tx_done returns true.

> +   /* Kick IPI mailbox to send message */
> +   arg0 = SMC_IPI_MAILBOX_NOTIFY;
> +   zynqmp_ipi_fw_call(ipi_mbox, arg0, 0, 0, 0, 0, 0, );
> +   } else {
> +   /* Send response message */
> +   

Re: [PATCH v5 1/5] Bluetooth: hci_qca: use wait_until_sent() for power pulses

2018-12-21 Thread Matthias Kaehlcke
On Thu, Dec 20, 2018 at 08:16:35PM +0530, Balakrishna Godavarthi wrote:
> wcn3990 requires a power pulse to turn ON/OFF along with
> regulators. Sometimes we are observing the power pulses are sent
> out with some time delay, due to queuing these commands. This is
> causing synchronization issues with chip, which intern delay the
> chip setup or may end up with communication issues.
> 
> Signed-off-by: Balakrishna Godavarthi 
> ---
>  drivers/bluetooth/hci_qca.c | 38 ++---
>  1 file changed, 14 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> index f036c8f98ea3..5a07c2370289 100644
> --- a/drivers/bluetooth/hci_qca.c
> +++ b/drivers/bluetooth/hci_qca.c
> @@ -1013,11 +1013,9 @@ static inline void host_set_baudrate(struct hci_uart 
> *hu, unsigned int speed)
>   hci_uart_set_baudrate(hu, speed);
>  }
>  
> -static int qca_send_power_pulse(struct hci_dev *hdev, u8 cmd)
> +static int qca_send_power_pulse(struct hci_uart *hu, u8 cmd)
>  {
> - struct hci_uart *hu = hci_get_drvdata(hdev);
> - struct qca_data *qca = hu->priv;
> - struct sk_buff *skb;
> + int ret;
>  
>   /* These power pulses are single byte command which are sent
>* at required baudrate to wcn3990. On wcn3990, we have an external
> @@ -1029,19 +1027,16 @@ static int qca_send_power_pulse(struct hci_dev *hdev, 
> u8 cmd)
>* save power. Disabling hardware flow control is mandatory while
>* sending power pulses to SoC.
>*/
> - bt_dev_dbg(hdev, "sending power pulse %02x to SoC", cmd);
> -
> - skb = bt_skb_alloc(sizeof(cmd), GFP_KERNEL);
> - if (!skb)
> - return -ENOMEM;
> -
> + bt_dev_dbg(hu->hdev, "sending power pulse %02x to SoC", cmd);
>   hci_uart_set_flow_control(hu, true);
> + ret = serdev_device_write_buf(hu->serdev, , sizeof(cmd));
> + if (ret < 0) {
> + bt_dev_err(hu->hdev, "failed to send power pulse %02x to SoC",
> +cmd);
> + return ret;
> + }
>  
> - skb_put_u8(skb, cmd);
> - hci_skb_pkt_type(skb) = HCI_COMMAND_PKT;
> -
> - skb_queue_tail(>txq, skb);
> - hci_uart_tx_wakeup(hu);
> + serdev_device_wait_until_sent(hu->serdev, 0);

serdev_device_wait_until_sent() might only guarantee that the UART
circular buffer is empty (see
https://elixir.bootlin.com/linux/v4.19/source/drivers/tty/tty_ioctl.c#L225),
not that the data has actually sent (e.g. it might be sitting in
the UART FIFO). However with this:

>   /* Wait for 100 uS for SoC to settle down */
>   usleep_range(100, 200);

we should probably be fine, unless there's tons of data in the
FIFO.

You currently call serdev_device_write_flush() in
qca_power_shutdown(), I wonder if it would make sense to call it in
qca_send_power_pulse(), regardless of whether it's an on or off
pulse. In any case we don't care if the chip receives any 'pending'
data when we switch it on or off, right?

Cheers

Matthias


Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations

2018-12-21 Thread Tim Chen
On 12/21/18 1:59 PM, Ben Greear wrote:
> On 12/21/18 9:44 AM, Tim Chen wrote:
>> Thomas,
>>
>> Andi and I have made an update to our draft of the Spectre admin guide.
>> We may be out on Christmas vacation for a while.  But we want to
>> send it out for everyone to take a look.
> 
> Can you add a section on how to compile out all mitigations that have anything
> beyond negligible performance impact for those running systems where 
> performance
> is more important than security?
> 

If you don't worry about security and performance is paramount, then
boot with "nospectre_v2".  That's explained in the document.

Tim



Re: [v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Yang Shi




On 12/21/18 2:42 PM, Tim Chen wrote:

On 12/21/18 1:40 PM, Yang Shi wrote:

Swap readahead would read in a few pages regardless if the underlying
device is busy or not.  It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.

Use inode_read_congested() to check if the underlying device is busy or
not like what file page readahead does.  Get inode from swap_info_struct.
Although we can add inode information in swap_address_space
(address_space->host), it may lead some unexpected side effect, i.e.
it may break mapping_cap_account_dirty().  Using inode from
swap_info_struct seems simple and good enough.

Just does the check in vma_cluster_readahead() since
swap_vma_readahead() is just used for non-rotational device which
much less likely has congestion than traditional HDD.

Although swap slots may be consecutive on swap partition, it still may be
fragmented on swap file. This check would help to reduce excessive stall
for such case.

Cc: Huang Ying 
Cc: Tim Chen 
Cc: Minchan Kim 
Signed-off-by: Yang Shi 
---
v3: Move inode deference under swap device type check per Tim Chen
v2: Check the swap device type per Tim Chen

  mm/swap_state.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/mm/swap_state.c b/mm/swap_state.c
index fd2f21e..78d500e 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -538,11 +538,18 @@ struct page *swap_cluster_readahead(swp_entry_t entry, 
gfp_t gfp_mask,
bool do_poll = true, page_allocated;
struct vm_area_struct *vma = vmf->vma;
unsigned long addr = vmf->address;
+   struct inode *inode = NULL;
  
  	mask = swapin_nr_pages(offset) - 1;

if (!mask)
goto skip;
  
+	if (si->flags & (SWP_BLKDEV | SWP_FS)) {

+   inode = si->swap_file->f_mapping->host;
+   if (inode_read_congested(inode))
+   goto skip;
+   }
+
do_poll = false;
/* Read a page_cluster sized and aligned cluster around offset. */
start_offset = offset & ~mask;


Acked-by: Tim Chen 


Thanks. Happy holiday.



Re: [PATCH] touchscreen: elants: fix a missing check of return values

2018-12-21 Thread Dmitry Torokhov
On Fri, Dec 21, 2018 at 03:05:29PM -0600, Kangjie Lu wrote:
> Hi Dmitry,
> 
> Thanks for the feedback.
> 
> On Fri, Dec 21, 2018 at 2:27 AM Dmitry Torokhov 
> wrote:
> 
> > Hi Kangjie,
> >
> > On Fri, Dec 21, 2018 at 12:59:16AM -0600, Kangjie Lu wrote:
> > > elants_i2c_send() may fail, let's check its return values. The fix does
> > > the check and reports an error message upon the failure.
> > >
> > > Signed-off-by: Kangjie Lu 
> > > ---
> > >  drivers/input/touchscreen/elants_i2c.c | 10 --
> > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/input/touchscreen/elants_i2c.c
> > b/drivers/input/touchscreen/elants_i2c.c
> > > index f2cb23121833..cb3c1470bb68 100644
> > > --- a/drivers/input/touchscreen/elants_i2c.c
> > > +++ b/drivers/input/touchscreen/elants_i2c.c
> > > @@ -245,8 +245,14 @@ static int elants_i2c_calibrate(struct elants_data
> > *ts)
> > >   ts->state = ELAN_WAIT_RECALIBRATION;
> > >   reinit_completion(>cmd_done);
> > >
> > > - elants_i2c_send(client, w_flashkey, sizeof(w_flashkey));
> > > - elants_i2c_send(client, rek, sizeof(rek));
> > > + error = elants_i2c_send(client, w_flashkey, sizeof(w_flashkey));
> > > + error |= elants_i2c_send(client, rek, sizeof(rek));
> >
> > I dislike this kind of error handling as this may result in invalid
> > error code being reported, in case 2 commands produce different results.
> >
> 
> I will fix this.
> 
> 
> > > + if (error) {
> > > + dev_err(>dev,
> > > + "error in sending I2C messages for
> > calibration: %d\n",
> > > + error);
> > > + return error;
> >
> > If we just return like you do it here, interrupts will stay disabled and
> > touchscreen will be completely dead. With the old code we'd report
> > timeout on calibration, and touchscreen would have chance of working. We
> > would also be able to retry calibration.
> >
> 
> How about this: we print out the error message but still continue the
> following execution?

Yes, we could do dev_warn() here, but elants_i2c_send() already logs
errors, so I do not see much benefit from doing this.

> Also, if elants_i2c_send() fails,
> would wait_for_completion_interruptible_timeout() always capture a timeout?

Well, if controller does not get the calibration command(s) it will not
do anything and at worst in  time
wait_for_completion_interruptible_timeout() will return and we will
properly report this condition.

Another option is to rearrange the code to ensure we are not leaving
interrupts disabled on error.

Thanks.

-- 
Dmitry


Re: [PATCH] input: drv2667: fix indentation issues, remove extra tabs

2018-12-21 Thread Dmitry Torokhov
On Fri, Dec 21, 2018 at 11:12:41PM +, Colin King wrote:
> From: Colin Ian King 
> 
> There are some statements that are indented incorrectly, fix this by
> removinf the extra tabs.
> 
> Signed-off-by: Colin Ian King 

Applied, thank you.

> ---
>  drivers/input/misc/drv2667.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/input/misc/drv2667.c b/drivers/input/misc/drv2667.c
> index 2849bb6906a8..6091f0490e8f 100644
> --- a/drivers/input/misc/drv2667.c
> +++ b/drivers/input/misc/drv2667.c
> @@ -177,9 +177,9 @@ static int drv2667_set_waveform_freq(struct drv2667_data 
> *haptics)
>   error = regmap_write(haptics->regmap, DRV2667_PAGE, read_buf);
>   if (error) {
>   dev_err(>client->dev,
> - "Failed to set the page: %d\n", error);
> - return -EIO;
> - }
> + "Failed to set the page: %d\n", error);
> + return -EIO;
> + }
>   }
>  
>   return error;
> -- 
> 2.19.1
> 

-- 
Dmitry


Re: [PATCH] ASoC: Intel: fix unmet dependencies in Kconfig

2018-12-21 Thread Randy Dunlap
On 12/21/18 12:49 PM, Pierre-Louis Bossart wrote:
> Remove warnings by selecting SND_HDA_INTEL_DSP_DETECTION_XYZ only when
> SND_HDA_INTEL is part of the config.
> 
> Fixes: c337104b1a16 ('ALSA: HD-Audio: SKL+: abort probe if DSP is present and 
> Skylake driver selected')
> Reported-by: Randy Dunlap 
> Signed-off-by: Pierre-Louis Bossart 

Acked-by: Randy Dunlap 

Thanks.

> ---
>  sound/soc/intel/Kconfig | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig
> index 2fd1b61e8331..555cdb1d87d4 100644
> --- a/sound/soc/intel/Kconfig
> +++ b/sound/soc/intel/Kconfig
> @@ -188,12 +188,12 @@ config SND_SOC_INTEL_SKYLAKE_COMMON
>   select SND_SOC_TOPOLOGY
>   select SND_SOC_INTEL_SST
>   select SND_SOC_HDAC_HDA if SND_SOC_INTEL_SKYLAKE_HDAUDIO_CODEC
> - select SND_HDA_INTEL_DSP_DETECTION_SKL if SND_SOC_INTEL_SKL
> - select SND_HDA_INTEL_DSP_DETECTION_APL if SND_SOC_INTEL_APL
> - select SND_HDA_INTEL_DSP_DETECTION_KBL if SND_SOC_INTEL_KBL
> - select SND_HDA_INTEL_DSP_DETECTION_GLK if SND_SOC_INTEL_GLK
> - select SND_HDA_INTEL_DSP_DETECTION_CNL if SND_SOC_INTEL_CNL
> - select SND_HDA_INTEL_DSP_DETECTION_CFL if SND_SOC_INTEL_CFL
> + select SND_HDA_INTEL_DSP_DETECTION_SKL if SND_SOC_INTEL_SKL && 
> SND_HDA_INTEL
> + select SND_HDA_INTEL_DSP_DETECTION_APL if SND_SOC_INTEL_APL && 
> SND_HDA_INTEL
> + select SND_HDA_INTEL_DSP_DETECTION_KBL if SND_SOC_INTEL_KBL && 
> SND_HDA_INTEL
> + select SND_HDA_INTEL_DSP_DETECTION_GLK if SND_SOC_INTEL_GLK && 
> SND_HDA_INTEL
> + select SND_HDA_INTEL_DSP_DETECTION_CNL if SND_SOC_INTEL_CNL && 
> SND_HDA_INTEL
> + select SND_HDA_INTEL_DSP_DETECTION_CFL if SND_SOC_INTEL_CFL && 
> SND_HDA_INTEL
>   select SND_SOC_ACPI_INTEL_MATCH
>   help
> If you have a Intel Skylake/Broxton/ApolloLake/KabyLake/
> 


-- 
~Randy


Re: [PATCH 09/21] PCI: imx6: Drop imx6_pcie_link_up()

2018-12-21 Thread Andrey Smirnov
On Fri, Dec 21, 2018 at 10:55 AM Trent Piepho  wrote:
>
> On Thu, 2018-12-20 at 23:27 -0800, Andrey Smirnov wrote:
> > Until commit 4d107d3b5a68 ("PCI: imx6: Move link up check into
> > imx6_pcie_wait_for_link()") the driver relied on both LINK_UP and
> > LINK_IN_TRAINING to determine if PCIE link was up.
>
> I've already got a patch in that fixed this issue.  Queued for 4.19 and
> 4.14 stable.
>
> The problem was created by 886bc5ceb5cc in one branch and by
> 4d107d3b5a68 on another branch, that when merged in 562df5c8521e, the
> merge kept some pieces from each commit and dropped other pieces, which
> resulted in this check getting dropped.

OK, cool, this patch can be dropped then.

Thanks,
Andrey Smirnov


Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 16:32:58 -0800
Linus Torvalds  wrote:

> On Fri, Dec 21, 2018, 16:06 Steven Rostedt  
> > On Fri, 21 Dec 2018 18:13:16
> >
> > And I'll make a separate patch that adds:
> >
> > static __always_inline bool
> > str_has_prefix_len(const char *str, const char *prefix, unsigned int *len)
> 
> 
> Why would this ever be a good idea? What's the advantage over returning the
> length?

Style?

I was just thinking that some people (like Joe) think it's in bad taste
to have:

if ((len = str_has_prefix(str, "const"))) {

and it might look better to have:

if (str_has_prefix_len(str, "const", )) {

Honestly, I'm good with either and don't really have a preference.

Let me know which one you prefer, and I'll work to get a patch out.

-- Steve



Re: [PATCH v4 10/18] iommu/mediatek: Add mt8183 IOMMU support

2018-12-21 Thread Nicolas Boichat
On Fri, Dec 21, 2018 at 4:02 PM Yong Wu  wrote:
>
> On Fri, 2018-12-21 at 12:43 +0800, Nicolas Boichat wrote:
> > On Sat, Dec 8, 2018 at 4:42 PM Yong Wu  wrote:
> > >
> > > The M4U IP blocks in mt8183 is MediaTek's generation2 M4U which use
> > > the ARM Short-descriptor like mt8173, and most of the HW registers
> > > are the same.
> > >
> > > Here list main differences between mt8183 and mt8173/mt2712:
> > > 1) mt8183 has only one M4U HW like mt8173 while mt2712 has two.
> > > 2) mt8183 don't have the "bclk" clock, it use the EMI clock instead.
> > > 3) mt8183 can support the dram over 4GB, but it doesn't call this "4GB
> > > mode".
> > > 4) mt8183 pgtable base register(0x0) extend bit[1:0] which represent
> > > the bit[33:32] in the physical address of the pgtable base, But the
> > > standard ttbr0[1] means the S bit which is enabled defaultly, Hence,
> > > we add a mask.
> > > 5) mt8183 HW has a GALS modules, SMI should enable "has_gals" support.
> > > 6) the larb-id in smi-common is remapped. M4U should enable
> > > larbid_remapped support.
> > >
> > > Signed-off-by: Yong Wu 
> > > ---
> > >  drivers/iommu/mtk_iommu.c | 31 ++-
> > >  drivers/iommu/mtk_iommu.h |  1 +
> > >  drivers/memory/mtk-smi.c  | 19 +++
> > >  3 files changed, 42 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > > index 8ab3b69..d91a554 100644
> > > --- a/drivers/iommu/mtk_iommu.c
> > > +++ b/drivers/iommu/mtk_iommu.c
> > > @@ -36,6 +36,7 @@
> > >  #include "mtk_iommu.h"
> > >
> > >  #define REG_MMU_PT_BASE_ADDR   0x000
> > > +#define MMU_PT_ADDR_MASK   GENMASK(31, 7)
> > >
> > >  #define REG_MMU_INVALIDATE 0x020
> > >  #define F_ALL_INVLD0x2
> > > @@ -54,7 +55,7 @@
> > >  #define REG_MMU_CTRL_REG   0x110
> > >  #define F_MMU_PREFETCH_RT_REPLACE_MOD  BIT(4)
> > >  #define F_MMU_TF_PROTECT_SEL_SHIFT(data) \
> > > -   ((data)->plat_data->m4u_plat == M4U_MT2712 ? 4 : 5)
> > > +   ((data)->plat_data->m4u_plat == M4U_MT8173 ? 5 : 4)
> >
> > Should the shift value be a member of plat_data instead?
>
> It's also ok.
> This TF_PROTECT_SEL MACRO looks a bit complex. I can use a new patch to
> refactor it.

SGTM.

> >
> > >  /* It's named by F_MMU_TF_PROT_SEL in mt2712. */
> > >  #define F_MMU_TF_PROTECT_SEL(prot, data) \
> > > (((prot) & 0x3) << F_MMU_TF_PROTECT_SEL_SHIFT(data))
> > > @@ -347,7 +348,7 @@ static int mtk_iommu_attach_device(struct 
> > > iommu_domain *domain,
> > > /* Update the pgtable base address register of the M4U HW */
> > > if (!data->m4u_dom) {
> > > data->m4u_dom = dom;
> > > -   writel(dom->cfg.arm_v7s_cfg.ttbr[0],
> > > +   writel(dom->cfg.arm_v7s_cfg.ttbr[0] & MMU_PT_ADDR_MASK,
> > >data->base + REG_MMU_PT_BASE_ADDR);
> > > }
> > >
> > > @@ -510,6 +511,7 @@ static int mtk_iommu_of_xlate(struct device *dev, 
> > > struct of_phandle_args *args)
> > >
> > >  static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> > >  {
> > > +   enum mtk_iommu_plat m4u_plat = data->plat_data->m4u_plat;
> > > u32 regval;
> > > int ret;
> > >
> > > @@ -520,7 +522,7 @@ static int mtk_iommu_hw_init(const struct 
> > > mtk_iommu_data *data)
> > > }
> > >
> > > regval = F_MMU_TF_PROTECT_SEL(2, data);
> > > -   if (data->plat_data->m4u_plat == M4U_MT8173)
> > > +   if (m4u_plat == M4U_MT8173)
> > > regval |= F_MMU_PREFETCH_RT_REPLACE_MOD;
> > > writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
> > >
> > > @@ -541,14 +543,14 @@ static int mtk_iommu_hw_init(const struct 
> > > mtk_iommu_data *data)
> > > F_INT_PRETETCH_TRANSATION_FIFO_FAULT;
> > > writel_relaxed(regval, data->base + REG_MMU_INT_MAIN_CONTROL);
> > >
> > > -   if (data->plat_data->m4u_plat == M4U_MT8173)
> > > +   if (m4u_plat == M4U_MT8173)
> > > regval = (data->protect_base >> 1) | (data->enable_4GB << 
> > > 31);
> > > else
> > > regval = lower_32_bits(data->protect_base) |
> > >  upper_32_bits(data->protect_base);
> > > writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
> > >
> > > -   if (data->enable_4GB && data->plat_data->m4u_plat != M4U_MT8173) {
> > > +   if (data->enable_4GB && m4u_plat == M4U_MT2712) {
> > > /*
> > >  * If 4GB mode is enabled, the validate PA range is from
> > >  * 0x1__ to 0x1__. here record bit[32:30].
> > > @@ -558,8 +560,11 @@ static int mtk_iommu_hw_init(const struct 
> > > mtk_iommu_data *data)
> > > }
> > > writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
> > >
> > > -   /* It's MISC control register whose default value is ok except 
> > > 

Re: [PATCH v5 2/5] Bluetooth: hci_qca: Deassert RTS while baudrate change command

2018-12-21 Thread Matthias Kaehlcke
On Thu, Dec 20, 2018 at 08:16:36PM +0530, Balakrishna Godavarthi wrote:
> This patch will help to stop frame reassembly errors while changing
> the baudrate. This is because host send a change baudrate request
> command to the chip with 115200 bps, Whereas chip will change their
> UART clocks to the enable for new baudrate and sends the response
> for the change request command with newer baudrate, On host side
> we are still operating in 115200 bps which results of reading garbage
> data. Here we are pulling RTS line, so that chip we will wait to send data
> to host until host change its baudrate.
> 
> Signed-off-by: Balakrishna Godavarthi 
> Tested-by: Matthias Kaehlcke 
> Reviewed-by: Matthias Kaehlcke 
> ---
>  drivers/bluetooth/hci_qca.c | 24 +---
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
> index 5a07c2370289..1680ead6cc3d 100644
> --- a/drivers/bluetooth/hci_qca.c
> +++ b/drivers/bluetooth/hci_qca.c
> @@ -963,7 +963,6 @@ static int qca_set_baudrate(struct hci_dev *hdev, uint8_t 
> baudrate)
>   struct hci_uart *hu = hci_get_drvdata(hdev);
>   struct qca_data *qca = hu->priv;
>   struct sk_buff *skb;
> - struct qca_serdev *qcadev;
>   u8 cmd[] = { 0x01, 0x48, 0xFC, 0x01, 0x00 };
>  
>   if (baudrate > QCA_BAUDRATE_320)
> @@ -977,13 +976,6 @@ static int qca_set_baudrate(struct hci_dev *hdev, 
> uint8_t baudrate)
>   return -ENOMEM;
>   }
>  
> - /* Disabling hardware flow control is mandatory while
> -  * sending change baudrate request to wcn3990 SoC.
> -  */
> - qcadev = serdev_device_get_drvdata(hu->serdev);
> - if (qcadev->btsoc_type == QCA_WCN3990)
> - hci_uart_set_flow_control(hu, true);
> -
>   /* Assign commands to change baudrate and packet type. */
>   skb_put_data(skb, cmd, sizeof(cmd));
>   hci_skb_pkt_type(skb) = HCI_COMMAND_PKT;
> @@ -999,9 +991,6 @@ static int qca_set_baudrate(struct hci_dev *hdev, uint8_t 
> baudrate)
>   schedule_timeout(msecs_to_jiffies(BAUDRATE_SETTLE_TIMEOUT_MS));
>   set_current_state(TASK_RUNNING);
>  
> - if (qcadev->btsoc_type == QCA_WCN3990)
> - hci_uart_set_flow_control(hu, false);
> -
>   return 0;
>  }
>  
> @@ -1086,6 +1075,7 @@ static int qca_check_speeds(struct hci_uart *hu)
>  static int qca_set_speed(struct hci_uart *hu, enum qca_speed_type speed_type)
>  {
>   unsigned int speed, qca_baudrate;
> + struct qca_serdev *qcadev;
>   int ret;
>  
>   if (speed_type == QCA_INIT_SPEED) {
> @@ -1097,6 +1087,15 @@ static int qca_set_speed(struct hci_uart *hu, enum 
> qca_speed_type speed_type)
>   if (!speed)
>   return 0;
>  
> + /* Deassert RTS while changing the baudrate of chip and host.
> +  * This will prevent chip from transmitting its response with
> +  * the new baudrate while the host port is still operating at
> +  * the old speed.
> +  */
> + qcadev = serdev_device_get_drvdata(hu->serdev);
> + if (qcadev->btsoc_type == QCA_WCN3990)
> + serdev_device_set_rts(hu->serdev, false);
> +
>   qca_baudrate = qca_get_baudrate_value(speed);
>   bt_dev_dbg(hu->hdev, "Set UART speed to %d", speed);
>   ret = qca_set_baudrate(hu->hdev, qca_baudrate);
> @@ -1104,6 +1103,9 @@ static int qca_set_speed(struct hci_uart *hu, enum 
> qca_speed_type speed_type)
>   return ret;
>  
>   host_set_baudrate(hu, speed);
> +
> + if (qcadev->btsoc_type == QCA_WCN3990)
> + serdev_device_set_rts(hu->serdev, true);
>   }
>  
>   return 0;

I looked for ways to do without this change, but didn't find a good
solution. There are several possible problems with baudrate changes:

1) send request to BT controller to change the baudrate

  this is an asynchronous operation, the actual baudrate change can
  be delayed for multiple reasons, e.g.:

  - request sits in the BT driver's TX queue

this could be worked around by checking skb_queue_empty()

  - request sits in the UART buffer

a workaround for this could be calling
serdev_device_wait_until_sent() (only available with serdev though)

  - the request sits in the UART FIFO

will be sent out 'immediately'. no neat solution available AFAIK,
a short sleep could be an effective workaround

  - the controller may have a short delay to apply the change

Also no neat solution here. A/the same short sleep could work
around this

2) change baudrate of the host UART
  - this must not happen before the baudrate change request has been
sent to the BT controller, otherwise things are messed up
seriously

Ideally set_termios would make sure all pending data is sent
before the change is applied, some UART drivers do this, others
   

[PATCH v4 02/10] selftests/resctrl: Add basic resctrl file system operations and data

2018-12-21 Thread Fenghua Yu
From: Sai Praneeth Prakhya 

The basic resctrl file system operations and data are added for future
usage by resctrl selftest tool.

Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/Makefile|  10 +
 tools/testing/selftests/resctrl/resctrl.h   |  47 ++
 tools/testing/selftests/resctrl/resctrlfs.c | 454 
 3 files changed, 511 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/Makefile
 create mode 100644 tools/testing/selftests/resctrl/resctrl.h
 create mode 100644 tools/testing/selftests/resctrl/resctrlfs.c

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
new file mode 100644
index ..bd5c5418961e
--- /dev/null
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -0,0 +1,10 @@
+CC = gcc
+CFLAGS = -g -Wall
+
+*.o: *.c
+   $(CC) $(CFLAGS) -c *.c
+
+.PHONY: clean
+
+clean:
+   $(RM) *.o *~
diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
new file mode 100644
index ..01822f6258af
--- /dev/null
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#define _GNU_SOURCE
+#ifndef RESCTRL_H
+#define RESCTRL_H
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define RESCTRL_PATH   "/sys/fs/resctrl"
+#define PHYS_ID_PATH   "/sys/devices/system/cpu/cpu"
+
+#define PARENT_EXIT(err_msg)   \
+   do {\
+   perror(err_msg);\
+   kill(ppid, SIGKILL);\
+   exit(EXIT_FAILURE); \
+   } while (0)
+
+pid_t bm_pid, ppid;
+
+int remount_resctrlfs(bool mum_resctrlfs);
+char get_sock_num(int cpu_no);
+int validate_bw_report_request(char *bw_report);
+int validate_resctrl_feature_request(char *resctrl_val);
+int taskset_benchmark(pid_t bm_pid, int cpu_no);
+void run_benchmark(int signum, siginfo_t *info, void *ucontext);
+int write_schemata(char *ctrlgrp, char *schemata, int cpu_no,
+  char *resctrl_val);
+int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp,
+   char *resctrl_val);
+int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu,
+   int group_fd, unsigned long flags);
+int run_fill_buf(int span, int malloc_and_init_memory, int memflush, int op);
+
+#endif /* RESCTRL_H */
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
b/tools/testing/selftests/resctrl/resctrlfs.c
new file mode 100644
index ..9d9962947c9c
--- /dev/null
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -0,0 +1,454 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Basic resctrl file system operations
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Authors:
+ *Arshiya Hayatkhan Pathan 
+ *Sai Praneeth Prakhya ,
+ *Fenghua Yu 
+ */
+#include "resctrl.h"
+
+#define RESCTRL_MBM"L3 monitoring detected"
+#define RESCTRL_MBA"MB allocation detected"
+#define MAX_RESCTRL_FEATURES   2
+
+/*
+ * remount_resctrlfs - Remount resctrl FS at /sys/fs/resctrl
+ * @mum_resctrlfs: Should the resctrl FS be remounted?
+ *
+ * If not mounted, mount it.
+ * If mounted and mum_resctrlfs then remount resctrl FS.
+ * If mounted and !mum_resctrlfs then noop
+ *
+ * Return: 0 on success, non-zero on failure
+ */
+int remount_resctrlfs(bool mum_resctrlfs)
+{
+   DIR *dp;
+   struct dirent *ep;
+   unsigned int count = 0;
+
+   /*
+* If kernel is built with CONFIG_RESCTRL, then /sys/fs/resctrl should
+* be present by default
+*/
+   dp = opendir(RESCTRL_PATH);
+   if (dp) {
+   while ((ep = readdir(dp)) != NULL)
+   count++;
+   closedir(dp);
+   } else {
+   perror("Unable to read /sys/fs/resctrl");
+
+   return -1;
+   }
+
+   /*
+* If resctrl FS has more than two entries, it means that resctrl FS has
+* already been mounted. The two default entries are "." and "..", these
+* are present even when resctrl FS is not mounted
+*/
+   if (count > 2) {
+   if (mum_resctrlfs) {
+   if (umount(RESCTRL_PATH) != 0) {
+   perror("Unable to umount resctrl");
+
+   return errno;
+   }
+   printf("Remount: done!\n");
+   } else {
+   printf("Mounted already. Not remounting!\n");
+
+   return 0;
+   }
+   }
+
+   if (mount("resctrl", RESCTRL_PATH, "resctrl", 0, NULL) != 0) {
+   perror("Unable to mount 

[PATCH v4 09/10] selftests/resctrl: Add Cache Allocation Technology (CAT) selftest

2018-12-21 Thread Fenghua Yu
From: Arshiya Hayatkhan Pathan 

Cache Allocation Technology (CAT) selftest allocates a portion of
last level cache and starts a benchmark to read each cache
line in this portion of cache. Measure the cache misses in perf and
the misses should be equal to the number of cache lines in this
portion of cache.

We don't use CQM to calculate cache usage because some CAT enabled
platforms don't have CQM.

Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/Makefile  |   2 +-
 tools/testing/selftests/resctrl/cache.c   | 175 -
 tools/testing/selftests/resctrl/cat_test.c| 243 ++
 tools/testing/selftests/resctrl/fill_buf.c|  10 +-
 tools/testing/selftests/resctrl/resctrl.h |   3 +
 .../testing/selftests/resctrl/resctrl_tests.c |  15 +-
 tools/testing/selftests/resctrl/resctrlfs.c   |   8 +-
 7 files changed, 448 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/resctrl/cat_test.c

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index 664561cd76e6..028b4c22 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -8,7 +8,7 @@ all: resctrl_tests
 
 resctrl_tests: *.o
$(CC) $(CFLAGS) -o resctrl_tests resctrl_tests.o resctrlfs.o \
-resctrl_val.o fill_buf.o mbm_test.o mba_test.o cache.o 
cqm_test.o
+resctrl_val.o fill_buf.o mbm_test.o mba_test.o cache.o 
cqm_test.o cat_test.o
 
 .PHONY: clean
 
diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
index 1256590ef804..c98b7bc6ad9c 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -10,10 +10,107 @@ struct read_format {
} values[2];
 };
 
+static struct perf_event_attr pea_llc_miss;
+static struct read_format rf_cqm;
+static int fd_lm;
 char cbm_mask[256];
 unsigned long long_mask;
 char llc_occup_path[1024];
 
+static void initialize_perf_event_attr(void)
+{
+   pea_llc_miss.type = PERF_TYPE_HARDWARE;
+   pea_llc_miss.size = sizeof(struct perf_event_attr);
+   pea_llc_miss.read_format = PERF_FORMAT_GROUP;
+   pea_llc_miss.exclude_kernel = 1;
+   pea_llc_miss.exclude_hv = 1;
+   pea_llc_miss.exclude_idle = 1;
+   pea_llc_miss.exclude_callchain_kernel = 1;
+   pea_llc_miss.inherit = 1;
+   pea_llc_miss.exclude_guest = 1;
+   pea_llc_miss.disabled = 1;
+}
+
+static void ioctl_perf_event_ioc_reset_enable(void)
+{
+   ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
+   ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);
+}
+
+static int perf_event_open_llc_miss(pid_t pid, int cpu_no)
+{
+   fd_lm = perf_event_open(_llc_miss, pid, cpu_no, -1,
+   PERF_FLAG_FD_CLOEXEC);
+   if (fd_lm == -1) {
+   perror("Error opening leader");
+   ctrlc_handler(0, NULL, NULL);
+   return -1;
+   }
+
+   return 0;
+}
+
+static int initialize_llc_perf(void)
+{
+   memset(_llc_miss, 0, sizeof(struct perf_event_attr));
+   memset(_cqm, 0, sizeof(struct read_format));
+
+   /* Initialize perf_event_attr structures for HW_CACHE_MISSES */
+   initialize_perf_event_attr();
+
+   pea_llc_miss.config = PERF_COUNT_HW_CACHE_MISSES;
+
+   rf_cqm.nr = 1;
+
+   return 0;
+}
+
+static int reset_enable_llc_perf(pid_t pid, int cpu_no)
+{
+   int ret = 0;
+
+   ret = perf_event_open_llc_miss(pid, cpu_no);
+   if (ret < 0)
+   return ret;
+
+   /* Start counters to log values */
+   ioctl_perf_event_ioc_reset_enable();
+
+   return 0;
+}
+
+/*
+ * get_llc_perf:   llc cache miss through perf events
+ * @cpu_no:CPU number that the benchmark PID is binded to
+ *
+ * Perf events like HW_CACHE_MISSES could be used to validate number of
+ * cache lines allocated.
+ *
+ * Return: =0 on success.  <0 on failure.
+ */
+static int get_llc_perf(unsigned long *llc_perf_miss)
+{
+   __u64 total_misses;
+
+   /* Stop counters after one span to get miss rate */
+
+   ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
+
+   if (read(fd_lm, _cqm, sizeof(struct read_format)) == -1) {
+   perror("Could not get llc misses through perf");
+
+   return -1;
+   }
+
+   total_misses = rf_cqm.values[0].value;
+
+   close(fd_lm);
+
+   *llc_perf_miss = total_misses;
+
+   return 0;
+}
+
 /*
  * Get LLC Occupancy as reported by RESCTRL FS
  * For CQM,
@@ -82,9 +179,19 @@ static int print_results_cache(char *filename, int bm_pid,
 
 int measure_cache_vals(struct resctrl_val_param *param, int bm_pid)
 {
-   unsigned long llc_occu_resc = 0, llc_value = 0;
+   unsigned long llc_perf_miss = 0, llc_occu_resc = 0, llc_value = 0;
int ret;
 
+   /*
+* Measure cache miss from perf.
+*/

[PATCH v4 07/10] selftests/resctrl: Add MBA test

2018-12-21 Thread Fenghua Yu
From: Arshiya Hayatkhan Pathan 

MBA (Memory Bandwidth Allocation) test starts a stressful memory
bandwidth benchmark and allocates memory bandwidth from 100% down
to 10% for the benchmark process. For each allocation, compare
perf IMC counter and mbm total bytes from resctrl. The difference
between the two values should be within a threshold to pass the test.

Default benchmark is built-in fill_buf. But users can specify their
own benchmark by option "-b".

We can add memory bandwidth allocation for multiple processes in the
future.

Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Sai Praneeth Prakhya ,
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/Makefile  |   2 +-
 tools/testing/selftests/resctrl/mba_test.c| 173 ++
 tools/testing/selftests/resctrl/resctrl.h |   2 +
 .../testing/selftests/resctrl/resctrl_tests.c |  15 +-
 4 files changed, 190 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/resctrl/mba_test.c

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index 51165f9f8a9d..bf9f55e71d0c 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -8,7 +8,7 @@ all: resctrl_tests
 
 resctrl_tests: *.o
$(CC) $(CFLAGS) -o resctrl_tests resctrl_tests.o resctrlfs.o \
-membw.o fill_buf.o mbm_test.o
+membw.o fill_buf.o mbm_test.o mba_test.o
 
 .PHONY: clean
 
diff --git a/tools/testing/selftests/resctrl/mba_test.c 
b/tools/testing/selftests/resctrl/mba_test.c
new file mode 100644
index ..7a5d869e5382
--- /dev/null
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -0,0 +1,173 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Memory Bandwidth Allocation (MBA) test
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Authors:
+ *Arshiya Hayatkhan Pathan 
+ *Sai Praneeth Prakhya ,
+ *Fenghua Yu 
+ */
+#include "resctrl.h"
+
+#define RESULT_FILE_NAME   "result_mba"
+#define NUM_OF_RUNS5
+#define MAX_DIFF   300
+#define ALLOCATION_MAX 100
+#define ALLOCATION_MIN 10
+#define ALLOCATION_STEP10
+
+/*
+ * Change schemata percentage from 100 to 10%. Write schemata to specified
+ * con_mon grp, mon_grp in resctrl FS.
+ * For each allocation, run 5 times in order to get average values.
+ */
+static int mba_setup(int num, ...)
+{
+   static int runs_per_allocation, allocation = 100;
+   struct resctrl_val_param *p;
+   char allocation_str[64];
+   va_list param;
+
+   va_start(param, num);
+   p = va_arg(param, struct resctrl_val_param *);
+   va_end(param);
+
+   if (runs_per_allocation >= NUM_OF_RUNS)
+   runs_per_allocation = 0;
+
+   /* Only set up schemata once every NUM_OF_RUNS of allocations */
+   if (runs_per_allocation++ != 0)
+   return 0;
+
+   if (allocation < ALLOCATION_MIN || allocation > ALLOCATION_MAX)
+   return -1;
+
+   sprintf(allocation_str, "%d", allocation);
+
+   write_schemata(p->ctrlgrp, allocation_str, p->cpu_no, p->resctrl_val);
+   printf("changed schemata to : %d\n", allocation);
+   allocation -= ALLOCATION_STEP;
+
+   return 0;
+}
+
+static void show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
+{
+   int allocation, failed = 0, runs;
+
+   /* Memory bandwidth from 100% down to 10% */
+   for (allocation = 0; allocation < ALLOCATION_MAX / ALLOCATION_STEP;
+allocation++) {
+   unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
+   unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
+   long  avg_diff = 0;
+
+   /*
+* The first run is discarded due to inaccurate value from
+* phase transition.
+*/
+   for (runs = NUM_OF_RUNS * allocation + 1;
+runs < NUM_OF_RUNS * allocation + NUM_OF_RUNS ; runs++) {
+   sum_bw_imc += bw_imc[runs];
+   sum_bw_resc += bw_resc[runs];
+   }
+
+   avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
+   avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
+   avg_diff = avg_bw_resc - avg_bw_imc;
+
+   printf("\nschemata percentage: %d \t",
+  ALLOCATION_MAX - ALLOCATION_STEP * allocation);
+   printf("avg_bw_imc: %lu\t", avg_bw_imc);
+   printf("avg_bw_resc: %lu\t", avg_bw_resc);
+   printf("avg_diff: %lu\t", labs(avg_diff));
+   if (labs(avg_diff) > MAX_DIFF) {
+   printf("failed\n");
+   failed = 1;
+   } else {
+   printf("passed\n");
+   }
+   }
+
+   if (failed) {
+   printf("\nTest for schemata change using MBA failed");
+   printf("as atleast one 

Re: Fix "perf tools: Synthesize GROUP_DESC feature in pipe mode" in the LT 4.14 branch

2018-12-21 Thread Sasha Levin

On Fri, Dec 21, 2018 at 11:57:53AM +0100, Jinpu Wang wrote:

+cc Greg, stable

Greensky, James J  于2018年12月21日周五 上午11:48写道:


Commit d38d272592737ea88a20 ("perf tools: Synthesize GROUP_DESC feature in pipe 
mode") broke the LT 4.14 branch when using event groups in pipe-mode.

  # perf record -e '{cycles,instructions,branches}' -- sleep 4 | perf report
  # To display the perf.data header info, please use --header/--header-only 
options
  #
  Oxd7c [0x60]: failed to process type: 80
  Error:
  Failed to process sample

Commit a2015516c5c0be932a69 ("perf record: Synthesize features before events in pipe 
mode") is the fix. Can we get this cherry-picked and applied?


I've queued d38d272592737ea88a20 for 4.14, thank you.

--
Thanks,
Sasha


[PATCH v4 01/10] selftests/resctrl: Add README for resctrl tests

2018-12-21 Thread Fenghua Yu
resctrl tests will be implemented. README is added for the tool first.

Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/README | 53 ++
 1 file changed, 53 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/README

diff --git a/tools/testing/selftests/resctrl/README 
b/tools/testing/selftests/resctrl/README
new file mode 100644
index ..f9621d6146df
--- /dev/null
+++ b/tools/testing/selftests/resctrl/README
@@ -0,0 +1,53 @@
+resctrl_tests - resctrl file system test suit
+
+Authors:
+   Fenghua Yu 
+   Sai Praneeth Prakhya ,
+   Arshiya Hayatkhan Pathan 
+
+resctrl_tests tests various resctrl functionalities and interfaces including
+both software and hardware.
+
+Currently it supports Memory Bandwidth Monitoring test and Memory Bandwidth
+Allocation test on Intel RDT hardware. More tests will be added in the future.
+And the test suit can be extended to cover AMD QoS and ARM MPAM hardware
+as well.
+
+BUILD
+-
+
+Run "make" to build executable file "resctrl_tests".
+
+RUN
+---
+
+To use resctrl_tests, root or sudoer privileges are required. This is because
+the test needs to mount resctrl file system and change contents in the file
+system.
+
+Executing the test without any parameter will run all supported tests:
+
+   sudo ./resctrl_tests
+
+OVERVIEW OF EXECUTION
+-
+
+A test case has four stages:
+
+  - setup: mount resctrl file system, create group, setup schemata, move test
+process pids to tasks, start benchmark.
+  - execute: let benchmark run
+  - verify: get resctrl data and verify the data with another source, e.g.
+perf event.
+  - teardown: umount resctrl and clear temporary files.
+
+ARGUMENTS
+-
+
+Parameter '-h' shows usage information.
+
+usage: resctrl_tests [-h] [-b "benchmark_cmd [options]"] [-t test list] [-n 
no_of_bits]
+-b benchmark_cmd [options]: run specified benchmark for MBM, MBA and 
CQM default benchmark is builtin fill_buf
+-t test list: run tests specified in the test list, e.g. -t mbm, mba, 
cqm, cat
+-n no_of_bits: run cache tests using specified no of bits in cache bit 
mask
+-h: help
-- 
2.19.1



[PATCH v4 03/10] selftests/resctrl: Read memory bandwidth from perf IMC counter and from resctrl file system

2018-12-21 Thread Fenghua Yu
From: Sai Praneeth Prakhya 

Total memory bandwidth can be monitored from perf IMC counter and from
resctrl file system. Later the two will be compared to verify the total
memory bandwidth read from resctrl is correct.

Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl_val.c | 431 ++
 1 file changed, 431 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/resctrl_val.c

diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
new file mode 100644
index ..f381aafc0523
--- /dev/null
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -0,0 +1,431 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Memory bandwidth monitoring and allocation library
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Authors:
+ *Arshiya Hayatkhan Pathan 
+ *Sai Praneeth Prakhya ,
+ *Fenghua Yu 
+ */
+#include "resctrl.h"
+
+#define UNCORE_IMC "uncore_imc"
+#define READ_FILE_NAME "events/cas_count_read"
+#define WRITE_FILE_NAME"events/cas_count_write"
+#define DYN_PMU_PATH   "/sys/bus/event_source/devices"
+#define SCALE  0.6103515625
+#define MAX_IMCS   20
+#define MAX_TOKENS 5
+#define READ   0
+#define WRITE  1
+#define CON_MON_MBM_LOCAL_BYTES_PATH   \
+   "%s/%s/mon_groups/%s/mon_data/mon_L3_0%c/mbm_local_bytes"
+
+#define CON_MBM_LOCAL_BYTES_PATH   \
+   "%s/%s/mon_data/mon_L3_0%c/mbm_local_bytes"
+
+#define MON_MBM_LOCAL_BYTES_PATH   \
+   "%s/mon_groups/%s/mon_data/mon_L3_0%c/mbm_local_bytes"
+
+#define MBM_LOCAL_BYTES_PATH   \
+   "%s/mon_data/mon_L3_0%c/mbm_local_bytes"
+
+struct membw_read_format {
+   __u64 value; /* The value of the event */
+   __u64 time_enabled;  /* if PERF_FORMAT_TOTAL_TIME_ENABLED */
+   __u64 time_running;  /* if PERF_FORMAT_TOTAL_TIME_RUNNING */
+   __u64 id;/* if PERF_FORMAT_ID */
+};
+
+struct imc_counter_config {
+   __u32 type;
+   __u64 event;
+   __u64 umask;
+   struct perf_event_attr pe;
+   struct membw_read_format return_value;
+   int fd;
+};
+
+static struct imc_counter_config imc_counters_config[MAX_IMCS][2];
+static char mbm_total_path[1024];
+static int imcs;
+
+void membw_initialize_perf_event_attr(int i, int j)
+{
+   memset(_counters_config[i][j].pe, 0,
+  sizeof(struct perf_event_attr));
+   imc_counters_config[i][j].pe.type = imc_counters_config[i][j].type;
+   imc_counters_config[i][j].pe.size = sizeof(struct perf_event_attr);
+   imc_counters_config[i][j].pe.disabled = 1;
+   imc_counters_config[i][j].pe.inherit = 1;
+   imc_counters_config[i][j].pe.exclude_guest = 1;
+   imc_counters_config[i][j].pe.config =
+   imc_counters_config[i][j].umask << 8 |
+   imc_counters_config[i][j].event;
+   imc_counters_config[i][j].pe.sample_type = PERF_SAMPLE_IDENTIFIER;
+   imc_counters_config[i][j].pe.read_format =
+   PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING;
+}
+
+static int open_perf_event(int i, int cpu_no, int j)
+{
+   imc_counters_config[i][j].fd =
+   perf_event_open(_counters_config[i][j].pe, -1, cpu_no, -1,
+   PERF_FLAG_FD_CLOEXEC);
+
+   if (imc_counters_config[i][j].fd == -1) {
+   fprintf(stderr, "Error opening leader %llx\n",
+   imc_counters_config[i][j].pe.config);
+
+   return -1;
+   }
+
+   return 0;
+}
+
+void membw_ioctl_perf_event_ioc_reset_enable(int i, int j)
+{
+   ioctl(imc_counters_config[i][j].fd, PERF_EVENT_IOC_RESET, 0);
+   ioctl(imc_counters_config[i][j].fd, PERF_EVENT_IOC_ENABLE, 0);
+}
+
+void membw_ioctl_perf_event_ioc_disable(int i, int j)
+{
+   ioctl(imc_counters_config[i][j].fd, PERF_EVENT_IOC_DISABLE, 0);
+}
+
+/*
+ * get_event_and_umask:Parse config into event and umask
+ * @cas_count_cfg: Config
+ * @count: iMC number
+ * @op:Operation (read/write)
+ */
+void get_event_and_umask(char *cas_count_cfg, int count, bool op)
+{
+   char *token[MAX_TOKENS];
+   int i = 0;
+
+   strcat(cas_count_cfg, ",");
+   token[0] = strtok(cas_count_cfg, "=,");
+
+   for (i = 1; i < MAX_TOKENS; i++)
+   token[i] = strtok(NULL, "=,");
+
+   for (i = 0; i < MAX_TOKENS; i++) {
+   if (!token[i])
+   break;
+   if (strcmp(token[i], "event") == 0) {
+   if (op == READ)
+   imc_counters_config[count][READ].event =
+   strtol(token[i + 1], NULL, 16);
+   else
+ 

[PATCH v4 04/10] selftests/resctrl: Add callback to start a benchmark

2018-12-21 Thread Fenghua Yu
From: Sai Praneeth Prakhya 

The callback starts a child process and puts the child pid in created
resctrl group with specified memory bandwidth in schemata. The child
starts running benchmark.

Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/resctrl.h |  27 ++
 tools/testing/selftests/resctrl/resctrl_val.c | 272 ++
 2 files changed, 299 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrl.h 
b/tools/testing/selftests/resctrl/resctrl.h
index 01822f6258af..9820af253d4d 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -3,6 +3,7 @@
 #ifndef RESCTRL_H
 #define RESCTRL_H
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -28,9 +29,34 @@
exit(EXIT_FAILURE); \
} while (0)
 
+/*
+ * resctrl_val_param:  resctrl test parameters
+ * @resctrl_val:   Resctrl feature (Eg: mbm, mba.. etc)
+ * @ctrlgrp:   Name of the control monitor group (con_mon grp)
+ * @mongrp:Name of the monitor group (mon grp)
+ * @cpu_no:CPU number to which the benchmark would be binded
+ * @span:  Memory bytes accessed in each benchmark iteration
+ * @mum_resctrlfs: Should the resctrl FS be remounted?
+ * @filename:  Name of file to which the o/p should be written
+ * @bw_report: Bandwidth report type (reads vs writes)
+ * @setup: Call back function to setup test environment
+ */
+struct resctrl_val_param {
+   char*resctrl_val;
+   charctrlgrp[64];
+   charmongrp[64];
+   int cpu_no;
+   int span;
+   int mum_resctrlfs;
+   charfilename[64];
+   char*bw_report;
+   int (*setup)(int num, ...);
+};
+
 pid_t bm_pid, ppid;
 
 int remount_resctrlfs(bool mum_resctrlfs);
+int umount_resctrlfs(void);
 char get_sock_num(int cpu_no);
 int validate_bw_report_request(char *bw_report);
 int validate_resctrl_feature_request(char *resctrl_val);
@@ -43,5 +69,6 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char 
*mongrp,
 int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu,
int group_fd, unsigned long flags);
 int run_fill_buf(int span, int malloc_and_init_memory, int memflush, int op);
+int membw_val(char **benchmark_cmd, struct resctrl_val_param *param);
 
 #endif /* RESCTRL_H */
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c 
b/tools/testing/selftests/resctrl/resctrl_val.c
index f381aafc0523..9a79bf669254 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -11,6 +11,7 @@
  */
 #include "resctrl.h"
 
+#define MB (1024 * 1024)
 #define UNCORE_IMC "uncore_imc"
 #define READ_FILE_NAME "events/cas_count_read"
 #define WRITE_FILE_NAME"events/cas_count_write"
@@ -429,3 +430,274 @@ static unsigned long get_mem_bw_resctrl(void)
 
return mbm_total;
 }
+
+pid_t bm_pid, ppid;
+
+static void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
+{
+   kill(bm_pid, SIGKILL);
+   printf("Ending\n\n");
+
+   exit(EXIT_SUCCESS);
+}
+
+/*
+ * print_results_bw:   the memory bandwidth results are stored in a file
+ * @filename:  file that stores the results
+ * @bm_pid:child pid that runs benchmark
+ * @bw_imc:perf imc counter value
+ * @bw_resc:   memory bandwidth value
+ *
+ * Return: 0 on success. non-zero on failure.
+ */
+static int print_results_bw(char *filename,  int bm_pid, float bw_imc,
+   unsigned long bw_resc)
+{
+   unsigned long diff = labs(bw_imc - bw_resc);
+   FILE *fp;
+
+   if (strcmp(filename, "stdio") == 0 || strcmp(filename, "stderr") == 0) {
+   printf("Pid: %d \t Mem_BW_iMC: %f \t ", bm_pid, bw_imc);
+   printf("Mem_BW_resc: %lu \t Difference: %lu\n", bw_resc, diff);
+   } else {
+   fp = fopen(filename, "a");
+   if (!fp) {
+   perror("Cannot open results file");
+
+   return errno;
+   }
+   if (fprintf(fp, "Pid: %d \t Mem_BW_iMC: %f \t Mem_BW_resc: %lu 
\t Difference: %lu\n",
+   bm_pid, bw_imc, bw_resc, diff) <= 0) {
+   fclose(fp);
+   perror("Could not log results.");
+
+   return errno;
+   }
+   fclose(fp);
+   }
+
+   return 0;
+}
+
+static int
+measure_vals(struct resctrl_val_param *param, unsigned long *bw_resc_start)
+{
+   unsigned long bw_imc, bw_resc, bw_resc_end;
+   int ret;
+
+   /*
+* Measure memory bandwidth from resctrl and from
+* another source which is perf imc value or could
+* be something else if perf 

[PATCH v4 00/10] selftests/resctrl: Add resctrl selftest

2018-12-21 Thread Fenghua Yu
With more and more resctrl features are being added by Intel, AMD
and ARM, a test tool is becoming more and more useful to validate
that both hardware and software functionalities work as expected.

We introduce resctrl selftest to cover resctrl features on both
X86 and ARM architectures. It first implements MBM (Memory Bandwidth
Monitoring) and MBA (Memory Bandwidth Allocation) tests. We can enhance
the selftest tool to include more functionality tests in future.

There is an existing resctrl test suite 'intel_cmt_cat'. But the major
purpose of the tool is to test Intel(R) RDT hardware via writing and
reading MSR registers. It does access resctrl file system; but the
functionalities are very limited. And it doesn't support automatic test
and a lot of manual verifications are involved.

So the selftest tool we are introducing here provides a convenient
tool which does automatic resctrl testing, is easily available in kernel
tree, and will be extended to AMD QoS and ARM MPAM.

The selftest tool is in tools/testing/selftests/resctrl in order to have
generic test code for all architectures.

Changelog:
v4: 
- address comments from Balu and Randy
- Add CAT and CQM tests

v3:
- Change code based on comments from Babu Moger
- Remove some unnessary code and use pipe to communicate b/w processes

v2:
- Change code based on comments from Babu Moger
- Clean up other places.

Arshiya Hayatkhan Pathan (4):
  selftests/resctrl: Add MBM test
  selftests/resctrl: Add MBA test
  selftests/resctrl Add Cache QoS Monitoring (CQM) selftest
  selftests/resctrl: Add Cache Allocation Technology (CAT) selftest

Fenghua Yu (2):
  selftests/resctrl: Add README for resctrl tests
  selftests/resctrl: Add the test in MAINTAINERS

Sai Praneeth Prakhya (4):
  selftests/resctrl: Add basic resctrl file system operations and data
  selftests/resctrl: Read memory bandwidth from perf IMC counter and
from resctrl file system
  selftests/resctrl: Add callback to start a benchmark
  selftests/resctrl: Add built in benchmark

 MAINTAINERS   |   1 +
 tools/testing/selftests/resctrl/Makefile  |  16 +
 tools/testing/selftests/resctrl/README|  53 ++
 tools/testing/selftests/resctrl/cache.c   | 275 +++
 tools/testing/selftests/resctrl/cat_test.c| 243 ++
 tools/testing/selftests/resctrl/cqm_test.c| 169 
 tools/testing/selftests/resctrl/fill_buf.c| 198 +
 tools/testing/selftests/resctrl/mba_test.c| 174 +
 tools/testing/selftests/resctrl/mbm_test.c| 146 
 tools/testing/selftests/resctrl/resctrl.h | 104 +++
 .../testing/selftests/resctrl/resctrl_tests.c | 175 +
 tools/testing/selftests/resctrl/resctrl_val.c | 727 ++
 tools/testing/selftests/resctrl/resctrlfs.c   | 643 
 13 files changed, 2924 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/Makefile
 create mode 100644 tools/testing/selftests/resctrl/README
 create mode 100644 tools/testing/selftests/resctrl/cache.c
 create mode 100644 tools/testing/selftests/resctrl/cat_test.c
 create mode 100644 tools/testing/selftests/resctrl/cqm_test.c
 create mode 100644 tools/testing/selftests/resctrl/fill_buf.c
 create mode 100644 tools/testing/selftests/resctrl/mba_test.c
 create mode 100644 tools/testing/selftests/resctrl/mbm_test.c
 create mode 100644 tools/testing/selftests/resctrl/resctrl.h
 create mode 100644 tools/testing/selftests/resctrl/resctrl_tests.c
 create mode 100644 tools/testing/selftests/resctrl/resctrl_val.c
 create mode 100644 tools/testing/selftests/resctrl/resctrlfs.c

-- 
2.19.1



[PATCH v4 06/10] selftests/resctrl: Add MBM test

2018-12-21 Thread Fenghua Yu
From: Arshiya Hayatkhan Pathan 

MBM (Memory Bandwidth Monitoring) test is the first implemented selftest.
It starts a stressful memory bandwidth benchmark and assigns the
bandwidth pid in a resctrl monitoring group. Read and compare perf IMC
counter and MBM total bytes for the benchmark. The numbers should be
close enough to pass the test.

Default benchmark is built-in fill_buf. But users can specify their
own benchmark by option "-b".

We can add memory bandwidth monitoring for multiple processes in the
future.

Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Sai Praneeth Prakhya ,
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/Makefile  |   8 +-
 tools/testing/selftests/resctrl/mbm_test.c| 145 ++
 tools/testing/selftests/resctrl/resctrl.h |   3 +
 .../testing/selftests/resctrl/resctrl_tests.c | 123 +++
 tools/testing/selftests/resctrl/resctrl_val.c |   2 +
 5 files changed, 280 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/resctrl/mbm_test.c
 create mode 100644 tools/testing/selftests/resctrl/resctrl_tests.c

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index bd5c5418961e..51165f9f8a9d 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -1,10 +1,16 @@
 CC = gcc
 CFLAGS = -g -Wall
 
+all: resctrl_tests
+
 *.o: *.c
$(CC) $(CFLAGS) -c *.c
 
+resctrl_tests: *.o
+   $(CC) $(CFLAGS) -o resctrl_tests resctrl_tests.o resctrlfs.o \
+membw.o fill_buf.o mbm_test.o
+
 .PHONY: clean
 
 clean:
-   $(RM) *.o *~
+   $(RM) *.o *~ resctrl_tests
diff --git a/tools/testing/selftests/resctrl/mbm_test.c 
b/tools/testing/selftests/resctrl/mbm_test.c
new file mode 100644
index ..fc1b8bab1c71
--- /dev/null
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Memory Bandwidth Monitoring (MBM) test
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Authors:
+ *Arshiya Hayatkhan Pathan 
+ *Sai Praneeth Prakhya ,
+ *Fenghua Yu 
+ */
+#include "resctrl.h"
+
+#define RESULT_FILE_NAME   "result_mbm"
+#define MAX_DIFF   300
+#define NUM_OF_RUNS5
+
+static void
+show_bw_info(unsigned long *bw_imc, unsigned long *bw_resc, int span)
+{
+   unsigned long avg_bw_imc = 0, avg_bw_resc = 0;
+   unsigned long sum_bw_imc = 0, sum_bw_resc = 0;
+   long avg_diff = 0;
+   int runs;
+
+   /*
+* Discard the first value which is inaccurate due to monitoring setup
+* transition phase.
+*/
+   for (runs = 1; runs < NUM_OF_RUNS ; runs++) {
+   sum_bw_imc += bw_imc[runs];
+   sum_bw_resc += bw_resc[runs];
+   }
+
+   avg_bw_imc = sum_bw_imc / 4;
+   avg_bw_resc = sum_bw_resc / 4;
+   avg_diff = avg_bw_resc - avg_bw_imc;
+
+   printf("\nSpan (MB): %d \t", span);
+   printf("avg_bw_imc: %lu\t", avg_bw_imc);
+   printf("avg_bw_resc: %lu\t", avg_bw_resc);
+   printf("avg_diff: %lu\t", labs(avg_diff));
+
+   if (labs(avg_diff) > MAX_DIFF)
+   printf(" failed\n");
+   else
+   printf(" passed\n");
+}
+
+static int check_results(int span)
+{
+   unsigned long bw_imc[NUM_OF_RUNS], bw_resc[NUM_OF_RUNS];
+   char temp[1024], *token_array[8];
+   char output[] = RESULT_FILE_NAME;
+   int runs;
+   FILE *fp;
+
+   printf("\nChecking for pass/fail\n");
+
+   fp = fopen(output, "r");
+   if (!fp) {
+   perror("Error in opening file\n");
+
+   return errno;
+   }
+
+   runs = 0;
+   while (fgets(temp, 1024, fp)) {
+   char *token = strtok(temp, ":\t");
+   int i = 0;
+
+   while (token) {
+   token_array[i++] = token;
+   token = strtok(NULL, ":\t");
+   }
+
+   bw_resc[runs] = atol(token_array[5]);
+   bw_imc[runs] = atol(token_array[3]);
+   runs++;
+   }
+
+   show_bw_info(bw_imc, bw_resc, span);
+
+   fclose(fp);
+
+   return 0;
+}
+
+static int mbm_setup(int num, ...)
+{
+   struct resctrl_val_param *p;
+   static int num_of_runs;
+   va_list param;
+   int ret = 0;
+
+   /* Run NUM_OF_RUNS times */
+   if (num_of_runs++ >= NUM_OF_RUNS)
+   return -1;
+
+   va_start(param, num);
+   p = va_arg(param, struct resctrl_val_param *);
+   va_end(param);
+
+   /* Set up shemata with 100% allocation on the first run. */
+   if (num_of_runs == 0)
+   ret = write_schemata(p->ctrlgrp, "100", p->cpu_no,
+p->resctrl_val);
+
+   return ret;
+}
+
+void mbm_test_cleanup(void)
+{
+   remove(RESULT_FILE_NAME);
+}
+
+int mbm_bw_change(int span, int core_id, char *bw_report, char **benchmark_cmd)

[PATCH v4 08/10] selftests/resctrl Add Cache QoS Monitoring (CQM) selftest

2018-12-21 Thread Fenghua Yu
From: Arshiya Hayatkhan Pathan 

Cache QoS Monitoring (CQM) selftest starts stressful cache benchmark
with specified size of memory to access the cache. Last Level cache
occupancy reported by CQM should be close to the size of the memory.

Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/Makefile  |   2 +-
 tools/testing/selftests/resctrl/cache.c   | 102 
 tools/testing/selftests/resctrl/cqm_test.c| 169 +
 tools/testing/selftests/resctrl/fill_buf.c| 107 
 tools/testing/selftests/resctrl/mba_test.c|   3 +-
 tools/testing/selftests/resctrl/mbm_test.c|   3 +-
 tools/testing/selftests/resctrl/resctrl.h |  30 ++-
 .../testing/selftests/resctrl/resctrl_tests.c |  82 +--
 tools/testing/selftests/resctrl/resctrl_val.c | 106 
 tools/testing/selftests/resctrl/resctrlfs.c   | 231 --
 10 files changed, 691 insertions(+), 144 deletions(-)
 create mode 100644 tools/testing/selftests/resctrl/cache.c
 create mode 100644 tools/testing/selftests/resctrl/cqm_test.c

diff --git a/tools/testing/selftests/resctrl/Makefile 
b/tools/testing/selftests/resctrl/Makefile
index bf9f55e71d0c..664561cd76e6 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -8,7 +8,7 @@ all: resctrl_tests
 
 resctrl_tests: *.o
$(CC) $(CFLAGS) -o resctrl_tests resctrl_tests.o resctrlfs.o \
-membw.o fill_buf.o mbm_test.o mba_test.o
+resctrl_val.o fill_buf.o mbm_test.o mba_test.o cache.o 
cqm_test.o
 
 .PHONY: clean
 
diff --git a/tools/testing/selftests/resctrl/cache.c 
b/tools/testing/selftests/resctrl/cache.c
new file mode 100644
index ..1256590ef804
--- /dev/null
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include "resctrl.h"
+
+struct read_format {
+   __u64 nr;   /* The number of events */
+   struct {
+   __u64 value;/* The value of the event */
+   } values[2];
+};
+
+char cbm_mask[256];
+unsigned long long_mask;
+char llc_occup_path[1024];
+
+/*
+ * Get LLC Occupancy as reported by RESCTRL FS
+ * For CQM,
+ * 1. If con_mon grp and mon grp given, then read from mon grp in
+ * con_mon grp
+ * 2. If only con_mon grp given, then read from con_mon grp
+ * 3. If both not given, then read from root con_mon grp
+ * For CAT,
+ * 1. If con_mon grp given, then read from it
+ * 2. If con_mon grp not given, then read from root con_mon grp
+ *
+ * Return: =0 on success.  <0 on failure.
+ */
+static int get_llc_occu_resctrl(unsigned long *llc_occupancy)
+{
+   FILE *fp;
+
+   fp = fopen(llc_occup_path, "r");
+   if (!fp) {
+   perror("Failed to open results file");
+
+   return errno;
+   }
+   if (fscanf(fp, "%lu", llc_occupancy) <= 0) {
+   perror("Could not get llc occupancy");
+   fclose(fp);
+
+   return -1;
+   }
+   fclose(fp);
+
+   return 0;
+}
+
+/*
+ * print_results_cache:the cache results are stored in a file
+ * @filename:  file that stores the results
+ * @bm_pid:child pid that runs benchmark
+ * @llc_value: perf miss value /
+ * llc occupancy value reported by resctrl FS
+ *
+ * Return: 0 on success. non-zero on failure.
+ */
+static int print_results_cache(char *filename, int bm_pid,
+  unsigned long llc_value)
+{
+   FILE *fp;
+
+   if (strcmp(filename, "stdio") == 0 || strcmp(filename, "stderr") == 0) {
+   printf("Pid: %d \t LLC_value: %lu\n", bm_pid,
+  llc_value);
+   } else {
+   fp = fopen(filename, "a");
+   if (!fp) {
+   perror("Cannot open results file");
+
+   return errno;
+   }
+   fprintf(fp, "Pid: %d \t llc_value: %lu\n", bm_pid,
+   llc_value);
+   fclose(fp);
+   }
+
+   return 0;
+}
+
+int measure_cache_vals(struct resctrl_val_param *param, int bm_pid)
+{
+   unsigned long llc_occu_resc = 0, llc_value = 0;
+   int ret;
+
+   /*
+* Measure llc occupancy from resctrl.
+*/
+   if (!strcmp(param->resctrl_val, "cqm")) {
+   ret = get_llc_occu_resctrl(_occu_resc);
+   if (ret < 0)
+   return ret;
+   llc_value = llc_occu_resc;
+   }
+   ret = print_results_cache(param->filename, bm_pid, llc_value);
+   if (ret)
+   return ret;
+
+   return 0;
+}
diff --git a/tools/testing/selftests/resctrl/cqm_test.c 
b/tools/testing/selftests/resctrl/cqm_test.c
new file mode 100644
index ..cddee364c88a
--- /dev/null
+++ b/tools/testing/selftests/resctrl/cqm_test.c
@@ 

[PATCH v4 05/10] selftests/resctrl: Add built in benchmark

2018-12-21 Thread Fenghua Yu
From: Sai Praneeth Prakhya 

Built-in benchmark fill_buf generates stressful memory bandwidth
and cache traffic.

Later it will be used as a default benchmark by various resctrl tests
such as MBA (Memory Bandwidth Allocation) and MBM (Memory Bandwidth
Monitoring) tests.

Signed-off-by: Sai Praneeth Prakhya 
Signed-off-by: Arshiya Hayatkhan Pathan 
Signed-off-by: Fenghua Yu 
---
 tools/testing/selftests/resctrl/fill_buf.c | 175 +
 1 file changed, 175 insertions(+)
 create mode 100644 tools/testing/selftests/resctrl/fill_buf.c

diff --git a/tools/testing/selftests/resctrl/fill_buf.c 
b/tools/testing/selftests/resctrl/fill_buf.c
new file mode 100644
index ..d9950b5d068d
--- /dev/null
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * fill_buf benchmark
+ *
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Authors:
+ *Arshiya Hayatkhan Pathan 
+ *Sai Praneeth Prakhya ,
+ *Fenghua Yu 
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "resctrl.h"
+
+#define CL_SIZE(64)
+#define PAGE_SIZE  (4 * 1024)
+#define MB (1024 * 1024)
+
+static unsigned char *startptr;
+
+static void sb(void)
+{
+   asm volatile("sfence\n\t"
+: : : "memory");
+}
+
+static void ctrl_handler(int signo)
+{
+   free(startptr);
+   printf("\nEnding\n");
+   sb();
+   exit(EXIT_SUCCESS);
+}
+
+static void cl_flush(void *p)
+{
+   asm volatile("clflush (%0)\n\t"
+: : "r"(p) : "memory");
+}
+
+static void mem_flush(void *p, size_t s)
+{
+   char *cp = (char *)p;
+   size_t i = 0;
+
+   s = s / CL_SIZE; /* mem size in cache llines */
+
+   for (i = 0; i < s; i++)
+   cl_flush([i * CL_SIZE]);
+
+   sb();
+}
+
+static void *malloc_and_init_memory(size_t s)
+{
+   uint64_t *p64;
+   size_t s64;
+
+   void *p = memalign(PAGE_SIZE, s);
+
+   p64 = (uint64_t *)p;
+   s64 = s / sizeof(uint64_t);
+
+   while (s64 > 0) {
+   *p64 = (uint64_t)rand();
+   p64 += (CL_SIZE / sizeof(uint64_t));
+   s64 -= (CL_SIZE / sizeof(uint64_t));
+   }
+
+   return p;
+}
+
+static void fill_cache_read(unsigned char *start_ptr, unsigned char *end_ptr)
+{
+   while (1) {
+   unsigned char sum, *p;
+
+   p = start_ptr;
+   /* Read two chars in each cache line to stress cache */
+   while (p < (end_ptr - 1024)) {
+   sum += p[0] + p[32] + p[64] + p[96] + p[128] +
+  p[160] + p[192] + p[224] + p[256] + p[288] +
+  p[320] + p[352] + p[384] + p[416] + p[448] +
+  p[480] + p[512] + p[544] + p[576] + p[608] +
+  p[640] + p[672] + p[704] + p[736] + p[768] +
+  p[800] + p[832] + p[864] + p[896] + p[928] +
+  p[960] + p[992];
+   p += 1024;
+   }
+   }
+}
+
+static void fill_cache_write(unsigned char *start_ptr, unsigned char *end_ptr)
+{
+   while (1) {
+   while (start_ptr < end_ptr) {
+   *start_ptr = '1';
+   start_ptr += (CL_SIZE / 2);
+   }
+   start_ptr = startptr;
+   }
+}
+
+static void
+fill_cache(unsigned long long buf_size, int malloc_and_init,
+  int memflush, int op)
+{
+   unsigned char *start_ptr, *end_ptr;
+   unsigned long long i;
+
+   if (malloc_and_init) {
+   start_ptr = malloc_and_init_memory(buf_size);
+   printf("Started benchmark with memalign\n");
+   } else {
+   start_ptr = malloc(buf_size);
+   printf("Started benchmark with malloc\n");
+   }
+
+   if (!start_ptr)
+   return;
+
+   startptr = start_ptr;
+   end_ptr = start_ptr + buf_size;
+
+   /*
+* It's better to touch the memory once to avoid any compiler
+* optimizations
+*/
+   if (!malloc_and_init) {
+   for (i = 0; i < buf_size; i++)
+   *start_ptr++ = (unsigned char)rand();
+   }
+
+   start_ptr = startptr;
+
+   /* Flush the memory before using to avoid "cache hot pages" effect */
+   if (memflush) {
+   mem_flush(start_ptr, buf_size);
+   printf("Started benchmark with memflush\n");
+   } else {
+   printf("Started benchmark *without* memflush\n");
+   }
+
+   if (op == 0)
+   fill_cache_read(start_ptr, end_ptr);
+   else
+   fill_cache_write(start_ptr, end_ptr);
+
+   free(startptr);
+}
+
+int run_fill_buf(int span, int malloc_and_init_memory, int memflush, int op)
+{
+   unsigned long long cache_size = span * 

[PATCH v4 10/10] selftests/resctrl: Add the test in MAINTAINERS

2018-12-21 Thread Fenghua Yu
The resctrl selftest will be maintained by RDT maintainers.

Signed-off-by: Fenghua Yu 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 842b697a1511..edafdecb1e41 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12650,6 +12650,7 @@ S:  Supported
 F: arch/x86/kernel/cpu/intel_rdt*
 F: arch/x86/include/asm/intel_rdt_sched.h
 F: Documentation/x86/intel_rdt*
+F: tools/testing/selftests/resctrl
 
 READ-COPY UPDATE (RCU)
 M: "Paul E. McKenney" 
-- 
2.19.1



Re: [PATCH v2] RISC-V: defconfig: Enable Generic PCIE by default

2018-12-21 Thread Paul Walmsley
(fixed "Palmpalmer" again, might want to check your script for sending 
patches)

On Fri, 21 Dec 2018, Alistair Francis wrote:

> Enable generic PCIe by default in the RISC-V defconfig, this allows us
> to use QEMU's PCIe support out of the box.
> 
> Signed-off-by: Alistair Francis 

Thanks for updating this.  

I tried applying it against v4.20-rc7 and running "make savedefconfig", 
and there is a delta.  It is better if there is no delta.  Is this patch 
based on a different commit?  Or, is this patch not the product of "make 
savedefconfig"?   If the latter, please do:

make savedefconfig
cp defconfig arch/riscv/configs/defconfig

to clean up the diff.

thanks

- Paul


Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Joe Perches
On Fri, 2018-12-21 at 19:06 -0500, Steven Rostedt wrote:
> On Fri, 21 Dec 2018 18:13:16 -0500
> Steven Rostedt  wrote:
> 
> > +static __always_inline int str_has_prefix(const char *str, const char 
> > *prefix)
> 
> I'm thinking it is cleaner to have two helper functions and have them
> both return bool.
> 
> static __always_inline bool str_has_prefix(const char *str, const char 
> *prefix)
> {
>   return !strncmp(str, prefix, strlen(prefix));
> }
> 
> ( I still like to keep the __always_inline, it doesn't hurt )
> 
> And I'll make a separate patch that adds:
> 
> static __always_inline bool
> str_has_prefix_len(const char *str, const char *prefix, unsigned int *len)
> {
>   *len = strlen(prefix);
>   return !strncmp(str, prefix, *len) ? *len : 0;
> }
> 
> Everyone OK with that? 

I guess so as you're the one doing all the work.

Trivia:

Shouldn't the latter function use __kernel_size_t
as that's the actual return type of strlen?

Thanks for keeping at it.



[PATCH v1 03/11] vga-switcheroo: make PCI dependency explicit

2018-12-21 Thread Sinan Kaya
This driver depends on the PCI infrastructure but the dependency has not
been explicitly called out.

Signed-off-by: Sinan Kaya 
---
 drivers/gpu/vga/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/vga/Kconfig b/drivers/gpu/vga/Kconfig
index b677e5d524e6..d5f1d8e1c6f8 100644
--- a/drivers/gpu/vga/Kconfig
+++ b/drivers/gpu/vga/Kconfig
@@ -21,6 +21,7 @@ config VGA_SWITCHEROO
bool "Laptop Hybrid Graphics - GPU switching support"
depends on X86
depends on ACPI
+   depends on PCI
select VGA_ARB
help
  Many laptops released in 2008/9/10 have two GPUs with a multiplexer
-- 
2.19.0



[PATCH v1 07/11] drivers: thermal: Hide PCI driver when CONFIG_PCI is unset

2018-12-21 Thread Sinan Kaya
This driver is both a platform and PCI driver. Hide PCI specific pieces
when CONFIG_PCI is unset.

Signed-off-by: Sinan Kaya 
---
 .../intel/int340x_thermal/processor_thermal_device.c  | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c 
b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
index 284cf2c5a8fd..b84a475a1162 100644
--- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
+++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device.c
@@ -374,6 +374,7 @@ static int int3401_remove(struct platform_device *pdev)
return 0;
 }
 
+#ifdef CONFIG_PCI
 static irqreturn_t proc_thermal_pci_msi_irq(int irq, void *devid)
 {
struct proc_thermal_device *proc_priv;
@@ -482,6 +483,7 @@ static struct pci_driver proc_thermal_pci_driver = {
.remove = proc_thermal_pci_remove,
.id_table   = proc_thermal_pci_ids,
 };
+#endif
 
 static const struct acpi_device_id int3401_device_ids[] = {
{"INT3401", 0},
@@ -505,16 +507,18 @@ static int __init proc_thermal_init(void)
ret = platform_driver_register(_driver);
if (ret)
return ret;
-
+#ifdef CONFIG_PCI
ret = pci_register_driver(_thermal_pci_driver);
-
+#endif
return ret;
 }
 
 static void __exit proc_thermal_exit(void)
 {
platform_driver_unregister(_driver);
+#ifdef CONFIG_PCI
pci_unregister_driver(_thermal_pci_driver);
+#endif
 }
 
 module_init(proc_thermal_init);
-- 
2.19.0



[PATCH v1 06/11] platform/x86: apple-gmux: hide PCI specific code

2018-12-21 Thread Sinan Kaya
Code is scanning PCI bus to find out if it is switchable or not. If
CONFIG_PCI is not set, assume unswitchable.

Signed-off-by: Sinan Kaya 
---
 drivers/platform/x86/apple-gmux.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/platform/x86/apple-gmux.c 
b/drivers/platform/x86/apple-gmux.c
index fd2ffebc868f..b552b54bf58b 100644
--- a/drivers/platform/x86/apple-gmux.c
+++ b/drivers/platform/x86/apple-gmux.c
@@ -742,8 +742,12 @@ static int gmux_probe(struct pnp_dev *pnp, const struct 
pnp_device_id *id)
 * If Thunderbolt is present, the external DP port is not fully
 * switchable. Force its AUX channel to the discrete GPU.
 */
+#ifdef CONFIG_PCI
gmux_data->external_switchable =
!bus_for_each_dev(_bus_type, NULL, NULL, is_thunderbolt);
+#else
+   gmux_data->external_switchable = false;
+#endif
if (!gmux_data->external_switchable)
gmux_write8(gmux_data, GMUX_PORT_SWITCH_EXTERNAL, 3);
 
-- 
2.19.0



[PATCH v1 08/11] ASoC: Intel: Make PCI dependency explicit

2018-12-21 Thread Sinan Kaya
IOSF_MBI driver depends on CONFIG_PCI set but this is not specified
anywhere.

Signed-off-by: Sinan Kaya 
---
 sound/soc/intel/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig
index 2fd1b61e8331..b0764b2fe001 100644
--- a/sound/soc/intel/Kconfig
+++ b/sound/soc/intel/Kconfig
@@ -91,7 +91,7 @@ config SND_SST_ATOM_HIFI2_PLATFORM_PCI
 config SND_SST_ATOM_HIFI2_PLATFORM_ACPI
tristate "ACPI HiFi2 (Baytrail, Cherrytrail) Platforms"
default ACPI
-   depends on X86 && ACPI
+   depends on X86 && ACPI && PCI
select SND_SST_IPC_ACPI
select SND_SST_ATOM_HIFI2_PLATFORM
select SND_SOC_ACPI_INTEL_MATCH
-- 
2.19.0



[PATCH v1 02/11] ata: make PCI dependency explicit for PATA_ACPI

2018-12-21 Thread Sinan Kaya
PATA_ACPI is a PCI device driver but the PCI dependency has not been
explicitly called out.

Signed-off-by: Sinan Kaya 
---
 drivers/ata/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index 4ca7a6b4eaae..8218db17ebdb 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -1091,7 +1091,7 @@ comment "Generic fallback / legacy drivers"
 
 config PATA_ACPI
tristate "ACPI firmware driver for PATA"
-   depends on ATA_ACPI && ATA_BMDMA
+   depends on ATA_ACPI && ATA_BMDMA && PCI
help
  This option enables an ACPI method driver which drives
  motherboard PATA controller interfaces through the ACPI
-- 
2.19.0



[PATCH v1 09/11] mmc: add PCI dependency into IOSF_MBI

2018-12-21 Thread Sinan Kaya
Select IOSF_MBI only when PCI is set.

Signed-off-by: Sinan Kaya 
---
 drivers/mmc/host/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index e26b8145efb3..4c5f037b246d 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -117,7 +117,7 @@ config MMC_RICOH_MMC
 config MMC_SDHCI_ACPI
tristate "SDHCI support for ACPI enumerated SDHCI controllers"
depends on MMC_SDHCI && ACPI
-   select IOSF_MBI if X86
+   select IOSF_MBI if (X86 && PCI)
help
  This selects support for ACPI enumerated SDHCI controllers,
  identified by ACPI Compatibility ID PNP0D40 or specific
-- 
2.19.0



[PATCH v1 04/11] platform/x86: make PCI dependency explicit

2018-12-21 Thread Sinan Kaya
ipss driver is a PCI device driver but this has not been mentioned
anywhere in Kconfig.

Signed-off-by: Sinan Kaya 
---
 drivers/platform/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index e3b62c2ee8d1..b36ea14b41ad 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1009,7 +1009,7 @@ config INTEL_MFLD_THERMAL
 
 config INTEL_IPS
tristate "Intel Intelligent Power Sharing"
-   depends on ACPI
+   depends on ACPI && PCI
---help---
  Intel Calpella platforms support dynamic power sharing between the
  CPU and GPU, maximizing performance in a given TDP.  This driver,
-- 
2.19.0



[PATCH v1 10/11] x86: select IOSF_MBI only when CONFIG_PCI is set

2018-12-21 Thread Sinan Kaya
Need CONFIG_PCI to be set in order to be able to use IOSF_MBI
functionality.

Signed-off-by: Sinan Kaya 
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc4f0c6ee1ed..322a58ababb0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -617,7 +617,7 @@ config X86_INTEL_QUARK
 
 config X86_INTEL_LPSS
bool "Intel Low Power Subsystem Support"
-   depends on X86 && ACPI
+   depends on X86 && ACPI && PCI
select COMMON_CLK
select PINCTRL
select IOSF_MBI
-- 
2.19.0



[PATCH v1 11/11] drivers: thermal: Make PCI dependency explicit

2018-12-21 Thread Sinan Kaya
IOSF_CORE depends on PCI. This was never mentioned.

Signed-off-by: Sinan Kaya 
---
 drivers/thermal/intel/int340x_thermal/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/intel/int340x_thermal/Kconfig 
b/drivers/thermal/intel/int340x_thermal/Kconfig
index 0582bd12a239..0ca908d12750 100644
--- a/drivers/thermal/intel/int340x_thermal/Kconfig
+++ b/drivers/thermal/intel/int340x_thermal/Kconfig
@@ -4,7 +4,7 @@
 
 config INT340X_THERMAL
tristate "ACPI INT340X thermal drivers"
-   depends on X86 && ACPI
+   depends on X86 && ACPI && PCI
select THERMAL_GOV_USER_SPACE
select ACPI_THERMAL_REL
select ACPI_FAN
-- 
2.19.0



[PATCH v1 01/11] ACPI / LPSS: Add guards against CONFIG_PCI

2018-12-21 Thread Sinan Kaya
We can now compile ACPI without PCI support. If code depends on CONFIG_PCI,
it needs to explicitly guard that piece.

Signed-off-by: Sinan Kaya 
---
 drivers/acpi/acpi_lpss.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index 5f94c35d165f..4148abcdf9ef 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -534,8 +534,11 @@ static struct device *acpi_lpss_find_device(const char 
*hid, const char *uid)
dev = bus_find_device(_bus_type, NULL, , match_hid_uid);
if (dev)
return dev;
-
+#ifdef CONFIG_PCI
return bus_find_device(_bus_type, NULL, , match_hid_uid);
+#else
+   return NULL;
+#endif
 }
 
 static bool acpi_lpss_dep(struct acpi_device *adev, acpi_handle handle)
-- 
2.19.0



[PATCH v1 05/11] platform/x86: intel_pmc: Hide PCI specific pieces behind CONFIG_PCI

2018-12-21 Thread Sinan Kaya
In the configuration where CONFIG_PCI is unset, this driver is failing to
compile due to PCI framework dependencies. Hide these behind CONFIG_PCI
ifdef.

Signed-off-by: Sinan Kaya 
---
 drivers/platform/x86/intel_pmc_ipc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/platform/x86/intel_pmc_ipc.c 
b/drivers/platform/x86/intel_pmc_ipc.c
index 7964ba22ef8d..d85dfed3bf9c 100644
--- a/drivers/platform/x86/intel_pmc_ipc.c
+++ b/drivers/platform/x86/intel_pmc_ipc.c
@@ -504,6 +504,7 @@ static irqreturn_t ioc(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+#ifdef CONFIG_PCI
 static int ipc_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
struct intel_pmc_ipc_dev *pmc = 
@@ -556,6 +557,7 @@ static struct pci_driver ipc_pci_driver = {
.id_table = ipc_pci_ids,
.probe = ipc_pci_probe,
 };
+#endif
 
 static ssize_t intel_pmc_ipc_simple_cmd_store(struct device *dev,
  struct device_attribute *attr,
@@ -1007,18 +1009,22 @@ static int __init intel_pmc_ipc_init(void)
pr_err("Failed to register PMC ipc platform driver\n");
return ret;
}
+#ifdef CONFIG_PCI
ret = pci_register_driver(_pci_driver);
if (ret) {
pr_err("Failed to register PMC ipc pci driver\n");
platform_driver_unregister(_plat_driver);
return ret;
}
+#endif
return ret;
 }
 
 static void __exit intel_pmc_ipc_exit(void)
 {
+#ifdef CONFIG_PCI
pci_unregister_driver(_pci_driver);
+#endif
platform_driver_unregister(_plat_driver);
 }
 
-- 
2.19.0



[PATCH 1/1] x86/hyper-v: Fix 'set but not used' warnings

2018-12-21 Thread Michael Kelley
In these two cases, a value returned by rdmsr() or rdmsrl()
is ignored. Indicate that ignoring the value is intentional, so
that with the W=1 compilation option no warning is generated.

Signed-off-by: Michael Kelley 
---
 arch/x86/hyperv/hv_apic.c | 2 +-
 arch/x86/hyperv/hv_spinlock.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index 8eb6fbee..66a0f53 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -55,7 +55,7 @@ static void hv_apic_icr_write(u32 low, u32 id)
 
 static u32 hv_apic_read(u32 reg)
 {
-   u32 reg_val, hi;
+   u32 reg_val, __maybe_unused hi;
 
switch (reg) {
case APIC_EOI:
diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
index a861b04..e18c63d5 100644
--- a/arch/x86/hyperv/hv_spinlock.c
+++ b/arch/x86/hyperv/hv_spinlock.c
@@ -25,7 +25,7 @@ static void hv_qlock_kick(int cpu)
 
 static void hv_qlock_wait(u8 *byte, u8 val)
 {
-   unsigned long msr_val;
+   unsigned long __maybe_unused msr_val;
unsigned long flags;
 
if (in_nmi())
-- 
1.8.3.1



Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 18:13:16 -0500
Steven Rostedt  wrote:

> +static __always_inline int str_has_prefix(const char *str, const char 
> *prefix)

I'm thinking it is cleaner to have two helper functions and have them
both return bool.

static __always_inline bool str_has_prefix(const char *str, const char *prefix)
{
return !strncmp(str, prefix, strlen(prefix));
}

( I still like to keep the __always_inline, it doesn't hurt )

And I'll make a separate patch that adds:

static __always_inline bool
str_has_prefix_len(const char *str, const char *prefix, unsigned int *len)
{
*len = strlen(prefix);
return !strncmp(str, prefix, *len) ? *len : 0;
}

Everyone OK with that?

-- Steve


Re: linux-next: Tree for Dec 21 (acpi without CONFIG_PCI)

2018-12-21 Thread Sinan Kaya

On 12/21/2018 6:43 PM, Randy Dunlap wrote:

Any other issues?

There are a few Kconfig dependency issues.  Here are 2 of them:


WARNING: unmet direct dependencies detected for VGA_ARB
   Depends on [n]: HAS_IOMEM [=y] && PCI [=n] && !S390
   Selected by [y]:
   - VGA_SWITCHEROO [=y] && HAS_IOMEM [=y] && X86 [=y] && ACPI [=y]

WARNING: unmet direct dependencies detected for IOSF_MBI
   Depends on [n]: PCI [=n]
   Selected by [y]:
   - X86_INTEL_LPSS [=y] && X86 [=y] && ACPI [=y]
   Selected by [m]:
   - MMC_SDHCI_ACPI [=m] && MMC [=m] && MMC_SDHCI [=m] && ACPI [=y] && X86 [=y]


The randconfig file for these is attached.

(on x86_64; both issues are from the same .config file)


Thanks, I reproduced this. I'm getting ready to post a patch.



Re: linux-next: Tree for Dec 21 (acpi without CONFIG_PCI)

2018-12-21 Thread Sinan Kaya

On 12/21/2018 7:28 PM, Randy Dunlap wrote:

Any other issues?


and on i386, there is this additional kconfig warning:

WARNING: unmet direct dependencies detected for INTEL_SOC_DTS_IOSF_CORE
   Depends on [n]: THERMAL [=y] && (X86 [=y] || X86_INTEL_QUARK [=n] || COMPILE_TEST [=n]) 
&& X86 [=y] && PCI [=n]
   Selected by [m]:
   - INT340X_THERMAL [=m] && THERMAL [=y] && (X86 [=y] || X86_INTEL_QUARK [=n] || COMPILE_TEST 
[=n]) && X86 [=y] && ACPI [=y]


The randconfig file for this is attached.


Thanks, I reproduced this. I'm getting ready to post a patch.


[PATCH v6 5/7] kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy()

2018-12-21 Thread Roman Gushchin
If the cgroup destruction races with an exit() of a belonging
process(es), cg_kill_all() may fail. It's not a good reason to make
cg_destroy() fail and leave the cgroup in place, potentially causing
next test runs to fail.

Signed-off-by: Roman Gushchin 
Cc: Shuah Khan 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: linux-kselft...@vger.kernel.org
---
 tools/testing/selftests/cgroup/cgroup_util.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/testing/selftests/cgroup/cgroup_util.c 
b/tools/testing/selftests/cgroup/cgroup_util.c
index 14c9fe284806..eba06f94433b 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -227,9 +227,7 @@ int cg_destroy(const char *cgroup)
 retry:
ret = rmdir(cgroup);
if (ret && errno == EBUSY) {
-   ret = cg_killall(cgroup);
-   if (ret)
-   return ret;
+   cg_killall(cgroup);
usleep(100);
goto retry;
}
-- 
2.19.2



[PATCH v6 6/7] kselftests: cgroup: add freezer controller self-tests

2018-12-21 Thread Roman Gushchin
This patch implements 8 tests for the freezer controller for
cgroup v2:
1) a simple test, which aims to freeze and unfreeze a cgroup with 100
processes
2) a more complicated tree test, which creates a hierarchy of cgroups,
puts some processes in some cgroups, and tries to freeze and unfreeze
different parts of the subtree
3) a forkbomb test: the test aims to freeze a forkbomb running in a
cgroup, kill all tasks in the cgroup and remove the cgroup without
the unfreezing.
4) rmdir test: the test creates two nested cgroups, freezes the parent
one, checks that the child can be successfully removed, and a new
child can be created
5) migration tests: the test checks migration of a task between
frozen cgroups: from a frozen to a running, from a running to a
frozen, and from a frozen to a frozen.
6) ptrace test: the test checks that it's possible to attach to
a process in a frozen cgroup, get some information and detach, and
the cgroup will remain frozen.
7) stopped test: the test checks that it's possible to freeze a cgroup
with a stopped task
8) ptraced test: the test checks that it's possible to freeze a cgroup
with a ptraced task

Expected output:

  $ ./test_freezer
  ok 1 test_cgfreezer_simple
  ok 2 test_cgfreezer_tree
  ok 3 test_cgfreezer_forkbomb
  ok 4 test_cgrreezer_rmdir
  ok 5 test_cgfreezer_migrate
  ok 6 test_cgfreezer_ptrace
  ok 7 test_cgfreezer_stopped
  ok 8 test_cgfreezer_ptraced

Signed-off-by: Roman Gushchin 
Cc: Shuah Khan 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
Cc: linux-kselft...@vger.kernel.org
---
 tools/testing/selftests/cgroup/.gitignore |   1 +
 tools/testing/selftests/cgroup/Makefile   |   2 +
 tools/testing/selftests/cgroup/cgroup_util.c  |  81 +-
 tools/testing/selftests/cgroup/cgroup_util.h  |   7 +
 tools/testing/selftests/cgroup/test_freezer.c | 821 ++
 5 files changed, 911 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/cgroup/test_freezer.c

diff --git a/tools/testing/selftests/cgroup/.gitignore 
b/tools/testing/selftests/cgroup/.gitignore
index adacda50a4b2..7f9835624793 100644
--- a/tools/testing/selftests/cgroup/.gitignore
+++ b/tools/testing/selftests/cgroup/.gitignore
@@ -1,2 +1,3 @@
 test_memcontrol
 test_core
+test_freezer
diff --git a/tools/testing/selftests/cgroup/Makefile 
b/tools/testing/selftests/cgroup/Makefile
index 23fbaa4a9630..8d369b6a2069 100644
--- a/tools/testing/selftests/cgroup/Makefile
+++ b/tools/testing/selftests/cgroup/Makefile
@@ -5,8 +5,10 @@ all:
 
 TEST_GEN_PROGS = test_memcontrol
 TEST_GEN_PROGS += test_core
+TEST_GEN_PROGS += test_freezer
 
 include ../lib.mk
 
 $(OUTPUT)/test_memcontrol: cgroup_util.c
 $(OUTPUT)/test_core: cgroup_util.c
+$(OUTPUT)/test_freezer: cgroup_util.c
diff --git a/tools/testing/selftests/cgroup/cgroup_util.c 
b/tools/testing/selftests/cgroup/cgroup_util.c
index eba06f94433b..e9cdad673901 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -74,6 +74,16 @@ char *cg_name_indexed(const char *root, const char *name, 
int index)
return ret;
 }
 
+char *cg_control(const char *cgroup, const char *control)
+{
+   size_t len = strlen(cgroup) + strlen(control) + 2;
+   char *ret = malloc(len);
+
+   snprintf(ret, len, "%s/%s", cgroup, control);
+
+   return ret;
+}
+
 int cg_read(const char *cgroup, const char *control, char *buf, size_t len)
 {
char path[PATH_MAX];
@@ -196,7 +206,59 @@ int cg_create(const char *cgroup)
return mkdir(cgroup, 0644);
 }
 
-static int cg_killall(const char *cgroup)
+int cg_for_all_procs(const char *cgroup, int (*fn)(int pid, void *arg),
+void *arg)
+{
+   char buf[PAGE_SIZE];
+   char *ptr = buf;
+   int ret;
+
+   if (cg_read(cgroup, "cgroup.procs", buf, sizeof(buf)))
+   return -1;
+
+   while (ptr < buf + sizeof(buf)) {
+   int pid = strtol(ptr, , 10);
+
+   if (pid == 0)
+   break;
+   if (*ptr)
+   ptr++;
+   else
+   break;
+   ret = fn(pid, arg);
+   if (ret)
+   return ret;
+   }
+
+   return 0;
+}
+
+int cg_wait_for_proc_count(const char *cgroup, int count)
+{
+   char buf[10 * PAGE_SIZE] = {0};
+   int attempts;
+   char *ptr;
+
+   for (attempts = 10; attempts >= 0; attempts--) {
+   int nr = 0;
+
+   if (cg_read(cgroup, "cgroup.procs", buf, sizeof(buf)))
+   break;
+
+   for (ptr = buf; *ptr; ptr++)
+   if (*ptr == '\n')
+   nr++;
+
+   if (nr >= count)
+   return 0;
+
+   usleep(10);
+   }
+
+   return -1;
+}
+
+int cg_killall(const char *cgroup)
 {
char buf[PAGE_SIZE];
char *ptr = buf;
@@ -238,6 +300,14 @@ int cg_destroy(const char *cgroup)
 

[PATCH v6 1/7] cgroup: rename freezer.c into legacy_freezer.c

2018-12-21 Thread Roman Gushchin
Freezer.c will contain an implementation of cgroup v2 freezer,
so let's rename the v1 freezer to avoid naming conflicts.

Signed-off-by: Roman Gushchin 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
---
 kernel/cgroup/Makefile| 2 +-
 kernel/cgroup/{freezer.c => legacy_freezer.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename kernel/cgroup/{freezer.c => legacy_freezer.c} (100%)

diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile
index bfcdae896122..8d5689ca94b9 100644
--- a/kernel/cgroup/Makefile
+++ b/kernel/cgroup/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-y := cgroup.o rstat.o namespace.o cgroup-v1.o
 
-obj-$(CONFIG_CGROUP_FREEZER) += freezer.o
+obj-$(CONFIG_CGROUP_FREEZER) += legacy_freezer.o
 obj-$(CONFIG_CGROUP_PIDS) += pids.o
 obj-$(CONFIG_CGROUP_RDMA) += rdma.o
 obj-$(CONFIG_CPUSETS) += cpuset.o
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/legacy_freezer.c
similarity index 100%
rename from kernel/cgroup/freezer.c
rename to kernel/cgroup/legacy_freezer.c
-- 
2.19.2



[PATCH v6 4/7] cgroup: cgroup v2 freezer

2018-12-21 Thread Roman Gushchin
Cgroup v1 implements the freezer controller, which provides an ability
to stop the workload in a cgroup and temporarily free up some
resources (cpu, io, network bandwidth and, potentially, memory)
for some other tasks. Cgroup v2 lacks this functionality.

This patch implements freezer for cgroup v2.

Cgroup v2 freezer tries to put tasks into a state similar to jobctl
stop. This means that tasks can be killed, ptraced (using
PTRACE_SEIZE*), and interrupted. It is possible to attach to
a frozen task, get some information (e.g. read registers) and detach.
It's also possible to migrate a frozen tasks to another cgroup.

This differs cgroup v2 freezer from cgroup v1 freezer, which mostly
tried to imitate the system-wide freezer. However uninterruptible
sleep is fine when all tasks are going to be frozen (hibernation case),
it's not the acceptable state for some subset of the system.

Cgroup v2 freezer is not supporting freezing kthreads.
If a non-root cgroup contains kthread, the cgroup still can be frozen,
but the kthread will remain running, the cgroup will be shown
as non-frozen, and the notification will not be delivered.

* PTRACE_ATTACH is not working because non-fatal signal delivery
is blocked in frozen state.

There are some interface differences between cgroup v1 and cgroup v2
freezer too, which are required to conform the cgroup v2 interface
design principles:
1) There is no separate controller, which has to be turned on:
the functionality is always available and is represented by
cgroup.freeze and cgroup.events cgroup control files.
2) The desired state is defined by the cgroup.freeze control file.
Any hierarchical configuration is allowed.
3) The interface is asynchronous. The actual state is available
using cgroup.events control file ("frozen" field). There are no
dedicated transitional states.
4) It's allowed to make any changes with the cgroup hierarchy
(create new cgroups, remove old cgroups, move tasks between cgroups)
no matter if some cgroups are frozen.

Signed-off-by: Roman Gushchin 
Cc: Tejun Heo 
Cc: Oleg Nesterov 
Cc: kernel-t...@fb.com
---
 include/linux/cgroup-defs.h  |  34 
 include/linux/cgroup.h   |  46 +
 include/linux/sched.h|   4 +
 include/linux/sched/jobctl.h |   2 +
 kernel/cgroup/Makefile   |   2 +-
 kernel/cgroup/cgroup.c   | 106 ++-
 kernel/cgroup/freezer.c  | 329 +++
 kernel/signal.c  |  93 +-
 8 files changed, 607 insertions(+), 9 deletions(-)
 create mode 100644 kernel/cgroup/freezer.c

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 03355d7008ff..859813ff1ee2 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -64,6 +64,12 @@ enum {
 * specified at mount time and thus is implemented here.
 */
CGRP_CPUSET_CLONE_CHILDREN,
+
+   /* Control group has to be frozen. */
+   CGRP_FREEZE,
+
+   /* Cgroup is frozen. */
+   CGRP_FROZEN,
 };
 
 /* cgroup_root->flags */
@@ -316,6 +322,31 @@ struct cgroup_rstat_cpu {
struct cgroup *updated_next;/* NULL iff not on the list */
 };
 
+struct cgroup_freezer_state {
+   /* Should the cgroup and its descendants be frozen. */
+   bool freeze;
+
+   /* Should the cgroup actually be frozen? */
+   int e_freeze;
+
+   /* Fields below are protected by css_set_lock */
+
+   /* Number of frozen descendant cgroups */
+   int nr_frozen_descendants;
+
+   /* Number of tasks to freeze */
+   int nr_tasks_to_freeze;
+
+   /* Number of frozen tasks */
+   int nr_frozen_tasks;
+
+   /*
+* Number of tasks, which are counted as frozen:
+* SIGSTOPped, and PTRACEd.
+*/
+   int nr_stopped_tasks;
+};
+
 struct cgroup {
/* self css with NULL ->ss, points back to this cgroup */
struct cgroup_subsys_state self;
@@ -452,6 +483,9 @@ struct cgroup {
/* If there is block congestion on this cgroup. */
atomic_t congestion_count;
 
+   /* Used to store internal freezer state */
+   struct cgroup_freezer_state freezer;
+
/* ids of the ancestors at each level including self */
int ancestor_ids[];
 };
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 9d12757a65b0..df51b37928ba 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -877,4 +877,50 @@ static inline void put_cgroup_ns(struct cgroup_namespace 
*ns)
free_cgroup_ns(ns);
 }
 
+#ifdef CONFIG_CGROUPS
+
+void cgroup_enter_frozen(void);
+void cgroup_leave_frozen(void);
+void cgroup_enter_stopped(void);
+void cgroup_leave_stopped(void);
+void cgroup_dec_tasks_to_freeze(struct cgroup *cgrp);
+void cgroup_freeze(struct cgroup *cgrp, bool freeze);
+void cgroup_freezer_migrate_task(struct task_struct *task, struct cgroup *src,
+struct cgroup *dst);
+static inline bool cgroup_task_freeze(struct task_struct 

Re: [PATCH v7 1/5] tpm: dynamically allocate the allocated_banks array

2018-12-21 Thread Jarkko Sakkinen
On Fri, Dec 21, 2018 at 10:40:09AM +0100, Roberto Sassu wrote:
> On 12/20/2018 3:55 PM, Jarkko Sakkinen wrote:
> > On Thu, Dec 13, 2018 at 11:29:41AM +0100, Roberto Sassu wrote:
> > > This patch renames active_banks (member of tpm_chip) to allocated_banks,
> > > stores the number of allocated PCR banks in nr_allocated_banks (new member
> > > of tpm_chip), and replaces the static array with a pointer to a 
> > > dynamically
> > > allocated array.
> > > 
> > > tpm2_get_pcr_allocation() determines if a PCR bank is allocated by 
> > > checking
> > > the mask in the TPML_PCR_SELECTION structure returned by the TPM for
> > > TPM2_Get_Capability(). If a bank is not allocated, the TPM returns that
> > > bank in TPML_PCR_SELECTION, with all bits in the mask set to zero. In this
> > > case, the bank is not included in chip->allocated_banks, to avoid that TPM
> > > driver users unnecessarily calculate a digest for that bank.
> > > 
> > > One PCR bank with algorithm set to SHA1 is always allocated for TPM 1.x.
> > > 
> > > As a consequence of the introduction of nr_allocated_banks,
> > > tpm_pcr_extend() does not check anymore if the algorithm stored in 
> > > tpm_chip
> > > is equal to zero.
> > > 
> > > Signed-off-by: Roberto Sassu 
> > > Tested-by: Jarkko Sakkinen 
> > > ---
> > >   drivers/char/tpm/tpm-chip.c  |  1 +
> > >   drivers/char/tpm/tpm-interface.c | 18 +
> > >   drivers/char/tpm/tpm.h   |  3 ++-
> > >   drivers/char/tpm/tpm1-cmd.c  | 10 ++
> > >   drivers/char/tpm/tpm2-cmd.c  | 34 ++--
> > >   5 files changed, 47 insertions(+), 19 deletions(-)
> > > 
> > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > > index 32db84683c40..ce851c62bb68 100644
> > > --- a/drivers/char/tpm/tpm-chip.c
> > > +++ b/drivers/char/tpm/tpm-chip.c
> > > @@ -160,6 +160,7 @@ static void tpm_dev_release(struct device *dev)
> > >   kfree(chip->log.bios_event_log);
> > >   kfree(chip->work_space.context_buf);
> > >   kfree(chip->work_space.session_buf);
> > > + kfree(chip->allocated_banks);
> > >   kfree(chip);
> > >   }
> > > diff --git a/drivers/char/tpm/tpm-interface.c 
> > > b/drivers/char/tpm/tpm-interface.c
> > > index d9439f9abe78..7b80919228be 100644
> > > --- a/drivers/char/tpm/tpm-interface.c
> > > +++ b/drivers/char/tpm/tpm-interface.c
> > > @@ -488,8 +488,7 @@ EXPORT_SYMBOL_GPL(tpm_pcr_read);
> > >   int tpm_pcr_extend(struct tpm_chip *chip, u32 pcr_idx, const u8 *hash)
> > >   {
> > >   int rc;
> > > - struct tpm2_digest digest_list[ARRAY_SIZE(chip->active_banks)];
> > > - u32 count = 0;
> > > + struct tpm2_digest *digest_list;
> > >   int i;
> > >   chip = tpm_find_get_ops(chip);
> > > @@ -497,16 +496,19 @@ int tpm_pcr_extend(struct tpm_chip *chip, u32 
> > > pcr_idx, const u8 *hash)
> > >   return -ENODEV;
> > >   if (chip->flags & TPM_CHIP_FLAG_TPM2) {
> > > - memset(digest_list, 0, sizeof(digest_list));
> > > + digest_list = kcalloc(chip->nr_allocated_banks,
> > > +   sizeof(*digest_list), GFP_KERNEL);
> > > + if (!digest_list)
> > > + return -ENOMEM;
> > 
> > You could preallocate digest list and place it to struct tpm_chip
> > instead of doing it everytime tpm_pcr_extend() called.
> 
> This part will be removed with patch 5/5.

Even if it did, it does not make this patch unbroken.

/Jarkko


[PATCH v6 7/7] cgroup: document cgroup v2 freezer interface

2018-12-21 Thread Roman Gushchin
Describe cgroup v2 freezer interface in the cgroup v2 admin guide.

Signed-off-by: Roman Gushchin 
Reviewed-by: Mike Rapoport 
Cc: Tejun Heo 
Cc: linux-...@vger.kernel.org
Cc: kernel-t...@fb.com
---
 Documentation/admin-guide/cgroup-v2.rst | 27 +
 1 file changed, 27 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst 
b/Documentation/admin-guide/cgroup-v2.rst
index 07e06136a550..f8335e26b362 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -864,6 +864,8 @@ All cgroup core files are prefixed with "cgroup."
  populated
1 if the cgroup or its descendants contains any live
processes; otherwise, 0.
+ frozen
+   1 if the cgroup is frozen; otherwise, 0.
 
   cgroup.max.descendants
A read-write single value files.  The default is "max".
@@ -897,6 +899,31 @@ All cgroup core files are prefixed with "cgroup."
A dying cgroup can consume system resources not exceeding
limits, which were active at the moment of cgroup deletion.
 
+  cgroup.freeze
+   A read-write single value file which exists on non-root cgroups.
+   Allowed values are "0" and "1". The default is "0".
+
+   Writing "1" to the file causes freezing of the cgroup and all
+   descendant cgroups. This means that all belonging processes will
+   be stopped and will not run until the cgroup will be explicitly
+   unfrozen. Freezing of the cgroup may take some time; when this action
+   is completed, the "frozen" value in the cgroup.events control file
+   will be updated to "1" and the corresponding notification will be
+   issued.
+
+   A cgroup can be frozen either by its own settings, or by settings
+   of any ancestor cgroups. If any of ancestor cgroups is frozen, the
+   cgroup will remain frozen.
+
+   Processes in the frozen cgroup can be killed by a fatal signal.
+   They also can enter and leave a frozen cgroup: either by an explicit
+   move by a user, or if freezing of the cgroup races with fork().
+   If a process is moved to a frozen cgroup, it stops. If a process is
+   moved out of a frozen cgroup, it becomes running.
+
+   Frozen status of a cgroup doesn't affect any cgroup tree operations:
+   it's possible to delete a frozen (and empty) cgroup, as well as
+   create new sub-cgroups.
 
 Controllers
 ===
-- 
2.19.2



[PATCH v6 3/7] cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock

2018-12-21 Thread Roman Gushchin
The number of descendant cgroups and the number of dying
descendant cgroups are currently synchronized using the cgroup_mutex.

The number of descendant cgroups will be required by the cgroup v2
freezer, which will use it to determine if a cgroup is frozen
(depending on total number of descendants and number of frozen
descendants). It's not always acceptable to grab the cgroup_mutex,
especially from quite hot paths (e.g. exit()).

To avoid this, let's additionally synchronize these counters using
the css_set_lock.

So, it's safe to read these counters with either cgroup_mutex or
css_set_lock locked, and for changing both locks should be acquired.

Signed-off-by: Roman Gushchin 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
---
 include/linux/cgroup-defs.h | 5 +
 kernel/cgroup/cgroup.c  | 6 ++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 8fcbae1b8db0..03355d7008ff 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -348,6 +348,11 @@ struct cgroup {
 * Dying cgroups are cgroups which were deleted by a user,
 * but are still existing because someone else is holding a reference.
 * max_descendants is a maximum allowed number of descent cgroups.
+*
+* nr_descendants and nr_dying_descendants are protected
+* by cgroup_mutex and css_set_lock. It's fine to read them holding
+* any of cgroup_mutex and css_set_lock; for writing both locks
+* should be held.
 */
int nr_descendants;
int nr_dying_descendants;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 7519a4307021..f89dde50f693 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4723,9 +4723,11 @@ static void css_release_work_fn(struct work_struct *work)
if (cgroup_on_dfl(cgrp))
cgroup_rstat_flush(cgrp);
 
+   spin_lock_irq(_set_lock);
for (tcgrp = cgroup_parent(cgrp); tcgrp;
 tcgrp = cgroup_parent(tcgrp))
tcgrp->nr_dying_descendants--;
+   spin_unlock_irq(_set_lock);
 
cgroup_idr_remove(>root->cgroup_idr, cgrp->id);
cgrp->id = -1;
@@ -4943,12 +4945,14 @@ static struct cgroup *cgroup_create(struct cgroup 
*parent)
if (ret)
goto out_psi_free;
 
+   spin_lock_irq(_set_lock);
for (tcgrp = cgrp; tcgrp; tcgrp = cgroup_parent(tcgrp)) {
cgrp->ancestor_ids[tcgrp->level] = tcgrp->id;
 
if (tcgrp != cgrp)
tcgrp->nr_descendants++;
}
+   spin_unlock_irq(_set_lock);
 
if (notify_on_release(parent))
set_bit(CGRP_NOTIFY_ON_RELEASE, >flags);
@@ -5233,10 +5237,12 @@ static int cgroup_destroy_locked(struct cgroup *cgrp)
if (parent && cgroup_is_threaded(cgrp))
parent->nr_threaded_children--;
 
+   spin_lock_irq(_set_lock);
for (tcgrp = cgroup_parent(cgrp); tcgrp; tcgrp = cgroup_parent(tcgrp)) {
tcgrp->nr_descendants--;
tcgrp->nr_dying_descendants++;
}
+   spin_unlock_irq(_set_lock);
 
cgroup1_check_for_release(parent);
 
-- 
2.19.2



[PATCH v6 2/7] cgroup: implement __cgroup_task_count() helper

2018-12-21 Thread Roman Gushchin
The helper is identical to the existing cgroup_task_count()
except it doesn't take the css_set_lock by itself, assuming
that the caller does.

Also, move cgroup_task_count() implementation into
kernel/cgroup/cgroup.c, as there is nothing specific to cgroup v1.

Signed-off-by: Roman Gushchin 
Cc: Tejun Heo 
Cc: kernel-t...@fb.com
---
 kernel/cgroup/cgroup-internal.h |  1 +
 kernel/cgroup/cgroup-v1.c   | 16 
 kernel/cgroup/cgroup.c  | 33 +
 3 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index c950864016e2..a195328431ce 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -226,6 +226,7 @@ int cgroup_rmdir(struct kernfs_node *kn);
 int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
 struct kernfs_root *kf_root);
 
+int __cgroup_task_count(const struct cgroup *cgrp);
 int cgroup_task_count(const struct cgroup *cgrp);
 
 /*
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 51063e7a93c2..6134fef07d57 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -336,22 +336,6 @@ static struct cgroup_pidlist 
*cgroup_pidlist_find_create(struct cgroup *cgrp,
return l;
 }
 
-/**
- * cgroup_task_count - count the number of tasks in a cgroup.
- * @cgrp: the cgroup in question
- */
-int cgroup_task_count(const struct cgroup *cgrp)
-{
-   int count = 0;
-   struct cgrp_cset_link *link;
-
-   spin_lock_irq(_set_lock);
-   list_for_each_entry(link, >cset_links, cset_link)
-   count += link->cset->nr_tasks;
-   spin_unlock_irq(_set_lock);
-   return count;
-}
-
 /*
  * Load a cgroup's pidarray with either procs' tgids or tasks' pids
  */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index e06994fd4e34..7519a4307021 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -563,6 +563,39 @@ static void cgroup_get_live(struct cgroup *cgrp)
css_get(>self);
 }
 
+/**
+ * __cgroup_task_count - count the number of tasks in a cgroup. The caller
+ * is responsible for taking the css_set_lock.
+ * @cgrp: the cgroup in question
+ */
+int __cgroup_task_count(const struct cgroup *cgrp)
+{
+   int count = 0;
+   struct cgrp_cset_link *link;
+
+   lockdep_assert_held(_set_lock);
+
+   list_for_each_entry(link, >cset_links, cset_link)
+   count += link->cset->nr_tasks;
+
+   return count;
+}
+
+/**
+ * cgroup_task_count - count the number of tasks in a cgroup.
+ * @cgrp: the cgroup in question
+ */
+int cgroup_task_count(const struct cgroup *cgrp)
+{
+   int count;
+
+   spin_lock_irq(_set_lock);
+   count = __cgroup_task_count(cgrp);
+   spin_unlock_irq(_set_lock);
+
+   return count;
+}
+
 struct cgroup_subsys_state *of_css(struct kernfs_open_file *of)
 {
struct cgroup *cgrp = of->kn->parent->priv;
-- 
2.19.2



[PATCH v6 0/7] freezer for cgroup v2

2018-12-21 Thread Roman Gushchin
This patchset implements freezer for cgroup v2.

It provides similar functionality as v1 freezer, but the interface
conforms to the cgroup v2 interface design principles, and it
provides a better user experience: tasks can be killed, ptrace works,
there is no separate controller, which has to be enabled, etc.

Patches (1), (2) and (3) are some preparational work, patch (4) contains
the implementation, patch (5) is a small cgroup kselftest fix,
patch (6) covers freezer adds 6 new kselftests to cover the freezer
functionality. Patch (7) adds corresponding docs.

v6->v5:
  - reverted to clear TIF_SIGPENDING with additional checks before schedule(),
  as proposed by Oleg Nesterov
  - made cgroup v2 freezer working with the system freezer (by using
  freezable_schedule())
  - make freezer working with SIGSTOPped and PTRACEd tasks
  - added tests to cover freezing a cgroup with SIGSTOPped and PTRACEd tasks

v5->v4:
  - rewored cgroup state transition code (suggested by Tejun Heo)
  - look at JOBCTL_TRAP_FREEZE instead of task->frozen in
recalc_sigpending(), check for task->frozen and JOBCTL_TRAP_FREEZE
in signal_pending_state() (suggested by Oleg Nesterov)
  - some cosmetic changes in signal.c (suggested by Oleg Nesterov)
  - cleaned up comments

v4->v3:
  - reading nr_descendants doesn't require taking css_set_lock anymore
  - fixed docs based on Mike Rapoport's feedback
  - fixed double irq lock found by Dan Carpenter

v3->v2:
  - dropped TASK_FROZEN for now, frozen tasks are put into TASK_INTERRUPTIBLE
  state; it's probably not the final version, but the API question can be
  discussed separately
  - don't clear TIF_SIGPENDING before going to sleep, instead add
  task->frozen check in signal_pending_state() and recalc_sigpending()
  - cgroup-level counter are now synchronized using css_set_lock,
  which simplified the whole code (e.g. per-cgroup works were removed)
  - the amount of comments increased significantly
  - many other improvements incorporating feedback from Tejun and Oleg

v2->v1:
  - fixed locking aroung calling cgroup_freezer_leave()
  - added docs

Roman Gushchin (7):
  cgroup: rename freezer.c into legacy_freezer.c
  cgroup: implement __cgroup_task_count() helper
  cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock
  cgroup: cgroup v2 freezer
  kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy()
  kselftests: cgroup: add freezer controller self-tests
  cgroup: document cgroup v2 freezer interface

 Documentation/admin-guide/cgroup-v2.rst   |  27 +
 include/linux/cgroup-defs.h   |  39 +
 include/linux/cgroup.h|  46 +
 include/linux/sched.h |   4 +
 include/linux/sched/jobctl.h  |   2 +
 kernel/cgroup/Makefile|   4 +-
 kernel/cgroup/cgroup-internal.h   |   1 +
 kernel/cgroup/cgroup-v1.c |  16 -
 kernel/cgroup/cgroup.c| 145 +++-
 kernel/cgroup/freezer.c   | 650 ++
 kernel/cgroup/legacy_freezer.c| 481 ++
 kernel/signal.c   |  93 +-
 tools/testing/selftests/cgroup/.gitignore |   1 +
 tools/testing/selftests/cgroup/Makefile   |   2 +
 tools/testing/selftests/cgroup/cgroup_util.c  |  85 +-
 tools/testing/selftests/cgroup/cgroup_util.h  |   7 +
 tools/testing/selftests/cgroup/test_freezer.c | 821 ++
 17 files changed, 1993 insertions(+), 431 deletions(-)
 create mode 100644 kernel/cgroup/legacy_freezer.c
 create mode 100644 tools/testing/selftests/cgroup/test_freezer.c

-- 
2.19.2



Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

2018-12-21 Thread Jarkko Sakkinen
On Fri, Dec 21, 2018 at 10:28:09AM -0800, Sean Christopherson wrote:
> > Why would you want to pass EPC through user space to KVM rather than
> > KVM allocating it through kernel interfaces?
> 
> Delegating EPC management to userspace fits better with KVM's existing
> memory ABI.  KVM provides a single ioctl(), KVM_SET_USER_MEMORY_REGION[1],
> that allows userspace to create, move, modify and delete memory regions.
> 
> Skipping over a lot of details, there are essentially three options for
> exposing EPC to a KVM guest:
> 
>  1) Provide a dedicated KVM ioctl() to manage EPC without routing it
> through KVM_SET_USER_MEMORY_REGION.
> 
>  2) Add a flag to 'struct kvm_userspace_memory_region' that denotes an
> EPC memory region and mmap() / allocate EPC in KVM.
> 
>  3) Provide an ABI to allocate raw EPC and let userspace manage it like
> any other memory region.
> 
> Option (1) requires duplicating all of KVM_SET_USER_MEMORY_REGION's
> functionality unless the ioctl() is severly restricted.
> 
> Option (2) is an ugly abuse of KVM_SET_USER_MEMORY_REGION since the EPC
> flag would have completely different semantics than all other usage of
> KVM_SET_USER_MEMORY_REGION.
> 
> Thus, option (3).

OK, thank you for patience explaining this.

> Probably a better question to answer is why provide the ABI through
> /dev/sgx and not /dev/kvm.  IMO /dev/sgx is a more logical way to
> advertise support to userspace, e.g. userspace can simply check if
> /dev/sgx (or /dev/sgx/epc) exists vs. probing a KVM capability.

You have to understand that for a KVM non-expert like me it was really
important to get the context, which you kindly gave. I have never used
KVM's memory management API but now that I know how it works all of this
makes perfect sense. This is not a better question but it is definitely
a good follow up question :-)

I don't really understand you deduction here, however. If SGX was not
supported, why couldn't the hypothetical /dev/kvm functionality just
return an error?

For me it sounds a bit messy that KVM functionality, which is a client
to the SGX functionality, places some of its functionality to the SGX
core.

/Jarkko


Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 15:44:41 -0800
Joe Perches  wrote:

> On Fri, 2018-12-21 at 18:25 -0500, Steven Rostedt wrote:
> > On Fri, 21 Dec 2018 15:19:33 -0800
> > Joe Perches  wrote:
> >   
> > > I believe this should be bool.
> > > 
> > > I don't find a use for non-zero assigned len value in the kernel
> > > for strncmp and I believe the function should simply be:
> > > 
> > > static inline bool str_has_prefix(const char *str, const char prefix[])
> > > {
> > >   return !strncmp(str, prefix, strlen(prefix));
> > > }  
> > 
> > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c  
> []
> > @@ -172,8 +172,8 @@ cache_type_store(struct device *dev, struct 
> > device_attribute *attr,
> >  * it's not worth the risk */
> > return -EINVAL;
> >  
> > -   if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
> > -   buf += sizeof(temp) - 1;
> > +   if ((len = str_has_prefix(buf, temp))) {
> > +   buf += len;  
> 
> That's not really a use of the non-zero strncmp return value.
> 
> You are attempting an optimization not already done.
> I also wonder if it's actually an optimization as the
> return value may not be precomputed.

Note, temp is this:

static const char temp[] = "temporary ";

> 
> Also the assignment in the test isn't preferred style.

We could have two helper functions:

static __always_inline bool
str_has_prefix(const char *str, const char *prefix)
{
return strncmp(str, prefix, strlen(prefix));
}

and a 

static __always_inline bool
str_has_prefix_len(const char *str, const char *prefix, unsigned int *len)
{
*len = strlen(prefix);
return strncmp(str, prefix, *len);
}

This was my original thought with the first patches. But when Linus
suggested changing the style from the strncmp() I thought it was a way
to encapsulate the two.

Either way, but yes, I do want a way to do the compare and calculate
the length all in one function. That even makes checking options easier
to get to:

if (str_has_prefix_len(cmdline, "param=", )) {
value = cmdline + len;


> 
> > And there's more places like this.  
> 
> Any where the non-zero return value is actually used?
> 
> > > It's hard to believe __always_inline vs inline matters 
> > > for any single line function.  
> > 
> > I've been burnt by gcc deciding to not inline single functions before.  
> 
> Complex single functions sure, but single line inlines?
> I haven't seen that externed anywhere.
> 
> Today no inline function is marked __always_inline in
> string.h
> 
> I don't doubt there should be some standardization
> of inline vs __always_inline in the kernel, but this
> right now seems different just for difference sake.

I got burnt by some crazy gcc config options making local_irq_save()
become a out of line function, and that cause crazy crap to happen with
the function tracer.

Now inlining here is just for guaranteeing that strlen() gets turned
into a constant for constant strings and wont do anything harmful if
that doesn't happen (but slightly slow things down). But again, it
doesn't hurt to have the __always_inline. Why are you so dead against
it? You haven't stated your rational for that.

-- Steve



Re: rfc: bool structure members (was Re: [PATCH V3] net/mlx4: Get rid of page operation after dma_alloc_coherent)

2018-12-21 Thread Jason Gunthorpe
On Thu, Dec 20, 2018 at 09:12:43PM -0800, Joe Perches wrote:
> Care to submit a coding_style.rst patch or
> improve the one below this?

I took yours and revised it a little bit. I spent some time looking at
assembly and decided to drop the performance note, the number of cases
that run into overhead seems pretty small and probably already
requires !! to be correct. There is also an equally unlikely gain, ie
'if (a & b)' optimizes a tiny bit better for bool types.

I also added a small intro on bool, as I know some people are
unfamiliar with C11 _Bool and might think bool is just '#define bool
u8'

(for those added to the cc) I'm looking at cases, like the patch that
spawned this, where the struct has a single bool and no performance
considerations. As CH said, seeing that get converted to int due to
checkpatch is worse than having used bool. Using u8 won't make this
struct smaller or faster.

Cheers,
Jason

>From c5e2c887f6326e1c4369876f39996417da5e90cc Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe 
Date: Fri, 21 Dec 2018 16:29:22 -0700
Subject: [PATCH] coding-style: Clarify the expectations around bool

There has been some confusion since checkpatch started warning about bool
use in structures, and people have been avoiding using it.

Many people feel there is still a legitimate place for bool in structures,
so provide some guidance on bool usage derived from the entire thread that
spawned the checkpatch warning.

Link: 
https://lkml.kernel.org/r/ca+55afwvzk1ofb9t2v014ptakfhtvan_zj2dojncy3x6e4u...@mail.gmail.com
Signed-off-by: Joe Perches 
Signed-off-by: Jason Gunthorpe 
---
 Documentation/process/coding-style.rst | 33 ++
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/Documentation/process/coding-style.rst 
b/Documentation/process/coding-style.rst
index 4e7c0a1c427a9a..efdb1d32035fe1 100644
--- a/Documentation/process/coding-style.rst
+++ b/Documentation/process/coding-style.rst
@@ -918,7 +918,32 @@ result.  Typical examples would be functions that return 
pointers; they use
 NULL or the ERR_PTR mechanism to report failure.
 
 
-17) Don't re-invent the kernel macros
+17) Using bool
+--
+
+The Linux kernel uses the C11 standard for the bool type. bool values can only
+evaluate to 0 or 1, and implicit or explicit conversion to bool automatically
+converts the value to true or false. When using bool types the !! construction
+is not needed, which eliminates a class of bugs.
+
+When working with bool values the true and false labels should be used instead
+of 0 and 1.
+
+bool function return types, arguments and stack variables are always fine to
+use whenever appropriate. Use of bool is encouraged to improve readability and
+is often a better option than 'int' for storing boolean values.
+
+Do not use bool if cache line layout or size of the value matters, its size
+and alignment varies based on the compiled architecture. Structures that are
+optimized for alignment and size should not use bool.
+
+If a structure has many true/false values, consider consolidating them into a
+bitfield with 1 bit members, or using an appropriate fixed width type, such as
+u8.
+
+Otherwise limited use of bool in structures does improve readability.
+
+18) Don't re-invent the kernel macros
 -
 
 The header file include/linux/kernel.h contains a number of macros that
@@ -941,7 +966,7 @@ need them.  Feel free to peruse that header file to see 
what else is already
 defined that you shouldn't reproduce in your code.
 
 
-18) Editor modelines and other cruft
+19) Editor modelines and other cruft
 
 
 Some editors can interpret configuration information embedded in source files,
@@ -975,7 +1000,7 @@ own custom mode, or may have some other magic method for 
making indentation
 work correctly.
 
 
-19) Inline assembly
+20) Inline assembly
 ---
 
 In architecture-specific code, you may need to use inline assembly to interface
@@ -1007,7 +1032,7 @@ the next instruction in the assembly output:
 : /* outputs */ : /* inputs */ : /* clobbers */);
 
 
-20) Conditional Compilation
+21) Conditional Compilation
 ---
 
 Wherever possible, don't use preprocessor conditionals (#if, #ifdef) in .c
-- 
2.19.2



Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Joe Perches
On Fri, 2018-12-21 at 18:25 -0500, Steven Rostedt wrote:
> On Fri, 21 Dec 2018 15:19:33 -0800
> Joe Perches  wrote:
> 
> > I believe this should be bool.
> > 
> > I don't find a use for non-zero assigned len value in the kernel
> > for strncmp and I believe the function should simply be:
> > 
> > static inline bool str_has_prefix(const char *str, const char prefix[])
> > {
> > return !strncmp(str, prefix, strlen(prefix));
> > }
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
[]
> @@ -172,8 +172,8 @@ cache_type_store(struct device *dev, struct 
> device_attribute *attr,
>* it's not worth the risk */
>   return -EINVAL;
>  
> - if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
> - buf += sizeof(temp) - 1;
> + if ((len = str_has_prefix(buf, temp))) {
> + buf += len;

That's not really a use of the non-zero strncmp return value.

You are attempting an optimization not already done.
I also wonder if it's actually an optimization as the
return value may not be precomputed.

Also the assignment in the test isn't preferred style.

> And there's more places like this.

Any where the non-zero return value is actually used?

> > It's hard to believe __always_inline vs inline matters 
> > for any single line function.
> 
> I've been burnt by gcc deciding to not inline single functions before.

Complex single functions sure, but single line inlines?
I haven't seen that externed anywhere.

Today no inline function is marked __always_inline in
string.h

I don't doubt there should be some standardization
of inline vs __always_inline in the kernel, but this
right now seems different just for difference sake.

cheers, Joe



Re: x86/sgx: uapi change proposal

2018-12-21 Thread Jarkko Sakkinen
On Fri, Dec 21, 2018 at 10:24:54AM -0800, Sean Christopherson wrote:
> On Fri, Dec 21, 2018 at 09:12:46AM -0800, Andy Lutomirski wrote:
> > > On Dec 21, 2018, at 9:28 AM, Sean Christopherson 
> > >  wrote:
> > >
> > > On Wed, Dec 19, 2018 at 06:58:48PM -0800, Andy Lutomirski wrote:
> > >>> On Dec 19, 2018, at 6:45 AM, Sean Christopherson 
> > >>>  wrote:
> > >>>
> > > On Wed, Dec 19, 2018 at 09:36:16AM +, Jethro Beekman wrote:
> > >>>
> > >>> I agree with Jethro, passing the enclave_fd as a param is obnoxious.
> > >>> And it means the user needs to open /dev/sgx to do anything with an
> > >>> enclave fd, e.g. the enclave fd might be passed to a builder thread,
> > >>> it shouldn't also need the device fd.
> > >>>
> > >>> E.g.:
> > >>>
> > >>>   sgx_fd = open("/dev/sgx", O_RDWR);
> > >>>   BUG_ON(sgx_fd < 0);
> > >>>
> > >>>   enclave_fd = ioctl(sgx_fd, SGX_ENCLAVE_CREATE, );
> > >>>   BUG_ON(enclave_fd < 0);
> > >>>
> > >>>   ret = ioctl(enclave_fd, SGX_ENCLAVE_ADD_PAGE, );
> > >>>   BUG_ON(ret);
> > >>>
> > >>>   ...
> > >>>
> > >>>   ret = ioctl(enclave_fd, SGX_ENCLAVE_INIT, );
> > >>>   BUG_ON(ret);
> > >>>
> > >>>   ...
> > >>>
> > >>>   close(enclave_fd);
> > >>>   close(sgx_fd);
> > >>>
> > >>>
> > >>> Take a look at virt/kvm/kvm_main.c to see how KVM manages anon inodes
> > >>> and ioctls for VMs and vCPUs.
> > >>
> > >> Can one of you explain why SGX_ENCLAVE_CREATE is better than just
> > >> opening a new instance of /dev/sgx for each encalve?
> > >
> > > Directly associating /dev/sgx with an enclave means /dev/sgx can't be
> > > used to provide ioctl()'s for other SGX-related needs, e.g. to mmap()
> > > raw EPC and expose it a VM.  Proposed layout in the link below.  I'll
> > > also respond to Jarkko's question about exposing EPC through /dev/sgx
> > > instead of having KVM allocate it on behalf of the VM.
> > 
> > Hmm. I guess this makes some sense.  My instinct would be to do it a
> > little differently and have:
> > 
> > /dev/sgx/enclave: Each instance is an enclave.
> > 
> > /dev/sgx/epc: Used to get raw EPC for KVM. Might have different
> > permissions, perhaps 0660 and group kvm.
> > 
> > /dev/sgx/something_else: For when SGX v3 adds something else :)
> 
> Mmmm, I like this approach a lot.  It would allow userspace to easily
> manage permissions for each "feature", e.g. give all users access to
> /dev/sgx/epc but restrict /dev/sgx/enclave.
> 
> And we could add e.g. /dev/sgx/admin if we wanted to exposed ioctls()
> that apply to all aspects of SGX.
> 
> Do you know if /dev/sgx/epc could be dynamically created, e.g. by
> KVM when the kvm_intel module is loaded?  That would seal the deal for
> me as it'd keep open the option of having KVM handle oversubscription
> of guest EPC while exposing EPC through /dev/sgx instead of /dev/kvm.  

No issues with this approach from my side but won't use it for v19.

Please share these comments when reviewing v19. Neither objections to
have this interface.

/Jarkko


Re: [PATCH v2] hwmon: (lm80) fix a missing check of bus read in lm80 probe

2018-12-21 Thread Guenter Roeck
On Fri, Dec 21, 2018 at 01:10:39PM -0600, Kangjie Lu wrote:
> In lm80_probe(), if lm80_read_value() fails, it returns a negative
> error number which is stored to data->fan[f_min] and will be further
> used. We should avoid using the data if the read fails.
> 
> The fix checks if lm80_read_value() fails, and if so, returns with the
> error number.
> 
> Signed-off-by: Kangjie Lu 

Applied.

Thanks,
Guenter

> ---
>  drivers/hwmon/lm80.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwmon/lm80.c b/drivers/hwmon/lm80.c
> index d91333557f04..85f797501aaf 100644
> --- a/drivers/hwmon/lm80.c
> +++ b/drivers/hwmon/lm80.c
> @@ -627,6 +627,7 @@ static int lm80_probe(struct i2c_client *client,
>   struct device *dev = >dev;
>   struct device *hwmon_dev;
>   struct lm80_data *data;
> + int rv;
>  
>   data = devm_kzalloc(dev, sizeof(struct lm80_data), GFP_KERNEL);
>   if (!data)
> @@ -639,8 +640,14 @@ static int lm80_probe(struct i2c_client *client,
>   lm80_init_client(client);
>  
>   /* A few vars need to be filled upon startup */
> - data->fan[f_min][0] = lm80_read_value(client, LM80_REG_FAN_MIN(1));
> - data->fan[f_min][1] = lm80_read_value(client, LM80_REG_FAN_MIN(2));
> + rv = lm80_read_value(client, LM80_REG_FAN_MIN(1));
> + if (rv < 0)
> + return rv;
> + data->fan[f_min][0] = rv;
> + rv = lm80_read_value(client, LM80_REG_FAN_MIN(2));
> + if (rv < 0)
> + return rv;
> + data->fan[f_min][1] = rv;
>  
>   hwmon_dev = devm_hwmon_device_register_with_groups(dev, client->name,
>  data, lm80_groups);


Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 18:25:07 -0500
Steven Rostedt  wrote:

> On Fri, 21 Dec 2018 15:19:33 -0800
> Joe Perches  wrote:
> 
> > I believe this should be bool.
> > 
> > I don't find a use for non-zero assigned len value in the kernel
> > for strncmp and I believe the function should simply be:
> > 
> > static inline bool str_has_prefix(const char *str, const char prefix[])
> > {
> > return !strncmp(str, prefix, strlen(prefix));
> > }  
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 3bb2b3351e35..e4566b9c2553 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -172,8 +172,8 @@ cache_type_store(struct device *dev, struct 
> device_attribute *attr,
>* it's not worth the risk */
>   return -EINVAL;
>  
> - if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
> - buf += sizeof(temp) - 1;
> + if ((len = str_has_prefix(buf, temp))) {
> + buf += len;
>   sdkp->cache_override = 1;
>   } else {
>   sdkp->cache_override = 0;
> 
> And there's more places like this.
>

Heck, even the tracing code would like this. From the code that started
this entire discussion:

-- Steve

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 9d590138f870..632410c6e6c0 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -4342,12 +4342,13 @@ static int parse_actions(struct hist_trigger_data 
*hist_data)
unsigned int i;
int ret = 0;
char *str;
+   int len;
 
for (i = 0; i < hist_data->attrs->n_actions; i++) {
str = hist_data->attrs->action_str[i];
 
-   if (str_has_prefix(str, "onmatch(")) {
-   char *action_str = str + sizeof("onmatch(") - 1;
+   if ((len = str_has_prefix(str, "onmatch("))) {
+   char *action_str = str + len;
 
data = onmatch_parse(tr, action_str);
if (IS_ERR(data)) {
@@ -4355,8 +4356,8 @@ static int parse_actions(struct hist_trigger_data 
*hist_data)
break;
}
data->fn = action_trace;
-   } else if (str_has_prefix(str, "onmax(")) {
-   char *action_str = str + sizeof("onmax(") - 1;
+   } else if ((len = str_has_prefix(str, "onmax("))) {
+   char *action_str = str + len;
 
data = onmax_parse(action_str);
if (IS_ERR(data)) {


Re: x86/sgx: uapi change proposal

2018-12-21 Thread Jarkko Sakkinen
On Fri, Dec 21, 2018 at 09:12:46AM -0800, Andy Lutomirski wrote:
> Hmm. I guess this makes some sense.  My instinct would be to do it a
> little differently and have:
> 
> /dev/sgx/enclave: Each instance is an enclave.
> 
> /dev/sgx/epc: Used to get raw EPC for KVM. Might have different
> permissions, perhaps 0660 and group kvm.
> 
> /dev/sgx/something_else: For when SGX v3 adds something else :)

If I make a draw by saying that I will go with the "ioctls for enclave
fd" as I'm already making good progress in implementation for v19 and we
will look at it?

Does not look like a fruitful conversation to continue forward without
functional code. I will also update my selftest (as it is part of the
patch set) to align with whatever we have so you can immediately run
something.

And since no one is giving me anything at all on swapping but instead
cutting hairs on here, I will lock in (at least for v19) to use shmem
again.

Sounds like a plan? All the internals will work for whatever mess we
want to introduce to the uapi.

/Jarkko


Re: [PATCH v2] hwmon: (lm80) fix a missing check of the status of SMBus read

2018-12-21 Thread Guenter Roeck
On Fri, Dec 21, 2018 at 01:01:33PM -0600, Kangjie Lu wrote:
> If lm80_read_value() fails, it returns a negative number instead of the
> correct read data. Therefore, we should avoid using the data if it
> fails.
> 
> The fix checks if lm80_read_value() fails, and if so, returns with the
> error number.
> 
> Signed-off-by: Kangjie Lu 
> ---
[ change log goes here ]

>  drivers/hwmon/lm80.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hwmon/lm80.c b/drivers/hwmon/lm80.c
> index 08e3945a6fbf..d91333557f04 100644
> --- a/drivers/hwmon/lm80.c
> +++ b/drivers/hwmon/lm80.c
> @@ -360,6 +360,7 @@ static ssize_t set_fan_div(struct device *dev, struct 
> device_attribute *attr,
>   struct i2c_client *client = data->client;
>   unsigned long min, val;
>   u8 reg;
> + int rv;
>   int err = kstrtoul(buf, 10, );
>   if (err < 0)
>   return err;

Now we have 'rv' and 'err'.

My earlier comments didn't mean to suggest that we should now have two
different variables to handle errors.

Never mind, I'll fix it all up myself. No need to resend.

Guenter

> @@ -390,8 +391,11 @@ static ssize_t set_fan_div(struct device *dev, struct 
> device_attribute *attr,
>   return -EINVAL;
>   }
>  
> - reg = (lm80_read_value(client, LM80_REG_FANDIV) &
> -~(3 << (2 * (nr + 1 | (data->fan_div[nr] << (2 * (nr + 1)));
> + rv = lm80_read_value(client, LM80_REG_FANDIV);
> + if (rv < 0)
> + return rv;
> + reg = (rv & ~(3 << (2 * (nr + 1
> + | (data->fan_div[nr] << (2 * (nr + 1)));
>   lm80_write_value(client, LM80_REG_FANDIV, reg);
>  
>   /* Restore fan_min */


mmotm 2018-12-21-15-28 uploaded

2018-12-21 Thread akpm
The mm-of-the-moment snapshot 2018-12-21-15-28 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (4.x
or 4.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.

A git tree which contains the memory management portion of this tree is
maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
by Michal Hocko.  It contains the patches which are between the
"#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series
file, http://www.ozlabs.org/~akpm/mmotm/series.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/

To develop on top of mmotm git:

  $ git remote add mmotm 
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
  $ git remote update mmotm
  $ git checkout -b topic mmotm/master
  
  $ git send-email mmotm/master.. [...]

To rebase a branch with older patches to a new mmotm release:

  $ git remote update mmotm
  $ git rebase --onto mmotm/master  topic




The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/

and use of this tree is similar to
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.


This mmotm tree contains the following patches against 4.20-rc7:
(patches marked "*" will be included in linux-next)

  origin.patch
* mm-memory_hotplug-initialize-struct-pages-for-the-full-memory-section.patch
* mm-thp-fix-flags-for-pmd-migration-when-split.patch
* forkmemcg-fix-crash-in-free_thread_stack-on-memcg-charge-fail.patch
* mm-page_alloc-fix-has_unmovable_pages-for-hugepages.patch
* checkpatch-dont-interpret-stack-dumps-as-commit-ids.patch
* mm-thp-always-specify-disabled-vmas-as-nh-in-smaps.patch
* mm-memcg-fix-reclaim-deadlock-with-writeback.patch
* arm-arch-arm-include-asm-pageh-needs-personalityh.patch
* kasan-mm-change-hooks-signatures.patch
* kasan-slub-handle-pointer-tags-in-early_kmem_cache_node_alloc.patch
* kasan-move-common-generic-and-tag-based-code-to-commonc.patch
* kasan-rename-source-files-to-reflect-the-new-naming-scheme.patch
* kasan-add-config_kasan_generic-and-config_kasan_sw_tags.patch
* kasan-arm64-adjust-shadow-size-for-tag-based-mode.patch
* kasan-rename-kasan_zero_page-to-kasan_early_shadow_page.patch
* kasan-initialize-shadow-to-0xff-for-tag-based-mode.patch
* arm64-move-untagged_addr-macro-from-uaccessh-to-memoryh.patch
* kasan-add-tag-related-helper-functions.patch
* kasan-arm64-untag-address-in-_virt_addr_is_linear.patch
* kasan-preassign-tags-to-objects-with-ctors-or-slab_typesafe_by_rcu.patch
* kasan-arm64-fix-up-fault-handling-logic.patch
* kasan-arm64-enable-top-byte-ignore-for-the-kernel.patch
* kasan-mm-perform-untagged-pointers-comparison-in-krealloc.patch
* kasan-split-out-generic_reportc-from-reportc.patch
* kasan-add-bug-reporting-routines-for-tag-based-mode.patch
* mm-move-obj_to_index-to-include-linux-slab_defh.patch
* kasan-add-hooks-implementation-for-tag-based-mode.patch
* kasan-arm64-add-brk-handler-for-inline-instrumentation.patch
* kasan-mm-arm64-tag-non-slab-memory-allocated-via-pagealloc.patch
* kasan-add-__must_check-annotations-to-kasan-hooks.patch
* kasan-arm64-select-have_arch_kasan_sw_tags.patch
* kasan-update-documentation.patch
* kasan-add-spdx-license-identifier-mark-to-source-files.patch
* bloat-o-meter-ignore-__addressable_-symbols.patch
* scripts-decodecode-set-arch-when-running-natively-on-arm-arm64.patch
* scripts-decode_stacktrace-only-strip-base-path-when-a-prefix-of-the-path.patch
* checkstackpl-dynamic-stack-growth-for-aarch64.patch
* scripts-add-spdxcheckpy-self-test.patch
* scripts-tags-add-more-declarations.patch
* arch-sh-mach-kfr2r09-fix-struct-mtd_oob_ops-build-warning.patch
* sh-kfr2r09-drop-pointless-static-qualifier-in-kfr2r09_usb0_gadget_setup.patch
* sh-boards-convert-to-spdx-identifiers.patch
* 

Re: [PATCH net-next] staging: octeon: fix build failure with XFRM enabled

2018-12-21 Thread Guenter Roeck
On Fri, Dec 21, 2018 at 09:57:26PM +0100, Florian Westphal wrote:
> skb->sp doesn't exist anymore in the next-next tree, so mips defconfig
> no longer builds.  Use helper instead to reset the secpath.
> 
> Not even compile tested.
> 
It does fix the build error.

Tested-by: Guenter Roeck 

> Cc: Greg Kroah-Hartman 
> Reported-by: Guenter Roeck 
> Fixes: 4165079ba328d ("net: switch secpath to use skb extension 
> infrastructure")
> Signed-off-by: Florian Westphal 
> ---
>  Greg, David:
> 
>  The patch will not break build for a tree that lacks the 'Fixes'
>  commit, so this can also go in via staging tree.
>  OTOH, net-next build is broken for mips/octeon, so I think in
>  this case net-next might make more sense?
> 
> diff --git a/drivers/staging/octeon/ethernet-tx.c 
> b/drivers/staging/octeon/ethernet-tx.c
> index df3441b815bb..317c9720467c 100644
> --- a/drivers/staging/octeon/ethernet-tx.c
> +++ b/drivers/staging/octeon/ethernet-tx.c
> @@ -359,8 +359,7 @@ int cvm_oct_xmit(struct sk_buff *skb, struct net_device 
> *dev)
>   dst_release(skb_dst(skb));
>   skb_dst_set(skb, NULL);
>  #ifdef CONFIG_XFRM
> - secpath_put(skb->sp);
> - skb->sp = NULL;
> + secpath_reset(skb);
>  #endif
>   nf_reset(skb);
>  
> -- 
> 2.19.2
> 


[PATCH] Staging: iio: ad7192: replaced bool in struct

2018-12-21 Thread Amir Mahdi Ghorbanian
Replaced bool in struct with unsigned int bitfield to conserve space and
more clearly define size of varibales

Signed-off-by: Amir Mahdi Ghorbanian 
---
 drivers/staging/iio/adc/ad7192.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/iio/adc/ad7192.h b/drivers/staging/iio/adc/ad7192.h
index 7433a43..7d3e62f 100644
--- a/drivers/staging/iio/adc/ad7192.h
+++ b/drivers/staging/iio/adc/ad7192.h
@@ -35,13 +35,13 @@ struct ad7192_platform_data {
u16 vref_mv;
u8  clock_source_sel;
u32 ext_clk_hz;
-   boolrefin2_en;
-   boolrej60_en;
-   boolsinc3_en;
-   boolchop_en;
-   boolbuf_en;
-   boolunipolar_en;
-   boolburnout_curr_en;
+   unsigned intrefin2_en : 1;
+   unsigned intrej60_en : 1;
+   unsigned intsinc3_en : 1;
+   unsigned intchop_en : 1;
+   unsigned intbuf_en : 1;
+   unsigned intunipolar_en : 1;
+   unsigned intburnout_curr_en : 1;
 };
 
 #endif /* IIO_ADC_AD7192_H_ */
-- 
2.7.4



Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 15:19:33 -0800
Joe Perches  wrote:

> I believe this should be bool.
> 
> I don't find a use for non-zero assigned len value in the kernel
> for strncmp and I believe the function should simply be:
> 
> static inline bool str_has_prefix(const char *str, const char prefix[])
> {
>   return !strncmp(str, prefix, strlen(prefix));
> }

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 3bb2b3351e35..e4566b9c2553 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -172,8 +172,8 @@ cache_type_store(struct device *dev, struct 
device_attribute *attr,
 * it's not worth the risk */
return -EINVAL;
 
-   if (strncmp(buf, temp, sizeof(temp) - 1) == 0) {
-   buf += sizeof(temp) - 1;
+   if ((len = str_has_prefix(buf, temp))) {
+   buf += len;
sdkp->cache_override = 1;
} else {
sdkp->cache_override = 0;

And there's more places like this.

> 
> It's hard to believe __always_inline vs inline matters
> for any single line function.

I've been burnt by gcc deciding to not inline single functions before.
Why take the chance? It also documents that I must be inlined, and not
just a hint.

-- Steve



[PATCH] usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup

2018-12-21 Thread Jack Pham
OUT endpoint requests may somtimes have this flag set when
preparing to be submitted to HW indicating that there is an
additional TRB chained to the request for alignment purposes.
If that request is removed before the controller can execute the
transfer (e.g. ep_dequeue/ep_disable), the request will not go
through the dwc3_gadget_ep_cleanup_completed_request() handler
and will not have its needs_extra_trb flag cleared when
dwc3_gadget_giveback() is called.  This same request could be
later requeued for a new transfer that does not require an
extra TRB and if it is successfully completed, the cleanup
and TRB reclamation will incorrectly process the additional TRB
which belongs to the next request, and incorrectly advances the
TRB dequeue pointer, thereby messing up calculation of the next
requeust's actual/remaining count when it completes.

The right thing to do here is to ensure that the flag is cleared
before it is given back to the function driver.  A good place
to do that is in dwc3_gadget_del_and_unmap_request().

Signed-off-by: Jack Pham 
---
Hi Felipe,

There's probably zero chance this is making it to 4.20, so if you take
this after 4.21-rc1 so be it. But should this be Cc: stable? If so it
needs to be sent separately for <= 4.19 as needs_extra_trb was previously
req->unaligned and req->zero.

Thanks,
Jack

 drivers/usb/dwc3/gadget.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 2ecde30ad0b7..e97b14f444c8 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -177,6 +177,7 @@ static void dwc3_gadget_del_and_unmap_request(struct 
dwc3_ep *dep,
req->started = false;
list_del(>list);
req->remaining = 0;
+   req->needs_extra_trb = false;
 
if (req->request.status == -EINPROGRESS)
req->request.status = status;
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Joe Perches
On Fri, 2018-12-21 at 18:13 -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" 
> 
> A discussion came up in the trace triggers thread about converting a
> bunch of:
> 
>  strncmp(str, "const", sizeof("const") - 1)
> 
> use cases into a helper macro. It started with:
> 
>   strncmp(str, const, sizeof(const) - 1)
> 
> But then Joe Perches mentioned that if a const is not used, the
> sizeof() will be the size of a pointer, which can be bad. And that
> gcc will optimize strlen("const") into "sizeof("const") - 1".
> 
> Thinking about this more, a quick grep in the kernel tree found several
> (thousands!) of cases that use this construct. A quick grep also
> revealed that there's probably several bugs in that use case. Some are
> that people forgot the "- 1" (which I found) and others could be that
> the constant for the sizeof is different than the constant (although, I
> haven't found any of those, but I also didn't look hard).
> 
> I figured the best thing to do is to create a helper macro and place it
> into include/linux/string.h. And go around and fix all the open coded
> versions of it later.
> 
> Note, gcc appears to optimize this when we make it into an always_inline
> static function, which removes a lot of issues that a macro produces.
> 
> Link: 
> http://lkml.kernel.org/r/e3e754f2bd18e56eaa8baf79bee619316ebf4cfc.1545161087.git.tom.zanu...@linux.intel.com
> Link: http://lkml.kernel.org/r/20181219211615.2298e...@gandalf.local.home
> Link: 
> http://lkml.kernel.org/r/CAHk-=wg_sr-uec1ggmkzpypouyanl5cmx4r7ceuav4qmf5j...@mail.gmail.com
> 
> Cc: Tom Zanussi 
> Cc: Greg Kroah-Hartman 
> Suggestions-by: Linus Torvalds 
> Suggestions-by: Joe Perches 
> Suggestions-by: Andreas Schwab 
> Signed-off-by: Steven Rostedt (VMware) 
> ---
> 
> Changes since v2:
> 
>  - Rename the helper function to str_has_prefix()
> (Linus Torvalds and Joe Perches)
> 
>  - Use a static inline, as that appears to still optimize correctly
> (Andreas Schwab)
> 
>  include/linux/string.h | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/include/linux/string.h b/include/linux/string.h
> index 27d0482e5e05..a37859d91b57 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -456,4 +456,24 @@ static inline void memcpy_and_pad(void *dest, size_t 
> dest_len,
>   memcpy(dest, src, dest_len);
>  }
>  
> +/**
> + * str_has_prefix - Test if a string has a given prefix
> + * @str: The string to test
> + * @prefix: The string to see if @str starts with
> + *
> + * A common way to test a prefix of a string is to do:
> + *  strncmp(str, prefix, sizeof(prefix) - 1)
> + *
> + * But this can lead to bugs due to typos, or if prefix is a pointer
> + * and not a constant. Instead use str_has_prefix().
> + *
> + * Returns: 0 if @str does not start with @prefix
> + strlen(@prefix) if @str does start with @prefix
> + */
> +static __always_inline int str_has_prefix(const char *str, const char 
> *prefix)
> +{
> + int len = strlen(prefix);
> + return strncmp(str, prefix, len) == 0 ? len : 0;
> +}

I believe this should be bool.

I don't find a use for non-zero assigned len value in the kernel
for strncmp and I believe the function should simply be:

static inline bool str_has_prefix(const char *str, const char prefix[])
{
return !strncmp(str, prefix, strlen(prefix));
}

It's hard to believe __always_inline vs inline matters
for any single line function.




[PATCH] bcache: fix indentation issue, remove tabs on a hunk of code

2018-12-21 Thread Colin King
From: Colin Ian King 

There is a hunk of code that is indented one level too deep, fix this
by removing the extra tabs.

Signed-off-by: Colin Ian King 
---
 drivers/md/bcache/super.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4dee119c3664..a697a3a923cd 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1615,21 +1615,21 @@ static void conditional_stop_bcache_device(struct 
cache_set *c,
 */
pr_warn("stop_when_cache_set_failed of %s is \"auto\" and cache 
is dirty, stop it to avoid potential data corruption.",
d->disk->disk_name);
-   /*
-* There might be a small time gap that cache set is
-* released but bcache device is not. Inside this time
-* gap, regular I/O requests will directly go into
-* backing device as no cache set attached to. This
-* behavior may also introduce potential inconsistence
-* data in writeback mode while cache is dirty.
-* Therefore before calling bcache_device_stop() due
-* to a broken cache device, dc->io_disable should be
-* explicitly set to true.
-*/
-   dc->io_disable = true;
-   /* make others know io_disable is true earlier */
-   smp_mb();
-   bcache_device_stop(d);
+   /*
+* There might be a small time gap that cache set is
+* released but bcache device is not. Inside this time
+* gap, regular I/O requests will directly go into
+* backing device as no cache set attached to. This
+* behavior may also introduce potential inconsistence
+* data in writeback mode while cache is dirty.
+* Therefore before calling bcache_device_stop() due
+* to a broken cache device, dc->io_disable should be
+* explicitly set to true.
+*/
+   dc->io_disable = true;
+   /* make others know io_disable is true earlier */
+   smp_mb();
+   bcache_device_stop(d);
} else {
/*
 * dc->stop_when_cache_set_failed == BCH_CACHED_STOP_AUTO
-- 
2.19.1



[PATCH v3] string.h: Add str_has_prefix() helper

2018-12-21 Thread Steven Rostedt


From: "Steven Rostedt (VMware)" 

A discussion came up in the trace triggers thread about converting a
bunch of:

 strncmp(str, "const", sizeof("const") - 1)

use cases into a helper macro. It started with:

strncmp(str, const, sizeof(const) - 1)

But then Joe Perches mentioned that if a const is not used, the
sizeof() will be the size of a pointer, which can be bad. And that
gcc will optimize strlen("const") into "sizeof("const") - 1".

Thinking about this more, a quick grep in the kernel tree found several
(thousands!) of cases that use this construct. A quick grep also
revealed that there's probably several bugs in that use case. Some are
that people forgot the "- 1" (which I found) and others could be that
the constant for the sizeof is different than the constant (although, I
haven't found any of those, but I also didn't look hard).

I figured the best thing to do is to create a helper macro and place it
into include/linux/string.h. And go around and fix all the open coded
versions of it later.

Note, gcc appears to optimize this when we make it into an always_inline
static function, which removes a lot of issues that a macro produces.

Link: 
http://lkml.kernel.org/r/e3e754f2bd18e56eaa8baf79bee619316ebf4cfc.1545161087.git.tom.zanu...@linux.intel.com
Link: http://lkml.kernel.org/r/20181219211615.2298e...@gandalf.local.home
Link: 
http://lkml.kernel.org/r/CAHk-=wg_sr-uec1ggmkzpypouyanl5cmx4r7ceuav4qmf5j...@mail.gmail.com

Cc: Tom Zanussi 
Cc: Greg Kroah-Hartman 
Suggestions-by: Linus Torvalds 
Suggestions-by: Joe Perches 
Suggestions-by: Andreas Schwab 
Signed-off-by: Steven Rostedt (VMware) 
---

Changes since v2:

 - Rename the helper function to str_has_prefix()
(Linus Torvalds and Joe Perches)

 - Use a static inline, as that appears to still optimize correctly
(Andreas Schwab)

 include/linux/string.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 27d0482e5e05..a37859d91b57 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -456,4 +456,24 @@ static inline void memcpy_and_pad(void *dest, size_t 
dest_len,
memcpy(dest, src, dest_len);
 }
 
+/**
+ * str_has_prefix - Test if a string has a given prefix
+ * @str: The string to test
+ * @prefix: The string to see if @str starts with
+ *
+ * A common way to test a prefix of a string is to do:
+ *  strncmp(str, prefix, sizeof(prefix) - 1)
+ *
+ * But this can lead to bugs due to typos, or if prefix is a pointer
+ * and not a constant. Instead use str_has_prefix().
+ *
+ * Returns: 0 if @str does not start with @prefix
+ strlen(@prefix) if @str does start with @prefix
+ */
+static __always_inline int str_has_prefix(const char *str, const char *prefix)
+{
+   int len = strlen(prefix);
+   return strncmp(str, prefix, len) == 0 ? len : 0;
+}
+
 #endif /* _LINUX_STRING_H_ */
-- 
2.19.1




[PATCH] input: drv2667: fix indentation issues, remove extra tabs

2018-12-21 Thread Colin King
From: Colin Ian King 

There are some statements that are indented incorrectly, fix this by
removinf the extra tabs.

Signed-off-by: Colin Ian King 
---
 drivers/input/misc/drv2667.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/input/misc/drv2667.c b/drivers/input/misc/drv2667.c
index 2849bb6906a8..6091f0490e8f 100644
--- a/drivers/input/misc/drv2667.c
+++ b/drivers/input/misc/drv2667.c
@@ -177,9 +177,9 @@ static int drv2667_set_waveform_freq(struct drv2667_data 
*haptics)
error = regmap_write(haptics->regmap, DRV2667_PAGE, read_buf);
if (error) {
dev_err(>client->dev,
-   "Failed to set the page: %d\n", error);
-   return -EIO;
-   }
+   "Failed to set the page: %d\n", error);
+   return -EIO;
+   }
}
 
return error;
-- 
2.19.1



Re: [PATCH] RISC-V: defconfig: Enable Generic PCIE by default

2018-12-21 Thread Alistair Francis
On Fri, 2018-12-21 at 14:34 -0800, Paul Walmsley wrote:
> 
> On Fri, 21 Dec 2018, Alistair Francis wrote:
> 
> > On Fri, 2018-12-21 at 11:27 -0800, Paul Walmsley wrote:
> > > On Fri, 21 Dec 2018, Alistair Francis wrote:
> > > 
> > > > When the MicroSemi driver does eventually go upstream this will
> > > > probably have to be discussed though as either the config or
> > > > device
> > > > tree will need to be edited to ensure only one PCIe device is
> > > > present.
> > > 
> > > The right way to do that is to have two separate DT files: one
> > > for the 
> > > HiFive-U board alone; the other for the HiFive-U plus the
> > > expansion 
> > > board combo.  There shouldn't be any problems with keeping both 
> > > drivers enabled in the defconfig.
> > 
> > Agreed, the only problem is that the device tree comes from the
> > boards
> > firmware at the moment.
> 
> The switchover is in progress:
> 
> https://lore.kernel.org/linux-riscv/41dac7a1-3f66-6a3f-8a68-206688e5b...@sifive.com/T/#t

Great! I didn't know it was in progress.

Alistair

> 
> 
> - Paul
> 


Re: [for-next][PATCH 23/24] string.h: Add strncmp_prefix() helper macro

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 14:57:13 -0800
Linus Torvalds  wrote:

> On Fri, Dec 21, 2018 at 2:48 PM Steven Rostedt  wrote:
> >  
> > > Your patch actually had them, but in the body of your email you did
> > >  
> > > > #define have_prefix(str, prefix) ({ \
> > > >   const char *__pfx = (const char *)prefix; \  
> > >
> > > which is just completely wrong.
> > >
> > > Considering your _old_ patch had the exact same bug, I really think
> > > you need to start internalizing the whole "macro arguments *have* to
> > > be properly protected" thing.  
> >
> > Well, there's less with assignments that can go wrong than with other
> > code. That is, there's little that can happen with "int x = arg;" where
> > arg is the macro paramater to cause something really nasty.  
> 
> What's wrong, Steven?

We are miscommunicating here...

I was talking about the missing parenthesis on the original patch,
which you stated was missing as well. And the original patch didn't
have the typecast.

> 
> The assignment is entirely irrelevant.
> 
> The problem is the cast.
> 
> A type cast has a very high priority, and so if you do
> 
> (const char *)prefix
> 
> it breaks completely if you might have something like"a+6" as the argument.
> 
> Think about what if "a" is of type "unsigned long", for example?

Yes, when writing the real code, I noticed that the typecast could
cause issues without the parenthesis, and added them.

The email you are replying to was saying there's not much that can go
wrong with:

#define MACRO(x) { \
int __p = x; \


I definitely can see something wrong with:

#define MACRO(x) { \
int __p = (int)x; \

because exactly what you stated.

There's nothing wrong with adding (x) for the first one, but it's
unlikely anything will cause it harm. The second one the (x) *is* most
definitely required.

This is a long winded "I agree with you" ;-)

-- Steve


Re: [for-next][PATCH 23/24] string.h: Add strncmp_prefix() helper macro

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 14:29:30 -0800
Linus Torvalds  wrote:

> On Fri, Dec 21, 2018 at 2:20 PM Joe Perches  wrote:
> >
> > Using
> >
> > static inline bool str_has_prefix(const char *str, const char prefix[])
> > {
> > return !strncmp(str, prefix, strlen(prefix));
> > }
> >
> > eliminates the strlen with gcc 4.8 (oldest I still have)  
> 
> Ok, that looks like the right thing to do.
> 

Agreed, and I posted a new version. I can start running it through my
test suit (I'll update all the instances in the tracing directory to
use it), and then it will be ready for a pull request by next week.

I'll revert the top two patches from my for-next tree now.

-- Steve


Re: [for-next][PATCH 23/24] string.h: Add strncmp_prefix() helper macro

2018-12-21 Thread Linus Torvalds
On Fri, Dec 21, 2018 at 2:48 PM Steven Rostedt  wrote:
>
> > Your patch actually had them, but in the body of your email you did
> >
> > > #define have_prefix(str, prefix) ({ \
> > >   const char *__pfx = (const char *)prefix; \
> >
> > which is just completely wrong.
> >
> > Considering your _old_ patch had the exact same bug, I really think
> > you need to start internalizing the whole "macro arguments *have* to
> > be properly protected" thing.
>
> Well, there's less with assignments that can go wrong than with other
> code. That is, there's little that can happen with "int x = arg;" where
> arg is the macro paramater to cause something really nasty.

What's wrong, Steven?

The assignment is entirely irrelevant.

The problem is the cast.

A type cast has a very high priority, and so if you do

(const char *)prefix

it breaks completely if you might have something like"a+6" as the argument.

Think about what if "a" is of type "unsigned long", for example?

   Linus


Re: [for-next][PATCH 23/24] string.h: Add strncmp_prefix() helper macro

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 14:20:25 -0800
Joe Perches  wrote:

> static inline bool str_has_prefix(const char *str, const char prefix[])
> {
>   return !strncmp(str, prefix, strlen(prefix));
> }
> 
> eliminates the strlen with gcc 4.8 (oldest I still have)

I tested it a bit more before posting.

I tested it against:

gcc 4.5.1, 4.5.4, 4.6.3, 6.2.0, 7.2.0 and 8.2

And the strlen was eliminated each time.

So this looks like the right approach :-)

-- Steve


Re: [for-next][PATCH 23/24] string.h: Add strncmp_prefix() helper macro

2018-12-21 Thread Steven Rostedt
On Fri, 21 Dec 2018 14:08:11 -0800
Linus Torvalds  wrote:

> On Fri, Dec 21, 2018 at 12:55 PM Steven Rostedt  wrote:
> >
> > On Fri, 21 Dec 2018 12:41:17 -0800
> > Linus Torvalds  wrote:
> >  
> > > Parentheses  
> >
> > ?  
> 
> Your patch actually had them, but in the body of your email you did
> 
> > #define have_prefix(str, prefix) ({ \
> >   const char *__pfx = (const char *)prefix; \  
> 
> which is just completely wrong.
> 
> Considering your _old_ patch had the exact same bug, I really think
> you need to start internalizing the whole "macro arguments *have* to
> be properly protected" thing.

Well, there's less with assignments that can go wrong than with other
code. That is, there's little that can happen with "int x = arg;" where
arg is the macro paramater to cause something really nasty. The chances
of that happening is up there with using only two underscores causing an
issue.

But they don't hurt to add.

> 
> And I agree with Joe that using a million underscores just makes code
> less legible. Two underscores at the beginning is the standard
> namespace protection. Underscores at the end do nothing. And using
> *more* than two is just crazy.

The two are standard for static variables in C code, but that makes me
worry about macros. I usually do at least three for macros. The
underscores at the end, to me, make it more balanced and easier to read:

strncmp(str, ___prefix___, ___len___);

To me looks better than:

strncmp(str, ___prefix, ___len);

But again, no reason to fight this bikeshed.

> 
> Finally, I think the cast is actually bogus and wrong. Why would the
> prefix ever be anything but "const char *"? Adding the cast only makes
> it more possible that somebody uses a truly wrong type ("unsigned long
> *" ?), and then the cast just silently hides it.

Actually after I sent the email, I was thinking the same thing, that
the original strncmp() would probably complain about that too, and we
should keep it the same. Not sure why I even suggested that. Perhaps, I
tested too many use cases :-/

> 
> If somebody uses "unsigned char *" for this, they'd get the exact same
> warning if they were using strncmp and do this by hand.
> 
> So why would that helper function act any differently? Particularly
> when it then has the huge downside that it will also accept absolute
> garbage?
> 
> Anyway, that was a long and winding NAK for your patch.


But after all that and said. I tested it as a static __always_inline,
and guess what. It also optimizes out the strlen() for constants!!!

Suggested-by: Andreas Schwab 

-- Steve

diff --git a/include/linux/string.h b/include/linux/string.h
index 27d0482e5e05..538469dfb458 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -456,4 +456,25 @@ static inline void memcpy_and_pad(void *dest, size_t 
dest_len,
memcpy(dest, src, dest_len);
 }
 
+/**
+ * str_has_prefix - Test if a string has a given prefix
+ * @str: The string to test
+ * @prefix: The string to see if @str starts with
+ *
+ * A common way to test a prefix of a string is to do:
+ *  strncmp(str, prefix, sizeof(prefix) - 1)
+ *
+ * But this can lead to bugs due to typos, or if prefix is a pointer
+ * and not a constant. Instead use str_has_prefix().
+ *
+ * Returns: 0 if @str does not start with @prefix
+ strlen(@prefix) if @str does start with @prefix
+ */
+static __always_inline int str_has_prefix(const char *str, const char *prefix)
+{
+   int len = strlen(prefix);
+
+   return strncmp(str, prefix, len) == 0 ? len : 0;
+}
+
 #endif /* _LINUX_STRING_H_ */


Re: [PATCH v4 1/3] driver core: add probe_err log helper

2018-12-21 Thread Rob Herring
On Thu, Dec 20, 2018 at 5:38 AM Andrzej Hajda  wrote:
>
> On 20.12.2018 12:14, Greg Kroah-Hartman wrote:
> > On Thu, Dec 20, 2018 at 11:22:45AM +0100, Andrzej Hajda wrote:
> >> During probe every time driver gets resource it should usually check for 
> >> error
> >> printk some message if it is not -EPROBE_DEFER and return the error. This
> >> pattern is simple but requires adding few lines after any resource 
> >> acquisition
> >> code, as a result it is often omited or implemented only partially.
> >> probe_err helps to replace such code sequences with simple call, so code:
> >>  if (err != -EPROBE_DEFER)
> >>  dev_err(dev, ...);
> >>  return err;
> >> becomes:
> >>  return probe_err(dev, err, ...);
> > Can you show a driver being converted to use this to show if it really
> > will save a bunch of lines and make things simpler?  Usually you are
> > requesting lots of resources so you need to do more than just return,
> > you need to clean stuff up first.
>
>
> I have posted sample conversion patch (generated by cocci) in previous
> version of this patchset [1].
>
> I did not re-posted it again as it is quite big patch and it will not be
> applied without prior splitting it per subsystem.
>
> Regarding stuff cleaning: devm_* usually makes it unnecessary, but also
> even with necessary cleaning you can profit from probe_err, you just
> calls it without leaving probe - you have still handled correctly probe
> deferring.
>
> Here is sample usage (taken from beginning of the mentioned patch):
>
> ---
> diff --git a/drivers/ata/libahci_platform.c b/drivers/ata/libahci_platform.c
> index 4b900fc659f7..52e891fe1586 100644
> --- a/drivers/ata/libahci_platform.c
> +++ b/drivers/ata/libahci_platform.c
> @@ -581,11 +581,8 @@ int ahci_platform_init_host(struct platform_device *pdev,
> int i, irq, n_ports, rc;
>
> irq = platform_get_irq(pdev, 0);
> -   if (irq <= 0) {
> -   if (irq != -EPROBE_DEFER)
> -   dev_err(dev, "no irq\n");
> -   return irq;
> -   }
> +   if (irq <= 0)
> +   return probe_err(dev, irq, "no irq\n");

Shouldn't platform_get_irq (or what it calls) print the error message
(like we do for kmalloc), rather than every driver? We could get rid
of lots of error strings that way. I guess there are cases where no
irq is not an error and we wouldn't want to always print an error. In
some cases like that, we have 2 versions of the function.

Not what you're addressing here exactly, but what I'd like to see is
the ability to print the exact locations generating errors in the
first place. That would require wrapping all the error code
assignments and returns (or at least the common sources). If we're
going to make tree wide changes, then that might be the better place
to put the effort. If we had that, then maybe we'd need a lot fewer
error messages in drivers. I did a prototype implementation and
coccinelle script a while back that I could dust off if there's
interest. It was helpful in finding the source of errors, but did have
some false positives printed.

Rob


[PATCH] gru: clean up indentation issue, remove spaces

2018-12-21 Thread Colin King
From: Colin Ian King 

There are a couple of statements that have extra spaces in the identation,
fix this by removing them.

Signed-off-by: Colin Ian King 
---
 drivers/misc/sgi-gru/grufault.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c
index 93be82fc338a..2ec5808ba464 100644
--- a/drivers/misc/sgi-gru/grufault.c
+++ b/drivers/misc/sgi-gru/grufault.c
@@ -616,8 +616,8 @@ irqreturn_t gru_intr_mblade(int irq, void *dev_id)
for_each_possible_blade(blade) {
if (uv_blade_nr_possible_cpus(blade))
continue;
-gru_intr(0, blade);
-gru_intr(1, blade);
+   gru_intr(0, blade);
+   gru_intr(1, blade);
}
return IRQ_HANDLED;
 }
-- 
2.19.1



RE: Fix "perf tools: Synthesize GROUP_DESC feature in pipe mode" in the LT 4.14 branch

2018-12-21 Thread Greensky, James J
This commit does not apply cleanly due to other changes. Is there something I 
can do to make it easier.

This line:
If (data->is_pipe)  {
Must be changed to:
If (file->is_pipe) {

Let me know if there is a more standard way of communicating cherry-picked 
changes for inclusion.

-Original Message-
From: Jinpu Wang [mailto:jinpuw...@gmail.com] 
Sent: Friday, December 21, 2018 2:58 AM
To: Greensky, James J ; Greg Kroah-Hartman 

Cc: linux-kernel@vger.kernel.org; v3.14+, only the raid10 part 

Subject: Re: Fix "perf tools: Synthesize GROUP_DESC feature in pipe mode" in 
the LT 4.14 branch

+cc Greg, stable

Greensky, James J  于2018年12月21日周五 上午11:48写道:
>
> Commit d38d272592737ea88a20 ("perf tools: Synthesize GROUP_DESC feature in 
> pipe mode") broke the LT 4.14 branch when using event groups in pipe-mode.
>
>   # perf record -e '{cycles,instructions,branches}' -- sleep 4 | perf report
>   # To display the perf.data header info, please use --header/--header-only 
> options
>   #
>   Oxd7c [0x60]: failed to process type: 80
>   Error:
>   Failed to process sample
>
> Commit a2015516c5c0be932a69 ("perf record: Synthesize features before events 
> in pipe mode") is the fix. Can we get this cherry-picked and applied?
>


Re: [v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Tim Chen
On 12/21/18 1:40 PM, Yang Shi wrote:
> Swap readahead would read in a few pages regardless if the underlying
> device is busy or not.  It may incur long waiting time if the device is
> congested, and it may also exacerbate the congestion.
> 
> Use inode_read_congested() to check if the underlying device is busy or
> not like what file page readahead does.  Get inode from swap_info_struct.
> Although we can add inode information in swap_address_space
> (address_space->host), it may lead some unexpected side effect, i.e.
> it may break mapping_cap_account_dirty().  Using inode from
> swap_info_struct seems simple and good enough.
> 
> Just does the check in vma_cluster_readahead() since
> swap_vma_readahead() is just used for non-rotational device which
> much less likely has congestion than traditional HDD.
> 
> Although swap slots may be consecutive on swap partition, it still may be
> fragmented on swap file. This check would help to reduce excessive stall
> for such case.
> 
> Cc: Huang Ying 
> Cc: Tim Chen 
> Cc: Minchan Kim 
> Signed-off-by: Yang Shi 
> ---
> v3: Move inode deference under swap device type check per Tim Chen
> v2: Check the swap device type per Tim Chen
> 
>  mm/swap_state.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index fd2f21e..78d500e 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -538,11 +538,18 @@ struct page *swap_cluster_readahead(swp_entry_t entry, 
> gfp_t gfp_mask,
>   bool do_poll = true, page_allocated;
>   struct vm_area_struct *vma = vmf->vma;
>   unsigned long addr = vmf->address;
> + struct inode *inode = NULL;
>  
>   mask = swapin_nr_pages(offset) - 1;
>   if (!mask)
>   goto skip;
>  
> + if (si->flags & (SWP_BLKDEV | SWP_FS)) {
> + inode = si->swap_file->f_mapping->host;
> + if (inode_read_congested(inode))
> + goto skip;
> + }
> +
>   do_poll = false;
>   /* Read a page_cluster sized and aligned cluster around offset. */
>   start_offset = offset & ~mask;
> 

Acked-by: Tim Chen 


Re: [GIT PULL] Last straggler for 4.20-rc8 or final

2018-12-21 Thread Borislav Petkov
+ mingo.

On Fri, Dec 21, 2018 at 02:09:02PM +0100, Paolo Bonzini wrote:
> Linus,
> 
> The following changes since commit 7566ec393f4161572ba6f11ad5171fd5d59b0fbd:
> 
>   Linux 4.20-rc7 (2018-12-16 15:46:55 -0800)
> 
> are available in the git repository at:
> 
>   https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus
> 
> for you to fetch changes up to 3cf85f9f6bd7b172122865432b4c6f0ec844e22a:
> 
>   KVM: x86: nSVM: fix switch to guest mmu (2018-12-19 22:19:22 +0100)
> 
> A simple patch for a pretty bad bug.  I haven't yet sent my 4.21 pull
> request because currently I'm waiting for a patch to be removed from
> tip (MPX removal breaks KVM live migration and was committed without
> Cc or Ack from me) and until that happens I cannot finalize my
> conflict resolution instructions.

Lemme make sure I understand it correctly: you'd like us to not send

https://git.kernel.org/tip/eb012ef3b4e331ae479dd7cd9378041d9b7f851c

up now and delay it for 4.22, right?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


[PATCH] drivers: misc: ad525x_dpot: clean indentation issue, remove tabs

2018-12-21 Thread Colin King
From: Colin Ian King 

There is a hunk of code in a case statement that is indented one level
too deeply, fix this by removing extra tabs. Also remove one empty line.

Signed-off-by: Colin Ian King 
---
 drivers/misc/ad525x_dpot.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/misc/ad525x_dpot.c b/drivers/misc/ad525x_dpot.c
index a0afadefcc49..1f6d008e0036 100644
--- a/drivers/misc/ad525x_dpot.c
+++ b/drivers/misc/ad525x_dpot.c
@@ -202,22 +202,20 @@ static s32 dpot_read_i2c(struct dpot_data *dpot, u8 reg)
return dpot_read_r8d8(dpot, ctrl);
case DPOT_UID(AD5272_ID):
case DPOT_UID(AD5274_ID):
-   dpot_write_r8d8(dpot,
+   dpot_write_r8d8(dpot,
(DPOT_AD5270_1_2_4_READ_RDAC << 2), 0);
 
-   value = dpot_read_r8d16(dpot,
-   DPOT_AD5270_1_2_4_RDAC << 2);
-
-   if (value < 0)
-   return value;
-   /*
-* AD5272/AD5274 returns high byte first, however
-* underling smbus expects low byte first.
-*/
-   value = swab16(value);
+   value = dpot_read_r8d16(dpot, DPOT_AD5270_1_2_4_RDAC << 2);
+   if (value < 0)
+   return value;
+   /*
+* AD5272/AD5274 returns high byte first, however
+* underling smbus expects low byte first.
+*/
+   value = swab16(value);
 
-   if (dpot->uid == DPOT_UID(AD5274_ID))
-   value = value >> 2;
+   if (dpot->uid == DPOT_UID(AD5274_ID))
+   value = value >> 2;
return value;
default:
if ((reg & DPOT_REG_TOL) || (dpot->max_pos > 256))
-- 
2.19.1



Re: [PATCH] RISC-V: defconfig: Enable Generic PCIE by default

2018-12-21 Thread Paul Walmsley



On Fri, 21 Dec 2018, Alistair Francis wrote:

> On Fri, 2018-12-21 at 11:27 -0800, Paul Walmsley wrote:
> > On Fri, 21 Dec 2018, Alistair Francis wrote:
> > 
> > > When the MicroSemi driver does eventually go upstream this will
> > > probably have to be discussed though as either the config or device
> > > tree will need to be edited to ensure only one PCIe device is
> > > present.
> > 
> > The right way to do that is to have two separate DT files: one for the 
> > HiFive-U board alone; the other for the HiFive-U plus the expansion 
> > board combo.  There shouldn't be any problems with keeping both 
> > drivers enabled in the defconfig.
> 
> Agreed, the only problem is that the device tree comes from the boards
> firmware at the moment.

The switchover is in progress:

https://lore.kernel.org/linux-riscv/41dac7a1-3f66-6a3f-8a68-206688e5b...@sifive.com/T/#t


- Paul



  1   2   3   4   5   6   7   8   >