date:20180123

[PATCH] ACPI / tables: Add IORT to injectable table list

2018-01-23 Thread Yang Shunyong

This patch adds ACPI_SIG_PPTT to the table, which enables IORT from
initrd to override which from firmware.

Signed-off-by: Yang Shunyong 
Cc: yutang2.ji...@hxt-semitech.com
Cc: yu.zh...@hxt-semitech.com
---
 drivers/acpi/tables.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 80ce2a7d224b..7bcb66f3 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -456,7 +456,8 @@ static u8 __init acpi_table_checksum(u8 *buffer, u32 length)
ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA,
ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
-   ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL };
+   ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
+   NULL };
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-- 
1.8.3.1

Re: [PATCH v6 22/36] nds32: Debugging support

2018-01-23 Thread Arnd Bergmann

On Tue, Jan 23, 2018 at 8:28 AM, Vincent Chen  wrote:
> 2018-01-18 18:37 GMT+08:00 Arnd Bergmann :
>> On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu  wrote:
>>> From: Greentime Hu 

>>
>> It appears that you are implementing the old-style ptrace handling
>> with architecture specific commands. Please have a look at how
>> this is done in risc-v or arm64. If this takes more too much time
>> to address, I'd suggest using an empty stub function for sys_ptrace
>> and adding it back at a later point, but not send the current version
>> upstream.
>>
>
> After referring to risc-v and arm64, I realize that PTRACE_GETREGSET
> and PTRACE_SETREGSET is used to replace arch specific command.
> The needed port for the two ptrace commands had done in current
> version patch.
>
> Could I keep them and just removing the code for old-style ptrace
> handling in the next version patch?

The important part is to not merge a user space interface into the upstream
kernel that we still want to change. It's clear that it takes some time to
update gdb and other programs using the ptrace interface, so I'd suggest
to simply not have any ptrace interface submitted for inclusion until that
is complete.

In the meantime, you can keep the existing version as an add-on kernel
patch, you probably have other patches that are not ready to get merged
yet, so just keep this one in the same tree as the others.

 Arnd

[PATCH v2] ACPI / tables: Add IORT to injectable table list

2018-01-23 Thread Yang Shunyong

This patch adds ACPI_SIG_IORT to the table, which enables IORT from
initrd to override which from firmware.

Signed-off-by: Yang Shunyong 
Cc: yutang2.ji...@hxt-semitech.com
Cc: yu.zh...@hxt-semitech.com
---

v2: change typo ACPI_SIG_PPTT to ACPI_SIG_IORT in commit message.

---
 drivers/acpi/tables.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 80ce2a7d224b..7bcb66f3 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -456,7 +456,8 @@ static u8 __init acpi_table_checksum(u8 *buffer, u32 length)
ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA,
ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
-   ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL };
+   ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
+   NULL };
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-- 
1.8.3.1

Re: [RFC v2 1/5] vfio/type1: Introduce iova list and add iommu aperture validity check

2018-01-23 Thread Auger Eric

Hi Shameer,

On 18/01/18 01:04, Alex Williamson wrote:
> On Fri, 12 Jan 2018 16:45:27 +
> Shameer Kolothum  wrote:
> 
>> This introduces an iova list that is valid for dma mappings. Make
>> sure the new iommu aperture window is valid and doesn't conflict
>> with any existing dma mappings during attach. Also update the iova
>> list with new aperture window during attach/detach.
>>
>> Signed-off-by: Shameer Kolothum 
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 177 
>> 
>>  1 file changed, 177 insertions(+)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>> b/drivers/vfio/vfio_iommu_type1.c
>> index e30e29a..11cbd49 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -60,6 +60,7 @@
>>  
>>  struct vfio_iommu {
>>  struct list_headdomain_list;
>> +struct list_headiova_list;
>>  struct vfio_domain  *external_domain; /* domain for external user */
>>  struct mutexlock;
>>  struct rb_root  dma_list;
>> @@ -92,6 +93,12 @@ struct vfio_group {
>>  struct list_headnext;
>>  };
>>  
>> +struct vfio_iova {
>> +struct list_headlist;
>> +phys_addr_t start;
>> +phys_addr_t end;
>> +};
> 
> dma_list uses dma_addr_t for the iova.  IOVAs are naturally DMA
> addresses, why are we using phys_addr_t?
> 
>> +
>>  /*
>>   * Guest RAM pinning working set or DMA target
>>   */
>> @@ -1192,6 +1199,123 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group 
>> *group, phys_addr_t *base)
>>  return ret;
>>  }
>>  
>> +static int vfio_insert_iova(phys_addr_t start, phys_addr_t end,
>> +struct list_head *head)
>> +{
>> +struct vfio_iova *region;
>> +
>> +region = kmalloc(sizeof(*region), GFP_KERNEL);
>> +if (!region)
>> +return -ENOMEM;
>> +
>> +INIT_LIST_HEAD(>list);
>> +region->start = start;
>> +region->end = end;
>> +
>> +list_add_tail(>list, head);
>> +return 0;
>> +}
> 
> As I'm reading through this series, I'm learning that there are a lot
> of assumptions and subtle details that should be documented.  For
> instance, the IOMMU API only provides a single geometry and we build
> upon that here as this patch creates a list, but there's only a single
> entry for now.  The following patches carve that single iova range into
> pieces and somewhat subtly use the list_head passed to keep the list
> sorted, allowing the first/last_entry tricks used throughout.  Subtle
> interfaces are prone to bugs.
> 
>> +
>> +/*
>> + * Find whether a mem region overlaps with existing dma mappings
>> + */
>> +static bool vfio_find_dma_overlap(struct vfio_iommu *iommu,
>> +  phys_addr_t start, phys_addr_t end)
>> +{
>> +struct rb_node *n = rb_first(>dma_list);
>> +
>> +for (; n; n = rb_next(n)) {
>> +struct vfio_dma *dma;
>> +
>> +dma = rb_entry(n, struct vfio_dma, node);
>> +
>> +if (end < dma->iova)
>> +break;
>> +if (start >= dma->iova + dma->size)
>> +continue;
>> +return true;
>> +}
>> +
>> +return false;
>> +}
> 
> Why do we need this in addition to the existing vfio_find_dma()?  Why
> doesn't this use the tree structure of the dma_list?
> 
>> +
>> +/*
>> + * Check the new iommu aperture is a valid one
>> + */
>> +static int vfio_iommu_valid_aperture(struct vfio_iommu *iommu,
>> + phys_addr_t start,
>> + phys_addr_t end)
>> +{
>> +struct vfio_iova *first, *last;
>> +struct list_head *iova = >iova_list;
>> +
>> +if (list_empty(iova))
>> +return 0;
>> +
>> +/* Check if new one is outside the current aperture */
> 
> "Disjoint sets"
> 
>> +first = list_first_entry(iova, struct vfio_iova, list);
>> +last = list_last_entry(iova, struct vfio_iova, list);
>> +if ((start > last->end) || (end < first->start))
>> +return -EINVAL;
>> +
>> +/* Check for any existing dma mappings outside the new start */
>> +if (start > first->start) {
>> +if (vfio_find_dma_overlap(iommu, first->start, start - 1))
>> +return -EINVAL;
>> +}
>> +
>> +/* Check for any existing dma mappings outside the new end */
>> +if (end < last->end) {
>> +if (vfio_find_dma_overlap(iommu, end + 1, last->end))
>> +return -EINVAL;
>> +}
>> +
>> +return 0;
>> +}
> 
> I think this returns an int because you want to use it for the return
> value below, but it really seems like a bool question, ie. does this
> aperture conflict with existing mappings.  Additionally, the aperture
> is valid, it was provided to us by the IOMMU API, the question is
> whether it conflicts.  Please also name consistently

[PATCH] cpufreq: mediatek: Add mediatek related projects into blacklist

2018-01-23 Thread sean.wang

From: Sean Wang 

commit 6066998cbd2b1012a8d5bc9a2957cfd0ad53150e upstream.

commit edeec420de24 ("cpufreq: dt-platdev: Automatically create cpufreq
device with OPP v2") not added MediaTek SoCs to the blacklist that would
lead to cause an occasional hang or unexpected behaviors on related boards
as kernelci reported and complained on [1] specifically for 4.14 and 4.15
tree.

For those reasons, add MediaTek SoCs into cpufreq-dt blacklist and wish
the patch be applied to 4.14 and 4.15 tree to allow kernelci able to
complete following automated kernel testing.

[1] https://kernelci.org/boot/mt7623n-bananapi-bpi-r2/

Fixes: edeec420de24 (cpufreq: dt-cpufreq: platdev Automatically create device 
with OPP v2)
Signed-off-by: Andrew-sh Cheng 
Signed-off-by: Sean Wang 
Cc: Kevin Hilman 
---
 drivers/cpufreq/cpufreq-dt-platdev.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c 
b/drivers/cpufreq/cpufreq-dt-platdev.c
index a753c50..9e0aa76 100644
--- a/drivers/cpufreq/cpufreq-dt-platdev.c
+++ b/drivers/cpufreq/cpufreq-dt-platdev.c
@@ -111,6 +111,14 @@ static const struct of_device_id blacklist[] __initconst = 
{
 
{ .compatible = "marvell,armadaxp", },
 
+   { .compatible = "mediatek,mt2701", },
+   { .compatible = "mediatek,mt2712", },
+   { .compatible = "mediatek,mt7622", },
+   { .compatible = "mediatek,mt7623", },
+   { .compatible = "mediatek,mt817x", },
+   { .compatible = "mediatek,mt8173", },
+   { .compatible = "mediatek,mt8176", },
+
{ .compatible = "nvidia,tegra124", },
 
{ .compatible = "st,stih407", },
-- 
2.7.4

Re: [PATCH v2 1/2] Input: edt-ft5x06 - Add support for regulator

2018-01-23 Thread Lothar Waßmann

Hi,

On Mon, 22 Jan 2018 09:42:08 -0800 Dmitry Torokhov wrote:
> Hi Mylène,
> 
> On Thu, Dec 28, 2017 at 8:33 AM, Mylène Josserand
>  wrote:
> > Add the support of regulator to use it as VCC source.
> >
> > Signed-off-by: Mylène Josserand 
> > ---
> >  .../bindings/input/touchscreen/edt-ft5x06.txt  |  1 +
> >  drivers/input/touchscreen/edt-ft5x06.c | 33 
> > ++
> >  2 files changed, 34 insertions(+)
> >
> > diff --git 
> > a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt 
> > b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > index 025cf8c9324a..48e975b9c1aa 100644
> > --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > @@ -30,6 +30,7 @@ Required properties:
> >  Optional properties:
> >   - reset-gpios: GPIO specification for the RESET input
> >   - wake-gpios:  GPIO specification for the WAKE input
> > + - vcc-supply:  Regulator that supplies the touchscreen
> >
> >   - pinctrl-names: should be "default"
> >   - pinctrl-0:   a phandle pointing to the pin settings for the
> > diff --git a/drivers/input/touchscreen/edt-ft5x06.c 
> > b/drivers/input/touchscreen/edt-ft5x06.c
> > index c53a3d7239e7..5ee14a25a382 100644
> > --- a/drivers/input/touchscreen/edt-ft5x06.c
> > +++ b/drivers/input/touchscreen/edt-ft5x06.c
> > @@ -39,6 +39,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #define WORK_REGISTER_THRESHOLD0x00
> >  #define WORK_REGISTER_REPORT_RATE  0x08
> > @@ -91,6 +92,7 @@ struct edt_ft5x06_ts_data {
> > struct touchscreen_properties prop;
> > u16 num_x;
> > u16 num_y;
> > +   struct regulator *vcc;
> >
> > struct gpio_desc *reset_gpio;
> > struct gpio_desc *wake_gpio;
> > @@ -993,6 +995,23 @@ static int edt_ft5x06_ts_probe(struct i2c_client 
> > *client,
> >
> > tsdata->max_support_points = chip_data->max_support_points;
> >
> > +   tsdata->vcc = devm_regulator_get(>dev, "vcc");
> > +   if (IS_ERR(tsdata->vcc)) {
> > +   error = PTR_ERR(tsdata->vcc);
> > +   dev_err(>dev, "failed to request regulator: %d\n",
> > +   error);
>
I would check for -EPROBE_DEFER here and omit the error message in this
case.


Lothar Waßmann

Re: [RFC v2 2/5] vfio/type1: Check reserve region conflict and update iova list

2018-01-23 Thread Auger Eric

Hi Shameer,

On 18/01/18 01:04, Alex Williamson wrote:
> On Fri, 12 Jan 2018 16:45:28 +
> Shameer Kolothum  wrote:
> 
>> This retrieves the reserved regions associated with dev group and
>> checks for conflicts with any existing dma mappings. Also update
>> the iova list excluding the reserved regions.
>>
>> Signed-off-by: Shameer Kolothum 
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 161 
>> +++-
>>  1 file changed, 159 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>> b/drivers/vfio/vfio_iommu_type1.c
>> index 11cbd49..7609070 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -28,6 +28,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -1199,6 +1200,20 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group 
>> *group, phys_addr_t *base)
>>  return ret;
>>  }
>>  
> 
> /* list_sort helper */
> 
>> +static int vfio_resv_cmp(void *priv, struct list_head *a, struct list_head 
>> *b)
>> +{
>> +struct iommu_resv_region *ra, *rb;
>> +
>> +ra = container_of(a, struct iommu_resv_region, list);
>> +rb = container_of(b, struct iommu_resv_region, list);
>> +
>> +if (ra->start < rb->start)
>> +return -1;
>> +if (ra->start > rb->start)
>> +return 1;
>> +return 0;
>> +}
>> +
>>  static int vfio_insert_iova(phys_addr_t start, phys_addr_t end,
>>  struct list_head *head)
>>  {
>> @@ -1274,6 +1289,24 @@ static int vfio_iommu_valid_aperture(struct 
>> vfio_iommu *iommu,
>>  }
>>  
>>  /*
>> + * Check reserved region conflicts with existing dma mappings
>> + */
>> +static int vfio_iommu_resv_region_conflict(struct vfio_iommu *iommu,
>> +struct list_head *resv_regions)
>> +{
>> +struct iommu_resv_region *region;
>> +
>> +/* Check for conflict with existing dma mappings */
>> +list_for_each_entry(region, resv_regions, list) {
>> +if (vfio_find_dma_overlap(iommu, region->start,
>> +region->start + region->length - 1))
>> +return -EINVAL;
>> +}
>> +
>> +return 0;
>> +}
> 
> This basically does the same test as vfio_iommu_valid_aperture but
> properly names it a conflict test.  Please be consistent.  Should this
> also return bool, "conflict" is a yes/no answer.
> 
>> +
>> +/*
>>   * Adjust the iommu aperture window if new aperture is a valid one
>>   */
>>  static int vfio_iommu_iova_aper_adjust(struct vfio_iommu *iommu,
>> @@ -1316,6 +1349,51 @@ static int vfio_iommu_iova_aper_adjust(struct 
>> vfio_iommu *iommu,
>>  return 0;
>>  }
>>  
>> +/*
>> + * Check and update iova region list in case a reserved region
>> + * overlaps the iommu iova range
>> + */
>> +static int vfio_iommu_iova_resv_adjust(struct vfio_iommu *iommu,
>> +struct list_head *resv_regions)
> 
> "resv_region" in previous function, just "resv" here, use consistent
> names.  Also, what are we adjusting.  Maybe "exclude" is a better term.
> 
>> +{
>> +struct iommu_resv_region *resv;
>> +struct list_head *iova = >iova_list;
>> +struct vfio_iova *n, *next;
>> +
>> +list_for_each_entry(resv, resv_regions, list) {
>> +phys_addr_t start, end;
>> +
>> +start = resv->start;
>> +end = resv->start + resv->length - 1;
>> +
>> +list_for_each_entry_safe(n, next, iova, list) {
>> +phys_addr_t a, b;
>> +int ret = 0;
>> +
>> +a = n->start;
>> +b = n->end;
> 
> 'a' and 'b' variables actually make this incredibly confusing.  Use
> better variable names or just drop them entirely, it's much easier to
> follow as n->start & n->end.
> 
>> +/* No overlap */
>> +if ((start > b) || (end < a))
>> +continue;
>> +/* Split the current node and create holes */
>> +if (start > a)
>> +ret = vfio_insert_iova(a, start - 1, >list);
>> +if (!ret && end < b)
>> +ret = vfio_insert_iova(end + 1, b, >list);
>> +if (ret)
>> +return ret;
>> +
>> +list_del(>list);
> 
> This is trickier than it appears and deserves some explanation.  AIUI,
> we're actually inserting duplicate entries for the remainder at the
> start of the range and then at the end of the range (and the order is
> important here because we're inserting each before the current node),
> and then we delete the current node.  So the iova_list is kept sorted
> through this process, though temporarily includes some bogus, unordered
> sub-sets.
> 
>> +

Re: problematic rc9 futex changes.

2018-01-23 Thread Peter Zijlstra

On Tue, Jan 23, 2018 at 12:34:46AM -0500, Dave Jones wrote:
> c1e2f0eaf015fb: "futex: Avoid violating the 10th rule of futex" seems to
> make up a few new rules to violate.
> 
> Coverity picked up these two problems in the same code:
> 

Yeah, Geert also spotted it:

https://lkml.kernel.org/r/20180122103947.gd2...@hirez.programming.kicks-ass.net

I've been running the robustpi tests from glibc but have so far failed
to actually trigger the bug. I think I'll just write up a Changelog and
post the fix from the above link.

Re: [PATCH v1] x86/io: Define readq()/writeq() to use 64-bit type

2018-01-23 Thread Andy Shevchenko

On Mon, 2018-01-22 at 16:46 -0800, h...@zytor.com wrote:
> On January 22, 2018 4:32:14 PM PST, "Mehta, Sohil"  com> wrote:
> > On Fri, 2018-01-19 at 16:33 +0200, Andy Shevchenko wrote:

> > > +build_mmio_read(readq, "q", unsigned long long, "=r", :"memory")
> > > +build_mmio_read(__readq, "q", unsigned long long, "=r", )
> > > +build_mmio_write(writeq, "q", unsigned long long, "r", :"memory")
> > > +build_mmio_write(__writeq, "q", unsigned long long, "r", )
> > >  
> > >  #define readq_relaxed(a) __readq(a)
> > >  #define writeq_relaxed(v, a) __writeq(v, a)
> > 
> > The patch works for me:
> > 
> > Tested-by: Sohil Mehta 
> > 

> Wouldn't simply u64 make more sense?

It would break a common style used in this module for the rest of
accessors. 

So, I prefer to go with unsigned long long and change later, if needed,
from POD types to uNN ones in entire file.

-- 
Andy Shevchenko 
Intel Finland Oy

Re: [RFC v2 3/5] vfio/type1: check dma map request is within a valid iova range

2018-01-23 Thread Auger Eric

Hi Shameer,

On 12/01/18 17:45, Shameer Kolothum wrote:
> This checks and rejects any dma map request outside valid iova
> range.
> 
> Signed-off-by: Shameer Kolothum 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 7609070..47ea490 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -971,6 +971,23 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, 
> struct vfio_dma *dma,
>   return ret;
>  }
>  
> +/*
> + * Check dma map request is within a valid iova range
> + */
> +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu,
> + phys_addr_t start, phys_addr_t end)
s/phys_addr_t/dma_addr_t here also.
> +{
> + struct list_head *iova = >iova_list;
> + struct vfio_iova *node;
> +
> + list_for_each_entry(node, iova, list) {
> + if ((start >= node->start) && (end <= node->end))
> + return true;
> + }
> +
> + return false;
> +}
> +
>  static int vfio_dma_do_map(struct vfio_iommu *iommu,
>  struct vfio_iommu_type1_dma_map *map)
>  {
> @@ -1009,6 +1026,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>   goto out_unlock;
>   }
>  
> + if (!vfio_iommu_iova_dma_valid(iommu, iova, iova + size - 1)) {
> + ret = -EINVAL;
> + goto out_unlock;
> + }
> +
>   dma = kzalloc(sizeof(*dma), GFP_KERNEL);
>   if (!dma) {
>   ret = -ENOMEM;
> 

Thanks

Eric

[PATCH v8 3/5] x86/KASLR: Give a warning if movable_node specified without kaslr_mem=

2018-01-23 Thread Chao Fan

Since only 'movable_node' specified without 'kaslr_mem=' may break
memory hotplug, so reconmmend users using 'kaslr_mem=' when
'movable_node' specified.

Acked-by: Baoquan He 
Signed-off-by: Chao Fan 
---
 arch/x86/boot/compressed/kaslr.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index b200a7ceafc1..8703cc764306 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -282,6 +282,16 @@ static int handle_mem_filter(void)
!strstr(args, "kaslr_mem="))
return 0;
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+   /*
+* Check if 'kaslr_mem=' specified when 'movable_node' found. If not,
+* just give a warrning. Otherwise memory hotplug could be
+* affected if kernel is put on movable memory regions.
+*/
+   if (strstr(args, "movable_node") && !strstr(args, "kaslr_mem="))
+   warn("'kaslr_mem=' should be specified when using 
'movable_node'.\n");
+#endif
+
tmp_cmdline = malloc(len + 1);
if (!tmp_cmdline)
error("Failed to allocate space for tmp_cmdline");
-- 
2.14.3

Re: [PATCH v2 2/4] drivers: firmware: xilinx: Add ZynqMP firmware driver

2018-01-23 Thread Greg KH

On Wed, Jan 17, 2018 at 12:20:32PM -0800, Jolly Shah wrote:
> This patch is adding communication layer with firmware.
> Firmware driver provides an interface to firmware APIs.
> Interface APIs can be used by any driver to communicate to
> PMUFW(Platform Management Unit). All requests go through ATF.
> 
> Signed-off-by: Jolly Shah 
> Signed-off-by: Rajan Vaja 
> ---
>  arch/arm64/Kconfig.platforms|   1 +
>  drivers/firmware/Kconfig|   1 +
>  drivers/firmware/Makefile   |   1 +
>  drivers/firmware/xilinx/Kconfig |   4 +
>  drivers/firmware/xilinx/Makefile|   4 +
>  drivers/firmware/xilinx/zynqmp/Kconfig  |  16 +
>  drivers/firmware/xilinx/zynqmp/Makefile |   4 +
>  drivers/firmware/xilinx/zynqmp/firmware.c   | 987 
> 
>  include/linux/firmware/xilinx/zynqmp/firmware.h | 570 ++

Why does this file need to be in include/linux/ at all?  Shouldn't it
just live in the driver-specific subdir?

thanks,

greg k-h

[PATCH v8 1/5] x86/KASLR: Add kaslr_mem=nn[KMG]@ss[KMG]

2018-01-23 Thread Chao Fan

Introduce a new kernel parameter kaslr_mem=nn[KMG]@ss[KMG] which is used
by KASLR only during kernel decompression stage.

Users can use it to specify memory regions where kernel can be randomized
into. E.g if movable_node specified in kernel cmdline, kernel could be
extracted into those movable regions, this will make memory hotplug fail.
With the help of 'kaslr_mem=', limit kernel in those immovable regions
specified.

Tested-by: Luiz Capitulino 
Acked-by: Baoquan He 
Signed-off-by: Chao Fan 
---
 arch/x86/boot/compressed/kaslr.c | 73 ++--
 1 file changed, 70 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 8199a6187251..b21741135673 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -108,6 +108,15 @@ enum mem_avoid_index {
 
 static struct mem_vector mem_avoid[MEM_AVOID_MAX];
 
+/* Only support at most 4 usable memory regions specified for kaslr */
+#define MAX_KASLR_MEM_USABLE   4
+
+/* Store the usable memory regions for kaslr */
+static struct mem_vector mem_usable[MAX_KASLR_MEM_USABLE];
+
+/* The amount of usable regions for kaslr user specify, not more than 4 */
+static int num_usable_region;
+
 static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 {
/* Item one is entirely before item two. */
@@ -206,7 +215,62 @@ static void mem_avoid_memmap(char *str)
memmap_too_large = true;
 }
 
-static int handle_mem_memmap(void)
+static int parse_kaslr_mem(char *p,
+  unsigned long long *start,
+  unsigned long long *size)
+{
+   char *oldp;
+
+   if (!p)
+   return -EINVAL;
+
+   oldp = p;
+   *size = memparse(p, );
+   if (p == oldp)
+   return -EINVAL;
+
+   switch (*p) {
+   case '@':
+   *start = memparse(p + 1, );
+   return 0;
+   default:
+   /*
+* If w/o offset, only size specified, kaslr_mem=nn[KMG]
+* has the same behaviour as kaslr_mem=nn[KMG]@0. It means
+* the region starts from 0.
+*/
+   *start = 0;
+   return 0;
+   }
+
+   return -EINVAL;
+}
+
+static void parse_kaslr_mem_regions(char *str)
+{
+   static int i;
+
+   while (str && (i < MAX_KASLR_MEM_USABLE)) {
+   int rc;
+   unsigned long long start, size;
+   char *k = strchr(str, ',');
+
+   if (k)
+   *k++ = 0;
+
+   rc = parse_kaslr_mem(str, , );
+   if (rc < 0)
+   break;
+   str = k;
+
+   mem_usable[i].start = start;
+   mem_usable[i].size = size;
+   i++;
+   }
+   num_usable_region = i;
+}
+
+static int handle_mem_filter(void)
 {
char *args = (char *)get_cmd_line_ptr();
size_t len = strlen((char *)args);
@@ -214,7 +278,8 @@ static int handle_mem_memmap(void)
char *param, *val;
u64 mem_size;
 
-   if (!strstr(args, "memmap=") && !strstr(args, "mem="))
+   if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
+   !strstr(args, "kaslr_mem="))
return 0;
 
tmp_cmdline = malloc(len + 1);
@@ -239,6 +304,8 @@ static int handle_mem_memmap(void)
 
if (!strcmp(param, "memmap")) {
mem_avoid_memmap(val);
+   } else if (!strcmp(param, "kaslr_mem")) {
+   parse_kaslr_mem_regions(val);
} else if (!strcmp(param, "mem")) {
char *p = val;
 
@@ -378,7 +445,7 @@ static void mem_avoid_init(unsigned long input, unsigned 
long input_size,
/* We don't need to set a mapping for setup_data. */
 
/* Mark the memmap regions we need to avoid */
-   handle_mem_memmap();
+   handle_mem_filter();
 
 #ifdef CONFIG_X86_VERBOSE_BOOTUP
/* Make sure video RAM can be used. */
-- 
2.14.3

[PATCH RFC 14/16] rcuperf: Add config files with various CONFIG_NR_CPUS

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 .../selftests/rcutorture/configs/rcuperf/PRCU-12| 21 +
 .../rcutorture/configs/rcuperf/PRCU-12.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-14| 21 +
 .../rcutorture/configs/rcuperf/PRCU-14.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-15| 21 +
 .../rcutorture/configs/rcuperf/PRCU-15.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-16| 21 +
 .../rcutorture/configs/rcuperf/PRCU-16.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-2 | 21 +
 .../rcutorture/configs/rcuperf/PRCU-2.boot  |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-32| 21 +
 .../rcutorture/configs/rcuperf/PRCU-32.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-4 | 21 +
 .../rcutorture/configs/rcuperf/PRCU-4.boot  |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-48| 21 +
 .../rcutorture/configs/rcuperf/PRCU-48.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-56| 21 +
 .../rcutorture/configs/rcuperf/PRCU-56.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-60| 21 +
 .../rcutorture/configs/rcuperf/PRCU-60.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-62| 21 +
 .../rcutorture/configs/rcuperf/PRCU-62.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-64| 21 +
 .../rcutorture/configs/rcuperf/PRCU-64.boot |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-8 | 21 +
 .../rcutorture/configs/rcuperf/PRCU-8.boot  |  1 +
 .../selftests/rcutorture/configs/rcuperf/TREE-12| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-14| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-15| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-16| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-2 | 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-32| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-4 | 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-48| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-56| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-60| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-62| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-64| 21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-8 | 21 +
 39 files changed, 559 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
 create mode 100644 
tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
 create mode 100644

Re: Ping Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all

2018-01-23 Thread Vincent Legoll

On 1/23/18, Michael Ellerman  wrote:
> This has been broken in linux-next for ~6 weeks now, can we please merge
> this and get it fixed.

Added Stephen Rothwell to cc

-- 
Vincent Legoll

[PATCH RFC 03/16] rcutorture: Add PRCU test config files

2018-01-23 Thread lianglihao

From: Lihao Liang 

Use the same config files as TREE02, TREE03, TREE06, TREE07, and TREE09.

Signed-off-by: Lihao Liang 
---
 .../selftests/rcutorture/configs/rcu/CFLIST|  5 
 .../selftests/rcutorture/configs/rcu/PRCU02| 27 ++
 .../selftests/rcutorture/configs/rcu/PRCU02.boot   |  1 +
 .../selftests/rcutorture/configs/rcu/PRCU03| 23 ++
 .../selftests/rcutorture/configs/rcu/PRCU03.boot   |  2 ++
 .../selftests/rcutorture/configs/rcu/PRCU06| 26 +
 .../selftests/rcutorture/configs/rcu/PRCU06.boot   |  5 
 .../selftests/rcutorture/configs/rcu/PRCU07| 25 
 .../selftests/rcutorture/configs/rcu/PRCU07.boot   |  2 ++
 .../selftests/rcutorture/configs/rcu/PRCU09| 19 +++
 .../selftests/rcutorture/configs/rcu/PRCU09.boot   |  1 +
 11 files changed, 136 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST 
b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
index a3a1a05a..7359e194 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
+++ b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
@@ -1,3 +1,8 @@
+PRCU02
+PRCU03
+PRCU06
+PRCU07
+PRCU09
 TREE01
 TREE02
 TREE03
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02 
b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02
new file mode 100644
index ..5f532f05
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02
@@ -0,0 +1,27 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=8
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_PRCU=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_FANOUT=3
+CONFIG_RCU_FANOUT_LEAF=3
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
+CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
new file mode 100644
index ..6c5e626f
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
@@ -0,0 +1 @@
+rcutorture.torture_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03 
b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03
new file mode 100644
index ..869cadc8
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03
@@ -0,0 +1,23 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=16
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_PRCU=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=y
+CONFIG_NO_HZ_IDLE=n
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_TRACE=y
+CONFIG_HOTPLUG_CPU=y
+CONFIG_RCU_FANOUT=2
+CONFIG_RCU_FANOUT_LEAF=2
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_RCU_BOOST=y
+CONFIG_RCU_KTHREAD_PRIO=2
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
+CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
new file mode 100644
index ..0be10cba
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
@@ -0,0 +1,2 @@
+rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30
+rcutorture.torture_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU06 
b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06
new file mode 100644
index ..b1480963
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06
@@ -0,0 +1,26 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=8
+CONFIG_PREEMPT_NONE=y
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=n
+CONFIG_PRCU=y
+#CHECK#CONFIG_TREE_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n

[PATCH RFC 10/16] rcutorture: Test call_prcu() and prcu_barrier()

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 kernel/rcu/prcu.c   | 4 +++-
 kernel/rcu/rcutorture.c | 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index 2664d091..49cb70e6 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -179,8 +179,10 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 
/* Use GFP_ATOMIC with IRQs disabled */
vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
-   if (!vhp)
+   if (!vhp) {
+   WARN_ON(1);
return;
+   }
 
head->func = func;
head->next = NULL;
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 7d65bf0c..9215ebb0 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -797,8 +797,8 @@ static struct rcu_torture_ops prcu_ops = {
.exp_sync   = synchronize_prcu,
.get_state  = NULL,
.cond_sync  = NULL,
-   .call   = NULL,
-   .cb_barrier = NULL,
+   .call   = call_prcu,
+   .cb_barrier = prcu_barrier,
.fqs= NULL,
.stats  = NULL,
.irq_capable= 1,
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 16/16] Add GPLv2 license

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 include/linux/prcu.h | 4 
 kernel/rcu/prcu.c| 4 
 2 files changed, 8 insertions(+)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index 9f740985..9fa74dac 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -4,6 +4,10 @@
  *
  * Authors: Heng Zhang 
  *  Lihao Liang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
  */
 
 #ifndef __LINUX_PRCU_H
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index ef2c7730..06375ee6 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -10,6 +10,10 @@
  *
  * Authors: Heng Zhang 
  *  Lihao Liang 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
  */
 
 #include 
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 12/16] prcu: Add PRCU Kconfig parameter

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 include/linux/prcu.h | 14 ++
 init/Kconfig |  7 +++
 kernel/rcu/Makefile  |  3 ++-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index cce967fd..bb20fa40 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -7,8 +7,7 @@
 #include 
 #include 
 
-#define CONFIG_PRCU
-
+#ifdef CONFIG_PRCU
 struct prcu_version_head {
unsigned long long version;
struct prcu_version_head *next;
@@ -48,7 +47,6 @@ struct prcu_struct {
struct completion barrier_completion;
 };
 
-#ifdef CONFIG_PRCU
 void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
@@ -62,11 +60,11 @@ void prcu_check_callbacks(void);
 
 #else /* #ifdef CONFIG_PRCU */
 
-#define prcu_read_lock() do {} while (0)
-#define prcu_read_unlock() do {} while (0)
-#define synchronize_prcu() do {} while (0)
-#define call_prcu() do {} while (0)
-#define prcu_barrier() do {} while (0)
+#define prcu_read_lock rcu_read_lock
+#define prcu_read_unlock rcu_read_unlock
+#define synchronize_prcu synchronize_rcu
+#define call_prcu call_rcu
+#define prcu_barrier rcu_barrier
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 #define prcu_pending() 0
diff --git a/init/Kconfig b/init/Kconfig
index 1d3475fc..c1fd80f9 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -565,6 +565,13 @@ config TASKS_RCU
  only voluntary context switch (not preemption!), idle, and
  user-mode execution as quiescent states.
 
+config PRCU
+   bool
+   default y
+   help
+ This option selects the PRCU implementation based on a fast
+ consensus protocol.
+
 config RCU_STALL_COMMON
def_bool ( TREE_RCU || PREEMPT_RCU || RCU_TRACE )
help
diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 8791419c..9074b395 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -2,7 +2,7 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT := n
 
-obj-y += update.o sync.o prcu.o
+obj-y += update.o sync.o
 obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
 obj-$(CONFIG_TREE_SRCU) += srcutree.o
 obj-$(CONFIG_TINY_SRCU) += srcutiny.o
@@ -12,4 +12,5 @@ obj-$(CONFIG_TREE_RCU) += tree.o
 obj-$(CONFIG_PREEMPT_RCU) += tree.o
 obj-$(CONFIG_TREE_RCU_TRACE) += tree_trace.o
 obj-$(CONFIG_TINY_RCU) += tiny.o
+obj-$(CONFIG_PRCU) += prcu.o
 obj-$(CONFIG_RCU_NEED_SEGCBLIST) += rcu_segcblist.o
-- 
2.14.1.729.g59c0ea183

[PATCH v6 0/2] Initial Allwinner V3s CSI Support

2018-01-23 Thread Yong Deng

This patchset add initial support for Allwinner V3s CSI.

Allwinner V3s SoC features two CSI module. CSI0 is used for MIPI CSI-2
interface and CSI1 is used for parallel interface. This is not
documented in datasheet but by test and guess.

This patchset implement a v4l2 framework driver and add a binding 
documentation for it. 

Currently, the driver only support the parallel interface. And has been
tested with a BT1120 signal which generating from FPGA. The following
fetures are not support with this patchset:
  - ISP 
  - MIPI-CSI2
  - Master clock for camera sensor
  - Power regulator for the front end IC

Changes in v6:
  * Add Rob Herring's review tag.
  * Fix a NULL pointer dereference by picking Maxime Ripard's patch.
  * Add Maxime Ripard's test tag.

Changes in v5:
  * Using the new SPDX tags.
  * Fix MODULE_LICENSE.
  * Add many default cases and warning messages.
  * Detail the parallel bus properties
  * Fix some spelling and syntax mistakes.

Changes in v4:
  * Deal with the CSI 'INNER QUEUE'.
CSI will lookup the next dma buffer for next frame before the
the current frame done IRQ triggered. This is not documented
but reported by Ondřej Jirman.
The BSP code has workaround for this too. It skip to mark the
first buffer as frame done for VB2 and pass the second buffer
to CSI in the first frame done ISR call. Then in second frame
done ISR call, it mark the first buffer as frame done for VB2
and pass the third buffer to CSI. And so on. The bad thing is
that the first buffer will be written twice and the first frame
is dropped even the queued buffer is sufficient.
So, I make some improvement here. Pass the next buffer to CSI
just follow starting the CSI. In this case, the first frame
will be stored in first buffer, second frame in second buffer.
This mothed is used to avoid dropping the first frame, it
would also drop frame when lacking of queued buffer.
  * Fix: using a wrong mbus_code when getting the supported formats
  * Change all fourcc to pixformat
  * Change some function names

Changes in v3:
  * Get rid of struct sun6i_csi_ops
  * Move sun6i-csi to new directory drivers/media/platform/sunxi
  * Merge sun6i_csi.c and sun6i_csi_v3s.c into sun6i_csi.c
  * Use generic fwnode endpoints parser
  * Only support a single subdev to make things simple
  * Many complaintion fix

Changes in v2: 
  * Change sunxi-csi to sun6i-csi
  * Rebase to media_tree master branch 

Following is the 'v4l2-compliance -s -f' output, I have test this
with both interlaced and progressive signal:

# ./v4l2-compliance -s -f
v4l2-compliance SHA   : 6049ea8bd64f9d78ef87ef0c2b3dc9b5de1ca4a1

Driver Info:
Driver name   : sun6i-video
Card type : sun6i-csi
Bus info  : platform:csi
Driver version: 4.15.0
Capabilities  : 0x8421
Video Capture
Streaming
Extended Pix Format
Device Capabilities
Device Caps   : 0x0421
Video Capture
Streaming
Extended Pix Format

Compliance test for device /dev/video0 (not using libv4l2):

Required ioctls:
test VIDIOC_QUERYCAP: OK

Allow for multiple opens:
test second video open: OK
test VIDIOC_QUERYCAP: OK
test VIDIOC_G/S_PRIORITY: OK
test for unlimited opens: OK

Debug ioctls:
test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported)
test VIDIOC_LOG_STATUS: OK (Not Supported)

Input ioctls:
test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
test VIDIOC_ENUMAUDIO: OK (Not Supported)
test VIDIOC_G/S/ENUMINPUT: OK
test VIDIOC_G/S_AUDIO: OK (Not Supported)
Inputs: 1 Audio Inputs: 0 Tuners: 0

Output ioctls:
test VIDIOC_G/S_MODULATOR: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_ENUMAUDOUT: OK (Not Supported)
test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
test VIDIOC_G/S_AUDOUT: OK (Not Supported)
Outputs: 0 Audio Outputs: 0 Modulators: 0

Input/Output configuration ioctls:
test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
test VIDIOC_G/S_EDID: OK (Not Supported)

Test input 0:

Control ioctls:
test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK (Not Supported)
test VIDIOC_QUERYCTRL: OK (Not Supported)
test VIDIOC_G/S_CTRL: OK (Not Supported)
test VIDIOC_G/S/TRY_EXT_CTRLS: OK (Not Supported)
test VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT: OK (Not Supported)
test VIDIOC_G/S_JPEGCOMP: OK (Not Supported)
Standard Controls: 0 Private Controls: 0

Format

[PATCH v6 2/2] media: V3s: Add support for Allwinner CSI.

2018-01-23 Thread Yong Deng

Allwinner V3s SoC features two CSI module. CSI0 is used for MIPI CSI-2
interface and CSI1 is used for parallel interface. This is not
documented in datasheet but by test and guess.

This patch implement a v4l2 framework driver for it.

Currently, the driver only support the parallel interface. MIPI-CSI2,
ISP's support are not included in this patch.

Tested-by: Maxime Ripard 
Signed-off-by: Yong Deng 
---
 MAINTAINERS|   8 +
 drivers/media/platform/Kconfig |   1 +
 drivers/media/platform/Makefile|   2 +
 drivers/media/platform/sunxi/sun6i-csi/Kconfig |   9 +
 drivers/media/platform/sunxi/sun6i-csi/Makefile|   3 +
 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c | 908 +
 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.h | 143 
 .../media/platform/sunxi/sun6i-csi/sun6i_csi_reg.h | 196 +
 .../media/platform/sunxi/sun6i-csi/sun6i_video.c   | 753 +
 .../media/platform/sunxi/sun6i-csi/sun6i_video.h   |  53 ++
 10 files changed, 2076 insertions(+)
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/Kconfig
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/Makefile
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.h
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi_reg.h
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_video.c
 create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_video.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 9501403..b792fe5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3783,6 +3783,14 @@ M:   Jaya Kumar 
 S: Maintained
 F: sound/pci/cs5535audio/
 
+CSI DRIVERS FOR ALLWINNER V3s
+M: Yong Deng 
+L: linux-me...@vger.kernel.org
+T: git git://linuxtv.org/media_tree.git
+S: Maintained
+F: drivers/media/platform/sunxi/sun6i-csi/
+F: Documentation/devicetree/bindings/media/sun6i-csi.txt
+
 CW1200 WLAN driver
 M: Solomon Peachy 
 S: Maintained
diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index fd0c998..41017e3 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -150,6 +150,7 @@ source "drivers/media/platform/am437x/Kconfig"
 source "drivers/media/platform/xilinx/Kconfig"
 source "drivers/media/platform/rcar-vin/Kconfig"
 source "drivers/media/platform/atmel/Kconfig"
+source "drivers/media/platform/sunxi/sun6i-csi/Kconfig"
 
 config VIDEO_TI_CAL
tristate "TI CAL (Camera Adaptation Layer) driver"
diff --git a/drivers/media/platform/Makefile b/drivers/media/platform/Makefile
index 003b0bb..e6e9ce7 100644
--- a/drivers/media/platform/Makefile
+++ b/drivers/media/platform/Makefile
@@ -97,3 +97,5 @@ obj-$(CONFIG_VIDEO_QCOM_CAMSS)+= 
qcom/camss-8x16/
 obj-$(CONFIG_VIDEO_QCOM_VENUS) += qcom/venus/
 
 obj-y  += meson/
+
+obj-$(CONFIG_VIDEO_SUN6I_CSI)  += sunxi/sun6i-csi/
diff --git a/drivers/media/platform/sunxi/sun6i-csi/Kconfig 
b/drivers/media/platform/sunxi/sun6i-csi/Kconfig
new file mode 100644
index 000..314188a
--- /dev/null
+++ b/drivers/media/platform/sunxi/sun6i-csi/Kconfig
@@ -0,0 +1,9 @@
+config VIDEO_SUN6I_CSI
+   tristate "Allwinner V3s Camera Sensor Interface driver"
+   depends on VIDEO_V4L2 && COMMON_CLK && VIDEO_V4L2_SUBDEV_API && HAS_DMA
+   depends on ARCH_SUNXI || COMPILE_TEST
+   select VIDEOBUF2_DMA_CONTIG
+   select REGMAP_MMIO
+   select V4L2_FWNODE
+   ---help---
+  Support for the Allwinner Camera Sensor Interface Controller on V3s.
diff --git a/drivers/media/platform/sunxi/sun6i-csi/Makefile 
b/drivers/media/platform/sunxi/sun6i-csi/Makefile
new file mode 100644
index 000..213cb6b
--- /dev/null
+++ b/drivers/media/platform/sunxi/sun6i-csi/Makefile
@@ -0,0 +1,3 @@
+sun6i-csi-y += sun6i_video.o sun6i_csi.o
+
+obj-$(CONFIG_VIDEO_SUN6I_CSI) += sun6i-csi.o
diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c 
b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
new file mode 100644
index 000..9c341f0
--- /dev/null
+++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
@@ -0,0 +1,908 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2011-2018 Magewell Electronics Co., Ltd. (Nanjing)
+ * All rights reserved.
+ * Author: Yong Deng 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "sun6i_csi.h"
+#include "sun6i_csi_reg.h"
+
+#define MODULE_NAME"sun6i-csi"
+
+struct sun6i_csi_dev {
+   struct sun6i_csicsi;
+   struct device

[PATCH v8 2/5] x86/KASLR: Handle the memory regions specified in kaslr_mem

2018-01-23 Thread Chao Fan

If no 'kaslr_mem=' specified, just handle the e820/efi entries directly
as before. Otherwise, limit kernel to memory regions specified in
'kaslr_mem=' commandline.

Rename process_mem_region to slots_count to match
slots_fetch_random, and name new function as process_mem_region.

Tested-by: Luiz Capitulino 
Acked-by: Baoquan He 
Signed-off-by: Chao Fan 
---
 arch/x86/boot/compressed/kaslr.c | 64 +---
 1 file changed, 53 insertions(+), 11 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index b21741135673..b200a7ceafc1 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -548,9 +548,9 @@ static unsigned long slots_fetch_random(void)
return 0;
 }
 
-static void process_mem_region(struct mem_vector *entry,
-  unsigned long minimum,
-  unsigned long image_size)
+static void slots_count(struct mem_vector *entry,
+   unsigned long minimum,
+   unsigned long image_size)
 {
struct mem_vector region, overlap;
struct slot_area slot_area;
@@ -627,6 +627,52 @@ static void process_mem_region(struct mem_vector *entry,
}
 }
 
+static bool process_mem_region(struct mem_vector region,
+  unsigned long long minimum,
+  unsigned long long image_size)
+{
+   /*
+* If 'kaslr_mem=' specified, walk all the regions, and
+* filter the intersection to slots_count.
+*/
+   if (num_usable_region > 0) {
+   int i;
+
+   for (i = 0; i < num_usable_region; i++) {
+   struct mem_vector entry;
+   unsigned long long start, end, entry_end, region_end;
+
+   start = mem_usable[i].start;
+   end = start + mem_usable[i].size;
+   region_end = region.start + region.size;
+
+   entry.start = clamp(region.start, start, end);
+   entry_end = clamp(region_end, start, end);
+
+   if (entry.start < entry_end) {
+   entry.size = entry_end - entry.start;
+   slots_count(, minimum, image_size);
+   }
+
+   if (slot_area_index == MAX_SLOT_AREA) {
+   debug_putstr("Aborted e820/efi memmap scan 
(slot_areas full)!\n");
+   return 1;
+   }
+   }
+   return 0;
+   }
+
+   /*
+* If no kaslr_mem stored, use region directly
+*/
+   slots_count(, minimum, image_size);
+   if (slot_area_index == MAX_SLOT_AREA) {
+   debug_putstr("Aborted e820/efi memmap scan (slot_areas 
full)!\n");
+   return 1;
+   }
+   return 0;
+}
+
 #ifdef CONFIG_EFI
 /*
  * Returns true if mirror region found (and must have been processed
@@ -692,11 +738,9 @@ process_efi_entries(unsigned long minimum, unsigned long 
image_size)
 
region.start = md->phys_addr;
region.size = md->num_pages << EFI_PAGE_SHIFT;
-   process_mem_region(, minimum, image_size);
-   if (slot_area_index == MAX_SLOT_AREA) {
-   debug_putstr("Aborted EFI scan (slot_areas full)!\n");
+
+   if (process_mem_region(region, minimum, image_size))
break;
-   }
}
return true;
 }
@@ -723,11 +767,9 @@ static void process_e820_entries(unsigned long minimum,
continue;
region.start = entry->addr;
region.size = entry->size;
-   process_mem_region(, minimum, image_size);
-   if (slot_area_index == MAX_SLOT_AREA) {
-   debug_putstr("Aborted e820 scan (slot_areas full)!\n");
+
+   if (process_mem_region(region, minimum, image_size))
break;
-   }
}
 }
 
-- 
2.14.3

[PATCH v8 0/5] x86/KASLR: Add parameter kaslr_mem=nn[KMG]@ss[KMG]

2018-01-23 Thread Chao Fan

This is v8 resend. There's no code change. Just improve code comments
and document accordingly. So add Baoquan's Acked-by and Luiz's
Tested-by.

***Background:
People reported that kaslr may randomly chooses some positions
which are located in movable memory regions. This will break memory
hotplug feature.

And also on kvm guest with 4GB meory, the good unfragmented 1GB could
be occupied by randomized kernel. It will cause hugetlb failing to
allocate 1GB page. While kernel with 'nokaslr' has not such issue.
This causes regression. Please see the discussion mail:
https://lkml.org/lkml/2018/1/4/236

***Solutions:
Introduce a new kernel parameter 'kaslr_mem=nn@ss' to let users to
specify the memory regions where kernel can be allowed to randomize
safely.

E.g if 'movable_node' is spedified, we can use 'kaslr_mem=nn@ss' to
tell KASLR where we can put kernel safely. Then KASLR code can avoid
those movable regions and only choose those immovable regions
specified.

For hugetlb case, users can always add 'kaslr_mem=1G' in kernel
cmdline since the 0~1G is always fragmented region because of BIOS
reserved area. Surely users can specify regions more precisely if
they know system memory very well.

*** Issues need be discussed
There are several issues I am not quite sure, please help review and
give suggestions:

1) Since there's already mem_avoid[] which stores the memory regions
KASLR need avoid. For the regions KASLR can safely use, I name it as
mem_usable[], not sure if it's appropriate. Or kaslr_mem[] directly?

2) In v6, I made 'kaslr_mem=' as a kernel parameter which users can use
to specify memory regions where kenrel can be extracted safely by
'kaslr_mem=nn@ss', or regions where we need avoid to extract kernel by
'kaslr_mem=nn!ss'. While later I rethink about it, seems
'kaslr_mem=nn@ss' can satisfy the current requirement, there's no need
to introduce the 'kaslr_mem=nn!ss'. So I just take that
'kaslr_mem=nn!ss' handling patch off, may add it later if anyone think
it's necessary. Any suggestions?
https://www.spinics.net/lists/kernel/msg2698457.html

***Test results:
 - I did some tests for the memory hotplug issues. I specify the memory
   region in one node, then I found every time the kernel will be
   extracted to the memory of this node.
 - Luiz tested this series with a 4GB KVM guest. With kaslr_mem=1G,
   got one 1GB page allocated 100% of the time in 85 boots. Without
   kaslr_mem=, got 3 failures in only 10 boots (that is, in 3 boots
   no 1GB page allocated). So this series solves the 1GB page problem.

***History
v7->v8:
 - Just improve some comments.
 - Change the wrong spelling.
 - Add the Tested-by and Acked-by.

v6->v7:
 - Drop the unnecessary avoid part for now.
 - Add document for the new parameter.

v5->v6:
 - Add the last patch to save the avoid memory regions.

v4->v5:
 - Change the problem reported by LKP
Follow Dou's suggestion:
 - Also return if match "movable_node" when parsing kernel commandline
   in handle_mem_filter without define CONFIG_MEMORY_HOTPLUG

v3->v4:
Follow Kees's suggestion:
 - Put the functions variables of immovable_mem to #ifdef
   CONFIG_MEMORY_HOTPLUG and change some code place
 - Change the name of "process_mem_region" to "slots_count"
 - Reanme the new function "process_immovable_mem" to "process_mem_region"
Follow Baoquan's suggestion:
 - Fail KASLR if "movable_node" specified without "immovable_mem"
 - Ajust the code place of handling mem_region directely if no
   immovable_mem specified
Follow Randy's suggestion:
 - Change the mistake and add detailed description for the document.

v2->v3:
Follow Baoquan He's suggestion:
 - Change names of several functions.
 - Add a new parameter "immovable_mem" instead of extending mvoable_node
 - Use the clamp to calculate the memory intersecting, which makes
   logical more clear.
 - Disable memory mirror if movable_node specified

v1->v2:
Follow Dou Liyang's suggestion:
 - Add the parse for movable_node=nn[KMG] without @ss[KMG]
 - Fix the bug for more than one "movable_node=" specified
 - Drop useless variables and use mem_vector region directely
 - Add more comments.


Chao Fan (5):
  x86/KASLR: Add kaslr_mem=nn[KMG]@ss[KMG]
  x86/KASLR: Handle the memory regions specified in kaslr_mem
  x86/KASLR: Give a warning if movable_node specified without kaslr_mem=
  x86/KASLR: Skip memory mirror handling if movable_node specified
  document: add document for kaslr_mem

 Documentation/admin-guide/kernel-parameters.txt |  10 ++
 arch/x86/boot/compressed/kaslr.c| 154 +---
 2 files changed, 150 insertions(+), 14 deletions(-)

-- 
2.14.3

[PATCH v2] iommu/mediatek: Move attach_device after iommu-group is ready for M4Uv1

2018-01-23 Thread Yong Wu

In the commit 05f80300dc8b ("iommu: Finish making iommu_group support
mandatory"), the iommu framework has supposed all the iommu drivers have
their owner iommu-group, it get rid of the FIXME workarounds while the
group is NULL. But the flow of Mediatek M4U gen1 looks a bit trick that
it will hang at this case:

==
Unable to handle kernel NULL pointer dereference at virtual address 0030
PC is at mutex_lock+0x28/0x54
LR is at iommu_attach_device+0xa4/0xd4
pc : []lr : []psr: 6013
sp : df0edbb8  ip : df0edbc8  fp : df0edbc4
r10: c114da14  r9 : df2a3e40  r8 : 0003
r7 : df27a210  r6 : df2a90c4  r5 : 0030  r4 : 
r3 : df0f8000  r2 : f000  r1 : df29c610  r0 : 0030
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
xxx
(mutex_lock) from [] (iommu_attach_device+0xa4/0xd4)
(iommu_attach_device) from [] (__arm_iommu_attach_device+0x28/0x90)
(__arm_iommu_attach_device) from [] 
(arm_iommu_attach_device+0x1c/0x30)
(arm_iommu_attach_device) from [] (mtk_iommu_add_device+0xfc/0x214)
(mtk_iommu_add_device) from [] (add_iommu_group+0x3c/0x68)
(add_iommu_group) from [] (bus_for_each_dev+0x78/0xac)
(bus_for_each_dev) from [] (bus_set_iommu+0xb0/0xec)
(bus_set_iommu) from [] (mtk_iommu_probe+0x328/0x368)
(mtk_iommu_probe) from [] (platform_drv_probe+0x5c/0xc0)
(platform_drv_probe) from [] (driver_probe_device+0x2f4/0x4d8)
(driver_probe_device) from [] (__driver_attach+0x10c/0x128)
(__driver_attach) from [] (bus_for_each_dev+0x78/0xac)
(bus_for_each_dev) from [] (driver_attach+0x2c/0x30)
(driver_attach) from [] (bus_add_driver+0x1e0/0x278)
(bus_add_driver) from [] (driver_register+0x88/0x108)
(driver_register) from [] (__platform_driver_register+0x50/0x58)
(__platform_driver_register) from [] (m4u_init+0x24/0x28)
(m4u_init) from [] (do_one_initcall+0xf0/0x17c)
=

The root cause is that "arm_iommu_attach_device" is called before
"iommu_group_get_for_dev" in the interface "mtk_iommu_add_device". Thus,
We adjust the sequence of this two functions.

Unfortunately, there is another issue after the solution above, From the
function "iommu_attach_device", Only one device in each a iommu group is
allowed. In Mediatek case, there is only one m4u group, all the devices
are in one group. thus it get fail at this step.

In order to satisfy this requirement, a new iommu group is allocated for
each a iommu consumer device. But meanwhile, we still have to use the
same domain for all the iommu group. Use a global variable "mtk_domain_v1"
to save the global domain.

CC: Robin Murphy 
CC: Honghui Zhang 
Fixes: 05f80300dc8b ("iommu: Finish making iommu_group support mandatory")
Reported-by: Ryder Lee 
Tested-by: Bibby Hsieh 
Signed-off-by: Yong Wu 
---
changes since v1:
   Add mtk_domain_v1=NULL in domain_free for symmetry.

v1: https://patchwork.kernel.org/patch/10176255/
---
 drivers/iommu/mtk_iommu_v1.c | 42 +++---
 1 file changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 542930c..86106bf 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -103,6 +103,9 @@ struct mtk_iommu_domain {
struct mtk_iommu_data   *data;
 };
 
+/* There is only a iommu domain in M4U gen1. */
+static struct mtk_iommu_domain *mtk_domain_v1;
+
 static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom)
 {
return container_of(dom, struct mtk_iommu_domain, domain);
@@ -251,10 +254,15 @@ static struct iommu_domain 
*mtk_iommu_domain_alloc(unsigned type)
if (type != IOMMU_DOMAIN_UNMANAGED)
return NULL;
 
+   /* Always return the same domain. */
+   if (mtk_domain_v1)
+   return _domain_v1->domain;
+
dom = kzalloc(sizeof(*dom), GFP_KERNEL);
if (!dom)
return NULL;
 
+   mtk_domain_v1 = dom;
return >domain;
 }
 
@@ -263,6 +271,7 @@ static void mtk_iommu_domain_free(struct iommu_domain 
*domain)
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
struct mtk_iommu_data *data = dom->data;
 
+   mtk_domain_v1 = NULL;
dma_free_coherent(data->dev, M2701_IOMMU_PGT_SIZE,
dom->pgt_va, dom->pgt_pa);
kfree(to_mtk_domain(domain));
@@ -418,20 +427,12 @@ static int mtk_iommu_create_mapping(struct device *dev,
m4udev->archdata.iommu = mtk_mapping;
}
 
-   ret = arm_iommu_attach_device(dev, mtk_mapping);
-   if (ret)
-   goto err_release_mapping;
-
return 0;
-
-err_release_mapping:
-   arm_iommu_release_mapping(mtk_mapping);
-   m4udev->archdata.iommu = NULL;
-   return ret;
 }
 
 static int mtk_iommu_add_device(struct device *dev)
 {
+   struct dma_iommu_mapping

Re: [PATCH v7 2/2] mfd: syscon: Add hardware spinlock support

2018-01-23 Thread Baolin Wang

Hi Lee,

On 22 January 2018 at 21:43, Lee Jones  wrote:
> On Thu, 11 Jan 2018, Lee Jones wrote:
>> On Mon, 25 Dec 2017, Baolin Wang wrote:
>>
>> > Some system control registers need hardware spinlock to synchronize
>> > between the multiple subsystems, so we should add hardware spinlock
>> > support for syscon.
>> >
>> > Signed-off-by: Baolin Wang 
>> > Acked-by: Rob Herring 
>> > ---
>> > Changes since v6:
>> >  - Treat hwlock id 0 as valid for regmap.
>> >
>> > Changes since v5:
>> >  - Fix the case that hwspinlock is not enabled.
>> >
>> > Changes since v4:
>> >  - Add one exapmle to show how to add hwlock.
>> >  - Fix the coding style issue.
>> >
>> > Changes since v3:
>> >  - Add error handling for of_hwspin_lock_get_id()
>> >
>> > Changes since v2:
>> >  - Add acked tag from Rob.
>> >
>> > Changes since v1:
>> >  - Remove timeout configuration.
>> >  - Modify the binding file to add hwlocks.
>> > ---
>> >  Documentation/devicetree/bindings/mfd/syscon.txt |8 
>> >  drivers/mfd/syscon.c |   19 
>> > +++
>> >  2 files changed, 27 insertions(+)
>>
>> Applied, thanks.
>
> In order to avoid confusion, I should like to tell you that this patch
> is applied for v4.17, not v4.16.

This patch has been applied into Mark's branch[1] with your ACK, so
Mark should drop this patch from his branch and you will pick it and
merge it into v4.17?

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git/commit/?h=topic/hwspinlock=3bafc09e779710abaa7b836fe3bbeeeab7754c2b

-- 
Baolin.wang
Best Regards

[PATCH RFC 01/16] prcu: Add PRCU implementation

2018-01-23 Thread lianglihao

From: Heng Zhang 

This RCU implementation (PRCU) is based on a fast consensus protocol
published in the following paper:

Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
https://dl.acm.org/citation.cfm?id=3024114.3024143

Signed-off-by: Heng Zhang 
Signed-off-by: Lihao Liang 
---
 include/linux/prcu.h |  37 +++
 kernel/rcu/Makefile  |   2 +-
 kernel/rcu/prcu.c| 125 +++
 kernel/sched/core.c  |   2 +
 4 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/prcu.h
 create mode 100644 kernel/rcu/prcu.c

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
new file mode 100644
index ..653b4633
--- /dev/null
+++ b/include/linux/prcu.h
@@ -0,0 +1,37 @@
+#ifndef __LINUX_PRCU_H
+#define __LINUX_PRCU_H
+
+#include 
+#include 
+#include 
+
+#define CONFIG_PRCU
+
+struct prcu_local_struct {
+   unsigned int locked;
+   unsigned int online;
+   unsigned long long version;
+};
+
+struct prcu_struct {
+   atomic64_t global_version;
+   atomic_t active_ctr;
+   struct mutex mtx;
+   wait_queue_head_t wait_q;
+};
+
+#ifdef CONFIG_PRCU
+void prcu_read_lock(void);
+void prcu_read_unlock(void);
+void synchronize_prcu(void);
+void prcu_note_context_switch(void);
+
+#else /* #ifdef CONFIG_PRCU */
+
+#define prcu_read_lock() do {} while (0)
+#define prcu_read_unlock() do {} while (0)
+#define synchronize_prcu() do {} while (0)
+#define prcu_note_context_switch() do {} while (0)
+
+#endif /* #ifdef CONFIG_PRCU */
+#endif /* __LINUX_PRCU_H */
diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 23803c7d..8791419c 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -2,7 +2,7 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT := n
 
-obj-y += update.o sync.o
+obj-y += update.o sync.o prcu.o
 obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
 obj-$(CONFIG_TREE_SRCU) += srcutree.o
 obj-$(CONFIG_TINY_SRCU) += srcutiny.o
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
new file mode 100644
index ..a00b9420
--- /dev/null
+++ b/kernel/rcu/prcu.c
@@ -0,0 +1,125 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
+
+struct prcu_struct global_prcu = {
+   .global_version = ATOMIC64_INIT(0),
+   .active_ctr = ATOMIC_INIT(0),
+   .mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
+   .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
+};
+struct prcu_struct *prcu = _prcu;
+
+static inline void prcu_report(struct prcu_local_struct *local)
+{
+   unsigned long long global_version;
+   unsigned long long local_version;
+
+   global_version = atomic64_read(>global_version);
+   local_version = local->version;
+   if (global_version > local_version)
+   cmpxchg(>version, local_version, global_version);
+}
+
+void prcu_read_lock(void)
+{
+   struct prcu_local_struct *local;
+
+   local = get_cpu_ptr(_local);
+   if (!local->online) {
+   WRITE_ONCE(local->online, 1);
+   smp_mb();
+   }
+
+   local->locked++;
+   put_cpu_ptr(_local);
+}
+EXPORT_SYMBOL(prcu_read_lock);
+
+void prcu_read_unlock(void)
+{
+   int locked;
+   struct prcu_local_struct *local;
+
+   barrier();
+   local = get_cpu_ptr(_local);
+   locked = local->locked;
+   if (locked) {
+   local->locked--;
+   if (locked == 1)
+   prcu_report(local);
+   put_cpu_ptr(_local);
+   } else {
+   put_cpu_ptr(_local);
+   if (!atomic_dec_return(>active_ctr))
+   wake_up(>wait_q);
+   }
+}
+EXPORT_SYMBOL(prcu_read_unlock);
+
+static void prcu_handler(void *info)
+{
+   struct prcu_local_struct *local;
+
+   local = this_cpu_ptr(_local);
+   if (!local->locked)
+   WRITE_ONCE(local->version, 
atomic64_read(>global_version));
+}
+
+void synchronize_prcu(void)
+{
+   int cpu;
+   cpumask_t cpus;
+   unsigned long long version;
+   struct prcu_local_struct *local;
+
+   version = atomic64_add_return(1, >global_version);
+   mutex_lock(>mtx);
+
+   local = get_cpu_ptr(_local);
+   local->version = version;
+   put_cpu_ptr(_local);
+
+   cpumask_clear();
+   for_each_possible_cpu(cpu) {
+   local = per_cpu_ptr(_local, cpu);
+   if (!READ_ONCE(local->online))
+   continue;
+   if (READ_ONCE(local->version) < version) {
+   smp_call_function_single(cpu, prcu_handler, NULL, 0);
+   cpumask_set_cpu(cpu, );
+   }
+   }

[PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

2018-01-23 Thread lianglihao

From: Lihao Liang 

Dear Paul,

This patch set implements a preemptive version of RCU (PRCU) based on the 
following paper:

Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
https://dl.acm.org/citation.cfm?id=3024114.3024143

We have also added preliminary callback-handling support.  Thus, the current 
version
provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), 
call_prcu(),
and prcu_barrier().

This is an experimental patch, so it would be good to have some feedback.

Known shortcoming is that the grace-period version is incremented in 
synchronize_prcu().
If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() 
invoked,
callbacks cannot be invoked.  Later version should address this issue, e.g. 
adding a
grace-period expedition mechanism.  Others include to use a a hierarchical 
structure,
taking into account the NUMA topology, to send IPI in synchronize_prcu().

We have tested the implementation using rcutorture on both an x86 and ARM64 
machine.
PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 
reported BUG 
in a 1h run.

[ 1593.604201] ---[ end trace b3bae911bec86152 ]---
[ 1594.629450] prcu-torture:torture_onoff task: offlining 14
[ 1594.73] smpboot: CPU 14 is now offline
[ 1594.757732] prcu-torture:torture_onoff task: offlined 14
[ 1597.765149] prcu-torture:torture_onoff task: onlining 11
[ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
[ 1597.804102] prcu-torture:torture_onoff task: onlined 11
[ 1599.365098] prcu-torture: rtc: b0277b90 ver: 66358 tfle: 0 rta: 
66358 rtaf: 0 
rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418 
onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 
cbflood: 225
[ 1599.367946] prcu-torture: !!!
[ 1599.367966] [ cut here ]


We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, 
that is
synchronize_rcu_expedited was tested.

The rcuperf results are as follows (average grace-period duration in ms of ten 
10min runs):

16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic

CPUs  2   4   8  12  15   16
PRCU   0.141.074.158.02   10.7915.16 
TREE  49.30  104.75  277.55  390.82  620.82  1381.54

64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic

CPUs   2   48  16  32   48   6364
PRCU0.23   19.6938.28   63.21   95.41   167.18   252.01   1841.44
TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27

Best wishes,
Lihao.


Lihao Liang (15):
  rcutorture: Add PRCU rcu_torture_ops
  rcutorture: Add PRCU test config files
  rcuperf: Add PRCU rcu_perf_ops
  rcuperf: Add PRCU test config files
  rcuperf: Set gp_exp to true for tests to run
  prcu: Implement call_prcu() API
  prcu: Implement PRCU callback processing
  prcu: Implement prcu_barrier() API
  rcutorture: Test call_prcu() and prcu_barrier()
  rcutorture: Add basic ARM64 support to run scripts
  prcu: Add PRCU Kconfig parameter
  prcu: Comment source code
  rcuperf: Add config files with various CONFIG_NR_CPUS
  rcutorture: Add scripts to run experiments
  Add GPLv2 license

Heng Zhang (1):
  prcu: Add PRCU implementation

 include/linux/interrupt.h  |   3 +
 include/linux/prcu.h   | 122 +
 include/linux/rcupdate.h   |   1 +
 init/Kconfig   |   7 +
 init/main.c|   2 +
 kernel/rcu/Makefile|   1 +
 kernel/rcu/prcu.c  | 497 +
 kernel/rcu/rcuperf.c   |  33 +-
 kernel/rcu/rcutorture.c|  40 +-
 kernel/rcu/tree.c  |   1 +
 kernel/sched/core.c|   2 +
 kernel/time/timer.c|   2 +
 kvm.sh | 452 +++
 run-rcuperf.sh |  26 ++
 .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
 .../selftests/rcutorture/configs/rcu/CFLIST|   5 +
 .../selftests/rcutorture/configs/rcu/PRCU02|  27 ++
 .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
 .../selftests/rcutorture/configs/rcu/PRCU03|  23 +
 .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
 .../selftests/rcutorture/configs/rcu/PRCU06|  26 ++
 .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
 .../selftests/rcutorture/configs/rcu/PRCU07|  25 ++
 .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +

[PATCH RFC 04/16] rcuperf: Add PRCU rcu_perf_ops

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 kernel/rcu/rcuperf.c | 31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index a4a86fb4..ea80fa3e 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -304,6 +305,34 @@ static bool __maybe_unused torturing_tasks(void)
 
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
+/*
+ * Definitions for prcu perf testing.
+ */
+
+static int prcu_perf_read_lock(void) __acquires(RCU)
+{
+   prcu_read_lock();
+   return 0;
+}
+
+static void prcu_perf_read_unlock(int idx) __releases(RCU)
+{
+   prcu_read_unlock();
+}
+
+static struct rcu_perf_ops prcu_ops = {
+   .ptype  = PRCU_FLAVOR,
+   .init   = rcu_sync_perf_init,
+   .readlock   = prcu_perf_read_lock,
+   .readunlock = prcu_perf_read_unlock,
+   .started= rcu_no_completed,
+   .completed  = rcu_no_completed,
+   .exp_completed  = rcu_no_completed,
+   .sync   = synchronize_prcu,
+   .exp_sync   = synchronize_prcu,
+   .name   = "prcu"
+};
+
 /*
  * If performance tests complete, wait for shutdown to commence.
  */
@@ -554,7 +583,7 @@ rcu_perf_init(void)
long i;
int firsterr = 0;
static struct rcu_perf_ops *perf_ops[] = {
-   _ops, _bh_ops, _ops, _ops,
+   _ops, _bh_ops, _ops, _ops, _ops,
RCUPERF_TASKS_OPS
};
 
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 02/16] rcutorture: Add PRCU rcu_torture_ops

2018-01-23 Thread lianglihao

From: Lihao Liang 

Reviewed-by: Heng Zhang 
Signed-off-by: Lihao Liang 
---
 include/linux/rcupdate.h |  1 +
 kernel/rcu/rcutorture.c  | 40 +++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e1e5d002..12df9709 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -84,6 +84,7 @@ enum rcutorture_type {
RCU_SCHED_FLAVOR,
RCU_TASKS_FLAVOR,
SRCU_FLAVOR,
+   PRCU_FLAVOR,
INVALID_RCU_FLAVOR
 };
 
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index ae6e574d..7d65bf0c 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -768,6 +769,43 @@ static bool __maybe_unused torturing_tasks(void)
 
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
+/*
+ * Definitions for prcu torture testing.
+ */
+
+static int prcu_torture_read_lock(void) __acquires(RCU)
+{
+   prcu_read_lock();
+   return 0;
+}
+
+static void prcu_torture_read_unlock(int idx) __releases(RCU)
+{
+   prcu_read_unlock();
+}
+
+static struct rcu_torture_ops prcu_ops = {
+   .ttype  = PRCU_FLAVOR,
+   .init   = rcu_sync_torture_init,
+   .readlock   = prcu_torture_read_lock,
+   .read_delay = rcu_read_delay,  /* just reuse rcu's version. */
+   .readunlock = prcu_torture_read_unlock,
+   .started= rcu_no_completed,
+   .completed  = rcu_no_completed,
+   .deferred_free  = NULL,
+   .sync   = synchronize_prcu,
+   .exp_sync   = synchronize_prcu,
+   .get_state  = NULL,
+   .cond_sync  = NULL,
+   .call   = NULL,
+   .cb_barrier = NULL,
+   .fqs= NULL,
+   .stats  = NULL,
+   .irq_capable= 1,
+   .can_boost  = 0,
+   .name   = "prcu"
+};
+
 /*
  * RCU torture priority-boost testing.  Runs one real-time thread per
  * CPU for moderate bursts, repeatedly registering RCU callbacks and
@@ -1764,7 +1802,7 @@ rcu_torture_init(void)
int firsterr = 0;
static struct rcu_torture_ops *torture_ops[] = {
_ops, _bh_ops, _busted_ops, _ops, _ops,
-   _ops, RCUTORTURE_TASKS_OPS
+   _ops, _ops, RCUTORTURE_TASKS_OPS
};
 
if (!torture_init_begin(torture_type, verbose, _runnable))
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 05/16] rcuperf: Add PRCU test config files

2018-01-23 Thread lianglihao

From: Lihao Liang 

Use the same config file of TREE.

Signed-off-by: Lihao Liang 
---
 .../selftests/rcutorture/configs/rcuperf/CFLIST  |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU| 20 
 .../selftests/rcutorture/configs/rcuperf/PRCU.boot   |  1 +
 3 files changed, 22 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot

diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST 
b/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST
index c9f56cf2..4b80917a 100644
--- a/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST
@@ -1 +1,2 @@
 TREE
+PRCU
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU 
b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
new file mode 100644
index ..a312f671
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
@@ -0,0 +1,20 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot 
b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
new file mode 100644
index ..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
-- 
2.14.1.729.g59c0ea183

[PATCH v6 1/2] dt-bindings: media: Add Allwinner V3s Camera Sensor Interface (CSI)

2018-01-23 Thread Yong Deng

Add binding documentation for Allwinner V3s CSI.

Reviewed-by: Rob Herring 
Signed-off-by: Yong Deng 
---
 .../devicetree/bindings/media/sun6i-csi.txt| 59 ++
 1 file changed, 59 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/sun6i-csi.txt

diff --git a/Documentation/devicetree/bindings/media/sun6i-csi.txt 
b/Documentation/devicetree/bindings/media/sun6i-csi.txt
new file mode 100644
index 000..2ff47a9
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/sun6i-csi.txt
@@ -0,0 +1,59 @@
+Allwinner V3s Camera Sensor Interface
+-
+
+Allwinner V3s SoC features two CSI module. CSI0 is used for MIPI CSI-2
+interface and CSI1 is used for parallel interface.
+
+Required properties:
+  - compatible: value must be "allwinner,sun8i-v3s-csi"
+  - reg: base address and size of the memory-mapped region.
+  - interrupts: interrupt associated to this IP
+  - clocks: phandles to the clocks feeding the CSI
+* bus: the CSI interface clock
+* mod: the CSI module clock
+* ram: the CSI DRAM clock
+  - clock-names: the clock names mentioned above
+  - resets: phandles to the reset line driving the CSI
+
+Each CSI node should contain one 'port' child node with one child 'endpoint'
+node, according to the bindings defined in
+Documentation/devicetree/bindings/media/video-interfaces.txt. As mentioned
+above, the endpoint's bus type should be MIPI CSI-2 for CSI0 and parallel or
+Bt656 for CSI1.
+
+Endpoint node properties for CSI1
+-
+
+- remote-endpoint  : (required) a phandle to the bus receiver's endpoint
+  node
+- bus-width:   : (required) must be 8, 10, 12 or 16
+- pclk-sample  : (optional) (default: sample on falling edge)
+- hsync-active : (only required for parallel)
+- vsync-active : (only required for parallel)
+
+Example:
+
+csi1: csi@1cb4000 {
+   compatible = "allwinner,sun8i-v3s-csi";
+   reg = <0x01cb4000 0x1000>;
+   interrupts = ;
+   clocks = < CLK_BUS_CSI>,
+< CLK_CSI1_SCLK>,
+< CLK_DRAM_CSI>;
+   clock-names = "bus", "mod", "ram";
+   resets = < RST_BUS_CSI>;
+
+   port {
+   /* Parallel bus endpoint */
+   csi1_ep: endpoint {
+   remote-endpoint = <_ep>;
+   bus-width = <16>;
+
+   /* If hsync-active/vsync-active are missing,
+  embedded BT.656 sync is used */
+   hsync-active = <0>; /* Active low */
+   vsync-active = <0>; /* Active low */
+   pclk-sample = <1>;  /* Rising */
+   };
+   };
+};
-- 
1.8.3.1

Re: [PATCH v6 16/36] nds32: DMA mapping API

2018-01-23 Thread Greentime Hu

Hi, Arnd:

2018-01-18 18:26 GMT+08:00 Arnd Bergmann :
> On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu  wrote:
>> From: Greentime Hu 
>>
>> This patch adds support for the DMA mapping API. It uses dma_map_ops for
>> flexibility.
>>
>> Signed-off-by: Vincent Chen 
>> Signed-off-by: Greentime Hu 
>
> I'm still unhappy about the way the cache flushes are done here as discussed
> before. It's not a show-stopped, but no Ack from me.

How about this implementation?

static void
nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle,
  size_t size, enum dma_data_direction dir)
{
switch (direction) {
case DMA_TO_DEVICE: /* writeback only */
break;
case DMA_FROM_DEVICE:   /* invalidate only */
case DMA_BIDIRECTIONAL: /* writeback and invalidate */
cpu_dma_inval_range(start, end);
break;
default:
BUG();
}
}

static void
nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle,
 size_t size, enum dma_data_direction dir)
{
switch (direction) {
case DMA_FROM_DEVICE:   /* invalidate only */
break;
case DMA_TO_DEVICE: /* writeback only */
case DMA_BIDIRECTIONAL: /* writeback and invalidate */
cpu_dma_wb_range(start, end);
break;
default:
BUG();
}
}

Re: [PATCH v2 3/4] drivers: firmware: xilinx: Add sysfs interface

2018-01-23 Thread Greg KH

On Wed, Jan 17, 2018 at 12:20:33PM -0800, Jolly Shah wrote:
> Add Firmware-ggs sysfs interface which provides read/write
> interface to global storage registers.
> 
> Signed-off-by: Jolly Shah 
> Signed-off-by: Rajan Vaja 
> ---
>  .../ABI/stable/sysfs-driver-zynqmp-firmware|  33 +++
>  drivers/firmware/xilinx/zynqmp/Makefile|   2 +-
>  drivers/firmware/xilinx/zynqmp/firmware-ggs.c  | 298 
> +
>  drivers/firmware/xilinx/zynqmp/firmware.c  |  26 ++
>  include/linux/firmware/xilinx/zynqmp/firmware.h|   2 +
>  5 files changed, 360 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/ABI/stable/sysfs-driver-zynqmp-firmware
>  create mode 100644 drivers/firmware/xilinx/zynqmp/firmware-ggs.c
> 
> diff --git a/Documentation/ABI/stable/sysfs-driver-zynqmp-firmware 
> b/Documentation/ABI/stable/sysfs-driver-zynqmp-firmware
> new file mode 100644
> index 000..2483215
> --- /dev/null
> +++ b/Documentation/ABI/stable/sysfs-driver-zynqmp-firmware
> @@ -0,0 +1,33 @@
> +What:  /sys/devices/platform/zynqmp-firmware/ggs*
> +Date:  January 2018
> +KernelVersion: 4.15.0
> +Contact:   "Jolly Shah" 
> +Description:
> +   Shows PMU global general storage register value,
> +   GLOBAL_GEN_STORAGE{0:3}.
> +   Global general storage register that can be used
> +   by system to pass information between masters.
> +
> +   The register is reset during system or power-on
> +   resets. Three registers are used by the FSBL and
> +   other Xilinx software products: GLOBAL_GEN_STORAGE{4:6}.
> +
> +Users: Xilinx
> +
> +What:  /sys/devices/platform/zynqmp-firmware/pggs*
> +Date:  January 2018
> +KernelVersion: 4.15.0
> +Contact:   "Jolly Shah" 
> +Description:
> +   Shows PMU persistent global general storage register
> +   value, PERS_GLOB_GEN_STORAGE{0:3}.
> +   Persistent global general storage register that
> +   can be used by system to pass information between
> +   masters.
> +
> +   This register is only reset by the power-on reset
> +   and maintains its value through a system reset.
> +   Four registers are used by the FSBL and other Xilinx
> +   software products: PERS_GLOB_GEN_STORAGE{4:7}.
> +   Register is reset only by a POR reset.
> +Users: Xilinx
> diff --git a/drivers/firmware/xilinx/zynqmp/Makefile 
> b/drivers/firmware/xilinx/zynqmp/Makefile
> index c3ec669..6629781 100644
> --- a/drivers/firmware/xilinx/zynqmp/Makefile
> +++ b/drivers/firmware/xilinx/zynqmp/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0+
>  # Makefile for Xilinx firmwares
> 
> -obj-$(CONFIG_ZYNQMP_FIRMWARE) += firmware.o
> +obj-$(CONFIG_ZYNQMP_FIRMWARE) += firmware.o firmware-ggs.o
> diff --git a/drivers/firmware/xilinx/zynqmp/firmware-ggs.c 
> b/drivers/firmware/xilinx/zynqmp/firmware-ggs.c
> new file mode 100644
> index 000..be47ca2
> --- /dev/null
> +++ b/drivers/firmware/xilinx/zynqmp/firmware-ggs.c
> @@ -0,0 +1,298 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Xilinx Zynq MPSoC Firmware layer
> + *
> + *  Copyright (C) 2014-2018 Xilinx, Inc.
> + *
> + *  Jolly Shah 
> + *  Rajan Vaja 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 

That's crazy deep nesting, why?

> +
> +static ssize_t read_register(char *buf, u32 ioctl_id, u32 reg)
> +{
> +   int ret;
> +   u32 ret_payload[PAYLOAD_ARG_CNT];
> +   const struct zynqmp_eemi_ops *eemi_ops = get_eemi_ops();
> +
> +   if (!eemi_ops || !eemi_ops->ioctl)
> +   return 0;

Not an error?

> +
> +   ret = eemi_ops->ioctl(0, ioctl_id, reg, 0, ret_payload);
> +   if (ret)
> +   return ret;
> +
> +   return snprintf(buf, PAGE_SIZE, "0x%x\n", ret_payload[1]);

Minor nit, you never need to use snprintf() for a sysfs file, as you
"know" the size and you can't overflow it with just a single value.

Yeah, some tool-checkers hate to see a "raw" sprintf() call, but really,
ignore them here :)

> +}
> +
> +static ssize_t write_register(const char *buf, size_t count,
> + u32 ioctl_id, u32 reg)
> +{
> +   char *kern_buff;
> +   char *inbuf;
> +   char *tok;
> +   long mask;
> +   long value;
> +   int ret;
> +   u32 ret_payload[PAYLOAD_ARG_CNT];
> +   const struct zynqmp_eemi_ops *eemi_ops = get_eemi_ops();
> +
> +   if (!eemi_ops || !eemi_ops->ioctl)
> +   return -EFAULT;
> +
> +   kern_buff = kzalloc(count, GFP_KERNEL);
> +   if (!kern_buff)
> +   return -ENOMEM;
> +
> +   ret = strlcpy(kern_buff, buf, count);
> +   if (ret < 0)

[PATCH v8 4/5] x86/KASLR: Skip memory mirror handling if movable_node specified

2018-01-23 Thread Chao Fan

In kernel code, if 'movable_node' specified, it will skip the mirror
feature. So also skip mirror feature in KASLR.

Acked-by: Baoquan He 
Signed-off-by: Chao Fan 
---
 arch/x86/boot/compressed/kaslr.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 8703cc764306..e4b487f0b7af 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -692,6 +692,7 @@ static bool
 process_efi_entries(unsigned long minimum, unsigned long image_size)
 {
struct efi_info *e = _params->efi_info;
+   char *args = (char *)get_cmd_line_ptr();
bool efi_mirror_found = false;
struct mem_vector region;
efi_memory_desc_t *md;
@@ -725,6 +726,12 @@ process_efi_entries(unsigned long minimum, unsigned long 
image_size)
}
}
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+   /* Skip memory mirror if 'movabale_node' specified */
+   if (strstr(args, "movable_node"))
+   efi_mirror_found = false;
+#endif
+
for (i = 0; i < nr_desc; i++) {
md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i);
 
-- 
2.14.3

[PATCH v8 5/5] document: add document for kaslr_mem

2018-01-23 Thread Chao Fan

Cc: linux-...@vger.kernel.org
Cc: Jonathan Corbet 
Cc: Randy Dunlap 
Signed-off-by: Chao Fan 
---
 Documentation/admin-guide/kernel-parameters.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index e2de7c006a74..e6de15715c4c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2350,6 +2350,16 @@
allocations which rules out almost all kernel
allocations. Use with caution!
 
+   kaslr_mem=nn[KMG][@ss[KMG]]
+   [KNL] Force usage of a specific region of memory
+   for KASLR during kernel decompression stage.
+   Region of usable memory is from ss to ss+nn. If ss
+   is omitted, it is equivalent to kaslr_mem=nn[KMG]@0.
+   Multiple regions can be specified, comma delimited.
+   Notice: only support 4 regions at most now.
+   Example:
+   kaslr_mem=1G,500M@2G,1G@4G
+
MTD_Partition=  [MTD]
Format: ,,,
 
-- 
2.14.3

Re: [PATCH v5] devres: combine function devm_ioremap*

2018-01-23 Thread Greg KH

On Tue, Jan 16, 2018 at 08:03:41PM +0800, Yisheng Xie wrote:
> When I tried to use devm_ioremap function and review related
> code, I found devm_ioremap_* almost have the similar realize
> with each other, which can be combined.
> 
> In the former version, I have tried to kill ioremap_cache to
> reduce the size of devres, which can not work for ioremap is
> not the same as ioremap_nocache in some ARCHs likes ia64.
> Therefore, as the suggestion of Christophe, I introduce a help
> function __devm_ioremap, let devm_ioremap* inline and call
> __devm_ioremap with different devm_ioremap_type.
> 
> After apply the patch, the size of devres.o can be reduce from
> 8216 Bytes to 7352Bytes in my compile environment.
> 
> Suggested-by: Christophe LEROY 
> Signed-off-by: Yisheng Xie 
> ---
> v2:
>  - use MARCO for ioremap
> v3:
>  - kill dev_ioremap_nocache
> v4:
>  - combine function devm_ioremap*
> v5:
>  - fix code style.
> 
>  include/linux/io.h | 61 +++
>  lib/devres.c   | 84 
> ++
>  2 files changed, 70 insertions(+), 75 deletions(-)
> 
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 32e30e8..4d0a640 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -73,12 +73,61 @@ static inline void devm_ioport_unmap(struct device *dev, 
> void __iomem *addr)
>  
>  #define IOMEM_ERR_PTR(err) (__force void __iomem *)ERR_PTR(err)
>  
> -void __iomem *devm_ioremap(struct device *dev, resource_size_t offset,
> -resource_size_t size);
> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
> offset,
> -resource_size_t size);
> -void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset,
> -resource_size_t size);
> +enum devm_ioremap_type {
> + DEVM_IOREMAP = 0,
> + DEVM_IOREMAP_NC,
> + DEVM_IOREMAP_WC,
> +};

Why do these types need to be in a public .h file?

Why not just keep the .h file as-is and then just put the cleanup in the
.c file like you did?

thanks,

greg k-h

Re: [PATCH] x86/retpoline/entry: Disable the entire SYSCALL64 fast path with retpolines on

2018-01-23 Thread Ingo Molnar

* Linus Torvalds  wrote:

> On Mon, Jan 22, 2018 at 10:04 AM, Andy Lutomirski  wrote:
> > The existing retpoline code carefully and awkwardly retpolinifies
> > the SYSCALL64 slow path.  This stops the fast path from being
> > particularly fast, and it's IMO rather messy.
> 
> I'm not convinced your patch isn't messier still.. It's certainly
> subtle. I had to look at that ptregs stub generator thing twice.
> 
> Honestly, I'd rather get rid of the fast-path entirely. Compared to
> all the PTI mess, it's not even noticeable.
> 
> And if we ever get CPU's that have this all fixed, we can re-visit
> introducing the fastpath. But this is all very messy and it doesn't
> seem worth it right now.
> 
> If we get rid of the fastpath, we can lay out the slow path slightly
> better, and get rid of some of those jump-overs. And we'd get rid of
> the ptregs hooks entirely.
> 
> So we can try to make the "slow" path better while at it, but I really
> don't think it matters much now in the post-PTI era. Sadly.

Note that there's another advantage to your proposal: should other 
vulnerabilities 
arise in the future, requiring changes in the syscall entry path, we'd be more 
flexible to address them in the C space than in the assembly space.

In hindsight a _LOT_ of the PTI complexity and fragility centered around 
interacting with x86 kernel entry assembly code - which entry code fortunately 
got 
much simpler (and easier to review) in the past 1-2 years due to the thorough 
cleanups and the conversion of most of it to C. But it was still painful.

So I'm fully in favor of that.

Thanks,

Ingo

[PATCH RFC 15/16] rcutorture: Add scripts to run experiments

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 kvm.sh | 452 +
 run-rcuperf.sh |  26 
 2 files changed, 478 insertions(+)
 create mode 100755 kvm.sh
 create mode 100755 run-rcuperf.sh

diff --git a/kvm.sh b/kvm.sh
new file mode 100755
index ..3b3c1b69
--- /dev/null
+++ b/kvm.sh
@@ -0,0 +1,452 @@
+#!/bin/bash
+#
+# Run a series of 14 tests under KVM.  These are not particularly
+# well-selected or well-tuned, but are the current set.  Run from the
+# top level of the source tree.
+#
+# Edit the definitions below to set the locations of the various directories,
+# as well as the test duration.
+#
+# Usage: kvm.sh [ options ]
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+# Copyright (C) IBM Corporation, 2011
+#
+# Authors: Paul E. McKenney 
+
+scriptname=$0
+args="$*"
+
+T=/tmp/kvm.sh.$$
+trap 'rm -rf $T' 0
+mkdir $T
+
+dur=$((30*60))
+dryrun=""
+KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
+PATH=${KVM}/bin:$PATH; export PATH
+TORTURE_DEFCONFIG=defconfig
+TORTURE_BOOT_IMAGE=""
+TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD
+TORTURE_KMAKE_ARG=""
+TORTURE_SHUTDOWN_GRACE=180
+TORTURE_SUITE=rcu
+resdir=""
+configs=""
+cpus=0
+ds=`date +%Y.%m.%d-%H:%M:%S`
+jitter="-1"
+
+. functions.sh
+
+usage () {
+   echo "Usage: $scriptname optional arguments:"
+   echo "   --bootargs kernel-boot-arguments"
+   echo "   --bootimage relative-path-to-kernel-boot-image"
+   echo "   --buildonly"
+   echo "   --configs \"config-file list w/ repeat factor (3*TINY01)\""
+   echo "   --cpus N"
+   echo "   --datestamp string"
+   echo "   --defconfig string"
+   echo "   --dryrun sched|script"
+   echo "   --duration minutes"
+   echo "   --interactive"
+   echo "   --jitter N [ maxsleep (us) [ maxspin (us) ] ]"
+   echo "   --kmake-arg kernel-make-arguments"
+   echo "   --mac nn:nn:nn:nn:nn:nn"
+   echo "   --no-initrd"
+   echo "   --qemu-args qemu-system-..."
+   echo "   --qemu-cmd qemu-system-..."
+   echo "   --results absolute-pathname"
+   echo "   --torture rcu"
+   exit 1
+}
+
+while test $# -gt 0
+do
+   case "$1" in
+   --bootargs|--bootarg)
+   checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" 
'.*' '^--'
+   TORTURE_BOOTARGS="$2"
+   shift
+   ;;
+   --bootimage)
+   checkarg --bootimage "(relative path to kernel boot image)" 
"$#" "$2" '[a-zA-Z0-9][a-zA-Z0-9_]*' '^--'
+   TORTURE_BOOT_IMAGE="$2"
+   shift
+   ;;
+   --buildonly)
+   TORTURE_BUILDONLY=1
+   ;;
+   --configs|--config)
+   checkarg --configs "(list of config files)" "$#" "$2" '^[^/]*$' 
'^--'
+   configs="$2"
+   shift
+   ;;
+   --cpus)
+   checkarg --cpus "(number)" "$#" "$2" '^[0-9]*$' '^--'
+   cpus=$2
+   shift
+   ;;
+   --datestamp)
+   checkarg --datestamp "(relative pathname)" "$#" "$2" '^[^/]*$' 
'^--'
+   ds=$2
+   shift
+   ;;
+   --defconfig)
+   checkarg --defconfig "defconfigtype" "$#" "$2" '^[^/][^/]*$' 
'^--'
+   TORTURE_DEFCONFIG=$2
+   shift
+   ;;
+   --dryrun)
+   checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--'
+   dryrun=$2
+   shift
+   ;;
+   --duration)
+   checkarg --duration "(minutes)" $# "$2" '^[0-9]*$' '^error'
+   dur=$(($2*60))
+   shift
+   ;;
+   --interactive)
+   TORTURE_QEMU_INTERACTIVE=1; export TORTURE_QEMU_INTERACTIVE
+   ;;
+   --jitter)
+   checkarg --jitter "(# threads [ sleep [ spin ] ])" $# "$2" 
'^-\{,1\}[0-9]\+\( \+[0-9]\+\)\{,2\} *$' '^error$'
+   jitter="$2"
+   shift
+   ;;
+   --kmake-arg)
+   checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' 
'^error$'
+

[PATCH RFC 13/16] prcu: Comment source code

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 include/linux/prcu.h |  73 -
 kernel/rcu/prcu.c| 178 +++
 2 files changed, 225 insertions(+), 26 deletions(-)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index bb20fa40..9f740985 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -1,3 +1,11 @@
+/*
+ * Read-Copy Update mechanism for mutual exclusion (PRCU version).
+ * PRCU public definitions.
+ *
+ * Authors: Heng Zhang 
+ *  Lihao Liang 
+ */
+
 #ifndef __LINUX_PRCU_H
 #define __LINUX_PRCU_H
 
@@ -8,12 +16,26 @@
 #include 
 
 #ifdef CONFIG_PRCU
+
+/*
+ * Simple list structure of callback versions.
+ *
+ * Note: Ideally, we would like to add the version field
+ * to the rcu_head struct.  But if we do so, other users of
+ * rcu_head in the Linux kernel will complain hard and loudly.
+ */
 struct prcu_version_head {
unsigned long long version;
struct prcu_version_head *next;
 };
 
-/* Simple unsegmented callback list for PRCU. */
+/*
+ * Simple unsegmented callback list for PRCU.
+ *
+ * Note: Since we can't add a new version field to rcu_head,
+ * we have to make our own callback list for PRCU instead of
+ * using the existing rcu_cblist. Sigh!
+ */
 struct prcu_cblist {
struct rcu_head *head;
struct rcu_head **tail;
@@ -27,31 +49,47 @@ struct prcu_cblist {
.version_head = NULL, .version_tail = _head, \
 }
 
+/*
+ * PRCU's per-CPU state.
+ */
 struct prcu_local_struct {
-   unsigned int locked;
-   unsigned int online;
-   unsigned long long version;
-   unsigned long long cb_version;
-   struct rcu_head barrier_head;
-   struct prcu_cblist cblist;
+   unsigned int locked;   /* Nesting level of PRCU read-side */
+  /*  critcal sections */
+   unsigned int online;   /* Indicates whether a context-switch */
+  /*  has occurred on this CPU */
+   unsigned long long version;/* Local grace-period version */
+   unsigned long long cb_version; /* Local callback version */
+   struct rcu_head barrier_head;  /* PRCU callback list */
+   struct prcu_cblist cblist; /* PRCU callback version list */
 };
 
+/*
+ * PRCU's global state.
+ */
 struct prcu_struct {
-   atomic64_t global_version;
-   atomic64_t cb_version;
-   atomic_t active_ctr;
-   atomic_t barrier_cpu_count;
-   struct mutex mtx;
-   struct mutex barrier_mtx;
-   wait_queue_head_t wait_q;
-   struct completion barrier_completion;
+   atomic64_t global_version;/* Global grace-period version */
+   atomic64_t cb_version;/* Global callback version */
+   atomic_t active_ctr;  /* Outstanding PRCU tasks */
+ /*  being context-switched */
+   atomic_t barrier_cpu_count;   /* # CPUs waiting on 
prcu_barrier() */
+   struct mutex mtx; /* Serialize synchronize_prcu() */
+   struct mutex barrier_mtx; /* Serialize prcu_barrier() */
+   wait_queue_head_t wait_q; /* Wait for synchronize_prcu() */
+   struct completion barrier_completion; /* Wait for prcu_barrier() */
 };
 
+/*
+ * PRCU APIs.
+ */
 void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
 void call_prcu(struct rcu_head *head, rcu_callback_t func);
 void prcu_barrier(void);
+
+/*
+ * Internal non-public functions.
+ */
 void prcu_init(void);
 void prcu_note_context_switch(void);
 int prcu_pending(void);
@@ -60,11 +98,16 @@ void prcu_check_callbacks(void);
 
 #else /* #ifdef CONFIG_PRCU */
 
+/*
+ * If CONFIG_PRCU is not defined,
+ * map its APIs to RCU's counterparts.
+ */
 #define prcu_read_lock rcu_read_lock
 #define prcu_read_unlock rcu_read_unlock
 #define synchronize_prcu synchronize_rcu
 #define call_prcu call_rcu
 #define prcu_barrier rcu_barrier
+
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 #define prcu_pending() 0
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index 49cb70e6..ef2c7730 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -1,3 +1,17 @@
+/*
+ * Read-Copy Update mechanism for mutual exclusion (PRCU version).
+ * This PRCU implementation is based on a fast consensus protocol
+ * published in the following paper:
+ *
+ * Fast Consensus Using Bounded Staleness for Scalable Read-mostly 
Synchronization.
+ * Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
+ * IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
+ * https://dl.acm.org/citation.cfm?id=3024114.3024143
+ *
+ * Authors: Heng Zhang 
+ *  Lihao Liang 
+ */
+
 #include 
 #include 
 #include

[PATCH RFC 08/16] prcu: Implement PRCU callback processing

2018-01-23 Thread lianglihao

From: Lihao Liang 

Currently, PRCU core processing only consists of callback processing
in prcu_process_callbacks(), which is triggered by the scheduling-clock
interrupt.

Reviewed-by: Heng Zhang 
Signed-off-by: Lihao Liang 
---
 include/linux/interrupt.h |  3 ++
 include/linux/prcu.h  |  8 +
 kernel/rcu/prcu.c | 86 +++
 kernel/rcu/tree.c |  1 +
 kernel/time/timer.c   |  2 ++
 5 files changed, 100 insertions(+)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 0991f973..f05ef62a 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -456,6 +456,9 @@ enum
SCHED_SOFTIRQ,
HRTIMER_SOFTIRQ, /* Unused, but kept as tools rely on the
numbering. Sigh! */
+#ifdef CONFIG_PRCU
+   PRCU_SOFTIRQ,
+#endif
RCU_SOFTIRQ,/* Preferable RCU should always be the last softirq */
 
NR_SOFTIRQS
diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index e5e09c9b..4e7d5d65 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -31,11 +31,13 @@ struct prcu_local_struct {
unsigned int locked;
unsigned int online;
unsigned long long version;
+   unsigned long long cb_version;
struct prcu_cblist cblist;
 };
 
 struct prcu_struct {
atomic64_t global_version;
+   atomic64_t cb_version;
atomic_t active_ctr;
struct mutex mtx;
wait_queue_head_t wait_q;
@@ -48,6 +50,9 @@ void synchronize_prcu(void);
 void call_prcu(struct rcu_head *head, rcu_callback_t func);
 void prcu_init(void);
 void prcu_note_context_switch(void);
+int prcu_pending(void);
+void invoke_prcu_core(void);
+void prcu_check_callbacks(void);
 
 #else /* #ifdef CONFIG_PRCU */
 
@@ -57,6 +62,9 @@ void prcu_note_context_switch(void);
 #define call_prcu() do {} while (0)
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
+#define prcu_pending() 0
+#define invoke_prcu_core() do {} while (0)
+#define prcu_check_callbacks() do {} while (0)
 
 #endif /* #ifdef CONFIG_PRCU */
 #endif /* __LINUX_PRCU_H */
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index f198285c..373039c5 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -1,6 +1,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -11,6 +12,7 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, 
prcu_local);
 
 struct prcu_struct global_prcu = {
.global_version = ATOMIC64_INIT(0),
+   .cb_version = ATOMIC64_INIT(0),
.active_ctr = ATOMIC_INIT(0),
.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
@@ -27,6 +29,35 @@ static void prcu_cblist_init(struct prcu_cblist *rclp)
rclp->len = 0;
 }
 
+/*
+ * Dequeue the oldest rcu_head structure from the specified callback list;
+ * store the callback grace period version number into the version pointer.
+ */
+static struct rcu_head *prcu_cblist_dequeue(struct prcu_cblist *rclp)
+{
+   struct rcu_head *rhp;
+   struct prcu_version_head *vhp;
+
+   rhp = rclp->head;
+   if (!rhp) {
+   WARN_ON(vhp);
+   WARN_ON(rclp->len);
+   return NULL;
+   }
+
+   vhp = rclp->version_head;
+   rclp->version_head = vhp->next;
+   rclp->head = rhp->next;
+   rclp->len--;
+
+   if (!rclp->head) {
+   rclp->tail = >head;
+   rclp->version_tail = >version_head;
+   }
+
+   return rhp;
+}
+
 static inline void prcu_report(struct prcu_local_struct *local)
 {
unsigned long long global_version;
@@ -117,6 +148,7 @@ void synchronize_prcu(void)
if (atomic_read(>active_ctr))
wait_event(prcu->wait_q, !atomic_read(>active_ctr));
 
+   atomic64_set(>cb_version, version);
mutex_unlock(>mtx);
 }
 EXPORT_SYMBOL(synchronize_prcu);
@@ -166,6 +198,58 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 }
 EXPORT_SYMBOL(call_prcu);
 
+int prcu_pending(void)
+{
+   struct prcu_local_struct *local = get_cpu_ptr(_local);
+   unsigned long long cb_version = local->cb_version;
+   struct prcu_cblist *rclp = >cblist;
+
+   put_cpu_ptr(_local);
+   return cb_version < atomic64_read(>cb_version) && rclp->head;
+}
+
+void invoke_prcu_core(void)
+{
+   if (cpu_online(smp_processor_id()))
+   raise_softirq(PRCU_SOFTIRQ);
+}
+
+void prcu_check_callbacks(void)
+{
+   if (prcu_pending())
+   invoke_prcu_core();
+}
+
+static __latent_entropy void prcu_process_callbacks(struct softirq_action 
*unused)
+{
+   unsigned long flags;
+   unsigned long long cb_version;
+   struct prcu_local_struct *local;
+   struct prcu_cblist *rclp;
+   struct rcu_head *rhp;
+   struct prcu_version_head *vhp;
+
+   if

[PATCH RFC 09/16] prcu: Implement prcu_barrier() API

2018-01-23 Thread lianglihao

From: Lihao Liang 

This is PRCU's counterpart of RCU's rcu_barrier() API.

Reviewed-by: Heng Zhang 
Signed-off-by: Lihao Liang 
---
 include/linux/prcu.h |  7 ++
 kernel/rcu/prcu.c| 63 
 2 files changed, 70 insertions(+)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index 4e7d5d65..cce967fd 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CONFIG_PRCU
 
@@ -32,6 +33,7 @@ struct prcu_local_struct {
unsigned int online;
unsigned long long version;
unsigned long long cb_version;
+   struct rcu_head barrier_head;
struct prcu_cblist cblist;
 };
 
@@ -39,8 +41,11 @@ struct prcu_struct {
atomic64_t global_version;
atomic64_t cb_version;
atomic_t active_ctr;
+   atomic_t barrier_cpu_count;
struct mutex mtx;
+   struct mutex barrier_mtx;
wait_queue_head_t wait_q;
+   struct completion barrier_completion;
 };
 
 #ifdef CONFIG_PRCU
@@ -48,6 +53,7 @@ void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
 void call_prcu(struct rcu_head *head, rcu_callback_t func);
+void prcu_barrier(void);
 void prcu_init(void);
 void prcu_note_context_switch(void);
 int prcu_pending(void);
@@ -60,6 +66,7 @@ void prcu_check_callbacks(void);
 #define prcu_read_unlock() do {} while (0)
 #define synchronize_prcu() do {} while (0)
 #define call_prcu() do {} while (0)
+#define prcu_barrier() do {} while (0)
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 #define prcu_pending() 0
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index 373039c5..2664d091 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -15,6 +15,7 @@ struct prcu_struct global_prcu = {
.cb_version = ATOMIC64_INIT(0),
.active_ctr = ATOMIC_INIT(0),
.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
+   .barrier_mtx = __MUTEX_INITIALIZER(global_prcu.barrier_mtx),
.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
 };
 struct prcu_struct *prcu = _prcu;
@@ -250,6 +251,68 @@ static __latent_entropy void prcu_process_callbacks(struct 
softirq_action *unuse
local_irq_restore(flags);
 }
 
+/*
+ * PRCU callback function for prcu_barrier().
+ * If we are last, wake up the task executing prcu_barrier().
+ */
+static void prcu_barrier_callback(struct rcu_head *rhp)
+{
+   if (atomic_dec_and_test(>barrier_cpu_count))
+   complete(>barrier_completion);
+}
+
+/*
+ * Called with preemption disabled, and from cross-cpu IRQ context.
+ */
+static void prcu_barrier_func(void *info)
+{
+   struct prcu_local_struct *local = this_cpu_ptr(_local);
+
+   atomic_inc(>barrier_cpu_count);
+   call_prcu(>barrier_head, prcu_barrier_callback);
+}
+
+/* Waiting for all PRCU callbacks to complete. */
+void prcu_barrier(void)
+{
+   int cpu;
+
+   /* Take mutex to serialize concurrent prcu_barrier() requests. */
+   mutex_lock(>barrier_mtx);
+
+   /*
+* Initialize the count to one rather than to zero in order to
+* avoid a too-soon return to zero in case of a short grace period
+* (or preemption of this task).
+*/
+   init_completion(>barrier_completion);
+   atomic_set(>barrier_cpu_count, 1);
+
+   /*
+* Register a new callback on each CPU using IPI to prevent races
+* with call_prcu(). When that callback is invoked, we will know
+* that all of the corresponding CPU's preceding callbacks have
+* been invoked.
+*/
+   for_each_possible_cpu(cpu)
+   smp_call_function_single(cpu, prcu_barrier_func, NULL, 1);
+
+   /* Decrement the count as we initialize it to one. */
+   if (atomic_dec_and_test(>barrier_cpu_count))
+   complete(>barrier_completion);
+
+   /*
+* Now that we have an prcu_barrier_callback() callback on each
+* CPU, and thus each counted, remove the initial count.
+* Wait for all prcu_barrier_callback() callbacks to be invoked.
+*/
+   wait_for_completion(>barrier_completion);
+
+   /* Other rcu_barrier() invocations can now safely proceed. */
+   mutex_unlock(>barrier_mtx);
+}
+EXPORT_SYMBOL(prcu_barrier);
+
 void prcu_init_local_struct(int cpu)
 {
struct prcu_local_struct *local;
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 07/16] prcu: Implement call_prcu() API

2018-01-23 Thread lianglihao

From: Lihao Liang 

This is PRCU's counterpart of RCU's call_rcu() API.

Reviewed-by: Heng Zhang 
Signed-off-by: Lihao Liang 
---
 include/linux/prcu.h | 25 
 init/main.c  |  2 ++
 kernel/rcu/prcu.c| 67 +---
 3 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index 653b4633..e5e09c9b 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -2,15 +2,36 @@
 #define __LINUX_PRCU_H
 
 #include 
+#include 
 #include 
 #include 
 
 #define CONFIG_PRCU
 
+struct prcu_version_head {
+   unsigned long long version;
+   struct prcu_version_head *next;
+};
+
+/* Simple unsegmented callback list for PRCU. */
+struct prcu_cblist {
+   struct rcu_head *head;
+   struct rcu_head **tail;
+   struct prcu_version_head *version_head;
+   struct prcu_version_head **version_tail;
+   long len;
+};
+
+#define PRCU_CBLIST_INITIALIZER(n) { \
+   .head = NULL, .tail = , \
+   .version_head = NULL, .version_tail = _head, \
+}
+
 struct prcu_local_struct {
unsigned int locked;
unsigned int online;
unsigned long long version;
+   struct prcu_cblist cblist;
 };
 
 struct prcu_struct {
@@ -24,6 +45,8 @@ struct prcu_struct {
 void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
+void call_prcu(struct rcu_head *head, rcu_callback_t func);
+void prcu_init(void);
 void prcu_note_context_switch(void);
 
 #else /* #ifdef CONFIG_PRCU */
@@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
 #define prcu_read_lock() do {} while (0)
 #define prcu_read_unlock() do {} while (0)
 #define synchronize_prcu() do {} while (0)
+#define call_prcu() do {} while (0)
+#define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 
 #endif /* #ifdef CONFIG_PRCU */
diff --git a/init/main.c b/init/main.c
index f8665104..4925964e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
workqueue_init_early();
 
rcu_init();
+   prcu_init();
 
/* Trace events are available after this */
trace_init();
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index a00b9420..f198285c 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -1,11 +1,12 @@
 #include 
-#include 
 #include 
-#include 
+#include 
 #include 
-
+#include 
 #include 
 
+#include "rcu.h"
+
 DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
 
 struct prcu_struct global_prcu = {
@@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
 };
 struct prcu_struct *prcu = _prcu;
 
+/* Initialize simple callback list. */
+static void prcu_cblist_init(struct prcu_cblist *rclp)
+{
+   rclp->head = NULL;
+   rclp->tail = >head;
+   rclp->version_head = NULL;
+   rclp->version_tail = >version_head;
+   rclp->len = 0;
+}
+
 static inline void prcu_report(struct prcu_local_struct *local)
 {
unsigned long long global_version;
@@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
prcu_report(local);
put_cpu_ptr(_local);
 }
+
+void call_prcu(struct rcu_head *head, rcu_callback_t func)
+{
+   unsigned long flags;
+   struct prcu_local_struct *local;
+   struct prcu_cblist *rclp;
+   struct prcu_version_head *vhp;
+
+   debug_rcu_head_queue(head);
+
+   /* Use GFP_ATOMIC with IRQs disabled */
+   vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
+   if (!vhp)
+   return;
+
+   head->func = func;
+   head->next = NULL;
+   vhp->next = NULL;
+
+   local_irq_save(flags);
+   local = this_cpu_ptr(_local);
+   vhp->version = local->version;
+   rclp = >cblist;
+   rclp->len++;
+   *rclp->tail = head;
+   rclp->tail = >next;
+   *rclp->version_tail = vhp;
+   rclp->version_tail = >next;
+   local_irq_restore(flags);
+}
+EXPORT_SYMBOL(call_prcu);
+
+void prcu_init_local_struct(int cpu)
+{
+   struct prcu_local_struct *local;
+
+   local = per_cpu_ptr(_local, cpu);
+   local->locked = 0;
+   local->online = 0;
+   local->version = 0;
+   prcu_cblist_init(>cblist);
+}
+
+void __init prcu_init(void)
+{
+   int cpu;
+
+   for_each_possible_cpu(cpu)
+   prcu_init_local_struct(cpu);
+}
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run

2018-01-23 Thread lianglihao

From: Lihao Liang 

Signed-off-by: Lihao Liang 
---
 kernel/rcu/rcuperf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index ea80fa3e..baccc123 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney 
");
 #define VERBOSE_PERFOUT_ERRSTRING(s) \
do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } 
while (0)
 
-torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
+torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");
 torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
 torture_param(int, nreaders, -1, "Number of RCU reader threads");
 torture_param(int, nwriters, -1, "Number of RCU updater threads");
-- 
2.14.1.729.g59c0ea183

[PATCH RFC 11/16] rcutorture: Add basic ARM64 support to run scripts

2018-01-23 Thread lianglihao

From: Lihao Liang 

This commit adds support of the qemu command qemu-system-aarch64
to rcutorture.

Signed-off-by: Lihao Liang 
---
 tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh 
b/tools/testing/selftests/rcutorture/bin/functions.sh
index 1426a9b9..4a24b873 100644
--- a/tools/testing/selftests/rcutorture/bin/functions.sh
+++ b/tools/testing/selftests/rcutorture/bin/functions.sh
@@ -111,6 +111,9 @@ identify_boot_image () {
qemu-system-x86_64|qemu-system-i386)
echo arch/x86/boot/bzImage
;;
+   qemu-system-aarch64)
+   echo arch/arm64/boot/Image
+   ;;
*)
echo vmlinux
;;
@@ -133,6 +136,9 @@ identify_qemu () {
elif echo $u | grep -q "Intel 80386"
then
echo qemu-system-i386
+   elif echo $u | grep -q aarch64
+   then
+   echo qemu-system-aarch64
elif uname -a | grep -q ppc64
then
echo qemu-system-ppc64
@@ -151,16 +157,20 @@ identify_qemu () {
 # Output arguments for the qemu "-append" string based on CPU type
 # and the TORTURE_QEMU_INTERACTIVE environment variable.
 identify_qemu_append () {
+   local console=ttyS0
case "$1" in
qemu-system-x86_64|qemu-system-i386)
echo noapic selinux=0 initcall_debug debug
;;
+   qemu-system-aarch64)
+   console=ttyAMA0
+   ;;
esac
if test -n "$TORTURE_QEMU_INTERACTIVE"
then
echo root=/dev/sda
else
-   echo console=ttyS0
+   echo console=$console
fi
 }
 
@@ -172,6 +182,9 @@ identify_qemu_args () {
case "$1" in
qemu-system-x86_64|qemu-system-i386)
;;
+   qemu-system-arm|qemu-system-aarch64)
+   echo -machine virt,gic-version=host -cpu host
+   ;;
qemu-system-ppc64)
echo -enable-kvm -M pseries -nodefaults
echo -device spapr-vscsi
@@ -229,7 +242,7 @@ specify_qemu_cpus () {
echo $2
else
case "$1" in
-   qemu-system-x86_64|qemu-system-i386)
+   qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64)
echo $2 -smp $3
;;
qemu-system-ppc64)
-- 
2.14.1.729.g59c0ea183

Re: [PATCH v3 11/20] arm64: mm: Map entry trampoline into trampoline and kernel page tables

2018-01-23 Thread Yisheng Xie

Hi Will,

On 2017/12/6 20:35, Will Deacon wrote:
> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
> +static int __init map_entry_trampoline(void)
> +{
> + extern char __entry_tramp_text_start[];
> +
> + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
> + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
> +
> + /* The trampoline is always mapped and can therefore be global */
> + pgprot_val(prot) &= ~PTE_NG;
> +
> + /* Map only the text into the trampoline page table */
> + memset(tramp_pg_dir, 0, PGD_SIZE);
> + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
> +  prot, pgd_pgtable_alloc, 0);

How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? Sorry
for I do not find where it is used.

Thanks
Yisheng

> +
> + /* ...as well as the kernel page table */
> + __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
> + return 0;
> +}
> +core_initcall(map_entry_trampoline);
> +#endif
> +
>  /*
>   * Create fine-grained mappings for the kernel.
>   */
>

Re: [PATCH v2 4/4] drivers: firmware: xilinx: Add debugfs interface

2018-01-23 Thread Greg KH

On Wed, Jan 17, 2018 at 12:20:34PM -0800, Jolly Shah wrote:
> +/* Setup debugfs fops */
> +static const struct file_operations fops_zynqmp_pm_dbgfs = {
> +   .owner  =   THIS_MODULE,
> +   .write  =   zynqmp_pm_debugfs_api_write,
> +   .read   =   zynqmp_pm_debugfs_api_version_read,
> +};
> +
> +/**
> + * zynqmp_pm_api_debugfs_init - Initialize debugfs interface
> + *
> + * Return:  Returns 0 on success
> + * Corresponding error code otherwise
> + */
> +int zynqmp_pm_api_debugfs_init(void)
> +{
> +   int err;
> +
> +   /* Initialize debugfs interface */
> +   zynqmp_pm_debugfs_dir = debugfs_create_dir(DRIVER_NAME, NULL);
> +   if (!zynqmp_pm_debugfs_dir) {
> +   pr_err("debugfs_create_dir failed\n");
> +   return -ENODEV;
> +   }

No, you should NEVER care if a debugfs call returned an error or not, no
need to check it at all.  Your code path should not change based on the
return value as no code should depened on the functionality of debugfs.

Any error returned by a debugfs call can be passed right back into it
with no problems, so again, no need to check this.

> +
> +   zynqmp_pm_debugfs_power =
> +   debugfs_create_file("pm", 0220,
> +   zynqmp_pm_debugfs_dir, NULL,
> +   _zynqmp_pm_dbgfs);
> +   if (!zynqmp_pm_debugfs_power) {
> +   pr_err("debugfs_create_file power failed\n");
> +   err = -ENODEV;
> +   goto err_dbgfs;
> +   }
> +
> +   zynqmp_pm_debugfs_api_version =
> +   debugfs_create_file("api_version", 0444,
> +   zynqmp_pm_debugfs_dir, NULL,
> +   _zynqmp_pm_dbgfs);
> +   if (!zynqmp_pm_debugfs_api_version) {
> +   pr_err("debugfs_create_file api_version failed\n");
> +   err = -ENODEV;
> +   goto err_dbgfs;
> +   }

Why do you save these dentries at all anyway?  You never do anything
with them, just create the files and away you go, no need to worry about
anything.

Remember, debugfs was created to be very simple to use, don't make it
more complex than it has to be please.

thanks,

greg k-h

Re: [PATCH 2/5] pinctrl: stm32: add STM32F769 MCU support

2018-01-23 Thread Alexandre Torgue




On 01/22/2018 09:25 AM, Linus Walleij wrote:

On Mon, Dec 11, 2017 at 9:54 AM, Alexandre Torgue
 wrote:


This patch which adds STM32F769 pinctrl and GPIO support, relies on the
generic STM32 pinctrl driver.

Signed-off-by: Alexandre Torgue 


Patch applied as Patrice poked me.

I hope it works fine being applied in isolation from the other
patches?


Yes it does. I will add other patches in my next pull request (for v4.17).

Thanks
Alex



Yours,
Linus Walleij

Re: [PATCH arm/aspeed/ast2500 v1] eSPI: add Aspeed AST2500 eSPI driver to boot a host with PCH runs on eSPI

2018-01-23 Thread Greg KH

On Tue, Jan 16, 2018 at 07:52:32PM +0800, Haiyue Wang wrote:
> When PCH works under eSPI mode, the PMC (Power Management Controller) in
> PCH is waiting for SUS_ACK from BMC after it alerts SUS_WARN. It is in
> dead loop if no SUS_ACK assert. This is the basic requirement for the BMC
> works as eSPI slave.
> 
> Also for the host power on / off actions, from BMC side, the following VW
> (Virtual Wire) messages are done in firmware:
> 1. SLAVE_BOOT_LOAD_DONE / SLAVE_BOOT_LOAD_STATUS
> 2. SUS_ACK
> 3. OOB_RESET_ACK
> 4. HOST_RESET_ACK
> 
> Signed-off-by: Haiyue Wang 
> ---
>  .../devicetree/bindings/misc/aspeed-espi-slave.txt |  20 ++
>  Documentation/misc-devices/espi-slave.rst  | 114 +

DT files need to be split out into a separate patch so that the DT
maintainers can properly review them.

> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -471,6 +471,17 @@ config VEXPRESS_SYSCFG
> ARM Ltd. Versatile Express uses specialised platform configuration
> bus. System Configuration interface is one of the possible means
> of generating transactions on this bus.
> +config ASPEED_ESPI_SLAVE

You need a blank line above this one please.

> + depends on ARCH_ASPEED || COMPILE_TEST
> + select REGMAP_MMIO

Select or depend?

> + tristate "Aspeed ast2500 eSPI slave device"
> + ---help---
> +   This allows host to access Baseboard Management Controller (BMC) over 
> the
> +   Enhanced Serial Peripheral Interface (eSPI) bus, which replaces the 
> Low Pin
> +   Count (LPC) bus.
> +
> +   Its interface supports peripheral, virtual wire, out-of-band, and 
> flash
> +   sharing channels.

What is the module name?

>  
>  config ASPEED_LPC_CTRL
>   depends on (ARCH_ASPEED || COMPILE_TEST) && REGMAP && MFD_SYSCON
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 5ca5f64..a1081f4 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -52,6 +52,7 @@ obj-$(CONFIG_GENWQE)+= genwqe/
>  obj-$(CONFIG_ECHO)   += echo/
>  obj-$(CONFIG_VEXPRESS_SYSCFG)+= vexpress-syscfg.o
>  obj-$(CONFIG_CXL_BASE)   += cxl/
> +obj-$(CONFIG_ASPEED_ESPI_SLAVE) += aspeed-espi-slave.o

Why no tab?

thanks,

greg k-h

Re: [PATCH] cpufreq: mediatek: Add mediatek related projects into blacklist

2018-01-23 Thread Greg KH

On Tue, Jan 23, 2018 at 04:31:11PM +0800, sean.w...@mediatek.com wrote:
> From: Sean Wang 
> 
> commit 6066998cbd2b1012a8d5bc9a2957cfd0ad53150e upstream.
> 
> commit edeec420de24 ("cpufreq: dt-platdev: Automatically create cpufreq
> device with OPP v2") not added MediaTek SoCs to the blacklist that would
> lead to cause an occasional hang or unexpected behaviors on related boards
> as kernelci reported and complained on [1] specifically for 4.14 and 4.15
> tree.
> 
> For those reasons, add MediaTek SoCs into cpufreq-dt blacklist and wish
> the patch be applied to 4.14 and 4.15 tree to allow kernelci able to
> complete following automated kernel testing.
> 
> [1] https://kernelci.org/boot/mt7623n-bananapi-bpi-r2/
> 
> Fixes: edeec420de24 (cpufreq: dt-cpufreq: platdev Automatically create device 
> with OPP v2)
> Signed-off-by: Andrew-sh Cheng 
> Signed-off-by: Sean Wang 
> Cc: Kevin Hilman 
> ---
>  drivers/cpufreq/cpufreq-dt-platdev.c | 8 
>  1 file changed, 8 insertions(+)

What stable kernel tree(s) are you wanting this backported to?

thanks,

greg k-h

[PATCH 2/2] ARM: dts: imx6sx: add ARM power domain support

2018-01-23 Thread Anson Huang

Add ARM power domain in PGC.

Signed-off-by: Anson Huang 
---
this patch should be based on 
0001-ARM-dts-imx6sx-add-pu-power-domain-support.patch
 arch/arm/boot/dts/imx6sx.dtsi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/imx6sx.dtsi b/arch/arm/boot/dts/imx6sx.dtsi
index 42ef4c6..aa29ca6 100644
--- a/arch/arm/boot/dts/imx6sx.dtsi
+++ b/arch/arm/boot/dts/imx6sx.dtsi
@@ -768,6 +768,11 @@
#address-cells = <1>;
#size-cells = <0>;
 
+   power-domain@0 {
+   reg = <0>;
+   #power-domain-cells = <0>;
+   };
+
pd_pu: power-domain@1 {
reg = <1>;
#power-domain-cells = <0>;
-- 
2.7.4

[PATCH 1/2] soc: imx: gpc: ARM power domain should be always-on

2018-01-23 Thread Anson Huang

ARM power domain does NOT support runtime off, always-on
flag should be set to avoid incorrect power state in
pm_genpd_summary:

Before:

root@imx6qpdlsolox:~# cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
domain  status  slaves
/device runtime status
--
ARM off-0

After:

root@imx6qpdlsolox:~# cat /sys/kernel/debug/pm_genpd/pm_genpd_summary
domain  status  slaves
/device runtime status
--
ARM on

Signed-off-by: Anson Huang 
---
 drivers/soc/imx/gpc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/soc/imx/gpc.c b/drivers/soc/imx/gpc.c
index 53f7275..6cafa9b 100644
--- a/drivers/soc/imx/gpc.c
+++ b/drivers/soc/imx/gpc.c
@@ -254,6 +254,7 @@ static struct imx_pm_domain imx_gpc_domains[] = {
{
.base = {
.name = "ARM",
+   .flags = GENPD_FLAG_ALWAYS_ON,
},
}, {
.base = {
-- 
2.7.4

Re: [PATCH] Documentation/ABI: clean up sysfs-class-pktcdvd

2018-01-23 Thread Julia Lawall



On Tue, 23 Jan 2018, Aishwarya Pant wrote:

> Clean up the sysfs documentation such that it is in the same format as
> described in Documentation/ABI/README. Mainly, the patch moves the
> attribute names to the 'What:' field. This might be useful for scripting
> and tracking changes in the ABI.
>
> Signed-off-by: Aishwarya Pant 
> ---
>  Documentation/ABI/testing/sysfs-class-pktcdvd | 122 
> +++---
>  1 file changed, 71 insertions(+), 51 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-class-pktcdvd 
> b/Documentation/ABI/testing/sysfs-class-pktcdvd
> index b1c3f0263359..e85ec99c6e31 100644
> --- a/Documentation/ABI/testing/sysfs-class-pktcdvd
> +++ b/Documentation/ABI/testing/sysfs-class-pktcdvd
> @@ -1,60 +1,80 @@
> -What:   /sys/class/pktcdvd/
> +sysfs interface
> +---
> +The pktcdvd module (packet writing driver) creates the following files in the
> +sysfs: ( is in format major:minor)
> +
> +What:   /sys/class/pktcdvd/add
> +What:   /sys/class/pktcdvd/remove
> +What:   /sys/class/pktcdvd/device_map
>  Date:   Oct. 2006
>  KernelVersion:  2.6.20
>  Contact:Thomas Maier 
>  Description:
>
> -sysfs interface
> 
> + add:(WO) Write a block device id (major:minor) to create
> + a new pktcdvd device and map it to the block device.
> +
> + remove: (WO) Write the pktcdvd device id (major:minor) to 
> it to
> + remove the pktcdvd device.
> +
> + device_map: (RO) Shows the device mapping in format:
> + pktcdvd[0-7]  
> +
> +
> +What:   /sys/class/pktcdvd/pktcdvd[0-7]/dev
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/uevent

It looks like there is a small alignment problem here.  Maybe you use
spaces in one case and tabs in the other.

julia

> +Date:   Oct. 2006
> +KernelVersion:  2.6.20
> +Contact:Thomas Maier 
> +Description:
> + dev:(RO) Device id
> +
> + uevent: (WO) To send an uevent
> +
> +
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/packets_started
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/packets_finished
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/kb_written
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/kb_read
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/kb_read_gather
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/reset
> +Date:   Oct. 2006
> +KernelVersion:  2.6.20
> +Contact:Thomas Maier 
> +Description:
> + packets_started:  (RO) Number of started packets.
> +
> + packets_finished: (RO) Number of finished packets.
> +
> + kb_written:   (RO) kBytes written.
> +
> + kb_read:  (RO) kBytes read.
> +
> + kb_read_gather:   (RO) kBytes read to fill write packets.
> +
> + reset:(WO) Write any value to it to reset pktcdvd
> +   device statistic values, like bytes
> +   read/written.
> +
> +
> +What:/sys/class/pktcdvd/pktcdvd[0-7]/write_queue/size
> +What:
> /sys/class/pktcdvd/pktcdvd[0-7]/write_queue/congestion_off
> +What:
> /sys/class/pktcdvd/pktcdvd[0-7]/write_queue/congestion_on
> +Date:   Oct. 2006
> +KernelVersion:  2.6.20
> +Contact:Thomas Maier 
> +Description:
> + size:   (RO) Contains the size of the bio write queue.
> +
> + congestion_off: (RW) If bio write queue size is below this mark,
> + accept new bio requests from the block layer.
>
> -The pktcdvd module (packet writing driver) creates
> -these files in the sysfs:
> -( is in format  major:minor )
> -
> -/sys/class/pktcdvd/
> -add(0200)  Write a block device id (major:minor)
> -   to create a new pktcdvd device and map
> -   it to the block device.
> -
> -remove (0200)  Write the pktcdvd device id (major:minor)
> -   to it to remove the pktcdvd device.
> -
> -device_map (0444)  Shows the device mapping in format:
> - pktcdvd[0-7]  
> -
> -/sys/class/pktcdvd/pktcdvd[0-7]/
> -dev   (0444) Device id
> -uevent(0200) To send an uevent.
> -
> -/sys/class/pktcdvd/pktcdvd[0-7]/stat/
> -packets_started   (0444) Number of started packets.
> -packets_finished  (0444) Number of finished packets.
> -
> -kb_written(0444) kBytes written.
> -kb_read   (0444) kBytes read.
> -kb_read_gather(0444) kBytes read to fill write packets.
> -
> -reset

[PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle)

2018-01-23 Thread Mike Snitzer

On Thu, Jan 18 2018 at  5:20pm -0500,
Bart Van Assche  wrote:

> On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote:
> > And yet Laurence cannot reproduce any such lockups with your test...
> 
> Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has
> already succeeded at running an unmodified version of my tests. In one of the
> e-mails Laurence sent me this morning I read that he modified these scripts
> to get past a kernel module unload failure that was reported while starting
> these tests. So the next step is to check which changes were made to the test
> scripts and also whether the test results are still valid.
> 
> > Are you absolutely certain this patch doesn't help you?
> > https://patchwork.kernel.org/patch/10174037/
> > 
> > If it doesn't then that is actually very useful to know.
> 
> The first I tried this morning is to run the srp-test software against a merge
> of Jens' for-next branch and your dm-4.16 branch. Since I noticed that the dm
> queue locked up I reinserted a blk_mq_delay_run_hw_queue() call in the dm 
> code.
> Since even that was not sufficient I tried to kick the queues via debugfs (for
> s in /sys/kernel/debug/block/*/state; do echo kick >$s; done). Since that was
> not sufficient to resolve the queue stall I reverted the following tree 
> patches
> that are in Jens' tree:
> * "blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request 
> feedback"
> * "blk-mq-sched: remove unused 'can_block' arg from 
> blk_mq_sched_insert_request"
> * "blk-mq: don't dispatch request in blk_mq_request_direct_issue if queue is 
> busy"
> 
> Only after I had done this the srp-test software ran again without triggering
> dm queue lockups.

Given that Ming's notifier-based patchset needs more development time I
think we're unfortunately past the point where we can comfortably wait
for that to be ready.

So we need to explore alternatives to fixing this IO stall regression.
Rather than attempt the above block reverts (which is an incomplete
listing given newer changes): might we develop a more targeted code
change to neutralize commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO
merging via blk_insert_cloned_request feedback")? -- which, given Bart's
findings above, seems to be the most problematic block commit.

To that end, assuming I drop this commit from dm-4.16:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=316a795ad388e0c3ca613454851a28079d917a92

Here is my proposal for putting this regression behind us for 4.16
(Ming's line of development would continue and hopefully be included in
4.17):

From: Mike Snitzer 
Date: Tue, 23 Jan 2018 09:40:22 +0100
Subject: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression

The series of blk-mq changes intended to improve sequential IO
performace (through improved merging with dm-mapth blk-mq stacked on
underlying blk-mq device).  Unfortunately these changes have caused
dm-mpath blk-mq IO stalls when blk_mq_request_issue_directly()'s call to
q->mq_ops->queue_rq() fails (due to device-specific resource
unavailability).

Fix this by reverting back to how blk_insert_cloned_request() functioned
prior to commit 396eaf21ee -- by using blk_mq_request_bypass_insert()
instead of blk_mq_request_issue_directly().

In the future, this commit should be reverted as the first change in a
followup series of changes that implements a comprehensive solution to
allowing an underlying blk-mq queue's resource limitation to trigger the
upper blk-mq queue to run once that underlying limited resource is
replenished.

Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via 
blk_insert_cloned_request feedback")
Signed-off-by: Mike Snitzer 
---
 block/blk-core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index cdae69be68e9..a224f282b4a6 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2520,7 +2520,8 @@ blk_status_t blk_insert_cloned_request(struct 
request_queue *q, struct request *
 * bypass a potential scheduler on the bottom device for
 * insert.
 */
-   return blk_mq_request_issue_directly(rq);
+   blk_mq_request_bypass_insert(rq, true);
+   return BLK_STS_OK;
}

spin_lock_irqsave(q->queue_lock, flags);
-- 
2.15.0

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse

On Tue, 2018-01-23 at 08:53 +0100, Ingo Molnar wrote:
> 
> The patch below demonstrates the principle, it forcibly enables dynamic 
> ftrace 
> patching (CONFIG_DYNAMIC_FTRACE=y et al) and turns mcount/__fentry__ into a 
> RET:
> 
>   81a01a40 <__fentry__>:
>   81a01a40:   c3  retq   
> 
> This would have to be extended with (very simple) call stack depth tracking 
> (just 
> 3 more instructions would do in the fast path I believe) and a suitable 
> SkyLake 
> workaround (and also has to play nice with the ftrace callbacks).
> 
> On non-SkyLake the overhead would be 0 cycles.

The overhead of forcing CONFIG_DYNAMIC_FTRACE=y is precisely zero
cycles? That seems a little optimistic. ;)

I'll grant you if it goes straight to a 'ret' it isn't *that* high
though.

> On SkyLake this would add an overhead of maybe 2-3 cycles per function call 
> and 
> obviously all this code and data would be very cache hot. Given that the 
> average 
> number of function calls per system call is around a dozen, this would be 
> _much_ 
> faster than any microcode/MSR based approach.

That's kind of neat, except you don't want it at the top of the
function; you want it at the bottom.

If you could hijack the *return* site, then you could check for
underflow and stuff the RSB right there. But in __fentry__ there's not
a lot you can do other than complain that something bad is going to
happen in the future. You know that a string of 16+ rets is going to
happen, but you've got no gadget in *there* to deal with it when it
does.

HJ did have patches to turn 'ret' into a form of retpoline, which I
don't think ever even got performance-tested. They'd have forced a
mispredict on *every* ret. A cheaper option might be to turn ret into a
'jmp skylake_ret_hack'. Which on pre-SKL will be a bare ret, and SKL+
can do the counting (in conjunction with a 'per_cpu(call_depth)++' in
__fentry__) and stuff the RSB before actually returning, when
appropriate.

By the time you've made it work properly, I suspect we're approaching
the barf-factor of IBRS, for a less complete solution.

> Is there a testcase for the SkyLake 16-deep-call-stack problem that I could 
> run? 

Andi's been experimenting at 
https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=spec/deep-chain-3

> Is there a description of the exact speculative execution vulnerability that 
> has 
> to be addressed to begin with?

"It takes predictions from the generic branch target buffer when the
RSB underflows".

IBRS filters what can come from the BTB, and resolves the problem that
way. Retpoline avoids the indirect branches that on *earlier* CPUs were
the only things that would use the offending predictions. But on SKL,
now 'ret' is one of the problematic instructions too. Fun! :)

> If this approach is workable I'd much prefer it to any MSR writes in the 
> syscall 
> entry path not just because it's fast enough in practice to not be turned off 
> by 
> everyone, but also because everyone would agree that per function call 
> overhead 
> needs to go away on new CPUs. Both deployment and backporting is also _much_ 
> more 
> flexible, simpler, faster and more complete than microcode/firmware or 
> compiler 
> based solutions.
> 
> Assuming the vulnerability can be addressed via this route that is, which is 
> a big 
> assumption!

I think it's close. There are some other cases which empty the RSB,
like sleeping and loading microcode, which can happily be special-
cased. Andi's rounded up many of the remaining details already at 
https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=spec/skl-rsb-3

And there's SMI, which is a pain but I think Linus is right we can
possibly just stick our fingers in our ears and pretend we didn't hear
about that one as it's likely to be hard to trigger (famous last
words).

On the whole though, I think you can see why we're keeping IBRS around
for now, sent out purely as an RFC and rebased on top of the stuff
we're *actually* sending to Linus for inclusion.

When we have a clear idea of what we're doing for Skylake, it'll be
useful to have a proper comparison of the security, the performance and
the "ick" factor of whatever we come up with, vs. IBRS.

Right now the plan is just "screw Skylake"; we'll just forget it's a
special snowflake and treat it like everything else, except for a bit
of extra RSB-stuffing on context switch (since we had to add that for
!SMEP anyway). And that's not *entirely* unreasonable but as I said I'd
*really* like to have a decent analysis of the implications of that,
not just some hand-wavy "nah, it'll be fine".

smime.p7s
Description: S/MIME cryptographic signature

[RFC PATCH 2/2] hv_netvsc: Change GPADL teardown order according to Hyper-V version

2018-01-23 Thread Mohammed Gamal

Commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split")
introduced a regression causing VMs not to shutdown on pre-Wind2016
hosts after netvsc_remove_device() is called. This was caused as the
GPADL teardown sequence was changed.

This patch restores the old behavior for pre-Win2016 hosts, while
keeping the changes from 0cf7378 for Win2016 and higher hosts.

Signed-off-by: Mohammed Gamal 
---
 drivers/net/hyperv/netvsc.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 3982f76..d09bb3b 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -575,8 +575,17 @@ void netvsc_device_remove(struct hv_device *device)
 
cancel_work_sync(_device->subchan_work);
 
+   /*
+* Revoke receive buffer. If host is pre-Win2016 then tear down
+* receive buffer GPADL. Do the same for send buffer.
+*/
netvsc_revoke_recv_buf(device, net_device);
+   if (vmbus_proto_version < VERSION_WIN10)
+   netvsc_teardown_recv_buf_gpadl(device, net_device);
+
netvsc_revoke_send_buf(device, net_device);
+   if (vmbus_proto_version < VERSION_WIN10)
+   netvsc_teardown_send_buf_gpadl(device, net_device);
 
RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
 
@@ -589,8 +598,14 @@ void netvsc_device_remove(struct hv_device *device)
/* Now, we can close the channel safely */
vmbus_close(device->channel);
 
-   netvsc_teardown_recv_buf_gpadl(device, net_device);
-   netvsc_teardown_send_buf_gpadl(device, net_device);
+   /*
+* If host is Win2016 or higher then we do the GPADL tear down
+* here after VMBus is closed, instead of doing it earlier.
+*/
+   if (vmbus_proto_version >= VERSION_WIN10) {
+   netvsc_teardown_recv_buf_gpadl(device, net_device);
+   netvsc_teardown_send_buf_gpadl(device, net_device);
+   }
 
/* And dissassociate NAPI context from device */
for (i = 0; i < net_device->num_chn; i++)
-- 
1.8.3.1

[RFC PATCH 1/2] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()

2018-01-23 Thread Mohammed Gamal

Split each of the functions into two for each of send/recv buffers

Signed-off-by: Mohammed Gamal 
---
 drivers/net/hyperv/netvsc.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index bfc7969..3982f76 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -100,8 +100,8 @@ static void free_netvsc_device_rcu(struct netvsc_device 
*nvdev)
call_rcu(>rcu, free_netvsc_device);
 }
 
-static void netvsc_revoke_buf(struct hv_device *device,
- struct netvsc_device *net_device)
+static void netvsc_revoke_recv_buf(struct hv_device *device,
+  struct netvsc_device *net_device)
 {
struct nvsp_message *revoke_packet;
struct net_device *ndev = hv_get_drvdata(device);
@@ -146,6 +146,14 @@ static void netvsc_revoke_buf(struct hv_device *device,
}
net_device->recv_section_cnt = 0;
}
+}
+
+static void netvsc_revoke_send_buf(struct hv_device *device,
+  struct netvsc_device *net_device)
+{
+   struct nvsp_message *revoke_packet;
+   struct net_device *ndev = hv_get_drvdata(device);
+   int ret;
 
/* Deal with the send buffer we may have setup.
 * If we got a  send section size, it means we received a
@@ -189,8 +197,8 @@ static void netvsc_revoke_buf(struct hv_device *device,
}
 }
 
-static void netvsc_teardown_gpadl(struct hv_device *device,
- struct netvsc_device *net_device)
+static void netvsc_teardown_recv_buf_gpadl(struct hv_device *device,
+  struct netvsc_device *net_device)
 {
struct net_device *ndev = hv_get_drvdata(device);
int ret;
@@ -215,6 +223,13 @@ static void netvsc_teardown_gpadl(struct hv_device *device,
vfree(net_device->recv_buf);
net_device->recv_buf = NULL;
}
+}
+
+static void netvsc_teardown_send_buf_gpadl(struct hv_device *device,
+  struct netvsc_device *net_device)
+{
+   struct net_device *ndev = hv_get_drvdata(device);
+   int ret;
 
if (net_device->send_buf_gpadl_handle) {
ret = vmbus_teardown_gpadl(device->channel,
@@ -425,8 +440,10 @@ static int netvsc_init_buf(struct hv_device *device,
goto exit;
 
 cleanup:
-   netvsc_revoke_buf(device, net_device);
-   netvsc_teardown_gpadl(device, net_device);
+   netvsc_revoke_recv_buf(device, net_device);
+   netvsc_revoke_send_buf(device, net_device);
+   netvsc_teardown_recv_buf_gpadl(device, net_device);
+   netvsc_teardown_send_buf_gpadl(device, net_device);
 
 exit:
return ret;
@@ -558,7 +575,8 @@ void netvsc_device_remove(struct hv_device *device)
 
cancel_work_sync(_device->subchan_work);
 
-   netvsc_revoke_buf(device, net_device);
+   netvsc_revoke_recv_buf(device, net_device);
+   netvsc_revoke_send_buf(device, net_device);
 
RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
 
@@ -571,7 +589,8 @@ void netvsc_device_remove(struct hv_device *device)
/* Now, we can close the channel safely */
vmbus_close(device->channel);
 
-   netvsc_teardown_gpadl(device, net_device);
+   netvsc_teardown_recv_buf_gpadl(device, net_device);
+   netvsc_teardown_send_buf_gpadl(device, net_device);
 
/* And dissassociate NAPI context from device */
for (i = 0; i < net_device->num_chn; i++)
-- 
1.8.3.1

[RFC PATCH 0/2] hv_netvsc: Fix shutdown regression on Win2012 hosts

2018-01-23 Thread Mohammed Gamal

Commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") introduced
a regression that caused VMs not to shutdown after netvsc_device_remove() is
called. This is caused by GPADL teardown sequence change, and while that was 
necessary to fix issues with Win2016 hosts, it did introduce a regression for
earlier versions.

Prior to commit 0cf737808 the call sequence in netvsc_device_remove() was as 
follows (as implemented in netvsc_destroy_buf()):
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Teardown receive buffer GPADL
3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
4- Teardown send buffer GPADL
5- Close vmbus

This didn't work for WS2016 hosts. Commit 0cf737808 split netvsc_destroy_buf()
into two functions and rearranged the order as follows
1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
3- Close vmbus
4- Teardown receive buffer GPADL
5- Teardown send buffer GPADL

That worked well for WS2016 hosts, but for WS2012 hosts it prevented VMs from
shutting down. 

This patch series works around this problem. The first patch splits
netvsc_revoke_buf() and netvsc_teardown_gpadl() into two finer grained
functions for tearing down send and receive buffers individally. The second 
patch
uses the finer grained functions to implement the teardown sequence according to
the host's version. We keep the behavior introduced in 0cf737808ae7 for Windows
2016 hosts, while we re-introduce the old sequence for earlier verions.

Mohammed Gamal (2):
  hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()
  hv_netvsc: Change GPADL teardown order according to Hyper-V version

 drivers/net/hyperv/netvsc.c | 50 +
 1 file changed, 42 insertions(+), 8 deletions(-)

-- 
1.8.3.1

Re: [PATCH] cpufreq: mediatek: Add mediatek related projects into blacklist

2018-01-23 Thread Greg KH

On Tue, Jan 23, 2018 at 05:38:34PM +0800, Sean Wang wrote:
> On Tue, 2018-01-23 at 09:46 +0100, Greg KH wrote:
> > On Tue, Jan 23, 2018 at 04:31:11PM +0800, sean.w...@mediatek.com wrote:
> > > From: Sean Wang 
> > > 
> > > commit 6066998cbd2b1012a8d5bc9a2957cfd0ad53150e upstream.
> > > 
> > > commit edeec420de24 ("cpufreq: dt-platdev: Automatically create cpufreq
> > > device with OPP v2") not added MediaTek SoCs to the blacklist that would
> > > lead to cause an occasional hang or unexpected behaviors on related boards
> > > as kernelci reported and complained on [1] specifically for 4.14 and 4.15
> > > tree.
> > > 
> > > For those reasons, add MediaTek SoCs into cpufreq-dt blacklist and wish
> > > the patch be applied to 4.14 and 4.15 tree to allow kernelci able to
> > > complete following automated kernel testing.
> > > 
> > > [1] https://kernelci.org/boot/mt7623n-bananapi-bpi-r2/
> > > 
> > > Fixes: edeec420de24 (cpufreq: dt-cpufreq: platdev Automatically create 
> > > device with OPP v2)
> > > Signed-off-by: Andrew-sh Cheng 
> > > Signed-off-by: Sean Wang 
> > > Cc: Kevin Hilman 
> > > ---
> > >  drivers/cpufreq/cpufreq-dt-platdev.c | 8 
> > >  1 file changed, 8 insertions(+)
> > 
> > What stable kernel tree(s) are you wanting this backported to?
> > 
> > thanks,
> > 
> > greg k-h
> 
> Hi, Greg,
> 
> thanks for your help!
> 
> stable and stable-rc are those trees I want this backported to 

I don't understand, what exactly do you mean by this?

Have you read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly?

> Hi, Viresh
> 
> currently, can the patch be permitted to go through tree linux-pm branch
> master to be part of mainline? 

Wait, this is not in Linus's tree already?  If not, what is that big
"commit  upstream" in the changelog for?

I don't see that commit in Linus's tree at all.

totally confused,

greg k-h

Re: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2

2018-01-23 Thread Will Deacon

On Mon, Jan 22, 2018 at 02:00:59PM -0500, Jon Masters wrote:
> On 01/22/2018 06:33 AM, Will Deacon wrote:
> > On Fri, Jan 19, 2018 at 04:22:47AM -0800, Jayachandran C wrote:
> >> Use PSCI based mitigation for speculative execution attacks targeting
> >> the branch predictor. We use the same mechanism as the one used for
> >> Cortex-A CPUs, we expect the PSCI version call to have a side effect
> >> of clearing the BTBs.
> >>
> >> Signed-off-by: Jayachandran C 
> >> ---
> >>  arch/arm64/kernel/cpu_errata.c | 10 ++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/arch/arm64/kernel/cpu_errata.c 
> >> b/arch/arm64/kernel/cpu_errata.c
> >> index 70e5f18..45ff9a2 100644
> >> --- a/arch/arm64/kernel/cpu_errata.c
> >> +++ b/arch/arm64/kernel/cpu_errata.c
> >> @@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> >>.capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
> >>MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
> >>},
> >> +  {
> >> +  .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> >> +  MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
> >> +  .enable = enable_psci_bp_hardening,
> >> +  },
> >> +  {
> >> +  .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> >> +  MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
> >> +  .enable = enable_psci_bp_hardening,
> >> +  },
> >>  #endif
> > 
> > Thanks.
> > 
> > Acked-by: Will Deacon 
> 
> Thanks. I have separately asked for a specification tweak to allow us to
> discover whether firmware has been augmented to provide the necessary
> support that we need. That applies beyond Cavium.

AFAIK, there's already an SMCCC/PSCI proposal doing the rounds that is
discoverable and does what we need. Have you seen it? We should be posting
code this week.

Will

Re: [PATCH 2/2] mfd: smsc-ece1099: Improve a size determination in smsc_i2c_probe()

2018-01-23 Thread Lee Jones

On Tue, 16 Jan 2018, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Tue, 16 Jan 2018 08:58:26 +0100
> 
> Replace the specification of a data structure by a pointer dereference
> as the parameter for the operator "sizeof" to make the corresponding size
> determination a bit safer according to the Linux coding style convention.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/mfd/smsc-ece1099.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/mfd/smsc-ece1099.c b/drivers/mfd/smsc-ece1099.c
> index b9d96651cc0d..6681205dd2c0 100644
> --- a/drivers/mfd/smsc-ece1099.c
> +++ b/drivers/mfd/smsc-ece1099.c
> @@ -33,12 +33,10 @@ static const struct regmap_config smsc_regmap_config = {
>  static int smsc_i2c_probe(struct i2c_client *i2c,
>   const struct i2c_device_id *id)
>  {
> - struct smsc *smsc;
>   int devid, rev, venid_l, venid_h;
>   int ret;
> + struct smsc *smsc = devm_kzalloc(>dev, sizeof(*smsc), GFP_KERNEL);

Please keep these separate.

> - smsc = devm_kzalloc(>dev, sizeof(struct smsc),
> - GFP_KERNEL);
>   if (!smsc)
>   return -ENOMEM;
>  

-- 
Lee Jones
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

RE: [RFC v2 1/5] vfio/type1: Introduce iova list and add iommu aperture validity check

2018-01-23 Thread Shameerali Kolothum Thodi

Hi Eric,

> -Original Message-
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: Tuesday, January 23, 2018 8:25 AM
> To: Alex Williamson ; Shameerali Kolothum
> Thodi 
> Cc: pmo...@linux.vnet.ibm.com; k...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linuxarm ; John Garry
> ; xuwei (O) 
> Subject: Re: [RFC v2 1/5] vfio/type1: Introduce iova list and add iommu
> aperture validity check
> 
> Hi Shameer,
> 
> On 18/01/18 01:04, Alex Williamson wrote:
> > On Fri, 12 Jan 2018 16:45:27 +
> > Shameer Kolothum  wrote:
> >
> >> This introduces an iova list that is valid for dma mappings. Make
> >> sure the new iommu aperture window is valid and doesn't conflict
> >> with any existing dma mappings during attach. Also update the iova
> >> list with new aperture window during attach/detach.
> >>
> >> Signed-off-by: Shameer Kolothum
> 
> >> ---
> >>  drivers/vfio/vfio_iommu_type1.c | 177
> 
> >>  1 file changed, 177 insertions(+)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c
> b/drivers/vfio/vfio_iommu_type1.c
> >> index e30e29a..11cbd49 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -60,6 +60,7 @@
> >>
> >>  struct vfio_iommu {
> >>struct list_headdomain_list;
> >> +  struct list_headiova_list;
> >>struct vfio_domain  *external_domain; /* domain for external user
> */
> >>struct mutexlock;
> >>struct rb_root  dma_list;
> >> @@ -92,6 +93,12 @@ struct vfio_group {
> >>struct list_headnext;
> >>  };
> >>
> >> +struct vfio_iova {
> >> +  struct list_headlist;
> >> +  phys_addr_t start;
> >> +  phys_addr_t end;
> >> +};
> >
> > dma_list uses dma_addr_t for the iova.  IOVAs are naturally DMA
> > addresses, why are we using phys_addr_t?
> >
> >> +
> >>  /*
> >>   * Guest RAM pinning working set or DMA target
> >>   */
> >> @@ -1192,6 +1199,123 @@ static bool vfio_iommu_has_sw_msi(struct
> iommu_group *group, phys_addr_t *base)
> >>return ret;
> >>  }
> >>
> >> +static int vfio_insert_iova(phys_addr_t start, phys_addr_t end,
> >> +  struct list_head *head)
> >> +{
> >> +  struct vfio_iova *region;
> >> +
> >> +  region = kmalloc(sizeof(*region), GFP_KERNEL);
> >> +  if (!region)
> >> +  return -ENOMEM;
> >> +
> >> +  INIT_LIST_HEAD(>list);
> >> +  region->start = start;
> >> +  region->end = end;
> >> +
> >> +  list_add_tail(>list, head);
> >> +  return 0;
> >> +}
> >
> > As I'm reading through this series, I'm learning that there are a lot
> > of assumptions and subtle details that should be documented.  For
> > instance, the IOMMU API only provides a single geometry and we build
> > upon that here as this patch creates a list, but there's only a single
> > entry for now.  The following patches carve that single iova range into
> > pieces and somewhat subtly use the list_head passed to keep the list
> > sorted, allowing the first/last_entry tricks used throughout.  Subtle
> > interfaces are prone to bugs.
> >
> >> +
> >> +/*
> >> + * Find whether a mem region overlaps with existing dma mappings
> >> + */
> >> +static bool vfio_find_dma_overlap(struct vfio_iommu *iommu,
> >> +phys_addr_t start, phys_addr_t end)
> >> +{
> >> +  struct rb_node *n = rb_first(>dma_list);
> >> +
> >> +  for (; n; n = rb_next(n)) {
> >> +  struct vfio_dma *dma;
> >> +
> >> +  dma = rb_entry(n, struct vfio_dma, node);
> >> +
> >> +  if (end < dma->iova)
> >> +  break;
> >> +  if (start >= dma->iova + dma->size)
> >> +  continue;
> >> +  return true;
> >> +  }
> >> +
> >> +  return false;
> >> +}
> >
> > Why do we need this in addition to the existing vfio_find_dma()?  Why
> > doesn't this use the tree structure of the dma_list?
> >
> >> +
> >> +/*
> >> + * Check the new iommu aperture is a valid one
> >> + */
> >> +static int vfio_iommu_valid_aperture(struct vfio_iommu *iommu,
> >> +   phys_addr_t start,
> >> +   phys_addr_t end)
> >> +{
> >> +  struct vfio_iova *first, *last;
> >> +  struct list_head *iova = >iova_list;
> >> +
> >> +  if (list_empty(iova))
> >> +  return 0;
> >> +
> >> +  /* Check if new one is outside the current aperture */
> >
> > "Disjoint sets"
> >
> >> +  first = list_first_entry(iova, struct vfio_iova, list);
> >> +  last = list_last_entry(iova, struct vfio_iova, list);
> >> +  if ((start > last->end) || (end < first->start))
> >> +  return -EINVAL;
> >> +
> >> +  /* Check for any existing dma mappings outside the new start */
> >> +  if (start > first->start) {
> >> +  if

Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-23 Thread gengdongjiu

sorry fix a typo.

On 2018/1/23 17:23, gengdongjiu wrote:
>> There are problems with doing this:
>>
>> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
>> | How do SEA and SEI interact?
>> |
>> | As far as I can see they can both interrupt each other, which isn't 
>> something
>> | the single in_nmi() path in APEI can handle. I thinks we should fix this
>> | first.
>>
>> [..]
>>
>> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
>> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
>> | from SEA, but not SEI. (what happens if an SError arrives while we are
>> | queueing memory_failure work from an IRQ).
>> |
>> | The one that scares me is the trace-point reporting stuff. What happens if 
>> an
>> | SError arrives while we are enabling a trace point? (these are static-keys
>> | right?)
>> |
>> |  I don't think we can just plumb SEI in like this and be done with it.
>> |  (I'm looking at teasing out the estatus cache code from being x86:NMI 
>> only.
>> |  This way we solve the same 'cant do this from NMI context' with the same
>> |  code'.)
>>
>>
>> I will post what I've got for this estatus-cache thing as an RFC, its not 
>> ready
>> to be considered yet.

Yes, I know you are dong that. Your serial's patch will consider all above 
things, right?
If your patch can be consider that, this patch can based on your patchset. 
thanks.

> 
>>

[PATCH] rtc: ds1302: remove redundant initializations of pointer bp

2018-01-23 Thread Colin King

From: Colin Ian King 

Pointe bp is being initialized and this value is never read, it
is being updated to the same value later just before it is going to
be used. Remove the initialization as it is never read and keep
the setting of bp closer to the use of bp.

Cleans up clang warnings:
drivers/rtc/rtc-ds1302.c:115:7: warning: Value stored to 'bp' during
its initialization is never read
drivers/rtc/rtc-ds1302.c:46:7: warning: Value stored to 'bp' during
its initialization is never read

Signed-off-by: Colin Ian King 
---
 drivers/rtc/rtc-ds1302.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/rtc/rtc-ds1302.c b/drivers/rtc/rtc-ds1302.c
index 0ec4be62322b..43bcb17c922e 100644
--- a/drivers/rtc/rtc-ds1302.c
+++ b/drivers/rtc/rtc-ds1302.c
@@ -43,7 +43,7 @@ static int ds1302_rtc_set_time(struct device *dev, struct 
rtc_time *time)
 {
struct spi_device   *spi = dev_get_drvdata(dev);
u8  buf[1 + RTC_CLCK_LEN];
-   u8  *bp = buf;
+   u8  *bp;
int status;
 
/* Enable writing */
@@ -112,7 +112,7 @@ static int ds1302_probe(struct spi_device *spi)
struct rtc_device   *rtc;
u8  addr;
u8  buf[4];
-   u8  *bp = buf;
+   u8  *bp;
int status;
 
/* Sanity check board setup data.  This may be hooked up
-- 
2.15.1

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar


* David Woodhouse  wrote:

> > On SkyLake this would add an overhead of maybe 2-3 cycles per function call 
> > and 
> > obviously all this code and data would be very cache hot. Given that the 
> > average 
> > number of function calls per system call is around a dozen, this would be 
> > _much_ 
> > faster than any microcode/MSR based approach.
> 
> That's kind of neat, except you don't want it at the top of the
> function; you want it at the bottom.
> 
> If you could hijack the *return* site, then you could check for
> underflow and stuff the RSB right there. But in __fentry__ there's not
> a lot you can do other than complain that something bad is going to
> happen in the future. You know that a string of 16+ rets is going to
> happen, but you've got no gadget in *there* to deal with it when it
> does.

No, it can be done with the existing CALL instrumentation callback that 
CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from 
the CALL trampoline - see my previous email.

> HJ did have patches to turn 'ret' into a form of retpoline, which I
> don't think ever even got performance-tested.

Return instrumentation is possible as well, but there are two major drawbacks:

 - GCC support for it is not as widely available and return instrumentation is 
   less tested in Linux kernel contexts

 - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already 
   enabled in distros here and today, so the runtime overhead to non-SkyLake 
CPUs 
   would be literally zero, while still allowing to fix the RSB vulnerability 
on 
   SkyLake.

Thanks,

Ingo

Re: [PULL] alpha.git

2018-01-23 Thread Mikulas Patocka



On Sat, 20 Jan 2018, Matt Turner wrote:

> Hi Linus,
> 
> Please pull my alpha git tree. It contains a build fix and a regression fix.
> 
> Hopefully still in time for 4.15 :)
> 
> Thanks,
> Matt

Hi

Will you also submit these patches? The first one fixes a crash when 
pthread_create races with signal delivery, it could cause random crashing 
in applications.

https://marc.info/?l=linux-alpha=151491969711913=2
https://marc.info/?l=linux-alpha=151491960011839=2
https://marc.info/?l=linux-alpha=151491963911901=2

Mikulas

> The following changes since commit 8cbab92dff778e516064c13113ca15d4869ec883:
> 
>  Merge tag 'for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma (2018-01-16 16:47:40
> -0800)
> 
> are available in the git repository at:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha.git for-linus
> 
> for you to fetch changes up to 86be89939d11a84800f66e2a283b915b704bf33d:
> 
>  alpha/PCI: Fix noname IRQ level detection (2018-01-20 16:22:36 -0800)
> 
> 
> Lorenzo Pieralisi (1):
>  alpha/PCI: Fix noname IRQ level detection
> 
> Michael Cree (1):
>  alpha: extend memset16 to EV6 optimised routines
> 
> arch/alpha/kernel/sys_sio.c | 35 +--
> arch/alpha/lib/ev6-memset.S | 12 ++--
> 2 files changed, 35 insertions(+), 12 deletions(-)
>

Re: [PATCH v6 15/15] MIPS: ingenic: Initial GCW Zero support

2018-01-23 Thread Philippe Ombredanne

Paul:

On Wed, Jan 10, 2018 at 11:59 PM, Paul Cercueil  wrote:
> Hi Philippe,
>
> Le dim. 7 janv. 2018 à 17:18, Philippe Ombredanne  a
> écrit :
>>
>> On Fri, Jan 5, 2018 at 7:25 PM, Paul Cercueil 
>> wrote:
>>>
>>>  The GCW Zero (http://www.gcw-zero.com) is a retro-gaming focused
>>>  handheld game console, successfully kickstarted in ~2012, running Linux.
>>>
>>>  Signed-off-by: Paul Cercueil 
>>>  Acked-by: Mathieu Malaterre 
>>>  ---
>>>   arch/mips/boot/dts/ingenic/Makefile |  1 +
>>>   arch/mips/boot/dts/ingenic/gcw0.dts | 62
>>> +
>>>   arch/mips/configs/gcw0_defconfig| 27 
>>>   arch/mips/jz4740/Kconfig|  4 +++
>>>   4 files changed, 94 insertions(+)
>>>   create mode 100644 arch/mips/boot/dts/ingenic/gcw0.dts
>>>   create mode 100644 arch/mips/configs/gcw0_defconfig
>>>
>>>   v2: No change
>>>   v3: No change
>>>   v4: No change
>>>   v5: Use SPDX license identifier
>>>   Drop custom CROSS_COMPILE from defconfig
>>>   v6: Add "model" property in devicetree
>>
>>
>> For the use of SPDX tags for the whole patch set: thank you!
>>
>> Acked-by: Philippe Ombredanne 
>
>
> Is your Acked-by for the whole patchset? Or just this one patch?

Sorry for the late reply!
This is  for the whole patchset for your use of SPDX tags.

-- 
Cordially
Philippe Ombredanne

Re: Network interface "stops working"

2018-01-23 Thread Turbo Fredriksson

Not sure it’s a bug yet, but anyone have any ideas on how I can find out?

> On 22 Jan 2018, at 23:32, Cong Wang  wrote:
> 
> (Please always Cc netdev for networking related bugs.)
> 
> On Mon, Jan 22, 2018 at 2:02 AM, Turbo Fredriksson  wrote:
>> I just got a new broadband delivered at home. It is "Hyperoptic 1Gbps fiber" 
>> which comes as a ethernet connector at home. I wasn’t around
>> when they connected up everything, so I’m not sure *where* the fiber starts, 
>> but either way, I have an ethernet jack in one of my rooms.
>> 
>> They also provided me with a ZTE router. I have need for my own services 
>> (firewalling, NATing, IPSEC and what not), so don’t want to use
>> the provided router..
>> 
>> However, I’m having serious trouble keeping the interface up! Works for a 
>> few minutes and then just “stops working”. Don’t know why, there’s
>> nothing in the logs or from dmesg..
>> 
>> Taking the interface down and then up again usually solves it. For a few 
>> minutes.
>> 
>> Also, when running the interface (a Intel 82576 Gigabit dual port, using the 
>> igb driver - tried e1000 and e1000e but they don’t find any interfaces),
>> in 1Gbps mode, the interface starts flapping up and down and I can’t get a 
>> connection at all. So my interface definition runs a script to use
>> ethtool to set the speed to 100Mbps, full duplex, no auto negotiation. Which 
>> “kinda” works (for a while, hence my problems).
>> 
>> Because the provided router works just fine, I’m sure it’s something on my 
>> Linux box (Debian GNU/Linux Jessie) that does it.. I’ve tried running
>> it without any iptables, in both static and DHCP mode but same problem..
>> 
>> I’m not a complete beginner with Linux nor networks, but this is to “close 
>> to the hardware” for me. I’m at a loss to what else to try..
>> 
>> This isn’t a new machine, it have served me very well for five years give or 
>> take and I’ve never had any problems with it (not to say that it still
>> can’t be hardware problems, but I find that somewhat unlikely at the moment).
>> 
>> 
>> Could anyone please advice to what I can try to try to pinpoint the problem 
>> (and/or possibly fix it)?



signature.asc
Description: Message signed with OpenPGP

Re: unixbench context switch perfomance & cpu topology

2018-01-23 Thread Wanpeng Li

2018-01-22 20:53 GMT+08:00 Peter Zijlstra :
> On Mon, Jan 22, 2018 at 07:47:45PM +0800, Wanpeng Li wrote:
>> Hi all,
>>
>> We can observe unixbench context switch performance is heavily
>> influenced by cpu topology which is exposed to the guest. the score is
>> posted below, bigger is better, both the guest and the host kernel are
>> 3.15-rc3(we can also reproduce against centos 7.4 693 guest/host), LLC
>> is exposed to the guest, kvm adaptive halt-polling is default enabled,
>> then start a guest w/ 8 logical cpus.
>>
>>
>>
>> unixbench context switch
>> -smp 8, sockets=8, cores=1, threads=1382036
>> -smp 8, sockets=4, cores=2, threads=1132480
>> -smp 8, sockets=2, cores=4, threads=1128032
>> -smp 8, sockets=2, cores=2, threads=2131767
>> -smp 8, sockets=1, cores=4, threads=2132742
>> -smp 8, sockets=1, cores=4, threads=2 (guest w/ nohz=off idle=poll)331471
>>
>> I can observe there are a lot of reschedule IPIs sent from one vCPU to
>> another vCPU, the context switch workload switches between running and
>> idle frequently which results in HLT instruction in the idle path, I
>> use idle=poll to avoid vmexit due to HLT and to avoid reschedule IPIs
>> since idle task checks TIF_NEED_RESCHED flags in a loop, nohz=off can
>> stop to program lapic timer/other nohz stuffs. Any idea why sockets=8
>> can get best performance?
>
> I suspect because we load-balance less agressively across nodes than we
> do within a cache domain.

It is true. after taking a more closer look by kernelshark, the
context1 in the guest will be migrated to another logical cpu after
several milliseconds for sockets=1, cores=4, threads=2,  however, it
can keep on one logical cpu around several seconds for sockets=8,
cores=1, threads=1 before migrating to another one.

>
> Fix you benchmark to pin itself to a single CPU, that's the only
> sensible way to obtain this number in any case.

Yeah, this setup can get a good performance. Actually the two context1
tasks don't stack up on one logical cpu at the most of time which is
observed by kernelshark opposed to Mike's reply. In addition, I can
observe the sum of RESCHED IPIs in the guest for sockets=1, cores=4,
threads=2 is 4.5 times for sockets=8, cores=1, threads=1. Any idea how
this can happen? I suspect the TTWU path selects another idle logical
cpu which results in a RESCHED IPI is avoidless. However, there is
still no benefit for performance after I clear the SD_BALANCE_WAKE for
correlative sched_domains.

Regards,
Wanpeng Li

Re: [PATCH v3 11/20] arm64: mm: Map entry trampoline into trampoline and kernel page tables

2018-01-23 Thread Yisheng Xie

Hi Will,

On 2018/1/23 18:04, Will Deacon wrote:
> On Tue, Jan 23, 2018 at 04:28:45PM +0800, Yisheng Xie wrote:
>> On 2017/12/6 20:35, Will Deacon wrote:
>>> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
>>> +static int __init map_entry_trampoline(void)
>>> +{
>>> +   extern char __entry_tramp_text_start[];
>>> +
>>> +   pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
>>> +   phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
>>> +
>>> +   /* The trampoline is always mapped and can therefore be global */
>>> +   pgprot_val(prot) &= ~PTE_NG;
>>> +
>>> +   /* Map only the text into the trampoline page table */
>>> +   memset(tramp_pg_dir, 0, PGD_SIZE);
>>> +   __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
>>> +prot, pgd_pgtable_alloc, 0);
>>
>> How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? 
>> Sorry
>> for I do not find where it is used.
> 
> Yes, that's what happens when we return to userspace. The code is a little
> convoluted, but the tramp_pg_dir is placed at a fixed offset from swapper
> (see the linker script) so the sub instruction in tramp_unmap_kernel is what
> gives us the ttbr1 value we need.

oh, I missed that. Maybe a comment inline is better to understand. Thanks once
more for your help and explain :)

Thanks
Yisheng
> 
> Will
> 
> .
>

Re: [PATCH] kasan: add __asan_report_loadN/storeN_noabort callbacks

2018-01-23 Thread Andrey Ryabinin

On 01/19/2018 08:44 PM, Andrey Konovalov wrote:
> Instead of __asan_report_load_n_noabort and __asan_report_store_n_noabort
> callbacks Clang emits differently named __asan_report_loadN_noabort and
> __asan_report_storeN_noabort (similar to __asan_loadN/storeN_noabort, whose
> names both GCC and Clang agree on).
> 
> Add callback implementation for __asan_report_loadN/storeN_noabort.
> 

This made me wonder why this wasn't observed before. So I noticed that
inline instrumentation with -fsanitize=kernel-addresss is broken in clang,
and clang never calls __asan_report*() functions. I see that you guys fixed this
just yesterday https://reviews.llvm.org/D42384 .

But it seems that you didn't fix the rest of "if (CompileKernel)" crap.
Clang generates "__asan_report_[load,store]N*" instead of 
"__asan_report_[load,store]_n*"
only because of this idiocy:

const std::string SuffixStr = CompileKernel ? "N" : "_n";

See 
https://github.com/llvm-mirror/llvm/blob/ca19eaabd75f55865efd321b7a6f1d4ba3db8bc8/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L2250

Note that SuffixStr is used *only* for __asan_report_* callbacks, which makes 
no sense because
we never ever had __asan_report* callbacks with "N" suffix.

So I think that you should just fix the llvm here.

And there is probably one more "if (CompileKernel)" crap in runOnModule()
which breaks globals instrumentation.

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar


* David Woodhouse  wrote:

> On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> > 
> > BTW., the reason this is enabled on all distro kernels is because the 
> > overhead 
> > is  a single patched-in NOP instruction in the function epilogue, when 
> > tracing 
> > is  disabled. So it's not even a CALL+RET - it's a patched in NOP.
> 
> Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to 
> get 
> to the point where we can patch *that* out into a NOP... or are you saying we 
> already can?

Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a 
NOP today - all 50,000+ call sites on a typical distro kernel.

We did so for a long time - this is all a well established, working mechanism.

> But this is a digression. I was being pedantic about the "0 cycles" but sure, 
> this would be perfectly tolerable.

It's not a digression in two ways:

- I wanted to make it clear that for distro kernels it _is_ a zero cycles 
overhead
  mechanism for non-SkyLake CPUs, literally.

- I noticed that Meltdown and the CR3 writes for PTI appears to have 
established a
  kind of ... insensitivity and numbness to kernel micro-costs, which peaked 
with
  the per-syscall MSR write nonsense patch of the SkyLake workaround.
  That attitude is totally unacceptable to me as x86 maintainer and yes, still
  every cycle counts.

Thanks,

Ingo

Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd

2018-01-23 Thread Jiri Benc

On Tue, 23 Jan 2018 11:26:58 +0100, Wolfgang Bumiller wrote:
> Even if you know the netnsid, do the mentioned watches work for
> nested/child namespaces if eg. a container creates new namespace before
> and/or after the watch was established and moves interfaces to these
> child namespaces, would you just see them disappear, or can you keep
> track of them later on as well?

What do you mean by "nested namespaces"? There's no such thing for net
name spaces.

As for missing API to get netnsid of the netns the interface is moved
to, see my previous emails in this thread. This needs to be added.

> Even if that works, from what the documentation tells me netlink is an
> unreliable protocol, so if my watcher's socket buffer is full, wouldn't
> I be losing important tracking information?

Sure. But that's fundamentally unfixable independently on netlink, the
kernel needs to take an action if a program is not reading its
messages. Either some messages get dropped or the program is killed or
infinite amount of memory is consumed. This has nothing to do with uAPI
design.

> I think one possible solution to tracking interfaces would be to have a
> unique identifier that never changes (even if it's just a simple
> uint64_t incremented whenever an interface is created). But since
> they're not local to the current namespace that may require a lot of
> extra permission checks (but I'm just speculating here...).

You'll get a hard NACK from CRIU folks if you try to propose this.

> In any case, IFLA_NET_NS_FD/PID are already there and I had been
> wondering previously why they couldn't be used with RTM_GETLINK, it
> would just make sense.

Those predate netnsids and we can't get rid of them now, since they're
part of uAPI. But we can (and should) make sure we don't add more of
those.

 Jiri

[PATCH] staging: comedi: dt2811: remove redundant initialization of 'ns'

2018-01-23 Thread Colin King

From: Colin Ian King 

Variable ns is being initialized with a value that is never read, ns
is being re-assigned a new value later on. Remove the redundant
initialization.

Cleans up clang warning:
drivers/staging/comedi/drivers/dt2811.c:310:21: warning: Value stored
to 'ns' during its initialization is never read

Signed-off-by: Colin Ian King 
---
 drivers/staging/comedi/drivers/dt2811.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/dt2811.c 
b/drivers/staging/comedi/drivers/dt2811.c
index fea0a1baf10b..05207a519755 100644
--- a/drivers/staging/comedi/drivers/dt2811.c
+++ b/drivers/staging/comedi/drivers/dt2811.c
@@ -307,7 +307,7 @@ static int dt2811_ai_cmd(struct comedi_device *dev,
 static unsigned int dt2811_ns_to_timer(unsigned int *nanosec,
   unsigned int flags)
 {
-   unsigned long long ns = *nanosec;
+   unsigned long long ns;
unsigned int ns_lo = COMEDI_MIN_SPEED;
unsigned int ns_hi = 0;
unsigned int divisor_hi = 0;
-- 
2.15.1

Re: [PATCH v5] devres: combine function devm_ioremap*

2018-01-23 Thread Yisheng Xie



On 2018/1/23 16:42, Greg KH wrote:
> On Tue, Jan 16, 2018 at 08:03:41PM +0800, Yisheng Xie wrote:
>> When I tried to use devm_ioremap function and review related
>> code, I found devm_ioremap_* almost have the similar realize
>> with each other, which can be combined.
>>
>> In the former version, I have tried to kill ioremap_cache to
>> reduce the size of devres, which can not work for ioremap is
>> not the same as ioremap_nocache in some ARCHs likes ia64.
>> Therefore, as the suggestion of Christophe, I introduce a help
>> function __devm_ioremap, let devm_ioremap* inline and call
>> __devm_ioremap with different devm_ioremap_type.
>>
>> After apply the patch, the size of devres.o can be reduce from
>> 8216 Bytes to 7352Bytes in my compile environment.
>>
>> Suggested-by: Christophe LEROY 
>> Signed-off-by: Yisheng Xie 
>> ---
>> v2:
>>  - use MARCO for ioremap
>> v3:
>>  - kill dev_ioremap_nocache
>> v4:
>>  - combine function devm_ioremap*
>> v5:
>>  - fix code style.
>>
>>  include/linux/io.h | 61 +++
>>  lib/devres.c   | 84 
>> ++
>>  2 files changed, 70 insertions(+), 75 deletions(-)
>>
>> diff --git a/include/linux/io.h b/include/linux/io.h
>> index 32e30e8..4d0a640 100644
>> --- a/include/linux/io.h
>> +++ b/include/linux/io.h
>> @@ -73,12 +73,61 @@ static inline void devm_ioport_unmap(struct device *dev, 
>> void __iomem *addr)
>>  
>>  #define IOMEM_ERR_PTR(err) (__force void __iomem *)ERR_PTR(err)
>>  
>> -void __iomem *devm_ioremap(struct device *dev, resource_size_t offset,
>> -   resource_size_t size);
>> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
>> offset,
>> -   resource_size_t size);
>> -void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset,
>> -   resource_size_t size);
>> +enum devm_ioremap_type {
>> +DEVM_IOREMAP = 0,
>> +DEVM_IOREMAP_NC,
>> +DEVM_IOREMAP_WC,
>> +};
> 
> Why do these types need to be in a public .h file?
> 
> Why not just keep the .h file as-is and then just put the cleanup in the
> .c file like you did?
> 
Right. I was just trying to inline these functions. Anyway, I
will follow your suggestion. Sorry for sending so many versions.

Thanks
Yisheng

> thanks,
> 
> greg k-h
> 
> .
>

Re: [PATCH 1/4] dmaengine: qcom: bam_dma: make bam clk optional

2018-01-23 Thread Vinod Koul

On Mon, Jan 22, 2018 at 09:55:01AM +, Srinivas Kandagatla wrote:

> >>@@ -1180,13 +1180,14 @@ static int bam_dma_probe(struct platform_device 
> >>*pdev)
> >>"qcom,controlled-remotely");
> >>bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");
> >
> >but you still do clk_get unconditionally?
> 
> Only reason to do this way is to not break existing users in the mainline.
> 
> remotely controlled BAM is already supported in upstream driver, there are
> users of this who pass clk from device tree, If I make this conditional then
> subsequent reads to the BAM registers for those instances might crash the
> system.

But these instances are remote controlled, so if we stop representing them
in Linux, why would we read them?

-- 
~Vinod

Re: [PATCH v2 1/2] Input: edt-ft5x06 - Add support for regulator

2018-01-23 Thread Mylene Josserand

Hello Lothar,

Le Tue, 23 Jan 2018 09:04:14 +0100,
Lothar Waßmann  a écrit :

> Hi,
> 
> On Mon, 22 Jan 2018 09:42:08 -0800 Dmitry Torokhov wrote:
> > Hi Mylène,
> > 
> > On Thu, Dec 28, 2017 at 8:33 AM, Mylène Josserand
> >  wrote:  
> > > Add the support of regulator to use it as VCC source.
> > >
> > > Signed-off-by: Mylène Josserand 
> > > ---
> > >  .../bindings/input/touchscreen/edt-ft5x06.txt  |  1 +
> > >  drivers/input/touchscreen/edt-ft5x06.c | 33 
> > > ++
> > >  2 files changed, 34 insertions(+)
> > >
> > > diff --git 
> > > a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt 
> > > b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > > index 025cf8c9324a..48e975b9c1aa 100644
> > > --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > > +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > > @@ -30,6 +30,7 @@ Required properties:
> > >  Optional properties:
> > >   - reset-gpios: GPIO specification for the RESET input
> > >   - wake-gpios:  GPIO specification for the WAKE input
> > > + - vcc-supply:  Regulator that supplies the touchscreen
> > >
> > >   - pinctrl-names: should be "default"
> > >   - pinctrl-0:   a phandle pointing to the pin settings for the
> > > diff --git a/drivers/input/touchscreen/edt-ft5x06.c 
> > > b/drivers/input/touchscreen/edt-ft5x06.c
> > > index c53a3d7239e7..5ee14a25a382 100644
> > > --- a/drivers/input/touchscreen/edt-ft5x06.c
> > > +++ b/drivers/input/touchscreen/edt-ft5x06.c
> > > @@ -39,6 +39,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #define WORK_REGISTER_THRESHOLD0x00
> > >  #define WORK_REGISTER_REPORT_RATE  0x08
> > > @@ -91,6 +92,7 @@ struct edt_ft5x06_ts_data {
> > > struct touchscreen_properties prop;
> > > u16 num_x;
> > > u16 num_y;
> > > +   struct regulator *vcc;
> > >
> > > struct gpio_desc *reset_gpio;
> > > struct gpio_desc *wake_gpio;
> > > @@ -993,6 +995,23 @@ static int edt_ft5x06_ts_probe(struct i2c_client 
> > > *client,
> > >
> > > tsdata->max_support_points = chip_data->max_support_points;
> > >
> > > +   tsdata->vcc = devm_regulator_get(>dev, "vcc");
> > > +   if (IS_ERR(tsdata->vcc)) {
> > > +   error = PTR_ERR(tsdata->vcc);
> > > +   dev_err(>dev, "failed to request regulator: %d\n",
> > > +   error);  
> >  
> I would check for -EPROBE_DEFER here and omit the error message in this
> case.
> 
> 
> Lothar Waßmann

Sure, I will add this case, thank you for the review.

Best regards,

-- 
Mylène Josserand, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

Re: [PATCH v1 1/4] seq_file: Introduce DEFINE_SHOW_ATTRIBUTE() helper macro

2018-01-23 Thread Lee Jones

On Mon, 22 Jan 2018, Andy Shevchenko wrote:

> The DEFINE_SHOW_ATTRIBUTE() helper macro would be useful for current
> users, which are many of them, and for new comers to decrease code
> duplication.
> 
> Signed-off-by: Andy Shevchenko 
> ---
>  drivers/mfd/ab8500-debugfs.c| 14 --

Acked-by: Lee Jones 

>  drivers/platform/x86/pmc_atom.c | 14 --
>  include/linux/seq_file.h| 14 ++
>  net/bluetooth/hci_debugfs.c | 13 -
>  4 files changed, 14 insertions(+), 41 deletions(-)

-- 
Lee Jones
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCH 1/4] dmaengine: qcom: bam_dma: make bam clk optional

2018-01-23 Thread Srinivas Kandagatla




On 23/01/18 09:19, Vinod Koul wrote:

On Mon, Jan 22, 2018 at 09:55:01AM +, Srinivas Kandagatla wrote:


@@ -1180,13 +1180,14 @@ static int bam_dma_probe(struct platform_device *pdev)
"qcom,controlled-remotely");
bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk");


but you still do clk_get unconditionally?


Only reason to do this way is to not break existing users in the mainline.

remotely controlled BAM is already supported in upstream driver, there are
users of this who pass clk from device tree, If I make this conditional then
subsequent reads to the BAM registers for those instances might crash the
system.


But these instances are remote controlled, so if we stop representing them
in Linux, why would we read them?


Plan is that we would transition those users once we get these 
bindings/changes in. Currently I don't have access to any of those 
devices so I made the changes safe, such that it does not break devices 
on mainline.


--srini

Re: [PATCH] PCI: qcom: add missing supplies required for msm8996

2018-01-23 Thread Stanimir Varbanov

Hey Srini,

As there are no comments I'd propose to change the endpoint supplies to
more generic names.

On 12/08/2017 11:20 AM, srinivas.kandaga...@linaro.org wrote:
> From: Srinivas Kandagatla 
> 
> This patch adds supplies that are required for msm8996. Two of them vdda
> and vdda-1p8 are analog supplies that go in to controller, and the rest
> of the two vddpe's are supplies to PCIe endpoints.
> 
> Without these supplies PCIe endpoints which require power supplies are
> not enumerated at all, as there is no one to power it up.
> 
> Signed-off-by: Srinivas Kandagatla 
> ---
>  .../devicetree/bindings/pci/qcom,pcie.txt  | 16 +
>  drivers/pci/dwc/pcie-qcom.c| 28 
> --
>  2 files changed, 42 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie.txt 
> b/Documentation/devicetree/bindings/pci/qcom,pcie.txt
> index 3c9d321b3d3b..045102cb3e12 100644
> --- a/Documentation/devicetree/bindings/pci/qcom,pcie.txt
> +++ b/Documentation/devicetree/bindings/pci/qcom,pcie.txt
> @@ -179,6 +179,11 @@
>   Value type: 
>   Definition: A phandle to the core analog power supply
>  
> +- vdda-1p8-supply:
> + Usage: required for msm8996
> + Value type: 
> + Definition: A phandle to the 1.8v analog power supply
> +

This should be dropped, because it is part of the phy.

>  - vdda_phy-supply:
>   Usage: required for ipq/apq8064
>   Value type: 
> @@ -189,6 +194,15 @@
>   Value type: 
>   Definition: A phandle to the analog power supply for IC which generates
>   reference clock
> +- vddpe-supply:
> + Usage: optional
> + Value type: 
> + Definition: A phandle to the PCIe endpoint power supply

vddpe_3v3-supply

> +
> +- vddpe1-supply:
> + Usage: optional
> + Value type: 
> + Definition: A phandle to the PCIe endpoint power supply 1

vddpe_1v5-supply

>  
>  - phys:
>   Usage: required for apq8084
> @@ -205,6 +219,8 @@
>   Value type: 
>   Definition: List of phandle and GPIO specifier pairs. Should contain
>   - "perst-gpios" PCIe endpoint reset signal line
> + - "pe_en-gpios" PCIe endpoint enable signal line
> + - "pe_en1-gpios" PCIe endpoint enable1 signal line

We don't need those gpios, the regulator driver will manipulate these
gpios when we call regulator_enable/disable.


-- 
regards,
Stan

[PATCH net 2/2] vhost: do not try to access device IOTLB when not initialized

2018-01-23 Thread Jason Wang

The code will try to access dev->iotlb when processing
VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead
to NULL pointer dereference. Fixes this by check dev->iotlb before.

Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 549771a..5727b18 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1015,6 +1015,10 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
vhost_iotlb_notify_vq(dev, msg);
break;
case VHOST_IOTLB_INVALIDATE:
+   if (!dev->iotlb) {
+   ret = -EFAULT;
+   break;
+   }
vhost_vq_meta_reset(dev);
vhost_del_umem_range(dev->iotlb, msg->iova,
 msg->iova + msg->size - 1);
-- 
2.7.4

Re: [PATCH] ACPI / tables: Add IORT to injectable table list

2018-01-23 Thread Yang, Shunyong

Hi, All

Sorry, please ignore this patch. Please help to review v2.
https://patchwork.kernel.org/patch/10179761/

Thanks
Shunyong

On Tue, 2018-01-23 at 16:06 +0800, Yang Shunyong wrote:
> This patch adds ACPI_SIG_PPTT to the table, which enables IORT from
> initrd to override which from firmware.
> 
> Signed-off-by: Yang Shunyong 
> Cc: yutang2.ji...@hxt-semitech.com
> Cc: yu.zh...@hxt-semitech.com
> ---
>  drivers/acpi/tables.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
> index 80ce2a7d224b..7bcb66f3 100644
> --- a/drivers/acpi/tables.c
> +++ b/drivers/acpi/tables.c
> @@ -456,7 +456,8 @@ static u8 __init acpi_table_checksum(u8 *buffer,
> u32 length)
>   ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA,
>   ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
>   ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
> - ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL };
> + ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
> + NULL };
>  
>  #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
>

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread Ingo Molnar


* Ingo Molnar  wrote:

> Is there a testcase for the SkyLake 16-deep-call-stack problem that I could 
> run? 
> Is there a description of the exact speculative execution vulnerability that 
> has 
> to be addressed to begin with?

Ok, so for now I'm assuming that this is the 16 entries return-stack-buffer 
underflow condition where SkyLake falls back to the branch predictor (while 
other 
CPUs wrap the buffer).

> If this approach is workable I'd much prefer it to any MSR writes in the 
> syscall 
> entry path not just because it's fast enough in practice to not be turned off 
> by 
> everyone, but also because everyone would agree that per function call 
> overhead 
> needs to go away on new CPUs. Both deployment and backporting is also _much_ 
> more 
> flexible, simpler, faster and more complete than microcode/firmware or 
> compiler 
> based solutions.
> 
> Assuming the vulnerability can be addressed via this route that is, which is 
> a big 
> assumption!

So I talked this over with PeterZ, and I think it's all doable:

 - the CALL __fentry__ callbacks maintain the depth tracking (on the kernel 
   stack, fast to access), and issue an "RSB-stuffing sequence" when depth 
reaches
   16 entries.

 - "the RSB-stuffing sequence" is a return trampoline that pushes a CALL on the 
   stack which is executed on the RET.

 - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. (The 
   tracking could probably made IRQ and maybe even NMI safe, but the worst-case 
   nesting scenarios make my head ache.)

I.e. IBRS can be mostly replaced with a kernel based solution that is better 
than 
IBRS and which does not negatively impact any other non-SkyLake CPUs or general 
code quality.

I.e. a full upstream Spectre solution.

Thanks,

Ingo

Re: [PATCH 2/3] power: supply: add cros-ec USB PD charger driver.

2018-01-23 Thread Lee Jones

On Wed, 17 Jan 2018, Enric Balletbo i Serra wrote:

> From: Sameer Nanda 
> 
> This driver gets various bits of information about what is connected to
> USB PD ports from the EC and converts that into power_supply properties.
> 
> Signed-off-by: Sameer Nanda 
> Signed-off-by: Enric Balletbo i Serra 
> ---
>  drivers/power/supply/Kconfig  |  11 +
>  drivers/power/supply/Makefile |   1 +
>  drivers/power/supply/cros_usbpd-charger.c | 953 
> ++

>  include/linux/mfd/cros_ec.h   |   3 +

Acked-by: Lee Jones 

>  4 files changed, 968 insertions(+)
>  create mode 100644 drivers/power/supply/cros_usbpd-charger.c

-- 
Lee Jones
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [Nouveau] [PATCH] drm/nouveau/mmu: Fix trailing semicolon

2018-01-23 Thread Karol Herbst

Reviewed-by: Karol Herbst 

On Wed, Jan 17, 2018 at 7:53 PM, Luis de Bethencourt  wrote:
> The trailing semicolon is an empty statement that does no operation.
> Removing it since it doesn't do anything.
>
> Signed-off-by: Luis de Bethencourt 
> ---
>
> Hi,
>
> After fixing the same thing in drivers/staging/rtl8723bs/, Joe Perches
> suggested I fix it treewide [0].
>
> Best regards
> Luis
>
>
> [0] 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115410.html
> [1] 
> http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115390.html
>
>  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c 
> b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> index e35d3e17cd7c..93946dcee319 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
> @@ -642,7 +642,7 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 
> size, bool ref)
> else
> block = (size >> page[i].shift) << 
> page[i].shift;
> } else {
> -   block = (size >> page[i].shift) << page[i].shift;;
> +   block = (size >> page[i].shift) << page[i].shift;
> }
>
> /* Perform operation. */
> --
> 2.15.1
>
> ___
> Nouveau mailing list
> nouv...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

Re: [PATCH] drm/bridge/synopsys: dsi: Adopt SPDX identifiers

2018-01-23 Thread Philippe CORNU

Hi Laurent,

A big *thank* for your review

On 01/23/2018 12:30 AM, Laurent Pinchart wrote:
> Hi Philippe,
> 
> Thank you for the patch.
> 
> On Monday, 22 January 2018 12:26:08 EET Philippe Cornu wrote:
>> Add SPDX identifiers to the Synopsys DesignWare MIPI DSI
>> host controller driver.
>>
>> Signed-off-by: Philippe Cornu 
>> ---
>>   drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 6 +-
>>   1 file changed, 1 insertion(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
>> b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c index
>> 46b0e73404d1..e06836dec77c 100644
>> --- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
>> +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
>> @@ -1,12 +1,8 @@
>> +// SPDX-License-Identifier: GPL-2.0
> 
> According to Documentation/process/license-rules.txt this would change the
> existing license. The correct identifier is GPL-2.0+.
> 

You are right, I did not put the correct identifier :(

After reading more spdx.org, I wonder if the correct value should be 
GPL-2.0-or-later instead of GPL-2.0+

https://spdx.org/licenses/GPL-2.0-or-later.html
https://spdx.org/licenses/GPL-2.0+.html

What is your opinion?

Many thanks,
Philippe :-)

>>   /*
>>* Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd
>>* Copyright (C) STMicroelectronics SA 2017
>>*
>> - * This program is free software; you can redistribute it and/or modify
>> - * it under the terms of the GNU General Public License as published by
>> - * the Free Software Foundation; either version 2 of the License, or
>> - * (at your option) any later version.
>> - *
>>* Modified by Philippe Cornu 
>>* This generic Synopsys DesignWare MIPI DSI host driver is based on the
>>* Rockchip version from rockchip/dw-mipi-dsi.c with phy & bridge APIs.
> 
>

Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd

2018-01-23 Thread Wolfgang Bumiller

On Tue, Jan 23, 2018 at 10:30:09AM +0100, Jiri Benc wrote:
> On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote:
> > This is not necessarily true in scenarios where I move a network device
> > via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't
> > created. Here is an example:
> > 
> > nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
> > nlmsghdr->nlmsg_type = RTM_NEWLINK;
> > /* move to network namespace of pid */
> > nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid)
> > /* give interface new name */
> > nla_put_string(nlmsg, IFLA_IFNAME, ifname)
> > 
> > The only thing I have is the pid that identifies the network namespace.
> 
> How do you know the interface did not get renamed in the new netns?
> 
> This is racy and won't work reliably. You really need to know the
> netnsid before moving the interface to the netns to be able to do
> meaningful queries.

Even if you know the netnsid, do the mentioned watches work for
nested/child namespaces if eg. a container creates new namespace before
and/or after the watch was established and moves interfaces to these
child namespaces, would you just see them disappear, or can you keep
track of them later on as well?

Even if that works, from what the documentation tells me netlink is an
unreliable protocol, so if my watcher's socket buffer is full, wouldn't
I be losing important tracking information?

I think one possible solution to tracking interfaces would be to have a
unique identifier that never changes (even if it's just a simple
uint64_t incremented whenever an interface is created). But since
they're not local to the current namespace that may require a lot of
extra permission checks (but I'm just speculating here...).

In any case, IFLA_NET_NS_FD/PID are already there and I had been
wondering previously why they couldn't be used with RTM_GETLINK, it
would just make sense.

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-23 Thread David Woodhouse

On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote:
> 
> BTW., the reason this is enabled on all distro kernels is because the 
> overhead is 
> a single patched-in NOP instruction in the function epilogue, when tracing is 
> disabled. So it's not even a CALL+RET - it's a patched in NOP.

Hm? We still have GCC emitting 'call __fentry__' don't we? Would be
nice to get to the point where we can patch *that* out into a NOP... or
are you saying we already can?

But this is a digression. I was being pedantic about the "0 cycles" but
sure, this would be perfectly tolerable.

smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH] PCI: qcom: add missing supplies required for msm8996

2018-01-23 Thread Srinivas Kandagatla




On 23/01/18 10:14, Stanimir Varbanov wrote:

Hi,

On 01/23/2018 11:46 AM, Srinivas Kandagatla wrote:



On 23/01/18 09:23, Stanimir Varbanov wrote:

Hey Srini,

As there are no comments I'd propose to change the endpoint supplies to
more generic names.


Sure, I will respin this with your suggestions, except the 3v3 and 1v5
suffix due to the reasons below:

+- vdda-1p8-supply:
+Usage: required for msm8996
+Value type: 
+Definition: A phandle to the 1.8v analog power supply
+


This should be dropped, because it is part of the phy.

Yep.




   - vdda_phy-supply:
   Usage: required for ipq/apq8064
   Value type: 
@@ -189,6 +194,15 @@
   Value type: 
   Definition: A phandle to the analog power supply for IC which
generates
   reference clock
+- vddpe-supply:
+Usage: optional
+Value type: 
+Definition: A phandle to the PCIe endpoint power supply


vddpe_3v3-supply

Why do we need suffix here? AFAIU, It does not add any value, instead it
would confuse the users.


vddpe and vddpe1 is already confusing as well.


I partly agree with you.

How would you represent if there are two power 3v3 supplies for the 
endpoint?




Lets imagine that powering up the endpointX needs some specific sequence
between 3v3 and 1v5 and endpointY (which could be connected on the same
PCIe lane) has different power sequence, how we would handle that in the
qcom pcie host driver?


power sequencing is all together a different issue, that is not 
addressed in this patch. Am hoping that this will be fixed as part of 
making pwrseq interface more generic. Not sure where it endedup now!!


--srini






These are power supplies for endpoint which could be of any voltage. In


I don't think that could be any values see PCIe mini card
electromechanical specification. There on the connector are provided 3v3
and 1v5.


this case both endpoint supplies are 3v3, these could be 1.8 or 5v or
12v in some other cases.


If we see hw designs with 5v and 12v we could extend the binding and the
driver with support for them. I want to be exact in the names and
voltages in the driver and bindings.

Re: unixbench context switch perfomance & cpu topology

2018-01-23 Thread Wanpeng Li

2018-01-22 21:37 GMT+08:00 Mike Galbraith :
> On Mon, 2018-01-22 at 20:27 +0800, Wanpeng Li wrote:
>> 2018-01-22 20:08 GMT+08:00 Mike Galbraith :
>> > On Mon, 2018-01-22 at 19:47 +0800, Wanpeng Li wrote:
>> >> Hi all,
>> >>
>> >> We can observe unixbench context switch performance is heavily
>> >> influenced by cpu topology which is exposed to the guest. the score is
>> >> posted below, bigger is better, both the guest and the host kernel are
>> >> 3.15-rc3(we can also reproduce against centos 7.4 693 guest/host), LLC
>> >> is exposed to the guest, kvm adaptive halt-polling is default enabled,
>> >> then start a guest w/ 8 logical cpus.
>> >>
>> >>
>> >>
>> >> unixbench context switch
>> >> -smp 8, sockets=8, cores=1, threads=1382036
>> >> -smp 8, sockets=4, cores=2, threads=1132480
>> >> -smp 8, sockets=2, cores=4, threads=1128032
>> >> -smp 8, sockets=2, cores=2, threads=2131767
>> >> -smp 8, sockets=1, cores=4, threads=2132742
>> >> -smp 8, sockets=1, cores=4, threads=2 (guest w/ nohz=off idle=poll)
>> >> 331471
>> >>
>> >> I can observe there are a lot of reschedule IPIs sent from one vCPU to
>> >> another vCPU, the context switch workload switches between running and
>> >> idle frequently which results in HLT instruction in the idle path, I
>> >> use idle=poll to avoid vmexit due to HLT and to avoid reschedule IPIs
>> >> since idle task checks TIF_NEED_RESCHED flags in a loop, nohz=off can
>> >> stop to program lapic timer/other nohz stuffs. Any idea why sockets=8
>> >> can get best performance?
>> >
>> > Probably because with that topology, there is no shared llc, thus no
>> > cross-core scheduling, micro-benchmark waker/wakee are stacked.  If
>> > your benchmark does nothing but schedule, stacking makes beautiful (but
>> > utterly meaningless) numbers.
>>
>> The waker and wakee are just sporadic on the same logical cpu in the
>> guest(-smp 8, sockets=8, cores=1, threads=1) during the testing, in
>> addition, binding the waker/wakee to one logical cpu in the guest(-smp
>> 8, sockets=1, cores=4, threads=2) also can get the performance as
>> better as 8 sockets setup.
>
> Here, with tip.today and that topology, context1 does stack up on one core.
>
>  PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ P COMMAND
>  4218 root  20   04048808732 R 52.16 0.022   0:12.77 4 
> context1
>  4219 root  20   04048 80  0 S 47.18 0.002   0:11.96 4 
> context1
>
> There's a bit of bouncing, but the two stack right back up.  But
> whatever, what Peter said, the benchmark should pin itself to do this.

Thanks for having a try, Mike. :) Actually the two context1 tasks
don't stack up on one logical cpu at the most of time which is
observed by kernelshark. Do you have any idea why there is 4.5 times
RESCHED IPIs which is mentioned in another reply for this thread?

Regards,
Wanpeng Li

[PATCH 2/2] x86/microcode: Fix again accessing initrd after having been freed

2018-01-23 Thread Borislav Petkov

From: Borislav Petkov 

Commit

  24c2503255d3 ("x86/microcode: Do not access the initrd after it has been 
freed")

fixed attempts to access initrd from the microcode loader after it has
been freed. However, a similar KASAN warning was reported (stack trace
edited):

  smpboot: Booting Node 0 Processor 1 APIC 0x11
  ==
  BUG: KASAN: use-after-free in find_cpio_data+0x9b5/0xa50
  Read of size 1 at addr 880035ffd000 by task swapper/1/0

  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.8-slack #7
  Hardware name: System manufacturer System Product Name/A88X-PLUS, BIOS 3003 
03/10/2016
  Call Trace:
   dump_stack
   print_address_description
   kasan_report
   ? find_cpio_data
   __asan_report_load1_noabort
   find_cpio_data
   find_microcode_in_initrd
   __load_ucode_amd
   load_ucode_amd_ap
  load_ucode_ap

After some investigation, it turned out that a merge was done using the
wrong side to resolve, leading to picking up the previous state, before
the 24c2503255d3 fix. Therefore the Fixes tag below contains a merge
commit.

Revert the mismerge by catching the save_microcode_in_initrd_amd()
retval and thus letting the function exit with the last return statement
so that initrd_gone can be set to true.

Reported-by: 
Cc:  # 4.11
Fixes: f26483eaedec ("Merge branch 'x86/urgent' into x86/microcode, to resolve 
conflicts")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198295
Signed-off-by: Borislav Petkov 
---
 arch/x86/kernel/cpu/microcode/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c 
b/arch/x86/kernel/cpu/microcode/core.c
index c4fa4a85d4cb..e4fc595cd6ea 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -239,7 +239,7 @@ static int __init save_microcode_in_initrd(void)
break;
case X86_VENDOR_AMD:
if (c->x86 >= 0x10)
-   return save_microcode_in_initrd_amd(cpuid_eax(1));
+   ret = save_microcode_in_initrd_amd(cpuid_eax(1));
break;
default:
break;
-- 
2.13.0

[PATCH 1/2] x86/microcode/intel: Extend BDW late-loading further with LLC size check

2018-01-23 Thread Borislav Petkov

From: Jia Zhang 

The commit

  b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision 
check")

reduced the impact of erratum BDF90 for Broadwell model 79.

The impact can be reduced further by checking the size of the last level
cache portion per core.

Tony: "The erratum says the problem only occurs on the large-cache SKUs.
So we only need to avoid the update if we are on a big cache SKU that is
also running old microcode."

For more details, see erratum BDF90 in document #334165 (Intel Xeon
Processor E7-8800/4800 v4 Product Family Specification Update) from
September 2017.

Signed-off-by: Jia Zhang 
Acked-by: Tony Luck 
Cc: "h...@hmh.eng.br" 
Cc: x86-ml 
Cc:  # v4.14
Link: 
http://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang@linux.alibaba.com
Signed-off-by: Borislav Petkov 
---
 arch/x86/kernel/cpu/microcode/intel.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c 
b/arch/x86/kernel/cpu/microcode/intel.c
index d9e460fc7a3b..f7c55b0e753a 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -45,6 +45,9 @@ static const char ucode_path[] = 
"kernel/x86/microcode/GenuineIntel.bin";
 /* Current microcode patch used in early patching on the APs. */
 static struct microcode_intel *intel_ucode_patch;
 
+/* last level cache size per core */
+static int llc_size_per_core;
+
 static inline bool cpu_signatures_match(unsigned int s1, unsigned int p1,
unsigned int s2, unsigned int p2)
 {
@@ -912,12 +915,14 @@ static bool is_blacklisted(unsigned int cpu)
 
/*
 * Late loading on model 79 with microcode revision less than 0x0b21
-* may result in a system hang. This behavior is documented in item
-* BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family).
+* and LLC size per core bigger than 2.5MB may result in a system hang.
+* This behavior is documented in item BDF90, #334165 (Intel Xeon
+* Processor E7-8800/4800 v4 Product Family).
 */
if (c->x86 == 6 &&
c->x86_model == INTEL_FAM6_BROADWELL_X &&
c->x86_mask == 0x01 &&
+   llc_size_per_core > 2621440 &&
c->microcode < 0x0b21) {
pr_err_once("Erratum BDF90: late loading with revision < 
0x0b21 (0x%x) disabled.\n", c->microcode);
pr_err_once("Please consider either early loading through 
initrd/built-in or a potential BIOS update.\n");
@@ -975,6 +980,15 @@ static struct microcode_ops microcode_intel_ops = {
.apply_microcode  = apply_microcode_intel,
 };
 
+static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c)
+{
+   u64 llc_size = c->x86_cache_size * 1024;
+
+   do_div(llc_size, c->x86_max_cores);
+
+   return (int)llc_size;
+}
+
 struct microcode_ops * __init init_intel_microcode(void)
 {
struct cpuinfo_x86 *c = _cpu_data;
@@ -985,5 +999,7 @@ struct microcode_ops * __init init_intel_microcode(void)
return NULL;
}
 
+   llc_size_per_core = calc_llc_size_per_core(c);
+
return _intel_ops;
 }
-- 
2.13.0

Re: [RFC] Add ability to multiplex SPI bus

2018-01-23 Thread Mark Brown

On Mon, Jan 22, 2018 at 10:51:11PM +, Ben Whitten wrote:
> A chip that I am working on acts as an SPI multiplexer for downstream radios,
> this patch adds basic support for adding an SPI mux with DT.

Please don't send cover letters for single patches, if there is anything
that needs saying put it in the changelog of the patch or after the ---
if it's administrative stuff.  This reduces mail volume and ensures that 
any important information is recorded in the changelog rather than being
lost. 

signature.asc
Description: PGP signature

Re: [PATCH 05/12] arm64: dts: mt7622: add PMIC MT6380 related nodes

2018-01-23 Thread Philippe Ombredanne

Sean,
sorry for the late reply and thanks you for this research.

On Fri, Jan 12, 2018 at 4:33 AM, Sean Wang  wrote:
> Currently, I'm really confused about what usage STYLE of SPDX license
> identifier I should use for each type of file.
>
> could you point me where I can find the related document describing SPDX
> usage style for those files expected by the community in the future?

The doc is in this patchset [1]

[1] https://lkml.org/lkml/2017/12/28/326


> I found more than one way STYLE of SPDX present at current code, for
> example as below. If there's no absolute definition for them, and then
> which way that is better?
> 1)
> for *.dts, applied with "// " at head or within " /* */ " not at head
> such as
>
> arch/arm/boot/dts/bcm953012hr.dts:2: *  SPDX-License-Identifier:
> BSD-3-Clause

This is a "style bug". The comment style for .dts should be //

> 2)
> for *.c, applied with "// " at head or within " /* */ " not at head
> such as
> drivers/soc/xilinx/zynqmp/pm.c:10: * SPDX-License-Identifier:   GPL-2.0+

This is a "style bug". The comment style for .c should be //

> 3)
> for *.h, applied with "// " at head or within " /* */ " at head
> such as
> drivers/usb/dwc3/gadget.h:1:// SPDX-License-Identifier: GPL-2.0

This is a "style bug". The comment style for .h should be /**/

> 4)
> no issue, Makefile, or Kconfig, definitely applied with "# " at head

That's the correct way.

So the net-net is that these "style bugs" should be fixed.

-- 
Cordially
Philippe Ombredanne

Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2018-01-23 Thread Konstantin Khlebnikov


On 22.01.2018 23:51, Andy Lutomirski wrote:

On Wed, Oct 11, 2017 at 6:32 AM, Konstantin Khlebnikov
 wrote:

On 08.10.2017 12:16, Christoph Hellwig wrote:


This looks fine in general, but a few comments:

   - can you split adding the new function from switching over the fork
 codeok




   - at least kasan and vmalloc_user/vmalloc_32_user use very similar
 patterns, can you switch them over as well?



I don't see why VM_USERMAP cannot be set right at allocation.

I'll add vm_flags argument to __vmalloc_node() and
pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
in separate patch.

KASAN is different: it allocates shadow area for area allocated for module.
Pointer to module area must be pushed from module_alloc().
This isn't worth optimization.


   - the new __alloc_vm_area looks very different from alloc_vm_area,
 maybe it needs a better name?  vmalloc_range_area for example?



__vmalloc_area() is vacant - this most low-level, so I'll keep "__".


   - when you split an existing function please keep the more low-level
 function on top of the higher level one that calls it.ok


Did this ever get re-sent?



It seems not. Probably lost in race-condition with my vacation.
Will do.

Re: [PATCH v7 2/2] mfd: syscon: Add hardware spinlock support

2018-01-23 Thread Lee Jones

On Tue, 23 Jan 2018, Baolin Wang wrote:

> Hi Lee,
> 
> On 22 January 2018 at 21:43, Lee Jones  wrote:
> > On Thu, 11 Jan 2018, Lee Jones wrote:
> >> On Mon, 25 Dec 2017, Baolin Wang wrote:
> >>
> >> > Some system control registers need hardware spinlock to synchronize
> >> > between the multiple subsystems, so we should add hardware spinlock
> >> > support for syscon.
> >> >
> >> > Signed-off-by: Baolin Wang 
> >> > Acked-by: Rob Herring 
> >> > ---
> >> > Changes since v6:
> >> >  - Treat hwlock id 0 as valid for regmap.
> >> >
> >> > Changes since v5:
> >> >  - Fix the case that hwspinlock is not enabled.
> >> >
> >> > Changes since v4:
> >> >  - Add one exapmle to show how to add hwlock.
> >> >  - Fix the coding style issue.
> >> >
> >> > Changes since v3:
> >> >  - Add error handling for of_hwspin_lock_get_id()
> >> >
> >> > Changes since v2:
> >> >  - Add acked tag from Rob.
> >> >
> >> > Changes since v1:
> >> >  - Remove timeout configuration.
> >> >  - Modify the binding file to add hwlocks.
> >> > ---
> >> >  Documentation/devicetree/bindings/mfd/syscon.txt |8 
> >> >  drivers/mfd/syscon.c |   19 
> >> > +++
> >> >  2 files changed, 27 insertions(+)
> >>
> >> Applied, thanks.
> >
> > In order to avoid confusion, I should like to tell you that this patch
> > is applied for v4.17, not v4.16.
> 
> This patch has been applied into Mark's branch[1] with your ACK, so
> Mark should drop this patch from his branch and you will pick it and
> merge it into v4.17?
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git/commit/?h=topic/hwspinlock=3bafc09e779710abaa7b836fe3bbeeeab7754c2b

Ah, this is the one that failed to build when merged alone.

Very well.  Ignore my last.

-- 
Lee Jones
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCH v2 1/2] Input: edt-ft5x06 - Add support for regulator

2018-01-23 Thread Mylene Josserand

Hello Dimitry,

Thank you for the review!

Le Mon, 22 Jan 2018 09:42:08 -0800,
Dmitry Torokhov  a écrit :

> Hi Mylène,
> 
> On Thu, Dec 28, 2017 at 8:33 AM, Mylène Josserand
>  wrote:
> > Add the support of regulator to use it as VCC source.
> >
> > Signed-off-by: Mylène Josserand 
> > ---
> >  .../bindings/input/touchscreen/edt-ft5x06.txt  |  1 +
> >  drivers/input/touchscreen/edt-ft5x06.c | 33 
> > ++
> >  2 files changed, 34 insertions(+)
> >
> > diff --git 
> > a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt 
> > b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > index 025cf8c9324a..48e975b9c1aa 100644
> > --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
> > @@ -30,6 +30,7 @@ Required properties:
> >  Optional properties:
> >   - reset-gpios: GPIO specification for the RESET input
> >   - wake-gpios:  GPIO specification for the WAKE input
> > + - vcc-supply:  Regulator that supplies the touchscreen
> >
> >   - pinctrl-names: should be "default"
> >   - pinctrl-0:   a phandle pointing to the pin settings for the
> > diff --git a/drivers/input/touchscreen/edt-ft5x06.c 
> > b/drivers/input/touchscreen/edt-ft5x06.c
> > index c53a3d7239e7..5ee14a25a382 100644
> > --- a/drivers/input/touchscreen/edt-ft5x06.c
> > +++ b/drivers/input/touchscreen/edt-ft5x06.c
> > @@ -39,6 +39,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #define WORK_REGISTER_THRESHOLD0x00
> >  #define WORK_REGISTER_REPORT_RATE  0x08
> > @@ -91,6 +92,7 @@ struct edt_ft5x06_ts_data {
> > struct touchscreen_properties prop;
> > u16 num_x;
> > u16 num_y;
> > +   struct regulator *vcc;
> >
> > struct gpio_desc *reset_gpio;
> > struct gpio_desc *wake_gpio;
> > @@ -993,6 +995,23 @@ static int edt_ft5x06_ts_probe(struct i2c_client 
> > *client,
> >
> > tsdata->max_support_points = chip_data->max_support_points;
> >
> > +   tsdata->vcc = devm_regulator_get(>dev, "vcc");
> > +   if (IS_ERR(tsdata->vcc)) {
> > +   error = PTR_ERR(tsdata->vcc);
> > +   dev_err(>dev, "failed to request regulator: %d\n",
> > +   error);
> > +   return error;
> > +   };  
> 
> As 0-day pounted out, this semicolon is not needed.

Yes, thanks, I will fix that in next version.

> 
> > +
> > +   if (tsdata->vcc) {  
> 
> You do not need to check for non-NULL here, devm_regulator_get() wil
> lnever give you a NULL. If regulator is not defined in DT/board
> mappings, then dummy regulator will be provided. You can call
> regulator_enable() and regulator_disable() and other regulator APIs
> with dummy regulator.

Okay, thanks for the explanation, I will remove that.

> 
> > +   error = regulator_enable(tsdata->vcc);
> > +   if (error < 0) {
> > +   dev_err(>dev, "failed to enable vcc: %d\n",
> > +   error);
> > +   return error;
> > +   }
> > +   }
> > +
> > tsdata->reset_gpio = devm_gpiod_get_optional(>dev,
> >  "reset", 
> > GPIOD_OUT_HIGH);
> > if (IS_ERR(tsdata->reset_gpio)) {
> > @@ -1122,20 +1141,34 @@ static int edt_ft5x06_ts_remove(struct i2c_client 
> > *client)
> >  static int __maybe_unused edt_ft5x06_ts_suspend(struct device *dev)
> >  {
> > struct i2c_client *client = to_i2c_client(dev);
> > +   struct edt_ft5x06_ts_data *tsdata = i2c_get_clientdata(client);
> >
> > if (device_may_wakeup(dev))
> > enable_irq_wake(client->irq);
> >
> > +   if (tsdata->vcc)  
> 
> Same here.

yep

> 
> > +   regulator_disable(tsdata->vcc);
> > +
> > return 0;
> >  }
> >
> >  static int __maybe_unused edt_ft5x06_ts_resume(struct device *dev)
> >  {
> > struct i2c_client *client = to_i2c_client(dev);
> > +   struct edt_ft5x06_ts_data *tsdata = i2c_get_clientdata(client);
> > +   int ret;
> >
> > if (device_may_wakeup(dev))
> > disable_irq_wake(client->irq);
> >
> > +   if (tsdata->vcc) {  
> 
> And here.

yep

> 
> > +   ret = regulator_enable(tsdata->vcc);
> > +   if (ret < 0) {
> > +   dev_err(dev, "failed to enable vcc: %d\n", ret);
> > +   return ret;
> > +   }  
> 
> Since power to the device may have been cut, I think you need to
> restore the register settings to whatever it was (factory vs work
> mode, threshold, gain and offset registers, etc, etc).

Okay. Could you tell me how can I do that?

> 
> > +   }
> > +
> > return 0;
> >  }
> >
> > --
> > 2.11.0
> >  
> 
> Thanks.
>

[PATCH net 1/2] vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()

2018-01-23 Thread Jason Wang

We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
hold mutexes of all virtqueues. This may confuse lockdep to report a
possible deadlock because of trying to hold locks belong to same
class. Switch to use mutex_lock_nested() to avoid false positive.

Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+dbb7c1161485e61b0...@syzkaller.appspotmail.com
Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 33ac2b1..549771a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -904,7 +904,7 @@ static void vhost_dev_lock_vqs(struct vhost_dev *d)
 {
int i = 0;
for (i = 0; i < d->nvqs; ++i)
-   mutex_lock(>vqs[i]->mutex);
+   mutex_lock_nested(>vqs[i]->mutex, i);
 }
 
 static void vhost_dev_unlock_vqs(struct vhost_dev *d)
-- 
2.7.4

Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd

2018-01-23 Thread Jiri Benc

On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote:
> This is not necessarily true in scenarios where I move a network device
> via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't
> created. Here is an example:
> 
> nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
> nlmsghdr->nlmsg_type = RTM_NEWLINK;
> /* move to network namespace of pid */
> nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid)
> /* give interface new name */
> nla_put_string(nlmsg, IFLA_IFNAME, ifname)
> 
> The only thing I have is the pid that identifies the network namespace.

How do you know the interface did not get renamed in the new netns?

This is racy and won't work reliably. You really need to know the
netnsid before moving the interface to the netns to be able to do
meaningful queries.

You may argue that for your case, you're fine with the race. But I know
about use cases where it matters a lot: those are tools that show
network topology including changes in real time, such as Skydive. It's
important to have the uAPI designed right. And we don't want two
different APIs for the same thing.

If you want to do any watching over the interfaces (as opposed to "move
to the netns and forget"), you really have to work with netnsids. Let's
focus on how to do that more easily. We don't return netnsid at all
places where we should return it and we need to fix that.

> There's no non-syscall way to learn the netnsid.

And that is the primary problem.

 Jiri

Re: [PATCH v2] kasan: don't emit builtin calls when sanitization is off

2018-01-23 Thread Andrey Ryabinin



On 01/23/2018 05:20 AM, Andrew Morton wrote:
> On Fri, 19 Jan 2018 18:58:12 +0100 Andrey Konovalov  
> wrote:
> 
>> With KASAN enabled the kernel has two different memset() functions, one
>> with KASAN checks (memset) and one without (__memset). KASAN uses some
>> macro tricks to use the proper version where required. For example memset()
>> calls in mm/slub.c are without KASAN checks, since they operate on poisoned
>> slab object metadata.
>>
>> The issue is that clang emits memset() calls even when there is no memset()
>> in the source code. They get linked with improper memset() implementation
>> and the kernel fails to boot due to a huge amount of KASAN reports during
>> early boot stages.
>>
>> The solution is to add -fno-builtin flag for files with KASAN_SANITIZE := n
>> marker.
> 
> This clashes somewhat with Arnd's asan-rework-kconfig-settings.patch. 
> Could you two please put heads together and decide what we want for a
> final result?
> 
> Meanwhile I'll "fix" the reject with
> 
> +ifdef CONFIG_KASAN_EXTRA
>  CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope)
> +endif
> 

Looks correct.

[PATCH] bnx2: remove redundant initializations of pointers txr and rxr

2018-01-23 Thread Colin King

From: Colin Ian King 

Pointers txr and rxr are being initialized and a few statements later
are being assigned new values without the original values ever being
read. The initialized values are therefore redundant and can be
removed.

Cleans up clang warnings:
drivers/net/ethernet/broadcom/bnx2.c:5821:28: warning: Value stored to
'txr' during its initialization is never read
drivers/net/ethernet/broadcom/bnx2.c:5822:28: warning: Value stored to
'rxr' during its initialization is never read

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/broadcom/bnx2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c 
b/drivers/net/ethernet/broadcom/bnx2.c
index 154866e8517a..5de4c33f682e 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -5818,8 +5818,8 @@ bnx2_run_loopback(struct bnx2 *bp, int loopback_mode)
struct l2_fhdr *rx_hdr;
int ret = -ENODEV;
struct bnx2_napi *bnapi = >bnx2_napi[0], *tx_napi;
-   struct bnx2_tx_ring_info *txr = >tx_ring;
-   struct bnx2_rx_ring_info *rxr = >rx_ring;
+   struct bnx2_tx_ring_info *txr;
+   struct bnx2_rx_ring_info *rxr;
 
tx_napi = bnapi;
 
-- 
2.15.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2266 matches

Mail list logo