[PATCH] ACPI / tables: Add IORT to injectable table list
This patch adds ACPI_SIG_PPTT to the table, which enables IORT from initrd to override which from firmware. Signed-off-by: Yang ShunyongCc: yutang2.ji...@hxt-semitech.com Cc: yu.zh...@hxt-semitech.com --- drivers/acpi/tables.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 80ce2a7d224b..7bcb66f3 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -456,7 +456,8 @@ static u8 __init acpi_table_checksum(u8 *buffer, u32 length) ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA, ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT, ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT, - ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL }; + ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT, + NULL }; #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header) -- 1.8.3.1
Re: [PATCH v6 22/36] nds32: Debugging support
On Tue, Jan 23, 2018 at 8:28 AM, Vincent Chenwrote: > 2018-01-18 18:37 GMT+08:00 Arnd Bergmann : >> On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: >>> From: Greentime Hu >> >> It appears that you are implementing the old-style ptrace handling >> with architecture specific commands. Please have a look at how >> this is done in risc-v or arm64. If this takes more too much time >> to address, I'd suggest using an empty stub function for sys_ptrace >> and adding it back at a later point, but not send the current version >> upstream. >> > > After referring to risc-v and arm64, I realize that PTRACE_GETREGSET > and PTRACE_SETREGSET is used to replace arch specific command. > The needed port for the two ptrace commands had done in current > version patch. > > Could I keep them and just removing the code for old-style ptrace > handling in the next version patch? The important part is to not merge a user space interface into the upstream kernel that we still want to change. It's clear that it takes some time to update gdb and other programs using the ptrace interface, so I'd suggest to simply not have any ptrace interface submitted for inclusion until that is complete. In the meantime, you can keep the existing version as an add-on kernel patch, you probably have other patches that are not ready to get merged yet, so just keep this one in the same tree as the others. Arnd
[PATCH v2] ACPI / tables: Add IORT to injectable table list
This patch adds ACPI_SIG_IORT to the table, which enables IORT from initrd to override which from firmware. Signed-off-by: Yang ShunyongCc: yutang2.ji...@hxt-semitech.com Cc: yu.zh...@hxt-semitech.com --- v2: change typo ACPI_SIG_PPTT to ACPI_SIG_IORT in commit message. --- drivers/acpi/tables.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index 80ce2a7d224b..7bcb66f3 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -456,7 +456,8 @@ static u8 __init acpi_table_checksum(u8 *buffer, u32 length) ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA, ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT, ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT, - ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL }; + ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT, + NULL }; #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header) -- 1.8.3.1
Re: [RFC v2 1/5] vfio/type1: Introduce iova list and add iommu aperture validity check
Hi Shameer, On 18/01/18 01:04, Alex Williamson wrote: > On Fri, 12 Jan 2018 16:45:27 + > Shameer Kolothumwrote: > >> This introduces an iova list that is valid for dma mappings. Make >> sure the new iommu aperture window is valid and doesn't conflict >> with any existing dma mappings during attach. Also update the iova >> list with new aperture window during attach/detach. >> >> Signed-off-by: Shameer Kolothum >> --- >> drivers/vfio/vfio_iommu_type1.c | 177 >> >> 1 file changed, 177 insertions(+) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index e30e29a..11cbd49 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -60,6 +60,7 @@ >> >> struct vfio_iommu { >> struct list_headdomain_list; >> +struct list_headiova_list; >> struct vfio_domain *external_domain; /* domain for external user */ >> struct mutexlock; >> struct rb_root dma_list; >> @@ -92,6 +93,12 @@ struct vfio_group { >> struct list_headnext; >> }; >> >> +struct vfio_iova { >> +struct list_headlist; >> +phys_addr_t start; >> +phys_addr_t end; >> +}; > > dma_list uses dma_addr_t for the iova. IOVAs are naturally DMA > addresses, why are we using phys_addr_t? > >> + >> /* >> * Guest RAM pinning working set or DMA target >> */ >> @@ -1192,6 +1199,123 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group >> *group, phys_addr_t *base) >> return ret; >> } >> >> +static int vfio_insert_iova(phys_addr_t start, phys_addr_t end, >> +struct list_head *head) >> +{ >> +struct vfio_iova *region; >> + >> +region = kmalloc(sizeof(*region), GFP_KERNEL); >> +if (!region) >> +return -ENOMEM; >> + >> +INIT_LIST_HEAD(>list); >> +region->start = start; >> +region->end = end; >> + >> +list_add_tail(>list, head); >> +return 0; >> +} > > As I'm reading through this series, I'm learning that there are a lot > of assumptions and subtle details that should be documented. For > instance, the IOMMU API only provides a single geometry and we build > upon that here as this patch creates a list, but there's only a single > entry for now. The following patches carve that single iova range into > pieces and somewhat subtly use the list_head passed to keep the list > sorted, allowing the first/last_entry tricks used throughout. Subtle > interfaces are prone to bugs. > >> + >> +/* >> + * Find whether a mem region overlaps with existing dma mappings >> + */ >> +static bool vfio_find_dma_overlap(struct vfio_iommu *iommu, >> + phys_addr_t start, phys_addr_t end) >> +{ >> +struct rb_node *n = rb_first(>dma_list); >> + >> +for (; n; n = rb_next(n)) { >> +struct vfio_dma *dma; >> + >> +dma = rb_entry(n, struct vfio_dma, node); >> + >> +if (end < dma->iova) >> +break; >> +if (start >= dma->iova + dma->size) >> +continue; >> +return true; >> +} >> + >> +return false; >> +} > > Why do we need this in addition to the existing vfio_find_dma()? Why > doesn't this use the tree structure of the dma_list? > >> + >> +/* >> + * Check the new iommu aperture is a valid one >> + */ >> +static int vfio_iommu_valid_aperture(struct vfio_iommu *iommu, >> + phys_addr_t start, >> + phys_addr_t end) >> +{ >> +struct vfio_iova *first, *last; >> +struct list_head *iova = >iova_list; >> + >> +if (list_empty(iova)) >> +return 0; >> + >> +/* Check if new one is outside the current aperture */ > > "Disjoint sets" > >> +first = list_first_entry(iova, struct vfio_iova, list); >> +last = list_last_entry(iova, struct vfio_iova, list); >> +if ((start > last->end) || (end < first->start)) >> +return -EINVAL; >> + >> +/* Check for any existing dma mappings outside the new start */ >> +if (start > first->start) { >> +if (vfio_find_dma_overlap(iommu, first->start, start - 1)) >> +return -EINVAL; >> +} >> + >> +/* Check for any existing dma mappings outside the new end */ >> +if (end < last->end) { >> +if (vfio_find_dma_overlap(iommu, end + 1, last->end)) >> +return -EINVAL; >> +} >> + >> +return 0; >> +} > > I think this returns an int because you want to use it for the return > value below, but it really seems like a bool question, ie. does this > aperture conflict with existing mappings. Additionally, the aperture > is valid, it was provided to us by the IOMMU API, the question is > whether it conflicts. Please also name consistently
[PATCH] cpufreq: mediatek: Add mediatek related projects into blacklist
From: Sean Wangcommit 6066998cbd2b1012a8d5bc9a2957cfd0ad53150e upstream. commit edeec420de24 ("cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2") not added MediaTek SoCs to the blacklist that would lead to cause an occasional hang or unexpected behaviors on related boards as kernelci reported and complained on [1] specifically for 4.14 and 4.15 tree. For those reasons, add MediaTek SoCs into cpufreq-dt blacklist and wish the patch be applied to 4.14 and 4.15 tree to allow kernelci able to complete following automated kernel testing. [1] https://kernelci.org/boot/mt7623n-bananapi-bpi-r2/ Fixes: edeec420de24 (cpufreq: dt-cpufreq: platdev Automatically create device with OPP v2) Signed-off-by: Andrew-sh Cheng Signed-off-by: Sean Wang Cc: Kevin Hilman --- drivers/cpufreq/cpufreq-dt-platdev.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index a753c50..9e0aa76 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -111,6 +111,14 @@ static const struct of_device_id blacklist[] __initconst = { { .compatible = "marvell,armadaxp", }, + { .compatible = "mediatek,mt2701", }, + { .compatible = "mediatek,mt2712", }, + { .compatible = "mediatek,mt7622", }, + { .compatible = "mediatek,mt7623", }, + { .compatible = "mediatek,mt817x", }, + { .compatible = "mediatek,mt8173", }, + { .compatible = "mediatek,mt8176", }, + { .compatible = "nvidia,tegra124", }, { .compatible = "st,stih407", }, -- 2.7.4
Re: [PATCH v2 1/2] Input: edt-ft5x06 - Add support for regulator
Hi, On Mon, 22 Jan 2018 09:42:08 -0800 Dmitry Torokhov wrote: > Hi Mylène, > > On Thu, Dec 28, 2017 at 8:33 AM, Mylène Josserand >wrote: > > Add the support of regulator to use it as VCC source. > > > > Signed-off-by: Mylène Josserand > > --- > > .../bindings/input/touchscreen/edt-ft5x06.txt | 1 + > > drivers/input/touchscreen/edt-ft5x06.c | 33 > > ++ > > 2 files changed, 34 insertions(+) > > > > diff --git > > a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > index 025cf8c9324a..48e975b9c1aa 100644 > > --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > @@ -30,6 +30,7 @@ Required properties: > > Optional properties: > > - reset-gpios: GPIO specification for the RESET input > > - wake-gpios: GPIO specification for the WAKE input > > + - vcc-supply: Regulator that supplies the touchscreen > > > > - pinctrl-names: should be "default" > > - pinctrl-0: a phandle pointing to the pin settings for the > > diff --git a/drivers/input/touchscreen/edt-ft5x06.c > > b/drivers/input/touchscreen/edt-ft5x06.c > > index c53a3d7239e7..5ee14a25a382 100644 > > --- a/drivers/input/touchscreen/edt-ft5x06.c > > +++ b/drivers/input/touchscreen/edt-ft5x06.c > > @@ -39,6 +39,7 @@ > > #include > > #include > > #include > > +#include > > > > #define WORK_REGISTER_THRESHOLD0x00 > > #define WORK_REGISTER_REPORT_RATE 0x08 > > @@ -91,6 +92,7 @@ struct edt_ft5x06_ts_data { > > struct touchscreen_properties prop; > > u16 num_x; > > u16 num_y; > > + struct regulator *vcc; > > > > struct gpio_desc *reset_gpio; > > struct gpio_desc *wake_gpio; > > @@ -993,6 +995,23 @@ static int edt_ft5x06_ts_probe(struct i2c_client > > *client, > > > > tsdata->max_support_points = chip_data->max_support_points; > > > > + tsdata->vcc = devm_regulator_get(>dev, "vcc"); > > + if (IS_ERR(tsdata->vcc)) { > > + error = PTR_ERR(tsdata->vcc); > > + dev_err(>dev, "failed to request regulator: %d\n", > > + error); > I would check for -EPROBE_DEFER here and omit the error message in this case. Lothar Waßmann
Re: [RFC v2 2/5] vfio/type1: Check reserve region conflict and update iova list
Hi Shameer, On 18/01/18 01:04, Alex Williamson wrote: > On Fri, 12 Jan 2018 16:45:28 + > Shameer Kolothumwrote: > >> This retrieves the reserved regions associated with dev group and >> checks for conflicts with any existing dma mappings. Also update >> the iova list excluding the reserved regions. >> >> Signed-off-by: Shameer Kolothum >> --- >> drivers/vfio/vfio_iommu_type1.c | 161 >> +++- >> 1 file changed, 159 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/vfio/vfio_iommu_type1.c >> b/drivers/vfio/vfio_iommu_type1.c >> index 11cbd49..7609070 100644 >> --- a/drivers/vfio/vfio_iommu_type1.c >> +++ b/drivers/vfio/vfio_iommu_type1.c >> @@ -28,6 +28,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -1199,6 +1200,20 @@ static bool vfio_iommu_has_sw_msi(struct iommu_group >> *group, phys_addr_t *base) >> return ret; >> } >> > > /* list_sort helper */ > >> +static int vfio_resv_cmp(void *priv, struct list_head *a, struct list_head >> *b) >> +{ >> +struct iommu_resv_region *ra, *rb; >> + >> +ra = container_of(a, struct iommu_resv_region, list); >> +rb = container_of(b, struct iommu_resv_region, list); >> + >> +if (ra->start < rb->start) >> +return -1; >> +if (ra->start > rb->start) >> +return 1; >> +return 0; >> +} >> + >> static int vfio_insert_iova(phys_addr_t start, phys_addr_t end, >> struct list_head *head) >> { >> @@ -1274,6 +1289,24 @@ static int vfio_iommu_valid_aperture(struct >> vfio_iommu *iommu, >> } >> >> /* >> + * Check reserved region conflicts with existing dma mappings >> + */ >> +static int vfio_iommu_resv_region_conflict(struct vfio_iommu *iommu, >> +struct list_head *resv_regions) >> +{ >> +struct iommu_resv_region *region; >> + >> +/* Check for conflict with existing dma mappings */ >> +list_for_each_entry(region, resv_regions, list) { >> +if (vfio_find_dma_overlap(iommu, region->start, >> +region->start + region->length - 1)) >> +return -EINVAL; >> +} >> + >> +return 0; >> +} > > This basically does the same test as vfio_iommu_valid_aperture but > properly names it a conflict test. Please be consistent. Should this > also return bool, "conflict" is a yes/no answer. > >> + >> +/* >> * Adjust the iommu aperture window if new aperture is a valid one >> */ >> static int vfio_iommu_iova_aper_adjust(struct vfio_iommu *iommu, >> @@ -1316,6 +1349,51 @@ static int vfio_iommu_iova_aper_adjust(struct >> vfio_iommu *iommu, >> return 0; >> } >> >> +/* >> + * Check and update iova region list in case a reserved region >> + * overlaps the iommu iova range >> + */ >> +static int vfio_iommu_iova_resv_adjust(struct vfio_iommu *iommu, >> +struct list_head *resv_regions) > > "resv_region" in previous function, just "resv" here, use consistent > names. Also, what are we adjusting. Maybe "exclude" is a better term. > >> +{ >> +struct iommu_resv_region *resv; >> +struct list_head *iova = >iova_list; >> +struct vfio_iova *n, *next; >> + >> +list_for_each_entry(resv, resv_regions, list) { >> +phys_addr_t start, end; >> + >> +start = resv->start; >> +end = resv->start + resv->length - 1; >> + >> +list_for_each_entry_safe(n, next, iova, list) { >> +phys_addr_t a, b; >> +int ret = 0; >> + >> +a = n->start; >> +b = n->end; > > 'a' and 'b' variables actually make this incredibly confusing. Use > better variable names or just drop them entirely, it's much easier to > follow as n->start & n->end. > >> +/* No overlap */ >> +if ((start > b) || (end < a)) >> +continue; >> +/* Split the current node and create holes */ >> +if (start > a) >> +ret = vfio_insert_iova(a, start - 1, >list); >> +if (!ret && end < b) >> +ret = vfio_insert_iova(end + 1, b, >list); >> +if (ret) >> +return ret; >> + >> +list_del(>list); > > This is trickier than it appears and deserves some explanation. AIUI, > we're actually inserting duplicate entries for the remainder at the > start of the range and then at the end of the range (and the order is > important here because we're inserting each before the current node), > and then we delete the current node. So the iova_list is kept sorted > through this process, though temporarily includes some bogus, unordered > sub-sets. > >> +
Re: problematic rc9 futex changes.
On Tue, Jan 23, 2018 at 12:34:46AM -0500, Dave Jones wrote: > c1e2f0eaf015fb: "futex: Avoid violating the 10th rule of futex" seems to > make up a few new rules to violate. > > Coverity picked up these two problems in the same code: > Yeah, Geert also spotted it: https://lkml.kernel.org/r/20180122103947.gd2...@hirez.programming.kicks-ass.net I've been running the robustpi tests from glibc but have so far failed to actually trigger the bug. I think I'll just write up a Changelog and post the fix from the above link.
Re: [PATCH v1] x86/io: Define readq()/writeq() to use 64-bit type
On Mon, 2018-01-22 at 16:46 -0800, h...@zytor.com wrote: > On January 22, 2018 4:32:14 PM PST, "Mehta, Sohil"com> wrote: > > On Fri, 2018-01-19 at 16:33 +0200, Andy Shevchenko wrote: > > > +build_mmio_read(readq, "q", unsigned long long, "=r", :"memory") > > > +build_mmio_read(__readq, "q", unsigned long long, "=r", ) > > > +build_mmio_write(writeq, "q", unsigned long long, "r", :"memory") > > > +build_mmio_write(__writeq, "q", unsigned long long, "r", ) > > > > > > #define readq_relaxed(a) __readq(a) > > > #define writeq_relaxed(v, a) __writeq(v, a) > > > > The patch works for me: > > > > Tested-by: Sohil Mehta > > > Wouldn't simply u64 make more sense? It would break a common style used in this module for the rest of accessors. So, I prefer to go with unsigned long long and change later, if needed, from POD types to uNN ones in entire file. -- Andy Shevchenko Intel Finland Oy
Re: [RFC v2 3/5] vfio/type1: check dma map request is within a valid iova range
Hi Shameer, On 12/01/18 17:45, Shameer Kolothum wrote: > This checks and rejects any dma map request outside valid iova > range. > > Signed-off-by: Shameer Kolothum> --- > drivers/vfio/vfio_iommu_type1.c | 22 ++ > 1 file changed, 22 insertions(+) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index 7609070..47ea490 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -971,6 +971,23 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, > struct vfio_dma *dma, > return ret; > } > > +/* > + * Check dma map request is within a valid iova range > + */ > +static bool vfio_iommu_iova_dma_valid(struct vfio_iommu *iommu, > + phys_addr_t start, phys_addr_t end) s/phys_addr_t/dma_addr_t here also. > +{ > + struct list_head *iova = >iova_list; > + struct vfio_iova *node; > + > + list_for_each_entry(node, iova, list) { > + if ((start >= node->start) && (end <= node->end)) > + return true; > + } > + > + return false; > +} > + > static int vfio_dma_do_map(struct vfio_iommu *iommu, > struct vfio_iommu_type1_dma_map *map) > { > @@ -1009,6 +1026,11 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu, > goto out_unlock; > } > > + if (!vfio_iommu_iova_dma_valid(iommu, iova, iova + size - 1)) { > + ret = -EINVAL; > + goto out_unlock; > + } > + > dma = kzalloc(sizeof(*dma), GFP_KERNEL); > if (!dma) { > ret = -ENOMEM; > Thanks Eric
[PATCH v8 3/5] x86/KASLR: Give a warning if movable_node specified without kaslr_mem=
Since only 'movable_node' specified without 'kaslr_mem=' may break memory hotplug, so reconmmend users using 'kaslr_mem=' when 'movable_node' specified. Acked-by: Baoquan HeSigned-off-by: Chao Fan --- arch/x86/boot/compressed/kaslr.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index b200a7ceafc1..8703cc764306 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -282,6 +282,16 @@ static int handle_mem_filter(void) !strstr(args, "kaslr_mem=")) return 0; +#ifdef CONFIG_MEMORY_HOTPLUG + /* +* Check if 'kaslr_mem=' specified when 'movable_node' found. If not, +* just give a warrning. Otherwise memory hotplug could be +* affected if kernel is put on movable memory regions. +*/ + if (strstr(args, "movable_node") && !strstr(args, "kaslr_mem=")) + warn("'kaslr_mem=' should be specified when using 'movable_node'.\n"); +#endif + tmp_cmdline = malloc(len + 1); if (!tmp_cmdline) error("Failed to allocate space for tmp_cmdline"); -- 2.14.3
Re: [PATCH v2 2/4] drivers: firmware: xilinx: Add ZynqMP firmware driver
On Wed, Jan 17, 2018 at 12:20:32PM -0800, Jolly Shah wrote: > This patch is adding communication layer with firmware. > Firmware driver provides an interface to firmware APIs. > Interface APIs can be used by any driver to communicate to > PMUFW(Platform Management Unit). All requests go through ATF. > > Signed-off-by: Jolly Shah> Signed-off-by: Rajan Vaja > --- > arch/arm64/Kconfig.platforms| 1 + > drivers/firmware/Kconfig| 1 + > drivers/firmware/Makefile | 1 + > drivers/firmware/xilinx/Kconfig | 4 + > drivers/firmware/xilinx/Makefile| 4 + > drivers/firmware/xilinx/zynqmp/Kconfig | 16 + > drivers/firmware/xilinx/zynqmp/Makefile | 4 + > drivers/firmware/xilinx/zynqmp/firmware.c | 987 > > include/linux/firmware/xilinx/zynqmp/firmware.h | 570 ++ Why does this file need to be in include/linux/ at all? Shouldn't it just live in the driver-specific subdir? thanks, greg k-h
[PATCH v8 1/5] x86/KASLR: Add kaslr_mem=nn[KMG]@ss[KMG]
Introduce a new kernel parameter kaslr_mem=nn[KMG]@ss[KMG] which is used by KASLR only during kernel decompression stage. Users can use it to specify memory regions where kernel can be randomized into. E.g if movable_node specified in kernel cmdline, kernel could be extracted into those movable regions, this will make memory hotplug fail. With the help of 'kaslr_mem=', limit kernel in those immovable regions specified. Tested-by: Luiz CapitulinoAcked-by: Baoquan He Signed-off-by: Chao Fan --- arch/x86/boot/compressed/kaslr.c | 73 ++-- 1 file changed, 70 insertions(+), 3 deletions(-) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index 8199a6187251..b21741135673 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -108,6 +108,15 @@ enum mem_avoid_index { static struct mem_vector mem_avoid[MEM_AVOID_MAX]; +/* Only support at most 4 usable memory regions specified for kaslr */ +#define MAX_KASLR_MEM_USABLE 4 + +/* Store the usable memory regions for kaslr */ +static struct mem_vector mem_usable[MAX_KASLR_MEM_USABLE]; + +/* The amount of usable regions for kaslr user specify, not more than 4 */ +static int num_usable_region; + static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two) { /* Item one is entirely before item two. */ @@ -206,7 +215,62 @@ static void mem_avoid_memmap(char *str) memmap_too_large = true; } -static int handle_mem_memmap(void) +static int parse_kaslr_mem(char *p, + unsigned long long *start, + unsigned long long *size) +{ + char *oldp; + + if (!p) + return -EINVAL; + + oldp = p; + *size = memparse(p, ); + if (p == oldp) + return -EINVAL; + + switch (*p) { + case '@': + *start = memparse(p + 1, ); + return 0; + default: + /* +* If w/o offset, only size specified, kaslr_mem=nn[KMG] +* has the same behaviour as kaslr_mem=nn[KMG]@0. It means +* the region starts from 0. +*/ + *start = 0; + return 0; + } + + return -EINVAL; +} + +static void parse_kaslr_mem_regions(char *str) +{ + static int i; + + while (str && (i < MAX_KASLR_MEM_USABLE)) { + int rc; + unsigned long long start, size; + char *k = strchr(str, ','); + + if (k) + *k++ = 0; + + rc = parse_kaslr_mem(str, , ); + if (rc < 0) + break; + str = k; + + mem_usable[i].start = start; + mem_usable[i].size = size; + i++; + } + num_usable_region = i; +} + +static int handle_mem_filter(void) { char *args = (char *)get_cmd_line_ptr(); size_t len = strlen((char *)args); @@ -214,7 +278,8 @@ static int handle_mem_memmap(void) char *param, *val; u64 mem_size; - if (!strstr(args, "memmap=") && !strstr(args, "mem=")) + if (!strstr(args, "memmap=") && !strstr(args, "mem=") && + !strstr(args, "kaslr_mem=")) return 0; tmp_cmdline = malloc(len + 1); @@ -239,6 +304,8 @@ static int handle_mem_memmap(void) if (!strcmp(param, "memmap")) { mem_avoid_memmap(val); + } else if (!strcmp(param, "kaslr_mem")) { + parse_kaslr_mem_regions(val); } else if (!strcmp(param, "mem")) { char *p = val; @@ -378,7 +445,7 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size, /* We don't need to set a mapping for setup_data. */ /* Mark the memmap regions we need to avoid */ - handle_mem_memmap(); + handle_mem_filter(); #ifdef CONFIG_X86_VERBOSE_BOOTUP /* Make sure video RAM can be used. */ -- 2.14.3
[PATCH RFC 14/16] rcuperf: Add config files with various CONFIG_NR_CPUS
From: Lihao LiangSigned-off-by: Lihao Liang --- .../selftests/rcutorture/configs/rcuperf/PRCU-12| 21 + .../rcutorture/configs/rcuperf/PRCU-12.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-14| 21 + .../rcutorture/configs/rcuperf/PRCU-14.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-15| 21 + .../rcutorture/configs/rcuperf/PRCU-15.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-16| 21 + .../rcutorture/configs/rcuperf/PRCU-16.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-2 | 21 + .../rcutorture/configs/rcuperf/PRCU-2.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-32| 21 + .../rcutorture/configs/rcuperf/PRCU-32.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-4 | 21 + .../rcutorture/configs/rcuperf/PRCU-4.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-48| 21 + .../rcutorture/configs/rcuperf/PRCU-48.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-56| 21 + .../rcutorture/configs/rcuperf/PRCU-56.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-60| 21 + .../rcutorture/configs/rcuperf/PRCU-60.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-62| 21 + .../rcutorture/configs/rcuperf/PRCU-62.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-64| 21 + .../rcutorture/configs/rcuperf/PRCU-64.boot | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU-8 | 21 + .../rcutorture/configs/rcuperf/PRCU-8.boot | 1 + .../selftests/rcutorture/configs/rcuperf/TREE-12| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-14| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-15| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-16| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-2 | 21 + .../selftests/rcutorture/configs/rcuperf/TREE-32| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-4 | 21 + .../selftests/rcutorture/configs/rcuperf/TREE-48| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-56| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-60| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-62| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-64| 21 + .../selftests/rcutorture/configs/rcuperf/TREE-8 | 21 + 39 files changed, 559 insertions(+) create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot create mode 100644
Re: Ping Re: [PATCH] virtio: make VIRTIO a menuconfig to ease disabling it all
On 1/23/18, Michael Ellermanwrote: > This has been broken in linux-next for ~6 weeks now, can we please merge > this and get it fixed. Added Stephen Rothwell to cc -- Vincent Legoll
[PATCH RFC 03/16] rcutorture: Add PRCU test config files
From: Lihao LiangUse the same config files as TREE02, TREE03, TREE06, TREE07, and TREE09. Signed-off-by: Lihao Liang --- .../selftests/rcutorture/configs/rcu/CFLIST| 5 .../selftests/rcutorture/configs/rcu/PRCU02| 27 ++ .../selftests/rcutorture/configs/rcu/PRCU02.boot | 1 + .../selftests/rcutorture/configs/rcu/PRCU03| 23 ++ .../selftests/rcutorture/configs/rcu/PRCU03.boot | 2 ++ .../selftests/rcutorture/configs/rcu/PRCU06| 26 + .../selftests/rcutorture/configs/rcu/PRCU06.boot | 5 .../selftests/rcutorture/configs/rcu/PRCU07| 25 .../selftests/rcutorture/configs/rcu/PRCU07.boot | 2 ++ .../selftests/rcutorture/configs/rcu/PRCU09| 19 +++ .../selftests/rcutorture/configs/rcu/PRCU09.boot | 1 + 11 files changed, 136 insertions(+) create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST index a3a1a05a..7359e194 100644 --- a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST @@ -1,3 +1,8 @@ +PRCU02 +PRCU03 +PRCU06 +PRCU07 +PRCU09 TREE01 TREE02 TREE03 diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02 new file mode 100644 index ..5f532f05 --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02 @@ -0,0 +1,27 @@ +CONFIG_SMP=y +CONFIG_NR_CPUS=8 +CONFIG_PREEMPT_NONE=n +CONFIG_PREEMPT_VOLUNTARY=n +CONFIG_PREEMPT=y +CONFIG_PRCU=y +#CHECK#CONFIG_PREEMPT_RCU=y +CONFIG_HZ_PERIODIC=n +CONFIG_NO_HZ_IDLE=y +CONFIG_NO_HZ_FULL=n +CONFIG_RCU_FAST_NO_HZ=n +CONFIG_RCU_TRACE=n +CONFIG_HOTPLUG_CPU=n +CONFIG_SUSPEND=n +CONFIG_HIBERNATION=n +CONFIG_RCU_FANOUT=3 +CONFIG_RCU_FANOUT_LEAF=3 +CONFIG_RCU_NOCB_CPU=n +CONFIG_DEBUG_LOCK_ALLOC=y +CONFIG_PROVE_LOCKING=n +CONFIG_RCU_BOOST=n +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n +CONFIG_RCU_EXPERT=y +CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y +CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y +CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y +CONFIG_DEBUG_OBJECTS_RCU_HEAD=y diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot new file mode 100644 index ..6c5e626f --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot @@ -0,0 +1 @@ +rcutorture.torture_type=prcu diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03 new file mode 100644 index ..869cadc8 --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03 @@ -0,0 +1,23 @@ +CONFIG_SMP=y +CONFIG_NR_CPUS=16 +CONFIG_PREEMPT_NONE=n +CONFIG_PREEMPT_VOLUNTARY=n +CONFIG_PREEMPT=y +CONFIG_PRCU=y +#CHECK#CONFIG_PREEMPT_RCU=y +CONFIG_HZ_PERIODIC=y +CONFIG_NO_HZ_IDLE=n +CONFIG_NO_HZ_FULL=n +CONFIG_RCU_TRACE=y +CONFIG_HOTPLUG_CPU=y +CONFIG_RCU_FANOUT=2 +CONFIG_RCU_FANOUT_LEAF=2 +CONFIG_RCU_NOCB_CPU=n +CONFIG_DEBUG_LOCK_ALLOC=n +CONFIG_RCU_BOOST=y +CONFIG_RCU_KTHREAD_PRIO=2 +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n +CONFIG_RCU_EXPERT=y +CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y +CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y +CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot new file mode 100644 index ..0be10cba --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot @@ -0,0 +1,2 @@ +rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30 +rcutorture.torture_type=prcu diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU06 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06 new file mode 100644 index ..b1480963 --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06 @@ -0,0 +1,26 @@ +CONFIG_SMP=y +CONFIG_NR_CPUS=8 +CONFIG_PREEMPT_NONE=y +CONFIG_PREEMPT_VOLUNTARY=n +CONFIG_PREEMPT=n +CONFIG_PRCU=y +#CHECK#CONFIG_TREE_RCU=y +CONFIG_HZ_PERIODIC=n +CONFIG_NO_HZ_IDLE=y +CONFIG_NO_HZ_FULL=n +CONFIG_RCU_FAST_NO_HZ=n
[PATCH RFC 10/16] rcutorture: Test call_prcu() and prcu_barrier()
From: Lihao LiangSigned-off-by: Lihao Liang --- kernel/rcu/prcu.c | 4 +++- kernel/rcu/rcutorture.c | 4 ++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c index 2664d091..49cb70e6 100644 --- a/kernel/rcu/prcu.c +++ b/kernel/rcu/prcu.c @@ -179,8 +179,10 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func) /* Use GFP_ATOMIC with IRQs disabled */ vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC); - if (!vhp) + if (!vhp) { + WARN_ON(1); return; + } head->func = func; head->next = NULL; diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 7d65bf0c..9215ebb0 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -797,8 +797,8 @@ static struct rcu_torture_ops prcu_ops = { .exp_sync = synchronize_prcu, .get_state = NULL, .cond_sync = NULL, - .call = NULL, - .cb_barrier = NULL, + .call = call_prcu, + .cb_barrier = prcu_barrier, .fqs= NULL, .stats = NULL, .irq_capable= 1, -- 2.14.1.729.g59c0ea183
[PATCH RFC 16/16] Add GPLv2 license
From: Lihao LiangSigned-off-by: Lihao Liang --- include/linux/prcu.h | 4 kernel/rcu/prcu.c| 4 2 files changed, 8 insertions(+) diff --git a/include/linux/prcu.h b/include/linux/prcu.h index 9f740985..9fa74dac 100644 --- a/include/linux/prcu.h +++ b/include/linux/prcu.h @@ -4,6 +4,10 @@ * * Authors: Heng Zhang * Lihao Liang + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. */ #ifndef __LINUX_PRCU_H diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c index ef2c7730..06375ee6 100644 --- a/kernel/rcu/prcu.c +++ b/kernel/rcu/prcu.c @@ -10,6 +10,10 @@ * * Authors: Heng Zhang * Lihao Liang + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. */ #include -- 2.14.1.729.g59c0ea183
[PATCH RFC 12/16] prcu: Add PRCU Kconfig parameter
From: Lihao LiangSigned-off-by: Lihao Liang --- include/linux/prcu.h | 14 ++ init/Kconfig | 7 +++ kernel/rcu/Makefile | 3 ++- 3 files changed, 15 insertions(+), 9 deletions(-) diff --git a/include/linux/prcu.h b/include/linux/prcu.h index cce967fd..bb20fa40 100644 --- a/include/linux/prcu.h +++ b/include/linux/prcu.h @@ -7,8 +7,7 @@ #include #include -#define CONFIG_PRCU - +#ifdef CONFIG_PRCU struct prcu_version_head { unsigned long long version; struct prcu_version_head *next; @@ -48,7 +47,6 @@ struct prcu_struct { struct completion barrier_completion; }; -#ifdef CONFIG_PRCU void prcu_read_lock(void); void prcu_read_unlock(void); void synchronize_prcu(void); @@ -62,11 +60,11 @@ void prcu_check_callbacks(void); #else /* #ifdef CONFIG_PRCU */ -#define prcu_read_lock() do {} while (0) -#define prcu_read_unlock() do {} while (0) -#define synchronize_prcu() do {} while (0) -#define call_prcu() do {} while (0) -#define prcu_barrier() do {} while (0) +#define prcu_read_lock rcu_read_lock +#define prcu_read_unlock rcu_read_unlock +#define synchronize_prcu synchronize_rcu +#define call_prcu call_rcu +#define prcu_barrier rcu_barrier #define prcu_init() do {} while (0) #define prcu_note_context_switch() do {} while (0) #define prcu_pending() 0 diff --git a/init/Kconfig b/init/Kconfig index 1d3475fc..c1fd80f9 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -565,6 +565,13 @@ config TASKS_RCU only voluntary context switch (not preemption!), idle, and user-mode execution as quiescent states. +config PRCU + bool + default y + help + This option selects the PRCU implementation based on a fast + consensus protocol. + config RCU_STALL_COMMON def_bool ( TREE_RCU || PREEMPT_RCU || RCU_TRACE ) help diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile index 8791419c..9074b395 100644 --- a/kernel/rcu/Makefile +++ b/kernel/rcu/Makefile @@ -2,7 +2,7 @@ # and is generally not a function of system call inputs. KCOV_INSTRUMENT := n -obj-y += update.o sync.o prcu.o +obj-y += update.o sync.o obj-$(CONFIG_CLASSIC_SRCU) += srcu.o obj-$(CONFIG_TREE_SRCU) += srcutree.o obj-$(CONFIG_TINY_SRCU) += srcutiny.o @@ -12,4 +12,5 @@ obj-$(CONFIG_TREE_RCU) += tree.o obj-$(CONFIG_PREEMPT_RCU) += tree.o obj-$(CONFIG_TREE_RCU_TRACE) += tree_trace.o obj-$(CONFIG_TINY_RCU) += tiny.o +obj-$(CONFIG_PRCU) += prcu.o obj-$(CONFIG_RCU_NEED_SEGCBLIST) += rcu_segcblist.o -- 2.14.1.729.g59c0ea183
[PATCH v6 0/2] Initial Allwinner V3s CSI Support
This patchset add initial support for Allwinner V3s CSI. Allwinner V3s SoC features two CSI module. CSI0 is used for MIPI CSI-2 interface and CSI1 is used for parallel interface. This is not documented in datasheet but by test and guess. This patchset implement a v4l2 framework driver and add a binding documentation for it. Currently, the driver only support the parallel interface. And has been tested with a BT1120 signal which generating from FPGA. The following fetures are not support with this patchset: - ISP - MIPI-CSI2 - Master clock for camera sensor - Power regulator for the front end IC Changes in v6: * Add Rob Herring's review tag. * Fix a NULL pointer dereference by picking Maxime Ripard's patch. * Add Maxime Ripard's test tag. Changes in v5: * Using the new SPDX tags. * Fix MODULE_LICENSE. * Add many default cases and warning messages. * Detail the parallel bus properties * Fix some spelling and syntax mistakes. Changes in v4: * Deal with the CSI 'INNER QUEUE'. CSI will lookup the next dma buffer for next frame before the the current frame done IRQ triggered. This is not documented but reported by Ondřej Jirman. The BSP code has workaround for this too. It skip to mark the first buffer as frame done for VB2 and pass the second buffer to CSI in the first frame done ISR call. Then in second frame done ISR call, it mark the first buffer as frame done for VB2 and pass the third buffer to CSI. And so on. The bad thing is that the first buffer will be written twice and the first frame is dropped even the queued buffer is sufficient. So, I make some improvement here. Pass the next buffer to CSI just follow starting the CSI. In this case, the first frame will be stored in first buffer, second frame in second buffer. This mothed is used to avoid dropping the first frame, it would also drop frame when lacking of queued buffer. * Fix: using a wrong mbus_code when getting the supported formats * Change all fourcc to pixformat * Change some function names Changes in v3: * Get rid of struct sun6i_csi_ops * Move sun6i-csi to new directory drivers/media/platform/sunxi * Merge sun6i_csi.c and sun6i_csi_v3s.c into sun6i_csi.c * Use generic fwnode endpoints parser * Only support a single subdev to make things simple * Many complaintion fix Changes in v2: * Change sunxi-csi to sun6i-csi * Rebase to media_tree master branch Following is the 'v4l2-compliance -s -f' output, I have test this with both interlaced and progressive signal: # ./v4l2-compliance -s -f v4l2-compliance SHA : 6049ea8bd64f9d78ef87ef0c2b3dc9b5de1ca4a1 Driver Info: Driver name : sun6i-video Card type : sun6i-csi Bus info : platform:csi Driver version: 4.15.0 Capabilities : 0x8421 Video Capture Streaming Extended Pix Format Device Capabilities Device Caps : 0x0421 Video Capture Streaming Extended Pix Format Compliance test for device /dev/video0 (not using libv4l2): Required ioctls: test VIDIOC_QUERYCAP: OK Allow for multiple opens: test second video open: OK test VIDIOC_QUERYCAP: OK test VIDIOC_G/S_PRIORITY: OK test for unlimited opens: OK Debug ioctls: test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported) test VIDIOC_LOG_STATUS: OK (Not Supported) Input ioctls: test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported) test VIDIOC_G/S_FREQUENCY: OK (Not Supported) test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported) test VIDIOC_ENUMAUDIO: OK (Not Supported) test VIDIOC_G/S/ENUMINPUT: OK test VIDIOC_G/S_AUDIO: OK (Not Supported) Inputs: 1 Audio Inputs: 0 Tuners: 0 Output ioctls: test VIDIOC_G/S_MODULATOR: OK (Not Supported) test VIDIOC_G/S_FREQUENCY: OK (Not Supported) test VIDIOC_ENUMAUDOUT: OK (Not Supported) test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported) test VIDIOC_G/S_AUDOUT: OK (Not Supported) Outputs: 0 Audio Outputs: 0 Modulators: 0 Input/Output configuration ioctls: test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported) test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported) test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported) test VIDIOC_G/S_EDID: OK (Not Supported) Test input 0: Control ioctls: test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK (Not Supported) test VIDIOC_QUERYCTRL: OK (Not Supported) test VIDIOC_G/S_CTRL: OK (Not Supported) test VIDIOC_G/S/TRY_EXT_CTRLS: OK (Not Supported) test VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT: OK (Not Supported) test VIDIOC_G/S_JPEGCOMP: OK (Not Supported) Standard Controls: 0 Private Controls: 0 Format
[PATCH v6 2/2] media: V3s: Add support for Allwinner CSI.
Allwinner V3s SoC features two CSI module. CSI0 is used for MIPI CSI-2 interface and CSI1 is used for parallel interface. This is not documented in datasheet but by test and guess. This patch implement a v4l2 framework driver for it. Currently, the driver only support the parallel interface. MIPI-CSI2, ISP's support are not included in this patch. Tested-by: Maxime RipardSigned-off-by: Yong Deng --- MAINTAINERS| 8 + drivers/media/platform/Kconfig | 1 + drivers/media/platform/Makefile| 2 + drivers/media/platform/sunxi/sun6i-csi/Kconfig | 9 + drivers/media/platform/sunxi/sun6i-csi/Makefile| 3 + drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c | 908 + drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.h | 143 .../media/platform/sunxi/sun6i-csi/sun6i_csi_reg.h | 196 + .../media/platform/sunxi/sun6i-csi/sun6i_video.c | 753 + .../media/platform/sunxi/sun6i-csi/sun6i_video.h | 53 ++ 10 files changed, 2076 insertions(+) create mode 100644 drivers/media/platform/sunxi/sun6i-csi/Kconfig create mode 100644 drivers/media/platform/sunxi/sun6i-csi/Makefile create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.h create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_csi_reg.h create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_video.c create mode 100644 drivers/media/platform/sunxi/sun6i-csi/sun6i_video.h diff --git a/MAINTAINERS b/MAINTAINERS index 9501403..b792fe5 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3783,6 +3783,14 @@ M: Jaya Kumar S: Maintained F: sound/pci/cs5535audio/ +CSI DRIVERS FOR ALLWINNER V3s +M: Yong Deng +L: linux-me...@vger.kernel.org +T: git git://linuxtv.org/media_tree.git +S: Maintained +F: drivers/media/platform/sunxi/sun6i-csi/ +F: Documentation/devicetree/bindings/media/sun6i-csi.txt + CW1200 WLAN driver M: Solomon Peachy S: Maintained diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig index fd0c998..41017e3 100644 --- a/drivers/media/platform/Kconfig +++ b/drivers/media/platform/Kconfig @@ -150,6 +150,7 @@ source "drivers/media/platform/am437x/Kconfig" source "drivers/media/platform/xilinx/Kconfig" source "drivers/media/platform/rcar-vin/Kconfig" source "drivers/media/platform/atmel/Kconfig" +source "drivers/media/platform/sunxi/sun6i-csi/Kconfig" config VIDEO_TI_CAL tristate "TI CAL (Camera Adaptation Layer) driver" diff --git a/drivers/media/platform/Makefile b/drivers/media/platform/Makefile index 003b0bb..e6e9ce7 100644 --- a/drivers/media/platform/Makefile +++ b/drivers/media/platform/Makefile @@ -97,3 +97,5 @@ obj-$(CONFIG_VIDEO_QCOM_CAMSS)+= qcom/camss-8x16/ obj-$(CONFIG_VIDEO_QCOM_VENUS) += qcom/venus/ obj-y += meson/ + +obj-$(CONFIG_VIDEO_SUN6I_CSI) += sunxi/sun6i-csi/ diff --git a/drivers/media/platform/sunxi/sun6i-csi/Kconfig b/drivers/media/platform/sunxi/sun6i-csi/Kconfig new file mode 100644 index 000..314188a --- /dev/null +++ b/drivers/media/platform/sunxi/sun6i-csi/Kconfig @@ -0,0 +1,9 @@ +config VIDEO_SUN6I_CSI + tristate "Allwinner V3s Camera Sensor Interface driver" + depends on VIDEO_V4L2 && COMMON_CLK && VIDEO_V4L2_SUBDEV_API && HAS_DMA + depends on ARCH_SUNXI || COMPILE_TEST + select VIDEOBUF2_DMA_CONTIG + select REGMAP_MMIO + select V4L2_FWNODE + ---help--- + Support for the Allwinner Camera Sensor Interface Controller on V3s. diff --git a/drivers/media/platform/sunxi/sun6i-csi/Makefile b/drivers/media/platform/sunxi/sun6i-csi/Makefile new file mode 100644 index 000..213cb6b --- /dev/null +++ b/drivers/media/platform/sunxi/sun6i-csi/Makefile @@ -0,0 +1,3 @@ +sun6i-csi-y += sun6i_video.o sun6i_csi.o + +obj-$(CONFIG_VIDEO_SUN6I_CSI) += sun6i-csi.o diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c new file mode 100644 index 000..9c341f0 --- /dev/null +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c @@ -0,0 +1,908 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2011-2018 Magewell Electronics Co., Ltd. (Nanjing) + * All rights reserved. + * Author: Yong Deng + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "sun6i_csi.h" +#include "sun6i_csi_reg.h" + +#define MODULE_NAME"sun6i-csi" + +struct sun6i_csi_dev { + struct sun6i_csicsi; + struct device
[PATCH v8 2/5] x86/KASLR: Handle the memory regions specified in kaslr_mem
If no 'kaslr_mem=' specified, just handle the e820/efi entries directly as before. Otherwise, limit kernel to memory regions specified in 'kaslr_mem=' commandline. Rename process_mem_region to slots_count to match slots_fetch_random, and name new function as process_mem_region. Tested-by: Luiz CapitulinoAcked-by: Baoquan He Signed-off-by: Chao Fan --- arch/x86/boot/compressed/kaslr.c | 64 +--- 1 file changed, 53 insertions(+), 11 deletions(-) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index b21741135673..b200a7ceafc1 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -548,9 +548,9 @@ static unsigned long slots_fetch_random(void) return 0; } -static void process_mem_region(struct mem_vector *entry, - unsigned long minimum, - unsigned long image_size) +static void slots_count(struct mem_vector *entry, + unsigned long minimum, + unsigned long image_size) { struct mem_vector region, overlap; struct slot_area slot_area; @@ -627,6 +627,52 @@ static void process_mem_region(struct mem_vector *entry, } } +static bool process_mem_region(struct mem_vector region, + unsigned long long minimum, + unsigned long long image_size) +{ + /* +* If 'kaslr_mem=' specified, walk all the regions, and +* filter the intersection to slots_count. +*/ + if (num_usable_region > 0) { + int i; + + for (i = 0; i < num_usable_region; i++) { + struct mem_vector entry; + unsigned long long start, end, entry_end, region_end; + + start = mem_usable[i].start; + end = start + mem_usable[i].size; + region_end = region.start + region.size; + + entry.start = clamp(region.start, start, end); + entry_end = clamp(region_end, start, end); + + if (entry.start < entry_end) { + entry.size = entry_end - entry.start; + slots_count(, minimum, image_size); + } + + if (slot_area_index == MAX_SLOT_AREA) { + debug_putstr("Aborted e820/efi memmap scan (slot_areas full)!\n"); + return 1; + } + } + return 0; + } + + /* +* If no kaslr_mem stored, use region directly +*/ + slots_count(, minimum, image_size); + if (slot_area_index == MAX_SLOT_AREA) { + debug_putstr("Aborted e820/efi memmap scan (slot_areas full)!\n"); + return 1; + } + return 0; +} + #ifdef CONFIG_EFI /* * Returns true if mirror region found (and must have been processed @@ -692,11 +738,9 @@ process_efi_entries(unsigned long minimum, unsigned long image_size) region.start = md->phys_addr; region.size = md->num_pages << EFI_PAGE_SHIFT; - process_mem_region(, minimum, image_size); - if (slot_area_index == MAX_SLOT_AREA) { - debug_putstr("Aborted EFI scan (slot_areas full)!\n"); + + if (process_mem_region(region, minimum, image_size)) break; - } } return true; } @@ -723,11 +767,9 @@ static void process_e820_entries(unsigned long minimum, continue; region.start = entry->addr; region.size = entry->size; - process_mem_region(, minimum, image_size); - if (slot_area_index == MAX_SLOT_AREA) { - debug_putstr("Aborted e820 scan (slot_areas full)!\n"); + + if (process_mem_region(region, minimum, image_size)) break; - } } } -- 2.14.3
[PATCH v8 0/5] x86/KASLR: Add parameter kaslr_mem=nn[KMG]@ss[KMG]
This is v8 resend. There's no code change. Just improve code comments and document accordingly. So add Baoquan's Acked-by and Luiz's Tested-by. ***Background: People reported that kaslr may randomly chooses some positions which are located in movable memory regions. This will break memory hotplug feature. And also on kvm guest with 4GB meory, the good unfragmented 1GB could be occupied by randomized kernel. It will cause hugetlb failing to allocate 1GB page. While kernel with 'nokaslr' has not such issue. This causes regression. Please see the discussion mail: https://lkml.org/lkml/2018/1/4/236 ***Solutions: Introduce a new kernel parameter 'kaslr_mem=nn@ss' to let users to specify the memory regions where kernel can be allowed to randomize safely. E.g if 'movable_node' is spedified, we can use 'kaslr_mem=nn@ss' to tell KASLR where we can put kernel safely. Then KASLR code can avoid those movable regions and only choose those immovable regions specified. For hugetlb case, users can always add 'kaslr_mem=1G' in kernel cmdline since the 0~1G is always fragmented region because of BIOS reserved area. Surely users can specify regions more precisely if they know system memory very well. *** Issues need be discussed There are several issues I am not quite sure, please help review and give suggestions: 1) Since there's already mem_avoid[] which stores the memory regions KASLR need avoid. For the regions KASLR can safely use, I name it as mem_usable[], not sure if it's appropriate. Or kaslr_mem[] directly? 2) In v6, I made 'kaslr_mem=' as a kernel parameter which users can use to specify memory regions where kenrel can be extracted safely by 'kaslr_mem=nn@ss', or regions where we need avoid to extract kernel by 'kaslr_mem=nn!ss'. While later I rethink about it, seems 'kaslr_mem=nn@ss' can satisfy the current requirement, there's no need to introduce the 'kaslr_mem=nn!ss'. So I just take that 'kaslr_mem=nn!ss' handling patch off, may add it later if anyone think it's necessary. Any suggestions? https://www.spinics.net/lists/kernel/msg2698457.html ***Test results: - I did some tests for the memory hotplug issues. I specify the memory region in one node, then I found every time the kernel will be extracted to the memory of this node. - Luiz tested this series with a 4GB KVM guest. With kaslr_mem=1G, got one 1GB page allocated 100% of the time in 85 boots. Without kaslr_mem=, got 3 failures in only 10 boots (that is, in 3 boots no 1GB page allocated). So this series solves the 1GB page problem. ***History v7->v8: - Just improve some comments. - Change the wrong spelling. - Add the Tested-by and Acked-by. v6->v7: - Drop the unnecessary avoid part for now. - Add document for the new parameter. v5->v6: - Add the last patch to save the avoid memory regions. v4->v5: - Change the problem reported by LKP Follow Dou's suggestion: - Also return if match "movable_node" when parsing kernel commandline in handle_mem_filter without define CONFIG_MEMORY_HOTPLUG v3->v4: Follow Kees's suggestion: - Put the functions variables of immovable_mem to #ifdef CONFIG_MEMORY_HOTPLUG and change some code place - Change the name of "process_mem_region" to "slots_count" - Reanme the new function "process_immovable_mem" to "process_mem_region" Follow Baoquan's suggestion: - Fail KASLR if "movable_node" specified without "immovable_mem" - Ajust the code place of handling mem_region directely if no immovable_mem specified Follow Randy's suggestion: - Change the mistake and add detailed description for the document. v2->v3: Follow Baoquan He's suggestion: - Change names of several functions. - Add a new parameter "immovable_mem" instead of extending mvoable_node - Use the clamp to calculate the memory intersecting, which makes logical more clear. - Disable memory mirror if movable_node specified v1->v2: Follow Dou Liyang's suggestion: - Add the parse for movable_node=nn[KMG] without @ss[KMG] - Fix the bug for more than one "movable_node=" specified - Drop useless variables and use mem_vector region directely - Add more comments. Chao Fan (5): x86/KASLR: Add kaslr_mem=nn[KMG]@ss[KMG] x86/KASLR: Handle the memory regions specified in kaslr_mem x86/KASLR: Give a warning if movable_node specified without kaslr_mem= x86/KASLR: Skip memory mirror handling if movable_node specified document: add document for kaslr_mem Documentation/admin-guide/kernel-parameters.txt | 10 ++ arch/x86/boot/compressed/kaslr.c| 154 +--- 2 files changed, 150 insertions(+), 14 deletions(-) -- 2.14.3
[PATCH v2] iommu/mediatek: Move attach_device after iommu-group is ready for M4Uv1
In the commit 05f80300dc8b ("iommu: Finish making iommu_group support mandatory"), the iommu framework has supposed all the iommu drivers have their owner iommu-group, it get rid of the FIXME workarounds while the group is NULL. But the flow of Mediatek M4U gen1 looks a bit trick that it will hang at this case: == Unable to handle kernel NULL pointer dereference at virtual address 0030 PC is at mutex_lock+0x28/0x54 LR is at iommu_attach_device+0xa4/0xd4 pc : []lr : []psr: 6013 sp : df0edbb8 ip : df0edbc8 fp : df0edbc4 r10: c114da14 r9 : df2a3e40 r8 : 0003 r7 : df27a210 r6 : df2a90c4 r5 : 0030 r4 : r3 : df0f8000 r2 : f000 r1 : df29c610 r0 : 0030 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none xxx (mutex_lock) from [] (iommu_attach_device+0xa4/0xd4) (iommu_attach_device) from [] (__arm_iommu_attach_device+0x28/0x90) (__arm_iommu_attach_device) from [] (arm_iommu_attach_device+0x1c/0x30) (arm_iommu_attach_device) from [] (mtk_iommu_add_device+0xfc/0x214) (mtk_iommu_add_device) from [] (add_iommu_group+0x3c/0x68) (add_iommu_group) from [] (bus_for_each_dev+0x78/0xac) (bus_for_each_dev) from [] (bus_set_iommu+0xb0/0xec) (bus_set_iommu) from [] (mtk_iommu_probe+0x328/0x368) (mtk_iommu_probe) from [] (platform_drv_probe+0x5c/0xc0) (platform_drv_probe) from [] (driver_probe_device+0x2f4/0x4d8) (driver_probe_device) from [] (__driver_attach+0x10c/0x128) (__driver_attach) from [] (bus_for_each_dev+0x78/0xac) (bus_for_each_dev) from [] (driver_attach+0x2c/0x30) (driver_attach) from [] (bus_add_driver+0x1e0/0x278) (bus_add_driver) from [] (driver_register+0x88/0x108) (driver_register) from [] (__platform_driver_register+0x50/0x58) (__platform_driver_register) from [] (m4u_init+0x24/0x28) (m4u_init) from [] (do_one_initcall+0xf0/0x17c) = The root cause is that "arm_iommu_attach_device" is called before "iommu_group_get_for_dev" in the interface "mtk_iommu_add_device". Thus, We adjust the sequence of this two functions. Unfortunately, there is another issue after the solution above, From the function "iommu_attach_device", Only one device in each a iommu group is allowed. In Mediatek case, there is only one m4u group, all the devices are in one group. thus it get fail at this step. In order to satisfy this requirement, a new iommu group is allocated for each a iommu consumer device. But meanwhile, we still have to use the same domain for all the iommu group. Use a global variable "mtk_domain_v1" to save the global domain. CC: Robin MurphyCC: Honghui Zhang Fixes: 05f80300dc8b ("iommu: Finish making iommu_group support mandatory") Reported-by: Ryder Lee Tested-by: Bibby Hsieh Signed-off-by: Yong Wu --- changes since v1: Add mtk_domain_v1=NULL in domain_free for symmetry. v1: https://patchwork.kernel.org/patch/10176255/ --- drivers/iommu/mtk_iommu_v1.c | 42 +++--- 1 file changed, 19 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index 542930c..86106bf 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -103,6 +103,9 @@ struct mtk_iommu_domain { struct mtk_iommu_data *data; }; +/* There is only a iommu domain in M4U gen1. */ +static struct mtk_iommu_domain *mtk_domain_v1; + static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom) { return container_of(dom, struct mtk_iommu_domain, domain); @@ -251,10 +254,15 @@ static struct iommu_domain *mtk_iommu_domain_alloc(unsigned type) if (type != IOMMU_DOMAIN_UNMANAGED) return NULL; + /* Always return the same domain. */ + if (mtk_domain_v1) + return _domain_v1->domain; + dom = kzalloc(sizeof(*dom), GFP_KERNEL); if (!dom) return NULL; + mtk_domain_v1 = dom; return >domain; } @@ -263,6 +271,7 @@ static void mtk_iommu_domain_free(struct iommu_domain *domain) struct mtk_iommu_domain *dom = to_mtk_domain(domain); struct mtk_iommu_data *data = dom->data; + mtk_domain_v1 = NULL; dma_free_coherent(data->dev, M2701_IOMMU_PGT_SIZE, dom->pgt_va, dom->pgt_pa); kfree(to_mtk_domain(domain)); @@ -418,20 +427,12 @@ static int mtk_iommu_create_mapping(struct device *dev, m4udev->archdata.iommu = mtk_mapping; } - ret = arm_iommu_attach_device(dev, mtk_mapping); - if (ret) - goto err_release_mapping; - return 0; - -err_release_mapping: - arm_iommu_release_mapping(mtk_mapping); - m4udev->archdata.iommu = NULL; - return ret; } static int mtk_iommu_add_device(struct device *dev) { + struct dma_iommu_mapping
Re: [PATCH v7 2/2] mfd: syscon: Add hardware spinlock support
Hi Lee, On 22 January 2018 at 21:43, Lee Joneswrote: > On Thu, 11 Jan 2018, Lee Jones wrote: >> On Mon, 25 Dec 2017, Baolin Wang wrote: >> >> > Some system control registers need hardware spinlock to synchronize >> > between the multiple subsystems, so we should add hardware spinlock >> > support for syscon. >> > >> > Signed-off-by: Baolin Wang >> > Acked-by: Rob Herring >> > --- >> > Changes since v6: >> > - Treat hwlock id 0 as valid for regmap. >> > >> > Changes since v5: >> > - Fix the case that hwspinlock is not enabled. >> > >> > Changes since v4: >> > - Add one exapmle to show how to add hwlock. >> > - Fix the coding style issue. >> > >> > Changes since v3: >> > - Add error handling for of_hwspin_lock_get_id() >> > >> > Changes since v2: >> > - Add acked tag from Rob. >> > >> > Changes since v1: >> > - Remove timeout configuration. >> > - Modify the binding file to add hwlocks. >> > --- >> > Documentation/devicetree/bindings/mfd/syscon.txt |8 >> > drivers/mfd/syscon.c | 19 >> > +++ >> > 2 files changed, 27 insertions(+) >> >> Applied, thanks. > > In order to avoid confusion, I should like to tell you that this patch > is applied for v4.17, not v4.16. This patch has been applied into Mark's branch[1] with your ACK, so Mark should drop this patch from his branch and you will pick it and merge it into v4.17? [1] https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git/commit/?h=topic/hwspinlock=3bafc09e779710abaa7b836fe3bbeeeab7754c2b -- Baolin.wang Best Regards
[PATCH RFC 01/16] prcu: Add PRCU implementation
From: Heng ZhangThis RCU implementation (PRCU) is based on a fast consensus protocol published in the following paper: Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization. Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016. https://dl.acm.org/citation.cfm?id=3024114.3024143 Signed-off-by: Heng Zhang Signed-off-by: Lihao Liang --- include/linux/prcu.h | 37 +++ kernel/rcu/Makefile | 2 +- kernel/rcu/prcu.c| 125 +++ kernel/sched/core.c | 2 + 4 files changed, 165 insertions(+), 1 deletion(-) create mode 100644 include/linux/prcu.h create mode 100644 kernel/rcu/prcu.c diff --git a/include/linux/prcu.h b/include/linux/prcu.h new file mode 100644 index ..653b4633 --- /dev/null +++ b/include/linux/prcu.h @@ -0,0 +1,37 @@ +#ifndef __LINUX_PRCU_H +#define __LINUX_PRCU_H + +#include +#include +#include + +#define CONFIG_PRCU + +struct prcu_local_struct { + unsigned int locked; + unsigned int online; + unsigned long long version; +}; + +struct prcu_struct { + atomic64_t global_version; + atomic_t active_ctr; + struct mutex mtx; + wait_queue_head_t wait_q; +}; + +#ifdef CONFIG_PRCU +void prcu_read_lock(void); +void prcu_read_unlock(void); +void synchronize_prcu(void); +void prcu_note_context_switch(void); + +#else /* #ifdef CONFIG_PRCU */ + +#define prcu_read_lock() do {} while (0) +#define prcu_read_unlock() do {} while (0) +#define synchronize_prcu() do {} while (0) +#define prcu_note_context_switch() do {} while (0) + +#endif /* #ifdef CONFIG_PRCU */ +#endif /* __LINUX_PRCU_H */ diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile index 23803c7d..8791419c 100644 --- a/kernel/rcu/Makefile +++ b/kernel/rcu/Makefile @@ -2,7 +2,7 @@ # and is generally not a function of system call inputs. KCOV_INSTRUMENT := n -obj-y += update.o sync.o +obj-y += update.o sync.o prcu.o obj-$(CONFIG_CLASSIC_SRCU) += srcu.o obj-$(CONFIG_TREE_SRCU) += srcutree.o obj-$(CONFIG_TINY_SRCU) += srcutiny.o diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c new file mode 100644 index ..a00b9420 --- /dev/null +++ b/kernel/rcu/prcu.c @@ -0,0 +1,125 @@ +#include +#include +#include +#include +#include + +#include + +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local); + +struct prcu_struct global_prcu = { + .global_version = ATOMIC64_INIT(0), + .active_ctr = ATOMIC_INIT(0), + .mtx = __MUTEX_INITIALIZER(global_prcu.mtx), + .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q) +}; +struct prcu_struct *prcu = _prcu; + +static inline void prcu_report(struct prcu_local_struct *local) +{ + unsigned long long global_version; + unsigned long long local_version; + + global_version = atomic64_read(>global_version); + local_version = local->version; + if (global_version > local_version) + cmpxchg(>version, local_version, global_version); +} + +void prcu_read_lock(void) +{ + struct prcu_local_struct *local; + + local = get_cpu_ptr(_local); + if (!local->online) { + WRITE_ONCE(local->online, 1); + smp_mb(); + } + + local->locked++; + put_cpu_ptr(_local); +} +EXPORT_SYMBOL(prcu_read_lock); + +void prcu_read_unlock(void) +{ + int locked; + struct prcu_local_struct *local; + + barrier(); + local = get_cpu_ptr(_local); + locked = local->locked; + if (locked) { + local->locked--; + if (locked == 1) + prcu_report(local); + put_cpu_ptr(_local); + } else { + put_cpu_ptr(_local); + if (!atomic_dec_return(>active_ctr)) + wake_up(>wait_q); + } +} +EXPORT_SYMBOL(prcu_read_unlock); + +static void prcu_handler(void *info) +{ + struct prcu_local_struct *local; + + local = this_cpu_ptr(_local); + if (!local->locked) + WRITE_ONCE(local->version, atomic64_read(>global_version)); +} + +void synchronize_prcu(void) +{ + int cpu; + cpumask_t cpus; + unsigned long long version; + struct prcu_local_struct *local; + + version = atomic64_add_return(1, >global_version); + mutex_lock(>mtx); + + local = get_cpu_ptr(_local); + local->version = version; + put_cpu_ptr(_local); + + cpumask_clear(); + for_each_possible_cpu(cpu) { + local = per_cpu_ptr(_local, cpu); + if (!READ_ONCE(local->online)) + continue; + if (READ_ONCE(local->version) < version) { + smp_call_function_single(cpu, prcu_handler, NULL, 0); + cpumask_set_cpu(cpu, ); + } + }
[PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
From: Lihao LiangDear Paul, This patch set implements a preemptive version of RCU (PRCU) based on the following paper: Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization. Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan. IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016. https://dl.acm.org/citation.cfm?id=3024114.3024143 We have also added preliminary callback-handling support. Thus, the current version provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(), and prcu_barrier(). This is an experimental patch, so it would be good to have some feedback. Known shortcoming is that the grace-period version is incremented in synchronize_prcu(). If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked, callbacks cannot be invoked. Later version should address this issue, e.g. adding a grace-period expedition mechanism. Others include to use a a hierarchical structure, taking into account the NUMA topology, to send IPI in synchronize_prcu(). We have tested the implementation using rcutorture on both an x86 and ARM64 machine. PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG in a 1h run. [ 1593.604201] ---[ end trace b3bae911bec86152 ]--- [ 1594.629450] prcu-torture:torture_onoff task: offlining 14 [ 1594.73] smpboot: CPU 14 is now offline [ 1594.757732] prcu-torture:torture_onoff task: offlined 14 [ 1597.765149] prcu-torture:torture_onoff task: onlining 11 [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb [ 1597.804102] prcu-torture:torture_onoff task: onlined 11 [ 1599.365098] prcu-torture: rtc: b0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0 rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418 onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225 [ 1599.367946] prcu-torture: !!! [ 1599.367966] [ cut here ] We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is synchronize_rcu_expedited was tested. The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs): 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic CPUs 2 4 8 12 15 16 PRCU 0.141.074.158.02 10.7915.16 TREE 49.30 104.75 277.55 390.82 620.82 1381.54 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic CPUs 2 48 16 32 48 6364 PRCU0.23 19.6938.28 63.21 95.41 167.18 252.01 1841.44 TREE 416.73 901.89 1060.86 743.00 920.66 1325.21 1646.20 23806.27 Best wishes, Lihao. Lihao Liang (15): rcutorture: Add PRCU rcu_torture_ops rcutorture: Add PRCU test config files rcuperf: Add PRCU rcu_perf_ops rcuperf: Add PRCU test config files rcuperf: Set gp_exp to true for tests to run prcu: Implement call_prcu() API prcu: Implement PRCU callback processing prcu: Implement prcu_barrier() API rcutorture: Test call_prcu() and prcu_barrier() rcutorture: Add basic ARM64 support to run scripts prcu: Add PRCU Kconfig parameter prcu: Comment source code rcuperf: Add config files with various CONFIG_NR_CPUS rcutorture: Add scripts to run experiments Add GPLv2 license Heng Zhang (1): prcu: Add PRCU implementation include/linux/interrupt.h | 3 + include/linux/prcu.h | 122 + include/linux/rcupdate.h | 1 + init/Kconfig | 7 + init/main.c| 2 + kernel/rcu/Makefile| 1 + kernel/rcu/prcu.c | 497 + kernel/rcu/rcuperf.c | 33 +- kernel/rcu/rcutorture.c| 40 +- kernel/rcu/tree.c | 1 + kernel/sched/core.c| 2 + kernel/time/timer.c| 2 + kvm.sh | 452 +++ run-rcuperf.sh | 26 ++ .../testing/selftests/rcutorture/bin/functions.sh | 17 +- .../selftests/rcutorture/configs/rcu/CFLIST| 5 + .../selftests/rcutorture/configs/rcu/PRCU02| 27 ++ .../selftests/rcutorture/configs/rcu/PRCU02.boot | 1 + .../selftests/rcutorture/configs/rcu/PRCU03| 23 + .../selftests/rcutorture/configs/rcu/PRCU03.boot | 2 + .../selftests/rcutorture/configs/rcu/PRCU06| 26 ++ .../selftests/rcutorture/configs/rcu/PRCU06.boot | 5 + .../selftests/rcutorture/configs/rcu/PRCU07| 25 ++ .../selftests/rcutorture/configs/rcu/PRCU07.boot | 2 +
[PATCH RFC 04/16] rcuperf: Add PRCU rcu_perf_ops
From: Lihao LiangSigned-off-by: Lihao Liang --- kernel/rcu/rcuperf.c | 31 ++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c index a4a86fb4..ea80fa3e 100644 --- a/kernel/rcu/rcuperf.c +++ b/kernel/rcu/rcuperf.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -304,6 +305,34 @@ static bool __maybe_unused torturing_tasks(void) #endif /* #else #ifdef CONFIG_TASKS_RCU */ +/* + * Definitions for prcu perf testing. + */ + +static int prcu_perf_read_lock(void) __acquires(RCU) +{ + prcu_read_lock(); + return 0; +} + +static void prcu_perf_read_unlock(int idx) __releases(RCU) +{ + prcu_read_unlock(); +} + +static struct rcu_perf_ops prcu_ops = { + .ptype = PRCU_FLAVOR, + .init = rcu_sync_perf_init, + .readlock = prcu_perf_read_lock, + .readunlock = prcu_perf_read_unlock, + .started= rcu_no_completed, + .completed = rcu_no_completed, + .exp_completed = rcu_no_completed, + .sync = synchronize_prcu, + .exp_sync = synchronize_prcu, + .name = "prcu" +}; + /* * If performance tests complete, wait for shutdown to commence. */ @@ -554,7 +583,7 @@ rcu_perf_init(void) long i; int firsterr = 0; static struct rcu_perf_ops *perf_ops[] = { - _ops, _bh_ops, _ops, _ops, + _ops, _bh_ops, _ops, _ops, _ops, RCUPERF_TASKS_OPS }; -- 2.14.1.729.g59c0ea183
[PATCH RFC 02/16] rcutorture: Add PRCU rcu_torture_ops
From: Lihao LiangReviewed-by: Heng Zhang Signed-off-by: Lihao Liang --- include/linux/rcupdate.h | 1 + kernel/rcu/rcutorture.c | 40 +++- 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index e1e5d002..12df9709 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -84,6 +84,7 @@ enum rcutorture_type { RCU_SCHED_FLAVOR, RCU_TASKS_FLAVOR, SRCU_FLAVOR, + PRCU_FLAVOR, INVALID_RCU_FLAVOR }; diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index ae6e574d..7d65bf0c 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -46,6 +46,7 @@ #include #include #include +#include #include #include #include @@ -768,6 +769,43 @@ static bool __maybe_unused torturing_tasks(void) #endif /* #else #ifdef CONFIG_TASKS_RCU */ +/* + * Definitions for prcu torture testing. + */ + +static int prcu_torture_read_lock(void) __acquires(RCU) +{ + prcu_read_lock(); + return 0; +} + +static void prcu_torture_read_unlock(int idx) __releases(RCU) +{ + prcu_read_unlock(); +} + +static struct rcu_torture_ops prcu_ops = { + .ttype = PRCU_FLAVOR, + .init = rcu_sync_torture_init, + .readlock = prcu_torture_read_lock, + .read_delay = rcu_read_delay, /* just reuse rcu's version. */ + .readunlock = prcu_torture_read_unlock, + .started= rcu_no_completed, + .completed = rcu_no_completed, + .deferred_free = NULL, + .sync = synchronize_prcu, + .exp_sync = synchronize_prcu, + .get_state = NULL, + .cond_sync = NULL, + .call = NULL, + .cb_barrier = NULL, + .fqs= NULL, + .stats = NULL, + .irq_capable= 1, + .can_boost = 0, + .name = "prcu" +}; + /* * RCU torture priority-boost testing. Runs one real-time thread per * CPU for moderate bursts, repeatedly registering RCU callbacks and @@ -1764,7 +1802,7 @@ rcu_torture_init(void) int firsterr = 0; static struct rcu_torture_ops *torture_ops[] = { _ops, _bh_ops, _busted_ops, _ops, _ops, - _ops, RCUTORTURE_TASKS_OPS + _ops, _ops, RCUTORTURE_TASKS_OPS }; if (!torture_init_begin(torture_type, verbose, _runnable)) -- 2.14.1.729.g59c0ea183
[PATCH RFC 05/16] rcuperf: Add PRCU test config files
From: Lihao LiangUse the same config file of TREE. Signed-off-by: Lihao Liang --- .../selftests/rcutorture/configs/rcuperf/CFLIST | 1 + .../selftests/rcutorture/configs/rcuperf/PRCU| 20 .../selftests/rcutorture/configs/rcuperf/PRCU.boot | 1 + 3 files changed, 22 insertions(+) create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST b/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST index c9f56cf2..4b80917a 100644 --- a/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST +++ b/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST @@ -1 +1,2 @@ TREE +PRCU diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU new file mode 100644 index ..a312f671 --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU @@ -0,0 +1,20 @@ +CONFIG_SMP=y +CONFIG_PREEMPT_NONE=n +CONFIG_PREEMPT_VOLUNTARY=n +CONFIG_PREEMPT=y +#CHECK#CONFIG_PREEMPT_RCU=y +CONFIG_HZ_PERIODIC=n +CONFIG_NO_HZ_IDLE=y +CONFIG_NO_HZ_FULL=n +CONFIG_RCU_FAST_NO_HZ=n +CONFIG_RCU_TRACE=n +CONFIG_HOTPLUG_CPU=n +CONFIG_SUSPEND=n +CONFIG_HIBERNATION=n +CONFIG_RCU_NOCB_CPU=n +CONFIG_DEBUG_LOCK_ALLOC=n +CONFIG_PROVE_LOCKING=n +CONFIG_RCU_BOOST=n +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n +CONFIG_RCU_EXPERT=y +CONFIG_RCU_TRACE=y diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot new file mode 100644 index ..7e54ea55 --- /dev/null +++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot @@ -0,0 +1 @@ +rcuperf.perf_type=prcu -- 2.14.1.729.g59c0ea183
[PATCH v6 1/2] dt-bindings: media: Add Allwinner V3s Camera Sensor Interface (CSI)
Add binding documentation for Allwinner V3s CSI. Reviewed-by: Rob HerringSigned-off-by: Yong Deng --- .../devicetree/bindings/media/sun6i-csi.txt| 59 ++ 1 file changed, 59 insertions(+) create mode 100644 Documentation/devicetree/bindings/media/sun6i-csi.txt diff --git a/Documentation/devicetree/bindings/media/sun6i-csi.txt b/Documentation/devicetree/bindings/media/sun6i-csi.txt new file mode 100644 index 000..2ff47a9 --- /dev/null +++ b/Documentation/devicetree/bindings/media/sun6i-csi.txt @@ -0,0 +1,59 @@ +Allwinner V3s Camera Sensor Interface +- + +Allwinner V3s SoC features two CSI module. CSI0 is used for MIPI CSI-2 +interface and CSI1 is used for parallel interface. + +Required properties: + - compatible: value must be "allwinner,sun8i-v3s-csi" + - reg: base address and size of the memory-mapped region. + - interrupts: interrupt associated to this IP + - clocks: phandles to the clocks feeding the CSI +* bus: the CSI interface clock +* mod: the CSI module clock +* ram: the CSI DRAM clock + - clock-names: the clock names mentioned above + - resets: phandles to the reset line driving the CSI + +Each CSI node should contain one 'port' child node with one child 'endpoint' +node, according to the bindings defined in +Documentation/devicetree/bindings/media/video-interfaces.txt. As mentioned +above, the endpoint's bus type should be MIPI CSI-2 for CSI0 and parallel or +Bt656 for CSI1. + +Endpoint node properties for CSI1 +- + +- remote-endpoint : (required) a phandle to the bus receiver's endpoint + node +- bus-width: : (required) must be 8, 10, 12 or 16 +- pclk-sample : (optional) (default: sample on falling edge) +- hsync-active : (only required for parallel) +- vsync-active : (only required for parallel) + +Example: + +csi1: csi@1cb4000 { + compatible = "allwinner,sun8i-v3s-csi"; + reg = <0x01cb4000 0x1000>; + interrupts = ; + clocks = < CLK_BUS_CSI>, +< CLK_CSI1_SCLK>, +< CLK_DRAM_CSI>; + clock-names = "bus", "mod", "ram"; + resets = < RST_BUS_CSI>; + + port { + /* Parallel bus endpoint */ + csi1_ep: endpoint { + remote-endpoint = <_ep>; + bus-width = <16>; + + /* If hsync-active/vsync-active are missing, + embedded BT.656 sync is used */ + hsync-active = <0>; /* Active low */ + vsync-active = <0>; /* Active low */ + pclk-sample = <1>; /* Rising */ + }; + }; +}; -- 1.8.3.1
Re: [PATCH v6 16/36] nds32: DMA mapping API
Hi, Arnd: 2018-01-18 18:26 GMT+08:00 Arnd Bergmann: > On Mon, Jan 15, 2018 at 6:53 AM, Greentime Hu wrote: >> From: Greentime Hu >> >> This patch adds support for the DMA mapping API. It uses dma_map_ops for >> flexibility. >> >> Signed-off-by: Vincent Chen >> Signed-off-by: Greentime Hu > > I'm still unhappy about the way the cache flushes are done here as discussed > before. It's not a show-stopped, but no Ack from me. How about this implementation? static void nds32_dma_sync_single_for_cpu(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { switch (direction) { case DMA_TO_DEVICE: /* writeback only */ break; case DMA_FROM_DEVICE: /* invalidate only */ case DMA_BIDIRECTIONAL: /* writeback and invalidate */ cpu_dma_inval_range(start, end); break; default: BUG(); } } static void nds32_dma_sync_single_for_device(struct device *dev, dma_addr_t handle, size_t size, enum dma_data_direction dir) { switch (direction) { case DMA_FROM_DEVICE: /* invalidate only */ break; case DMA_TO_DEVICE: /* writeback only */ case DMA_BIDIRECTIONAL: /* writeback and invalidate */ cpu_dma_wb_range(start, end); break; default: BUG(); } }
Re: [PATCH v2 3/4] drivers: firmware: xilinx: Add sysfs interface
On Wed, Jan 17, 2018 at 12:20:33PM -0800, Jolly Shah wrote: > Add Firmware-ggs sysfs interface which provides read/write > interface to global storage registers. > > Signed-off-by: Jolly Shah> Signed-off-by: Rajan Vaja > --- > .../ABI/stable/sysfs-driver-zynqmp-firmware| 33 +++ > drivers/firmware/xilinx/zynqmp/Makefile| 2 +- > drivers/firmware/xilinx/zynqmp/firmware-ggs.c | 298 > + > drivers/firmware/xilinx/zynqmp/firmware.c | 26 ++ > include/linux/firmware/xilinx/zynqmp/firmware.h| 2 + > 5 files changed, 360 insertions(+), 1 deletion(-) > create mode 100644 Documentation/ABI/stable/sysfs-driver-zynqmp-firmware > create mode 100644 drivers/firmware/xilinx/zynqmp/firmware-ggs.c > > diff --git a/Documentation/ABI/stable/sysfs-driver-zynqmp-firmware > b/Documentation/ABI/stable/sysfs-driver-zynqmp-firmware > new file mode 100644 > index 000..2483215 > --- /dev/null > +++ b/Documentation/ABI/stable/sysfs-driver-zynqmp-firmware > @@ -0,0 +1,33 @@ > +What: /sys/devices/platform/zynqmp-firmware/ggs* > +Date: January 2018 > +KernelVersion: 4.15.0 > +Contact: "Jolly Shah" > +Description: > + Shows PMU global general storage register value, > + GLOBAL_GEN_STORAGE{0:3}. > + Global general storage register that can be used > + by system to pass information between masters. > + > + The register is reset during system or power-on > + resets. Three registers are used by the FSBL and > + other Xilinx software products: GLOBAL_GEN_STORAGE{4:6}. > + > +Users: Xilinx > + > +What: /sys/devices/platform/zynqmp-firmware/pggs* > +Date: January 2018 > +KernelVersion: 4.15.0 > +Contact: "Jolly Shah" > +Description: > + Shows PMU persistent global general storage register > + value, PERS_GLOB_GEN_STORAGE{0:3}. > + Persistent global general storage register that > + can be used by system to pass information between > + masters. > + > + This register is only reset by the power-on reset > + and maintains its value through a system reset. > + Four registers are used by the FSBL and other Xilinx > + software products: PERS_GLOB_GEN_STORAGE{4:7}. > + Register is reset only by a POR reset. > +Users: Xilinx > diff --git a/drivers/firmware/xilinx/zynqmp/Makefile > b/drivers/firmware/xilinx/zynqmp/Makefile > index c3ec669..6629781 100644 > --- a/drivers/firmware/xilinx/zynqmp/Makefile > +++ b/drivers/firmware/xilinx/zynqmp/Makefile > @@ -1,4 +1,4 @@ > # SPDX-License-Identifier: GPL-2.0+ > # Makefile for Xilinx firmwares > > -obj-$(CONFIG_ZYNQMP_FIRMWARE) += firmware.o > +obj-$(CONFIG_ZYNQMP_FIRMWARE) += firmware.o firmware-ggs.o > diff --git a/drivers/firmware/xilinx/zynqmp/firmware-ggs.c > b/drivers/firmware/xilinx/zynqmp/firmware-ggs.c > new file mode 100644 > index 000..be47ca2 > --- /dev/null > +++ b/drivers/firmware/xilinx/zynqmp/firmware-ggs.c > @@ -0,0 +1,298 @@ > +// SPDX-License-Identifier: GPL-2.0+ > +/* > + * Xilinx Zynq MPSoC Firmware layer > + * > + * Copyright (C) 2014-2018 Xilinx, Inc. > + * > + * Jolly Shah > + * Rajan Vaja > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include That's crazy deep nesting, why? > + > +static ssize_t read_register(char *buf, u32 ioctl_id, u32 reg) > +{ > + int ret; > + u32 ret_payload[PAYLOAD_ARG_CNT]; > + const struct zynqmp_eemi_ops *eemi_ops = get_eemi_ops(); > + > + if (!eemi_ops || !eemi_ops->ioctl) > + return 0; Not an error? > + > + ret = eemi_ops->ioctl(0, ioctl_id, reg, 0, ret_payload); > + if (ret) > + return ret; > + > + return snprintf(buf, PAGE_SIZE, "0x%x\n", ret_payload[1]); Minor nit, you never need to use snprintf() for a sysfs file, as you "know" the size and you can't overflow it with just a single value. Yeah, some tool-checkers hate to see a "raw" sprintf() call, but really, ignore them here :) > +} > + > +static ssize_t write_register(const char *buf, size_t count, > + u32 ioctl_id, u32 reg) > +{ > + char *kern_buff; > + char *inbuf; > + char *tok; > + long mask; > + long value; > + int ret; > + u32 ret_payload[PAYLOAD_ARG_CNT]; > + const struct zynqmp_eemi_ops *eemi_ops = get_eemi_ops(); > + > + if (!eemi_ops || !eemi_ops->ioctl) > + return -EFAULT; > + > + kern_buff = kzalloc(count, GFP_KERNEL); > + if (!kern_buff) > + return -ENOMEM; > + > + ret = strlcpy(kern_buff, buf, count); > + if (ret < 0)
[PATCH v8 4/5] x86/KASLR: Skip memory mirror handling if movable_node specified
In kernel code, if 'movable_node' specified, it will skip the mirror feature. So also skip mirror feature in KASLR. Acked-by: Baoquan HeSigned-off-by: Chao Fan --- arch/x86/boot/compressed/kaslr.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index 8703cc764306..e4b487f0b7af 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -692,6 +692,7 @@ static bool process_efi_entries(unsigned long minimum, unsigned long image_size) { struct efi_info *e = _params->efi_info; + char *args = (char *)get_cmd_line_ptr(); bool efi_mirror_found = false; struct mem_vector region; efi_memory_desc_t *md; @@ -725,6 +726,12 @@ process_efi_entries(unsigned long minimum, unsigned long image_size) } } +#ifdef CONFIG_MEMORY_HOTPLUG + /* Skip memory mirror if 'movabale_node' specified */ + if (strstr(args, "movable_node")) + efi_mirror_found = false; +#endif + for (i = 0; i < nr_desc; i++) { md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i); -- 2.14.3
[PATCH v8 5/5] document: add document for kaslr_mem
Cc: linux-...@vger.kernel.org Cc: Jonathan CorbetCc: Randy Dunlap Signed-off-by: Chao Fan --- Documentation/admin-guide/kernel-parameters.txt | 10 ++ 1 file changed, 10 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index e2de7c006a74..e6de15715c4c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2350,6 +2350,16 @@ allocations which rules out almost all kernel allocations. Use with caution! + kaslr_mem=nn[KMG][@ss[KMG]] + [KNL] Force usage of a specific region of memory + for KASLR during kernel decompression stage. + Region of usable memory is from ss to ss+nn. If ss + is omitted, it is equivalent to kaslr_mem=nn[KMG]@0. + Multiple regions can be specified, comma delimited. + Notice: only support 4 regions at most now. + Example: + kaslr_mem=1G,500M@2G,1G@4G + MTD_Partition= [MTD] Format: ,,, -- 2.14.3
Re: [PATCH v5] devres: combine function devm_ioremap*
On Tue, Jan 16, 2018 at 08:03:41PM +0800, Yisheng Xie wrote: > When I tried to use devm_ioremap function and review related > code, I found devm_ioremap_* almost have the similar realize > with each other, which can be combined. > > In the former version, I have tried to kill ioremap_cache to > reduce the size of devres, which can not work for ioremap is > not the same as ioremap_nocache in some ARCHs likes ia64. > Therefore, as the suggestion of Christophe, I introduce a help > function __devm_ioremap, let devm_ioremap* inline and call > __devm_ioremap with different devm_ioremap_type. > > After apply the patch, the size of devres.o can be reduce from > 8216 Bytes to 7352Bytes in my compile environment. > > Suggested-by: Christophe LEROY> Signed-off-by: Yisheng Xie > --- > v2: > - use MARCO for ioremap > v3: > - kill dev_ioremap_nocache > v4: > - combine function devm_ioremap* > v5: > - fix code style. > > include/linux/io.h | 61 +++ > lib/devres.c | 84 > ++ > 2 files changed, 70 insertions(+), 75 deletions(-) > > diff --git a/include/linux/io.h b/include/linux/io.h > index 32e30e8..4d0a640 100644 > --- a/include/linux/io.h > +++ b/include/linux/io.h > @@ -73,12 +73,61 @@ static inline void devm_ioport_unmap(struct device *dev, > void __iomem *addr) > > #define IOMEM_ERR_PTR(err) (__force void __iomem *)ERR_PTR(err) > > -void __iomem *devm_ioremap(struct device *dev, resource_size_t offset, > -resource_size_t size); > -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t > offset, > -resource_size_t size); > -void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset, > -resource_size_t size); > +enum devm_ioremap_type { > + DEVM_IOREMAP = 0, > + DEVM_IOREMAP_NC, > + DEVM_IOREMAP_WC, > +}; Why do these types need to be in a public .h file? Why not just keep the .h file as-is and then just put the cleanup in the .c file like you did? thanks, greg k-h
Re: [PATCH] x86/retpoline/entry: Disable the entire SYSCALL64 fast path with retpolines on
* Linus Torvaldswrote: > On Mon, Jan 22, 2018 at 10:04 AM, Andy Lutomirski wrote: > > The existing retpoline code carefully and awkwardly retpolinifies > > the SYSCALL64 slow path. This stops the fast path from being > > particularly fast, and it's IMO rather messy. > > I'm not convinced your patch isn't messier still.. It's certainly > subtle. I had to look at that ptregs stub generator thing twice. > > Honestly, I'd rather get rid of the fast-path entirely. Compared to > all the PTI mess, it's not even noticeable. > > And if we ever get CPU's that have this all fixed, we can re-visit > introducing the fastpath. But this is all very messy and it doesn't > seem worth it right now. > > If we get rid of the fastpath, we can lay out the slow path slightly > better, and get rid of some of those jump-overs. And we'd get rid of > the ptregs hooks entirely. > > So we can try to make the "slow" path better while at it, but I really > don't think it matters much now in the post-PTI era. Sadly. Note that there's another advantage to your proposal: should other vulnerabilities arise in the future, requiring changes in the syscall entry path, we'd be more flexible to address them in the C space than in the assembly space. In hindsight a _LOT_ of the PTI complexity and fragility centered around interacting with x86 kernel entry assembly code - which entry code fortunately got much simpler (and easier to review) in the past 1-2 years due to the thorough cleanups and the conversion of most of it to C. But it was still painful. So I'm fully in favor of that. Thanks, Ingo
[PATCH RFC 15/16] rcutorture: Add scripts to run experiments
From: Lihao LiangSigned-off-by: Lihao Liang --- kvm.sh | 452 + run-rcuperf.sh | 26 2 files changed, 478 insertions(+) create mode 100755 kvm.sh create mode 100755 run-rcuperf.sh diff --git a/kvm.sh b/kvm.sh new file mode 100755 index ..3b3c1b69 --- /dev/null +++ b/kvm.sh @@ -0,0 +1,452 @@ +#!/bin/bash +# +# Run a series of 14 tests under KVM. These are not particularly +# well-selected or well-tuned, but are the current set. Run from the +# top level of the source tree. +# +# Edit the definitions below to set the locations of the various directories, +# as well as the test duration. +# +# Usage: kvm.sh [ options ] +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, you can access it online at +# http://www.gnu.org/licenses/gpl-2.0.html. +# +# Copyright (C) IBM Corporation, 2011 +# +# Authors: Paul E. McKenney + +scriptname=$0 +args="$*" + +T=/tmp/kvm.sh.$$ +trap 'rm -rf $T' 0 +mkdir $T + +dur=$((30*60)) +dryrun="" +KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM +PATH=${KVM}/bin:$PATH; export PATH +TORTURE_DEFCONFIG=defconfig +TORTURE_BOOT_IMAGE="" +TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD +TORTURE_KMAKE_ARG="" +TORTURE_SHUTDOWN_GRACE=180 +TORTURE_SUITE=rcu +resdir="" +configs="" +cpus=0 +ds=`date +%Y.%m.%d-%H:%M:%S` +jitter="-1" + +. functions.sh + +usage () { + echo "Usage: $scriptname optional arguments:" + echo " --bootargs kernel-boot-arguments" + echo " --bootimage relative-path-to-kernel-boot-image" + echo " --buildonly" + echo " --configs \"config-file list w/ repeat factor (3*TINY01)\"" + echo " --cpus N" + echo " --datestamp string" + echo " --defconfig string" + echo " --dryrun sched|script" + echo " --duration minutes" + echo " --interactive" + echo " --jitter N [ maxsleep (us) [ maxspin (us) ] ]" + echo " --kmake-arg kernel-make-arguments" + echo " --mac nn:nn:nn:nn:nn:nn" + echo " --no-initrd" + echo " --qemu-args qemu-system-..." + echo " --qemu-cmd qemu-system-..." + echo " --results absolute-pathname" + echo " --torture rcu" + exit 1 +} + +while test $# -gt 0 +do + case "$1" in + --bootargs|--bootarg) + checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" '.*' '^--' + TORTURE_BOOTARGS="$2" + shift + ;; + --bootimage) + checkarg --bootimage "(relative path to kernel boot image)" "$#" "$2" '[a-zA-Z0-9][a-zA-Z0-9_]*' '^--' + TORTURE_BOOT_IMAGE="$2" + shift + ;; + --buildonly) + TORTURE_BUILDONLY=1 + ;; + --configs|--config) + checkarg --configs "(list of config files)" "$#" "$2" '^[^/]*$' '^--' + configs="$2" + shift + ;; + --cpus) + checkarg --cpus "(number)" "$#" "$2" '^[0-9]*$' '^--' + cpus=$2 + shift + ;; + --datestamp) + checkarg --datestamp "(relative pathname)" "$#" "$2" '^[^/]*$' '^--' + ds=$2 + shift + ;; + --defconfig) + checkarg --defconfig "defconfigtype" "$#" "$2" '^[^/][^/]*$' '^--' + TORTURE_DEFCONFIG=$2 + shift + ;; + --dryrun) + checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--' + dryrun=$2 + shift + ;; + --duration) + checkarg --duration "(minutes)" $# "$2" '^[0-9]*$' '^error' + dur=$(($2*60)) + shift + ;; + --interactive) + TORTURE_QEMU_INTERACTIVE=1; export TORTURE_QEMU_INTERACTIVE + ;; + --jitter) + checkarg --jitter "(# threads [ sleep [ spin ] ])" $# "$2" '^-\{,1\}[0-9]\+\( \+[0-9]\+\)\{,2\} *$' '^error$' + jitter="$2" + shift + ;; + --kmake-arg) + checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$' +
[PATCH RFC 13/16] prcu: Comment source code
From: Lihao LiangSigned-off-by: Lihao Liang --- include/linux/prcu.h | 73 - kernel/rcu/prcu.c| 178 +++ 2 files changed, 225 insertions(+), 26 deletions(-) diff --git a/include/linux/prcu.h b/include/linux/prcu.h index bb20fa40..9f740985 100644 --- a/include/linux/prcu.h +++ b/include/linux/prcu.h @@ -1,3 +1,11 @@ +/* + * Read-Copy Update mechanism for mutual exclusion (PRCU version). + * PRCU public definitions. + * + * Authors: Heng Zhang + * Lihao Liang + */ + #ifndef __LINUX_PRCU_H #define __LINUX_PRCU_H @@ -8,12 +16,26 @@ #include #ifdef CONFIG_PRCU + +/* + * Simple list structure of callback versions. + * + * Note: Ideally, we would like to add the version field + * to the rcu_head struct. But if we do so, other users of + * rcu_head in the Linux kernel will complain hard and loudly. + */ struct prcu_version_head { unsigned long long version; struct prcu_version_head *next; }; -/* Simple unsegmented callback list for PRCU. */ +/* + * Simple unsegmented callback list for PRCU. + * + * Note: Since we can't add a new version field to rcu_head, + * we have to make our own callback list for PRCU instead of + * using the existing rcu_cblist. Sigh! + */ struct prcu_cblist { struct rcu_head *head; struct rcu_head **tail; @@ -27,31 +49,47 @@ struct prcu_cblist { .version_head = NULL, .version_tail = _head, \ } +/* + * PRCU's per-CPU state. + */ struct prcu_local_struct { - unsigned int locked; - unsigned int online; - unsigned long long version; - unsigned long long cb_version; - struct rcu_head barrier_head; - struct prcu_cblist cblist; + unsigned int locked; /* Nesting level of PRCU read-side */ + /* critcal sections */ + unsigned int online; /* Indicates whether a context-switch */ + /* has occurred on this CPU */ + unsigned long long version;/* Local grace-period version */ + unsigned long long cb_version; /* Local callback version */ + struct rcu_head barrier_head; /* PRCU callback list */ + struct prcu_cblist cblist; /* PRCU callback version list */ }; +/* + * PRCU's global state. + */ struct prcu_struct { - atomic64_t global_version; - atomic64_t cb_version; - atomic_t active_ctr; - atomic_t barrier_cpu_count; - struct mutex mtx; - struct mutex barrier_mtx; - wait_queue_head_t wait_q; - struct completion barrier_completion; + atomic64_t global_version;/* Global grace-period version */ + atomic64_t cb_version;/* Global callback version */ + atomic_t active_ctr; /* Outstanding PRCU tasks */ + /* being context-switched */ + atomic_t barrier_cpu_count; /* # CPUs waiting on prcu_barrier() */ + struct mutex mtx; /* Serialize synchronize_prcu() */ + struct mutex barrier_mtx; /* Serialize prcu_barrier() */ + wait_queue_head_t wait_q; /* Wait for synchronize_prcu() */ + struct completion barrier_completion; /* Wait for prcu_barrier() */ }; +/* + * PRCU APIs. + */ void prcu_read_lock(void); void prcu_read_unlock(void); void synchronize_prcu(void); void call_prcu(struct rcu_head *head, rcu_callback_t func); void prcu_barrier(void); + +/* + * Internal non-public functions. + */ void prcu_init(void); void prcu_note_context_switch(void); int prcu_pending(void); @@ -60,11 +98,16 @@ void prcu_check_callbacks(void); #else /* #ifdef CONFIG_PRCU */ +/* + * If CONFIG_PRCU is not defined, + * map its APIs to RCU's counterparts. + */ #define prcu_read_lock rcu_read_lock #define prcu_read_unlock rcu_read_unlock #define synchronize_prcu synchronize_rcu #define call_prcu call_rcu #define prcu_barrier rcu_barrier + #define prcu_init() do {} while (0) #define prcu_note_context_switch() do {} while (0) #define prcu_pending() 0 diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c index 49cb70e6..ef2c7730 100644 --- a/kernel/rcu/prcu.c +++ b/kernel/rcu/prcu.c @@ -1,3 +1,17 @@ +/* + * Read-Copy Update mechanism for mutual exclusion (PRCU version). + * This PRCU implementation is based on a fast consensus protocol + * published in the following paper: + * + * Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization. + * Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan. + * IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016. + * https://dl.acm.org/citation.cfm?id=3024114.3024143 + * + * Authors: Heng Zhang + * Lihao Liang + */ + #include #include #include
[PATCH RFC 08/16] prcu: Implement PRCU callback processing
From: Lihao LiangCurrently, PRCU core processing only consists of callback processing in prcu_process_callbacks(), which is triggered by the scheduling-clock interrupt. Reviewed-by: Heng Zhang Signed-off-by: Lihao Liang --- include/linux/interrupt.h | 3 ++ include/linux/prcu.h | 8 + kernel/rcu/prcu.c | 86 +++ kernel/rcu/tree.c | 1 + kernel/time/timer.c | 2 ++ 5 files changed, 100 insertions(+) diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 0991f973..f05ef62a 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -456,6 +456,9 @@ enum SCHED_SOFTIRQ, HRTIMER_SOFTIRQ, /* Unused, but kept as tools rely on the numbering. Sigh! */ +#ifdef CONFIG_PRCU + PRCU_SOFTIRQ, +#endif RCU_SOFTIRQ,/* Preferable RCU should always be the last softirq */ NR_SOFTIRQS diff --git a/include/linux/prcu.h b/include/linux/prcu.h index e5e09c9b..4e7d5d65 100644 --- a/include/linux/prcu.h +++ b/include/linux/prcu.h @@ -31,11 +31,13 @@ struct prcu_local_struct { unsigned int locked; unsigned int online; unsigned long long version; + unsigned long long cb_version; struct prcu_cblist cblist; }; struct prcu_struct { atomic64_t global_version; + atomic64_t cb_version; atomic_t active_ctr; struct mutex mtx; wait_queue_head_t wait_q; @@ -48,6 +50,9 @@ void synchronize_prcu(void); void call_prcu(struct rcu_head *head, rcu_callback_t func); void prcu_init(void); void prcu_note_context_switch(void); +int prcu_pending(void); +void invoke_prcu_core(void); +void prcu_check_callbacks(void); #else /* #ifdef CONFIG_PRCU */ @@ -57,6 +62,9 @@ void prcu_note_context_switch(void); #define call_prcu() do {} while (0) #define prcu_init() do {} while (0) #define prcu_note_context_switch() do {} while (0) +#define prcu_pending() 0 +#define invoke_prcu_core() do {} while (0) +#define prcu_check_callbacks() do {} while (0) #endif /* #ifdef CONFIG_PRCU */ #endif /* __LINUX_PRCU_H */ diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c index f198285c..373039c5 100644 --- a/kernel/rcu/prcu.c +++ b/kernel/rcu/prcu.c @@ -1,6 +1,7 @@ #include #include #include +#include #include #include #include @@ -11,6 +12,7 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local); struct prcu_struct global_prcu = { .global_version = ATOMIC64_INIT(0), + .cb_version = ATOMIC64_INIT(0), .active_ctr = ATOMIC_INIT(0), .mtx = __MUTEX_INITIALIZER(global_prcu.mtx), .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q) @@ -27,6 +29,35 @@ static void prcu_cblist_init(struct prcu_cblist *rclp) rclp->len = 0; } +/* + * Dequeue the oldest rcu_head structure from the specified callback list; + * store the callback grace period version number into the version pointer. + */ +static struct rcu_head *prcu_cblist_dequeue(struct prcu_cblist *rclp) +{ + struct rcu_head *rhp; + struct prcu_version_head *vhp; + + rhp = rclp->head; + if (!rhp) { + WARN_ON(vhp); + WARN_ON(rclp->len); + return NULL; + } + + vhp = rclp->version_head; + rclp->version_head = vhp->next; + rclp->head = rhp->next; + rclp->len--; + + if (!rclp->head) { + rclp->tail = >head; + rclp->version_tail = >version_head; + } + + return rhp; +} + static inline void prcu_report(struct prcu_local_struct *local) { unsigned long long global_version; @@ -117,6 +148,7 @@ void synchronize_prcu(void) if (atomic_read(>active_ctr)) wait_event(prcu->wait_q, !atomic_read(>active_ctr)); + atomic64_set(>cb_version, version); mutex_unlock(>mtx); } EXPORT_SYMBOL(synchronize_prcu); @@ -166,6 +198,58 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL(call_prcu); +int prcu_pending(void) +{ + struct prcu_local_struct *local = get_cpu_ptr(_local); + unsigned long long cb_version = local->cb_version; + struct prcu_cblist *rclp = >cblist; + + put_cpu_ptr(_local); + return cb_version < atomic64_read(>cb_version) && rclp->head; +} + +void invoke_prcu_core(void) +{ + if (cpu_online(smp_processor_id())) + raise_softirq(PRCU_SOFTIRQ); +} + +void prcu_check_callbacks(void) +{ + if (prcu_pending()) + invoke_prcu_core(); +} + +static __latent_entropy void prcu_process_callbacks(struct softirq_action *unused) +{ + unsigned long flags; + unsigned long long cb_version; + struct prcu_local_struct *local; + struct prcu_cblist *rclp; + struct rcu_head *rhp; + struct prcu_version_head *vhp; + + if
[PATCH RFC 09/16] prcu: Implement prcu_barrier() API
From: Lihao LiangThis is PRCU's counterpart of RCU's rcu_barrier() API. Reviewed-by: Heng Zhang Signed-off-by: Lihao Liang --- include/linux/prcu.h | 7 ++ kernel/rcu/prcu.c| 63 2 files changed, 70 insertions(+) diff --git a/include/linux/prcu.h b/include/linux/prcu.h index 4e7d5d65..cce967fd 100644 --- a/include/linux/prcu.h +++ b/include/linux/prcu.h @@ -5,6 +5,7 @@ #include #include #include +#include #define CONFIG_PRCU @@ -32,6 +33,7 @@ struct prcu_local_struct { unsigned int online; unsigned long long version; unsigned long long cb_version; + struct rcu_head barrier_head; struct prcu_cblist cblist; }; @@ -39,8 +41,11 @@ struct prcu_struct { atomic64_t global_version; atomic64_t cb_version; atomic_t active_ctr; + atomic_t barrier_cpu_count; struct mutex mtx; + struct mutex barrier_mtx; wait_queue_head_t wait_q; + struct completion barrier_completion; }; #ifdef CONFIG_PRCU @@ -48,6 +53,7 @@ void prcu_read_lock(void); void prcu_read_unlock(void); void synchronize_prcu(void); void call_prcu(struct rcu_head *head, rcu_callback_t func); +void prcu_barrier(void); void prcu_init(void); void prcu_note_context_switch(void); int prcu_pending(void); @@ -60,6 +66,7 @@ void prcu_check_callbacks(void); #define prcu_read_unlock() do {} while (0) #define synchronize_prcu() do {} while (0) #define call_prcu() do {} while (0) +#define prcu_barrier() do {} while (0) #define prcu_init() do {} while (0) #define prcu_note_context_switch() do {} while (0) #define prcu_pending() 0 diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c index 373039c5..2664d091 100644 --- a/kernel/rcu/prcu.c +++ b/kernel/rcu/prcu.c @@ -15,6 +15,7 @@ struct prcu_struct global_prcu = { .cb_version = ATOMIC64_INIT(0), .active_ctr = ATOMIC_INIT(0), .mtx = __MUTEX_INITIALIZER(global_prcu.mtx), + .barrier_mtx = __MUTEX_INITIALIZER(global_prcu.barrier_mtx), .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q) }; struct prcu_struct *prcu = _prcu; @@ -250,6 +251,68 @@ static __latent_entropy void prcu_process_callbacks(struct softirq_action *unuse local_irq_restore(flags); } +/* + * PRCU callback function for prcu_barrier(). + * If we are last, wake up the task executing prcu_barrier(). + */ +static void prcu_barrier_callback(struct rcu_head *rhp) +{ + if (atomic_dec_and_test(>barrier_cpu_count)) + complete(>barrier_completion); +} + +/* + * Called with preemption disabled, and from cross-cpu IRQ context. + */ +static void prcu_barrier_func(void *info) +{ + struct prcu_local_struct *local = this_cpu_ptr(_local); + + atomic_inc(>barrier_cpu_count); + call_prcu(>barrier_head, prcu_barrier_callback); +} + +/* Waiting for all PRCU callbacks to complete. */ +void prcu_barrier(void) +{ + int cpu; + + /* Take mutex to serialize concurrent prcu_barrier() requests. */ + mutex_lock(>barrier_mtx); + + /* +* Initialize the count to one rather than to zero in order to +* avoid a too-soon return to zero in case of a short grace period +* (or preemption of this task). +*/ + init_completion(>barrier_completion); + atomic_set(>barrier_cpu_count, 1); + + /* +* Register a new callback on each CPU using IPI to prevent races +* with call_prcu(). When that callback is invoked, we will know +* that all of the corresponding CPU's preceding callbacks have +* been invoked. +*/ + for_each_possible_cpu(cpu) + smp_call_function_single(cpu, prcu_barrier_func, NULL, 1); + + /* Decrement the count as we initialize it to one. */ + if (atomic_dec_and_test(>barrier_cpu_count)) + complete(>barrier_completion); + + /* +* Now that we have an prcu_barrier_callback() callback on each +* CPU, and thus each counted, remove the initial count. +* Wait for all prcu_barrier_callback() callbacks to be invoked. +*/ + wait_for_completion(>barrier_completion); + + /* Other rcu_barrier() invocations can now safely proceed. */ + mutex_unlock(>barrier_mtx); +} +EXPORT_SYMBOL(prcu_barrier); + void prcu_init_local_struct(int cpu) { struct prcu_local_struct *local; -- 2.14.1.729.g59c0ea183
[PATCH RFC 07/16] prcu: Implement call_prcu() API
From: Lihao LiangThis is PRCU's counterpart of RCU's call_rcu() API. Reviewed-by: Heng Zhang Signed-off-by: Lihao Liang --- include/linux/prcu.h | 25 init/main.c | 2 ++ kernel/rcu/prcu.c| 67 +--- 3 files changed, 91 insertions(+), 3 deletions(-) diff --git a/include/linux/prcu.h b/include/linux/prcu.h index 653b4633..e5e09c9b 100644 --- a/include/linux/prcu.h +++ b/include/linux/prcu.h @@ -2,15 +2,36 @@ #define __LINUX_PRCU_H #include +#include #include #include #define CONFIG_PRCU +struct prcu_version_head { + unsigned long long version; + struct prcu_version_head *next; +}; + +/* Simple unsegmented callback list for PRCU. */ +struct prcu_cblist { + struct rcu_head *head; + struct rcu_head **tail; + struct prcu_version_head *version_head; + struct prcu_version_head **version_tail; + long len; +}; + +#define PRCU_CBLIST_INITIALIZER(n) { \ + .head = NULL, .tail = , \ + .version_head = NULL, .version_tail = _head, \ +} + struct prcu_local_struct { unsigned int locked; unsigned int online; unsigned long long version; + struct prcu_cblist cblist; }; struct prcu_struct { @@ -24,6 +45,8 @@ struct prcu_struct { void prcu_read_lock(void); void prcu_read_unlock(void); void synchronize_prcu(void); +void call_prcu(struct rcu_head *head, rcu_callback_t func); +void prcu_init(void); void prcu_note_context_switch(void); #else /* #ifdef CONFIG_PRCU */ @@ -31,6 +54,8 @@ void prcu_note_context_switch(void); #define prcu_read_lock() do {} while (0) #define prcu_read_unlock() do {} while (0) #define synchronize_prcu() do {} while (0) +#define call_prcu() do {} while (0) +#define prcu_init() do {} while (0) #define prcu_note_context_switch() do {} while (0) #endif /* #ifdef CONFIG_PRCU */ diff --git a/init/main.c b/init/main.c index f8665104..4925964e 100644 --- a/init/main.c +++ b/init/main.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include @@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void) workqueue_init_early(); rcu_init(); + prcu_init(); /* Trace events are available after this */ trace_init(); diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c index a00b9420..f198285c 100644 --- a/kernel/rcu/prcu.c +++ b/kernel/rcu/prcu.c @@ -1,11 +1,12 @@ #include -#include #include -#include +#include #include - +#include #include +#include "rcu.h" + DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local); struct prcu_struct global_prcu = { @@ -16,6 +17,16 @@ struct prcu_struct global_prcu = { }; struct prcu_struct *prcu = _prcu; +/* Initialize simple callback list. */ +static void prcu_cblist_init(struct prcu_cblist *rclp) +{ + rclp->head = NULL; + rclp->tail = >head; + rclp->version_head = NULL; + rclp->version_tail = >version_head; + rclp->len = 0; +} + static inline void prcu_report(struct prcu_local_struct *local) { unsigned long long global_version; @@ -123,3 +134,53 @@ void prcu_note_context_switch(void) prcu_report(local); put_cpu_ptr(_local); } + +void call_prcu(struct rcu_head *head, rcu_callback_t func) +{ + unsigned long flags; + struct prcu_local_struct *local; + struct prcu_cblist *rclp; + struct prcu_version_head *vhp; + + debug_rcu_head_queue(head); + + /* Use GFP_ATOMIC with IRQs disabled */ + vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC); + if (!vhp) + return; + + head->func = func; + head->next = NULL; + vhp->next = NULL; + + local_irq_save(flags); + local = this_cpu_ptr(_local); + vhp->version = local->version; + rclp = >cblist; + rclp->len++; + *rclp->tail = head; + rclp->tail = >next; + *rclp->version_tail = vhp; + rclp->version_tail = >next; + local_irq_restore(flags); +} +EXPORT_SYMBOL(call_prcu); + +void prcu_init_local_struct(int cpu) +{ + struct prcu_local_struct *local; + + local = per_cpu_ptr(_local, cpu); + local->locked = 0; + local->online = 0; + local->version = 0; + prcu_cblist_init(>cblist); +} + +void __init prcu_init(void) +{ + int cpu; + + for_each_possible_cpu(cpu) + prcu_init_local_struct(cpu); +} -- 2.14.1.729.g59c0ea183
[PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run
From: Lihao LiangSigned-off-by: Lihao Liang --- kernel/rcu/rcuperf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c index ea80fa3e..baccc123 100644 --- a/kernel/rcu/rcuperf.c +++ b/kernel/rcu/rcuperf.c @@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney "); #define VERBOSE_PERFOUT_ERRSTRING(s) \ do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0) -torture_param(bool, gp_exp, false, "Use expedited GP wait primitives"); +torture_param(bool, gp_exp, true, "Use expedited GP wait primitives"); torture_param(int, holdoff, 10, "Holdoff time before test start (s)"); torture_param(int, nreaders, -1, "Number of RCU reader threads"); torture_param(int, nwriters, -1, "Number of RCU updater threads"); -- 2.14.1.729.g59c0ea183
[PATCH RFC 11/16] rcutorture: Add basic ARM64 support to run scripts
From: Lihao LiangThis commit adds support of the qemu command qemu-system-aarch64 to rcutorture. Signed-off-by: Lihao Liang --- tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh index 1426a9b9..4a24b873 100644 --- a/tools/testing/selftests/rcutorture/bin/functions.sh +++ b/tools/testing/selftests/rcutorture/bin/functions.sh @@ -111,6 +111,9 @@ identify_boot_image () { qemu-system-x86_64|qemu-system-i386) echo arch/x86/boot/bzImage ;; + qemu-system-aarch64) + echo arch/arm64/boot/Image + ;; *) echo vmlinux ;; @@ -133,6 +136,9 @@ identify_qemu () { elif echo $u | grep -q "Intel 80386" then echo qemu-system-i386 + elif echo $u | grep -q aarch64 + then + echo qemu-system-aarch64 elif uname -a | grep -q ppc64 then echo qemu-system-ppc64 @@ -151,16 +157,20 @@ identify_qemu () { # Output arguments for the qemu "-append" string based on CPU type # and the TORTURE_QEMU_INTERACTIVE environment variable. identify_qemu_append () { + local console=ttyS0 case "$1" in qemu-system-x86_64|qemu-system-i386) echo noapic selinux=0 initcall_debug debug ;; + qemu-system-aarch64) + console=ttyAMA0 + ;; esac if test -n "$TORTURE_QEMU_INTERACTIVE" then echo root=/dev/sda else - echo console=ttyS0 + echo console=$console fi } @@ -172,6 +182,9 @@ identify_qemu_args () { case "$1" in qemu-system-x86_64|qemu-system-i386) ;; + qemu-system-arm|qemu-system-aarch64) + echo -machine virt,gic-version=host -cpu host + ;; qemu-system-ppc64) echo -enable-kvm -M pseries -nodefaults echo -device spapr-vscsi @@ -229,7 +242,7 @@ specify_qemu_cpus () { echo $2 else case "$1" in - qemu-system-x86_64|qemu-system-i386) + qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64) echo $2 -smp $3 ;; qemu-system-ppc64) -- 2.14.1.729.g59c0ea183
Re: [PATCH v3 11/20] arm64: mm: Map entry trampoline into trampoline and kernel page tables
Hi Will, On 2017/12/6 20:35, Will Deacon wrote: > +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 > +static int __init map_entry_trampoline(void) > +{ > + extern char __entry_tramp_text_start[]; > + > + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; > + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start); > + > + /* The trampoline is always mapped and can therefore be global */ > + pgprot_val(prot) &= ~PTE_NG; > + > + /* Map only the text into the trampoline page table */ > + memset(tramp_pg_dir, 0, PGD_SIZE); > + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE, > + prot, pgd_pgtable_alloc, 0); How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? Sorry for I do not find where it is used. Thanks Yisheng > + > + /* ...as well as the kernel page table */ > + __set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot); > + return 0; > +} > +core_initcall(map_entry_trampoline); > +#endif > + > /* > * Create fine-grained mappings for the kernel. > */ >
Re: [PATCH v2 4/4] drivers: firmware: xilinx: Add debugfs interface
On Wed, Jan 17, 2018 at 12:20:34PM -0800, Jolly Shah wrote: > +/* Setup debugfs fops */ > +static const struct file_operations fops_zynqmp_pm_dbgfs = { > + .owner = THIS_MODULE, > + .write = zynqmp_pm_debugfs_api_write, > + .read = zynqmp_pm_debugfs_api_version_read, > +}; > + > +/** > + * zynqmp_pm_api_debugfs_init - Initialize debugfs interface > + * > + * Return: Returns 0 on success > + * Corresponding error code otherwise > + */ > +int zynqmp_pm_api_debugfs_init(void) > +{ > + int err; > + > + /* Initialize debugfs interface */ > + zynqmp_pm_debugfs_dir = debugfs_create_dir(DRIVER_NAME, NULL); > + if (!zynqmp_pm_debugfs_dir) { > + pr_err("debugfs_create_dir failed\n"); > + return -ENODEV; > + } No, you should NEVER care if a debugfs call returned an error or not, no need to check it at all. Your code path should not change based on the return value as no code should depened on the functionality of debugfs. Any error returned by a debugfs call can be passed right back into it with no problems, so again, no need to check this. > + > + zynqmp_pm_debugfs_power = > + debugfs_create_file("pm", 0220, > + zynqmp_pm_debugfs_dir, NULL, > + _zynqmp_pm_dbgfs); > + if (!zynqmp_pm_debugfs_power) { > + pr_err("debugfs_create_file power failed\n"); > + err = -ENODEV; > + goto err_dbgfs; > + } > + > + zynqmp_pm_debugfs_api_version = > + debugfs_create_file("api_version", 0444, > + zynqmp_pm_debugfs_dir, NULL, > + _zynqmp_pm_dbgfs); > + if (!zynqmp_pm_debugfs_api_version) { > + pr_err("debugfs_create_file api_version failed\n"); > + err = -ENODEV; > + goto err_dbgfs; > + } Why do you save these dentries at all anyway? You never do anything with them, just create the files and away you go, no need to worry about anything. Remember, debugfs was created to be very simple to use, don't make it more complex than it has to be please. thanks, greg k-h
Re: [PATCH 2/5] pinctrl: stm32: add STM32F769 MCU support
On 01/22/2018 09:25 AM, Linus Walleij wrote: On Mon, Dec 11, 2017 at 9:54 AM, Alexandre Torguewrote: This patch which adds STM32F769 pinctrl and GPIO support, relies on the generic STM32 pinctrl driver. Signed-off-by: Alexandre Torgue Patch applied as Patrice poked me. I hope it works fine being applied in isolation from the other patches? Yes it does. I will add other patches in my next pull request (for v4.17). Thanks Alex Yours, Linus Walleij
Re: [PATCH arm/aspeed/ast2500 v1] eSPI: add Aspeed AST2500 eSPI driver to boot a host with PCH runs on eSPI
On Tue, Jan 16, 2018 at 07:52:32PM +0800, Haiyue Wang wrote: > When PCH works under eSPI mode, the PMC (Power Management Controller) in > PCH is waiting for SUS_ACK from BMC after it alerts SUS_WARN. It is in > dead loop if no SUS_ACK assert. This is the basic requirement for the BMC > works as eSPI slave. > > Also for the host power on / off actions, from BMC side, the following VW > (Virtual Wire) messages are done in firmware: > 1. SLAVE_BOOT_LOAD_DONE / SLAVE_BOOT_LOAD_STATUS > 2. SUS_ACK > 3. OOB_RESET_ACK > 4. HOST_RESET_ACK > > Signed-off-by: Haiyue Wang> --- > .../devicetree/bindings/misc/aspeed-espi-slave.txt | 20 ++ > Documentation/misc-devices/espi-slave.rst | 114 + DT files need to be split out into a separate patch so that the DT maintainers can properly review them. > --- a/drivers/misc/Kconfig > +++ b/drivers/misc/Kconfig > @@ -471,6 +471,17 @@ config VEXPRESS_SYSCFG > ARM Ltd. Versatile Express uses specialised platform configuration > bus. System Configuration interface is one of the possible means > of generating transactions on this bus. > +config ASPEED_ESPI_SLAVE You need a blank line above this one please. > + depends on ARCH_ASPEED || COMPILE_TEST > + select REGMAP_MMIO Select or depend? > + tristate "Aspeed ast2500 eSPI slave device" > + ---help--- > + This allows host to access Baseboard Management Controller (BMC) over > the > + Enhanced Serial Peripheral Interface (eSPI) bus, which replaces the > Low Pin > + Count (LPC) bus. > + > + Its interface supports peripheral, virtual wire, out-of-band, and > flash > + sharing channels. What is the module name? > > config ASPEED_LPC_CTRL > depends on (ARCH_ASPEED || COMPILE_TEST) && REGMAP && MFD_SYSCON > diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile > index 5ca5f64..a1081f4 100644 > --- a/drivers/misc/Makefile > +++ b/drivers/misc/Makefile > @@ -52,6 +52,7 @@ obj-$(CONFIG_GENWQE)+= genwqe/ > obj-$(CONFIG_ECHO) += echo/ > obj-$(CONFIG_VEXPRESS_SYSCFG)+= vexpress-syscfg.o > obj-$(CONFIG_CXL_BASE) += cxl/ > +obj-$(CONFIG_ASPEED_ESPI_SLAVE) += aspeed-espi-slave.o Why no tab? thanks, greg k-h
Re: [PATCH] cpufreq: mediatek: Add mediatek related projects into blacklist
On Tue, Jan 23, 2018 at 04:31:11PM +0800, sean.w...@mediatek.com wrote: > From: Sean Wang> > commit 6066998cbd2b1012a8d5bc9a2957cfd0ad53150e upstream. > > commit edeec420de24 ("cpufreq: dt-platdev: Automatically create cpufreq > device with OPP v2") not added MediaTek SoCs to the blacklist that would > lead to cause an occasional hang or unexpected behaviors on related boards > as kernelci reported and complained on [1] specifically for 4.14 and 4.15 > tree. > > For those reasons, add MediaTek SoCs into cpufreq-dt blacklist and wish > the patch be applied to 4.14 and 4.15 tree to allow kernelci able to > complete following automated kernel testing. > > [1] https://kernelci.org/boot/mt7623n-bananapi-bpi-r2/ > > Fixes: edeec420de24 (cpufreq: dt-cpufreq: platdev Automatically create device > with OPP v2) > Signed-off-by: Andrew-sh Cheng > Signed-off-by: Sean Wang > Cc: Kevin Hilman > --- > drivers/cpufreq/cpufreq-dt-platdev.c | 8 > 1 file changed, 8 insertions(+) What stable kernel tree(s) are you wanting this backported to? thanks, greg k-h
[PATCH 2/2] ARM: dts: imx6sx: add ARM power domain support
Add ARM power domain in PGC. Signed-off-by: Anson Huang--- this patch should be based on 0001-ARM-dts-imx6sx-add-pu-power-domain-support.patch arch/arm/boot/dts/imx6sx.dtsi | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm/boot/dts/imx6sx.dtsi b/arch/arm/boot/dts/imx6sx.dtsi index 42ef4c6..aa29ca6 100644 --- a/arch/arm/boot/dts/imx6sx.dtsi +++ b/arch/arm/boot/dts/imx6sx.dtsi @@ -768,6 +768,11 @@ #address-cells = <1>; #size-cells = <0>; + power-domain@0 { + reg = <0>; + #power-domain-cells = <0>; + }; + pd_pu: power-domain@1 { reg = <1>; #power-domain-cells = <0>; -- 2.7.4
[PATCH 1/2] soc: imx: gpc: ARM power domain should be always-on
ARM power domain does NOT support runtime off, always-on flag should be set to avoid incorrect power state in pm_genpd_summary: Before: root@imx6qpdlsolox:~# cat /sys/kernel/debug/pm_genpd/pm_genpd_summary domain status slaves /device runtime status -- ARM off-0 After: root@imx6qpdlsolox:~# cat /sys/kernel/debug/pm_genpd/pm_genpd_summary domain status slaves /device runtime status -- ARM on Signed-off-by: Anson Huang--- drivers/soc/imx/gpc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/soc/imx/gpc.c b/drivers/soc/imx/gpc.c index 53f7275..6cafa9b 100644 --- a/drivers/soc/imx/gpc.c +++ b/drivers/soc/imx/gpc.c @@ -254,6 +254,7 @@ static struct imx_pm_domain imx_gpc_domains[] = { { .base = { .name = "ARM", + .flags = GENPD_FLAG_ALWAYS_ON, }, }, { .base = { -- 2.7.4
Re: [PATCH] Documentation/ABI: clean up sysfs-class-pktcdvd
On Tue, 23 Jan 2018, Aishwarya Pant wrote: > Clean up the sysfs documentation such that it is in the same format as > described in Documentation/ABI/README. Mainly, the patch moves the > attribute names to the 'What:' field. This might be useful for scripting > and tracking changes in the ABI. > > Signed-off-by: Aishwarya Pant> --- > Documentation/ABI/testing/sysfs-class-pktcdvd | 122 > +++--- > 1 file changed, 71 insertions(+), 51 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-class-pktcdvd > b/Documentation/ABI/testing/sysfs-class-pktcdvd > index b1c3f0263359..e85ec99c6e31 100644 > --- a/Documentation/ABI/testing/sysfs-class-pktcdvd > +++ b/Documentation/ABI/testing/sysfs-class-pktcdvd > @@ -1,60 +1,80 @@ > -What: /sys/class/pktcdvd/ > +sysfs interface > +--- > +The pktcdvd module (packet writing driver) creates the following files in the > +sysfs: ( is in format major:minor) > + > +What: /sys/class/pktcdvd/add > +What: /sys/class/pktcdvd/remove > +What: /sys/class/pktcdvd/device_map > Date: Oct. 2006 > KernelVersion: 2.6.20 > Contact:Thomas Maier > Description: > > -sysfs interface > > + add:(WO) Write a block device id (major:minor) to create > + a new pktcdvd device and map it to the block device. > + > + remove: (WO) Write the pktcdvd device id (major:minor) to > it to > + remove the pktcdvd device. > + > + device_map: (RO) Shows the device mapping in format: > + pktcdvd[0-7] > + > + > +What: /sys/class/pktcdvd/pktcdvd[0-7]/dev > +What:/sys/class/pktcdvd/pktcdvd[0-7]/uevent It looks like there is a small alignment problem here. Maybe you use spaces in one case and tabs in the other. julia > +Date: Oct. 2006 > +KernelVersion: 2.6.20 > +Contact:Thomas Maier > +Description: > + dev:(RO) Device id > + > + uevent: (WO) To send an uevent > + > + > +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/packets_started > +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/packets_finished > +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/kb_written > +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/kb_read > +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/kb_read_gather > +What:/sys/class/pktcdvd/pktcdvd[0-7]/stat/reset > +Date: Oct. 2006 > +KernelVersion: 2.6.20 > +Contact:Thomas Maier > +Description: > + packets_started: (RO) Number of started packets. > + > + packets_finished: (RO) Number of finished packets. > + > + kb_written: (RO) kBytes written. > + > + kb_read: (RO) kBytes read. > + > + kb_read_gather: (RO) kBytes read to fill write packets. > + > + reset:(WO) Write any value to it to reset pktcdvd > + device statistic values, like bytes > + read/written. > + > + > +What:/sys/class/pktcdvd/pktcdvd[0-7]/write_queue/size > +What: > /sys/class/pktcdvd/pktcdvd[0-7]/write_queue/congestion_off > +What: > /sys/class/pktcdvd/pktcdvd[0-7]/write_queue/congestion_on > +Date: Oct. 2006 > +KernelVersion: 2.6.20 > +Contact:Thomas Maier > +Description: > + size: (RO) Contains the size of the bio write queue. > + > + congestion_off: (RW) If bio write queue size is below this mark, > + accept new bio requests from the block layer. > > -The pktcdvd module (packet writing driver) creates > -these files in the sysfs: > -( is in format major:minor ) > - > -/sys/class/pktcdvd/ > -add(0200) Write a block device id (major:minor) > - to create a new pktcdvd device and map > - it to the block device. > - > -remove (0200) Write the pktcdvd device id (major:minor) > - to it to remove the pktcdvd device. > - > -device_map (0444) Shows the device mapping in format: > - pktcdvd[0-7] > - > -/sys/class/pktcdvd/pktcdvd[0-7]/ > -dev (0444) Device id > -uevent(0200) To send an uevent. > - > -/sys/class/pktcdvd/pktcdvd[0-7]/stat/ > -packets_started (0444) Number of started packets. > -packets_finished (0444) Number of finished packets. > - > -kb_written(0444) kBytes written. > -kb_read (0444) kBytes read. > -kb_read_gather(0444) kBytes read to fill write packets. > - > -reset
[PATCH] block: neutralize blk_insert_cloned_request IO stall regression (was: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle)
On Thu, Jan 18 2018 at 5:20pm -0500, Bart Van Asschewrote: > On Thu, 2018-01-18 at 17:01 -0500, Mike Snitzer wrote: > > And yet Laurence cannot reproduce any such lockups with your test... > > Hmm ... maybe I misunderstood Laurence but I don't think that Laurence has > already succeeded at running an unmodified version of my tests. In one of the > e-mails Laurence sent me this morning I read that he modified these scripts > to get past a kernel module unload failure that was reported while starting > these tests. So the next step is to check which changes were made to the test > scripts and also whether the test results are still valid. > > > Are you absolutely certain this patch doesn't help you? > > https://patchwork.kernel.org/patch/10174037/ > > > > If it doesn't then that is actually very useful to know. > > The first I tried this morning is to run the srp-test software against a merge > of Jens' for-next branch and your dm-4.16 branch. Since I noticed that the dm > queue locked up I reinserted a blk_mq_delay_run_hw_queue() call in the dm > code. > Since even that was not sufficient I tried to kick the queues via debugfs (for > s in /sys/kernel/debug/block/*/state; do echo kick >$s; done). Since that was > not sufficient to resolve the queue stall I reverted the following tree > patches > that are in Jens' tree: > * "blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request > feedback" > * "blk-mq-sched: remove unused 'can_block' arg from > blk_mq_sched_insert_request" > * "blk-mq: don't dispatch request in blk_mq_request_direct_issue if queue is > busy" > > Only after I had done this the srp-test software ran again without triggering > dm queue lockups. Given that Ming's notifier-based patchset needs more development time I think we're unfortunately past the point where we can comfortably wait for that to be ready. So we need to explore alternatives to fixing this IO stall regression. Rather than attempt the above block reverts (which is an incomplete listing given newer changes): might we develop a more targeted code change to neutralize commit 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")? -- which, given Bart's findings above, seems to be the most problematic block commit. To that end, assuming I drop this commit from dm-4.16: https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16=316a795ad388e0c3ca613454851a28079d917a92 Here is my proposal for putting this regression behind us for 4.16 (Ming's line of development would continue and hopefully be included in 4.17): From: Mike Snitzer Date: Tue, 23 Jan 2018 09:40:22 +0100 Subject: [PATCH] block: neutralize blk_insert_cloned_request IO stall regression The series of blk-mq changes intended to improve sequential IO performace (through improved merging with dm-mapth blk-mq stacked on underlying blk-mq device). Unfortunately these changes have caused dm-mpath blk-mq IO stalls when blk_mq_request_issue_directly()'s call to q->mq_ops->queue_rq() fails (due to device-specific resource unavailability). Fix this by reverting back to how blk_insert_cloned_request() functioned prior to commit 396eaf21ee -- by using blk_mq_request_bypass_insert() instead of blk_mq_request_issue_directly(). In the future, this commit should be reverted as the first change in a followup series of changes that implements a comprehensive solution to allowing an underlying blk-mq queue's resource limitation to trigger the upper blk-mq queue to run once that underlying limited resource is replenished. Fixes: 396eaf21ee ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") Signed-off-by: Mike Snitzer --- block/blk-core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c index cdae69be68e9..a224f282b4a6 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2520,7 +2520,8 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request * * bypass a potential scheduler on the bottom device for * insert. */ - return blk_mq_request_issue_directly(rq); + blk_mq_request_bypass_insert(rq, true); + return BLK_STS_OK; } spin_lock_irqsave(q->queue_lock, flags); -- 2.15.0
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 08:53 +0100, Ingo Molnar wrote: > > The patch below demonstrates the principle, it forcibly enables dynamic > ftrace > patching (CONFIG_DYNAMIC_FTRACE=y et al) and turns mcount/__fentry__ into a > RET: > > 81a01a40 <__fentry__>: > 81a01a40: c3 retq > > This would have to be extended with (very simple) call stack depth tracking > (just > 3 more instructions would do in the fast path I believe) and a suitable > SkyLake > workaround (and also has to play nice with the ftrace callbacks). > > On non-SkyLake the overhead would be 0 cycles. The overhead of forcing CONFIG_DYNAMIC_FTRACE=y is precisely zero cycles? That seems a little optimistic. ;) I'll grant you if it goes straight to a 'ret' it isn't *that* high though. > On SkyLake this would add an overhead of maybe 2-3 cycles per function call > and > obviously all this code and data would be very cache hot. Given that the > average > number of function calls per system call is around a dozen, this would be > _much_ > faster than any microcode/MSR based approach. That's kind of neat, except you don't want it at the top of the function; you want it at the bottom. If you could hijack the *return* site, then you could check for underflow and stuff the RSB right there. But in __fentry__ there's not a lot you can do other than complain that something bad is going to happen in the future. You know that a string of 16+ rets is going to happen, but you've got no gadget in *there* to deal with it when it does. HJ did have patches to turn 'ret' into a form of retpoline, which I don't think ever even got performance-tested. They'd have forced a mispredict on *every* ret. A cheaper option might be to turn ret into a 'jmp skylake_ret_hack'. Which on pre-SKL will be a bare ret, and SKL+ can do the counting (in conjunction with a 'per_cpu(call_depth)++' in __fentry__) and stuff the RSB before actually returning, when appropriate. By the time you've made it work properly, I suspect we're approaching the barf-factor of IBRS, for a less complete solution. > Is there a testcase for the SkyLake 16-deep-call-stack problem that I could > run? Andi's been experimenting at https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=spec/deep-chain-3 > Is there a description of the exact speculative execution vulnerability that > has > to be addressed to begin with? "It takes predictions from the generic branch target buffer when the RSB underflows". IBRS filters what can come from the BTB, and resolves the problem that way. Retpoline avoids the indirect branches that on *earlier* CPUs were the only things that would use the offending predictions. But on SKL, now 'ret' is one of the problematic instructions too. Fun! :) > If this approach is workable I'd much prefer it to any MSR writes in the > syscall > entry path not just because it's fast enough in practice to not be turned off > by > everyone, but also because everyone would agree that per function call > overhead > needs to go away on new CPUs. Both deployment and backporting is also _much_ > more > flexible, simpler, faster and more complete than microcode/firmware or > compiler > based solutions. > > Assuming the vulnerability can be addressed via this route that is, which is > a big > assumption! I think it's close. There are some other cases which empty the RSB, like sleeping and loading microcode, which can happily be special- cased. Andi's rounded up many of the remaining details already at https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=spec/skl-rsb-3 And there's SMI, which is a pain but I think Linus is right we can possibly just stick our fingers in our ears and pretend we didn't hear about that one as it's likely to be hard to trigger (famous last words). On the whole though, I think you can see why we're keeping IBRS around for now, sent out purely as an RFC and rebased on top of the stuff we're *actually* sending to Linus for inclusion. When we have a clear idea of what we're doing for Skylake, it'll be useful to have a proper comparison of the security, the performance and the "ick" factor of whatever we come up with, vs. IBRS. Right now the plan is just "screw Skylake"; we'll just forget it's a special snowflake and treat it like everything else, except for a bit of extra RSB-stuffing on context switch (since we had to add that for !SMEP anyway). And that's not *entirely* unreasonable but as I said I'd *really* like to have a decent analysis of the implications of that, not just some hand-wavy "nah, it'll be fine". smime.p7s Description: S/MIME cryptographic signature
[RFC PATCH 2/2] hv_netvsc: Change GPADL teardown order according to Hyper-V version
Commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") introduced a regression causing VMs not to shutdown on pre-Wind2016 hosts after netvsc_remove_device() is called. This was caused as the GPADL teardown sequence was changed. This patch restores the old behavior for pre-Win2016 hosts, while keeping the changes from 0cf7378 for Win2016 and higher hosts. Signed-off-by: Mohammed Gamal--- drivers/net/hyperv/netvsc.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 3982f76..d09bb3b 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -575,8 +575,17 @@ void netvsc_device_remove(struct hv_device *device) cancel_work_sync(_device->subchan_work); + /* +* Revoke receive buffer. If host is pre-Win2016 then tear down +* receive buffer GPADL. Do the same for send buffer. +*/ netvsc_revoke_recv_buf(device, net_device); + if (vmbus_proto_version < VERSION_WIN10) + netvsc_teardown_recv_buf_gpadl(device, net_device); + netvsc_revoke_send_buf(device, net_device); + if (vmbus_proto_version < VERSION_WIN10) + netvsc_teardown_send_buf_gpadl(device, net_device); RCU_INIT_POINTER(net_device_ctx->nvdev, NULL); @@ -589,8 +598,14 @@ void netvsc_device_remove(struct hv_device *device) /* Now, we can close the channel safely */ vmbus_close(device->channel); - netvsc_teardown_recv_buf_gpadl(device, net_device); - netvsc_teardown_send_buf_gpadl(device, net_device); + /* +* If host is Win2016 or higher then we do the GPADL tear down +* here after VMBus is closed, instead of doing it earlier. +*/ + if (vmbus_proto_version >= VERSION_WIN10) { + netvsc_teardown_recv_buf_gpadl(device, net_device); + netvsc_teardown_send_buf_gpadl(device, net_device); + } /* And dissassociate NAPI context from device */ for (i = 0; i < net_device->num_chn; i++) -- 1.8.3.1
[RFC PATCH 1/2] hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl()
Split each of the functions into two for each of send/recv buffers Signed-off-by: Mohammed Gamal--- drivers/net/hyperv/netvsc.c | 35 +++ 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index bfc7969..3982f76 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -100,8 +100,8 @@ static void free_netvsc_device_rcu(struct netvsc_device *nvdev) call_rcu(>rcu, free_netvsc_device); } -static void netvsc_revoke_buf(struct hv_device *device, - struct netvsc_device *net_device) +static void netvsc_revoke_recv_buf(struct hv_device *device, + struct netvsc_device *net_device) { struct nvsp_message *revoke_packet; struct net_device *ndev = hv_get_drvdata(device); @@ -146,6 +146,14 @@ static void netvsc_revoke_buf(struct hv_device *device, } net_device->recv_section_cnt = 0; } +} + +static void netvsc_revoke_send_buf(struct hv_device *device, + struct netvsc_device *net_device) +{ + struct nvsp_message *revoke_packet; + struct net_device *ndev = hv_get_drvdata(device); + int ret; /* Deal with the send buffer we may have setup. * If we got a send section size, it means we received a @@ -189,8 +197,8 @@ static void netvsc_revoke_buf(struct hv_device *device, } } -static void netvsc_teardown_gpadl(struct hv_device *device, - struct netvsc_device *net_device) +static void netvsc_teardown_recv_buf_gpadl(struct hv_device *device, + struct netvsc_device *net_device) { struct net_device *ndev = hv_get_drvdata(device); int ret; @@ -215,6 +223,13 @@ static void netvsc_teardown_gpadl(struct hv_device *device, vfree(net_device->recv_buf); net_device->recv_buf = NULL; } +} + +static void netvsc_teardown_send_buf_gpadl(struct hv_device *device, + struct netvsc_device *net_device) +{ + struct net_device *ndev = hv_get_drvdata(device); + int ret; if (net_device->send_buf_gpadl_handle) { ret = vmbus_teardown_gpadl(device->channel, @@ -425,8 +440,10 @@ static int netvsc_init_buf(struct hv_device *device, goto exit; cleanup: - netvsc_revoke_buf(device, net_device); - netvsc_teardown_gpadl(device, net_device); + netvsc_revoke_recv_buf(device, net_device); + netvsc_revoke_send_buf(device, net_device); + netvsc_teardown_recv_buf_gpadl(device, net_device); + netvsc_teardown_send_buf_gpadl(device, net_device); exit: return ret; @@ -558,7 +575,8 @@ void netvsc_device_remove(struct hv_device *device) cancel_work_sync(_device->subchan_work); - netvsc_revoke_buf(device, net_device); + netvsc_revoke_recv_buf(device, net_device); + netvsc_revoke_send_buf(device, net_device); RCU_INIT_POINTER(net_device_ctx->nvdev, NULL); @@ -571,7 +589,8 @@ void netvsc_device_remove(struct hv_device *device) /* Now, we can close the channel safely */ vmbus_close(device->channel); - netvsc_teardown_gpadl(device, net_device); + netvsc_teardown_recv_buf_gpadl(device, net_device); + netvsc_teardown_send_buf_gpadl(device, net_device); /* And dissassociate NAPI context from device */ for (i = 0; i < net_device->num_chn; i++) -- 1.8.3.1
[RFC PATCH 0/2] hv_netvsc: Fix shutdown regression on Win2012 hosts
Commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") introduced a regression that caused VMs not to shutdown after netvsc_device_remove() is called. This is caused by GPADL teardown sequence change, and while that was necessary to fix issues with Win2016 hosts, it did introduce a regression for earlier versions. Prior to commit 0cf737808 the call sequence in netvsc_device_remove() was as follows (as implemented in netvsc_destroy_buf()): 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Teardown receive buffer GPADL 3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 4- Teardown send buffer GPADL 5- Close vmbus This didn't work for WS2016 hosts. Commit 0cf737808 split netvsc_destroy_buf() into two functions and rearranged the order as follows 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 3- Close vmbus 4- Teardown receive buffer GPADL 5- Teardown send buffer GPADL That worked well for WS2016 hosts, but for WS2012 hosts it prevented VMs from shutting down. This patch series works around this problem. The first patch splits netvsc_revoke_buf() and netvsc_teardown_gpadl() into two finer grained functions for tearing down send and receive buffers individally. The second patch uses the finer grained functions to implement the teardown sequence according to the host's version. We keep the behavior introduced in 0cf737808ae7 for Windows 2016 hosts, while we re-introduce the old sequence for earlier verions. Mohammed Gamal (2): hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl() hv_netvsc: Change GPADL teardown order according to Hyper-V version drivers/net/hyperv/netvsc.c | 50 + 1 file changed, 42 insertions(+), 8 deletions(-) -- 1.8.3.1
Re: [PATCH] cpufreq: mediatek: Add mediatek related projects into blacklist
On Tue, Jan 23, 2018 at 05:38:34PM +0800, Sean Wang wrote: > On Tue, 2018-01-23 at 09:46 +0100, Greg KH wrote: > > On Tue, Jan 23, 2018 at 04:31:11PM +0800, sean.w...@mediatek.com wrote: > > > From: Sean Wang> > > > > > commit 6066998cbd2b1012a8d5bc9a2957cfd0ad53150e upstream. > > > > > > commit edeec420de24 ("cpufreq: dt-platdev: Automatically create cpufreq > > > device with OPP v2") not added MediaTek SoCs to the blacklist that would > > > lead to cause an occasional hang or unexpected behaviors on related boards > > > as kernelci reported and complained on [1] specifically for 4.14 and 4.15 > > > tree. > > > > > > For those reasons, add MediaTek SoCs into cpufreq-dt blacklist and wish > > > the patch be applied to 4.14 and 4.15 tree to allow kernelci able to > > > complete following automated kernel testing. > > > > > > [1] https://kernelci.org/boot/mt7623n-bananapi-bpi-r2/ > > > > > > Fixes: edeec420de24 (cpufreq: dt-cpufreq: platdev Automatically create > > > device with OPP v2) > > > Signed-off-by: Andrew-sh Cheng > > > Signed-off-by: Sean Wang > > > Cc: Kevin Hilman > > > --- > > > drivers/cpufreq/cpufreq-dt-platdev.c | 8 > > > 1 file changed, 8 insertions(+) > > > > What stable kernel tree(s) are you wanting this backported to? > > > > thanks, > > > > greg k-h > > Hi, Greg, > > thanks for your help! > > stable and stable-rc are those trees I want this backported to I don't understand, what exactly do you mean by this? Have you read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly? > Hi, Viresh > > currently, can the patch be permitted to go through tree linux-pm branch > master to be part of mainline? Wait, this is not in Linus's tree already? If not, what is that big "commit upstream" in the changelog for? I don't see that commit in Linus's tree at all. totally confused, greg k-h
Re: [PATCH v3 1/2] arm64: Branch predictor hardening for Cavium ThunderX2
On Mon, Jan 22, 2018 at 02:00:59PM -0500, Jon Masters wrote: > On 01/22/2018 06:33 AM, Will Deacon wrote: > > On Fri, Jan 19, 2018 at 04:22:47AM -0800, Jayachandran C wrote: > >> Use PSCI based mitigation for speculative execution attacks targeting > >> the branch predictor. We use the same mechanism as the one used for > >> Cortex-A CPUs, we expect the PSCI version call to have a side effect > >> of clearing the BTBs. > >> > >> Signed-off-by: Jayachandran C> >> --- > >> arch/arm64/kernel/cpu_errata.c | 10 ++ > >> 1 file changed, 10 insertions(+) > >> > >> diff --git a/arch/arm64/kernel/cpu_errata.c > >> b/arch/arm64/kernel/cpu_errata.c > >> index 70e5f18..45ff9a2 100644 > >> --- a/arch/arm64/kernel/cpu_errata.c > >> +++ b/arch/arm64/kernel/cpu_errata.c > >> @@ -338,6 +338,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = { > >>.capability = ARM64_HARDEN_BP_POST_GUEST_EXIT, > >>MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1), > >>}, > >> + { > >> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, > >> + MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN), > >> + .enable = enable_psci_bp_hardening, > >> + }, > >> + { > >> + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, > >> + MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2), > >> + .enable = enable_psci_bp_hardening, > >> + }, > >> #endif > > > > Thanks. > > > > Acked-by: Will Deacon > > Thanks. I have separately asked for a specification tweak to allow us to > discover whether firmware has been augmented to provide the necessary > support that we need. That applies beyond Cavium. AFAIK, there's already an SMCCC/PSCI proposal doing the rounds that is discoverable and does what we need. Have you seen it? We should be posting code this week. Will
Re: [PATCH 2/2] mfd: smsc-ece1099: Improve a size determination in smsc_i2c_probe()
On Tue, 16 Jan 2018, SF Markus Elfring wrote: > From: Markus Elfring> Date: Tue, 16 Jan 2018 08:58:26 +0100 > > Replace the specification of a data structure by a pointer dereference > as the parameter for the operator "sizeof" to make the corresponding size > determination a bit safer according to the Linux coding style convention. > > This issue was detected by using the Coccinelle software. > > Signed-off-by: Markus Elfring > --- > drivers/mfd/smsc-ece1099.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/mfd/smsc-ece1099.c b/drivers/mfd/smsc-ece1099.c > index b9d96651cc0d..6681205dd2c0 100644 > --- a/drivers/mfd/smsc-ece1099.c > +++ b/drivers/mfd/smsc-ece1099.c > @@ -33,12 +33,10 @@ static const struct regmap_config smsc_regmap_config = { > static int smsc_i2c_probe(struct i2c_client *i2c, > const struct i2c_device_id *id) > { > - struct smsc *smsc; > int devid, rev, venid_l, venid_h; > int ret; > + struct smsc *smsc = devm_kzalloc(>dev, sizeof(*smsc), GFP_KERNEL); Please keep these separate. > - smsc = devm_kzalloc(>dev, sizeof(struct smsc), > - GFP_KERNEL); > if (!smsc) > return -ENOMEM; > -- Lee Jones Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
RE: [RFC v2 1/5] vfio/type1: Introduce iova list and add iommu aperture validity check
Hi Eric, > -Original Message- > From: Auger Eric [mailto:eric.au...@redhat.com] > Sent: Tuesday, January 23, 2018 8:25 AM > To: Alex Williamson; Shameerali Kolothum > Thodi > Cc: pmo...@linux.vnet.ibm.com; k...@vger.kernel.org; linux- > ker...@vger.kernel.org; Linuxarm ; John Garry > ; xuwei (O) > Subject: Re: [RFC v2 1/5] vfio/type1: Introduce iova list and add iommu > aperture validity check > > Hi Shameer, > > On 18/01/18 01:04, Alex Williamson wrote: > > On Fri, 12 Jan 2018 16:45:27 + > > Shameer Kolothum wrote: > > > >> This introduces an iova list that is valid for dma mappings. Make > >> sure the new iommu aperture window is valid and doesn't conflict > >> with any existing dma mappings during attach. Also update the iova > >> list with new aperture window during attach/detach. > >> > >> Signed-off-by: Shameer Kolothum > > >> --- > >> drivers/vfio/vfio_iommu_type1.c | 177 > > >> 1 file changed, 177 insertions(+) > >> > >> diff --git a/drivers/vfio/vfio_iommu_type1.c > b/drivers/vfio/vfio_iommu_type1.c > >> index e30e29a..11cbd49 100644 > >> --- a/drivers/vfio/vfio_iommu_type1.c > >> +++ b/drivers/vfio/vfio_iommu_type1.c > >> @@ -60,6 +60,7 @@ > >> > >> struct vfio_iommu { > >>struct list_headdomain_list; > >> + struct list_headiova_list; > >>struct vfio_domain *external_domain; /* domain for external user > */ > >>struct mutexlock; > >>struct rb_root dma_list; > >> @@ -92,6 +93,12 @@ struct vfio_group { > >>struct list_headnext; > >> }; > >> > >> +struct vfio_iova { > >> + struct list_headlist; > >> + phys_addr_t start; > >> + phys_addr_t end; > >> +}; > > > > dma_list uses dma_addr_t for the iova. IOVAs are naturally DMA > > addresses, why are we using phys_addr_t? > > > >> + > >> /* > >> * Guest RAM pinning working set or DMA target > >> */ > >> @@ -1192,6 +1199,123 @@ static bool vfio_iommu_has_sw_msi(struct > iommu_group *group, phys_addr_t *base) > >>return ret; > >> } > >> > >> +static int vfio_insert_iova(phys_addr_t start, phys_addr_t end, > >> + struct list_head *head) > >> +{ > >> + struct vfio_iova *region; > >> + > >> + region = kmalloc(sizeof(*region), GFP_KERNEL); > >> + if (!region) > >> + return -ENOMEM; > >> + > >> + INIT_LIST_HEAD(>list); > >> + region->start = start; > >> + region->end = end; > >> + > >> + list_add_tail(>list, head); > >> + return 0; > >> +} > > > > As I'm reading through this series, I'm learning that there are a lot > > of assumptions and subtle details that should be documented. For > > instance, the IOMMU API only provides a single geometry and we build > > upon that here as this patch creates a list, but there's only a single > > entry for now. The following patches carve that single iova range into > > pieces and somewhat subtly use the list_head passed to keep the list > > sorted, allowing the first/last_entry tricks used throughout. Subtle > > interfaces are prone to bugs. > > > >> + > >> +/* > >> + * Find whether a mem region overlaps with existing dma mappings > >> + */ > >> +static bool vfio_find_dma_overlap(struct vfio_iommu *iommu, > >> +phys_addr_t start, phys_addr_t end) > >> +{ > >> + struct rb_node *n = rb_first(>dma_list); > >> + > >> + for (; n; n = rb_next(n)) { > >> + struct vfio_dma *dma; > >> + > >> + dma = rb_entry(n, struct vfio_dma, node); > >> + > >> + if (end < dma->iova) > >> + break; > >> + if (start >= dma->iova + dma->size) > >> + continue; > >> + return true; > >> + } > >> + > >> + return false; > >> +} > > > > Why do we need this in addition to the existing vfio_find_dma()? Why > > doesn't this use the tree structure of the dma_list? > > > >> + > >> +/* > >> + * Check the new iommu aperture is a valid one > >> + */ > >> +static int vfio_iommu_valid_aperture(struct vfio_iommu *iommu, > >> + phys_addr_t start, > >> + phys_addr_t end) > >> +{ > >> + struct vfio_iova *first, *last; > >> + struct list_head *iova = >iova_list; > >> + > >> + if (list_empty(iova)) > >> + return 0; > >> + > >> + /* Check if new one is outside the current aperture */ > > > > "Disjoint sets" > > > >> + first = list_first_entry(iova, struct vfio_iova, list); > >> + last = list_last_entry(iova, struct vfio_iova, list); > >> + if ((start > last->end) || (end < first->start)) > >> + return -EINVAL; > >> + > >> + /* Check for any existing dma mappings outside the new start */ > >> + if (start > first->start) { > >> + if
Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8
sorry fix a typo. On 2018/1/23 17:23, gengdongjiu wrote: >> There are problems with doing this: >> >> Oct. 18, 2017, 10:26 a.m. James Morse wrote: >> | How do SEA and SEI interact? >> | >> | As far as I can see they can both interrupt each other, which isn't >> something >> | the single in_nmi() path in APEI can handle. I thinks we should fix this >> | first. >> >> [..] >> >> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie >> | XiuQi pointed to the memory_failure_queue() code. We can use this directly >> | from SEA, but not SEI. (what happens if an SError arrives while we are >> | queueing memory_failure work from an IRQ). >> | >> | The one that scares me is the trace-point reporting stuff. What happens if >> an >> | SError arrives while we are enabling a trace point? (these are static-keys >> | right?) >> | >> | I don't think we can just plumb SEI in like this and be done with it. >> | (I'm looking at teasing out the estatus cache code from being x86:NMI >> only. >> | This way we solve the same 'cant do this from NMI context' with the same >> | code'.) >> >> >> I will post what I've got for this estatus-cache thing as an RFC, its not >> ready >> to be considered yet. Yes, I know you are dong that. Your serial's patch will consider all above things, right? If your patch can be consider that, this patch can based on your patchset. thanks. > >>
[PATCH] rtc: ds1302: remove redundant initializations of pointer bp
From: Colin Ian KingPointe bp is being initialized and this value is never read, it is being updated to the same value later just before it is going to be used. Remove the initialization as it is never read and keep the setting of bp closer to the use of bp. Cleans up clang warnings: drivers/rtc/rtc-ds1302.c:115:7: warning: Value stored to 'bp' during its initialization is never read drivers/rtc/rtc-ds1302.c:46:7: warning: Value stored to 'bp' during its initialization is never read Signed-off-by: Colin Ian King --- drivers/rtc/rtc-ds1302.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/rtc/rtc-ds1302.c b/drivers/rtc/rtc-ds1302.c index 0ec4be62322b..43bcb17c922e 100644 --- a/drivers/rtc/rtc-ds1302.c +++ b/drivers/rtc/rtc-ds1302.c @@ -43,7 +43,7 @@ static int ds1302_rtc_set_time(struct device *dev, struct rtc_time *time) { struct spi_device *spi = dev_get_drvdata(dev); u8 buf[1 + RTC_CLCK_LEN]; - u8 *bp = buf; + u8 *bp; int status; /* Enable writing */ @@ -112,7 +112,7 @@ static int ds1302_probe(struct spi_device *spi) struct rtc_device *rtc; u8 addr; u8 buf[4]; - u8 *bp = buf; + u8 *bp; int status; /* Sanity check board setup data. This may be hooked up -- 2.15.1
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* David Woodhousewrote: > > On SkyLake this would add an overhead of maybe 2-3 cycles per function call > > and > > obviously all this code and data would be very cache hot. Given that the > > average > > number of function calls per system call is around a dozen, this would be > > _much_ > > faster than any microcode/MSR based approach. > > That's kind of neat, except you don't want it at the top of the > function; you want it at the bottom. > > If you could hijack the *return* site, then you could check for > underflow and stuff the RSB right there. But in __fentry__ there's not > a lot you can do other than complain that something bad is going to > happen in the future. You know that a string of 16+ rets is going to > happen, but you've got no gadget in *there* to deal with it when it > does. No, it can be done with the existing CALL instrumentation callback that CONFIG_DYNAMIC_FTRACE=y provides, by pushing a RET trampoline on the stack from the CALL trampoline - see my previous email. > HJ did have patches to turn 'ret' into a form of retpoline, which I > don't think ever even got performance-tested. Return instrumentation is possible as well, but there are two major drawbacks: - GCC support for it is not as widely available and return instrumentation is less tested in Linux kernel contexts - a major point of my suggestion is that CONFIG_DYNAMIC_FTRACE=y is already enabled in distros here and today, so the runtime overhead to non-SkyLake CPUs would be literally zero, while still allowing to fix the RSB vulnerability on SkyLake. Thanks, Ingo
Re: [PULL] alpha.git
On Sat, 20 Jan 2018, Matt Turner wrote: > Hi Linus, > > Please pull my alpha git tree. It contains a build fix and a regression fix. > > Hopefully still in time for 4.15 :) > > Thanks, > Matt Hi Will you also submit these patches? The first one fixes a crash when pthread_create races with signal delivery, it could cause random crashing in applications. https://marc.info/?l=linux-alpha=151491969711913=2 https://marc.info/?l=linux-alpha=151491960011839=2 https://marc.info/?l=linux-alpha=151491963911901=2 Mikulas > The following changes since commit 8cbab92dff778e516064c13113ca15d4869ec883: > > Merge tag 'for-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma (2018-01-16 16:47:40 > -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha.git for-linus > > for you to fetch changes up to 86be89939d11a84800f66e2a283b915b704bf33d: > > alpha/PCI: Fix noname IRQ level detection (2018-01-20 16:22:36 -0800) > > > Lorenzo Pieralisi (1): > alpha/PCI: Fix noname IRQ level detection > > Michael Cree (1): > alpha: extend memset16 to EV6 optimised routines > > arch/alpha/kernel/sys_sio.c | 35 +-- > arch/alpha/lib/ev6-memset.S | 12 ++-- > 2 files changed, 35 insertions(+), 12 deletions(-) >
Re: [PATCH v6 15/15] MIPS: ingenic: Initial GCW Zero support
Paul: On Wed, Jan 10, 2018 at 11:59 PM, Paul Cercueilwrote: > Hi Philippe, > > Le dim. 7 janv. 2018 à 17:18, Philippe Ombredanne a > écrit : >> >> On Fri, Jan 5, 2018 at 7:25 PM, Paul Cercueil >> wrote: >>> >>> The GCW Zero (http://www.gcw-zero.com) is a retro-gaming focused >>> handheld game console, successfully kickstarted in ~2012, running Linux. >>> >>> Signed-off-by: Paul Cercueil >>> Acked-by: Mathieu Malaterre >>> --- >>> arch/mips/boot/dts/ingenic/Makefile | 1 + >>> arch/mips/boot/dts/ingenic/gcw0.dts | 62 >>> + >>> arch/mips/configs/gcw0_defconfig| 27 >>> arch/mips/jz4740/Kconfig| 4 +++ >>> 4 files changed, 94 insertions(+) >>> create mode 100644 arch/mips/boot/dts/ingenic/gcw0.dts >>> create mode 100644 arch/mips/configs/gcw0_defconfig >>> >>> v2: No change >>> v3: No change >>> v4: No change >>> v5: Use SPDX license identifier >>> Drop custom CROSS_COMPILE from defconfig >>> v6: Add "model" property in devicetree >> >> >> For the use of SPDX tags for the whole patch set: thank you! >> >> Acked-by: Philippe Ombredanne > > > Is your Acked-by for the whole patchset? Or just this one patch? Sorry for the late reply! This is for the whole patchset for your use of SPDX tags. -- Cordially Philippe Ombredanne
Re: Network interface "stops working"
Not sure it’s a bug yet, but anyone have any ideas on how I can find out? > On 22 Jan 2018, at 23:32, Cong Wangwrote: > > (Please always Cc netdev for networking related bugs.) > > On Mon, Jan 22, 2018 at 2:02 AM, Turbo Fredriksson wrote: >> I just got a new broadband delivered at home. It is "Hyperoptic 1Gbps fiber" >> which comes as a ethernet connector at home. I wasn’t around >> when they connected up everything, so I’m not sure *where* the fiber starts, >> but either way, I have an ethernet jack in one of my rooms. >> >> They also provided me with a ZTE router. I have need for my own services >> (firewalling, NATing, IPSEC and what not), so don’t want to use >> the provided router.. >> >> However, I’m having serious trouble keeping the interface up! Works for a >> few minutes and then just “stops working”. Don’t know why, there’s >> nothing in the logs or from dmesg.. >> >> Taking the interface down and then up again usually solves it. For a few >> minutes. >> >> Also, when running the interface (a Intel 82576 Gigabit dual port, using the >> igb driver - tried e1000 and e1000e but they don’t find any interfaces), >> in 1Gbps mode, the interface starts flapping up and down and I can’t get a >> connection at all. So my interface definition runs a script to use >> ethtool to set the speed to 100Mbps, full duplex, no auto negotiation. Which >> “kinda” works (for a while, hence my problems). >> >> Because the provided router works just fine, I’m sure it’s something on my >> Linux box (Debian GNU/Linux Jessie) that does it.. I’ve tried running >> it without any iptables, in both static and DHCP mode but same problem.. >> >> I’m not a complete beginner with Linux nor networks, but this is to “close >> to the hardware” for me. I’m at a loss to what else to try.. >> >> This isn’t a new machine, it have served me very well for five years give or >> take and I’ve never had any problems with it (not to say that it still >> can’t be hardware problems, but I find that somewhat unlikely at the moment). >> >> >> Could anyone please advice to what I can try to try to pinpoint the problem >> (and/or possibly fix it)? signature.asc Description: Message signed with OpenPGP
Re: unixbench context switch perfomance & cpu topology
2018-01-22 20:53 GMT+08:00 Peter Zijlstra: > On Mon, Jan 22, 2018 at 07:47:45PM +0800, Wanpeng Li wrote: >> Hi all, >> >> We can observe unixbench context switch performance is heavily >> influenced by cpu topology which is exposed to the guest. the score is >> posted below, bigger is better, both the guest and the host kernel are >> 3.15-rc3(we can also reproduce against centos 7.4 693 guest/host), LLC >> is exposed to the guest, kvm adaptive halt-polling is default enabled, >> then start a guest w/ 8 logical cpus. >> >> >> >> unixbench context switch >> -smp 8, sockets=8, cores=1, threads=1382036 >> -smp 8, sockets=4, cores=2, threads=1132480 >> -smp 8, sockets=2, cores=4, threads=1128032 >> -smp 8, sockets=2, cores=2, threads=2131767 >> -smp 8, sockets=1, cores=4, threads=2132742 >> -smp 8, sockets=1, cores=4, threads=2 (guest w/ nohz=off idle=poll)331471 >> >> I can observe there are a lot of reschedule IPIs sent from one vCPU to >> another vCPU, the context switch workload switches between running and >> idle frequently which results in HLT instruction in the idle path, I >> use idle=poll to avoid vmexit due to HLT and to avoid reschedule IPIs >> since idle task checks TIF_NEED_RESCHED flags in a loop, nohz=off can >> stop to program lapic timer/other nohz stuffs. Any idea why sockets=8 >> can get best performance? > > I suspect because we load-balance less agressively across nodes than we > do within a cache domain. It is true. after taking a more closer look by kernelshark, the context1 in the guest will be migrated to another logical cpu after several milliseconds for sockets=1, cores=4, threads=2, however, it can keep on one logical cpu around several seconds for sockets=8, cores=1, threads=1 before migrating to another one. > > Fix you benchmark to pin itself to a single CPU, that's the only > sensible way to obtain this number in any case. Yeah, this setup can get a good performance. Actually the two context1 tasks don't stack up on one logical cpu at the most of time which is observed by kernelshark opposed to Mike's reply. In addition, I can observe the sum of RESCHED IPIs in the guest for sockets=1, cores=4, threads=2 is 4.5 times for sockets=8, cores=1, threads=1. Any idea how this can happen? I suspect the TTWU path selects another idle logical cpu which results in a RESCHED IPI is avoidless. However, there is still no benefit for performance after I clear the SD_BALANCE_WAKE for correlative sched_domains. Regards, Wanpeng Li
Re: [PATCH v3 11/20] arm64: mm: Map entry trampoline into trampoline and kernel page tables
Hi Will, On 2018/1/23 18:04, Will Deacon wrote: > On Tue, Jan 23, 2018 at 04:28:45PM +0800, Yisheng Xie wrote: >> On 2017/12/6 20:35, Will Deacon wrote: >>> +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 >>> +static int __init map_entry_trampoline(void) >>> +{ >>> + extern char __entry_tramp_text_start[]; >>> + >>> + pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; >>> + phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start); >>> + >>> + /* The trampoline is always mapped and can therefore be global */ >>> + pgprot_val(prot) &= ~PTE_NG; >>> + >>> + /* Map only the text into the trampoline page table */ >>> + memset(tramp_pg_dir, 0, PGD_SIZE); >>> + __create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE, >>> +prot, pgd_pgtable_alloc, 0); >> >> How the tramp_pg_dir is used, should it be set to ttbr1 when exit kernel? >> Sorry >> for I do not find where it is used. > > Yes, that's what happens when we return to userspace. The code is a little > convoluted, but the tramp_pg_dir is placed at a fixed offset from swapper > (see the linker script) so the sub instruction in tramp_unmap_kernel is what > gives us the ttbr1 value we need. oh, I missed that. Maybe a comment inline is better to understand. Thanks once more for your help and explain :) Thanks Yisheng > > Will > > . >
Re: [PATCH] kasan: add __asan_report_loadN/storeN_noabort callbacks
On 01/19/2018 08:44 PM, Andrey Konovalov wrote: > Instead of __asan_report_load_n_noabort and __asan_report_store_n_noabort > callbacks Clang emits differently named __asan_report_loadN_noabort and > __asan_report_storeN_noabort (similar to __asan_loadN/storeN_noabort, whose > names both GCC and Clang agree on). > > Add callback implementation for __asan_report_loadN/storeN_noabort. > This made me wonder why this wasn't observed before. So I noticed that inline instrumentation with -fsanitize=kernel-addresss is broken in clang, and clang never calls __asan_report*() functions. I see that you guys fixed this just yesterday https://reviews.llvm.org/D42384 . But it seems that you didn't fix the rest of "if (CompileKernel)" crap. Clang generates "__asan_report_[load,store]N*" instead of "__asan_report_[load,store]_n*" only because of this idiocy: const std::string SuffixStr = CompileKernel ? "N" : "_n"; See https://github.com/llvm-mirror/llvm/blob/ca19eaabd75f55865efd321b7a6f1d4ba3db8bc8/lib/Transforms/Instrumentation/AddressSanitizer.cpp#L2250 Note that SuffixStr is used *only* for __asan_report_* callbacks, which makes no sense because we never ever had __asan_report* callbacks with "N" suffix. So I think that you should just fix the llvm here. And there is probably one more "if (CompileKernel)" crap in runOnModule() which breaks globals instrumentation.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* David Woodhousewrote: > On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote: > > > > BTW., the reason this is enabled on all distro kernels is because the > > overhead > > is a single patched-in NOP instruction in the function epilogue, when > > tracing > > is disabled. So it's not even a CALL+RET - it's a patched in NOP. > > Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to > get > to the point where we can patch *that* out into a NOP... or are you saying we > already can? Yes, we already can and do patch the 'call __fentry__/ mcount' call site into a NOP today - all 50,000+ call sites on a typical distro kernel. We did so for a long time - this is all a well established, working mechanism. > But this is a digression. I was being pedantic about the "0 cycles" but sure, > this would be perfectly tolerable. It's not a digression in two ways: - I wanted to make it clear that for distro kernels it _is_ a zero cycles overhead mechanism for non-SkyLake CPUs, literally. - I noticed that Meltdown and the CR3 writes for PTI appears to have established a kind of ... insensitivity and numbness to kernel micro-costs, which peaked with the per-syscall MSR write nonsense patch of the SkyLake workaround. That attitude is totally unacceptable to me as x86 maintainer and yes, still every cycle counts. Thanks, Ingo
Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
On Tue, 23 Jan 2018 11:26:58 +0100, Wolfgang Bumiller wrote: > Even if you know the netnsid, do the mentioned watches work for > nested/child namespaces if eg. a container creates new namespace before > and/or after the watch was established and moves interfaces to these > child namespaces, would you just see them disappear, or can you keep > track of them later on as well? What do you mean by "nested namespaces"? There's no such thing for net name spaces. As for missing API to get netnsid of the netns the interface is moved to, see my previous emails in this thread. This needs to be added. > Even if that works, from what the documentation tells me netlink is an > unreliable protocol, so if my watcher's socket buffer is full, wouldn't > I be losing important tracking information? Sure. But that's fundamentally unfixable independently on netlink, the kernel needs to take an action if a program is not reading its messages. Either some messages get dropped or the program is killed or infinite amount of memory is consumed. This has nothing to do with uAPI design. > I think one possible solution to tracking interfaces would be to have a > unique identifier that never changes (even if it's just a simple > uint64_t incremented whenever an interface is created). But since > they're not local to the current namespace that may require a lot of > extra permission checks (but I'm just speculating here...). You'll get a hard NACK from CRIU folks if you try to propose this. > In any case, IFLA_NET_NS_FD/PID are already there and I had been > wondering previously why they couldn't be used with RTM_GETLINK, it > would just make sense. Those predate netnsids and we can't get rid of them now, since they're part of uAPI. But we can (and should) make sure we don't add more of those. Jiri
[PATCH] staging: comedi: dt2811: remove redundant initialization of 'ns'
From: Colin Ian KingVariable ns is being initialized with a value that is never read, ns is being re-assigned a new value later on. Remove the redundant initialization. Cleans up clang warning: drivers/staging/comedi/drivers/dt2811.c:310:21: warning: Value stored to 'ns' during its initialization is never read Signed-off-by: Colin Ian King --- drivers/staging/comedi/drivers/dt2811.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/comedi/drivers/dt2811.c b/drivers/staging/comedi/drivers/dt2811.c index fea0a1baf10b..05207a519755 100644 --- a/drivers/staging/comedi/drivers/dt2811.c +++ b/drivers/staging/comedi/drivers/dt2811.c @@ -307,7 +307,7 @@ static int dt2811_ai_cmd(struct comedi_device *dev, static unsigned int dt2811_ns_to_timer(unsigned int *nanosec, unsigned int flags) { - unsigned long long ns = *nanosec; + unsigned long long ns; unsigned int ns_lo = COMEDI_MIN_SPEED; unsigned int ns_hi = 0; unsigned int divisor_hi = 0; -- 2.15.1
Re: [PATCH v5] devres: combine function devm_ioremap*
On 2018/1/23 16:42, Greg KH wrote: > On Tue, Jan 16, 2018 at 08:03:41PM +0800, Yisheng Xie wrote: >> When I tried to use devm_ioremap function and review related >> code, I found devm_ioremap_* almost have the similar realize >> with each other, which can be combined. >> >> In the former version, I have tried to kill ioremap_cache to >> reduce the size of devres, which can not work for ioremap is >> not the same as ioremap_nocache in some ARCHs likes ia64. >> Therefore, as the suggestion of Christophe, I introduce a help >> function __devm_ioremap, let devm_ioremap* inline and call >> __devm_ioremap with different devm_ioremap_type. >> >> After apply the patch, the size of devres.o can be reduce from >> 8216 Bytes to 7352Bytes in my compile environment. >> >> Suggested-by: Christophe LEROY>> Signed-off-by: Yisheng Xie >> --- >> v2: >> - use MARCO for ioremap >> v3: >> - kill dev_ioremap_nocache >> v4: >> - combine function devm_ioremap* >> v5: >> - fix code style. >> >> include/linux/io.h | 61 +++ >> lib/devres.c | 84 >> ++ >> 2 files changed, 70 insertions(+), 75 deletions(-) >> >> diff --git a/include/linux/io.h b/include/linux/io.h >> index 32e30e8..4d0a640 100644 >> --- a/include/linux/io.h >> +++ b/include/linux/io.h >> @@ -73,12 +73,61 @@ static inline void devm_ioport_unmap(struct device *dev, >> void __iomem *addr) >> >> #define IOMEM_ERR_PTR(err) (__force void __iomem *)ERR_PTR(err) >> >> -void __iomem *devm_ioremap(struct device *dev, resource_size_t offset, >> - resource_size_t size); >> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t >> offset, >> - resource_size_t size); >> -void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset, >> - resource_size_t size); >> +enum devm_ioremap_type { >> +DEVM_IOREMAP = 0, >> +DEVM_IOREMAP_NC, >> +DEVM_IOREMAP_WC, >> +}; > > Why do these types need to be in a public .h file? > > Why not just keep the .h file as-is and then just put the cleanup in the > .c file like you did? > Right. I was just trying to inline these functions. Anyway, I will follow your suggestion. Sorry for sending so many versions. Thanks Yisheng > thanks, > > greg k-h > > . >
Re: [PATCH 1/4] dmaengine: qcom: bam_dma: make bam clk optional
On Mon, Jan 22, 2018 at 09:55:01AM +, Srinivas Kandagatla wrote: > >>@@ -1180,13 +1180,14 @@ static int bam_dma_probe(struct platform_device > >>*pdev) > >>"qcom,controlled-remotely"); > >>bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk"); > > > >but you still do clk_get unconditionally? > > Only reason to do this way is to not break existing users in the mainline. > > remotely controlled BAM is already supported in upstream driver, there are > users of this who pass clk from device tree, If I make this conditional then > subsequent reads to the BAM registers for those instances might crash the > system. But these instances are remote controlled, so if we stop representing them in Linux, why would we read them? -- ~Vinod
Re: [PATCH v2 1/2] Input: edt-ft5x06 - Add support for regulator
Hello Lothar, Le Tue, 23 Jan 2018 09:04:14 +0100, Lothar Waßmanna écrit : > Hi, > > On Mon, 22 Jan 2018 09:42:08 -0800 Dmitry Torokhov wrote: > > Hi Mylène, > > > > On Thu, Dec 28, 2017 at 8:33 AM, Mylène Josserand > > wrote: > > > Add the support of regulator to use it as VCC source. > > > > > > Signed-off-by: Mylène Josserand > > > --- > > > .../bindings/input/touchscreen/edt-ft5x06.txt | 1 + > > > drivers/input/touchscreen/edt-ft5x06.c | 33 > > > ++ > > > 2 files changed, 34 insertions(+) > > > > > > diff --git > > > a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > > b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > > index 025cf8c9324a..48e975b9c1aa 100644 > > > --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > > +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > > @@ -30,6 +30,7 @@ Required properties: > > > Optional properties: > > > - reset-gpios: GPIO specification for the RESET input > > > - wake-gpios: GPIO specification for the WAKE input > > > + - vcc-supply: Regulator that supplies the touchscreen > > > > > > - pinctrl-names: should be "default" > > > - pinctrl-0: a phandle pointing to the pin settings for the > > > diff --git a/drivers/input/touchscreen/edt-ft5x06.c > > > b/drivers/input/touchscreen/edt-ft5x06.c > > > index c53a3d7239e7..5ee14a25a382 100644 > > > --- a/drivers/input/touchscreen/edt-ft5x06.c > > > +++ b/drivers/input/touchscreen/edt-ft5x06.c > > > @@ -39,6 +39,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > > > > #define WORK_REGISTER_THRESHOLD0x00 > > > #define WORK_REGISTER_REPORT_RATE 0x08 > > > @@ -91,6 +92,7 @@ struct edt_ft5x06_ts_data { > > > struct touchscreen_properties prop; > > > u16 num_x; > > > u16 num_y; > > > + struct regulator *vcc; > > > > > > struct gpio_desc *reset_gpio; > > > struct gpio_desc *wake_gpio; > > > @@ -993,6 +995,23 @@ static int edt_ft5x06_ts_probe(struct i2c_client > > > *client, > > > > > > tsdata->max_support_points = chip_data->max_support_points; > > > > > > + tsdata->vcc = devm_regulator_get(>dev, "vcc"); > > > + if (IS_ERR(tsdata->vcc)) { > > > + error = PTR_ERR(tsdata->vcc); > > > + dev_err(>dev, "failed to request regulator: %d\n", > > > + error); > > > I would check for -EPROBE_DEFER here and omit the error message in this > case. > > > Lothar Waßmann Sure, I will add this case, thank you for the review. Best regards, -- Mylène Josserand, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
Re: [PATCH v1 1/4] seq_file: Introduce DEFINE_SHOW_ATTRIBUTE() helper macro
On Mon, 22 Jan 2018, Andy Shevchenko wrote: > The DEFINE_SHOW_ATTRIBUTE() helper macro would be useful for current > users, which are many of them, and for new comers to decrease code > duplication. > > Signed-off-by: Andy Shevchenko> --- > drivers/mfd/ab8500-debugfs.c| 14 -- Acked-by: Lee Jones > drivers/platform/x86/pmc_atom.c | 14 -- > include/linux/seq_file.h| 14 ++ > net/bluetooth/hci_debugfs.c | 13 - > 4 files changed, 14 insertions(+), 41 deletions(-) -- Lee Jones Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH 1/4] dmaengine: qcom: bam_dma: make bam clk optional
On 23/01/18 09:19, Vinod Koul wrote: On Mon, Jan 22, 2018 at 09:55:01AM +, Srinivas Kandagatla wrote: @@ -1180,13 +1180,14 @@ static int bam_dma_probe(struct platform_device *pdev) "qcom,controlled-remotely"); bdev->bamclk = devm_clk_get(bdev->dev, "bam_clk"); but you still do clk_get unconditionally? Only reason to do this way is to not break existing users in the mainline. remotely controlled BAM is already supported in upstream driver, there are users of this who pass clk from device tree, If I make this conditional then subsequent reads to the BAM registers for those instances might crash the system. But these instances are remote controlled, so if we stop representing them in Linux, why would we read them? Plan is that we would transition those users once we get these bindings/changes in. Currently I don't have access to any of those devices so I made the changes safe, such that it does not break devices on mainline. --srini
Re: [PATCH] PCI: qcom: add missing supplies required for msm8996
Hey Srini, As there are no comments I'd propose to change the endpoint supplies to more generic names. On 12/08/2017 11:20 AM, srinivas.kandaga...@linaro.org wrote: > From: Srinivas Kandagatla> > This patch adds supplies that are required for msm8996. Two of them vdda > and vdda-1p8 are analog supplies that go in to controller, and the rest > of the two vddpe's are supplies to PCIe endpoints. > > Without these supplies PCIe endpoints which require power supplies are > not enumerated at all, as there is no one to power it up. > > Signed-off-by: Srinivas Kandagatla > --- > .../devicetree/bindings/pci/qcom,pcie.txt | 16 + > drivers/pci/dwc/pcie-qcom.c| 28 > -- > 2 files changed, 42 insertions(+), 2 deletions(-) > > diff --git a/Documentation/devicetree/bindings/pci/qcom,pcie.txt > b/Documentation/devicetree/bindings/pci/qcom,pcie.txt > index 3c9d321b3d3b..045102cb3e12 100644 > --- a/Documentation/devicetree/bindings/pci/qcom,pcie.txt > +++ b/Documentation/devicetree/bindings/pci/qcom,pcie.txt > @@ -179,6 +179,11 @@ > Value type: > Definition: A phandle to the core analog power supply > > +- vdda-1p8-supply: > + Usage: required for msm8996 > + Value type: > + Definition: A phandle to the 1.8v analog power supply > + This should be dropped, because it is part of the phy. > - vdda_phy-supply: > Usage: required for ipq/apq8064 > Value type: > @@ -189,6 +194,15 @@ > Value type: > Definition: A phandle to the analog power supply for IC which generates > reference clock > +- vddpe-supply: > + Usage: optional > + Value type: > + Definition: A phandle to the PCIe endpoint power supply vddpe_3v3-supply > + > +- vddpe1-supply: > + Usage: optional > + Value type: > + Definition: A phandle to the PCIe endpoint power supply 1 vddpe_1v5-supply > > - phys: > Usage: required for apq8084 > @@ -205,6 +219,8 @@ > Value type: > Definition: List of phandle and GPIO specifier pairs. Should contain > - "perst-gpios" PCIe endpoint reset signal line > + - "pe_en-gpios" PCIe endpoint enable signal line > + - "pe_en1-gpios" PCIe endpoint enable1 signal line We don't need those gpios, the regulator driver will manipulate these gpios when we call regulator_enable/disable. -- regards, Stan
[PATCH net 2/2] vhost: do not try to access device IOTLB when not initialized
The code will try to access dev->iotlb when processing VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead to NULL pointer dereference. Fixes this by check dev->iotlb before. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang--- drivers/vhost/vhost.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 549771a..5727b18 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1015,6 +1015,10 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev, vhost_iotlb_notify_vq(dev, msg); break; case VHOST_IOTLB_INVALIDATE: + if (!dev->iotlb) { + ret = -EFAULT; + break; + } vhost_vq_meta_reset(dev); vhost_del_umem_range(dev->iotlb, msg->iova, msg->iova + msg->size - 1); -- 2.7.4
Re: [PATCH] ACPI / tables: Add IORT to injectable table list
Hi, All Sorry, please ignore this patch. Please help to review v2. https://patchwork.kernel.org/patch/10179761/ Thanks Shunyong On Tue, 2018-01-23 at 16:06 +0800, Yang Shunyong wrote: > This patch adds ACPI_SIG_PPTT to the table, which enables IORT from > initrd to override which from firmware. > > Signed-off-by: Yang Shunyong> Cc: yutang2.ji...@hxt-semitech.com > Cc: yu.zh...@hxt-semitech.com > --- > drivers/acpi/tables.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c > index 80ce2a7d224b..7bcb66f3 100644 > --- a/drivers/acpi/tables.c > +++ b/drivers/acpi/tables.c > @@ -456,7 +456,8 @@ static u8 __init acpi_table_checksum(u8 *buffer, > u32 length) > ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA, > ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT, > ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT, > - ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL }; > + ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT, > + NULL }; > > #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header) >
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
* Ingo Molnarwrote: > Is there a testcase for the SkyLake 16-deep-call-stack problem that I could > run? > Is there a description of the exact speculative execution vulnerability that > has > to be addressed to begin with? Ok, so for now I'm assuming that this is the 16 entries return-stack-buffer underflow condition where SkyLake falls back to the branch predictor (while other CPUs wrap the buffer). > If this approach is workable I'd much prefer it to any MSR writes in the > syscall > entry path not just because it's fast enough in practice to not be turned off > by > everyone, but also because everyone would agree that per function call > overhead > needs to go away on new CPUs. Both deployment and backporting is also _much_ > more > flexible, simpler, faster and more complete than microcode/firmware or > compiler > based solutions. > > Assuming the vulnerability can be addressed via this route that is, which is > a big > assumption! So I talked this over with PeterZ, and I think it's all doable: - the CALL __fentry__ callbacks maintain the depth tracking (on the kernel stack, fast to access), and issue an "RSB-stuffing sequence" when depth reaches 16 entries. - "the RSB-stuffing sequence" is a return trampoline that pushes a CALL on the stack which is executed on the RET. - All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. (The tracking could probably made IRQ and maybe even NMI safe, but the worst-case nesting scenarios make my head ache.) I.e. IBRS can be mostly replaced with a kernel based solution that is better than IBRS and which does not negatively impact any other non-SkyLake CPUs or general code quality. I.e. a full upstream Spectre solution. Thanks, Ingo
Re: [PATCH 2/3] power: supply: add cros-ec USB PD charger driver.
On Wed, 17 Jan 2018, Enric Balletbo i Serra wrote: > From: Sameer Nanda> > This driver gets various bits of information about what is connected to > USB PD ports from the EC and converts that into power_supply properties. > > Signed-off-by: Sameer Nanda > Signed-off-by: Enric Balletbo i Serra > --- > drivers/power/supply/Kconfig | 11 + > drivers/power/supply/Makefile | 1 + > drivers/power/supply/cros_usbpd-charger.c | 953 > ++ > include/linux/mfd/cros_ec.h | 3 + Acked-by: Lee Jones > 4 files changed, 968 insertions(+) > create mode 100644 drivers/power/supply/cros_usbpd-charger.c -- Lee Jones Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [Nouveau] [PATCH] drm/nouveau/mmu: Fix trailing semicolon
Reviewed-by: Karol HerbstOn Wed, Jan 17, 2018 at 7:53 PM, Luis de Bethencourt wrote: > The trailing semicolon is an empty statement that does no operation. > Removing it since it doesn't do anything. > > Signed-off-by: Luis de Bethencourt > --- > > Hi, > > After fixing the same thing in drivers/staging/rtl8723bs/, Joe Perches > suggested I fix it treewide [0]. > > Best regards > Luis > > > [0] > http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115410.html > [1] > http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-January/115390.html > > drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > index e35d3e17cd7c..93946dcee319 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c > @@ -642,7 +642,7 @@ nvkm_vmm_ptes_sparse(struct nvkm_vmm *vmm, u64 addr, u64 > size, bool ref) > else > block = (size >> page[i].shift) << > page[i].shift; > } else { > - block = (size >> page[i].shift) << page[i].shift;; > + block = (size >> page[i].shift) << page[i].shift; > } > > /* Perform operation. */ > -- > 2.15.1 > > ___ > Nouveau mailing list > nouv...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [PATCH] drm/bridge/synopsys: dsi: Adopt SPDX identifiers
Hi Laurent, A big *thank* for your review On 01/23/2018 12:30 AM, Laurent Pinchart wrote: > Hi Philippe, > > Thank you for the patch. > > On Monday, 22 January 2018 12:26:08 EET Philippe Cornu wrote: >> Add SPDX identifiers to the Synopsys DesignWare MIPI DSI >> host controller driver. >> >> Signed-off-by: Philippe Cornu>> --- >> drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 6 +- >> 1 file changed, 1 insertion(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c >> b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c index >> 46b0e73404d1..e06836dec77c 100644 >> --- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c >> +++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c >> @@ -1,12 +1,8 @@ >> +// SPDX-License-Identifier: GPL-2.0 > > According to Documentation/process/license-rules.txt this would change the > existing license. The correct identifier is GPL-2.0+. > You are right, I did not put the correct identifier :( After reading more spdx.org, I wonder if the correct value should be GPL-2.0-or-later instead of GPL-2.0+ https://spdx.org/licenses/GPL-2.0-or-later.html https://spdx.org/licenses/GPL-2.0+.html What is your opinion? Many thanks, Philippe :-) >> /* >>* Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd >>* Copyright (C) STMicroelectronics SA 2017 >>* >> - * This program is free software; you can redistribute it and/or modify >> - * it under the terms of the GNU General Public License as published by >> - * the Free Software Foundation; either version 2 of the License, or >> - * (at your option) any later version. >> - * >>* Modified by Philippe Cornu >>* This generic Synopsys DesignWare MIPI DSI host driver is based on the >>* Rockchip version from rockchip/dw-mipi-dsi.c with phy & bridge APIs. > >
Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
On Tue, Jan 23, 2018 at 10:30:09AM +0100, Jiri Benc wrote: > On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote: > > This is not necessarily true in scenarios where I move a network device > > via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't > > created. Here is an example: > > > > nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; > > nlmsghdr->nlmsg_type = RTM_NEWLINK; > > /* move to network namespace of pid */ > > nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid) > > /* give interface new name */ > > nla_put_string(nlmsg, IFLA_IFNAME, ifname) > > > > The only thing I have is the pid that identifies the network namespace. > > How do you know the interface did not get renamed in the new netns? > > This is racy and won't work reliably. You really need to know the > netnsid before moving the interface to the netns to be able to do > meaningful queries. Even if you know the netnsid, do the mentioned watches work for nested/child namespaces if eg. a container creates new namespace before and/or after the watch was established and moves interfaces to these child namespaces, would you just see them disappear, or can you keep track of them later on as well? Even if that works, from what the documentation tells me netlink is an unreliable protocol, so if my watcher's socket buffer is full, wouldn't I be losing important tracking information? I think one possible solution to tracking interfaces would be to have a unique identifier that never changes (even if it's just a simple uint64_t incremented whenever an interface is created). But since they're not local to the current namespace that may require a lot of extra permission checks (but I'm just speculating here...). In any case, IFLA_NET_NS_FD/PID are already there and I had been wondering previously why they couldn't be used with RTM_GETLINK, it would just make sense.
Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation
On Tue, 2018-01-23 at 11:15 +0100, Ingo Molnar wrote: > > BTW., the reason this is enabled on all distro kernels is because the > overhead is > a single patched-in NOP instruction in the function epilogue, when tracing is > disabled. So it's not even a CALL+RET - it's a patched in NOP. Hm? We still have GCC emitting 'call __fentry__' don't we? Would be nice to get to the point where we can patch *that* out into a NOP... or are you saying we already can? But this is a digression. I was being pedantic about the "0 cycles" but sure, this would be perfectly tolerable. smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH] PCI: qcom: add missing supplies required for msm8996
On 23/01/18 10:14, Stanimir Varbanov wrote: Hi, On 01/23/2018 11:46 AM, Srinivas Kandagatla wrote: On 23/01/18 09:23, Stanimir Varbanov wrote: Hey Srini, As there are no comments I'd propose to change the endpoint supplies to more generic names. Sure, I will respin this with your suggestions, except the 3v3 and 1v5 suffix due to the reasons below: +- vdda-1p8-supply: +Usage: required for msm8996 +Value type: +Definition: A phandle to the 1.8v analog power supply + This should be dropped, because it is part of the phy. Yep. - vdda_phy-supply: Usage: required for ipq/apq8064 Value type: @@ -189,6 +194,15 @@ Value type: Definition: A phandle to the analog power supply for IC which generates reference clock +- vddpe-supply: +Usage: optional +Value type: +Definition: A phandle to the PCIe endpoint power supply vddpe_3v3-supply Why do we need suffix here? AFAIU, It does not add any value, instead it would confuse the users. vddpe and vddpe1 is already confusing as well. I partly agree with you. How would you represent if there are two power 3v3 supplies for the endpoint? Lets imagine that powering up the endpointX needs some specific sequence between 3v3 and 1v5 and endpointY (which could be connected on the same PCIe lane) has different power sequence, how we would handle that in the qcom pcie host driver? power sequencing is all together a different issue, that is not addressed in this patch. Am hoping that this will be fixed as part of making pwrseq interface more generic. Not sure where it endedup now!! --srini These are power supplies for endpoint which could be of any voltage. In I don't think that could be any values see PCIe mini card electromechanical specification. There on the connector are provided 3v3 and 1v5. this case both endpoint supplies are 3v3, these could be 1.8 or 5v or 12v in some other cases. If we see hw designs with 5v and 12v we could extend the binding and the driver with support for them. I want to be exact in the names and voltages in the driver and bindings.
Re: unixbench context switch perfomance & cpu topology
2018-01-22 21:37 GMT+08:00 Mike Galbraith: > On Mon, 2018-01-22 at 20:27 +0800, Wanpeng Li wrote: >> 2018-01-22 20:08 GMT+08:00 Mike Galbraith : >> > On Mon, 2018-01-22 at 19:47 +0800, Wanpeng Li wrote: >> >> Hi all, >> >> >> >> We can observe unixbench context switch performance is heavily >> >> influenced by cpu topology which is exposed to the guest. the score is >> >> posted below, bigger is better, both the guest and the host kernel are >> >> 3.15-rc3(we can also reproduce against centos 7.4 693 guest/host), LLC >> >> is exposed to the guest, kvm adaptive halt-polling is default enabled, >> >> then start a guest w/ 8 logical cpus. >> >> >> >> >> >> >> >> unixbench context switch >> >> -smp 8, sockets=8, cores=1, threads=1382036 >> >> -smp 8, sockets=4, cores=2, threads=1132480 >> >> -smp 8, sockets=2, cores=4, threads=1128032 >> >> -smp 8, sockets=2, cores=2, threads=2131767 >> >> -smp 8, sockets=1, cores=4, threads=2132742 >> >> -smp 8, sockets=1, cores=4, threads=2 (guest w/ nohz=off idle=poll) >> >> 331471 >> >> >> >> I can observe there are a lot of reschedule IPIs sent from one vCPU to >> >> another vCPU, the context switch workload switches between running and >> >> idle frequently which results in HLT instruction in the idle path, I >> >> use idle=poll to avoid vmexit due to HLT and to avoid reschedule IPIs >> >> since idle task checks TIF_NEED_RESCHED flags in a loop, nohz=off can >> >> stop to program lapic timer/other nohz stuffs. Any idea why sockets=8 >> >> can get best performance? >> > >> > Probably because with that topology, there is no shared llc, thus no >> > cross-core scheduling, micro-benchmark waker/wakee are stacked. If >> > your benchmark does nothing but schedule, stacking makes beautiful (but >> > utterly meaningless) numbers. >> >> The waker and wakee are just sporadic on the same logical cpu in the >> guest(-smp 8, sockets=8, cores=1, threads=1) during the testing, in >> addition, binding the waker/wakee to one logical cpu in the guest(-smp >> 8, sockets=1, cores=4, threads=2) also can get the performance as >> better as 8 sockets setup. > > Here, with tip.today and that topology, context1 does stack up on one core. > > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ P COMMAND > 4218 root 20 04048808732 R 52.16 0.022 0:12.77 4 > context1 > 4219 root 20 04048 80 0 S 47.18 0.002 0:11.96 4 > context1 > > There's a bit of bouncing, but the two stack right back up. But > whatever, what Peter said, the benchmark should pin itself to do this. Thanks for having a try, Mike. :) Actually the two context1 tasks don't stack up on one logical cpu at the most of time which is observed by kernelshark. Do you have any idea why there is 4.5 times RESCHED IPIs which is mentioned in another reply for this thread? Regards, Wanpeng Li
[PATCH 2/2] x86/microcode: Fix again accessing initrd after having been freed
From: Borislav PetkovCommit 24c2503255d3 ("x86/microcode: Do not access the initrd after it has been freed") fixed attempts to access initrd from the microcode loader after it has been freed. However, a similar KASAN warning was reported (stack trace edited): smpboot: Booting Node 0 Processor 1 APIC 0x11 == BUG: KASAN: use-after-free in find_cpio_data+0x9b5/0xa50 Read of size 1 at addr 880035ffd000 by task swapper/1/0 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.8-slack #7 Hardware name: System manufacturer System Product Name/A88X-PLUS, BIOS 3003 03/10/2016 Call Trace: dump_stack print_address_description kasan_report ? find_cpio_data __asan_report_load1_noabort find_cpio_data find_microcode_in_initrd __load_ucode_amd load_ucode_amd_ap load_ucode_ap After some investigation, it turned out that a merge was done using the wrong side to resolve, leading to picking up the previous state, before the 24c2503255d3 fix. Therefore the Fixes tag below contains a merge commit. Revert the mismerge by catching the save_microcode_in_initrd_amd() retval and thus letting the function exit with the last return statement so that initrd_gone can be set to true. Reported-by: Cc: # 4.11 Fixes: f26483eaedec ("Merge branch 'x86/urgent' into x86/microcode, to resolve conflicts") Link: https://bugzilla.kernel.org/show_bug.cgi?id=198295 Signed-off-by: Borislav Petkov --- arch/x86/kernel/cpu/microcode/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index c4fa4a85d4cb..e4fc595cd6ea 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -239,7 +239,7 @@ static int __init save_microcode_in_initrd(void) break; case X86_VENDOR_AMD: if (c->x86 >= 0x10) - return save_microcode_in_initrd_amd(cpuid_eax(1)); + ret = save_microcode_in_initrd_amd(cpuid_eax(1)); break; default: break; -- 2.13.0
[PATCH 1/2] x86/microcode/intel: Extend BDW late-loading further with LLC size check
From: Jia ZhangThe commit b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check") reduced the impact of erratum BDF90 for Broadwell model 79. The impact can be reduced further by checking the size of the last level cache portion per core. Tony: "The erratum says the problem only occurs on the large-cache SKUs. So we only need to avoid the update if we are on a big cache SKU that is also running old microcode." For more details, see erratum BDF90 in document #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family Specification Update) from September 2017. Signed-off-by: Jia Zhang Acked-by: Tony Luck Cc: "h...@hmh.eng.br" Cc: x86-ml Cc: # v4.14 Link: http://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang@linux.alibaba.com Signed-off-by: Borislav Petkov --- arch/x86/kernel/cpu/microcode/intel.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index d9e460fc7a3b..f7c55b0e753a 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -45,6 +45,9 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin"; /* Current microcode patch used in early patching on the APs. */ static struct microcode_intel *intel_ucode_patch; +/* last level cache size per core */ +static int llc_size_per_core; + static inline bool cpu_signatures_match(unsigned int s1, unsigned int p1, unsigned int s2, unsigned int p2) { @@ -912,12 +915,14 @@ static bool is_blacklisted(unsigned int cpu) /* * Late loading on model 79 with microcode revision less than 0x0b21 -* may result in a system hang. This behavior is documented in item -* BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family). +* and LLC size per core bigger than 2.5MB may result in a system hang. +* This behavior is documented in item BDF90, #334165 (Intel Xeon +* Processor E7-8800/4800 v4 Product Family). */ if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X && c->x86_mask == 0x01 && + llc_size_per_core > 2621440 && c->microcode < 0x0b21) { pr_err_once("Erratum BDF90: late loading with revision < 0x0b21 (0x%x) disabled.\n", c->microcode); pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n"); @@ -975,6 +980,15 @@ static struct microcode_ops microcode_intel_ops = { .apply_microcode = apply_microcode_intel, }; +static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c) +{ + u64 llc_size = c->x86_cache_size * 1024; + + do_div(llc_size, c->x86_max_cores); + + return (int)llc_size; +} + struct microcode_ops * __init init_intel_microcode(void) { struct cpuinfo_x86 *c = _cpu_data; @@ -985,5 +999,7 @@ struct microcode_ops * __init init_intel_microcode(void) return NULL; } + llc_size_per_core = calc_llc_size_per_core(c); + return _intel_ops; } -- 2.13.0
Re: [RFC] Add ability to multiplex SPI bus
On Mon, Jan 22, 2018 at 10:51:11PM +, Ben Whitten wrote: > A chip that I am working on acts as an SPI multiplexer for downstream radios, > this patch adds basic support for adding an SPI mux with DT. Please don't send cover letters for single patches, if there is anything that needs saying put it in the changelog of the patch or after the --- if it's administrative stuff. This reduces mail volume and ensures that any important information is recorded in the changelog rather than being lost. signature.asc Description: PGP signature
Re: [PATCH 05/12] arm64: dts: mt7622: add PMIC MT6380 related nodes
Sean, sorry for the late reply and thanks you for this research. On Fri, Jan 12, 2018 at 4:33 AM, Sean Wangwrote: > Currently, I'm really confused about what usage STYLE of SPDX license > identifier I should use for each type of file. > > could you point me where I can find the related document describing SPDX > usage style for those files expected by the community in the future? The doc is in this patchset [1] [1] https://lkml.org/lkml/2017/12/28/326 > I found more than one way STYLE of SPDX present at current code, for > example as below. If there's no absolute definition for them, and then > which way that is better? > 1) > for *.dts, applied with "// " at head or within " /* */ " not at head > such as > > arch/arm/boot/dts/bcm953012hr.dts:2: * SPDX-License-Identifier: > BSD-3-Clause This is a "style bug". The comment style for .dts should be // > 2) > for *.c, applied with "// " at head or within " /* */ " not at head > such as > drivers/soc/xilinx/zynqmp/pm.c:10: * SPDX-License-Identifier: GPL-2.0+ This is a "style bug". The comment style for .c should be // > 3) > for *.h, applied with "// " at head or within " /* */ " at head > such as > drivers/usb/dwc3/gadget.h:1:// SPDX-License-Identifier: GPL-2.0 This is a "style bug". The comment style for .h should be /**/ > 4) > no issue, Makefile, or Kconfig, definitely applied with "# " at head That's the correct way. So the net-net is that these "style bugs" should be fixed. -- Cordially Philippe Ombredanne
Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack
On 22.01.2018 23:51, Andy Lutomirski wrote: On Wed, Oct 11, 2017 at 6:32 AM, Konstantin Khlebnikovwrote: On 08.10.2017 12:16, Christoph Hellwig wrote: This looks fine in general, but a few comments: - can you split adding the new function from switching over the fork codeok - at least kasan and vmalloc_user/vmalloc_32_user use very similar patterns, can you switch them over as well? I don't see why VM_USERMAP cannot be set right at allocation. I'll add vm_flags argument to __vmalloc_node() and pass here VM_USERMAP from vmalloc_user/vmalloc_32_user in separate patch. KASAN is different: it allocates shadow area for area allocated for module. Pointer to module area must be pushed from module_alloc(). This isn't worth optimization. - the new __alloc_vm_area looks very different from alloc_vm_area, maybe it needs a better name? vmalloc_range_area for example? __vmalloc_area() is vacant - this most low-level, so I'll keep "__". - when you split an existing function please keep the more low-level function on top of the higher level one that calls it.ok Did this ever get re-sent? It seems not. Probably lost in race-condition with my vacation. Will do.
Re: [PATCH v7 2/2] mfd: syscon: Add hardware spinlock support
On Tue, 23 Jan 2018, Baolin Wang wrote: > Hi Lee, > > On 22 January 2018 at 21:43, Lee Joneswrote: > > On Thu, 11 Jan 2018, Lee Jones wrote: > >> On Mon, 25 Dec 2017, Baolin Wang wrote: > >> > >> > Some system control registers need hardware spinlock to synchronize > >> > between the multiple subsystems, so we should add hardware spinlock > >> > support for syscon. > >> > > >> > Signed-off-by: Baolin Wang > >> > Acked-by: Rob Herring > >> > --- > >> > Changes since v6: > >> > - Treat hwlock id 0 as valid for regmap. > >> > > >> > Changes since v5: > >> > - Fix the case that hwspinlock is not enabled. > >> > > >> > Changes since v4: > >> > - Add one exapmle to show how to add hwlock. > >> > - Fix the coding style issue. > >> > > >> > Changes since v3: > >> > - Add error handling for of_hwspin_lock_get_id() > >> > > >> > Changes since v2: > >> > - Add acked tag from Rob. > >> > > >> > Changes since v1: > >> > - Remove timeout configuration. > >> > - Modify the binding file to add hwlocks. > >> > --- > >> > Documentation/devicetree/bindings/mfd/syscon.txt |8 > >> > drivers/mfd/syscon.c | 19 > >> > +++ > >> > 2 files changed, 27 insertions(+) > >> > >> Applied, thanks. > > > > In order to avoid confusion, I should like to tell you that this patch > > is applied for v4.17, not v4.16. > > This patch has been applied into Mark's branch[1] with your ACK, so > Mark should drop this patch from his branch and you will pick it and > merge it into v4.17? > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git/commit/?h=topic/hwspinlock=3bafc09e779710abaa7b836fe3bbeeeab7754c2b Ah, this is the one that failed to build when merged alone. Very well. Ignore my last. -- Lee Jones Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH v2 1/2] Input: edt-ft5x06 - Add support for regulator
Hello Dimitry, Thank you for the review! Le Mon, 22 Jan 2018 09:42:08 -0800, Dmitry Torokhova écrit : > Hi Mylène, > > On Thu, Dec 28, 2017 at 8:33 AM, Mylène Josserand > wrote: > > Add the support of regulator to use it as VCC source. > > > > Signed-off-by: Mylène Josserand > > --- > > .../bindings/input/touchscreen/edt-ft5x06.txt | 1 + > > drivers/input/touchscreen/edt-ft5x06.c | 33 > > ++ > > 2 files changed, 34 insertions(+) > > > > diff --git > > a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > index 025cf8c9324a..48e975b9c1aa 100644 > > --- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > +++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt > > @@ -30,6 +30,7 @@ Required properties: > > Optional properties: > > - reset-gpios: GPIO specification for the RESET input > > - wake-gpios: GPIO specification for the WAKE input > > + - vcc-supply: Regulator that supplies the touchscreen > > > > - pinctrl-names: should be "default" > > - pinctrl-0: a phandle pointing to the pin settings for the > > diff --git a/drivers/input/touchscreen/edt-ft5x06.c > > b/drivers/input/touchscreen/edt-ft5x06.c > > index c53a3d7239e7..5ee14a25a382 100644 > > --- a/drivers/input/touchscreen/edt-ft5x06.c > > +++ b/drivers/input/touchscreen/edt-ft5x06.c > > @@ -39,6 +39,7 @@ > > #include > > #include > > #include > > +#include > > > > #define WORK_REGISTER_THRESHOLD0x00 > > #define WORK_REGISTER_REPORT_RATE 0x08 > > @@ -91,6 +92,7 @@ struct edt_ft5x06_ts_data { > > struct touchscreen_properties prop; > > u16 num_x; > > u16 num_y; > > + struct regulator *vcc; > > > > struct gpio_desc *reset_gpio; > > struct gpio_desc *wake_gpio; > > @@ -993,6 +995,23 @@ static int edt_ft5x06_ts_probe(struct i2c_client > > *client, > > > > tsdata->max_support_points = chip_data->max_support_points; > > > > + tsdata->vcc = devm_regulator_get(>dev, "vcc"); > > + if (IS_ERR(tsdata->vcc)) { > > + error = PTR_ERR(tsdata->vcc); > > + dev_err(>dev, "failed to request regulator: %d\n", > > + error); > > + return error; > > + }; > > As 0-day pounted out, this semicolon is not needed. Yes, thanks, I will fix that in next version. > > > + > > + if (tsdata->vcc) { > > You do not need to check for non-NULL here, devm_regulator_get() wil > lnever give you a NULL. If regulator is not defined in DT/board > mappings, then dummy regulator will be provided. You can call > regulator_enable() and regulator_disable() and other regulator APIs > with dummy regulator. Okay, thanks for the explanation, I will remove that. > > > + error = regulator_enable(tsdata->vcc); > > + if (error < 0) { > > + dev_err(>dev, "failed to enable vcc: %d\n", > > + error); > > + return error; > > + } > > + } > > + > > tsdata->reset_gpio = devm_gpiod_get_optional(>dev, > > "reset", > > GPIOD_OUT_HIGH); > > if (IS_ERR(tsdata->reset_gpio)) { > > @@ -1122,20 +1141,34 @@ static int edt_ft5x06_ts_remove(struct i2c_client > > *client) > > static int __maybe_unused edt_ft5x06_ts_suspend(struct device *dev) > > { > > struct i2c_client *client = to_i2c_client(dev); > > + struct edt_ft5x06_ts_data *tsdata = i2c_get_clientdata(client); > > > > if (device_may_wakeup(dev)) > > enable_irq_wake(client->irq); > > > > + if (tsdata->vcc) > > Same here. yep > > > + regulator_disable(tsdata->vcc); > > + > > return 0; > > } > > > > static int __maybe_unused edt_ft5x06_ts_resume(struct device *dev) > > { > > struct i2c_client *client = to_i2c_client(dev); > > + struct edt_ft5x06_ts_data *tsdata = i2c_get_clientdata(client); > > + int ret; > > > > if (device_may_wakeup(dev)) > > disable_irq_wake(client->irq); > > > > + if (tsdata->vcc) { > > And here. yep > > > + ret = regulator_enable(tsdata->vcc); > > + if (ret < 0) { > > + dev_err(dev, "failed to enable vcc: %d\n", ret); > > + return ret; > > + } > > Since power to the device may have been cut, I think you need to > restore the register settings to whatever it was (factory vs work > mode, threshold, gain and offset registers, etc, etc). Okay. Could you tell me how can I do that? > > > + } > > + > > return 0; > > } > > > > -- > > 2.11.0 > > > > Thanks. >
[PATCH net 1/2] vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()
We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to hold mutexes of all virtqueues. This may confuse lockdep to report a possible deadlock because of trying to hold locks belong to same class. Switch to use mutex_lock_nested() to avoid false positive. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+dbb7c1161485e61b0...@syzkaller.appspotmail.com Signed-off-by: Jason Wang--- drivers/vhost/vhost.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 33ac2b1..549771a 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -904,7 +904,7 @@ static void vhost_dev_lock_vqs(struct vhost_dev *d) { int i = 0; for (i = 0; i < d->nvqs; ++i) - mutex_lock(>vqs[i]->mutex); + mutex_lock_nested(>vqs[i]->mutex, i); } static void vhost_dev_unlock_vqs(struct vhost_dev *d) -- 2.7.4
Re: [PATCH net-next 1/1] rtnetlink: request RTM_GETLINK by pid or fd
On Mon, 22 Jan 2018 23:25:41 +0100, Christian Brauner wrote: > This is not necessarily true in scenarios where I move a network device > via RTM_NEWLINK + IFLA_NET_NS_PID into a network namespace I haven't > created. Here is an example: > > nlmsghdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; > nlmsghdr->nlmsg_type = RTM_NEWLINK; > /* move to network namespace of pid */ > nla_put_u32(nlmsg, IFLA_NET_NS_PID, pid) > /* give interface new name */ > nla_put_string(nlmsg, IFLA_IFNAME, ifname) > > The only thing I have is the pid that identifies the network namespace. How do you know the interface did not get renamed in the new netns? This is racy and won't work reliably. You really need to know the netnsid before moving the interface to the netns to be able to do meaningful queries. You may argue that for your case, you're fine with the race. But I know about use cases where it matters a lot: those are tools that show network topology including changes in real time, such as Skydive. It's important to have the uAPI designed right. And we don't want two different APIs for the same thing. If you want to do any watching over the interfaces (as opposed to "move to the netns and forget"), you really have to work with netnsids. Let's focus on how to do that more easily. We don't return netnsid at all places where we should return it and we need to fix that. > There's no non-syscall way to learn the netnsid. And that is the primary problem. Jiri
Re: [PATCH v2] kasan: don't emit builtin calls when sanitization is off
On 01/23/2018 05:20 AM, Andrew Morton wrote: > On Fri, 19 Jan 2018 18:58:12 +0100 Andrey Konovalov> wrote: > >> With KASAN enabled the kernel has two different memset() functions, one >> with KASAN checks (memset) and one without (__memset). KASAN uses some >> macro tricks to use the proper version where required. For example memset() >> calls in mm/slub.c are without KASAN checks, since they operate on poisoned >> slab object metadata. >> >> The issue is that clang emits memset() calls even when there is no memset() >> in the source code. They get linked with improper memset() implementation >> and the kernel fails to boot due to a huge amount of KASAN reports during >> early boot stages. >> >> The solution is to add -fno-builtin flag for files with KASAN_SANITIZE := n >> marker. > > This clashes somewhat with Arnd's asan-rework-kconfig-settings.patch. > Could you two please put heads together and decide what we want for a > final result? > > Meanwhile I'll "fix" the reject with > > +ifdef CONFIG_KASAN_EXTRA > CFLAGS_KASAN += $(call cc-option, -fsanitize-address-use-after-scope) > +endif > Looks correct.
[PATCH] bnx2: remove redundant initializations of pointers txr and rxr
From: Colin Ian KingPointers txr and rxr are being initialized and a few statements later are being assigned new values without the original values ever being read. The initialized values are therefore redundant and can be removed. Cleans up clang warnings: drivers/net/ethernet/broadcom/bnx2.c:5821:28: warning: Value stored to 'txr' during its initialization is never read drivers/net/ethernet/broadcom/bnx2.c:5822:28: warning: Value stored to 'rxr' during its initialization is never read Signed-off-by: Colin Ian King --- drivers/net/ethernet/broadcom/bnx2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c index 154866e8517a..5de4c33f682e 100644 --- a/drivers/net/ethernet/broadcom/bnx2.c +++ b/drivers/net/ethernet/broadcom/bnx2.c @@ -5818,8 +5818,8 @@ bnx2_run_loopback(struct bnx2 *bp, int loopback_mode) struct l2_fhdr *rx_hdr; int ret = -ENODEV; struct bnx2_napi *bnapi = >bnx2_napi[0], *tx_napi; - struct bnx2_tx_ring_info *txr = >tx_ring; - struct bnx2_rx_ring_info *rxr = >rx_ring; + struct bnx2_tx_ring_info *txr; + struct bnx2_rx_ring_info *rxr; tx_napi = bnapi; -- 2.15.1