Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()
On (01/09/19 08:16), Helge Deller wrote: > I prefer that change. > In my role as admin of the parisc-linux.org domain I plan to retire > all @parisc-linux.org email addresses soon anyway. I see. Well, I can send a formal patch, but I think James would know better which one of his email addresses he'd prefer to use; so it seems that a patch from James would be more appropriate. -ss
Re: [PATCH v5 2/2] dt-bindings: spi: Document Renesas R-Car RPC-IF controller bindings
Hi Mason, On Tue, Jan 8, 2019 at 5:17 AM Mason Yang wrote: > Document the bindings used by the Renesas R-Car Gen3 RPC-IF controller. > > Signed-off-by: Mason Yang Thanks for your patch! > --- /dev/null > +++ b/Documentation/devicetree/bindings/spi/spi-renesas-rpc.txt > @@ -0,0 +1,37 @@ > +Renesas R-Car Gen3 RPC-IF controller Device Tree Bindings > +-- > + > +Required properties: > +- compatible: should be "renesas,r8a77995-rpc" Would it make sense to have a family-specific fallback "renesas,rcar-gen3-rpc", bseides the SoC-specific compatible value? > +- #address-cells: should be 1 > +- #size-cells: should be 0 > +- reg: should contain 2 entries, one for the registers and one for the direct > + mapping area > +- reg-names: should contain "regs" and "dirmap" > +- clock-names: should contain "rpc" > +- clocks: should contain 1 entries for the module's clock Doesn't the driver have a need to access the RPCD2 clock? At least on R-Car V3M, it needs to program the Divider Clock Register (DIVREG). > +- renesas,rpc-mode: should contain "spi" for rpc spi mode or > + "hyperflash" for rpc hyperflash mode. Can't this be derived from the flash subnode? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: About some mail lost and random unsubscribe issues on this dmaengine list
Hi, Vinod, Thanks for your information. I will check with our IT. Shunyong. Thanks. On 2019/1/9 15:00, Vinod Koul wrote: > On 09-01-19, 01:35, Yang, Shunyong wrote: >> Hi, Vinod and All, >> I found, somtimes, I cannot received some mails from this mailing >> list. For example, I sent my v2 series on 7th, Jan, >> >> https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713 >> >> It appeared on patchwork.kernel.org but I haven't received it in this >> mailing list. >> >> And I found randomly, a month or two months, I am unsubscribed from this >> list automatically. Then I have to re-subscribe the list. > > I guess your corporate server is to blame. vger mail list is "know" to > unsubscribe servers which behave badly, so you need to ask your mail > admins why and try to use other email which is known to be friendly with > vger :) > >> Have someone met similar problem as me? Or maybe I missed some necessary >> settings? >> >> Shunyong. >> Thanks. >
[PATCH v2] sched: fix a potential double-fetch bug in sched_copy_attr
"uattr->size" is copied in from user space and checked. However, it is copied in again after the security check. A malicious user may race to change it. The fix sets uattr->size to be the checked size. Signed-off-by: Kangjie Lu --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6fedf3a98581..e868cc25ac2a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4499,6 +4499,9 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a if (ret) return -EFAULT; + /* In case attr->size was changed in the user space */ + attr->size = size; + /* * XXX: Do we want to be lenient like existing syscalls; or do we want * to be strict and return an error on out-of-bounds values? -- 2.17.1
Re: [RESEND 1/1] gic: its: Make sure a LPI is discarded before free.
On Wed, 9 Jan 2019 11:53:27 +0800 Zhao Yuanyuan wrote: Hi Zhao, > Its device will be removed after all events be freed. > Undisarded events can lead to unpredictable behaviar. > > Signed-off-by: Zhao Yuanyuan > --- > drivers/irqchip/irq-gic-v3-its.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > b/drivers/irqchip/irq-gic-v3-its.c > index db20e99..4fee008 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -2572,6 +2572,10 @@ static void its_irq_domain_free(struct irq_domain > *domain, unsigned int virq, > virq + i); > u32 event = its_get_event_id(data); > > + /* Discard irq before free */ > + if (irqd_is_activated(d)) > + its_send_discard(its_dev, event); > + > /* Mark interrupt index as unused */ > clear_bit(event, its_dev->event_map.lpi_map); > But we already do send a discard on deactivate, which logically happens before we free the domain. So what are you fixing here? Thanks, M. -- Without deviation from the norm, progress is not possible.
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote: > "A PCI completion timeout occurred for an outstanding PCI-E transaction" > it is. > > This is how I bind the device to vfio: > > echo vfio-pci > '/sys/bus/pci/devices/:01:00.0/driver_override' > echo vfio-pci > '/sys/bus/pci/devices/:01:00.1/driver_override' > echo ':01:00.0' > '/sys/bus/pci/devices/:01:00.0/driver/unbind' > echo ':01:00.1' > '/sys/bus/pci/devices/:01:00.1/driver/unbind' > echo ':01:00.0' > /sys/bus/pci/drivers/vfio-pci/bind > echo ':01:00.1' > /sys/bus/pci/drivers/vfio-pci/bind > > > and I noticed that EEH only happens with the last command. The order > (.0,.1 or .1,.0) does not matter, it seems that putting one function to > D3 is fine but putting another one when the first one is already in D3 - > produces EEH. And I do not recall ever seeing this on the firestone > machine. Weird. Putting all functions into D3 is what allows the device to actually go into D3. Does it work with other devices ? We do have that bug on early P9 revisions where the attempt of bringing the link to L1 as part of the D3 process fails in horrible ways, I thought P8 would be ok but maybe not ... Otherwise, it might be that our timeouts are too low (you may want to talk to our PCIe guys internally) Cheers, Ben.
Re: [PATCH v3 2/4] Input: goodix - Add AVDD28-supply regulator support
Hi Jagan, On Sat, Dec 15, 2018 at 08:48:00PM +0530, Jagan Teki wrote: > Goodix CTP controllers have AVDD28 pin connected to voltage > regulator which may not be turned on by default, like for GT5663. > > Add support for such ctp used boards by adding voltage regulator > handling code to goodix ctp driver. > > Signed-off-by: Jagan Teki > --- > drivers/input/touchscreen/goodix.c | 33 +- > 1 file changed, 28 insertions(+), 5 deletions(-) > > diff --git a/drivers/input/touchscreen/goodix.c > b/drivers/input/touchscreen/goodix.c > index f2d9c2c41885..7371f6946098 100644 > --- a/drivers/input/touchscreen/goodix.c > +++ b/drivers/input/touchscreen/goodix.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -47,6 +48,7 @@ struct goodix_ts_data { > struct touchscreen_properties prop; > unsigned int max_touch_num; > unsigned int int_trigger_type; > + struct regulator *avdd28; > struct gpio_desc *gpiod_int; > struct gpio_desc *gpiod_rst; > u16 id; > @@ -786,25 +788,41 @@ static int goodix_ts_probe(struct i2c_client *client, > if (error) > return error; > > + ts->avdd28 = devm_regulator_get(>dev, "AVDD28"); > + if (IS_ERR(ts->avdd28)) { > + error = PTR_ERR(ts->avdd28); > + if (error != -EPROBE_DEFER) > + dev_err(>dev, > + "Failed to get AVDD28 regulator: %d\n", error); > + return error; > + } > + > + /* power the controller */ > + error = regulator_enable(ts->avdd28); > + if (error) { > + dev_err(>dev, "Controller fail to enable AVDD28\n"); > + return error; > + } > + > if (ts->gpiod_int && ts->gpiod_rst) { > /* reset the controller */ > error = goodix_reset(ts); > if (error) { > dev_err(>dev, "Controller reset failed.\n"); > - return error; > + goto error; > } > } > > error = goodix_i2c_test(client); > if (error) { > dev_err(>dev, "I2C communication failure: %d\n", error); > - return error; > + goto error; > } > > error = goodix_read_version(ts); > if (error) { > dev_err(>dev, "Read version failed.\n"); > - return error; > + goto error; > } > > ts->chip = goodix_get_chip_data(ts->id); > @@ -823,23 +841,28 @@ static int goodix_ts_probe(struct i2c_client *client, > dev_err(>dev, > "Failed to invoke firmware loader: %d\n", > error); > - return error; > + goto error; > } > > return 0; > } else { > error = goodix_configure_dev(ts); > if (error) > - return error; > + goto error; > } > > return 0; > + > +error: > + regulator_disable(ts->avdd28); > + return error; > } > > static int goodix_ts_remove(struct i2c_client *client) > { > struct goodix_ts_data *ts = i2c_get_clientdata(client); > > + regulator_disable(ts->avdd28); This may be disabling the regulator too early. Please use devm_add_action_or_reset() to install a custom devm handler that would disable the regulator in line with the rest of devm unwinding flow. > if (ts->gpiod_int && ts->gpiod_rst) > wait_for_completion(>firmware_loading_complete); > > -- > 2.18.0.321.gffc6fa0e3 > Thanks. -- Dmitry
Re: [PATCH v3 1/4] dt-bindings: input: touchscreen: goodix: Document AVDD28-supply property
On Sat, Dec 15, 2018 at 08:47:59PM +0530, Jagan Teki wrote: > Most of the Goodix CTP controllers are supply with AVDD28 pin. > which need to supply for controllers like GT5663 on some boards > to trigger the power. > > So, document the supply property so-that the require boards > that used on GT5663 can enable it via device tree. > > Signed-off-by: Jagan Teki > --- > Documentation/devicetree/bindings/input/touchscreen/goodix.txt | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt > b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt > index f7e95c52f3c7..c4622c983e08 100644 > --- a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt > +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt > @@ -23,6 +23,7 @@ Optional properties: > - touchscreen-inverted-y : Y axis is inverted (boolean) > - touchscreen-swapped-x-y : X and Y axis are swapped (boolean) > (swapping is done after inverting the axis) > + - AVDD28-supply : Analog power supply regulator on AVDD28 pin I think we normally use lower case in DT bindings and rarely encode voltage in the supply name unless we are dealing with several supplies of the same kind, but I'll let Ron comment on this. Thanks. -- Dmitry
Re: [PATCH v5 4/8] Input: stmpe-ts: preparations for STMPE ADC driver
On Fri, Dec 21, 2018 at 02:46:33PM +0100, Philippe Schenker wrote: > From: Philippe Schenker > > This patch removes common ADC settings in favor to use > stmpe811_adc_common_init that is present in MFD. This is necessary in > preparation for the stmpe-adc driver, because those two drivers have > common settings for the ADC. > > Signed-off-by: Philippe Schenker Acked-by: Dmitry Torokhov > > --- > > Changes in v5: > - Changed author of commit to use correct email. > > Changes in v4: > - New patch: Split changes in stmpe-ts.c to a separate commit > - Remove common adc settings from init and call the >stmpe811_adc_common_init function > > Changes in v3: > - Undo ADC-settings related code-deletions in stmpe-ts.c that the code >is backwards-compatible to older devicetrees. > > Changes in v2: None > > drivers/input/touchscreen/stmpe-ts.c | 42 +--- > 1 file changed, 7 insertions(+), 35 deletions(-) > > diff --git a/drivers/input/touchscreen/stmpe-ts.c > b/drivers/input/touchscreen/stmpe-ts.c > index c5d9006588a2..cf9c9aa39f6e 100644 > --- a/drivers/input/touchscreen/stmpe-ts.c > +++ b/drivers/input/touchscreen/stmpe-ts.c > @@ -30,8 +30,6 @@ > * with touchscreen controller > */ > #define STMPE_REG_INT_STA0x0B > -#define STMPE_REG_ADC_CTRL1 0x20 > -#define STMPE_REG_ADC_CTRL2 0x21 > #define STMPE_REG_TSC_CTRL 0x40 > #define STMPE_REG_TSC_CFG0x41 > #define STMPE_REG_FIFO_TH0x4A > @@ -58,15 +56,6 @@ > * @idev: registered input device > * @work: a work item used to scan the device > * @dev: a pointer back to the MFD cell struct device* > - * @sample_time: ADC converstion time in number of clock. > - * (0 -> 36 clocks, 1 -> 44 clocks, 2 -> 56 clocks, 3 -> 64 clocks, > - * 4 -> 80 clocks, 5 -> 96 clocks, 6 -> 144 clocks), > - * recommended is 4. > - * @mod_12b: ADC Bit mode (0 -> 10bit ADC, 1 -> 12bit ADC) > - * @ref_sel: ADC reference source > - * (0 -> internal reference, 1 -> external reference) > - * @adc_freq: ADC Clock speed > - * (0 -> 1.625 MHz, 1 -> 3.25 MHz, 2 || 3 -> 6.5 MHz) > * @ave_ctrl: Sample average control > * (0 -> 1 sample, 1 -> 2 samples, 2 -> 4 samples, 3 -> 8 samples) > * @touch_det_delay: Touch detect interrupt delay > @@ -88,10 +77,6 @@ struct stmpe_touch { > struct input_dev *idev; > struct delayed_work work; > struct device *dev; > - u8 sample_time; > - u8 mod_12b; > - u8 ref_sel; > - u8 adc_freq; > u8 ave_ctrl; > u8 touch_det_delay; > u8 settling; > @@ -192,7 +177,7 @@ static irqreturn_t stmpe_ts_handler(int irq, void *data) > static int stmpe_init_hw(struct stmpe_touch *ts) > { > int ret; > - u8 adc_ctrl1, adc_ctrl1_mask, tsc_cfg, tsc_cfg_mask; > + u8 tsc_cfg, tsc_cfg_mask; > struct stmpe *stmpe = ts->stmpe; > struct device *dev = ts->dev; > > @@ -202,22 +187,9 @@ static int stmpe_init_hw(struct stmpe_touch *ts) > return ret; > } > > - adc_ctrl1 = STMPE_SAMPLE_TIME(ts->sample_time) | > - STMPE_MOD_12B(ts->mod_12b) | STMPE_REF_SEL(ts->ref_sel); > - adc_ctrl1_mask = STMPE_SAMPLE_TIME(0xff) | STMPE_MOD_12B(0xff) | > - STMPE_REF_SEL(0xff); > - > - ret = stmpe_set_bits(stmpe, STMPE_REG_ADC_CTRL1, > - adc_ctrl1_mask, adc_ctrl1); > - if (ret) { > - dev_err(dev, "Could not setup ADC\n"); > - return ret; > - } > - > - ret = stmpe_set_bits(stmpe, STMPE_REG_ADC_CTRL2, > - STMPE_ADC_FREQ(0xff), STMPE_ADC_FREQ(ts->adc_freq)); > + ret = stmpe811_adc_common_init(stmpe); > if (ret) { > - dev_err(dev, "Could not setup ADC\n"); > + stmpe_disable(stmpe, STMPE_BLOCK_TOUCHSCREEN | STMPE_BLOCK_ADC); > return ret; > } > > @@ -295,13 +267,13 @@ static void stmpe_ts_get_platform_info(struct > platform_device *pdev, > > if (np) { > if (!of_property_read_u32(np, "st,sample-time", )) > - ts->sample_time = val; > + ts->stmpe->sample_time = val; > if (!of_property_read_u32(np, "st,mod-12b", )) > - ts->mod_12b = val; > + ts->stmpe->mod_12b = val; > if (!of_property_read_u32(np, "st,ref-sel", )) > - ts->ref_sel = val; > + ts->stmpe->ref_sel = val; > if (!of_property_read_u32(np, "st,adc-freq", )) > - ts->adc_freq = val; > + ts->stmpe->adc_freq = val; > if (!of_property_read_u32(np, "st,ave-ctrl", )) > ts->ave_ctrl = val; > if (!of_property_read_u32(np, "st,touch-det-delay", )) > -- > 2.19.2 > -- Dmitry
[PATCH] kobject: make kset_get_ownership() 'static'
From: Eric Biggers kset_get_ownership() is only used in lib/kobject.c, so make it 'static'. Signed-off-by: Eric Biggers --- lib/kobject.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/kobject.c b/lib/kobject.c index b72e00fd7d09e..aa89edcd2b63a 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -887,7 +887,7 @@ static void kset_release(struct kobject *kobj) kfree(kset); } -void kset_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid) +static void kset_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid) { if (kobj->parent) kobject_get_ownership(kobj->parent, uid, gid); -- 2.20.1
Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order
On Wed 09-01-19 11:28:52, Arun KS wrote: > On 2019-01-08 23:43, Michal Hocko wrote: > > On Tue 08-01-19 09:56:09, Alexander Duyck wrote: > > > On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote: > > [...] > > > > static int online_pages_range(unsigned long start_pfn, unsigned long > > > > nr_pages, > > > > void *arg) > > > > { > > > > - unsigned long i; > > > > unsigned long onlined_pages = *(unsigned long *)arg; > > > > - struct page *page; > > > > > > > > if (PageReserved(pfn_to_page(start_pfn))) > > > > - for (i = 0; i < nr_pages; i++) { > > > > - page = pfn_to_page(start_pfn + i); > > > > - (*online_page_callback)(page); > > > > - onlined_pages++; > > > > - } > > > > + onlined_pages = online_pages_blocks(start_pfn, > > > > nr_pages); > > > > > > Shouldn't this be a "+=" instead of an "="? It seems like you are > > > going > > > to lose your count otherwise. > > > > You are right of course. I should have noticed during the review. > > Thanks! > > I think we don't need to. The caller function is setting onlined_pages = 0 > before calling online_pages_range(). > And there are no other reference to online_pages_range other than from > online_pages(). Are you missing that we accumulate onlined_pages via *(unsigned long *)arg = onlined_pages; in online_pages_range? -- Michal Hocko SUSE Labs
Re: [PATCH v2] mm/page_owner: fix for deferred struct page init
On Thu 20-12-18 13:50:31, Qian Cai wrote: > When booting a system with "page_owner=on", > > start_kernel > page_ext_init > invoke_init_callbacks > init_section_page_ext > init_page_owner > init_early_allocated_pages > init_zones_in_node > init_pages_in_zone > lookup_page_ext > page_to_nid > > The issue here is that page_to_nid() will not work since some page > flags have no node information until later in page_alloc_init_late() due > to DEFERRED_STRUCT_PAGE_INIT. Hence, it could trigger an out-of-bounds > access with an invalid nid. > > [8.666047] UBSAN: Undefined behaviour in ./include/linux/mm.h:1104:50 > [8.672603] index 7 is out of range for type 'zone [5]' > > Also, kernel will panic since flags were poisoned earlier with, > > CONFIG_DEBUG_VM_PGFLAGS=y > CONFIG_NODE_NOT_IN_PAGE_FLAGS=n > > start_kernel > setup_arch > pagetable_init > paging_init > sparse_init > sparse_init_nid > memblock_alloc_try_nid_raw > > Although later it tries to set page flags for pages in reserved bootmem > regions, > > mm_init > mem_init > memblock_free_all > free_low_memory_core_early > reserve_bootmem_region > > there could still have some freed pages from the page allocator but yet > to be initialized due to DEFERRED_STRUCT_PAGE_INIT. It have already been > dealt with a bit in page_ext_init(). > > * Take into account DEFERRED_STRUCT_PAGE_INIT. > */ > if (early_pfn_to_nid(pfn) != nid) > continue; > > However it did not handle it well in init_pages_in_zone() which end up > calling page_to_nid(). > > [ 11.917212] page:ea000420 is uninitialized and poisoned > [ 11.917220] raw: > > [ 11.921745] raw: > > [ 11.924523] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > [ 11.926498] page_owner info is not active (free page?) > [ 12.329560] kernel BUG at include/linux/mm.h:990! > [ 12.337632] RIP: 0010:init_page_owner+0x486/0x520 > > Since there is no other routines depend on page_ext_init() in > start_kernel() and no real benefit to call it so early, just move it > after page_alloc_init_late() to ensure that there is no deferred pages > need to de dealt with. The last paragraph should be updated. I would propose something like this. " This means that assumptions behind fe53ca54270a ("mm: use early_pfn_to_nid in page_ext_init") are incomplete. Therefore revert the commit for now. A proper way to move the page_owner initialization to sooner is to hook into memmap initialization. " > Fixes: fe53ca54270a ("mm: use early_pfn_to_nid in page_ext_init") > Suggested-by: Michal Hocko > Signed-off-by: Qian Cai Acked-by: Michal Hocko > --- > > v2: postpone init_page_ext() to after page_alloc_init_late(). > > init/main.c | 2 +- > mm/page_ext.c | 3 +-- > 2 files changed, 2 insertions(+), 3 deletions(-) > > diff --git a/init/main.c b/init/main.c > index 2b7b7fe173c9..1aeb062b2cb7 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -696,7 +696,6 @@ asmlinkage __visible void __init start_kernel(void) > initrd_start = 0; > } > #endif > - page_ext_init(); > kmemleak_init(); > setup_per_cpu_pageset(); > numa_policy_init(); > @@ -1147,6 +1146,7 @@ static noinline void __init kernel_init_freeable(void) > sched_init_smp(); > > page_alloc_init_late(); > + page_ext_init(); > > do_basic_setup(); > > diff --git a/mm/page_ext.c b/mm/page_ext.c > index ae44f7adbe07..d76fd51e312a 100644 > --- a/mm/page_ext.c > +++ b/mm/page_ext.c > @@ -399,9 +399,8 @@ void __init page_ext_init(void) >* -pfn--> >* N0 | N1 | N2 | N0 | N1 | N2| >* > - * Take into account DEFERRED_STRUCT_PAGE_INIT. >*/ > - if (early_pfn_to_nid(pfn) != nid) > + if (pfn_to_nid(pfn) != nid) > continue; > if (init_section_page_ext(pfn, nid)) > goto oom; > -- > 2.17.2 (Apple Git-113) -- Michal Hocko SUSE Labs
[PATCH v3] i2c: dev: prevent adapter retries and timeout being set as minus value
If set adapter->retries to minus value from user space via ioctl, will make __i2c_transfer and __i2c_smbus_xfer jump the calling to adapter->algo->master_xfer and adapter->algo->smbus_xfer that registered by the underlying bus drivers, and return value 0 to all the callers. The bus driver will never be accessed anymore by all users, besides, the users may still get successful return value without any error or information log print out. If set adapter->timeout to minus value from user space via ioctl, will make the retrying loop in __i2c_transfer and __i2c_smbus_xfer always break after the the first try, due to the time_after always returns true. Signed-off-by: Yi Zeng --- drivers/i2c/i2c-dev.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 1aca742..ccd76c7 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -470,9 +470,15 @@ static long i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) data_arg.data); } case I2C_RETRIES: + if (arg > INT_MAX) + return -EINVAL; + client->adapter->retries = arg; break; case I2C_TIMEOUT: + if (arg > INT_MAX) + return -EINVAL; + /* For historical reasons, user-space sets the timeout * value in units of 10 ms. */ -- 1.9.1
[PATCH net V2] vhost: log dirty page correctly
Vhost dirty page logging API is designed to sync through GPA. But we try to log GIOVA when device IOTLB is enabled. This is wrong and may lead to missing data after migration. To solve this issue, when logging with device IOTLB enabled, we will: 1) reuse the device IOTLB translation result of GIOVA->HVA mapping to get HVA, for writable descriptor, get HVA through iovec. For used ring update, translate its GIOVA to HVA 2) traverse the GPA->HVA mapping to get the possible GPA and log through GPA. Pay attention this reverse mapping is not guaranteed to be unique, so we should log each possible GPA in this case. This fix the failure of scp to guest during migration. In -next, we will probably support passing GIOVA->GPA instead of GIOVA->HVA. Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API") Reported-by: Jintack Lim Cc: Jintack Lim Signed-off-by: Jason Wang --- The patch is needed for stable. Changes from V1: - return error instead of warn --- drivers/vhost/net.c | 3 +- drivers/vhost/vhost.c | 82 +++ drivers/vhost/vhost.h | 3 +- 3 files changed, 72 insertions(+), 16 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 36f3d0f49e60..bca86bf7189f 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1236,7 +1236,8 @@ static void handle_rx(struct vhost_net *net) if (nvq->done_idx > VHOST_NET_BATCH) vhost_net_signal_used(nvq); if (unlikely(vq_log)) - vhost_log_write(vq, vq_log, log, vhost_len); + vhost_log_write(vq, vq_log, log, vhost_len, + vq->iov, in); total_len += vhost_len; if (unlikely(vhost_exceeds_weight(++recv_pkts, total_len))) { vhost_poll_queue(>poll); diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 9f7942cbcbb2..ee095f08ffd4 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1733,11 +1733,70 @@ static int log_write(void __user *log_base, return r; } +static int log_write_hva(struct vhost_virtqueue *vq, u64 hva, u64 len) +{ + struct vhost_umem *umem = vq->umem; + struct vhost_umem_node *u; + u64 gpa; + int r; + bool hit = false; + + list_for_each_entry(u, >umem_list, link) { + if (u->userspace_addr < hva && + u->userspace_addr + u->size >= + hva + len) { + gpa = u->start + hva - u->userspace_addr; + r = log_write(vq->log_base, gpa, len); + if (r < 0) + return r; + hit = true; + } + } + + if (!hit) + return -EFAULT; + + return 0; +} + +static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len) +{ + struct iovec iov[64]; + int i, ret; + + if (!vq->iotlb) + return log_write(vq->log_base, vq->log_addr + used_offset, len); + + ret = translate_desc(vq, (u64)(uintptr_t)vq->used + used_offset, +len, iov, 64, VHOST_ACCESS_WO); + if (ret) + return ret; + + for (i = 0; i < ret; i++) { + ret = log_write_hva(vq, (u64)(uintptr_t)iov[i].iov_base, + iov[i].iov_len); + if (ret) + return ret; + } + + return 0; +} + int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log, - unsigned int log_num, u64 len) + unsigned int log_num, u64 len, struct iovec *iov, int count) { int i, r; + if (vq->iotlb) { + for (i = 0; i < count; i++) { + r = log_write_hva(vq, (u64)(uintptr_t)iov[i].iov_base, + iov[i].iov_len); + if (r < 0) + return r; + } + return 0; + } + /* Make sure data written is seen before log. */ smp_wmb(); for (i = 0; i < log_num; ++i) { @@ -1769,9 +1828,8 @@ static int vhost_update_used_flags(struct vhost_virtqueue *vq) smp_wmb(); /* Log used flag write. */ used = >used->flags; - log_write(vq->log_base, vq->log_addr + - (used - (void __user *)vq->used), - sizeof vq->used->flags); + log_used(vq, (used - (void __user *)vq->used), +sizeof vq->used->flags); if (vq->log_ctx) eventfd_signal(vq->log_ctx, 1); } @@ -1789,9 +1847,8 @@ static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event) smp_wmb(); /* Log avail event write */ used =
[PATCH v2] scsi: avoid a double-fetch and a redundant copy
What we need is only "pack_id", so do not create a heap object or copy the whole object in. The fix efficiently copies "pack_id" only and also avoids double-fetch. Signed-off-by: Kangjie Lu --- drivers/scsi/sg.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index c6ad00703c5b..13662c41058a 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -446,16 +446,8 @@ sg_read(struct file *filp, char __user *buf, size_t count, loff_t * ppos) } if (old_hdr->reply_len < 0) { if (count >= SZ_SG_IO_HDR) { - sg_io_hdr_t *new_hdr; - new_hdr = kmalloc(SZ_SG_IO_HDR, GFP_KERNEL); - if (!new_hdr) { - retval = -ENOMEM; - goto free_old_hdr; - } - retval =__copy_from_user - (new_hdr, buf, SZ_SG_IO_HDR); - req_pack_id = new_hdr->pack_id; - kfree(new_hdr); + retval = get_user(req_pack_id, + &((sg_io_hdr_t *)buf)->pack_id); if (retval) { retval = -EFAULT; goto free_old_hdr; -- 2.17.1
Re: [PATCH v5 0/7] mfd / platform: cros_ec: move cros_ec sysfs attributes to its own drivers.
On Tue, 08 Jan 2019, Enric Balletbo Serra wrote: > Missatge de Lee Jones del dia dv., 21 de des. > 2018 a les 16:39: > > > > On Wed, 12 Dec 2018, Enric Balletbo i Serra wrote: > > > > > Hi, > > > > > > This is another patchset to try to cleanup a bit more the crossed > > > references for cros-ec driver between the MFD and the platform/chrome > > > subsystems. > > > > > > The purpose of these patches is get rid of the different cros-ec > > > attributes > > > from mfd/cros_ec_dev to its own sub-driver in platform/chrome. cros_ec_dev > > > continues instantiating the sub-devices but the sysfs attributes are owned > > > by the platform driver.E.g. The lightbar driver should own his sysfs > > > attributes and be instantiated only if the Embedded Controller has a > > > lightbar. > > > > > > The patchset also adds the documentation of the sysfs attributes. > > > > > > Most of the patches touches mfd subsystem and platform/chrome so I'd > > > suggest go all using and inmutable branch. > > > > That's fine. > > > > What else needs to happen with this set? > > I think the patchset is ready to be queued. Note that to apply cleanly > it depends on [1] which is already merged in your for-next branch. > What do you prefer, go through your repo or go through the > chrome-platform repo? Do you want Benson or I create an immutable > branch? I'm fine with whenever you decide. Probably best if this goes through the MFD tree. I have a PR pending upstream at the moment. Once I know what is happening with that, I'll start taking patches/sets again. > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git/commit/?h=for-mfd-next=18e294ddafaeb80a1e2e10c9bd750a6cb8388d5b > > > Any more Acks required? > > > > I think that you, Guenter, Benson and I are fine with it, so not more > acks needed. Thanks for the clarification. -- Lee Jones [李琼斯] Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On Wed, 2019-01-09 at 17:32 +1100, Alexey Kardashevskiy wrote: > I have just moved the "Mellanox Technologies MT27700 Family > [ConnectX-4]" from garrison to firestone machine and there it does not > produce an EEH, with the same kernel and skiboot (both upstream + my > debug). Hm. I cannot really blame the card but I cannot see what could > cause the difference in skiboot either. I even tried disabling NPU so > garrison would look like firestone, still EEH'ing. The systems have a different chip though, firestone is P8 and garrison is P8', which a slightly different PHB revision. Worth checking if we have anything significantly different in our inits and poke at the HW guys. BTW. Are the cards behind a switch in either case ? Cheers, Ben.
Re: [PATCH v4 01/15] ARM: export arm_pm_restart()
On Tue, 2019-01-08 at 23:31 +, Russell King - ARM Linux wrote: > On Tue, Jan 08, 2019 at 06:58:04PM +0100, Lubomir Rintel wrote: > > The OLPC XO 1.75 laptop is rebooted with a command to the Embedded > > Controller. The EC driver should be a module, since most people don't need > > it to be compiled in. > > Why do you need this - is there a reason why you can't hook onto the > restart_handler_list using register_restart_handler()? The registered > handlers are passed the command. Actually, yes. The restart handlers are invoked in atomic context, but the EC communication needs to block. (The 1.75 EC is actually a SPI master: a command it initiated by poking a GPIO and waiting until EC tells us it's ready to read a command.) Sorry for not mentioning that earlier, I forgot that it is the case. > I would rather this is not exported to modules, it is generally unsafe > to hook stuff by directly poking function pointers from module code. Lubo
[GIT PULL] Backlight for v5.1
Good morning Linus, Please accept my apologies for the lateness. Christmas holidays, yada yada yada! :) The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a: Linux 4.20-rc1 (2018-11-04 15:37:52 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight.git tags/backlight-next-4.21 for you to fetch changes up to 3cee7a7d05b11038c8b5fa093e45c6f839ffc867: backlight: 88pm860x_bl: Use of_node_name_eq for node name comparisons (2018-12-14 12:30:24 +) - Fix-ups - Use new of_node_name_eq() API call; - Bug Fixes - Internally track 'enabled' state; pwm_bl - Fix auto-generated brightness tables parsed by DT; pwm_bl Heiko Stuebner (2): backlight: pwm_bl: Re-add driver internal enabled tracking backlight: pwm_bl: Fix devicetree parsing with auto-generated brightness tables Rob Herring (1): backlight: 88pm860x_bl: Use of_node_name_eq for node name comparisons drivers/video/backlight/88pm860x_bl.c | 2 +- drivers/video/backlight/pwm_bl.c | 28 +--- 2 files changed, 18 insertions(+), 12 deletions(-) -- Lee Jones [李琼斯] Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
[PATCH v2] net: ethernet: mediatek: fix warning in phy_start_aneg
From: Heiner Kallweit linux 5.0-rc1 shows following warning on bpi-r2/mt7623 bootup: [ 5.170597] WARNING: CPU: 3 PID: 1 at drivers/net/phy/phy.c:548 phy_start_aneg+0x110/0x144 [ 5.178826] called from state READY [ 5.264111] [] (phy_start_aneg) from [] (mtk_init+0x414/0x47c) [ 5.271630] r7:df5f5eec r6:c0f08c48 r5: r4:dea67800 [ 5.277256] [] (mtk_init) from [] (register_netdevice+0x98/0x51c) [ 5.285035] r8: r7: r6:c0f97080 r5:c0f08c48 r4:dea67800 [ 5.291693] [] (register_netdevice) from [] (register_netdev+0x2c/0x44) [ 5.299989] r8: r7:dea2e608 r6:deacea00 r5:dea2e604 r4:dea67800 [ 5.306646] [] (register_netdev) from [] (mtk_probe+0x668/0x7ac) [ 5.314336] r5:dea2e604 r4:dea2e040 [ 5.317890] [] (mtk_probe) from [] (platform_drv_probe+0x58/0xa8) [ 5.325670] r10:c0f86bac r9: r8:c0fbe578 r7: r6:c0f86bac r5: [ 5.333445] r4:deacea10 [ 5.335963] [] (platform_drv_probe) from [] (really_probe+0x2d8/0x424) maybe other boards using this generic driver are affected v2: optimization: - phy_set_max_speed() is only needed if you want to reduce the max speed, typically if the PHY supports 1Gbps but the MAC supports 100Mbps only. - The pause parameters are autonegotiated. Except you have a specific need you normally don't need to manually fiddle with this. - phy_start_aneg() is called implicitly by the phylib state machine, you shouldn't call it manually except you have a good excuse. - netif_carrier_on/netif_carrier_off in mtk_phy_link_adjust() isn't needed. It's done by phy_link_change() in phylib. Signed-off-by: Frank Wunderlich --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 16 1 file changed, 16 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 399f565dd85a..2968d29a992f 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -258,11 +258,6 @@ static void mtk_phy_link_adjust(struct net_device *dev) mtk_w32(mac->hw, mcr, MTK_MAC_MCR(mac->id)); - if (dev->phydev->link) - netif_carrier_on(dev); - else - netif_carrier_off(dev); - if (!of_phy_is_fixed_link(mac->of_node)) phy_print_status(dev->phydev); } @@ -347,17 +342,6 @@ static int mtk_phy_connect(struct net_device *dev) if (mtk_phy_connect_node(eth, mac, np)) goto err_phy; - dev->phydev->autoneg = AUTONEG_ENABLE; - dev->phydev->speed = 0; - dev->phydev->duplex = 0; - - phy_set_max_speed(dev->phydev, SPEED_1000); - phy_support_asym_pause(dev->phydev); - linkmode_copy(dev->phydev->advertising, dev->phydev->supported); - linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, -dev->phydev->advertising); - phy_start_aneg(dev->phydev); - of_node_put(np); return 0; -- 2.17.1
[GIT PULL] MFD for v5.1
Good morning Linus, Please accept my apologies for the lateness. Christmas holidays, yada yada yada! :) The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a: Linux 4.20-rc1 (2018-11-04 15:37:52 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git tags/mfd-next-4.21 for you to fetch changes up to 3f2d347e851ef4464dea49504cde85e5eef67b2d: mfd: exynos-lpass: Enable UART module support (2019-01-03 08:32:42 +) - New Device Support - Add support for Power Supply to AXP813 - Add support for GPIO, ADC, AC and Battery Power Supply to AXP803 - Add support for UART to Exynos LPASS - Fix-ups - Use supplied MACROS; ti_am335x_tscadc - Trivial spelling/whitespace/alignment; tmio, axp20x, rave-sp - Regmap changes; bd9571mwv, wm5110-tables - Kconfig dependencies; MFD_AT91_USART - Supply shared data for child-devices; madera-core - Use new of_node_name_eq() API call; max77620, stmpe - Use managed resources (devm_*); tps65218 - Comment descriptions; ingenic-tcu - Coding style; madera-core - Bug Fixes - Fix section mismatches; twl-core, db8500-prcmu - Correct error path related issues; mt6397-core, ab8500-core, mc13xxx-core - IRQ related fixes; tps6586x - Ensure proper initialisation sequence; qcom_rpm - Repair potential memory leak; cros_ec_dev Beomho Seo (1): mfd: exynos-lpass: Enable UART module support Charles Keepax (1): mfd: wm5110: Add missing ASRC rate register Chen-Yu Tsai (1): mfd: axp20x: Re-align MFD cell entries Cheng-Yi Chiang (1): mfd: cros_ec: Add commands to control codec Dan Carpenter (1): mfd: ab8500-core: Return zero in get_register_interruptible() Dien Pham (1): mfd: bd9571mwv: Add volatile register to make DVFS work Enric Balletbo i Serra (1): mfd: cros_ec_dev: Add missing mfd_remove_devices() call in remove Geert Uytterhoeven (1): mfd: tmio: Typo s/use use/use/ Jean Delvare (1): mfd: at91-usart: Add platform dependency Jonathan Hunter (1): mfd: tps6586x: Handle interrupts on suspend Jonathan Marek (1): mfd: qcom_rpm: write fw_version to CTRL_REG Kangjie Lu (1): mfd: mc13xxx: Fix a missing check of a register-read failure Keerthy (1): mfd: tps65218: Use devm_regmap_add_irq_chip and clean up error path in probe() Nathan Chancellor (2): mfd: twl-core: Fix section annotations on {,un}protect_pm_master mfd: db8500-prcmu: Fix some section annotations Nicolas Boichat (1): mfd: mt6397: Do not call irq_domain_remove if PMIC unsupported Oskari Lemmela (2): mfd: axp20x: Add AC power supply cell for AXP813 mfd: axp20x: Add supported cells for AXP803 Paul Cercueil (1): mfd: ingenic-tcu: Fix bit field description in header Richard Fitzgerald (2): mfd: madera: Add shared data for accessory detection mfd: madera: Remove spurious semicolon in while loop Rob Herring (1): mfd: Use of_node_name_eq() for node name comparisons Vignesh R (2): mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells iio: adc: ti_am335x_tscadc: Improve accuracy of measurement Yangtao Li (1): mfd: rave-sp: Fix typo in rave_sp_checksum comment drivers/iio/adc/ti_am335x_adc.c | 5 +- drivers/mfd/Kconfig | 1 + drivers/mfd/ab8500-core.c| 2 +- drivers/mfd/axp20x.c | 126 +++ drivers/mfd/bd9571mwv.c | 1 + drivers/mfd/cros_ec_dev.c| 1 + drivers/mfd/db8500-prcmu.c | 4 +- drivers/mfd/exynos-lpass.c | 4 +- drivers/mfd/madera-core.c| 5 +- drivers/mfd/max77620.c | 2 +- drivers/mfd/mc13xxx-core.c | 4 +- drivers/mfd/mt6397-core.c| 3 +- drivers/mfd/qcom_rpm.c | 4 ++ drivers/mfd/rave-sp.c| 2 +- drivers/mfd/stmpe.c | 12 ++-- drivers/mfd/ti_am335x_tscadc.c | 5 +- drivers/mfd/tps65218.c | 24 +-- drivers/mfd/tps6586x.c | 24 +++ drivers/mfd/twl-core.c | 4 +- drivers/mfd/wm5110-tables.c | 2 + include/linux/mfd/cros_ec_commands.h | 94 ++ include/linux/mfd/ingenic-tcu.h | 2 +- include/linux/mfd/madera/core.h | 7 ++ include/linux/mfd/ti_am335x_tscadc.h | 4 ++ include/linux/mfd/tmio.h | 2 +- 25 files changed, 244 insertions(+), 100 deletions(-) -- Lee Jones [李琼斯] Linaro Services Technical Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
[tip:perf/urgent] tools include uapi: Sync linux/vhost.h with the kernel sources
Commit-ID: ee412f14693a3fe2645b3528603dfd37dd05118a Gitweb: https://git.kernel.org/tip/ee412f14693a3fe2645b3528603dfd37dd05118a Author: Arnaldo Carvalho de Melo AuthorDate: Tue, 8 Jan 2019 13:53:23 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 14:09:33 -0300 tools include uapi: Sync linux/vhost.h with the kernel sources To get the changes in: 4b86713236e4 ("vhost: split structs into a separate header file") Silencing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h' diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h Those didn't touch things used in tools, i.e. the following continues working: $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh static const char *vhost_virtio_ioctl_cmds[] = { [0x00] = "SET_FEATURES", [0x01] = "SET_OWNER", [0x02] = "RESET_OWNER", [0x03] = "SET_MEM_TABLE", [0x04] = "SET_LOG_BASE", [0x07] = "SET_LOG_FD", [0x10] = "SET_VRING_NUM", [0x11] = "SET_VRING_ADDR", [0x12] = "SET_VRING_BASE", [0x13] = "SET_VRING_ENDIAN", [0x14] = "GET_VRING_ENDIAN", [0x20] = "SET_VRING_KICK", [0x21] = "SET_VRING_CALL", [0x22] = "SET_VRING_ERR", [0x23] = "SET_VRING_BUSYLOOP_TIMEOUT", [0x24] = "GET_VRING_BUSYLOOP_TIMEOUT", [0x25] = "SET_BACKEND_FEATURES", [0x30] = "NET_SET_BACKEND", [0x40] = "SCSI_SET_ENDPOINT", [0x41] = "SCSI_CLEAR_ENDPOINT", [0x42] = "SCSI_GET_ABI_VERSION", [0x43] = "SCSI_SET_EVENTS_MISSED", [0x44] = "SCSI_GET_EVENTS_MISSED", [0x60] = "VSOCK_SET_GUEST_CID", [0x61] = "VSOCK_SET_RUNNING", }; static const char *vhost_virtio_ioctl_read_cmds[] = { [0x00] = "GET_FEATURES", [0x12] = "GET_VRING_BASE", [0x26] = "GET_BACKEND_FEATURES", }; $ At some point in the eBPFication of perf, using something like: # perf trace -e ioctl(cmd=VHOST_VRING*) Will setup a BPF filter right at the raw_syscalls:sys_enter tracepoint, i.e. filtering at the origin. Cc: Adrian Hunter Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Michael S. Tsirkin Cc: Namhyung Kim Cc: Paolo Bonzini Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-g28usrt7l59lwq3wuh8vz...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/include/uapi/linux/vhost.h | 113 +-- 1 file changed, 2 insertions(+), 111 deletions(-) diff --git a/tools/include/uapi/linux/vhost.h b/tools/include/uapi/linux/vhost.h index 84c3de89696a..40d028eed645 100644 --- a/tools/include/uapi/linux/vhost.h +++ b/tools/include/uapi/linux/vhost.h @@ -11,94 +11,9 @@ * device configuration. */ +#include #include -#include #include -#include -#include - -struct vhost_vring_state { - unsigned int index; - unsigned int num; -}; - -struct vhost_vring_file { - unsigned int index; - int fd; /* Pass -1 to unbind from file. */ - -}; - -struct vhost_vring_addr { - unsigned int index; - /* Option flags. */ - unsigned int flags; - /* Flag values: */ - /* Whether log address is valid. If set enables logging. */ -#define VHOST_VRING_F_LOG 0 - - /* Start of array of descriptors (virtually contiguous) */ - __u64 desc_user_addr; - /* Used structure address. Must be 32 bit aligned */ - __u64 used_user_addr; - /* Available structure address. Must be 16 bit aligned */ - __u64 avail_user_addr; - /* Logging support. */ - /* Log writes to used structure, at offset calculated from specified -* address. Address must be 32 bit aligned. */ - __u64 log_guest_addr; -}; - -/* no alignment requirement */ -struct vhost_iotlb_msg { - __u64 iova; - __u64 size; - __u64 uaddr; -#define VHOST_ACCESS_RO 0x1 -#define VHOST_ACCESS_WO 0x2 -#define VHOST_ACCESS_RW 0x3 - __u8 perm; -#define VHOST_IOTLB_MISS 1 -#define VHOST_IOTLB_UPDATE 2 -#define VHOST_IOTLB_INVALIDATE 3 -#define VHOST_IOTLB_ACCESS_FAIL4 - __u8 type; -}; - -#define VHOST_IOTLB_MSG 0x1 -#define VHOST_IOTLB_MSG_V2 0x2 - -struct vhost_msg { - int type; - union { - struct vhost_iotlb_msg iotlb; - __u8 padding[64]; - }; -}; - -struct vhost_msg_v2 { - __u32 type; - __u32 reserved; - union { - struct vhost_iotlb_msg iotlb; - __u8 padding[64]; - }; -}; - -struct vhost_memory_region { - __u64 guest_phys_addr; - __u64 memory_size; /* bytes */ - __u64 userspace_addr; - __u64 flags_padding; /* No flags are currently specified. */ -}; - -/* All region addresses and sizes must be 4K aligned. */ -#define VHOST_PAGE_SIZE 0x1000 - -struct
[tip:perf/urgent] tools include uapi: Sync linux/fs.h copy with the kernel sources
Commit-ID: fdc42ca190c7d8976f4f9240752f0bd008270b72 Gitweb: https://git.kernel.org/tip/fdc42ca190c7d8976f4f9240752f0bd008270b72 Author: Arnaldo Carvalho de Melo AuthorDate: Tue, 8 Jan 2019 13:48:14 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 14:09:33 -0300 tools include uapi: Sync linux/fs.h copy with the kernel sources To get the changes in: e262e32d6bde ("vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled") That made the mount flags string table generator to switch to using mount.h instead. This silences the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h' diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h Cc: Adrian Hunter Cc: David Howells Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-mosz81pa6iwxko4p2owbm...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/include/uapi/linux/fs.h | 60 ++- 1 file changed, 8 insertions(+), 52 deletions(-) diff --git a/tools/include/uapi/linux/fs.h b/tools/include/uapi/linux/fs.h index a441ea1bfe6d..121e82ce296b 100644 --- a/tools/include/uapi/linux/fs.h +++ b/tools/include/uapi/linux/fs.h @@ -14,6 +14,11 @@ #include #include +/* Use of MS_* flags within the kernel is restricted to core mount(2) code. */ +#if !defined(__KERNEL__) +#include +#endif + /* * It's silly to have NR_OPEN bigger than NR_FILE, but you can change * the file limit at runtime and only root can increase the per-process @@ -101,57 +106,6 @@ struct inodes_stat_t { #define NR_FILE 8192 /* this can well be larger on a larger system */ - -/* - * These are the fs-independent mount-flags: up to 32 flags are supported - */ -#define MS_RDONLY 1 /* Mount read-only */ -#define MS_NOSUID 2 /* Ignore suid and sgid bits */ -#define MS_NODEV4 /* Disallow access to device special files */ -#define MS_NOEXEC 8 /* Disallow program execution */ -#define MS_SYNCHRONOUS 16 /* Writes are synced at once */ -#define MS_REMOUNT 32 /* Alter flags of a mounted FS */ -#define MS_MANDLOCK64 /* Allow mandatory locks on an FS */ -#define MS_DIRSYNC 128 /* Directory modifications are synchronous */ -#define MS_NOATIME 1024/* Do not update access times. */ -#define MS_NODIRATIME 2048/* Do not update directory access times */ -#define MS_BIND4096 -#define MS_MOVE8192 -#define MS_REC 16384 -#define MS_VERBOSE 32768 /* War is peace. Verbosity is silence. - MS_VERBOSE is deprecated. */ -#define MS_SILENT 32768 -#define MS_POSIXACL(1<<16) /* VFS does not apply the umask */ -#define MS_UNBINDABLE (1<<17) /* change to unbindable */ -#define MS_PRIVATE (1<<18) /* change to private */ -#define MS_SLAVE (1<<19) /* change to slave */ -#define MS_SHARED (1<<20) /* change to shared */ -#define MS_RELATIME(1<<21) /* Update atime relative to mtime/ctime. */ -#define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */ -#define MS_I_VERSION (1<<23) /* Update inode I_version field */ -#define MS_STRICTATIME (1<<24) /* Always perform atime updates */ -#define MS_LAZYTIME(1<<25) /* Update the on-disk [acm]times lazily */ - -/* These sb flags are internal to the kernel */ -#define MS_SUBMOUNT (1<<26) -#define MS_NOREMOTELOCK(1<<27) -#define MS_NOSEC (1<<28) -#define MS_BORN(1<<29) -#define MS_ACTIVE (1<<30) -#define MS_NOUSER (1<<31) - -/* - * Superblock flags that can be altered by MS_REMOUNT - */ -#define MS_RMT_MASK(MS_RDONLY|MS_SYNCHRONOUS|MS_MANDLOCK|MS_I_VERSION|\ -MS_LAZYTIME) - -/* - * Old magic mount flag and mask - */ -#define MS_MGC_VAL 0xC0ED -#define MS_MGC_MSK 0x - /* * Structure for FS_IOC_FSGETXATTR[A] and FS_IOC_FSSETXATTR. */ @@ -269,7 +223,8 @@ struct fsxattr { #define FS_POLICY_FLAGS_PAD_16 0x02 #define FS_POLICY_FLAGS_PAD_32 0x03 #define FS_POLICY_FLAGS_PAD_MASK 0x03 -#define FS_POLICY_FLAGS_VALID 0x03 +#define FS_POLICY_FLAG_DIRECT_KEY 0x04/* use master key directly */ +#define FS_POLICY_FLAGS_VALID 0x07 /* Encryption algorithms */ #define FS_ENCRYPTION_MODE_INVALID 0 @@ -281,6 +236,7 @@ struct fsxattr { #define FS_ENCRYPTION_MODE_AES_128_CTS 6 #define FS_ENCRYPTION_MODE_SPECK128_256_XTS7 /* Removed, do not use. */ #define FS_ENCRYPTION_MODE_SPECK128_256_CTS8 /* Removed, do not use. */ +#define FS_ENCRYPTION_MODE_ADIANTUM9 struct fscrypt_policy { __u8 version;
[tip:perf/urgent] perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h
Commit-ID: 1c23397d2a6a077ab32f01c01406c2fe61b7b3a4 Gitweb: https://git.kernel.org/tip/1c23397d2a6a077ab32f01c01406c2fe61b7b3a4 Author: Arnaldo Carvalho de Melo AuthorDate: Tue, 8 Jan 2019 13:46:43 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 14:09:33 -0300 perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h As now we'll update our fs.h copy and what tools/perf/trace/beauty/mount_flags.sh needs just got moved to mount.h, use that instead. Cc: Adrian Hunter Cc: David Howells Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-ls19h376xukeouxrw9dsw...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/trace/beauty/mount_flags.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/trace/beauty/mount_flags.sh b/tools/perf/trace/beauty/mount_flags.sh index 45547573a1db..847850b2ef6c 100755 --- a/tools/perf/trace/beauty/mount_flags.sh +++ b/tools/perf/trace/beauty/mount_flags.sh @@ -5,11 +5,11 @@ printf "static const char *mount_flags[] = {\n" regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MS_([[:alnum:]_]+)[[:space:]]+([[:digit:]]+)[[:space:]]*.*' -egrep $regex ${header_dir}/fs.h | egrep -v '(MSK|VERBOSE|MGC_VAL)\>' | \ +egrep $regex ${header_dir}/mount.h | egrep -v '(MSK|VERBOSE|MGC_VAL)\>' | \ sed -r "s/$regex/\2 \2 \1/g" | sort -n | \ xargs printf "\t[%s ? (ilog2(%s) + 1) : 0] = \"%s\",\n" regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MS_([[:alnum:]_]+)[[:space:]]+\(1<<([[:digit:]]+)\)[[:space:]]*.*' -egrep $regex ${header_dir}/fs.h | \ +egrep $regex ${header_dir}/mount.h | \ sed -r "s/$regex/\2 \1/g" | \ xargs printf "\t[%s + 1] = \"%s\",\n" printf "};\n"
Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()
On 09.01.19 06:39, Sergey Senozhatsky wrote: > On (01/08/19 08:44), Helge Deller wrote: >> Date: Tue, 8 Jan 2019 08:44:19 +0100 >> From: Helge Deller >> To: Sergey Senozhatsky , "James E . J . >> Bottomley" >> Cc: linux-par...@vger.kernel.org, linux-kernel@vger.kernel.org, Sergey >> Senozhatsky >> Subject: Re: [PATCH] parisc: replace oops_in_progress manipulation with >> bust_spinlocks() >> Message-ID: >> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 >> Thunderbird/60.4.0 >> >> On 07.01.19 10:56, Sergey Senozhatsky wrote: >>> Use bust_spinlocks() function to set oops_in_progress. >>> >>> Signed-off-by: Sergey Senozhatsky >> >> I've added this patch to the parisc for-next tree. > > Thanks. > > BTW, > jejb at parisc-linux.org bounces. Shall it be replaced with > HansenPartnership.com email address? > > --- > > diff --git a/MAINTAINERS b/MAINTAINERS > index 7fc7813cd9c2..86b174dbe506 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -11511,7 +11511,7 @@ F:Documentation/blockdev/paride.txt > F: drivers/block/paride/ > > PARISC ARCHITECTURE > -M: "James E.J. Bottomley" > +M: "James E.J. Bottomley" > M: Helge Deller > L: linux-par...@vger.kernel.org > W: http://www.parisc-linux.org/ > I prefer that change. In my role as admin of the parisc-linux.org domain I plan to retire all @parisc-linux.org email addresses soon anyway. Helge
[tip:perf/urgent] perf top: Lift restriction on using callchains without "sym" in --sort
Commit-ID: f2e14cd2c93699aa0aeaa8240457ab359f1258ff Gitweb: https://git.kernel.org/tip/f2e14cd2c93699aa0aeaa8240457ab359f1258ff Author: Arnaldo Carvalho de Melo AuthorDate: Tue, 8 Jan 2019 10:56:59 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 perf top: Lift restriction on using callchains without "sym" in --sort This restriction is not present in 'perf report' and since 'perf top' uses the same hists browser, remove it from it as well. With this we create per event buckets with callchain trees, so that # perf top --sort dso -g --no-children Bucketizes samples by DSO and below it shows the callchains leading to functions in this DSO. Try also: # perf top -e sched:*switch -g --no-children To see the callchains leading to sched switches, pressing 'E' to expand all one can quickly see the most common scheduler switches and what leads to them, for instance, calls to IO, futexes, etc. Acked-by: Namhyung Kim Cc: Adrian Hunter Cc: Jiri Olsa Link: https://lkml.kernel.org/r/20190107140854.ga28...@kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-top.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index fe3ecfb2e64b..f64e312db787 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1028,12 +1028,7 @@ out_err: static int callchain_param__setup_sample_type(struct callchain_param *callchain) { - if (!perf_hpp_list.sym) { - if (callchain->enabled) { - ui__error("Selected -g but \"sym\" not present in --sort/-s."); - return -EINVAL; - } - } else if (callchain->mode != CHAIN_NONE) { + if (callchain->mode != CHAIN_NONE) { if (callchain_register_param(callchain) < 0) { ui__error("Can't register callchain params.\n"); return -EINVAL;
[tip:perf/urgent] tools include uapi: Grab a copy of linux/mount.h
Commit-ID: 250bfc87ddc427fa001bbc8bc1468ce5fc06645b Gitweb: https://git.kernel.org/tip/250bfc87ddc427fa001bbc8bc1468ce5fc06645b Author: Arnaldo Carvalho de Melo AuthorDate: Tue, 8 Jan 2019 13:42:37 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 14:09:28 -0300 tools include uapi: Grab a copy of linux/mount.h We were using a copy of uapi/linux/fs.h to create the mount syscall 'flags' string table to use in 'perf trace', to convert from the number obtained via the raw_syscalls:sys_enter into a string, using tools/perf/trace/beauty/mount_flags.sh, but in e262e32d6bde ("vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled") those defines got moved to linux/mount.h, so grab a copy of mount.h too. Keep the uapi/linux/fs.h as we'll use it for the SEEK_ constants. Cc: Adrian Hunter Cc: David Howells Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-i2ricmpwpdrpukfq3298j...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- {include => tools/include}/uapi/linux/mount.h | 0 tools/perf/check-headers.sh | 1 + 2 files changed, 1 insertion(+) diff --git a/include/uapi/linux/mount.h b/tools/include/uapi/linux/mount.h similarity index 100% copy from include/uapi/linux/mount.h copy to tools/include/uapi/linux/mount.h diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh index 6cb98f8570a2..b51e952ab35f 100755 --- a/tools/perf/check-headers.sh +++ b/tools/perf/check-headers.sh @@ -10,6 +10,7 @@ include/uapi/linux/fs.h include/uapi/linux/kcmp.h include/uapi/linux/kvm.h include/uapi/linux/in.h +include/uapi/linux/mount.h include/uapi/linux/perf_event.h include/uapi/linux/prctl.h include/uapi/linux/sched.h
[tip:perf/urgent] tools lib traceevent: Remove tep_data_event_from_type() API
Commit-ID: 9231967e2f515fce9e19687c0c40dfda416b3512 Gitweb: https://git.kernel.org/tip/9231967e2f515fce9e19687c0c40dfda416b3512 Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 23:08:13 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Remove tep_data_event_from_type() API In order to make libtraceevent into a proper library, its API should be straightforward. After discussion with Steven Rostedt, we decided to remove the tep_data_event_from_type() API and to replace it with tep_find_event(), as it does the same. Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181201040852.913841...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/event-parse.c | 12 tools/lib/traceevent/event-parse.h | 1 - 2 files changed, 13 deletions(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 54d94054eef0..abd4fa5d3088 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -5265,18 +5265,6 @@ int tep_data_type(struct tep_handle *pevent, struct tep_record *rec) return trace_parse_common_type(pevent, rec->data); } -/** - * tep_data_event_from_type - find the event by a given type - * @pevent: a handle to the pevent - * @type: the type of the event. - * - * This returns the event form a given @type; - */ -struct tep_event *tep_data_event_from_type(struct tep_handle *pevent, int type) -{ - return tep_find_event(pevent, type); -} - /** * tep_data_pid - parse the PID from record * @pevent: a handle to the pevent diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index bd1bd9a27839..aec48f2aea8a 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -526,7 +526,6 @@ tep_find_event_by_record(struct tep_handle *pevent, struct tep_record *record); void tep_data_lat_fmt(struct tep_handle *pevent, struct trace_seq *s, struct tep_record *record); int tep_data_type(struct tep_handle *pevent, struct tep_record *rec); -struct tep_event *tep_data_event_from_type(struct tep_handle *pevent, int type); int tep_data_pid(struct tep_handle *pevent, struct tep_record *rec); int tep_data_preempt_count(struct tep_handle *pevent, struct tep_record *rec); int tep_data_flags(struct tep_handle *pevent, struct tep_record *rec);
[tip:perf/urgent] tools lib traceevent: Rename tep_is_file_bigendian() to tep_file_bigendian()
Commit-ID: 4104e604277016b3e6a7d120368054f9d2716953 Gitweb: https://git.kernel.org/tip/4104e604277016b3e6a7d120368054f9d2716953 Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 23:08:12 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Rename tep_is_file_bigendian() to tep_file_bigendian() In order to make libtraceevent into a proper library, its API should be straightforward. After a discussion with Steven Rostedt, we decided to rename a few APIs, to have more intuitive names. This patch renames tep_is_file_bigendian() to tep_file_bigendian(). Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181201040852.767549...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/event-parse-api.c | 4 ++-- tools/lib/traceevent/event-parse.h | 2 +- tools/lib/traceevent/plugin_kvm.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/lib/traceevent/event-parse-api.c b/tools/lib/traceevent/event-parse-api.c index 8b31c0e00ba3..d463761a58f4 100644 --- a/tools/lib/traceevent/event-parse-api.c +++ b/tools/lib/traceevent/event-parse-api.c @@ -194,13 +194,13 @@ void tep_set_page_size(struct tep_handle *pevent, int _page_size) } /** - * tep_is_file_bigendian - get if the file is in big endian order + * tep_file_bigendian - get if the file is in big endian order * @pevent: a handle to the tep_handle * * This returns if the file is in big endian order * If @pevent is NULL, 0 is returned. */ -int tep_is_file_bigendian(struct tep_handle *pevent) +int tep_file_bigendian(struct tep_handle *pevent) { if(pevent) return pevent->file_bigendian; diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index ac377ae99008..bd1bd9a27839 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -559,7 +559,7 @@ int tep_get_long_size(struct tep_handle *pevent); void tep_set_long_size(struct tep_handle *pevent, int long_size); int tep_get_page_size(struct tep_handle *pevent); void tep_set_page_size(struct tep_handle *pevent, int _page_size); -int tep_is_file_bigendian(struct tep_handle *pevent); +int tep_file_bigendian(struct tep_handle *pevent); void tep_set_file_bigendian(struct tep_handle *pevent, enum tep_endian endian); int tep_is_host_bigendian(struct tep_handle *pevent); void tep_set_host_bigendian(struct tep_handle *pevent, enum tep_endian endian); diff --git a/tools/lib/traceevent/plugin_kvm.c b/tools/lib/traceevent/plugin_kvm.c index 754050eea467..64b9c25a1fd3 100644 --- a/tools/lib/traceevent/plugin_kvm.c +++ b/tools/lib/traceevent/plugin_kvm.c @@ -389,7 +389,7 @@ static int kvm_mmu_print_role(struct trace_seq *s, struct tep_record *record, * We can only use the structure if file is of the same * endianness. */ - if (tep_is_file_bigendian(event->pevent) == + if (tep_file_bigendian(event->pevent) == tep_is_host_bigendian(event->pevent)) { trace_seq_printf(s, "%u q%u%s %s%s %spae %snxe %swp%s%s%s",
[tip:perf/urgent] tools lib traceevent: Changed return logic of tep_register_event_handler() API
Commit-ID: f87ce7c43f36d4abff91b19edadd23939f99ff98 Gitweb: https://git.kernel.org/tip/f87ce7c43f36d4abff91b19edadd23939f99ff98 Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 23:08:11 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Changed return logic of tep_register_event_handler() API In order to make libtraceevent into a proper library, its API should be straightforward. The tep_register_event_handler() functions returns -1 in case it successfully registers the new event handler. Such return code is used by the other library APIs in case of an error. To unify the return logic of tep_register_event_handler() with the other APIs, this patch introduces enum tep_reg_handler, which is used by this function as return value, to handle all possible successful return cases. Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181201040852.628034...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/event-parse.c | 10 -- tools/lib/traceevent/event-parse.h | 5 + 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index a850342baf86..54d94054eef0 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -6632,6 +6632,12 @@ static struct tep_event *search_event(struct tep_handle *pevent, int id, * * If @id is >= 0, then it is used to find the event. * else @sys_name and @event_name are used. + * + * Returns: + * TEP_REGISTER_SUCCESS_OVERWRITE if an existing handler is overwritten + * TEP_REGISTER_SUCCESS if a new handler is registered successfully + * negative TEP_ERRNO_... in case of an error + * */ int tep_register_event_handler(struct tep_handle *pevent, int id, const char *sys_name, const char *event_name, @@ -6649,7 +6655,7 @@ int tep_register_event_handler(struct tep_handle *pevent, int id, event->handler = func; event->context = context; - return 0; + return TEP_REGISTER_SUCCESS_OVERWRITE; not_found: /* Save for later use. */ @@ -6679,7 +6685,7 @@ int tep_register_event_handler(struct tep_handle *pevent, int id, pevent->handlers = handle; handle->context = context; - return -1; + return TEP_REGISTER_SUCCESS; } static int handle_matches(struct event_handler *handler, int id, diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h index 77a4a1dd4b4d..ac377ae99008 100644 --- a/tools/lib/traceevent/event-parse.h +++ b/tools/lib/traceevent/event-parse.h @@ -485,6 +485,11 @@ int tep_print_func_field(struct trace_seq *s, const char *fmt, struct tep_event *event, const char *name, struct tep_record *record, int err); +enum tep_reg_handler { + TEP_REGISTER_SUCCESS = 0, + TEP_REGISTER_SUCCESS_OVERWRITE, +}; + int tep_register_event_handler(struct tep_handle *pevent, int id, const char *sys_name, const char *event_name, tep_event_handler_func func, void *context);
[tip:perf/urgent] tools lib traceevent: Changed return logic of trace_seq_printf() and trace_seq_vprintf() APIs
Commit-ID: 6d2d6fd7e3ee0daf0d8308741792b3ec41aafd0c Gitweb: https://git.kernel.org/tip/6d2d6fd7e3ee0daf0d8308741792b3ec41aafd0c Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 23:08:10 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Changed return logic of trace_seq_printf() and trace_seq_vprintf() APIs In order to make libtraceevent into a proper library, its API should be straightforward. The trace_seq_printf() and trace_seq_vprintf() APIs have inconsistent returned values with the other trace_seq_* APIs. This path changes the return logic of trace_seq_printf() and trace_seq_vprintf() to return the number of printed characters, as the other trace_seq_* related APIs. Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181201040852.485792...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/trace-seq.c | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/tools/lib/traceevent/trace-seq.c b/tools/lib/traceevent/trace-seq.c index 8ff1d55954d1..8d5ecd2bf877 100644 --- a/tools/lib/traceevent/trace-seq.c +++ b/tools/lib/traceevent/trace-seq.c @@ -100,7 +100,8 @@ static void expand_buffer(struct trace_seq *s) * @fmt: printf format string * * It returns 0 if the trace oversizes the buffer's free - * space, 1 otherwise. + * space, the number of characters printed, or a negative + * value in case of an error. * * The tracer may use either sequence operations or its own * copy to user routines. To simplify formating of a trace @@ -129,9 +130,10 @@ trace_seq_printf(struct trace_seq *s, const char *fmt, ...) goto try_again; } - s->len += ret; + if (ret > 0) + s->len += ret; - return 1; + return ret; } /** @@ -139,6 +141,10 @@ trace_seq_printf(struct trace_seq *s, const char *fmt, ...) * @s: trace sequence descriptor * @fmt: printf format string * + * It returns 0 if the trace oversizes the buffer's free + * space, the number of characters printed, or a negative + * value in case of an error. + * * * The tracer may use either sequence operations or its own * copy to user routines. To simplify formating of a trace * trace_seq_printf is used to store strings into a special @@ -163,9 +169,10 @@ trace_seq_vprintf(struct trace_seq *s, const char *fmt, va_list args) goto try_again; } - s->len += ret; + if (ret > 0) + s->len += ret; - return len; + return ret; } /**
[tip:perf/urgent] tools lib traceevent: Initialize host_bigendian at tep_handle allocation
Commit-ID: eed14f4b075ec594ac09921b998bf3dd61f5886b Gitweb: https://git.kernel.org/tip/eed14f4b075ec594ac09921b998bf3dd61f5886b Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 23:08:08 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Initialize host_bigendian at tep_handle allocation This patch initializes the host_bigendian member of the tep_handle structure with the byte order of the current host, when this handler is created - in tep_alloc() API. We need this in order to remove the tep_set_host_bigendian() API. Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181201040852.216292...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/event-parse.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 156e513074b2..44b80471b024 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -6762,8 +6762,10 @@ struct tep_handle *tep_alloc(void) { struct tep_handle *pevent = calloc(1, sizeof(*pevent)); - if (pevent) + if (pevent) { pevent->ref_count = 1; + pevent->host_bigendian = tep_host_bigendian(); + } return pevent; }
[tip:perf/urgent] tools lib traceevent: Rename struct cmdline to struct tep_cmdline
Commit-ID: 2e4318a287bdf815140462257ab8697f5289a12f Gitweb: https://git.kernel.org/tip/2e4318a287bdf815140462257ab8697f5289a12f Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 23:08:09 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Rename struct cmdline to struct tep_cmdline In order to make libtraceevent a proper library, variables, data structures and functions should have a unique prefix to prevent name space conflicts. That prefix will be "tep_". This patch renames 'struct cmdline' to 'struct tep_cmdline'. Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181201040852.358871...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/event-parse-local.h | 4 ++-- tools/lib/traceevent/event-parse.c | 36 tools/lib/traceevent/event-parse.h | 8 +++ 3 files changed, 24 insertions(+), 24 deletions(-) diff --git a/tools/lib/traceevent/event-parse-local.h b/tools/lib/traceevent/event-parse-local.h index 9a092dd4a86d..35833ee32d6c 100644 --- a/tools/lib/traceevent/event-parse-local.h +++ b/tools/lib/traceevent/event-parse-local.h @@ -7,7 +7,7 @@ #ifndef _PARSE_EVENTS_INT_H #define _PARSE_EVENTS_INT_H -struct cmdline; +struct tep_cmdline; struct cmdline_list; struct func_map; struct func_list; @@ -36,7 +36,7 @@ struct tep_handle { int long_size; int page_size; - struct cmdline *cmdlines; + struct tep_cmdline *cmdlines; struct cmdline_list *cmdlist; int cmdline_count; diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 44b80471b024..a850342baf86 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -124,15 +124,15 @@ struct tep_print_arg *alloc_arg(void) return calloc(1, sizeof(struct tep_print_arg)); } -struct cmdline { +struct tep_cmdline { char *comm; int pid; }; static int cmdline_cmp(const void *a, const void *b) { - const struct cmdline *ca = a; - const struct cmdline *cb = b; + const struct tep_cmdline *ca = a; + const struct tep_cmdline *cb = b; if (ca->pid < cb->pid) return -1; @@ -152,7 +152,7 @@ static int cmdline_init(struct tep_handle *pevent) { struct cmdline_list *cmdlist = pevent->cmdlist; struct cmdline_list *item; - struct cmdline *cmdlines; + struct tep_cmdline *cmdlines; int i; cmdlines = malloc(sizeof(*cmdlines) * pevent->cmdline_count); @@ -179,8 +179,8 @@ static int cmdline_init(struct tep_handle *pevent) static const char *find_cmdline(struct tep_handle *pevent, int pid) { - const struct cmdline *comm; - struct cmdline key; + const struct tep_cmdline *comm; + struct tep_cmdline key; if (!pid) return ""; @@ -208,8 +208,8 @@ static const char *find_cmdline(struct tep_handle *pevent, int pid) */ int tep_pid_is_registered(struct tep_handle *pevent, int pid) { - const struct cmdline *comm; - struct cmdline key; + const struct tep_cmdline *comm; + struct tep_cmdline key; if (!pid) return 1; @@ -235,9 +235,9 @@ int tep_pid_is_registered(struct tep_handle *pevent, int pid) static int add_new_comm(struct tep_handle *pevent, const char *comm, int pid, bool override) { - struct cmdline *cmdlines = pevent->cmdlines; - struct cmdline *cmdline; - struct cmdline key; + struct tep_cmdline *cmdlines = pevent->cmdlines; + struct tep_cmdline *cmdline; + struct tep_cmdline key; char *new_comm; if (!pid) @@ -5331,8 +5331,8 @@ const char *tep_data_comm_from_pid(struct tep_handle *pevent, int pid) return comm; } -static struct cmdline * -pid_from_cmdlist(struct tep_handle *pevent, const char *comm, struct cmdline *next) +static struct tep_cmdline * +pid_from_cmdlist(struct tep_handle *pevent, const char *comm, struct tep_cmdline *next) { struct cmdline_list *cmdlist = (struct cmdline_list *)next; @@ -5344,7 +5344,7 @@ pid_from_cmdlist(struct tep_handle *pevent, const char *comm, struct cmdline *ne while (cmdlist && strcmp(cmdlist->comm, comm) != 0) cmdlist = cmdlist->next; - return (struct cmdline *)cmdlist; + return (struct tep_cmdline *)cmdlist; } /** @@ -5360,10 +5360,10 @@ pid_from_cmdlist(struct tep_handle *pevent, const char *comm, struct cmdline *ne * next pid. * Also, it does a linear search, so it may be slow. */ -struct cmdline *tep_data_pid_from_comm(struct tep_handle *pevent, const char *comm, - struct cmdline *next) +struct tep_cmdline *tep_data_pid_from_comm(struct tep_handle
[tip:perf/urgent] perf tests: Add a test for the ARM 32-bit [vectors] page
Commit-ID: 21327c7843e9169d5e2e527713e60e6c9842a56c Gitweb: https://git.kernel.org/tip/21327c7843e9169d5e2e527713e60e6c9842a56c Author: Florian Fainelli AuthorDate: Thu, 20 Dec 2018 19:43:37 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 perf tests: Add a test for the ARM 32-bit [vectors] page perf on ARM requires CONFIG_KUSER_HELPERS to be turned on to allow some independance with respect to the ARM CPU being used. Add a test which tries to locate the [vectors] page, created when CONFIG_KUSER_HELPERS is turned on to help asses the system's health. Signed-off-by: Florian Fainelli Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Chris Healy Cc: Greg Kroah-Hartman Cc: Kim Phillips Cc: Lucas Stach Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Russell King Cc: Thomas Gleixner Cc: Thomas Richter Link: http://lkml.kernel.org/r/20181221034337.26663-3-f.faine...@gmail.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm/tests/Build | 1 + tools/perf/arch/arm/tests/arch-tests.c | 4 tools/perf/arch/arm/tests/vectors-page.c | 24 tools/perf/tests/tests.h | 5 + 4 files changed, 34 insertions(+) diff --git a/tools/perf/arch/arm/tests/Build b/tools/perf/arch/arm/tests/Build index 883c57ff0c08..d9ae2733f9cc 100644 --- a/tools/perf/arch/arm/tests/Build +++ b/tools/perf/arch/arm/tests/Build @@ -1,4 +1,5 @@ libperf-y += regs_load.o libperf-y += dwarf-unwind.o +libperf-y += vectors-page.o libperf-y += arch-tests.o diff --git a/tools/perf/arch/arm/tests/arch-tests.c b/tools/perf/arch/arm/tests/arch-tests.c index 5b1543c98022..6848101a855f 100644 --- a/tools/perf/arch/arm/tests/arch-tests.c +++ b/tools/perf/arch/arm/tests/arch-tests.c @@ -10,6 +10,10 @@ struct test arch_tests[] = { .func = test__dwarf_unwind, }, #endif + { + .desc = "Vectors page", + .func = test__vectors_page, + }, { .func = NULL, }, diff --git a/tools/perf/arch/arm/tests/vectors-page.c b/tools/perf/arch/arm/tests/vectors-page.c new file mode 100644 index ..7ffdd79971c8 --- /dev/null +++ b/tools/perf/arch/arm/tests/vectors-page.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "debug.h" +#include "tests/tests.h" +#include "util/find-map.c" + +#define VECTORS__MAP_NAME "[vectors]" + +int test__vectors_page(struct test *test __maybe_unused, + int subtest __maybe_unused) +{ + void *start, *end; + + if (find_map(, , VECTORS__MAP_NAME)) { + pr_err("%s not found, is CONFIG_KUSER_HELPERS enabled?\n", + VECTORS__MAP_NAME); + return TEST_FAIL; + } + + return TEST_OK; +} diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h index b82f55fcc294..399f18ca71a3 100644 --- a/tools/perf/tests/tests.h +++ b/tools/perf/tests/tests.h @@ -119,4 +119,9 @@ int test__arch_unwind_sample(struct perf_sample *sample, struct thread *thread); #endif #endif + +#if defined(__arm__) +int test__vectors_page(struct test *test, int subtest); +#endif + #endif /* TESTS_H */
[tip:perf/urgent] tools lib traceevent: Introduce new libtracevent API: tep_override_comm()
Commit-ID: ca3958b1c0968a6f3105e211355f128ce871e796 Gitweb: https://git.kernel.org/tip/ca3958b1c0968a6f3105e211355f128ce871e796 Author: Tzvetomir Stoyanov AuthorDate: Fri, 30 Nov 2018 10:44:11 -0500 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 tools lib traceevent: Introduce new libtracevent API: tep_override_comm() This patch adds a new API of tracevent library: tep_override_comm() It registers a pid / command mapping. If a mapping with the same pid already exists, the entry is updated with the new command. Signed-off-by: Tzvetomir Stoyanov Cc: Andrew Morton Cc: Jiri Olsa Cc: Namhyung Kim Link: http://lkml.kernel.org/r/20181130154648.038915...@goodmis.org Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/traceevent/event-parse.c | 69 +- tools/lib/traceevent/event-parse.h | 1 + 2 files changed, 55 insertions(+), 15 deletions(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 69a96e39f0ab..156e513074b2 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -232,11 +232,13 @@ int tep_pid_is_registered(struct tep_handle *pevent, int pid) * we must add this pid. This is much slower than when cmdlines * are added before the array is initialized. */ -static int add_new_comm(struct tep_handle *pevent, const char *comm, int pid) +static int add_new_comm(struct tep_handle *pevent, + const char *comm, int pid, bool override) { struct cmdline *cmdlines = pevent->cmdlines; - const struct cmdline *cmdline; + struct cmdline *cmdline; struct cmdline key; + char *new_comm; if (!pid) return 0; @@ -247,8 +249,19 @@ static int add_new_comm(struct tep_handle *pevent, const char *comm, int pid) cmdline = bsearch(, pevent->cmdlines, pevent->cmdline_count, sizeof(*pevent->cmdlines), cmdline_cmp); if (cmdline) { - errno = EEXIST; - return -1; + if (!override) { + errno = EEXIST; + return -1; + } + new_comm = strdup(comm); + if (!new_comm) { + errno = ENOMEM; + return -1; + } + free(cmdline->comm); + cmdline->comm = new_comm; + + return 0; } cmdlines = realloc(cmdlines, sizeof(*cmdlines) * (pevent->cmdline_count + 1)); @@ -275,21 +288,13 @@ static int add_new_comm(struct tep_handle *pevent, const char *comm, int pid) return 0; } -/** - * tep_register_comm - register a pid / comm mapping - * @pevent: handle for the pevent - * @comm: the command line to register - * @pid: the pid to map the command line to - * - * This adds a mapping to search for command line names with - * a given pid. The comm is duplicated. - */ -int tep_register_comm(struct tep_handle *pevent, const char *comm, int pid) +static int _tep_register_comm(struct tep_handle *pevent, + const char *comm, int pid, bool override) { struct cmdline_list *item; if (pevent->cmdlines) - return add_new_comm(pevent, comm, pid); + return add_new_comm(pevent, comm, pid, override); item = malloc(sizeof(*item)); if (!item) @@ -312,6 +317,40 @@ int tep_register_comm(struct tep_handle *pevent, const char *comm, int pid) return 0; } +/** + * tep_register_comm - register a pid / comm mapping + * @pevent: handle for the pevent + * @comm: the command line to register + * @pid: the pid to map the command line to + * + * This adds a mapping to search for command line names with + * a given pid. The comm is duplicated. If a command with the same pid + * already exist, -1 is returned and errno is set to EEXIST + */ +int tep_register_comm(struct tep_handle *pevent, const char *comm, int pid) +{ + return _tep_register_comm(pevent, comm, pid, false); +} + +/** + * tep_override_comm - register a pid / comm mapping + * @pevent: handle for the pevent + * @comm: the command line to register + * @pid: the pid to map the command line to + * + * This adds a mapping to search for command line names with + * a given pid. The comm is duplicated. If a command with the same pid + * already exist, the command string is udapted with the new one + */ +int tep_override_comm(struct tep_handle *pevent, const char *comm, int pid) +{ + if (!pevent->cmdlines && cmdline_init(pevent)) { + errno = ENOMEM; + return -1; + } + return _tep_register_comm(pevent, comm, pid, true); +} + int tep_register_trace_clock(struct tep_handle *pevent, const char *trace_clock) { pevent->trace_clock = strdup(trace_clock); diff --git a/tools/lib/traceevent/event-parse.h
[tip:perf/urgent] perf tools: Make find_vdso_map() more modular
Commit-ID: 011532379b7c2de6757e129037bdfc8d704bce23 Gitweb: https://git.kernel.org/tip/011532379b7c2de6757e129037bdfc8d704bce23 Author: Florian Fainelli AuthorDate: Thu, 20 Dec 2018 19:43:36 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:13 -0300 perf tools: Make find_vdso_map() more modular In preparation for checking that the vectors page on the ARM architecture, refactor the find_vdso_map() function to accept finding an arbitrary string and create a dedicated helper function for that under util/find-map.c and update the filename to find-map.c and all references to it: perf-read-vdso.c and util/vdso.c. Signed-off-by: Florian Fainelli Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Chris Healy Cc: Greg Kroah-Hartman Cc: Kim Phillips Cc: Lucas Stach Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Ravi Bangoria Cc: Russell King Cc: Thomas Gleixner Cc: Thomas Richter Link: http://lkml.kernel.org/r/20181221034337.26663-2-f.faine...@gmail.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Makefile.perf| 4 ++-- tools/perf/perf-read-vdso.c | 6 +++--- tools/perf/util/{find-vdso-map.c => find-map.c} | 7 +++ tools/perf/util/vdso.c | 6 +++--- 4 files changed, 11 insertions(+), 12 deletions(-) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 2921f829a0f4..0ee6795d82cc 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -662,12 +662,12 @@ $(OUTPUT)perf-%: %.o $(PERFLIBS) $(QUIET_LINK)$(CC) $(CFLAGS) -o $@ $(LDFLAGS) $(filter %.o,$^) $(LIBS) ifndef NO_PERF_READ_VDSO32 -$(OUTPUT)perf-read-vdso32: perf-read-vdso.c util/find-vdso-map.c +$(OUTPUT)perf-read-vdso32: perf-read-vdso.c util/find-map.c $(QUIET_CC)$(CC) -m32 $(filter -static,$(LDFLAGS)) -Wall -Werror -o $@ perf-read-vdso.c endif ifndef NO_PERF_READ_VDSOX32 -$(OUTPUT)perf-read-vdsox32: perf-read-vdso.c util/find-vdso-map.c +$(OUTPUT)perf-read-vdsox32: perf-read-vdso.c util/find-map.c $(QUIET_CC)$(CC) -mx32 $(filter -static,$(LDFLAGS)) -Wall -Werror -o $@ perf-read-vdso.c endif diff --git a/tools/perf/perf-read-vdso.c b/tools/perf/perf-read-vdso.c index 8c0ca0cc428f..aaa5210ea84a 100644 --- a/tools/perf/perf-read-vdso.c +++ b/tools/perf/perf-read-vdso.c @@ -5,17 +5,17 @@ #define VDSO__MAP_NAME "[vdso]" /* - * Include definition of find_vdso_map() also used in util/vdso.c for + * Include definition of find_map() also used in util/vdso.c for * building perf. */ -#include "util/find-vdso-map.c" +#include "util/find-map.c" int main(void) { void *start, *end; size_t size, written; - if (find_vdso_map(, )) + if (find_map(, , VDSO__MAP_NAME)) return 1; size = end - start; diff --git a/tools/perf/util/find-vdso-map.c b/tools/perf/util/find-map.c similarity index 71% rename from tools/perf/util/find-vdso-map.c rename to tools/perf/util/find-map.c index d7823e3508fc..7b2300588ece 100644 --- a/tools/perf/util/find-vdso-map.c +++ b/tools/perf/util/find-map.c @@ -1,5 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -static int find_vdso_map(void **start, void **end) +static int find_map(void **start, void **end, const char *name) { FILE *maps; char line[128]; @@ -7,7 +7,7 @@ static int find_vdso_map(void **start, void **end) maps = fopen("/proc/self/maps", "r"); if (!maps) { - fprintf(stderr, "vdso: cannot open maps\n"); + fprintf(stderr, "cannot open maps\n"); return -1; } @@ -21,8 +21,7 @@ static int find_vdso_map(void **start, void **end) if (m < 0) continue; - if (!strncmp([m], VDSO__MAP_NAME, -sizeof(VDSO__MAP_NAME) - 1)) + if (!strncmp([m], name, strlen(name))) found = 1; } diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c index 741af209b19d..3702cba11d7d 100644 --- a/tools/perf/util/vdso.c +++ b/tools/perf/util/vdso.c @@ -18,10 +18,10 @@ #include "debug.h" /* - * Include definition of find_vdso_map() also used in perf-read-vdso.c for + * Include definition of find_map() also used in perf-read-vdso.c for * building perf-read-vdso32 and perf-read-vdsox32. */ -#include "find-vdso-map.c" +#include "find-map.c" #define VDSO__TEMP_FILE_NAME "/tmp/perf-vdso.so-XX" @@ -76,7 +76,7 @@ static char *get_file(struct vdso_file *vdso_file) if (vdso_file->found) return vdso_file->temp_file_name; - if (vdso_file->error || find_vdso_map(, )) + if (vdso_file->error || find_map(, , VDSO__MAP_NAME)) return NULL; size = end - start;
[PATCH v3 1/4] usb: gadget: uvc: synchronize streamon/off with uvc_function_set_alt
If the streamon ioctl is issued while the stream is already on, then the kernel BUGs. This happens at the BUG_ON in uvc_video_alloc_requests within the call stack from the ioctl handler for VIDIOC_STREAMON. This could happen when uvc_function_set_alt 0 races and wins against uvc_v4l2_streamon, or when userspace neglects to issue the VIDIOC_STREAMOFF ioctl. To fix this, add two more uvc states: starting and stopping. Use these to prevent the racing, and to detect when VIDIOC_STREAMON is issued without previously issuing VIDIOC_STREAMOFF. Signed-off-by: Paul Elder --- Changes in v3: - add state guard to uvc_function_set_alt 1 - add documentation for newly added uvc states - reorder uvc states to more or less follow the flow diagram - add more state guards to ioctl handlers for streamon and streamoff Changes in v2: Nothing drivers/usb/gadget/function/f_uvc.c| 17 drivers/usb/gadget/function/uvc.h | 37 ++ drivers/usb/gadget/function/uvc_v4l2.c | 26 -- 3 files changed, 73 insertions(+), 7 deletions(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index 8c99392df593..2ec3b73b2b75 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -317,26 +317,31 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt) switch (alt) { case 0: - if (uvc->state != UVC_STATE_STREAMING) + if (uvc->state != UVC_STATE_STREAMING && + uvc->state != UVC_STATE_STARTING) return 0; if (uvc->video.ep) usb_ep_disable(uvc->video.ep); + uvc->state = UVC_STATE_STOPPING; + memset(_event, 0, sizeof(v4l2_event)); v4l2_event.type = UVC_EVENT_STREAMOFF; v4l2_event_queue(>vdev, _event); - uvc->state = UVC_STATE_CONNECTED; return 0; case 1: - if (uvc->state != UVC_STATE_CONNECTED) - return 0; - if (!uvc->video.ep) return -EINVAL; + if (uvc->state == UVC_STATE_STOPPING) + return -EINVAL; + + if (uvc->state != UVC_STATE_CONNECTED) + return 0; + uvcg_info(f, "reset UVC\n"); usb_ep_disable(uvc->video.ep); @@ -346,6 +351,8 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt) return ret; usb_ep_enable(uvc->video.ep); + uvc->state = UVC_STATE_STARTING; + memset(_event, 0, sizeof(v4l2_event)); v4l2_event.type = UVC_EVENT_STREAMON; v4l2_event_queue(>vdev, _event); diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h index 099d650082e5..f183e349499c 100644 --- a/drivers/usb/gadget/function/uvc.h +++ b/drivers/usb/gadget/function/uvc.h @@ -101,10 +101,47 @@ struct uvc_video { unsigned int fid; }; +/** + * enum uvc_state - the states of a struct uvc_device + * @UVC_STATE_DISCONNECTED: not connected state + * - transition to connected state on .set_alt + * @UVC_STATE_CONNECTED:connected state + * - transition to disconnected state on .disable + * and alloc + * - transition to starting state on .set_alt 1 + * @UVC_STATE_STARTING: starting state + * - transition to streaming state on streamon ioctl + * - transition to stopping state on set_alt 0 + * @UVC_STATE_STREAMING:streaming state + * - transition to stopping state on .set_alt 0 + * @UVC_STATE_STOPPING: stopping state + * - transition to connected on streamoff ioctl + * + * Diagram of state transitions: + * + * disable + * +---+ + * v | + * +--+ set_alt +---+ + * | DISCONNECTED | -> | CONNECTED | + * +--++---+ + *| ^ + * set_alt 1 | | streamoff + * +--+ + + * V| + * +--+ streamon +---+ set_alt 0 +--+ + * | STARTING | --> | STREAMING | ---> | STOPPING | + * +--+ +---+ +--+ + * |^ + * | set_alt 0| + * ++ + */ enum uvc_state { UVC_STATE_DISCONNECTED, UVC_STATE_CONNECTED, +
[tip:perf/urgent] perf trace: Fix alignment for [continued] lines
Commit-ID: ac6e022cbfdce215ad545e91d9827060855da3d7 Gitweb: https://git.kernel.org/tip/ac6e022cbfdce215ad545e91d9827060855da3d7 Author: Arnaldo Carvalho de Melo AuthorDate: Mon, 7 Jan 2019 16:54:38 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:12 -0300 perf trace: Fix alignment for [continued] lines We were not taking into account the "... [continued]" printed characters, fix it. Cc: Adrian Hunter Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-qt20y0acmf8k0bzisce8k...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-trace.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index b8bf5d025563..ed4583128b9c 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -2032,9 +2032,10 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel, if (ttrace->entry_pending) { printed = fprintf(trace->output, "%s", ttrace->entry_str); } else { - fprintf(trace->output, " ... ["); + printed += fprintf(trace->output, " ... ["); color_fprintf(trace->output, PERF_COLOR_YELLOW, "continued"); - fprintf(trace->output, "]: %s()", sc->name); + printed += 9; + printed += fprintf(trace->output, "]: %s()", sc->name); } printed++; /* the closing ')' */
[PATCH v3 3/4] usb: gadget: uvc: disable stream when disconnected
If the streamon ioctl is issued while the stream is already on, then the kernel BUGs. This happens at the BUG_ON in uvc_video_alloc_requests within the call stack from the ioctl handler for VIDIOC_STREAMON. This can also be triggered by starting the stream and then physically disconnecting usb. To fix this, do the streamoff procedures on usb disconnect. Since uvcg_video_enable is not interrupt-safe, add an interrupt-safe version uvcg_video_cancel, and use that. Signed-off-by: Paul Elder v2 Reviewed-by: Kieran Bingham --- Changes in v3: - added interrupt-safe uvcg_video_cancel and used instead of the non-interrupt-save uvcg_video_enable 0 Changes in v2: Nothing drivers/usb/gadget/function/f_uvc.c | 3 +++ drivers/usb/gadget/function/uvc_video.c | 13 + drivers/usb/gadget/function/uvc_video.h | 2 ++ 3 files changed, 18 insertions(+) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index b407b10a95ed..4134117b5f81 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -369,9 +369,12 @@ uvc_function_disable(struct usb_function *f) { struct uvc_device *uvc = to_uvc(f); struct v4l2_event v4l2_event; + struct uvc_video *video = >video; uvcg_info(f, "%s()\n", __func__); + uvcg_video_cancel(video, 1); + memset(_event, 0, sizeof(v4l2_event)); v4l2_event.type = UVC_EVENT_DISCONNECT; v4l2_event_queue(>vdev, _event); diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c index 5c042f380708..5f3ed9e0b5ad 100644 --- a/drivers/usb/gadget/function/uvc_video.c +++ b/drivers/usb/gadget/function/uvc_video.c @@ -348,6 +348,19 @@ int uvcg_video_pump(struct uvc_video *video) return 0; } +int uvcg_video_cancel(struct uvc_video *video, int disconnect) +{ + unsigned int i; + + for (i = 0; i < UVC_NUM_REQUESTS; ++i) + if (video->req[i]) + usb_ep_dequeue(video->ep, video->req[i]); + + uvc_video_free_requests(video); + uvcg_queue_cancel(>queue, disconnect); + return 0; +} + /* * Enable or disable the video stream. */ diff --git a/drivers/usb/gadget/function/uvc_video.h b/drivers/usb/gadget/function/uvc_video.h index 278dc52c7604..1f310fa58ce5 100644 --- a/drivers/usb/gadget/function/uvc_video.h +++ b/drivers/usb/gadget/function/uvc_video.h @@ -16,6 +16,8 @@ struct uvc_video; int uvcg_video_pump(struct uvc_video *video); +int uvcg_video_cancel(struct uvc_video *video, int disconnect); + int uvcg_video_enable(struct uvc_video *video, int enable); int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc); -- 2.20.1
[PATCH v3 2/4] usb: gadget: uvc: don't delay the status phase of non-zero SET_INTERFACE requests
Reception of a SET_INTERFACE request with a non-zero alternate setting signals the start of the video stream. The gadget has to enable the video streaming endpoint and to signal stream start to userspace, in order to start receiving video frames to transmit over USB. As userspace can be slow to react, the UVC function driver delays the status phase of the SET_INTERFACE control transfer until userspace is ready. The status phase is delayed by returning USB_GADGET_DELAYED_STATUS from the function's .set_alt() handler. This creates a race condition as the userspace application could process the stream start event before the composite layer processes the USB_GADGET_DELAYED_STATUS return value. The race has been observed in practice, and can't be solved without a change to the USB_GADGET_DELAYED_STATUS API. Fortunately the UVC function driver doesn't strictly require delaying the status phase for non-zero SET_INTERFACE, as the only requirement from a USB point of view is that the streaming endpoint must be enabled before the status phase completes, and that is already guaranteed by the current code. We can thus complete the status phase synchronously, removing the race condition at the same time. Without delaying the status phase the host will likely start issuing isochronous transfers before we queue the first USB requests. The UDC will reply with NAKs which should be handled properly by the host. If this ends up causing issues another option will be to modify the status phase delay API to fix the race condition. Signed-off-by: Paul Elder --- Changes in v3: Nothing Changes in v2: 1. Remove delay usb status phase drivers/usb/gadget/function/f_uvc.c| 3 ++- drivers/usb/gadget/function/uvc_v4l2.c | 6 -- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index 2ec3b73b2b75..b407b10a95ed 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -356,7 +356,8 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt) memset(_event, 0, sizeof(v4l2_event)); v4l2_event.type = UVC_EVENT_STREAMON; v4l2_event_queue(>vdev, _event); - return USB_GADGET_DELAYED_STATUS; + + return 0; default: return -EINVAL; diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c index 97e624214287..7816ea9886e1 100644 --- a/drivers/usb/gadget/function/uvc_v4l2.c +++ b/drivers/usb/gadget/function/uvc_v4l2.c @@ -210,12 +210,6 @@ uvc_v4l2_streamon(struct file *file, void *fh, enum v4l2_buf_type type) uvc->state = UVC_STATE_STREAMING; - /* -* Complete the alternate setting selection setup phase now that -* userspace is ready to provide video frames. -*/ - uvc_function_setup_continue(uvc); - return 0; } -- 2.20.1
Re: Generic RGB LED support was Re: [PATCH 2/2] leds: lp5024: Add the LP5024/18 RGB LED driver
Hi Pavel, On 09/01/2019 0.59, Pavel Machek wrote: Hi! Grab yourself an RGB LED and play with it; you'll see what the problems are. It is hard to explain colors over email... Video [0] gives some overview of lp5024 capabilities. I don't see any problems in exposing separate red,green,blue files and brightness for the devices with hardware support for that. Well, that's what we do today, as three separate LEDs, right? No. It doesn't allow for setting color intensity by having the color fixed beforehand. Below is relevant excerpt from the lp5024 documentation. This is not something that can be mapped to RGB color space, but rather to HSV/HSL, with the reservation that the hardware implementation uses PWM for setting color intensity. So they have feature where they have independent controls for each channel, then one common control per three channels. Other chips have common control for all the LEDs, for example. We don't support that currently; lets focus on the RGB thing first. I don't have problem with that, either; other drivers already do that. He's free to use existing same interface. But that is insufficient, as it does not allow simple stuff, such as turning led "white". So... perhaps we should agree on requirements, first, and then we can discuss solutions? Requirements for RGB LED interface: 1) Userspace should be able to set the white color 2) Userspace should be able to arbitrary color from well known list and it should approximately match what would CRT, LCD or OLED monitor display The difference is that monitor display driver is pre-calibrated for given display by the manufacturer. With the LED controllers the manufacturer has no control over what LEDs will be connected to the iouts. Therefore it should be not surprising that colors produced by custom LEDs are not as user would expect when comparing to the RGB color displayed on the monitor display. It is true that _chip_ manufacturer can not know what LEDs will be connected. But _system_ manufacturer can and should know that, and should tell be able to tell us in the dts. This renders your requirement 2) infeasible with use of custom LEDs any fixed algorithm, since the final effect will always heavily depend on the LED circuit design. Depending on LED circuit design and actual LEDs connected is okay.. we just need to get information from _system_ designer (not chip designer), and pass it to a place where color is computed. a) RGB LEDs are usually not balanced. Setting 100% PWM on red/green/blue channels will result in nothing close to white light. In fact, to get white light on N900, blue and green channel's PWM needs to be set pretty low, as in 5%. b) LED class does not define any relation between "brightness" in sysfs and ammount of light in lumens. Some drivers use close to linear relation, some use exponential relation. Human eyes percieve logarithm of lumens. RGB color model uses even more complex function. One general question: do you have any solutions in store? I played with LEDs on N900 over the weekend, yes. And getting reasonable colors seems to be possible, when a) and b) are solved... a) seems to be more important than b). Now... this does not tell us how we should design kernel<->user interface, but it should tell us that main goals - 1) and 2) are possible. I was thinking about this calibration and color correctness thing and I am thinking a bit that it should be partly in kernel and partly in user space. For displays and printers there are defined icc-profiles that define how colors are mapped to particular device in cases when you want to have accurate color representation. In theory to get accurate LED output one could model LEDs with icc profile and then pick your color and convert it with icc profile to actual raw hardware values. Then this raw hardware value could be written from user space to kernel. In kernel we could provide raw hardware value support and in case of PWM IC controlled LEDs we could provide curve mapping to linearize the behavior of values entered to enable use in cases where close enough is good enough. Eg. if you want to have "white" then you have your color space picker like sRGB(255,255,255). Then you would define icc profile it might translate to hardware raw values 253%, 230%, 225%. Then you would write this to kernel with: # set red, green and blue color elements echo "253 230 225" > color In this case however behavior of brightness node is challening in accuracy domain. Britghtness value 255 would of course provide 1:1 mapping in this case. To go right to correct color and brightness at one go we could have optional brightness at the end of color value array: # set red, green and blue color element and brightness setting: echo "253 230 225 255" > color If you want to have fancier behavior for brightness in your system then we probably need to have configurable brightness model. - hardware, like in lp5024 case would map to
[PATCH v3 0/4] usb: gadget: uvc: fix racing between uvc_function_set_alt and streamon/off
Down the call stack from the ioctl handler for VIDIOC_STREAMON, uvc_video_alloc_requests contains a BUG_ON, which in the high level, triggers when VIDIOC_STREAMON ioctl is issued without VIDIOC_STREAMOFF being issued previously. This can happen in a few ways, such as if the userspace uvc gadget application simply doesn't issue VIDIOC_STREAMOFF. Another way is if uvc_function_set_alt with alt 0 is called after it is called with 1 but before VIDIOC_STREAMON is called; in this case, UVC_EVENT_STREAMOFF will not be queued to userspace, and therefore userspace will never call VIDIOC_STREAMOFF. To fix this, add two more uvc states: starting and stopping. The starting state is entered when uvc_function_set_alt 1 is called, and is exited in uvc_v4l2_streamon, when the state is changed to streaming. The stopping state is entered when uvc_function_set_alt 0 is called, and is exited in uvc_v4l2_streamoff, when the state is changed to connected. The status phase of the SET_INTERFACE request doesn't need to be delayed by the uvc gadget driver, so that is removed. Finally, there is another way to trigger the aforementioned BUG: start streaming and (physically) disconnect usb. To fix this, call uvcg_video_enable 0 in uvc_function_disable. Changes in v3: - add state guard to uvc_function_set_alt 1 - add documentation for newly added uvc states - reorder uvc states to more or less follow the flow diagram - add more state guards to ioctl handlers for streamon and streamoff - added interrupt-safe uvcg_video_cancel and used instead of the non-interrupt-save uvcg_video_enable 0 in uvc_function_disable Changes in v2: 1. Remove delay usb status phase Paul Elder (4): usb: gadget: uvc: synchronize streamon/off with uvc_function_set_alt usb: gadget: uvc: don't delay the status phase of non-zero SET_INTERFACE requests usb: gadget: uvc: disable stream when disconnected usb: gadget: uvc: remove unused/duplicate function prototypes from uvc.h drivers/usb/gadget/function/f_uvc.c | 23 drivers/usb/gadget/function/uvc.h | 47 +++-- drivers/usb/gadget/function/uvc_v4l2.c | 28 +++ drivers/usb/gadget/function/uvc_video.c | 13 +++ drivers/usb/gadget/function/uvc_video.h | 2 ++ 5 files changed, 91 insertions(+), 22 deletions(-) -- 2.20.1
[PATCH v3 4/4] usb: gadget: uvc: remove unused/duplicate function prototypes from uvc.h
Defined in uvc.h, uvc_endpoint_stream doesn't exist, and uvc_function_connect, uvc_function_disconnect, and uvc_function_setup_continue have duplicates in f_uvc.h. Remove these four function prototypes from uvc.h Signed-off-by: Paul Elder --- drivers/usb/gadget/function/uvc.h | 10 -- 1 file changed, 10 deletions(-) diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h index f183e349499c..671020c8a836 100644 --- a/drivers/usb/gadget/function/uvc.h +++ b/drivers/usb/gadget/function/uvc.h @@ -185,14 +185,4 @@ struct uvc_file_handle { #define to_uvc_file_handle(handle) \ container_of(handle, struct uvc_file_handle, vfh) -/* - * Functions - */ - -extern void uvc_function_setup_continue(struct uvc_device *uvc); -extern void uvc_endpoint_stream(struct uvc_device *dev); - -extern void uvc_function_connect(struct uvc_device *uvc); -extern void uvc_function_disconnect(struct uvc_device *uvc); - #endif /* _UVC_GADGET_H_ */ -- 2.20.1
[PATCH v2] usb: gadget: musb: fix short isoc packets with inventra dma
Handling short packets (length < max packet size) in the Inventra DMA engine in the MUSB driver causes the MUSB DMA controller to hang. An example of a problem that is caused by this problem is when streaming video out of a UVC gadget, only the first video frame is transferred. For short packets (mode-0 or mode-1 DMA), MUSB_TXCSR_TXPKTRDY must be set manually by the driver. This was previously done in musb_g_tx (musb_gadget.c), but incorrectly (all csr flags were cleared, and only MUSB_TXCSR_MODE and MUSB_TXCSR_TXPKTRDY were set). Fixing that problem allows some requests to be transferred correctly, but multiple requests were often put together in one USB packet, and caused problems if the packet size was not a multiple of 4. Instead, set MUSB_TXCSR_TXPKTRDY in dma_controller_irq (musbhsdma.c), just like host mode transfers. This topic was originally tackled by Nicolas Boichat [0] [1] and is discussed further at [2] as part of his GSoC project [3]. [0] https://groups.google.com/forum/?hl=en#!topic/beagleboard-gsoc/k8Azwfp75CU [1] https://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/commit/b0be3b6cc195ba732189b04f1d43ec843c3e54c9?p=beagleboard-usbsniffer:beagleboard-usbsniffer-kernel.git;a=patch;h=b0be3b6cc195ba732189b04f1d43ec843c3e54c9 [2] http://beagleboard-usbsniffer.blogspot.com/2010/07/musb-isochronous-transfers-fixed.html [3] http://elinux.org/BeagleBoard/GSoC/USBSniffer Signed-off-by: Paul Elder --- Changes in v2: - no more flushing FIFO - greatly simplified short packet if guard in musb_g_tx, and removed unnecessary variables - minor indentation and wording changes drivers/usb/musb/musb_gadget.c | 19 +-- drivers/usb/musb/musbhsdma.c | 21 +++-- 2 files changed, 16 insertions(+), 24 deletions(-) diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c index eae8b1b1b45b..496643f54faa 100644 --- a/drivers/usb/musb/musb_gadget.c +++ b/drivers/usb/musb/musb_gadget.c @@ -452,13 +452,10 @@ void musb_g_tx(struct musb *musb, u8 epnum) } if (request) { - u8 is_dma = 0; - boolshort_packet = false; trace_musb_req_tx(req); if (dma && (csr & MUSB_TXCSR_DMAENAB)) { - is_dma = 1; csr |= MUSB_TXCSR_P_WZC_BITS; csr &= ~(MUSB_TXCSR_DMAENAB | MUSB_TXCSR_P_UNDERRUN | MUSB_TXCSR_TXPKTRDY | MUSB_TXCSR_AUTOSET); @@ -476,16 +473,8 @@ void musb_g_tx(struct musb *musb, u8 epnum) */ if ((request->zero && request->length) && (request->length % musb_ep->packet_sz == 0) - && (request->actual == request->length)) - short_packet = true; + && (request->actual == request->length)) { - if ((musb_dma_inventra(musb) || musb_dma_ux500(musb)) && - (is_dma && (!dma->desired_mode || - (request->actual & - (musb_ep->packet_sz - 1) - short_packet = true; - - if (short_packet) { /* * On DMA completion, FIFO may not be * available yet... @@ -493,8 +482,10 @@ void musb_g_tx(struct musb *musb, u8 epnum) if (csr & MUSB_TXCSR_TXPKTRDY) return; - musb_writew(epio, MUSB_TXCSR, MUSB_TXCSR_MODE - | MUSB_TXCSR_TXPKTRDY); + musb_dbg(musb, "sending short pkt (zero=%d, length=%d, actual=%d, dma->desired_mode=%d)\n", +request->zero, request->length, request->actual, +dma->desired_mode); + musb_writew(epio, MUSB_TXCSR, csr | MUSB_TXCSR_TXPKTRDY); request->zero = 0; } diff --git a/drivers/usb/musb/musbhsdma.c b/drivers/usb/musb/musbhsdma.c index a688f7f87829..5fc6825745f2 100644 --- a/drivers/usb/musb/musbhsdma.c +++ b/drivers/usb/musb/musbhsdma.c @@ -346,12 +346,10 @@ static irqreturn_t dma_controller_irq(int irq, void *private_data) channel->status = MUSB_DMA_STATUS_FREE; /* completed */ - if ((devctl & MUSB_DEVCTL_HM) - && (musb_channel->transmit) - && ((channel->desired_mode == 0) - || (channel->actual_len & - (musb_channel->max_packet_sz - 1))) - ) { + if (musb_channel->transmit && +
[PATCH v5 5/6] usb: musb: gadget: implement optional explicit status stage
Implement the mechanism for optional explicit status stage for the MUSB driver. This allows a function driver to specify what to reply for the status stage. The functionality for an implicit status stage is retained. Signed-off-by: Paul Elder v1 Reviewed-by: Laurent Pinchart v1 Acked-by: Bin Liu --- Changes from v4: - call usb_gadget_control_complete before usb_gadget_giveback_request - set musb ep0 state to statusin in ep0_send_ack - make sure to not double-write musb register in ep0_rxstate, since musb_g_ep0_giveback will take care of writing them No change from v3 Changes from v2: - update call to usb_gadget_control_complete to include status - since sending STALL from the function driver is now done with usb_ep_set_halt, there is no need for the internal ep0_send_response to take a stall/ack parameter; remove the parameter and make the function only send ack, and remove checking for the status reply in the usb_request for the status stage Changes from v1: - obvious change to implement v2 mechanism laid out by 4/6 of this series (send_response, and musb_g_ep0_send_response function has been removed, call to usb_gadget_control_complete has been added) - ep0_send_response's ack argument has been changed from stall - last_packet flag in ep0_rxstate has been removed, since it is equal to req != NULL drivers/usb/musb/musb_gadget.c | 2 ++ drivers/usb/musb/musb_gadget_ep0.c | 29 +++-- 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c index 496643f54faa..ff83578c181f 100644 --- a/drivers/usb/musb/musb_gadget.c +++ b/drivers/usb/musb/musb_gadget.c @@ -144,6 +144,8 @@ __acquires(ep->musb->lock) unmap_dma_buffer(req, musb); trace_musb_req_gb(req); + if (req->ep->end_point.address == 0) + usb_gadget_control_complete(>g, >request); usb_gadget_giveback_request(>ep->end_point, >request); spin_lock(>lock); ep->busy = busy; diff --git a/drivers/usb/musb/musb_gadget_ep0.c b/drivers/usb/musb/musb_gadget_ep0.c index 91a5027b5c1f..dc5abdf9d49e 100644 --- a/drivers/usb/musb/musb_gadget_ep0.c +++ b/drivers/usb/musb/musb_gadget_ep0.c @@ -458,6 +458,25 @@ __acquires(musb->lock) return handled; } +static int ep0_send_ack(struct musb *musb) +{ + void __iomem *regs = musb->control_ep->regs; + u16 csr; + + if (musb->ep0_state != MUSB_EP0_STAGE_RX && + musb->ep0_state != MUSB_EP0_STAGE_STATUSIN) + return -EINVAL; + + csr = MUSB_CSR0_P_DATAEND | MUSB_CSR0_P_SVDRXPKTRDY; + + musb_ep_select(musb->mregs, 0); + musb_writew(regs, MUSB_CSR0, csr); + + musb->ep0_state = MUSB_EP0_STAGE_STATUSIN; + + return 0; +} + /* we have an ep0out data packet * Context: caller holds controller lock */ @@ -504,12 +523,15 @@ static void ep0_rxstate(struct musb *musb) if (req) { musb->ackpend = csr; musb_g_ep0_giveback(musb, req); + if (req->explicit_status) + return; if (!musb->ackpend) return; musb->ackpend = 0; + } else { + musb_ep_select(musb->mregs, 0); + musb_writew(regs, MUSB_CSR0, csr); } - musb_ep_select(musb->mregs, 0); - musb_writew(regs, MUSB_CSR0, csr); } /* @@ -939,6 +961,9 @@ musb_g_ep0_queue(struct usb_ep *e, struct usb_request *r, gfp_t gfp_flags) case MUSB_EP0_STAGE_ACKWAIT:/* zero-length data */ status = 0; break; + case MUSB_EP0_STAGE_STATUSIN: + status = ep0_send_ack(musb); + goto cleanup; default: musb_dbg(musb, "ep0 request queued in state %d", musb->ep0_state); -- 2.20.1
[PATCH v5 4/6] usb: gadget: add mechanism to specify an explicit status stage
A usb gadget function driver may or may not want to delay the status stage of a control OUT request. An instance where it might want to is to asynchronously validate the data of a class-specific request. A function driver that wants an explicit status stage should set the newly added explicit_status flag of the usb_request corresponding to the data stage. Later on, the function driver can explicitly complete the status stage by enqueueing a usb_request for ACK, or calling usb_ep_set_halt() for STALL. To support both explicit and implicit status stages, a UDC driver must call the newly added usb_gadget_control_complete function right before calling usb_gadget_giveback_request. To support the explicit status stage, it might then check what stage the usb_request was queued in, and for control IN ACK the host's zero-length data packet, or for control OUT send a zero-length DATA1 ACK packet. Signed-off-by: Paul Elder v4 Acked-by: Alan Stern v1 Reviewed-by: Laurent Pinchart --- Changes from v4: - Change parameter of usb_gadget_control_complete to simply take a usb_request - Make usb_gadget_control_complete do nothing if the request has no length (ie. no data stage) Changes from v3: - More specific in commit message about what to do for udc driver acking - Set explicit_status in usb_gadget_control_complete - Make explicit_status type bool Changes from v2: Add status parameter to usb_gadget_control_complete, so that a usb_request is not queued if the status of the just given back request is nonzero. Changes from v1: Complete change of API. Now we use a flag that should be set in the usb_request that is queued for the data stage to signal to the UDC that we want to delay the status stage (as opposed to setting a flag in the UDC itself, that persists across all requests). We now also provide a function for UDC drivers to very easily allow implicit status stages, to mitigate the need to convert all function drivers to this new API at once, and to make it easier for UDC drivers to convert. drivers/usb/gadget/udc/core.c | 34 ++ include/linux/usb/gadget.h| 10 ++ 2 files changed, 44 insertions(+) diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c index af88b48c1cea..57b2c2550537 100644 --- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -894,6 +894,40 @@ EXPORT_SYMBOL_GPL(usb_gadget_giveback_request); /* - */ +/** + * usb_gadget_control_complete - complete the status stage of a control + * request, or delay it + * Context: in_interrupt() + * + * @gadget: gadget whose control request's status stage should be completed + * @request: usb request whose status stage should be completed + * + * This is called by device controller drivers before returning the completed + * request back to the gadget layer, to either complete or delay the status + * stage. It exits without doing anything if the request has a non-zero status, + * if it has zero length, or if its explicit_status flag is set. + */ +void usb_gadget_control_complete(struct usb_gadget *gadget, + struct usb_request *request) +{ + struct usb_request *req; + + if (request->explicit_status || request->status || !request->length) + return; + + /* Send an implicit status-stage request for ep0 */ + req = usb_ep_alloc_request(gadget->ep0, GFP_ATOMIC); + if (req) { + req->length = 0; + req->explicit_status = 1; + req->complete = usb_ep_free_request; + usb_ep_queue(gadget->ep0, req, GFP_ATOMIC); + } +} +EXPORT_SYMBOL_GPL(usb_gadget_control_complete); + +/* - */ + /** * gadget_find_ep_by_name - returns ep whose name is the same as sting passed * in second parameter or NULL if searched endpoint not found diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h index e5cd84a0f84a..bf4f021ce139 100644 --- a/include/linux/usb/gadget.h +++ b/include/linux/usb/gadget.h @@ -73,6 +73,7 @@ struct usb_ep; * Note that for writes (IN transfers) some data bytes may still * reside in a device-side FIFO when the request is reported as * complete. + * @explicit_status: If true, delays the status stage * * These are allocated/freed through the endpoint they're used with. The * hardware's driver can add extra per-request data to the memory it returns, @@ -114,6 +115,8 @@ struct usb_request { int status; unsignedactual; + + boolexplicit_status; }; /*-*/ @@ -850,6 +853,13 @@ extern void usb_gadget_giveback_request(struct usb_ep *ep, /*-*/ +/*
[PATCH v5 2/6] usb: gadget: uvc: enqueue usb request in setup handler for control OUT
Currently, for uvc class-specific control IN and OUT requests, in the setup handler a UVC_EVENT_SETUP with the setup control is enqueued to userspace. In response to this, the uvc function driver expects userspace to call ioctl UVCIOC_SEND_RESPONSE containing uvc request data. In the case of control IN this is fine, but for control OUT it causes a problem. Since the host sends data immediately after the setup stage completes, it is possible that the empty uvc request data is not enqueued in time for the UDC driver to put the data stage data into (this causes some UDC drivers, such as MUSB, to reply with a STALL). This problem is remedied by having the setup handler enqueue the empty uvc request data, instead of waiting for userspace to do it. Signed-off-by: Paul Elder Reviewed-by: Laurent Pinchart --- No change from v4 No change from v3 No change from v2 No change from v1 drivers/usb/gadget/function/f_uvc.c| 25 +++-- drivers/usb/gadget/function/uvc_v4l2.c | 7 +++ 2 files changed, 26 insertions(+), 6 deletions(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index 723ca91e4f2a..36de51ac30ed 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -223,8 +223,6 @@ static int uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl) { struct uvc_device *uvc = to_uvc(f); - struct v4l2_event v4l2_event; - struct uvc_event *uvc_event = (void *)_event.u.data; if ((ctrl->bRequestType & USB_TYPE_MASK) != USB_TYPE_CLASS) { uvcg_info(f, "invalid request type\n"); @@ -241,10 +239,25 @@ uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl) uvc->event_setup_out = !(ctrl->bRequestType & USB_DIR_IN); uvc->event_length = le16_to_cpu(ctrl->wLength); - memset(_event, 0, sizeof(v4l2_event)); - v4l2_event.type = UVC_EVENT_SETUP; - memcpy(_event->req, ctrl, sizeof(uvc_event->req)); - v4l2_event_queue(>vdev, _event); + if (uvc->event_setup_out) { + struct usb_request *req = uvc->control_req; + + /* +* Enqueue the request immediately for control OUT as the +* host will start the data stage straight away. +*/ + req->length = uvc->event_length; + req->zero = 0; + usb_ep_queue(f->config->cdev->gadget->ep0, req, GFP_KERNEL); + } else { + struct v4l2_event v4l2_event; + struct uvc_event *uvc_event = (void *)_event.u.data; + + memset(_event, 0, sizeof(v4l2_event)); + v4l2_event.type = UVC_EVENT_SETUP; + memcpy(_event->req, ctrl, sizeof(uvc_event->req)); + v4l2_event_queue(>vdev, _event); + } return 0; } diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c index 7816ea9886e1..ac48f49d9f10 100644 --- a/drivers/usb/gadget/function/uvc_v4l2.c +++ b/drivers/usb/gadget/function/uvc_v4l2.c @@ -35,6 +35,13 @@ uvc_send_response(struct uvc_device *uvc, struct uvc_request_data *data) struct usb_composite_dev *cdev = uvc->func.config->cdev; struct usb_request *req = uvc->control_req; + /* +* For control OUT transfers the request has been enqueued synchronously +* by the setup handler, there's nothing to be done here. +*/ + if (uvc->event_setup_out) + return 0; + if (data->length < 0) return usb_ep_set_halt(cdev->gadget->ep0); -- 2.20.1
[PATCH v5 1/6] usb: uvc: include videodev2.h in g_uvc.h
V4L2_EVENT_PRIVATE_START is used in g_uvc.h but is defined in videodev2.h, which is not included and causes a compiler warning: linux/usb/g_uvc.h:15:28: error: ‘V4L2_EVENT_PRIVATE_START’ undeclared here (not in a function) #define UVC_EVENT_FIRST (V4L2_EVENT_PRIVATE_START + 0) Include videodev2.h in g_uvc.h. Signed-off-by: Paul Elder Reviewed-by: Laurent Pinchart --- No change from v4 No change from v3 No change from v2 No change from v1 include/uapi/linux/usb/g_uvc.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/usb/g_uvc.h b/include/uapi/linux/usb/g_uvc.h index 3c9ee3020cbb..6698c3263ae8 100644 --- a/include/uapi/linux/usb/g_uvc.h +++ b/include/uapi/linux/usb/g_uvc.h @@ -11,6 +11,7 @@ #include #include #include +#include #define UVC_EVENT_FIRST(V4L2_EVENT_PRIVATE_START + 0) #define UVC_EVENT_CONNECT (V4L2_EVENT_PRIVATE_START + 0) -- 2.20.1
[PATCH v5 6/6] usb: gadget: uvc: allow ioctl to send response in status stage
We now have a mechanism to signal the UDC driver to reply to a control OUT request with STALL or ACK, and we have packaged the setup stage data and the data stage data of a control OUT request into a single UVC_EVENT_DATA for userspace to consume. The ioctl UVCIOC_SEND_RESPONSE in the case of a control OUT request sends a response to the data stage, and so the ioctl now notifies the UDC driver to reply with STALL or ACK. In the case of a control IN request, the ioctl sends the UVC data as before. Also tell the UDC to delay the status stage for this to work. Signed-off-by: Paul Elder --- No change from v4 No change from v3 Changes from v2: - calling usb_ep_set_halt in uvc_send_response if data->length < 0 is now common for both IN and OUT transfers so make that check common - remove now unnecessary field setting for the usb_request to be queued for the status stage Changes from v1: - remove usb_ep_delay_status call from the old proposed API - changed portions of uvc_send_response to match v2 API - remove UDC warning that send_response is not implemented drivers/usb/gadget/function/f_uvc.c| 4 ++-- drivers/usb/gadget/function/uvc_v4l2.c | 23 +-- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index 58ed191f636e..053808b01d48 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -209,14 +209,13 @@ uvc_function_ep0_complete(struct usb_ep *ep, struct usb_request *req) struct uvc_event *uvc_event = (void *)_event.u.data; if (uvc->event_setup_out) { - uvc->event_setup_out = 0; - memset(_event, 0, sizeof(v4l2_event)); v4l2_event.type = UVC_EVENT_DATA; uvc_event->data.length = req->actual; memcpy(_event->data.data, req->buf, req->actual); memcpy(_event->data.setup, >control_setup, sizeof(uvc_event->data.setup)); + v4l2_event_queue(>vdev, _event); } } @@ -251,6 +250,7 @@ uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl) */ req->length = uvc->event_length; req->zero = 0; + req->explicit_status = 1; usb_ep_queue(f->config->cdev->gadget->ep0, req, GFP_KERNEL); } else { struct v4l2_event v4l2_event; diff --git a/drivers/usb/gadget/function/uvc_v4l2.c b/drivers/usb/gadget/function/uvc_v4l2.c index ac48f49d9f10..353071869e15 100644 --- a/drivers/usb/gadget/function/uvc_v4l2.c +++ b/drivers/usb/gadget/function/uvc_v4l2.c @@ -35,15 +35,26 @@ uvc_send_response(struct uvc_device *uvc, struct uvc_request_data *data) struct usb_composite_dev *cdev = uvc->func.config->cdev; struct usb_request *req = uvc->control_req; + if (data->length < 0) + return usb_ep_set_halt(cdev->gadget->ep0); + /* * For control OUT transfers the request has been enqueued synchronously -* by the setup handler, there's nothing to be done here. +* by the setup handler, we just need to tell the UDC whether to ACK or +* STALL the control transfer. */ - if (uvc->event_setup_out) - return 0; - - if (data->length < 0) - return usb_ep_set_halt(cdev->gadget->ep0); + if (uvc->event_setup_out) { + /* +* The length field carries the control request status. +* Negative values signal a STALL and zero values an ACK. +* Positive values are not valid as there is no data to send +* back in the status stage. +*/ + if (data->length > 0) + return -EINVAL; + + return usb_ep_queue(cdev->gadget->ep0, req, GFP_KERNEL); + } req->length = min_t(unsigned int, uvc->event_length, data->length); req->zero = data->length < uvc->event_length; -- 2.20.1
[PATCH v5 0/6] usb: gadget: add mechanism to asynchronously validate data stage of ctrl out request
This patch series adds a mechanism to allow asynchronously validating the data stage of a control OUT request, and for stalling or suceeding the request accordingly. This mechanism is implemented for MUSB, and is used by UVC. At the same time, UVC packages the setup stage and data stage data together to send to userspace to save on a pair of context switches per control out request. This patch series does change the userspace API. We however believe that it is justified because the current API is broken, and because it isn't being used (because it's broken). The current API is broken such that it is subject to race conditions that cause fatal errors with a high frequency. This is actually what motivated this patch series in the first place. In the current API, not only is there no way to asynchronously validate the data stage of a control OUT request, but an empty buffer is expected to be provided to hold the data stage data -- which is more likely than not to be late. There is even a warning in musb_g_ep0_queue: /* else for sequence #2 (OUT), caller provides a buffer * before the next packet arrives. deferred responses * (after SETUP is acked) are racey. */ This problem has never been reported in years, which is a sign that the API isn't used. Furthermore, the vendor kernels that we have seen using the UVC gadget driver (such as QC and Huawei) are heavily patched with local changes to the API. This corroborates the suspicion that the current mainline API is not being used. Additionally, this API isn't meant to be used by generic applications, but by a dedicated userspace helper. uvc-gadget is one such example, but it has bitrotten and isn't compatible with the current kernel API. The fact that nobody has submitted patches nor complained for a long time again shows that it isn't being used. The conclusion is that since the API hasn't been used for a long time, it is safe to fix it. Changes in v5: - Change parameter of usb_gadget_control_complete to simply take a usb_request - Make usb_gadget_control_complete do nothing if the request has no length (ie. no data stage) - musb: call usb_gadget_control_complete before usb_gadget_giveback_request - set musb ep0 state to statusin in ep0_send_ack - make sure to not double-write musb register in ep0_rxstate, since musb_g_ep0_giveback will take care of writing them Changes in v4: - Change wording and fix typo in patch 4/6 "usb: gadget: add mechanism to specify an explicit status stage" - Set explicit_status in usb_gadget_control_complete - Change explicit_status from unsigned int to bool Changes in v3: - Function driver send STALL status stage by simply calling usb_ep_set_halt, and ACK by enqueueing request - Fix signature of usb_gadget_control_complete to check the status of the request that was just given back. - Corresponding changes to musb and uvc gadget Changes in v2: Overhaul of status stage delay mechanism/API. Now if a function driver desires an explicit/delayed status stage, it specifies so in a flag in the usb_request that is queued for the data stage. The function driver later enqueues another usb_request for the status stage, also with the explicit_status flag set, and with the zero flag acting as the status. If a function driver does not desire an explicit status stage, then it can set (or ignore) the explicit_status flag in the usb_request that is queued for the data stage. To allow the optional explicit status stage, a UDC driver should call the newly added usb_gadget_control_complete right after usb_gadget_giveback_request, and in its queue function should check if the usb_request is for the status stage and if it has been requested to be explicit, and if so check the status that should be sent. (See 5/6 "usb: musb: gadget: implement optional explicit status stage" for an implementation for MUSB) Paul Elder (6): usb: uvc: include videodev2.h in g_uvc.h usb: gadget: uvc: enqueue usb request in setup handler for control OUT usb: gadget: uvc: package setup and data for control OUT requests usb: gadget: add mechanism to specify an explicit status stage usb: musb: gadget: implement optional explicit status stage usb: gadget: uvc: allow ioctl to send response in status stage drivers/usb/gadget/function/f_uvc.c| 32 ++-- drivers/usb/gadget/function/uvc.h | 1 + drivers/usb/gadget/function/uvc_v4l2.c | 18 ++ drivers/usb/gadget/udc/core.c | 34 ++ drivers/usb/musb/musb_gadget.c | 2 ++ drivers/usb/musb/musb_gadget_ep0.c | 29 -- include/linux/usb/gadget.h | 10 include/uapi/linux/usb/g_uvc.h | 4 ++- 8 files changed, 119 insertions(+), 11 deletions(-) -- 2.20.1
[tip:perf/urgent] perf trace: Fix ')' placement in "interrupted" syscall lines
Commit-ID: 172bf02d564bdb6df8410f64720fa2c68e755d1a Gitweb: https://git.kernel.org/tip/172bf02d564bdb6df8410f64720fa2c68e755d1a Author: Arnaldo Carvalho de Melo AuthorDate: Mon, 7 Jan 2019 16:24:27 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 8 Jan 2019 13:28:12 -0300 perf trace: Fix ')' placement in "interrupted" syscall lines When we get the sys_enter for a syscall we check if the last one is still waiting for its matching sys_exit, if so we print this: 468.753 ( ): firefox/32382 poll(ufds: 0x7f3988d3dd00, nfds: 7, timeout_msecs: 4294967295) ... 449.575 ( 0.004 ms): Softwar~cThrea/32434 futex(uaddr: 0x7f39a18a9b70, op: WAKE|PRIVATE_FLAG, val: 1) = 0 At some point we'll get that poll sys_exit event and will print a "[continued]" line. While making the sizing of the alignment after the syscall arg list and its result configurable, so that we can mimic strace, which uses a smaller alingment by default, a bug was introduced where the closing parens appeared before the syscall name and its arg list, fix it. Fixes: 4b8a240ed5e0 ("perf trace: Add alignment spaces after the closing parens") Cc: Adrian Hunter Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-oi45i54s59h1w1kmgpzrf...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-trace.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index adbf28183560..b8bf5d025563 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1758,6 +1758,7 @@ static int trace__printf_interrupted_entry(struct trace *trace) { struct thread_trace *ttrace; size_t printed; + int len; if (trace->failure_only || trace->current == NULL) return 0; @@ -1768,9 +1769,14 @@ static int trace__printf_interrupted_entry(struct trace *trace) return 0; printed = trace__fprintf_entry_head(trace, trace->current, 0, false, ttrace->entry_time, trace->output); - printed += fprintf(trace->output, ")%-*s ...\n", trace->args_alignment, ttrace->entry_str); - ttrace->entry_pending = false; + printed += len = fprintf(trace->output, "%s)", ttrace->entry_str); + + if (len < trace->args_alignment - 4) + printed += fprintf(trace->output, "%-*s", trace->args_alignment - 4 - len, " "); + printed += fprintf(trace->output, " ...\n"); + + ttrace->entry_pending = false; ++trace->nr_events_printed; return printed;
[PATCH v5 3/6] usb: gadget: uvc: package setup and data for control OUT requests
Since "usb: gadget: uvc: enqueue uvc_request_data in setup handler for control OUT requests" it is no longer necessary for userspace to call ioctl UVCIOC_SEND_RESPONSE in response to receiving a UVC_EVENT_SETUP from the uvc function driver for a control OUT request. This change means that for control OUT userspace will receive a UVC_EVENT_SETUP and not do anything with it. This is a waste of a pair of context switches, so we put the setup and data stage data into a single UVC_EVENT_DATA to give to userspace. Previously struct uvc_request_data had 60 bytes allocated for data, and since uvc data at most is 34 bytes in UVC 1.1 and 48 bytes in UVC 1.5, we can afford to cut out 8 bytes to store the setup control. Since the setup control is discarded after the handling of the setup stage, it must be saved in struct uvc_device during the setup handler in order for the data stage handler to be able to read it and send it to userspace. Signed-off-by: Paul Elder Reviewed-by: Laurent Pinchart --- No change from v4 No change from v3 No change from v2 No change from v1 drivers/usb/gadget/function/f_uvc.c | 3 +++ drivers/usb/gadget/function/uvc.h | 1 + include/uapi/linux/usb/g_uvc.h | 3 ++- 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c index 36de51ac30ed..58ed191f636e 100644 --- a/drivers/usb/gadget/function/f_uvc.c +++ b/drivers/usb/gadget/function/f_uvc.c @@ -215,6 +215,8 @@ uvc_function_ep0_complete(struct usb_ep *ep, struct usb_request *req) v4l2_event.type = UVC_EVENT_DATA; uvc_event->data.length = req->actual; memcpy(_event->data.data, req->buf, req->actual); + memcpy(_event->data.setup, >control_setup, + sizeof(uvc_event->data.setup)); v4l2_event_queue(>vdev, _event); } } @@ -238,6 +240,7 @@ uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl) */ uvc->event_setup_out = !(ctrl->bRequestType & USB_DIR_IN); uvc->event_length = le16_to_cpu(ctrl->wLength); + memcpy(>control_setup, ctrl, sizeof(uvc->control_setup)); if (uvc->event_setup_out) { struct usb_request *req = uvc->control_req; diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h index 671020c8a836..1d89b1df4ba0 100644 --- a/drivers/usb/gadget/function/uvc.h +++ b/drivers/usb/gadget/function/uvc.h @@ -163,6 +163,7 @@ struct uvc_device { unsigned int control_intf; struct usb_ep *control_ep; struct usb_request *control_req; + struct usb_ctrlrequest control_setup; void *control_buf; unsigned int streaming_intf; diff --git a/include/uapi/linux/usb/g_uvc.h b/include/uapi/linux/usb/g_uvc.h index 6698c3263ae8..10fbb4382925 100644 --- a/include/uapi/linux/usb/g_uvc.h +++ b/include/uapi/linux/usb/g_uvc.h @@ -24,7 +24,8 @@ struct uvc_request_data { __s32 length; - __u8 data[60]; + struct usb_ctrlrequest setup; + __u8 data[52]; }; struct uvc_event { -- 2.20.1
Re: [PATCH v2 3/3] drivers: platform: goldfish: goldfish_sync: add a driver
On Tue, Jan 08, 2019 at 04:48:02PM -0800, Roman Kiryanov wrote: > > Also, why is this driver needed at all? Why can't you use the "normal" > > drm sync api interface? Why write a custom ioctl driver just for this > > one kernel interface? > > This driver allows us to talk to host's sync api. That describes what your code does, yes, but you did not answer my question at all. Why do you need this? What are you trying to do? And why is this so different from any other drm driver? thanks, greg k-h
Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler
Am Mittwoch, 9. Januar 2019, 07:58:28 CET schrieb James Bottomley: Hi James, > On Wed, 2019-01-09 at 07:45 +0100, Stephan Mueller wrote: > > Am Mittwoch, 9. Januar 2019, 01:44:31 CET schrieb James Bottomley: > > > > Hi James, > > > > > Actually, it would be enormously helpful if we could reuse these > > > pieces for the TPM as well. > > > > Could you please help me understand whether the KDFs in TPM are > > directly usable as a standalone cipher primitive or does it go > > together with additional key generation operations? > > They're used as generators ... which means they deterministically > produce keys from what the TPM calls seeds so we can get crypto agility > of TPM 2.0 ... well KDFa does. KDFe is simply what NIST recommends you > do when using EC for a shared key agreement ... and really we shouldn't > be using ECDH in the kernel without it. > Thanks for clarifying. That would mean that indeed we would have hardware- provided KDF implementations that may be usable with the kernel crypto API. [...] > > > Would it be appropriate, to implement a type cast to a structure from > > the u8 pointer? > > > > E.g. for the aforementioned label/context data, we could define > > something like > > > > struct crypto_kdf_ctr { > > > > char *label; > > size_t label_len; > > u8 *contextU; > > size_t contextU_len; > > u8 *contextV; > > size_t contextV_len; > > > > }; > > > > And the implementation of the generate function for CTR KDF would > > > > then have a type cast along the following lines: > > if (slen != sizeof(struct crypto_kdf_ctr)) > > > > return -EINVAL; > > > > const struct crypto_kdf_ctr *kdf_ctr_input = (struct > > > > crypto_kdf_ctr *)src; > > > > > > For different KDFs, different structs would be needed. > > Actually, we probably just need the input key (or secret material), the > concatenation and the number of output bits. Thanks for confirming. Though, when it comes to HKDF (not that I see it being needed or required in the kernel), there is a need to split up the src pointer since the mentioned input is used in different ways. In order to try to get the implementation and thus the interface right, I would suggest to at least have a consensus on how to handle such situations. Thus, would the proposal be acceptable for such KDFs that may need to have different components communicated as input to the KDF? Ciao Stephan
Re: [GIT PULL 00/16] perf/core fixes and improvements
* Arnaldo Carvalho de Melo wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit 64598e8b6fdaf28e37c3530f8b95a9f8ef6af131: > > Merge tag 'perf-core-for-mingo-4.21-20190104' of > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent > (2019-01-08 16:31:19 +0100) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git > tags/perf-core-for-mingo-5.0-20190108 > > for you to fetch changes up to ee412f14693a3fe2645b3528603dfd37dd05118a: > > tools include uapi: Sync linux/vhost.h with the kernel sources (2019-01-08 > 14:09:33 -0300) > > > perf/core fixes and improvements: > > perf top: > > Arnaldo Carvalho de Melo: > > . Lift restriction on using callchains without "sym" in --sort, allowing > > perf trace: > > Arnaldo Carvalho de Melo: > > . Fix ')' placement in "interrupted" syscall lines. > > . Fix alignment for [continued] lines. > > perf tests: > > Florian Fainelli: > > . Add a test for the ARM 32-bit [vectors] page. > > tools lib traceevent: > > Tzvetomir Stoyanov: > > . Introduce new libtracevent API: tep_override_comm(). > > . Initialize host_bigendian at tep_handle allocation. > > . More namespacing changes. > > . Remove superfluous APIs. > > tools headers uapi: > > Arnaldo Carvalho de Melo: > > . Update linux/{fs,vhost}.h, grab a copy o linux/mount.h, where the > MS_ mount flags were moved. > > Signed-off-by: Arnaldo Carvalho de Melo > > > Arnaldo Carvalho de Melo (7): > perf trace: Fix ')' placement in "interrupted" syscall lines > perf trace: Fix alignment for [continued] lines > perf top: Lift restriction on using callchains without "sym" in --sort > tools include uapi: Grab a copy of linux/mount.h > perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h > tools include uapi: Sync linux/fs.h copy with the kernel sources > tools include uapi: Sync linux/vhost.h with the kernel sources > > Florian Fainelli (2): > perf tools: Make find_vdso_map() more modular > perf tests: Add a test for the ARM 32-bit [vectors] page > > Tzvetomir Stoyanov (7): > tools lib traceevent: Introduce new libtracevent API: > tep_override_comm() > tools lib traceevent: Initialize host_bigendian at tep_handle allocation > tools lib traceevent: Rename struct cmdline to struct tep_cmdline > tools lib traceevent: Changed return logic of trace_seq_printf() and > trace_seq_vprintf() APIs > tools lib traceevent: Changed return logic of > tep_register_event_handler() API > tools lib traceevent: Rename tep_is_file_bigendian() to > tep_file_bigendian() > tools lib traceevent: Remove tep_data_event_from_type() API > > tools/include/uapi/linux/fs.h | 60 ++- > tools/include/uapi/linux/mount.h| 58 +++ > tools/include/uapi/linux/vhost.h| 113 + > tools/lib/traceevent/event-parse-api.c | 4 +- > tools/lib/traceevent/event-parse-local.h| 4 +- > tools/lib/traceevent/event-parse.c | 129 > +++- > tools/lib/traceevent/event-parse.h | 17 ++-- > tools/lib/traceevent/plugin_kvm.c | 2 +- > tools/lib/traceevent/trace-seq.c| 17 +++- > tools/perf/Makefile.perf| 4 +- > tools/perf/arch/arm/tests/Build | 1 + > tools/perf/arch/arm/tests/arch-tests.c | 4 + > tools/perf/arch/arm/tests/vectors-page.c| 24 + > tools/perf/builtin-top.c| 7 +- > tools/perf/builtin-trace.c | 15 ++- > tools/perf/check-headers.sh | 1 + > tools/perf/perf-read-vdso.c | 6 +- > tools/perf/tests/tests.h| 5 + > tools/perf/trace/beauty/mount_flags.sh | 4 +- > tools/perf/util/{find-vdso-map.c => find-map.c} | 7 +- > tools/perf/util/vdso.c | 6 +- > 21 files changed, 238 insertions(+), 250 deletions(-) > create mode 100644 tools/include/uapi/linux/mount.h > create mode 100644 tools/perf/arch/arm/tests/vectors-page.c > rename tools/perf/util/{find-vdso-map.c => find-ma
Re: About some mail lost and random unsubscribe issues on this dmaengine list
On 09-01-19, 01:35, Yang, Shunyong wrote: > Hi, Vinod and All, > I found, somtimes, I cannot received some mails from this mailing > list. For example, I sent my v2 series on 7th, Jan, > > https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713 > > It appeared on patchwork.kernel.org but I haven't received it in this > mailing list. > > And I found randomly, a month or two months, I am unsubscribed from this > list automatically. Then I have to re-subscribe the list. I guess your corporate server is to blame. vger mail list is "know" to unsubscribe servers which behave badly, so you need to ask your mail admins why and try to use other email which is known to be friendly with vger :) > Have someone met similar problem as me? Or maybe I missed some necessary > settings? > > Shunyong. > Thanks. -- ~Vinod
[PATCH resend] drm/panel: panel-innolux: set display off in innolux_panel_unprepare
Move mipi_dsi_dcs_set_display_off() from innolux_panel_disable() to innolux_panel_unprepare(), so they are consistent with innolux_panel_enable() and innolux_panel_prepare(). This also fixes some mode check and irq timeout issue in MTK dsi code. Since some dsi code (e.g. mtk_dsi) have following call trace: 1. drm_panel_disable(), which calls innolux_panel_disable() 2. switch to cmd mode 3. drm_panel_unprepare(), which calls innolux_panel_unprepare() However, mtk_dsi needs to be in cmd mode to be able to send commands (e.g. mipi_dsi_dcs_set_display_off() and mipi_dsi_dcs_enter_sleep_mode()), so we need these functions to be called after the switch to cmd mode happens, i.e. in innolux_panel_unprepare. Signed-off-by: Hsin-Yi, Wang --- Resend for review. --- drivers/gpu/drm/panel/panel-innolux-p079zca.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/panel/panel-innolux-p079zca.c b/drivers/gpu/drm/panel/panel-innolux-p079zca.c index ca4ae45dd307..8e5724b63f1f 100644 --- a/drivers/gpu/drm/panel/panel-innolux-p079zca.c +++ b/drivers/gpu/drm/panel/panel-innolux-p079zca.c @@ -70,18 +70,12 @@ static inline struct innolux_panel *to_innolux_panel(struct drm_panel *panel) static int innolux_panel_disable(struct drm_panel *panel) { struct innolux_panel *innolux = to_innolux_panel(panel); - int err; if (!innolux->enabled) return 0; backlight_disable(innolux->backlight); - err = mipi_dsi_dcs_set_display_off(innolux->link); - if (err < 0) - DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n", - err); - innolux->enabled = false; return 0; @@ -95,6 +89,11 @@ static int innolux_panel_unprepare(struct drm_panel *panel) if (!innolux->prepared) return 0; + err = mipi_dsi_dcs_set_display_off(innolux->link); + if (err < 0) + DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n", + err); + err = mipi_dsi_dcs_enter_sleep_mode(innolux->link); if (err < 0) { DRM_DEV_ERROR(panel->dev, "failed to enter sleep mode: %d\n", -- 2.18.1
Re: [GIT PULL] seccomp: build fix for v5.0-rc2
* Kees Cook wrote: > This was already picked up by x86-urgent... > > -Kees I'm fine with both routes - if Linus pulls this I'll zap the x86/urgent one. Thanks, Ingo
Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler
On Wed, 2019-01-09 at 07:45 +0100, Stephan Mueller wrote: > Am Mittwoch, 9. Januar 2019, 01:44:31 CET schrieb James Bottomley: > > Hi James, > > > Actually, it would be enormously helpful if we could reuse these > > pieces for the TPM as well. > > Could you please help me understand whether the KDFs in TPM are > directly usable as a standalone cipher primitive or does it go > together with additional key generation operations? They're used as generators ... which means they deterministically produce keys from what the TPM calls seeds so we can get crypto agility of TPM 2.0 ... well KDFa does. KDFe is simply what NIST recommends you do when using EC for a shared key agreement ... and really we shouldn't be using ECDH in the kernel without it. > > It has two KDFs: KDFa, which is the CTR-KDF from > > SP800-108 and KDFe which is the SP800-56A KDF for elliptic curve > > one pass Diffie Hellman, so if we're going to do the former, I'd > > really like the latter as well. > > > > The way the TPM parametrises input to both KDFs is > > > > (hashAlg, key, label, contextU, contextV, bits) > > > > Where > > > > hashAlg = the hash algorithm used as the PRF > > key = the input parameter of variable bit size or > >the x co-ordinate of the shared point > > label= An ASCII string representing the use > > contextU = public input U > > contextV = public input V > > bits = number of output bits > > When implementing KDFs as an extension of the kernel crypto API's RNG > facility we currently have to handle the limitiation of the existing > API. The label/context data (and when considering RFC 5869 HKDF > requring IKM, salt and additional information as input) currently > cannot directly be communicated through the API. > > The issue is that the RNG facility currently has the following > prototype defined: > > int (*generate)(struct crypto_rng *tfm, > const u8 *src, unsigned int slen, > u8 *dst, unsigned int dlen); > > The src pointer would need to take the label/context data. That's probably good enough. For both KDFa and KDFe the label contextU and ContextV are concatenated, so in both cases a single source is probably good enough. However, we also need to feed in the key somehow since it's usually used separately in the derivation functions. > Would it be appropriate, to implement a type cast to a structure from > the u8 pointer? > > E.g. for the aforementioned label/context data, we could define > something like > > struct crypto_kdf_ctr { > char *label; > size_t label_len; > u8 *contextU; > size_t contextU_len; > u8 *contextV; > size_t contextV_len; > }; > > And the implementation of the generate function for CTR KDF would > then have a type cast along the following lines: > > if (slen != sizeof(struct crypto_kdf_ctr)) > return -EINVAL; > const struct crypto_kdf_ctr *kdf_ctr_input = (struct > crypto_kdf_ctr *)src; > > > For different KDFs, different structs would be needed. Actually, we probably just need the input key (or secret material), the concatenation and the number of output bits. James
Re: [PATCH 1/2] x86, kexec_file_load: Don't setup EFI info if EFI runtime is not enabled
CCing more people On Wed, Jan 9, 2019 at 2:45 PM Kairui Song wrote: > > Currenly with "efi=noruntime" in kernel command line, calling > kexec_file_load will raise below problem: > > [ 97.967067] BUG: unable to handle kernel NULL pointer dereference at > > [ 97.967894] #PF error: [normal kernel read fault] > ... > [ 97.980456] Call Trace: > [ 97.980724] efi_runtime_map_copy+0x28/0x30 > [ 97.981267] bzImage64_load+0x688/0x872 > [ 97.981794] arch_kexec_kernel_image_load+0x6d/0x70 > [ 97.982441] kimage_file_alloc_init+0x13e/0x220 > [ 97.983035] __x64_sys_kexec_file_load+0x144/0x290 > [ 97.983586] do_syscall_64+0x55/0x1a0 > [ 97.983962] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > When efi runtime is not enabled, efi memmap is not mapped, so just skip > EFI info setup. > > Suggested-by: Dave Young > Signed-off-by: Kairui Song > --- > arch/x86/kernel/kexec-bzimage64.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kernel/kexec-bzimage64.c > b/arch/x86/kernel/kexec-bzimage64.c > index 0d5efa34f359..53917a3ebf94 100644 > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -167,6 +167,9 @@ setup_efi_state(struct boot_params *params, unsigned long > params_load_addr, > struct efi_info *current_ei = _params.efi_info; > struct efi_info *ei = >efi_info; > > + if (!efi_enabled(EFI_RUNTIME_SERVICES)) > + return 0; > + > if (!current_ei->efi_memmap_size) > return 0; > > -- > 2.20.1 > -- Best Regards, Kairui Song
Re: [PATCH 2/2] x86, kexec_file_load: make it work with efi=noruntime or efi=oldmap
CCing more people On Wed, Jan 9, 2019 at 2:47 PM Kairui Song wrote: > > When efi=noruntime or efi=oldmap is used, EFI services won't be available > in the second kernel, therefore the second kernel will not be able to get > the ACPI RSDP address from firmware by calling EFI services and won't > boot. Previously we are expecting the user to set the acpi_rsdp= > on kernel command line for second kernel as there was no way to pass RSDP > address to second kernel. > > After commit e6e094e053af ('x86/acpi, x86/boot: Take RSDP address from > boot params if available'), now it's possible to set a acpi_rsdp_addr > parameter in the boot_params passed to second kernel, this commit make > use of it, detect and set the RSDP address when it's required for second > kernel to boot. > > Tested with an EFI enabled KVM VM with efi=noruntime. > > Suggested-by: Dave Young > Signed-off-by: Kairui Song > --- > arch/x86/kernel/kexec-bzimage64.c | 21 + > drivers/acpi/acpica/tbxfroot.c| 3 +-- > include/acpi/acpixf.h | 2 +- > 3 files changed, 23 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/kexec-bzimage64.c > b/arch/x86/kernel/kexec-bzimage64.c > index 53917a3ebf94..0a90dcbd041f 100644 > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -255,8 +256,28 @@ setup_boot_parameters(struct kimage *image, struct > boot_params *params, > /* Setup EFI state */ > setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz, > efi_setup_data_offset); > + > +#ifdef CONFIG_ACPI > + /* Setup ACPI RSDP pointer in case EFI is not available in second > kernel */ > + if (!acpi_disabled && (!efi_enabled(EFI_RUNTIME_SERVICES) || > efi_enabled(EFI_OLD_MEMMAP))) { > + /* Copied from acpi_os_get_root_pointer accordingly */ > + params->acpi_rsdp_addr = boot_params.acpi_rsdp_addr; > + if (!params->acpi_rsdp_addr) { > + if (efi_enabled(EFI_CONFIG_TABLES)) { > + if (efi.acpi20 != EFI_INVALID_TABLE_ADDR) > + params->acpi_rsdp_addr = efi.acpi20; > + else if (efi.acpi != EFI_INVALID_TABLE_ADDR) > + params->acpi_rsdp_addr = efi.acpi; > + } else if > (IS_ENABLED(CONFIG_ACPI_LEGACY_TABLES_LOOKUP)) { > + > acpi_find_root_pointer(>acpi_rsdp_addr); > + } > + } > + if (!params->acpi_rsdp_addr) > + pr_warn("RSDP is not available for second kernel\n"); > + } > #endif > > +#endif > /* Setup EDD info */ > memcpy(params->eddbuf, boot_params.eddbuf, > EDDMAXNR * sizeof(struct edd_info)); > diff --git a/drivers/acpi/acpica/tbxfroot.c b/drivers/acpi/acpica/tbxfroot.c > index 483d0ce5180a..dac1e34a931c 100644 > --- a/drivers/acpi/acpica/tbxfroot.c > +++ b/drivers/acpi/acpica/tbxfroot.c > @@ -108,8 +108,7 @@ acpi_status acpi_tb_validate_rsdp(struct acpi_table_rsdp > *rsdp) > * > > **/ > > -acpi_status ACPI_INIT_FUNCTION > -acpi_find_root_pointer(acpi_physical_address *table_address) > +acpi_status acpi_find_root_pointer(acpi_physical_address *table_address) > { > u8 *table_ptr; > u8 *mem_rover; > diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h > index 7aa38b648564..869d75ecaf7d 100644 > --- a/include/acpi/acpixf.h > +++ b/include/acpi/acpixf.h > @@ -474,7 +474,7 @@ ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION > ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION > acpi_reallocate_root_table(void)) > > -ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION > +ACPI_EXTERNAL_RETURN_STATUS(acpi_status > acpi_find_root_pointer(acpi_physical_address >*rsdp_address)) > ACPI_EXTERNAL_RETURN_STATUS(acpi_status > -- > 2.20.1 > -- Best Regards, Kairui Song
Re: [PATCH AUTOSEL 4.20 016/117] fanotify: return only user requested event types in event mask
On Tue, Jan 8, 2019 at 10:11 PM Sasha Levin wrote: > > From: Matthew Bobrowski > > [ Upstream commit 2d10b23082a7eb8be508b3789f2e7250a88a5ddb ] > > Modify fanotify_should_send_event() so that it now returns a mask for > an event that contains ONLY flags for the event types that have been > specifically requested by the user. Flags that may have been included > within the event mask, but have not been explicitly requested by the > user will not be present in the returned value. > > As an example, given the situation where a user requests events of type > FAN_OPEN. Traditionally, the event mask returned within an event that > occurred on a filesystem object that has been marked for monitoring and is > opened, will only ever have the FAN_OPEN bit set. With the introduction of > the new flags like FAN_OPEN_EXEC, and perhaps any other future event > flags, there is a possibility of the returned event mask containing more > than a single bit set, despite having only requested the single event type. > Prior to these modifications performed to fanotify_should_send_event(), a > user would have received a bundled event mask containing flags FAN_OPEN > and FAN_OPEN_EXEC in the instance that a file was opened for execution via > execve(), for example. This means that a user would receive event types > in the returned event mask that have not been requested. This runs the > possibility of breaking existing systems and causing other unforeseen > issues. > > To mitigate this possibility, fanotify_should_send_event() has been > modified to return the event mask containing ONLY event types explicitly > requested by the user. This means that we will NOT report events that the > user did no set a mask for, and we will NOT report events that the user > has set an ignore mask for. > > The function name fanotify_should_send_event() has also been updated so > that it's more relevant to what it has been designed to do. > > Signed-off-by: Matthew Bobrowski > Reviewed-by: Amir Goldstein > Signed-off-by: Jan Kara > Signed-off-by: Sasha Levin > --- Sasha, I have no objection to applying this patch to 4.20, but FYI, it does not fix anything. Before introducing FAN_OPEN_EXEC in 5.0-rc1, this patch has no visible effect. I don't mind if you apply it. It will make stable code closer to mainline, which is always a good thing IMO. And FWIW, I think that patch is quite trivial and low risk. Thanks, Amir.
Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler
On Tue, 2019-01-08 at 17:43 -0800, Andy Lutomirski wrote: > [Adding Jarkko because this stuff relates to the TPM.] > > On Tue, Jan 8, 2019 at 4:44 PM James Bottomley > wrote: > > > > On Tue, 2019-01-08 at 15:54 -0800, Andy Lutomirski wrote: > > > > On Jan 7, 2019, at 11:09 PM, Stephan Mueller > > > de> > > > > wrote: > > > > > > > > Am Dienstag, 8. Januar 2019, 06:03:58 CET schrieb Herbert Xu: > > > > > > > > Hi Herbert, > > > > > > > > > Are we going to have multiple implementations for the same > > > > > KDF? If not then the crypto API is not a good fit. To > > > > > consolidate multiple implementations of the same KDF, simply > > > > > provide helpers for them. > > > > > > > > It is unlikely to have multiple implementations of a KDF. > > > > However, KDFs relate to hashes like block chaining modes to raw > > > > block ciphers. Thus a KDF can be applied with different hashes. > > > > > > > > My idea was to add template support to RNGs (because KDFs are > > > > effectively a type of RNG since they produce an arbitrary > > > > output from a fixed input). The KDFs would be a template > > > > wrapping hashes. For example, the CTR-KDF from SP800-108 could > > > > be instantiated like kdf-ctr(sha256). > > > > > > > > > > > > > > I think that, if the crypto API is going to grow a KDF facility, > > > it should be done right. Have a key type or flag or whatever that > > > says “this key may *only* be used to derive keys using such-and- > > > such algorithm”, and have a helper to derive a key. That helper > > > should take some useful parameters and mix them in: > > > > > > - What type of key is being derived? ECDSA signing key? HMAC > > > key? AES key? > > > > > > - Can user code access the derived key? > > > > > > - What is the key’s purpose? “Encrypt and authenticate a > > > hibernation image” would be a purpose. > > > > > > - Number of bytes. > > > > > > All of these parameters should be mixed in to the key derivation. > > > > > > Also, an AE key, even for AES+HMAC, should be just one derived > > > key. If you need 512 bits, ask for a 512-bit key, not two 256- > > > bit keys. > > > > Actually, it would be enormously helpful if we could reuse these > > pieces for the TPM as well. It has two KDFs: KDFa, which is the > > CTR-KDF from SP800-108 and KDFe which is the SP800-56A KDF for > > elliptic curve one pass Diffie Hellman, so if we're going to do the > > former, I'd really like the latter as well. > > > > The way the TPM parametrises input to both KDFs is > > > > (hashAlg, key, label, contextU, contextV, bits) > > > > Where > > > > hashAlg = the hash algorithm used as the PRF > > key = the input parameter of variable bit size or > >the x co-ordinate of the shared point > > label= An ASCII string representing the use > > contextU = public input U > > contextV = public input V > > bits = number of output bits > > > > Is that a good enough parametrisation (not the only way you > > distinguish uses is with the label, which is not > > recoverable)? ContextU and ContextV are simply concatenated to > > form the full Context of SP800-108, but we tend to need two > > separate inputs (for KDFe they're the public x co-ordinates of the > > points of the two inputs to ECDH for instance; in KDFa they're > > usually the local and remote nonces). > > > > The labels for TPM usage are things like "INTEGRITY" for HMAC keys > > or "CFB" when generating an aes128-cfb session key. For KDFe, the > > tpm seems to like the label "SECRET". Although the TPM specifies > > fixed short strings for the label, nothing prevents them being > > longer like the purpose you state above (essentially we could mix > > purpose, use and key type into the label and the contexts). > > > > That really ought to cover anything the kernel needs. > > But can you explain what's up with with KDFe? The obvious searches > end up with just warnings that the US currently has no government :( You mean you can't find SP100-56A because NIST is a government entity and it's discontinued its website because of the government shutdown? No idea, I only live here, you'll have to ask a real American. ACM does have a copy: http://delivery.acm.org/10.1145/221/2206270/SP800-56A_Revision1_Mar08-2007.pdf?ip=50.35.68.20=2206270=OPEN=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1546993111_ed9c8bd24b2f838c829d428aac7f5d71 > Anyway, if we're talking about the TPM, it seems like the entire > "trusted key" mechanism in the kernel is missing the point. If I > want to encrypt something like a hibernation image on a machine with > a TPM, it makes essentially no sense to me that we would get a key > with a known raw value that is merely TPM-backed (i.e. the "trusted > key") and use that to decrypt the image. The right way to do it is > to the use the TPM as it was intended to be used: generate a single- > use key that protects the hibernation image and seal *that* directly >
[PATCH 2/2] x86, kexec_file_load: make it work with efi=noruntime or efi=oldmap
When efi=noruntime or efi=oldmap is used, EFI services won't be available in the second kernel, therefore the second kernel will not be able to get the ACPI RSDP address from firmware by calling EFI services and won't boot. Previously we are expecting the user to set the acpi_rsdp= on kernel command line for second kernel as there was no way to pass RSDP address to second kernel. After commit e6e094e053af ('x86/acpi, x86/boot: Take RSDP address from boot params if available'), now it's possible to set a acpi_rsdp_addr parameter in the boot_params passed to second kernel, this commit make use of it, detect and set the RSDP address when it's required for second kernel to boot. Tested with an EFI enabled KVM VM with efi=noruntime. Suggested-by: Dave Young Signed-off-by: Kairui Song --- arch/x86/kernel/kexec-bzimage64.c | 21 + drivers/acpi/acpica/tbxfroot.c| 3 +-- include/acpi/acpixf.h | 2 +- 3 files changed, 23 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 53917a3ebf94..0a90dcbd041f 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -255,8 +256,28 @@ setup_boot_parameters(struct kimage *image, struct boot_params *params, /* Setup EFI state */ setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz, efi_setup_data_offset); + +#ifdef CONFIG_ACPI + /* Setup ACPI RSDP pointer in case EFI is not available in second kernel */ + if (!acpi_disabled && (!efi_enabled(EFI_RUNTIME_SERVICES) || efi_enabled(EFI_OLD_MEMMAP))) { + /* Copied from acpi_os_get_root_pointer accordingly */ + params->acpi_rsdp_addr = boot_params.acpi_rsdp_addr; + if (!params->acpi_rsdp_addr) { + if (efi_enabled(EFI_CONFIG_TABLES)) { + if (efi.acpi20 != EFI_INVALID_TABLE_ADDR) + params->acpi_rsdp_addr = efi.acpi20; + else if (efi.acpi != EFI_INVALID_TABLE_ADDR) + params->acpi_rsdp_addr = efi.acpi; + } else if (IS_ENABLED(CONFIG_ACPI_LEGACY_TABLES_LOOKUP)) { + acpi_find_root_pointer(>acpi_rsdp_addr); + } + } + if (!params->acpi_rsdp_addr) + pr_warn("RSDP is not available for second kernel\n"); + } #endif +#endif /* Setup EDD info */ memcpy(params->eddbuf, boot_params.eddbuf, EDDMAXNR * sizeof(struct edd_info)); diff --git a/drivers/acpi/acpica/tbxfroot.c b/drivers/acpi/acpica/tbxfroot.c index 483d0ce5180a..dac1e34a931c 100644 --- a/drivers/acpi/acpica/tbxfroot.c +++ b/drivers/acpi/acpica/tbxfroot.c @@ -108,8 +108,7 @@ acpi_status acpi_tb_validate_rsdp(struct acpi_table_rsdp *rsdp) * **/ -acpi_status ACPI_INIT_FUNCTION -acpi_find_root_pointer(acpi_physical_address *table_address) +acpi_status acpi_find_root_pointer(acpi_physical_address *table_address) { u8 *table_ptr; u8 *mem_rover; diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h index 7aa38b648564..869d75ecaf7d 100644 --- a/include/acpi/acpixf.h +++ b/include/acpi/acpixf.h @@ -474,7 +474,7 @@ ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION acpi_reallocate_root_table(void)) -ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION +ACPI_EXTERNAL_RETURN_STATUS(acpi_status acpi_find_root_pointer(acpi_physical_address *rsdp_address)) ACPI_EXTERNAL_RETURN_STATUS(acpi_status -- 2.20.1
Re: [PATCH 2/2] leds: lp5024: Add the LP5024/18 RGB LED driver
Hi Jacek, On 07/01/2019 23.13, Jacek Anaszewski wrote: Hi Vesa, On 1/5/19 1:39 AM, Vesa Jääskeläinen wrote: Hi Jacek, On 04/01/2019 23.37, Jacek Anaszewski wrote: But, aside from that hypothetic issue, we need a solution for LEDn_BRIGHTNESS feature of lp5024, i.e. setting color intensity via a single register write. How would you propose to address that? You could model it to something like this in device tree: led-module @ { compatible = "lp5024"; // There is in hardware setup to use either linear or // logarithmic scaling: //enable-logarithmic-brightness; led0 { // this will create led instance for LED0 in lp5024 label = "lp-led0"; // This specifies LED number within lp5024 led-index = <0>; // set output-base as 0*3 == 0 element-red { // refers to OUT0 output-offset = <0>; }; element-green { // refers to OUT1 output-offset = <1>; }; element-blue { // refers to OUT2 output-offset = <2>; }; }; led1 { // this will create led instance for LED1 in lp5024 label = "lp-led1"; // This specifies LED number within lp5024 led-index = <1>; // set output-base as 1*3 == 3 element-red { // refers to OUT3 output-offset = <0>; }; element-green { // refers to OUT4 output-offset = <1>; }; element-blue { // refers to OUT5 output-offset = <2>; }; }; bank-led { // this will create led instance for bank leds in lp5024 label = "lp-bank-led"; // configured bank led configuration led-index = <2 3 4 5 6 7>; // As here is list of led-indices this entry is // assumed to be bank configuration. Bank mode is enable // for the indices. // set output-base as BANK A element-red { // refers to BANK A output-offset = <0>; }; element-green { // refers to BANK B output-offset = <1>; }; element-blue { // refers to BANK C output-offset = <2>; }; }; }; This would then create three led instances and each led instance has brightness setting and that goes straight to hardware. If one would want to override hardware control for brightness then I suppose you would define in led node something like: brightness-model = "hsl" This would then pick red, green and blue elements for hsl calculations and others color elements for linear. LED specific hardware brightness would then be either 0 or 0xFF depending if all of LED color elements are zero or not. Would that kind of model work? I'd prefer to have single RGB LED device. And your DT design is unnecessarily complex and a bit confusing. As this chip series is kinda designed for N x RGB LED's my idea was that if from user space point of view we model it as N times of individual RGB LED instances that may not even have anything to do with together. Eg. could be used for different purposes and such. And in device tree one would define logical connections for the leds so they would be mapped logically correct to user space. If one would define it like: led1 { // this will create led instance for LED1 in lp5024 label = "lp-led1"; // This specifies LED number within lp5024 led-sources = <1>; }; (note changed led-index to led-sources as that is what Pavel had and preferred) We could assume that it is RGB led in this driver's case and create it automatically with elements "red", "green", and "blue". And this could then be mapped automatically to HSL color elements or what ever the model would be. If you would model it differently in your hardware design then you would need to define more device tree nodes. Eg. if your order of LEDs would not be red, green, blue. Or if you would have non-RGB led(s) in there. Also, you provided scarce information about sysfs interface. It would be nice to see the sequence of commands. In this case it could be: # Note: Updated color to value array model. $ ls /sys/class/leds lp-led0 lp-led1 lp-bank-led $ ls /sys/class/leds/lp-led0 brightness color $ ls /sys/class/leds/lp-led1 brightness color $ ls /sys/class/leds/lp-bank-led brightness color # Idea of above is that as brightness is for triplet: # OUT(LED*3 + 0), OUT(LED*3 + 1), OUT(LED*3 + 2), # Then if we model it like RGB LED then brightness would automatically # map to correct OUTputs and be grouped from user space point of view # logically in correct place. # set first led to red echo "255 0 0" > /sys/class/leds/lp-led0/color # set second led to green echo "0 255 0" > /sys/class/leds/lp-led1/color # set
Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler
Am Mittwoch, 9. Januar 2019, 01:44:31 CET schrieb James Bottomley: Hi James, > Actually, it would be enormously helpful if we could reuse these pieces > for the TPM as well. Could you please help me understand whether the KDFs in TPM are directly usable as a standalone cipher primitive or does it go together with additional key generation operations? > It has two KDFs: KDFa, which is the CTR-KDF from > SP800-108 and KDFe which is the SP800-56A KDF for elliptic curve one > pass Diffie Hellman, so if we're going to do the former, I'd really > like the latter as well. > > The way the TPM parametrises input to both KDFs is > > (hashAlg, key, label, contextU, contextV, bits) > > Where > > hashAlg = the hash algorithm used as the PRF > key = the input parameter of variable bit size or >the x co-ordinate of the shared point > label= An ASCII string representing the use > contextU = public input U > contextV = public input V > bits = number of output bits When implementing KDFs as an extension of the kernel crypto API's RNG facility we currently have to handle the limitiation of the existing API. The label/ context data (and when considering RFC 5869 HKDF requring IKM, salt and additional information as input) currently cannot directly be communicated through the API. The issue is that the RNG facility currently has the following prototype defined: int (*generate)(struct crypto_rng *tfm, const u8 *src, unsigned int slen, u8 *dst, unsigned int dlen); The src pointer would need to take the label/context data. Would it be appropriate, to implement a type cast to a structure from the u8 pointer? E.g. for the aforementioned label/context data, we could define something like struct crypto_kdf_ctr { char *label; size_t label_len; u8 *contextU; size_t contextU_len; u8 *contextV; size_t contextV_len; }; And the implementation of the generate function for CTR KDF would then have a type cast along the following lines: if (slen != sizeof(struct crypto_kdf_ctr)) return -EINVAL; const struct crypto_kdf_ctr *kdf_ctr_input = (struct crypto_kdf_ctr *)src; For different KDFs, different structs would be needed. Ciao Stephan
[PATCH 1/2] x86, kexec_file_load: Don't setup EFI info if EFI runtime is not enabled
Currenly with "efi=noruntime" in kernel command line, calling kexec_file_load will raise below problem: [ 97.967067] BUG: unable to handle kernel NULL pointer dereference at [ 97.967894] #PF error: [normal kernel read fault] ... [ 97.980456] Call Trace: [ 97.980724] efi_runtime_map_copy+0x28/0x30 [ 97.981267] bzImage64_load+0x688/0x872 [ 97.981794] arch_kexec_kernel_image_load+0x6d/0x70 [ 97.982441] kimage_file_alloc_init+0x13e/0x220 [ 97.983035] __x64_sys_kexec_file_load+0x144/0x290 [ 97.983586] do_syscall_64+0x55/0x1a0 [ 97.983962] entry_SYSCALL_64_after_hwframe+0x44/0xa9 When efi runtime is not enabled, efi memmap is not mapped, so just skip EFI info setup. Suggested-by: Dave Young Signed-off-by: Kairui Song --- arch/x86/kernel/kexec-bzimage64.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 0d5efa34f359..53917a3ebf94 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -167,6 +167,9 @@ setup_efi_state(struct boot_params *params, unsigned long params_load_addr, struct efi_info *current_ei = _params.efi_info; struct efi_info *ei = >efi_info; + if (!efi_enabled(EFI_RUNTIME_SERVICES)) + return 0; + if (!current_ei->efi_memmap_size) return 0; -- 2.20.1
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On 09/01/2019 16:30, David Gibson wrote: > On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote: >> On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: >>> In a very cryptic way that requires manual parsing using non-public docs sadly but yes. From the look of it, it's a completion timeout. Looks to me like we don't get a response to a config space access during the change of D state. I don't know if it's the write of the D3 state itself or the read back though (it's probably detected on the read back or a subsequent read, but that doesn't tell me which specific one failed). >>> >>> If it is just one card doing it (again, check you have latest >>> firmware) I wonder if it is a sketchy PCI-E electrical link that is >>> causing a long re-training cycle? Can you tell if the PCI-E link is >>> permanently gone or does it eventually return? >> >> No, it's 100% reproducable on systems with that specific card model, >> not card instance, and maybe different systems/cards as well, I'll let >> David & Alexey comment further on that. > > Well, it's 100% reproducable on a particular model of system > (garrison) with a particular model of card. I've had some suggestions > that it fails with some other systems card card models, but nothing > confirmed - the one other system model I've been able to try, which > also had a newer card model didn't reproduce the problem. I have just moved the "Mellanox Technologies MT27700 Family [ConnectX-4]" from garrison to firestone machine and there it does not produce an EEH, with the same kernel and skiboot (both upstream + my debug). Hm. I cannot really blame the card but I cannot see what could cause the difference in skiboot either. I even tried disabling NPU so garrison would look like firestone, still EEH'ing. >>> Does the card work in Gen 3 when it starts? Is there any indication of >>> PCI-E link errors? >> >> Nope. >> >>> Everytime or sometimes? >>> >>> POWER 8 firmware is good? If the link does eventually come back, is >>> the POWER8's D3 resumption timeout long enough? >>> >>> If this doesn't lead to an obvious conclusion you'll probably need to >>> connect to IBM's Mellanox support team to get more information from >>> the card side. >> >> We are IBM :-) So far, it seems to be that the card is doing something >> not quite right, but we don't know what. We might need to engage >> Mellanox themselves. > > Possibly. On the other hand, I've had it reported that this is a > software regression at least with downstream red hat kernels. I > haven't yet been able to eliminate factors that might be confusing > that, or try to find a working version upstream. Do you have tarballs handy? I'd diff... -- Alexey
Re: [PATCH] lkdtm: Add a tests for NULL pointer dereference
Le 09/01/2019 à 02:14, Kees Cook a écrit : On Fri, Dec 14, 2018 at 7:26 AM Christophe Leroy wrote: Introduce lkdtm tests for NULL pointer dereference: check access or exec at NULL address. Why is this not already covered by the existing tests? (Is there something special about NULL that is being missed?) I'd expect SMAP and SMEP to cover NULL as well. Most arches print a different message whether the faulty address is above or under PAGE_SIZE. Below is exemple from x86: pr_alert("BUG: unable to handle kernel %s at %px\n", address < PAGE_SIZE ? "NULL pointer dereference" : "paging request", (void *)address); Until recently, the powerpc arch didn't do it. When I implemented it (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49a502ea23bf9dec47f8f3c3960909ff409cd1bb), I needed a way to test it and couldn't find an existing one, hence this new LKDTM test. But maybe I missed something ? Christophe -Kees Signed-off-by: Christophe Leroy --- drivers/misc/lkdtm/core.c | 2 ++ drivers/misc/lkdtm/lkdtm.h | 2 ++ drivers/misc/lkdtm/perms.c | 18 ++ 3 files changed, 22 insertions(+) diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c index bc76756b7eda..36910e1d5c09 100644 --- a/drivers/misc/lkdtm/core.c +++ b/drivers/misc/lkdtm/core.c @@ -157,7 +157,9 @@ static const struct crashtype crashtypes[] = { CRASHTYPE(EXEC_VMALLOC), CRASHTYPE(EXEC_RODATA), CRASHTYPE(EXEC_USERSPACE), + CRASHTYPE(EXEC_NULL), CRASHTYPE(ACCESS_USERSPACE), + CRASHTYPE(ACCESS_NULL), CRASHTYPE(WRITE_RO), CRASHTYPE(WRITE_RO_AFTER_INIT), CRASHTYPE(WRITE_KERN), diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h index 3c6fd327e166..b69ee004a3f7 100644 --- a/drivers/misc/lkdtm/lkdtm.h +++ b/drivers/misc/lkdtm/lkdtm.h @@ -45,7 +45,9 @@ void lkdtm_EXEC_KMALLOC(void); void lkdtm_EXEC_VMALLOC(void); void lkdtm_EXEC_RODATA(void); void lkdtm_EXEC_USERSPACE(void); +void lkdtm_EXEC_NULL(void); void lkdtm_ACCESS_USERSPACE(void); +void lkdtm_ACCESS_NULL(void); /* lkdtm_refcount.c */ void lkdtm_REFCOUNT_INC_OVERFLOW(void); diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c index fa54add6375a..62f76d506f04 100644 --- a/drivers/misc/lkdtm/perms.c +++ b/drivers/misc/lkdtm/perms.c @@ -164,6 +164,11 @@ void lkdtm_EXEC_USERSPACE(void) vm_munmap(user_addr, PAGE_SIZE); } +void lkdtm_EXEC_NULL(void) +{ + execute_location(NULL, CODE_AS_IS); +} + void lkdtm_ACCESS_USERSPACE(void) { unsigned long user_addr, tmp = 0; @@ -195,6 +200,19 @@ void lkdtm_ACCESS_USERSPACE(void) vm_munmap(user_addr, PAGE_SIZE); } +void lkdtm_ACCESS_NULL(void) +{ + unsigned long tmp; + unsigned long *ptr = (unsigned long *)NULL; + + pr_info("attempting bad read at %px\n", ptr); + tmp = *ptr; + tmp += 0xc0dec0de; + + pr_info("attempting bad write at %px\n", ptr); + *ptr = tmp; +} + void __init lkdtm_perms_init(void) { /* Make sure we can write to __ro_after_init values during __init */ -- 2.13.3
Re: [PATCH v2] staging: erofs: Add identifier for function definition arguments
On 2019/1/8 21:24, Sidong Yang wrote: > Add identifier for function definition arguments in xattr_iter_handlers, > this change clears the checkpatch.pl issue and make code more explicit. > > Signed-off-by: Sidong Yang Reviewed-by: Chao Yu Thanks,
Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler
Am Mittwoch, 9. Januar 2019, 00:54:22 CET schrieb Andy Lutomirski: Hi Andy, > > I think that, if the crypto API is going to grow a KDF facility, it should > be done right. Have a key type or flag or whatever that says “this key may > *only* be used to derive keys using such-and-such algorithm”, and have a > helper to derive a key. That helper should take some useful parameters and > mix them in: > > - What type of key is being derived? ECDSA signing key? HMAC key? AES > key? > > - Can user code access the derived key? > > - What is the key’s purpose? “Encrypt and authenticate a hibernation image” > would be a purpose. > > - Number of bytes. > > All of these parameters should be mixed in to the key derivation. > > Also, an AE key, even for AES+HMAC, should be just one derived key. If you > need 512 bits, ask for a 512-bit key, not two 256-bit keys. I concur with your requirements. However, is the kernel crypto API the right place to enforce such policies? To me, the kernel crypto API is a tinker-toy set of ciphers. The real policy enforcer would or should be the keyring facility. Thus, may I propose to: - implement the cryptographic primitive of the KDF in the kernel crypto API - implement the policy system how to use the KDF in the keyring facility Ciao Stephan
Re: [PATCH] ACPI/nfit: delete the redundant header file
On Sun, Jan 6, 2019 at 6:33 PM Xiaochun Lee wrote: > > From: Xiaochun Lee > > The header file "intel.h" is repeated here, So delete one. > > Signed-off-by: Xiaochun Lee Thanks, applied. By the way, no need for a cover letter if it's only one patch.
Re: [PATCH] ACPI/nfit: delete the function to_acpi_nfit_desc
On Sat, Jan 5, 2019 at 12:09 AM Xiaochun Lee wrote: > > From: Xiaochun Lee > > The function to_acpi_nfit_desc and function to_acpi_desc > do the same things,delete the function to_acpi_nfit_desc, > and keep the inline function to_acpi_desc. > > Signed-off-by: Xiaochun Lee Good catch, thanks, applied.
Re: [PATCH v2] nfit: Mark some functions as __maybe_unused
On Tue, Jan 8, 2019 at 9:03 PM Nathan Chancellor wrote: > > On arm64 little endian allyesconfig: > > drivers/acpi/nfit/intel.c:149:12: warning: unused function > 'intel_security_unlock' [-Wunused-function] > static int intel_security_unlock(struct nvdimm *nvdimm, >^ > drivers/acpi/nfit/intel.c:230:12: warning: unused function > 'intel_security_erase' [-Wunused-function] > static int intel_security_erase(struct nvdimm *nvdimm, >^ > drivers/acpi/nfit/intel.c:279:12: warning: unused function > 'intel_security_query_overwrite' [-Wunused-function] > static int intel_security_query_overwrite(struct nvdimm *nvdimm) >^ > drivers/acpi/nfit/intel.c:316:12: warning: unused function > 'intel_security_overwrite' [-Wunused-function] > static int intel_security_overwrite(struct nvdimm *nvdimm, >^ > 4 warnings generated. > > Mark these functions as __maybe_unused because they are only used when > CONFIG_X86 is set. > > Fixes: 4c6926a23b76 ("acpi/nfit, libnvdimm: Add unlock of nvdimm support for > Intel DIMMs") > Suggested-by: Dan Williams > Signed-off-by: Nathan Chancellor Looks good, applied.
Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order
On 2019-01-09 03:47, Alexander Duyck wrote: On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote: When freeing pages are done with higher order, time spent on coalescing pages by buddy allocator can be reduced. With section size of 256MB, hot add latency of a single section shows improvement from 50-60 ms to less than 1 ms, hence improving the hot add latency by 60 times. Modify external providers of online callback to align with the change. Signed-off-by: Arun KS Acked-by: Michal Hocko Reviewed-by: Oscar Salvador Sorry, ended up encountering a couple more things that have me a bit confused. [...] diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 5301fef..211f3fe 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -771,7 +771,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, } } -static void hv_online_page(struct page *pg) +static int hv_online_page(struct page *pg, unsigned int order) { struct hv_hotadd_state *has; unsigned long flags; @@ -783,10 +783,12 @@ static void hv_online_page(struct page *pg) if ((pfn < has->start_pfn) || (pfn >= has->end_pfn)) continue; - hv_page_online_one(has, pg); + hv_bring_pgs_online(has, pfn, (1UL << order)); break; } spin_unlock_irqrestore(_device.ha_lock, flags); + + return 0; } static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) So the question I have is why was a return value added to these functions? They were previously void types and now they are int. What is the return value expected other than 0? Earlier with returning a void there was now way for an arch code to denying onlining of this particular page. By using an int as return type, we can implement this. In one of the boards I was using, there are some pages which should not be onlined because they are used for other purposes(like secure trust zone or hypervisor). diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index ceb5048..95f888f 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -345,8 +345,8 @@ static enum bp_state reserve_additional_memory(void) /* * add_memory_resource() will call online_pages() which in its turn -* will call xen_online_page() callback causing deadlock if we don't -* release balloon_mutex here. Unlocking here is safe because the +* will call xen_bring_pgs_online() callback causing deadlock if we + * don't release balloon_mutex here. Unlocking here is safe because the * callers drop the mutex before trying again. */ mutex_unlock(_mutex); @@ -369,15 +369,22 @@ static enum bp_state reserve_additional_memory(void) return BP_ECANCELED; } -static void xen_online_page(struct page *page) +static int xen_bring_pgs_online(struct page *pg, unsigned int order) Why did we rename this function? I see it was added as a new function in v3, however in v4 we ended up replacing it completely. So why not just keep the same name and make it easier for us to identify that the is the Xen version of the XXX_online_pages callback? Point taken. Will send a patch. [...] +static int online_pages_blocks(unsigned long start, unsigned long nr_pages) +{ + unsigned long end = start + nr_pages; + int order, ret, onlined_pages = 0; + + while (start < end) { + order = min(MAX_ORDER - 1, + get_order(PFN_PHYS(end) - PFN_PHYS(start))); + + ret = (*online_page_callback)(pfn_to_page(start), order); + if (!ret) + onlined_pages += (1UL << order); + else if (ret > 0) + onlined_pages += ret; + So if the ret > 0 it is supposed to represent how many pages were onlined within a given block? What if the ret was negative? Really I am not a fan of adding a return value to the online functions unless we specifically document what the expected return values are supposed to be. If we don't have any return values other than 0 there isn't much point in having one anyway. I ll document this. Regards, Arun
Re: [PATCH 2/2] leds: lp5024: Add the LP5024/18 RGB LED driver
Hi Dan, On 07/01/2019 21.34, Dan Murphy wrote: Vesa On 1/4/19 6:39 PM, Vesa Jääskeläinen wrote: Hi Jacek, On 04/01/2019 23.37, Jacek Anaszewski wrote: But, aside from that hypothetic issue, we need a solution for LEDn_BRIGHTNESS feature of lp5024, i.e. setting color intensity via a single register write. How would you propose to address that? You could model it to something like this in device tree: led-module @ { compatible = "lp5024"; // There is in hardware setup to use either linear or // logarithmic scaling: //enable-logarithmic-brightness; led0 { // this will create led instance for LED0 in lp5024 label = "lp-led0"; // This specifies LED number within lp5024 led-index = <0>; // set output-base as 0*3 == 0 element-red { // refers to OUT0 output-offset = <0>; }; element-green { // refers to OUT1 output-offset = <1>; }; element-blue { // refers to OUT2 output-offset = <2>; }; }; led1 { // this will create led instance for LED1 in lp5024 label = "lp-led1"; // This specifies LED number within lp5024 led-index = <1>; // set output-base as 1*3 == 3 Can we not use led-sources like I have done already? It was just for illustration of the idea. Names can be agreed. I have nothing against led-sources name. I was just looking at datasheet to try to undestand what it did and then tried to figure out if it could be mapped the idea I have been playing with. I really like to keep the DT nodes simple and re-use nodes that exist if possible. I'll reply to Jacek's email about more clarifications of the idea. Thanks, Vesa Jääskeläinen My code already maps and groups the outputs into the associated banks
RE: [RFC PATCH v1] cfg80211/nl80211: add support for AID assignment by driver : https://patchwork.kernel.org/patch/10726899/
My Bad, Onwards I will submit a new version of the RFC patch to address the reviewer comments on previous RFC patch and for query or suggestions, I will definitely cc Linux-wireless in my mail without any external link or attachment. Thanks, Prasanna -Original Message- From: Kalle Valo [mailto:kv...@codeaurora.org] Sent: Tuesday, January 8, 2019 10:50 PM To: Garnayak, Sarada Prasanna Cc: Berg, Johannes ; Jouni Malinen ; Sergey Matyukevich ; linux-kernel@vger.kernel.org; linux-wirel...@vger.kernel.org Subject: Re: [RFC PATCH v1] cfg80211/nl80211: add support for AID assignment by driver : https://patchwork.kernel.org/patch/10726899/ + linux-wireless "Garnayak, Sarada Prasanna" writes: > Thank you so much for reviewing my RFC patch and here I would like to > address the big concerns/comments about this RFC patch like > > Ø What are the exact use case and potential user of this patch. > > Ø How the Hostapd will handle this get aid and free aid. > > Ø etc. > > In the current implementation in WLAN Stack, the hostapd is assigning > the AID for a station during association. > > But we have a few WLAN hardware which is maintaining the AID in the > WLAN hardware level itself and the firmware checking this AID in the > hardware level according to that the firmware is taking action on > station management > > like (Station DB management, BA agreement, SMPS and TM management > frame etc). For this reason, we need these get_aid/free_aid. > > Also here I am attaching the Hostapd RFC Patch for the same. PFA > (hostapd patch, call flow diagram and comments on this kernel patch) > > Please review the call flow below: You made few crucial mistakes: 1. You did not CC linux-wireless. 2. You submitted the message as HTML and our lists automatically filter those. 3. You attached a powerpoint file and I would hope that our lists automatically filter those as well. At least I'm not going to open any random powerpoint/word files I receive via email. -- Kalle Valo
Re: [PATCH v2] f2fs: fix sbi->extent_list corruption issue
On 2019/1/9 12:38, Jaegeuk Kim wrote: > On 01/07, Chao Yu wrote: >> On 2019/1/5 4:33, Jaegeuk Kim wrote: >>> On 01/04, Sahitya Tummala wrote: On Mon, Nov 26, 2018 at 10:17:20AM +0530, Sahitya Tummala wrote: > When there is a failure in f2fs_fill_super() after/during > the recovery of fsync'd nodes, it frees the current sbi and > retries again. This time the mount is successful, but the files > that got recovered before retry, still holds the extent tree, > whose extent nodes list is corrupted since sbi and sbi->extent_list > is freed up. The list_del corruption issue is observed when the > file system is getting unmounted and when those recoverd files extent > node is being freed up in the below context. > > list_del corruption. prev->next should be fff1e1ef5480, but was (null) > <...> > kernel BUG at kernel/msm-4.14/lib/list_debug.c:53! > task: fff1f46f2280 task.stack: ff8008068000 > lr : __list_del_entry_valid+0x94/0xb4 > pc : __list_del_entry_valid+0x94/0xb4 > <...> > Call trace: > __list_del_entry_valid+0x94/0xb4 > __release_extent_node+0xb0/0x114 > __free_extent_tree+0x58/0x7c > f2fs_shrink_extent_tree+0xdc/0x3b0 > f2fs_leave_shrinker+0x28/0x7c > f2fs_put_super+0xfc/0x1e0 > generic_shutdown_super+0x70/0xf4 > kill_block_super+0x2c/0x5c > kill_f2fs_super+0x44/0x50 > deactivate_locked_super+0x60/0x8c > deactivate_super+0x68/0x74 > cleanup_mnt+0x40/0x78 > __cleanup_mnt+0x1c/0x28 > task_work_run+0x48/0xd0 > do_notify_resume+0x678/0xe98 > work_pending+0x8/0x14 > > Fix this by cleaning up inodes, extent tree and nodes of those > recovered files before freeing up sbi and before next retry. > Hi Jaegeuk, Chao, I have observed another scenario where the similar list corruption issue can happen with sbi->inode_list as well. If recover_fsync_data() fails at some point in write_checkpoint() due to some error and if those recovered inodes are still dirty, then after the mount is successful, this issue is observed when that dirty inode is under writeback. >>> >>> recover_fsync_data() does iget/iput in pair, and destroy_fsync_dnodes() >>> drops >>> its dirty list and call iput(), when there is an error. So, after then, >>> there'd >>> be no dirty inodes. If there's no error, checkpoint() flushes quota/dentry >>> pages >>> in dirty inodes as well. Can we check where this dirty inode came from? >> >> I guess it comes from: >> >> f2fs_recover_fsync_data() >> >> /* Needed for iput() to work correctly and not trash data */ >> sbi->sb->s_flags |= SB_ACTIVE; >> >> iput_final() >> >> if (!drop && (sb->s_flags & SB_ACTIVE)) { >> inode_add_lru(inode); >> spin_unlock(>i_lock); >> return; >> } >> >> So dirty data in those inode can be remained after iput(), then meta/node >> can be persisted during next checkpoint, if checkpoint failed due to error, >> dirty inode remain in system. IIUC. > > > 749 err = recover_data(sbi, _list, _inode_list, _list); > 750 if (!err) > 751 f2fs_bug_on(sbi, !list_empty(_list)); > 752 else { > 753 /* restore s_flags to let iput() trash data */ > 754 sbi->sb->s_flags = s_flags; > 755 } > > We deactivate sb before iput? We only restore s_flags in error path of recover_data? I remember Sahitya said his case is encountering error in later checkpoint()? Thanks, > >> >>> >>> Oh, one sceanrio can be an error by f2fs_disable_checkpoint() which will do >>> GC. >>> [ 90.400500] list_del corruption. prev->next should be ffed1f566208, but was (null) [ 90.675349] Call trace: [ 90.677869] __list_del_entry_valid+0x94/0xb4 [ 90.682351] remove_dirty_inode+0xac/0x114 [ 90.686563] __f2fs_write_data_pages+0x6a8/0x6c8 [ 90.691302] f2fs_write_data_pages+0x40/0x4c [ 90.695695] do_writepages+0x80/0xf0 [ 90.699372] __writeback_single_inode+0xdc/0x4ac [ 90.704113] writeback_sb_inodes+0x280/0x440 [ 90.708501] wb_writeback+0x1b8/0x3d0 [ 90.712267] wb_workfn+0x1a8/0x4d4 [ 90.715765] process_one_work+0x1c0/0x3d4 [ 90.719883] worker_thread+0x224/0x344 [ 90.723739] kthread+0x120/0x130 [ 90.727055] ret_from_fork+0x10/0x18 I think it is better to cleanup those inodes completely before freeing sbi and before next retry as done in this patch. Would you like to re-consider this patch for this new issue? >>> >>> The patch was merged in mainline already. >>> Could you take a look at this patch? >>> >>> >From cb1d20e640402beed300c2bdce79311ee8a781ad Mon Sep 17 00:00:00 2001 >>> From: Jaegeuk Kim >>> Date: Fri, 4 Jan 2019 12:29:00 -0800 >>> Subject: [PATCH] f2fs: sync filesystem after roll-forward recovery >> >> You mean android kernel
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: > > > In a very cryptic way that requires manual parsing using non-public > > docs sadly but yes. From the look of it, it's a completion timeout. > > > > Looks to me like we don't get a response to a config space access > > during the change of D state. I don't know if it's the write of the D3 > > state itself or the read back though (it's probably detected on the > > read back or a subsequent read, but that doesn't tell me which specific > > one failed). > > If it is just one card doing it (again, check you have latest > firmware) I wonder if it is a sketchy PCI-E electrical link that is > causing a long re-training cycle? Can you tell if the PCI-E link is > permanently gone or does it eventually return? No, it's 100% reproducable on systems with that specific card model, not card instance, and maybe different systems/cards as well, I'll let David & Alexey comment further on that. > Does the card work in Gen 3 when it starts? Is there any indication of > PCI-E link errors? Nope. > Everytime or sometimes? > > POWER 8 firmware is good? If the link does eventually come back, is > the POWER8's D3 resumption timeout long enough? > > If this doesn't lead to an obvious conclusion you'll probably need to > connect to IBM's Mellanox support team to get more information from > the card side. We are IBM :-) So far, it seems to be that the card is doing something not quite right, but we don't know what. We might need to engage Mellanox themselves. Cheers, Ben.
Re: [PATCH] pseries/hotplug: Add more delay in pseries_cpu_die while waiting for rtas-stop
Hello Thiago, Wish you a happy 2019! On Sat, Dec 08, 2018 at 12:40:52AM -0200, Thiago Jung Bauermann wrote: > > Gautham R Shenoy writes: > > On Fri, Dec 07, 2018 at 04:13:11PM +0530, Gautham R Shenoy wrote: > >> Sure. I will test the patch and report back. > > > > I added the following debug patch on top of your patch, and after an > > hour's run, the system crashed. Appending the log at the end. > > Thank you very much for testing! Your debug patch was very helpful as > well. > > > I suppose we still need to increase the number of tries since we wait > > only for 2.5ms looping before giving up. > > Do you think it would have helped? In the debug output you posted I > would have expected the following message to show up if the loop > finished too early, and it didn't: > > "Querying DEAD? cpu %i (%i) shows %i\n" > > So I don't think increasing the loop length would have made a > difference. In fact, the call to smp_query_cpu_stopped() always > succeeded on the first iteration. I did some testing during the holidays. Here are the observations: 1) With just your patch (without any additional debug patch), if I run DLPAR on /off operations on a system that has SMT=off, I am able to see a crash involving RTAS stack corruption within an hour's time. 2) With the debug patch (appended below) which has additional debug to capture the callers of stop-self, start-cpu, set-power-levels, the system is able to perform DLPAR on/off operations on a system with SMT=off for three days. And then, it crashed with the dead CPU showing a "Bad kernel stack pointer". From this log, I can clearly see that there were no concurrent calls to stop-self, start-cpu, set-power-levels. The only concurrent RTAS calls were the dying CPU calling "stop-self", and the CPU running the DLPAR operation calling "query-cpu-stopped-state". The crash signature is appended below as well. 3) Modifying your patch to remove the udelay and increase the loop count from 25 to 1000 doesn't improve the situation. We are still able to see the crash. 4) With my patch, even without any additional debug, I was able to observe the system run the tests successfully for over a week (I started the tests before the Christmas weekend, and forgot to turn it off!) It appears that there is a narrow race window involving rtas-stop-self and rtas-query-cpu-stopped-state calls that can be observed with your patch. Adding any printk's seems to reduce the probability of hitting this race window. It might be worth the while to check with RTAS folks, if they suspect something here. The Crash log with this debug patch is as follows [DEBUG] CPU 32 waiting for CPU 130 to enter rtas cpu 130 (hwid 130) Ready to die... [DEBUG] CPU 32 noticed CPU 130 enter rtas: tries=0, time=539 [DEBUG] CPU 32 waiting for CPU 131 to enter rtas cpu 131 (hwid 131) Ready to die... [DEBUG] CPU 32 noticed CPU 131 enter rtas: tries=0, time=137 [DEBUG] CPU 32 waiting for CPU 132 to enter rtas cpu 132 (hwid 132) Ready to die... [DEBUG] CPU 32 noticed CPU 132 enter rtas: tries=0, time=1238 [DEBUG] CPU 32 waiting for CPU 133 to enter rtas cpu 133 (hwid 133) Ready to die... [DEBUG] CPU 32 noticed CPU 133 enter rtas: tries=1, time=1259 [DEBUG] CPU 32 waiting for CPU 134 to enter rtas cpu 134 (hwid 134) Ready to die... [DEBUG] CPU 32 noticed CPU 134 enter rtas: tries=0, time=1141 [DEBUG] CPU 32 waiting for CPU 135 to enter rtas cpu 135 (hwid 135) Ready to die... [DEBUG] CPU 32 noticed CPU 135 enter rtas: tries=0, time=636 cpu 120 (hwid 120) Ready to die... [DEBUG] CPU 32 waiting for CPU 120 to enter rtas [DEBUG] CPU 32 noticed CPU 120 enter rtas: tries=0, time=53 [DEBUG] CPU 32 waiting for CPU 121 to enter rtas cpu 121 (hwid 121) Ready to die... Bad kernel stack pointer 1fafb400 at 1fafb4b0 Bad kernel stack pointer 1fafb4b0 at 1ec15270 Oops: Bad kernel stack pointer, sig: 6 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 121 PID: 0 Comm: swapper/121 Not tainted 4.20.0-thiago-original-with-debug+ #57 NIP: 1ec15270 LR: 1ec1526c CTR: 1ecd26c4 REGS: c0001e577d30 TRAP: 0300 Not tainted (4.20.0-thiago-original-with-debug+) MSR: 80001000 CR: 4800 XER: 0020 CFAR: 1ecd27a8 DAR: 8108 DSISR: 4000 IRQMASK: 80009033 GPR00: 1ec1526c 1fafb4b0 GPR04: 01a40968 00e0 dfe8 0018 GPR08: 1f82ae00 1ed025d4 1f827850 GPR12: 01b6d998 c0001eb4e080 c003a1bdbf90 1eea3020 GPR16: c006fdc45100 c004c8b0 c13d5300 GPR20: c006fdc45300 0008 c19d2cf8 c13d6888 GPR24: 0079 c13d688c 0002 c13d688c GPR28: c19cecf0 03c8 1fafb4b0 NIP [1ec15270] 0x1ec15270 LR [1ec1526c] 0x1ec1526c Call Trace: Instruction dump:
Re: [PATCH 4.20 000/145] 4.20.1-stable review
On Tue, Jan 08, 2019 at 07:39:44PM +, Dmitrii Tcvetkov wrote: > Hello, > > Don't have anything against listed patches, just curious: doesn't upstream > commit > 574823bfab82d9d8fa47f422778043fbb4b4f50e > (Change mincore() to count "mapped" pages rather than "cached" pages) > need to be backported as a security fix for CVE-2019-5489[1] to stable > kernels? > > [1] https://www.openwall.com/lists/oss-security/2019/01/07/2 Please read the email thread for that patch on the linux kernel mailing list as to why it is not to be applied at this point in time. If you are worried about this issue, please weigh in on that thread. thanks, greg k-h
Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order
On 2019-01-08 23:43, Michal Hocko wrote: On Tue 08-01-19 09:56:09, Alexander Duyck wrote: On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote: [...] > static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, >void *arg) > { > - unsigned long i; >unsigned long onlined_pages = *(unsigned long *)arg; > - struct page *page; > >if (PageReserved(pfn_to_page(start_pfn))) > - for (i = 0; i < nr_pages; i++) { > - page = pfn_to_page(start_pfn + i); > - (*online_page_callback)(page); > - onlined_pages++; > - } > + onlined_pages = online_pages_blocks(start_pfn, nr_pages); Shouldn't this be a "+=" instead of an "="? It seems like you are going to lose your count otherwise. You are right of course. I should have noticed during the review. Thanks! I think we don't need to. The caller function is setting onlined_pages = 0 before calling online_pages_range(). And there are no other reference to online_pages_range other than from online_pages(). https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/memory_hotplug.c?h=v5.0-rc1#n845 int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type) { unsigned long flags; unsigned long onlined_pages = 0; Regards, Arun
[PATCH] thermal: mtk: Allocate enough space for mtk_thermal.
The mtk_thermal struct contains a 'struct mtk_thermal_bank banks[];', but the allocation only allocates sizeof(struct mtk_thermal) bytes, which cause out of bound access with the ->banks[] member. Change it to a fixed size array instead. Signed-off-by: Pi-Hsun Shih --- drivers/thermal/mtk_thermal.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c index 0691f260f6eabe..ea11edb3fcced6 100644 --- a/drivers/thermal/mtk_thermal.c +++ b/drivers/thermal/mtk_thermal.c @@ -159,6 +159,9 @@ #define MT7622_NUM_SENSORS_PER_ZONE1 #define MT7622_TS1 0 +/* The maximum number of banks */ +#define MAX_NUM_ZONES 8 + struct mtk_thermal; struct thermal_bank_cfg { @@ -178,7 +181,7 @@ struct mtk_thermal_data { const int *sensor_mux_values; const int *msr; const int *adcpnp; - struct thermal_bank_cfg bank_data[]; + struct thermal_bank_cfg bank_data[MAX_NUM_ZONES]; }; struct mtk_thermal { @@ -197,7 +200,7 @@ struct mtk_thermal { s32 vts[MT8173_NUM_SENSORS]; const struct mtk_thermal_data *conf; - struct mtk_thermal_bank banks[]; + struct mtk_thermal_bank banks[MAX_NUM_ZONES]; }; /* MT8173 thermal sensor data */ -- 2.20.1.97.g81188d93c3-goog
Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()
I didn't pay enough attention. Seems to be a temp problem on parisc-linux.org side: : There was a temporary problem delivering your message to jejb at parisc-linux.org. : : The response was: : : The recipient server did not accept our requests to connect. : [mail.parisc-linux.org. 140.ZZZ.X.YY: generic::failed_precondition: connect error (0): error] -ss
[PATCH] rcu: Remove rcu_*_state declaration
Because of just only one set of rcu_state, the declaration of rcu_sched_state/rcu_bh_state/rcu_preempt_state is unnecessary. Signed-off-by: Peng Hao --- kernel/rcu/tree.h | 7 --- 1 file changed, 7 deletions(-) diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 703e19f..9ea704c 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -401,13 +401,6 @@ struct rcu_state { /* * RCU implementation internal declarations: */ -extern struct rcu_state rcu_sched_state; - -extern struct rcu_state rcu_bh_state; - -#ifdef CONFIG_PREEMPT_RCU -extern struct rcu_state rcu_preempt_state; -#endif /* #ifdef CONFIG_PREEMPT_RCU */ int rcu_dynticks_snap(struct rcu_data *rdp); -- 1.8.3.1
Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get
On Tue, Jan 8, 2019 at 9:36 PM Shakeel Butt wrote: > > On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel wrote: > > > > There is an imbalance between when slab_pre_alloc_hook calls > > memcg_kmem_get_cache and when slab_post_alloc_hook calls > > memcg_kmem_put_cache. > > > > Can you explain how there is an imbalance? If the returned kmem cache > from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt of > memcg is elevated and the memcg_kmem_put_cache() will correctly > decrement the refcnt of the memcg. > > > This can cause a memcg kmem cache to be destroyed right as > > an object from that cache is being allocated, Also please note that the memcg kmem caches are destroyed (if empty) on memcg offline. The css_tryget_online() within memcg_kmem_get_cache() will fail. See kernel/cgroup/cgroup.c * 2. When the percpu_ref is confirmed to be visible as killed on all CPUs *and thus css_tryget_online() is guaranteed to fail, the css can be *offlined by invoking offline_css(). After offlining, the base ref is *put. Implemented in css_killed_work_fn(). > > which is probably > > not good. It could lead to things like a memcg allocating new > > kmalloc slabs instead of using freed space in old ones, maybe > > memory leaks, and maybe oopses as a memcg kmalloc slab is getting > > destroyed on one CPU while another CPU is trying to do an allocation > > from that same memcg. > > > > The obvious fix would be to use the same condition for calling > > memcg_kmem_put_cache that we also use to decide whether to call > > memcg_kmem_get_cache. > > > > I am not sure how long this bug has been around, since the last > > changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup > > kmem charge functions") - merely moved the bug from one location to > > another. I am still tagging that changeset, because the fix should > > automatically apply that far back. > > > > Signed-off-by: Rik van Riel > > Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions") > > Cc: kernel-t...@fb.com > > Cc: linux...@kvack.org > > Cc: sta...@vger.kernel.org > > Cc: Alexey Dobriyan > > Cc: Christoph Lameter > > Cc: Pekka Enberg > > Cc: Andrew Morton > > Cc: David Rientjes > > Cc: Joonsoo Kim > > Cc: Johannes Weiner > > Cc: Tejun Heo > > --- > > mm/slab.h | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/slab.h b/mm/slab.h > > index 4190c24ef0e9..ab3d95bef8a0 100644 > > --- a/mm/slab.h > > +++ b/mm/slab.h > > @@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct > > kmem_cache *s, gfp_t flags, > > p[i] = kasan_slab_alloc(s, object, flags); > > } > > > > - if (memcg_kmem_enabled()) > > + if (memcg_kmem_enabled() && > > + ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT))) > > I don't think these extra checks are needed. They are safe but not needed. > > > memcg_kmem_put_cache(s); > > } > > > > -- > > 2.17.1 > > > > thanks, > Shakeel
Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]
On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote: > On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: > > > > > In a very cryptic way that requires manual parsing using non-public > > > docs sadly but yes. From the look of it, it's a completion timeout. > > > > > > Looks to me like we don't get a response to a config space access > > > during the change of D state. I don't know if it's the write of the D3 > > > state itself or the read back though (it's probably detected on the > > > read back or a subsequent read, but that doesn't tell me which specific > > > one failed). > > > > If it is just one card doing it (again, check you have latest > > firmware) I wonder if it is a sketchy PCI-E electrical link that is > > causing a long re-training cycle? Can you tell if the PCI-E link is > > permanently gone or does it eventually return? > > No, it's 100% reproducable on systems with that specific card model, > not card instance, and maybe different systems/cards as well, I'll let > David & Alexey comment further on that. Well, it's 100% reproducable on a particular model of system (garrison) with a particular model of card. I've had some suggestions that it fails with some other systems card card models, but nothing confirmed - the one other system model I've been able to try, which also had a newer card model didn't reproduce the problem. > > Does the card work in Gen 3 when it starts? Is there any indication of > > PCI-E link errors? > > Nope. > > > Everytime or sometimes? > > > > POWER 8 firmware is good? If the link does eventually come back, is > > the POWER8's D3 resumption timeout long enough? > > > > If this doesn't lead to an obvious conclusion you'll probably need to > > connect to IBM's Mellanox support team to get more information from > > the card side. > > We are IBM :-) So far, it seems to be that the card is doing something > not quite right, but we don't know what. We might need to engage > Mellanox themselves. Possibly. On the other hand, I've had it reported that this is a software regression at least with downstream red hat kernels. I haven't yet been able to eliminate factors that might be confusing that, or try to find a working version upstream. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()
On (01/08/19 08:44), Helge Deller wrote: > Date: Tue, 8 Jan 2019 08:44:19 +0100 > From: Helge Deller > To: Sergey Senozhatsky , "James E . J . > Bottomley" > Cc: linux-par...@vger.kernel.org, linux-kernel@vger.kernel.org, Sergey > Senozhatsky > Subject: Re: [PATCH] parisc: replace oops_in_progress manipulation with > bust_spinlocks() > Message-ID: > User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 > Thunderbird/60.4.0 > > On 07.01.19 10:56, Sergey Senozhatsky wrote: > > Use bust_spinlocks() function to set oops_in_progress. > > > > Signed-off-by: Sergey Senozhatsky > > I've added this patch to the parisc for-next tree. Thanks. BTW, jejb at parisc-linux.org bounces. Shall it be replaced with HansenPartnership.com email address? --- diff --git a/MAINTAINERS b/MAINTAINERS index 7fc7813cd9c2..86b174dbe506 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11511,7 +11511,7 @@ F: Documentation/blockdev/paride.txt F: drivers/block/paride/ PARISC ARCHITECTURE -M: "James E.J. Bottomley" +M: "James E.J. Bottomley" M: Helge Deller L: linux-par...@vger.kernel.org W: http://www.parisc-linux.org/
5.0-rc1 KVM inspired "BUG: Bad page state in process" spew
Greetings, KVM seems to be busted in master ATM. All I have to do to have host start screaming and maybe exploding (if the guest doesn't do so first) is to try to install a (obese in this case) kernel over nfs mount of the host in a guest. Kernel producing the spew below is 3bd6e94, config attached. homer: # grep BUG: /netconsole.log [ 1531.909703] BUG: Bad page state in process X pfn:100491 [ 1531.958141] BUG: Bad page state in process systemd-journal pfn:100412 [ 1532.662359] BUG: Bad page state in process X pfn:10043f [ 1532.664033] BUG: Bad page state in process X pfn:10044d [ 1532.686433] BUG: Bad page state in process systemd-journal pfn:1027b0 [ 1532.687704] BUG: Bad page state in process systemd-journal pfn:1027ba [ 1532.689039] BUG: Bad page state in process systemd-journal pfn:1027de [ 1532.694762] BUG: Bad page state in process systemd-journal pfn:102602 [ 1532.696337] BUG: Bad page state in process systemd-journal pfn:102726 [ 1533.086254] BUG: Bad page state in process CPU 0/KVM pfn:100a48 [ 1533.086869] BUG: Bad page state in process CPU 0/KVM pfn:100a49 [ 1533.087413] BUG: Bad page state in process CPU 0/KVM pfn:100a4a [ 1533.087947] BUG: Bad page state in process CPU 0/KVM pfn:100a4b [ 1533.088477] BUG: Bad page state in process CPU 0/KVM pfn:100a4c [ 1533.089044] BUG: Bad page state in process CPU 0/KVM pfn:100a4d [ 1533.089586] BUG: Bad page state in process CPU 0/KVM pfn:100a4e [ 1533.090121] BUG: Bad page state in process CPU 0/KVM pfn:100a61 [ 1533.090657] BUG: Bad page state in process CPU 0/KVM pfn:100a62 [ 1533.091191] BUG: Bad page state in process CPU 0/KVM pfn:100a63 [ 1533.091739] BUG: Bad page state in process CPU 0/KVM pfn:100a64 [ 1533.092276] BUG: Bad page state in process CPU 0/KVM pfn:100a65 [ 1533.092814] BUG: Bad page state in process CPU 0/KVM pfn:100a66 [ 1533.093348] BUG: Bad page state in process CPU 0/KVM pfn:100a67 [ 1533.093940] BUG: Bad page state in process CPU 0/KVM pfn:1017e8 [ 1533.094512] BUG: Bad page state in process CPU 0/KVM pfn:1017e9 [ 1533.095049] BUG: Bad page state in process CPU 0/KVM pfn:1017ea [ 1533.095585] BUG: Bad page state in process CPU 0/KVM pfn:1017eb [ 1533.096120] BUG: Bad page state in process CPU 0/KVM pfn:1017ee [ 1533.096679] BUG: Bad page state in process CPU 0/KVM pfn:101a20 [ 1533.097221] BUG: Bad page state in process CPU 0/KVM pfn:101a22 [ 1533.097757] BUG: Bad page state in process systemd-journal pfn:101ce4 [ 1533.098162] BUG: Bad page state in process systemd-journal pfn:101ce5 [ 1533.098535] BUG: Bad page state in process CPU 0/KVM pfn:101ce6 [ 1533.099075] BUG: Bad page state in process CPU 0/KVM pfn:101ce7 [ 1533.099667] BUG: Bad page state in process CPU 0/KVM pfn:1024cc [ 1533.101840] BUG: Bad page state in process CPU 0/KVM pfn:1024e9 [ 1533.102379] BUG: Bad page state in process CPU 0/KVM pfn:1024ea [ 1533.102935] BUG: Bad page state in process CPU 0/KVM pfn:1024eb [ 1533.103516] BUG: Bad page state in process CPU 0/KVM pfn:102bc8 [ 1533.105771] BUG: Bad page state in process CPU 0/KVM pfn:102be0 [ 1533.106308] BUG: Bad page state in process CPU 0/KVM pfn:102be1 [ 1533.106875] BUG: Bad page state in process CPU 0/KVM pfn:102be2 [ 1533.107412] BUG: Bad page state in process CPU 0/KVM pfn:102be3 [ 1533.66] BUG: Bad page state in process CPU 0/KVM pfn:102bee config.xz Description: application/xz
Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get
On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel wrote: > > There is an imbalance between when slab_pre_alloc_hook calls > memcg_kmem_get_cache and when slab_post_alloc_hook calls > memcg_kmem_put_cache. > Can you explain how there is an imbalance? If the returned kmem cache from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt of memcg is elevated and the memcg_kmem_put_cache() will correctly decrement the refcnt of the memcg. > This can cause a memcg kmem cache to be destroyed right as > an object from that cache is being allocated, which is probably > not good. It could lead to things like a memcg allocating new > kmalloc slabs instead of using freed space in old ones, maybe > memory leaks, and maybe oopses as a memcg kmalloc slab is getting > destroyed on one CPU while another CPU is trying to do an allocation > from that same memcg. > > The obvious fix would be to use the same condition for calling > memcg_kmem_put_cache that we also use to decide whether to call > memcg_kmem_get_cache. > > I am not sure how long this bug has been around, since the last > changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup > kmem charge functions") - merely moved the bug from one location to > another. I am still tagging that changeset, because the fix should > automatically apply that far back. > > Signed-off-by: Rik van Riel > Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions") > Cc: kernel-t...@fb.com > Cc: linux...@kvack.org > Cc: sta...@vger.kernel.org > Cc: Alexey Dobriyan > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: Andrew Morton > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Johannes Weiner > Cc: Tejun Heo > --- > mm/slab.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/slab.h b/mm/slab.h > index 4190c24ef0e9..ab3d95bef8a0 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct kmem_cache > *s, gfp_t flags, > p[i] = kasan_slab_alloc(s, object, flags); > } > > - if (memcg_kmem_enabled()) > + if (memcg_kmem_enabled() && > + ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT))) I don't think these extra checks are needed. They are safe but not needed. > memcg_kmem_put_cache(s); > } > > -- > 2.17.1 > thanks, Shakeel
Re: seqcount usage in xt_replace_table()
On Wed, Jan 9, 2019 at 1:36 AM Anatol Pomozov wrote: > > Hello > > On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri > wrote: > > > > Hi Anatol, > > > > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote: > > > Hello folks, > > > > > > A bit of context what I am doing. I am trying to port KTSAN (Kernel > > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage > > > and makes sure it is accessed in a thread-safe manner. > > > > Interesting! FYI, some LKMM's maintainers (Paul included) had and > > continued to have some "fun" discussing topics related to "thread- > > safe memory accesses": I'm sure that they'll be very interested in > > such work of yours and eager to discuss your results. > > Thread Sanitizer is a great tool to find thread-safety issues with > user-space code. The tool been developed by a team of smart people > from Google [1]. > > KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A > bunch of work been done there but the project is still at > proof-of-concept point. > > I am not a part of Google's dynamic tools team. But I've decided to > pick something to do during the New Year holidays so started porting > KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to > fix a few crashes [3]. > > [1] https://github.com/google/sanitizers > [2] https://github.com/google/ktsan/wiki > [3] https://github.com/anatol/linux/tree/ktsan_4.20 For completeness, at the time we also had to add read_seqcount_cancel() function to dynamically mark all seqcount read regions. It can be used here too, start read region and immediately end it. Less clear than raw_read_seqcount_nocritical(), but also less API surface. /** * read_seqcount_cancel - cancel a seq-read critical section * @s: pointer to seqcount_t * * This is a no-op except for ktsan, it needs to know scopes of seq-read * critical sections. The sections are denoted either by begin->retry or * by begin->cancel. */ static inline void read_seqcount_cancel(const seqcount_t *s) { ktsan_seqcount_end(s); } https://github.com/google/ktsan/blob/tsan/include/linux/seqlock.h#L235-L246
Re: [PATCH RFC 3/5] dt-bindings: Add PDC timer bindings for Qualcomm SoCs
On 1/7/2019 9:47 PM, Raju P L S S S N wrote: On 1/4/2019 2:49 AM, Stephen Boyd wrote: Quoting Raju P L S S S N (2019-01-03 04:22:58) On 12/29/2018 3:08 AM, Stephen Boyd wrote: Quoting Raju P L S S S N (2018-12-26 01:44:43) There are two RSC devices in SoC one for application processor subsystem & other display subsystem. Both RSC contain registers for PDC timers (one for each subsystem). When is the timer programmed on the display subsystem's RSC? It's hard to give advice without all the information. For display subsystem RSC, hardware sleep solver takes care of timer programming for wakeup when the subsystem goes to Power collapse. So the display subsystem doesn't need to program their PDC in their RSC from software? Yes. I would think that it would make sense for the application processor's RSC timer to be programmed from the broadcast timer mechanism in the kernel so that timers during idle work and suspend turns off the timer appropriately with a shutdown hook. I guess the PDC can't tell you the time though? It looks like a shadow (and limited) version of the ARM architected MMIO timer that we already program for the broadcast timer mechanism. Is that because even the MMIO timer can't wakeup the system in deep idle? Assuming that's true, it means the ARM MMIO timer can't always be used as the system wide broadcast mechanism because we need to augment it with the PDC timer to get the actual wakeup. Yes. this is correct. Maybe we should be adding hooks into the broadcast timer mechanism to program this wakeup event hardware in addition to the ARM MMIO timer. Or we should stop using the ARM MMIO timer on these systems and read the system register based physical time in the RSC timer driver and register this 64-bit PDC register as the broadcast timer. So the time reading would be through sysreg and the wakeup programming would be done by writing the PDC timer. The assumption would be that we have access to the physical time registers (which sounds like the assumption we have to make). There are no physical timer registers available in RSC for this purpose. Do we get an interrupt somewhere from the RSC hardware when the timer fires? Or does that just cause a system wakeup event without any pending irq and then another irq (like the ARM architected timer) just happens to be pending around the same time? If we get an interrupt somehow then I would prefer to drop the ARM MMIO timer and do this hybrid broadcast timer approach. There is no interrupt for PDC timeout. It just causes system wakeup without a pending irq. ARM MMIO is necessary for irq. Does that system wakeup cause the CPUs to be enabled? So the sysreg based timer in the CPU would start working? Or does it only make the system come out of a deep enough idle state to make an always on domain work that only contains the MMIO based ARM architected timer? It only makes the system come out of deep sleep state for MMIO timer to function. I'd hope that each RSC's PDC timer wakes up the owner of the RSC so that we can use the sysreg based timers and ignore the MMIO based timers here. This isn't a very important distinction to make though, so if you have to use the MMIO timers then it just means some extra DT things need to be done to relate the MMIO timers with the RSC that has the timer that needs to be programmed too. Either way we would need a way to either hook the ARM architected timer driver in the kernel, or reimplement the bit of code needed to implement the clockevent based on the ARM architected timer that programs the ARM timer and the PDC timer together. Could you please provide some more details on the implementation? Hi Stephen, Regardless of implementation options about application processor subsytem PDC timer, I think there is a need to differentiate the fact that for application processor subsystem PDC timer programming is done by SW but not for display processor subsystem as HW sleep solver takes care of PDC timer during sleep entry/exit. How about having a dt property like qcom,pdc-timer-mode = "solver"/"sw" ? The current patch introduced client based model using regmap to achieve this differentiation between these two subsystems. By using the dt property, once rpmh-src driver instance for application subsystem can have PDC timer programing implemented. Let me know if there is another way. For implementation of PDC timer, I see the following based on above discussion - 1. Take the existing cpu_pm_notify approach and move the current series approach of programing PDC timer registers to RSC driver. This wouldn't involve any changes in clock_event_framework/broadcast framework. 2. Add api hooks (like reading the next wake up programmed) to ARM architecture timer driver so that the value is copied to PDC timer using the api with in RSC driver based on cpu_pm_notifications. 3. Changes in clockevent to program both ARM mem timer & PDC timer together. Could
[RFC PATCH v2 4/4] Documentation/device-mapper: add optional parameter reinit
From: Huaisheng Ye Add intro and usage guide for reinit. Signed-off-by: Huaisheng Ye --- Documentation/device-mapper/writecache.txt | 4 1 file changed, 4 insertions(+) diff --git a/Documentation/device-mapper/writecache.txt b/Documentation/device-mapper/writecache.txt index 01532b3..255c68c 100644 --- a/Documentation/device-mapper/writecache.txt +++ b/Documentation/device-mapper/writecache.txt @@ -45,6 +45,10 @@ Constructor parameters: afterwards - some underlying devices perform better with fua, some with nofua. The user should test it + reinit (by default off) + applicable only to persistent memory - use the REINIT flag + when the surper block has messy data, that would cause fn ctr + failed to work with invalid magic or version in the superblock Status: 1. error indicator - 0 if there was no error, otherwise error number -- 1.8.3.1
[RFC PATCH v2 1/4] dm-writecache: remove unused size to writecache_flush_region
From: Huaisheng Ye writecache_flush_region doesn't use size to calculate flush region. That uses _set_bits to mark the region in dirty_bitmap directly. Signed-off-by: Huaisheng Ye --- drivers/md/dm-writecache.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 2d50eec..2d8e0c0 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -418,7 +418,7 @@ static void writecache_flush_all_metadata(struct dm_writecache *wc) memset(wc->dirty_bitmap, -1, wc->dirty_bitmap_size); } -static void writecache_flush_region(struct dm_writecache *wc, void *ptr, size_t size) +static void writecache_flush_region(struct dm_writecache *wc, void *ptr) { if (!WC_MODE_PMEM(wc)) __set_bit(((char *)ptr - (char *)wc->memory_map) / BITMAP_GRANULARITY, @@ -657,7 +657,7 @@ static void writecache_free_entry(struct dm_writecache *wc, struct wc_entry *e) writecache_unlink(wc, e); writecache_add_to_freelist(wc, e); clear_seq_count(wc, e); - writecache_flush_region(wc, memory_entry(wc, e), sizeof(struct wc_memory_entry)); + writecache_flush_region(wc, memory_entry(wc, e)); if (unlikely(waitqueue_active(>freelist_wait))) wake_up(>freelist_wait); } @@ -687,9 +687,9 @@ static void writecache_poison_lists(struct dm_writecache *wc) static void writecache_flush_entry(struct dm_writecache *wc, struct wc_entry *e) { - writecache_flush_region(wc, memory_entry(wc, e), sizeof(struct wc_memory_entry)); + writecache_flush_region(wc, memory_entry(wc, e)); if (WC_MODE_PMEM(wc)) - writecache_flush_region(wc, memory_data(wc, e), wc->block_size); + writecache_flush_region(wc, memory_data(wc, e)); } static bool writecache_entry_is_committed(struct dm_writecache *wc, struct wc_entry *e) @@ -733,7 +733,7 @@ static void writecache_flush(struct dm_writecache *wc) wc->seq_count++; pmem_assign(sb(wc)->seq_count, cpu_to_le64(wc->seq_count)); - writecache_flush_region(wc, (wc)->seq_count, sizeof sb(wc)->seq_count); + writecache_flush_region(wc, (wc)->seq_count); writecache_commit_flushed(wc); wc->overwrote_committed = false; @@ -1757,7 +1757,7 @@ static int init_memory(struct dm_writecache *wc) writecache_flush_all_metadata(wc); writecache_commit_flushed(wc); pmem_assign(sb(wc)->magic, cpu_to_le32(MEMORY_SUPERBLOCK_MAGIC)); - writecache_flush_region(wc, (wc)->magic, sizeof sb(wc)->magic); + writecache_flush_region(wc, (wc)->magic); writecache_commit_flushed(wc); return 0; -- 1.8.3.1
[RFC PATCH v2 2/4] dm-writecache: get rid of memory_data flush to writecache_flush_entry
From: Huaisheng Ye writecache_flush_region only works when SSD mode. If wc->pmem_mode sets, writecache_flush_region doesn't need to be called in writecache_flush_entry. Signed-off-by: Huaisheng Ye --- drivers/md/dm-writecache.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 2d8e0c0..c69317c 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -688,8 +688,6 @@ static void writecache_poison_lists(struct dm_writecache *wc) static void writecache_flush_entry(struct dm_writecache *wc, struct wc_entry *e) { writecache_flush_region(wc, memory_entry(wc, e)); - if (WC_MODE_PMEM(wc)) - writecache_flush_region(wc, memory_data(wc, e)); } static bool writecache_entry_is_committed(struct dm_writecache *wc, struct wc_entry *e) -- 1.8.3.1
[RFC PATCH v2 3/4] dm-writecache: expand pmem_reinit for struct dm_writecache
From: Huaisheng Ye When use persistent memory as cache data device, sometimes the super block of pmem has messy data stored in it. That would have risk to lead fn ctr failed to work due to invalid magic or version. Here we expand pmem_reinit to optional parameters in order to solve this issue. When user gets pmem device, which has unrelated data in it, as cache device, he should use paramenter 'reinit' to avoid s.magic and s.version don't equal to NULL or correct MEMORY_SUPERBLOCK_MAGIC/VERSION. Signed-off-by: Huaisheng Ye --- drivers/md/dm-writecache.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index c69317c..2c1e825 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -149,6 +149,7 @@ struct dm_writecache { bool pmem_mode:1; bool writeback_fua:1; + bool pmem_reinit:1; bool overwrote_committed:1; bool memory_vmapped:1; @@ -2026,6 +2027,10 @@ static int writecache_ctr(struct dm_target *ti, unsigned argc, char **argv) wc->writeback_fua = false; wc->writeback_fua_set = true; } else goto invalid_optional; + } else if (!strcasecmp(string, "reinit")) { + if (WC_MODE_PMEM(wc)) + wc->pmem_reinit = true; + else goto invalid_optional; } else { invalid_optional: r = -EINVAL; @@ -2127,7 +2132,7 @@ static int writecache_ctr(struct dm_target *ti, unsigned argc, char **argv) ti->error = "Hardware memory error when reading superblock"; goto bad; } - if (!le32_to_cpu(s.magic) && !le32_to_cpu(s.version)) { + if (wc->pmem_reinit || (!le32_to_cpu(s.magic) && !le32_to_cpu(s.version))) { r = init_memory(wc); if (r) { ti->error = "Unable to initialize device"; -- 1.8.3.1
[RFC PATCH v2 0/4] add parameter for pmem cache device init
From: Huaisheng Ye This patch set could be used for dm-writecache when use persistent memory as cache data device. Patch 1 and 2 go towards removing unused parameter and codes which actually doesn't really work. Patch 3 and 4 are targeted at solving problem fn ctr failed to work due to invalid magic or version, which is caused by the super block of pmem has messy data stored. Changes Since v1: - add optional parameter reinit to avoid invalid magic or version. [1]: https://lkml.org/lkml/2019/1/3/43 Huaisheng Ye (4): dm-writecache: remove unused size to writecache_flush_region dm-writecache: get rid of memory_data flush to writecache_flush_entry dm-writecache: expand pmem_reinit for struct dm_writecache Documentation/device-mapper: add optional parameter reinit Documentation/device-mapper/writecache.txt | 4 drivers/md/dm-writecache.c | 19 +++ 2 files changed, 15 insertions(+), 8 deletions(-) -- 1.8.3.1
RE: x86/sgx: uapi change proposal
> On Tue, Jan 08, 2019 at 11:27:11AM -0800, Huang, Kai wrote: > > > > > > > > Can one of you explain why SGX_ENCLAVE_CREATE is better than just > > > > opening a new instance of /dev/sgx for each encalve? > > > > > > Directly associating /dev/sgx with an enclave means /dev/sgx can't > > > be used to provide ioctl()'s for other SGX-related needs, e.g. to > > > mmap() raw EPC and expose it a VM. Proposed layout in the link > > > below. I'll also respond to Jarkko's question about exposing EPC > > > through /dev/sgx instead of having KVM allocate it on behalf of the VM. > > > > > > https://lkml.kernel.org/r/20181218185349.gc30...@linux.intel.com > > > > Hi Sean, > > > > Sorry for replying to old email. But IMHO it is not a must that Qemu > > needs to open some /dev/sgx and allocate/mmap EPC for guest's virtual > > EPC slot, instead, KVM could create private slot, which is not visible > > to Qemu, for virtual EPC, and KVM could call core-SGX EPC allocation > > API directly. > > That's possible, but it has several downsides. > > - Duplicates a lot of code in KVM for managing memory regions. I don't see why there will be duplicated code. you can simply call __x86_set_memory_region to create private slot. It is KVM x86 equivalent to KVM_SET_USER_MEMORY_REGION from userspace. The only difference is Qemu is not aware of the private slot. > - Artificially restricts userspace to a single EPC region, unless > even more code is duplicated to handle multiple private regions. You can have multiple private slots, by calling __x86_set_memory_region for each EPC section. KVM receives EPC section/sections info from Qemu, via CPUID, or dedicated IOCTL (is this you are going to add?), and simply creates private EPC slot/slots. > - Requires additional ioctls() or capabilities to probe EPC support No. EPC info is from Qemu at the beginning (size is given by parameter, base is calculated by Qemu), and actually it is Qemu notifies KVM EPC info, so I don't think we require additional ioctls or capabilities here. > - Does not fit with Qemu/KVM's memory model, e.g. all other types of > memory are exposed to a guest through > KVM_SET_USER_MEMORY_REGION. EPC is different. I am not sure whether EPC needs to fit such model. There are already examples in KVM which uses private slot w/o using KVM_SET_USER_MEMORY_REGION, for example, APIC access page. > - Prevents userspace from debugging a guest's enclave. I'm not saying > this is a likely scenario, but I also don't think we should preclude > it without good reason. I am not sure how important it is, so don't know whether this can be a justification. To me we don't need to consider this. Qemu normally doesn't access guest memory unless it has to (ie, for device model). > - KVM is now responsible for managing the lifecycle of EPC, e.g. what > happens if an EPC cgroup limit is lowered on a running VM and > KVM can't gracefully reclaim EPC? The userspace hypervisor should > ultimately decide how to handle such an event. Even using KVM_SET_USER_MEMORY_REGION, KVM is also responsible for managing the lifecycle of EPC, no? And "managing the lifecycle of EPC" you mean doesn't need to be "managing EPC itself" I suppose, since EPC should always be managed by core SGX code. I don't see the difference between private slot and KVM_SET_USER_MEMORY_REGION, in terms of how does KVM reclaim EPC, or how does KVM do when it fails to reclaim EPC. > - SGX logic is split between SGX and KVM, e.g. VA page management for > oversubscription will likely be common to SGX and KVM. From a long > term maintenance perspective, this means that changes to the EPC > management could potentially need to be Acked by KVM, and vice versa. I think most of the code should be in core SGX code under x86. KVM should only have the code that is specifically related to virtualization, ie, ENCLV. VA page allocation should be under code SGX code. KVM might need to call function such as alloc_va_page, etc, but this is not a problem. There are many other cases now. And this is not related to private slot vs KVM_SET_USER_MEMORY_REGION. > > > I am not sure what's the good of allowing userspace to alloc/mmap a > > raw EPC region? Userspace is not allowed to touch EPC anyway, expect > > enclave code. > > > > To me KVM creates private EPC slot is cleaner than exposing > > /dev/sgx/epc and allowing userspace to map some raw EPC region. > > Cleaner in the sense that it's faster to get basic support up and running > since > there are fewer touchpoints, but there are long term ramifications to > cramming EPC management in KVM. > > And at this point I'm not stating any absolutes, e.g. how EPC will be handled > by KVM. What I'm pushing for is to not eliminate the possibility of having > the SGX subsystem own all EPC management, e.g. don't tie /dev/sgx to a > single enclave. I suppose "SGX subsystem" you mean here is core SGX code under x86.
Re: [PATCH v3] loop: drop caches if offset or block_size are changed
Hi Jens, May I ask whether this patch is acceptable? Thanks, On 12/18, Jaegeuk Kim wrote: > If we don't drop caches used in old offset or block_size, we can get old data > from new offset/block_size, which gives unexpected data to user. > > For example, Martijn found a loopback bug in the below scenario. > 1) LOOP_SET_FD loads first two pages on loop file > 2) LOOP_SET_STATUS64 changes the offset on the loop file > 3) mount is failed due to the cached pages having wrong superblock > > Cc: Jens Axboe > Cc: linux-bl...@vger.kernel.org > Reported-by: Martijn Coenen > Signed-off-by: Jaegeuk Kim > --- > v2 -> v3: > - avoid to submit IOs on frozen mq > > Jens, how about this? > > Thanks, > > drivers/block/loop.c | 29 +++-- > 1 file changed, 27 insertions(+), 2 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index cb0cc8685076..6b03121d62aa 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1126,6 +1126,12 @@ loop_set_status(struct loop_device *lo, const struct > loop_info64 *info) > if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) > return -EINVAL; > > + if (lo->lo_offset != info->lo_offset || > + lo->lo_sizelimit != info->lo_sizelimit) { > + sync_blockdev(lo->lo_device); > + kill_bdev(lo->lo_device); > + } > + > /* I/O need to be drained during transfer transition */ > blk_mq_freeze_queue(lo->lo_queue); > > @@ -1154,6 +1160,11 @@ loop_set_status(struct loop_device *lo, const struct > loop_info64 *info) > > if (lo->lo_offset != info->lo_offset || > lo->lo_sizelimit != info->lo_sizelimit) { > + /* kill_bdev should have truncated all the pages */ > + if (lo->lo_device->bd_inode->i_mapping->nrpages) { > + err = -EAGAIN; > + goto exit; > + } > if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) { > err = -EFBIG; > goto exit; > @@ -1375,22 +1386,36 @@ static int loop_set_dio(struct loop_device *lo, > unsigned long arg) > > static int loop_set_block_size(struct loop_device *lo, unsigned long arg) > { > + int err = 0; > + > if (lo->lo_state != Lo_bound) > return -ENXIO; > > if (arg < 512 || arg > PAGE_SIZE || !is_power_of_2(arg)) > return -EINVAL; > > + if (lo->lo_queue->limits.logical_block_size != arg) { > + sync_blockdev(lo->lo_device); > + kill_bdev(lo->lo_device); > + } > + > blk_mq_freeze_queue(lo->lo_queue); > > + /* kill_bdev should have truncated all the pages */ > + if (lo->lo_queue->limits.logical_block_size != arg && > + lo->lo_device->bd_inode->i_mapping->nrpages) { > + err = -EAGAIN; > + goto out; > + } > + > blk_queue_logical_block_size(lo->lo_queue, arg); > blk_queue_physical_block_size(lo->lo_queue, arg); > blk_queue_io_min(lo->lo_queue, arg); > loop_update_dio(lo); > - > +out: > blk_mq_unfreeze_queue(lo->lo_queue); > > - return 0; > + return err; > } > > static int lo_ioctl(struct block_device *bdev, fmode_t mode, > -- > 2.19.0.605.g01d371f741-goog
Re: __get_user slower than get_user (was Re: [RFC PATCH V3 0/5] Hi:)
On Tue, Jan 8, 2019 at 8:31 PM Michael S. Tsirkin wrote: > > Linus, given that you just changed all users of access_ok anyway, do > you still think that the access_ok() conversion to return a speculation > sanitized pointer or NULL is too big a conversion? I didn't actually change a single access_ok(). I changed the (very few) users of "user_access_begin()" to do an access_ok() in them. There were 8 of them total. It turns out that two of those cases (the strn*_user() ones) found bugs in the implementation of access_ok() of two architectures, and then looking at the others found that six more architectures also had problems, but those weren't actually because of any access_ok() changes, they were pre-existing issues. So we definitely had unfortunate bugs in access_ok(), but they were mostly the benign kind (ir the "use arguments twice - a real potential bug, but not one that actually likely makes any difference to existing users) Changing all 600+ users of access_ok() would be painful. That said, one thing I *would* like to do is to just get rid of __get_user() and __put_user() entirely. Or rather, just make them do exactly the same thing that the normal "get_user()"/"put_user()" functions do. And then, _within_ the case of get_user()/put_user(), doing the access_ok() as a data dependency rather than a lfence should be easy enough. Linus