date:20190108

Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()

2019-01-08 Thread Sergey Senozhatsky

On (01/09/19 08:16), Helge Deller wrote:
> I prefer that change.
> In my role as admin of the parisc-linux.org domain I plan to retire
> all @parisc-linux.org email addresses soon anyway.

I see. Well, I can send a formal patch, but I think James
would know better which one of his email addresses he'd
prefer to use; so it seems that a patch from James would
be more appropriate.

-ss

Re: [PATCH v5 2/2] dt-bindings: spi: Document Renesas R-Car RPC-IF controller bindings

2019-01-08 Thread Geert Uytterhoeven

Hi Mason,

On Tue, Jan 8, 2019 at 5:17 AM Mason Yang  wrote:
> Document the bindings used by the Renesas R-Car Gen3 RPC-IF controller.
>
> Signed-off-by: Mason Yang 

Thanks for your patch!

> --- /dev/null
> +++ b/Documentation/devicetree/bindings/spi/spi-renesas-rpc.txt
> @@ -0,0 +1,37 @@
> +Renesas R-Car Gen3 RPC-IF controller Device Tree Bindings
> +--
> +
> +Required properties:
> +- compatible: should be "renesas,r8a77995-rpc"

Would it make sense to have a family-specific fallback "renesas,rcar-gen3-rpc",
bseides the SoC-specific compatible value?


> +- #address-cells: should be 1
> +- #size-cells: should be 0
> +- reg: should contain 2 entries, one for the registers and one for the direct
> +   mapping area
> +- reg-names: should contain "regs" and "dirmap"
> +- clock-names: should contain "rpc"
> +- clocks: should contain 1 entries for the module's clock

Doesn't the driver have a need to access the RPCD2 clock?
At least on R-Car V3M, it needs to program the Divider Clock Register
(DIVREG).

> +- renesas,rpc-mode: should contain "spi" for rpc spi mode or
> +  "hyperflash" for rpc hyperflash mode.

Can't this be derived from the flash subnode?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Re: About some mail lost and random unsubscribe issues on this dmaengine list

2019-01-08 Thread Yang, Shunyong

Hi, Vinod,
  Thanks for your information. I will check with our IT.
Shunyong.
Thanks.

On 2019/1/9 15:00, Vinod Koul wrote:
> On 09-01-19, 01:35, Yang, Shunyong wrote:
>> Hi, Vinod and All,
>>   I found, somtimes, I cannot received some mails from this mailing
>> list. For example, I sent my v2 series on 7th, Jan,
>>
>> https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713
>>
>> It appeared on patchwork.kernel.org but I haven't received it in this
>> mailing list.
>>
>> And I found randomly, a month or two months, I am unsubscribed from this
>> list automatically. Then I have to re-subscribe the list.
> 
> I guess your corporate server is to blame. vger mail list is "know" to 
> unsubscribe servers which behave badly, so you need to ask your mail
> admins why and try to use other email which is known to be friendly with
> vger :)
> 
>> Have someone met similar problem as me? Or maybe I missed some necessary
>> settings?
>>
>> Shunyong.
>> Thanks.
>

[PATCH v2] sched: fix a potential double-fetch bug in sched_copy_attr

2019-01-08 Thread Kangjie Lu

"uattr->size" is copied in from user space and checked. However, it is
copied in again after the security check. A malicious user may race to
change it. The fix sets uattr->size to be the checked size.

Signed-off-by: Kangjie Lu 
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6fedf3a98581..e868cc25ac2a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4499,6 +4499,9 @@ static int sched_copy_attr(struct sched_attr __user 
*uattr, struct sched_attr *a
if (ret)
return -EFAULT;
 
+   /* In case attr->size was changed in the user space */
+   attr->size = size;
+
/*
 * XXX: Do we want to be lenient like existing syscalls; or do we want
 * to be strict and return an error on out-of-bounds values?
-- 
2.17.1

Re: [RESEND 1/1] gic: its: Make sure a LPI is discarded before free.

2019-01-08 Thread Marc Zyngier

On Wed, 9 Jan 2019 11:53:27 +0800
Zhao Yuanyuan  wrote:

Hi Zhao,

> Its device will be removed after all events be freed.
> Undisarded events can lead to unpredictable behaviar.
> 
> Signed-off-by: Zhao Yuanyuan 
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index db20e99..4fee008 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2572,6 +2572,10 @@ static void its_irq_domain_free(struct irq_domain 
> *domain, unsigned int virq,
>   virq + i);
>   u32 event = its_get_event_id(data);
>  
> + /* Discard irq before free */
> + if (irqd_is_activated(d))
> + its_send_discard(its_dev, event);
> +
>   /* Mark interrupt index as unused */
>   clear_bit(event, its_dev->event_map.lpi_map);
>  

But we already do send a discard on deactivate, which logically happens
before we free the domain. So what are you fixing here?

Thanks,

M.
-- 
Without deviation from the norm, progress is not possible.

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Benjamin Herrenschmidt

On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote:
> "A PCI completion timeout occurred for an outstanding PCI-E transaction"
> it is.
> 
> This is how I bind the device to vfio:
> 
> echo vfio-pci > '/sys/bus/pci/devices/:01:00.0/driver_override'
> echo vfio-pci > '/sys/bus/pci/devices/:01:00.1/driver_override'
> echo ':01:00.0' > '/sys/bus/pci/devices/:01:00.0/driver/unbind'
> echo ':01:00.1' > '/sys/bus/pci/devices/:01:00.1/driver/unbind'
> echo ':01:00.0' > /sys/bus/pci/drivers/vfio-pci/bind
> echo ':01:00.1' > /sys/bus/pci/drivers/vfio-pci/bind
> 
> 
> and I noticed that EEH only happens with the last command. The order
> (.0,.1  or .1,.0) does not matter, it seems that putting one function to
> D3 is fine but putting another one when the first one is already in D3 -
> produces EEH. And I do not recall ever seeing this on the firestone
> machine. Weird.

Putting all functions into D3 is what allows the device to actually go
into D3.

Does it work with other devices ? We do have that bug on early P9
revisions where the attempt of bringing the link to L1 as part of the
D3 process fails in horrible ways, I thought P8 would be ok but maybe
not ...

Otherwise, it might be that our timeouts are too low (you may want to
talk to our PCIe guys internally)

Cheers,
Ben.

Re: [PATCH v3 2/4] Input: goodix - Add AVDD28-supply regulator support

2019-01-08 Thread Dmitry Torokhov

Hi Jagan,

On Sat, Dec 15, 2018 at 08:48:00PM +0530, Jagan Teki wrote:
> Goodix CTP controllers have AVDD28 pin connected to voltage
> regulator which may not be turned on by default, like for GT5663.
> 
> Add support for such ctp used boards by adding voltage regulator
> handling code to goodix ctp driver.
> 
> Signed-off-by: Jagan Teki 
> ---
>  drivers/input/touchscreen/goodix.c | 33 +-
>  1 file changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/input/touchscreen/goodix.c 
> b/drivers/input/touchscreen/goodix.c
> index f2d9c2c41885..7371f6946098 100644
> --- a/drivers/input/touchscreen/goodix.c
> +++ b/drivers/input/touchscreen/goodix.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -47,6 +48,7 @@ struct goodix_ts_data {
>   struct touchscreen_properties prop;
>   unsigned int max_touch_num;
>   unsigned int int_trigger_type;
> + struct regulator *avdd28;
>   struct gpio_desc *gpiod_int;
>   struct gpio_desc *gpiod_rst;
>   u16 id;
> @@ -786,25 +788,41 @@ static int goodix_ts_probe(struct i2c_client *client,
>   if (error)
>   return error;
>  
> + ts->avdd28 = devm_regulator_get(>dev, "AVDD28");
> + if (IS_ERR(ts->avdd28)) {
> + error = PTR_ERR(ts->avdd28);
> + if (error != -EPROBE_DEFER)
> + dev_err(>dev,
> + "Failed to get AVDD28 regulator: %d\n", error);
> + return error;
> + }
> +
> + /* power the controller */
> + error = regulator_enable(ts->avdd28);
> + if (error) {
> + dev_err(>dev, "Controller fail to enable AVDD28\n");
> + return error;
> + }
> +
>   if (ts->gpiod_int && ts->gpiod_rst) {
>   /* reset the controller */
>   error = goodix_reset(ts);
>   if (error) {
>   dev_err(>dev, "Controller reset failed.\n");
> - return error;
> + goto error;
>   }
>   }
>  
>   error = goodix_i2c_test(client);
>   if (error) {
>   dev_err(>dev, "I2C communication failure: %d\n", error);
> - return error;
> + goto error;
>   }
>  
>   error = goodix_read_version(ts);
>   if (error) {
>   dev_err(>dev, "Read version failed.\n");
> - return error;
> + goto error;
>   }
>  
>   ts->chip = goodix_get_chip_data(ts->id);
> @@ -823,23 +841,28 @@ static int goodix_ts_probe(struct i2c_client *client,
>   dev_err(>dev,
>   "Failed to invoke firmware loader: %d\n",
>   error);
> - return error;
> + goto error;
>   }
>  
>   return 0;
>   } else {
>   error = goodix_configure_dev(ts);
>   if (error)
> - return error;
> + goto error;
>   }
>  
>   return 0;
> +
> +error:
> + regulator_disable(ts->avdd28);
> + return error;
>  }
>  
>  static int goodix_ts_remove(struct i2c_client *client)
>  {
>   struct goodix_ts_data *ts = i2c_get_clientdata(client);
>  
> + regulator_disable(ts->avdd28);

This may be disabling the regulator too early. Please use
devm_add_action_or_reset() to install a custom devm handler that would
disable the regulator in line with the rest of devm unwinding flow.

>   if (ts->gpiod_int && ts->gpiod_rst)
>   wait_for_completion(>firmware_loading_complete);
>  
> -- 
> 2.18.0.321.gffc6fa0e3
> 

Thanks.

-- 
Dmitry

Re: [PATCH v3 1/4] dt-bindings: input: touchscreen: goodix: Document AVDD28-supply property

2019-01-08 Thread Dmitry Torokhov

On Sat, Dec 15, 2018 at 08:47:59PM +0530, Jagan Teki wrote:
> Most of the Goodix CTP controllers are supply with AVDD28 pin.
> which need to supply for controllers like GT5663 on some boards
> to trigger the power.
> 
> So, document the supply property so-that the require boards
> that used on GT5663 can enable it via device tree.
> 
> Signed-off-by: Jagan Teki 
> ---
>  Documentation/devicetree/bindings/input/touchscreen/goodix.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt 
> b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt
> index f7e95c52f3c7..c4622c983e08 100644
> --- a/Documentation/devicetree/bindings/input/touchscreen/goodix.txt
> +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix.txt
> @@ -23,6 +23,7 @@ Optional properties:
>   - touchscreen-inverted-y  : Y axis is inverted (boolean)
>   - touchscreen-swapped-x-y : X and Y axis are swapped (boolean)
>   (swapping is done after inverting the axis)
> + - AVDD28-supply : Analog power supply regulator on AVDD28 pin

I think we normally use lower case in DT bindings and rarely encode
voltage in the supply name unless we are dealing with several supplies
of the same kind, but I'll let Ron comment on this.

Thanks.

-- 
Dmitry

Re: [PATCH v5 4/8] Input: stmpe-ts: preparations for STMPE ADC driver

2019-01-08 Thread Dmitry Torokhov

On Fri, Dec 21, 2018 at 02:46:33PM +0100, Philippe Schenker wrote:
> From: Philippe Schenker 
> 
> This patch removes common ADC settings in favor to use
> stmpe811_adc_common_init that is present in MFD. This is necessary in
> preparation for the stmpe-adc driver, because those two drivers have
> common settings for the ADC.
> 
> Signed-off-by: Philippe Schenker 

Acked-by: Dmitry Torokhov 

> 
> ---
> 
> Changes in v5:
>  - Changed author of commit to use correct email.
> 
> Changes in v4:
>  - New patch: Split changes in stmpe-ts.c to a separate commit
>  - Remove common adc settings from init and call the
>stmpe811_adc_common_init function
> 
> Changes in v3:
>  - Undo ADC-settings related code-deletions in stmpe-ts.c that the code
>is backwards-compatible to older devicetrees.
> 
> Changes in v2: None
> 
>  drivers/input/touchscreen/stmpe-ts.c | 42 +---
>  1 file changed, 7 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/input/touchscreen/stmpe-ts.c 
> b/drivers/input/touchscreen/stmpe-ts.c
> index c5d9006588a2..cf9c9aa39f6e 100644
> --- a/drivers/input/touchscreen/stmpe-ts.c
> +++ b/drivers/input/touchscreen/stmpe-ts.c
> @@ -30,8 +30,6 @@
>   * with touchscreen controller
>   */
>  #define STMPE_REG_INT_STA0x0B
> -#define STMPE_REG_ADC_CTRL1  0x20
> -#define STMPE_REG_ADC_CTRL2  0x21
>  #define STMPE_REG_TSC_CTRL   0x40
>  #define STMPE_REG_TSC_CFG0x41
>  #define STMPE_REG_FIFO_TH0x4A
> @@ -58,15 +56,6 @@
>   * @idev: registered input device
>   * @work: a work item used to scan the device
>   * @dev: a pointer back to the MFD cell struct device*
> - * @sample_time: ADC converstion time in number of clock.
> - * (0 -> 36 clocks, 1 -> 44 clocks, 2 -> 56 clocks, 3 -> 64 clocks,
> - * 4 -> 80 clocks, 5 -> 96 clocks, 6 -> 144 clocks),
> - * recommended is 4.
> - * @mod_12b: ADC Bit mode (0 -> 10bit ADC, 1 -> 12bit ADC)
> - * @ref_sel: ADC reference source
> - * (0 -> internal reference, 1 -> external reference)
> - * @adc_freq: ADC Clock speed
> - * (0 -> 1.625 MHz, 1 -> 3.25 MHz, 2 || 3 -> 6.5 MHz)
>   * @ave_ctrl: Sample average control
>   * (0 -> 1 sample, 1 -> 2 samples, 2 -> 4 samples, 3 -> 8 samples)
>   * @touch_det_delay: Touch detect interrupt delay
> @@ -88,10 +77,6 @@ struct stmpe_touch {
>   struct input_dev *idev;
>   struct delayed_work work;
>   struct device *dev;
> - u8 sample_time;
> - u8 mod_12b;
> - u8 ref_sel;
> - u8 adc_freq;
>   u8 ave_ctrl;
>   u8 touch_det_delay;
>   u8 settling;
> @@ -192,7 +177,7 @@ static irqreturn_t stmpe_ts_handler(int irq, void *data)
>  static int stmpe_init_hw(struct stmpe_touch *ts)
>  {
>   int ret;
> - u8 adc_ctrl1, adc_ctrl1_mask, tsc_cfg, tsc_cfg_mask;
> + u8 tsc_cfg, tsc_cfg_mask;
>   struct stmpe *stmpe = ts->stmpe;
>   struct device *dev = ts->dev;
>  
> @@ -202,22 +187,9 @@ static int stmpe_init_hw(struct stmpe_touch *ts)
>   return ret;
>   }
>  
> - adc_ctrl1 = STMPE_SAMPLE_TIME(ts->sample_time) |
> - STMPE_MOD_12B(ts->mod_12b) | STMPE_REF_SEL(ts->ref_sel);
> - adc_ctrl1_mask = STMPE_SAMPLE_TIME(0xff) | STMPE_MOD_12B(0xff) |
> -  STMPE_REF_SEL(0xff);
> -
> - ret = stmpe_set_bits(stmpe, STMPE_REG_ADC_CTRL1,
> - adc_ctrl1_mask, adc_ctrl1);
> - if (ret) {
> - dev_err(dev, "Could not setup ADC\n");
> - return ret;
> - }
> -
> - ret = stmpe_set_bits(stmpe, STMPE_REG_ADC_CTRL2,
> - STMPE_ADC_FREQ(0xff), STMPE_ADC_FREQ(ts->adc_freq));
> + ret = stmpe811_adc_common_init(stmpe);
>   if (ret) {
> - dev_err(dev, "Could not setup ADC\n");
> + stmpe_disable(stmpe, STMPE_BLOCK_TOUCHSCREEN | STMPE_BLOCK_ADC);
>   return ret;
>   }
>  
> @@ -295,13 +267,13 @@ static void stmpe_ts_get_platform_info(struct 
> platform_device *pdev,
>  
>   if (np) {
>   if (!of_property_read_u32(np, "st,sample-time", ))
> - ts->sample_time = val;
> + ts->stmpe->sample_time = val;
>   if (!of_property_read_u32(np, "st,mod-12b", ))
> - ts->mod_12b = val;
> + ts->stmpe->mod_12b = val;
>   if (!of_property_read_u32(np, "st,ref-sel", ))
> - ts->ref_sel = val;
> + ts->stmpe->ref_sel = val;
>   if (!of_property_read_u32(np, "st,adc-freq", ))
> - ts->adc_freq = val;
> + ts->stmpe->adc_freq = val;
>   if (!of_property_read_u32(np, "st,ave-ctrl", ))
>   ts->ave_ctrl = val;
>   if (!of_property_read_u32(np, "st,touch-det-delay", ))
> -- 
> 2.19.2
> 

-- 
Dmitry

[PATCH] kobject: make kset_get_ownership() 'static'

2019-01-08 Thread Eric Biggers

From: Eric Biggers 

kset_get_ownership() is only used in lib/kobject.c, so make it 'static'.

Signed-off-by: Eric Biggers 
---
 lib/kobject.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/kobject.c b/lib/kobject.c
index b72e00fd7d09e..aa89edcd2b63a 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -887,7 +887,7 @@ static void kset_release(struct kobject *kobj)
kfree(kset);
 }
 
-void kset_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid)
+static void kset_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid)
 {
if (kobj->parent)
kobject_get_ownership(kobj->parent, uid, gid);
-- 
2.20.1

Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order

2019-01-08 Thread Michal Hocko

On Wed 09-01-19 11:28:52, Arun KS wrote:
> On 2019-01-08 23:43, Michal Hocko wrote:
> > On Tue 08-01-19 09:56:09, Alexander Duyck wrote:
> > > On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote:
> > [...]
> > > >  static int online_pages_range(unsigned long start_pfn, unsigned long 
> > > > nr_pages,
> > > > void *arg)
> > > >  {
> > > > -   unsigned long i;
> > > > unsigned long onlined_pages = *(unsigned long *)arg;
> > > > -   struct page *page;
> > > >
> > > > if (PageReserved(pfn_to_page(start_pfn)))
> > > > -   for (i = 0; i < nr_pages; i++) {
> > > > -   page = pfn_to_page(start_pfn + i);
> > > > -   (*online_page_callback)(page);
> > > > -   onlined_pages++;
> > > > -   }
> > > > +   onlined_pages = online_pages_blocks(start_pfn, 
> > > > nr_pages);
> > > 
> > > Shouldn't this be a "+=" instead of an "="? It seems like you are
> > > going
> > > to lose your count otherwise.
> > 
> > You are right of course. I should have noticed during the review.
> > Thanks!
> 
> I think we don't need to. The caller function is setting onlined_pages = 0
> before calling online_pages_range().
> And there are no other reference to online_pages_range other than from
> online_pages().

Are you missing that we accumulate onlined_pages via
*(unsigned long *)arg = onlined_pages;
in online_pages_range?

-- 
Michal Hocko
SUSE Labs

Re: [PATCH v2] mm/page_owner: fix for deferred struct page init

2019-01-08 Thread Michal Hocko

On Thu 20-12-18 13:50:31, Qian Cai wrote:
> When booting a system with "page_owner=on",
> 
> start_kernel
>   page_ext_init
> invoke_init_callbacks
>   init_section_page_ext
> init_page_owner
>   init_early_allocated_pages
> init_zones_in_node
>   init_pages_in_zone
> lookup_page_ext
>   page_to_nid
> 
> The issue here is that page_to_nid() will not work since some page
> flags have no node information until later in page_alloc_init_late() due
> to DEFERRED_STRUCT_PAGE_INIT. Hence, it could trigger an out-of-bounds
> access with an invalid nid.
> 
> [8.666047] UBSAN: Undefined behaviour in ./include/linux/mm.h:1104:50
> [8.672603] index 7 is out of range for type 'zone [5]'
> 
> Also, kernel will panic since flags were poisoned earlier with,
> 
> CONFIG_DEBUG_VM_PGFLAGS=y
> CONFIG_NODE_NOT_IN_PAGE_FLAGS=n
> 
> start_kernel
>   setup_arch
> pagetable_init
>   paging_init
> sparse_init
>   sparse_init_nid
> memblock_alloc_try_nid_raw
> 
> Although later it tries to set page flags for pages in reserved bootmem
> regions,
> 
> mm_init
>   mem_init
> memblock_free_all
>   free_low_memory_core_early
> reserve_bootmem_region
> 
> there could still have some freed pages from the page allocator but yet
> to be initialized due to DEFERRED_STRUCT_PAGE_INIT. It have already been
> dealt with a bit in page_ext_init().
> 
> * Take into account DEFERRED_STRUCT_PAGE_INIT.
> */
> if (early_pfn_to_nid(pfn) != nid)
>   continue;
> 
> However it did not handle it well in init_pages_in_zone() which end up
> calling page_to_nid().
> 
> [   11.917212] page:ea000420 is uninitialized and poisoned
> [   11.917220] raw:   
> 
> [   11.921745] raw:   
> 
> [   11.924523] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> [   11.926498] page_owner info is not active (free page?)
> [   12.329560] kernel BUG at include/linux/mm.h:990!
> [   12.337632] RIP: 0010:init_page_owner+0x486/0x520
> 
> Since there is no other routines depend on page_ext_init() in
> start_kernel() and no real benefit to call it so early, just move it
> after page_alloc_init_late() to ensure that there is no deferred pages
> need to de dealt with.

The last paragraph should be updated. I would propose something like
this.

"
This means that assumptions behind fe53ca54270a ("mm: use
early_pfn_to_nid in page_ext_init") are incomplete. Therefore revert the
commit for now. A proper way to move the page_owner initialization to
sooner is to hook into memmap initialization.
"
> 

Fixes: fe53ca54270a ("mm: use early_pfn_to_nid in page_ext_init")

> Suggested-by: Michal Hocko 
> Signed-off-by: Qian Cai 

Acked-by: Michal Hocko 

> ---
> 
> v2: postpone init_page_ext() to after page_alloc_init_late().
> 
>  init/main.c   | 2 +-
>  mm/page_ext.c | 3 +--
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/init/main.c b/init/main.c
> index 2b7b7fe173c9..1aeb062b2cb7 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -696,7 +696,6 @@ asmlinkage __visible void __init start_kernel(void)
>   initrd_start = 0;
>   }
>  #endif
> - page_ext_init();
>   kmemleak_init();
>   setup_per_cpu_pageset();
>   numa_policy_init();
> @@ -1147,6 +1146,7 @@ static noinline void __init kernel_init_freeable(void)
>   sched_init_smp();
>  
>   page_alloc_init_late();
> + page_ext_init();
>  
>   do_basic_setup();
>  
> diff --git a/mm/page_ext.c b/mm/page_ext.c
> index ae44f7adbe07..d76fd51e312a 100644
> --- a/mm/page_ext.c
> +++ b/mm/page_ext.c
> @@ -399,9 +399,8 @@ void __init page_ext_init(void)
>* -pfn-->
>* N0 | N1 | N2 | N0 | N1 | N2|
>*
> -  * Take into account DEFERRED_STRUCT_PAGE_INIT.
>*/
> - if (early_pfn_to_nid(pfn) != nid)
> + if (pfn_to_nid(pfn) != nid)
>   continue;
>   if (init_section_page_ext(pfn, nid))
>   goto oom;
> -- 
> 2.17.2 (Apple Git-113)

-- 
Michal Hocko
SUSE Labs

[PATCH v3] i2c: dev: prevent adapter retries and timeout being set as minus value

2019-01-08 Thread Yi Zeng

If set adapter->retries to minus value from user space via ioctl,
will make __i2c_transfer and __i2c_smbus_xfer jump the calling to
adapter->algo->master_xfer and adapter->algo->smbus_xfer that
registered by the underlying bus drivers, and return value 0 to
all the callers. The bus driver will never be accessed anymore by
all users, besides, the users may still get successful return value
without any error or information log print out.

If set adapter->timeout to minus value from user space via ioctl,
will make the retrying loop in __i2c_transfer and __i2c_smbus_xfer
always break after the the first try, due to the time_after always
returns true.

Signed-off-by: Yi Zeng 
---
 drivers/i2c/i2c-dev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c
index 1aca742..ccd76c7 100644
--- a/drivers/i2c/i2c-dev.c
+++ b/drivers/i2c/i2c-dev.c
@@ -470,9 +470,15 @@ static long i2cdev_ioctl(struct file *file, unsigned int 
cmd, unsigned long arg)
  data_arg.data);
}
case I2C_RETRIES:
+   if (arg > INT_MAX)
+   return -EINVAL;
+
client->adapter->retries = arg;
break;
case I2C_TIMEOUT:
+   if (arg > INT_MAX)
+   return -EINVAL;
+
/* For historical reasons, user-space sets the timeout
 * value in units of 10 ms.
 */
-- 
1.9.1

[PATCH net V2] vhost: log dirty page correctly

2019-01-08 Thread Jason Wang

Vhost dirty page logging API is designed to sync through GPA. But we
try to log GIOVA when device IOTLB is enabled. This is wrong and may
lead to missing data after migration.

To solve this issue, when logging with device IOTLB enabled, we will:

1) reuse the device IOTLB translation result of GIOVA->HVA mapping to
   get HVA, for writable descriptor, get HVA through iovec. For used
   ring update, translate its GIOVA to HVA
2) traverse the GPA->HVA mapping to get the possible GPA and log
   through GPA. Pay attention this reverse mapping is not guaranteed
   to be unique, so we should log each possible GPA in this case.

This fix the failure of scp to guest during migration. In -next, we
will probably support passing GIOVA->GPA instead of GIOVA->HVA.

Fixes: 6b1e6cc7855b ("vhost: new device IOTLB API")
Reported-by: Jintack Lim 
Cc: Jintack Lim 
Signed-off-by: Jason Wang 
---
The patch is needed for stable.
Changes from V1:
- return error instead of warn
---
 drivers/vhost/net.c   |  3 +-
 drivers/vhost/vhost.c | 82 +++
 drivers/vhost/vhost.h |  3 +-
 3 files changed, 72 insertions(+), 16 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 36f3d0f49e60..bca86bf7189f 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1236,7 +1236,8 @@ static void handle_rx(struct vhost_net *net)
if (nvq->done_idx > VHOST_NET_BATCH)
vhost_net_signal_used(nvq);
if (unlikely(vq_log))
-   vhost_log_write(vq, vq_log, log, vhost_len);
+   vhost_log_write(vq, vq_log, log, vhost_len,
+   vq->iov, in);
total_len += vhost_len;
if (unlikely(vhost_exceeds_weight(++recv_pkts, total_len))) {
vhost_poll_queue(>poll);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 9f7942cbcbb2..ee095f08ffd4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1733,11 +1733,70 @@ static int log_write(void __user *log_base,
return r;
 }
 
+static int log_write_hva(struct vhost_virtqueue *vq, u64 hva, u64 len)
+{
+   struct vhost_umem *umem = vq->umem;
+   struct vhost_umem_node *u;
+   u64 gpa;
+   int r;
+   bool hit = false;
+
+   list_for_each_entry(u, >umem_list, link) {
+   if (u->userspace_addr < hva &&
+   u->userspace_addr + u->size >=
+   hva + len) {
+   gpa = u->start + hva - u->userspace_addr;
+   r = log_write(vq->log_base, gpa, len);
+   if (r < 0)
+   return r;
+   hit = true;
+   }
+   }
+
+   if (!hit)
+   return -EFAULT;
+
+   return 0;
+}
+
+static int log_used(struct vhost_virtqueue *vq, u64 used_offset, u64 len)
+{
+   struct iovec iov[64];
+   int i, ret;
+
+   if (!vq->iotlb)
+   return log_write(vq->log_base, vq->log_addr + used_offset, len);
+
+   ret = translate_desc(vq, (u64)(uintptr_t)vq->used + used_offset,
+len, iov, 64, VHOST_ACCESS_WO);
+   if (ret)
+   return ret;
+
+   for (i = 0; i < ret; i++) {
+   ret = log_write_hva(vq, (u64)(uintptr_t)iov[i].iov_base,
+   iov[i].iov_len);
+   if (ret)
+   return ret;
+   }
+
+   return 0;
+}
+
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
-   unsigned int log_num, u64 len)
+   unsigned int log_num, u64 len, struct iovec *iov, int count)
 {
int i, r;
 
+   if (vq->iotlb) {
+   for (i = 0; i < count; i++) {
+   r = log_write_hva(vq, (u64)(uintptr_t)iov[i].iov_base,
+ iov[i].iov_len);
+   if (r < 0)
+   return r;
+   }
+   return 0;
+   }
+
/* Make sure data written is seen before log. */
smp_wmb();
for (i = 0; i < log_num; ++i) {
@@ -1769,9 +1828,8 @@ static int vhost_update_used_flags(struct vhost_virtqueue 
*vq)
smp_wmb();
/* Log used flag write. */
used = >used->flags;
-   log_write(vq->log_base, vq->log_addr +
- (used - (void __user *)vq->used),
- sizeof vq->used->flags);
+   log_used(vq, (used - (void __user *)vq->used),
+sizeof vq->used->flags);
if (vq->log_ctx)
eventfd_signal(vq->log_ctx, 1);
}
@@ -1789,9 +1847,8 @@ static int vhost_update_avail_event(struct 
vhost_virtqueue *vq, u16 avail_event)
smp_wmb();
/* Log avail event write */
used =

[PATCH v2] scsi: avoid a double-fetch and a redundant copy

2019-01-08 Thread Kangjie Lu

What we need is only "pack_id", so do not create a heap object or copy
the whole object in. The fix efficiently copies "pack_id" only and
also avoids double-fetch.

Signed-off-by: Kangjie Lu 
---
 drivers/scsi/sg.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index c6ad00703c5b..13662c41058a 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -446,16 +446,8 @@ sg_read(struct file *filp, char __user *buf, size_t count, 
loff_t * ppos)
}
if (old_hdr->reply_len < 0) {
if (count >= SZ_SG_IO_HDR) {
-   sg_io_hdr_t *new_hdr;
-   new_hdr = kmalloc(SZ_SG_IO_HDR, GFP_KERNEL);
-   if (!new_hdr) {
-   retval = -ENOMEM;
-   goto free_old_hdr;
-   }
-   retval =__copy_from_user
-   (new_hdr, buf, SZ_SG_IO_HDR);
-   req_pack_id = new_hdr->pack_id;
-   kfree(new_hdr);
+   retval = get_user(req_pack_id,
+   &((sg_io_hdr_t *)buf)->pack_id);
if (retval) {
retval = -EFAULT;
goto free_old_hdr;
-- 
2.17.1

Re: [PATCH v5 0/7] mfd / platform: cros_ec: move cros_ec sysfs attributes to its own drivers.

2019-01-08 Thread Lee Jones

On Tue, 08 Jan 2019, Enric Balletbo Serra wrote:
> Missatge de Lee Jones  del dia dv., 21 de des.
> 2018 a les 16:39:
> >
> > On Wed, 12 Dec 2018, Enric Balletbo i Serra wrote:
> >
> > > Hi,
> > >
> > > This is another patchset to try to cleanup a bit more the crossed
> > > references for cros-ec driver between the MFD and the platform/chrome
> > > subsystems.
> > >
> > > The purpose of these patches is get rid of the different cros-ec 
> > > attributes
> > > from mfd/cros_ec_dev to its own sub-driver in platform/chrome. cros_ec_dev
> > > continues instantiating the sub-devices but the sysfs attributes are owned
> > > by the platform driver.E.g. The lightbar driver should own his sysfs
> > > attributes and be instantiated only if the Embedded Controller has a
> > > lightbar.
> > >
> > > The patchset also adds the documentation of the sysfs attributes.
> > >
> > > Most of the patches touches mfd subsystem and platform/chrome so I'd
> > > suggest go all using and inmutable branch.
> >
> > That's fine.
> >
> > What else needs to happen with this set?
> 
> I think the patchset is ready to be queued. Note that to apply cleanly
> it depends on [1] which is already merged in your for-next branch.
> What do you prefer, go through your repo or go through the
> chrome-platform repo? Do you want Benson or I create an immutable
> branch? I'm fine with whenever you decide.

Probably best if this goes through the MFD tree.

I have a PR pending upstream at the moment.  Once I know what is
happening with that, I'll start taking patches/sets again.

> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git/commit/?h=for-mfd-next=18e294ddafaeb80a1e2e10c9bd750a6cb8388d5b
> 
> > Any more Acks required?
> >
> 
> I think that you, Guenter, Benson and I are fine with it, so not more
> acks needed.

Thanks for the clarification.

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Benjamin Herrenschmidt

On Wed, 2019-01-09 at 17:32 +1100, Alexey Kardashevskiy wrote:
> I have just moved the "Mellanox Technologies MT27700 Family
> [ConnectX-4]" from garrison to firestone machine and there it does not
> produce an EEH, with the same kernel and skiboot (both upstream + my
> debug). Hm. I cannot really blame the card but I cannot see what could
> cause the difference in skiboot either. I even tried disabling NPU so
> garrison would look like firestone, still EEH'ing.

The systems have a different chip though, firestone is P8 and garrison
is P8', which a slightly different PHB revision. Worth checking if we
have anything significantly different in our inits and poke at the HW
guys.

BTW. Are the cards behind a switch in either case ?

Cheers,
Ben.

Re: [PATCH v4 01/15] ARM: export arm_pm_restart()

2019-01-08 Thread Lubomir Rintel

On Tue, 2019-01-08 at 23:31 +, Russell King - ARM Linux wrote:
> On Tue, Jan 08, 2019 at 06:58:04PM +0100, Lubomir Rintel wrote:
> > The OLPC XO 1.75 laptop is rebooted with a command to the Embedded
> > Controller. The EC driver should be a module, since most people don't need
> > it to be compiled in.
> 
> Why do you need this - is there a reason why you can't hook onto the
> restart_handler_list using register_restart_handler()?  The registered
> handlers are passed the command.

Actually, yes. The restart handlers are invoked in atomic context, but
the EC communication needs to block. (The 1.75 EC is actually a SPI
master: a command it initiated by poking a GPIO and waiting until EC
tells us it's ready to read a command.)

Sorry for not mentioning that earlier, I forgot that it is the case.

> I would rather this is not exported to modules, it is generally unsafe
> to hook stuff by directly poking function pointers from module code.

Lubo

[GIT PULL] Backlight for v5.1

2019-01-08 Thread Lee Jones

Good morning Linus,

Please accept my apologies for the lateness.

Christmas holidays, yada yada yada! :)

The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a:

  Linux 4.20-rc1 (2018-11-04 15:37:52 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight.git 
tags/backlight-next-4.21

for you to fetch changes up to 3cee7a7d05b11038c8b5fa093e45c6f839ffc867:

  backlight: 88pm860x_bl: Use of_node_name_eq for node name comparisons 
(2018-12-14 12:30:24 +)


 - Fix-ups
   - Use new of_node_name_eq() API call;

 - Bug Fixes
   - Internally track 'enabled' state; pwm_bl
   - Fix auto-generated brightness tables parsed by DT; pwm_bl


Heiko Stuebner (2):
  backlight: pwm_bl: Re-add driver internal enabled tracking
  backlight: pwm_bl: Fix devicetree parsing with auto-generated brightness 
tables

Rob Herring (1):
  backlight: 88pm860x_bl: Use of_node_name_eq for node name comparisons

 drivers/video/backlight/88pm860x_bl.c |  2 +-
 drivers/video/backlight/pwm_bl.c  | 28 +---
 2 files changed, 18 insertions(+), 12 deletions(-)

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

[PATCH v2] net: ethernet: mediatek: fix warning in phy_start_aneg

2019-01-08 Thread Frank Wunderlich

From: Heiner Kallweit 

linux 5.0-rc1 shows following warning on bpi-r2/mt7623 bootup:

[ 5.170597] WARNING: CPU: 3 PID: 1 at drivers/net/phy/phy.c:548 
phy_start_aneg+0x110/0x144
[ 5.178826] called from state READY

[ 5.264111] [] (phy_start_aneg) from [] 
(mtk_init+0x414/0x47c)
[ 5.271630] r7:df5f5eec r6:c0f08c48 r5: r4:dea67800
[ 5.277256] [] (mtk_init) from [] 
(register_netdevice+0x98/0x51c)
[ 5.285035] r8: r7: r6:c0f97080 r5:c0f08c48 r4:dea67800
[ 5.291693] [] (register_netdevice) from [] 
(register_netdev+0x2c/0x44)
[ 5.299989] r8: r7:dea2e608 r6:deacea00 r5:dea2e604 r4:dea67800
[ 5.306646] [] (register_netdev) from [] 
(mtk_probe+0x668/0x7ac)
[ 5.314336] r5:dea2e604 r4:dea2e040
[ 5.317890] [] (mtk_probe) from [] 
(platform_drv_probe+0x58/0xa8)
[ 5.325670] r10:c0f86bac r9: r8:c0fbe578 r7: r6:c0f86bac 
r5:
[ 5.333445] r4:deacea10
[ 5.335963] [] (platform_drv_probe) from [] 
(really_probe+0x2d8/0x424)

maybe other boards using this generic driver are affected

v2:
optimization:

- phy_set_max_speed() is only needed if you want to reduce the
  max speed, typically if the PHY supports 1Gbps but the MAC
  supports 100Mbps only.

- The pause parameters are autonegotiated. Except you have a specific
  need you normally don't need to manually fiddle with this.

- phy_start_aneg() is called implicitly by the phylib state machine,
  you shouldn't call it manually except you have a good excuse.

- netif_carrier_on/netif_carrier_off in mtk_phy_link_adjust() isn't
  needed. It's done by phy_link_change() in phylib.

Signed-off-by: Frank Wunderlich 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 16 
 1 file changed, 16 deletions(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 399f565dd85a..2968d29a992f 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -258,11 +258,6 @@ static void mtk_phy_link_adjust(struct net_device *dev)
 
mtk_w32(mac->hw, mcr, MTK_MAC_MCR(mac->id));
 
-   if (dev->phydev->link)
-   netif_carrier_on(dev);
-   else
-   netif_carrier_off(dev);
-
if (!of_phy_is_fixed_link(mac->of_node))
phy_print_status(dev->phydev);
 }
@@ -347,17 +342,6 @@ static int mtk_phy_connect(struct net_device *dev)
if (mtk_phy_connect_node(eth, mac, np))
goto err_phy;
 
-   dev->phydev->autoneg = AUTONEG_ENABLE;
-   dev->phydev->speed = 0;
-   dev->phydev->duplex = 0;
-
-   phy_set_max_speed(dev->phydev, SPEED_1000);
-   phy_support_asym_pause(dev->phydev);
-   linkmode_copy(dev->phydev->advertising, dev->phydev->supported);
-   linkmode_set_bit(ETHTOOL_LINK_MODE_Autoneg_BIT,
-dev->phydev->advertising);
-   phy_start_aneg(dev->phydev);
-
of_node_put(np);
 
return 0;
-- 
2.17.1

[GIT PULL] MFD for v5.1

2019-01-08 Thread Lee Jones

Good morning Linus,

Please accept my apologies for the lateness.

Christmas holidays, yada yada yada! :)

The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a:

  Linux 4.20-rc1 (2018-11-04 15:37:52 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git tags/mfd-next-4.21

for you to fetch changes up to 3f2d347e851ef4464dea49504cde85e5eef67b2d:

  mfd: exynos-lpass: Enable UART module support (2019-01-03 08:32:42 +)


 - New Device Support
   - Add support for Power Supply to AXP813
   - Add support for GPIO, ADC, AC and Battery Power Supply to AXP803
   - Add support for UART to Exynos LPASS

 - Fix-ups
   - Use supplied MACROS; ti_am335x_tscadc
   - Trivial spelling/whitespace/alignment; tmio, axp20x, rave-sp
   - Regmap changes; bd9571mwv, wm5110-tables
   - Kconfig dependencies; MFD_AT91_USART
   - Supply shared data for child-devices; madera-core
   - Use new of_node_name_eq() API call; max77620, stmpe
   - Use managed resources (devm_*); tps65218
   - Comment descriptions; ingenic-tcu
   - Coding style; madera-core

 - Bug Fixes
   - Fix section mismatches; twl-core, db8500-prcmu
   - Correct error path related issues; mt6397-core, ab8500-core, mc13xxx-core
   - IRQ related fixes; tps6586x
   - Ensure proper initialisation sequence; qcom_rpm
   - Repair potential memory leak; cros_ec_dev


Beomho Seo (1):
  mfd: exynos-lpass: Enable UART module support

Charles Keepax (1):
  mfd: wm5110: Add missing ASRC rate register

Chen-Yu Tsai (1):
  mfd: axp20x: Re-align MFD cell entries

Cheng-Yi Chiang (1):
  mfd: cros_ec: Add commands to control codec

Dan Carpenter (1):
  mfd: ab8500-core: Return zero in get_register_interruptible()

Dien Pham (1):
  mfd: bd9571mwv: Add volatile register to make DVFS work

Enric Balletbo i Serra (1):
  mfd: cros_ec_dev: Add missing mfd_remove_devices() call in remove

Geert Uytterhoeven (1):
  mfd: tmio: Typo s/use use/use/

Jean Delvare (1):
  mfd: at91-usart: Add platform dependency

Jonathan Hunter (1):
  mfd: tps6586x: Handle interrupts on suspend

Jonathan Marek (1):
  mfd: qcom_rpm: write fw_version to CTRL_REG

Kangjie Lu (1):
  mfd: mc13xxx: Fix a missing check of a register-read failure

Keerthy (1):
  mfd: tps65218: Use devm_regmap_add_irq_chip and clean up error path in 
probe()

Nathan Chancellor (2):
  mfd: twl-core: Fix section annotations on {,un}protect_pm_master
  mfd: db8500-prcmu: Fix some section annotations

Nicolas Boichat (1):
  mfd: mt6397: Do not call irq_domain_remove if PMIC unsupported

Oskari Lemmela (2):
  mfd: axp20x: Add AC power supply cell for AXP813
  mfd: axp20x: Add supported cells for AXP803

Paul Cercueil (1):
  mfd: ingenic-tcu: Fix bit field description in header

Richard Fitzgerald (2):
  mfd: madera: Add shared data for accessory detection
  mfd: madera: Remove spurious semicolon in while loop

Rob Herring (1):
  mfd: Use of_node_name_eq() for node name comparisons

Vignesh R (2):
  mfd: ti_am335x_tscadc: Use PLATFORM_DEVID_AUTO while registering mfd cells
  iio: adc: ti_am335x_tscadc: Improve accuracy of measurement

Yangtao Li (1):
  mfd: rave-sp: Fix typo in rave_sp_checksum comment

 drivers/iio/adc/ti_am335x_adc.c  |   5 +-
 drivers/mfd/Kconfig  |   1 +
 drivers/mfd/ab8500-core.c|   2 +-
 drivers/mfd/axp20x.c | 126 +++
 drivers/mfd/bd9571mwv.c  |   1 +
 drivers/mfd/cros_ec_dev.c|   1 +
 drivers/mfd/db8500-prcmu.c   |   4 +-
 drivers/mfd/exynos-lpass.c   |   4 +-
 drivers/mfd/madera-core.c|   5 +-
 drivers/mfd/max77620.c   |   2 +-
 drivers/mfd/mc13xxx-core.c   |   4 +-
 drivers/mfd/mt6397-core.c|   3 +-
 drivers/mfd/qcom_rpm.c   |   4 ++
 drivers/mfd/rave-sp.c|   2 +-
 drivers/mfd/stmpe.c  |  12 ++--
 drivers/mfd/ti_am335x_tscadc.c   |   5 +-
 drivers/mfd/tps65218.c   |  24 +--
 drivers/mfd/tps6586x.c   |  24 +++
 drivers/mfd/twl-core.c   |   4 +-
 drivers/mfd/wm5110-tables.c  |   2 +
 include/linux/mfd/cros_ec_commands.h |  94 ++
 include/linux/mfd/ingenic-tcu.h  |   2 +-
 include/linux/mfd/madera/core.h  |   7 ++
 include/linux/mfd/ti_am335x_tscadc.h |   4 ++
 include/linux/mfd/tmio.h |   2 +-
 25 files changed, 244 insertions(+), 100 deletions(-)

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

[tip:perf/urgent] tools include uapi: Sync linux/vhost.h with the kernel sources

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  ee412f14693a3fe2645b3528603dfd37dd05118a
Gitweb: https://git.kernel.org/tip/ee412f14693a3fe2645b3528603dfd37dd05118a
Author: Arnaldo Carvalho de Melo 
AuthorDate: Tue, 8 Jan 2019 13:53:23 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 14:09:33 -0300

tools include uapi: Sync linux/vhost.h with the kernel sources

To get the changes in:

  4b86713236e4 ("vhost: split structs into a separate header file")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from 
latest version at 'include/uapi/linux/vhost.h'
  diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h

Those didn't touch things used in tools, i.e. the following continues
working:

  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh
  static const char *vhost_virtio_ioctl_cmds[] = {
  [0x00] = "SET_FEATURES",
  [0x01] = "SET_OWNER",
  [0x02] = "RESET_OWNER",
  [0x03] = "SET_MEM_TABLE",
  [0x04] = "SET_LOG_BASE",
  [0x07] = "SET_LOG_FD",
  [0x10] = "SET_VRING_NUM",
  [0x11] = "SET_VRING_ADDR",
  [0x12] = "SET_VRING_BASE",
  [0x13] = "SET_VRING_ENDIAN",
  [0x14] = "GET_VRING_ENDIAN",
  [0x20] = "SET_VRING_KICK",
  [0x21] = "SET_VRING_CALL",
  [0x22] = "SET_VRING_ERR",
  [0x23] = "SET_VRING_BUSYLOOP_TIMEOUT",
  [0x24] = "GET_VRING_BUSYLOOP_TIMEOUT",
  [0x25] = "SET_BACKEND_FEATURES",
  [0x30] = "NET_SET_BACKEND",
  [0x40] = "SCSI_SET_ENDPOINT",
  [0x41] = "SCSI_CLEAR_ENDPOINT",
  [0x42] = "SCSI_GET_ABI_VERSION",
  [0x43] = "SCSI_SET_EVENTS_MISSED",
  [0x44] = "SCSI_GET_EVENTS_MISSED",
  [0x60] = "VSOCK_SET_GUEST_CID",
  [0x61] = "VSOCK_SET_RUNNING",
  };
  static const char *vhost_virtio_ioctl_read_cmds[] = {
  [0x00] = "GET_FEATURES",
  [0x12] = "GET_VRING_BASE",
  [0x26] = "GET_BACKEND_FEATURES",
  };
  $

At some point in the eBPFication of perf, using something like:

# perf trace -e ioctl(cmd=VHOST_VRING*)

Will setup a BPF filter right at the raw_syscalls:sys_enter tracepoint,
i.e. filtering at the origin.

Cc: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Luis Cláudio Gonçalves 
Cc: Michael S. Tsirkin 
Cc: Namhyung Kim 
Cc: Paolo Bonzini 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-g28usrt7l59lwq3wuh8vz...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/uapi/linux/vhost.h | 113 +--
 1 file changed, 2 insertions(+), 111 deletions(-)

diff --git a/tools/include/uapi/linux/vhost.h b/tools/include/uapi/linux/vhost.h
index 84c3de89696a..40d028eed645 100644
--- a/tools/include/uapi/linux/vhost.h
+++ b/tools/include/uapi/linux/vhost.h
@@ -11,94 +11,9 @@
  * device configuration.
  */
 
+#include 
 #include 
-#include 
 #include 
-#include 
-#include 
-
-struct vhost_vring_state {
-   unsigned int index;
-   unsigned int num;
-};
-
-struct vhost_vring_file {
-   unsigned int index;
-   int fd; /* Pass -1 to unbind from file. */
-
-};
-
-struct vhost_vring_addr {
-   unsigned int index;
-   /* Option flags. */
-   unsigned int flags;
-   /* Flag values: */
-   /* Whether log address is valid. If set enables logging. */
-#define VHOST_VRING_F_LOG 0
-
-   /* Start of array of descriptors (virtually contiguous) */
-   __u64 desc_user_addr;
-   /* Used structure address. Must be 32 bit aligned */
-   __u64 used_user_addr;
-   /* Available structure address. Must be 16 bit aligned */
-   __u64 avail_user_addr;
-   /* Logging support. */
-   /* Log writes to used structure, at offset calculated from specified
-* address. Address must be 32 bit aligned. */
-   __u64 log_guest_addr;
-};
-
-/* no alignment requirement */
-struct vhost_iotlb_msg {
-   __u64 iova;
-   __u64 size;
-   __u64 uaddr;
-#define VHOST_ACCESS_RO  0x1
-#define VHOST_ACCESS_WO  0x2
-#define VHOST_ACCESS_RW  0x3
-   __u8 perm;
-#define VHOST_IOTLB_MISS   1
-#define VHOST_IOTLB_UPDATE 2
-#define VHOST_IOTLB_INVALIDATE 3
-#define VHOST_IOTLB_ACCESS_FAIL4
-   __u8 type;
-};
-
-#define VHOST_IOTLB_MSG 0x1
-#define VHOST_IOTLB_MSG_V2 0x2
-
-struct vhost_msg {
-   int type;
-   union {
-   struct vhost_iotlb_msg iotlb;
-   __u8 padding[64];
-   };
-};
-
-struct vhost_msg_v2 {
-   __u32 type;
-   __u32 reserved;
-   union {
-   struct vhost_iotlb_msg iotlb;
-   __u8 padding[64];
-   };
-};
-
-struct vhost_memory_region {
-   __u64 guest_phys_addr;
-   __u64 memory_size; /* bytes */
-   __u64 userspace_addr;
-   __u64 flags_padding; /* No flags are currently specified. */
-};
-
-/* All region addresses and sizes must be 4K aligned. */
-#define VHOST_PAGE_SIZE 0x1000
-
-struct

[tip:perf/urgent] tools include uapi: Sync linux/fs.h copy with the kernel sources

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  fdc42ca190c7d8976f4f9240752f0bd008270b72
Gitweb: https://git.kernel.org/tip/fdc42ca190c7d8976f4f9240752f0bd008270b72
Author: Arnaldo Carvalho de Melo 
AuthorDate: Tue, 8 Jan 2019 13:48:14 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 14:09:33 -0300

tools include uapi: Sync linux/fs.h copy with the kernel sources

To get the changes in:

  e262e32d6bde ("vfs: Suppress MS_* flag defs within the kernel unless 
explicitly enabled")

That made the mount flags string table generator to switch to using
mount.h instead.

This silences the following perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from 
latest version at 'include/uapi/linux/fs.h'
  diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h

Cc: Adrian Hunter 
Cc: David Howells 
Cc: Jiri Olsa 
Cc: Luis Cláudio Gonçalves 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-mosz81pa6iwxko4p2owbm...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/uapi/linux/fs.h | 60 ++-
 1 file changed, 8 insertions(+), 52 deletions(-)

diff --git a/tools/include/uapi/linux/fs.h b/tools/include/uapi/linux/fs.h
index a441ea1bfe6d..121e82ce296b 100644
--- a/tools/include/uapi/linux/fs.h
+++ b/tools/include/uapi/linux/fs.h
@@ -14,6 +14,11 @@
 #include 
 #include 
 
+/* Use of MS_* flags within the kernel is restricted to core mount(2) code. */
+#if !defined(__KERNEL__)
+#include 
+#endif
+
 /*
  * It's silly to have NR_OPEN bigger than NR_FILE, but you can change
  * the file limit at runtime and only root can increase the per-process
@@ -101,57 +106,6 @@ struct inodes_stat_t {
 
 #define NR_FILE  8192  /* this can well be larger on a larger system */
 
-
-/*
- * These are the fs-independent mount-flags: up to 32 flags are supported
- */
-#define MS_RDONLY   1  /* Mount read-only */
-#define MS_NOSUID   2  /* Ignore suid and sgid bits */
-#define MS_NODEV4  /* Disallow access to device special files */
-#define MS_NOEXEC   8  /* Disallow program execution */
-#define MS_SYNCHRONOUS 16  /* Writes are synced at once */
-#define MS_REMOUNT 32  /* Alter flags of a mounted FS */
-#define MS_MANDLOCK64  /* Allow mandatory locks on an FS */
-#define MS_DIRSYNC 128 /* Directory modifications are synchronous */
-#define MS_NOATIME 1024/* Do not update access times. */
-#define MS_NODIRATIME  2048/* Do not update directory access times */
-#define MS_BIND4096
-#define MS_MOVE8192
-#define MS_REC 16384
-#define MS_VERBOSE 32768   /* War is peace. Verbosity is silence.
-  MS_VERBOSE is deprecated. */
-#define MS_SILENT  32768
-#define MS_POSIXACL(1<<16) /* VFS does not apply the umask */
-#define MS_UNBINDABLE  (1<<17) /* change to unbindable */
-#define MS_PRIVATE (1<<18) /* change to private */
-#define MS_SLAVE   (1<<19) /* change to slave */
-#define MS_SHARED  (1<<20) /* change to shared */
-#define MS_RELATIME(1<<21) /* Update atime relative to mtime/ctime. */
-#define MS_KERNMOUNT   (1<<22) /* this is a kern_mount call */
-#define MS_I_VERSION   (1<<23) /* Update inode I_version field */
-#define MS_STRICTATIME (1<<24) /* Always perform atime updates */
-#define MS_LAZYTIME(1<<25) /* Update the on-disk [acm]times lazily */
-
-/* These sb flags are internal to the kernel */
-#define MS_SUBMOUNT (1<<26)
-#define MS_NOREMOTELOCK(1<<27)
-#define MS_NOSEC   (1<<28)
-#define MS_BORN(1<<29)
-#define MS_ACTIVE  (1<<30)
-#define MS_NOUSER  (1<<31)
-
-/*
- * Superblock flags that can be altered by MS_REMOUNT
- */
-#define MS_RMT_MASK(MS_RDONLY|MS_SYNCHRONOUS|MS_MANDLOCK|MS_I_VERSION|\
-MS_LAZYTIME)
-
-/*
- * Old magic mount flag and mask
- */
-#define MS_MGC_VAL 0xC0ED
-#define MS_MGC_MSK 0x
-
 /*
  * Structure for FS_IOC_FSGETXATTR[A] and FS_IOC_FSSETXATTR.
  */
@@ -269,7 +223,8 @@ struct fsxattr {
 #define FS_POLICY_FLAGS_PAD_16 0x02
 #define FS_POLICY_FLAGS_PAD_32 0x03
 #define FS_POLICY_FLAGS_PAD_MASK   0x03
-#define FS_POLICY_FLAGS_VALID  0x03
+#define FS_POLICY_FLAG_DIRECT_KEY  0x04/* use master key directly */
+#define FS_POLICY_FLAGS_VALID  0x07
 
 /* Encryption algorithms */
 #define FS_ENCRYPTION_MODE_INVALID 0
@@ -281,6 +236,7 @@ struct fsxattr {
 #define FS_ENCRYPTION_MODE_AES_128_CTS 6
 #define FS_ENCRYPTION_MODE_SPECK128_256_XTS7 /* Removed, do not use. */
 #define FS_ENCRYPTION_MODE_SPECK128_256_CTS8 /* Removed, do not use. */
+#define FS_ENCRYPTION_MODE_ADIANTUM9
 
 struct fscrypt_policy {
__u8 version;

[tip:perf/urgent] perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  1c23397d2a6a077ab32f01c01406c2fe61b7b3a4
Gitweb: https://git.kernel.org/tip/1c23397d2a6a077ab32f01c01406c2fe61b7b3a4
Author: Arnaldo Carvalho de Melo 
AuthorDate: Tue, 8 Jan 2019 13:46:43 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 14:09:33 -0300

perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h

As now we'll update our fs.h copy and what 
tools/perf/trace/beauty/mount_flags.sh
needs just got moved to mount.h, use that instead.

Cc: Adrian Hunter 
Cc: David Howells 
Cc: Jiri Olsa 
Cc: Luis Cláudio Gonçalves 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-ls19h376xukeouxrw9dsw...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/trace/beauty/mount_flags.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/trace/beauty/mount_flags.sh 
b/tools/perf/trace/beauty/mount_flags.sh
index 45547573a1db..847850b2ef6c 100755
--- a/tools/perf/trace/beauty/mount_flags.sh
+++ b/tools/perf/trace/beauty/mount_flags.sh
@@ -5,11 +5,11 @@
 
 printf "static const char *mount_flags[] = {\n"
 
regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MS_([[:alnum:]_]+)[[:space:]]+([[:digit:]]+)[[:space:]]*.*'
-egrep $regex ${header_dir}/fs.h | egrep -v '(MSK|VERBOSE|MGC_VAL)\>' | \
+egrep $regex ${header_dir}/mount.h | egrep -v '(MSK|VERBOSE|MGC_VAL)\>' | \
sed -r "s/$regex/\2 \2 \1/g" | sort -n | \
xargs printf "\t[%s ? (ilog2(%s) + 1) : 0] = \"%s\",\n"
 
regex='^[[:space:]]*#[[:space:]]*define[[:space:]]+MS_([[:alnum:]_]+)[[:space:]]+\(1<<([[:digit:]]+)\)[[:space:]]*.*'
-egrep $regex ${header_dir}/fs.h | \
+egrep $regex ${header_dir}/mount.h | \
sed -r "s/$regex/\2 \1/g" | \
xargs printf "\t[%s + 1] = \"%s\",\n"
 printf "};\n"

Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()

2019-01-08 Thread Helge Deller

On 09.01.19 06:39, Sergey Senozhatsky wrote:
> On (01/08/19 08:44), Helge Deller wrote:
>> Date: Tue, 8 Jan 2019 08:44:19 +0100
>> From: Helge Deller 
>> To: Sergey Senozhatsky , "James E . J .
>>  Bottomley" 
>> Cc: linux-par...@vger.kernel.org, linux-kernel@vger.kernel.org, Sergey
>>  Senozhatsky 
>> Subject: Re: [PATCH] parisc: replace oops_in_progress manipulation with
>>  bust_spinlocks()
>> Message-ID: 
>> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
>>  Thunderbird/60.4.0
>>
>> On 07.01.19 10:56, Sergey Senozhatsky wrote:
>>> Use bust_spinlocks() function to set oops_in_progress.
>>>
>>> Signed-off-by: Sergey Senozhatsky 
>>
>> I've added this patch to the parisc for-next tree.
> 
> Thanks.
> 
> BTW,
> jejb at parisc-linux.org bounces. Shall it be replaced with
> HansenPartnership.com email address?
> 
> ---
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7fc7813cd9c2..86b174dbe506 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11511,7 +11511,7 @@ F:Documentation/blockdev/paride.txt
>  F:   drivers/block/paride/
>  
>  PARISC ARCHITECTURE
> -M:   "James E.J. Bottomley" 
> +M:   "James E.J. Bottomley" 
>  M:   Helge Deller 
>  L:   linux-par...@vger.kernel.org
>  W:   http://www.parisc-linux.org/
> 

I prefer that change.
In my role as admin of the parisc-linux.org domain I plan to retire
all @parisc-linux.org email addresses soon anyway.

Helge

[tip:perf/urgent] perf top: Lift restriction on using callchains without "sym" in --sort

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  f2e14cd2c93699aa0aeaa8240457ab359f1258ff
Gitweb: https://git.kernel.org/tip/f2e14cd2c93699aa0aeaa8240457ab359f1258ff
Author: Arnaldo Carvalho de Melo 
AuthorDate: Tue, 8 Jan 2019 10:56:59 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

perf top: Lift restriction on using callchains without "sym" in --sort

This restriction is not present in 'perf report' and since 'perf top'
uses the same hists browser, remove it from it as well.

With this we create per event buckets with callchain trees, so that

  # perf top --sort dso -g --no-children

Bucketizes samples by DSO and below it shows the callchains leading to
functions in this DSO.

Try also:

  # perf top -e sched:*switch -g --no-children

To see the callchains leading to sched switches, pressing 'E' to expand
all one can quickly see the most common scheduler switches and what
leads to them, for instance, calls to IO, futexes, etc.

Acked-by: Namhyung Kim 
Cc: Adrian Hunter 
Cc: Jiri Olsa 
Link: https://lkml.kernel.org/r/20190107140854.ga28...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-top.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index fe3ecfb2e64b..f64e312db787 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1028,12 +1028,7 @@ out_err:
 
 static int callchain_param__setup_sample_type(struct callchain_param 
*callchain)
 {
-   if (!perf_hpp_list.sym) {
-   if (callchain->enabled) {
-   ui__error("Selected -g but \"sym\" not present in 
--sort/-s.");
-   return -EINVAL;
-   }
-   } else if (callchain->mode != CHAIN_NONE) {
+   if (callchain->mode != CHAIN_NONE) {
if (callchain_register_param(callchain) < 0) {
ui__error("Can't register callchain params.\n");
return -EINVAL;

[tip:perf/urgent] tools include uapi: Grab a copy of linux/mount.h

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  250bfc87ddc427fa001bbc8bc1468ce5fc06645b
Gitweb: https://git.kernel.org/tip/250bfc87ddc427fa001bbc8bc1468ce5fc06645b
Author: Arnaldo Carvalho de Melo 
AuthorDate: Tue, 8 Jan 2019 13:42:37 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 14:09:28 -0300

tools include uapi: Grab a copy of linux/mount.h

We were using a copy of uapi/linux/fs.h to create the mount syscall
'flags' string table to use in 'perf trace', to convert from the number
obtained via the raw_syscalls:sys_enter into a string, using
tools/perf/trace/beauty/mount_flags.sh, but in e262e32d6bde ("vfs:
Suppress MS_* flag defs within the kernel unless explicitly enabled")
those defines got moved to linux/mount.h, so grab a copy of mount.h too.

Keep the uapi/linux/fs.h as we'll use it for the SEEK_ constants.

Cc: Adrian Hunter 
Cc: David Howells 
Cc: Jiri Olsa 
Cc: Luis Cláudio Gonçalves 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-i2ricmpwpdrpukfq3298j...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 {include => tools/include}/uapi/linux/mount.h | 0
 tools/perf/check-headers.sh   | 1 +
 2 files changed, 1 insertion(+)

diff --git a/include/uapi/linux/mount.h b/tools/include/uapi/linux/mount.h
similarity index 100%
copy from include/uapi/linux/mount.h
copy to tools/include/uapi/linux/mount.h
diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh
index 6cb98f8570a2..b51e952ab35f 100755
--- a/tools/perf/check-headers.sh
+++ b/tools/perf/check-headers.sh
@@ -10,6 +10,7 @@ include/uapi/linux/fs.h
 include/uapi/linux/kcmp.h
 include/uapi/linux/kvm.h
 include/uapi/linux/in.h
+include/uapi/linux/mount.h
 include/uapi/linux/perf_event.h
 include/uapi/linux/prctl.h
 include/uapi/linux/sched.h

[tip:perf/urgent] tools lib traceevent: Remove tep_data_event_from_type() API

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  9231967e2f515fce9e19687c0c40dfda416b3512
Gitweb: https://git.kernel.org/tip/9231967e2f515fce9e19687c0c40dfda416b3512
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 23:08:13 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Remove tep_data_event_from_type() API

In order to make libtraceevent into a proper library, its API
should be straightforward.

After discussion with Steven Rostedt, we decided to remove the
tep_data_event_from_type() API and to replace it with tep_find_event(),
as it does the same.

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181201040852.913841...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/event-parse.c | 12 
 tools/lib/traceevent/event-parse.h |  1 -
 2 files changed, 13 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 54d94054eef0..abd4fa5d3088 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -5265,18 +5265,6 @@ int tep_data_type(struct tep_handle *pevent, struct 
tep_record *rec)
return trace_parse_common_type(pevent, rec->data);
 }
 
-/**
- * tep_data_event_from_type - find the event by a given type
- * @pevent: a handle to the pevent
- * @type: the type of the event.
- *
- * This returns the event form a given @type;
- */
-struct tep_event *tep_data_event_from_type(struct tep_handle *pevent, int type)
-{
-   return tep_find_event(pevent, type);
-}
-
 /**
  * tep_data_pid - parse the PID from record
  * @pevent: a handle to the pevent
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index bd1bd9a27839..aec48f2aea8a 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -526,7 +526,6 @@ tep_find_event_by_record(struct tep_handle *pevent, struct 
tep_record *record);
 void tep_data_lat_fmt(struct tep_handle *pevent,
  struct trace_seq *s, struct tep_record *record);
 int tep_data_type(struct tep_handle *pevent, struct tep_record *rec);
-struct tep_event *tep_data_event_from_type(struct tep_handle *pevent, int 
type);
 int tep_data_pid(struct tep_handle *pevent, struct tep_record *rec);
 int tep_data_preempt_count(struct tep_handle *pevent, struct tep_record *rec);
 int tep_data_flags(struct tep_handle *pevent, struct tep_record *rec);

[tip:perf/urgent] tools lib traceevent: Rename tep_is_file_bigendian() to tep_file_bigendian()

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  4104e604277016b3e6a7d120368054f9d2716953
Gitweb: https://git.kernel.org/tip/4104e604277016b3e6a7d120368054f9d2716953
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 23:08:12 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Rename tep_is_file_bigendian() to tep_file_bigendian()

In order to make libtraceevent into a proper library, its API
should be straightforward.

After a discussion with Steven Rostedt, we decided to rename a few APIs,
to have more intuitive names.

This patch renames tep_is_file_bigendian() to tep_file_bigendian().

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181201040852.767549...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/event-parse-api.c | 4 ++--
 tools/lib/traceevent/event-parse.h | 2 +-
 tools/lib/traceevent/plugin_kvm.c  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/lib/traceevent/event-parse-api.c 
b/tools/lib/traceevent/event-parse-api.c
index 8b31c0e00ba3..d463761a58f4 100644
--- a/tools/lib/traceevent/event-parse-api.c
+++ b/tools/lib/traceevent/event-parse-api.c
@@ -194,13 +194,13 @@ void tep_set_page_size(struct tep_handle *pevent, int 
_page_size)
 }
 
 /**
- * tep_is_file_bigendian - get if the file is in big endian order
+ * tep_file_bigendian - get if the file is in big endian order
  * @pevent: a handle to the tep_handle
  *
  * This returns if the file is in big endian order
  * If @pevent is NULL, 0 is returned.
  */
-int tep_is_file_bigendian(struct tep_handle *pevent)
+int tep_file_bigendian(struct tep_handle *pevent)
 {
if(pevent)
return pevent->file_bigendian;
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index ac377ae99008..bd1bd9a27839 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -559,7 +559,7 @@ int tep_get_long_size(struct tep_handle *pevent);
 void tep_set_long_size(struct tep_handle *pevent, int long_size);
 int tep_get_page_size(struct tep_handle *pevent);
 void tep_set_page_size(struct tep_handle *pevent, int _page_size);
-int tep_is_file_bigendian(struct tep_handle *pevent);
+int tep_file_bigendian(struct tep_handle *pevent);
 void tep_set_file_bigendian(struct tep_handle *pevent, enum tep_endian endian);
 int tep_is_host_bigendian(struct tep_handle *pevent);
 void tep_set_host_bigendian(struct tep_handle *pevent, enum tep_endian endian);
diff --git a/tools/lib/traceevent/plugin_kvm.c 
b/tools/lib/traceevent/plugin_kvm.c
index 754050eea467..64b9c25a1fd3 100644
--- a/tools/lib/traceevent/plugin_kvm.c
+++ b/tools/lib/traceevent/plugin_kvm.c
@@ -389,7 +389,7 @@ static int kvm_mmu_print_role(struct trace_seq *s, struct 
tep_record *record,
 * We can only use the structure if file is of the same
 * endianness.
 */
-   if (tep_is_file_bigendian(event->pevent) ==
+   if (tep_file_bigendian(event->pevent) ==
tep_is_host_bigendian(event->pevent)) {
 
trace_seq_printf(s, "%u q%u%s %s%s %spae %snxe %swp%s%s%s",

[tip:perf/urgent] tools lib traceevent: Changed return logic of tep_register_event_handler() API

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  f87ce7c43f36d4abff91b19edadd23939f99ff98
Gitweb: https://git.kernel.org/tip/f87ce7c43f36d4abff91b19edadd23939f99ff98
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 23:08:11 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Changed return logic of tep_register_event_handler() API

In order to make libtraceevent into a proper library, its API
should be straightforward.

The tep_register_event_handler() functions returns -1 in case it
successfully registers the new event handler. Such return code is used
by the other library APIs in case of an error.

To unify the return logic of tep_register_event_handler() with the other
APIs, this patch introduces enum tep_reg_handler, which is used by this
function as return value, to handle all possible successful return
cases.

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181201040852.628034...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/event-parse.c | 10 --
 tools/lib/traceevent/event-parse.h |  5 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index a850342baf86..54d94054eef0 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -6632,6 +6632,12 @@ static struct tep_event *search_event(struct tep_handle 
*pevent, int id,
  *
  * If @id is >= 0, then it is used to find the event.
  * else @sys_name and @event_name are used.
+ *
+ * Returns:
+ *  TEP_REGISTER_SUCCESS_OVERWRITE if an existing handler is overwritten
+ *  TEP_REGISTER_SUCCESS if a new handler is registered successfully
+ *  negative TEP_ERRNO_... in case of an error
+ *
  */
 int tep_register_event_handler(struct tep_handle *pevent, int id,
   const char *sys_name, const char *event_name,
@@ -6649,7 +6655,7 @@ int tep_register_event_handler(struct tep_handle *pevent, 
int id,
 
event->handler = func;
event->context = context;
-   return 0;
+   return TEP_REGISTER_SUCCESS_OVERWRITE;
 
  not_found:
/* Save for later use. */
@@ -6679,7 +6685,7 @@ int tep_register_event_handler(struct tep_handle *pevent, 
int id,
pevent->handlers = handle;
handle->context = context;
 
-   return -1;
+   return TEP_REGISTER_SUCCESS;
 }
 
 static int handle_matches(struct event_handler *handler, int id,
diff --git a/tools/lib/traceevent/event-parse.h 
b/tools/lib/traceevent/event-parse.h
index 77a4a1dd4b4d..ac377ae99008 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -485,6 +485,11 @@ int tep_print_func_field(struct trace_seq *s, const char 
*fmt,
 struct tep_event *event, const char *name,
 struct tep_record *record, int err);
 
+enum tep_reg_handler {
+   TEP_REGISTER_SUCCESS = 0,
+   TEP_REGISTER_SUCCESS_OVERWRITE,
+};
+
 int tep_register_event_handler(struct tep_handle *pevent, int id,
   const char *sys_name, const char *event_name,
   tep_event_handler_func func, void *context);

[tip:perf/urgent] tools lib traceevent: Changed return logic of trace_seq_printf() and trace_seq_vprintf() APIs

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  6d2d6fd7e3ee0daf0d8308741792b3ec41aafd0c
Gitweb: https://git.kernel.org/tip/6d2d6fd7e3ee0daf0d8308741792b3ec41aafd0c
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 23:08:10 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Changed return logic of trace_seq_printf() and 
trace_seq_vprintf() APIs

In order to make libtraceevent into a proper library, its API should be
straightforward.

The trace_seq_printf() and trace_seq_vprintf() APIs have inconsistent
returned values with the other trace_seq_* APIs.

This path changes the return logic of trace_seq_printf() and
trace_seq_vprintf() to return the number of printed characters, as the
other trace_seq_* related APIs.

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181201040852.485792...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/trace-seq.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/lib/traceevent/trace-seq.c b/tools/lib/traceevent/trace-seq.c
index 8ff1d55954d1..8d5ecd2bf877 100644
--- a/tools/lib/traceevent/trace-seq.c
+++ b/tools/lib/traceevent/trace-seq.c
@@ -100,7 +100,8 @@ static void expand_buffer(struct trace_seq *s)
  * @fmt: printf format string
  *
  * It returns 0 if the trace oversizes the buffer's free
- * space, 1 otherwise.
+ * space, the number of characters printed, or a negative
+ * value in case of an error.
  *
  * The tracer may use either sequence operations or its own
  * copy to user routines. To simplify formating of a trace
@@ -129,9 +130,10 @@ trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
goto try_again;
}
 
-   s->len += ret;
+   if (ret > 0)
+   s->len += ret;
 
-   return 1;
+   return ret;
 }
 
 /**
@@ -139,6 +141,10 @@ trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
  * @s: trace sequence descriptor
  * @fmt: printf format string
  *
+ * It returns 0 if the trace oversizes the buffer's free
+ * space, the number of characters printed, or a negative
+ * value in case of an error.
+ * *
  * The tracer may use either sequence operations or its own
  * copy to user routines. To simplify formating of a trace
  * trace_seq_printf is used to store strings into a special
@@ -163,9 +169,10 @@ trace_seq_vprintf(struct trace_seq *s, const char *fmt, 
va_list args)
goto try_again;
}
 
-   s->len += ret;
+   if (ret > 0)
+   s->len += ret;
 
-   return len;
+   return ret;
 }
 
 /**

[tip:perf/urgent] tools lib traceevent: Initialize host_bigendian at tep_handle allocation

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  eed14f4b075ec594ac09921b998bf3dd61f5886b
Gitweb: https://git.kernel.org/tip/eed14f4b075ec594ac09921b998bf3dd61f5886b
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 23:08:08 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Initialize host_bigendian at tep_handle allocation

This patch initializes the host_bigendian member of the tep_handle
structure with the byte order of the current host, when this handler is
created - in tep_alloc() API. We need this in order to remove the
tep_set_host_bigendian() API.

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181201040852.216292...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/event-parse.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 156e513074b2..44b80471b024 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -6762,8 +6762,10 @@ struct tep_handle *tep_alloc(void)
 {
struct tep_handle *pevent = calloc(1, sizeof(*pevent));
 
-   if (pevent)
+   if (pevent) {
pevent->ref_count = 1;
+   pevent->host_bigendian = tep_host_bigendian();
+   }
 
return pevent;
 }

[tip:perf/urgent] tools lib traceevent: Rename struct cmdline to struct tep_cmdline

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  2e4318a287bdf815140462257ab8697f5289a12f
Gitweb: https://git.kernel.org/tip/2e4318a287bdf815140462257ab8697f5289a12f
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 23:08:09 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Rename struct cmdline to struct tep_cmdline

In order to make libtraceevent a proper library, variables, data
structures and functions should have a unique prefix to prevent name
space conflicts. That prefix will be "tep_".

This patch renames 'struct cmdline' to 'struct tep_cmdline'.

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181201040852.358871...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/event-parse-local.h |  4 ++--
 tools/lib/traceevent/event-parse.c   | 36 
 tools/lib/traceevent/event-parse.h   |  8 +++
 3 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/tools/lib/traceevent/event-parse-local.h 
b/tools/lib/traceevent/event-parse-local.h
index 9a092dd4a86d..35833ee32d6c 100644
--- a/tools/lib/traceevent/event-parse-local.h
+++ b/tools/lib/traceevent/event-parse-local.h
@@ -7,7 +7,7 @@
 #ifndef _PARSE_EVENTS_INT_H
 #define _PARSE_EVENTS_INT_H
 
-struct cmdline;
+struct tep_cmdline;
 struct cmdline_list;
 struct func_map;
 struct func_list;
@@ -36,7 +36,7 @@ struct tep_handle {
int long_size;
int page_size;
 
-   struct cmdline *cmdlines;
+   struct tep_cmdline *cmdlines;
struct cmdline_list *cmdlist;
int cmdline_count;
 
diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 44b80471b024..a850342baf86 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -124,15 +124,15 @@ struct tep_print_arg *alloc_arg(void)
return calloc(1, sizeof(struct tep_print_arg));
 }
 
-struct cmdline {
+struct tep_cmdline {
char *comm;
int pid;
 };
 
 static int cmdline_cmp(const void *a, const void *b)
 {
-   const struct cmdline *ca = a;
-   const struct cmdline *cb = b;
+   const struct tep_cmdline *ca = a;
+   const struct tep_cmdline *cb = b;
 
if (ca->pid < cb->pid)
return -1;
@@ -152,7 +152,7 @@ static int cmdline_init(struct tep_handle *pevent)
 {
struct cmdline_list *cmdlist = pevent->cmdlist;
struct cmdline_list *item;
-   struct cmdline *cmdlines;
+   struct tep_cmdline *cmdlines;
int i;
 
cmdlines = malloc(sizeof(*cmdlines) * pevent->cmdline_count);
@@ -179,8 +179,8 @@ static int cmdline_init(struct tep_handle *pevent)
 
 static const char *find_cmdline(struct tep_handle *pevent, int pid)
 {
-   const struct cmdline *comm;
-   struct cmdline key;
+   const struct tep_cmdline *comm;
+   struct tep_cmdline key;
 
if (!pid)
return "";
@@ -208,8 +208,8 @@ static const char *find_cmdline(struct tep_handle *pevent, 
int pid)
  */
 int tep_pid_is_registered(struct tep_handle *pevent, int pid)
 {
-   const struct cmdline *comm;
-   struct cmdline key;
+   const struct tep_cmdline *comm;
+   struct tep_cmdline key;
 
if (!pid)
return 1;
@@ -235,9 +235,9 @@ int tep_pid_is_registered(struct tep_handle *pevent, int 
pid)
 static int add_new_comm(struct tep_handle *pevent,
const char *comm, int pid, bool override)
 {
-   struct cmdline *cmdlines = pevent->cmdlines;
-   struct cmdline *cmdline;
-   struct cmdline key;
+   struct tep_cmdline *cmdlines = pevent->cmdlines;
+   struct tep_cmdline *cmdline;
+   struct tep_cmdline key;
char *new_comm;
 
if (!pid)
@@ -5331,8 +5331,8 @@ const char *tep_data_comm_from_pid(struct tep_handle 
*pevent, int pid)
return comm;
 }
 
-static struct cmdline *
-pid_from_cmdlist(struct tep_handle *pevent, const char *comm, struct cmdline 
*next)
+static struct tep_cmdline *
+pid_from_cmdlist(struct tep_handle *pevent, const char *comm, struct 
tep_cmdline *next)
 {
struct cmdline_list *cmdlist = (struct cmdline_list *)next;
 
@@ -5344,7 +5344,7 @@ pid_from_cmdlist(struct tep_handle *pevent, const char 
*comm, struct cmdline *ne
while (cmdlist && strcmp(cmdlist->comm, comm) != 0)
cmdlist = cmdlist->next;
 
-   return (struct cmdline *)cmdlist;
+   return (struct tep_cmdline *)cmdlist;
 }
 
 /**
@@ -5360,10 +5360,10 @@ pid_from_cmdlist(struct tep_handle *pevent, const char 
*comm, struct cmdline *ne
  * next pid.
  * Also, it does a linear search, so it may be slow.
  */
-struct cmdline *tep_data_pid_from_comm(struct tep_handle *pevent, const char 
*comm,
-  struct cmdline *next)
+struct tep_cmdline *tep_data_pid_from_comm(struct tep_handle

[tip:perf/urgent] perf tests: Add a test for the ARM 32-bit [vectors] page

2019-01-08 Thread tip-bot for Florian Fainelli

Commit-ID:  21327c7843e9169d5e2e527713e60e6c9842a56c
Gitweb: https://git.kernel.org/tip/21327c7843e9169d5e2e527713e60e6c9842a56c
Author: Florian Fainelli 
AuthorDate: Thu, 20 Dec 2018 19:43:37 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

perf tests: Add a test for the ARM 32-bit [vectors] page

perf on ARM requires CONFIG_KUSER_HELPERS to be turned on to allow some
independance with respect to the ARM CPU being used. Add a test which
tries to locate the [vectors] page, created when CONFIG_KUSER_HELPERS is
turned on to help asses the system's health.

Signed-off-by: Florian Fainelli 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Chris Healy 
Cc: Greg Kroah-Hartman 
Cc: Kim Phillips 
Cc: Lucas Stach 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Thomas Richter 
Link: http://lkml.kernel.org/r/20181221034337.26663-3-f.faine...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/arch/arm/tests/Build  |  1 +
 tools/perf/arch/arm/tests/arch-tests.c   |  4 
 tools/perf/arch/arm/tests/vectors-page.c | 24 
 tools/perf/tests/tests.h |  5 +
 4 files changed, 34 insertions(+)

diff --git a/tools/perf/arch/arm/tests/Build b/tools/perf/arch/arm/tests/Build
index 883c57ff0c08..d9ae2733f9cc 100644
--- a/tools/perf/arch/arm/tests/Build
+++ b/tools/perf/arch/arm/tests/Build
@@ -1,4 +1,5 @@
 libperf-y += regs_load.o
 libperf-y += dwarf-unwind.o
+libperf-y += vectors-page.o
 
 libperf-y += arch-tests.o
diff --git a/tools/perf/arch/arm/tests/arch-tests.c 
b/tools/perf/arch/arm/tests/arch-tests.c
index 5b1543c98022..6848101a855f 100644
--- a/tools/perf/arch/arm/tests/arch-tests.c
+++ b/tools/perf/arch/arm/tests/arch-tests.c
@@ -10,6 +10,10 @@ struct test arch_tests[] = {
.func = test__dwarf_unwind,
},
 #endif
+   {
+   .desc = "Vectors page",
+   .func = test__vectors_page,
+   },
{
.func = NULL,
},
diff --git a/tools/perf/arch/arm/tests/vectors-page.c 
b/tools/perf/arch/arm/tests/vectors-page.c
new file mode 100644
index ..7ffdd79971c8
--- /dev/null
+++ b/tools/perf/arch/arm/tests/vectors-page.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+
+#include "debug.h"
+#include "tests/tests.h"
+#include "util/find-map.c"
+
+#define VECTORS__MAP_NAME "[vectors]"
+
+int test__vectors_page(struct test *test __maybe_unused,
+  int subtest __maybe_unused)
+{
+   void *start, *end;
+
+   if (find_map(, , VECTORS__MAP_NAME)) {
+   pr_err("%s not found, is CONFIG_KUSER_HELPERS enabled?\n",
+  VECTORS__MAP_NAME);
+   return TEST_FAIL;
+   }
+
+   return TEST_OK;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index b82f55fcc294..399f18ca71a3 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -119,4 +119,9 @@ int test__arch_unwind_sample(struct perf_sample *sample,
 struct thread *thread);
 #endif
 #endif
+
+#if defined(__arm__)
+int test__vectors_page(struct test *test, int subtest);
+#endif
+
 #endif /* TESTS_H */

[tip:perf/urgent] tools lib traceevent: Introduce new libtracevent API: tep_override_comm()

2019-01-08 Thread tip-bot for Tzvetomir Stoyanov

Commit-ID:  ca3958b1c0968a6f3105e211355f128ce871e796
Gitweb: https://git.kernel.org/tip/ca3958b1c0968a6f3105e211355f128ce871e796
Author: Tzvetomir Stoyanov 
AuthorDate: Fri, 30 Nov 2018 10:44:11 -0500
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

tools lib traceevent: Introduce new libtracevent API: tep_override_comm()

This patch adds a new API of tracevent library: tep_override_comm() It
registers a pid / command mapping. If a mapping with the same pid
already exists, the entry is updated with the new command.

Signed-off-by: Tzvetomir Stoyanov 
Cc: Andrew Morton 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Link: http://lkml.kernel.org/r/20181130154648.038915...@goodmis.org
Signed-off-by: Steven Rostedt (VMware) 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/traceevent/event-parse.c | 69 +-
 tools/lib/traceevent/event-parse.h |  1 +
 2 files changed, 55 insertions(+), 15 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c 
b/tools/lib/traceevent/event-parse.c
index 69a96e39f0ab..156e513074b2 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -232,11 +232,13 @@ int tep_pid_is_registered(struct tep_handle *pevent, int 
pid)
  * we must add this pid. This is much slower than when cmdlines
  * are added before the array is initialized.
  */
-static int add_new_comm(struct tep_handle *pevent, const char *comm, int pid)
+static int add_new_comm(struct tep_handle *pevent,
+   const char *comm, int pid, bool override)
 {
struct cmdline *cmdlines = pevent->cmdlines;
-   const struct cmdline *cmdline;
+   struct cmdline *cmdline;
struct cmdline key;
+   char *new_comm;
 
if (!pid)
return 0;
@@ -247,8 +249,19 @@ static int add_new_comm(struct tep_handle *pevent, const 
char *comm, int pid)
cmdline = bsearch(, pevent->cmdlines, pevent->cmdline_count,
   sizeof(*pevent->cmdlines), cmdline_cmp);
if (cmdline) {
-   errno = EEXIST;
-   return -1;
+   if (!override) {
+   errno = EEXIST;
+   return -1;
+   }
+   new_comm = strdup(comm);
+   if (!new_comm) {
+   errno = ENOMEM;
+   return -1;
+   }
+   free(cmdline->comm);
+   cmdline->comm = new_comm;
+
+   return 0;
}
 
cmdlines = realloc(cmdlines, sizeof(*cmdlines) * (pevent->cmdline_count 
+ 1));
@@ -275,21 +288,13 @@ static int add_new_comm(struct tep_handle *pevent, const 
char *comm, int pid)
return 0;
 }
 
-/**
- * tep_register_comm - register a pid / comm mapping
- * @pevent: handle for the pevent
- * @comm: the command line to register
- * @pid: the pid to map the command line to
- *
- * This adds a mapping to search for command line names with
- * a given pid. The comm is duplicated.
- */
-int tep_register_comm(struct tep_handle *pevent, const char *comm, int pid)
+static int _tep_register_comm(struct tep_handle *pevent,
+ const char *comm, int pid, bool override)
 {
struct cmdline_list *item;
 
if (pevent->cmdlines)
-   return add_new_comm(pevent, comm, pid);
+   return add_new_comm(pevent, comm, pid, override);
 
item = malloc(sizeof(*item));
if (!item)
@@ -312,6 +317,40 @@ int tep_register_comm(struct tep_handle *pevent, const 
char *comm, int pid)
return 0;
 }
 
+/**
+ * tep_register_comm - register a pid / comm mapping
+ * @pevent: handle for the pevent
+ * @comm: the command line to register
+ * @pid: the pid to map the command line to
+ *
+ * This adds a mapping to search for command line names with
+ * a given pid. The comm is duplicated. If a command with the same pid
+ * already exist, -1 is returned and errno is set to EEXIST
+ */
+int tep_register_comm(struct tep_handle *pevent, const char *comm, int pid)
+{
+   return _tep_register_comm(pevent, comm, pid, false);
+}
+
+/**
+ * tep_override_comm - register a pid / comm mapping
+ * @pevent: handle for the pevent
+ * @comm: the command line to register
+ * @pid: the pid to map the command line to
+ *
+ * This adds a mapping to search for command line names with
+ * a given pid. The comm is duplicated. If a command with the same pid
+ * already exist, the command string is udapted with the new one
+ */
+int tep_override_comm(struct tep_handle *pevent, const char *comm, int pid)
+{
+   if (!pevent->cmdlines && cmdline_init(pevent)) {
+   errno = ENOMEM;
+   return -1;
+   }
+   return _tep_register_comm(pevent, comm, pid, true);
+}
+
 int tep_register_trace_clock(struct tep_handle *pevent, const char 
*trace_clock)
 {
pevent->trace_clock = strdup(trace_clock);
diff --git a/tools/lib/traceevent/event-parse.h

[tip:perf/urgent] perf tools: Make find_vdso_map() more modular

2019-01-08 Thread tip-bot for Florian Fainelli

Commit-ID:  011532379b7c2de6757e129037bdfc8d704bce23
Gitweb: https://git.kernel.org/tip/011532379b7c2de6757e129037bdfc8d704bce23
Author: Florian Fainelli 
AuthorDate: Thu, 20 Dec 2018 19:43:36 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:13 -0300

perf tools: Make find_vdso_map() more modular

In preparation for checking that the vectors page on the ARM
architecture, refactor the find_vdso_map() function to accept finding an
arbitrary string and create a dedicated helper function for that under
util/find-map.c and update the filename to find-map.c and all references
to it: perf-read-vdso.c and util/vdso.c.

Signed-off-by: Florian Fainelli 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Chris Healy 
Cc: Greg Kroah-Hartman 
Cc: Kim Phillips 
Cc: Lucas Stach 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Thomas Richter 
Link: http://lkml.kernel.org/r/20181221034337.26663-2-f.faine...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.perf| 4 ++--
 tools/perf/perf-read-vdso.c | 6 +++---
 tools/perf/util/{find-vdso-map.c => find-map.c} | 7 +++
 tools/perf/util/vdso.c  | 6 +++---
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 2921f829a0f4..0ee6795d82cc 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -662,12 +662,12 @@ $(OUTPUT)perf-%: %.o $(PERFLIBS)
$(QUIET_LINK)$(CC) $(CFLAGS) -o $@ $(LDFLAGS) $(filter %.o,$^) $(LIBS)
 
 ifndef NO_PERF_READ_VDSO32
-$(OUTPUT)perf-read-vdso32: perf-read-vdso.c util/find-vdso-map.c
+$(OUTPUT)perf-read-vdso32: perf-read-vdso.c util/find-map.c
$(QUIET_CC)$(CC) -m32 $(filter -static,$(LDFLAGS)) -Wall -Werror -o $@ 
perf-read-vdso.c
 endif
 
 ifndef NO_PERF_READ_VDSOX32
-$(OUTPUT)perf-read-vdsox32: perf-read-vdso.c util/find-vdso-map.c
+$(OUTPUT)perf-read-vdsox32: perf-read-vdso.c util/find-map.c
$(QUIET_CC)$(CC) -mx32 $(filter -static,$(LDFLAGS)) -Wall -Werror -o $@ 
perf-read-vdso.c
 endif
 
diff --git a/tools/perf/perf-read-vdso.c b/tools/perf/perf-read-vdso.c
index 8c0ca0cc428f..aaa5210ea84a 100644
--- a/tools/perf/perf-read-vdso.c
+++ b/tools/perf/perf-read-vdso.c
@@ -5,17 +5,17 @@
 #define VDSO__MAP_NAME "[vdso]"
 
 /*
- * Include definition of find_vdso_map() also used in util/vdso.c for
+ * Include definition of find_map() also used in util/vdso.c for
  * building perf.
  */
-#include "util/find-vdso-map.c"
+#include "util/find-map.c"
 
 int main(void)
 {
void *start, *end;
size_t size, written;
 
-   if (find_vdso_map(, ))
+   if (find_map(, , VDSO__MAP_NAME))
return 1;
 
size = end - start;
diff --git a/tools/perf/util/find-vdso-map.c b/tools/perf/util/find-map.c
similarity index 71%
rename from tools/perf/util/find-vdso-map.c
rename to tools/perf/util/find-map.c
index d7823e3508fc..7b2300588ece 100644
--- a/tools/perf/util/find-vdso-map.c
+++ b/tools/perf/util/find-map.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-static int find_vdso_map(void **start, void **end)
+static int find_map(void **start, void **end, const char *name)
 {
FILE *maps;
char line[128];
@@ -7,7 +7,7 @@ static int find_vdso_map(void **start, void **end)
 
maps = fopen("/proc/self/maps", "r");
if (!maps) {
-   fprintf(stderr, "vdso: cannot open maps\n");
+   fprintf(stderr, "cannot open maps\n");
return -1;
}
 
@@ -21,8 +21,7 @@ static int find_vdso_map(void **start, void **end)
if (m < 0)
continue;
 
-   if (!strncmp([m], VDSO__MAP_NAME,
-sizeof(VDSO__MAP_NAME) - 1))
+   if (!strncmp([m], name, strlen(name)))
found = 1;
}
 
diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
index 741af209b19d..3702cba11d7d 100644
--- a/tools/perf/util/vdso.c
+++ b/tools/perf/util/vdso.c
@@ -18,10 +18,10 @@
 #include "debug.h"
 
 /*
- * Include definition of find_vdso_map() also used in perf-read-vdso.c for
+ * Include definition of find_map() also used in perf-read-vdso.c for
  * building perf-read-vdso32 and perf-read-vdsox32.
  */
-#include "find-vdso-map.c"
+#include "find-map.c"
 
 #define VDSO__TEMP_FILE_NAME "/tmp/perf-vdso.so-XX"
 
@@ -76,7 +76,7 @@ static char *get_file(struct vdso_file *vdso_file)
if (vdso_file->found)
return vdso_file->temp_file_name;
 
-   if (vdso_file->error || find_vdso_map(, ))
+   if (vdso_file->error || find_map(, , VDSO__MAP_NAME))
return NULL;
 
size = end - start;

[PATCH v3 1/4] usb: gadget: uvc: synchronize streamon/off with uvc_function_set_alt

2019-01-08 Thread Paul Elder

If the streamon ioctl is issued while the stream is already on, then the
kernel BUGs. This happens at the BUG_ON in uvc_video_alloc_requests
within the call stack from the ioctl handler for VIDIOC_STREAMON.

This could happen when uvc_function_set_alt 0 races and wins against
uvc_v4l2_streamon, or when userspace neglects to issue the
VIDIOC_STREAMOFF ioctl.

To fix this, add two more uvc states: starting and stopping. Use these
to prevent the racing, and to detect when VIDIOC_STREAMON is issued
without previously issuing VIDIOC_STREAMOFF.

Signed-off-by: Paul Elder 
---
Changes in v3:

- add state guard to uvc_function_set_alt 1
- add documentation for newly added uvc states
- reorder uvc states to more or less follow the flow diagram
- add more state guards to ioctl handlers for streamon and streamoff

Changes in v2: Nothing

 drivers/usb/gadget/function/f_uvc.c| 17 
 drivers/usb/gadget/function/uvc.h  | 37 ++
 drivers/usb/gadget/function/uvc_v4l2.c | 26 --
 3 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/gadget/function/f_uvc.c 
b/drivers/usb/gadget/function/f_uvc.c
index 8c99392df593..2ec3b73b2b75 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -317,26 +317,31 @@ uvc_function_set_alt(struct usb_function *f, unsigned 
interface, unsigned alt)
 
switch (alt) {
case 0:
-   if (uvc->state != UVC_STATE_STREAMING)
+   if (uvc->state != UVC_STATE_STREAMING &&
+   uvc->state != UVC_STATE_STARTING)
return 0;
 
if (uvc->video.ep)
usb_ep_disable(uvc->video.ep);
 
+   uvc->state = UVC_STATE_STOPPING;
+
memset(_event, 0, sizeof(v4l2_event));
v4l2_event.type = UVC_EVENT_STREAMOFF;
v4l2_event_queue(>vdev, _event);
 
-   uvc->state = UVC_STATE_CONNECTED;
return 0;
 
case 1:
-   if (uvc->state != UVC_STATE_CONNECTED)
-   return 0;
-
if (!uvc->video.ep)
return -EINVAL;
 
+   if (uvc->state == UVC_STATE_STOPPING)
+   return -EINVAL;
+
+   if (uvc->state != UVC_STATE_CONNECTED)
+   return 0;
+
uvcg_info(f, "reset UVC\n");
usb_ep_disable(uvc->video.ep);
 
@@ -346,6 +351,8 @@ uvc_function_set_alt(struct usb_function *f, unsigned 
interface, unsigned alt)
return ret;
usb_ep_enable(uvc->video.ep);
 
+   uvc->state = UVC_STATE_STARTING;
+
memset(_event, 0, sizeof(v4l2_event));
v4l2_event.type = UVC_EVENT_STREAMON;
v4l2_event_queue(>vdev, _event);
diff --git a/drivers/usb/gadget/function/uvc.h 
b/drivers/usb/gadget/function/uvc.h
index 099d650082e5..f183e349499c 100644
--- a/drivers/usb/gadget/function/uvc.h
+++ b/drivers/usb/gadget/function/uvc.h
@@ -101,10 +101,47 @@ struct uvc_video {
unsigned int fid;
 };
 
+/**
+ * enum uvc_state - the states of a struct uvc_device
+ * @UVC_STATE_DISCONNECTED: not connected state
+ * - transition to connected state on .set_alt
+ * @UVC_STATE_CONNECTED:connected state
+ * - transition to disconnected state on .disable
+ *   and alloc
+ * - transition to starting state on .set_alt 1
+ * @UVC_STATE_STARTING: starting state
+ * - transition to streaming state on streamon ioctl
+ * - transition to stopping state on set_alt 0
+ * @UVC_STATE_STREAMING:streaming state
+ * - transition to stopping state on .set_alt 0
+ * @UVC_STATE_STOPPING: stopping state
+ * - transition to connected on streamoff ioctl
+ *
+ * Diagram of state transitions:
+ *
+ *  disable
+ *   +---+
+ *   v   |
+ * +--+  set_alt   +---+
+ * | DISCONNECTED | -> | CONNECTED |
+ * +--++---+
+ *| ^
+ * set_alt 1  | | streamoff
+ * +--+ +
+ * V|
+ *  +--+  streamon   +---+  set_alt 0   +--+
+ *  | STARTING | --> | STREAMING | ---> | STOPPING |
+ *  +--+ +---+  +--+
+ * |^
+ * |   set_alt 0|
+ * ++
+ */
 enum uvc_state {
UVC_STATE_DISCONNECTED,
UVC_STATE_CONNECTED,
+

[tip:perf/urgent] perf trace: Fix alignment for [continued] lines

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  ac6e022cbfdce215ad545e91d9827060855da3d7
Gitweb: https://git.kernel.org/tip/ac6e022cbfdce215ad545e91d9827060855da3d7
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 7 Jan 2019 16:54:38 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:12 -0300

perf trace: Fix alignment for [continued] lines

We were not taking into account the "... [continued]" printed
characters, fix it.

Cc: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Luis Cláudio Gonçalves 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-qt20y0acmf8k0bzisce8k...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-trace.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index b8bf5d025563..ed4583128b9c 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2032,9 +2032,10 @@ static int trace__sys_exit(struct trace *trace, struct 
perf_evsel *evsel,
if (ttrace->entry_pending) {
printed = fprintf(trace->output, "%s", ttrace->entry_str);
} else {
-   fprintf(trace->output, " ... [");
+   printed += fprintf(trace->output, " ... [");
color_fprintf(trace->output, PERF_COLOR_YELLOW, "continued");
-   fprintf(trace->output, "]: %s()", sc->name);
+   printed += 9;
+   printed += fprintf(trace->output, "]: %s()", sc->name);
}
 
printed++; /* the closing ')' */

[PATCH v3 3/4] usb: gadget: uvc: disable stream when disconnected

2019-01-08 Thread Paul Elder

If the streamon ioctl is issued while the stream is already on, then the
kernel BUGs. This happens at the BUG_ON in uvc_video_alloc_requests
within the call stack from the ioctl handler for VIDIOC_STREAMON.

This can also be triggered by starting the stream and then physically
disconnecting usb. To fix this, do the streamoff procedures on usb
disconnect. Since uvcg_video_enable is not interrupt-safe, add an
interrupt-safe version uvcg_video_cancel, and use that.

Signed-off-by: Paul Elder 
v2 Reviewed-by: Kieran Bingham 
---
Changes in v3:

- added interrupt-safe uvcg_video_cancel and used instead of the
  non-interrupt-save uvcg_video_enable 0

Changes in v2: Nothing

 drivers/usb/gadget/function/f_uvc.c |  3 +++
 drivers/usb/gadget/function/uvc_video.c | 13 +
 drivers/usb/gadget/function/uvc_video.h |  2 ++
 3 files changed, 18 insertions(+)

diff --git a/drivers/usb/gadget/function/f_uvc.c 
b/drivers/usb/gadget/function/f_uvc.c
index b407b10a95ed..4134117b5f81 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -369,9 +369,12 @@ uvc_function_disable(struct usb_function *f)
 {
struct uvc_device *uvc = to_uvc(f);
struct v4l2_event v4l2_event;
+   struct uvc_video *video = >video;
 
uvcg_info(f, "%s()\n", __func__);
 
+   uvcg_video_cancel(video, 1);
+
memset(_event, 0, sizeof(v4l2_event));
v4l2_event.type = UVC_EVENT_DISCONNECT;
v4l2_event_queue(>vdev, _event);
diff --git a/drivers/usb/gadget/function/uvc_video.c 
b/drivers/usb/gadget/function/uvc_video.c
index 5c042f380708..5f3ed9e0b5ad 100644
--- a/drivers/usb/gadget/function/uvc_video.c
+++ b/drivers/usb/gadget/function/uvc_video.c
@@ -348,6 +348,19 @@ int uvcg_video_pump(struct uvc_video *video)
return 0;
 }
 
+int uvcg_video_cancel(struct uvc_video *video, int disconnect)
+{
+   unsigned int i;
+
+   for (i = 0; i < UVC_NUM_REQUESTS; ++i)
+   if (video->req[i])
+   usb_ep_dequeue(video->ep, video->req[i]);
+
+   uvc_video_free_requests(video);
+   uvcg_queue_cancel(>queue, disconnect);
+   return 0;
+}
+
 /*
  * Enable or disable the video stream.
  */
diff --git a/drivers/usb/gadget/function/uvc_video.h 
b/drivers/usb/gadget/function/uvc_video.h
index 278dc52c7604..1f310fa58ce5 100644
--- a/drivers/usb/gadget/function/uvc_video.h
+++ b/drivers/usb/gadget/function/uvc_video.h
@@ -16,6 +16,8 @@ struct uvc_video;
 
 int uvcg_video_pump(struct uvc_video *video);
 
+int uvcg_video_cancel(struct uvc_video *video, int disconnect);
+
 int uvcg_video_enable(struct uvc_video *video, int enable);
 
 int uvcg_video_init(struct uvc_video *video, struct uvc_device *uvc);
-- 
2.20.1

[PATCH v3 2/4] usb: gadget: uvc: don't delay the status phase of non-zero SET_INTERFACE requests

2019-01-08 Thread Paul Elder

Reception of a SET_INTERFACE request with a non-zero alternate setting
signals the start of the video stream. The gadget has to enable the
video streaming endpoint and to signal stream start to userspace, in
order to start receiving video frames to transmit over USB. As userspace
can be slow to react, the UVC function driver delays the status phase of
the SET_INTERFACE control transfer until userspace is ready.

The status phase is delayed by returning USB_GADGET_DELAYED_STATUS from
the function's .set_alt() handler. This creates a race condition as the
userspace application could process the stream start event before the
composite layer processes the USB_GADGET_DELAYED_STATUS return value.
The race has been observed in practice, and can't be solved without a
change to the USB_GADGET_DELAYED_STATUS API.

Fortunately the UVC function driver doesn't strictly require delaying
the status phase for non-zero SET_INTERFACE, as the only requirement
from a USB point of view is that the streaming endpoint must be enabled
before the status phase completes, and that is already guaranteed by the
current code. We can thus complete the status phase synchronously,
removing the race condition at the same time.

Without delaying the status phase the host will likely start issuing
isochronous transfers before we queue the first USB requests. The UDC
will reply with NAKs which should be handled properly by the host. If
this ends up causing issues another option will be to modify the status
phase delay API to fix the race condition.

Signed-off-by: Paul Elder 
---
Changes in v3: Nothing

Changes in v2:
1. Remove delay usb status phase

 drivers/usb/gadget/function/f_uvc.c| 3 ++-
 drivers/usb/gadget/function/uvc_v4l2.c | 6 --
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/gadget/function/f_uvc.c 
b/drivers/usb/gadget/function/f_uvc.c
index 2ec3b73b2b75..b407b10a95ed 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -356,7 +356,8 @@ uvc_function_set_alt(struct usb_function *f, unsigned 
interface, unsigned alt)
memset(_event, 0, sizeof(v4l2_event));
v4l2_event.type = UVC_EVENT_STREAMON;
v4l2_event_queue(>vdev, _event);
-   return USB_GADGET_DELAYED_STATUS;
+
+   return 0;
 
default:
return -EINVAL;
diff --git a/drivers/usb/gadget/function/uvc_v4l2.c 
b/drivers/usb/gadget/function/uvc_v4l2.c
index 97e624214287..7816ea9886e1 100644
--- a/drivers/usb/gadget/function/uvc_v4l2.c
+++ b/drivers/usb/gadget/function/uvc_v4l2.c
@@ -210,12 +210,6 @@ uvc_v4l2_streamon(struct file *file, void *fh, enum 
v4l2_buf_type type)
 
uvc->state = UVC_STATE_STREAMING;
 
-   /*
-* Complete the alternate setting selection setup phase now that
-* userspace is ready to provide video frames.
-*/
-   uvc_function_setup_continue(uvc);
-
return 0;
 }
 
-- 
2.20.1

Re: Generic RGB LED support was Re: [PATCH 2/2] leds: lp5024: Add the LP5024/18 RGB LED driver

2019-01-08 Thread Vesa Jääskeläinen


Hi Pavel,

On 09/01/2019 0.59, Pavel Machek wrote:

Hi!


Grab yourself an RGB LED and play with it; you'll see what the
problems are. It is hard to explain colors over email...


Video [0] gives some overview of lp5024 capabilities.

I don't see any problems in exposing separate red,green,blue
files and brightness for the devices with hardware support for
that.


Well, that's what we do today, as three separate LEDs, right?


No. It doesn't allow for setting color intensity by having
the color fixed beforehand. Below is relevant excerpt from
the lp5024 documentation. This is not something that can be
mapped to RGB color space, but rather to HSV/HSL, with the
reservation that the hardware implementation uses PWM
for setting color intensity.


So they have feature where they have independent controls for each
channel, then one common control per three channels. Other chips have
common control for all the LEDs, for example. We don't support that
currently; lets focus on the RGB thing first.


I don't have problem with that, either; other drivers already do
that. He's free to use existing same interface.

But that is insufficient, as it does not allow simple stuff, such as
turning led "white".

So... perhaps we should agree on requirements, first, and then we can
discuss solutions?

Requirements for RGB LED interface:

1) Userspace should be able to set the white color

2) Userspace should be able to arbitrary color from well known list
and it should approximately match what would CRT, LCD or OLED monitor display


The difference is that monitor display driver is pre-calibrated
for given display by the manufacturer. With the LED controllers the
manufacturer has no control over what LEDs will be connected to the
iouts. Therefore it should be not surprising that colors produced
by custom LEDs are not as user would expect when comparing to
the RGB color displayed on the monitor display.


It is true that _chip_ manufacturer can not know what LEDs will be
connected. But _system_ manufacturer can and should know that, and
should tell be able to tell us in the dts.


This renders your requirement 2) infeasible with use of custom LEDs
any fixed algorithm, since the final effect will always heavily depend
on the LED circuit design.


Depending on LED circuit design and actual LEDs connected is okay.. we
just need to get information from _system_ designer (not chip
designer), and pass it to a place where color is computed.


a) RGB LEDs are usually not balanced. Setting 100% PWM on
red/green/blue channels will result in nothing close to white
light. In fact, to get white light on N900, blue and green channel's
PWM needs to be set pretty low, as in 5%.

b) LED class does not define any relation between "brightness" in
sysfs and ammount of light in lumens. Some drivers use close to linear
relation, some use exponential relation. Human eyes percieve logarithm
of lumens. RGB color model uses even more complex function.


One general question: do you have any solutions in store?


I played with LEDs on N900 over the weekend, yes.

And getting reasonable colors seems to be possible, when a) and b) are
solved... a) seems to be more important than b).

Now... this does not tell us how we should design kernel<->user
interface, but it should tell us that main goals - 1) and 2) are
possible.


I was thinking about this calibration and color correctness thing and I 
am thinking a bit that it should be partly in kernel and partly in user 
space.


For displays and printers there are defined icc-profiles that define how 
colors are mapped to particular device in cases when you want to have 
accurate color representation. In theory to get accurate LED output one 
could model LEDs with icc profile and then pick your color and convert 
it with icc profile to actual raw hardware values. Then this raw 
hardware value could be written from user space to kernel.


In kernel we could provide raw hardware value support and in case of PWM 
IC controlled LEDs we could provide curve mapping to linearize the 
behavior of values entered to enable use in cases where close enough is 
good enough.


Eg. if you want to have "white" then you have your color space picker 
like sRGB(255,255,255). Then you would define icc profile it might 
translate to hardware raw values 253%, 230%, 225%.


Then you would write this to kernel with:

# set red, green and blue color elements
echo "253 230 225" > color

In this case however behavior of brightness node is challening in 
accuracy domain. Britghtness value 255 would of course provide 1:1 
mapping in this case.


To go right to correct color and brightness at one go we could have 
optional brightness at the end of color value array:


# set red, green and blue color element and brightness setting:
echo "253 230 225 255" > color

If you want to have fancier behavior for brightness in your system then 
we probably need to have configurable brightness model.


- hardware, like in lp5024 case would map to

[PATCH v3 0/4] usb: gadget: uvc: fix racing between uvc_function_set_alt and streamon/off

2019-01-08 Thread Paul Elder

Down the call stack from the ioctl handler for VIDIOC_STREAMON,
uvc_video_alloc_requests contains a BUG_ON, which in the high level,
triggers when VIDIOC_STREAMON ioctl is issued without VIDIOC_STREAMOFF
being issued previously.

This can happen in a few ways, such as if the userspace uvc gadget
application simply doesn't issue VIDIOC_STREAMOFF. Another way is if
uvc_function_set_alt with alt 0 is called after it is called with 1 but
before VIDIOC_STREAMON is called; in this case, UVC_EVENT_STREAMOFF will
not be queued to userspace, and therefore userspace will never call
VIDIOC_STREAMOFF.

To fix this, add two more uvc states: starting and stopping. The
starting state is entered when uvc_function_set_alt 1 is called, and is
exited in uvc_v4l2_streamon, when the state is changed to streaming. The
stopping state is entered when uvc_function_set_alt 0 is called, and is
exited in uvc_v4l2_streamoff, when the state is changed to connected.

The status phase of the SET_INTERFACE request doesn't need to be delayed
by the uvc gadget driver, so that is removed.

Finally, there is another way to trigger the aforementioned BUG: start
streaming and (physically) disconnect usb. To fix this, call
uvcg_video_enable 0 in uvc_function_disable.


Changes in v3:

- add state guard to uvc_function_set_alt 1
- add documentation for newly added uvc states
- reorder uvc states to more or less follow the flow diagram
- add more state guards to ioctl handlers for streamon and streamoff
- added interrupt-safe uvcg_video_cancel and used instead of the
  non-interrupt-save uvcg_video_enable 0 in uvc_function_disable

Changes in v2:
1. Remove delay usb status phase

Paul Elder (4):
  usb: gadget: uvc: synchronize streamon/off with uvc_function_set_alt
  usb: gadget: uvc: don't delay the status phase of non-zero
SET_INTERFACE requests
  usb: gadget: uvc: disable stream when disconnected
  usb: gadget: uvc: remove unused/duplicate function prototypes from
uvc.h

 drivers/usb/gadget/function/f_uvc.c | 23 
 drivers/usb/gadget/function/uvc.h   | 47 +++--
 drivers/usb/gadget/function/uvc_v4l2.c  | 28 +++
 drivers/usb/gadget/function/uvc_video.c | 13 +++
 drivers/usb/gadget/function/uvc_video.h |  2 ++
 5 files changed, 91 insertions(+), 22 deletions(-)

-- 
2.20.1

[PATCH v3 4/4] usb: gadget: uvc: remove unused/duplicate function prototypes from uvc.h

2019-01-08 Thread Paul Elder

Defined in uvc.h, uvc_endpoint_stream doesn't exist, and
uvc_function_connect, uvc_function_disconnect, and
uvc_function_setup_continue have duplicates in f_uvc.h.
Remove these four function prototypes from uvc.h

Signed-off-by: Paul Elder 
---
 drivers/usb/gadget/function/uvc.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/usb/gadget/function/uvc.h 
b/drivers/usb/gadget/function/uvc.h
index f183e349499c..671020c8a836 100644
--- a/drivers/usb/gadget/function/uvc.h
+++ b/drivers/usb/gadget/function/uvc.h
@@ -185,14 +185,4 @@ struct uvc_file_handle {
 #define to_uvc_file_handle(handle) \
container_of(handle, struct uvc_file_handle, vfh)
 
-/* 
- * Functions
- */
-
-extern void uvc_function_setup_continue(struct uvc_device *uvc);
-extern void uvc_endpoint_stream(struct uvc_device *dev);
-
-extern void uvc_function_connect(struct uvc_device *uvc);
-extern void uvc_function_disconnect(struct uvc_device *uvc);
-
 #endif /* _UVC_GADGET_H_ */
-- 
2.20.1

[PATCH v2] usb: gadget: musb: fix short isoc packets with inventra dma

2019-01-08 Thread Paul Elder

Handling short packets (length < max packet size) in the Inventra DMA
engine in the MUSB driver causes the MUSB DMA controller to hang. An
example of a problem that is caused by this problem is when streaming
video out of a UVC gadget, only the first video frame is transferred.

For short packets (mode-0 or mode-1 DMA), MUSB_TXCSR_TXPKTRDY must be
set manually by the driver. This was previously done in musb_g_tx
(musb_gadget.c), but incorrectly (all csr flags were cleared, and only
MUSB_TXCSR_MODE and MUSB_TXCSR_TXPKTRDY were set). Fixing that problem
allows some requests to be transferred correctly, but multiple requests
were often put together in one USB packet, and caused problems if the
packet size was not a multiple of 4. Instead, set MUSB_TXCSR_TXPKTRDY
in dma_controller_irq (musbhsdma.c), just like host mode transfers.

This topic was originally tackled by Nicolas Boichat [0] [1] and is
discussed further at [2] as part of his GSoC project [3].

[0] https://groups.google.com/forum/?hl=en#!topic/beagleboard-gsoc/k8Azwfp75CU
[1] 
https://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/commit/b0be3b6cc195ba732189b04f1d43ec843c3e54c9?p=beagleboard-usbsniffer:beagleboard-usbsniffer-kernel.git;a=patch;h=b0be3b6cc195ba732189b04f1d43ec843c3e54c9
[2] 
http://beagleboard-usbsniffer.blogspot.com/2010/07/musb-isochronous-transfers-fixed.html
[3] http://elinux.org/BeagleBoard/GSoC/USBSniffer

Signed-off-by: Paul Elder 
---
Changes in v2:

- no more flushing FIFO
- greatly simplified short packet if guard in musb_g_tx, and removed
  unnecessary variables
- minor indentation and wording changes

 drivers/usb/musb/musb_gadget.c | 19 +--
 drivers/usb/musb/musbhsdma.c   | 21 +++--
 2 files changed, 16 insertions(+), 24 deletions(-)

diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c
index eae8b1b1b45b..496643f54faa 100644
--- a/drivers/usb/musb/musb_gadget.c
+++ b/drivers/usb/musb/musb_gadget.c
@@ -452,13 +452,10 @@ void musb_g_tx(struct musb *musb, u8 epnum)
}
 
if (request) {
-   u8  is_dma = 0;
-   boolshort_packet = false;
 
trace_musb_req_tx(req);
 
if (dma && (csr & MUSB_TXCSR_DMAENAB)) {
-   is_dma = 1;
csr |= MUSB_TXCSR_P_WZC_BITS;
csr &= ~(MUSB_TXCSR_DMAENAB | MUSB_TXCSR_P_UNDERRUN |
 MUSB_TXCSR_TXPKTRDY | MUSB_TXCSR_AUTOSET);
@@ -476,16 +473,8 @@ void musb_g_tx(struct musb *musb, u8 epnum)
 */
if ((request->zero && request->length)
&& (request->length % musb_ep->packet_sz == 0)
-   && (request->actual == request->length))
-   short_packet = true;
+   && (request->actual == request->length)) {
 
-   if ((musb_dma_inventra(musb) || musb_dma_ux500(musb)) &&
-   (is_dma && (!dma->desired_mode ||
-   (request->actual &
-   (musb_ep->packet_sz - 1)
-   short_packet = true;
-
-   if (short_packet) {
/*
 * On DMA completion, FIFO may not be
 * available yet...
@@ -493,8 +482,10 @@ void musb_g_tx(struct musb *musb, u8 epnum)
if (csr & MUSB_TXCSR_TXPKTRDY)
return;
 
-   musb_writew(epio, MUSB_TXCSR, MUSB_TXCSR_MODE
-   | MUSB_TXCSR_TXPKTRDY);
+   musb_dbg(musb, "sending short pkt (zero=%d, length=%d, 
actual=%d, dma->desired_mode=%d)\n",
+request->zero, request->length, 
request->actual,
+dma->desired_mode);
+   musb_writew(epio, MUSB_TXCSR, csr | 
MUSB_TXCSR_TXPKTRDY);
request->zero = 0;
}
 
diff --git a/drivers/usb/musb/musbhsdma.c b/drivers/usb/musb/musbhsdma.c
index a688f7f87829..5fc6825745f2 100644
--- a/drivers/usb/musb/musbhsdma.c
+++ b/drivers/usb/musb/musbhsdma.c
@@ -346,12 +346,10 @@ static irqreturn_t dma_controller_irq(int irq, void 
*private_data)
channel->status = MUSB_DMA_STATUS_FREE;
 
/* completed */
-   if ((devctl & MUSB_DEVCTL_HM)
-   && (musb_channel->transmit)
-   && ((channel->desired_mode == 0)
-   || (channel->actual_len &
-   (musb_channel->max_packet_sz - 1)))
-   ) {
+   if (musb_channel->transmit &&
+

[PATCH v5 5/6] usb: musb: gadget: implement optional explicit status stage

2019-01-08 Thread Paul Elder

Implement the mechanism for optional explicit status stage for the MUSB
driver. This allows a function driver to specify what to reply for the
status stage. The functionality for an implicit status stage is
retained.

Signed-off-by: Paul Elder 
v1 Reviewed-by: Laurent Pinchart 
v1 Acked-by: Bin Liu 
---
Changes from v4:

- call usb_gadget_control_complete before usb_gadget_giveback_request
- set musb ep0 state to statusin in ep0_send_ack
- make sure to not double-write musb register in ep0_rxstate, since
  musb_g_ep0_giveback will take care of writing them

No change from v3

Changes from v2:
- update call to usb_gadget_control_complete to include status
- since sending STALL from the function driver is now done with
  usb_ep_set_halt, there is no need for the internal ep0_send_response to
  take a stall/ack parameter; remove the parameter and make the function
  only send ack, and remove checking for the status reply in the
  usb_request for the status stage

Changes from v1:
- obvious change to implement v2 mechanism laid out by 4/6 of this
  series (send_response, and musb_g_ep0_send_response function has
  been removed, call to usb_gadget_control_complete has been added)
- ep0_send_response's ack argument has been changed from stall
- last_packet flag in ep0_rxstate has been removed, since it is equal to
  req != NULL

 drivers/usb/musb/musb_gadget.c |  2 ++
 drivers/usb/musb/musb_gadget_ep0.c | 29 +++--
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c
index 496643f54faa..ff83578c181f 100644
--- a/drivers/usb/musb/musb_gadget.c
+++ b/drivers/usb/musb/musb_gadget.c
@@ -144,6 +144,8 @@ __acquires(ep->musb->lock)
unmap_dma_buffer(req, musb);
 
trace_musb_req_gb(req);
+   if (req->ep->end_point.address == 0)
+   usb_gadget_control_complete(>g, >request);
usb_gadget_giveback_request(>ep->end_point, >request);
spin_lock(>lock);
ep->busy = busy;
diff --git a/drivers/usb/musb/musb_gadget_ep0.c 
b/drivers/usb/musb/musb_gadget_ep0.c
index 91a5027b5c1f..dc5abdf9d49e 100644
--- a/drivers/usb/musb/musb_gadget_ep0.c
+++ b/drivers/usb/musb/musb_gadget_ep0.c
@@ -458,6 +458,25 @@ __acquires(musb->lock)
return handled;
 }
 
+static int ep0_send_ack(struct musb *musb)
+{
+   void __iomem *regs = musb->control_ep->regs;
+   u16 csr;
+
+   if (musb->ep0_state != MUSB_EP0_STAGE_RX &&
+   musb->ep0_state != MUSB_EP0_STAGE_STATUSIN)
+   return -EINVAL;
+
+   csr = MUSB_CSR0_P_DATAEND | MUSB_CSR0_P_SVDRXPKTRDY;
+
+   musb_ep_select(musb->mregs, 0);
+   musb_writew(regs, MUSB_CSR0, csr);
+
+   musb->ep0_state = MUSB_EP0_STAGE_STATUSIN;
+
+   return 0;
+}
+
 /* we have an ep0out data packet
  * Context:  caller holds controller lock
  */
@@ -504,12 +523,15 @@ static void ep0_rxstate(struct musb *musb)
if (req) {
musb->ackpend = csr;
musb_g_ep0_giveback(musb, req);
+   if (req->explicit_status)
+   return;
if (!musb->ackpend)
return;
musb->ackpend = 0;
+   } else {
+   musb_ep_select(musb->mregs, 0);
+   musb_writew(regs, MUSB_CSR0, csr);
}
-   musb_ep_select(musb->mregs, 0);
-   musb_writew(regs, MUSB_CSR0, csr);
 }
 
 /*
@@ -939,6 +961,9 @@ musb_g_ep0_queue(struct usb_ep *e, struct usb_request *r, 
gfp_t gfp_flags)
case MUSB_EP0_STAGE_ACKWAIT:/* zero-length data */
status = 0;
break;
+   case MUSB_EP0_STAGE_STATUSIN:
+   status = ep0_send_ack(musb);
+   goto cleanup;
default:
musb_dbg(musb, "ep0 request queued in state %d",
musb->ep0_state);
-- 
2.20.1

[PATCH v5 4/6] usb: gadget: add mechanism to specify an explicit status stage

2019-01-08 Thread Paul Elder

A usb gadget function driver may or may not want to delay the status
stage of a control OUT request. An instance where it might want to is to
asynchronously validate the data of a class-specific request.

A function driver that wants an explicit status stage should set the
newly added explicit_status flag of the usb_request corresponding to the
data stage. Later on, the function driver can explicitly complete the
status stage by enqueueing a usb_request for ACK, or calling
usb_ep_set_halt() for STALL.

To support both explicit and implicit status stages, a UDC driver must
call the newly added usb_gadget_control_complete function right before
calling usb_gadget_giveback_request. To support the explicit status
stage, it might then check what stage the usb_request was queued in, and
for control IN ACK the host's zero-length data packet, or for control
OUT send a zero-length DATA1 ACK packet.

Signed-off-by: Paul Elder 
v4 Acked-by: Alan Stern 
v1 Reviewed-by: Laurent Pinchart 
---
Changes from v4:

- Change parameter of usb_gadget_control_complete to simply take a
  usb_request
- Make usb_gadget_control_complete do nothing if the request has no
  length (ie. no data stage)

Changes from v3:

- More specific in commit message about what to do for udc driver acking
- Set explicit_status in usb_gadget_control_complete
- Make explicit_status type bool

Changes from v2:

Add status parameter to usb_gadget_control_complete, so that a
usb_request is not queued if the status of the just given back request
is nonzero.

Changes from v1:

Complete change of API. Now we use a flag that should be set in the
usb_request that is queued for the data stage to signal to the UDC that
we want to delay the status stage (as opposed to setting a flag in the
UDC itself, that persists across all requests). We now also provide a
function for UDC drivers to very easily allow implicit status stages, to
mitigate the need to convert all function drivers to this new API at
once, and to make it easier for UDC drivers to convert.

 drivers/usb/gadget/udc/core.c | 34 ++
 include/linux/usb/gadget.h| 10 ++
 2 files changed, 44 insertions(+)

diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index af88b48c1cea..57b2c2550537 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -894,6 +894,40 @@ EXPORT_SYMBOL_GPL(usb_gadget_giveback_request);
 
 /* - */
 
+/**
+ * usb_gadget_control_complete - complete the status stage of a control
+ * request, or delay it
+ * Context: in_interrupt()
+ *
+ * @gadget: gadget whose control request's status stage should be completed
+ * @request: usb request whose status stage should be completed
+ *
+ * This is called by device controller drivers before returning the completed
+ * request back to the gadget layer, to either complete or delay the status
+ * stage. It exits without doing anything if the request has a non-zero status,
+ * if it has zero length, or if its explicit_status flag is set.
+ */
+void usb_gadget_control_complete(struct usb_gadget *gadget,
+   struct usb_request *request)
+{
+   struct usb_request *req;
+
+   if (request->explicit_status || request->status || !request->length)
+   return;
+
+   /* Send an implicit status-stage request for ep0 */
+   req = usb_ep_alloc_request(gadget->ep0, GFP_ATOMIC);
+   if (req) {
+   req->length = 0;
+   req->explicit_status = 1;
+   req->complete = usb_ep_free_request;
+   usb_ep_queue(gadget->ep0, req, GFP_ATOMIC);
+   }
+}
+EXPORT_SYMBOL_GPL(usb_gadget_control_complete);
+
+/* - */
+
 /**
  * gadget_find_ep_by_name - returns ep whose name is the same as sting passed
  * in second parameter or NULL if searched endpoint not found
diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
index e5cd84a0f84a..bf4f021ce139 100644
--- a/include/linux/usb/gadget.h
+++ b/include/linux/usb/gadget.h
@@ -73,6 +73,7 @@ struct usb_ep;
  * Note that for writes (IN transfers) some data bytes may still
  * reside in a device-side FIFO when the request is reported as
  * complete.
+ * @explicit_status: If true, delays the status stage
  *
  * These are allocated/freed through the endpoint they're used with.  The
  * hardware's driver can add extra per-request data to the memory it returns,
@@ -114,6 +115,8 @@ struct usb_request {
 
int status;
unsignedactual;
+
+   boolexplicit_status;
 };
 
 /*-*/
@@ -850,6 +853,13 @@ extern void usb_gadget_giveback_request(struct usb_ep *ep,
 
 /*-*/
 
+/*

[PATCH v5 2/6] usb: gadget: uvc: enqueue usb request in setup handler for control OUT

2019-01-08 Thread Paul Elder

Currently, for uvc class-specific control IN and OUT requests, in the
setup handler a UVC_EVENT_SETUP with the setup control is enqueued to
userspace. In response to this, the uvc function driver expects
userspace to call ioctl UVCIOC_SEND_RESPONSE containing uvc request
data.

In the case of control IN this is fine, but for control OUT it causes a
problem. Since the host sends data immediately after the setup stage
completes, it is possible that the empty uvc request data is not
enqueued in time for the UDC driver to put the data stage data into
(this causes some UDC drivers, such as MUSB, to reply with a STALL).
This problem is remedied by having the setup handler enqueue the empty
uvc request data, instead of waiting for userspace to do it.

Signed-off-by: Paul Elder 
Reviewed-by: Laurent Pinchart 
---
No change from v4
No change from v3
No change from v2
No change from v1

 drivers/usb/gadget/function/f_uvc.c| 25 +++--
 drivers/usb/gadget/function/uvc_v4l2.c |  7 +++
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/gadget/function/f_uvc.c 
b/drivers/usb/gadget/function/f_uvc.c
index 723ca91e4f2a..36de51ac30ed 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -223,8 +223,6 @@ static int
 uvc_function_setup(struct usb_function *f, const struct usb_ctrlrequest *ctrl)
 {
struct uvc_device *uvc = to_uvc(f);
-   struct v4l2_event v4l2_event;
-   struct uvc_event *uvc_event = (void *)_event.u.data;
 
if ((ctrl->bRequestType & USB_TYPE_MASK) != USB_TYPE_CLASS) {
uvcg_info(f, "invalid request type\n");
@@ -241,10 +239,25 @@ uvc_function_setup(struct usb_function *f, const struct 
usb_ctrlrequest *ctrl)
uvc->event_setup_out = !(ctrl->bRequestType & USB_DIR_IN);
uvc->event_length = le16_to_cpu(ctrl->wLength);
 
-   memset(_event, 0, sizeof(v4l2_event));
-   v4l2_event.type = UVC_EVENT_SETUP;
-   memcpy(_event->req, ctrl, sizeof(uvc_event->req));
-   v4l2_event_queue(>vdev, _event);
+   if (uvc->event_setup_out) {
+   struct usb_request *req = uvc->control_req;
+
+   /*
+* Enqueue the request immediately for control OUT as the
+* host will start the data stage straight away.
+*/
+   req->length = uvc->event_length;
+   req->zero = 0;
+   usb_ep_queue(f->config->cdev->gadget->ep0, req, GFP_KERNEL);
+   } else {
+   struct v4l2_event v4l2_event;
+   struct uvc_event *uvc_event = (void *)_event.u.data;
+
+   memset(_event, 0, sizeof(v4l2_event));
+   v4l2_event.type = UVC_EVENT_SETUP;
+   memcpy(_event->req, ctrl, sizeof(uvc_event->req));
+   v4l2_event_queue(>vdev, _event);
+   }
 
return 0;
 }
diff --git a/drivers/usb/gadget/function/uvc_v4l2.c 
b/drivers/usb/gadget/function/uvc_v4l2.c
index 7816ea9886e1..ac48f49d9f10 100644
--- a/drivers/usb/gadget/function/uvc_v4l2.c
+++ b/drivers/usb/gadget/function/uvc_v4l2.c
@@ -35,6 +35,13 @@ uvc_send_response(struct uvc_device *uvc, struct 
uvc_request_data *data)
struct usb_composite_dev *cdev = uvc->func.config->cdev;
struct usb_request *req = uvc->control_req;
 
+   /*
+* For control OUT transfers the request has been enqueued synchronously
+* by the setup handler, there's nothing to be done here.
+*/
+   if (uvc->event_setup_out)
+   return 0;
+
if (data->length < 0)
return usb_ep_set_halt(cdev->gadget->ep0);
 
-- 
2.20.1

[PATCH v5 1/6] usb: uvc: include videodev2.h in g_uvc.h

2019-01-08 Thread Paul Elder

V4L2_EVENT_PRIVATE_START is used in g_uvc.h but is defined in
videodev2.h, which is not included and causes a compiler warning:

linux/usb/g_uvc.h:15:28: error: ‘V4L2_EVENT_PRIVATE_START’ undeclared here (not 
in a function)
 #define UVC_EVENT_FIRST   (V4L2_EVENT_PRIVATE_START + 0)

Include videodev2.h in g_uvc.h.

Signed-off-by: Paul Elder 
Reviewed-by: Laurent Pinchart 
---
No change from v4
No change from v3
No change from v2
No change from v1

 include/uapi/linux/usb/g_uvc.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/usb/g_uvc.h b/include/uapi/linux/usb/g_uvc.h
index 3c9ee3020cbb..6698c3263ae8 100644
--- a/include/uapi/linux/usb/g_uvc.h
+++ b/include/uapi/linux/usb/g_uvc.h
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define UVC_EVENT_FIRST(V4L2_EVENT_PRIVATE_START + 0)
 #define UVC_EVENT_CONNECT  (V4L2_EVENT_PRIVATE_START + 0)
-- 
2.20.1

[PATCH v5 6/6] usb: gadget: uvc: allow ioctl to send response in status stage

2019-01-08 Thread Paul Elder

We now have a mechanism to signal the UDC driver to reply to a control
OUT request with STALL or ACK, and we have packaged the setup stage data
and the data stage data of a control OUT request into a single
UVC_EVENT_DATA for userspace to consume. The ioctl UVCIOC_SEND_RESPONSE
in the case of a control OUT request sends a response to the data stage,
and so the ioctl now notifies the UDC driver to reply with STALL or ACK.
In the case of a control IN request, the ioctl sends the UVC data as
before.

Also tell the UDC to delay the status stage for this to work.

Signed-off-by: Paul Elder 
---
No change from v4
No change from v3

Changes from v2:
- calling usb_ep_set_halt in uvc_send_response if data->length < 0 is
  now common for both IN and OUT transfers so make that check common
- remove now unnecessary field setting for the usb_request to be queued
  for the status stage

Changes from v1:
- remove usb_ep_delay_status call from the old proposed API
- changed portions of uvc_send_response to match v2 API
- remove UDC warning that send_response is not implemented

 drivers/usb/gadget/function/f_uvc.c|  4 ++--
 drivers/usb/gadget/function/uvc_v4l2.c | 23 +--
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/gadget/function/f_uvc.c 
b/drivers/usb/gadget/function/f_uvc.c
index 58ed191f636e..053808b01d48 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -209,14 +209,13 @@ uvc_function_ep0_complete(struct usb_ep *ep, struct 
usb_request *req)
struct uvc_event *uvc_event = (void *)_event.u.data;
 
if (uvc->event_setup_out) {
-   uvc->event_setup_out = 0;
-
memset(_event, 0, sizeof(v4l2_event));
v4l2_event.type = UVC_EVENT_DATA;
uvc_event->data.length = req->actual;
memcpy(_event->data.data, req->buf, req->actual);
memcpy(_event->data.setup, >control_setup,
   sizeof(uvc_event->data.setup));
+
v4l2_event_queue(>vdev, _event);
}
 }
@@ -251,6 +250,7 @@ uvc_function_setup(struct usb_function *f, const struct 
usb_ctrlrequest *ctrl)
 */
req->length = uvc->event_length;
req->zero = 0;
+   req->explicit_status = 1;
usb_ep_queue(f->config->cdev->gadget->ep0, req, GFP_KERNEL);
} else {
struct v4l2_event v4l2_event;
diff --git a/drivers/usb/gadget/function/uvc_v4l2.c 
b/drivers/usb/gadget/function/uvc_v4l2.c
index ac48f49d9f10..353071869e15 100644
--- a/drivers/usb/gadget/function/uvc_v4l2.c
+++ b/drivers/usb/gadget/function/uvc_v4l2.c
@@ -35,15 +35,26 @@ uvc_send_response(struct uvc_device *uvc, struct 
uvc_request_data *data)
struct usb_composite_dev *cdev = uvc->func.config->cdev;
struct usb_request *req = uvc->control_req;
 
+   if (data->length < 0)
+   return usb_ep_set_halt(cdev->gadget->ep0);
+
/*
 * For control OUT transfers the request has been enqueued synchronously
-* by the setup handler, there's nothing to be done here.
+* by the setup handler, we just need to tell the UDC whether to ACK or
+* STALL the control transfer.
 */
-   if (uvc->event_setup_out)
-   return 0;
-
-   if (data->length < 0)
-   return usb_ep_set_halt(cdev->gadget->ep0);
+   if (uvc->event_setup_out) {
+   /*
+* The length field carries the control request status.
+* Negative values signal a STALL and zero values an ACK.
+* Positive values are not valid as there is no data to send
+* back in the status stage.
+*/
+   if (data->length > 0)
+   return -EINVAL;
+
+   return usb_ep_queue(cdev->gadget->ep0, req, GFP_KERNEL);
+   }
 
req->length = min_t(unsigned int, uvc->event_length, data->length);
req->zero = data->length < uvc->event_length;
-- 
2.20.1

[PATCH v5 0/6] usb: gadget: add mechanism to asynchronously validate data stage of ctrl out request

2019-01-08 Thread Paul Elder

This patch series adds a mechanism to allow asynchronously validating
the data stage of a control OUT request, and for stalling or suceeding
the request accordingly. This mechanism is implemented for MUSB, and is
used by UVC. At the same time, UVC packages the setup stage and data
stage data together to send to userspace to save on a pair of context
switches per control out request.

This patch series does change the userspace API. We however believe that
it is justified because the current API is broken, and because it isn't
being used (because it's broken).

The current API is broken such that it is subject to race conditions
that cause fatal errors with a high frequency. This is actually what
motivated this patch series in the first place. In the current API, not
only is there no way to asynchronously validate the data stage of a
control OUT request, but an empty buffer is expected to be provided to
hold the data stage data -- which is more likely than not to be late.
There is even a warning in musb_g_ep0_queue:

/* else for sequence #2 (OUT), caller provides a buffer
 * before the next packet arrives.  deferred responses
 * (after SETUP is acked) are racey.
 */

This problem has never been reported in years, which is a sign that the
API isn't used. Furthermore, the vendor kernels that we have seen using
the UVC gadget driver (such as QC and Huawei) are heavily patched with
local changes to the API. This corroborates the suspicion that the
current mainline API is not being used.

Additionally, this API isn't meant to be used by generic applications,
but by a dedicated userspace helper. uvc-gadget is one such example, but
it has bitrotten and isn't compatible with the current kernel API. The
fact that nobody has submitted patches nor complained for a long time
again shows that it isn't being used.

The conclusion is that since the API hasn't been used for a long time,
it is safe to fix it.

Changes in v5:

- Change parameter of usb_gadget_control_complete to simply take a
  usb_request
- Make usb_gadget_control_complete do nothing if the request has no
  length (ie. no data stage)
- musb: call usb_gadget_control_complete before
  usb_gadget_giveback_request
- set musb ep0 state to statusin in ep0_send_ack
- make sure to not double-write musb register in ep0_rxstate, since
  musb_g_ep0_giveback will take care of writing them

Changes in v4:

- Change wording and fix typo in patch 4/6 "usb: gadget: add mechanism
  to specify an explicit status stage"
- Set explicit_status in usb_gadget_control_complete
- Change explicit_status from unsigned int to bool

Changes in v3:

- Function driver send STALL status stage by simply calling
  usb_ep_set_halt, and ACK by enqueueing request
- Fix signature of usb_gadget_control_complete to check the status of the
  request that was just given back.
- Corresponding changes to musb and uvc gadget

Changes in v2:

Overhaul of status stage delay mechanism/API. Now if a function driver
desires an explicit/delayed status stage, it specifies so in a flag in
the usb_request that is queued for the data stage. The function driver
later enqueues another usb_request for the status stage, also with the
explicit_status flag set, and with the zero flag acting as the status.
If a function driver does not desire an explicit status stage, then it
can set (or ignore) the explicit_status flag in the usb_request that
is queued for the data stage.

To allow the optional explicit status stage, a UDC driver should call
the newly added usb_gadget_control_complete right after
usb_gadget_giveback_request, and in its queue function should check if
the usb_request is for the status stage and if it has been requested to
be explicit, and if so check the status that should be sent. (See 5/6
"usb: musb: gadget: implement optional explicit status stage" for an
implementation for MUSB)

Paul Elder (6):
  usb: uvc: include videodev2.h in g_uvc.h
  usb: gadget: uvc: enqueue usb request in setup handler for control OUT
  usb: gadget: uvc: package setup and data for control OUT requests
  usb: gadget: add mechanism to specify an explicit status stage
  usb: musb: gadget: implement optional explicit status stage
  usb: gadget: uvc: allow ioctl to send response in status stage

 drivers/usb/gadget/function/f_uvc.c| 32 ++--
 drivers/usb/gadget/function/uvc.h  |  1 +
 drivers/usb/gadget/function/uvc_v4l2.c | 18 ++
 drivers/usb/gadget/udc/core.c  | 34 ++
 drivers/usb/musb/musb_gadget.c |  2 ++
 drivers/usb/musb/musb_gadget_ep0.c | 29 --
 include/linux/usb/gadget.h | 10 
 include/uapi/linux/usb/g_uvc.h |  4 ++-
 8 files changed, 119 insertions(+), 11 deletions(-)

-- 
2.20.1

[tip:perf/urgent] perf trace: Fix ')' placement in "interrupted" syscall lines

2019-01-08 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  172bf02d564bdb6df8410f64720fa2c68e755d1a
Gitweb: https://git.kernel.org/tip/172bf02d564bdb6df8410f64720fa2c68e755d1a
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 7 Jan 2019 16:24:27 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 8 Jan 2019 13:28:12 -0300

perf trace: Fix ')' placement in "interrupted" syscall lines

When we get the sys_enter for a syscall we check if the last one is
still waiting for its matching sys_exit, if so we print this:

   468.753 ( ): firefox/32382 poll(ufds: 0x7f3988d3dd00, nfds: 7, 
timeout_msecs: 4294967295) ...
   449.575 ( 0.004 ms): Softwar~cThrea/32434 futex(uaddr: 0x7f39a18a9b70, op: 
WAKE|PRIVATE_FLAG, val: 1)   = 0

At some point we'll get that poll sys_exit event and will print a "[continued]" 
line.

While making the sizing of the alignment after the syscall arg list and
its result configurable, so that we can mimic strace, which uses a
smaller alingment by default, a bug was introduced where the closing
parens appeared before the syscall name and its arg list, fix it.

Fixes: 4b8a240ed5e0 ("perf trace: Add alignment spaces after the closing 
parens")
Cc: Adrian Hunter 
Cc: Jiri Olsa 
Cc: Luis Cláudio Gonçalves 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-oi45i54s59h1w1kmgpzrf...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-trace.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index adbf28183560..b8bf5d025563 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1758,6 +1758,7 @@ static int trace__printf_interrupted_entry(struct trace 
*trace)
 {
struct thread_trace *ttrace;
size_t printed;
+   int len;
 
if (trace->failure_only || trace->current == NULL)
return 0;
@@ -1768,9 +1769,14 @@ static int trace__printf_interrupted_entry(struct trace 
*trace)
return 0;
 
printed  = trace__fprintf_entry_head(trace, trace->current, 0, false, 
ttrace->entry_time, trace->output);
-   printed += fprintf(trace->output, ")%-*s ...\n", trace->args_alignment, 
ttrace->entry_str);
-   ttrace->entry_pending = false;
+   printed += len = fprintf(trace->output, "%s)", ttrace->entry_str);
+
+   if (len < trace->args_alignment - 4)
+   printed += fprintf(trace->output, "%-*s", trace->args_alignment 
- 4 - len, " ");
 
+   printed += fprintf(trace->output, " ...\n");
+
+   ttrace->entry_pending = false;
++trace->nr_events_printed;
 
return printed;

[PATCH v5 3/6] usb: gadget: uvc: package setup and data for control OUT requests

2019-01-08 Thread Paul Elder

Since "usb: gadget: uvc: enqueue uvc_request_data in setup handler
for control OUT requests" it is no longer necessary for userspace to
call ioctl UVCIOC_SEND_RESPONSE in response to receiving a
UVC_EVENT_SETUP from the uvc function driver for a control OUT request.

This change means that for control OUT userspace will receive a
UVC_EVENT_SETUP and not do anything with it. This is a waste of a pair
of context switches, so we put the setup and data stage data into a
single UVC_EVENT_DATA to give to userspace. Previously struct
uvc_request_data had 60 bytes allocated for data, and since uvc data at
most is 34 bytes in UVC 1.1 and 48 bytes in UVC 1.5, we can afford to
cut out 8 bytes to store the setup control.

Since the setup control is discarded after the handling of the setup
stage, it must be saved in struct uvc_device during the setup handler in
order for the data stage handler to be able to read it and send it to
userspace.

Signed-off-by: Paul Elder 
Reviewed-by: Laurent Pinchart 
---
No change from v4
No change from v3
No change from v2
No change from v1

 drivers/usb/gadget/function/f_uvc.c | 3 +++
 drivers/usb/gadget/function/uvc.h   | 1 +
 include/uapi/linux/usb/g_uvc.h  | 3 ++-
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/gadget/function/f_uvc.c 
b/drivers/usb/gadget/function/f_uvc.c
index 36de51ac30ed..58ed191f636e 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -215,6 +215,8 @@ uvc_function_ep0_complete(struct usb_ep *ep, struct 
usb_request *req)
v4l2_event.type = UVC_EVENT_DATA;
uvc_event->data.length = req->actual;
memcpy(_event->data.data, req->buf, req->actual);
+   memcpy(_event->data.setup, >control_setup,
+  sizeof(uvc_event->data.setup));
v4l2_event_queue(>vdev, _event);
}
 }
@@ -238,6 +240,7 @@ uvc_function_setup(struct usb_function *f, const struct 
usb_ctrlrequest *ctrl)
 */
uvc->event_setup_out = !(ctrl->bRequestType & USB_DIR_IN);
uvc->event_length = le16_to_cpu(ctrl->wLength);
+   memcpy(>control_setup, ctrl, sizeof(uvc->control_setup));
 
if (uvc->event_setup_out) {
struct usb_request *req = uvc->control_req;
diff --git a/drivers/usb/gadget/function/uvc.h 
b/drivers/usb/gadget/function/uvc.h
index 671020c8a836..1d89b1df4ba0 100644
--- a/drivers/usb/gadget/function/uvc.h
+++ b/drivers/usb/gadget/function/uvc.h
@@ -163,6 +163,7 @@ struct uvc_device {
unsigned int control_intf;
struct usb_ep *control_ep;
struct usb_request *control_req;
+   struct usb_ctrlrequest control_setup;
void *control_buf;
 
unsigned int streaming_intf;
diff --git a/include/uapi/linux/usb/g_uvc.h b/include/uapi/linux/usb/g_uvc.h
index 6698c3263ae8..10fbb4382925 100644
--- a/include/uapi/linux/usb/g_uvc.h
+++ b/include/uapi/linux/usb/g_uvc.h
@@ -24,7 +24,8 @@
 
 struct uvc_request_data {
__s32 length;
-   __u8 data[60];
+   struct usb_ctrlrequest setup;
+   __u8 data[52];
 };
 
 struct uvc_event {
-- 
2.20.1

Re: [PATCH v2 3/3] drivers: platform: goldfish: goldfish_sync: add a driver

2019-01-08 Thread Greg KH

On Tue, Jan 08, 2019 at 04:48:02PM -0800, Roman Kiryanov wrote:
> > Also, why is this driver needed at all?  Why can't you use the "normal"
> > drm sync api interface?  Why write a custom ioctl driver just for this
> > one kernel interface?
> 
> This driver allows us to talk to host's sync api.

That describes what your code does, yes, but you did not answer my
question at all.  Why do you need this?  What are you trying to do?  And
why is this so different from any other drm driver?

thanks,

greg k-h

Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler

2019-01-08 Thread Stephan Mueller

Am Mittwoch, 9. Januar 2019, 07:58:28 CET schrieb James Bottomley:

Hi James,

> On Wed, 2019-01-09 at 07:45 +0100, Stephan Mueller wrote:
> > Am Mittwoch, 9. Januar 2019, 01:44:31 CET schrieb James Bottomley:
> > 
> > Hi James,
> > 
> > > Actually, it would be enormously helpful if we could reuse these
> > > pieces for the TPM as well.
> > 
> > Could you please help me understand whether the KDFs in TPM are
> > directly usable as a standalone cipher primitive or does it go
> > together with additional  key generation operations?
> 
> They're used as generators ... which means they deterministically
> produce keys from what the TPM calls seeds so we can get crypto agility
> of TPM 2.0 ... well KDFa does.  KDFe is simply what NIST recommends you
> do when using EC for a shared key agreement ... and really we shouldn't
> be using ECDH in the kernel without it.
> 

Thanks for clarifying. That would mean that indeed we would have hardware-
provided KDF implementations that may be usable with the kernel crypto API.

[...]
> 
> > Would it be appropriate, to implement a type cast to a structure from
> > the u8 pointer?
> > 
> > E.g. for the aforementioned label/context data, we could define
> > something like
> > 
> > struct crypto_kdf_ctr {
> > 
> > char *label;
> > size_t label_len;
> > u8 *contextU;
> > size_t contextU_len;
> > u8 *contextV;
> > size_t contextV_len;
> > 
> > };
> > 
> > And the implementation of the generate function for CTR KDF would
> > 
> > then have a  type cast along the following lines:
> > if (slen != sizeof(struct crypto_kdf_ctr))
> > 
> > return -EINVAL;
> > 
> > const struct crypto_kdf_ctr *kdf_ctr_input = (struct
> > 
> > crypto_kdf_ctr *)src;
> > 
> > 
> > For different KDFs, different structs would be needed.
> 
> Actually, we probably just need the input key (or secret material), the
> concatenation and the number of output bits.

Thanks for confirming. Though, when it comes to HKDF (not that I see it being 
needed or required in the kernel), there is a need to split up the src pointer 
since the mentioned input is used in different ways.

In order to try to get the implementation and thus the interface right, I 
would suggest to at least have a consensus on how to handle such situations.

Thus, would the proposal be acceptable for such KDFs that may need to have 
different components communicated as input to the KDF?

Ciao
Stephan

Re: [GIT PULL 00/16] perf/core fixes and improvements

2019-01-08 Thread Ingo Molnar



* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 64598e8b6fdaf28e37c3530f8b95a9f8ef6af131:
> 
>   Merge tag 'perf-core-for-mingo-4.21-20190104' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent 
> (2019-01-08 16:31:19 +0100)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-5.0-20190108
> 
> for you to fetch changes up to ee412f14693a3fe2645b3528603dfd37dd05118a:
> 
>   tools include uapi: Sync linux/vhost.h with the kernel sources (2019-01-08 
> 14:09:33 -0300)
> 
> 
> perf/core fixes and improvements:
> 
> perf top:
> 
>   Arnaldo Carvalho de Melo:
> 
>   . Lift restriction on using callchains without "sym" in --sort, allowing
> 
> perf trace:
> 
>   Arnaldo Carvalho de Melo:
> 
>   . Fix ')' placement in "interrupted" syscall lines.
> 
>   . Fix alignment for [continued] lines.
> 
> perf tests:
> 
>   Florian Fainelli:
> 
>   . Add a test for the ARM 32-bit [vectors] page.
> 
> tools lib traceevent:
> 
>   Tzvetomir Stoyanov:
> 
>   . Introduce new libtracevent API: tep_override_comm().
> 
>   . Initialize host_bigendian at tep_handle allocation.
> 
>   . More namespacing changes.
> 
>   . Remove superfluous APIs.
> 
> tools headers uapi:
> 
>   Arnaldo Carvalho de Melo:
> 
>   . Update linux/{fs,vhost}.h, grab a copy o linux/mount.h, where the
> MS_ mount flags were moved.
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (7):
>   perf trace: Fix ')' placement in "interrupted" syscall lines
>   perf trace: Fix alignment for [continued] lines
>   perf top: Lift restriction on using callchains without "sym" in --sort
>   tools include uapi: Grab a copy of linux/mount.h
>   perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.h
>   tools include uapi: Sync linux/fs.h copy with the kernel sources
>   tools include uapi: Sync linux/vhost.h with the kernel sources
> 
> Florian Fainelli (2):
>   perf tools: Make find_vdso_map() more modular
>   perf tests: Add a test for the ARM 32-bit [vectors] page
> 
> Tzvetomir Stoyanov (7):
>   tools lib traceevent: Introduce new libtracevent API: 
> tep_override_comm()
>   tools lib traceevent: Initialize host_bigendian at tep_handle allocation
>   tools lib traceevent: Rename struct cmdline to struct tep_cmdline
>   tools lib traceevent: Changed return logic of trace_seq_printf() and 
> trace_seq_vprintf() APIs
>   tools lib traceevent: Changed return logic of 
> tep_register_event_handler() API
>   tools lib traceevent: Rename tep_is_file_bigendian() to 
> tep_file_bigendian()
>   tools lib traceevent: Remove tep_data_event_from_type() API
> 
>  tools/include/uapi/linux/fs.h   |  60 ++-
>  tools/include/uapi/linux/mount.h|  58 +++
>  tools/include/uapi/linux/vhost.h| 113 +
>  tools/lib/traceevent/event-parse-api.c  |   4 +-
>  tools/lib/traceevent/event-parse-local.h|   4 +-
>  tools/lib/traceevent/event-parse.c  | 129 
> +++-
>  tools/lib/traceevent/event-parse.h  |  17 ++--
>  tools/lib/traceevent/plugin_kvm.c   |   2 +-
>  tools/lib/traceevent/trace-seq.c|  17 +++-
>  tools/perf/Makefile.perf|   4 +-
>  tools/perf/arch/arm/tests/Build |   1 +
>  tools/perf/arch/arm/tests/arch-tests.c  |   4 +
>  tools/perf/arch/arm/tests/vectors-page.c|  24 +
>  tools/perf/builtin-top.c|   7 +-
>  tools/perf/builtin-trace.c  |  15 ++-
>  tools/perf/check-headers.sh |   1 +
>  tools/perf/perf-read-vdso.c |   6 +-
>  tools/perf/tests/tests.h|   5 +
>  tools/perf/trace/beauty/mount_flags.sh  |   4 +-
>  tools/perf/util/{find-vdso-map.c => find-map.c} |   7 +-
>  tools/perf/util/vdso.c  |   6 +-
>  21 files changed, 238 insertions(+), 250 deletions(-)
>  create mode 100644 tools/include/uapi/linux/mount.h
>  create mode 100644 tools/perf/arch/arm/tests/vectors-page.c
>  rename tools/perf/util/{find-vdso-map.c => find-ma

Re: About some mail lost and random unsubscribe issues on this dmaengine list

2019-01-08 Thread Vinod Koul

On 09-01-19, 01:35, Yang, Shunyong wrote:
> Hi, Vinod and All,
>   I found, somtimes, I cannot received some mails from this mailing
> list. For example, I sent my v2 series on 7th, Jan,
> 
> https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713
> 
> It appeared on patchwork.kernel.org but I haven't received it in this
> mailing list.
> 
> And I found randomly, a month or two months, I am unsubscribed from this
> list automatically. Then I have to re-subscribe the list.

I guess your corporate server is to blame. vger mail list is "know" to 
unsubscribe servers which behave badly, so you need to ask your mail
admins why and try to use other email which is known to be friendly with
vger :)

> Have someone met similar problem as me? Or maybe I missed some necessary
> settings?
> 
> Shunyong.
> Thanks.

-- 
~Vinod

[PATCH resend] drm/panel: panel-innolux: set display off in innolux_panel_unprepare

2019-01-08 Thread Hsin-Yi, Wang

Move mipi_dsi_dcs_set_display_off() from innolux_panel_disable()
to innolux_panel_unprepare(), so they are consistent with
innolux_panel_enable() and innolux_panel_prepare().

This also fixes some mode check and irq timeout issue in MTK dsi code.

Since some dsi code (e.g. mtk_dsi) have following call trace:
1. drm_panel_disable(), which calls innolux_panel_disable()
2. switch to cmd mode
3. drm_panel_unprepare(), which calls innolux_panel_unprepare()

However, mtk_dsi needs to be in cmd mode to be able to send commands
(e.g. mipi_dsi_dcs_set_display_off() and mipi_dsi_dcs_enter_sleep_mode()),
so we need these functions to be called after the switch to cmd mode happens,
i.e. in innolux_panel_unprepare.


Signed-off-by: Hsin-Yi, Wang 
---
Resend for review.
---
 drivers/gpu/drm/panel/panel-innolux-p079zca.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-innolux-p079zca.c 
b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
index ca4ae45dd307..8e5724b63f1f 100644
--- a/drivers/gpu/drm/panel/panel-innolux-p079zca.c
+++ b/drivers/gpu/drm/panel/panel-innolux-p079zca.c
@@ -70,18 +70,12 @@ static inline struct innolux_panel *to_innolux_panel(struct 
drm_panel *panel)
 static int innolux_panel_disable(struct drm_panel *panel)
 {
struct innolux_panel *innolux = to_innolux_panel(panel);
-   int err;
 
if (!innolux->enabled)
return 0;
 
backlight_disable(innolux->backlight);
 
-   err = mipi_dsi_dcs_set_display_off(innolux->link);
-   if (err < 0)
-   DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n",
- err);
-
innolux->enabled = false;
 
return 0;
@@ -95,6 +89,11 @@ static int innolux_panel_unprepare(struct drm_panel *panel)
if (!innolux->prepared)
return 0;
 
+   err = mipi_dsi_dcs_set_display_off(innolux->link);
+   if (err < 0)
+   DRM_DEV_ERROR(panel->dev, "failed to set display off: %d\n",
+ err);
+
err = mipi_dsi_dcs_enter_sleep_mode(innolux->link);
if (err < 0) {
DRM_DEV_ERROR(panel->dev, "failed to enter sleep mode: %d\n",
-- 
2.18.1

Re: [GIT PULL] seccomp: build fix for v5.0-rc2

2019-01-08 Thread Ingo Molnar



* Kees Cook  wrote:

> This was already picked up by x86-urgent...
> 
> -Kees

I'm fine with both routes - if Linus pulls this I'll zap the x86/urgent 
one.

Thanks,

Ingo

Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler

2019-01-08 Thread James Bottomley

On Wed, 2019-01-09 at 07:45 +0100, Stephan Mueller wrote:
> Am Mittwoch, 9. Januar 2019, 01:44:31 CET schrieb James Bottomley:
> 
> Hi James,
> 
> > Actually, it would be enormously helpful if we could reuse these
> > pieces for the TPM as well. 
> 
> Could you please help me understand whether the KDFs in TPM are
> directly usable as a standalone cipher primitive or does it go
> together with additional  key generation operations?

They're used as generators ... which means they deterministically
produce keys from what the TPM calls seeds so we can get crypto agility
of TPM 2.0 ... well KDFa does.  KDFe is simply what NIST recommends you
do when using EC for a shared key agreement ... and really we shouldn't
be using ECDH in the kernel without it.

> > It has two KDFs: KDFa, which is the CTR-KDF from
> > SP800-108 and KDFe which is the SP800-56A KDF for elliptic curve
> > one pass Diffie Hellman, so if we're going to do the former, I'd
> > really like the latter as well.
> > 
> > The way the TPM parametrises input to both KDFs is
> > 
> > (hashAlg, key, label, contextU, contextV, bits)
> > 
> > Where
> > 
> > hashAlg  = the hash algorithm used as the PRF
> > key  = the input parameter of variable bit size or
> >the x co-ordinate of the shared point
> > label= An ASCII string representing the use
> > contextU = public input U
> > contextV = public input V
> > bits = number of output bits
> 
> When implementing KDFs as an extension of the kernel crypto API's RNG
> facility we currently have to handle the limitiation of the existing
> API. The label/context data (and when considering RFC 5869 HKDF
> requring IKM, salt and  additional information as input) currently
> cannot directly be communicated  through the API.
> 
> The issue is that the RNG facility currently has the following
> prototype defined:
> 
> int (*generate)(struct crypto_rng *tfm,
> const u8 *src, unsigned int slen,
> u8 *dst, unsigned int dlen);
> 
> The src pointer would need to take the label/context data.

That's probably good enough.  For both KDFa and KDFe the label contextU
and ContextV are concatenated, so in both cases a single source is
probably good enough.  However, we also need to feed in the key somehow
since it's usually used separately in the derivation functions.

> Would it be appropriate, to implement a type cast to a structure from
> the u8 pointer?
> 
> E.g. for the aforementioned label/context data, we could define
> something like
> 
> struct crypto_kdf_ctr {
>   char *label;
>   size_t label_len;
>   u8 *contextU;
>   size_t contextU_len;
>   u8 *contextV;
>   size_t contextV_len;
> };
> 
> And the implementation of the generate function for CTR KDF would
> then have a  type cast along the following lines:
> 
>   if (slen != sizeof(struct crypto_kdf_ctr))
>   return -EINVAL;
>   const struct crypto_kdf_ctr *kdf_ctr_input = (struct
> crypto_kdf_ctr *)src;
> 
> 
> For different KDFs, different structs would be needed.

Actually, we probably just need the input key (or secret material), the
concatenation and the number of output bits.

James

Re: [PATCH 1/2] x86, kexec_file_load: Don't setup EFI info if EFI runtime is not enabled

2019-01-08 Thread Kairui Song

CCing more people

On Wed, Jan 9, 2019 at 2:45 PM Kairui Song  wrote:
>
> Currenly with "efi=noruntime" in kernel command line, calling
> kexec_file_load will raise below problem:
>
> [   97.967067] BUG: unable to handle kernel NULL pointer dereference at 
> 
> [   97.967894] #PF error: [normal kernel read fault]
> ...
> [   97.980456] Call Trace:
> [   97.980724]  efi_runtime_map_copy+0x28/0x30
> [   97.981267]  bzImage64_load+0x688/0x872
> [   97.981794]  arch_kexec_kernel_image_load+0x6d/0x70
> [   97.982441]  kimage_file_alloc_init+0x13e/0x220
> [   97.983035]  __x64_sys_kexec_file_load+0x144/0x290
> [   97.983586]  do_syscall_64+0x55/0x1a0
> [   97.983962]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> When efi runtime is not enabled, efi memmap is not mapped, so just skip
> EFI info setup.
>
> Suggested-by: Dave Young 
> Signed-off-by: Kairui Song 
> ---
>  arch/x86/kernel/kexec-bzimage64.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 0d5efa34f359..53917a3ebf94 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -167,6 +167,9 @@ setup_efi_state(struct boot_params *params, unsigned long 
> params_load_addr,
> struct efi_info *current_ei = _params.efi_info;
> struct efi_info *ei = >efi_info;
>
> +   if (!efi_enabled(EFI_RUNTIME_SERVICES))
> +   return 0;
> +
> if (!current_ei->efi_memmap_size)
> return 0;
>
> --
> 2.20.1
>


-- 
Best Regards,
Kairui Song

Re: [PATCH 2/2] x86, kexec_file_load: make it work with efi=noruntime or efi=oldmap

2019-01-08 Thread Kairui Song

CCing more people

On Wed, Jan 9, 2019 at 2:47 PM Kairui Song  wrote:
>
> When efi=noruntime or efi=oldmap is used, EFI services won't be available
> in the second kernel, therefore the second kernel will not be able to get
> the ACPI RSDP address from firmware by calling EFI services and won't
> boot. Previously we are expecting the user to set the acpi_rsdp=
> on kernel command line for second kernel as there was no way to pass RSDP
> address to second kernel.
>
> After commit e6e094e053af ('x86/acpi, x86/boot: Take RSDP address from
> boot params if available'), now it's possible to set a acpi_rsdp_addr
> parameter in the boot_params passed to second kernel, this commit make
> use of it, detect and set the RSDP address when it's required for second
> kernel to boot.
>
> Tested with an EFI enabled KVM VM with efi=noruntime.
>
> Suggested-by: Dave Young 
> Signed-off-by: Kairui Song 
> ---
>  arch/x86/kernel/kexec-bzimage64.c | 21 +
>  drivers/acpi/acpica/tbxfroot.c|  3 +--
>  include/acpi/acpixf.h |  2 +-
>  3 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 53917a3ebf94..0a90dcbd041f 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -255,8 +256,28 @@ setup_boot_parameters(struct kimage *image, struct 
> boot_params *params,
> /* Setup EFI state */
> setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz,
> efi_setup_data_offset);
> +
> +#ifdef CONFIG_ACPI
> +   /* Setup ACPI RSDP pointer in case EFI is not available in second 
> kernel */
> +   if (!acpi_disabled && (!efi_enabled(EFI_RUNTIME_SERVICES) || 
> efi_enabled(EFI_OLD_MEMMAP))) {
> +   /* Copied from acpi_os_get_root_pointer accordingly */
> +   params->acpi_rsdp_addr = boot_params.acpi_rsdp_addr;
> +   if (!params->acpi_rsdp_addr) {
> +   if (efi_enabled(EFI_CONFIG_TABLES)) {
> +   if (efi.acpi20 != EFI_INVALID_TABLE_ADDR)
> +   params->acpi_rsdp_addr = efi.acpi20;
> +   else if (efi.acpi != EFI_INVALID_TABLE_ADDR)
> +   params->acpi_rsdp_addr = efi.acpi;
> +   } else if 
> (IS_ENABLED(CONFIG_ACPI_LEGACY_TABLES_LOOKUP)) {
> +   
> acpi_find_root_pointer(>acpi_rsdp_addr);
> +   }
> +   }
> +   if (!params->acpi_rsdp_addr)
> +   pr_warn("RSDP is not available for second kernel\n");
> +   }
>  #endif
>
> +#endif
> /* Setup EDD info */
> memcpy(params->eddbuf, boot_params.eddbuf,
> EDDMAXNR * sizeof(struct edd_info));
> diff --git a/drivers/acpi/acpica/tbxfroot.c b/drivers/acpi/acpica/tbxfroot.c
> index 483d0ce5180a..dac1e34a931c 100644
> --- a/drivers/acpi/acpica/tbxfroot.c
> +++ b/drivers/acpi/acpica/tbxfroot.c
> @@ -108,8 +108,7 @@ acpi_status acpi_tb_validate_rsdp(struct acpi_table_rsdp 
> *rsdp)
>   *
>   
> **/
>
> -acpi_status ACPI_INIT_FUNCTION
> -acpi_find_root_pointer(acpi_physical_address *table_address)
> +acpi_status acpi_find_root_pointer(acpi_physical_address *table_address)
>  {
> u8 *table_ptr;
> u8 *mem_rover;
> diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
> index 7aa38b648564..869d75ecaf7d 100644
> --- a/include/acpi/acpixf.h
> +++ b/include/acpi/acpixf.h
> @@ -474,7 +474,7 @@ ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION
>  ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION
> acpi_reallocate_root_table(void))
>
> -ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION
> +ACPI_EXTERNAL_RETURN_STATUS(acpi_status
> acpi_find_root_pointer(acpi_physical_address
>*rsdp_address))
>  ACPI_EXTERNAL_RETURN_STATUS(acpi_status
> --
> 2.20.1
>


--
Best Regards,
Kairui Song

Re: [PATCH AUTOSEL 4.20 016/117] fanotify: return only user requested event types in event mask

2019-01-08 Thread Amir Goldstein

On Tue, Jan 8, 2019 at 10:11 PM Sasha Levin  wrote:
>
> From: Matthew Bobrowski 
>
> [ Upstream commit 2d10b23082a7eb8be508b3789f2e7250a88a5ddb ]
>
> Modify fanotify_should_send_event() so that it now returns a mask for
> an event that contains ONLY flags for the event types that have been
> specifically requested by the user. Flags that may have been included
> within the event mask, but have not been explicitly requested by the
> user will not be present in the returned value.
>
> As an example, given the situation where a user requests events of type
> FAN_OPEN. Traditionally, the event mask returned within an event that
> occurred on a filesystem object that has been marked for monitoring and is
> opened, will only ever have the FAN_OPEN bit set. With the introduction of
> the new flags like FAN_OPEN_EXEC, and perhaps any other future event
> flags, there is a possibility of the returned event mask containing more
> than a single bit set, despite having only requested the single event type.
> Prior to these modifications performed to fanotify_should_send_event(), a
> user would have received a bundled event mask containing flags FAN_OPEN
> and FAN_OPEN_EXEC in the instance that a file was opened for execution via
> execve(), for example. This means that a user would receive event types
> in the returned event mask that have not been requested. This runs the
> possibility of breaking existing systems and causing other unforeseen
> issues.
>
> To mitigate this possibility, fanotify_should_send_event() has been
> modified to return the event mask containing ONLY event types explicitly
> requested by the user. This means that we will NOT report events that the
> user did no set a mask for, and we will NOT report events that the user
> has set an ignore mask for.
>
> The function name fanotify_should_send_event() has also been updated so
> that it's more relevant to what it has been designed to do.
>
> Signed-off-by: Matthew Bobrowski 
> Reviewed-by: Amir Goldstein 
> Signed-off-by: Jan Kara 
> Signed-off-by: Sasha Levin 
> ---

Sasha,

I have no objection to applying this patch to 4.20, but FYI, it does
not fix anything.
Before introducing FAN_OPEN_EXEC in 5.0-rc1, this patch has no visible effect.

I don't mind if you apply it. It will make stable code closer to
mainline, which is
always a good thing IMO. And FWIW, I think that patch is quite trivial
and low risk.

Thanks,
Amir.

Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler

2019-01-08 Thread James Bottomley

On Tue, 2019-01-08 at 17:43 -0800, Andy Lutomirski wrote:
> [Adding Jarkko because this stuff relates to the TPM.]
> 
> On Tue, Jan 8, 2019 at 4:44 PM James Bottomley
>  wrote:
> > 
> > On Tue, 2019-01-08 at 15:54 -0800, Andy Lutomirski wrote:
> > > > On Jan 7, 2019, at 11:09 PM, Stephan Mueller  > > > de>
> > > > wrote:
> > > > 
> > > > Am Dienstag, 8. Januar 2019, 06:03:58 CET schrieb Herbert Xu:
> > > > 
> > > > Hi Herbert,
> > > > 
> > > > > Are we going to have multiple implementations for the same
> > > > > KDF? If not then the crypto API is not a good fit.  To
> > > > > consolidate multiple implementations of the same KDF, simply
> > > > > provide helpers for them.
> > > > 
> > > > It is unlikely to have multiple implementations of a KDF.
> > > > However, KDFs relate to hashes like block chaining modes to raw
> > > > block ciphers. Thus a KDF can be applied with different hashes.
> > > > 
> > > > My idea was to add template support to RNGs (because KDFs are
> > > > effectively a type of RNG since they produce an arbitrary
> > > > output from a fixed input). The KDFs would be a template
> > > > wrapping hashes.  For example, the CTR-KDF from SP800-108 could
> > > > be instantiated like kdf-ctr(sha256).
> > > > 
> > > > 
> > > 
> > > I think that, if the crypto API is going to grow a KDF facility,
> > > it should be done right. Have a key type or flag or whatever that
> > > says “this key may *only* be used to derive keys using such-and-
> > > such algorithm”, and have a helper to derive a key.  That helper
> > > should take some useful parameters and mix them in:
> > > 
> > > - What type of key is being derived?  ECDSA signing key?  HMAC
> > > key?  AES key?
> > > 
> > > - Can user code access the derived key?
> > > 
> > > - What is the key’s purpose?  “Encrypt and authenticate a
> > > hibernation image” would be a purpose.
> > > 
> > > - Number of bytes.
> > > 
> > > All of these parameters should be mixed in to the key derivation.
> > > 
> > > Also, an AE key, even for AES+HMAC, should be just one derived
> > > key.  If you need 512 bits, ask for a 512-bit key, not two 256-
> > > bit keys.
> > 
> > Actually, it would be enormously helpful if we could reuse these
> > pieces for the TPM as well.  It has two KDFs: KDFa, which is the
> > CTR-KDF from SP800-108 and KDFe which is the SP800-56A KDF for
> > elliptic curve one pass Diffie Hellman, so if we're going to do the
> > former, I'd really like the latter as well.
> > 
> > The way the TPM parametrises input to both KDFs is
> > 
> > (hashAlg, key, label, contextU, contextV, bits)
> > 
> > Where
> > 
> > hashAlg  = the hash algorithm used as the PRF
> > key  = the input parameter of variable bit size or
> >the x co-ordinate of the shared point
> > label= An ASCII string representing the use
> > contextU = public input U
> > contextV = public input V
> > bits = number of output bits
> > 
> > Is that a good enough parametrisation (not the only way you
> > distinguish uses is with the label, which is not
> > recoverable)?  ContextU and ContextV are simply concatenated to
> > form the full Context of SP800-108, but we tend to need two
> > separate inputs (for KDFe they're the public x co-ordinates of the
> > points of the two inputs to ECDH for instance; in KDFa they're
> > usually the local and remote nonces).
> > 
> > The labels for TPM usage are things like "INTEGRITY" for HMAC keys
> > or "CFB" when generating an aes128-cfb session key. For KDFe, the
> > tpm seems to like the label "SECRET".  Although the TPM specifies
> > fixed short strings for the label, nothing prevents them being
> > longer like the purpose you state above (essentially we could mix
> > purpose, use and key type into the label and the contexts).
> > 
> 
> That really ought to cover anything the kernel needs.
> 
> But can you explain what's up with with KDFe?  The obvious searches
> end up with just warnings that the US currently has no government :(

You mean you can't find SP100-56A because NIST is a government entity
and it's discontinued its website because of the government shutdown? 
No idea, I only live here, you'll have to ask a real American.

ACM does have a copy:

http://delivery.acm.org/10.1145/221/2206270/SP800-56A_Revision1_Mar08-2007.pdf?ip=50.35.68.20=2206270=OPEN=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1546993111_ed9c8bd24b2f838c829d428aac7f5d71

> Anyway, if we're talking about the TPM, it seems like the entire
> "trusted key" mechanism in the kernel is missing the point.  If I
> want to encrypt something like a hibernation image on a machine with
> a TPM, it makes essentially no sense to me that we would get a key
> with a known raw value that is merely TPM-backed (i.e. the "trusted
> key") and use that to decrypt the image.  The right way to do it is
> to the use the TPM as it was intended to be used: generate a single-
> use key that protects the hibernation image and seal *that* directly
>

[PATCH 2/2] x86, kexec_file_load: make it work with efi=noruntime or efi=oldmap

2019-01-08 Thread Kairui Song

When efi=noruntime or efi=oldmap is used, EFI services won't be available
in the second kernel, therefore the second kernel will not be able to get
the ACPI RSDP address from firmware by calling EFI services and won't
boot. Previously we are expecting the user to set the acpi_rsdp=
on kernel command line for second kernel as there was no way to pass RSDP
address to second kernel.

After commit e6e094e053af ('x86/acpi, x86/boot: Take RSDP address from
boot params if available'), now it's possible to set a acpi_rsdp_addr
parameter in the boot_params passed to second kernel, this commit make
use of it, detect and set the RSDP address when it's required for second
kernel to boot.

Tested with an EFI enabled KVM VM with efi=noruntime.

Suggested-by: Dave Young 
Signed-off-by: Kairui Song 
---
 arch/x86/kernel/kexec-bzimage64.c | 21 +
 drivers/acpi/acpica/tbxfroot.c|  3 +--
 include/acpi/acpixf.h |  2 +-
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/kexec-bzimage64.c 
b/arch/x86/kernel/kexec-bzimage64.c
index 53917a3ebf94..0a90dcbd041f 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -255,8 +256,28 @@ setup_boot_parameters(struct kimage *image, struct 
boot_params *params,
/* Setup EFI state */
setup_efi_state(params, params_load_addr, efi_map_offset, efi_map_sz,
efi_setup_data_offset);
+
+#ifdef CONFIG_ACPI
+   /* Setup ACPI RSDP pointer in case EFI is not available in second 
kernel */
+   if (!acpi_disabled && (!efi_enabled(EFI_RUNTIME_SERVICES) || 
efi_enabled(EFI_OLD_MEMMAP))) {
+   /* Copied from acpi_os_get_root_pointer accordingly */
+   params->acpi_rsdp_addr = boot_params.acpi_rsdp_addr;
+   if (!params->acpi_rsdp_addr) {
+   if (efi_enabled(EFI_CONFIG_TABLES)) {
+   if (efi.acpi20 != EFI_INVALID_TABLE_ADDR)
+   params->acpi_rsdp_addr = efi.acpi20;
+   else if (efi.acpi != EFI_INVALID_TABLE_ADDR)
+   params->acpi_rsdp_addr = efi.acpi;
+   } else if 
(IS_ENABLED(CONFIG_ACPI_LEGACY_TABLES_LOOKUP)) {
+   acpi_find_root_pointer(>acpi_rsdp_addr);
+   }
+   }
+   if (!params->acpi_rsdp_addr)
+   pr_warn("RSDP is not available for second kernel\n");
+   }
 #endif
 
+#endif
/* Setup EDD info */
memcpy(params->eddbuf, boot_params.eddbuf,
EDDMAXNR * sizeof(struct edd_info));
diff --git a/drivers/acpi/acpica/tbxfroot.c b/drivers/acpi/acpica/tbxfroot.c
index 483d0ce5180a..dac1e34a931c 100644
--- a/drivers/acpi/acpica/tbxfroot.c
+++ b/drivers/acpi/acpica/tbxfroot.c
@@ -108,8 +108,7 @@ acpi_status acpi_tb_validate_rsdp(struct acpi_table_rsdp 
*rsdp)
  *
  
**/
 
-acpi_status ACPI_INIT_FUNCTION
-acpi_find_root_pointer(acpi_physical_address *table_address)
+acpi_status acpi_find_root_pointer(acpi_physical_address *table_address)
 {
u8 *table_ptr;
u8 *mem_rover;
diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
index 7aa38b648564..869d75ecaf7d 100644
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -474,7 +474,7 @@ ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION
 ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION
acpi_reallocate_root_table(void))
 
-ACPI_EXTERNAL_RETURN_STATUS(acpi_status ACPI_INIT_FUNCTION
+ACPI_EXTERNAL_RETURN_STATUS(acpi_status
acpi_find_root_pointer(acpi_physical_address
   *rsdp_address))
 ACPI_EXTERNAL_RETURN_STATUS(acpi_status
-- 
2.20.1

Re: [PATCH 2/2] leds: lp5024: Add the LP5024/18 RGB LED driver

2019-01-08 Thread Vesa Jääskeläinen


Hi Jacek,

On 07/01/2019 23.13, Jacek Anaszewski wrote:

Hi Vesa,

On 1/5/19 1:39 AM, Vesa Jääskeläinen wrote:

Hi Jacek,

On 04/01/2019 23.37, Jacek Anaszewski wrote:

But, aside from that hypothetic issue, we need a solution for
LEDn_BRIGHTNESS feature of lp5024, i.e. setting color intensity
via a single register write. How would you propose to address that?


You could model it to something like this in device tree:

led-module @  {
 compatible = "lp5024";

 // There is in hardware setup to use either linear or
 // logarithmic scaling:
 //enable-logarithmic-brightness;

 led0 {
 // this will create led instance for LED0 in lp5024
 label = "lp-led0";

 // This specifies LED number within lp5024
 led-index = <0>;   // set output-base as 0*3 == 0

 element-red {
 // refers to OUT0
 output-offset = <0>;
 };

 element-green {
 // refers to OUT1
 output-offset = <1>;
 };

 element-blue {
 // refers to OUT2
 output-offset = <2>;
 };

 };

 led1 {
 // this will create led instance for LED1 in lp5024
 label = "lp-led1";

 // This specifies LED number within lp5024
 led-index = <1>;   // set output-base as 1*3 == 3

 element-red {
 // refers to OUT3
 output-offset = <0>;
 };

 element-green {
 // refers to OUT4
 output-offset = <1>;
 };

 element-blue {
 // refers to OUT5
 output-offset = <2>;
 };

 };

 bank-led {
 // this will create led instance for bank leds in lp5024
 label = "lp-bank-led";

 // configured bank led configuration
 led-index = <2 3 4 5 6 7>;
 // As here is list of led-indices this entry is
 // assumed to be bank configuration. Bank mode is enable
 // for the indices.

 // set output-base as BANK A

 element-red {
 // refers to BANK A
 output-offset = <0>;
 };

 element-green {
 // refers to BANK B
 output-offset = <1>;
 };

 element-blue {
 // refers to BANK C
 output-offset = <2>;
 };
 };
};

This would then create three led instances and each led instance has 
brightness setting and that goes straight to hardware.


If one would want to override hardware control for brightness then I 
suppose you would define in led node something like:


 brightness-model = "hsl"

This would then pick red, green and blue elements for hsl calculations 
and others color elements for linear. LED specific hardware brightness 
would then be either 0 or 0xFF depending if all of LED color elements 
are zero or not.


Would that kind of model work?


I'd prefer to have single RGB LED device. And your DT design
is unnecessarily complex and a bit confusing.


As this chip series is kinda designed for N x RGB LED's my idea was that 
if from user space point of view we model it as N times of individual 
RGB LED instances that may not even have anything to do with together. 
Eg. could be used for different purposes and such.


And in device tree one would define logical connections for the leds so 
they would be mapped logically correct to user space.


If one would define it like:

led1 {
// this will create led instance for LED1 in lp5024
label = "lp-led1";

// This specifies LED number within lp5024
led-sources = <1>;
};
(note changed led-index to led-sources as that is what Pavel had and 
preferred)


We could assume that it is RGB led in this driver's case and create it 
automatically with elements "red", "green", and "blue". And this could 
then be mapped automatically to HSL color elements or what ever the 
model would be.


If you would model it differently in your hardware design then you would 
need to define more device tree nodes. Eg. if your order of LEDs would 
not be red, green, blue. Or if you would have non-RGB led(s) in there.



Also, you provided scarce information about sysfs interface.
It would be nice to see the sequence of commands.


In this case it could be:

# Note: Updated color to value array model.

$ ls /sys/class/leds
lp-led0 lp-led1 lp-bank-led

$ ls /sys/class/leds/lp-led0
brightness  color

$ ls /sys/class/leds/lp-led1
brightness  color

$ ls /sys/class/leds/lp-bank-led
brightness  color

# Idea of above is that as brightness is for triplet:
#   OUT(LED*3 + 0), OUT(LED*3 + 1), OUT(LED*3 + 2),
# Then if we model it like RGB LED then brightness would automatically
# map to correct OUTputs and be grouped from user space point of view
# logically in correct place.

# set first led to red
echo "255 0 0" > /sys/class/leds/lp-led0/color

# set second led to green
echo "0 255 0" > /sys/class/leds/lp-led1/color

# set

Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler

2019-01-08 Thread Stephan Mueller

Am Mittwoch, 9. Januar 2019, 01:44:31 CET schrieb James Bottomley:

Hi James,

> Actually, it would be enormously helpful if we could reuse these pieces
> for the TPM as well. 

Could you please help me understand whether the KDFs in TPM are directly 
usable as a standalone cipher primitive or does it go together with additional 
key generation operations?

> It has two KDFs: KDFa, which is the CTR-KDF from
> SP800-108 and KDFe which is the SP800-56A KDF for elliptic curve one
> pass Diffie Hellman, so if we're going to do the former, I'd really
> like the latter as well.
> 
> The way the TPM parametrises input to both KDFs is
> 
> (hashAlg, key, label, contextU, contextV, bits)
> 
> Where
> 
> hashAlg  = the hash algorithm used as the PRF
> key  = the input parameter of variable bit size or
>the x co-ordinate of the shared point
> label= An ASCII string representing the use
> contextU = public input U
> contextV = public input V
> bits = number of output bits

When implementing KDFs as an extension of the kernel crypto API's RNG facility 
we currently have to handle the limitiation of the existing API. The label/
context data (and when considering RFC 5869 HKDF requring IKM, salt and 
additional information as input) currently cannot directly be communicated 
through the API.

The issue is that the RNG facility currently has the following prototype 
defined:

int (*generate)(struct crypto_rng *tfm,
const u8 *src, unsigned int slen,
u8 *dst, unsigned int dlen);

The src pointer would need to take the label/context data.

Would it be appropriate, to implement a type cast to a structure from the u8 
pointer?

E.g. for the aforementioned label/context data, we could define something like

struct crypto_kdf_ctr {
char *label;
size_t label_len;
u8 *contextU;
size_t contextU_len;
u8 *contextV;
size_t contextV_len;
};

And the implementation of the generate function for CTR KDF would then have a 
type cast along the following lines:

if (slen != sizeof(struct crypto_kdf_ctr))
return -EINVAL;
const struct crypto_kdf_ctr *kdf_ctr_input = (struct crypto_kdf_ctr 
*)src;


For different KDFs, different structs would be needed.

Ciao
Stephan

[PATCH 1/2] x86, kexec_file_load: Don't setup EFI info if EFI runtime is not enabled

2019-01-08 Thread Kairui Song

Currenly with "efi=noruntime" in kernel command line, calling
kexec_file_load will raise below problem:

[   97.967067] BUG: unable to handle kernel NULL pointer dereference at 

[   97.967894] #PF error: [normal kernel read fault]
...
[   97.980456] Call Trace:
[   97.980724]  efi_runtime_map_copy+0x28/0x30
[   97.981267]  bzImage64_load+0x688/0x872
[   97.981794]  arch_kexec_kernel_image_load+0x6d/0x70
[   97.982441]  kimage_file_alloc_init+0x13e/0x220
[   97.983035]  __x64_sys_kexec_file_load+0x144/0x290
[   97.983586]  do_syscall_64+0x55/0x1a0
[   97.983962]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

When efi runtime is not enabled, efi memmap is not mapped, so just skip
EFI info setup.

Suggested-by: Dave Young 
Signed-off-by: Kairui Song 
---
 arch/x86/kernel/kexec-bzimage64.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/kexec-bzimage64.c 
b/arch/x86/kernel/kexec-bzimage64.c
index 0d5efa34f359..53917a3ebf94 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -167,6 +167,9 @@ setup_efi_state(struct boot_params *params, unsigned long 
params_load_addr,
struct efi_info *current_ei = _params.efi_info;
struct efi_info *ei = >efi_info;
 
+   if (!efi_enabled(EFI_RUNTIME_SERVICES))
+   return 0;
+
if (!current_ei->efi_memmap_size)
return 0;
 
-- 
2.20.1

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Alexey Kardashevskiy




On 09/01/2019 16:30, David Gibson wrote:
> On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote:
>> On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote:
>>>
 In a very cryptic way that requires manual parsing using non-public
 docs sadly but yes. From the look of it, it's a completion timeout.

 Looks to me like we don't get a response to a config space access
 during the change of D state. I don't know if it's the write of the D3
 state itself or the read back though (it's probably detected on the
 read back or a subsequent read, but that doesn't tell me which specific
 one failed).
>>>
>>> If it is just one card doing it (again, check you have latest
>>> firmware) I wonder if it is a sketchy PCI-E electrical link that is
>>> causing a long re-training cycle? Can you tell if the PCI-E link is
>>> permanently gone or does it eventually return?
>>
>> No, it's 100% reproducable on systems with that specific card model,
>> not card instance, and maybe different systems/cards as well, I'll let
>> David & Alexey comment further on that.
> 
> Well, it's 100% reproducable on a particular model of system
> (garrison) with a particular model of card.  I've had some suggestions
> that it fails with some other systems card card models, but nothing
> confirmed - the one other system model I've been able to try, which
> also had a newer card model didn't reproduce the problem.

I have just moved the "Mellanox Technologies MT27700 Family
[ConnectX-4]" from garrison to firestone machine and there it does not
produce an EEH, with the same kernel and skiboot (both upstream + my
debug). Hm. I cannot really blame the card but I cannot see what could
cause the difference in skiboot either. I even tried disabling NPU so
garrison would look like firestone, still EEH'ing.



>>> Does the card work in Gen 3 when it starts? Is there any indication of
>>> PCI-E link errors?
>>
>> Nope.
>>
>>> Everytime or sometimes?
>>>
>>> POWER 8 firmware is good? If the link does eventually come back, is
>>> the POWER8's D3 resumption timeout long enough?
>>>
>>> If this doesn't lead to an obvious conclusion you'll probably need to
>>> connect to IBM's Mellanox support team to get more information from
>>> the card side.
>>
>> We are IBM :-) So far, it seems to be that the card is doing something
>> not quite right, but we don't know what. We might need to engage
>> Mellanox themselves.
> 
> Possibly.  On the other hand, I've had it reported that this is a
> software regression at least with downstream red hat kernels.  I
> haven't yet been able to eliminate factors that might be confusing
> that, or try to find a working version upstream.

Do you have tarballs handy? I'd diff...


-- 
Alexey

Re: [PATCH] lkdtm: Add a tests for NULL pointer dereference

2019-01-08 Thread Christophe Leroy





Le 09/01/2019 à 02:14, Kees Cook a écrit :

On Fri, Dec 14, 2018 at 7:26 AM Christophe Leroy
 wrote:


Introduce lkdtm tests for NULL pointer dereference: check
access or exec at NULL address.


Why is this not already covered by the existing tests? (Is there
something special about NULL that is being missed?) I'd expect SMAP
and SMEP to cover NULL as well.


Most arches print a different message whether the faulty address is 
above or under PAGE_SIZE. Below is exemple from x86:


pr_alert("BUG: unable to handle kernel %s at %px\n",
 address < PAGE_SIZE ? "NULL pointer dereference" : "paging 
request",
 (void *)address);


Until recently, the powerpc arch didn't do it. When I implemented it 
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=49a502ea23bf9dec47f8f3c3960909ff409cd1bb), 
I needed a way to test it and couldn't find an existing one, hence this 
new LKDTM test.


But maybe I missed something ?

Christophe



-Kees



Signed-off-by: Christophe Leroy 
---
  drivers/misc/lkdtm/core.c  |  2 ++
  drivers/misc/lkdtm/lkdtm.h |  2 ++
  drivers/misc/lkdtm/perms.c | 18 ++
  3 files changed, 22 insertions(+)

diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index bc76756b7eda..36910e1d5c09 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -157,7 +157,9 @@ static const struct crashtype crashtypes[] = {
 CRASHTYPE(EXEC_VMALLOC),
 CRASHTYPE(EXEC_RODATA),
 CRASHTYPE(EXEC_USERSPACE),
+   CRASHTYPE(EXEC_NULL),
 CRASHTYPE(ACCESS_USERSPACE),
+   CRASHTYPE(ACCESS_NULL),
 CRASHTYPE(WRITE_RO),
 CRASHTYPE(WRITE_RO_AFTER_INIT),
 CRASHTYPE(WRITE_KERN),
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 3c6fd327e166..b69ee004a3f7 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -45,7 +45,9 @@ void lkdtm_EXEC_KMALLOC(void);
  void lkdtm_EXEC_VMALLOC(void);
  void lkdtm_EXEC_RODATA(void);
  void lkdtm_EXEC_USERSPACE(void);
+void lkdtm_EXEC_NULL(void);
  void lkdtm_ACCESS_USERSPACE(void);
+void lkdtm_ACCESS_NULL(void);

  /* lkdtm_refcount.c */
  void lkdtm_REFCOUNT_INC_OVERFLOW(void);
diff --git a/drivers/misc/lkdtm/perms.c b/drivers/misc/lkdtm/perms.c
index fa54add6375a..62f76d506f04 100644
--- a/drivers/misc/lkdtm/perms.c
+++ b/drivers/misc/lkdtm/perms.c
@@ -164,6 +164,11 @@ void lkdtm_EXEC_USERSPACE(void)
 vm_munmap(user_addr, PAGE_SIZE);
  }

+void lkdtm_EXEC_NULL(void)
+{
+   execute_location(NULL, CODE_AS_IS);
+}
+
  void lkdtm_ACCESS_USERSPACE(void)
  {
 unsigned long user_addr, tmp = 0;
@@ -195,6 +200,19 @@ void lkdtm_ACCESS_USERSPACE(void)
 vm_munmap(user_addr, PAGE_SIZE);
  }

+void lkdtm_ACCESS_NULL(void)
+{
+   unsigned long tmp;
+   unsigned long *ptr = (unsigned long *)NULL;
+
+   pr_info("attempting bad read at %px\n", ptr);
+   tmp = *ptr;
+   tmp += 0xc0dec0de;
+
+   pr_info("attempting bad write at %px\n", ptr);
+   *ptr = tmp;
+}
+
  void __init lkdtm_perms_init(void)
  {
 /* Make sure we can write to __ro_after_init values during __init */
--
2.13.3

Re: [PATCH v2] staging: erofs: Add identifier for function definition arguments

2019-01-08 Thread Chao Yu

On 2019/1/8 21:24, Sidong Yang wrote:
> Add identifier for function definition arguments in xattr_iter_handlers,
> this change clears the checkpatch.pl issue and make code more explicit.
> 
> Signed-off-by: Sidong Yang 

Reviewed-by: Chao Yu 

Thanks,

Re: [PATCH 1/5 v2] PM / hibernate: Create snapshot keys handler

2019-01-08 Thread Stephan Mueller

Am Mittwoch, 9. Januar 2019, 00:54:22 CET schrieb Andy Lutomirski:

Hi Andy,
> 
> I think that, if the crypto API is going to grow a KDF facility, it should
> be done right. Have a key type or flag or whatever that says “this key may
> *only* be used to derive keys using such-and-such algorithm”, and have a
> helper to derive a key.  That helper should take some useful parameters and
> mix them in:
> 
> - What type of key is being derived?  ECDSA signing key?  HMAC key?  AES
> key?
> 
> - Can user code access the derived key?
> 
> - What is the key’s purpose?  “Encrypt and authenticate a hibernation image”
> would be a purpose.
> 
> - Number of bytes.
> 
> All of these parameters should be mixed in to the key derivation.
> 
> Also, an AE key, even for AES+HMAC, should be just one derived key.  If you
> need 512 bits, ask for a 512-bit key, not two 256-bit keys.

I concur with your requirements. However, is the kernel crypto API the right 
place to enforce such policies? To me, the kernel crypto API is a tinker-toy 
set of ciphers.

The real policy enforcer would or should be the keyring facility. Thus, may I 
propose to:

- implement the cryptographic primitive of the KDF in the kernel crypto API

- implement the policy system how to use the KDF in the keyring facility

Ciao
Stephan

Re: [PATCH] ACPI/nfit: delete the redundant header file

2019-01-08 Thread Dan Williams

On Sun, Jan 6, 2019 at 6:33 PM Xiaochun Lee  wrote:
>
> From: Xiaochun Lee 
>
> The header file "intel.h" is repeated here, So delete one.
>
> Signed-off-by: Xiaochun Lee 

Thanks, applied.

By the way, no need for a cover letter if it's only one patch.

Re: [PATCH] ACPI/nfit: delete the function to_acpi_nfit_desc

2019-01-08 Thread Dan Williams

On Sat, Jan 5, 2019 at 12:09 AM Xiaochun Lee  wrote:
>
> From: Xiaochun Lee 
>
> The function to_acpi_nfit_desc and function to_acpi_desc
> do the same things,delete the function to_acpi_nfit_desc,
> and keep the inline function to_acpi_desc.
>
> Signed-off-by: Xiaochun Lee 

Good catch, thanks, applied.

Re: [PATCH v2] nfit: Mark some functions as __maybe_unused

2019-01-08 Thread Dan Williams

On Tue, Jan 8, 2019 at 9:03 PM Nathan Chancellor
 wrote:
>
> On arm64 little endian allyesconfig:
>
> drivers/acpi/nfit/intel.c:149:12: warning: unused function 
> 'intel_security_unlock' [-Wunused-function]
> static int intel_security_unlock(struct nvdimm *nvdimm,
>^
> drivers/acpi/nfit/intel.c:230:12: warning: unused function 
> 'intel_security_erase' [-Wunused-function]
> static int intel_security_erase(struct nvdimm *nvdimm,
>^
> drivers/acpi/nfit/intel.c:279:12: warning: unused function 
> 'intel_security_query_overwrite' [-Wunused-function]
> static int intel_security_query_overwrite(struct nvdimm *nvdimm)
>^
> drivers/acpi/nfit/intel.c:316:12: warning: unused function 
> 'intel_security_overwrite' [-Wunused-function]
> static int intel_security_overwrite(struct nvdimm *nvdimm,
>^
> 4 warnings generated.
>
> Mark these functions as __maybe_unused because they are only used when
> CONFIG_X86 is set.
>
> Fixes: 4c6926a23b76 ("acpi/nfit, libnvdimm: Add unlock of nvdimm support for 
> Intel DIMMs")
> Suggested-by: Dan Williams 
> Signed-off-by: Nathan Chancellor 

Looks good, applied.

Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order

2019-01-08 Thread Arun KS


On 2019-01-09 03:47, Alexander Duyck wrote:

On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote:
When freeing pages are done with higher order, time spent on 
coalescing
pages by buddy allocator can be reduced.  With section size of 256MB, 
hot
add latency of a single section shows improvement from 50-60 ms to 
less

than 1 ms, hence improving the hot add latency by 60 times.  Modify
external providers of online callback to align with the change.

Signed-off-by: Arun KS 
Acked-by: Michal Hocko 
Reviewed-by: Oscar Salvador 


Sorry, ended up encountering a couple more things that have me a bit
confused.

[...]


diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index 5301fef..211f3fe 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -771,7 +771,7 @@ static void hv_mem_hot_add(unsigned long start, 
unsigned long size,

}
 }

-static void hv_online_page(struct page *pg)
+static int hv_online_page(struct page *pg, unsigned int order)
 {
struct hv_hotadd_state *has;
unsigned long flags;
@@ -783,10 +783,12 @@ static void hv_online_page(struct page *pg)
if ((pfn < has->start_pfn) || (pfn >= has->end_pfn))
continue;

-   hv_page_online_one(has, pg);
+   hv_bring_pgs_online(has, pfn, (1UL << order));
break;
}
spin_unlock_irqrestore(_device.ha_lock, flags);
+
+   return 0;
 }

 static int pfn_covered(unsigned long start_pfn, unsigned long 
pfn_cnt)


So the question I have is why was a return value added to these
functions? They were previously void types and now they are int. What
is the return value expected other than 0?


Earlier with returning a void there was now way for an arch code to 
denying onlining of this particular page. By using an int as return 
type, we can implement this. In one of the boards I was using, there are 
some pages which should not be onlined because they are used for other 
purposes(like secure trust zone or hypervisor).





diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index ceb5048..95f888f 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -345,8 +345,8 @@ static enum bp_state 
reserve_additional_memory(void)


/*
 * add_memory_resource() will call online_pages() which in its turn
-* will call xen_online_page() callback causing deadlock if we don't
-* release balloon_mutex here. Unlocking here is safe because the
+* will call xen_bring_pgs_online() callback causing deadlock if we
+	 * don't release balloon_mutex here. Unlocking here is safe because 
the

 * callers drop the mutex before trying again.
 */
mutex_unlock(_mutex);
@@ -369,15 +369,22 @@ static enum bp_state 
reserve_additional_memory(void)

return BP_ECANCELED;
 }

-static void xen_online_page(struct page *page)
+static int xen_bring_pgs_online(struct page *pg, unsigned int order)


Why did we rename this function? I see it was added as a new function
in v3, however in v4 we ended up replacing it completely. So why not
just keep the same name and make it easier for us to identify that the
is the Xen version of the XXX_online_pages callback?


Point taken. Will send a patch.



[...]

+static int online_pages_blocks(unsigned long start, unsigned long 
nr_pages)

+{
+   unsigned long end = start + nr_pages;
+   int order, ret, onlined_pages = 0;
+
+   while (start < end) {
+   order = min(MAX_ORDER - 1,
+   get_order(PFN_PHYS(end) - PFN_PHYS(start)));
+
+   ret = (*online_page_callback)(pfn_to_page(start), order);
+   if (!ret)
+   onlined_pages += (1UL << order);
+   else if (ret > 0)
+   onlined_pages += ret;
+


So if the ret > 0 it is supposed to represent how many pages were
onlined within a given block? What if the ret was negative? Really I am
not a fan of adding a return value to the online functions unless we
specifically document what the expected return values are supposed to
be. If we don't have any return values other than 0 there isn't much
point in having one anyway.


I ll document this.

Regards,
Arun

Re: [PATCH 2/2] leds: lp5024: Add the LP5024/18 RGB LED driver

2019-01-08 Thread Vesa Jääskeläinen


Hi Dan,

On 07/01/2019 21.34, Dan Murphy wrote:

Vesa

On 1/4/19 6:39 PM, Vesa Jääskeläinen wrote:

Hi Jacek,

On 04/01/2019 23.37, Jacek Anaszewski wrote:

But, aside from that hypothetic issue, we need a solution for
LEDn_BRIGHTNESS feature of lp5024, i.e. setting color intensity
via a single register write. How would you propose to address that?


You could model it to something like this in device tree:

led-module @  {
 compatible = "lp5024";

 // There is in hardware setup to use either linear or
 // logarithmic scaling:
 //enable-logarithmic-brightness;

 led0 {
     // this will create led instance for LED0 in lp5024
     label = "lp-led0";

     // This specifies LED number within lp5024

     led-index = <0>;   // set output-base as 0*3 == 0

     element-red {

     // refers to OUT0
     output-offset = <0>;
     };

     element-green {

     // refers to OUT1
     output-offset = <1>;
     };

     element-blue {

     // refers to OUT2
     output-offset = <2>;
     };

 };


 led1 {
     // this will create led instance for LED1 in lp5024
     label = "lp-led1";

     // This specifies LED number within lp5024

     led-index = <1>;   // set output-base as 1*3 == 3



Can we not use led-sources like I have done already?


It was just for illustration of the idea. Names can be agreed. I have 
nothing against led-sources name. I was just looking at datasheet to try 
to undestand what it did and then tried to figure out if it could be 
mapped the idea I have been playing with.



I really like to keep the DT nodes simple and re-use nodes that exist if 
possible.


I'll reply to Jacek's email about more clarifications of the idea.

Thanks,
Vesa Jääskeläinen


My code already maps and groups the outputs into the associated banks

RE: [RFC PATCH v1] cfg80211/nl80211: add support for AID assignment by driver : https://patchwork.kernel.org/patch/10726899/

2019-01-08 Thread Garnayak, Sarada Prasanna

My Bad, Onwards I will submit a new version of the RFC patch to address the 
reviewer comments on previous RFC patch and for query or suggestions, I will 
definitely cc Linux-wireless in my mail without any external link or attachment.

Thanks,
Prasanna

-Original Message-
From: Kalle Valo [mailto:kv...@codeaurora.org] 
Sent: Tuesday, January 8, 2019 10:50 PM
To: Garnayak, Sarada Prasanna 
Cc: Berg, Johannes ; Jouni Malinen ; 
Sergey Matyukevich ; 
linux-kernel@vger.kernel.org; linux-wirel...@vger.kernel.org
Subject: Re: [RFC PATCH v1] cfg80211/nl80211: add support for AID assignment by 
driver : https://patchwork.kernel.org/patch/10726899/

+ linux-wireless

"Garnayak, Sarada Prasanna"  writes:

> Thank you so much for reviewing my RFC patch and here I would like to 
> address the big concerns/comments about this RFC patch like
>
> Ø What are the exact use case and potential user of this patch.
>
> Ø How the Hostapd will handle this get aid and free aid.
>
> Ø etc.
>
> In the current implementation in WLAN Stack, the hostapd is assigning 
> the AID for a station during association.
>
> But we have a few WLAN hardware which is maintaining the AID in the 
> WLAN hardware level itself and the firmware checking this AID in the 
> hardware level according to that the firmware is taking action on 
> station management
>
> like (Station DB management, BA agreement, SMPS and TM management 
> frame etc). For this reason, we need these get_aid/free_aid.
>
> Also here I am attaching the Hostapd RFC Patch for the same. PFA 
> (hostapd patch, call flow diagram and comments on this kernel patch)
>
> Please review the call flow below:

You made few crucial mistakes:

1. You did not CC linux-wireless.

2. You submitted the message as HTML and our lists automatically filter
   those.

3. You attached a powerpoint file and I would hope that our lists
   automatically filter those as well. At least I'm not going to open
   any random powerpoint/word files I receive via email.

--
Kalle Valo

Re: [PATCH v2] f2fs: fix sbi->extent_list corruption issue

2019-01-08 Thread Chao Yu

On 2019/1/9 12:38, Jaegeuk Kim wrote:
> On 01/07, Chao Yu wrote:
>> On 2019/1/5 4:33, Jaegeuk Kim wrote:
>>> On 01/04, Sahitya Tummala wrote:
 On Mon, Nov 26, 2018 at 10:17:20AM +0530, Sahitya Tummala wrote:
> When there is a failure in f2fs_fill_super() after/during
> the recovery of fsync'd nodes, it frees the current sbi and
> retries again. This time the mount is successful, but the files
> that got recovered before retry, still holds the extent tree,
> whose extent nodes list is corrupted since sbi and sbi->extent_list
> is freed up. The list_del corruption issue is observed when the
> file system is getting unmounted and when those recoverd files extent
> node is being freed up in the below context.
>
> list_del corruption. prev->next should be fff1e1ef5480, but was (null)
> <...>
> kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
> task: fff1f46f2280 task.stack: ff8008068000
> lr : __list_del_entry_valid+0x94/0xb4
> pc : __list_del_entry_valid+0x94/0xb4
> <...>
> Call trace:
> __list_del_entry_valid+0x94/0xb4
> __release_extent_node+0xb0/0x114
> __free_extent_tree+0x58/0x7c
> f2fs_shrink_extent_tree+0xdc/0x3b0
> f2fs_leave_shrinker+0x28/0x7c
> f2fs_put_super+0xfc/0x1e0
> generic_shutdown_super+0x70/0xf4
> kill_block_super+0x2c/0x5c
> kill_f2fs_super+0x44/0x50
> deactivate_locked_super+0x60/0x8c
> deactivate_super+0x68/0x74
> cleanup_mnt+0x40/0x78
> __cleanup_mnt+0x1c/0x28
> task_work_run+0x48/0xd0
> do_notify_resume+0x678/0xe98
> work_pending+0x8/0x14
>
> Fix this by cleaning up inodes, extent tree and nodes of those
> recovered files before freeing up sbi and before next retry.
>
 Hi Jaegeuk, Chao,

 I have observed another scenario where the similar list corruption issue
 can happen with sbi->inode_list as well. If recover_fsync_data()
 fails at some point in write_checkpoint() due to some error and if
 those recovered inodes are still dirty, then after the mount is
 successful, this issue is observed when that dirty inode is under 
 writeback.
>>>
>>> recover_fsync_data() does iget/iput in pair, and destroy_fsync_dnodes() 
>>> drops
>>> its dirty list and call iput(), when there is an error. So, after then, 
>>> there'd
>>> be no dirty inodes. If there's no error, checkpoint() flushes quota/dentry 
>>> pages
>>> in dirty inodes as well. Can we check where this dirty inode came from?
>>
>> I guess it comes from:
>>
>> f2fs_recover_fsync_data()
>>
>>  /* Needed for iput() to work correctly and not trash data */
>>  sbi->sb->s_flags |= SB_ACTIVE;
>>
>> iput_final()
>>
>>  if (!drop && (sb->s_flags & SB_ACTIVE)) {
>>  inode_add_lru(inode);
>>  spin_unlock(>i_lock);
>>  return;
>>  }
>>
>> So dirty data in those inode can be remained after iput(), then meta/node
>> can be persisted during next checkpoint, if checkpoint failed due to error,
>> dirty inode remain in system. IIUC.
> 
> 
> 749 err = recover_data(sbi, _list, _inode_list, _list);
> 750 if (!err)
> 751 f2fs_bug_on(sbi, !list_empty(_list));
> 752 else {
> 753 /* restore s_flags to let iput() trash data */
> 754 sbi->sb->s_flags = s_flags;
> 755 }
> 
> We deactivate sb before iput?

We only restore s_flags in error path of recover_data? I remember Sahitya
said his case is encountering error in later checkpoint()?

Thanks,

> 
>>
>>>
>>> Oh, one sceanrio can be an error by f2fs_disable_checkpoint() which will do 
>>> GC.
>>>

 [   90.400500] list_del corruption. prev->next should be ffed1f566208, 
 but was (null)
 [   90.675349] Call trace:
 [   90.677869]  __list_del_entry_valid+0x94/0xb4
 [   90.682351]  remove_dirty_inode+0xac/0x114
 [   90.686563]  __f2fs_write_data_pages+0x6a8/0x6c8
 [   90.691302]  f2fs_write_data_pages+0x40/0x4c
 [   90.695695]  do_writepages+0x80/0xf0
 [   90.699372]  __writeback_single_inode+0xdc/0x4ac
 [   90.704113]  writeback_sb_inodes+0x280/0x440
 [   90.708501]  wb_writeback+0x1b8/0x3d0
 [   90.712267]  wb_workfn+0x1a8/0x4d4
 [   90.715765]  process_one_work+0x1c0/0x3d4
 [   90.719883]  worker_thread+0x224/0x344
 [   90.723739]  kthread+0x120/0x130
 [   90.727055]  ret_from_fork+0x10/0x18

 I think it is better to cleanup those inodes completely before freeing sbi
 and before next retry as done in this patch. Would you like to re-consider
 this patch for this new issue?
>>>
>>> The patch was merged in mainline already.
>>> Could you take a look at this patch?
>>>
>>> >From cb1d20e640402beed300c2bdce79311ee8a781ad Mon Sep 17 00:00:00 2001
>>> From: Jaegeuk Kim 
>>> Date: Fri, 4 Jan 2019 12:29:00 -0800
>>> Subject: [PATCH] f2fs: sync filesystem after roll-forward recovery
>>
>> You mean android kernel

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Benjamin Herrenschmidt

On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote:
> 
> > In a very cryptic way that requires manual parsing using non-public
> > docs sadly but yes. From the look of it, it's a completion timeout.
> > 
> > Looks to me like we don't get a response to a config space access
> > during the change of D state. I don't know if it's the write of the D3
> > state itself or the read back though (it's probably detected on the
> > read back or a subsequent read, but that doesn't tell me which specific
> > one failed).
> 
> If it is just one card doing it (again, check you have latest
> firmware) I wonder if it is a sketchy PCI-E electrical link that is
> causing a long re-training cycle? Can you tell if the PCI-E link is
> permanently gone or does it eventually return?

No, it's 100% reproducable on systems with that specific card model,
not card instance, and maybe different systems/cards as well, I'll let
David & Alexey comment further on that.

> Does the card work in Gen 3 when it starts? Is there any indication of
> PCI-E link errors?

Nope.

> Everytime or sometimes?
> 
> POWER 8 firmware is good? If the link does eventually come back, is
> the POWER8's D3 resumption timeout long enough?
> 
> If this doesn't lead to an obvious conclusion you'll probably need to
> connect to IBM's Mellanox support team to get more information from
> the card side.

We are IBM :-) So far, it seems to be that the card is doing something
not quite right, but we don't know what. We might need to engage
Mellanox themselves.

Cheers,
Ben.

Re: [PATCH] pseries/hotplug: Add more delay in pseries_cpu_die while waiting for rtas-stop

2019-01-08 Thread Gautham R Shenoy

Hello Thiago,

Wish you a happy 2019!

On Sat, Dec 08, 2018 at 12:40:52AM -0200, Thiago Jung Bauermann wrote:
> 
> Gautham R Shenoy  writes:
> > On Fri, Dec 07, 2018 at 04:13:11PM +0530, Gautham R Shenoy wrote:
> >> Sure. I will test the patch and report back.
> >
> > I added the following debug patch on top of your patch, and after an
> > hour's run, the system crashed. Appending the log at the end.
> 
> Thank you very much for testing! Your debug patch was very helpful as
> well.
> 
> > I suppose we still need to increase the number of tries since we wait
> > only for 2.5ms looping before giving up.
> 
> Do you think it would have helped? In the debug output you posted I
> would have expected the following message to show up if the loop
> finished too early, and it didn't:
> 
> "Querying DEAD? cpu %i (%i) shows %i\n"
> 
> So I don't think increasing the loop length would have made a
> difference. In fact, the call to smp_query_cpu_stopped() always
> succeeded on the first iteration.

I did some testing during the holidays. Here are the observations:

1) With just your patch (without any additional debug patch), if I run
DLPAR on /off operations on a system that has SMT=off, I am able to
see a crash involving RTAS stack corruption within an hour's time.

2) With the debug patch (appended below) which has additional debug to
capture the callers of stop-self, start-cpu, set-power-levels, the
system is able to perform DLPAR on/off operations on a system with
SMT=off for three days. And then, it crashed with the dead CPU showing
a "Bad kernel stack pointer". From this log, I can clearly
see that there were no concurrent calls to stop-self, start-cpu,
set-power-levels. The only concurrent RTAS calls were the dying CPU
calling "stop-self", and the CPU running the DLPAR operation calling
"query-cpu-stopped-state". The crash signature is appended below as
well.

3) Modifying your patch to remove the udelay and increase the loop
count from 25 to 1000 doesn't improve the situation. We are still able
to see the crash.

4) With my patch, even without any additional debug, I was able to
observe the system run the tests successfully for over a week (I
started the tests before the Christmas weekend, and forgot to turn it
off!)

It appears that there is a narrow race window involving rtas-stop-self
and rtas-query-cpu-stopped-state calls that can be observed with your
patch. Adding any printk's seems to reduce the probability of hitting
this race window. It might be worth the while to check with RTAS
folks, if they suspect something here.

The Crash log with this debug patch is as follows

[DEBUG] CPU 32 waiting for CPU 130 to enter rtas
cpu 130 (hwid 130) Ready to die...
[DEBUG] CPU 32 noticed CPU 130 enter rtas: tries=0, time=539
[DEBUG] CPU 32 waiting for CPU 131 to enter rtas
cpu 131 (hwid 131) Ready to die...
[DEBUG] CPU 32 noticed CPU 131 enter rtas: tries=0, time=137
[DEBUG] CPU 32 waiting for CPU 132 to enter rtas
cpu 132 (hwid 132) Ready to die...
[DEBUG] CPU 32 noticed CPU 132 enter rtas: tries=0, time=1238
[DEBUG] CPU 32 waiting for CPU 133 to enter rtas
cpu 133 (hwid 133) Ready to die...
[DEBUG] CPU 32 noticed CPU 133 enter rtas: tries=1, time=1259
[DEBUG] CPU 32 waiting for CPU 134 to enter rtas
cpu 134 (hwid 134) Ready to die...
[DEBUG] CPU 32 noticed CPU 134 enter rtas: tries=0, time=1141
[DEBUG] CPU 32 waiting for CPU 135 to enter rtas
cpu 135 (hwid 135) Ready to die...
[DEBUG] CPU 32 noticed CPU 135 enter rtas: tries=0, time=636
cpu 120 (hwid 120) Ready to die...
[DEBUG] CPU 32 waiting for CPU 120 to enter rtas
[DEBUG] CPU 32 noticed CPU 120 enter rtas: tries=0, time=53
[DEBUG] CPU 32 waiting for CPU 121 to enter rtas
cpu 121 (hwid 121) Ready to die...
Bad kernel stack pointer 1fafb400 at 1fafb4b0
Bad kernel stack pointer 1fafb4b0 at 1ec15270
Oops: Bad kernel stack pointer, sig: 6 [#1]
LE SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 121 PID: 0 Comm: swapper/121 Not tainted 
4.20.0-thiago-original-with-debug+ #57
NIP:  1ec15270 LR: 1ec1526c CTR: 1ecd26c4
REGS: c0001e577d30 TRAP: 0300   Not tainted  
(4.20.0-thiago-original-with-debug+)
MSR:  80001000   CR: 4800  XER: 0020
CFAR: 1ecd27a8 DAR: 8108 DSISR: 4000 IRQMASK: 
80009033
GPR00: 1ec1526c 1fafb4b0  
GPR04: 01a40968 00e0 dfe8 0018
GPR08: 1f82ae00 1ed025d4 1f827850 
GPR12: 01b6d998 c0001eb4e080 c003a1bdbf90 1eea3020
GPR16:  c006fdc45100 c004c8b0 c13d5300
GPR20: c006fdc45300 0008 c19d2cf8 c13d6888
GPR24: 0079 c13d688c 0002 c13d688c
GPR28: c19cecf0 03c8  1fafb4b0
NIP [1ec15270] 0x1ec15270
LR [1ec1526c] 0x1ec1526c
Call Trace:
Instruction dump:

Re: [PATCH 4.20 000/145] 4.20.1-stable review

2019-01-08 Thread Greg Kroah-Hartman

On Tue, Jan 08, 2019 at 07:39:44PM +, Dmitrii Tcvetkov wrote:
> Hello, 
> 
> Don't have anything against listed patches, just curious: doesn't upstream 
> commit 
> 574823bfab82d9d8fa47f422778043fbb4b4f50e 
> (Change mincore() to count "mapped" pages rather than "cached" pages) 
> need to be backported as a security fix for CVE-2019-5489[1] to stable 
> kernels?
> 
> [1] https://www.openwall.com/lists/oss-security/2019/01/07/2

Please read the email thread for that patch on the linux kernel mailing
list as to why it is not to be applied at this point in time.

If you are worried about this issue, please weigh in on that thread.

thanks,

greg k-h

Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order

2019-01-08 Thread Arun KS

On 2019-01-08 23:43, Michal Hocko wrote:

On Tue 08-01-19 09:56:09, Alexander Duyck wrote:

On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote:

[...]

>  static int online_pages_range(unsigned long start_pfn, unsigned long 
nr_pages,
>void *arg)
>  {
> -  unsigned long i;
>unsigned long onlined_pages = *(unsigned long *)arg;
> -  struct page *page;
>
>if (PageReserved(pfn_to_page(start_pfn)))
> -  for (i = 0; i < nr_pages; i++) {
> -  page = pfn_to_page(start_pfn + i);
> -  (*online_page_callback)(page);
> -  onlined_pages++;
> -  }
> +  onlined_pages = online_pages_blocks(start_pfn, nr_pages);

Shouldn't this be a "+=" instead of an "="? It seems like you are 
going

to lose your count otherwise.

You are right of course. I should have noticed during the review.
Thanks!

I think we don't need to. The caller function is setting onlined_pages = 
0 before calling online_pages_range().
And there are no other reference to online_pages_range other than from 
online_pages().

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/memory_hotplug.c?h=v5.0-rc1#n845

int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int 
online_type)

{
unsigned long flags;
unsigned long onlined_pages = 0;

Regards,
Arun

[PATCH] thermal: mtk: Allocate enough space for mtk_thermal.

2019-01-08 Thread Pi-Hsun Shih

The mtk_thermal struct contains a 'struct mtk_thermal_bank banks[];',
but the allocation only allocates sizeof(struct mtk_thermal) bytes,
which cause out of bound access with the ->banks[] member. Change it to
a fixed size array instead.

Signed-off-by: Pi-Hsun Shih 
---
 drivers/thermal/mtk_thermal.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/mtk_thermal.c b/drivers/thermal/mtk_thermal.c
index 0691f260f6eabe..ea11edb3fcced6 100644
--- a/drivers/thermal/mtk_thermal.c
+++ b/drivers/thermal/mtk_thermal.c
@@ -159,6 +159,9 @@
 #define MT7622_NUM_SENSORS_PER_ZONE1
 #define MT7622_TS1 0
 
+/* The maximum number of banks */
+#define MAX_NUM_ZONES  8
+
 struct mtk_thermal;
 
 struct thermal_bank_cfg {
@@ -178,7 +181,7 @@ struct mtk_thermal_data {
const int *sensor_mux_values;
const int *msr;
const int *adcpnp;
-   struct thermal_bank_cfg bank_data[];
+   struct thermal_bank_cfg bank_data[MAX_NUM_ZONES];
 };
 
 struct mtk_thermal {
@@ -197,7 +200,7 @@ struct mtk_thermal {
s32 vts[MT8173_NUM_SENSORS];
 
const struct mtk_thermal_data *conf;
-   struct mtk_thermal_bank banks[];
+   struct mtk_thermal_bank banks[MAX_NUM_ZONES];
 };
 
 /* MT8173 thermal sensor data */
-- 
2.20.1.97.g81188d93c3-goog

Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()

2019-01-08 Thread Sergey Senozhatsky

I didn't pay enough attention. Seems to be a temp problem
on parisc-linux.org side:

: There was a temporary problem delivering your message to jejb at 
parisc-linux.org.
:
: The response was:
:
: The recipient server did not accept our requests to connect.
: [mail.parisc-linux.org. 140.ZZZ.X.YY: generic::failed_precondition: connect 
error (0): error]

-ss

[PATCH] rcu: Remove rcu_*_state declaration

2019-01-08 Thread Peng Hao

Because of just only one set of rcu_state, the declaration of
rcu_sched_state/rcu_bh_state/rcu_preempt_state is unnecessary.

Signed-off-by: Peng Hao 
---
 kernel/rcu/tree.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 703e19f..9ea704c 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -401,13 +401,6 @@ struct rcu_state {
 /*
  * RCU implementation internal declarations:
  */
-extern struct rcu_state rcu_sched_state;
-
-extern struct rcu_state rcu_bh_state;
-
-#ifdef CONFIG_PREEMPT_RCU
-extern struct rcu_state rcu_preempt_state;
-#endif /* #ifdef CONFIG_PREEMPT_RCU */
 
 int rcu_dynticks_snap(struct rcu_data *rdp);
 
-- 
1.8.3.1

Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

2019-01-08 Thread Shakeel Butt

On Tue, Jan 8, 2019 at 9:36 PM Shakeel Butt  wrote:
>
> On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel  wrote:
> >
> > There is an imbalance between when slab_pre_alloc_hook calls
> > memcg_kmem_get_cache and when slab_post_alloc_hook calls
> > memcg_kmem_put_cache.
> >
>
> Can you explain how there is an imbalance? If the returned kmem cache
> from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt of
> memcg is elevated and the memcg_kmem_put_cache() will correctly
> decrement the refcnt of the memcg.
>
> > This can cause a memcg kmem cache to be destroyed right as
> > an object from that cache is being allocated,

Also please note that the memcg kmem caches are destroyed (if empty)
on memcg offline. The css_tryget_online() within
memcg_kmem_get_cache() will fail.

See kernel/cgroup/cgroup.c
* 2. When the percpu_ref is confirmed to be visible as killed on all CPUs
 *and thus css_tryget_online() is guaranteed to fail, the css can be
 *offlined by invoking offline_css().  After offlining, the base ref is
 *put.  Implemented in css_killed_work_fn().

> > which is probably
> > not good. It could lead to things like a memcg allocating new
> > kmalloc slabs instead of using freed space in old ones, maybe
> > memory leaks, and maybe oopses as a memcg kmalloc slab is getting
> > destroyed on one CPU while another CPU is trying to do an allocation
> > from that same memcg.
> >
> > The obvious fix would be to use the same condition for calling
> > memcg_kmem_put_cache that we also use to decide whether to call
> > memcg_kmem_get_cache.
> >
> > I am not sure how long this bug has been around, since the last
> > changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup
> >  kmem charge functions") - merely moved the bug from one location to
> > another. I am still tagging that changeset, because the fix should
> > automatically apply that far back.
> >
> > Signed-off-by: Rik van Riel 
> > Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions")
> > Cc: kernel-t...@fb.com
> > Cc: linux...@kvack.org
> > Cc: sta...@vger.kernel.org
> > Cc: Alexey Dobriyan 
> > Cc: Christoph Lameter 
> > Cc: Pekka Enberg 
> > Cc: Andrew Morton 
> > Cc: David Rientjes 
> > Cc: Joonsoo Kim 
> > Cc: Johannes Weiner 
> > Cc: Tejun Heo 
> > ---
> >  mm/slab.h | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/slab.h b/mm/slab.h
> > index 4190c24ef0e9..ab3d95bef8a0 100644
> > --- a/mm/slab.h
> > +++ b/mm/slab.h
> > @@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct 
> > kmem_cache *s, gfp_t flags,
> > p[i] = kasan_slab_alloc(s, object, flags);
> > }
> >
> > -   if (memcg_kmem_enabled())
> > +   if (memcg_kmem_enabled() &&
> > +   ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT)))
>
> I don't think these extra checks are needed. They are safe but not needed.
>
> > memcg_kmem_put_cache(s);
> >  }
> >
> > --
> > 2.17.1
> >
>
> thanks,
> Shakeel

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread David Gibson

On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote:
> On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote:
> > 
> > > In a very cryptic way that requires manual parsing using non-public
> > > docs sadly but yes. From the look of it, it's a completion timeout.
> > > 
> > > Looks to me like we don't get a response to a config space access
> > > during the change of D state. I don't know if it's the write of the D3
> > > state itself or the read back though (it's probably detected on the
> > > read back or a subsequent read, but that doesn't tell me which specific
> > > one failed).
> > 
> > If it is just one card doing it (again, check you have latest
> > firmware) I wonder if it is a sketchy PCI-E electrical link that is
> > causing a long re-training cycle? Can you tell if the PCI-E link is
> > permanently gone or does it eventually return?
> 
> No, it's 100% reproducable on systems with that specific card model,
> not card instance, and maybe different systems/cards as well, I'll let
> David & Alexey comment further on that.

Well, it's 100% reproducable on a particular model of system
(garrison) with a particular model of card.  I've had some suggestions
that it fails with some other systems card card models, but nothing
confirmed - the one other system model I've been able to try, which
also had a newer card model didn't reproduce the problem.

> > Does the card work in Gen 3 when it starts? Is there any indication of
> > PCI-E link errors?
> 
> Nope.
> 
> > Everytime or sometimes?
> > 
> > POWER 8 firmware is good? If the link does eventually come back, is
> > the POWER8's D3 resumption timeout long enough?
> > 
> > If this doesn't lead to an obvious conclusion you'll probably need to
> > connect to IBM's Mellanox support team to get more information from
> > the card side.
> 
> We are IBM :-) So far, it seems to be that the card is doing something
> not quite right, but we don't know what. We might need to engage
> Mellanox themselves.

Possibly.  On the other hand, I've had it reported that this is a
software regression at least with downstream red hat kernels.  I
haven't yet been able to eliminate factors that might be confusing
that, or try to find a working version upstream.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [PATCH] parisc: replace oops_in_progress manipulation with bust_spinlocks()

2019-01-08 Thread Sergey Senozhatsky

On (01/08/19 08:44), Helge Deller wrote:
> Date: Tue, 8 Jan 2019 08:44:19 +0100
> From: Helge Deller 
> To: Sergey Senozhatsky , "James E . J .
>  Bottomley" 
> Cc: linux-par...@vger.kernel.org, linux-kernel@vger.kernel.org, Sergey
>  Senozhatsky 
> Subject: Re: [PATCH] parisc: replace oops_in_progress manipulation with
>  bust_spinlocks()
> Message-ID: 
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
>  Thunderbird/60.4.0
> 
> On 07.01.19 10:56, Sergey Senozhatsky wrote:
> > Use bust_spinlocks() function to set oops_in_progress.
> > 
> > Signed-off-by: Sergey Senozhatsky 
> 
> I've added this patch to the parisc for-next tree.

Thanks.

BTW,
jejb at parisc-linux.org bounces. Shall it be replaced with
HansenPartnership.com email address?

---

diff --git a/MAINTAINERS b/MAINTAINERS
index 7fc7813cd9c2..86b174dbe506 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11511,7 +11511,7 @@ F:  Documentation/blockdev/paride.txt
 F: drivers/block/paride/
 
 PARISC ARCHITECTURE
-M: "James E.J. Bottomley" 
+M: "James E.J. Bottomley" 
 M: Helge Deller 
 L: linux-par...@vger.kernel.org
 W: http://www.parisc-linux.org/

5.0-rc1 KVM inspired "BUG: Bad page state in process" spew

2019-01-08 Thread Mike Galbraith

Greetings,

KVM seems to be busted in master ATM.  All I have to do to have host
start screaming and maybe exploding (if the guest doesn't do so first)
is to try to install a (obese in this case) kernel over nfs mount of
the host in a guest.

Kernel producing the spew below is 3bd6e94, config attached.

homer: # grep BUG: /netconsole.log
[ 1531.909703] BUG: Bad page state in process X  pfn:100491
[ 1531.958141] BUG: Bad page state in process systemd-journal  pfn:100412
[ 1532.662359] BUG: Bad page state in process X  pfn:10043f
[ 1532.664033] BUG: Bad page state in process X  pfn:10044d
[ 1532.686433] BUG: Bad page state in process systemd-journal  pfn:1027b0
[ 1532.687704] BUG: Bad page state in process systemd-journal  pfn:1027ba
[ 1532.689039] BUG: Bad page state in process systemd-journal  pfn:1027de
[ 1532.694762] BUG: Bad page state in process systemd-journal  pfn:102602
[ 1532.696337] BUG: Bad page state in process systemd-journal  pfn:102726
[ 1533.086254] BUG: Bad page state in process CPU 0/KVM  pfn:100a48
[ 1533.086869] BUG: Bad page state in process CPU 0/KVM  pfn:100a49
[ 1533.087413] BUG: Bad page state in process CPU 0/KVM  pfn:100a4a
[ 1533.087947] BUG: Bad page state in process CPU 0/KVM  pfn:100a4b
[ 1533.088477] BUG: Bad page state in process CPU 0/KVM  pfn:100a4c
[ 1533.089044] BUG: Bad page state in process CPU 0/KVM  pfn:100a4d
[ 1533.089586] BUG: Bad page state in process CPU 0/KVM  pfn:100a4e
[ 1533.090121] BUG: Bad page state in process CPU 0/KVM  pfn:100a61
[ 1533.090657] BUG: Bad page state in process CPU 0/KVM  pfn:100a62
[ 1533.091191] BUG: Bad page state in process CPU 0/KVM  pfn:100a63
[ 1533.091739] BUG: Bad page state in process CPU 0/KVM  pfn:100a64
[ 1533.092276] BUG: Bad page state in process CPU 0/KVM  pfn:100a65
[ 1533.092814] BUG: Bad page state in process CPU 0/KVM  pfn:100a66
[ 1533.093348] BUG: Bad page state in process CPU 0/KVM  pfn:100a67
[ 1533.093940] BUG: Bad page state in process CPU 0/KVM  pfn:1017e8
[ 1533.094512] BUG: Bad page state in process CPU 0/KVM  pfn:1017e9
[ 1533.095049] BUG: Bad page state in process CPU 0/KVM  pfn:1017ea
[ 1533.095585] BUG: Bad page state in process CPU 0/KVM  pfn:1017eb
[ 1533.096120] BUG: Bad page state in process CPU 0/KVM  pfn:1017ee
[ 1533.096679] BUG: Bad page state in process CPU 0/KVM  pfn:101a20
[ 1533.097221] BUG: Bad page state in process CPU 0/KVM  pfn:101a22
[ 1533.097757] BUG: Bad page state in process systemd-journal  pfn:101ce4
[ 1533.098162] BUG: Bad page state in process systemd-journal  pfn:101ce5
[ 1533.098535] BUG: Bad page state in process CPU 0/KVM  pfn:101ce6
[ 1533.099075] BUG: Bad page state in process CPU 0/KVM  pfn:101ce7
[ 1533.099667] BUG: Bad page state in process CPU 0/KVM  pfn:1024cc
[ 1533.101840] BUG: Bad page state in process CPU 0/KVM  pfn:1024e9
[ 1533.102379] BUG: Bad page state in process CPU 0/KVM  pfn:1024ea
[ 1533.102935] BUG: Bad page state in process CPU 0/KVM  pfn:1024eb
[ 1533.103516] BUG: Bad page state in process CPU 0/KVM  pfn:102bc8
[ 1533.105771] BUG: Bad page state in process CPU 0/KVM  pfn:102be0
[ 1533.106308] BUG: Bad page state in process CPU 0/KVM  pfn:102be1
[ 1533.106875] BUG: Bad page state in process CPU 0/KVM  pfn:102be2
[ 1533.107412] BUG: Bad page state in process CPU 0/KVM  pfn:102be3
[ 1533.66] BUG: Bad page state in process CPU 0/KVM  pfn:102bee

config.xz
Description: application/xz

Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

2019-01-08 Thread Shakeel Butt

On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel  wrote:
>
> There is an imbalance between when slab_pre_alloc_hook calls
> memcg_kmem_get_cache and when slab_post_alloc_hook calls
> memcg_kmem_put_cache.
>

Can you explain how there is an imbalance? If the returned kmem cache
from memcg_kmem_get_cache() is the memcg kmem cache then the refcnt of
memcg is elevated and the memcg_kmem_put_cache() will correctly
decrement the refcnt of the memcg.

> This can cause a memcg kmem cache to be destroyed right as
> an object from that cache is being allocated, which is probably
> not good. It could lead to things like a memcg allocating new
> kmalloc slabs instead of using freed space in old ones, maybe
> memory leaks, and maybe oopses as a memcg kmalloc slab is getting
> destroyed on one CPU while another CPU is trying to do an allocation
> from that same memcg.
>
> The obvious fix would be to use the same condition for calling
> memcg_kmem_put_cache that we also use to decide whether to call
> memcg_kmem_get_cache.
>
> I am not sure how long this bug has been around, since the last
> changeset to touch that code - 452647784b2f ("mm: memcontrol: cleanup
>  kmem charge functions") - merely moved the bug from one location to
> another. I am still tagging that changeset, because the fix should
> automatically apply that far back.
>
> Signed-off-by: Rik van Riel 
> Fixes: 452647784b2f ("mm: memcontrol: cleanup kmem charge functions")
> Cc: kernel-t...@fb.com
> Cc: linux...@kvack.org
> Cc: sta...@vger.kernel.org
> Cc: Alexey Dobriyan 
> Cc: Christoph Lameter 
> Cc: Pekka Enberg 
> Cc: Andrew Morton 
> Cc: David Rientjes 
> Cc: Joonsoo Kim 
> Cc: Johannes Weiner 
> Cc: Tejun Heo 
> ---
>  mm/slab.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 4190c24ef0e9..ab3d95bef8a0 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -444,7 +444,8 @@ static inline void slab_post_alloc_hook(struct kmem_cache 
> *s, gfp_t flags,
> p[i] = kasan_slab_alloc(s, object, flags);
> }
>
> -   if (memcg_kmem_enabled())
> +   if (memcg_kmem_enabled() &&
> +   ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT)))

I don't think these extra checks are needed. They are safe but not needed.

> memcg_kmem_put_cache(s);
>  }
>
> --
> 2.17.1
>

thanks,
Shakeel

Re: seqcount usage in xt_replace_table()

2019-01-08 Thread Dmitry Vyukov

On Wed, Jan 9, 2019 at 1:36 AM Anatol Pomozov  wrote:
>
> Hello
>
> On Tue, Jan 8, 2019 at 4:02 PM Andrea Parri
>  wrote:
> >
> > Hi Anatol,
> >
> > On Tue, Jan 08, 2019 at 11:33:39AM -0800, Anatol Pomozov wrote:
> > > Hello folks,
> > >
> > > A bit of context what I am doing. I am trying to port KTSAN (Kernel
> > > Thread Sanitizer) tool to v4.20. That tool tracks shared data usage
> > > and makes sure it is accessed in a thread-safe manner.
> >
> > Interesting!  FYI, some LKMM's maintainers (Paul included) had and
> > continued to have some "fun" discussing topics related to "thread-
> > safe memory accesses": I'm sure that they'll be very interested in
> > such work of yours and eager to discuss your results.
>
> Thread Sanitizer is a great tool to find thread-safety issues with
> user-space code. The tool been developed by a team of smart people
> from Google [1].
>
> KTSAN is an attempt to bring the same ideas to Linux kernel [2]. A
> bunch of work been done there but the project is still at
> proof-of-concept point.
>
> I am not a part of Google's dynamic tools team. But I've decided to
> pick something to do during the New Year holidays so started porting
> KTSAN from v4.2 to v4.20. The work is "almost completed" but I need to
> fix a few crashes [3].
>
> [1] https://github.com/google/sanitizers
> [2] https://github.com/google/ktsan/wiki
> [3] https://github.com/anatol/linux/tree/ktsan_4.20


For completeness, at the time we also had to add
read_seqcount_cancel() function to dynamically mark all seqcount read
regions. It can be used here too, start read region and immediately
end it. Less clear than raw_read_seqcount_nocritical(), but also less
API surface.

/**
 * read_seqcount_cancel - cancel a seq-read critical section
 * @s: pointer to seqcount_t
 *
 * This is a no-op except for ktsan, it needs to know scopes of seq-read
 * critical sections. The sections are denoted either by begin->retry or
 * by begin->cancel.
 */
static inline void read_seqcount_cancel(const seqcount_t *s)
{
ktsan_seqcount_end(s);
}

https://github.com/google/ktsan/blob/tsan/include/linux/seqlock.h#L235-L246

Re: [PATCH RFC 3/5] dt-bindings: Add PDC timer bindings for Qualcomm SoCs

2019-01-08 Thread Raju P L S S S N





On 1/7/2019 9:47 PM, Raju P L S S S N wrote:



On 1/4/2019 2:49 AM, Stephen Boyd wrote:

Quoting Raju P L S S S N (2019-01-03 04:22:58)


On 12/29/2018 3:08 AM, Stephen Boyd wrote:

Quoting Raju P L S S S N (2018-12-26 01:44:43)


There are two RSC devices in SoC one for application processor 
subsystem

& other display subsystem. Both RSC contain registers for PDC timers
(one for each subsystem).


When is the timer programmed on the display subsystem's RSC? It's hard
to give advice without all the information.


For display subsystem RSC, hardware sleep solver takes care of timer
programming for wakeup when the subsystem goes to Power collapse.


So the display subsystem doesn't need to program their PDC in their RSC
from software?

Yes.





I would think that it would make sense for the application processor's
RSC timer to be programmed from the broadcast timer mechanism in the
kernel so that timers during idle work and suspend turns off the timer
appropriately with a shutdown hook. I guess the PDC can't tell you the
time though? It looks like a shadow (and limited) version of the ARM
architected MMIO timer that we already program for the broadcast timer
mechanism. Is that because even the MMIO timer can't wakeup the system
in deep idle?  Assuming that's true, it means the ARM MMIO timer can't
always be used as the system wide broadcast mechanism because we 
need to

augment it with the PDC timer to get the actual wakeup.



Yes. this is correct.


Maybe we should be adding hooks into the broadcast timer mechanism to
program this wakeup event hardware in addition to the ARM MMIO 
timer. Or

we should stop using the ARM MMIO timer on these systems and read the
system register based physical time in the RSC timer driver and 
register

this 64-bit PDC register as the broadcast timer. So the time reading
would be through sysreg and the wakeup programming would be done by
writing the PDC timer. The assumption would be that we have access to
the physical time registers (which sounds like the assumption we 
have to

make).


There are no physical timer registers available in RSC for this purpose.



Do we get an interrupt somewhere from the RSC hardware when the timer
fires? Or does that just cause a system wakeup event without any 
pending

irq and then another irq (like the ARM architected timer) just happens
to be pending around the same time? If we get an interrupt somehow then
I would prefer to drop the ARM MMIO timer and do this hybrid broadcast
timer approach.


There is no interrupt for PDC timeout. It just causes system wakeup
without a pending irq. ARM MMIO is necessary for irq.


Does that system wakeup cause the CPUs to be enabled? So the sysreg
based timer in the CPU would start working? Or does it only make the
system come out of a deep enough idle state to make an always on domain
work that only contains the MMIO based ARM architected timer?


It only makes the system come out of deep sleep state for MMIO timer to 
function.




I'd hope that each RSC's PDC timer wakes up the owner of the RSC so that
we can use the sysreg based timers and ignore the MMIO based timers
here. This isn't a very important distinction to make though, so if you
have to use the MMIO timers then it just means some extra DT things need
to be done to relate the MMIO timers with the RSC that has the timer
that needs to be programmed too.




Either way we would need a way to either hook the ARM architected timer
driver in the kernel, or reimplement the bit of code needed to implement
the clockevent based on the ARM architected timer that programs the ARM
timer and the PDC timer together.



Could you please provide some more details on the implementation?


Hi Stephen,

Regardless of implementation options about application processor 
subsytem PDC timer, I think there is a need to differentiate the fact 
that for application processor subsystem PDC timer programming is done 
by SW but not for display processor subsystem as HW sleep solver takes 
care of PDC timer during sleep entry/exit. How about having a dt 
property like qcom,pdc-timer-mode = "solver"/"sw" ? The current patch 
introduced client based model using regmap to achieve this 
differentiation between these two subsystems. By using the dt property, 
once rpmh-src driver instance for application subsystem can have PDC 
timer programing implemented. Let me know if there is another way.


For implementation of PDC timer, I see the following based on above 
discussion -
1. Take the existing cpu_pm_notify approach and move the current series 
approach of programing PDC timer registers to RSC driver. This wouldn't 
involve any changes in clock_event_framework/broadcast framework.


2. Add api hooks (like reading the next wake up programmed) to ARM 
architecture timer driver so that the value is copied to PDC timer using 
the api with in RSC driver based on cpu_pm_notifications.


3. Changes in clockevent to program both ARM mem timer & PDC timer 
together. Could

[RFC PATCH v2 4/4] Documentation/device-mapper: add optional parameter reinit

2019-01-08 Thread Huaisheng Ye

From: Huaisheng Ye 

Add intro and usage guide for reinit.

Signed-off-by: Huaisheng Ye 
---
 Documentation/device-mapper/writecache.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/device-mapper/writecache.txt 
b/Documentation/device-mapper/writecache.txt
index 01532b3..255c68c 100644
--- a/Documentation/device-mapper/writecache.txt
+++ b/Documentation/device-mapper/writecache.txt
@@ -45,6 +45,10 @@ Constructor parameters:
afterwards
- some underlying devices perform better with fua, some
  with nofua. The user should test it
+   reinit  (by default off)
+   applicable only to persistent memory - use the REINIT flag
+   when the surper block has messy data, that would cause fn ctr
+   failed to work with invalid magic or version in the superblock
 
 Status:
 1. error indicator - 0 if there was no error, otherwise error number
-- 
1.8.3.1

[RFC PATCH v2 1/4] dm-writecache: remove unused size to writecache_flush_region

2019-01-08 Thread Huaisheng Ye

From: Huaisheng Ye 

writecache_flush_region doesn't use size to calculate flush region.
That uses _set_bits to mark the region in dirty_bitmap directly.

Signed-off-by: Huaisheng Ye 
---
 drivers/md/dm-writecache.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 2d50eec..2d8e0c0 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -418,7 +418,7 @@ static void writecache_flush_all_metadata(struct 
dm_writecache *wc)
memset(wc->dirty_bitmap, -1, wc->dirty_bitmap_size);
 }
 
-static void writecache_flush_region(struct dm_writecache *wc, void *ptr, 
size_t size)
+static void writecache_flush_region(struct dm_writecache *wc, void *ptr)
 {
if (!WC_MODE_PMEM(wc))
__set_bit(((char *)ptr - (char *)wc->memory_map) / 
BITMAP_GRANULARITY,
@@ -657,7 +657,7 @@ static void writecache_free_entry(struct dm_writecache *wc, 
struct wc_entry *e)
writecache_unlink(wc, e);
writecache_add_to_freelist(wc, e);
clear_seq_count(wc, e);
-   writecache_flush_region(wc, memory_entry(wc, e), sizeof(struct 
wc_memory_entry));
+   writecache_flush_region(wc, memory_entry(wc, e));
if (unlikely(waitqueue_active(>freelist_wait)))
wake_up(>freelist_wait);
 }
@@ -687,9 +687,9 @@ static void writecache_poison_lists(struct dm_writecache 
*wc)
 
 static void writecache_flush_entry(struct dm_writecache *wc, struct wc_entry 
*e)
 {
-   writecache_flush_region(wc, memory_entry(wc, e), sizeof(struct 
wc_memory_entry));
+   writecache_flush_region(wc, memory_entry(wc, e));
if (WC_MODE_PMEM(wc))
-   writecache_flush_region(wc, memory_data(wc, e), wc->block_size);
+   writecache_flush_region(wc, memory_data(wc, e));
 }
 
 static bool writecache_entry_is_committed(struct dm_writecache *wc, struct 
wc_entry *e)
@@ -733,7 +733,7 @@ static void writecache_flush(struct dm_writecache *wc)
 
wc->seq_count++;
pmem_assign(sb(wc)->seq_count, cpu_to_le64(wc->seq_count));
-   writecache_flush_region(wc, (wc)->seq_count, sizeof 
sb(wc)->seq_count);
+   writecache_flush_region(wc, (wc)->seq_count);
writecache_commit_flushed(wc);
 
wc->overwrote_committed = false;
@@ -1757,7 +1757,7 @@ static int init_memory(struct dm_writecache *wc)
writecache_flush_all_metadata(wc);
writecache_commit_flushed(wc);
pmem_assign(sb(wc)->magic, cpu_to_le32(MEMORY_SUPERBLOCK_MAGIC));
-   writecache_flush_region(wc, (wc)->magic, sizeof sb(wc)->magic);
+   writecache_flush_region(wc, (wc)->magic);
writecache_commit_flushed(wc);
 
return 0;
-- 
1.8.3.1

[RFC PATCH v2 2/4] dm-writecache: get rid of memory_data flush to writecache_flush_entry

2019-01-08 Thread Huaisheng Ye

From: Huaisheng Ye 

writecache_flush_region only works when SSD mode.
If wc->pmem_mode sets, writecache_flush_region doesn't need to be called in
writecache_flush_entry.

Signed-off-by: Huaisheng Ye 
---
 drivers/md/dm-writecache.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 2d8e0c0..c69317c 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -688,8 +688,6 @@ static void writecache_poison_lists(struct dm_writecache 
*wc)
 static void writecache_flush_entry(struct dm_writecache *wc, struct wc_entry 
*e)
 {
writecache_flush_region(wc, memory_entry(wc, e));
-   if (WC_MODE_PMEM(wc))
-   writecache_flush_region(wc, memory_data(wc, e));
 }
 
 static bool writecache_entry_is_committed(struct dm_writecache *wc, struct 
wc_entry *e)
-- 
1.8.3.1

[RFC PATCH v2 3/4] dm-writecache: expand pmem_reinit for struct dm_writecache

2019-01-08 Thread Huaisheng Ye

From: Huaisheng Ye 

When use persistent memory as cache data device, sometimes
the super block of pmem has messy data stored in it. That would
have risk to lead fn ctr failed to work due to invalid magic or
version.

Here we expand pmem_reinit to optional parameters in order to solve
this issue. When user gets pmem device, which has unrelated data in it,
as cache device, he should use paramenter 'reinit' to avoid s.magic
and s.version don't equal to NULL or correct
MEMORY_SUPERBLOCK_MAGIC/VERSION.

Signed-off-by: Huaisheng Ye 
---
 drivers/md/dm-writecache.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index c69317c..2c1e825 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -149,6 +149,7 @@ struct dm_writecache {
 
bool pmem_mode:1;
bool writeback_fua:1;
+   bool pmem_reinit:1;
 
bool overwrote_committed:1;
bool memory_vmapped:1;
@@ -2026,6 +2027,10 @@ static int writecache_ctr(struct dm_target *ti, unsigned 
argc, char **argv)
wc->writeback_fua = false;
wc->writeback_fua_set = true;
} else goto invalid_optional;
+   } else if (!strcasecmp(string, "reinit")) {
+   if (WC_MODE_PMEM(wc))
+   wc->pmem_reinit = true;
+   else goto invalid_optional;
} else {
 invalid_optional:
r = -EINVAL;
@@ -2127,7 +2132,7 @@ static int writecache_ctr(struct dm_target *ti, unsigned 
argc, char **argv)
ti->error = "Hardware memory error when reading superblock";
goto bad;
}
-   if (!le32_to_cpu(s.magic) && !le32_to_cpu(s.version)) {
+   if (wc->pmem_reinit || (!le32_to_cpu(s.magic) && 
!le32_to_cpu(s.version))) {
r = init_memory(wc);
if (r) {
ti->error = "Unable to initialize device";
-- 
1.8.3.1

[RFC PATCH v2 0/4] add parameter for pmem cache device init

2019-01-08 Thread Huaisheng Ye

From: Huaisheng Ye 

This patch set could be used for dm-writecache when use persistent
memory as cache data device.

Patch 1 and 2 go towards removing unused parameter and codes which
actually doesn't really work.

Patch 3 and 4 are targeted at solving problem fn ctr failed to work
due to invalid magic or version, which is caused by the super block
of pmem has messy data stored.

Changes Since v1:
- add optional parameter reinit to avoid invalid magic or
  version.
[1]: https://lkml.org/lkml/2019/1/3/43

Huaisheng Ye (4):
  dm-writecache: remove unused size to writecache_flush_region
  dm-writecache: get rid of memory_data flush to writecache_flush_entry
  dm-writecache: expand pmem_reinit for struct dm_writecache
  Documentation/device-mapper: add optional parameter reinit

 Documentation/device-mapper/writecache.txt |  4 
 drivers/md/dm-writecache.c | 19 +++
 2 files changed, 15 insertions(+), 8 deletions(-)

-- 
1.8.3.1

RE: x86/sgx: uapi change proposal

2019-01-08 Thread Huang, Kai

> On Tue, Jan 08, 2019 at 11:27:11AM -0800, Huang, Kai wrote:
> > > >
> > > > Can one of you explain why SGX_ENCLAVE_CREATE is better than just
> > > > opening a new instance of /dev/sgx for each encalve?
> > >
> > > Directly associating /dev/sgx with an enclave means /dev/sgx can't
> > > be used to provide ioctl()'s for other SGX-related needs, e.g. to
> > > mmap() raw EPC and expose it a VM.  Proposed layout in the link
> > > below.  I'll also respond to Jarkko's question about exposing EPC
> > > through /dev/sgx instead of having KVM allocate it on behalf of the VM.
> > >
> > > https://lkml.kernel.org/r/20181218185349.gc30...@linux.intel.com
> >
> > Hi Sean,
> >
> > Sorry for replying to old email. But IMHO it is not a must that Qemu
> > needs to open some /dev/sgx and allocate/mmap EPC for guest's virtual
> > EPC slot, instead, KVM could create private slot, which is not visible
> > to Qemu, for virtual EPC, and KVM could call core-SGX EPC allocation
> > API directly.
> 
> That's possible, but it has several downsides.
> 
>   - Duplicates a lot of code in KVM for managing memory regions.

I don't see why there will be duplicated code. you can simply call 
__x86_set_memory_region to create private slot. It is KVM x86 equivalent to 
KVM_SET_USER_MEMORY_REGION from userspace. The only difference is Qemu is not 
aware of the private slot.

>   - Artificially restricts userspace to a single EPC region, unless
> even more code is duplicated to handle multiple private regions.

You can have multiple private slots, by calling __x86_set_memory_region for 
each EPC section. KVM receives EPC section/sections info from Qemu, via CPUID, 
or dedicated IOCTL (is this you are going to add?), and simply creates private 
EPC slot/slots.

>   - Requires additional ioctls() or capabilities to probe EPC support

No. EPC info is from Qemu at the beginning (size is given by parameter, base is 
calculated by Qemu), and actually it is Qemu notifies KVM EPC info, so I don't 
think we require additional ioctls or capabilities here.

>   - Does not fit with Qemu/KVM's memory model, e.g. all other types of
> memory are exposed to a guest through
> KVM_SET_USER_MEMORY_REGION.

EPC is different. I am not sure whether EPC needs to fit such model. There are 
already examples in KVM which uses private slot w/o using 
KVM_SET_USER_MEMORY_REGION, for example, APIC access page.

>   - Prevents userspace from debugging a guest's enclave.  I'm not saying
> this is a likely scenario, but I also don't think we should preclude
> it without good reason.

I am not sure how important it is, so don't know whether this can be a 
justification. To me we don't need to consider this. Qemu normally doesn't 
access guest memory unless it has to (ie, for device model). 

>   - KVM is now responsible for managing the lifecycle of EPC, e.g. what
> happens if an EPC cgroup limit is lowered on a running VM and
> KVM can't gracefully reclaim EPC?  The userspace hypervisor should
> ultimately decide how to handle such an event.

Even using KVM_SET_USER_MEMORY_REGION, KVM is also responsible for managing the 
lifecycle of EPC, no? And "managing the lifecycle of EPC" you mean doesn't need 
to be "managing EPC itself" I suppose, since EPC should always be managed by 
core SGX code.

I don't see the difference between private slot and KVM_SET_USER_MEMORY_REGION, 
in terms of how does KVM reclaim EPC, or how does KVM do when it fails to 
reclaim EPC. 

>   - SGX logic is split between SGX and KVM, e.g. VA page management for
> oversubscription will likely be common to SGX and KVM.  From a long
> term maintenance perspective, this means that changes to the EPC
> management could potentially need to be Acked by KVM, and vice versa.

I think most of the code should be in core SGX code under x86. KVM should only 
have the code that is specifically related to virtualization, ie, ENCLV.

VA page allocation should be under code SGX code. KVM might need to call 
function such as alloc_va_page, etc, but this is not a problem. There are many 
other cases now.

And this is not related to private slot vs KVM_SET_USER_MEMORY_REGION.

> 
> > I am not sure what's the good of allowing userspace to alloc/mmap a
> > raw EPC region? Userspace is not allowed to touch EPC anyway, expect
> > enclave code.
> >
> > To me KVM creates private EPC slot is cleaner than exposing
> > /dev/sgx/epc and allowing userspace to map some raw EPC region.
> 
> Cleaner in the sense that it's faster to get basic support up and running 
> since
> there are fewer touchpoints, but there are long term ramifications to
> cramming EPC management in KVM.
> 
> And at this point I'm not stating any absolutes, e.g. how EPC will be handled
> by KVM.  What I'm pushing for is to not eliminate the possibility of having
> the SGX subsystem own all EPC management, e.g. don't tie /dev/sgx to a
> single enclave.

I suppose "SGX subsystem" you mean here is core SGX code under x86.

Re: [PATCH v3] loop: drop caches if offset or block_size are changed

2019-01-08 Thread Jaegeuk Kim

Hi Jens,

May I ask whether this patch is acceptable?

Thanks,

On 12/18, Jaegeuk Kim wrote:
> If we don't drop caches used in old offset or block_size, we can get old data
> from new offset/block_size, which gives unexpected data to user.
> 
> For example, Martijn found a loopback bug in the below scenario.
> 1) LOOP_SET_FD loads first two pages on loop file
> 2) LOOP_SET_STATUS64 changes the offset on the loop file
> 3) mount is failed due to the cached pages having wrong superblock
> 
> Cc: Jens Axboe 
> Cc: linux-bl...@vger.kernel.org
> Reported-by: Martijn Coenen 
> Signed-off-by: Jaegeuk Kim 
> ---
> v2 -> v3:
>  - avoid to submit IOs on frozen mq
> 
> Jens, how about this?
> 
> Thanks,
> 
>  drivers/block/loop.c | 29 +++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index cb0cc8685076..6b03121d62aa 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1126,6 +1126,12 @@ loop_set_status(struct loop_device *lo, const struct 
> loop_info64 *info)
>   if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
>   return -EINVAL;
>  
> + if (lo->lo_offset != info->lo_offset ||
> + lo->lo_sizelimit != info->lo_sizelimit) {
> + sync_blockdev(lo->lo_device);
> + kill_bdev(lo->lo_device);
> + }
> +
>   /* I/O need to be drained during transfer transition */
>   blk_mq_freeze_queue(lo->lo_queue);
>  
> @@ -1154,6 +1160,11 @@ loop_set_status(struct loop_device *lo, const struct 
> loop_info64 *info)
>  
>   if (lo->lo_offset != info->lo_offset ||
>   lo->lo_sizelimit != info->lo_sizelimit) {
> + /* kill_bdev should have truncated all the pages */
> + if (lo->lo_device->bd_inode->i_mapping->nrpages) {
> + err = -EAGAIN;
> + goto exit;
> + }
>   if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
>   err = -EFBIG;
>   goto exit;
> @@ -1375,22 +1386,36 @@ static int loop_set_dio(struct loop_device *lo, 
> unsigned long arg)
>  
>  static int loop_set_block_size(struct loop_device *lo, unsigned long arg)
>  {
> + int err = 0;
> +
>   if (lo->lo_state != Lo_bound)
>   return -ENXIO;
>  
>   if (arg < 512 || arg > PAGE_SIZE || !is_power_of_2(arg))
>   return -EINVAL;
>  
> + if (lo->lo_queue->limits.logical_block_size != arg) {
> + sync_blockdev(lo->lo_device);
> + kill_bdev(lo->lo_device);
> + }
> +
>   blk_mq_freeze_queue(lo->lo_queue);
>  
> + /* kill_bdev should have truncated all the pages */
> + if (lo->lo_queue->limits.logical_block_size != arg &&
> + lo->lo_device->bd_inode->i_mapping->nrpages) {
> + err = -EAGAIN;
> + goto out;
> + }
> +
>   blk_queue_logical_block_size(lo->lo_queue, arg);
>   blk_queue_physical_block_size(lo->lo_queue, arg);
>   blk_queue_io_min(lo->lo_queue, arg);
>   loop_update_dio(lo);
> -
> +out:
>   blk_mq_unfreeze_queue(lo->lo_queue);
>  
> - return 0;
> + return err;
>  }
>  
>  static int lo_ioctl(struct block_device *bdev, fmode_t mode,
> -- 
> 2.19.0.605.g01d371f741-goog

Re: __get_user slower than get_user (was Re: [RFC PATCH V3 0/5] Hi:)

2019-01-08 Thread Linus Torvalds

On Tue, Jan 8, 2019 at 8:31 PM Michael S. Tsirkin  wrote:
>
> Linus, given that you just changed all users of access_ok anyway, do
> you still think that the access_ok() conversion to return a speculation
> sanitized pointer or NULL is too big a conversion?

I didn't actually change a single access_ok().

I changed the (very few) users of "user_access_begin()" to do an
access_ok() in them. There were 8 of them total.

It turns out that two of those cases (the strn*_user() ones) found
bugs in the implementation of access_ok() of two architectures, and
then looking at the others found that six more architectures also had
problems, but those weren't actually because of any access_ok()
changes, they were pre-existing issues. So we definitely had
unfortunate bugs in access_ok(), but they were mostly the benign kind
(ir the "use arguments twice - a real potential bug, but not one that
actually likely makes any difference to existing users)

Changing all 600+ users of access_ok() would be painful.

That said, one thing I *would* like to do is to just get rid of
__get_user() and __put_user() entirely. Or rather, just make them do
exactly the same thing that the normal "get_user()"/"put_user()"
functions do.

And then, _within_ the case of get_user()/put_user(), doing the
access_ok() as a data dependency rather than a lfence should be easy
enough.

 Linus

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1482 matches

Mail list logo