date:20171219

Re: [PATCH v2 4/4] pinctrl: mediatek: update MAINTAINERS entry with MediaTek pinctrl driver

2017-12-19 Thread Linus Walleij

On Tue, Dec 12, 2017 at 7:24 AM,   wrote:

> From: Sean Wang 
>
> I work for MediaTek on maintaining the existing MediaTek SoC whose target
> to home gateway such as MT7622 and MT7623 that is reusing MT2701 related
> files and will keep adding support for the following such kinds of SoCs
> in the future.
>
> Signed-off-by: Sean Wang 
> Reviewed-by: Biao Huang 

Patch applied.

Yours,
Linus Walleij

Re: [PATCH v2 3/4] pinctrl: mediatek: add pinctrl driver for MT7622 SoC

2017-12-19 Thread Linus Walleij

On Tue, Dec 12, 2017 at 7:24 AM,   wrote:

> From: Sean Wang 
>
> Add support for pinctrl on MT7622 SoC. The IO core found on the SoC has
> the registers for pinctrl, pinconf and gpio mixed up in the same register
> range. However, the IO core for the MT7622 SoC is completely distinct from
> anyone of previous MediaTek SoCs which already had support, such as
> the hardware internal, register address map and register detailed
> definition for each pin.
>
> Therefore, instead, the driver is being newly implemented by reusing
> generic methods provided from the core layer with GENERIC_PINCONF,
> GENERIC_PINCTRL_GROUPS, and GENERIC_PINMUX_FUNCTIONS for the sake of code
> simplicity and rid of superfluous code. Where the function of pins
> determined by groups is utilized in this driver which can help developers
> less confused with what combinations of pins effective on the SoC and even
> reducing the mistakes during the integration of those relevant boards.
>
> As the gpio_chip handling is also only a few lines, the driver also
> implements the gpio functionality directly through GPIOLIB.
>
> Signed-off-by: Sean Wang 
> Reviewed-by: Biao Huang 

Patch applied. Very nice work!

As I've seen visiting Asia how popular MTK chips are for all kinds
of devices it's really nice to have proper upstream support directly from
Mediatek on these chips. You guys are awesome.

Some suggestions for improvements:

> +static void mtk_w32(struct mtk_pinctrl *pctl, u32 reg, u32 val)
> +{
> +   writel_relaxed(val, pctl->base + reg);
> +}
> +
> +static u32 mtk_r32(struct mtk_pinctrl *pctl, u32 reg)
> +{
> +   return readl_relaxed(pctl->base + reg);
> +}
> +
> +static void mtk_rmw(struct mtk_pinctrl *pctl, u32 reg, u32 mask, u32 set)
> +{
> +   u32 val;
> +
> +   val = mtk_r32(pctl, reg);
> +   val &= ~mask;
> +   val |= set;
> +   mtk_w32(pctl, reg, val);
> +}

Have you considered replacing this with regmap-mmio? It does pretty much
the same thing. It could be an improvemet reducing code a bit and making
it more generic. The error codes from eg regmap_update_bits() can be
safely ignored on MMIO maps.

> +static int mtk_build_gpiochip(struct mtk_pinctrl *hw, struct device_node *np)
> +{
> +   struct gpio_chip *chip = &hw->chip;
> +   int ret;
> +
> +   chip->label = PINCTRL_PINCTRL_DEV;
> +   chip->parent= hw->dev;
> +   chip->request   = gpiochip_generic_request;
> +   chip->free  = gpiochip_generic_free;
> +   chip->direction_input   = mtk_gpio_direction_input;
> +   chip->direction_output  = mtk_gpio_direction_output;

Please submit a patch implementing chip->get_direction(), it
is really helpful, especially for debugging.

If your pin controller later adds support for things that can be
used from the GPIO side, like open drain or debounce, then
please consider at that point to also implement
chip->set_config() in the gpio_chip. That way your GPIO consumers
can use e.g. open drain through pin control as back-end.
See drivers/pinctrl/intel/pinctrl-intel.c for an example.

Yours,
Linus Walleij

Re: Maintainer docs for patch merging

2017-12-19 Thread Greg Kroah-Hartman

On Wed, Dec 20, 2017 at 11:25:41AM +1100, Tobin C. Harding wrote:
> Hi,
> 
> Recently we started a maintainer book (merged into Jonathan's docs-next
> branch).
> 
> Would any current maintainers please be willing to explain how they go
> about generating the automated emails one often receives when a patch
> [set] is applied.
> 
> This may also be related to tree/branch management for maintainers
> kernel.org shows some people like to use multiple trees and some use
> branches? 
> 
> If deemed relevant we could add a section to the new book (and I'd also
> like to know how to do it for my own tree please so I can copy ;)
> 
> I have CC'd Greg and Andrew because they seem to have a system in place
> for this.

I "stole" Andrew's scripts for this a long time ago.  I guess I can
write up something "real" for the documentation so that others can see
the shell mess that drives those emails :)

> No rush on this, I know Christmas is soon.

Yeah, this will have to wait until mid January at the earliest...

thanks,

greg k-h

Re: general protection fault in native_write_cr4

2017-12-19 Thread Wanpeng Li

2017-12-20 15:49 GMT+08:00 syzbot
:
> Hello,
>
> syzkaller hit the following crash on
> f6f3732162b5ae3c771b9285a5a32d72b8586920
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
>
>

I will have a look again, you continue to run it in kvm guest, right?

Regards,
Wanpeng Li

> kvm: KVM_SET_TSS_ADDR need to be called before entering vcpu
> kasan: CONFIG_KASAN_INLINE enabled
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault:  [#1] SMP KASAN
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 3142 Comm: syzkaller429302 Not tainted 4.15.0-rc3+ #224
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:native_write_cr4+0x4/0x10 arch/x86/include/asm/special_insns.h:76
> RSP: 0018:8801ca6f75a0 EFLAGS: 00010093
> RAX: 8801ca1c8700 RBX: 001606e0 RCX: 811a2a92
> RDX:  RSI:  RDI: 001606e0
> RBP: 8801ca6f75a0 R08: 1100394dee0f R09: 0004
> R10: 8801ca6f7510 R11: 0004 R12: 0093
> R13: 8801ca1c8700 R14: 8801db514850 R15: 8801db514850
> FS:  01031880() GS:8801db50() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 05e22006 CR4: 001626e0
> Call Trace:
>  __write_cr4 arch/x86/include/asm/paravirt.h:76 [inline]
>  __cr4_set arch/x86/include/asm/tlbflush.h:180 [inline]
>  cr4_clear_bits arch/x86/include/asm/tlbflush.h:203 [inline]
>  kvm_cpu_vmxoff arch/x86/kvm/vmx.c:3582 [inline]
>  hardware_disable+0x34a/0x4b0 arch/x86/kvm/vmx.c:3588
>  kvm_arch_hardware_disable+0x35/0xd0 arch/x86/kvm/x86.c:7982
>  hardware_disable_nolock+0x30/0x40
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3310
>  on_each_cpu+0xca/0x1b0 kernel/smp.c:604
>  hardware_disable_all_nolock+0x3e/0x50
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3328
>  hardware_disable_all arch/x86/kvm/../../../virt/kvm/kvm_main.c:3334
> [inline]
>  kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:742 [inline]
>  kvm_put_kvm+0x956/0xdf0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:755
>  kvm_vm_release+0x42/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:766
>  __fput+0x327/0x7e0 fs/file_table.c:210
>  fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x199/0x270 kernel/task_work.c:113
>  exit_task_work include/linux/task_work.h:22 [inline]
>  do_exit+0x9bb/0x1ad0 kernel/exit.c:865
>  do_group_exit+0x149/0x400 kernel/exit.c:968
>  SYSC_exit_group kernel/exit.c:979 [inline]
>  SyS_exit_group+0x1d/0x20 kernel/exit.c:977
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x441c78
> RSP: 002b:7ffe68e20f68 EFLAGS: 0246 ORIG_RAX: 00e7
> RAX: ffda RBX: 004002c8 RCX: 00441c78
> RDX:  RSI: 003c RDI: 
> RBP: 006cd018 R08: 00e7 R09: ffd0
> R10: 0012 R11: 0246 R12: 00404080
> R13: 00404110 R14:  R15: 
> Code: 0f 1f 80 00 00 00 00 55 48 89 e5 0f 20 d8 5d c3 0f 1f 80 00 00 00 00
> 55 48 89 e5 0f 22 df 5d c3 0f 1f 80 00 00 00 00 55 48 89 e5 <0f> 22 e7 5d c3
> 0f 1f 80 00 00 00 00 55 48 89 e5 44 0f 20 c0 5d
> RIP: native_write_cr4+0x4/0x10 arch/x86/include/asm/special_insns.h:76 RSP:
> 8801ca6f75a0
> ---[ end trace ca14f0c15b26c251 ]---
>
>
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkal...@googlegroups.com.
> Please credit me with: Reported-by: syzbot 
>
> syzbot will keep track of this bug report.
> Once a fix for this bug is merged into any tree, reply to this email with:
> #syz fix: exact-commit-title
> If you want to test a patch for this bug, please reply with:
> #syz test: git://repo/address.git branch
> and provide the patch inline or as an attachment.
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug
> report.
> Note: all commands must start from beginning of the line in the email body.

Re: BUG: unable to handle kernel NULL pointer dereference in rb_insert_color

2017-12-19 Thread Dmitry Vyukov

On Tue, Dec 19, 2017 at 10:59 PM, Eric Biggers  wrote:
> On Tue, Dec 19, 2017 at 12:41:01AM -0800, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>>
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
>>
>> sctp: [Deprecated]: syz-executor6 (pid 4202) Use of int in max_burst
>> socket option.
>> Use struct sctp_assoc_value instead
>> BUG: unable to handle kernel NULL pointer dereference at 0008
>> sctp: [Deprecated]: syz-executor4 (pid 4240) Use of int in max_burst
>> socket option.
>> Use struct sctp_assoc_value instead
>> sctp: [Deprecated]: syz-executor4 (pid 4240) Use of int in max_burst
>> socket option.
>> Use struct sctp_assoc_value instead
>> IP: __rb_insert lib/rbtree.c:126 [inline]
>> IP: rb_insert_color+0x17/0x190 lib/rbtree.c:452
>> PGD 0 P4D 0
>> Oops:  [#1] SMP
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> Modules linked in:
>> CPU: 0 PID: 4244 Comm: modprobe Not tainted 4.15.0-rc3-next-20171214+ #67
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> RIP: 0010:__rb_insert lib/rbtree.c:126 [inline]
>> RIP: 0010:rb_insert_color+0x17/0x190 lib/rbtree.c:452
>> RSP: 0018:c900010a7c08 EFLAGS: 00010246
>> RAX:  RBX:  RCX: 814ddcb9
>> RDX: 8801ebedf988 RSI: 8801ebfd6400 RDI: 88021413a408
>> RBP: c900010a7c08 R08: 0002bcf8 R09: 88021413a400
>> R10:  R11:  R12: 88021413a400
>> R13: 8801ebedf990 R14: a34fc52a R15: 8801ebedf988
>> FS:  7f85a5155700() GS:88021fc0() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: 0008 CR3: 0001eaccd006 CR4: 001606f0
>> DR0: 2000 DR1: 2000 DR2: 
>> DR3:  DR6: 0ff0 DR7: 0600
>> Call Trace:
>>  ext4_htree_store_dirent+0x122/0x160 fs/ext4/dir.c:488
>>  htree_dirblock_to_tree+0x112/0x300 fs/ext4/namei.c:1019
>>  ext4_htree_fill_tree+0xdf/0x410 fs/ext4/namei.c:1096
>>  ext4_dx_readdir fs/ext4/dir.c:575 [inline]
>>  ext4_readdir+0x8cf/0xd70 fs/ext4/dir.c:122
>>  iterate_dir+0xb8/0x200 fs/readdir.c:51
>>  SYSC_getdents fs/readdir.c:231 [inline]
>>  SyS_getdents+0xcc/0x1b0 fs/readdir.c:212
>>  entry_SYSCALL_64_fastpath+0x1f/0x96
>> RIP: 0033:0x7f85a4a45575
>> RSP: 002b:7ffc9b5be120 EFLAGS: 0246 ORIG_RAX: 004e
>> RAX: ffda RBX: 7f85a4d23e98 RCX: 7f85a4a45575
>> RDX: 8000 RSI: 5633094701e0 RDI: 
>> RBP: 7f85a4d23e40 R08: 5633094701e0 R09: 7f85a4d23e90
>> R10:  R11: 0246 R12: 5633094701b0
>> R13: 00018e21 R14:  R15: 0004
>> Code: 48 85 d2 75 eb 5d c3 31 c0 5d c3 66 0f 1f 84 00 00 00 00 00 55
>> 48 8b 17 48 89 e5 48 85 d2 0f 84 4c 01 00 00 48 8b 02 a8 01 75 5e
>> <48> 8b 48 08 49 89 c0 48 39 d1 74 54 48 85 c9 74 09 f6 01 01 0f
>> RIP: __rb_insert lib/rbtree.c:126 [inline] RSP: c900010a7c08
>> RIP: rb_insert_color+0x17/0x190 lib/rbtree.c:452 RSP: c900010a7c08
>> CR2: 0008
>> BUG: unable to handle kernel paging request at 00010001
>> ---[ end trace c403bd3ebad2ccb0 ]---
>
> The line number in lib/rbtree.c seems to be slightly off.  Looking at the
> disassembly:
>
> 825b5ea0 :
> 825b5ea0:   55  push   %rbp
> 825b5ea1:   48 8b 17mov(%rdi),%rdx
> 825b5ea4:   48 89 e5mov%rsp,%rbp
> 825b5ea7:   48 85 d2test   %rdx,%rdx
> 825b5eaa:   0f 84 4c 01 00 00   je 
> 825b5ffc 
> 825b5eb0:   48 8b 02mov(%rdx),%rax
> 825b5eb3:   a8 01   test   $0x1,%al
> 825b5eb5:   75 5e   jne
> 825b5f15 
> 825b5eb7:   48 8b 48 08 mov0x8(%rax),%rcx
>
> It crashed on 'mov 0x8(%rax),%rcx' which corresponds to
> 'tmp = gparent->rb_right;' at lib/rbtree.c:131.  So 'parent' was the root 
> node,
> but its color was red, while it is supposed to be black.
>
> No idea how that happened, but it's almost certainly not an ext4 bug.  In fact
> there is another report of this same crash that has a different call trace:
>
> Call Trace:
>  key_alloc_serial security/keys/key.c:170 [inline]
>  key_alloc+0x54c/0x5b0 security/keys/key.c:319
>  keyring_alloc+0x4d/0xb0 security/keys/keyring.c:503
>  install_process_keyring_to_cred.part.3+0x38/0x80 
> security/keys/

Re: [PATCH v2 2/4] pinctrl: mediatek: cleanup for placing all drivers under the menu

2017-12-19 Thread Linus Walleij

On Tue, Dec 12, 2017 at 7:24 AM,   wrote:

> From: Sean Wang 
>
> Since lots of MediaTek drivers had been added, it seems slightly better
> for that adding cleanup for placing MediaTek pinctrl drivers under the
> independent menu as other kinds of drivers usually was done.
>
> Signed-off-by: Sean Wang 
> Reviewed-by: Biao Huang 

Patch applied. Also very nice!

Yours,
Linus Walleij

Re: [PATCH v2 1/4] dt-bindings: pinctrl: add bindings for MediaTek MT7622 SoC

2017-12-19 Thread Linus Walleij

On Tue, Dec 12, 2017 at 7:24 AM,   wrote:

> From: Sean Wang 
>
> Add devicetree bindings for MediaTek MT7622 pinctrl driver.
>
> Signed-off-by: Sean Wang 
> Reviewed-by: Biao Huang 

Patch applied with Rob's ACK.

Yours,
Linus Walleij

Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

2017-12-19 Thread Sergey Senozhatsky

On (12/19/17 09:40), Steven Rostedt wrote:
> On Tue, 19 Dec 2017 13:58:46 +0900
> Sergey Senozhatsky  wrote:
> 
> > so you are not convinced that my scenarios real/matter; I'm not
> 
> Well, not with the test module. I'm looking for actual code in the
> upstream kernel.
> 
> > convinced that I have stable and functional boards with this patch ;)
> > seems that we are coming to a dead end.
> 
> Can I ask, is it any worse than what we have today?

that's a really hard question. both the existing printk() and the
tweaked printk() have the same thing in common - a) preemption from
console_unlock() and b) printk() being way to fast compared to anything
else (call_console_drivers() and to preemption latency of console_sem
owner). your patch puts some requirements that my workload simply cannot
fulfill. so may be if I'll start pushing it towards OOM and so on, then
I'll see some difference (but both (a) and (b) still gonna stay true).

the thing that really changes everything is offloading to printk_kthread.
given that I can't have a tiny logbuf, and that I can't have fast console,
and that I can't have tons of printks to chose from and to get advantage
of hand off algorithm in any reliable way; I need something more to
guarantee that the current console_sem will not be forced to evict all
logbuf messages.

> > for the record,
> > I'm not going to block the patch if you want it to be merged.
> 
> Thanks,

I mean it :)

-ss

Re: [PATCH V2 9/9] ARM: dts: stm32: add initial support of stm32mp157c eval board

2017-12-19 Thread Linus Walleij

On Mon, Dec 18, 2017 at 4:17 PM, Ludovic Barre  wrote:

> From: Ludovic Barre 
>
> Add support of stm32mp157c evaluation board (part number: STM32MP157C-EV1)
> split in 2 elements:
> -Daughter board (part number: STM32MP157C-ED1)
>  which includes CPU, memory and power supply
> -Mother board (part number: STM32MP157C-EM1)
>  which includes external peripherals (like display, camera,...)
>  and extension connectors.
>
> The daughter board can run alone, this is why the device tree files
> are split in two layers, for the complete evaluation board (ev1)
> and for the daughter board alone (ed1).
>
> Signed-off-by: Ludovic Barre 
> Signed-off-by: Alexandre Torgue 
(...)
> diff --git a/arch/arm/boot/dts/stm32mp157c-ev1.dts 
> b/arch/arm/boot/dts/stm32mp157c-ev1.dts

Evaluation boards are important because they set a pattern that customers
will use.

Please consider to include nodes for all GPIO blocks used in this
evaluation board, and add:

gpio-line-names = "foo", "bar" ...;

See for example
arch/arm/boot/dts/bcm2835-rpi-a.dts
arch/arm/boot/dts/ste-snowball.dts

It's good to have because probably you guys have proper schematics and
know rail names of the stuff connected to those GPIO lines and so on,
so you can give the lines proper names.

It will be helpful for people using the reference design, especially with the
new character device, and also sets a pattern for people doing devices
based on the reference design and we really want to do that.

Yours,
Linus Walleij

Re: [PATCH] staging: ccree: use size_t consistently

2017-12-19 Thread Greg Kroah-Hartman

On Wed, Dec 20, 2017 at 07:23:31AM +, Gilad Ben-Yossef wrote:
> Fix declaration, implementation and wrapper function to use
> the same size_t type we actually define the parameter to be.
> 
> Fixes: 3f268f5d6669 ("staging: ccree: turn compile time debug log to params")
> Signed-off-by: Gilad Ben-Yossef 

You forgot the reported-by: tag :(

I'll go add it...

Re: [PATCH V2 6/9] pinctrl: stm32: Add STM32MP157 MPU support

2017-12-19 Thread Linus Walleij

On Mon, Dec 18, 2017 at 4:17 PM, Ludovic Barre  wrote:

> From: Ludovic Barre 
>
> This driver consists of 2 controllers due to a hole in mapping:
> -1 controller for GPIO bankA to K.
> -1 controller for GPIO bankZ.
>
> Signed-off-by: Alexandre Torgue 
> Signed-off-by: Ludovic Barre 
> Reviewed-by: Rob Herring 

Patch applied.

Yours,
Linus Walleij

Re: [PATCH V2 1/9] devicetree: bindings: Document supported STM32 SoC family

2017-12-19 Thread Linus Walleij

On Mon, Dec 18, 2017 at 4:17 PM, Ludovic Barre  wrote:

> From: Ludovic Barre 
>
> This adds a list of supported STM32 SoC bindings.
>
> Signed-off-by: Gwenael Treuveur 
> Signed-off-by: Ludovic Barre 
> Reviewed-by: Rob Herring 

Patch applied.

Yours,
Linus Walleij

Re: [PATCH] kfree_rcu() should use the new kfree_bulk() interface for freeing rcu structures

2017-12-19 Thread Jesper Dangaard Brouer


On Tue, 19 Dec 2017 13:20:43 -0800 Rao Shoaib  wrote:

> On 12/19/2017 12:41 PM, Jesper Dangaard Brouer wrote:
> > On Tue, 19 Dec 2017 09:52:27 -0800 rao.sho...@oracle.com wrote:
> >  
> >> +/* Main RCU function that is called to free RCU structures */
> >> +static void
> >> +__rcu_bulk_free(struct rcu_head *head, rcu_callback_t func, int cpu, bool 
> >> lazy)
> >> +{
> >> +  unsigned long offset;
> >> +  void *ptr;
> >> +  struct rcu_bulk_free *rbf;
> >> +  struct rcu_bulk_free_container *rbfc = NULL;
> >> +
> >> +  rbf = this_cpu_ptr(&cpu_rbf);
> >> +
> >> +  if (unlikely(!rbf->rbf_init)) {
> >> +  spin_lock_init(&rbf->rbf_lock);
> >> +  rbf->rbf_cpu = smp_processor_id();
> >> +  rbf->rbf_init = true;
> >> +  }
> >> +
> >> +  /* hold lock to protect against other cpu's */
> >> +  spin_lock_bh(&rbf->rbf_lock);  
> >
> > I'm not sure this will be faster.  Having to take a cross CPU lock here
> > (+ BH-disable) could cause scaling issues.   Hopefully this lock will
> > not be used intensively by other CPUs, right?
> >
[...]
> 
> As Paul has pointed out the lock is a per-cpu lock, the only reason for 
> another CPU to access this lock is if the rcu callbacks run on a 
> different CPU and there is nothing the code can do to avoid that but 
> that should be rare anyways.

(loop in Paul's comment)
On Tue, 19 Dec 2017 12:56:29 -0800
"Paul E. McKenney"  wrote:

> Isn't this lock in a per-CPU object?  It -might- go cross-CPU in response
> to CPU-hotplug operations, but that should be rare.

Point taken.  If this lock is very unlikely to be taken on another CPU
then I withdraw my performance concerns (the cacheline can hopefully
stay in Modified(M) state on this CPU, and all other CPUs will have in
in Invalid(I) state based on MESI cache coherence protocol view[1]).

The lock's atomic operation does have some overhead, and _later_ if we
could get fancy and use seqlock (include/linux/seqlock.h) to remove
that.

[1] https://en.wikipedia.org/wiki/MESI_protocol
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

[PATCH v6 0/2] KVM: MMU: fix kvm_is_mmio_pfn()

2017-12-19 Thread Haozhong Zhang

Some reserved pages, such as those from NVDIMM DAX devices, are not
for MMIO, and can be mapped with cached memory type for better
performance. However, the above check misconceives those pages as
MMIO.  Because KVM maps MMIO pages with UC memory type, the
performance of guest accesses to those pages would be harmed.
Therefore, we check the host memory type in addition and only treat
UC/UC-/WC pages as MMIO.

Changes in v6:
 * Rename the function in patch 1 to pat_immune_to_uc_mtrr().
 * Consider WC memory type in patch 1.

Changes in v5:
 * Rename pat_pfn_is_uc() into pat_pfn_is_uc_or_uc_minus() to avoid
   confusion.
 * Drop converters between kvm_pfn_t and pfn_t, because they are not
   necessary. pat_pfn_is_uc_or_uc_minus() does not need flags in
   pfn_t, so we can only pass a raw unsigned long to it.

Changes in v4:
 * Mask pfn_t and kvm_pfn_t specific flags in conversion.

Changes in v3:
 * Move cache mode check to pat.c as pat_pfn_is_uc()
 * Reintroduce converters between kvm_pfn_t and pfn_t.

Changes in v2:
 * Switch to lookup_memtype() to get host memory type.
 * Rewrite the comment in KVM MMU patch.
 * Remove v1 patch 2, which is not necessary in v2.

Haozhong Zhang (2):
  x86/mm: add a function to check if a pfn is UC/UC-/WC
  KVM: MMU: consider host cache mode in MMIO page check

 arch/x86/include/asm/pat.h |  2 ++
 arch/x86/kvm/mmu.c | 13 -
 arch/x86/mm/pat.c  | 19 +++
 3 files changed, 33 insertions(+), 1 deletion(-)

-- 
2.14.1

[PATCH v6 2/2] KVM: MMU: consider host cache mode in MMIO page check

2017-12-19 Thread Haozhong Zhang

Some reserved pages, such as those from NVDIMM DAX devices, are not
for MMIO, and can be mapped with cached memory type for better
performance. However, the above check misconceives those pages as
MMIO.  Because KVM maps MMIO pages with UC memory type, the
performance of guest accesses to those pages would be harmed.
Therefore, we check the host memory type in addition and only treat
UC/UC-/WC pages as MMIO.

Signed-off-by: Haozhong Zhang 
Reported-by: Cuevas Escareno, Ivan D 
Reported-by: Kumar, Karthik 
Reviewed-by: Xiao Guangrong 
---
 arch/x86/kvm/mmu.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 89da688784fa..e3b9998b3355 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2708,7 +2708,18 @@ static bool mmu_need_write_protect(struct kvm_vcpu 
*vcpu, gfn_t gfn,
 static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
 {
if (pfn_valid(pfn))
-   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn));
+   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn)) &&
+   /*
+* Some reserved pages, such as those from NVDIMM
+* DAX devices, are not for MMIO, and can be mapped
+* with cached memory type for better performance.
+* However, the above check misconceives those pages
+* as MMIO, and results in KVM mapping them with UC
+* memory type, which would hurt the performance.
+* Therefore, we check the host memory type in addition
+* and only treat UC/UC-/WC pages as MMIO.
+*/
+   (!pat_enabled() || pat_immune_to_uc_mtrr(pfn));
 
return true;
 }
-- 
2.14.1

Re: [ANNOUNCE] autofs 5.1.2 release

2017-12-19 Thread Ian Kent

On 20/12/17 13:52, Ian Kent wrote:
> On 20/12/17 11:29, NeilBrown wrote:
>>
>> Hi Ian,
>>  I've been looking at:
>>
>>> - add configuration option to use fqdn in mounts.
>>
>> (commit 9aeef772604) because using this new option causes a regression.
>> If you are using the "replicated server" functionality, then
>>   use_hostname_for_mounts = yes
>> completely disables it.
> 
> Yes, that's not quite right.
> 
> It disables the probe and proximity check for each distinct host
> name used.
> 
> Each of the entries in the list of hosts should still be
> attempted and given that NFS ping is also now used in the NFS
> mount module what's lost is the preferred ordering of the hosts
> list.

Mmm  that's also not right.

An NFS ping is only done on failed local bind mount to check
the NFS server is running on the local machine.

So that availability check needs to be done at mount time if
the proximity check is not done 

Ian

[PATCH v6 1/2] x86/mm: add a function to check if a pfn is UC/UC-/WCee

2017-12-19 Thread Haozhong Zhang

Check whether the PAT memory type of a pfn cannot be overridden by
MTRR UC memory type, i.e. the PAT memory type is UC, UC- or WC. This
function will be used by KVM to determine whether it needs to map a
host pfn to guest with UC memory type.

Signed-off-by: Haozhong Zhang 
Reviewed-by: Xiao Guangrong 
---
 arch/x86/include/asm/pat.h |  2 ++
 arch/x86/mm/pat.c  | 19 +++
 2 files changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index 8a3ee355b422..9a217a18523b 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -22,4 +22,6 @@ int io_reserve_memtype(resource_size_t start, resource_size_t 
end,
 
 void io_free_memtype(resource_size_t start, resource_size_t end);
 
+bool pat_immune_to_uc_mtrr(unsigned long pfn);
+
 #endif /* _ASM_X86_PAT_H */
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index fe7d57a8fb60..2231a84c3d34 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -677,6 +677,25 @@ static enum page_cache_mode lookup_memtype(u64 paddr)
return rettype;
 }
 
+/**
+ * Check whether the PAT memory type of @pfn cannot be overridden by
+ * UC MTRR memory type.
+ *
+ * Only to be called when PAT is enabled.
+ *
+ * Returns true, if the PAT memory type of @pfn is UC, UC-, or WC.
+ * Returns false in other cases.
+ */
+bool pat_immune_to_uc_mtrr(unsigned long pfn)
+{
+   enum page_cache_mode cm = lookup_memtype(PFN_PHYS(pfn));
+
+   return cm == _PAGE_CACHE_MODE_UC ||
+  cm == _PAGE_CACHE_MODE_UC_MINUS ||
+  cm == _PAGE_CACHE_MODE_WC;
+}
+EXPORT_SYMBOL_GPL(pat_immune_to_uc_mtrr);
+
 /**
  * io_reserve_memtype - Request a memory type mapping for a region of memory
  * @start: start (physical address) of the region
-- 
2.14.1

GOOD DAY FROM MOHAMMED AHMED .

2017-12-19 Thread Mr Mohamad Ahmed

My Dear  Friend.

I am Mr. Mohammed Ahmed, I work with Bank Of Africa Burkina Faso West
Africa as their Auditing Manager.  My Dear I am sending you this
business proposal in regards to release and transfer of $13.5 M USD
into a foreign account.

Everything about the transaction shall be done legal and official on
your behalf without any problem for all I require from you is to
provide foreign account to receive the fund, Please you shouldn’t be
embarrassed how I came across your email ID for I got your contact
address from internet Directory and I decided to get in touch with you
with the proposal.

If you are interested to execute the business with me and also provide
foreign account to receive the fund then get back to me for more
details about the business deal and as soon as I receive your positive
response along with your personal information, I will not hesitate to
feed you with more details on how we are to achieve our goal.

You will be entitle  to have 50% as your own share from the fund and
50% will for me,  If you are willing to execute the business  then
send me immediately your personal  information needed to enable us
proceed ahead with the business.


1. Full name:.
2. Current Address:.
3. Phone.
4. Occupation:.
5. Age:
6. Country:
7. Sex
8. Your Passport or ID card or Driving License

Waiting For Your Urgent Response

Thanks

Mr. Mohammed Ahmed.

[PATCH] staging: ccree: use size_t consistently

2017-12-19 Thread Gilad Ben-Yossef

Fix declaration, implementation and wrapper function to use
the same size_t type we actually define the parameter to be.

Fixes: 3f268f5d6669 ("staging: ccree: turn compile time debug log to params")
Signed-off-by: Gilad Ben-Yossef 
---
 drivers/staging/ccree/ssi_driver.c | 2 +-
 drivers/staging/ccree/ssi_driver.h | 5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/ccree/ssi_driver.c 
b/drivers/staging/ccree/ssi_driver.c
index 56b5d45..1254c69 100644
--- a/drivers/staging/ccree/ssi_driver.c
+++ b/drivers/staging/ccree/ssi_driver.c
@@ -86,7 +86,7 @@ void __dump_byte_array(const char *name, const u8 *buf, 
size_t len)
if (!buf)
return;
 
-   snprintf(prefix, sizeof(prefix), "%s[%lu]: ", name, len);
+   snprintf(prefix, sizeof(prefix), "%s[%zu]: ", name, len);
 
print_hex_dump(KERN_DEBUG, prefix, DUMP_PREFIX_ADDRESS, 16, 1, buf,
   len, false);
diff --git a/drivers/staging/ccree/ssi_driver.h 
b/drivers/staging/ccree/ssi_driver.h
index 5a56f7a..bf83f3e 100644
--- a/drivers/staging/ccree/ssi_driver.h
+++ b/drivers/staging/ccree/ssi_driver.h
@@ -174,10 +174,9 @@ static inline struct device *drvdata_to_dev(struct 
cc_drvdata *drvdata)
return &drvdata->plat_dev->dev;
 }
 
-void __dump_byte_array(const char *name, const u8 *the_array,
-  unsigned long size);
+void __dump_byte_array(const char *name, const u8 *buf, size_t len);
 static inline void dump_byte_array(const char *name, const u8 *the_array,
-  unsigned long size)
+  size_t size)
 {
if (cc_dump_bytes)
__dump_byte_array(name, the_array, size);
-- 
2.7.4

Re: [PATCH v2 2/2] pinctrl: Allow indicating loss of pin states during low-power

2017-12-19 Thread Linus Walleij

On Mon, Dec 11, 2017 at 12:38 AM, Florian Fainelli  wrote:
> On 12/02/2017 04:48 AM, Linus Walleij wrote:

>> This should solve your problem without having to alter the semantics
>> of pinctrl_select_state() for everyone.
>
> This was exactly what I proposed initially here:
>
> http://patchwork.ozlabs.org/patch/734326/
>
> I really want to get this fixed, but I can't do that if we keep losing
> the context of the discussion (pun intended) :).

Oh sorry man. I am clearly too stupid for this job...

In accordance with things needing to be intuitive, something named
*force_* should of course force the setting into the hardware.

The original patch didn't mention the fact that it was hogs
and hogs only that was causing the trouble and that is why I
got lost. (I guess.) I have been going about this as if it was
something generic that affect all states in all devices, and to
me hogs is just an abscure corner of pin controlling...

I applied the patchwork patch from above, and elaborated
a bit on that it pertains to hogs, let's see what
happens.

For the case where a driver (not hog) needs to handle
suspend/resume transitions, proper states can hopefully
be used.

Yours,
Linus Walleij

Re: [PATCH v3 14/16] phy: Add notify_speed callback

2017-12-19 Thread Kishon Vijay Abraham I

Hi,

On Wednesday 20 December 2017 11:59 AM, Manu Gautam wrote:
> Hi,
> 
> 
> On 12/20/2017 11:19 AM, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On Tuesday 12 December 2017 08:54 PM, Manu Gautam wrote:
>>> Hi,
>>>
>>>
>>> On 12/12/2017 5:13 PM, Kishon Vijay Abraham I wrote:
 Hi,

 On Tuesday 21 November 2017 02:53 PM, Manu Gautam wrote:
> QCOM USB PHYs can monitor resume/remote-wakeup event in
> suspended state. However PHY driver must know current
> operational speed of PHY in order to set correct polarity of
> wakeup events for detection. E.g. QUSB2 PHY monitors DP/DM
> signals depending on speed is LS or FS/HS to detect resume.
> Similarly QMP USB3 PHY in SS mode should monitor RX
> terminations attach/detach and LFPS events depending on
> SSPHY is active or not.
>> Why not use a notification mechanism instead of adding new APIs in phy-core.
>> This will only bloat phy-core with APIs for a particular platform.
> 
> Do you mean notifier_chains ?
> When we have multiple instances of USB PHYs then notifier chains are not
> of much help. For any platform glue or PHY driver it will be very difficult to
> figure out if notification received for speed was for same phy/bus or a
> different one.
> Using PHY callbacks looked more elegant to me. Additionally PHY drivers
> can also use this info decide power management policy e.g. if speed is
> INVALID then it means PHY is not in a session and it can enter deepest
> low power state.
> Additionally if you prefer set_speed name over notify_speed then I am
> ok with that as well so that it sounds more generic.

I'd prefer adding modes in enum phy_mode according to speed and using 
phy_set_mode.

Thanks
Kishon

Re: [PATCH 03/15] staging: lustre: replace simple cases of LIBCFS_ALLOC with kzalloc.

2017-12-19 Thread kbuild test robot

Hi NeilBrown,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on staging/staging-testing]
[also build test ERROR on next-20171220]
[cannot apply to v4.15-rc4]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/NeilBrown/staging-lustre-convert-most-LIBCFS-ALLOC-to-k-malloc/20171220-113029
config: x86_64-randconfig-r0-12200451 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c: In function 
'kiblnd_dev_failover':
>> drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c:2395:2: error: 'kdev' 
>> undeclared (first use in this function)
 kdev = kzalloc(sizeof(*hdev), GFP_NOFS);
 ^~~~
   drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c:2395:2: note: each 
undeclared identifier is reported only once for each function it appears in

vim +/kdev +2395 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c

  2329  
  2330  int kiblnd_dev_failover(struct kib_dev *dev)
  2331  {
  2332  LIST_HEAD(zombie_tpo);
  2333  LIST_HEAD(zombie_ppo);
  2334  LIST_HEAD(zombie_fpo);
  2335  struct rdma_cm_id *cmid  = NULL;
  2336  struct kib_hca_dev *hdev  = NULL;
  2337  struct ib_pd *pd;
  2338  struct kib_net *net;
  2339  struct sockaddr_in addr;
  2340  unsigned long flags;
  2341  int rc = 0;
  2342  int i;
  2343  
  2344  LASSERT(*kiblnd_tunables.kib_dev_failover > 1 ||
  2345  dev->ibd_can_failover || !dev->ibd_hdev);
  2346  
  2347  rc = kiblnd_dev_need_failover(dev);
  2348  if (rc <= 0)
  2349  goto out;
  2350  
  2351  if (dev->ibd_hdev &&
  2352  dev->ibd_hdev->ibh_cmid) {
  2353  /*
  2354   * XXX it's not good to close old listener at here,
  2355   * because we can fail to create new listener.
  2356   * But we have to close it now, otherwise rdma_bind_addr
  2357   * will return EADDRINUSE... How crap!
  2358   */
  2359  write_lock_irqsave(&kiblnd_data.kib_global_lock, flags);
  2360  
  2361  cmid = dev->ibd_hdev->ibh_cmid;
  2362  /*
  2363   * make next schedule of kiblnd_dev_need_failover()
  2364   * return 1 for me
  2365   */
  2366  dev->ibd_hdev->ibh_cmid  = NULL;
  2367  write_unlock_irqrestore(&kiblnd_data.kib_global_lock, 
flags);
  2368  
  2369  rdma_destroy_id(cmid);
  2370  }
  2371  
  2372  cmid = kiblnd_rdma_create_id(kiblnd_cm_callback, dev, 
RDMA_PS_TCP,
  2373   IB_QPT_RC);
  2374  if (IS_ERR(cmid)) {
  2375  rc = PTR_ERR(cmid);
  2376  CERROR("Failed to create cmid for failover: %d\n", rc);
  2377  goto out;
  2378  }
  2379  
  2380  memset(&addr, 0, sizeof(addr));
  2381  addr.sin_family  = AF_INET;
  2382  addr.sin_addr.s_addr = htonl(dev->ibd_ifip);
  2383  addr.sin_port   = htons(*kiblnd_tunables.kib_service);
  2384  
  2385  /* Bind to failover device or port */
  2386  rc = rdma_bind_addr(cmid, (struct sockaddr *)&addr);
  2387  if (rc || !cmid->device) {
  2388  CERROR("Failed to bind %s:%pI4h to device(%p): %d\n",
  2389 dev->ibd_ifname, &dev->ibd_ifip,
  2390 cmid->device, rc);
  2391  rdma_destroy_id(cmid);
  2392  goto out;
  2393  }
  2394  
> 2395  kdev = kzalloc(sizeof(*hdev), GFP_NOFS);
  2396  if (!hdev) {
  2397  CERROR("Failed to allocate kib_hca_dev\n");
  2398  rdma_destroy_id(cmid);
  2399  rc = -ENOMEM;
  2400  goto out;
  2401  }
  2402  
  2403  atomic_set(&hdev->ibh_ref, 1);
  2404  hdev->ibh_dev   = dev;
  2405  hdev->ibh_cmid  = cmid;
  2406  hdev->ibh_ibdev = cmid->device;
  2407  
  2408  pd = ib_alloc_pd(cmid->device, 0);
  2409  if (IS_ERR(pd)) {
  2410  rc = PTR_ERR(pd);
  2411  CERROR("Can't allocate PD: %d\n", rc);
  2412  goto out;
  2413  }
  2414  
  2415  hdev->ibh_pd = pd;
  2416  
  2417  rc = rdma_listen(cmid, 0);
  2418  if (rc) {
  2419  CERROR("Can't start new listener: %d\n", rc);
  2420  goto out;
  2421  }
  2422  
  2423  rc = kiblnd_hdev_get_attr(hdev);
  2424  if (

Re: [PATCH v10 1/5] add infrastructure for tagging functions as error injectable

2017-12-19 Thread Masami Hiramatsu

On Tue, 19 Dec 2017 18:14:17 -0800
Alexei Starovoitov  wrote:

> On 12/18/17 10:29 PM, Masami Hiramatsu wrote:
> >>
> >> +#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
> >> +#ifdef CONFIG_BPF_KPROBE_OVERRIDE
> >
> > BTW, CONFIG_BPF_KPROBE_OVERRIDE is also confusable name.
> > Since this feature override a function to just return with
> > some return value (as far as I understand, or would you
> > also plan to modify execution path inside a function?),
> > I think it should be better CONFIG_BPF_FUNCTION_OVERRIDE or
> > CONFIG_BPF_EXECUTION_OVERRIDE.
> 
> I don't think such renaming makes sense.
> The feature is overriding kprobe by changing how kprobe returns.
> It doesn't override BPF_FUNCTION or BPF_EXECUTION.

No, I meant this is BPF's feature which override FUNCTION, so
BPF is a kind of namespace. (Is that only for a function entry
because it can not tweak stackframe at this morment?)

> The kernel enters and exists bpf program as normal.

Yeah, but that bpf program modifies instruction pointer, am I correct?

> 
> > Indeed, BPF is based on kprobes, but it seems you are limiting it
> > with ftrace (function-call trace) (I'm not sure the reason why),
> > so using "kprobes" for this feature seems strange for me.
> 
> do you have an idea how kprobe override can happen when kprobe
> placed in the middle of the function?

For example, if you know a basic block in the function, maybe
you can skip a block or something like that. But nowadays
it is somewhat hard because optimizer mixed it up.

> 
> Please make your suggestion as patches based on top of bpf-next.

bpf-next seems already pick this series. Would you mean I revert it and
write new patch?

Thank you,

> 
> Thanks
> 


-- 
Masami Hiramatsu

Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread

2017-12-19 Thread Sergey Senozhatsky

Hello,

not sure if you've been following the whole thread, so I'll try
to summarize it here. apologies if it'll massively repeat the things
that have already been said or will be too long.

On (12/19/17 15:31), Michal Hocko wrote:
> On Tue 19-12-17 10:24:55, Sergey Senozhatsky wrote:
> > On (12/18/17 20:08), Steven Rostedt wrote:
> > > > ... do you guys read my emails? which part of the traces I have provided
> > > > suggests that there is any improvement?
> > > 
> > > The traces I've seen from you were from non-realistic scenarios.
> > > But I have hit issues with printk()s happening that cause one CPU to do 
> > > all
> > > the work, where my patch would fix that. Those are the scenarios I'm
> > > talking about.
> > 
> > any hints about what makes your scenario more realistic than mine?
> 
> Well, Tetsuo had some semi-realistic scenario where alloc stall messages
> managed to stall other printk callers (namely oom handler). I am saying
> sem-realistic because he is testing OOM throughput with an unrealistic
> workload which itself is not very real on production systems. The
> essential thing here is that many processes might be in the allocation
> path and any printk there could just swamp some unrelated printk caller
> and cause hard to debug problems. Steven's patch should address that
> with a relatively simple lock handover. I was pessimistic this would
> work sufficiently well but it survived Tetsuo's testing IIRC so it
> sounds good enough to me.

sure, no denial. Tetsuo indeed said that Steven's patch passed his
test. and for the note, the most recent printk_kthread patch set passed
Tetsuo's test as well; I asked him privately to run the tests before I
published it. but this is not the point. and to make it clear - this is
not a "Steven's patch" vs. "my patch set + Steven's patch atop of it"
type of thing. not at all.

IMPORTANT DISCLAIMER
   I SPEAK FOR MYSELF ONLY. ABOUT MY OBSERVATION ONLY. THIS IS AN
   ATTEMPT TO ANALYSE WHY THE PATCH DIDN'T WORK ON MY SETUP AND WHY
   MY SETUP NEEDS ANOTHER APPROACH. THIS IS NOT TO WARN ANYONE THAT
   THE PATCH WON'T WORK ON THEIR SETUPS. I MEAN NO OFFENSE AND AM
   NOT TRYING TO LOOK/SOUND SMART. AND, LIKE I SAID, IF STEVEN OR
   PETR WANT TO PUSH THE PATCH, I'M NOT GOING TO BLOCK IT.

so why Steven's patch has not succeeded on my boards?... partly because
of the fact that printk is "multidimensional" in its complexity and
Steven's patch just doesn't touch some of those problematic parts; partly
because the patch has the requirements which can't be 100% true on my
boards.

to begin with,
so the patch works only when the console_sem is contended. IOW, when
multiple sites perform printk-s concurrently frequent enough for the
hand off logic to detect it and to pass the control to another CPU.
but this turned out to be a bit hard to count on. why? several problems.

(1) the really big one:
   console_sem owner can be preempted under console_sem, removing any
   console_sem competition, it's already locked - its owner is preempted.
   this removes any possibility of hand off. and this unleashes CPUs that
   do printk-s, because when console_sem is locked, printk-s from other
   CPUs become, basically, as fast as sprintf+memcpy.

(1.1) the really big one, part two. more on this later. see @PART2

(2) another big one:
   printk() fast path - sprintf+memcpy - can be significantly faster than
   call to console drivers. on my board I see that printk CPU can add 1140
   new messages to the logbuf, while active console_sem owner prints a
   single message to the serial console. not to mention console_sem owner
   preemption.

(1) and (2) combined can do magical things. on my extremely trivial
test -- not insanely huge number of printks (much less than 2000 lines)
from a preemptible context, not so much printk-s from other CPUs - I
can see, and I confirmed that with the traces, that when console_sem is
getting handed over to another CPU and that new console_sem owner is
getting preempted or when it begins to print messages to serial console,
the CPU that actually does most of printk-s finishes its job in almost no
time, because all it has to do is sprintf+memcpy. which basically means
that console_sem is not contended anymore, and thus its either current
console_sem owner or _maybe_ some other task that has to print all of
the pending messages.

and to make it worse, the hand off logic does not distinguish between
contexts it's passing the console_sem ownership to. it will be equally
happy to hand off console_sem to atomic context from a non atomic one,
or from atomic to another atomic (and vice versa), regardless the number
of pending messages in the logbuf. more on this later [see @LATER].

now, Steven and Petr said that my test was non realistic and thus the
observations were invalid. OK, let's take a closer look on OOM. and
let's start with the question - what is so special about OOM that
makes (1) or (1.1) or (2) invalid?
i
can we get preempted when we call out_

[PATCH] gitignore: add *.gcda files

2017-12-19 Thread Jaejoong Kim

Ignore the *.gcda files generated by gcov

Signed-off-by: Jaejoong Kim 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 0c39aa2..580ef7c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -39,6 +39,7 @@ Module.symvers
 *.dwo
 *.su
 *.c.[012]*.*
+*.gcda
 
 #
 # Top-level generic files
-- 
2.7.4

Re: [PATCH] kfree_rcu() should use the new kfree_bulk() interface for freeing rcu structures

2017-12-19 Thread Jesper Dangaard Brouer

On Tue, 19 Dec 2017 16:20:51 -0800
"Paul E. McKenney"  wrote:

> On Tue, Dec 19, 2017 at 02:12:06PM -0800, Matthew Wilcox wrote:
> > On Tue, Dec 19, 2017 at 09:41:58PM +0100, Jesper Dangaard Brouer wrote:  
> > > If I had to implement this: I would choose to do the optimization in
> > > __rcu_process_callbacks() create small on-call-stack ptr-array for
> > > kfree_bulk().  I would only optimize the case that call kfree()
> > > directly.  In the while(list) loop I would defer calling
> > > __rcu_reclaim() for __is_kfree_rcu_offset(head->func), and instead add
> > > them to the ptr-array (and flush if the array is full in loop, and
> > > kfree_bulk flush after loop).
> > > 
> > > The real advantage of kfree_bulk() comes from amortizing the per kfree
> > > (behind-the-scenes) sync cost.  There is an additional benefit, because
> > > objects comes from RCU and will hit a slower path in SLUB.   The SLUB
> > > allocator is very fast for objects that gets recycled quickly (short
> > > lifetime), non-locked (cpu-local) double-cmpxchg.  But slower for
> > > longer-lived/more-outstanding objects, as this hits a slower code-path,
> > > fully locked (cross-cpu) double-cmpxchg.
> > 
> > Something like this ...  (compile tested only)

Yes, exactly.

> > Considerably less code; Rao, what do you think?  
> 
> I am sorry, but I am not at all fan of this approach.
> 
> If we are going to make this sort of change, we should do so in a way
> that allows the slab code to actually do the optimizations that might
> make this sort of thing worthwhile.  After all, if the main goal was small
> code size, the best approach is to drop kfree_bulk() and get on with life
> in the usual fashion.
> 
> I would prefer to believe that something like kfree_bulk() can help,
> and if that is the case, we should give it a chance to do things like
> group kfree_rcu() requests by destination slab and soforth, allowing
> batching optimizations that might provide more significant increases
> in performance.  Furthermore, having this in slab opens the door to
> slab taking emergency action when memory is low.

I agree with your argument. Although in the (slub) code I do handle
different destination slab's, but only do a limited look-ahead to find
same dest-slab's which gives the speedup (see build_detached_freelist).

We do have a larger and more consistent speedup potential, if adding
infrastructure that allow us to pre-sort by destination slab, before
invoking kfree_bulk().  In that respect, Rao's patch is a better
approach.

> But for the patch below, NAK.
> 
>   Thanx, Paul
> 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 59c471de342a..5ac4ed077233 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -174,20 +174,19 @@ static inline void debug_rcu_head_unqueue(struct 
> > rcu_head *head)
> >  }
> >  #endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */
> > 
> > -void kfree(const void *);
> > -
> >  /*
> >   * Reclaim the specified callback, either by invoking it (non-lazy case)
> >   * or freeing it directly (lazy case).  Return true if lazy, false 
> > otherwise.
> >   */
> > -static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
> > +static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head, 
> > void **kfree,
> > +   unsigned int *idx)
> >  {
> > unsigned long offset = (unsigned long)head->func;
> > 
> > rcu_lock_acquire(&rcu_callback_map);
> > if (__is_kfree_rcu_offset(offset)) {
> > RCU_TRACE(trace_rcu_invoke_kfree_callback(rn, head, offset);)
> > -   kfree((void *)head - offset);
> > +   kfree[*idx++] = (void *)head - offset;
> > rcu_lock_release(&rcu_callback_map);
> > return true;
> > } else {
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index f9c0ca2ccf0c..7e13979b4697 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2725,6 +2725,8 @@ static void rcu_do_batch(struct rcu_state *rsp, 
> > struct rcu_data *rdp)
> > struct rcu_head *rhp;
> > struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl);
> > long bl, count;
> > +   void *to_free[16];
> > +   unsigned int to_free_idx = 0;
> > 
> > /* If no callbacks are ready, just return. */
> > if (!rcu_segcblist_ready_cbs(&rdp->cblist)) {
> > @@ -2755,8 +2757,10 @@ static void rcu_do_batch(struct rcu_state *rsp, 
> > struct rcu_data *rdp)
> > rhp = rcu_cblist_dequeue(&rcl);
> > for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
> > debug_rcu_head_unqueue(rhp);
> > -   if (__rcu_reclaim(rsp->name, rhp))
> > +   if (__rcu_reclaim(rsp->name, rhp, to_free, &to_free_idx))
> > rcu_cblist_dequeued_lazy(&rcl);
> > +   if (to_free_idx == 16)
> > +   kfree_bulk(16, to_free);
> > /*
> >  * Stop only if limit reached and CPU has something to do.
> >

Re: [PATCH v15 3/5] mfd: Add driver for RAVE Supervisory Processor

2017-12-19 Thread Philippe Ombredanne

Andrey,

On Wed, Dec 20, 2017 at 5:00 AM, Andrey Smirnov
 wrote:
> Add a driver for RAVE Supervisory Processor, an MCU implementing
> various bits of housekeeping functionality (watchdoging, backlight
> control, LED control, etc) on RAVE family of products by Zodiac
> Inflight Innovations.



> --- /dev/null
> +++ b/drivers/mfd/rave-sp.c
> @@ -0,0 +1,720 @@
> +/*
> + * Multifunction core driver for Zodiac Inflight Innovations
> + * SP MCU that is connected via dedicated UART port
> + *
> + * Copyright (C) 2017 Zodiac Inflight Innovations
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 of
> + * the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see .
> + */

Would you mind using the new SPDX tags documented in Thomas patch set
[1] rather than this fine but longer legalese?
Thank you!

[1] https://lkml.org/lkml/2017/12/4/934

-- 
Cordially
Philippe Ombredanne

Re: [ANNOUNCE] autofs 5.1.2 release

2017-12-19 Thread Ian Kent

On 20/12/17 14:10, Ian Kent wrote:
> On 20/12/17 13:52, Ian Kent wrote:
>> On 20/12/17 11:29, NeilBrown wrote:
>>>
>>> Hi Ian,
>>>  I've been looking at:
>>>
 - add configuration option to use fqdn in mounts.
>>>
>>> (commit 9aeef772604) because using this new option causes a regression.
>>> If you are using the "replicated server" functionality, then
>>>   use_hostname_for_mounts = yes
>>> completely disables it.
>>
>> Yes, that's not quite right.
>>
>> It disables the probe and proximity check for each distinct host
>> name used.
>>
>> Each of the entries in the list of hosts should still be
>> attempted and given that NFS ping is also now used in the NFS
>> mount module what's lost is the preferred ordering of the hosts
>> list.
>>
>>>
>>> This is caused by:
>>>
>>> diff --git a/modules/replicated.c b/modules/replicated.c
>>> index 32860d5fe245..8437f5f3d5b2 100644
>>> --- a/modules/replicated.c
>>> +++ b/modules/replicated.c
>>> @@ -667,6 +667,12 @@ int prune_host_list(unsigned logopt, struct host 
>>> **list,
>>> if (!*list)
>>> return 0;
>>>  
>>> +   /* If we're using the host name then there's no point probing
>>> +* avialability and respose time.
>>> +*/
>>> +   if (defaults_use_hostname_for_mounts())
>>> +   return 1;
>>> +
>>> /* Use closest hosts to choose NFS version */
>>>
>>> My question is: why what this particular change made.
>>
>> It was a while ago but there were complains about using the IP
>> address for mounts. It was requested to provide a way to prevent
>> that and force the use of the host name in mounts.
>>
>>> Why can't prune_host_list() be allowed to do it's thing
>>> when use_hostname_for_mounts is set.
>>
>> We could if each host name resolved to a single IP address.
>>
>> I'd need to check that use_hostname_for_mounts doesn't get
>> in the road but the host struct should have ->rr set to true
>> if it has multiple addresses so changing it to work the way
>> your recommending shouldn't be hard. I think there's a couple
>> of places that would need to be checked.
>>
>> If the host does resolve to multiple addresses the situation
>> is different. There's no way to stop the actual mount from
>> trying an IP address that's not responding and proximity
>> doesn't make sense either again because every time a lookup
>> is done on the host name (eg. at mount time) the next address
>> in its list will be returned which can and usually is different
>> from what would have been checked.
>>
>>> I understand that it would be pointless choosing between
>>> the different interfaces of a multi-homed host, but there is still value
>>> in choosing between multiple distinct hosts.
>>>
>>> What, if anything, might go wrong if I simply reverse this chunk of the
>>> patch?
>>
>> You'll get IP addresses in the logs in certain cases but that
>> should be all.
>>
>> It would probably be better to ensure that the checks are done
>> if the host name resolves to a single IP address.
> 
> I think that should be "if the host names in the list each resolve
> to a single IP address", otherwise the round robin behavior would
> probably still get in the road.

I think maybe this is sufficient 

autofs-5.1.4 - use proximity check if all host names are simple

From: Ian Kent 

Currently if the configuration option use_hostname_for_mounts is
set then the proximity calcualtion is not done for the list of
hosts.

But if each host name in the host list resolves to a single IP
address then performing the proximity check still makes sense.

Signed-off-by: Ian Kent 
---
 modules/replicated.c |   32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/modules/replicated.c b/modules/replicated.c
index 3ac4c70f..e5c2276d 100644
--- a/modules/replicated.c
+++ b/modules/replicated.c
@@ -711,6 +711,24 @@ done:
return 0;
 }
 
+static unsigned int is_hosts_list_simple(struct host *list)
+{
+   struct host *this = list;
+   unsigned int ret = 1;
+
+   while (this) {
+   struct host *next = this->next;
+
+   if (this->rr) {
+   ret = 0;
+   break;
+   }
+   this = next;
+   }
+
+   return ret;
+}
+
 int prune_host_list(unsigned logopt, struct host **list,
unsigned int vers, int port)
 {
@@ -726,12 +744,6 @@ int prune_host_list(unsigned logopt, struct host **list,
if (!*list)
return 0;
 
-   /* If we're using the host name then there's no point probing
-* avialability and respose time.
-*/
-   if (defaults_use_hostname_for_mounts())
-   return 1;
-
/* Use closest hosts to choose NFS version */
 
first = *list;
@@ -767,6 +779,14 @@ int prune_host_list(unsigned logopt, struct host **list,
return 1;
}
 
+   /* If we're using the host name then there's no point probing
+

Re: [PATCH v2] cpufreq: powernv: Add support of frequency domain

2017-12-19 Thread Viresh Kumar

On 20-12-17, 12:12, Abhishek Goel wrote:
> diff --git a/drivers/cpufreq/powernv-cpufreq.c 
> b/drivers/cpufreq/powernv-cpufreq.c
> index b6d7c4c..fd642bc 100644
> --- a/drivers/cpufreq/powernv-cpufreq.c
> +++ b/drivers/cpufreq/powernv-cpufreq.c
> @@ -37,6 +37,7 @@
>  #include  /* Required for cpu_sibling_mask() in UP configs */
>  #include 
>  #include 
> +#include 
>  
>  #define POWERNV_MAX_PSTATES  256
>  #define PMSR_PSAFE_ENABLE(1UL << 30)
> @@ -130,6 +131,9 @@ static struct chip {
>  static int nr_chips;
>  static DEFINE_PER_CPU(struct chip *, chip_info);
>  
> +static u32 freq_domain_indicator;
> +static u32 flag;

I wouldn't name it as flag, its unreadable. Maybe its better to name
it based on the quirk you are trying to workaround with ?

> +
>  /*
>   * Note:
>   * The set of pstates consists of contiguous integers.
> @@ -194,6 +198,38 @@ static inline void reset_gpstates(struct cpufreq_policy 
> *policy)
>   gpstates->last_gpstate_idx = 0;
>  }
>  
> +#define SIZE NR_CPUS
> +#define ORDER_FREQ_MAP ilog2(SIZE)
> +
> +static DEFINE_HASHTABLE(freq_domain_map, ORDER_FREQ_MAP);
> +
> +struct hashmap {
> + cpumask_t mask;
> + int chip_id;
> + u32 pir_key;
> + struct hlist_node hash_node;
> +};
> +
> +static void insert(u32 key, int cpu)
> +{
> + struct hashmap *data;
> +
> + hash_for_each_possible(freq_domain_map, data, hash_node, key%SIZE) {
> + if (data->chip_id == cpu_to_chip_id(cpu) &&
> + data->pir_key == key) {
> + cpumask_set_cpu(cpu, &data->mask);
> + return;
> + }
> + }
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + hash_add(freq_domain_map, &data->hash_node, key%SIZE);
> + cpumask_set_cpu(cpu, &data->mask);
> + data->chip_id = cpu_to_chip_id(cpu);
> + data->pir_key = key;
> +
> +}
> +
>  /*
>   * Initialize the freq table based on data obtained
>   * from the firmware passed via device-tree
> @@ -206,7 +242,9 @@ static int init_powernv_pstates(void)
>   u32 len_ids, len_freqs;
>   u32 pstate_min, pstate_max, pstate_nominal;
>   u32 pstate_turbo, pstate_ultra_turbo;
> + u32 key;
>  
> + flag = 0;

Isn't flag already 0 (global-uninitialized) ?

>   power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
>   if (!power_mgt) {
>   pr_warn("power-mgt node not found\n");
> @@ -229,6 +267,17 @@ static int init_powernv_pstates(void)
>   return -ENODEV;
>   }
>  
> + if (of_device_is_compatible(power_mgt, "freq-domain-v1") &&
> + of_property_read_u32(power_mgt, "ibm,freq-domain-indicator",
> +  &freq_domain_indicator)) {
> + pr_warn("ibm,freq-domain-indicator not found\n");
> + freq_domain_indicator = 0;

You shouldn't be required to set it to 0 here.

> + }
> +
> + if (of_device_is_compatible(power_mgt, "P9-occ-quirk")) {
> + flag = 1;
> + }

Remove {} and a better name like p9_occ_quirk would be good for flag.
Also making it a bool may be better ?

> +
>   if (of_property_read_u32(power_mgt, "ibm,pstate-ultra-turbo",
>&pstate_ultra_turbo)) {
>   powernv_pstate_info.wof_enabled = false;
> @@ -249,6 +298,7 @@ static int init_powernv_pstates(void)
>  next:
>   pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
>   pstate_nominal, pstate_max);
> + pr_info("frequency domain indicator %d", freq_domain_indicator);
>   pr_info("Workload Optimized Frequency is %s in the platform\n",
>   (powernv_pstate_info.wof_enabled) ? "enabled" : "disabled");
>  
> @@ -276,6 +326,15 @@ static int init_powernv_pstates(void)
>   return -ENODEV;
>   }
>  
> + if (freq_domain_indicator) {
> + hash_init(freq_domain_map);
> + for_each_possible_cpu(i) {
> + key = ((u32) get_hard_smp_processor_id(i) &
> + freq_domain_indicator);

Maybe break it like:

key = (u32) get_hard_smp_processor_id(i);
key &= freq_domain_indicator;

to make it easily readable ?

> + insert(key, i);
> + }
> + }
> +
>   powernv_pstate_info.nr_pstates = nr_pstates;
>   pr_debug("NR PStates %d\n", nr_pstates);
>   for (i = 0; i < nr_pstates; i++) {
> @@ -693,6 +752,7 @@ static int powernv_cpufreq_target_index(struct 
> cpufreq_policy *policy,
>  {
>   struct powernv_smp_call_data freq_data;
>   unsigned int cur_msec, gpstate_idx;
> +

:(

>   struct global_pstate_info *gpstates = policy->driver_data;
>  
>   if (unlikely(rebooting) && new_index != get_nominal_index())
> @@ -760,25 +820,55 @@ static int powernv_cpufreq_target_index(struct 
> cpufreq_policy *policy,
>  
>   spin_unlock(&gpstates->gpstate_lock);
>  
> - /*
> -  * Use smp_call_function to send IP

Re: [PATCH] arm64: dts: Remove leading 0x and 0s from bindings notation

2017-12-19 Thread Andy Gross

On Thu, Dec 14, 2017 at 05:53:52PM +0100, Mathieu Malaterre wrote:
> Improve the DTS files by removing all the leading "0x" and zeros to fix the
> following dtc warnings:
> 
> Warning (unit_address_format): Node /XXX unit name should not have leading 
> "0x"
> 
> and
> 
> Warning (unit_address_format): Node /XXX unit name should not have leading 0s
> 
> Converted using the following command:
> 
> find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -E -i -e 
> "s/@0x([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" -e "s/@0+([0-9a-fA-F\.]+)\s?\{/@\L\1 
> \{/g" {} +
> 
> For simplicity, two sed expressions were used to solve each warnings 
> separately.
> 
> To make the regex expression more robust a few other issues were resolved,
> namely setting unit-address to lower case, and adding a whitespace before the
> the opening curly brace:
> 
> https://elinux.org/Device_Tree_Linux#Linux_conventions
> 
> This is a follow up to commit 4c9847b7375a ("dt-bindings: Remove leading 0x 
> from bindings notation")
> 
> Reported-by: David Daney 
> Suggested-by: Rob Herring 
> Signed-off-by: Mathieu Malaterre 

Acked-by: Andy Gross

Re: [PATCH v2 2/5] mm: Extends local cpu counter vm_diff_nodestat from s8 to s16

2017-12-19 Thread kemi

On 2017年12月20日 01:21, Christopher Lameter wrote:
> On Tue, 19 Dec 2017, Michal Hocko wrote:
> 
>>> Well the reason for s8 was to keep the data structures small so that they
>>> fit in the higher level cpu caches. The large these structures become the
>>> more cachelines are used by the counters and the larger the performance
>>> influence on the code that should not be impacted by the overhead.
>>
>> I am not sure I understand. We usually do not access more counters in
>> the single code path (well, PGALLOC and NUMA counteres is more of an
>> exception). So it is rarely an advantage that the whole array is in the
>> same cache line. Besides that this is allocated by the percpu allocator
>> aligns to the type size rather than cache lines AFAICS.
> 
> I thought we are talking about NUMA counters here?
> 
> Regardless: A typical fault, system call or OS action will access multiple
> zone and node counters when allocating or freeing memory. Enlarging the
> fields will increase the number of cachelines touched.
> 

Yes, we add one more cache line footprint access theoretically.
But I don't think it would be a problem.
1) Not all the counters need to be accessed in fast path of page allocation,
the counters covered in a single cache line usually is enough for that, we
probably don't need to access one more cache line. I tend to agree Michal's
argument.
Besides, in some slow path in which code is protected by zone lock or lru lock,
access one more cache line would be a big problem since many other cache lines 
are also be accessed.

2) Enlarging vm_node_stat_diff from s8 to s16 gives an opportunity to keep
more number in local cpus that provides the possibility of reducing the global
counter update frequency. Thus, we can gain the benefit by reducing expensive 
cache bouncing.  

Well, if you still have some concerns, I can post some data for 
will-it-scale.page_fault1.
What the benchmark does is: it forks nr_cpu processes and then each
process does the following:
1 mmap() 128M anonymous space;
2 writes to each page there to trigger actual page allocation;
3 munmap() it.
in a loop.
https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c

Or you can provide some other benchmarks on which you want to see performance 
impact.

>> Maybe it used to be all different back then when the code has been added
>> but arguing about cache lines seems to be a bit problematic here. Maybe
>> you have some specific workloads which can prove me wrong?
> 
> Run a workload that does some page faults? Heavy allocation and freeing of
> memory?
> 
> Maybe that is no longer relevant since the number of the counters is
> large that the accesses are so sparse that each action pulls in a whole
> cacheline. That would be something we tried to avoid when implementing
> the differentials.
> 
>

Re: [PATCH v2] IPI performance benchmark

2017-12-19 Thread Wanpeng Li

Hi Yury,
2017-12-19 16:50 GMT+08:00 Yury Norov :
> This benchmark sends many IPIs in different modes and measures
> time for IPI delivery (first column), and total time, ie including
> time to acknowledge the receive by sender (second column).
>
> The scenarios are:
> Dry-run:do everything except actually sending IPI. Useful
> to estimate system overhead.
> Self-IPI:   Send IPI to self CPU.
> Normal IPI: Send IPI to some other CPU.
> Broadcast IPI:  Send broadcast IPI to all online CPUs.
> Broadcast lock: Send broadcast IPI to all online CPUs and force them
> acquire/release spinlock.
>
> The raw output looks like this:
> [  155.363374] Dry-run: 0,2999696 ns
> [  155.429162] Self-IPI: 30385328,   65589392 ns
> [  156.060821] Normal IPI:  566914128,  631453008 ns
> [  158.384427] Broadcast IPI:   0, 2323368720 ns
> [  160.831850] Broadcast lock:  0, 2447000544 ns
>
> For virtualized guests, sending and reveiving IPIs causes guest exit.
> I used this test to measure performance impact on KVM subsystem of
> Christoffer Dall's series "Optimize KVM/ARM for VHE systems" [1].
>
> Test machine is ThunderX2, 112 online CPUs. Below the results normalized
> to host dry-run time, broadcast lock results omitted. Smaller - better.

Could you test on a x86 box? I see a lot of calltraces on my haswell
client host, there is no calltrace in the guest, however, I can still
observe "Invalid parameters" warning when insmod this module. In
addition, the x86 box fails to boot when ipi_benchmark is buildin.

Regards,
Wanpeng Li

[PATCH v2] cpufreq: powernv: Add support of frequency domain

2017-12-19 Thread Abhishek Goel

Frequency-domain indicates group of CPUs that would share same frequency.
It is detected using device-tree node "frequency-domain-indicator".
frequency-domain-indicator is a bitmask which will have different value
depending upon the generation of the processor.

CPUs of the same chip for which the result of a bitwise AND between
their PIR and the frequency-domain-indicator is the same share the same
frequency.

In this patch, we define hash-table indexed by the aforementioned
bitwise ANDed value to store the cpumask of the CPUs sharing the same
frequency domain. Further, the cpufreq policy will be created per
frequency-domain

So for POWER9, a cpufreq policy is created per quad while for POWER8 it
is created per core. Governor decides frequency for each policy but
multiple cores may come under same policy. In such case frequency needs
to be set on each core sharing that policy.

Signed-off-by: Abhishek Goel 
---
 drivers/cpufreq/powernv-cpufreq.c | 110 ++
 1 file changed, 100 insertions(+), 10 deletions(-)

diff --git a/drivers/cpufreq/powernv-cpufreq.c 
b/drivers/cpufreq/powernv-cpufreq.c
index b6d7c4c..fd642bc 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -37,6 +37,7 @@
 #include  /* Required for cpu_sibling_mask() in UP configs */
 #include 
 #include 
+#include 
 
 #define POWERNV_MAX_PSTATES256
 #define PMSR_PSAFE_ENABLE  (1UL << 30)
@@ -130,6 +131,9 @@ static struct chip {
 static int nr_chips;
 static DEFINE_PER_CPU(struct chip *, chip_info);
 
+static u32 freq_domain_indicator;
+static u32 flag;
+
 /*
  * Note:
  * The set of pstates consists of contiguous integers.
@@ -194,6 +198,38 @@ static inline void reset_gpstates(struct cpufreq_policy 
*policy)
gpstates->last_gpstate_idx = 0;
 }
 
+#define SIZE NR_CPUS
+#define ORDER_FREQ_MAP ilog2(SIZE)
+
+static DEFINE_HASHTABLE(freq_domain_map, ORDER_FREQ_MAP);
+
+struct hashmap {
+   cpumask_t mask;
+   int chip_id;
+   u32 pir_key;
+   struct hlist_node hash_node;
+};
+
+static void insert(u32 key, int cpu)
+{
+   struct hashmap *data;
+
+   hash_for_each_possible(freq_domain_map, data, hash_node, key%SIZE) {
+   if (data->chip_id == cpu_to_chip_id(cpu) &&
+   data->pir_key == key) {
+   cpumask_set_cpu(cpu, &data->mask);
+   return;
+   }
+   }
+
+   data = kzalloc(sizeof(*data), GFP_KERNEL);
+   hash_add(freq_domain_map, &data->hash_node, key%SIZE);
+   cpumask_set_cpu(cpu, &data->mask);
+   data->chip_id = cpu_to_chip_id(cpu);
+   data->pir_key = key;
+
+}
+
 /*
  * Initialize the freq table based on data obtained
  * from the firmware passed via device-tree
@@ -206,7 +242,9 @@ static int init_powernv_pstates(void)
u32 len_ids, len_freqs;
u32 pstate_min, pstate_max, pstate_nominal;
u32 pstate_turbo, pstate_ultra_turbo;
+   u32 key;
 
+   flag = 0;
power_mgt = of_find_node_by_path("/ibm,opal/power-mgt");
if (!power_mgt) {
pr_warn("power-mgt node not found\n");
@@ -229,6 +267,17 @@ static int init_powernv_pstates(void)
return -ENODEV;
}
 
+   if (of_device_is_compatible(power_mgt, "freq-domain-v1") &&
+   of_property_read_u32(power_mgt, "ibm,freq-domain-indicator",
+&freq_domain_indicator)) {
+   pr_warn("ibm,freq-domain-indicator not found\n");
+   freq_domain_indicator = 0;
+   }
+
+   if (of_device_is_compatible(power_mgt, "P9-occ-quirk")) {
+   flag = 1;
+   }
+
if (of_property_read_u32(power_mgt, "ibm,pstate-ultra-turbo",
 &pstate_ultra_turbo)) {
powernv_pstate_info.wof_enabled = false;
@@ -249,6 +298,7 @@ static int init_powernv_pstates(void)
 next:
pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
pstate_nominal, pstate_max);
+   pr_info("frequency domain indicator %d", freq_domain_indicator);
pr_info("Workload Optimized Frequency is %s in the platform\n",
(powernv_pstate_info.wof_enabled) ? "enabled" : "disabled");
 
@@ -276,6 +326,15 @@ static int init_powernv_pstates(void)
return -ENODEV;
}
 
+   if (freq_domain_indicator) {
+   hash_init(freq_domain_map);
+   for_each_possible_cpu(i) {
+   key = ((u32) get_hard_smp_processor_id(i) &
+   freq_domain_indicator);
+   insert(key, i);
+   }
+   }
+
powernv_pstate_info.nr_pstates = nr_pstates;
pr_debug("NR PStates %d\n", nr_pstates);
for (i = 0; i < nr_pstates; i++) {
@@ -693,6 +752,7 @@ static int powernv_cpufreq_target_index(struct 
cpufreq_policy *policy,
 {
struct powernv_smp_call_data freq_data;

[PATCH 1/2] clk: mediatek: group drivers under indpendent menu

2017-12-19 Thread sean.wang

From: Sean Wang 

Getting much MediaTek clock driver have been added to CCF, so it's
better adding the cleanup for grouping drivers under the independent
menu to simplify configuration selection. In addition, really trivial
fixups for typos are added in the same patch.

Signed-off-by: Sean Wang 
---
 drivers/clk/mediatek/Kconfig | 96 +++-
 1 file changed, 50 insertions(+), 46 deletions(-)

diff --git a/drivers/clk/mediatek/Kconfig b/drivers/clk/mediatek/Kconfig
index 59dc0aa..7338f81 100644
--- a/drivers/clk/mediatek/Kconfig
+++ b/drivers/clk/mediatek/Kconfig
@@ -1,136 +1,139 @@
 #
-# MediaTek SoC drivers
+# MediaTek Clock Drivers
 #
+menu "Clock driver for MediaTek SoC"
+   depends on ARCH_MEDIATEK || COMPILE_TEST
+
 config COMMON_CLK_MEDIATEK
bool
---help---
- Mediatek SoCs' clock support.
+ MediaTek SoCs' clock support.
 
 config COMMON_CLK_MT2701
-   bool "Clock driver for Mediatek MT2701"
+   bool "Clock driver for MediaTek MT2701"
depends on (ARCH_MEDIATEK && ARM) || COMPILE_TEST
select COMMON_CLK_MEDIATEK
default ARCH_MEDIATEK && ARM
---help---
- This driver supports Mediatek MT2701 basic clocks.
+ This driver supports MediaTek MT2701 basic clocks.
 
 config COMMON_CLK_MT2701_MMSYS
-   bool "Clock driver for Mediatek MT2701 mmsys"
+   bool "Clock driver for MediaTek MT2701 mmsys"
depends on COMMON_CLK_MT2701
---help---
- This driver supports Mediatek MT2701 mmsys clocks.
+ This driver supports MediaTek MT2701 mmsys clocks.
 
 config COMMON_CLK_MT2701_IMGSYS
-   bool "Clock driver for Mediatek MT2701 imgsys"
+   bool "Clock driver for MediaTek MT2701 imgsys"
depends on COMMON_CLK_MT2701
---help---
- This driver supports Mediatek MT2701 imgsys clocks.
+ This driver supports MediaTek MT2701 imgsys clocks.
 
 config COMMON_CLK_MT2701_VDECSYS
-   bool "Clock driver for Mediatek MT2701 vdecsys"
+   bool "Clock driver for MediaTek MT2701 vdecsys"
depends on COMMON_CLK_MT2701
---help---
- This driver supports Mediatek MT2701 vdecsys clocks.
+ This driver supports MediaTek MT2701 vdecsys clocks.
 
 config COMMON_CLK_MT2701_HIFSYS
-   bool "Clock driver for Mediatek MT2701 hifsys"
+   bool "Clock driver for MediaTek MT2701 hifsys"
depends on COMMON_CLK_MT2701
---help---
- This driver supports Mediatek MT2701 hifsys clocks.
+ This driver supports MediaTek MT2701 hifsys clocks.
 
 config COMMON_CLK_MT2701_ETHSYS
-   bool "Clock driver for Mediatek MT2701 ethsys"
+   bool "Clock driver for MediaTek MT2701 ethsys"
depends on COMMON_CLK_MT2701
---help---
- This driver supports Mediatek MT2701 ethsys clocks.
+ This driver supports MediaTek MT2701 ethsys clocks.
 
 config COMMON_CLK_MT2701_BDPSYS
-   bool "Clock driver for Mediatek MT2701 bdpsys"
+   bool "Clock driver for MediaTek MT2701 bdpsys"
depends on COMMON_CLK_MT2701
---help---
- This driver supports Mediatek MT2701 bdpsys clocks.
+ This driver supports MediaTek MT2701 bdpsys clocks.
 
 config COMMON_CLK_MT2712
-   bool "Clock driver for Mediatek MT2712"
+   bool "Clock driver for MediaTek MT2712"
depends on (ARCH_MEDIATEK && ARM64) || COMPILE_TEST
select COMMON_CLK_MEDIATEK
default ARCH_MEDIATEK && ARM64
---help---
- This driver supports Mediatek MT2712 basic clocks.
+ This driver supports MediaTek MT2712 basic clocks.
 
 config COMMON_CLK_MT2712_BDPSYS
-   bool "Clock driver for Mediatek MT2712 bdpsys"
+   bool "Clock driver for MediaTek MT2712 bdpsys"
depends on COMMON_CLK_MT2712
---help---
- This driver supports Mediatek MT2712 bdpsys clocks.
+ This driver supports MediaTek MT2712 bdpsys clocks.
 
 config COMMON_CLK_MT2712_IMGSYS
-   bool "Clock driver for Mediatek MT2712 imgsys"
+   bool "Clock driver for MediaTek MT2712 imgsys"
depends on COMMON_CLK_MT2712
---help---
- This driver supports Mediatek MT2712 imgsys clocks.
+ This driver supports MediaTek MT2712 imgsys clocks.
 
 config COMMON_CLK_MT2712_JPGDECSYS
-   bool "Clock driver for Mediatek MT2712 jpgdecsys"
+   bool "Clock driver for MediaTek MT2712 jpgdecsys"
depends on COMMON_CLK_MT2712
---help---
- This driver supports Mediatek MT2712 jpgdecsys clocks.
+ This driver supports MediaTek MT2712 jpgdecsys clocks.
 
 config COMMON_CLK_MT2712_MFGCFG
-   bool "Clock driver for Mediatek MT2712 mfgcfg"
+   bool "Clock driver for MediaTek MT2712 mfgcfg"
depends on COMMON_CLK_MT2712
---help---
- This driver supports Mediatek MT2712 mfgcfg clocks.
+ This driver supports MediaTek MT2712 mfgcfg clocks.
 
 config COMMON_CLK_MT2712_M

[PATCH 2/2] clk: mediatek: fixup test-building of MediaTek clock drivers

2017-12-19 Thread sean.wang

From: Sean Wang 

Let the build system looking into the directiory where the clock drivers
resides for the COMPILE_TEST alternative dependency allows test-building
the drivers.

Signed-off-by: Sean Wang 
---
 drivers/clk/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
index f7f761b..2e3fb08 100644
--- a/drivers/clk/Makefile
+++ b/drivers/clk/Makefile
@@ -67,7 +67,7 @@ obj-$(CONFIG_ARCH_MXC)+= imx/
 obj-$(CONFIG_MACH_INGENIC) += ingenic/
 obj-$(CONFIG_ARCH_KEYSTONE)+= keystone/
 obj-$(CONFIG_MACH_LOONGSON32)  += loongson1/
-obj-$(CONFIG_ARCH_MEDIATEK)+= mediatek/
+obj-y  += mediatek/
 obj-$(CONFIG_COMMON_CLK_AMLOGIC)   += meson/
 obj-$(CONFIG_MACH_PIC32)   += microchip/
 ifeq ($(CONFIG_COMMON_CLK), y)
-- 
2.7.4

Re: [PATCH 1/3] phy: core: Move runtime PM reference counting to the parent device

2017-12-19 Thread Kishon Vijay Abraham I

Hi Ulf,

On Wednesday 20 December 2017 02:52 AM, Ulf Hansson wrote:
> The runtime PM deployment in the phy core is a bit unnecessary complicated
> and the main reason is because it operates on the phy device, which is
> created by the phy core and assigned as a child device of the phy provider
> device.
> 
> Let's simplify the code, by replacing the existing calls to
> phy_pm_runtime_get_sync() and phy_pm_runtime_put(), with regular calls to
> pm_runtime_get_sync() and pm_runtime_put(). While doing that, let's also
> change to give the phy provider device as the parameter to the runtime PM
> calls. This together with adding error paths, that allows the phy
> provider device to be runtime PM disabled, enables further clean up the
> code. More precisely, we can simply avoid to enable runtime PM for the phy
> device altogether, so let's do that as well.
> 
> More importantly, this change also fixes an issue for system suspend.
> Especially in those cases when the phy provider device gets put into a low
> power state via calling the pm_runtime_force_suspend() helper, as is the
> case for a Renesas SoC, which has the phy provider device attached to the
> generic PM domain.
> 
> The problem in this case, is that pm_runtime_force_suspend() expects the
> child device of the provider device to be runtime suspended, else this will
> trigger a WARN splat (correctly) when runtime PM gets re-enabled at system
> resume.
> 
> In the current case, even if phy_power_off() triggers a pm_runtime_put()
> during system suspend the phy device (child) doesn't get runtime suspended,
> because that is prevented in the system suspend phases. However, by
> avoiding to enable runtime PM, this problem goes away.
> 
> Signed-off-by: Ulf Hansson 
> ---
>  drivers/phy/phy-core.c | 33 +
>  1 file changed, 13 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
> index b4964b0..9fa3f13 100644
> --- a/drivers/phy/phy-core.c
> +++ b/drivers/phy/phy-core.c
> @@ -222,10 +222,10 @@ int phy_init(struct phy *phy)
>   if (!phy)
>   return 0;
>  
> - ret = phy_pm_runtime_get_sync(phy);
> - if (ret < 0 && ret != -ENOTSUPP)
> + ret = pm_runtime_get_sync(phy->dev.parent);

Won't this make phy-core manage pm_runtime of phy_provider even though the
phy_provider might not intend it?

Thanks
Kishon

Re: [PATCH v5 15/15] devicetree: bindings: Document qcom,pvs

2017-12-19 Thread Sricharan R

Hi Viresh,

On 12/20/2017 11:57 AM, Viresh Kumar wrote:
> On 20-12-17, 11:55, Sricharan R wrote:
 +  opp-14 {
 +  opp-hz = /bits/ 64 <14>;
 +  opp-microvolt-speed0-pvs0-v0 = <125>;
>>>
>>> Why speed0 and v0 in all the names ?
>>>
>>
>>  Ya, all the three (speed, pvs and version) are read from efuse. So all the 
>> three
>>  can vary.
> 
> Okay, so may be in the example you should have a mix of all the
> combinations to show how these things work ? You only showed values
> for a single efuse configuration currently.
> 

 Ha ok. Will add other examples as well.

Regards,
 Sricharan

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

Re: [PATCH 1/3] x86/entry: Fix idtentry unwind hint

2017-12-19 Thread Josh Poimboeuf

On Wed, Dec 20, 2017 at 05:41:44AM +, Andrey Vagin wrote:
> Hi Josh,
> 
> 
> Now I see these two warnings on Linus' tree:
> 
> [1.902454] WARNING: stack recursion on stack type 1
> [1.902466] WARNING: can't dereference iret registers at cd089a12 
> for ip entry_SYSCALL_64_fastpath+0x5/0x86

This still looks like the same issue where ORC is getting confused by
paravirt patching.  Unfortunately the patches which fix this got
preempted by other work again.  I haven't forgotten about it.

2017 is out the window, but hopefully in January I'll get a chance to
revive the patches.

-- 
Josh

Re: [PATCH] CIFS: SMBD: fix configurations with INFINIBAND=m

2017-12-19 Thread Stefan Metzmacher

Am 19.12.2017 um 22:21 schrieb Long Li via samba-technical:
>>  depends on CIFS && INFINIBAND
>> +depends on CIFS=m || INFINIBAND=y
> 
> How about we change them to
> 
> depends on CIFS=m && INFINIBAND || CIFS=y && INFINIBAND=y
> 
> This makes it easy to read.

I like it :-)

metze




signature.asc
Description: OpenPGP digital signature

Re: [PATCH v3 14/16] phy: Add notify_speed callback

2017-12-19 Thread Manu Gautam

Hi,

On 12/20/2017 11:19 AM, Kishon Vijay Abraham I wrote:
> Hi,
>
> On Tuesday 12 December 2017 08:54 PM, Manu Gautam wrote:
>> Hi,
>>
>>
>> On 12/12/2017 5:13 PM, Kishon Vijay Abraham I wrote:
>>> Hi,
>>>
>>> On Tuesday 21 November 2017 02:53 PM, Manu Gautam wrote:
 QCOM USB PHYs can monitor resume/remote-wakeup event in
 suspended state. However PHY driver must know current
 operational speed of PHY in order to set correct polarity of
 wakeup events for detection. E.g. QUSB2 PHY monitors DP/DM
 signals depending on speed is LS or FS/HS to detect resume.
 Similarly QMP USB3 PHY in SS mode should monitor RX
 terminations attach/detach and LFPS events depending on
 SSPHY is active or not.
> Why not use a notification mechanism instead of adding new APIs in phy-core.
> This will only bloat phy-core with APIs for a particular platform.

Do you mean notifier_chains ?
When we have multiple instances of USB PHYs then notifier chains are not
of much help. For any platform glue or PHY driver it will be very difficult to
figure out if notification received for speed was for same phy/bus or a
different one.
Using PHY callbacks looked more elegant to me. Additionally PHY drivers
can also use this info decide power management policy e.g. if speed is
INVALID then it means PHY is not in a session and it can enter deepest
low power state.
Additionally if you prefer set_speed name over notify_speed then I am
ok with that as well so that it sounds more generic.

>
> Thanks
> Kishon
>

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v5 15/15] devicetree: bindings: Document qcom,pvs

2017-12-19 Thread Viresh Kumar

On 20-12-17, 11:55, Sricharan R wrote:
> >> +  opp-14 {
> >> +  opp-hz = /bits/ 64 <14>;
> >> +  opp-microvolt-speed0-pvs0-v0 = <125>;
> > 
> > Why speed0 and v0 in all the names ?
> > 
> 
>  Ya, all the three (speed, pvs and version) are read from efuse. So all the 
> three
>  can vary.

Okay, so may be in the example you should have a mix of all the
combinations to show how these things work ? You only showed values
for a single efuse configuration currently.

-- 
viresh

Re: [PATCH v5 15/15] devicetree: bindings: Document qcom,pvs

2017-12-19 Thread Sricharan R

Hi Viresh,

On 12/20/2017 8:56 AM, Viresh Kumar wrote:
> On 19-12-17, 21:25, Sricharan R wrote:
>> +cpu@0 {
>> +compatible = "qcom,krait";
>> +enable-method = "qcom,kpss-acc-v1";
>> +device_type = "cpu";
>> +reg = <0>;
>> +qcom,acc = <&acc0>;
>> +qcom,saw = <&saw0>;
>> +clocks = <&kraitcc 0>;
>> +clock-names = "cpu";
>> +cpu-supply = <&smb208_s2a>;
>> +operating-points-v2 = <&cpu_opp_table>;
>> +};
>> +
>> +qcom,pvs {
>> +qcom,pvs-format-a;
>> +};
> 
> Not sure what Rob is going to say on that :)
> 

 Yes. Would be good to know the best way.

>> +
>> +
>> +cpu_opp_table: opp_table {
>> +compatible = "operating-points-v2";
>> +
>> +/*
>> + * Missing opp-shared property means CPUs switch DVFS states
>> + * independently.
>> + */
>> +
>> +opp-14 {
>> +opp-hz = /bits/ 64 <14>;
>> +opp-microvolt-speed0-pvs0-v0 = <125>;
> 
> Why speed0 and v0 in all the names ?
> 

 Ya, all the three (speed, pvs and version) are read from efuse. So all the 
three
 can vary.

Regards,
 Sricharan

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation

[PATCH v6 3/3] eeprom: at24: switch to device-managed version of i2c_new_dummy

2017-12-19 Thread Heiner Kallweit

Make use of recently introduced device-managed version of
i2c_new_dummy to simplify the code.

Signed-off-by: Heiner Kallweit 
---
v2:
- small improvements regarding code readability
v3:
- no changes
v4:
- no changes
v5:
- no changes
v6:
- rebased
---
 drivers/misc/eeprom/at24.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
index b44a3d2b2..6232765bb 100644
--- a/drivers/misc/eeprom/at24.c
+++ b/drivers/misc/eeprom/at24.c
@@ -626,21 +626,19 @@ static int at24_probe(struct i2c_client *client, const 
struct i2c_device_id *id)
 
/* use dummy devices for multiple-address chips */
for (i = 1; i < num_addresses; i++) {
-   at24->client[i].client = i2c_new_dummy(client->adapter,
-  client->addr + i);
-   if (!at24->client[i].client) {
+   struct at24_client *cl;
+
+   cl = &at24->client[i];
+   cl->client = devm_i2c_new_dummy(&client->dev, client->adapter,
+   client->addr + i);
+   if (IS_ERR(cl->client)) {
dev_err(&client->dev, "address 0x%02x unavailable\n",
-   client->addr + i);
-   err = -EADDRINUSE;
-   goto err_clients;
-   }
-   at24->client[i].regmap = devm_regmap_init_i2c(
-   at24->client[i].client,
-   ®map_config);
-   if (IS_ERR(at24->client[i].regmap)) {
-   err = PTR_ERR(at24->client[i].regmap);
-   goto err_clients;
+   client->addr + i);
+   return PTR_ERR(cl->client);
}
+   cl->regmap = devm_regmap_init_i2c(cl->client, ®map_config);
+   if (IS_ERR(cl->regmap))
+   return PTR_ERR(cl->regmap);
}
 
i2c_set_clientdata(client, at24);
@@ -692,10 +690,6 @@ static int at24_probe(struct i2c_client *client, const 
struct i2c_device_id *id)
return 0;
 
 err_clients:
-   for (i = 1; i < num_addresses; i++)
-   if (at24->client[i].client)
-   i2c_unregister_device(at24->client[i].client);
-
pm_runtime_disable(&client->dev);
 
return err;
@@ -704,15 +698,11 @@ static int at24_probe(struct i2c_client *client, const 
struct i2c_device_id *id)
 static int at24_remove(struct i2c_client *client)
 {
struct at24_data *at24;
-   int i;
 
at24 = i2c_get_clientdata(client);
 
nvmem_unregister(at24->nvmem);
 
-   for (i = 1; i < at24->num_addresses; i++)
-   i2c_unregister_device(at24->client[i].client);
-
pm_runtime_disable(&client->dev);
pm_runtime_set_suspended(&client->dev);
 
-- 
2.15.1

[PATCH v6 1/3] i2c: core: improve return value handling of i2c_new_device and i2c_new_dummy

2017-12-19 Thread Heiner Kallweit

Currently i2c_new_device and i2c_new_dummy return just NULL in error
case although they have more error details internally. Therefore move
the functionality into new functions returning detailed errors and
add wrappers for compatibilty with the current API.

This allows to use these functions with detailed error codes within
the i2c core or for API extensions.

Signed-off-by: Heiner Kallweit 
Reviewed-by: Bartosz Golaszewski 
---
v3:
- prefix i2c_new_device and i2c_new_dummy with two underscores
  instead one
v4:
- add missing kernel doc
- add reviewed-by
v5:
- fix a copy & paste error in a kernel doc comment
v6:
- no changes
---
 drivers/i2c/i2c-core-base.c | 70 -
 1 file changed, 57 insertions(+), 13 deletions(-)

diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index bb34a5d41..cb3f29fb8 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -656,7 +656,7 @@ static int i2c_dev_irq_from_resources(const struct resource 
*resources,
 }
 
 /**
- * i2c_new_device - instantiate an i2c device
+ * __i2c_new_device - instantiate an i2c device
  * @adap: the adapter managing the device
  * @info: describes one I2C device; bus_num is ignored
  * Context: can sleep
@@ -669,17 +669,17 @@ static int i2c_dev_irq_from_resources(const struct 
resource *resources,
  * before any i2c_adapter could exist.
  *
  * This returns the new i2c client, which may be saved for later use with
- * i2c_unregister_device(); or NULL to indicate an error.
+ * i2c_unregister_device(); or an ERR_PTR to indicate an error.
  */
-struct i2c_client *
-i2c_new_device(struct i2c_adapter *adap, struct i2c_board_info const *info)
+static struct i2c_client *
+__i2c_new_device(struct i2c_adapter *adap, struct i2c_board_info const *info)
 {
struct i2c_client   *client;
int status;
 
client = kzalloc(sizeof *client, GFP_KERNEL);
if (!client)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
client->adapter = adap;
 
@@ -746,7 +746,29 @@ i2c_new_device(struct i2c_adapter *adap, struct 
i2c_board_info const *info)
client->name, client->addr, status);
 out_err_silent:
kfree(client);
-   return NULL;
+   return ERR_PTR(status);
+}
+
+/**
+ * i2c_new_device - instantiate an i2c device
+ * @adap: the adapter managing the device
+ * @info: describes one I2C device; bus_num is ignored
+ * Context: can sleep
+ *
+ * This function has the same functionality like __i2_new_device, it just
+ * returns NULL instead of an ERR_PTR in case of an error for compatibility
+ * with current I2C API.
+ *
+ * This returns the new i2c client, which may be saved for later use with
+ * i2c_unregister_device(); or NULL to indicate an error.
+ */
+struct i2c_client *
+i2c_new_device(struct i2c_adapter *adap, struct i2c_board_info const *info)
+{
+   struct i2c_client *ret;
+
+   ret = __i2c_new_device(adap, info);
+   return IS_ERR(ret) ? NULL : ret;
 }
 EXPORT_SYMBOL_GPL(i2c_new_device);
 
@@ -793,7 +815,7 @@ static struct i2c_driver dummy_driver = {
 };
 
 /**
- * i2c_new_dummy - return a new i2c device bound to a dummy driver
+ * __i2c_new_dummy - return a new i2c device bound to a dummy driver
  * @adapter: the adapter managing the device
  * @address: seven bit address to be used
  * Context: can sleep
@@ -808,15 +830,37 @@ static struct i2c_driver dummy_driver = {
  * different driver.
  *
  * This returns the new i2c client, which should be saved for later use with
- * i2c_unregister_device(); or NULL to indicate an error.
+ * i2c_unregister_device(); or an ERR_PTR to indicate an error.
  */
-struct i2c_client *i2c_new_dummy(struct i2c_adapter *adapter, u16 address)
+static struct i2c_client *
+__i2c_new_dummy(struct i2c_adapter *adapter, u16 address)
 {
struct i2c_board_info info = {
I2C_BOARD_INFO("dummy", address),
};
 
-   return i2c_new_device(adapter, &info);
+   return __i2c_new_device(adapter, &info);
+}
+
+/**
+ * i2c_new_dummy - return a new i2c device bound to a dummy driver
+ * @adapter: the adapter managing the device
+ * @address: seven bit address to be used
+ * Context: can sleep
+ *
+ * This function has the same functionality like __i2_new_dummy, it just
+ * returns NULL instead of an ERR_PTR in case of an error for compatibility
+ * with current I2C API.
+ *
+ * This returns the new i2c client, which should be saved for later use with
+ * i2c_unregister_device(); or NULL to indicate an error.
+ */
+struct i2c_client *i2c_new_dummy(struct i2c_adapter *adapter, u16 address)
+{
+   struct i2c_client *ret;
+
+   ret = __i2c_new_dummy(adapter, address);
+   return IS_ERR(ret) ? NULL : ret;
 }
 EXPORT_SYMBOL_GPL(i2c_new_dummy);
 
@@ -939,9 +983,9 @@ i2c_sysfs_new_device(struct device *dev, struct 
device_attribute *attr,
info.flags |= I2C_CLIENT_SLAVE;
}
 
-   client

[PATCH v6 2/3] i2c: core: add device-managed version of i2c_new_dummy

2017-12-19 Thread Heiner Kallweit

i2c_new_dummy is typically called from the probe function of the
driver for the primary i2c client. It requires calls to
i2c_unregister_device in the error path of the probe function and
in the remove function.
This can be simplified by introducing a device-managed version.

Note the changed error case return value type:
i2c_new_dummy returns NULL whilst devm_new_i2c_dummy returns an ERR_PTR.

Signed-off-by: Heiner Kallweit 
---
v2:
- use new function _i2c_new_dummy with detailed error codes
v3:
- no changes
v4:
- reflect renaming to __i2c_new_dummy
v5:
- improve readability by adding struct i2c_dummy_devres
v6:
- add braces to function name in documentation
---
 Documentation/driver-model/devres.txt |  3 +++
 drivers/i2c/i2c-core-base.c   | 45 +++
 include/linux/i2c.h   |  3 +++
 3 files changed, 51 insertions(+)

diff --git a/Documentation/driver-model/devres.txt 
b/Documentation/driver-model/devres.txt
index c180045eb..22a40deed 100644
--- a/Documentation/driver-model/devres.txt
+++ b/Documentation/driver-model/devres.txt
@@ -259,6 +259,9 @@ GPIO
   devm_gpio_request_one()
   devm_gpio_free()
 
+I2C
+  devm_i2c_new_dummy()
+
 IIO
   devm_iio_device_alloc()
   devm_iio_device_free()
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index cb3f29fb8..4b05abbfa 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -864,6 +864,51 @@ struct i2c_client *i2c_new_dummy(struct i2c_adapter 
*adapter, u16 address)
 }
 EXPORT_SYMBOL_GPL(i2c_new_dummy);
 
+struct i2c_dummy_devres {
+   struct i2c_client *client;
+};
+
+static void devm_i2c_release_dummy(struct device *dev, void *res)
+{
+   struct i2c_dummy_devres *this = res;
+
+   i2c_unregister_device(this->client);
+}
+
+/**
+ * devm_i2c_new_dummy - return a new i2c device bound to a dummy driver
+ * @dev: device the managed resource is bound to
+ * @adapter: the adapter managing the device
+ * @address: seven bit address to be used
+ * Context: can sleep
+ *
+ * This is the device-managed version of i2c_new_dummy.
+ * Note the changed return value type: It returns the new i2c client
+ * or an ERR_PTR in case of an error.
+ */
+struct i2c_client *devm_i2c_new_dummy(struct device *dev,
+ struct i2c_adapter *adapter,
+ u16 address)
+{
+   struct i2c_dummy_devres *dr;
+   struct i2c_client *client;
+
+   dr = devres_alloc(devm_i2c_release_dummy, sizeof(*dr), GFP_KERNEL);
+   if (!dr)
+   return ERR_PTR(-ENOMEM);
+
+   client = __i2c_new_dummy(adapter, address);
+   if (IS_ERR(client)) {
+   devres_free(dr);
+   } else {
+   dr->client = client;
+   devres_add(dev, dr);
+   }
+
+   return client;
+}
+EXPORT_SYMBOL_GPL(devm_i2c_new_dummy);
+
 /**
  * i2c_new_secondary_device - Helper to get the instantiated secondary address
  * and create the associated device
diff --git a/include/linux/i2c.h b/include/linux/i2c.h
index 5d7f3c185..aca6ebbb8 100644
--- a/include/linux/i2c.h
+++ b/include/linux/i2c.h
@@ -441,6 +441,9 @@ extern int i2c_probe_func_quick_read(struct i2c_adapter *, 
unsigned short addr);
 extern struct i2c_client *
 i2c_new_dummy(struct i2c_adapter *adap, u16 address);
 
+extern struct i2c_client *
+devm_i2c_new_dummy(struct device *dev, struct i2c_adapter *adap, u16 address);
+
 extern struct i2c_client *
 i2c_new_secondary_device(struct i2c_client *client,
const char *name,
-- 
2.15.1

Re: [Intel-wired-lan] v4.15-rc2 on thinkpad x60: ethernet stopped working

2017-12-19 Thread Neftin, Sasha


On 12/18/2017 17:50, Neftin, Sasha wrote:

On 12/18/2017 13:58, Pavel Machek wrote:

On Mon 2017-12-18 13:24:40, Neftin, Sasha wrote:

On 12/18/2017 12:26, Pavel Machek wrote:

Hi!


In v4.15-rc2+, network manager can not see my ethernet card, and
manual attempts to ifconfig it up did not really help, either.

Card is:

02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit 
Ethernet

Controller



Any ideas ?

Yes , 19110cfbb34d4af0cdfe14cd243f3b09dc95b013 broke it.

See:
https://bugzilla.kernel.org/show_bug.cgi?id=198047

Fix there :
https://marc.info/?l=linux-kernel&m=151272209903675&w=2

I don't see the patch in latest mainline. Not having ethernet
is... somehow annoying. What is going on there?
Generally speaking, e1000 maintainence has been handled very 
poorly over

the past few years, I have to say.

Fixes take forever to propagate even when someone other than the
maintainer provides a working and tested fix, just like this case.

Jeff, please take e1000 maintainence seriously and get these critical
bug fixes propagated.

No response AFAICT. I guess I should test reverting
19110cfbb34d4af0cdfe14cd243f3b09dc95b013, then ask you for revert?

Hello Pavel,

Before ask for reverting 19110cfbb..., please, check if follow patch of
Benjamin work for you http://patchwork.ozlabs.org/patch/846825/

Jacob said, in another email:

# Digging into this, the problem is complicated. The original bug
# assumed behavior of the .check_for_link call, which is universally not
# implemented.
#
# I think the correct fix is to revert 19110cfbb34d ("e1000e: Separate
# signaling for link check/link up", 2017-10-10) and find a more 
proper solution.


...which makes me think that revert is preffered?

    Pavel

Pavel, before ask for revert - let's check Benjamin's patch following 
to his previous patch. Previous patch was not competed and latest one 
come to complete changes.


___
Intel-wired-lan mailing list
intel-wired-...@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan


Pavel, any update? Is Benjamin's last patch solved your network problem?

Re: [PATCH v1] drm/tegra: gem: Correct iommu_map_sg() error checking

2017-12-19 Thread kbuild test robot

Hi Dmitry,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tegra/for-next]
[also build test WARNING on v4.15-rc4 next-20171220]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Dmitry-Osipenko/drm-tegra-gem-Correct-iommu_map_sg-error-checking/20171220-123700
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git for-next
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/tegra/gem.c: In function 'tegra_bo_iommu_map':
>> drivers/gpu/drm/tegra/gem.c:131:58: warning: format '%zd' expects argument 
>> of type 'signed size_t', but argument 3 has type 'int' [-Wformat=]
  dev_err(tegra->drm->dev, "out of I/O virtual memory: %zd\n",
   ~~^
   %d

vim +131 drivers/gpu/drm/tegra/gem.c

de2ba664c drivers/gpu/host1x/drm/gem.c Arto Merilainen 2013-03-22  113  
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  114  static 
int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  115  {
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  116  
int prot = IOMMU_READ | IOMMU_WRITE;
e1ad592fa drivers/gpu/drm/tegra/gem.c  Dmitry Osipenko 2017-12-17  117  
int err;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  118  
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  119  
if (bo->mm)
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  120  
return -EBUSY;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  121  
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  122  
bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  123  
if (!bo->mm)
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  124  
return -ENOMEM;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  125  
347ad49d3 drivers/gpu/drm/tegra/gem.c  Thierry Reding  2017-03-09  126  
mutex_lock(&tegra->mm_lock);
347ad49d3 drivers/gpu/drm/tegra/gem.c  Thierry Reding  2017-03-09  127  
4e64e5539 drivers/gpu/drm/tegra/gem.c  Chris Wilson2017-02-02  128  
err = drm_mm_insert_node_generic(&tegra->mm,
4e64e5539 drivers/gpu/drm/tegra/gem.c  Chris Wilson2017-02-02  129  
 bo->mm, bo->gem.size, PAGE_SIZE, 0, 0);
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  130  
if (err < 0) {
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26 @131  
dev_err(tegra->drm->dev, "out of I/O virtual memory: %zd\n",
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  132  
err);
347ad49d3 drivers/gpu/drm/tegra/gem.c  Thierry Reding  2017-03-09  133  
goto unlock;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  134  
}
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  135  
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  136  
bo->paddr = bo->mm->start;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  137  
e1ad592fa drivers/gpu/drm/tegra/gem.c  Dmitry Osipenko 2017-12-17  138  
bo->size = iommu_map_sg(tegra->domain, bo->paddr, bo->sgt->sgl,
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  139  
bo->sgt->nents, prot);
e1ad592fa drivers/gpu/drm/tegra/gem.c  Dmitry Osipenko 2017-12-17  140  
if (!bo->size) {
e1ad592fa drivers/gpu/drm/tegra/gem.c  Dmitry Osipenko 2017-12-17  141  
dev_err(tegra->drm->dev, "failed to map buffer\n");
e1ad592fa drivers/gpu/drm/tegra/gem.c  Dmitry Osipenko 2017-12-17  142  
err = -ENOMEM;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  143  
goto remove;
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  144  
}
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  145  
347ad49d3 drivers/gpu/drm/tegra/gem.c  Thierry Reding  2017-03-09  146  
mutex_unlock(&tegra->mm_lock);
347ad49d3 drivers/gpu/drm/tegra/gem.c  Thierry Reding  2017-03-09  147  
df06b759f drivers/gpu/drm/tegra/gem.c  Thierry Reding  2014-06-26  148  
return 0;
df06b759f drivers/gpu/drm/tegra/gem.c

Re: [PATCH v5 14/15] cpufreq: Add module to register cpufreq on Krait CPUs

2017-12-19 Thread Sricharan R

Hi Viresh,

On 12/20/2017 9:06 AM, Viresh Kumar wrote:
> On 19-12-17, 21:24, Sricharan R wrote:
>> From: Stephen Boyd 
>>
>> Register a cpufreq-generic device whenever we detect that a
>> "qcom,krait" compatible CPU is present in DT.
>>
>> Cc: 
>> [Sricharan: updated to use dev_pm_opp_set_prop_name]
>> Signed-off-by: Sricharan R 
>> Signed-off-by: Stephen Boyd 
>> ---
>>  drivers/cpufreq/Kconfig.arm  |   9 ++
>>  drivers/cpufreq/Makefile |   1 +
>>  drivers/cpufreq/cpufreq-dt-platdev.c |   3 +-
>>  drivers/cpufreq/qcom-cpufreq.c   | 171 
>> +++
>>  4 files changed, 183 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/cpufreq/qcom-cpufreq.c
>>
>> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
>> index bdce448..60f28e7 100644
>> --- a/drivers/cpufreq/Kconfig.arm
>> +++ b/drivers/cpufreq/Kconfig.arm
>> @@ -100,6 +100,15 @@ config ARM_OMAP2PLUS_CPUFREQ
>>  depends on ARCH_OMAP2PLUS
>>  default ARCH_OMAP2PLUS
>>  
>> +config ARM_QCOM_CPUFREQ
>> +tristate "Qualcomm based"
> 
> Qualcomm based ... ? You want to add something after this ?
> 

 Hmm, got truncated. Will add a proper one.

> And why tristate ? Do you really want to build a module for this ?
> 

 Given that cpufreq-dt that registers the driver already supports module,
 don't think this needs to be a module. So will make it a bool.

>> +depends on ARCH_QCOM
>> +select PM_OPP
>> +help
>> +  This adds the CPUFreq driver for Qualcomm SoC based boards.
>> +
>> +  If in doubt, say N.
>> +
>>  config ARM_S3C_CPUFREQ
>>  bool
>>  help
>> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
>> index 812f9e0..1496464 100644
>> --- a/drivers/cpufreq/Makefile
>> +++ b/drivers/cpufreq/Makefile
>> @@ -62,6 +62,7 @@ obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ) += mediatek-cpufreq.o
>>  obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ) += omap-cpufreq.o
>>  obj-$(CONFIG_ARM_PXA2xx_CPUFREQ)+= pxa2xx-cpufreq.o
>>  obj-$(CONFIG_PXA3xx)+= pxa3xx-cpufreq.o
>> +obj-$(CONFIG_ARM_QCOM_CPUFREQ)  += qcom-cpufreq.o
>>  obj-$(CONFIG_ARM_S3C24XX_CPUFREQ)   += s3c24xx-cpufreq.o
>>  obj-$(CONFIG_ARM_S3C24XX_CPUFREQ_DEBUGFS) += s3c24xx-cpufreq-debugfs.o
>>  obj-$(CONFIG_ARM_S3C2410_CPUFREQ)   += s3c2410-cpufreq.o
>> diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c 
>> b/drivers/cpufreq/cpufreq-dt-platdev.c
>> index ecc56e2..032ac4f 100644
>> --- a/drivers/cpufreq/cpufreq-dt-platdev.c
>> +++ b/drivers/cpufreq/cpufreq-dt-platdev.c
>> @@ -118,7 +118,7 @@
>>  { .compatible = "ti,am33xx", },
>>  { .compatible = "ti,am43", },
>>  { .compatible = "ti,dra7", },
>> -
> 
> Keep this blank line as is..
> 

 ok

>> +{ .compatible = "qcom,ipq8064", },
> 
> And add another one here.
> 

 ok

>>  { }
>>  };
>>  
>> @@ -157,6 +157,7 @@ static int __init cpufreq_dt_platdev_init(void)
>>  
>>  create_pdev:
>>  of_node_put(np);
>> +
> 
> Remove this.
> 

 ok

>>  return PTR_ERR_OR_ZERO(platform_device_register_data(NULL, "cpufreq-dt",
>> -1, data,
>> sizeof(struct cpufreq_dt_platform_data)));
>> diff --git a/drivers/cpufreq/qcom-cpufreq.c b/drivers/cpufreq/qcom-cpufreq.c
>> new file mode 100644
>> index 000..3e5583d
>> --- /dev/null
>> +++ b/drivers/cpufreq/qcom-cpufreq.c
>> @@ -0,0 +1,171 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2013-2015, The Linux Foundation. All rights reserved.
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include "cpufreq-dt.h"
>> +
>> +static void __init get_krait_bin_format_a(int *speed, int *pvs, int 
>> *pvs_ver)
>> +{
>> +void __iomem *base;
>> +u32 pte_efuse;
>> +
>> +*speed = *pvs = *pvs_ver = 0;
>> +
>> +base = ioremap(0x007000c0, 4);
>> +if (!base) {
>> +pr_warn("Unable to read efuse data. Defaulting to 0!\n");
>> +return;
>> +}
>> +
>> +pte_efuse = readl_relaxed(base);
>> +iounmap(base);
>> +
>> +*speed = pte_efuse & 0xf;
>> +if (*speed == 0xf)
>> +*speed = (pte_efuse >> 4) & 0xf;
>> +
>> +if (*speed == 0xf) {
>> +*speed = 0;
>> +pr_warn("Speed bin: Defaulting to %d\n", *speed);
>> +} else {
>> +pr_info("Speed bin: %d\n", *speed);
>> +}
>> +
>> +*pvs = (pte_efuse >> 10) & 0x7;
>> +if (*pvs == 0x7)
>> +*pvs = (pte_efuse >> 13) & 0x7;
>> +
>> +if (*pvs == 0x7) {
>> +*pvs = 0;
>> +pr_warn("PVS bin: Defaulting to %d\n", *pvs);
>> +} else {
>> +pr_info("PVS bin: %d\n", *pvs);
>> +}
>> +}
>> +
>> +static void __init get_krait_bin_format_b(int *speed, int *pvs, int 
>> *pvs_ver)
>> +{
>> +u32 pte_efuse, redundant_sel;
>> +void __iomem *base;
>> +
>> +*speed = 0;
>> +*pvs = 0;
>> +*pvs_ver = 0;
>> +
>>

[PATCH v6 0/3] i2c: introduce devm_i2c_new_dummy and use it in at24 driver

2017-12-19 Thread Heiner Kallweit

i2c_new_dummy is typically called from the probe function of the
driver for the primary i2c client. It requires calls to
i2c_unregister_device in the error path of the probe function and
in the remove function.
This can be simplified by introducing a device-managed version.

Make at24 driver the first user of the new function.

Changes in v2:
- add change to i2c core to make a version of i2c_new_device
  available which returns an ERR_PTR instead of NULL in error case
- few minor improvements

Changes in v3:
- rename _i2c_new_device to __i2c_new_device

Changes in v4:
- add missing kernel doc comments
- add Reviewed-by

Changes in v5:
- fix a copy & paste error in patch 1
- improve readability in patch 2

Changes in v5:
- cosmetic change in patch 1
- patch 3 rebased on top of latest at24/for-next

Heiner Kallweit (3):
  i2c: core: improve return value handling of i2c_new_device and i2c_new_dummy
  i2c: core: add device-managed version of i2c_new_dummy
  eeprom: at24: switch to device-managed version of i2c_new_dummy

 Documentation/driver-model/devres.txt |   3 +
 drivers/i2c/i2c-core-base.c   | 115 ++
 drivers/misc/eeprom/at24.c|  32 --
 include/linux/i2c.h   |   3 +
 4 files changed, 119 insertions(+), 34 deletions(-)

-- 
2.15.1

Re: [PATCH V2 net-next 02/17] net: hns3: add support to modify tqps number

2017-12-19 Thread lipeng (Y)




On 2017/12/20 3:18, David Miller wrote:

From: Lipeng 
Date: Tue, 19 Dec 2017 12:02:24 +0800


@@ -2651,6 +2651,19 @@ static int hns3_get_ring_config(struct hns3_nic_priv 
*priv)
return ret;
  }
  
+static void hns3_put_ring_config(struct hns3_nic_priv *priv)

+{
+   struct hnae3_handle *h = priv->ae_handle;
+   u16 i;
+
+   for (i = 0; i < h->kinfo.num_tqps; i++) {

Please use a plain "int" for index iteration loops like this since
that is the canonical type to use.

will check and fix this , Thanks.

+static void hclge_release_tqp(struct hclge_vport *vport)
+{
+   struct hnae3_knic_private_info *kinfo = &vport->nic.kinfo;
+   struct hclge_dev *hdev = vport->back;
+   u16 i;
+
+   for (i = 0; i < kinfo->num_tqps; i++) {

Likewise.

.

Re: [PATCH V2 net-next 01/17] net: hns3: add support to query tqps number

2017-12-19 Thread lipeng (Y)




On 2017/12/20 3:16, David Miller wrote:

From: Lipeng 
Date: Tue, 19 Dec 2017 12:02:23 +0800


@@ -5002,6 +5002,26 @@ static void hclge_uninit_ae_dev(struct hnae3_ae_dev 
*ae_dev)
ae_dev->priv = NULL;
  }
  
+static u32 hclge_get_max_channels(struct hnae3_handle *handle)

+{
+   struct hclge_vport *vport = hclge_get_vport(handle);
+   struct hnae3_knic_private_info *kinfo = &handle->kinfo;
+   struct hclge_dev *hdev = vport->back;
+

Please order local variables from longest to shortest line.

Please audit your entire submission for this problem.

.

will check this patch-set about this problem. Thanks

Re: [ANNOUNCE] autofs 5.1.2 release

2017-12-19 Thread Ian Kent

On 20/12/17 13:52, Ian Kent wrote:
> On 20/12/17 11:29, NeilBrown wrote:
>>
>> Hi Ian,
>>  I've been looking at:
>>
>>> - add configuration option to use fqdn in mounts.
>>
>> (commit 9aeef772604) because using this new option causes a regression.
>> If you are using the "replicated server" functionality, then
>>   use_hostname_for_mounts = yes
>> completely disables it.
> 
> Yes, that's not quite right.
> 
> It disables the probe and proximity check for each distinct host
> name used.
> 
> Each of the entries in the list of hosts should still be
> attempted and given that NFS ping is also now used in the NFS
> mount module what's lost is the preferred ordering of the hosts
> list.
> 
>>
>> This is caused by:
>>
>> diff --git a/modules/replicated.c b/modules/replicated.c
>> index 32860d5fe245..8437f5f3d5b2 100644
>> --- a/modules/replicated.c
>> +++ b/modules/replicated.c
>> @@ -667,6 +667,12 @@ int prune_host_list(unsigned logopt, struct host **list,
>> if (!*list)
>> return 0;
>>  
>> +   /* If we're using the host name then there's no point probing
>> +* avialability and respose time.
>> +*/
>> +   if (defaults_use_hostname_for_mounts())
>> +   return 1;
>> +
>> /* Use closest hosts to choose NFS version */
>>
>> My question is: why what this particular change made.
> 
> It was a while ago but there were complains about using the IP
> address for mounts. It was requested to provide a way to prevent
> that and force the use of the host name in mounts.
> 
>> Why can't prune_host_list() be allowed to do it's thing
>> when use_hostname_for_mounts is set.
> 
> We could if each host name resolved to a single IP address.
> 
> I'd need to check that use_hostname_for_mounts doesn't get
> in the road but the host struct should have ->rr set to true
> if it has multiple addresses so changing it to work the way
> your recommending shouldn't be hard. I think there's a couple
> of places that would need to be checked.
> 
> If the host does resolve to multiple addresses the situation
> is different. There's no way to stop the actual mount from
> trying an IP address that's not responding and proximity
> doesn't make sense either again because every time a lookup
> is done on the host name (eg. at mount time) the next address
> in its list will be returned which can and usually is different
> from what would have been checked.
> 
>> I understand that it would be pointless choosing between
>> the different interfaces of a multi-homed host, but there is still value
>> in choosing between multiple distinct hosts.
>>
>> What, if anything, might go wrong if I simply reverse this chunk of the
>> patch?
> 
> You'll get IP addresses in the logs in certain cases but that
> should be all.
> 
> It would probably be better to ensure that the checks are done
> if the host name resolves to a single IP address.

I think that should be "if the host names in the list each resolve
to a single IP address", otherwise the round robin behavior would
probably still get in the road.

Ian

Re: [PATCH v2 4/5] mm: use node_page_state_snapshot to avoid deviation

2017-12-19 Thread kemi



On 2017年12月19日 20:43, Michal Hocko wrote:
> On Tue 19-12-17 14:39:25, Kemi Wang wrote:
>> To avoid deviation, this patch uses node_page_state_snapshot instead of
>> node_page_state for node page stats query.
>> e.g. cat /proc/zoneinfo
>>  cat /sys/devices/system/node/node*/vmstat
>>  cat /sys/devices/system/node/node*/numastat
>>
>> As it is a slow path and would not be read frequently, I would worry about
>> it.
> 
> The changelog doesn't explain why these counters needs any special
> treatment. _snapshot variants where used only for internal handling
> where the precision really mattered. We do not have any in-tree user and
> Jack has removed this by 
> http://lkml.kernel.org/r/20171122094416.26019-1-j...@suse.cz
> which is already sitting in the mmotm tree. We can re-add it but that
> would really require a _very good_ reason.
> 

Assume we have *nr* cpus, and threshold size is *t*. Thus, the maximum 
deviation is nr*t.
Currently, Skylake platform has hundreds of CPUs numbers and the number is 
still 
increasing. Also, even the threshold size is kept to 125 at maximum (32765 
for NUMA counters now), the deviation is just a little too big as I have 
mentioned in 
the log. I tend to sum the number in local cpus up when query the global stats.

Also, node_page_state_snapshot is only called in slow path and I don't think 
that
would be a big problem. 

Anyway, it is a matter of taste. I just think it's better to have.

>> Signed-off-by: Kemi Wang 
>> ---
>>  drivers/base/node.c | 17 ++---
>>  mm/vmstat.c |  2 +-
>>  2 files changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index a045ea1..cf303f8 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -169,12 +169,15 @@ static ssize_t node_read_numastat(struct device *dev,
>> "interleave_hit %lu\n"
>> "local_node %lu\n"
>> "other_node %lu\n",
>> -   node_page_state(NODE_DATA(dev->id), NUMA_HIT),
>> -   node_page_state(NODE_DATA(dev->id), NUMA_MISS),
>> -   node_page_state(NODE_DATA(dev->id), NUMA_FOREIGN),
>> -   node_page_state(NODE_DATA(dev->id), NUMA_INTERLEAVE_HIT),
>> -   node_page_state(NODE_DATA(dev->id), NUMA_LOCAL),
>> -   node_page_state(NODE_DATA(dev->id), NUMA_OTHER));
>> +   node_page_state_snapshot(NODE_DATA(dev->id), NUMA_HIT),
>> +   node_page_state_snapshot(NODE_DATA(dev->id), NUMA_MISS),
>> +   node_page_state_snapshot(NODE_DATA(dev->id),
>> +   NUMA_FOREIGN),
>> +   node_page_state_snapshot(NODE_DATA(dev->id),
>> +   NUMA_INTERLEAVE_HIT),
>> +   node_page_state_snapshot(NODE_DATA(dev->id), NUMA_LOCAL),
>> +   node_page_state_snapshot(NODE_DATA(dev->id),
>> +   NUMA_OTHER));
>>  }
>>  
>>  static DEVICE_ATTR(numastat, S_IRUGO, node_read_numastat, NULL);
>> @@ -194,7 +197,7 @@ static ssize_t node_read_vmstat(struct device *dev,
>>  for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
>>  n += sprintf(buf+n, "%s %lu\n",
>>   vmstat_text[i + NR_VM_ZONE_STAT_ITEMS],
>> - node_page_state(pgdat, i));
>> + node_page_state_snapshot(pgdat, i));
>>  
>>  return n;
>>  }
>> diff --git a/mm/vmstat.c b/mm/vmstat.c
>> index 64e08ae..d65f28d 100644
>> --- a/mm/vmstat.c
>> +++ b/mm/vmstat.c
>> @@ -1466,7 +1466,7 @@ static void zoneinfo_show_print(struct seq_file *m, 
>> pg_data_t *pgdat,
>>  for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) {
>>  seq_printf(m, "\n  %-12s %lu",
>>  vmstat_text[i + NR_VM_ZONE_STAT_ITEMS],
>> -node_page_state(pgdat, i));
>> +node_page_state_snapshot(pgdat, i));
>>  }
>>  }
>>  seq_printf(m,
>> -- 
>> 2.7.4
>>
>

Re: [PATCH v6 09/18] PCI: dwc: dra7xx: Help compiler to remove unused code

2017-12-19 Thread Kishon Vijay Abraham I



On Wednesday 20 December 2017 04:59 AM, Niklas Cassel wrote:
> The dra7xx driver supports both host and ep mode.
> When enabling support for only one of the modes, help the compiler
> to remove code for the mode that we have not enabled in the driver.
> 
> By adding if (!IS_ENABLED(CONFIG_PCI_DRA7XX_HOST)) return -ENODEV;
> anything after that statement will get silently dropped by the compiler,
> including static functions and structures that are referenced indirectly
> from there.
> 
> Suggested-by: Arnd Bergmann 
> Signed-off-by: Niklas Cassel 

Acked-by: Kishon Vijay Abraham I 
> ---
>  drivers/pci/dwc/pci-dra7xx.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
> index 07c74ae3614e..224ff8affdce 100644
> --- a/drivers/pci/dwc/pci-dra7xx.c
> +++ b/drivers/pci/dwc/pci-dra7xx.c
> @@ -694,6 +694,11 @@ static int __init dra7xx_pcie_probe(struct 
> platform_device *pdev)
>  
>   switch (mode) {
>   case DW_PCIE_RC_TYPE:
> + if (!IS_ENABLED(CONFIG_PCI_DRA7XX_HOST)) {
> + ret = -ENODEV;
> + goto err_gpio;
> + }
> +
>   dra7xx_pcie_writel(dra7xx, PCIECTRL_TI_CONF_DEVICE_TYPE,
>  DEVICE_TYPE_RC);
>   ret = dra7xx_add_pcie_port(dra7xx, pdev);
> @@ -701,6 +706,11 @@ static int __init dra7xx_pcie_probe(struct 
> platform_device *pdev)
>   goto err_gpio;
>   break;
>   case DW_PCIE_EP_TYPE:
> + if (!IS_ENABLED(CONFIG_PCI_DRA7XX_EP)) {
> + ret = -ENODEV;
> + goto err_gpio;
> + }
> +
>   dra7xx_pcie_writel(dra7xx, PCIECTRL_TI_CONF_DEVICE_TYPE,
>  DEVICE_TYPE_EP);
>  
>

Re: [PATCH v2 3/5] mm: enlarge NUMA counters threshold size

2017-12-19 Thread kemi



On 2017年12月19日 20:40, Michal Hocko wrote:
> On Tue 19-12-17 14:39:24, Kemi Wang wrote:
>> We have seen significant overhead in cache bouncing caused by NUMA counters
>> update in multi-threaded page allocation. See 'commit 1d90ca897cb0 ("mm:
>> update NUMA counter threshold size")' for more details.
>>
>> This patch updates NUMA counters to a fixed size of (MAX_S16 - 2) and deals
>> with global counter update using different threshold size for node page
>> stats.
> 
> Again, no numbers.

Compare to vanilla kernel, I don't think it has performance improvement, so
I didn't post performance data here.
But, if you would like to see performance gain from enlarging threshold size
for NUMA stats (compare to the first patch), I will do that later. 

> To be honest I do not really like the special casing
> here. Why are numa counters any different from PGALLOC which is
> incremented for _every_ single page allocation?
> 

I guess you meant to PGALLOC event.
The number of this event is kept in local cpu and sum up (for_each_online_cpu)
when need. It uses the similar way to what I used before for NUMA stats in V1 
patch series. Good enough.

>> ---
>>  mm/vmstat.c | 13 +++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/vmstat.c b/mm/vmstat.c
>> index 9c681cc..64e08ae 100644
>> --- a/mm/vmstat.c
>> +++ b/mm/vmstat.c
>> @@ -30,6 +30,8 @@
>>  
>>  #include "internal.h"
>>  
>> +#define VM_NUMA_STAT_THRESHOLD (S16_MAX - 2)
>> +
>>  #ifdef CONFIG_NUMA
>>  int sysctl_vm_numa_stat = ENABLE_NUMA_STAT;
>>  
>> @@ -394,7 +396,11 @@ void __inc_node_state(struct pglist_data *pgdat, enum 
>> node_stat_item item)
>>  s16 v, t;
>>  
>>  v = __this_cpu_inc_return(*p);
>> -t = __this_cpu_read(pcp->stat_threshold);
>> +if (item >= NR_VM_NUMA_STAT_ITEMS)
>> +t = __this_cpu_read(pcp->stat_threshold);
>> +else
>> +t = VM_NUMA_STAT_THRESHOLD;
>> +
>>  if (unlikely(v > t)) {
>>  s16 overstep = t >> 1;
>>  
>> @@ -549,7 +555,10 @@ static inline void mod_node_state(struct pglist_data 
>> *pgdat,
>>   * Most of the time the thresholds are the same anyways
>>   * for all cpus in a node.
>>   */
>> -t = this_cpu_read(pcp->stat_threshold);
>> +if (item >= NR_VM_NUMA_STAT_ITEMS)
>> +t = this_cpu_read(pcp->stat_threshold);
>> +else
>> +t = VM_NUMA_STAT_THRESHOLD;
>>  
>>  o = this_cpu_read(*p);
>>  n = delta + o;
>> -- 
>> 2.7.4
>>
>

Re: [ANNOUNCE] autofs 5.1.2 release

2017-12-19 Thread Ian Kent

On 20/12/17 11:29, NeilBrown wrote:
> 
> Hi Ian,
>  I've been looking at:
> 
>> - add configuration option to use fqdn in mounts.
> 
> (commit 9aeef772604) because using this new option causes a regression.
> If you are using the "replicated server" functionality, then
>   use_hostname_for_mounts = yes
> completely disables it.

Yes, that's not quite right.

It disables the probe and proximity check for each distinct host
name used.

Each of the entries in the list of hosts should still be
attempted and given that NFS ping is also now used in the NFS
mount module what's lost is the preferred ordering of the hosts
list.

> 
> This is caused by:
> 
> diff --git a/modules/replicated.c b/modules/replicated.c
> index 32860d5fe245..8437f5f3d5b2 100644
> --- a/modules/replicated.c
> +++ b/modules/replicated.c
> @@ -667,6 +667,12 @@ int prune_host_list(unsigned logopt, struct host **list,
> if (!*list)
> return 0;
>  
> +   /* If we're using the host name then there's no point probing
> +* avialability and respose time.
> +*/
> +   if (defaults_use_hostname_for_mounts())
> +   return 1;
> +
> /* Use closest hosts to choose NFS version */
> 
> My question is: why what this particular change made.

It was a while ago but there were complains about using the IP
address for mounts. It was requested to provide a way to prevent
that and force the use of the host name in mounts.

> Why can't prune_host_list() be allowed to do it's thing
> when use_hostname_for_mounts is set.

We could if each host name resolved to a single IP address.

I'd need to check that use_hostname_for_mounts doesn't get
in the road but the host struct should have ->rr set to true
if it has multiple addresses so changing it to work the way
your recommending shouldn't be hard. I think there's a couple
of places that would need to be checked.

If the host does resolve to multiple addresses the situation
is different. There's no way to stop the actual mount from
trying an IP address that's not responding and proximity
doesn't make sense either again because every time a lookup
is done on the host name (eg. at mount time) the next address
in its list will be returned which can and usually is different
from what would have been checked.

> I understand that it would be pointless choosing between
> the different interfaces of a multi-homed host, but there is still value
> in choosing between multiple distinct hosts.
> 
> What, if anything, might go wrong if I simply reverse this chunk of the
> patch?

You'll get IP addresses in the logs in certain cases but that
should be all.

It would probably be better to ensure that the checks are done
if the host name resolves to a single IP address.

Ian

Re: [PATCH v6 15/18] PCI: dwc: Make cpu_addr_fixup take struct dw_pcie as argument

2017-12-19 Thread Kishon Vijay Abraham I



On Wednesday 20 December 2017 04:59 AM, Niklas Cassel wrote:
> The current cpu addr fixup mask for ARTPEC-6, GENMASK(27, 0), is wrong.
> The correct cpu addr fixup mask for ARTPEC-6 is GENMASK(28, 0).
> 
> However, having a hardcoded cpu addr fixup mask in each driver is
> arguably wrong.
> A device tree property called something like "cpu-addr-fixup-mask"
> would have been a better solution.
> Introducing such a property is not needed though, since we already have
> pp->cfg0_base and ep->phys_base, which is derived from already existing
> device tree properties.
> 
> It is also worth noting that for ARTPEC-7, hardcoding the cpu addr fixup
> mask is not possible, since it uses a High Address Bits Look Up Table,
> which means that it can, at runtime, map the PCIe window to an arbitrary
> address in the 32-bit address space.
> 
> By using pp->cfg0_base and ep->phys_base, we avoid hardcoding a mask
> in each driver. This should work for ARTPEC-6, DRA7xx, and ARTPEC-7.
> I have not changed the code in DRA7xx though, since their existing
> code works, but if they want, they could use the same logic as
> artpec6_pcie_cpu_addr_fixup, and thus remove their hardcoded mask.
> 
> The reason why the fixup mask is needed is explained in commit f4c55c5a3f7f
> ("PCI: designware: Program ATU with untranslated address").
> 
> Signed-off-by: Niklas Cassel 

Acked-by: Kishon Vijay Abraham I 
> ---
>  drivers/pci/dwc/pci-dra7xx.c  |  2 +-
>  drivers/pci/dwc/pcie-artpec6.c| 18 ++
>  drivers/pci/dwc/pcie-designware.c |  2 +-
>  drivers/pci/dwc/pcie-designware.h |  2 +-
>  4 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
> index 224ff8affdce..89d87844abb3 100644
> --- a/drivers/pci/dwc/pci-dra7xx.c
> +++ b/drivers/pci/dwc/pci-dra7xx.c
> @@ -110,7 +110,7 @@ static inline void dra7xx_pcie_writel(struct dra7xx_pcie 
> *pcie, u32 offset,
>   writel(value, pcie->base + offset);
>  }
>  
> -static u64 dra7xx_pcie_cpu_addr_fixup(u64 pci_addr)
> +static u64 dra7xx_pcie_cpu_addr_fixup(struct dw_pcie *pci, u64 pci_addr)
>  {
>   return pci_addr & DRA7XX_CPU_TO_BUS_ADDR;
>  }
> diff --git a/drivers/pci/dwc/pcie-artpec6.c b/drivers/pci/dwc/pcie-artpec6.c
> index e7de4e4649eb..318a2bd0d97e 100644
> --- a/drivers/pci/dwc/pcie-artpec6.c
> +++ b/drivers/pci/dwc/pcie-artpec6.c
> @@ -67,8 +67,6 @@ static const struct of_device_id artpec6_pcie_of_match[];
>  #define PHY_STATUS   0x118
>  #define  PHY_COSPLLLOCK  BIT(0)
>  
> -#define ARTPEC6_CPU_TO_BUS_ADDR  GENMASK(27, 0)
> -
>  static u32 artpec6_pcie_readl(struct artpec6_pcie *artpec6_pcie, u32 offset)
>  {
>   u32 val;
> @@ -82,9 +80,21 @@ static void artpec6_pcie_writel(struct artpec6_pcie 
> *artpec6_pcie, u32 offset, u
>   regmap_write(artpec6_pcie->regmap, offset, val);
>  }
>  
> -static u64 artpec6_pcie_cpu_addr_fixup(u64 pci_addr)
> +static u64 artpec6_pcie_cpu_addr_fixup(struct dw_pcie *pci, u64 pci_addr)
>  {
> - return pci_addr & ARTPEC6_CPU_TO_BUS_ADDR;
> + struct artpec6_pcie *artpec6_pcie = to_artpec6_pcie(pci);
> + struct pcie_port *pp = &pci->pp;
> + struct dw_pcie_ep *ep = &pci->ep;
> +
> + switch (artpec6_pcie->mode) {
> + case DW_PCIE_RC_TYPE:
> + return pci_addr - pp->cfg0_base;
> + case DW_PCIE_EP_TYPE:
> + return pci_addr - ep->phys_base;
> + default:
> + dev_err(pci->dev, "UNKNOWN device type\n");
> + }
> + return pci_addr;
>  }
>  
>  static int artpec6_pcie_establish_link(struct dw_pcie *pci)
> diff --git a/drivers/pci/dwc/pcie-designware.c 
> b/drivers/pci/dwc/pcie-designware.c
> index 88abdddee2ad..800be7a4f087 100644
> --- a/drivers/pci/dwc/pcie-designware.c
> +++ b/drivers/pci/dwc/pcie-designware.c
> @@ -149,7 +149,7 @@ void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int 
> index, int type,
>   u32 retries, val;
>  
>   if (pci->ops->cpu_addr_fixup)
> - cpu_addr = pci->ops->cpu_addr_fixup(cpu_addr);
> + cpu_addr = pci->ops->cpu_addr_fixup(pci, cpu_addr);
>  
>   if (pci->iatu_unroll_enabled) {
>   dw_pcie_prog_outbound_atu_unroll(pci, index, type, cpu_addr,
> diff --git a/drivers/pci/dwc/pcie-designware.h 
> b/drivers/pci/dwc/pcie-designware.h
> index 24edac035160..cca5a81c1c74 100644
> --- a/drivers/pci/dwc/pcie-designware.h
> +++ b/drivers/pci/dwc/pcie-designware.h
> @@ -205,7 +205,7 @@ struct dw_pcie_ep {
>  };
>  
>  struct dw_pcie_ops {
> - u64 (*cpu_addr_fixup)(u64 cpu_addr);
> + u64 (*cpu_addr_fixup)(struct dw_pcie *pcie, u64 cpu_addr);
>   u32 (*read_dbi)(struct dw_pcie *pcie, void __iomem *base, u32 reg,
>   size_t size);
>   void(*write_dbi)(struct dw_pcie *pcie, void __iomem *base, u32 reg,
>

Re: [PATCH v3 14/16] phy: Add notify_speed callback

2017-12-19 Thread Kishon Vijay Abraham I

Hi,

On Tuesday 12 December 2017 08:54 PM, Manu Gautam wrote:
> Hi,
> 
> 
> On 12/12/2017 5:13 PM, Kishon Vijay Abraham I wrote:
>> Hi,
>>
>> On Tuesday 21 November 2017 02:53 PM, Manu Gautam wrote:
>>> QCOM USB PHYs can monitor resume/remote-wakeup event in
>>> suspended state. However PHY driver must know current
>>> operational speed of PHY in order to set correct polarity of
>>> wakeup events for detection. E.g. QUSB2 PHY monitors DP/DM
>>> signals depending on speed is LS or FS/HS to detect resume.
>>> Similarly QMP USB3 PHY in SS mode should monitor RX
>>> terminations attach/detach and LFPS events depending on
>>> SSPHY is active or not.

Why not use a notification mechanism instead of adding new APIs in phy-core.
This will only bloat phy-core with APIs for a particular platform.

Thanks
Kishon
>>>
>>> Signed-off-by: Manu Gautam 
>>> ---
>>>  drivers/phy/phy-core.c  | 30 ++
>>>  include/linux/phy/phy.h | 26 ++
>>>  2 files changed, 56 insertions(+)
>>>
>>> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
>>> index b4964b0..03df2be 100644
>>> --- a/drivers/phy/phy-core.c
>>> +++ b/drivers/phy/phy-core.c
>>> @@ -387,6 +387,36 @@ int phy_calibrate(struct phy *phy)
>>>  }
>>>  EXPORT_SYMBOL_GPL(phy_calibrate);
>>>  
>>> +int phy_notify_speed(struct phy *phy, enum phy_speed speed)
>>> +{
>>> +   int ret;
>>> +
>>> +   if (!phy || !phy->ops->notify_speed)
>>> +   return 0;
>>> +
>>> +   mutex_lock(&phy->mutex);
>>> +   ret = phy->ops->notify_speed(phy, speed);
>>> +   mutex_unlock(&phy->mutex);
>>> +
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(phy_notify_speed);
>>> +
>>> +enum phy_speed phy_get_speed(struct phy *phy)
>>> +{
>>> +   enum phy_speed ret;
>>> +
>>> +   if (!phy || !phy->ops->get_speed)
>>> +   return PHY_SPEED_UNKNOWN;
>>> +
>>> +   mutex_lock(&phy->mutex);
>>> +   ret = phy->ops->get_speed(phy);
>>> +   mutex_unlock(&phy->mutex);
>>> +
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(phy_get_speed);
>> So this is equivalent to set_speed (why notify?) and get_speed. set_speed 
>> will
>> most likely be invoked by USB driver? who will invoke get_speed?
> 
> I picked notify_speed as set_speed sounds like driver is going to set/program
> speed related configuration in PHY. Where as the purpose of this function
> is to notify phy_driver of the connection speed of established link.
> USB glue drivers for Qualcomm platforms need to know USB PHYs' speed to
> set correct polarity of wakeup interrupt from hardware in low power state.
>  
> 
>> Thanks
>> Kishon
>

[PATCH -next] mtd: sharpslpart: make local function sharpsl_nand_cleanup_ftl() static

2017-12-19 Thread Wei Yongjun

Fixes the following sparse warnings:

drivers/mtd/parsers/sharpslpart.c:222:6: warning:
 symbol 'sharpsl_nand_cleanup_ftl' was not declared. Should it be static?

Signed-off-by: Wei Yongjun 
---
 drivers/mtd/parsers/sharpslpart.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/parsers/sharpslpart.c 
b/drivers/mtd/parsers/sharpslpart.c
index 0ddb79a..8893dc8 100644
--- a/drivers/mtd/parsers/sharpslpart.c
+++ b/drivers/mtd/parsers/sharpslpart.c
@@ -219,7 +219,7 @@ static int sharpsl_nand_init_ftl(struct mtd_info *mtd, 
struct sharpsl_ftl *ftl)
return ret;
 }
 
-void sharpsl_nand_cleanup_ftl(struct sharpsl_ftl *ftl)
+static void sharpsl_nand_cleanup_ftl(struct sharpsl_ftl *ftl)
 {
kfree(ftl->log2phy);
 }

Re: [RFC PATCH 0/5] mm, hugetlb: allocation API and migration improvements

2017-12-19 Thread Naoya Horiguchi


On 12/15/2017 06:33 PM, Michal Hocko wrote:
> Naoya,
> this has passed Mike's review (thanks for that!), you have mentioned
> that you can pass this through your testing machinery earlier. While
> I've done some testing already I would really appreciate if you could
> do that as well. Review would be highly appreciated as well.

Sorry for my slow response. I reviewed/tested this patchset and looks
good to me overall.

I have one comment on the code path from mbind(2).
The callback passed to migrate_pages() in do_mbind() (i.e. new_page())
calls alloc_huge_page_noerr() which currently doesn't call 
SetPageHugeTemporary(),
so hugetlb migration fails when h->surplus_huge_page >= 
h->nr_overcommit_huge_pages.

I don't think this is a bug, but it would be better if mbind(2) works
more similarly with other migration callers like move_pages(2)/migrate_pages(2).

Thanks,
Naoya Horiguchi


> 
> Thanks!
> 
> On Mon 04-12-17 15:01:12, Michal Hocko wrote:
>> Hi,
>> this is a follow up for [1] for the allocation API and [2] for the
>> hugetlb migration. It wasn't really easy to split those into two
>> separate patch series as they share some code.
>>
>> My primary motivation to touch this code is to make the gigantic pages
>> migration working. The giga pages allocation code is just too fragile
>> and hacked into the hugetlb code now. This series tries to move giga
>> pages closer to the first class citizen. We are not there yet but having
>> 5 patches is quite a lot already and it will already make the code much
>> easier to follow. I will come with other changes on top after this sees
>> some review.
>>
>> The first two patches should be trivial to review. The third patch
>> changes the way how we migrate huge pages. Newly allocated pages are a
>> subject of the overcommit check and they participate surplus accounting
>> which is quite unfortunate as the changelog explains. This patch doesn't
>> change anything wrt. giga pages.
>> Patch #4 removes the surplus accounting hack from
>> __alloc_surplus_huge_page.  I hope I didn't miss anything there and a
>> deeper review is really due there.
>> Patch #5 finally unifies allocation paths and giga pages shouldn't be
>> any special anymore. There is also some renaming going on as well.
>>
>> Shortlog
>> Michal Hocko (5):
>>   mm, hugetlb: unify core page allocation accounting and initialization
>>   mm, hugetlb: integrate giga hugetlb more naturally to the allocation 
>> path
>>   mm, hugetlb: do not rely on overcommit limit during migration
>>   mm, hugetlb: get rid of surplus page accounting tricks
>>   mm, hugetlb: further simplify hugetlb allocation API
>>
>> Diffstat:
>>  include/linux/hugetlb.h |   3 +
>>  mm/hugetlb.c| 305 
>> +++-
>>  mm/migrate.c|   3 +-
>>  3 files changed, 175 insertions(+), 136 deletions(-)
>>
>>
>> [1] http://lkml.kernel.org/r/20170622193034.28972-1-mho...@kernel.org
>> [2] http://lkml.kernel.org/r/20171122152832.iayefrlxbugph...@dhcp22.suse.cz
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majord...@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 
>

Re: [PATCH v2 1/5] mm: migrate NUMA stats from per-zone to per-node

2017-12-19 Thread kemi



On 2017年12月19日 20:28, Michal Hocko wrote:
> On Tue 19-12-17 14:39:22, Kemi Wang wrote:
>> There is not really any use to get NUMA stats separated by zone, and
>> current per-zone NUMA stats is only consumed in /proc/zoneinfo. For code
>> cleanup purpose, we move NUMA stats from per-zone to per-node and reuse the
>> existed per-cpu infrastructure.
> 
> Let's hope that nobody really depends on the per-zone numbers. It would
> be really strange as those counters are inherently per-node and that is
> what users should care about but who knows...
> 
> Anyway, I hoped we could get rid of NR_VM_NUMA_STAT_ITEMS but your patch
> keeps it and follow up patches even use it further. I will comment on
> those separately but this still makes these few counters really special
> which I think is wrong.
> 

Well, that's what I can think of to keep a balance between performance 
and simplification. If you have a better idea, please post it and 
I will follow that surely.
 
>> Suggested-by: Andi Kleen 
>> Suggested-by: Michal Hocko 
>> Signed-off-by: Kemi Wang 
> 
> I have to fully grasp the rest of the series before I'll give my Ack,
> but I _really_ like the simplification this adds to the code. I believe
> it can be even simpler.
> 
>> ---
>>  drivers/base/node.c|  23 +++
>>  include/linux/mmzone.h |  27 
>>  include/linux/vmstat.h |  31 -
>>  mm/mempolicy.c |   2 +-
>>  mm/page_alloc.c|  16 +++--
>>  mm/vmstat.c| 177 
>> +
>>  6 files changed, 46 insertions(+), 230 deletions(-)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index ee090ab..a045ea1 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -169,13 +169,14 @@ static ssize_t node_read_numastat(struct device *dev,
>> "interleave_hit %lu\n"
>> "local_node %lu\n"
>> "other_node %lu\n",
>> -   sum_zone_numa_state(dev->id, NUMA_HIT),
>> -   sum_zone_numa_state(dev->id, NUMA_MISS),
>> -   sum_zone_numa_state(dev->id, NUMA_FOREIGN),
>> -   sum_zone_numa_state(dev->id, NUMA_INTERLEAVE_HIT),
>> -   sum_zone_numa_state(dev->id, NUMA_LOCAL),
>> -   sum_zone_numa_state(dev->id, NUMA_OTHER));
>> +   node_page_state(NODE_DATA(dev->id), NUMA_HIT),
>> +   node_page_state(NODE_DATA(dev->id), NUMA_MISS),
>> +   node_page_state(NODE_DATA(dev->id), NUMA_FOREIGN),
>> +   node_page_state(NODE_DATA(dev->id), NUMA_INTERLEAVE_HIT),
>> +   node_page_state(NODE_DATA(dev->id), NUMA_LOCAL),
>> +   node_page_state(NODE_DATA(dev->id), NUMA_OTHER));
>>  }
>> +
>>  static DEVICE_ATTR(numastat, S_IRUGO, node_read_numastat, NULL);
>>  
>>  static ssize_t node_read_vmstat(struct device *dev,
>> @@ -190,17 +191,9 @@ static ssize_t node_read_vmstat(struct device *dev,
>>  n += sprintf(buf+n, "%s %lu\n", vmstat_text[i],
>>   sum_zone_node_page_state(nid, i));
>>  
>> -#ifdef CONFIG_NUMA
>> -for (i = 0; i < NR_VM_NUMA_STAT_ITEMS; i++)
>> -n += sprintf(buf+n, "%s %lu\n",
>> - vmstat_text[i + NR_VM_ZONE_STAT_ITEMS],
>> - sum_zone_numa_state(nid, i));
>> -#endif
>> -
>>  for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++)
>>  n += sprintf(buf+n, "%s %lu\n",
>> - vmstat_text[i + NR_VM_ZONE_STAT_ITEMS +
>> - NR_VM_NUMA_STAT_ITEMS],
>> + vmstat_text[i + NR_VM_ZONE_STAT_ITEMS],
>>   node_page_state(pgdat, i));
>>  
>>  return n;
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 67f2e3c..c06d880 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -115,20 +115,6 @@ struct zone_padding {
>>  #define ZONE_PADDING(name)
>>  #endif
>>  
>> -#ifdef CONFIG_NUMA
>> -enum numa_stat_item {
>> -NUMA_HIT,   /* allocated in intended node */
>> -NUMA_MISS,  /* allocated in non intended node */
>> -NUMA_FOREIGN,   /* was intended here, hit elsewhere */
>> -NUMA_INTERLEAVE_HIT,/* interleaver preferred this zone */
>> -NUMA_LOCAL, /* allocation from local node */
>> -NUMA_OTHER, /* allocation from other node */
>> -NR_VM_NUMA_STAT_ITEMS
>> -};
>> -#else
>> -#define NR_VM_NUMA_STAT_ITEMS 0
>> -#endif
>> -
>>  enum zone_stat_item {
>>  /* First 128 byte cacheline (assuming 64 bit words) */
>>  NR_FREE_PAGES,
>> @@ -151,7 +137,18 @@ enum zone_stat_item {
>>  NR_VM_ZONE_STAT_ITEMS };
>>  
>>  enum node_stat_item {
>> -NR_LRU_BASE,
>> +#ifdef CONFIG_NUMA
>> +NUMA_HIT,   /* allocated in intended node */
>> +NUMA_MISS,  /* allocated in non intended node */
>> +NUMA

Re: proc_flush_task oops

2017-12-19 Thread Dave Jones

On Tue, Dec 19, 2017 at 07:54:24PM -0600, Eric W. Biederman wrote:

 > > *Scratches my head*  I am not seeing anything obvious.
 > 
 > Can you try this patch as you reproduce this issue?
 > 
 > diff --git a/kernel/pid.c b/kernel/pid.c
 > index b13b624e2c49..df9e5d4d8f83 100644
 > --- a/kernel/pid.c
 > +++ b/kernel/pid.c
 > @@ -210,6 +210,7 @@ struct pid *alloc_pid(struct pid_namespace *ns)
 > goto out_unlock;
 > for ( ; upid >= pid->numbers; --upid) {
 > /* Make the PID visible to find_pid_ns. */
 > +   WARN_ON(!upid->ns->proc_mnt);
 > idr_replace(&upid->ns->idr, pid, upid->nr);
 > upid->ns->pid_allocated++;
 > }
 > 
 > 
 > If the warning triggers it means the bug is in alloc_pid and somehow
 > something has gotten past the is_child_reaper check.

You're onto something.

WARNING: CPU: 1 PID: 12020 at kernel/pid.c:213 alloc_pid+0x230/0x280
CPU: 1 PID: 12020 Comm: trinity-c29 Not tainted 4.15.0-rc4-think+ #3 
RIP: 0010:alloc_pid+0x230/0x280
RSP: 0018:c90009977d48 EFLAGS: 00010046
RAX: 0030 RBX: 8804fb431280 RCX: 8f5c28f5c28f5c29
RDX: 88050a00de40 RSI: 82005218 RDI: 8804fc6aa9a8
RBP: 8804fb431270 R08:  R09: 0001
R10: c90009977cc0 R11: eab94e31da7171b7 R12: 8804fb431260
R13: 8804fb431240 R14: 82005200 R15: 8804fb431268
FS:  7f49b9065700() GS:88050a00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f49b906a000 CR3: 0004f7446001 CR4: 001606e0
DR0: 7f0b4c405000 DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0600
Call Trace:
 copy_process.part.41+0x14fa/0x1e30
 _do_fork+0xe7/0x720
 ? rcu_read_lock_sched_held+0x6c/0x80
 ? syscall_trace_enter+0x2d7/0x340
 do_syscall_64+0x60/0x210
 entry_SYSCALL64_slow_path+0x25/0x25

followed immediately by...

Oops:  [#1] SMP
CPU: 1 PID: 12020 Comm: trinity-c29 Tainted: GW
4.15.0-rc4-think+ #3 
RIP: 0010:proc_flush_task+0x8e/0x1b0
RSP: 0018:c90009977c40 EFLAGS: 00010286
RAX: 0001 RBX: 0001 RCX: fffb
RDX:  RSI: c90009977c50 RDI: 
RBP: c90009977c63 R08:  R09: 0002
R10: c90009977b70 R11: c90009977c64 R12: 0004
R13:  R14: 0004 R15: 8804fb431240
FS:  7f49b9065700() GS:88050a00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2:  CR3: 0004f7446001 CR4: 001606e0
DR0: 7f0b4c405000 DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0600
Call Trace:
 ? release_task+0xaf/0x680
 release_task+0xd2/0x680
 ? wait_consider_task+0xb82/0xce0
 wait_consider_task+0xbe9/0xce0
 ? do_wait+0xe1/0x330
 do_wait+0x151/0x330
 kernel_wait4+0x8d/0x150
 ? task_stopped_code+0x50/0x50
 SYSC_wait4+0x95/0xa0
 ? rcu_read_lock_sched_held+0x6c/0x80
 ? syscall_trace_enter+0x2d7/0x340
 ? do_syscall_64+0x60/0x210
 do_syscall_64+0x60/0x210
 entry_SYSCALL64_slow_path+0x25/0x25

Re: [PATCH net 0/3] Few mvneta fixes

2017-12-19 Thread Willy Tarreau

Hi Arnd,

On Tue, Dec 19, 2017 at 09:18:35PM +0100, Arnd Bergmann wrote:
> On Tue, Dec 19, 2017 at 5:59 PM, Gregory CLEMENT
>  wrote:
> > Hello,
> >
> > here it is a small series of fixes found on the mvneta driver. They
> > had been already used in the vendor kernel and are now ported to
> > mainline.
> 
> Does one of the patches look like it addresses the rare Oops we discussed on
> #kernelci this morning?
> 
> https://storage.kernelci.org/stable/linux-4.9.y/v4.9.70/arm/mvebu_v7_defconfig/lab-free-electrons/boot-armada-375-db.html

I could be wrong but for me the 375 uses mvpp2, not mvneta, so this
should have no effect there.

Willy

Re: r8169 regression: UDP packets dropped intermittantly

2017-12-19 Thread Jonathan Woithe

On Tue, Dec 19, 2017 at 01:25:23PM +0100, Michal Kubecek wrote:
> On Tue, Dec 19, 2017 at 04:15:32PM +1030, Jonathan Woithe wrote:
> > This clearly indicates that not every card using the r8169 driver is
> > vulnerable to the problem.  It also explains why Holger was unable to
> > reproduce the result on his system: the PCIe cards do not appear to suffer
> > from the problem.  Most likely the PCI RTL-8169 chip is affected, but newer
> > PCIe variations do not.  However, obviously more testing will be required
> > with a wider variety of cards if this inference is to hold up.
> 
> The r8169 driver supports many slightly different variants of the chip.
> To identify your variant more precisely, look for a line like
> 
>   r8169 :02:00.0 eth0: RTL8168evl/8111evl at 0xc90003135000, 
> d4:3d:7e:2a:30:08, XID 0c900800 IRQ 38
> 
> in kernel log.

The PCIe card (the one which works correctly with the current driver) shows
this:

  r8169 :02:00.0 eth0: RTL8168e/8111e at 0xf862e000, 80:1f:02:45:25:a4, 
XID 0c20 IRQ 30
  r8169 :02:00.0 eth0: jumbo features [frames: 9200 bytes, 
tx checksumming: ko]

The PCI card (Netgear GA311) which is affected by the problem shows this:

  r8169 :05:01.0 eth1: RTL8110s at 0xf8706800, e0:91:f5:1b:5f:c6, 
XID 0400 IRQ 22
  r8169 :05:01.0 eth1: jumbo features [frames: 7152 bytes, 
tx checksumming: ok]

The system which has shown the regressed behaviour is running a 32-bit
kernel; for various reasons we can't move to a 64-bit kernel at present. 
However, I was able to boot this system using Slackware 14.2 install discs,
and therefore test using both 32-bit and 64-bit 4.4.14 kernels.  In both
cases the fault was observed within 30 minutes of starting the tests when
the GA311 card was in use.  The fault is therefore not specific to 32-bit
environments.

Regards
  jonathan

Re: [PATCH] kfree_rcu() should use the new kfree_bulk() interface for freeing rcu structures

2017-12-19 Thread Paul E. McKenney

On Tue, Dec 19, 2017 at 05:53:36PM -0800, Matthew Wilcox wrote:
> On Tue, Dec 19, 2017 at 04:20:51PM -0800, Paul E. McKenney wrote:
> > If we are going to make this sort of change, we should do so in a way
> > that allows the slab code to actually do the optimizations that might
> > make this sort of thing worthwhile.  After all, if the main goal was small
> > code size, the best approach is to drop kfree_bulk() and get on with life
> > in the usual fashion.
> > 
> > I would prefer to believe that something like kfree_bulk() can help,
> > and if that is the case, we should give it a chance to do things like
> > group kfree_rcu() requests by destination slab and soforth, allowing
> > batching optimizations that might provide more significant increases
> > in performance.  Furthermore, having this in slab opens the door to
> > slab taking emergency action when memory is low.
> 
> kfree_bulk does sort by destination slab; look at build_detached_freelist.

Understood, but beside the point.  I suspect that giving it larger
scope makes it more efficient, similar to disk drives in the old days.
Grouping on the stack when processing RCU callbacks limits what can
reasonably be done.  Furthermore, using the vector approach going into the
grace period is much more cache-efficient than the linked-list approach,
given that the blocks have a reasonable chance of going cache-cold during
the grace period.

And the slab-related operations should really be in the slab code in any
case rather than within RCU.

Thanx, Paul

Re: [PATCH v3 02/16] phy: qcom-qmp: Adapt to clk_bulk_* APIs

2017-12-19 Thread Manu Gautam



On 12/20/2017 8:07 AM, Vivek Gautam wrote:
> Hi Manu,
>
> [snip]
>
>> @@ -998,29 +992,17 @@ static int qcom_qmp_phy_reset_init(struct device *dev)
>>  static int qcom_qmp_phy_clk_init(struct device *dev)
>>  {
>> struct qcom_qmp *qmp = dev_get_drvdata(dev);
>> -   int ret, i;
>> +   int num = qmp->cfg->num_clks;
>> +   int i;
>>
>> -   qmp->clks = devm_kcalloc(dev, qmp->cfg->num_clks,
>> -sizeof(*qmp->clks), GFP_KERNEL);
>> +   qmp->clks = devm_kcalloc(dev, num, sizeof(*qmp->clks), GFP_KERNEL);
>> if (!qmp->clks)
>> return -ENOMEM;
>>
>> -   for (i = 0; i < qmp->cfg->num_clks; i++) {
>> -   struct clk *_clk;
>> -   const char *name = qmp->cfg->clk_list[i];
>> -
>> -   _clk = devm_clk_get(dev, name);
>> -   if (IS_ERR(_clk)) {
>> -   ret = PTR_ERR(_clk);
>> -   if (ret != -EPROBE_DEFER)
>> -   dev_err(dev, "failed to get %s clk, %d\n",
>> -   name, ret);
>> -   return ret;
>> -   }
>> -   qmp->clks[i] = _clk;
>> -   }
>> +   for (i = 0; i < num; i++)
>> +   qmp->clks->id = qmp->cfg->clk_list[i];
> I think i missed this one while rebasing.
> We need to use index with this. Should be:
> qmp->clks[i]->id = qmp->cfg->clk_list[i];
>

Thanks, I will change this accordingly in next version.


> Regards
> Vivek
>
>

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v5 7/8] pwm: pwm-omap-dmtimer: Adapt driver to utilize dmtimer pdata ops

2017-12-19 Thread Keerthy



On Tuesday 19 December 2017 08:51 PM, Ladislav Michl wrote:
> On Tue, Dec 19, 2017 at 01:55:48PM +0530, Keerthy wrote:
>> On Tuesday 19 December 2017 10:28 AM, Keerthy wrote:
>>> On Monday 18 December 2017 06:25 PM, Keerthy wrote:
 On Monday 18 December 2017 03:01 PM, Ladislav Michl wrote:
> Keerthy,
>
> On Tue, Dec 12, 2017 at 11:42:16AM +0530, Keerthy wrote:
>> Adapt driver to utilize dmtimer pdata ops instead of pdata-quirks.
>>
>> Signed-off-by: Keerthy 
>> ---
>>
>> Changes in v4:
>>
>>   * Switched to dev_get_platdata.
>
> Where do you expect dev.platform_data to be set? PWM driver is failing
> with:
> omap-dmtimer-pwm dmtimer-pwm: dmtimer pdata structure NULL
> omap-dmtimer-pwm: probe of dmtimer-pwm failed with error -22
>
> Which I fixed with patch bellow, to be able to test your patchset.

 Thanks! I will make the below patch part of my series.

>
> Also I'm running a bit out of time, so I'll send few clean up
> patches and event capture code to get some feedback early.
>
> Regards,
>   ladis
>
> diff --git a/drivers/clocksource/timer-dm.c 
> b/drivers/clocksource/timer-dm.c
> index 39be39e6a8dd..d3d8a49cae0d 100644
> --- a/drivers/clocksource/timer-dm.c
> +++ b/drivers/clocksource/timer-dm.c
> @@ -773,6 +773,7 @@ static int omap_dm_timer_probe(struct platform_device 
> *pdev)
>   dev_err(dev, "%s: no platform data.\n", __func__);
>   return -ENODEV;
>   }
> + dev->platform_data = pdata;
>>>
>>> drivers/clocksource/timer-dm.c: In function 'omap_dm_timer_probe':
>>> drivers/clocksource/timer-dm.c:744:21: warning: assignment discards
>>> 'const' qualifier from pointer target type
>>>
>>> This cannot be done as we are assigning a const pointer to a non-const
>>> pointer.
> 
> Oh, I didn't even assume it as proper fix, just to show what is missing :)
> 
> But technically 'struct dmtimer_platform_data *pdata' is a constant which
> should not be changed. Also look how all that of_populate chain works -
> at the end const pointer is assigned to void* platform_data by simple
> (void *) overcast.
> 
>>> I will figure out a different way for this fix.
>>
>> Ladis,
>>
>> I fixed that:
>>
>> diff --git a/drivers/clocksource/timer-dm.c b/drivers/clocksource/timer-dm.c
>> index 1cbd954..e58f555 100644
>> --- a/drivers/clocksource/timer-dm.c
>> +++ b/drivers/clocksource/timer-dm.c
>> @@ -807,17 +807,21 @@ static int omap_dm_timer_probe(struct
>> platform_device *pdev)
>> struct resource *mem, *irq;
>> struct device *dev = &pdev->dev;
>> const struct of_device_id *match;
>> -   const struct dmtimer_platform_data *pdata;
>> +   struct dmtimer_platform_data *pdata;
>> int ret;
>>
>> match = of_match_device(of_match_ptr(omap_timer_match), dev);
>> -   pdata = match ? match->data : dev->platform_data;
>> +   pdata = match ? (struct dmtimer_platform_data *)match->data :
>> +   dev->platform_data;
> 
> All that seems needlesly complicated, what about patch bellow?
> 
>> if (!pdata && !dev->of_node) {
>> dev_err(dev, "%s: no platform data.\n", __func__);
>> return -ENODEV;
>> }
>>
>> +   if (!dev->platform_data)
>> +   dev->platform_data = pdata;
> 
> Does the above condition bring us anything?

That was to avoid assigning the same thing.

> 
>> irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
>> if (unlikely(!irq)) {
>> dev_err(dev, "%s: no IRQ resource.\n", __func__);
>> @@ -946,7 +950,7 @@ static int omap_dm_timer_remove(struct
>> platform_device *pdev)
>> .write_status = omap_dm_timer_write_status,
>>  };
>>
>> -static const struct dmtimer_platform_data omap3plus_pdata = {
>> +static struct dmtimer_platform_data omap3plus_pdata = {
>> .timer_errata = OMAP_TIMER_ERRATA_I103_I767,
>> .timer_ops = &dmtimer_ops,
>>  };
>>
>> Can you check at your end if this works for you?
> 
> Note, it is untested as I ran out of time and will continue after New Year.
> 
> diff --git a/drivers/clocksource/timer-dm.c b/drivers/clocksource/timer-dm.c
> index 1cbd95420914..85024f11773a 100644
> --- a/drivers/clocksource/timer-dm.c
> +++ b/drivers/clocksource/timer-dm.c
> @@ -806,14 +806,16 @@ static int omap_dm_timer_probe(struct platform_device 
> *pdev)
>   struct omap_dm_timer *timer;
>   struct resource *mem, *irq;
>   struct device *dev = &pdev->dev;
> - const struct of_device_id *match;
>   const struct dmtimer_platform_data *pdata;
>   int ret;
>  
> - match = of_match_device(of_match_ptr(omap_timer_match), dev);
> - pdata = match ? match->data : dev->platform_data;
> + pdata = of_device_get_match_data(dev);
> + if (!pdata)
> + pdata = dev_get_platdata(dev);
> + else
> + dev->platform_data = (void *)

[PATCH] RISC-V: Support built-in dtb

2017-12-19 Thread Zong Li

Build the dtb into the kernel image.
If the DTB is given via bootloader, the external DTB is adopted first.

Signed-off-by: Zong Li 
---
 arch/riscv/Kconfig   |  4 
 arch/riscv/Makefile  |  9 +
 arch/riscv/boot/Makefile | 17 +
 arch/riscv/boot/dts/Makefile | 11 +++
 arch/riscv/kernel/setup.c|  2 +-
 5 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/boot/Makefile
 create mode 100644 arch/riscv/boot/dts/Makefile

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 2e15e85..831cbb9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -189,6 +189,10 @@ config RISCV_ISA_C
 config RISCV_ISA_A
def_bool y
 
+config RISCV_BUILTIN_DTB
+string "Builtin DTB"
+default ""
+
 endmenu
 
 menu "Kernel type"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6719dd3..baed60a 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -57,6 +57,12 @@ ifeq ($(CONFIG_CMODEL_MEDANY),y)
KBUILD_CFLAGS += -mcmodel=medany
 endif
 
+ifneq '$(CONFIG_NDS32_BUILTIN_DTB)' '""'
+BUILTIN_DTB := y
+else
+BUILTIN_DTB := n
+endif
+
 # GCC versions that support the "-mstrict-align" option default to allowing
 # unaligned accesses.  While unaligned accesses are explicitly allowed in the
 # RISC-V ISA, they're emulated by machine mode traps on all extant
@@ -69,4 +75,7 @@ core-y += arch/riscv/kernel/ arch/riscv/mm/
 
 libs-y += arch/riscv/lib/
 
+boot := arch/riscv/boot
+core-$(BUILTIN_DTB) += $(boot)/dts/
+
 all: vmlinux
diff --git a/arch/riscv/boot/Makefile b/arch/riscv/boot/Makefile
new file mode 100644
index 000..003d697
--- /dev/null
+++ b/arch/riscv/boot/Makefile
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0
+
+targets := Image Image.gz
+
+$(obj)/Image: vmlinux FORCE
+   $(call if_changed,objcopy)
+
+$(obj)/Image.gz: $(obj)/Image FORCE
+   $(call if_changed,gzip)
+
+install: $(obj)/Image
+   $(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
+   $(obj)/Image System.map "$(INSTALL_PATH)"
+
+zinstall: $(obj)/Image.gz
+   $(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
+   $(obj)/Image.gz System.map "$(INSTALL_PATH)"
diff --git a/arch/riscv/boot/dts/Makefile b/arch/riscv/boot/dts/Makefile
new file mode 100644
index 000..b65d070
--- /dev/null
+++ b/arch/riscv/boot/dts/Makefile
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
+
+ifneq '$(CONFIG_RISCV_BUILTIN_DTB)' '""'
+BUILTIN_DTB := $(patsubst "%",%,$(CONFIG_RISCV_BUILTIN_DTB)).dtb.o
+else
+BUILTIN_DTB :=
+endif
+
+obj-$(CONFIG_OF) += $(BUILTIN_DTB)
+
+clean-files := *.dtb *.dtb.S
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index e59a28c..3c89f3d 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -149,7 +149,7 @@ asmlinkage void __init setup_vm(void)
 
 void __init sbi_save(unsigned int hartid, void *dtb)
 {
-   early_init_dt_scan(__va(dtb));
+   early_init_dt_scan(dtb ? __va(dtb) : __dtb_start);
 }
 
 /*
-- 
2.7.4

linux-next: Tree for Dec 20

2017-12-19 Thread Stephen Rothwell

Hi all,

Changes since 20171219:

The usb tree gained a conflict against the usb.current tree.

The staging tree gained a conflict against the char-misc-next tree.

The akpm tree lost a patch that turned up elsewhere.

Non-merge commits (relative to Linus' tree): 5152
 5400 files changed, 193652 insertions(+), 153296 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 253 trees (counting Linus' and 43 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (ace52288edf0 Merge tag 'for-linus-20171218' of 
git://git.infradead.org/linux-mtd)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (cfe17c9bbe6a kbuild: move cc-option and 
cc-disable-warning after incl. arch Makefile)
Merging arc-current/for-curr (bb6892a6366a ARC: [plat-axs103] refactor the quad 
core DT quirk code)
Merging arm-current/fixes (36b0cb84ee85 ARM: 8731/1: Fix 
csum_partial_copy_from_user() stack mismatch)
Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs 
for v4.14-rc7)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (110df8bd3e41 powerpc/perf: Fix kfree memory 
allocated for nest pmus)
Merging sparc/master (a0908a1b7d68 Merge branch 'akpm' (patches from Andrew))
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (d03a45572efa ipv4: fib: Fix metrics match when deleting a 
route)
Merging bpf/master (71af812aa3f2 Fix tools and testing build.)
Merging ipsec/master (acf568ee859f xfrm: Reinject transport-mode packets 
through tasklet)
Merging netfilter/master (d6da83813fb3 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (a41886f56b7b Merge tag 
'iwlwifi-for-kalle-2017-12-05' of 
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes)
Merging mac80211/master (04a7279ff12f cfg80211: ship certificates as hex files)
Merging sound-current/for-linus (5a15f289ee87 ALSA: usb-audio: Fix the missing 
ctl name suffix at parsing SU)
Merging pci-current/for-linus (1291a0d5049d Linux 4.15-rc4)
Merging driver-core.current/driver-core-linus (f57ab9a01a36 drivers: base: 
cacheinfo: fix cache type for non-architected system cache)
Merging tty.current/tty-linus (50c4c4e268a2 Linux 4.15-rc3)
Merging usb.current/usb-linus (76916b663e8d Merge tag 'usb-ci-v4.15-rc5' of 
git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus)
Merging usb-gadget-fixes/fixes (1291a0d5049d Linux 4.15-rc4)
Merging usb-serial-fixes/usb-linus (3920bb713038 USB: serial: option: adding 
support for YUGA CLM920-NC5)
Merging usb-chipidea-fixes/ci-for-usb-stable (964728f9f407 USB: chipidea: msm: 
fix ulpi-node lookup)
Merging phy/fixes (2b88212c4cc6 phy: rcar-gen3-usb2: select USB_COMMON)
Merging staging.current/staging-linus (d6b246bb7a29 staging: android: ion: Fix 
dma direction for dma_sync_sg_for_cpu/device)
Merging char-misc.current/char-misc-linus (7f3dc0088b98 binder: fix proc->files 
use-aft

RE: [PATCH v2] arm64: dts: ls1088a: Add USB support

2017-12-19 Thread Yinbo Zhu



-Original Message-
From: Shawn Guo [mailto:shawn...@kernel.org] 
Sent: Wednesday, December 20, 2017 10:53 AM
To: Yinbo Zhu 
Cc: Rob Herring ; Mark Rutland ; 
Catalin Marinas ) ; Will Deacon ) 
; Harninder Rai ; Raghav Dogra 
; Ashish Kumar ; Andy Tang 
; open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS 
; linux-arm-ker...@lists.infradead.org; open list 

Subject: Re: [PATCH v2] arm64: dts: ls1088a: Add USB support

On Thu, Dec 07, 2017 at 07:33:28AM +, Yinbo Zhu wrote:
> Hi shawn guo,
> 
> If my patch has no other issue,
> Can you help me push it to upstream.

>Are you talking about v4 patch?  First of all, I cannot find v4 in my mailbox. 
> That said, it seems you did not send 

>the patch to me.
>Secondly, by checking the patch on patchwork, the usb nodes in 
>fsl-ls1088a-rdb.dts do not sorted alphabetically 

>in label name.

>Shawn

Hi shawn, 
 I will change the code as follows, right?

&esdhc {
status = "okay";
 };
...
+&usb0 {
+   status = "okay";
+};
+
+&usb1 {
+   status = "okay";
+};
+
https://patchwork.kernel.org/patch/10059097/

Thanks
Yinbo.

Re: BUG: bad usercopy in memdup_user

2017-12-19 Thread Linus Torvalds

On Tue, Dec 19, 2017 at 8:05 PM, Linus Torvalds
 wrote:
>
> And yes, we had a few cases where the hashing actually did hide the
> values, and I've been applying patches to turn those from %p to %px.

So far at least:

  10a7e9d84915 Do not hash userspace addresses in fault handlers
  85c3e4a5a185 mm/slab.c: do not hash pointers when debugging slab
  d81041820873 powerpc/xmon: Don't print hashed pointers in xmon
  328b4ed93b69 x86: don't hash faulting address in oops printout
  b7ad7ef742a9 remove task and stack pointer printout from oops dump
  6424f6bb4327 kasan: use %px to print addresses instead of %p

although that next-to-last case is a "remove %p" case rather than
"convert to %px".

And we'll probably hit a few more, I'm not at all claiming that we're
somehow "done". There's bound to be other cases people haven't noticed
yet (or haven't patched yet, like the usercopy case that Kees is
signed up to fix up).

But considering that we had something like 12k of those %p users, I
think a handful now (and maybe a few tens eventually) is worth the
pain and confusion.

I just want to make sure that the ones we _do_ convert we actually
spend the mental effort really looking at, and really asking "does it
make sense to convert this?"

Not just knee-jerking "oh, it's hashed, let's just unhash it".

   Linus

Re: [PATCH 3/4] sched: Comment on why sync wakeups try to run on the current CPU

2017-12-19 Thread Mike Galbraith

On Wed, 2017-12-20 at 05:09 +0100, Mike Galbraith wrote:
>  Nope, stacking based upon that
> hint is most definitely not a good idea :)

Except when heavily loaded.  The only thing worse for communicating
hogs being stacked is communicating hogs talking with another hog
between them.

-Mike

[PATCH net-next v4 6/6] net: dccp: Remove dccpprobe module

2017-12-19 Thread Masami Hiramatsu

Remove DCCP probe module since jprobe has been deprecated.
That function is now replaced by dccp/dccp_probe trace-event.
You can use it via ftrace or perftools.

Signed-off-by: Masami Hiramatsu 
---
 net/dccp/Kconfig  |   17 
 net/dccp/Makefile |2 -
 net/dccp/probe.c  |  203 -
 3 files changed, 222 deletions(-)
 delete mode 100644 net/dccp/probe.c

diff --git a/net/dccp/Kconfig b/net/dccp/Kconfig
index 8c0ef71bed2f..b270e84d9c13 100644
--- a/net/dccp/Kconfig
+++ b/net/dccp/Kconfig
@@ -39,23 +39,6 @@ config IP_DCCP_DEBUG
 
  Just say N.
 
-config NET_DCCPPROBE
-   tristate "DCCP connection probing"
-   depends on PROC_FS && KPROBES
-   ---help---
-   This module allows for capturing the changes to DCCP connection
-   state in response to incoming packets. It is used for debugging
-   DCCP congestion avoidance modules. If you don't understand
-   what was just said, you don't need it: say N.
-
-   Documentation on how to use DCCP connection probing can be found
-   at:
-   
- 
http://www.linuxfoundation.org/collaborate/workgroups/networking/dccpprobe
-
-   To compile this code as a module, choose M here: the
-   module will be called dccp_probe.
-
 
 endmenu
 
diff --git a/net/dccp/Makefile b/net/dccp/Makefile
index 2e7b56097bc4..9d0383d2f277 100644
--- a/net/dccp/Makefile
+++ b/net/dccp/Makefile
@@ -21,9 +21,7 @@ obj-$(subst y,$(CONFIG_IP_DCCP),$(CONFIG_IPV6)) += dccp_ipv6.o
 dccp_ipv6-y := ipv6.o
 
 obj-$(CONFIG_INET_DCCP_DIAG) += dccp_diag.o
-obj-$(CONFIG_NET_DCCPPROBE) += dccp_probe.o
 
 dccp-$(CONFIG_SYSCTL) += sysctl.o
 
 dccp_diag-y := diag.o
-dccp_probe-y := probe.o
diff --git a/net/dccp/probe.c b/net/dccp/probe.c
deleted file mode 100644
index 3d3fda05b32d..
--- a/net/dccp/probe.c
+++ /dev/null
@@ -1,203 +0,0 @@
-/*
- * dccp_probe - Observe the DCCP flow with kprobes.
- *
- * The idea for this came from Werner Almesberger's umlsim
- * Copyright (C) 2004, Stephen Hemminger 
- *
- * Modified for DCCP from Stephen Hemminger's code
- * Copyright (C) 2006, Ian McDonald 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "dccp.h"
-#include "ccid.h"
-#include "ccids/ccid3.h"
-
-static int port;
-
-static int bufsize = 64 * 1024;
-
-static const char procname[] = "dccpprobe";
-
-static struct {
-   struct kfifo  fifo;
-   spinlock_tlock;
-   wait_queue_head_t wait;
-   struct timespec64 tstart;
-} dccpw;
-
-static void printl(const char *fmt, ...)
-{
-   va_list args;
-   int len;
-   struct timespec64 now;
-   char tbuf[256];
-
-   va_start(args, fmt);
-   getnstimeofday64(&now);
-
-   now = timespec64_sub(now, dccpw.tstart);
-
-   len = sprintf(tbuf, "%lu.%06lu ",
- (unsigned long) now.tv_sec,
- (unsigned long) now.tv_nsec / NSEC_PER_USEC);
-   len += vscnprintf(tbuf+len, sizeof(tbuf)-len, fmt, args);
-   va_end(args);
-
-   kfifo_in_locked(&dccpw.fifo, tbuf, len, &dccpw.lock);
-   wake_up(&dccpw.wait);
-}
-
-static int jdccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
-{
-   const struct inet_sock *inet = inet_sk(sk);
-   struct ccid3_hc_tx_sock *hc = NULL;
-
-   if (ccid_get_current_tx_ccid(dccp_sk(sk)) == DCCPC_CCID3)
-   hc = ccid3_hc_tx_sk(sk);
-
-   if (port == 0 || ntohs(inet->inet_dport) == port ||
-   ntohs(inet->inet_sport) == port) {
-   if (hc)
-   printl("%pI4:%u %pI4:%u %d %d %d %d %u %llu %llu %d\n",
-  &inet->inet_saddr, ntohs(inet->inet_sport),
-  &inet->inet_daddr, ntohs(inet->inet_dport), size,
-  hc->tx_s, hc->tx_rtt, hc->tx_p,
-  hc->tx_x_calc, hc->tx_x_recv >> 6,
-  hc->tx_x >> 6, hc->tx_t_ipi);
-   else
-   printl("%pI4:%u %pI4:%u %d\n",
-  &inet->inet_saddr, ntohs(inet->inet_sport),
-  &inet->inet_daddr, ntohs(inet->inet_dport),
-

[PATCH net-next v4 5/6] net: dccp: Add DCCP sendmsg trace event

2017-12-19 Thread Masami Hiramatsu

Add DCCP sendmsg trace event (dccp/dccp_probe) for
replacing dccpprobe. User can trace this event via
ftrace or perftools.

Signed-off-by: Masami Hiramatsu 
---
 net/dccp/proto.c |5 +++
 net/dccp/trace.h |  105 ++
 2 files changed, 110 insertions(+)
 create mode 100644 net/dccp/trace.h

diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 9d43c1f40274..e57b5db495cd 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -38,6 +38,9 @@
 #include "dccp.h"
 #include "feat.h"
 
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
 DEFINE_SNMP_STAT(struct dccp_mib, dccp_statistics) __read_mostly;
 
 EXPORT_SYMBOL_GPL(dccp_statistics);
@@ -761,6 +764,8 @@ int dccp_sendmsg(struct sock *sk, struct msghdr *msg, 
size_t len)
int rc, size;
long timeo;
 
+   trace_dccp_probe(sk, len);
+
if (len > dp->dccps_mss_cache)
return -EMSGSIZE;
 
diff --git a/net/dccp/trace.h b/net/dccp/trace.h
new file mode 100644
index ..aa01321a6c37
--- /dev/null
+++ b/net/dccp/trace.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM dccp
+
+#if !defined(_TRACE_DCCP_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_DCCP_H
+
+#include 
+#include "dccp.h"
+#include "ccids/ccid3.h"
+#include 
+
+TRACE_EVENT(dccp_probe,
+
+   TP_PROTO(struct sock *sk, size_t size),
+
+   TP_ARGS(sk, size),
+
+   TP_STRUCT__entry(
+   /* sockaddr_in6 is always bigger than sockaddr_in */
+   __array(__u8, saddr, sizeof(struct sockaddr_in6))
+   __array(__u8, daddr, sizeof(struct sockaddr_in6))
+   __field(__u16, sport)
+   __field(__u16, dport)
+   __field(__u16, size)
+   __field(__u16, tx_s)
+   __field(__u32, tx_rtt)
+   __field(__u32, tx_p)
+   __field(__u32, tx_x_calc)
+   __field(__u64, tx_x_recv)
+   __field(__u64, tx_x)
+   __field(__u32, tx_t_ipi)
+   ),
+
+   TP_fast_assign(
+   const struct inet_sock *inet = inet_sk(sk);
+   struct ccid3_hc_tx_sock *hc = NULL;
+
+   if (ccid_get_current_tx_ccid(dccp_sk(sk)) == DCCPC_CCID3)
+   hc = ccid3_hc_tx_sk(sk);
+
+   memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
+   memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));
+
+   if (sk->sk_family == AF_INET) {
+   struct sockaddr_in *v4 = (void *)__entry->saddr;
+
+   v4->sin_family = AF_INET;
+   v4->sin_port = inet->inet_sport;
+   v4->sin_addr.s_addr = inet->inet_saddr;
+   v4 = (void *)__entry->daddr;
+   v4->sin_family = AF_INET;
+   v4->sin_port = inet->inet_dport;
+   v4->sin_addr.s_addr = inet->inet_daddr;
+#if IS_ENABLED(CONFIG_IPV6)
+   } else if (sk->sk_family == AF_INET6) {
+   struct sockaddr_in6 *v6 = (void *)__entry->saddr;
+
+   v6->sin6_family = AF_INET6;
+   v6->sin6_port = inet->inet_sport;
+   v6->sin6_addr = inet6_sk(sk)->saddr;
+   v6 = (void *)__entry->daddr;
+   v6->sin6_family = AF_INET6;
+   v6->sin6_port = inet->inet_dport;
+   v6->sin6_addr = sk->sk_v6_daddr;
+#endif
+   }
+
+   /* For filtering use */
+   __entry->sport = ntohs(inet->inet_sport);
+   __entry->dport = ntohs(inet->inet_dport);
+
+   __entry->size = size;
+   if (hc) {
+   __entry->tx_s = hc->tx_s;
+   __entry->tx_rtt = hc->tx_rtt;
+   __entry->tx_p = hc->tx_p;
+   __entry->tx_x_calc = hc->tx_x_calc;
+   __entry->tx_x_recv = hc->tx_x_recv >> 6;
+   __entry->tx_x = hc->tx_x >> 6;
+   __entry->tx_t_ipi = hc->tx_t_ipi;
+   } else {
+   __entry->tx_s = 0;
+   memset(&__entry->tx_rtt, 0, (void *)&__entry->tx_t_ipi -
+  (void *)&__entry->tx_rtt +
+  sizeof(__entry->tx_t_ipi));
+   }
+   ),
+
+   TP_printk("src=%pISpc dest=%pISpc size=%d tx_s=%d tx_rtt=%d "
+ "tx_p=%d tx_x_calc=%u tx_x_recv=%llu tx_x=%llu tx_t_ipi=%d",
+ __entry->saddr, __entry->daddr, __entry->size,
+ __entry->tx_s, __entry->tx_rtt, __entry->tx_p,
+ __entry->tx_x_calc, __entry->tx_x_recv, __entry->tx_x,
+ __entry->tx_t_ipi)
+);
+
+#endif /* _TRACE_TCP_H */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define

[PATCH net-next v4 4/6] net: sctp: Remove debug SCTP probe module

2017-12-19 Thread Masami Hiramatsu

Remove SCTP probe module since jprobe has been deprecated.
That function is now replaced by sctp/sctp_probe and
sctp/sctp_probe_path trace-events.
You can use it via ftrace or perftools.

Signed-off-by: Masami Hiramatsu 
---
 net/sctp/Kconfig  |   12 ---
 net/sctp/Makefile |3 -
 net/sctp/probe.c  |  244 -
 3 files changed, 259 deletions(-)
 delete mode 100644 net/sctp/probe.c

diff --git a/net/sctp/Kconfig b/net/sctp/Kconfig
index d9c04dc1b3f3..c740b189d4ba 100644
--- a/net/sctp/Kconfig
+++ b/net/sctp/Kconfig
@@ -37,18 +37,6 @@ menuconfig IP_SCTP
 
 if IP_SCTP
 
-config NET_SCTPPROBE
-   tristate "SCTP: Association probing"
-depends on PROC_FS && KPROBES
----help---
-This module allows for capturing the changes to SCTP association
-state in response to incoming packets. It is used for debugging
-SCTP congestion control algorithms. If you don't understand
-what was just said, you don't need it: say N.
-
-To compile this code as a module, choose M here: the
-module will be called sctp_probe.
-
 config SCTP_DBG_OBJCNT
bool "SCTP: Debug object counts"
depends on PROC_FS
diff --git a/net/sctp/Makefile b/net/sctp/Makefile
index 54bd9c1a8aa1..6776582ec449 100644
--- a/net/sctp/Makefile
+++ b/net/sctp/Makefile
@@ -4,7 +4,6 @@
 #
 
 obj-$(CONFIG_IP_SCTP) += sctp.o
-obj-$(CONFIG_NET_SCTPPROBE) += sctp_probe.o
 obj-$(CONFIG_INET_SCTP_DIAG) += sctp_diag.o
 
 sctp-y := sm_statetable.o sm_statefuns.o sm_sideeffect.o \
@@ -16,8 +15,6 @@ sctp-y := sm_statetable.o sm_statefuns.o sm_sideeffect.o \
  offload.o stream_sched.o stream_sched_prio.o \
  stream_sched_rr.o stream_interleave.o
 
-sctp_probe-y := probe.o
-
 sctp-$(CONFIG_SCTP_DBG_OBJCNT) += objcnt.o
 sctp-$(CONFIG_PROC_FS) += proc.o
 sctp-$(CONFIG_SYSCTL) += sysctl.o
diff --git a/net/sctp/probe.c b/net/sctp/probe.c
deleted file mode 100644
index 1280f85a598d..
--- a/net/sctp/probe.c
+++ /dev/null
@@ -1,244 +0,0 @@
-/*
- * sctp_probe - Observe the SCTP flow with kprobes.
- *
- * The idea for this came from Werner Almesberger's umlsim
- * Copyright (C) 2004, Stephen Hemminger 
- *
- * Modified for SCTP from Stephen Hemminger's code
- * Copyright (C) 2010, Wei Yongjun 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-
-MODULE_SOFTDEP("pre: sctp");
-MODULE_AUTHOR("Wei Yongjun ");
-MODULE_DESCRIPTION("SCTP snooper");
-MODULE_LICENSE("GPL");
-
-static int port __read_mostly = 0;
-MODULE_PARM_DESC(port, "Port to match (0=all)");
-module_param(port, int, 0);
-
-static unsigned int fwmark __read_mostly = 0;
-MODULE_PARM_DESC(fwmark, "skb mark to match (0=no mark)");
-module_param(fwmark, uint, 0);
-
-static int bufsize __read_mostly = 64 * 1024;
-MODULE_PARM_DESC(bufsize, "Log buffer size (default 64k)");
-module_param(bufsize, int, 0);
-
-static int full __read_mostly = 1;
-MODULE_PARM_DESC(full, "Full log (1=every ack packet received,  0=only cwnd 
changes)");
-module_param(full, int, 0);
-
-static const char procname[] = "sctpprobe";
-
-static struct {
-   struct kfifo  fifo;
-   spinlock_tlock;
-   wait_queue_head_t wait;
-   struct timespec64 tstart;
-} sctpw;
-
-static __printf(1, 2) void printl(const char *fmt, ...)
-{
-   va_list args;
-   int len;
-   char tbuf[256];
-
-   va_start(args, fmt);
-   len = vscnprintf(tbuf, sizeof(tbuf), fmt, args);
-   va_end(args);
-
-   kfifo_in_locked(&sctpw.fifo, tbuf, len, &sctpw.lock);
-   wake_up(&sctpw.wait);
-}
-
-static int sctpprobe_open(struct inode *inode, struct file *file)
-{
-   kfifo_reset(&sctpw.fifo);
-   ktime_get_ts64(&sctpw.tstart);
-
-   return 0;
-}
-
-static ssize_t sctpprobe_read(struct file *file, char __user *buf,
- size_t len, loff_t *ppos)
-{
-   int error = 0, cnt = 0;
-   unsigned char *tbuf;
-
-   if (!buf)
-   return -EINVAL;
-
-   if (len == 0)
-   return 0;
-
-   tbuf = vmalloc(len);
-   if (!tbuf)
-   return -ENOMEM;
-
-   error = wait_e

Re: [PATCH 1/4] pci: dwc: pci-dra7xx: Enable errata i870 for both EP and RC mode

2017-12-19 Thread Vignesh R



On Tuesday 19 December 2017 09:54 PM, Lorenzo Pieralisi wrote:
> On Fri, Dec 01, 2017 at 11:43:08AM +0530, Vignesh R wrote:
>> Errata i870 is applicable in both EP and RC mode. Therefore rename
>> function dra7xx_pcie_ep_unaligned_memaccess(), that implements errata
>> workaround, to dra7xx_pcie_unaligned_memaccess() and call it from a
>> common place. So, that errata workaround is applied for both modes of
>> operation.
>> 
>> Reported-by: Chris Welch 
>> Signed-off-by: Vignesh R 
>> ---
>>  drivers/pci/dwc/pci-dra7xx.c | 12 ++--
>>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> I need Kishon's ACK to apply it, thanks.

There are some enhancements to this patch. Will submit a v2 shortly.
Please ignore this version for now. Thanks!


-- 
Regards
Vignesh

[PATCH net-next v4 3/6] net: sctp: Add SCTP ACK tracking trace event

2017-12-19 Thread Masami Hiramatsu

Add SCTP ACK tracking trace event to trace the changes of SCTP
association state in response to incoming packets.
It is used for debugging SCTP congestion control algorithms,
and will replace sctp_probe module.

Note that this event a bit tricky. Since this consists of 2
events (sctp_probe and sctp_probe_path) so you have to enable
both events as below.

  # cd /sys/kernel/debug/tracing
  # echo 1 > events/sctp/sctp_probe/enable
  # echo 1 > events/sctp/sctp_probe_path/enable

Or, you can enable all the events under sctp.

  # echo 1 > events/sctp/enable

Since sctp_probe_path event is always invoked from sctp_probe
event, you can not see any output if you only enable
sctp_probe_path.

Signed-off-by: Masami Hiramatsu 
---
  Changes in v3:
   - Add checking whether sctp_probe_path event is enabled
 before iterating sctp paths to record. Thanks Steven.
  Changes in v4:
   - Move a temporal variable definition in the block.
   - Fix to cast pointer to unsigned long instead of __u64
 for 32bit environment.
---
 include/trace/events/sctp.h |   99 +++
 net/sctp/sm_statefuns.c |5 ++
 2 files changed, 104 insertions(+)
 create mode 100644 include/trace/events/sctp.h

diff --git a/include/trace/events/sctp.h b/include/trace/events/sctp.h
new file mode 100644
index ..7475c7be165a
--- /dev/null
+++ b/include/trace/events/sctp.h
@@ -0,0 +1,99 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM sctp
+
+#if !defined(_TRACE_SCTP_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_SCTP_H
+
+#include 
+#include 
+
+TRACE_EVENT(sctp_probe_path,
+
+   TP_PROTO(struct sctp_transport *sp,
+const struct sctp_association *asoc),
+
+   TP_ARGS(sp, asoc),
+
+   TP_STRUCT__entry(
+   __field(__u64, asoc)
+   __field(__u32, primary)
+   __array(__u8, ipaddr, sizeof(union sctp_addr))
+   __field(__u32, state)
+   __field(__u32, cwnd)
+   __field(__u32, ssthresh)
+   __field(__u32, flight_size)
+   __field(__u32, partial_bytes_acked)
+   __field(__u32, pathmtu)
+   ),
+
+   TP_fast_assign(
+   __entry->asoc = (unsigned long)asoc;
+   __entry->primary = (sp == asoc->peer.primary_path);
+   memcpy(__entry->ipaddr, &sp->ipaddr, sizeof(union sctp_addr));
+   __entry->state = sp->state;
+   __entry->cwnd = sp->cwnd;
+   __entry->ssthresh = sp->ssthresh;
+   __entry->flight_size = sp->flight_size;
+   __entry->partial_bytes_acked = sp->partial_bytes_acked;
+   __entry->pathmtu = sp->pathmtu;
+   ),
+
+   TP_printk("asoc=%#llx%s ipaddr=%pISpc state=%u cwnd=%u ssthresh=%u "
+ "flight_size=%u partial_bytes_acked=%u pathmtu=%u",
+ __entry->asoc, __entry->primary ? "(*)" : "",
+ __entry->ipaddr, __entry->state, __entry->cwnd,
+ __entry->ssthresh, __entry->flight_size,
+ __entry->partial_bytes_acked, __entry->pathmtu)
+);
+
+TRACE_EVENT(sctp_probe,
+
+   TP_PROTO(const struct sctp_endpoint *ep,
+const struct sctp_association *asoc,
+struct sctp_chunk *chunk),
+
+   TP_ARGS(ep, asoc, chunk),
+
+   TP_STRUCT__entry(
+   __field(__u64, asoc)
+   __field(__u32, mark)
+   __field(__u16, bind_port)
+   __field(__u16, peer_port)
+   __field(__u32, pathmtu)
+   __field(__u32, rwnd)
+   __field(__u16, unack_data)
+   ),
+
+   TP_fast_assign(
+   struct sk_buff *skb = chunk->skb;
+
+   __entry->asoc = (unsigned long)asoc;
+   __entry->mark = skb->mark;
+   __entry->bind_port = ep->base.bind_addr.port;
+   __entry->peer_port = asoc->peer.port;
+   __entry->pathmtu = asoc->pathmtu;
+   __entry->rwnd = asoc->peer.rwnd;
+   __entry->unack_data = asoc->unack_data;
+
+   if (trace_sctp_probe_path_enabled()) {
+   struct sctp_transport *sp;
+
+   list_for_each_entry(sp, &asoc->peer.transport_addr_list,
+   transports) {
+   trace_sctp_probe_path(sp, asoc);
+   }
+   }
+   ),
+
+   TP_printk("asoc=%#llx mark=%#x bind_port=%d peer_port=%d pathmtu=%d "
+ "rwnd=%u unack_data=%d",
+ __entry->asoc, __entry->mark, __entry->bind_port,
+ __entry->peer_port, __entry->pathmtu, __entry->rwnd,
+ __entry->unack_data)
+);
+
+#endif /* _TRACE_SCTP_H */
+
+/* This part must be outside protection */
+#include 
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 541f34735346..eb790

[PATCH net-next v4 1/6] net: tcp: Add trace events for TCP congestion window tracing

2017-12-19 Thread Masami Hiramatsu

This adds an event to trace TCP stat variables with
slightly intrusive trace-event. This uses ftrace/perf
event log buffer to trace those state, no needs to
prepare own ring-buffer, nor custom user apps.

User can use ftrace to trace this event as below;

  # cd /sys/kernel/debug/tracing
  # echo 1 > events/tcp/tcp_probe/enable
  (run workloads)
  # cat trace

Signed-off-by: Masami Hiramatsu 
---
 Changes in v3:
  - Fix build errors caused by including events/tcp.h twice.
  - Sort out the including headers.
---
 include/trace/events/tcp.h |   80 
 net/ipv4/tcp_input.c   |3 ++
 2 files changed, 83 insertions(+)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 07a6cbf1..14ad60b468fb 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM tcp
 
@@ -8,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define tcp_state_name(state)  { state, #state }
 #define show_tcp_state_name(val)   \
@@ -293,6 +295,84 @@ TRACE_EVENT(tcp_retransmit_synack,
  __entry->saddr_v6, __entry->daddr_v6)
 );
 
+TRACE_EVENT(tcp_probe,
+
+   TP_PROTO(struct sock *sk, struct sk_buff *skb),
+
+   TP_ARGS(sk, skb),
+
+   TP_STRUCT__entry(
+   /* sockaddr_in6 is always bigger than sockaddr_in */
+   __array(__u8, saddr, sizeof(struct sockaddr_in6))
+   __array(__u8, daddr, sizeof(struct sockaddr_in6))
+   __field(__u16, sport)
+   __field(__u16, dport)
+   __field(__u32, mark)
+   __field(__u16, length)
+   __field(__u32, snd_nxt)
+   __field(__u32, snd_una)
+   __field(__u32, snd_cwnd)
+   __field(__u32, ssthresh)
+   __field(__u32, snd_wnd)
+   __field(__u32, srtt)
+   __field(__u32, rcv_wnd)
+   ),
+
+   TP_fast_assign(
+   const struct tcp_sock *tp = tcp_sk(sk);
+   const struct inet_sock *inet = inet_sk(sk);
+
+   memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
+   memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));
+
+   if (sk->sk_family == AF_INET) {
+   struct sockaddr_in *v4 = (void *)__entry->saddr;
+
+   v4->sin_family = AF_INET;
+   v4->sin_port = inet->inet_sport;
+   v4->sin_addr.s_addr = inet->inet_saddr;
+   v4 = (void *)__entry->daddr;
+   v4->sin_family = AF_INET;
+   v4->sin_port = inet->inet_dport;
+   v4->sin_addr.s_addr = inet->inet_daddr;
+#if IS_ENABLED(CONFIG_IPV6)
+   } else if (sk->sk_family == AF_INET6) {
+   struct sockaddr_in6 *v6 = (void *)__entry->saddr;
+
+   v6->sin6_family = AF_INET6;
+   v6->sin6_port = inet->inet_sport;
+   v6->sin6_addr = inet6_sk(sk)->saddr;
+   v6 = (void *)__entry->daddr;
+   v6->sin6_family = AF_INET6;
+   v6->sin6_port = inet->inet_dport;
+   v6->sin6_addr = sk->sk_v6_daddr;
+#endif
+   }
+
+   /* For filtering use */
+   __entry->sport = ntohs(inet->inet_sport);
+   __entry->dport = ntohs(inet->inet_dport);
+   __entry->mark = skb->mark;
+
+   __entry->length = skb->len;
+   __entry->snd_nxt = tp->snd_nxt;
+   __entry->snd_una = tp->snd_una;
+   __entry->snd_cwnd = tp->snd_cwnd;
+   __entry->snd_wnd = tp->snd_wnd;
+   __entry->rcv_wnd = tp->rcv_wnd;
+   __entry->ssthresh = tcp_current_ssthresh(sk);
+   __entry->srtt = tp->srtt_us >> 3;
+   ),
+
+   TP_printk("src=%pISpc dest=%pISpc mark=%#x length=%d snd_nxt=%#x "
+ "snd_una=%#x snd_cwnd=%u ssthresh=%u snd_wnd=%u srtt=%u "
+ "rcv_wnd=%u",
+ __entry->saddr, __entry->daddr, __entry->mark,
+ __entry->length, __entry->snd_nxt, __entry->snd_una,
+ __entry->snd_cwnd, __entry->ssthresh, __entry->snd_wnd,
+ __entry->srtt, __entry->rcv_wnd)
+);
+
 #endif /* _TRACE_TCP_H */
 
 /* This part must be outside protection */
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4d55c4b338ee..ff71b18d9682 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5299,6 +5299,9 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff 
*skb,
unsigned int len = skb->len;
struct tcp_sock *tp = tcp_sk(sk);
 
+   /* TCP congestion window tracking */
+   trace_tcp_probe(sk, skb);
+
tcp_mstamp_refresh(tp);
if (unlikely(!sk-

[PATCH net-next v4 2/6] net: tcp: Remove TCP probe module

2017-12-19 Thread Masami Hiramatsu

Remove TCP probe module since jprobe has been deprecated.
That function is now replaced by tcp/tcp_probe trace-event.
You can use it via ftrace or perftools.

Signed-off-by: Masami Hiramatsu 
---
 net/Kconfig  |   17 ---
 net/ipv4/Makefile|1 
 net/ipv4/tcp_probe.c |  301 --
 3 files changed, 319 deletions(-)
 delete mode 100644 net/ipv4/tcp_probe.c

diff --git a/net/Kconfig b/net/Kconfig
index 9dba2715919d..efe930db3c08 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -336,23 +336,6 @@ config NET_PKTGEN
  To compile this code as a module, choose M here: the
  module will be called pktgen.
 
-config NET_TCPPROBE
-   tristate "TCP connection probing"
-   depends on INET && PROC_FS && KPROBES
-   ---help---
-   This module allows for capturing the changes to TCP connection
-   state in response to incoming packets. It is used for debugging
-   TCP congestion avoidance modules. If you don't understand
-   what was just said, you don't need it: say N.
-
-   Documentation on how to use TCP connection probing can be found
-   at:
-   
- 
http://www.linuxfoundation.org/collaborate/workgroups/networking/tcpprobe
-
-   To compile this code as a module, choose M here: the
-   module will be called tcp_probe.
-
 config NET_DROP_MONITOR
tristate "Network packet drop alerting service"
depends on INET && TRACEPOINTS
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index c6c8ad1d4b6d..47a0a6649a9d 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -43,7 +43,6 @@ obj-$(CONFIG_INET_DIAG) += inet_diag.o
 obj-$(CONFIG_INET_TCP_DIAG) += tcp_diag.o
 obj-$(CONFIG_INET_UDP_DIAG) += udp_diag.o
 obj-$(CONFIG_INET_RAW_DIAG) += raw_diag.o
-obj-$(CONFIG_NET_TCPPROBE) += tcp_probe.o
 obj-$(CONFIG_TCP_CONG_BBR) += tcp_bbr.o
 obj-$(CONFIG_TCP_CONG_BIC) += tcp_bic.o
 obj-$(CONFIG_TCP_CONG_CDG) += tcp_cdg.o
diff --git a/net/ipv4/tcp_probe.c b/net/ipv4/tcp_probe.c
deleted file mode 100644
index 697f4c67b2e3..
--- a/net/ipv4/tcp_probe.c
+++ /dev/null
@@ -1,301 +0,0 @@
-/*
- * tcpprobe - Observe the TCP flow with kprobes.
- *
- * The idea for this came from Werner Almesberger's umlsim
- * Copyright (C) 2004, Stephen Hemminger 
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-
-MODULE_AUTHOR("Stephen Hemminger ");
-MODULE_DESCRIPTION("TCP cwnd snooper");
-MODULE_LICENSE("GPL");
-MODULE_VERSION("1.1");
-
-static int port __read_mostly;
-MODULE_PARM_DESC(port, "Port to match (0=all)");
-module_param(port, int, 0);
-
-static unsigned int bufsize __read_mostly = 4096;
-MODULE_PARM_DESC(bufsize, "Log buffer size in packets (4096)");
-module_param(bufsize, uint, 0);
-
-static unsigned int fwmark __read_mostly;
-MODULE_PARM_DESC(fwmark, "skb mark to match (0=no mark)");
-module_param(fwmark, uint, 0);
-
-static int full __read_mostly;
-MODULE_PARM_DESC(full, "Full log (1=every ack packet received,  0=only cwnd 
changes)");
-module_param(full, int, 0);
-
-static const char procname[] = "tcpprobe";
-
-struct tcp_log {
-   ktime_t tstamp;
-   union {
-   struct sockaddr raw;
-   struct sockaddr_in  v4;
-   struct sockaddr_in6 v6;
-   }   src, dst;
-   u16 length;
-   u32 snd_nxt;
-   u32 snd_una;
-   u32 snd_wnd;
-   u32 rcv_wnd;
-   u32 snd_cwnd;
-   u32 ssthresh;
-   u32 srtt;
-};
-
-static struct {
-   spinlock_t  lock;
-   wait_queue_head_t wait;
-   ktime_t start;
-   u32 lastcwnd;
-
-   unsigned long   head, tail;
-   struct tcp_log  *log;
-} tcp_probe;
-
-static inline int tcp_probe_used(void)
-{
-   return (tcp_probe.head - tcp_probe.tail) & (bufsize - 1);
-}
-
-static inline int tcp_probe_avail(void)
-{
-   return bufsize - tcp_probe_used() - 1;
-}
-
-#define tcp_probe_copy_fl_to_si4(inet, si4, mem)   \
-   do {\
-   si4.sin_family = AF_INET;   \
-   si4.sin_port = inet->inet_##mem##port;  \

[PATCH net-next v4 0/6] net: tcp: sctp: dccp: Replace jprobe usage with trace events

2017-12-19 Thread Masami Hiramatsu

Hi,

This series is v4 of the replacement of jprobe usage with trace
events. This version is rebased on net-next, fixes a build warning
and moves a temporal variable definition in a block.

Previous version is here;
https://lkml.org/lkml/2017/12/19/153

Changes from v3:
  All: Rebased on net-next
  [3/6]: fixes a build warning for i386 by casting pointer unsigned
long instead of __u64, and moves a temporal variable
 definition in a block.

Thank you,

---

Masami Hiramatsu (6):
  net: tcp: Add trace events for TCP congestion window tracing
  net: tcp: Remove TCP probe module
  net: sctp: Add SCTP ACK tracking trace event
  net: sctp: Remove debug SCTP probe module
  net: dccp: Add DCCP sendmsg trace event
  net: dccp: Remove dccpprobe module


 include/trace/events/sctp.h |   99 ++
 include/trace/events/tcp.h  |   80 +++
 net/Kconfig |   17 --
 net/dccp/Kconfig|   17 --
 net/dccp/Makefile   |2 
 net/dccp/probe.c|  203 -
 net/dccp/proto.c|5 +
 net/dccp/trace.h|  105 +++
 net/ipv4/Makefile   |1 
 net/ipv4/tcp_input.c|3 
 net/ipv4/tcp_probe.c|  301 ---
 net/sctp/Kconfig|   12 --
 net/sctp/Makefile   |3 
 net/sctp/probe.c|  244 ---
 net/sctp/sm_statefuns.c |5 +
 15 files changed, 297 insertions(+), 800 deletions(-)
 create mode 100644 include/trace/events/sctp.h
 delete mode 100644 net/dccp/probe.c
 create mode 100644 net/dccp/trace.h
 delete mode 100644 net/ipv4/tcp_probe.c
 delete mode 100644 net/sctp/probe.c

--
Masami Hiramatsu (Linaro)

Re: [PATCH] lib: add module unload support to sort tests

2017-12-19 Thread Paul Gortmaker

[Re: [PATCH] lib: add module unload support to sort tests] On 19/12/2017 (Tue 
23:10) Pravin Shedge wrote:

> On Tue, Dec 19, 2017 at 3:51 AM, Andrew Morton
>  wrote:
> > On Sun, 17 Dec 2017 15:19:27 +0530 Pravin Shedge 
> >  wrote:
> >
> >> test_sort.c perform array-based and linked list sort test. Code allows to
> >> compile either as a loadable modules or builtin into the kernel.
> >>
> >> Current code is not allow to unload the test_sort.ko module after
> >> successful completion.
> >>
> >> This patch add support to unload the "test_sort.ko" module.
> >>
> >> ...
> >>
> >> --- a/lib/test_sort.c
> >> +++ b/lib/test_sort.c
> >> @@ -13,11 +13,12 @@ static int __init cmpint(const void *a, const void *b)
> >>
> >>  static int __init test_sort_init(void)
> >>  {
> >> - int *a, i, r = 1, err = -ENOMEM;
> >> + int *a, i, r = 1;
> >> + int err = -EAGAIN; /* Fail will directly unload the module */
> >>
> >>   a = kmalloc_array(TEST_LEN, sizeof(*a), GFP_KERNEL);
> >>   if (!a)
> >> - return err;
> >> + return -ENOMEM;
> >>
> >>   for (i = 0; i < TEST_LEN; i++) {
> >>   r = (r * 725861) % 6599;
> >> @@ -26,13 +27,12 @@ static int __init test_sort_init(void)
> >>
> >>   sort(a, TEST_LEN, sizeof(*a), cmpint, NULL);
> >>
> >> - err = -EINVAL;
> >>   for (i = 0; i < TEST_LEN-1; i++)
> >>   if (a[i] > a[i+1]) {
> >>   pr_err("test has failed\n");
> >> + err = -EINVAL;
> >>   goto exit;
> >>   }
> >> - err = 0;
> >>   pr_info("test passed\n");
> >>  exit:
> >>   kfree(a);
> >
> > I'm struggling to understand this.
> >
> > It seems that the current pattern for lib/test_*.c is:
> >
> > - if test fails, module_init() handler returns -EFOO
> > - if test succeeds, module_init() handler returns 0
> >
> 
> Not true for all lib/*.c
> I refer following modules lib/percpu_test.c and lib/rbtree_test.c.
> 
> > So the module will be auto-unloaded if it failed and will require an
> > rmmod if it succeeded.
> >
> > Correct?
> Right. There are two approaches that I saw modules present in lib/*.c
> Few modules execute set of test cases from module_init() and at the end
> on successful completion they unload the module by returning -EAGAIN
> from module_init()

So, I'd make the argument that the two approaches is not ideal.  Start
by considering the two common use cases:

#1 - Fred builds everything in non-module;  boots and checks "dmesg" to
see what passed and failed.  He does not care about unload because the
machine will reboot for the next test in less than five minutes.

#2 - Bob wrote a test suite.  It does "dmesg -c" and loads a single test
and checks dmesg.  It then rmmod and restarts with the next module.

If we have two approaches, then Bob has a problem.  He has to track
which module he has to unload and which auto-unload.  Or he
unconditionally does an unload and ignores the error if any.  Which is
bad if the error code is -EBUSY due to dependencies or similar.

The auto-unload might seem like a nice optimization, but it encourages
inconsistent behaviour.  And behaviour that is different from all other
normal modules.

And finally, if the test is one shot, how do you justify leaving it
loaded in the kernel when it passed, but removing it when it failed?
That makes sense for driver probes but not one-shot software tests.

I'd suggest load, run test and wait for external unload trigger for
consistency.  And not to abuse the module_init() return code as a
communication channel for pass/fail.

> 
> Those code can compile as in-build in kernel or as a module.
> That means those testcases execute at the time of boot
> 
> Help from the make menuconfig for CONFIG_TEST_LIST_SORT shows:
> 
> "Enable this to turn on 'list_sort()' function test.
> This test is executed only once during system boot (so affects only
> boot time), or at module load time."
> 
> If test case is going affects only at boot time or at module load
> time, it's smart decision to unload module
> automatically on successful completion.

See above - I don't think it is smart, and the choice of which one
stays and which one auto-unloads is arbitrary and inconsistent.

Imagine something as simple as this:

for i in $LIST ; do
modprobe $i
lsmod | grep -q $i
if [ $? != 0 ]; then echo Module $i is not present! ; fi
done

OK, not ideal code, ignoring the modprobe return -- but what it reports
is true -- your test module (if it passed) will NOT be present.

> 
> >
> > And it appears that lib/test_sort.c currently implements the above.
> > And that your patch changes it so that the module_init() handler always
> > returns -EFOO, so the module will be aut-unloaded whether or not the
> > testing passed.
> >
> > Correct?
> Right. On any case test case is going show log in syslog either on
> it's failure or successful case.

Look at "dmesg" after boo

Re: [PATCH 03/15] staging: lustre: replace simple cases of LIBCFS_ALLOC with kzalloc.

2017-12-19 Thread kbuild test robot

Hi NeilBrown,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on staging/staging-testing]
[also build test ERROR on next-20171219]
[cannot apply to v4.15-rc4]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/NeilBrown/staging-lustre-convert-most-LIBCFS-ALLOC-to-k-malloc/20171220-113029
config: i386-randconfig-x009-201751 (attached as .config)
compiler: gcc-7 (Debian 7.2.0-12) 7.2.1 20171025
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c: In function 
'kiblnd_dev_failover':
>> drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c:2395:2: error: 'kdev' 
>> undeclared (first use in this function); did you mean 'hdev'?
 kdev = kzalloc(sizeof(*hdev), GFP_NOFS);
 ^~~~
 hdev
   drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c:2395:2: note: each 
undeclared identifier is reported only once for each function it appears in

vim +2395 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c

  2329  
  2330  int kiblnd_dev_failover(struct kib_dev *dev)
  2331  {
  2332  LIST_HEAD(zombie_tpo);
  2333  LIST_HEAD(zombie_ppo);
  2334  LIST_HEAD(zombie_fpo);
  2335  struct rdma_cm_id *cmid  = NULL;
  2336  struct kib_hca_dev *hdev  = NULL;
  2337  struct ib_pd *pd;
  2338  struct kib_net *net;
  2339  struct sockaddr_in addr;
  2340  unsigned long flags;
  2341  int rc = 0;
  2342  int i;
  2343  
  2344  LASSERT(*kiblnd_tunables.kib_dev_failover > 1 ||
  2345  dev->ibd_can_failover || !dev->ibd_hdev);
  2346  
  2347  rc = kiblnd_dev_need_failover(dev);
  2348  if (rc <= 0)
  2349  goto out;
  2350  
  2351  if (dev->ibd_hdev &&
  2352  dev->ibd_hdev->ibh_cmid) {
  2353  /*
  2354   * XXX it's not good to close old listener at here,
  2355   * because we can fail to create new listener.
  2356   * But we have to close it now, otherwise rdma_bind_addr
  2357   * will return EADDRINUSE... How crap!
  2358   */
  2359  write_lock_irqsave(&kiblnd_data.kib_global_lock, flags);
  2360  
  2361  cmid = dev->ibd_hdev->ibh_cmid;
  2362  /*
  2363   * make next schedule of kiblnd_dev_need_failover()
  2364   * return 1 for me
  2365   */
  2366  dev->ibd_hdev->ibh_cmid  = NULL;
  2367  write_unlock_irqrestore(&kiblnd_data.kib_global_lock, 
flags);
  2368  
  2369  rdma_destroy_id(cmid);
  2370  }
  2371  
  2372  cmid = kiblnd_rdma_create_id(kiblnd_cm_callback, dev, 
RDMA_PS_TCP,
  2373   IB_QPT_RC);
  2374  if (IS_ERR(cmid)) {
  2375  rc = PTR_ERR(cmid);
  2376  CERROR("Failed to create cmid for failover: %d\n", rc);
  2377  goto out;
  2378  }
  2379  
  2380  memset(&addr, 0, sizeof(addr));
  2381  addr.sin_family  = AF_INET;
  2382  addr.sin_addr.s_addr = htonl(dev->ibd_ifip);
  2383  addr.sin_port   = htons(*kiblnd_tunables.kib_service);
  2384  
  2385  /* Bind to failover device or port */
  2386  rc = rdma_bind_addr(cmid, (struct sockaddr *)&addr);
  2387  if (rc || !cmid->device) {
  2388  CERROR("Failed to bind %s:%pI4h to device(%p): %d\n",
  2389 dev->ibd_ifname, &dev->ibd_ifip,
  2390 cmid->device, rc);
  2391  rdma_destroy_id(cmid);
  2392  goto out;
  2393  }
  2394  
> 2395  kdev = kzalloc(sizeof(*hdev), GFP_NOFS);
  2396  if (!hdev) {
  2397  CERROR("Failed to allocate kib_hca_dev\n");
  2398  rdma_destroy_id(cmid);
  2399  rc = -ENOMEM;
  2400  goto out;
  2401  }
  2402  
  2403  atomic_set(&hdev->ibh_ref, 1);
  2404  hdev->ibh_dev   = dev;
  2405  hdev->ibh_cmid  = cmid;
  2406  hdev->ibh_ibdev = cmid->device;
  2407  
  2408  pd = ib_alloc_pd(cmid->device, 0);
  2409  if (IS_ERR(pd)) {
  2410  rc = PTR_ERR(pd);
  2411  CERROR("Can't allocate PD: %d\n", rc);
  2412  goto out;
  2413  }
  2414  
  2415  hdev->ibh_pd = pd;
  2416  
  2417  rc = rdma_listen(cmid, 0);

Re: [PATCH 3/4] sched: Comment on why sync wakeups try to run on the current CPU

2017-12-19 Thread Mike Galbraith

On Tue, 2017-12-19 at 20:06 +0100, Peter Zijlstra wrote:
> 
> Our SYNC hint does promise the caller will go away 'soon', although I'm
> not sure how many of the current users actually honor that.

The sync hint is not a lie, or even a damn lie, it's a statistic :) 
It's very useful for...

TCP_SENDFILE
homer:..debug/tracing # cat trace|grep netperf|grep wakes|wc -l
2417
homer:..debug/tracing # cat trace|grep netperf|grep schedules|wc -l
4
TCP_STREAM
homer:..debug/tracing # cat trace|grep netperf|grep wakes|wc -l
2506
homer:..debug/tracing # cat trace|grep netperf|grep schedules|wc -l
3
TCP_MAERTS
homer:..debug/tracing # cat trace|grep netperf|grep wakes|wc -l
2465
homer:..debug/tracing # cat trace|grep netperf|grep schedules|wc -l
2

...knowing that tasks are talking, but not the least bit useful for
scheduler decisions other than "pull to same planet".  Those are single
instances, all of which exceed 180% combined util, one at 100%.  Then
there are multi-wakers, tasks that would have scheduled if they hadn't
been given more work to do etc etc.  Nope, stacking based upon that
hint is most definitely not a good idea :)

-Mike

Re: [PATCH 2/4] sched: cpufreq: Keep track of cpufreq utilization update flags

2017-12-19 Thread Viresh Kumar

On 19-12-17, 20:25, Peter Zijlstra wrote:
> Yeah, not happy about this either; we had code that did the right thing
> without this extra tracking I think.

Sure, but how do you suggest we fix the problems we are facing with
the current design? Patrick had a completely different proposal for
solving those problems, which I didn't like very much. This patchset
replaced these patches from Patrick:

- [PATCH v3 1/6] cpufreq: schedutil: reset sg_cpus's flags at IDLE enter
  https://marc.info/?l=linux-kernel&m=151204247801633&w=2

- [PATCH v3 2/6] cpufreq: schedutil: ensure max frequency while
  running RT/DL tasks
  https://marc.info/?l=linux-kernel&m=151204253801657&w=2

- [PATCH v3 6/6] cpufreq: schedutil: ignore sugov kthreads
  https://marc.info/?l=linux-kernel&m=151204251501647&w=2

> Also, we can look at the rq state if we want to, we don't need to
> duplicate that state.

Well that also looks fine to me, and that would mean this:

- We remove SCHED_CPUFREQ_RT and SCHED_CPUFREQ_DL flags, but still
  call the utilization callbacks from RT and DL classes.

- From the utilization handler, we check runqueues of all three sched
  classes to see if they have some work pending (this can be done
  smartly by checking only RT first and skipping other checks if RT
  has some work).

Will that be acceptable ?

-- 
viresh

Re: BUG: bad usercopy in memdup_user

2017-12-19 Thread Linus Torvalds

On Tue, Dec 19, 2017 at 7:50 PM, Matthew Wilcox  wrote:
> On Tue, Dec 19, 2017 at 09:48:49PM +, Al Viro wrote:
>> Well, for example seeing a 0xfff4 where a pointer to object
>> must have been is a pretty strong hint to start looking for a way for
>> that ERR_PTR(-ENOMEM) having ended up there...  Something like
>> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
>> pathname overwriting whatever it ends up in, etc.  And yes, I have run
>> into both of those in real life.
>>
>> Debugging the situation when crap value has ended up in place of a
>> pointer is certainly a case where you do want to see what exactly has
>> ended up in there...
>
> Linus, how would you feel about printing ERR_PTRs without molestation?

Stop this stupidity already.

Guys, seriously. You're making idiotic arguments that have nothing to
do with reality.

If you actually have an invalid pointer that causes an oops (whether
it's an ERR_PTR or something like 0x6e69622f7273752f or something like
the list poison values etc),

  WE ALREADY PRINT OUT THE WHOLE UNHASHED POINTER VALUE

This "but but but some pointers are magic and shouldn't be hashed"
stuff is *garbage*. You're making shit up. You don't have a single
actual real-life example that you can point to that is relevant.

Again, I've seen those bad pointer oopses myself. Yes, the magic
values are relevant, and should be printed out.

BUT THEY ALREADY ARE PRINTED OUT.

Christ.

So let me repeat:

 - don't change %p behavior.

 - don't use "I was confused" as an argument. Yes, things changed, and
yes, it clearly caused confusion, but that is temporary and is not an
argument for making magic changes.

 - don't make up some garbage theoretical example: give a hard example
of output that actually didn't have enough information.

And yes, we'll just replace the %p with %px when that last situation
holds. Really. Really really.

But it needs to be a real example, not a "what if" that doesn't make sense.

Not some pet theory that doesn't hold water.

This whole "what if it was a poison pointer" argument is a _prime_
example of pure and utter garbage.

If we have an oops, and it was due a poison value or an err-pointer
that we dereferenced, we will *see* the poison value.

It will be right there in the register state.

It will be right there in the bad address.

It will be quite visible.

And yes, we had a few cases where the hashing actually did hide the
values, and I've been applying patches to turn those from %p to %px.

But it had better be actual real cases, and real thought out
situations. Not "let's just randomly pick values that we don't hash".

Linus

[RESEND][PATCH] tools/power: Don't make man pages executable

2017-12-19 Thread Laura Abbott

rpm-lint flagged these as being executable:

kernel-tools.x86_64: W: spurious-executable-perm 
/usr/share/man/man8/turbostat.8.gz
kernel-tools.x86_64: W: spurious-executable-perm 
/usr/share/man/man8/x86_energy_perf_policy.8.gz

Fix this

Signed-off-by: Laura Abbott 
---
Resent for linux-pm cc
---
 tools/power/x86/turbostat/Makefile  | 2 +-
 tools/power/x86/x86_energy_perf_policy/Makefile | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/power/x86/turbostat/Makefile 
b/tools/power/x86/turbostat/Makefile
index a9bc914a8fe8..2ab25aa38263 100644
--- a/tools/power/x86/turbostat/Makefile
+++ b/tools/power/x86/turbostat/Makefile
@@ -25,4 +25,4 @@ install : turbostat
install -d  $(DESTDIR)$(PREFIX)/bin
install $(BUILD_OUTPUT)/turbostat $(DESTDIR)$(PREFIX)/bin/turbostat
install -d  $(DESTDIR)$(PREFIX)/share/man/man8
-   install turbostat.8 $(DESTDIR)$(PREFIX)/share/man/man8
+   install -m 644 turbostat.8 $(DESTDIR)$(PREFIX)/share/man/man8
diff --git a/tools/power/x86/x86_energy_perf_policy/Makefile 
b/tools/power/x86/x86_energy_perf_policy/Makefile
index 2447b1bbaacf..f4534fb8b951 100644
--- a/tools/power/x86/x86_energy_perf_policy/Makefile
+++ b/tools/power/x86/x86_energy_perf_policy/Makefile
@@ -24,5 +24,5 @@ install : x86_energy_perf_policy
install -d  $(DESTDIR)$(PREFIX)/bin
install $(BUILD_OUTPUT)/x86_energy_perf_policy 
$(DESTDIR)$(PREFIX)/bin/x86_energy_perf_policy
install -d  $(DESTDIR)$(PREFIX)/share/man/man8
-   install x86_energy_perf_policy.8 $(DESTDIR)$(PREFIX)/share/man/man8
+   install -m 644 x86_energy_perf_policy.8 
$(DESTDIR)$(PREFIX)/share/man/man8
 
-- 
2.14.3

[PATCH v15 4/5] watchdog: Add RAVE SP watchdog driver

2017-12-19 Thread Andrey Smirnov

This driver provides access to RAVE SP watchdog functionality.

Cc: linux-kernel@vger.kernel.org
Cc: linux-watch...@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Cc: Greg Kroah-Hartman 
Cc: Pavel Machek 
Cc: Andy Shevchenko 
Cc: Guenter Roeck 
Cc: Rob Herring 
Cc: Johan Hovold 
Cc: Sebastian Reichel 
Acked-by: Pavel Machek 
Reviewed-by: Guenter Roeck 
Signed-off-by: Nikita Yushchenko 
Signed-off-by: Andrey Smirnov 
---
 drivers/watchdog/Kconfig   |   7 +
 drivers/watchdog/Makefile  |   1 +
 drivers/watchdog/rave-sp-wdt.c | 348 +
 3 files changed, 356 insertions(+)
 create mode 100644 drivers/watchdog/rave-sp-wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index ca200d1f310a..5bf613d3b7d6 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -223,6 +223,13 @@ config ZIIRAVE_WATCHDOG
  To compile this driver as a module, choose M here: the
  module will be called ziirave_wdt.
 
+config RAVE_SP_WATCHDOG
+   tristate "RAVE SP Watchdog timer"
+   depends on RAVE_SP_CORE
+   select WATCHDOG_CORE
+   help
+ Support for the watchdog on RAVE SP device.
+
 # ALPHA Architecture
 
 # ARM Architecture
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 715a21078e0c..135c5e81f25e 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -224,3 +224,4 @@ obj-$(CONFIG_MAX77620_WATCHDOG) += max77620_wdt.o
 obj-$(CONFIG_ZIIRAVE_WATCHDOG) += ziirave_wdt.o
 obj-$(CONFIG_SOFT_WATCHDOG) += softdog.o
 obj-$(CONFIG_MENF21BMC_WATCHDOG) += menf21bmc_wdt.o
+obj-$(CONFIG_RAVE_SP_WATCHDOG) += rave-sp-wdt.o
diff --git a/drivers/watchdog/rave-sp-wdt.c b/drivers/watchdog/rave-sp-wdt.c
new file mode 100644
index ..f70961eade29
--- /dev/null
+++ b/drivers/watchdog/rave-sp-wdt.c
@@ -0,0 +1,348 @@
+/*
+ *  rave-sp-wdt.c - Watchdog driver present in RAVE SP
+ *
+ * Copyright (C) 2017 Zodiac Inflight Innovation
+ *
+ * Driver for parent device can be found in drivers/mfd/rave-sp.c
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+enum {
+   RAVE_SP_RESET_BYTE = 1,
+   RAVE_SP_RESET_REASON_NORMAL = 0,
+   RAVE_SP_RESET_DELAY_MS = 500,
+};
+
+/**
+ * struct rave_sp_wdt_variant - RAVE SP watchdog variant
+ *
+ * @max_timeout:   Largest possible watchdog timeout setting
+ * @min_timeout:   Smallest possible watchdog timeout setting
+ *
+ * @configure: Function to send configuration command
+ * @restart:   Function to send "restart" command
+ */
+struct rave_sp_wdt_variant {
+   unsigned int max_timeout;
+   unsigned int min_timeout;
+
+   int (*configure)(struct watchdog_device *, bool);
+   int (*restart)(struct watchdog_device *);
+};
+
+/**
+ * struct rave_sp_wdt - RAVE SP watchdog
+ *
+ * @wdd:   Underlying watchdog device
+ * @sp:Pointer to parent RAVE SP device
+ * @variant:   Device specific variant information
+ * @reboot_notifier:   Reboot notifier implementing machine reset
+ */
+struct rave_sp_wdt {
+   struct watchdog_device wdd;
+   struct rave_sp *sp;
+   const struct rave_sp_wdt_variant *variant;
+   struct notifier_block reboot_notifier;
+};
+
+static struct rave_sp_wdt *to_rave_sp_wdt(struct watchdog_device *wdd)
+{
+   return container_of(wdd, struct rave_sp_wdt, wdd);
+}
+
+static int rave_sp_wdt_exec(struct watchdog_device *wdd, void *data,
+   size_t data_size)
+{
+   return rave_sp_exec(to_rave_sp_wdt(wdd)->sp,
+   data, data_size, NULL, 0);
+}
+
+static int rave_sp_wdt_legacy_configure(struct watchdog_device *wdd, bool on)
+{
+   u8 cmd[] = {
+   [0] = RAVE_SP_CMD_SW_WDT,
+   [1] = 0,
+   [2] = 0,
+   [3] = on,
+   [4] = on ? wdd->timeout : 0,
+   };
+
+   return rave_sp_wdt_exec(wdd, cmd, sizeof(cmd));
+}
+
+static int rave_sp_wdt_rdu_configure(struct watchdog_device *wdd, bool on)
+{
+   u8 cmd[] = {
+   [0] = RAVE_SP_CMD_SW_WDT,
+   [1] = 0,
+   [2] = on,
+   [3] = (u8)wdd->timeout,
+

[PATCH v15 2/5] serdev: Introduce devm_serdev_device_open()

2017-12-19 Thread Andrey Smirnov

Add code implementing managed version of serdev_device_open() for
serdev device drivers that "open" the device during driver's lifecycle
only once (e.g. opened in .probe() and closed in .remove()).

Cc: linux-kernel@vger.kernel.org
Cc: linux-ser...@vger.kernel.org
Cc: Rob Herring 
Cc: cphe...@gmail.com
Cc: Guenter Roeck 
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Cc: Greg Kroah-Hartman 
Cc: Pavel Machek 
Cc: Andy Shevchenko 
Cc: Johan Hovold 
Cc: Sebastian Reichel 
Acked-by: Pavel Machek 
Acked-by: Rob Herring 
Reviewed-by: Sebastian Reichel 
Reviewed-by: Guenter Roeck 
Signed-off-by: Andrey Smirnov 
---
 Documentation/driver-model/devres.txt |  3 +++
 drivers/tty/serdev/core.c | 27 +++
 include/linux/serdev.h|  1 +
 3 files changed, 31 insertions(+)

diff --git a/Documentation/driver-model/devres.txt 
b/Documentation/driver-model/devres.txt
index c180045eb43b..7c1bb3d0c222 100644
--- a/Documentation/driver-model/devres.txt
+++ b/Documentation/driver-model/devres.txt
@@ -384,6 +384,9 @@ RESET
   devm_reset_control_get()
   devm_reset_controller_register()
 
+SERDEV
+  devm_serdev_device_open()
+
 SLAVE DMA ENGINE
   devm_acpi_dma_controller_register()
 
diff --git a/drivers/tty/serdev/core.c b/drivers/tty/serdev/core.c
index 34050b439c1f..28133dbd2808 100644
--- a/drivers/tty/serdev/core.c
+++ b/drivers/tty/serdev/core.c
@@ -132,6 +132,33 @@ void serdev_device_close(struct serdev_device *serdev)
 }
 EXPORT_SYMBOL_GPL(serdev_device_close);
 
+static void devm_serdev_device_release(struct device *dev, void *dr)
+{
+   serdev_device_close(*(struct serdev_device **)dr);
+}
+
+int devm_serdev_device_open(struct device *dev, struct serdev_device *serdev)
+{
+   struct serdev_device **dr;
+   int ret;
+
+   dr = devres_alloc(devm_serdev_device_release, sizeof(*dr), GFP_KERNEL);
+   if (!dr)
+   return -ENOMEM;
+
+   ret = serdev_device_open(serdev);
+   if (ret) {
+   devres_free(dr);
+   return ret;
+   }
+
+   *dr = serdev;
+   devres_add(dev, dr);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(devm_serdev_device_open);
+
 void serdev_device_write_wakeup(struct serdev_device *serdev)
 {
complete(&serdev->write_comp);
diff --git a/include/linux/serdev.h b/include/linux/serdev.h
index e69402d4a8ae..9929063bd45d 100644
--- a/include/linux/serdev.h
+++ b/include/linux/serdev.h
@@ -193,6 +193,7 @@ static inline int serdev_controller_receive_buf(struct 
serdev_controller *ctrl,
 
 int serdev_device_open(struct serdev_device *);
 void serdev_device_close(struct serdev_device *);
+int devm_serdev_device_open(struct device *, struct serdev_device *);
 unsigned int serdev_device_set_baudrate(struct serdev_device *, unsigned int);
 void serdev_device_set_flow_control(struct serdev_device *, bool);
 int serdev_device_write_buf(struct serdev_device *, const unsigned char *, 
size_t);
-- 
2.14.3

[PATCH v15 1/5] serdev: Make .remove in struct serdev_device_driver optional

2017-12-19 Thread Andrey Smirnov

Using devres infrastructure it is possible to write a serdev driver
that doesn't have any code that needs to be called as a part of
.remove. Add code to make .remove optional.

Cc: linux-kernel@vger.kernel.org
Cc: linux-ser...@vger.kernel.org
Cc: Rob Herring 
Cc: cphe...@gmail.com
Cc: Guenter Roeck 
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Cc: Greg Kroah-Hartman 
Cc: Pavel Machek 
Cc: Andy Shevchenko 
Cc: Johan Hovold 
Cc: Sebastian Reichel 
Acked-by: Pavel Machek 
Acked-by: Rob Herring 
Reviewed-by: Sebastian Reichel 
Reviewed-by: Guenter Roeck 
Signed-off-by: Andrey Smirnov 
---
 drivers/tty/serdev/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serdev/core.c b/drivers/tty/serdev/core.c
index 1bef39828ca7..34050b439c1f 100644
--- a/drivers/tty/serdev/core.c
+++ b/drivers/tty/serdev/core.c
@@ -268,8 +268,8 @@ static int serdev_drv_probe(struct device *dev)
 static int serdev_drv_remove(struct device *dev)
 {
const struct serdev_device_driver *sdrv = 
to_serdev_device_driver(dev->driver);
-
-   sdrv->remove(to_serdev_device(dev));
+   if (sdrv->remove)
+   sdrv->remove(to_serdev_device(dev));
return 0;
 }
 
-- 
2.14.3

[PATCH v15 5/5] dt-bindings: watchdog: Add bindings for RAVE SP watchdog driver

2017-12-19 Thread Andrey Smirnov

Add Device Tree bindings for RAVE SP watchdog drvier - an MFD cell of
parent RAVE SP driver (documented in
Documentation/devicetree/bindings/mfd/zii,rave-sp.txt).

Cc: linux-kernel@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-watch...@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Cc: Greg Kroah-Hartman 
Cc: Pavel Machek 
Cc: Andy Shevchenko 
Cc: Guenter Roeck 
Cc: Rob Herring 
Cc: Johan Hovold 
Cc: Mark Rutland 
Cc: Sebastian Reichel 
Acked-by: Pavel Machek 
Acked-by: Rob Herring 
Reviewed-by: Guenter Roeck 
Signed-off-by: Nikita Yushchenko 
Signed-off-by: Andrey Smirnov 
---
 .../bindings/watchdog/zii,rave-sp-wdt.txt  | 39 ++
 1 file changed, 39 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt

diff --git a/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt 
b/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt
new file mode 100644
index ..3de96186e92e
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt
@@ -0,0 +1,39 @@
+Zodiac Inflight Innovations RAVE Supervisory Processor Watchdog Bindings
+
+RAVE SP watchdog device is a "MFD cell" device corresponding to
+watchdog functionality of RAVE Supervisory Processor. It is expected
+that its Device Tree node is specified as a child of the node
+corresponding to the parent RAVE SP device (as documented in
+Documentation/devicetree/bindings/mfd/zii,rave-sp.txt)
+
+Required properties:
+
+- compatible: Depending on wire protocol implemented by RAVE SP
+  firmware, should be one of:
+   - "zii,rave-sp-watchdog"
+   - "zii,rave-sp-watchdog-legacy"
+
+Optional properties:
+
+- wdt-timeout: Two byte nvmem cell specified as per
+   Documentation/devicetree/bindings/nvmem/nvmem.txt
+
+Example:
+
+   rave-sp {
+   compatible = "zii,rave-sp-rdu1";
+   current-speed = <38400>;
+
+   eeprom {
+   wdt_timeout: wdt-timeout@8E {
+   reg = <0x8E 2>;
+   };
+   };
+
+   watchdog {
+   compatible = "zii,rave-sp-watchdog";
+   nvmem-cells = <&wdt_timeout>;
+   nvmem-cell-names = "wdt-timeout";
+   };
+   }
+
-- 
2.14.3

[PATCH v15 3/5] mfd: Add driver for RAVE Supervisory Processor

2017-12-19 Thread Andrey Smirnov

Add a driver for RAVE Supervisory Processor, an MCU implementing
various bits of housekeeping functionality (watchdoging, backlight
control, LED control, etc) on RAVE family of products by Zodiac
Inflight Innovations.

This driver implementes core MFD/serdev device as well as
communication subroutines necessary for commanding the device.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Cc: Greg Kroah-Hartman 
Cc: Pavel Machek 
Cc: Andy Shevchenko 
Cc: Guenter Roeck 
Cc: Rob Herring 
Cc: Johan Hovold 
Cc: Sebastian Reichel 
Tested-by: Chris Healy 
Acked-for-MFD-by: Lee Jones 
Acked-by: Pavel Machek 
Reviewed-by: Guenter Roeck 
Reviewed-by: Andy Shevchenko 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/Kconfig |   8 +
 drivers/mfd/Makefile|   2 +
 drivers/mfd/rave-sp.c   | 720 
 include/linux/mfd/rave-sp.h |  52 
 4 files changed, 782 insertions(+)
 create mode 100644 drivers/mfd/rave-sp.c
 create mode 100644 include/linux/mfd/rave-sp.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 1d20a800e967..ec90d408bfa9 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -1859,5 +1859,13 @@ config MFD_VEXPRESS_SYSREG
  System Registers are the platform configuration block
  on the ARM Ltd. Versatile Express board.
 
+config RAVE_SP_CORE
+   tristate "RAVE SP MCU core driver"
+   depends on SERIAL_DEV_BUS
+   select CRC_CCITT
+   help
+ Select this to get support for the Supervisory Processor
+ device found on several devices in RAVE line of hardware.
+
 endmenu
 endif
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index d9474ade32e6..61abc297b97c 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -230,3 +230,5 @@ obj-$(CONFIG_MFD_STM32_LPTIMER) += stm32-lptimer.o
 obj-$(CONFIG_MFD_STM32_TIMERS) += stm32-timers.o
 obj-$(CONFIG_MFD_MXS_LRADC) += mxs-lradc.o
 obj-$(CONFIG_MFD_SC27XX_PMIC)  += sprd-sc27xx-spi.o
+obj-$(CONFIG_RAVE_SP_CORE) += rave-sp.o
+
diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
new file mode 100644
index ..ccf5d110
--- /dev/null
+++ b/drivers/mfd/rave-sp.c
@@ -0,0 +1,720 @@
+/*
+ * Multifunction core driver for Zodiac Inflight Innovations
+ * SP MCU that is connected via dedicated UART port
+ *
+ * Copyright (C) 2017 Zodiac Inflight Innovations
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * UART protocol using following entities:
+ *  - message to MCU => ACK response
+ *  - event from MCU => event ACK
+ *
+ * Frame structure:
+ *
+ * Where:
+ * - STX - is start of transmission character
+ * - ETX - end of transmission
+ * - DATA - payload
+ * - CHECKSUM - checksum calculated on 
+ *
+ * If  or  contain one of control characters, then it is
+ * escaped using  control code. Added  does not participate in
+ * checksum calculation.
+ */
+#define RAVE_SP_STX0x02
+#define RAVE_SP_ETX0x03
+#define RAVE_SP_DLE0x10
+
+#define RAVE_SP_MAX_DATA_SIZE  64
+#define RAVE_SP_CHECKSUM_SIZE  2  /* Worst case scenario on RDU2 */
+/*
+ * We don't store STX, ETX and unescaped bytes, so Rx is only
+ * DATA + CSUM
+ */
+#define RAVE_SP_RX_BUFFER_SIZE \
+   (RAVE_SP_MAX_DATA_SIZE + RAVE_SP_CHECKSUM_SIZE)
+
+#define RAVE_SP_STX_ETX_SIZE   2
+/*
+ * For Tx we have to have space for everything, STX, EXT and
+ * potentially stuffed DATA + CSUM data + csum
+ */
+#define RAVE_SP_TX_BUFFER_SIZE \
+   (RAVE_SP_STX_ETX_SIZE + 2 * RAVE_SP_RX_BUFFER_SIZE)
+
+#define RAVE_SP_BOOT_SOURCE_GET0
+#define RAVE_SP_BOOT_SOURCE_SET1
+
+#define RAVE_SP_RDU2_BOARD_TYPE_RMB0
+#define RAVE_SP_RDU2_BOARD_TYPE_DEB1
+
+#define RAVE_SP_BOOT_SOURCE_SD 0
+#define RAVE_SP_BOOT_SOURCE_EMMC   1
+#define RAVE_SP_BOOT_SOURCE_NOR2
+
+/**
+ * enum rave_sp_deframer_state - Possible state for de-framer
+ *
+ * @RAVE_SP_EXPECT_SOF: Scanning input for start-of-frame 
marker
+ * @RAVE_SP_EXPECT_DATA:

[PATCH v15 0/5] ZII RAVE platform driver

2017-12-19 Thread Andrey Smirnov

Everyone:

This patch series is v15 of the driver for supervisory processor found
on RAVE series of devices from ZII. Supervisory processor is a PIC
microcontroller connected to various electrical subsystems on RAVE
devices whose firmware implements protocol to command/qery them.

NOTE:

 * This driver dependends on crc_ccitt_false(), added by
   2da9378d531f8cc6670c7497f20d936b706ab80b in 'linux-next', the patch
   was pulled in by Andrew Morton and is currently avaiting users, so
   this series might have to go in through Andrew's tree


Changes since [v14]:

- Fixed a bug in deframer code where byte processing loop was not
  being terminated early is it should've been. This would result,
  among other things, in packets of maximum valid length being
  incorrectly reported as tool long.

- Increased command timeout value in support other valid commands
  that are outsied of scope for this patch set.

- Converted watchdog driver to differentiate between variants
  based on its own compatiblity string as opposed to relying on
  that of parent MFD device (as per request by Johan and Lee).

  NOTE: This change didn't seem to change DT bingins enough to
  warrant dropping any Acks for patches affected, so I kept
  them. If anyone wants to rescind their Ack, please let me know.

- Collected Acked-by from Pavel

- Collected Acked-by from Lee (for patch 3/5)

Changes since [v13]:

- Fixed incorrect MFD driver menuconfig entry placement

Changes since [v12]:

- Minor comment inconsistencies fixes in rave-sp.c

Changes since [v11]:

- Fix incorrect include in rave-sp-wdt.c as uncovered by kernel
  test robot

Changes since [v10]:

- Collected Acked-by from Rob and Reviewed-by from Guenter

- Incorporated watchdog driver feedback from Gunter and Johan

- Incorporated Johan's feedback for the rest of the code

Changes since [v9]:

- Converted watchdog driver to use watchdog_active() instead of
  watchdog_hw_running() and replaced WARN_ON with a regular error
  message as per feedback from Guenter

- Changed rave_sp_wdt_start() to set WDOG_HW_RUNNING only if
  communicating with hardware was sucessful

- Collected Reviewd-by from Sebastian (for serdev related patches)

- Collected Acked-by from Rob (for watchdog DT bindings)

Changes since [v8]:

- Driver moved from drivers/platform to drivers/mfd

- Collected Reviewed-by from Guenter (for patches 1, 2 and 3)

- Incorporated feedback from Guenter into watchdog driver

- Incorporated feedback from Rob into watchdog DT bindings

- Removed struct rave_sp_rsp_status, which was a leftover from v5
  -> v6 code removal.

- Fixed minor problems reported by checkpatch

Changes since [v7]:

- Added watchdog driver to the patchset, so it would be easier to
  understand how parent/children drivers are tied together

- Added serdev patches to implement devm_serdev_device_open() and make 
.remove optional

- "Added" missing serdev_device_close() by converting the driver
  to use devm_serdev_device_open()

- Converted the driver to use devm_of_platform_populate()

- Removed needless dependency on MFD_CORE

- Removed dependency on SERIAL_DEV_CTRL_TTYPORT

Changes since [v6]:

- Patch 2/2 has been applied by Lee so it is no longer a part of the series

- Removed all sysfs and debugfs attribute to reduce the scope of
  the driver propsed for inclusion. This is not a critical to have
  feature and can be added/discussed later.

Changes since [v5]:

- Fixed a build break, introduced by a last minute change in [v5]

- Moved majority of attributes that were exposed over sysfs to debugfs

- Document remaining sysfs attributes in 
Documentation/ABI/testing/sysfs-platform-rave-sp

Changes since [v4]:

- Replaced usage of DEVICE_ATTR with DEVICE_ATTR_RW

- Fixed a number of warnings produces by sparse tool

- Incorporated event more feedback from Andy Shevchenko

- Collected Reviewed-by from Andy

Changes since [v3]:

- Re-collected lost Acked-by from Rob

- Incorporated further feedback from Andy Shevchenko

- Dropped useless change (stray newline) to drivers/mfd/Makefile

Changes since [v2]:

- Fixed swapped command codes in rave_sp_common_get_boot_source()
  and rave_sp_common_set_boot_source() revealed by further testing
  of the code

- Incorporated feedback from Andy Shevchenko

Changes since [v1]:

- Updated wording in DT-bindings as per Rob's request.

- Collected Rob's Acked-by for patch 2/2

Feedback is greatly appreciated!

Thanks,
Andrey Smirnov

[v14] lkml.kernel.org/r/20171207162735.25873-1-andrew.smir...@gmail.com
[v13] lkml.kernel.org/r/20171204161118.19558-1-andrew.smir...@gmail.com
[v12] lkml.kernel.org/r/20171109160556.17018-1-andrew.smir...@gmail.com
[v11] lkml.kernel.org/r/20171106152935.16920-1-andrew.smir...@gmail.com
[v10

Re: PROBLEM: 4.15.0-rc3 APIC causes lockups on Core 2 Duo laptop

2017-12-19 Thread Dou Liyang


Hi Thomas,

At 12/20/2017 08:31 AM, Thomas Gleixner wrote:

On Tue, 19 Dec 2017, Alexandru Chirvasitu wrote:


I had never heard of 'bisect' before this casual mention (you might tell
I am a bit out of my depth). I've since applied it to Linus' tree between



bebc608 Linux 4.14 (good)

and

4fbd8d1 Linux 4.15-rc1 (bad)


Is Linus current head 4.15-rc4 bad as well?


[...]


Thanks for doing that bisect, but unfortunately this commit cannot be the
problematic one, It merily adds a config symbol, but it does not change any
code at all. It has no effect whatsoever. So something might have gone
wrong in your bisecting.



Agree.


I CC'ed Dou Liyang. He has changed the early APIC setup code and there has
been an issue reported already. Though I lost track of that. Dou, any

     Is it this one?
               https://marc.info/?l=linux-kernel&m=151188084018443

pointers?



Not sure, but seems the APIC failed to start in that 32-bit system.

I will look into it.

Alex,

Could you give me your .config file and the dmesg-log of 4.15.0-rc3.

Thanks,
dou

Re: BUG: bad usercopy in memdup_user

2017-12-19 Thread Matthew Wilcox

On Tue, Dec 19, 2017 at 09:48:49PM +, Al Viro wrote:
> Well, for example seeing a 0xfff4 where a pointer to object
> must have been is a pretty strong hint to start looking for a way for
> that ERR_PTR(-ENOMEM) having ended up there...  Something like
> 0x6e69622f7273752f is almost certainly a misplaced "/usr/bin", i.e. a
> pathname overwriting whatever it ends up in, etc.  And yes, I have run
> into both of those in real life.
> 
> Debugging the situation when crap value has ended up in place of a
> pointer is certainly a case where you do want to see what exactly has
> ended up in there...

Linus, how would you feel about printing ERR_PTRs without molestation?
It's not going to leak any information about the kernel address space
layout.  I'm a little less certain about trying to detect ASCII strings,
but I think this is an improvement.

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01c3957b2de6..c80c60b4b3ef 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1859,6 +1859,9 @@ char *pointer(const char *fmt, char *buf, char *end, void 
*ptr,
return string(buf, end, "(null)", spec);
}
 
+   if (IS_ERR(ptr))
+   return pointer_string(buf, end, ptr, spec);
+
switch (*fmt) {
case 'F':
case 'f':

Re: [PATCH v6 1/5] clk: Add clock driver for ASPEED BMC SoCs

2017-12-19 Thread Joel Stanley

On Tue, Nov 28, 2017 at 5:49 PM, Joel Stanley  wrote:
> This adds the stub of a driver for the ASPEED SoCs. The clocks are
> defined and the static registration is set up.
>
> Reviewed-by: Andrew Jeffery 
> Signed-off-by: Joel Stanley 
> ---
> v6:
>  - Add SPDX copyright notices
> v5:
>  - Add Andrew's reviewed-by
>  - Make aspeed_gates not initconst to avoid section mismatch warning
> v3:
>  - use named initlisers for aspeed_gates table
>  - fix clocks typo
>  - Move ASPEED_NUM_CLKS to the bottom of the list
>  - Put gates at the start of the list, so we can use them to initalise
>the aspeed_gates table
>  - Add ASPEED_CLK_SELECTION_2
>  - Set parent of network MAC gates
> ---
>  drivers/clk/Kconfig  |  12 +++
>  drivers/clk/Makefile |   1 +
>  drivers/clk/clk-aspeed.c | 139 
> +++
>  include/dt-bindings/clock/aspeed-clock.h |  44 ++
>  4 files changed, 196 insertions(+)
>  create mode 100644 drivers/clk/clk-aspeed.c
>  create mode 100644 include/dt-bindings/clock/aspeed-clock.h

> diff --git a/include/dt-bindings/clock/aspeed-clock.h 
> b/include/dt-bindings/clock/aspeed-clock.h
> new file mode 100644
> index ..9e170fb9a0da
> --- /dev/null
> +++ b/include/dt-bindings/clock/aspeed-clock.h
> @@ -0,0 +1,44 @@
> +/* SPDX-License-Identifier: (GPL-2.0+ OR MIT) */
> +
> +#ifndef DT_BINDINGS_ASPEED_CLOCK_H
> +#define DT_BINDINGS_ASPEED_CLOCK_H
> +
> +#define ASPEED_CLK_GATE_ECLK   0
> +#define ASPEED_CLK_GATE_GCLK   1
> +#define ASPEED_CLK_GATE_MCLK   2
> +#define ASPEED_CLK_GATE_VCLK   3
> +#define ASPEED_CLK_GATE_BCLK   4
> +#define ASPEED_CLK_GATE_DCLK   5
> +#define ASPEED_CLK_GATE_REFCLK 6
> +#define ASPEED_CLK_GATE_USBPORT2CLK7
> +#define ASPEED_CLK_GATE_LCLK   8
> +#define ASPEED_CLK_GATE_USBUHCICLK 9
> +#define ASPEED_CLK_GATE_D1CLK  10
> +#define ASPEED_CLK_GATE_YCLK   11
> +#define ASPEED_CLK_GATE_USBPORT1CLK12
> +#define ASPEED_CLK_GATE_UART1CLK   13
> +#define ASPEED_CLK_GATE_UART2CLK   14
> +#define ASPEED_CLK_GATE_UART5CLK   15
> +#define ASPEED_CLK_GATE_ESPICLK16
> +#define ASPEED_CLK_GATE_MAC1CLK17
> +#define ASPEED_CLK_GATE_MAC2CLK18
> +#define ASPEED_CLK_GATE_RSACLK 19
> +#define ASPEED_CLK_GATE_UART3CLK   20
> +#define ASPEED_CLK_GATE_UART4CLK   21
> +#define ASPEED_CLK_GATE_SDCLKCLK   22
> +#define ASPEED_CLK_GATE_LHCCLK 23
> +#define ASPEED_CLK_HPLL24
> +#define ASPEED_CLK_AHB 25
> +#define ASPEED_CLK_APB 26
> +#define ASPEED_CLK_UART27
> +#define ASPEED_CLK_SDIO28
> +#define ASPEED_CLK_ECLK29
> +#define ASPEED_CLK_ECLK_MUX30
> +#define ASPEED_CLK_LHCLK   31
> +#define ASPEED_CLK_MAC 32
> +#define ASPEED_CLK_BCLK33
> +#define ASPEED_CLK_MPLL34
> +
> +#define ASPEED_NUM_CLKS35

In reviewing this change as part of some ASPEED device tree changes,
we moved the define out of this header (as we do not want it to be
part of the ABI) and into the c file where it is used.

I have a v7 ready to send out with this change. Are there any other
issues with this before I send that version out?

Cheers,

Joel

Re: [PATCH v3 00/19] ARM: dts: aspeed: updates and new machines

2017-12-19 Thread Joel Stanley

On Wed, Dec 20, 2017 at 1:53 PM, Joel Stanley  wrote:
> This series of device tree patches for the ASPEED BMC machines
> moves all systems to use the soon to be merged clk driver, and
> updates machines to use all of the drivers we have upstream.
>
>  v3: Address review from Rob and Cedric
>   - Move aspeed-gpio.h usage out into the patches where use of the GPIO
> is added
>   - Clarify that the aspeed-clock.h patch will be merged as part of
> the device tree tree. This is to ensure we don't depend on the clk
> tree for building.

Arnd, Michael, Stephen; how do we resolve this? We need the
dt-bindings header to be present for both the clk driver and the
device tree to build.

The clk driver is not (yet - soon I hope?) merged by Michael and
Stephen. I am about to commit the device tree changes that will go
through the ARM SoC tree.

Cheers,

Joel

Re: [PATCH v5 14/15] cpufreq: Add module to register cpufreq on Krait CPUs

2017-12-19 Thread Viresh Kumar

On 19-12-17, 21:24, Sricharan R wrote:
> From: Stephen Boyd 
> 
> Register a cpufreq-generic device whenever we detect that a
> "qcom,krait" compatible CPU is present in DT.
> 
> Cc: 
> [Sricharan: updated to use dev_pm_opp_set_prop_name]
> Signed-off-by: Sricharan R 
> Signed-off-by: Stephen Boyd 
> ---
>  drivers/cpufreq/Kconfig.arm  |   9 ++
>  drivers/cpufreq/Makefile |   1 +
>  drivers/cpufreq/cpufreq-dt-platdev.c |   3 +-
>  drivers/cpufreq/qcom-cpufreq.c   | 171 
> +++
>  4 files changed, 183 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/cpufreq/qcom-cpufreq.c
> 
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index bdce448..60f28e7 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -100,6 +100,15 @@ config ARM_OMAP2PLUS_CPUFREQ
>   depends on ARCH_OMAP2PLUS
>   default ARCH_OMAP2PLUS
>  
> +config ARM_QCOM_CPUFREQ
> + tristate "Qualcomm based"

Qualcomm based ... ? You want to add something after this ?

And why tristate ? Do you really want to build a module for this ?

> + depends on ARCH_QCOM
> + select PM_OPP
> + help
> +   This adds the CPUFreq driver for Qualcomm SoC based boards.
> +
> +   If in doubt, say N.
> +
>  config ARM_S3C_CPUFREQ
>   bool
>   help
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index 812f9e0..1496464 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -62,6 +62,7 @@ obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ)  += mediatek-cpufreq.o
>  obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ)  += omap-cpufreq.o
>  obj-$(CONFIG_ARM_PXA2xx_CPUFREQ) += pxa2xx-cpufreq.o
>  obj-$(CONFIG_PXA3xx) += pxa3xx-cpufreq.o
> +obj-$(CONFIG_ARM_QCOM_CPUFREQ)   += qcom-cpufreq.o
>  obj-$(CONFIG_ARM_S3C24XX_CPUFREQ)+= s3c24xx-cpufreq.o
>  obj-$(CONFIG_ARM_S3C24XX_CPUFREQ_DEBUGFS) += s3c24xx-cpufreq-debugfs.o
>  obj-$(CONFIG_ARM_S3C2410_CPUFREQ)+= s3c2410-cpufreq.o
> diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c 
> b/drivers/cpufreq/cpufreq-dt-platdev.c
> index ecc56e2..032ac4f 100644
> --- a/drivers/cpufreq/cpufreq-dt-platdev.c
> +++ b/drivers/cpufreq/cpufreq-dt-platdev.c
> @@ -118,7 +118,7 @@
>   { .compatible = "ti,am33xx", },
>   { .compatible = "ti,am43", },
>   { .compatible = "ti,dra7", },
> -

Keep this blank line as is..

> + { .compatible = "qcom,ipq8064", },

And add another one here.

>   { }
>  };
>  
> @@ -157,6 +157,7 @@ static int __init cpufreq_dt_platdev_init(void)
>  
>  create_pdev:
>   of_node_put(np);
> +

Remove this.

>   return PTR_ERR_OR_ZERO(platform_device_register_data(NULL, "cpufreq-dt",
>  -1, data,
>  sizeof(struct cpufreq_dt_platform_data)));
> diff --git a/drivers/cpufreq/qcom-cpufreq.c b/drivers/cpufreq/qcom-cpufreq.c
> new file mode 100644
> index 000..3e5583d
> --- /dev/null
> +++ b/drivers/cpufreq/qcom-cpufreq.c
> @@ -0,0 +1,171 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2013-2015, The Linux Foundation. All rights reserved.
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "cpufreq-dt.h"
> +
> +static void __init get_krait_bin_format_a(int *speed, int *pvs, int *pvs_ver)
> +{
> + void __iomem *base;
> + u32 pte_efuse;
> +
> + *speed = *pvs = *pvs_ver = 0;
> +
> + base = ioremap(0x007000c0, 4);
> + if (!base) {
> + pr_warn("Unable to read efuse data. Defaulting to 0!\n");
> + return;
> + }
> +
> + pte_efuse = readl_relaxed(base);
> + iounmap(base);
> +
> + *speed = pte_efuse & 0xf;
> + if (*speed == 0xf)
> + *speed = (pte_efuse >> 4) & 0xf;
> +
> + if (*speed == 0xf) {
> + *speed = 0;
> + pr_warn("Speed bin: Defaulting to %d\n", *speed);
> + } else {
> + pr_info("Speed bin: %d\n", *speed);
> + }
> +
> + *pvs = (pte_efuse >> 10) & 0x7;
> + if (*pvs == 0x7)
> + *pvs = (pte_efuse >> 13) & 0x7;
> +
> + if (*pvs == 0x7) {
> + *pvs = 0;
> + pr_warn("PVS bin: Defaulting to %d\n", *pvs);
> + } else {
> + pr_info("PVS bin: %d\n", *pvs);
> + }
> +}
> +
> +static void __init get_krait_bin_format_b(int *speed, int *pvs, int *pvs_ver)
> +{
> + u32 pte_efuse, redundant_sel;
> + void __iomem *base;
> +
> + *speed = 0;
> + *pvs = 0;
> + *pvs_ver = 0;
> +
> + base = ioremap(0xfc4b80b0, 8);
> + if (!base) {
> + pr_warn("Unable to read efuse data. Defaulting to 0!\n");
> + return;
> + }
> +
> + pte_efuse = readl_relaxed(base);
> + redundant_sel = (pte_efuse >> 24) & 0x7;
> + *speed = pte_efuse & 0x7;
> + /* 4 bits of PVS are in efuse register bits 31, 8-6. */
> + *pvs = ((pte_efuse

Re: [v2] arm: dts: ls1021a: fix the value of TMR_FIPER1

2017-12-19 Thread Shawn Guo

On Mon, Dec 18, 2017 at 02:51:06AM +, Y.b. Lu wrote:
> Hi Shawn,
> 
> Sorry for bother. I just couldn’t find this patch on your git tree.
> Could you help to check?

Sorry.  I forgot to push the update.  Just pushed now.

Shawn

Re: [ANNOUNCE] autofs 5.1.2 release

2017-12-19 Thread NeilBrown


Hi Ian,
 I've been looking at:

> - add configuration option to use fqdn in mounts.

(commit 9aeef772604) because using this new option causes a regression.
If you are using the "replicated server" functionality, then
  use_hostname_for_mounts = yes
completely disables it.

This is caused by:

diff --git a/modules/replicated.c b/modules/replicated.c
index 32860d5fe245..8437f5f3d5b2 100644
--- a/modules/replicated.c
+++ b/modules/replicated.c
@@ -667,6 +667,12 @@ int prune_host_list(unsigned logopt, struct host **list,
if (!*list)
return 0;
 
+   /* If we're using the host name then there's no point probing
+* avialability and respose time.
+*/
+   if (defaults_use_hostname_for_mounts())
+   return 1;
+
/* Use closest hosts to choose NFS version */

My question is: why what this particular change made.
Why can't prune_host_list() be allowed to do it's thing
when use_hostname_for_mounts is set.
I understand that it would be pointless choosing between
the different interfaces of a multi-homed host, but there is still value
in choosing between multiple distinct hosts.

What, if anything, might go wrong if I simply reverse this chunk of the
patch?

Thanks,
NeilBrown


signature.asc
Description: PGP signature

[PATCH v3 20/20] ARM: dts: aspeed-evb: Add unit name to memory node

2017-12-19 Thread Joel Stanley

Fixes a warning when building with W=1.

All of the ASPEED device trees build without warnings now.

Signed-off-by: Joel Stanley 
---
 arch/arm/boot/dts/aspeed-ast2500-evb.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/aspeed-ast2500-evb.dts 
b/arch/arm/boot/dts/aspeed-ast2500-evb.dts
index 3e6f38e5d5d0..91a36c1f029b 100644
--- a/arch/arm/boot/dts/aspeed-ast2500-evb.dts
+++ b/arch/arm/boot/dts/aspeed-ast2500-evb.dts
@@ -16,7 +16,7 @@
bootargs = "console=ttyS4,115200 earlyprintk";
};
 
-   memory {
+   memory@8000 {
reg = <0x8000 0x2000>;
};
 };
-- 
2.15.1

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1180 matches

Mail list logo