rfc: Are any of the seq_pad() uses really necessary?
$ git grep -w seq_pad net net/ipv4/fib_trie.c:seq_pad(seq, '\n'); net/ipv4/ping.c:seq_pad(seq, '\n'); net/ipv4/tcp_ipv4.c:seq_pad(seq, '\n'); net/ipv4/udp.c: seq_pad(seq, '\n'); net/phonet/socket.c:seq_pad(seq, '\n'); net/phonet/socket.c:seq_pad(seq, '\n'); net/sctp/objcnt.c: seq_pad(seq, '\n'); what these uses do is add trailing blanks to a particular preset block width and then append a newline. None of these trailing pad bytes seem useful to me. Are there really tools that expect specific line widths when reading from things like /proc//net/ For instance: $ cat /proc//net/udp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops 484: :14E9 : 07 : 00: 1110 16961 2 0 486: :14EB : 07 : 00: 1020 2022599 2 0 788: :A619 : 07 : 00: 10000 4390482 2 0 3081: :8F0E : 07 : 00: 1110 16963 2 0 3376: 357F:0035 : 07 : 00: 1020 2022601 2 0 3391: :0044 : 07 : 00: 00 4546167 2 0 These seq_pad uses were modified by: >From 652586df95e5d76b37d07a11839126dcfede1621 Mon Sep 17 00:00:00 2001 From: Tetsuo HandaDate: Thu, 14 Nov 2013 14:31:57 -0800 Subject: [PATCH] seq_file: remove "%n" usage from seq_file users All seq_printf() users are using "%n" for calculating padding size, convert them to use seq_setwidth() / seq_pad() pair. Signed-off-by: Tetsuo Handa Signed-off-by: Kees Cook Cc: Joe Perches Cc: David Miller Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds If these are really necessary, then maybe the seq_pad function could be optimized using a memset instead of seq_printf(, "%*s", len, "");
Re: recommended use of request_any_context_irq()
On Thu, Sep 22, 2016 at 3:47 PM, Thomas Gleixnerwrote: > On Thu, 22 Sep 2016, Leo Li wrote: >> On Thu, Sep 22, 2016 at 3:10 AM, Marc Zyngier wrote: >> > There is strictly no reason to perform a blanket change of all the >> > drivers. What would be the reason to change them other than to cater for >> > a contrived use case that may never happen? >> >> Maybe we could do blanket change to drivers that meet certain >> criteria? At least we should improve the messaging when a driver >> cannot request interrupt due to nested threading. > > Nested threading is a result of requesting an any context interrupt not > something which is there already. > >> Right now, it might take quite some time for a developer unfamiliar with >> the threaded interrupt to figure out the problem. > > Did you have issues with a driver which was not able to request an > interrupt? If yes, please explain in detail what the failure was and why > you think that this should be changed. If not, please explain which problem > you are trying to solve. My problem was with sc16is7xx, an SPI-to-UART device(drivers/tty/serial/sc16is7xx.c), the interrupt of it is connected together with interrupts from other on-board devices to a GPIO expander (drivers/gpio/gpio-pca953x.c). The interrupt handler of pca953x interrupt controller is threaded, but the sc16is7xx driver is currently requesting plain interrupt. So the sc16is7xx just fails when requesting irq. The problem can be fixed by changing the sc16is7xx driver to use the request_any_context_irq(). But it is not easy to see this is because of requesting plain interrupt on a threaded interrupt controller without some debugging effort into the core code. And my concerns is that there are other drivers can hit the same problem if connected to the threaded interrupt controller. What can we do prevent similar problem in the future? Regards, Leo
Re: [PATCH RT 05/10] net: add back the missing serialization in ip_send_unicast_reply()
And again. I need to fix quilt mail to handle this. -- Steve On Thu, 22 Sep 2016 17:57:52 -0400 Steven Rostedtwrote: > 4.1.33-rt38-rc1 stable review patch. > If anyone has any objections, please let me know. > > -- > > From: Sebastian Andrzej Siewior > > Some time ago Sami PietikÀinen reported a crash on -RT in > ip_send_unicast_reply() which was later fixed by Nicholas Mc Guire > (v3.12.8-rt11). Later (v3.18.8) the code was reworked and I dropped the > patch. As it turns out it was mistake. > I have reports that the same crash is possible with a similar backtrace. > It seems that vanilla protects access to this_cpu_ptr() via > local_bh_disable(). This does not work the on -RT since we can have > NET_RX and NET_TX running in parallel on the same CPU. > This is brings back the old locks. > > |Unable to handle kernel NULL pointer dereference at virtual address 0010 > |PC is at __ip_make_skb+0x198/0x3e8 > |[] (__ip_make_skb) from [] > (ip_push_pending_frames+0x20/0x40) > |[] (ip_push_pending_frames) from [] > (ip_send_unicast_reply+0x210/0x22c) > |[] (ip_send_unicast_reply) from [] > (tcp_v4_send_reset+0x190/0x1c0) > |[] (tcp_v4_send_reset) from [] > (tcp_v4_do_rcv+0x22c/0x288) > |[] (tcp_v4_do_rcv) from [] (release_sock+0xb4/0x150) > |[] (release_sock) from [] (tcp_close+0x240/0x454) > |[] (tcp_close) from [] (inet_release+0x74/0x7c) > |[] (inet_release) from [] (sock_release+0x30/0xb0) > |[] (sock_release) from [] (sock_close+0x1c/0x24) > |[] (sock_close) from [] (__fput+0xe8/0x20c) > |[] (__fput) from [] (fput+0x18/0x1c) > |[] (fput) from [] (task_work_run+0xa4/0xb8) > |[] (task_work_run) from [] (do_work_pending+0xd0/0xe4) > |[] (do_work_pending) from [] (work_pending+0xc/0x20) > |Code: e3530001 8a01 e3a00040 ea11 (e5973010) > > Cc: stable...@vger.kernel.org > Signed-off-by: Sebastian Andrzej Siewior > Signed-off-by: Steven Rostedt > --- > net/ipv4/tcp_ipv4.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 13b92d595138..6bfa68fb5f21 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -62,6 +62,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -563,6 +564,7 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff > *skb) > } > EXPORT_SYMBOL(tcp_v4_send_check); > > +static DEFINE_LOCAL_IRQ_LOCK(tcp_sk_lock); > /* > * This routine will send an RST to the other tcp. > * > @@ -684,10 +686,13 @@ static void tcp_v4_send_reset(struct sock *sk, struct > sk_buff *skb) > arg.bound_dev_if = sk->sk_bound_dev_if; > > arg.tos = ip_hdr(skb)->tos; > + > + local_lock(tcp_sk_lock); > ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk), > skb, _SKB_CB(skb)->header.h4.opt, > ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, > , arg.iov[0].iov_len); > + local_unlock(tcp_sk_lock); > > TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); > TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS); > @@ -769,10 +774,12 @@ static void tcp_v4_send_ack(struct net *net, > if (oif) > arg.bound_dev_if = oif; > arg.tos = tos; > + local_lock(tcp_sk_lock); > ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk), > skb, _SKB_CB(skb)->header.h4.opt, > ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, > , arg.iov[0].iov_len); > + local_unlock(tcp_sk_lock); > > TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); > }
Re: [PATCH v4 0/3] nvme power saving
On Thu, Sep 22, 2016 at 02:33:36PM -0700, J Freyensee wrote: > ...and some SSDs don't even support this feature yet, so the number of > different NVMe devices available to test initially will most likely be > small (like the Fultondales I have, all I could check is to see if the > code broke anything if the device did not have this power-save > feature). > > I agree with Jens, makes a lot of sense to start with this feature > 'off'. > > To 'advertise' the feature, maybe make the feature a new selection in > Kconfig? Example, initially make it "EXPERIMENTAL", and later when > more devices implement this feature it can be integrated more tightly > into the NVMe solution and default to on. Should we just leave the kernel out of this then? I bet we could script this feature in user space.
Re: [PATCH] drivers: wlan-ng: fixed a coding style issue
On Thu, 2016-09-22 at 23:56 +0200, Jannik Becher wrote: > removed a space after a cast to obtain the coding style. Better would be to change the subject to something like: [PATCH] staging: wlan-ng: Remove unnecessary spaces before casts
Re: [PATCH v4 0/3] nvme power saving
On Thu, Sep 22, 2016 at 2:33 PM, J Freyenseewrote: > On Thu, 2016-09-22 at 14:43 -0600, Jens Axboe wrote: >> On 09/22/2016 02:11 PM, Andy Lutomirski wrote: >> > >> > On Thu, Sep 22, 2016 at 7:23 AM, Jens Axboe wrote: >> > > >> > > >> > > On 09/16/2016 12:16 PM, Andy Lutomirski wrote: >> > > > >> > > > >> > > > Hi all- >> > > > >> > > > Here's v4 of the APST patch set. The biggest bikesheddable >> > > > thing (I >> > > > think) is the scaling factor. I currently have it hardcoded so >> > > > that >> > > > we wait 50x the total latency before entering a power saving >> > > > state. >> > > > On my Samsung 950, this means we enter state 3 (70mW, 0.5ms >> > > > entry >> > > > latency, 5ms exit latency) after 275ms and state 4 (5mW, 2ms >> > > > entry >> > > > latency, 22ms exit latency) after 1200ms. I have the default >> > > > max >> > > > latency set to 25ms. >> > > > >> > > > FWIW, in practice, the latency this introduces seems to be well >> > > > under 22ms, but my benchmark is a bit silly and I might have >> > > > measured it wrong. I certainly haven't observed a slowdown >> > > > just >> > > > using my laptop. >> > > > >> > > > This time around, I changed the names of parameters after Jay >> > > > Frayensee got confused by the first try. Now they are: >> > > > >> > > > - ps_max_latency_us in sysfs: actually controls it. >> > > > - nvme_core.default_ps_max_latency_us: sets the default. >> > > > >> > > > Yeah, they're mouthfuls, but they should be clearer now. >> > > >> > > >> > > The only thing I don't like about this is the fact that's it's a >> > > driver private thing. Similar to ALPM on SATA, it's yet another >> > > knob that needs to be set. It we put it somewhere generic, then >> > > at least we could potentially use it in a generic fashion. >> > >> > Agreed. I'm hoping to hear back from Rafael soon about the >> > dev_pm_qos >> > thing. >> > >> > > >> > > >> > > Additionally, it should not be on by default. >> > >> > I think I disagree with this. Since we don't have anything like >> > laptop-mode AFAIK, I think we do want it on by default. For the >> > server workloads that want to consume more idle power for faster >> > response when idle, I think the servers should be willing to make >> > this >> > change, just like they need to disable overly deep C states, etc. >> > (Admittedly, unifying the configuration would be nice.) >> >> I can see two reasons why we don't want it the default: >> >> 1) Changes like this has a tendency to cause issues on various types >> of >> hardware. How many NVMe devices have you tested this on? ALPM on SATA >> had a lot of initial problems, where slowed down some SSDs unberably. I'm reasonably optimistic that the NVMe situation will be a lot better for a couple of reasons: 1. There's only one player involved. With ALPM, the controller and the drive need to cooperate on entering and leaving various idle states. With NVMe, the controller *is* the drive, so there's no issue where a drive manufacturer might not have tested with the relevant controller or vice versa. 2. Windows appears to use it. I haven't tested directly, but the Internet seems to think that Windows uses APST and maybe even manual state transitions, and that NVMe power states are even mandatory for Connected Standby logo compliance. 3. The feature is new. NVMe 1.0 didn't support APST at all, so the driver is unlikely to cause problems with older drivers. > > ...and some SSDs don't even support this feature yet, so the number of > different NVMe devices available to test initially will most likely be > small (like the Fultondales I have, all I could check is to see if the > code broke anything if the device did not have this power-save > feature). > > I agree with Jens, makes a lot of sense to start with this feature > 'off'. > > To 'advertise' the feature, maybe make the feature a new selection in > Kconfig? Example, initially make it "EXPERIMENTAL", and later when > more devices implement this feature it can be integrated more tightly > into the NVMe solution and default to on. > How about having a config option that's "default n" that changes the default? I could also add a log message when APST is first enabled on a device to make it easier to notice a change. --Andy
[PATCH 1/2] power/reset: at91-reset: add samx7 support
From: Szemző AndrásAdd samx7 support. It is lacking a few bits and needs a new reset function. Signed-off-by: Szemző András Signed-off-by: Alexandre Belloni --- drivers/power/reset/at91-reset.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/power/reset/at91-reset.c b/drivers/power/reset/at91-reset.c index 1b5d450586d1..563722e64d7b 100644 --- a/drivers/power/reset/at91-reset.c +++ b/drivers/power/reset/at91-reset.c @@ -134,6 +134,15 @@ static int sama5d3_restart(struct notifier_block *this, unsigned long mode, return NOTIFY_DONE; } +static int samx7_restart(struct notifier_block *this, unsigned long mode, +void *cmd) +{ + writel(cpu_to_le32(AT91_RSTC_KEY | AT91_RSTC_PROCRST), + at91_rstc_base); + + return NOTIFY_DONE; +} + static void __init at91_reset_status(struct platform_device *pdev) { u32 reg = readl(at91_rstc_base + AT91_RSTC_SR); @@ -173,6 +182,7 @@ static const struct of_device_id at91_reset_of_match[] = { { .compatible = "atmel,at91sam9260-rstc", .data = at91sam9260_restart }, { .compatible = "atmel,at91sam9g45-rstc", .data = at91sam9g45_restart }, { .compatible = "atmel,sama5d3-rstc", .data = sama5d3_restart }, + { .compatible = "atmel,samx7-rstc", .data = samx7_restart }, { /* sentinel */ } }; -- 2.9.3
[PATCH 2/2] power/reset: at91-reset: remove leftover platform_device_id
commit eacd8d09db7f ("power/reset: at91-reset: remove useless at91_reset_platform_probe()") removed non DT probe support but forgot to remove the now useless id_table. Do that now. Signed-off-by: Alexandre Belloni--- drivers/power/reset/at91-reset.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/power/reset/at91-reset.c b/drivers/power/reset/at91-reset.c index 563722e64d7b..b8005cc18a54 100644 --- a/drivers/power/reset/at91-reset.c +++ b/drivers/power/reset/at91-reset.c @@ -247,19 +247,12 @@ static int __exit at91_reset_remove(struct platform_device *pdev) return 0; } -static const struct platform_device_id at91_reset_plat_match[] = { - { "at91-sam9260-reset", (unsigned long)at91sam9260_restart }, - { "at91-sam9g45-reset", (unsigned long)at91sam9g45_restart }, - { /* sentinel */ } -}; - static struct platform_driver at91_reset_driver = { .remove = __exit_p(at91_reset_remove), .driver = { .name = "at91-reset", .of_match_table = at91_reset_of_match, }, - .id_table = at91_reset_plat_match, }; module_platform_driver_probe(at91_reset_driver, at91_reset_probe); -- 2.9.3
Re: [PATCH] KVM: nVMX: Fix reload apic access page warning
2016-09-06 17:20+0800, Wanpeng Li: > From: Wanpeng Li> WARNING: CPU: 1 PID: 4230 at kernel/sched/core.c:7564 __might_sleep+0x7e/0x80 > do not call blocking ops when !TASK_RUNNING; state=1 set at > [] prepare_to_swait+0x39/0xa0 > CPU: 1 PID: 4230 Comm: qemu-system-x86 Not tainted 4.8.0-rc5+ #47 > Call Trace: > dump_stack+0x99/0xd0 > __warn+0xd1/0xf0 > warn_slowpath_fmt+0x4f/0x60 > ? prepare_to_swait+0x39/0xa0 > ? prepare_to_swait+0x39/0xa0 > __might_sleep+0x7e/0x80 > __gfn_to_pfn_memslot+0x156/0x480 [kvm] > gfn_to_pfn+0x2a/0x30 [kvm] > gfn_to_page+0xe/0x20 [kvm] > kvm_vcpu_reload_apic_access_page+0x32/0xa0 [kvm] > nested_vmx_vmexit+0x765/0xca0 [kvm_intel] > ? _raw_spin_unlock_irqrestore+0x36/0x80 > vmx_check_nested_events+0x49/0x1f0 [kvm_intel] > kvm_arch_vcpu_runnable+0x2d/0xe0 [kvm] > kvm_vcpu_check_block+0x12/0x60 [kvm] > kvm_vcpu_block+0x94/0x4c0 [kvm] > kvm_arch_vcpu_ioctl_run+0x619/0x1aa0 [kvm] > ? kvm_arch_vcpu_ioctl_run+0xdf1/0x1aa0 [kvm] > kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm] > > === > [ INFO: suspicious RCU usage. ] > 4.8.0-rc5+ #47 Not tainted > --- > ./include/linux/kvm_host.h:535 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 1, debug_locks = 0 > 1 lock held by qemu-system-x86/4230: > #0: (>mutex){+.+.+.}, at: [] vcpu_load+0x1c/0x60 > [kvm] > > stack backtrace: > CPU: 1 PID: 4230 Comm: qemu-system-x86 Not tainted 4.8.0-rc5+ #47 > Call Trace: > dump_stack+0x99/0xd0 > lockdep_rcu_suspicious+0xe7/0x120 > gfn_to_memslot+0x12a/0x140 [kvm] > gfn_to_pfn+0x12/0x30 [kvm] > gfn_to_page+0xe/0x20 [kvm] > kvm_vcpu_reload_apic_access_page+0x32/0xa0 [kvm] > nested_vmx_vmexit+0x765/0xca0 [kvm_intel] > ? _raw_spin_unlock_irqrestore+0x36/0x80 > vmx_check_nested_events+0x49/0x1f0 [kvm_intel] > kvm_arch_vcpu_runnable+0x2d/0xe0 [kvm] > kvm_vcpu_check_block+0x12/0x60 [kvm] > kvm_vcpu_block+0x94/0x4c0 [kvm] > kvm_arch_vcpu_ioctl_run+0x619/0x1aa0 [kvm] > ? kvm_arch_vcpu_ioctl_run+0xdf1/0x1aa0 [kvm] > kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm] > ? __fget+0xfd/0x210 > ? __lock_is_held+0x54/0x70 > do_vfs_ioctl+0x96/0x6a0 > ? __fget+0x11c/0x210 > ? __fget+0x5/0x210 > SyS_ioctl+0x79/0x90 > do_syscall_64+0x81/0x220 > entry_SYSCALL64_slow_path+0x25/0x25 > > These can be triggered by running kvm-unit-test: ./x86-run x86/vmx.flat > > The nested preemption timer is based on hrtimer which is started on L2 > entry, stopped on L2 exit and evaluated via the new check_nested_events > hook. The current logic adds vCPU to a simple waitqueue (TASK_INTERRUPTIBLE) > if need to yield pCPU and w/o holding srcu read lock when accesses memslots, > both can be in nested preemption timer evaluation path which results in > the warning above. > > This patch fix it by leveraging request bit to async reload APIC access > page before vmentry in order to avoid to reload directly during the nested > preemption timer evaluation, it is safe since the vmcs01 is loaded and > current is nested vmexit. > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Yunhong Jiang > Signed-off-by: Wanpeng Li > --- Applied to kvm/queue, thanks.
Re: [PATCH v18 6/6] ARM: socfpga: fpga bridge driver support
On Tue, 9 Aug 2016, Paul Gortmaker wrote: > [Re: [PATCH v18 6/6] ARM: socfpga: fpga bridge driver support] On 08/08/2016 > (Mon 13:44) Moritz Fischer wrote: > > > Hi Alan, > > > > On Mon, Aug 8, 2016 at 12:18 PM, atullwrote: > > > > >> Please don't use module.h in drivers controlled by a bool > > >> Kconfig setting. > > >> > > >> THanks, > > >> Paul. > > >> > > > > > > Thanks for the feedback. Can you provide an example of what you > > > would consider to be proper usage in the kernel? > > > > > > I think Paul is suggesting to use > > > > static int __init alt_fpga_bridge_init(void) > > { > > platform_driver_register(_fpga_bridge_driver); > > } > > > > device_initcall(alt_fpga_bridge_init); > > > > or better: > > > > builtin_platform_driver(_fpga_bridge_driver); > > > > Like for example in: drivers/cpuidle/cpuidle-mvebu-v7.c > > Yes, pretty much that -- if you have a bool Kconfig, you should be using > builtin registration functions, and have no need for module.h or > anything MODULE_ or any module_init/module_exit calls. > > An empty file containing nothing but #include will > cause cpp to emit about 750k of goop, so we really should only be using > it for drivers that are genuinely modular; i.e. tristate Kconfig. > > Thanks, > Paul. > -- > > > > > Cheers, > > > > Moritz > Thanks for the feedback and explanations! I've retested my stuff with it all built as modules (mgr, bridged, and fpga-region) and it all works that way as well as built in. So I'll fix up the Kconfig as tristates for everybody. Also I'll add some dependencies as FPGA_REGION should be dependent on FPGA_BRIDGE. Alan
Re: [PATCH 4.4 000/118] 4.4.22-stable review
On Thu, Sep 22, 2016 at 07:28:20PM +0200, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 4.4.22 release. > There are 118 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Sat Sep 24 17:29:17 UTC 2016. > Anything received after that time might be too late. > Build results: total: 149 pass: 136 fail: 13 Failed builds: avr32:defconfig avr32:allnoconfig avr32:merisc_defconfig avr32:atngw100mkii_evklcd101_defconfig powerpc:defconfig powerpc:allmodconfig powerpc:allnoconfig powerpc:ppc6xx_defconfig powerpc:mpc83xx_defconfig powerpc:tqm8xx_defconfig powerpc:85xx/sbc8548_defconfig powerpc:83xx/mpc834x_mds_defconfig powerpc:86xx/sbc8641d_defconfig Qemu test results: total: 101 pass: 90 fail: 11 Failed tests: openrisc:or1ksim_defconfig powerpc:mac99:nosmp:ppc_book3s_defconfig powerpc:g3beige:nosmp:ppc_book3s_defconfig powerpc:mac99:smp:ppc_book3s_defconfig powerpc:virtex-ml507:44x/virtex5_defconfig powerpc:mpc8548cds:85xx/mpc85xx_cds_defconfig powerpc:mpc8548cds:smpdev:85xx/mpc85xx_cds_defconfig powerpc:bamboo:44x/bamboo_defconfig powerpc:mac99:ppc64_book3s_defconfig:nosmp powerpc:mac99:ppc64_book3s_defconfig:smp4 powerpc:pseries:pseries_defconfig Build errors: avr32: arch/avr32/kernel/built-in.o: In function `arch_ptrace': (.text+0x810): undefined reference to `___copy_from_user' arch/avr32/kernel/built-in.o:(___ksymtab+___copy_from_user+0x0): undefined reference to `___copy_from_user' kernel/built-in.o: In function `devm_request_resource': (.text+0x52c8): undefined reference to `___copy_from_user' kernel/built-in.o: In function `proc_do_large_bitmap': (.text+0x588c): undefined reference to `___copy_from_user' kernel/built-in.o: In function `proc_dostring': (.text+0x5b20): undefined reference to `___copy_from_user' kernel/built-in.o:sysctl.c:(.text+0x6088): more undefined references to `___copy_from_user' follow Hmm .. I've seen that before. Looks like a missing commit from upstream. I'll check later tonight. --- powerpc: drivers/misc/cxl/vphb.c:263:9: error: 'pcibios_free_controller_deferred' undeclared arch/powerpc/include/asm/uaccess.h: In function 'copy_from_user': arch/powerpc/include/asm/uaccess.h:328:1: error: wrong type argument to unary plus + memset(to, 0, n); I'll have to look into those. --- runtime: qemu openrisc crashes with a NULL pointer dereference. There is no backtrace; I'll have to bisect. qemu ppc all fail to build with "drivers/misc/cxl/vphb.c:263:9: error: 'pcibios_free_controller_deferred' undeclared". Details are available at http://kerneltests.org/builders. Guenter
Re: [PATCH net-next 0/4] net: dsa: add port fast ageing
On Thu, Sep 22, 2016 at 04:49:20PM -0400, Vivien Didelot wrote: > Today the DSA drivers are in charge of flushing the MAC addresses > associated to a port when its STP state changes from Learning or > Forwarding, to Disabled or Blocking or Listening. > > This makes the drivers more complex and hides this generic switch logic. > > This patchset introduces a new optional port_fast_age operation to > dsa_switch_ops, to move this logic to the DSA layer and keep drivers > simple. b53 and mv88e6xxx are updated accordingly. Reviewed-by: Andrew LunnAndrew
Re: [PATCH v2] bpf: Set register type according to is_valid_access()
On Thu, Sep 22, 2016 at 09:56:47PM +0200, Mickaël Salaün wrote: > This fix a pointer leak when an unprivileged eBPF program read a pointer > value from the context. Even if is_valid_access() returns a pointer > type, the eBPF verifier replace it with UNKNOWN_VALUE. The register > value containing an address is then allowed to leak. Moreover, this > prevented unprivileged eBPF programs to use functions with (legitimate) > pointer arguments. > > This bug is not an issue for now because the only unprivileged eBPF > program allowed is of type BPF_PROG_TYPE_SOCKET_FILTER and all the types > from its context are UNKNOWN_VALUE. However, this fix is important for > future unprivileged eBPF program types which could use pointers in their > context. > > Signed-off-by: Mickaël Salaün> Fixes: 969bf05eb3ce ("bpf: direct packet access") Please drop 'fixes' tag and rewrite commit log. It's not a fix. Right now only two reg types can be seen: PTR_TO_PACKET and PTR_TO_PACKET_END. Both are only in clsact and xdp programs which are root only. So nothing is leaking at present. Best case this patch is a pre-patch for some future work.
[PATCH] raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
Specifying the aligned attributes to the char recovi[PAGE_SIZE] and char recovi[PAGE_SIZE] arrays, so that all malloc memory is page boundary aligned. Without these alignment attributes, the test causes a segfault in userspace when the NDISKS are changed to 4 from 16. Cc: H. Peter AnvinCc: Yu-cheng Yu Signed-off-by: Gayatri Kammela Reviewed-by: H. Peter Anvin --- lib/raid6/test/test.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/raid6/test/test.c b/lib/raid6/test/test.c index 3bebbabdb510..32a00f11ac50 100644 --- a/lib/raid6/test/test.c +++ b/lib/raid6/test/test.c @@ -21,12 +21,13 @@ #define NDISKS 16 /* Including P and Q */ -const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(256))); +const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); struct raid6_calls raid6_call; char *dataptrs[NDISKS]; char data[NDISKS][PAGE_SIZE]; -char recovi[PAGE_SIZE], recovj[PAGE_SIZE]; +char recovi[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); +char recovj[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); static void makedata(int start, int stop) { -- 2.7.4
RE: [PATCH v2] raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
Hi all, Sorry for the noise! I didn't mean to send the version2. -Original Message- From: Kammela, Gayatri Sent: Thursday, September 22, 2016 5:08 PM To: linux-r...@vger.kernel.org; linux-kernel@vger.kernel.org Cc: s...@kernel.org; Anvin, H Peter; Shankar, Ravi V ; Yu, Fenghua ; Kammela, Gayatri ; H . Peter Anvin ; Yu, Yu-cheng Subject: [PATCH v2] raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays Specifying the aligned attributes to the char recovi[PAGE_SIZE] and char recovi[PAGE_SIZE] arrays, so that all malloc memory is page boundary aligned. Without these alignment attributes, the test causes a segfault in userspace when the NDISKS are changed to 4 from 16. Cc: H. Peter Anvin Cc: Yu-cheng Yu Signed-off-by: Gayatri Kammela --- lib/raid6/test/test.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/raid6/test/test.c b/lib/raid6/test/test.c index 3bebbabdb510..32a00f11ac50 100644 --- a/lib/raid6/test/test.c +++ b/lib/raid6/test/test.c @@ -21,12 +21,13 @@ #define NDISKS 16 /* Including P and Q */ -const char raid6_empty_zero_page[PAGE_SIZE] __attribute__((aligned(256))); +const char raid6_empty_zero_page[PAGE_SIZE] +__attribute__((aligned(PAGE_SIZE))); struct raid6_calls raid6_call; char *dataptrs[NDISKS]; char data[NDISKS][PAGE_SIZE]; -char recovi[PAGE_SIZE], recovj[PAGE_SIZE]; +char recovi[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); +char recovj[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); static void makedata(int start, int stop) { -- 2.7.4
Re: [PATCH 3/4] autofs - make mountpoint checks namespace aware
On Thu, 2016-09-22 at 10:43 -0500, Eric W. Biederman wrote: > Ian Kentwrites: > > > Eric, Mateusz, I appreciate your spending time on this and particularly > > pointing > > out my embarrassingly stupid is_local_mountpoint() usage mistake. > > > > Please accept my apology for the inconvenience. > > > > If all goes well (in testing) I'll have follow up patches to correct this > > fairly > > soon. > > Related question. Do you happen to know how many mounts per mount > namespace tend to be used? It looks like it is going to be wise to put > a configurable limit on that number. And I would like the default to be > something high enough most people don't care. I believe autofs is > likely where people tend to use the most mounts. That's a good question. I've been thinking that maybe I should have used a lookup_mnt() type check as I originally started out to, for this reason, as the mnt_namespace list looks to be a linear list. But there can be a lot of mounts, and not only due to autofs, so maybe that should be considered anyway. The number of mounts for direct mount maps is usually not very large because of the way they are implemented, large direct mount maps can have performance problems. There can be anywhere from a few (likely case a few hundred) to less than 1, plus mounts that have been triggered and not yet expired. Indirect mounts have one autofs mount at the root plus the number of mounts that have been triggered and not yet expired. The number of autofs indirect map entries can range from a few to the common case of several thousand and in rare cases up to between 3 and 5. I've not heard of people with maps larger than 5 entries. The larger the number of map entries the greater the possibility for a large number of active mounts so it's not hard to expect cases of a 1000 or somewhat more active mounts. Ian
Re: Should drivers like nvme let userspace control their latency via dev_pm_qos?
On 9/16/2016 5:26 PM, Andy Lutomirski wrote: I'm adding power management to the nvme driver, and I'm exposing exactly one knob via sysfs: the maximum permissible latency. This isn't a power domain issue, and it has no dependencies -- it's literally just the maximum latency that the driver may impose on I/O for power saving purposes. ISTM userspace should be able to specify its own latency tolerance in a uniform way, and dev_pm_qos seems like the natural interface for this, except that I cannot find a single instance in the tree of *any* driver using it via the notifier mechanism. That's because the notifier mechanism is only used for the "resume latency" type of constraints. I can find two drivers that do it using dev_pm_qos_expose_latency_tolerance(), and both are LPSS drivers? That's correct. Nobody else has used it so far. :-) So: should I be exposing .set_latency_tolerance() or should I just use a custom sysfs attribute? Or both? dev_pm_qos_expose_latency_tolerance() adds a single latency tolerance request object to the device and exposes a knob in user space by which that request object can be controlled. There may be more latency tolerance request objects for the same device if kernel code adds them. The effective latency tolerance is the minimum of all those requests and the callback is invoked every time that effective value changes. This also is described in the last section of Documentation/power/pm_qos_interface.txt (note that if the .set_latency_tolerance callback is present at the device registration time already, the latency tolerance sysfs attribute will be exposed automatically by the driver core). If that mechanism is suitable for the use case in question, I'd just use it. Thanks, Rafael
arch/mips/vdso/gettimeofday.c:1:0: error: '-march=r3900' requires '-mfp32'
Hi Guenter, First bad commit (maybe != root cause): tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: b1f2beb87bb034bb209773807994279f90cace78 commit: 398c7500a1f5f74e207bd2edca1b1721b3cc1f1e MIPS: VDSO: Fix build error with binutils 2.24 and earlier date: 9 months ago config: mips-jmr3927_defconfig (attached as .config) compiler: mips-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 398c7500a1f5f74e207bd2edca1b1721b3cc1f1e # save the attached .config to linux build tree make.cross ARCH=mips All errors (new ones prefixed by >>): >> arch/mips/vdso/gettimeofday.c:1:0: error: '-march=r3900' requires '-mfp32' /* vim +1 arch/mips/vdso/gettimeofday.c a7f4df4e Alex Smith 2015-10-21 @1 /* a7f4df4e Alex Smith 2015-10-21 2 * Copyright (C) 2015 Imagination Technologies a7f4df4e Alex Smith 2015-10-21 3 * Author: Alex Smitha7f4df4e Alex Smith 2015-10-21 4 * a7f4df4e Alex Smith 2015-10-21 5 * This program is free software; you can redistribute it and/or modify it a7f4df4e Alex Smith 2015-10-21 6 * under the terms of the GNU General Public License as published by the a7f4df4e Alex Smith 2015-10-21 7 * Free Software Foundation; either version 2 of the License, or (at your a7f4df4e Alex Smith 2015-10-21 8 * option) any later version. a7f4df4e Alex Smith 2015-10-21 9 */ :: The code at line 1 was first introduced by commit :: a7f4df4e21dd8a8dab96e88acd2c9c5017b83fc6 MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime() :: TO: Alex Smith :: CC: Ralf Baechle --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [RFC PATCH v2 3/5] futex: Throughput-optimized (TO) futexes
On Thu, 22 Sep 2016, Davidlohr Bueso wrote: > On Thu, 22 Sep 2016, Waiman Long wrote: > > > BTW, my initial attempt for the new futex was to use the same workflow as > > the PI futexes, but use mutex which has optimistic spinning instead of > > rt_mutex. > > Btw, Thomas, do you still have any interest pursuing this for rtmutexes from > -rt into mainline? If so I can resend the patches from a while ago. Certainly yes. My faint memory tells me that there was some potential issue due to boosting the owner only if it gets scheduled out, but I might be wrong. Thanks, tglx
[PATCH] arch/arm: enable task isolation functionality
This patch is a port of the task isolation functionality to the arm 32-bit architecture. The task isolation needs an additional thread flag that requires to change the entry assembly code to accept a bitfield larger than one byte. The constants _TIF_SYSCALL_WORK and _TIF_WORK_MASK are now defined in the literal pool. The rest of the patch is straightforward and reflects what is done on other architectures. Signed-off-by: Francis Giraldeau--- arch/arm/Kconfig | 1 + arch/arm/include/asm/thread_info.h | 8 ++-- arch/arm/kernel/entry-common.S | 15 ++- arch/arm/kernel/ptrace.c | 10 ++ arch/arm/kernel/signal.c | 12 +++- arch/arm/kernel/smp.c | 4 arch/arm/mm/fault.c| 9 - tools/testing/selftests/task_isolation/isolation.c | 14 ++ 8 files changed, 60 insertions(+), 13 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 018ee76..0b147e4 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -40,6 +40,7 @@ config ARM select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU select HAVE_ARCH_MMAP_RND_BITS if MMU select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) + select HAVE_ARCH_TASK_ISOLATION select HAVE_ARCH_TRACEHOOK select HAVE_ARM_SMCCC if CPU_V7 select HAVE_CBPF_JIT diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h index 776757d..c83ce56 100644 --- a/arch/arm/include/asm/thread_info.h +++ b/arch/arm/include/asm/thread_info.h @@ -145,6 +145,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *, #define TIF_SECCOMP7 /* seccomp syscall filtering active */ #define TIF_NOHZ 12 /* in adaptive nohz mode */ +#define TIF_TASK_ISOLATION 13 /* task isolation active */ #define TIF_USING_IWMMXT 17 #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ #define TIF_RESTORE_SIGMASK20 @@ -158,16 +159,19 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *, #define _TIF_SYSCALL_TRACEPOINT(1 << TIF_SYSCALL_TRACEPOINT) #define _TIF_SECCOMP (1 << TIF_SECCOMP) #define _TIF_USING_IWMMXT (1 << TIF_USING_IWMMXT) +#define _TIF_TASK_ISOLATION(1 << TIF_TASK_ISOLATION) /* Checks for any syscall work in entry-common.S */ #define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ - _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP) + _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \ + _TIF_TASK_ISOLATION) /* * Change these and you break ASM code in entry-common.S */ #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ -_TIF_NOTIFY_RESUME | _TIF_UPROBE) +_TIF_NOTIFY_RESUME | _TIF_UPROBE | \ +_TIF_TASK_ISOLATION) #endif /* __KERNEL__ */ #endif /* __ASM_ARM_THREAD_INFO_H */ diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index 10c3283..dd8c45b 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -36,7 +36,8 @@ ret_fast_syscall: UNWIND(.cantunwind) disable_irq_notrace @ disable interrupts ldr r1, [tsk, #TI_FLAGS]@ re-check for syscall tracing - tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK + ldr r2, =_TIF_SYSCALL_WORK | _TIF_WORK_MASK + tst r1, r2 bne fast_work_pending /* perform architecture specific actions before user return */ @@ -62,7 +63,8 @@ ret_fast_syscall: str r0, [sp, #S_R0 + S_OFF]!@ save returned r0 disable_irq_notrace @ disable interrupts ldr r1, [tsk, #TI_FLAGS]@ re-check for syscall tracing - tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK + ldr r2, =_TIF_SYSCALL_WORK | _TIF_WORK_MASK + tst r1, r2 beq no_work_pending UNWIND(.fnend ) ENDPROC(ret_fast_syscall) @@ -70,7 +72,8 @@ ENDPROC(ret_fast_syscall) /* Slower path - fall through to work_pending */ #endif - tst r1, #_TIF_SYSCALL_WORK + ldr r2, =_TIF_SYSCALL_WORK + tst r1, r2 bne __sys_trace_return_nosave slow_work_pending: mov r0, sp @ 'regs' @@ -94,7 +97,8 @@ ret_slow_syscall: disable_irq_notrace @ disable interrupts ENTRY(ret_to_user_from_irq) ldr r1, [tsk, #TI_FLAGS] - tst r1, #_TIF_WORK_MASK + ldr r2, =_TIF_WORK_MASK + tst r1, r2 bne slow_work_pending no_work_pending: asm_trace_hardirqs_on
[PATCH] drivers: wlan-ng: fixed a coding style issue
removed a space after a cast to obtain the coding style. Signed-off-by: Jannik Becher--- drivers/staging/wlan-ng/hfa384x.h | 4 ++-- drivers/staging/wlan-ng/p80211netdev.c | 12 ++-- drivers/staging/wlan-ng/p80211req.c| 16 drivers/staging/wlan-ng/prism2fw.c | 18 +- drivers/staging/wlan-ng/prism2mgmt.c | 8 drivers/staging/wlan-ng/prism2mib.c| 16 drivers/staging/wlan-ng/prism2sta.c| 32 7 files changed, 53 insertions(+), 53 deletions(-) diff --git a/drivers/staging/wlan-ng/hfa384x.h b/drivers/staging/wlan-ng/hfa384x.h index f8ee175..4cf4796 100644 --- a/drivers/staging/wlan-ng/hfa384x.h +++ b/drivers/staging/wlan-ng/hfa384x.h @@ -256,7 +256,7 @@ Information RID Lengths: MAC Information include the len or code fields) */ #defineHFA384x_RID_DBMCOMMSQUALITY_LEN \ - ((u16) sizeof(hfa384x_dbmcommsquality_t)) + ((u16)sizeof(hfa384x_dbmcommsquality_t)) #defineHFA384x_RID_JOINREQUEST_LEN \ ((u16)sizeof(hfa384x_JoinRequest_data_t)) @@ -1380,7 +1380,7 @@ static inline int hfa384x_drvr_getconfig16(hfa384x_t *hw, u16 rid, void *val) result = hfa384x_drvr_getconfig(hw, rid, val, sizeof(u16)); if (result == 0) - *((u16 *) val) = le16_to_cpu(*((u16 *) val)); + *((u16 *)val) = le16_to_cpu(*((u16 *)val)); return result; } diff --git a/drivers/staging/wlan-ng/p80211netdev.c b/drivers/staging/wlan-ng/p80211netdev.c index fb97779..38c936a 100644 --- a/drivers/staging/wlan-ng/p80211netdev.c +++ b/drivers/staging/wlan-ng/p80211netdev.c @@ -231,7 +231,7 @@ static int p80211_convert_to_ether(struct wlandevice *wlandev, struct sk_buff *s { struct p80211_hdr_a3 *hdr; - hdr = (struct p80211_hdr_a3 *) skb->data; + hdr = (struct p80211_hdr_a3 *)skb->data; if (p80211_rx_typedrop(wlandev, hdr->fc)) return CONV_TO_ETHER_SKIPPED; @@ -265,7 +265,7 @@ static int p80211_convert_to_ether(struct wlandevice *wlandev, struct sk_buff *s */ static void p80211netdev_rx_bh(unsigned long arg) { - struct wlandevice *wlandev = (struct wlandevice *) arg; + struct wlandevice *wlandev = (struct wlandevice *)arg; struct sk_buff *skb = NULL; netdevice_t *dev = wlandev->netdev; @@ -534,7 +534,7 @@ static int p80211netdev_ethtool(struct wlandevice *wlandev, void __user *useradd static int p80211knetdev_do_ioctl(netdevice_t *dev, struct ifreq *ifr, int cmd) { int result = 0; - struct p80211ioctl_req *req = (struct p80211ioctl_req *) ifr; + struct p80211ioctl_req *req = (struct p80211ioctl_req *)ifr; struct wlandevice *wlandev = dev->ml_priv; u8 *msgbuf; @@ -625,7 +625,7 @@ static int p80211knetdev_set_mac_address(netdevice_t *dev, void *addr) /* Set up some convenience pointers. */ mibattr = - macaddr = (p80211item_pstr6_t *) >data; + macaddr = (p80211item_pstr6_t *)>data; resultcode = /* Set up a dot11req_mibset */ @@ -633,7 +633,7 @@ static int p80211knetdev_set_mac_address(netdevice_t *dev, void *addr) dot11req.msgcode = DIDmsg_dot11req_mibset; dot11req.msglen = sizeof(struct p80211msg_dot11req_mibset); memcpy(dot11req.devname, - ((struct wlandevice *) dev->ml_priv)->name, WLAN_DEVNAMELEN_MAX - 1); + ((struct wlandevice *)dev->ml_priv)->name, WLAN_DEVNAMELEN_MAX - 1); /* Set up the mibattribute argument */ mibattr->did = DIDmsg_dot11req_mibset_mibattribute; @@ -653,7 +653,7 @@ static int p80211knetdev_set_mac_address(netdevice_t *dev, void *addr) resultcode->data = 0; /* now fire the request */ - result = p80211req_dorequest(dev->ml_priv, (u8 *) ); + result = p80211req_dorequest(dev->ml_priv, (u8 *)); /* If the request wasn't successful, report an error and don't * change the netdev address diff --git a/drivers/staging/wlan-ng/p80211req.c b/drivers/staging/wlan-ng/p80211req.c index 40627d5..010e5dc 100644 --- a/drivers/staging/wlan-ng/p80211req.c +++ b/drivers/staging/wlan-ng/p80211req.c @@ -110,7 +110,7 @@ static void p80211req_handle_action(struct wlandevice *wlandev, u32 *data, */ int p80211req_dorequest(struct wlandevice *wlandev, u8 *msgbuf) { - struct p80211msg *msg = (struct p80211msg *) msgbuf; + struct p80211msg *msg = (struct p80211msg *)msgbuf; /* Check to make sure the MSD is running */ if (!((wlandev->msdstate == WLAN_MSD_HWPRESENT && @@ -170,7 +170,7 @@ static void p80211req_handlemsg(struct wlandevice *wlandev, struct p80211msg *ms case DIDmsg_lnxreq_hostwep:{ struct
Re: [PATCH 1/1 linux-next] netfilter: conntrack: fix kmemleak false positive
Fabian Frederickwrote: > Hello Florian, > > First problem is solved: table gets cleared 3 minutes earlier > but I still have kmemleak before running the following: > > echo scan > /sys/kernel/debug/kmemleak > cat /sys/kernel/debug/kmemleak > Nothing > echo scan > /sys/kernel/debug/kmemleak > cat /sys/kernel/debug/kmemleak > -> rsyslogd > > I talked about false positive because everything is cleared later. Hmm, I fear this is a real bug and not false positive. Should be possible to confirm this via slabinfo: grep nf_conntrack /proc/slabinfo The active objects should match the conntrack count. (conntrack -C, or wc -l < /proc/). > > > unreferenced object 0x88003b0e6600 (size 248): > > > comm "rsyslogd", pid 1595, jiffies 4294741312 (age 7.343s) > > > ... > > > backtrace: > > > [] kmemleak_alloc+0x23/0x40 > > > [] kmem_cache_alloc+0xd9/0x180 > > > [] __nf_conntrack_alloc.isra.50+0x48/0x170 > > > [] nf_conntrack_in+0x3a2/0x5f0 > > > [] ipv4_conntrack_local+0x40/0x50 > > > [] nf_iterate+0x5d/0x70 > > > [] nf_hook_slow+0x5f/0xb0 > > > [] __ip_local_out+0xad/0xe0 > > > [] ip_local_out+0x17/0x40 > > > [] ip_send_skb+0x14/0x40 > > > [] udp_send_skb+0x91/0x260 > > > [] udp_sendmsg+0x2f5/0x950 > > > [] inet_sendmsg+0x60/0x90 > > > [] sock_sendmsg+0x33/0x40 > > > [] SYSC_sendto+0xee/0x160 > > > [] SyS_sendto+0x9/0x10 Hmm, so we leak when allocating conntrack for outgoing packet. Do you do any filtering (DROP) in output/postrouting? > > > (248 bytes being an nf_conn structure) > > > > > > Those structures being cleared in gc_worker() later on we can't talk > > > about unreferenced object so this patch uses kmemleak_not_leak() to > > > prevent those warnings. > > > > If thats the case, why is kmemleak complaining? Are you sure this > > is a false positive? Looks like a real bug to me, but I don't see anything obvious so far. I'll look at this again tomorrow.
[PATCH 3/4] ARM: dts: at91: add samx7 dtsi
From: Szemző AndrásAdd device tree support for Atmel samx7 SoCs family. Signed-off-by: Szemző András Signed-off-by: Alexandre Belloni --- arch/arm/boot/dts/samx7.dtsi | 1166 ++ 1 file changed, 1166 insertions(+) create mode 100644 arch/arm/boot/dts/samx7.dtsi diff --git a/arch/arm/boot/dts/samx7.dtsi b/arch/arm/boot/dts/samx7.dtsi new file mode 100644 index ..fcef47c22413 --- /dev/null +++ b/arch/arm/boot/dts/samx7.dtsi @@ -0,0 +1,1166 @@ +/* + * samx7.dtsi - Device Tree Include file for SAMx7 family SoCs + * + * This file is dual-licensed: you can use it either under the terms + * of the GPL or the X11 license, at your option. Note that this dual + * licensing only applies to this file, and not this project as a + * whole. + * + * a) This file is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of the + * License, or (at your option) any later version. + * + * This file is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Or, alternatively, + * + * b) Permission is hereby granted, free of charge, to any person + * obtaining a copy of this software and associated documentation + * files (the "Software"), to deal in the Software without + * restriction, including without limitation the rights to use, + * copy, modify, merge, publish, distribute, sublicense, and/or + * sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following + * conditions: + * + * The above copyright notice and this permission notice shall be + * included in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES + * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT + * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, + * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/dts-v1/; +#include "armv7-m.dtsi" +#include +#include +#include +#include +#include +#include + +/ { + model = "Atmel SAMx7 family SoC"; + compatible = "atmel,samx7"; + + aliases { + serial0 = + serial1 = + serial2 = + serial3 = + serial4 = + serial5 = + serial6 = + serial7 = + gpio0 = + gpio1 = + gpio2 = + gpio3 = + gpio4 = + i2c0 = + i2c1 = + i2c2 = + tcb0 = + tcb1 = + tcb2 = + tcb3 = + pwm0 = + pwm1 = + }; + + clocks { + + clk_slck: clk-slck { + #clock-cells = <0>; + compatible = "fixed-clock"; + }; + + clk_mck: clk-mck { + #clock-cells = <0>; + compatible = "fixed-clock"; + }; + }; + + soc { + + pmc: pmc@0x400e0600 { + compatible = "atmel,at91sam9x5-pmc", "syscon"; + reg = <0x400e0600 0x200>; + interrupts = <5>; + interrupt-controller; + #address-cells = <1>; + #size-cells = <0>; + #interrupt-cells = <1>; + + periphck { + compatible = "atmel,at91sam9x5-clk-peripheral"; + #address-cells = <1>; + #size-cells = <0>; + clocks = <_mck>; + + uart0_clk: uart0_clk { + #clock-cells = <0>; + reg = <7>; + }; + + uart1_clk: uart1_clk { + #clock-cells = <0>; + reg = <8>; + }; + + smc_clk: smc_clk { + #clock-cells = <0>; + reg = <9>; +
[PATCH 4/4] ARM: at91: debug: add samx7 support
From: Szemző AndrásAdd support for low level debugging on Atmel samx7. Signed-off-by: Szemző András Signed-off-by: Alexandre Belloni --- arch/arm/Kconfig.debug | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index a9693b6987a6..d209d0d78820 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -145,6 +145,15 @@ choice Say Y here if you want kernel low-level debugging support on the USART3 port of sama5d4. + config DEBUG_AT91_SAMX7_USART1 + bool "Kernel low-level debugging via SAMX7 USART1" + select DEBUG_AT91_UART + depends on SOC_SAMX7 + help + Say Y here if you want the debug print routines to direct + their output to the USART1 port on SAMX7 based + machines. + config DEBUG_BCM2835 bool "Kernel low-level debugging on BCM2835 PL011 UART" depends on ARCH_BCM2835 && ARCH_MULTI_V6 @@ -1479,6 +1488,7 @@ config DEBUG_UART_PHYS default 0x3f201000 if DEBUG_BCM2836 default 0x3e00 if DEBUG_BCM_KONA_UART default 0x4000e400 if DEBUG_LL_UART_EFM32 + default 0x40028000 if DEBUG_AT91_SAMX7_USART1 default 0x40081000 if DEBUG_LPC18XX_UART0 default 0x4009 if DEBUG_LPC32XX default 0x4010 if DEBUG_PXA_UART1 -- 2.9.3
[PATCH 1/4] ARM: at91: Add armv7m support
From: Szemző AndrásAdd Atmel SAME70/SAMS70/SAMV71 SoC support and detection. Signed-off-by: Szemző András Signed-off-by: Alexandre Belloni --- arch/arm/mach-at91/Kconfig | 9 +- arch/arm/mach-at91/Makefile | 1 + arch/arm/mach-at91/Makefile.boot | 3 ++ arch/arm/mach-at91/samx7.c | 62 arch/arm/mach-at91/soc.h | 21 ++ 5 files changed, 95 insertions(+), 1 deletion(-) create mode 100644 arch/arm/mach-at91/Makefile.boot create mode 100644 arch/arm/mach-at91/samx7.c diff --git a/arch/arm/mach-at91/Kconfig b/arch/arm/mach-at91/Kconfig index 5204395efda8..3ca2724a6ca6 100644 --- a/arch/arm/mach-at91/Kconfig +++ b/arch/arm/mach-at91/Kconfig @@ -1,12 +1,19 @@ menuconfig ARCH_AT91 bool "Atmel SoCs" - depends on ARCH_MULTI_V4T || ARCH_MULTI_V5 || ARCH_MULTI_V7 + depends on ARCH_MULTI_V4T || ARCH_MULTI_V5 || ARCH_MULTI_V7 || ARM_SINGLE_ARMV7M select COMMON_CLK_AT91 select GPIOLIB select PINCTRL select SOC_BUS if ARCH_AT91 +config SOC_SAMX7 + bool "SAM Cortex-M7 family" if ARM_SINGLE_ARMV7M + select COMMON_CLK_AT91 + select PINCTRL_AT91 + help + Select this if you are using one of Atmel's SAMx7 family SoC. + config SOC_SAMA5D2 bool "SAMA5D2 family" depends on ARCH_MULTI_V7 diff --git a/arch/arm/mach-at91/Makefile b/arch/arm/mach-at91/Makefile index c5bbf8bb8c0f..84956a18d604 100644 --- a/arch/arm/mach-at91/Makefile +++ b/arch/arm/mach-at91/Makefile @@ -7,6 +7,7 @@ obj-y := soc.o obj-$(CONFIG_SOC_AT91RM9200) += at91rm9200.o obj-$(CONFIG_SOC_AT91SAM9) += at91sam9.o obj-$(CONFIG_SOC_SAMA5)+= sama5.o +obj-$(CONFIG_SOC_SAMX7)+= samx7.o # Power Management obj-$(CONFIG_PM) += pm.o diff --git a/arch/arm/mach-at91/Makefile.boot b/arch/arm/mach-at91/Makefile.boot new file mode 100644 index ..eacfc3f5c33e --- /dev/null +++ b/arch/arm/mach-at91/Makefile.boot @@ -0,0 +1,3 @@ +# Empty file waiting for deletion once Makefile.boot isn't needed any more. +# Patch waits for application at +# http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=7889/1 . diff --git a/arch/arm/mach-at91/samx7.c b/arch/arm/mach-at91/samx7.c new file mode 100644 index ..bd33bc56278e --- /dev/null +++ b/arch/arm/mach-at91/samx7.c @@ -0,0 +1,62 @@ +/* + * Setup code for SAMx7 + * + * Copyright (C) 2013 Atmel, + *2016 Andras Szemzo + * + * Licensed under GPLv2 or later. + */ +#include +#include +#include +#include +#include +#include +#include +#include "generic.h" +#include "soc.h" + +static const struct at91_soc samx7_socs[] = { + AT91_SOC(SAME70Q21_CIDR_MATCH, SAME70Q21_EXID_MATCH, +"same70q21", "samx7"), + AT91_SOC(SAME70Q20_CIDR_MATCH, SAME70Q20_EXID_MATCH, +"same70q20", "samx7"), + AT91_SOC(SAME70Q19_CIDR_MATCH, SAME70Q19_EXID_MATCH, +"same70q19", "samx7"), + AT91_SOC(SAMS70Q21_CIDR_MATCH, SAMS70Q21_EXID_MATCH, +"sams70q21", "samx7"), + AT91_SOC(SAMS70Q20_CIDR_MATCH, SAMS70Q20_EXID_MATCH, +"sams70q20", "samx7"), + AT91_SOC(SAMS70Q19_CIDR_MATCH, SAMS70Q19_EXID_MATCH, +"sams70q19", "samx7"), + AT91_SOC(SAMV71Q21_CIDR_MATCH, SAMV71Q21_EXID_MATCH, +"samv71q21", "samx7"), + AT91_SOC(SAMV71Q20_CIDR_MATCH, SAMV71Q20_EXID_MATCH, +"samv71q20", "samx7"), + AT91_SOC(SAMV71Q19_CIDR_MATCH, SAMV71Q19_EXID_MATCH, +"samv71q19", "samx7"), + { /* sentinel */ }, +}; + +static void __init samx7_dt_device_init(void) +{ + struct soc_device *soc; + struct device *soc_dev = NULL; + + soc = at91_soc_init(samx7_socs); + if (soc) + soc_dev = soc_device_to_device(soc); + + of_platform_populate(NULL, of_default_bus_match_table, NULL, soc_dev); +} + +static const char *const samx7_dt_board_compat[] __initconst = { + "atmel,samx7", + NULL +}; + +DT_MACHINE_START(samx7_dt, "Atmel SAMx7") + .init_machine = samx7_dt_device_init, + .dt_compat = samx7_dt_board_compat, +MACHINE_END + diff --git a/arch/arm/mach-at91/soc.h b/arch/arm/mach-at91/soc.h index 228efded5085..0f97e9c5da7e 100644 --- a/arch/arm/mach-at91/soc.h +++ b/arch/arm/mach-at91/soc.h @@ -88,4 +88,25 @@ at91_soc_init(const struct at91_soc *socs); #define SAMA5D43_EXID_MATCH0x0003 #define SAMA5D44_EXID_MATCH0x0004 +#define SAME70Q21_CIDR_MATCH 0x21020e00 +#define SAME70Q21_EXID_MATCH 0x0002 +#define SAME70Q20_CIDR_MATCH 0x21020c00 +#define SAME70Q20_EXID_MATCH 0x0002 +#define SAME70Q19_CIDR_MATCH 0x210d0a00 +#define SAME70Q19_EXID_MATCH
Re: [PATCH 1/2] config: move x86 kvm_guest.config to a common locaton
2016-09-08 13:41-0500, Rob Herring: > kvm_guest.config is useful for KVM guests on other arches, and nothing > in it appears to be x86 specific, so just move the whole file. Kbuild > will find it in either location. > > Signed-off-by: Rob Herring> Cc: Christoffer Dall > Cc: Marc Zyngier > Cc: Paolo Bonzini > Cc: "Radim Krčmář" > Cc: kvm...@lists.cs.columbia.edu > Cc: k...@vger.kernel.org > --- Applied them both to kvm/queue, thanks.
Re: [PATCH v2] KVM: nVMX: Fix the NMI IDT-vectoring handling
2016-09-22 17:55+0800, Wanpeng Li: > From: Wanpeng Li> > Run kvm-unit-tests/eventinj.flat in L1: > > Sending NMI to self > After NMI to self > FAIL: NMI > > This test scenario is to test whether VMM can handle NMI IDT-vectoring info > correctly. > > At the beginning, L2 writes LAPIC to send a self NMI, the EPT page tables on > both L1 > and L0 are empty so: > > - The L2 accesses memory can generate EPT violation which can be intercepted > by L0. > > The EPT violation vmexit occurred during delivery of this NMI, and the NMI > info is > recorded in vmcs02's IDT-vectoring info. > > - L0 walks L1's EPT12 and L0 sees the mapping is invalid, it injects the EPT > violation into L1. > > The vmcs02's IDT-vectoring info is reflected to vmcs12's IDT-vectoring info > since > it is a nested vmexit. > > - L1 receives the EPT violation, then fixes its EPT12. > - L1 executes VMRESUME to resume L2 which generates vmexit and causes L1 > exits to L0. > - L0 emulates VMRESUME which is called from L1, then return to L2. > > L0 merges the requirement of vmcs12's IDT-vectoring info and injects it to > L2 through > vmcs02. > > - The L2 re-executes the fault instruction and cause EPT violation again. > - Since the L1's EPT12 is valid, L0 can fix its EPT02 > - L0 resume L2 > > The EPT violation vmexit occurred during delivery of this NMI again, and > the NMI info > is recorded in vmcs02's IDT-vectoring info. L0 should inject the NMI > through vmentry > event injection since it is caused by EPT02's EPT violation. > > However, vmx_inject_nmi() refuses to inject NMI from IDT-vectoring info if > vCPU is in > guest mode, this patch fix it by permitting to inject NMI from IDT-vectoring > if it is > the L0's responsibility to inject NMI from IDT-vectoring info to L2. > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Jan Kiszka > Cc: Bandan Das > Signed-off-by: Wanpeng Li > --- Applied to kvm/queue, thanks.
Re: [PATCH 3/5] mmc: core: changes frequency to hs_max_dtr when selecting hs400es
On 2016/9/22 18:21, Ulf Hansson wrote: On 22 September 2016 at 12:06, Shawn Linwrote: Hi ulf, 在 2016/9/22 17:38, Ulf Hansson 写道: On 21 September 2016 at 03:43, Shawn Lin wrote: Per JESD84-B51 P69, Host need to change frequency to <=52MHz after setting HS_TIMING to 0x1, and host may changes frequency to <= 200MHz after setting HS_TIMING to 0x3. It seems there is no difference if we don't change frequency to <= 52MHz as f_init is already less than 52MHz. But actually it does make difference. When doing compatibility test we see failures for some eMMC devices without changing the frequency to hs_max_dtr. And let's read the spec again, we could see that "Host may changes frequency to 200MHz" implies that it's not mandatory. But the "Host need to change frequency to <= 52MHz" implies that we should do this. I don't get this. Are you saying that f_init > 52 MHz? That should not be impossible, right!? nope, I was saying that the spec implies we to set clock after setting HS_TIMING to 0x1 when doing hs400es selection. I thought there is no difference because the spec says "Host need to change frequency to <= 52MHz", and the f_init(<=400k) is <= 52MHz, right? So I didn't set clock to hs_max_dtr. But I think I misunderstood the spec, so this patch will fix this. Okay, I see what you mean now! In other words: The card expects the clock rate to increase from the current used f_init (which is <= 400KHz), but still being <= 52MHz, when you have set HS_TIMING to 0x1. Okay, we can do that change! Could you try to improve the change log a little bit or you want me to help? yep, I could change the commit msg a bit and fix another copy-paste error, then respin v2. BTW, I noticed you have applied one of these 5 patches, so I will remove that one for V2. Thanks, Ulf. Kind regards Uffe -- Best Regards Shawn Lin
Re: [PATCH v2] arm: dts: zynq: Add MicroZed board support
On Thu, 2016-09-22 at 18:51:29 +0530, Jagan Teki wrote: > From: Jagan Teki> > Added basic dts support for MicroZed board. > > - UART > - SDHCI > - Ethernet > > Cc: Soren Brinkmann > Cc: Michal Simek > Signed-off-by: Jagan Teki > --- > Changes for v2: > - Add SDHCI > - Add Ethernet > > arch/arm/boot/dts/Makefile | 1 + > arch/arm/boot/dts/zynq-microzed.dts | 95 > + > 2 files changed, 96 insertions(+) > create mode 100644 arch/arm/boot/dts/zynq-microzed.dts > > diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile > index faacd52..4d7b858 100644 > --- a/arch/arm/boot/dts/Makefile > +++ b/arch/arm/boot/dts/Makefile > @@ -862,6 +862,7 @@ dtb-$(CONFIG_ARCH_VT8500) += \ > wm8750-apc8750.dtb \ > wm8850-w70v2.dtb > dtb-$(CONFIG_ARCH_ZYNQ) += \ > + zynq-microzed.dtb \ > zynq-parallella.dtb \ > zynq-zc702.dtb \ > zynq-zc706.dtb \ > diff --git a/arch/arm/boot/dts/zynq-microzed.dts > b/arch/arm/boot/dts/zynq-microzed.dts > new file mode 100644 > index 000..9e64496 > --- /dev/null > +++ b/arch/arm/boot/dts/zynq-microzed.dts > @@ -0,0 +1,95 @@ > +/* > + * Copyright (C) 2015 Jagan Teki > + * > + * This software is licensed under the terms of the GNU General Public > + * License version 2, as published by the Free Software Foundation, and > + * may be copied, distributed, and modified under those terms. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + */ > +/dts-v1/; > +/include/ "zynq-7000.dtsi" > + > +/ { > + model = "Zynq MicroZED Development Board"; > + compatible = "xlnx,zynq-microzed", "xlnx,zynq-7000"; > + > + aliases { > + ethernet0 = > + serial0 = > + }; > + > + memory { > + device_type = "memory"; > + reg = <0x0 0x4000>; > + }; > + > + chosen { > + bootargs = "earlycon"; > + stdout-path = "serial0:115200n8"; > + }; > + > + usb_phy0: phy0 { > + compatible = "usb-nop-xceiv"; > + #phy-cells = <0>; > + }; > +}; > + > + { > + ps-clk-frequency = <>; > +}; > + > + { > + status = "okay"; > + phy-mode = "rgmii-id"; > + phy-handle = <_phy>; > + > + ethernet_phy: ethernet-phy@0 { > + reg = <0>; > + }; > +}; > + > + { > + status = "okay"; > +}; > + > + { > + status = "okay"; > +}; > + > + { > + status = "okay"; > + dr_mode = "host"; > + usb-phy = <_phy0>; > + pinctrl-names = "default"; > + pinctrl-0 = <_usb0_default>; > +}; > + > + { > + pinctrl_usb0_default: usb0-default { > + mux { > + groups = "usb0_0_grp"; > + function = "usb0"; > + }; > + > + conf { > + groups = "usb0_0_grp"; > + slew-rate = <0>; > + io-standard = <1>; > + }; > + > + conf-rx { > + pins = "MIO29", "MIO31", "MIO36"; > + bias-high-impedance; > + }; > + > + conf-tx { > + pins = "MIO28", "MIO30", "MIO32", "MIO33", "MIO34", > +"MIO35", "MIO37", "MIO38", "MIO39"; > + bias-disable; > + }; > + }; > +}; I guess it's not strictly required, but shouldn't there be pinctrl descriptions for all devices? Sören
Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out
"Chen, Tim C"writes: >> >>So this is impossible without THP swapin. While 2M swapout makes a lot of >>sense, I doubt 2M swapin is really useful. What kind of application is >>'optimized' >>to do sequential memory access? Anything that touches regions larger than 4K and we want to do the kernel do minimal work to manage the swapping. > > We waste a lot of cpu cycles to re-compact 4K pages back to a large page > under THP. Swapping it back in as a single large page can avoid > fragmentation and this overhead. Also splitting something just to merge it again is wasteful. A lot of big improvements in the block and VM and network layers over the years came from avoiding that kind of wasteful work. -Andi
[ANNOUNCE] linux-4.7-ck5
Announcing the latest release of the -ck patchset for improved responsiveness and interactivity. http://ck.kolivas.org/patches/4.0/4.7/4.7-ck5/ This is normally just a branded version of BFS with some different default kernel options, however this version incorporates Jens Axboe's writeback throttling patch version 7 which in my testing made a dramatic improvement to behaviour under heavy write loads (no benchmarks.) https://marc.info/?l=linux-block=147325975312628 A note about CPU scheduler cgroups - BFS does not support any of the cgroup features but does implement a basic stub for the primary "CPU controller" cgroup which creates the relevant cgroup filesystem but using it does nothing. The reason for implementing this was that some applications now refuse to work without it though they'll happily work with these stubs, even if they don't do anything. Enjoy! お楽しみ下さい -- -ck
Re: [PATCH 04/11] staging: dgnc: kfree for board structure in
2016-09-22 16:21 GMT+09:00 Greg KH: > On Thu, Sep 22, 2016 at 02:22:03PM +0900, Daeseok Youn wrote: >> The board structure should be freed when any function was failed >> in dgnc_found_board(). And the board strucure will be stored >> into dgnc_board array when the dgnc_found_board() function has no error. >> >> Signed-off-by: Daeseok Youn >> --- >> drivers/staging/dgnc/dgnc_driver.c | 17 + >> 1 file changed, 9 insertions(+), 8 deletions(-) > > Another shortened subject line. I am not sure why the subject line was cut off. I will fix them up and resend. Thanks. Regards, Daeseok. > > Please look at all of the subjects in this series, fix them up, and > resend. > > thanks, > > greg k-h
Re: [PATCH v2] usb: gadget: Add uevent to notify userspace
Hi, On 22 September 2016 at 20:53, Felipe Balbiwrote: > > > Hi, > > Baolin Wang writes: static const struct usb_gadget_driver configfs_driver_template = { .bind = configfs_composite_bind, .unbind = configfs_composite_unbind, +#ifdef CONFIG_USB_CONFIGFS_UEVENT + .setup = configfs_setup, + .reset = configfs_disconnect, + .disconnect = configfs_disconnect, +#else .setup = composite_setup, .reset = composite_disconnect, .disconnect = composite_disconnect, +#endif > > nope, this is quite wrong. > @@ -1453,6 +1556,10 @@ static struct config_group *gadgets_make( gi->composite.gadget_driver.function = kstrdup(name, GFP_KERNEL); gi->composite.name = gi->composite.gadget_driver.function; +#ifdef CONFIG_USB_CONFIGFS_UEVENT + INIT_WORK(>work, configfs_work); +#endif >>> >>> This is just way too ugly, please make it so there are no #ifdefs in the >>> .c files. >>> >>> Or, as others said, why is this a build option at all, why would you not >>> always want this enabled if you are relying on it all of the time? >> >> Sometimes userspace does not need the notification, it is not all the >> time. Anyway I will remove the macro if you still insist on that. > > what's wrong with the sysfs we already have for this? If Android system userspace can support udc-core's uevents like Badhri said, I am fine with that. -- Baolin.wang Best Regards
Re: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out
Rik van Rielwrites: > On Thu, 2016-09-22 at 15:56 -0700, Shaohua Li wrote: >> On Wed, Sep 07, 2016 at 09:45:59AM -0700, Huang, Ying wrote: >> > >> > - It will help the memory fragmentation, especially when the THP is >> > heavily used by the applications. The 2M continuous pages will >> > be >> > free up after THP swapping out. >> >> So this is impossible without THP swapin. While 2M swapout makes a >> lot of >> sense, I doubt 2M swapin is really useful. What kind of application >> is >> 'optimized' to do sequential memory access? > > I suspect a lot of this will depend on the ratio of storage > speed to CPU & RAM speed. > > When swapping to a spinning disk, it makes sense to avoid > extra memory use on swapin, and work in 4kB blocks. For spinning disk, the THP swap optimization will be turned off in current implementation. Because huge swap cluster allocation based on swap cluster management, which is available only for non-rotating block devices (blk_queue_nonrot()). > When swapping to NVRAM, it makes sense to use 2MB blocks, > because that storage can handle data faster than we can > manage 4kB pages in the VM. Best Regards, Huang, Ying
Re: [PATCH v4 10/10] cpufreq: intel_pstate: Use CPPC to get max performance
On Thu, 2016-09-22 at 22:58 +0200, Rafael J. Wysocki wrote: > > > > so what if there are two CPU packages > > > and there are highest_perf differences in both, and we first enumerate > > > the first package entirely before getting to the second one? > > > > > > In that case we'll schedule the work item after enumerating the first > > > package and it may rebuild the sched domains before all priorities are > > > set for the second package, may it not? > > That is not a problem. For the second package, all the cpu priorities > > are initialized to the same value. So even if we start to do > > asym_packing in the scheduler for the whole system, > > on the second package, all the cpus are treated equally by the scheduler. > > We will operate as if there is no favored core till we update the > > priorities of the cpu on the second package. > OK > > But updating those priorities after we have set the "ITMT capable" > flag is not a problem? Nobody is going to be confused and so on? > Not a problem. The worst thing that could happen is we schedule a job to a cpu with a lesser max turbo freq first while the priorities update are in progress. > > > > That said, we don't enable ITMT automatically for 2 package system. > > So the explicit sysctl command to enable ITMT and cause the sched domain > > rebuild for 2 package system is most likely to come after > > we have discovered and set all the cpu priorities. > Right, but if that behavior is relied on, there should be a comment > about that in the code (and relying on it would be kind of fragile for > that matter). No, we don't rely on this behavior of not enabling ITMT automatically for 2 package system. We could enable ITMT for 2 package system by default if we want to. Then asym_packing will just consider the second package's cpus to be equal priorities if they haven't been set. > > > > > > > > > > > > This seems to require some more consideration. > > > > > > > > > > > > > > > + /* > > > > +* Since this function is in the hotcpu notifier > > > > callback > > > > +* path, submit a task to workqueue to call > > > > +* sched_set_itmt_support(). > > > > +*/ > > > > + schedule_work(_itmt_work); > > > It doesn't make sense to do this more than once IMO and what if we > > > attempt to schedule the work item again when it has been scheduled > > > once already? Don't we need any protection here? > > It is not a problem for sched_set_itmt_support to be called more than > > once. > While it is not incorrect, it also is not particularly useful to > schedule a work item just to find out later that it had nothing to do > to begin with. Setting ITMT capability is done per socket during system boot. So there is no performance impact at all so it should not be an issue. Tim
Re: [PATCH RT 05/10] net: add back the missing serialization in ip_send_unicast_reply()
This got rejected by vger.kernel.org because quilt mail can't handle that funny symbol in Sami's name. -- Steve On Thu, 22 Sep 2016 17:47:57 -0400 Steven Rostedtwrote: > 4.4.21-rt31-rc1 stable review patch. > If anyone has any objections, please let me know. > > -- > > From: Sebastian Andrzej Siewior > > Some time ago Sami PietikÀinen reported a crash on -RT in > ip_send_unicast_reply() which was later fixed by Nicholas Mc Guire > (v3.12.8-rt11). Later (v3.18.8) the code was reworked and I dropped the > patch. As it turns out it was mistake. > I have reports that the same crash is possible with a similar backtrace. > It seems that vanilla protects access to this_cpu_ptr() via > local_bh_disable(). This does not work the on -RT since we can have > NET_RX and NET_TX running in parallel on the same CPU. > This is brings back the old locks. > > |Unable to handle kernel NULL pointer dereference at virtual address 0010 > |PC is at __ip_make_skb+0x198/0x3e8 > |[] (__ip_make_skb) from [] > (ip_push_pending_frames+0x20/0x40) > |[] (ip_push_pending_frames) from [] > (ip_send_unicast_reply+0x210/0x22c) > |[] (ip_send_unicast_reply) from [] > (tcp_v4_send_reset+0x190/0x1c0) > |[] (tcp_v4_send_reset) from [] > (tcp_v4_do_rcv+0x22c/0x288) > |[] (tcp_v4_do_rcv) from [] (release_sock+0xb4/0x150) > |[] (release_sock) from [] (tcp_close+0x240/0x454) > |[] (tcp_close) from [] (inet_release+0x74/0x7c) > |[] (inet_release) from [] (sock_release+0x30/0xb0) > |[] (sock_release) from [] (sock_close+0x1c/0x24) > |[] (sock_close) from [] (__fput+0xe8/0x20c) > |[] (__fput) from [] (fput+0x18/0x1c) > |[] (fput) from [] (task_work_run+0xa4/0xb8) > |[] (task_work_run) from [] (do_work_pending+0xd0/0xe4) > |[] (do_work_pending) from [] (work_pending+0xc/0x20) > |Code: e3530001 8a01 e3a00040 ea11 (e5973010) > > Cc: stable...@vger.kernel.org > Signed-off-by: Sebastian Andrzej Siewior > Signed-off-by: Steven Rostedt > --- > net/ipv4/tcp_ipv4.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 048418b049d8..7fcd8b60d751 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -62,6 +62,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -566,6 +567,7 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff > *skb) > } > EXPORT_SYMBOL(tcp_v4_send_check); > > +static DEFINE_LOCAL_IRQ_LOCK(tcp_sk_lock); > /* > * This routine will send an RST to the other tcp. > * > @@ -687,10 +689,13 @@ static void tcp_v4_send_reset(const struct sock *sk, > struct sk_buff *skb) > arg.bound_dev_if = sk->sk_bound_dev_if; > > arg.tos = ip_hdr(skb)->tos; > + > + local_lock(tcp_sk_lock); > ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk), > skb, _SKB_CB(skb)->header.h4.opt, > ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, > , arg.iov[0].iov_len); > + local_unlock(tcp_sk_lock); > > TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); > TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS); > @@ -772,10 +777,12 @@ static void tcp_v4_send_ack(struct net *net, > if (oif) > arg.bound_dev_if = oif; > arg.tos = tos; > + local_lock(tcp_sk_lock); > ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk), > skb, _SKB_CB(skb)->header.h4.opt, > ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, > , arg.iov[0].iov_len); > + local_unlock(tcp_sk_lock); > > TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); > }
[PATCH RT 02/10] timers: wakeup all timer waiters without holding the base lock
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThere should be no need to hold the base lock during the wakeup. There should be no boosting involved, the wakeup list has its own lock so it should be safe to do this without the lock. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 5a45162ae924..b1f9e6c5bec4 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1279,8 +1279,8 @@ static inline void __run_timers(struct tvec_base *base) } } } - wakeup_timer_waiters(base); spin_unlock_irq(>lock); + wakeup_timer_waiters(base); } #ifdef CONFIG_NO_HZ_COMMON -- 2.8.1
Re: [RFC PATCH v2 3/5] futex: Throughput-optimized (TO) futexes
On 09/22/2016 05:41 PM, Thomas Gleixner wrote: On Thu, 22 Sep 2016, Davidlohr Bueso wrote: On Thu, 22 Sep 2016, Waiman Long wrote: BTW, my initial attempt for the new futex was to use the same workflow as the PI futexes, but use mutex which has optimistic spinning instead of rt_mutex. Btw, Thomas, do you still have any interest pursuing this for rtmutexes from -rt into mainline? If so I can resend the patches from a while ago. Certainly yes. My faint memory tells me that there was some potential issue due to boosting the owner only if it gets scheduled out, but I might be wrong. It is tricky to add optimistic spinning to rtmutexes because of the need to observe process priorities. It is certainly possible to make the top waiter spin, but then I am not sure how much performance gain with just that. Cheers, Longman
[PATCH RT 07/10] fs/dcache: resched/chill only if we make no progress
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorUpstream commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in dput()") changed the condition _when_ cpu_relax() / cond_resched() was invoked. This change was adapted in -RT into mostly the same thing except that if cond_resched() did nothing we had to do cpu_chill() to force the task off CPU for a tiny little bit in case the task had RT priority and did not want to leave the CPU. This change resulted in a performance regression (in my testcase the build time on /dev/shm increased from 19min to 24min). The reason is that with this change cpu_chill() was invoked even dput() made progress (dentry_kill() returned a different dentry) instead only if we were trying this operation on the same dentry over and over again. This patch brings back to the old behavior back to cond_resched() & chill if we make no progress. A little improvement is to invoke cpu_chill() only if we are a RT task (and avoid the sleep otherwise). Otherwise the scheduler should remove us from the CPU if we make no progress. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index d96330db7f80..9a6c0a5ec1a3 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -40,6 +40,8 @@ #include #include #include +#include +#include #include "internal.h" #include "mount.h" @@ -748,6 +750,8 @@ static inline bool fast_dput(struct dentry *dentry) */ void dput(struct dentry *dentry) { + struct dentry *parent; + if (unlikely(!dentry)) return; @@ -784,13 +788,17 @@ repeat: return; kill_it: - dentry = dentry_kill(dentry); - if (dentry) { + parent = dentry_kill(dentry); + if (parent) { int r; - r = cond_resched(); - if (!r) - cpu_chill(); + if (parent == dentry) { + /* the task with the highest priority won't schedule */ + r = cond_resched(); + if (!r && (rt_task(current) || dl_task(current))) + cpu_chill(); + } else + dentry = parent; goto repeat; } } -- 2.8.1
[PATCH RT 01/10] timers: wakeup all timer waiters
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThe base lock is dropped during the invocation if the timer. That means it is possible that we have one waiter while timer1 is running and once this one finished, we get another waiter while timer2 is running. Since we wake up only one waiter it is possible that we miss the other one. This will probably heal itself over time because most of the time we complete timers without an active wake up. To avoid the scenario where we don't wake up all waiters at once, wake_up_all() is used. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index c68ba873da3c..5a45162ae924 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1019,7 +1019,7 @@ static void wait_for_running_timer(struct timer_list *timer) base->running_timer != timer); } -# define wakeup_timer_waiters(b) wake_up(&(b)->wait_for_running_timer) +# define wakeup_timer_waiters(b) wake_up_all(&(b)->wait_for_running_timer) #else static inline void wait_for_running_timer(struct timer_list *timer) { -- 2.8.1
[PATCH RT 03/10] sched: lazy_preempt: avoid a warning in the !RT case
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorSigned-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 539693da..9f05a3dacd16 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3123,7 +3123,7 @@ static __always_inline int preemptible_lazy(void) #else -static int preemptible_lazy(void) +static inline int preemptible_lazy(void) { return 1; } -- 2.8.1
[PATCH RT 04/10] scsi/fcoe: Fix get_cpu()/put_cpu_light() imbalance in fcoe_recv_frame()
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Mike GalbraithDuring master->rt merge, I stumbled across the buglet below. Fix get_cpu()/put_cpu_light() imbalance. Cc: stable...@vger.kernel.org Signed-off-by: Mike Gabraith Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- drivers/scsi/fcoe/fcoe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c index d6b6dde64fb9..455bf9c67b16 100644 --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c @@ -1815,7 +1815,7 @@ static void fcoe_recv_frame(struct sk_buff *skb) */ hp = (struct fcoe_hdr *) skb_network_header(skb); - stats = per_cpu_ptr(lport->stats, get_cpu()); + stats = per_cpu_ptr(lport->stats, get_cpu_light()); if (unlikely(FC_FCOE_DECAPS_VER(hp) != FC_FCOE_VER)) { if (stats->ErrorFrames < 5) printk(KERN_WARNING "fcoe: FCoE version " -- 2.8.1
[PATCH RT 08/10] x86/preempt-lazy: fixup should_resched()
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewiorshould_resched() returns true if NEED_RESCHED is set and the preempt_count is 0 _or_ if NEED_RESCHED_LAZY is set ignoring the preempt counter. Ignoring the preemp counter is wrong. This patch adds this into account. While at it, __preempt_count_dec_and_test() ignores preempt_lazy_count while checking TIF_NEED_RESCHED_LAZY so we this check, too. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- arch/x86/include/asm/preempt.h | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h index c08949b0314d..eff1b8609f77 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h @@ -92,6 +92,8 @@ static __always_inline bool __preempt_count_dec_and_test(void) if (preempt_count_dec_and_test()) return true; #ifdef CONFIG_PREEMPT_LAZY + if (current_thread_info()->preempt_lazy_count) + return false; return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return false; @@ -104,8 +106,19 @@ static __always_inline bool __preempt_count_dec_and_test(void) static __always_inline bool should_resched(int preempt_offset) { #ifdef CONFIG_PREEMPT_LAZY - return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset || - test_thread_flag(TIF_NEED_RESCHED_LAZY)); + u32 tmp; + + tmp = raw_cpu_read_4(__preempt_count); + if (tmp == preempt_offset) + return true; + + /* preempt count == 0 ? */ + tmp &= ~PREEMPT_NEED_RESCHED; + if (tmp) + return false; + if (current_thread_info()->preempt_lazy_count) + return false; + return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset); #endif -- 2.8.1
[PATCH RT 09/10] fs/dcache: incremental fixup of the retry routine
4.1.33-rt38-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorIt has been pointed out by tglx that on UP the non-RT task could spin its entire time slice because the lock owner is preempted. This won't happen on !RT. So we back to "chill" if we can't cond_resched() did not work. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 9a6c0a5ec1a3..c790b2b070ab 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -40,8 +40,6 @@ #include #include #include -#include -#include #include "internal.h" #include "mount.h" @@ -795,10 +793,11 @@ kill_it: if (parent == dentry) { /* the task with the highest priority won't schedule */ r = cond_resched(); - if (!r && (rt_task(current) || dl_task(current))) + if (!r) cpu_chill(); - } else + } else { dentry = parent; + } goto repeat; } } -- 2.8.1
Re: [RFC PATCH 2/8] thread_info: allow custom in-task thread_info
On Sep 21, 2016 12:28 AM, "Mark Rutland"wrote: > > Hi Andy, > > On Fri, Sep 16, 2016 at 08:11:14AM -0700, Andy Lutomirski wrote: > > > On Thu, Sep 15, 2016 at 11:37:47AM -0700, Andy Lutomirski wrote: > > > Just to check, what do you mean to happen with the flags field? Should > > > that always be in the generic thread_info? e.g. > > > > > > struct thread_info { > > > u32 flags; > > > #ifdef arch_thread_info > > > struct arch_thread_info arch_ti; > > > #endif > > > }; > > > > Exactly. Possibly with a comment that using thread_struct should be > > preferred and that arch_thread_info should be used only if some header > > file requires access via current_thread_info() or task_thread_info(). > > While fixing up these patches, I realised that I'm somewhat concerned by > flags becoming a u32 (where it was previously an unsigned long for > arm64). > > The generic {test,set,*}_ti_thread_flag() helpers use the usual bitops, > which perform accesses of sizeof(unsigned long) at a time, and for arm64 > these need to be naturally-aligned. > > We happen to get that alignment from subsequent fields in task_struct > and/or thread_info, and for arm64 we don't seem to have a problem with > tearing, but it feels somewhat fragile, and leaves me uneasy. > > Looking at the git log, it seems that x86 also use unsigned long until > commit affa219b60a11b32 ("x86: change thread_info's flag field back to > 32 bits"), where if I'm reading correctly, this was done to get rid of > unnecessary padding. With THREAD_INFO_IN_STACK, thread_info::flags is > immediately followed by a long on x86, so we save no padding. > > Given all that, can we make the generic thread_info::flags an unsigned > long, matching what the thread flag helpers implicitly assume? > Yes. Want to send the patch or should I? --Andy
Re: [PATCH 09/12] x86/process: Pin the target stack in get_wchan()
On Fri, Sep 16, 2016 at 7:00 PM, Jann Hornwrote: > On Tue, Sep 13, 2016 at 02:29:29PM -0700, Andy Lutomirski wrote: >> This will prevent a crash if get_wchan() runs after the task stack >> is freed. > > I think I found some more stuff. Have a look at KSTK_EIP() and KSTK_ESP(), I > think > they read from the saved userspace registers area at the top of the kernel > stack? > > Used on remote processes in: > vma_is_stack_for_task() (via /proc/$pid/maps) This isn't used in /proc/$pid/maps -- it's only used in /proc/$pid/task/$tid/maps. I wonder if anyone actually cares about it -- it certainly won't work reliably. I could pin the stack in vma_is_stack_for_task, but it seems potentially better to me to change it to vma_is_stack_for_current() and remove the offending caller in /proc, replacing it with "return 0". Thoughts? > do_task_stat() (/proc/$pid/stat) Like this: mm = get_task_mm(task); if (mm) { vsize = task_vsize(mm); if (permitted) { eip = KSTK_EIP(task); esp = KSTK_ESP(task); } } Can we just delete this outright? It seems somewhere between mostly and entirely useless, and it also seems dangerous. Until very recently, on x86_64, this would have been a potential info leak, as SYSCALL followed closely by a hardware interrupt would cause *kernel* values to land in task_pt_regs(). I don't even want to think about what this code does if the task is in vm86 mode. I wouldn't be at all surprised if non-x86 architectures have all kinds of interesting thinks happen if you do this to a task that isn't running normal non-atomic kernel code at the time. I would advocate for unconditionally returning zeros in these two stat fields.
Re: [PATCH v4] KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
2016-09-22 07:43+0800, Wanpeng Li: > From: Wanpeng Li> > I observed that kvmvapic(to optimize flexpriority=N or AMD) is used > to boost TPR access when testing kvm-unit-test/eventinj.flat tpr case > on my haswell desktop (w/ flexpriority, w/o APICv). Commit (8d14695f9542 > x86, apicv: add virtual x2apic support) disable virtual x2apic mode > completely if w/o APICv, and the author also told me that windows guest > can't enter into x2apic mode when he developed the APICv feature several > years ago. However, it is not truth currently, Interrupt Remapping and > vIOMMU is added to qemu and the developers from Intel test windows 8 can > work in x2apic mode w/ Interrupt Remapping enabled recently. > > This patch enables TPR shadow for virtual x2apic mode to boost > windows guest in x2apic mode even if w/o APICv. > > Can pass the kvm-unit-test. > > Suggested-by: Radim Krčmář > Suggested-by: Wincy Van > Reviewed-by: Radim Krčmář > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Wincy Van > Cc: Yang Zhang > Signed-off-by: Wanpeng Li > --- Applied to kvm/queue, thanks.
Re: [RFC PATCH v2 1/5] futex: Add futex_set_timer() helper function
On 09/22/2016 05:31 PM, Thomas Gleixner wrote: On Tue, 20 Sep 2016, Waiman Long wrote: Please be more careful of your subject lines. First thing I thought was that you add a helper which is used in later patches to find out that you actualy consolidate duplicated code. Something like: futex: Consolidate duplicated timer setup code would have told me right away what this is about. This patch adds a new futex_set_timer() function to consolidate all Please do not use: "This patch ...". We already know that this is a patch, otherwise it would not be tagged [PATCH n/m] in the subject line. See Documentation/SubmittingPatches the sleeping hrtime setup code. Let me give you a hint: 1: The code has three identical code copies to set up the futex timeout. 2: Add a helper function and consolidate the call sites. #1 tells precisely what the problem is #2 tells precisely how it is solved Can you see the difference? +/* + * Helper function to set the sleeping hrtimer. + */ +static inline void futex_set_timer(ktime_t *time, struct hrtimer_sleeper **pto, + struct hrtimer_sleeper *timeout, int flags, u64 range_ns) Please use futex_setup_timer() as the function name. I was confused when I read the other patch that you wanted to "set" the timer before entering into the place which would actually need it. +{ + if (!time) + return; + *pto = timeout; Please don't do that. That's a horrible coding style. What's wrong with returning NULL or the timeout pointer and assign it to "to" at the call site? Thanks, tglx Thanks for the suggestions. I will fix this patch in the next revision. Cheers, Longman
[PATCH 01/11 V2] staging: dgnc: remove redundant initialization for channel array
The channel array in board_t was initialized in dgnc_found_board() with NULL. But the channel is going to initialize in dgnc_tty_init() again. So the channel array doesn't need to set NULL for initailization in dgnc_found_board(). Signed-off-by: Daeseok Youn--- V2: The subject line was cut off, I put it completely and update change log. drivers/staging/dgnc/dgnc_driver.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/staging/dgnc/dgnc_driver.c b/drivers/staging/dgnc/dgnc_driver.c index 01e948c..b598034 100644 --- a/drivers/staging/dgnc/dgnc_driver.c +++ b/drivers/staging/dgnc/dgnc_driver.c @@ -400,9 +400,6 @@ static int dgnc_found_board(struct pci_dev *pdev, int id) brd->state = BOARD_FOUND; - for (i = 0; i < MAXPORTS; i++) - brd->channels[i] = NULL; - /* store which card & revision we have */ pci_read_config_word(pdev, PCI_SUBSYSTEM_VENDOR_ID, >subvendor); pci_read_config_word(pdev, PCI_SUBSYSTEM_ID, >subdevice); -- 1.9.1
[PATCH 09/11 V2] staging: dgnc: rename dgnc_tty_uninit() to dgnc_cleanup_tty()
The dgnc_tty_uninit() doesn't match with dgnc_tty_init() at all. And also the dgnc_cleanup_tty() is only called for exiting the module. Signed-off-by: Daeseok Youn--- V2: the subject line was cut off, I put it completely. drivers/staging/dgnc/dgnc_driver.c | 2 +- drivers/staging/dgnc/dgnc_tty.c| 4 ++-- drivers/staging/dgnc/dgnc_tty.h| 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/staging/dgnc/dgnc_driver.c b/drivers/staging/dgnc/dgnc_driver.c index 81ce5c4..fd372d3 100644 --- a/drivers/staging/dgnc/dgnc_driver.c +++ b/drivers/staging/dgnc/dgnc_driver.c @@ -147,7 +147,7 @@ static void cleanup(bool sysfiles) for (i = 0; i < dgnc_num_boards; ++i) { dgnc_remove_ports_sysfiles(dgnc_board[i]); - dgnc_tty_uninit(dgnc_board[i]); + dgnc_cleanup_tty(dgnc_board[i]); dgnc_cleanup_board(dgnc_board[i]); } diff --git a/drivers/staging/dgnc/dgnc_tty.c b/drivers/staging/dgnc/dgnc_tty.c index 893f473..5befd28 100644 --- a/drivers/staging/dgnc/dgnc_tty.c +++ b/drivers/staging/dgnc/dgnc_tty.c @@ -387,12 +387,12 @@ void dgnc_tty_post_uninit(void) } /* - * dgnc_tty_uninit() + * dgnc_cleanup_tty() * * Uninitialize the TTY portion of this driver. Free all memory and * resources. */ -void dgnc_tty_uninit(struct dgnc_board *brd) +void dgnc_cleanup_tty(struct dgnc_board *brd) { int i = 0; diff --git a/drivers/staging/dgnc/dgnc_tty.h b/drivers/staging/dgnc/dgnc_tty.h index f065c8f..24c9a41 100644 --- a/drivers/staging/dgnc/dgnc_tty.h +++ b/drivers/staging/dgnc/dgnc_tty.h @@ -25,7 +25,7 @@ int dgnc_tty_preinit(void); int dgnc_tty_init(struct dgnc_board *); void dgnc_tty_post_uninit(void); -void dgnc_tty_uninit(struct dgnc_board *); +void dgnc_cleanup_tty(struct dgnc_board *); void dgnc_input(struct channel_t *ch); void dgnc_carrier(struct channel_t *ch); -- 1.9.1
Re: [PATCH] PCI: rockchip: Support quirk to disable 5 GT/s (PCIe 2.x) link rate
Hi Brain, 在 2016/9/23 9:15, Brian Norris 写道: Hi Shawn, On Fri, Sep 23, 2016 at 08:27:35AM +0800, Shawn Lin wrote: 在 2016/9/23 1:31, Brian Norris 写道: rk3399 supports PCIe 2.x link speeds marginally at best, and on some boards, the link won't train at 5 GT/s at all. Rather than sacrifice 500 ms waiting for training that will never happen, let's support a device tree quirk flag to disable generation 2 speeds entirely. I was thinking about could we get target link speed [TLS] from the end-point when finishing Gen1 training, but it seems that the location of ep's TLS is not fixed. Indeed it's not, but we could probably handle that if absolutely needed (get a reference to the root port pci_dev somehow, then use the existing helpers to walk children and get the computed ->pcie_cap offset). But Right, we could probably walk through the ep's cap and get this, but sure, it's not the problem here, and that is maybe what I want to dig more later. Thanks for sharing this. that's not the problem here; we have 5 GT/s devices, but they are not running at 5 GT/s because link training can't pass. We have been told there are still SI issues, and so you wouldn't really be able to turn this out at runtime anyway. But sure, I suppose that'd be a way to (for chips/boards that don't have SI issues) determine whether or not to attempt gen2 training at all. That does sound better than just timing out after 500ms... Anyway, your patch looks sane to me as we leave gen2 as default and people could drop that feature by adding rockchip,disable-gen2 to their dts if they are sure the board would never supoort Gen2 devices. Acked-by: Shawn LinThanks. Brian -- Best Regards Shawn Lin
linux-next: manual merge of the drm-misc tree with Linus' tree
Hi all, Today's linux-next merge of the drm-misc tree got a conflict in: drivers/gpu/drm/drm_crtc.c between commit: 6f00975c6190 ("drm: Reject page_flip for !DRIVER_MODESET") from Linus' tree and commit: 43968d7b806d ("drm: Extract drm_plane.[hc]") from the drm-misc tree. I fixed it up (the latter incorporated the former, so I just used the latter) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell
Re: [RFC PATCH 3/4] futex: Throughput-optimized (TO) futexes
On 09/22/2016 09:32 AM, Thomas Gleixner wrote: On Tue, 6 Sep 2016, Waiman Long wrote: +enum futex_type { + TYPE_PI = 0, + TYPE_TO, +}; Please introduce the futex_type magic and the related changes to the pi code in a seperate patch so it can be verified independently. It's sad that one has to explain that to you over and over I didn't break it out because the changes to the PI code was pretty small. I will break it out in the next version. @@ -836,10 +859,10 @@ static void put_futex_state(struct futex_state *state) return; /* -* If state->owner is NULL, the owner is most probably dying -* and has cleaned up the futex state already +* If state->owner is NULL and the type is TYPE_PI, the owner +* is most probably dying and has cleaned up the state already */ - if (state->owner) { + if (state->owner&& (state->type == TYPE_PI)) { raw_spin_lock_irq(>owner->pi_lock); list_del_init(>list); raw_spin_unlock_irq(>owner->pi_lock); @@ -847,6 +870,11 @@ static void put_futex_state(struct futex_state *state) rt_mutex_proxy_unlock(>pi_mutex, state->owner); } + /* +* Dequeue it from the HB futex state list. +*/ + list_del_init(>hb_list); The comment above this list_del() is really pointless. I can see that from the code itself. Aside of that: Why do you need seperate list heads? You explain the seperate list somewhere in that big comment below, but it should be explained at the point where you add it to the state and the hash bucket. Sure. Will fix the comment. if (current->pi_state_cache) kfree(state); else { @@ -919,13 +947,24 @@ void exit_pi_state_list(struct task_struct *curr) continue; } - WARN_ON(pi_state->owner != curr); WARN_ON(list_empty(_state->list)); + if (pi_state->type == TYPE_PI) { + WARN_ON(pi_state->owner != curr); + pi_state->owner = NULL; + } list_del_init(_state->list); - pi_state->owner = NULL; raw_spin_unlock_irq(>pi_lock); - rt_mutex_unlock(_state->pi_mutex); + if (pi_state->type == TYPE_PI) lacks curly braces Yes, you are right. + rt_mutex_unlock(_state->pi_mutex); + else if (pi_state->type == TYPE_TO) { + /* +* Need to wakeup the mutex owner. +*/ Another completely useless comment. Because you tell what you do, but not WHY. Will elaborate on why the wakeup here. + WARN_ON(!pi_state->owner); + if (pi_state->owner) + wake_up_process(pi_state->owner); And what handles or sanity checks the state->hb_list ??? The exit_pi_state_list() function doesn't need to deal with state->hb_list. The hb_list is used to locate the futex state, but the futex owner doesn't have a reference to the futex state. So it won't need to decrement it and potentially free it. +/* + * Try to lock the userspace futex word (0 => vpid). + * + * Return: 1 if lock acquired or an error happens, 0 if not. + *The status code will be 0 if no error, or< 0 if an error happens. + **puval will contain the latest futex value when trylock fails. + * + * The waiter flag, if set, will make it ignore the FUTEX_WAITERS bit. + * The HB spinlock should NOT be held while calling this function. A + * successful lock acquisition will clear the waiter and died bits. + */ +static inline int futex_trylock_to(u32 __user *uaddr, u32 vpid, u32 *puval, + const bool waiter, int *status) +{ + u32 uval; + + *status = 0; + + if (unlikely(get_user(uval, uaddr))) + goto efault; + + *puval = uval; + + if (waiter ? (uval& FUTEX_TID_MASK) : uval) + return 0; /* Trylock fails */ Please do not use tail comments. They are hard to parse. OK, will move the comment up. + + if (unlikely(futex_atomic_cmpxchg_inatomic(puval, uaddr, uval, vpid))) + goto efault; + + return *puval == uval; + +efault: + *status = -EFAULT; + return 1; +} Do we really need another variant of cmpxchg and why do you need that extra status? What's wrong in doing the magic in the return value? This is not another variant of cmpxchg. It is the cmpxchg used by cmpxchg_futex_value_locked(). The only difference is that page fault was disabled with the locked version. I call futex_atomic_cmpxchg_inatomic() directly because it is called without the HB spinlock. So I don't need to disable page fault. I will add a separate patch to introduce the helper function cmpxchg_futex_value_unlocked()
Re: [PATCH] clocksource/drivers/ti-32k: Prevent ftrace recursion
On Fri, 23 Sep 2016 10:04:31 +0800 Jisheng Zhangwrote: > Hi Thomas, > > On Thu, 22 Sep 2016 15:58:03 +0200 Thomas Gleixner wrote: > > > On Thu, 22 Sep 2016, Jisheng Zhang wrote: > > > > > Currently ti-32k can be used as a scheduler clock. We properly marked > > > omap_32k_read_sched_clock() as notrace but we then call another > > > function ti_32k_read_cycles() that _wasn't_ notrace. > > > > > > Having a traceable function in the sched_clock() path leads to a > > > recursion within ftrace and a kernel crash. > > > > Kernel crash? Doesn't ftrace core prevent recursion? > > a recent similar issue: > > http://www.spinics.net/lists/arm-kernel/msg533480.html Right. But Thomas brought up recursion detection. And I said that would be the fix, but now thinking about it, I've updated the recursion protection so that timer issues should not cause a crash. I'd like to know more, as this appears to be mostly arm related. -- Steve
[PATCH RT 02/10] timers: wakeup all timer waiters without holding the base lock
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThere should be no need to hold the base lock during the wakeup. There should be no boosting involved, the wakeup list has its own lock so it should be safe to do this without the lock. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index d5212147ae19..603699ff9411 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1313,8 +1313,8 @@ static inline void __run_timers(struct tvec_base *base) } } } - wakeup_timer_waiters(base); spin_unlock_irq(>lock); + wakeup_timer_waiters(base); } #ifdef CONFIG_NO_HZ_COMMON -- 2.8.1
[PATCH RT 03/10] sched: lazy_preempt: avoid a warning in the !RT case
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorSigned-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 5ec35352b06b..8bad7e2d363c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3474,7 +3474,7 @@ static __always_inline int preemptible_lazy(void) #else -static int preemptible_lazy(void) +static inline int preemptible_lazy(void) { return 1; } -- 2.8.1
[PATCH RT 01/10] timers: wakeup all timer waiters
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThe base lock is dropped during the invocation if the timer. That means it is possible that we have one waiter while timer1 is running and once this one finished, we get another waiter while timer2 is running. Since we wake up only one waiter it is possible that we miss the other one. This will probably heal itself over time because most of the time we complete timers without an active wake up. To avoid the scenario where we don't wake up all waiters at once, wake_up_all() is used. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index fee8682c209e..d5212147ae19 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1051,7 +1051,7 @@ static void wait_for_running_timer(struct timer_list *timer) base->running_timer != timer); } -# define wakeup_timer_waiters(b) wake_up(&(b)->wait_for_running_timer) +# define wakeup_timer_waiters(b) wake_up_all(&(b)->wait_for_running_timer) #else static inline void wait_for_running_timer(struct timer_list *timer) { -- 2.8.1
[PATCH RT 10/10] Linux 4.4.21-rt31-rc1
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: "Steven Rostedt (Red Hat)"--- localversion-rt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localversion-rt b/localversion-rt index b72862e06be4..7f30ff78f82f 100644 --- a/localversion-rt +++ b/localversion-rt @@ -1 +1 @@ --rt30 +-rt31-rc1 -- 2.8.1
[PATCH RT 07/10] fs/dcache: resched/chill only if we make no progress
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorUpstream commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in dput()") changed the condition _when_ cpu_relax() / cond_resched() was invoked. This change was adapted in -RT into mostly the same thing except that if cond_resched() did nothing we had to do cpu_chill() to force the task off CPU for a tiny little bit in case the task had RT priority and did not want to leave the CPU. This change resulted in a performance regression (in my testcase the build time on /dev/shm increased from 19min to 24min). The reason is that with this change cpu_chill() was invoked even dput() made progress (dentry_kill() returned a different dentry) instead only if we were trying this operation on the same dentry over and over again. This patch brings back to the old behavior back to cond_resched() & chill if we make no progress. A little improvement is to invoke cpu_chill() only if we are a RT task (and avoid the sleep otherwise). Otherwise the scheduler should remove us from the CPU if we make no progress. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 76f007eb28f8..3730c7f757ff 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -40,6 +40,8 @@ #include #include #include +#include +#include #include "internal.h" #include "mount.h" @@ -748,6 +750,8 @@ static inline bool fast_dput(struct dentry *dentry) */ void dput(struct dentry *dentry) { + struct dentry *parent; + if (unlikely(!dentry)) return; @@ -784,14 +788,17 @@ repeat: return; kill_it: - dentry = dentry_kill(dentry); - if (dentry) { + parent = dentry_kill(dentry); + if (parent) { int r; - /* the task with the highest priority won't schedule */ - r = cond_resched(); - if (!r) - cpu_chill(); + if (parent == dentry) { + /* the task with the highest priority won't schedule */ + r = cond_resched(); + if (!r && (rt_task(current) || dl_task(current))) + cpu_chill(); + } else + dentry = parent; goto repeat; } } -- 2.8.1
[PATCH RT 00/10] Linux 4.4.21-rt31-rc1
Dear RT Folks, This is the RT stable review cycle of patch 4.4.21-rt31-rc1. Please scream at me if I messed something up. Please test the patches too. The -rc release will be uploaded to kernel.org and will be deleted when the final release is out. This is just a review release (or release candidate). The pre-releases will not be pushed to the git repository, only the final release is. If all goes well, this patch will be converted to the next main release on 9/25/2016. Enjoy, -- Steve To build 4.4.21-rt31-rc1 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v4.x/linux-4.4.tar.xz http://www.kernel.org/pub/linux/kernel/v4.x/patch-4.4.21.xz http://www.kernel.org/pub/linux/kernel/projects/rt/4.4/patch-4.4.21-rt31-rc1.patch.xz You can also build from 4.4.21-rt30 by applying the incremental patch: http://www.kernel.org/pub/linux/kernel/projects/rt/4.4/incr/patch-4.4.21-rt30-rt31-rc1.patch.xz Changes from 4.4.21-rt30: --- Mike Galbraith (1): scsi/fcoe: Fix get_cpu()/put_cpu_light() imbalance in fcoe_recv_frame() Sebastian Andrzej Siewior (8): timers: wakeup all timer waiters timers: wakeup all timer waiters without holding the base lock sched: lazy_preempt: avoid a warning in the !RT case net: add back the missing serialization in ip_send_unicast_reply() net: add a lock around icmp_sk() fs/dcache: resched/chill only if we make no progress x86/preempt-lazy: fixup should_resched() fs/dcache: incremental fixup of the retry routine Steven Rostedt (Red Hat) (1): Linux 4.4.21-rt31-rc1 arch/x86/include/asm/preempt.h | 17 +++-- drivers/scsi/fcoe/fcoe.c | 2 +- fs/dcache.c| 18 -- kernel/sched/core.c| 2 +- kernel/time/timer.c| 4 ++-- localversion-rt| 2 +- net/ipv4/icmp.c| 8 net/ipv4/tcp_ipv4.c| 7 +++ 8 files changed, 47 insertions(+), 13 deletions(-)
[PATCH RT 08/10] x86/preempt-lazy: fixup should_resched()
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewiorshould_resched() returns true if NEED_RESCHED is set and the preempt_count is 0 _or_ if NEED_RESCHED_LAZY is set ignoring the preempt counter. Ignoring the preemp counter is wrong. This patch adds this into account. While at it, __preempt_count_dec_and_test() ignores preempt_lazy_count while checking TIF_NEED_RESCHED_LAZY so we this check, too. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- arch/x86/include/asm/preempt.h | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h index 5dbd2d0f91e0..6f432adc55cd 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h @@ -89,6 +89,8 @@ static __always_inline bool __preempt_count_dec_and_test(void) if (preempt_count_dec_and_test()) return true; #ifdef CONFIG_PREEMPT_LAZY + if (current_thread_info()->preempt_lazy_count) + return false; return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return false; @@ -101,8 +103,19 @@ static __always_inline bool __preempt_count_dec_and_test(void) static __always_inline bool should_resched(int preempt_offset) { #ifdef CONFIG_PREEMPT_LAZY - return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset || - test_thread_flag(TIF_NEED_RESCHED_LAZY)); + u32 tmp; + + tmp = raw_cpu_read_4(__preempt_count); + if (tmp == preempt_offset) + return true; + + /* preempt count == 0 ? */ + tmp &= ~PREEMPT_NEED_RESCHED; + if (tmp) + return false; + if (current_thread_info()->preempt_lazy_count) + return false; + return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return unlikely(raw_cpu_read_4(__preempt_count) == preempt_offset); #endif -- 2.8.1
[PATCH RT 04/10] scsi/fcoe: Fix get_cpu()/put_cpu_light() imbalance in fcoe_recv_frame()
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Mike GalbraithDuring master->rt merge, I stumbled across the buglet below. Fix get_cpu()/put_cpu_light() imbalance. Cc: stable...@vger.kernel.org Signed-off-by: Mike Gabraith Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- drivers/scsi/fcoe/fcoe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c index f1622a05854b..cbbbebd86c6e 100644 --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c @@ -1814,7 +1814,7 @@ static void fcoe_recv_frame(struct sk_buff *skb) */ hp = (struct fcoe_hdr *) skb_network_header(skb); - stats = per_cpu_ptr(lport->stats, get_cpu()); + stats = per_cpu_ptr(lport->stats, get_cpu_light()); if (unlikely(FC_FCOE_DECAPS_VER(hp) != FC_FCOE_VER)) { if (stats->ErrorFrames < 5) printk(KERN_WARNING "fcoe: FCoE version " -- 2.8.1
Re: [PATCH V6 3/5] PCI: thunder-pem: Allow to probe PEM-specific register range for ACPI case
On Thu, Sep 22, 2016 at 01:31:01PM -0500, Bjorn Helgaas wrote: > On Thu, Sep 22, 2016 at 01:44:46PM +0100, Lorenzo Pieralisi wrote: > > On Thu, Sep 22, 2016 at 11:10:13AM +, Gabriele Paoloni wrote: > > > Hi Lorenzo, Bjorn > > > > > > > -Original Message- > > > > From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com] > > > > Sent: 22 September 2016 10:50 > > > > To: Bjorn Helgaas > > > > Cc: Ard Biesheuvel; Tomasz Nowicki; David Daney; Will Deacon; Catalin > > > > Marinas; Rafael Wysocki; Arnd Bergmann; Hanjun Guo; Sinan Kaya; > > > > Jayachandran C; Christopher Covington; Duc Dang; Robert Richter; Marcin > > > > Wojtas; Liviu Dudau; Wangyijing; Mark Salter; linux- > > > > p...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; Linaro ACPI > > > > Mailman List; Jon Masters; Andrea Gallo; Jeremy Linton; liudongdong > > > > (C); Gabriele Paoloni; Jeff Hugo; linux-a...@vger.kernel.org; linux- > > > > ker...@vger.kernel.org; Rafael J. Wysocki > > > > Subject: Re: [PATCH V6 3/5] PCI: thunder-pem: Allow to probe PEM- > > > > specific register range for ACPI case > > > > > > > > On Wed, Sep 21, 2016 at 01:04:57PM -0500, Bjorn Helgaas wrote: > > > > > On Wed, Sep 21, 2016 at 03:05:49PM +0100, Lorenzo Pieralisi wrote: > > > > > > On Tue, Sep 20, 2016 at 02:17:44PM -0500, Bjorn Helgaas wrote: > > > > > > > On Tue, Sep 20, 2016 at 04:09:25PM +0100, Ard Biesheuvel wrote: > > > > > > > > > > > > [...] > > > > > > > > > > > > > > None of these platforms can be fixed entirely in software, and > > > > given > > > > > > > > that we will not be adding quirks for new broken hardware, we > > > > should > > > > > > > > ask ourselves whether having two versions of a quirk, i.e., one > > > > for > > > > > > > > broken hardware + currently shipping firmware, and one for the > > > > same > > > > > > > > broken hardware with fixed firmware is really an improvement > > > > over what > > > > > > > > has been proposed here. > > > > > > > > > > > > > > We're talking about two completely different types of quirks: > > > > > > > > > > > > > > 1) MCFG quirks to use memory-mapped config space that doesn't > > > > quite > > > > > > > conform to the ECAM model in the PCIe spec, and > > > > > > > > > > > > > > 2) Some yet-to-be-determined method to describe address space > > > > > > > consumed by a bridge. > > > > > > > > > > > > > > The first two patches of this series are a nice implementation > > > > for 1). > > > > > > > The third patch (ThunderX-specific) is one possibility for 2), > > > > but I > > > > > > > don't like it because there's no way for generic software like > > > > the > > > > > > > ACPI core to discover these resources. > > > > > > > > > > > > Ok, so basically this means that to implement (2) we need to assign > > > > > > some sort of _HID to these quirky PCI bridges (so that we know what > > > > > > device they represent and we can retrieve their _CRS). I take from > > > > > > this discussion that the goal is to make sure that all non-config > > > > > > resources have to be declared through _CRS device objects, which is > > > > > > fine but that requires a FW update (unless we can fabricate ACPI > > > > > > devices and corresponding _CRS in the kernel whenever we match a > > > > > > given MCFG table signature). > > > > > > > > > > All resources consumed by ACPI devices should be declared through > > > > > _CRS. If you want to fabricate ACPI devices or _CRS via kernel > > > > > quirks, that's fine with me. This could be triggered via MCFG > > > > > signature, DMI info, host bridge _HID, etc. > > > > > > > > I think the PNP quirk approach + PNP0c02 resource put forward by Gab > > > > is enough. > > > > > > Great thanks as we take a final decision I will ask Dogndgong to submit > > > another RFC based on this approach > > > > > > > > > > > > > We discussed this already and I think we should make a decision: > > > > > > > > > > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016- > > > > March/414722.html > > > > > > > > > > > > > > > I'd like to step back and come up with some understanding of > > > > how > > > > > > > > > non-broken firmware *should* deal with this issue. Then, if > > > > we *do* > > > > > > > > > work around this particular broken firmware in the kernel, it > > > > would be > > > > > > > > > nice to do it in a way that fits in with that understanding. > > > > > > > > > > > > > > > > > > For example, if a companion ACPI device is the preferred > > > > solution, an > > > > > > > > > ACPI quirk could fabricate a device with the required > > > > resources. That > > > > > > > > > would address the problem closer to the source and make it > > > > more likely > > > > > > > > > that the rest of the system will work correctly: /proc/iomem > > > > could > > > > > > > > > make sense, things that look at _CRS generically would work > > > > (e.g, > > > > > > > > > /sys/, an admittedly hypothetical "lsacpi", etc.) > > > > > > > > > > > > > > > > > > Hard-coding
[PATCH RT 09/10] fs/dcache: incremental fixup of the retry routine
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorIt has been pointed out by tglx that on UP the non-RT task could spin its entire time slice because the lock owner is preempted. This won't happen on !RT. So we back to "chill" if we can't cond_resched() did not work. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 44a3419c7125..986acc945c06 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -39,8 +39,6 @@ #include #include #include -#include -#include #include "internal.h" #include "mount.h" @@ -769,10 +767,11 @@ kill_it: if (parent == dentry) { /* the task with the highest priority won't schedule */ r = cond_resched(); - if (!r && (rt_task(current) || dl_task(current))) + if (!r) cpu_chill(); - } else + } else { dentry = parent; + } goto repeat; } } -- 2.8.1
RE: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out
> >So this is impossible without THP swapin. While 2M swapout makes a lot of >sense, I doubt 2M swapin is really useful. What kind of application is >'optimized' >to do sequential memory access? We waste a lot of cpu cycles to re-compact 4K pages back to a large page under THP. Swapping it back in as a single large page can avoid fragmentation and this overhead. Thanks. Tim
[PATCH v2 4/4] mmc: sdhci-of-arasan: add sdhci_arasan_voltage_switch for arasan,5.1
Per the vendor's requirement, we shouldn't do any setting for 1.8V Signaling Enable, otherwise the interaction/behaviour between phy and controller will be undefined. Mostly it works fine if we do that, but we still see failures. Anyway, let's fix it to meet the vendor's requirement. The error log looks like: [ 93.405085] mmc1: unexpected status 0x800900 after switch [ 93.408474] mmc1: switch to bus width 1 failed [ 93.408482] mmc1: mmc_select_hs200 failed, error -110 [ 93.408492] mmc1: error -110 during resume (card was removed?) [ 93.408705] PM: resume of devices complete after 213.453 msecs Signed-off-by: Shawn Lin--- Changes in v2: None drivers/mmc/host/sdhci-of-arasan.c | 24 1 file changed, 24 insertions(+) diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c index da8e40a..1573a8d 100644 --- a/drivers/mmc/host/sdhci-of-arasan.c +++ b/drivers/mmc/host/sdhci-of-arasan.c @@ -265,6 +265,28 @@ void sdhci_arasan_reset(struct sdhci_host *host, u8 mask) } } +static int sdhci_arasan_voltage_switch(struct mmc_host *mmc, + struct mmc_ios *ios) +{ + switch (ios->signal_voltage) { + case MMC_SIGNAL_VOLTAGE_180: + /* +* Plese don't switch to 1V8 as arasan,5.1 doesn't +* actually refer to this setting to indicate the +* signal voltage and the state machine will be broken +* actually if we force to enable 1V8. That's something +* like broken quirk but we could work around here. +*/ + return 0; + case MMC_SIGNAL_VOLTAGE_330: + case MMC_SIGNAL_VOLTAGE_120: + /* We don't support 3V3 and 1V2 */ + break; + } + + return -EINVAL; +} + static struct sdhci_ops sdhci_arasan_ops = { .set_clock = sdhci_arasan_set_clock, .get_max_clock = sdhci_pltfm_clk_get_max_clock, @@ -661,6 +683,8 @@ static int sdhci_arasan_probe(struct platform_device *pdev) host->mmc_host_ops.hs400_enhanced_strobe = sdhci_arasan_hs400_enhanced_strobe; + host->mmc_host_ops.start_signal_voltage_switch = + sdhci_arasan_voltage_switch; } ret = sdhci_add_host(host); -- 2.3.7
[PATCH v2 3/4] mmc: sdhci: Don't try to switch to unsupported voltage
From: Ziyuan XuSdhci shouldn't switch to the unsupported voltage if claiming that it can not support the requested voltage. Let's fix it. Signed-off-by: Ziyuan Xu Signed-off-by: Shawn Lin --- Changes in v2: None drivers/mmc/host/sdhci.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 4805566..b1f1edd 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -1845,7 +1845,8 @@ static int sdhci_start_signal_voltage_switch(struct mmc_host *mmc, switch (ios->signal_voltage) { case MMC_SIGNAL_VOLTAGE_330: - if (!(host->flags & SDHCI_SIGNALING_330)) + if (!(host->flags & SDHCI_SIGNALING_330) || + !(host->caps & SDHCI_CAN_VDD_330)) return -EINVAL; /* Set 1.8V Signal Enable in the Host Control2 register to 0 */ ctrl &= ~SDHCI_CTRL_VDD_180; @@ -1872,7 +1873,8 @@ static int sdhci_start_signal_voltage_switch(struct mmc_host *mmc, return -EAGAIN; case MMC_SIGNAL_VOLTAGE_180: - if (!(host->flags & SDHCI_SIGNALING_180)) + if (!(host->flags & SDHCI_SIGNALING_180) || + !(host->caps & SDHCI_CAN_VDD_180)) return -EINVAL; if (!IS_ERR(mmc->supply.vqmmc)) { ret = mmc_regulator_set_vqmmc(mmc, ios); -- 2.3.7
[PATCH v2 2/4] mmc: core: changes frequency to hs_max_dtr when selecting hs400es
Per JESD84-B51 P69, Host need to change frequency to <=52MHz after setting HS_TIMING to 0x1, and host may changes frequency to <= 200MHz after setting HS_TIMING to 0x3. That means the card expects the clock rate to increase from the current used f_init (which is less than 400KHz, but still being less than 52MHz) to 52MHz, otherwise we find some eMMC devices significantly report failure when sending status. Reported-by: Xiao YaoSigned-off-by: Shawn Lin --- Changes in v2: - improve the changelog drivers/mmc/core/mmc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c index f4ed5ac..39fc5b2 100644 --- a/drivers/mmc/core/mmc.c +++ b/drivers/mmc/core/mmc.c @@ -1282,6 +1282,8 @@ static int mmc_select_hs400es(struct mmc_card *card) if (err) goto out_err; + mmc_set_clock(host, card->ext_csd.hs_max_dtr); + err = mmc_switch_status(card); if (err) goto out_err; -- 2.3.7
Re: [PATCH] PCI: rockchip: fix uninitialized variable use
Hi Arnd, 在 2016/9/22 17:39, Arnd Bergmann 写道: The newly added pcie-rockchip driver fails to initialize the io_size variable if the DT doesn't provide ranges for the PCI I/O space, as found by building it with -Wmaybe-uninitialized: drivers/pci/host/pcie-rockchip.c: In function 'rockchip_pcie_probe': drivers/pci/host/pcie-rockchip.c:1007:6: warning: 'io_size' may be used uninitialized in this function [-Wmaybe-uninitialized] Seems like we miss this when refactoring the code a bit. Thanks for fixing it. Acked-by: Shawn LinThis adds an appropriate initialization immediately in front of the loop, so the io_size is zero as expected afterwards for that case. Fixes: abe17181b16f ("PCI: rockchip: Add Rockchip PCIe controller support") Signed-off-by: Arnd Bergmann --- drivers/pci/host/pcie-rockchip.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c index c3593e633ccd..8bedc1e1ef80 100644 --- a/drivers/pci/host/pcie-rockchip.c +++ b/drivers/pci/host/pcie-rockchip.c @@ -1078,6 +1078,7 @@ static int rockchip_pcie_probe(struct platform_device *pdev) goto err_vpcie; /* Get the I/O and memory ranges from DT */ + io_size = 0; resource_list_for_each_entry(win, ) { switch (resource_type(win->res)) { case IORESOURCE_IO: -- Best Regards Shawn Lin
[PATCH 03/11 V2] staging: dgnc: missing NULL check for ioremap in dgnc_do_remap()
The ioremap() function can be failed, so it need to have error handling in dgnc_do_remap(). And also the return type of dgnc_do_remap() should be changed from "void" to "int" Signed-off-by: Daeseok Youn--- V2: the subject line was cut off, I put it completely. drivers/staging/dgnc/dgnc_driver.c | 31 +-- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/drivers/staging/dgnc/dgnc_driver.c b/drivers/staging/dgnc/dgnc_driver.c index c87b3de..58cebf4 100644 --- a/drivers/staging/dgnc/dgnc_driver.c +++ b/drivers/staging/dgnc/dgnc_driver.c @@ -43,7 +43,7 @@ static void dgnc_cleanup_board(struct dgnc_board *brd); static voiddgnc_poll_handler(ulong dummy); static int dgnc_init_one(struct pci_dev *pdev, const struct pci_device_id *ent); -static voiddgnc_do_remap(struct dgnc_board *brd); +static int dgnc_do_remap(struct dgnc_board *brd); /* * File operations permitted on Control/Management major. @@ -431,7 +431,10 @@ static int dgnc_found_board(struct pci_dev *pdev, int id) brd->bd_uart_offset = 0x8; brd->bd_dividend = 921600; - dgnc_do_remap(brd); + rc = dgnc_do_remap(brd); + + if (rc < 0) + goto failed; /* Get and store the board VPD, if it exists */ brd->bd_ops->vpd(brd); @@ -483,15 +486,17 @@ static int dgnc_found_board(struct pci_dev *pdev, int id) brd->bd_uart_offset = 0x200; brd->bd_dividend = 921600; - dgnc_do_remap(brd); + rc = dgnc_do_remap(brd); - if (brd->re_map_membase) { - /* Read and store the dvid after remapping */ - brd->dvid = readb(brd->re_map_membase + 0x8D); + if (rc < 0) + goto failed; + + /* Read and store the dvid after remapping */ + brd->dvid = readb(brd->re_map_membase + 0x8D); + + /* Get and store the board VPD, if it exists */ + brd->bd_ops->vpd(brd); - /* Get and store the board VPD, if it exists */ - brd->bd_ops->vpd(brd); - } break; default: @@ -566,9 +571,15 @@ static int dgnc_finalize_board_init(struct dgnc_board *brd) /* * Remap PCI memory. */ -static void dgnc_do_remap(struct dgnc_board *brd) +static int dgnc_do_remap(struct dgnc_board *brd) { + int rc = 0; + brd->re_map_membase = ioremap(brd->membase, 0x1000); + if (!brd->re_map_membase) + rc = -ENOMEM; + + return rc; } /* -- 1.9.1
Re: Should drivers like nvme let userspace control their latency via dev_pm_qos?
On Thu, Sep 22, 2016 at 6:26 PM, Rafael J. Wysockiwrote: > On 9/16/2016 5:26 PM, Andy Lutomirski wrote: >> >> I'm adding power management to the nvme driver, and I'm exposing >> exactly one knob via sysfs: the maximum permissible latency. This >> isn't a power domain issue, and it has no dependencies -- it's >> literally just the maximum latency that the driver may impose on I/O >> for power saving purposes. >> >> ISTM userspace should be able to specify its own latency tolerance in >> a uniform way, and dev_pm_qos seems like the natural interface for >> this, except that I cannot find a single instance in the tree of *any* >> driver using it via the notifier mechanism. > > > That's because the notifier mechanism is only used for the "resume latency" > type of constraints. > >> I can find two drivers that do it using >> dev_pm_qos_expose_latency_tolerance(), and both are LPSS drivers? > > > That's correct. Nobody else has used it so far. :-) > >> So: should I be exposing .set_latency_tolerance() or should I just use >> a custom sysfs attribute? Or both? > > > dev_pm_qos_expose_latency_tolerance() adds a single latency tolerance > request object to the device and exposes a knob in user space by which that > request object can be controlled. There may be more latency tolerance > request objects for the same device if kernel code adds them. The effective > latency tolerance is the minimum of all those requests and the callback is > invoked every time that effective value changes. > > This also is described in the last section of > Documentation/power/pm_qos_interface.txt (note that if the > .set_latency_tolerance callback is present at the device registration time > already, the latency tolerance sysfs attribute will be exposed automatically > by the driver core). > > If that mechanism is suitable for the use case in question, I'd just use it. OK, I'll play with it.
Re: [PATCH net-next] Documentation: devicetree: revise ethernet device-tree binding about TRGMII
Date: Thu, 22 Sep 2016 19:48:47 +0300, Sergei Shtylyovwrote: >On 09/22/2016 07:16 PM, sean.w...@mediatek.com wrote: > >> From: Sean Wang >> >> fix typo in mediatek-net.txt and add phy-mode "trgmii" to ethernet.txt > >These changes are unrelated to each other, so there should be 2 separate >patches. And have the patches I reviewed been merged already, why are you >sending an incremental patch? > okay, I will make them into distinct patchs. I saw they had been applied so I created an incremental patch based on codebase after applied. >> Cc: devicet...@vger.kernel.org >> Reported-by: Sergei Shtylyov >> Signed-off-by: Sean Wang >[...] > >MBR, Sergei > >
Re: [PATCH] clocksource/drivers/ti-32k: Prevent ftrace recursion
On Thu, 22 Sep 2016 22:45:14 -0400 Steven Rostedt wrote: > On Fri, 23 Sep 2016 10:04:31 +0800 > Jisheng Zhangwrote: > > > Hi Thomas, > > > > On Thu, 22 Sep 2016 15:58:03 +0200 Thomas Gleixner wrote: > > > > > On Thu, 22 Sep 2016, Jisheng Zhang wrote: > > > > > > > Currently ti-32k can be used as a scheduler clock. We properly marked > > > > omap_32k_read_sched_clock() as notrace but we then call another > > > > function ti_32k_read_cycles() that _wasn't_ notrace. > > > > > > > > Having a traceable function in the sched_clock() path leads to a > > > > recursion within ftrace and a kernel crash. > > > > > > Kernel crash? Doesn't ftrace core prevent recursion? > > > > a recent similar issue: > > > > http://www.spinics.net/lists/arm-kernel/msg533480.html > > Right. But Thomas brought up recursion detection. And I said that would > be the fix, but now thinking about it, I've updated the recursion > protection so that timer issues should not cause a crash. > Got it. Thanks for the clarification
Re: [RFC v7 00/23] adapt clockevents frequencies to mono clock
On Wed, 21 Sep 2016, Nicolai Stange wrote: > Thomas Gleixnerwrites: > > > On Wed, 21 Sep 2016, Nicolai Stange wrote: > >> Thomas Gleixner writes: > >> > Have you ever measured the overhead of the extra work which has to be > >> > done > >> > in clockevents_adjust_all_freqs() ? > >> > >> Not exactly, I had a look at its invocation frequency which seems to > >> decay exponentially with uptime, presumably because the NTP error > >> approaches zero. > >> > >> However, I've just gathered a function_graph ftrace on my Intel > >> i7-4800MQ (Haswell, 8HTs): > >> > >> # TIMECPU DURATION FUNCTION CALLS > >> # | | | | | | | | > >>85.287027 | 0) 0.899 us| clockevents_adjust_all_freqs(); > >>85.288026 | 0) 0.759 us| clockevents_adjust_all_freqs(); > >>85.289026 | 0) 0.735 us| clockevents_adjust_all_freqs(); > >>85.290026 | 0) 0.671 us| clockevents_adjust_all_freqs(); > >> 149.503656 | 2) 2.477 us| clockevents_adjust_all_freqs(); > > > > That's not that bad. Though I'd like to see numbers for ARM (especially the > > less powerful SoCs) as well. > > On a Raspberry Pi 2B (bcm2836, ARMv7) with CONFIG_SMP=y, the mean over > ~5300 samples is 5.14+/-1.04us with a max of 11.15us. So why is the variance that high? You have an outlier on that intel as well which might be caused by NMI, but it might also be a systematic issue depending on the input parameters. 11 us on that ARM worries me. Thanks, tglx
[PATCH RT 09/10] fs/dcache: incremental fixup of the retry routine
4.4.21-rt31-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorIt has been pointed out by tglx that on UP the non-RT task could spin its entire time slice because the lock owner is preempted. This won't happen on !RT. So we back to "chill" if we can't cond_resched() did not work. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 3730c7f757ff..e80471cbfc19 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -40,8 +40,6 @@ #include #include #include -#include -#include #include "internal.h" #include "mount.h" @@ -795,10 +793,11 @@ kill_it: if (parent == dentry) { /* the task with the highest priority won't schedule */ r = cond_resched(); - if (!r && (rt_task(current) || dl_task(current))) + if (!r) cpu_chill(); - } else + } else { dentry = parent; + } goto repeat; } } -- 2.8.1
Re: [PATCH v5 3/6] x86/arch_prctl Add a new do_arch_prctl
On Wed, Sep 21, 2016 at 11:58 AM, Kyle Hueywrote: > Add a new do_arch_prctl to handle arch_prctls that are not specific to 64 > bits. Call it from the syscall entry point, but not any of the other > callsites in the kernel, which all want one of the existing 64 bit only > arch_prctls. > > Signed-off-by: Kyle Huey > --- > arch/x86/include/asm/proto.h | 1 + > arch/x86/kernel/process.c| 5 + > arch/x86/kernel/process_64.c | 8 +++- > 3 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h > index 95c3e51..94a57cc 100644 > --- a/arch/x86/include/asm/proto.h > +++ b/arch/x86/include/asm/proto.h > @@ -30,6 +30,7 @@ void x86_report_nx(void); > > extern int reboot_force; > > +long do_arch_prctl(struct task_struct *task, int code, unsigned long arg2); > #ifdef CONFIG_X86_64 > long do_arch_prctl_64(struct task_struct *task, int code, unsigned long > arg2); > #endif > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index 62c0b0e..97aa104 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -567,3 +567,8 @@ unsigned long get_wchan(struct task_struct *p) > } while (count++ < 16 && p->state != TASK_RUNNING); > return 0; > } > + > +long do_arch_prctl(struct task_struct *task, int code, unsigned long arg2) > +{ > + return -EINVAL; > +} This should be do_arch_prctl_common or similar to avoid confusion. P.S. The subject should be "x86/arch_prctl: ...".
Re: modules still have .debug_* (was Re: [PATCH 0/3] ARC unwinder switch to .eh_frame)
On Thu, Sep 22, 2016 at 1:59 PM, Vineet Guptawrote: > Hi Daniel, > > On 09/19/2016 06:21 PM, Daniel Mentz wrote: >> I confirmed that the .eh_frame section is present and that the >> .debug_frame section is absent. I also verified that the file size of >> the .ko files are small enough for our embedded platform and that >> unnecessary sections like .debug_info, .debug_line, .debug_str etc. >> are also absent. > > BTW it seems with my latest set of patches, modules still have .debug_*. > Can you double check if your tree still has the interim patch which added a > linker > script for modules to strip out .debug_* > > http://lists.infradead.org/pipermail/linux-snps-arc/2016-September/001483.html Hi Vineet, Sorry, that was a misunderstanding. Buildroot routinely runs the strip command on .ko files before installing them on the target. I was only looking at the .ko files *after* running the strip command. No, the interim patch was not in my tree. I confirmed that your commit "ARC: dw2 unwind: don't force dwarf 2" is indeed necessary to suppress the .debug_* sections when CONFIG_DEBUG_INFO is off. But again, we're stripping .ko files anyways before installing. > I'm not planning to carry it and would prefer addressing the the root cause by > removing the -gdwarf-2 toggle. I've added that and pushed rebased series. > Care to > take it for a respin please. I downloaded your latest commit e47305af57d7eedc10b4720e604d669b10c69e3b and verified that stack traces are properly displayed for code inside kernel modules as well as vmlinux. I also called memcpy() on some bad address and got a proper stack trace that involved memcpy(). I conclude that unwinding works for us. Thank You Daniel
[PATCH] KVM: VMX: refactor global page-sized bitmaps
We've had 10 page-sized bitmaps that were being allocated and freed one by one when we could just use a cycle and MSR bitmaps had a lot of useless code lying around. This patch * enumerates vmx bitmaps and uses an array to store them * replaces vmx_enable_intercept_msr_read_x2apic() with a condition * joins vmx_msr_disable_intercept_msr_{read,write}_x2apic() * renames x2apic_apicv_inactive msr_bitmaps to x2apic and original x2apic bitmaps to x2apic_apicv Signed-off-by: Radim Krčmář--- arch/x86/kvm/vmx.c | 297 + 1 file changed, 92 insertions(+), 205 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1cca146f4341..dfbcd45fcb2b 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -921,16 +921,20 @@ static DEFINE_PER_CPU(struct desc_ptr, host_gdt); static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu); static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock); -static unsigned long *vmx_io_bitmap_a; -static unsigned long *vmx_io_bitmap_b; -static unsigned long *vmx_msr_bitmap_legacy; -static unsigned long *vmx_msr_bitmap_longmode; -static unsigned long *vmx_msr_bitmap_legacy_x2apic; -static unsigned long *vmx_msr_bitmap_longmode_x2apic; -static unsigned long *vmx_msr_bitmap_legacy_x2apic_apicv_inactive; -static unsigned long *vmx_msr_bitmap_longmode_x2apic_apicv_inactive; -static unsigned long *vmx_vmread_bitmap; -static unsigned long *vmx_vmwrite_bitmap; +enum vmx_bitmap { + vmx_io_bitmap_a, + vmx_io_bitmap_b, + vmx_msr_bitmap_legacy, + vmx_msr_bitmap_legacy_x2apic, + vmx_msr_bitmap_legacy_x2apic_apicv, + vmx_msr_bitmap_longmode, + vmx_msr_bitmap_longmode_x2apic, + vmx_msr_bitmap_longmode_x2apic_apicv, + vmx_vmread_bitmap, + vmx_vmwrite_bitmap, + VMX_BITMAP_NR +}; +static unsigned long *vmx_bitmap[VMX_BITMAP_NR]; static bool cpu_has_load_ia32_efer; static bool cpu_has_load_perf_global_ctrl; @@ -2519,23 +2523,26 @@ static void move_msr_up(struct vcpu_vmx *vmx, int from, int to) static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu) { - unsigned long *msr_bitmap; + enum vmx_bitmap msr_bitmap; - if (is_guest_mode(vcpu)) - msr_bitmap = to_vmx(vcpu)->nested.msr_bitmap; - else if (cpu_has_secondary_exec_ctrls() && -(vmcs_read32(SECONDARY_VM_EXEC_CONTROL) & - SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) { + if (is_guest_mode(vcpu)) { + vmcs_write64(MSR_BITMAP, __pa(to_vmx(vcpu)->nested.msr_bitmap)); + return; + } + + if (cpu_has_secondary_exec_ctrls() && + (vmcs_read32(SECONDARY_VM_EXEC_CONTROL) & + SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) { if (enable_apicv && kvm_vcpu_apicv_active(vcpu)) { if (is_long_mode(vcpu)) + msr_bitmap = vmx_msr_bitmap_longmode_x2apic_apicv; + else + msr_bitmap = vmx_msr_bitmap_legacy_x2apic_apicv; + } else { + if (is_long_mode(vcpu)) msr_bitmap = vmx_msr_bitmap_longmode_x2apic; else msr_bitmap = vmx_msr_bitmap_legacy_x2apic; - } else { - if (is_long_mode(vcpu)) - msr_bitmap = vmx_msr_bitmap_longmode_x2apic_apicv_inactive; - else - msr_bitmap = vmx_msr_bitmap_legacy_x2apic_apicv_inactive; } } else { if (is_long_mode(vcpu)) @@ -2544,7 +2551,7 @@ static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu) msr_bitmap = vmx_msr_bitmap_legacy; } - vmcs_write64(MSR_BITMAP, __pa(msr_bitmap)); + vmcs_write64(MSR_BITMAP, __pa(vmx_bitmap[msr_bitmap])); } /* @@ -3600,13 +3607,13 @@ static void init_vmcs_shadow_fields(void) /* shadowed fields guest access without vmexit */ for (i = 0; i < max_shadow_read_write_fields; i++) { clear_bit(shadow_read_write_fields[i], - vmx_vmwrite_bitmap); + vmx_bitmap[vmx_vmwrite_bitmap]); clear_bit(shadow_read_write_fields[i], - vmx_vmread_bitmap); + vmx_bitmap[vmx_vmread_bitmap]); } for (i = 0; i < max_shadow_read_only_fields; i++) clear_bit(shadow_read_only_fields[i], - vmx_vmread_bitmap); + vmx_bitmap[vmx_vmread_bitmap]); } static __init int alloc_kvm_area(void) @@ -4601,41 +4608,6 @@ static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap, } } -static void __vmx_enable_intercept_for_msr(unsigned long *msr_bitmap, -
[PATCH RT 08/10] x86/preempt-lazy: fixup should_resched()
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewiorshould_resched() returns true if NEED_RESCHED is set and the preempt_count is 0 _or_ if NEED_RESCHED_LAZY is set ignoring the preempt counter. Ignoring the preemp counter is wrong. This patch adds this into account. While at it, __preempt_count_dec_and_test() ignores preempt_lazy_count while checking TIF_NEED_RESCHED_LAZY so we this check, too. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- arch/x86/include/asm/preempt.h | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h index 060d1b4e475d..6806369bddf5 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h @@ -95,6 +95,8 @@ static __always_inline bool __preempt_count_dec_and_test(void) if (preempt_count_dec_and_test()) return true; #ifdef CONFIG_PREEMPT_LAZY + if (current_thread_info()->preempt_lazy_count) + return false; return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return false; @@ -107,8 +109,18 @@ static __always_inline bool __preempt_count_dec_and_test(void) static __always_inline bool should_resched(void) { #ifdef CONFIG_PREEMPT_LAZY - return unlikely(!raw_cpu_read_4(__preempt_count) || \ - test_thread_flag(TIF_NEED_RESCHED_LAZY)); + u32 tmp; + + if (!raw_cpu_read_4(__preempt_count)) + return true; + + /* preempt count == 0 ? */ + tmp &= ~PREEMPT_NEED_RESCHED; + if (tmp) + return false; + if (current_thread_info()->preempt_lazy_count) + return false; + return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return unlikely(!raw_cpu_read_4(__preempt_count)); #endif -- 2.8.1
[PATCH RT 04/10] scsi/fcoe: Fix get_cpu()/put_cpu_light() imbalance in fcoe_recv_frame()
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Mike GalbraithDuring master->rt merge, I stumbled across the buglet below. Fix get_cpu()/put_cpu_light() imbalance. Cc: stable...@vger.kernel.org Signed-off-by: Mike Gabraith Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- drivers/scsi/fcoe/fcoe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c index c1f2bd0bcdb7..8933c02b6729 100644 --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c @@ -1816,7 +1816,7 @@ static void fcoe_recv_frame(struct sk_buff *skb) */ hp = (struct fcoe_hdr *) skb_network_header(skb); - stats = per_cpu_ptr(lport->stats, get_cpu()); + stats = per_cpu_ptr(lport->stats, get_cpu_light()); if (unlikely(FC_FCOE_DECAPS_VER(hp) != FC_FCOE_VER)) { if (stats->ErrorFrames < 5) printk(KERN_WARNING "fcoe: FCoE version " -- 2.8.1
[PATCH RT 00/10] Linux 3.18.42-rt45-rc1
Dear RT Folks, This is the RT stable review cycle of patch 3.18.42-rt45-rc1. Please scream at me if I messed something up. Please test the patches too. The -rc release will be uploaded to kernel.org and will be deleted when the final release is out. This is just a review release (or release candidate). The pre-releases will not be pushed to the git repository, only the final release is. If all goes well, this patch will be converted to the next main release on 9/25/2016. Enjoy, -- Steve To build 3.18.42-rt45-rc1 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.18.tar.xz http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.18.42.xz http://www.kernel.org/pub/linux/kernel/projects/rt/3.18/patch-3.18.42-rt45-rc1.patch.xz You can also build from 3.18.42-rt44 by applying the incremental patch: http://www.kernel.org/pub/linux/kernel/projects/rt/3.18/incr/patch-3.18.42-rt44-rt45-rc1.patch.xz Changes from 3.18.42-rt44: --- Mike Galbraith (1): scsi/fcoe: Fix get_cpu()/put_cpu_light() imbalance in fcoe_recv_frame() Sebastian Andrzej Siewior (8): timers: wakeup all timer waiters timers: wakeup all timer waiters without holding the base lock sched: lazy_preempt: avoid a warning in the !RT case net: add back the missing serialization in ip_send_unicast_reply() net: add a lock around icmp_sk() fs/dcache: resched/chill only if we make no progress x86/preempt-lazy: fixup should_resched() fs/dcache: incremental fixup of the retry routine Steven Rostedt (Red Hat) (1): Linux 3.18.42-rt45-rc1 arch/x86/include/asm/preempt.h | 16 ++-- drivers/scsi/fcoe/fcoe.c | 2 +- fs/dcache.c| 18 -- kernel/sched/core.c| 2 +- kernel/time/timer.c| 4 ++-- localversion-rt| 2 +- net/ipv4/icmp.c| 8 net/ipv4/tcp_ipv4.c| 7 +++ 8 files changed, 46 insertions(+), 13 deletions(-)
[PATCH RT 06/10] net: add a lock around icmp_sk()
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorIt looks like the this_cpu_ptr() access in icmp_sk() is protected with local_bh_disable(). To avoid missing serialization in -RT I am adding here a local lock. No crash has been observed, this is just precaution. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- net/ipv4/icmp.c | 8 1 file changed, 8 insertions(+) diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 3c71d5fc54ea..46571b86e171 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -78,6 +78,7 @@ #include #include #include +#include #include #include #include @@ -204,6 +205,8 @@ static const struct icmp_control icmp_pointers[NR_ICMP_TYPES+1]; * * On SMP we have one ICMP socket per-cpu. */ +static DEFINE_LOCAL_IRQ_LOCK(icmp_sk_lock); + static struct sock *icmp_sk(struct net *net) { return net->ipv4.icmp_sk[smp_processor_id()]; @@ -215,12 +218,14 @@ static inline struct sock *icmp_xmit_lock(struct net *net) local_bh_disable(); + local_lock(icmp_sk_lock); sk = icmp_sk(net); if (unlikely(!spin_trylock(>sk_lock.slock))) { /* This can happen if the output path signals a * dst_link_failure() for an outgoing ICMP packet. */ + local_unlock(icmp_sk_lock); local_bh_enable(); return NULL; } @@ -230,6 +235,7 @@ static inline struct sock *icmp_xmit_lock(struct net *net) static inline void icmp_xmit_unlock(struct sock *sk) { spin_unlock_bh(>sk_lock.slock); + local_unlock(icmp_sk_lock); } int sysctl_icmp_msgs_per_sec __read_mostly = 1000; @@ -357,6 +363,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param, struct sock *sk; struct sk_buff *skb; + local_lock(icmp_sk_lock); sk = icmp_sk(dev_net((*rt)->dst.dev)); if (ip_append_data(sk, fl4, icmp_glue_bits, icmp_param, icmp_param->data_len+icmp_param->head_len, @@ -379,6 +386,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param, skb->ip_summed = CHECKSUM_NONE; ip_push_pending_frames(sk, fl4); } + local_unlock(icmp_sk_lock); } /* -- 2.8.1
[PATCH RT 03/10] sched: lazy_preempt: avoid a warning in the !RT case
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorSigned-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 371fa38784e0..ce6d5c6ba8f7 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3171,7 +3171,7 @@ static __always_inline int preemptible_lazy(void) #else -static int preemptible_lazy(void) +static inline int preemptible_lazy(void) { return 1; } -- 2.8.1
[PATCH RT 02/10] timers: wakeup all timer waiters without holding the base lock
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThere should be no need to hold the base lock during the wakeup. There should be no boosting involved, the wakeup list has its own lock so it should be safe to do this without the lock. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/time/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 495de6337357..0df5cbfbcb5b 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1281,8 +1281,8 @@ static inline void __run_timers(struct tvec_base *base) } } } - wakeup_timer_waiters(base); spin_unlock_irq(>lock); + wakeup_timer_waiters(base); } #ifdef CONFIG_NO_HZ_COMMON -- 2.8.1
[PATCH RT 07/10] fs/dcache: resched/chill only if we make no progress
3.18.42-rt45-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorUpstream commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in dput()") changed the condition _when_ cpu_relax() / cond_resched() was invoked. This change was adapted in -RT into mostly the same thing except that if cond_resched() did nothing we had to do cpu_chill() to force the task off CPU for a tiny little bit in case the task had RT priority and did not want to leave the CPU. This change resulted in a performance regression (in my testcase the build time on /dev/shm increased from 19min to 24min). The reason is that with this change cpu_chill() was invoked even dput() made progress (dentry_kill() returned a different dentry) instead only if we were trying this operation on the same dentry over and over again. This patch brings back to the old behavior back to cond_resched() & chill if we make no progress. A little improvement is to invoke cpu_chill() only if we are a RT task (and avoid the sleep otherwise). Otherwise the scheduler should remove us from the CPU if we make no progress. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 1cb13a024cf8..44a3419c7125 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -39,6 +39,8 @@ #include #include #include +#include +#include #include "internal.h" #include "mount.h" @@ -722,6 +724,8 @@ static inline bool fast_dput(struct dentry *dentry) */ void dput(struct dentry *dentry) { + struct dentry *parent; + if (unlikely(!dentry)) return; @@ -758,14 +762,17 @@ repeat: return; kill_it: - dentry = dentry_kill(dentry); - if (dentry) { + parent = dentry_kill(dentry); + if (parent) { int r; - /* the task with the highest priority won't schedule */ - r = cond_resched(); - if (!r) - cpu_chill(); + if (parent == dentry) { + /* the task with the highest priority won't schedule */ + r = cond_resched(); + if (!r && (rt_task(current) || dl_task(current))) + cpu_chill(); + } else + dentry = parent; goto repeat; } } -- 2.8.1
Re: + softirq-fix-tasklet_kill-and-its-users.patch added to -mm tree
Hi Thomas, On Thu, 22 Sep 2016 09:05:23 +0200 (CEST) Thomas Gleixnerwrote: > > B1;2802;0cOn Wed, 21 Sep 2016, Santosh Shilimkar wrote: > > I requested you to include this patch but now am not sure anymore. > > Looks like there are almost 30 more users which are directly > > tweaking 'tasklet_struct' fields and calling other APIs. Hunting them > > and fixing them probably would be an exercise and also those changes > > needs those changed drivers to be tested. > > > > What do you suggest ? At least this patch needs to be dropped as of now > > till we can have complete coverage for those bad users. > > Yes, it needs to be dropped. Stephen, can you please revert it from next? I will do the revert today. It reverts cleanly, so hopefully there are no side effects. -- Cheers, Stephen Rothwell
Re: [PATCH] ASoC: simple-card: add support for aux devices
Hi Nikita > Example usage: > > codec: tlv320dac3100@18 { > compatible = "ti,tlv320dac3100"; > ... > } > > amp: tpa6130a2@60 { > compatible = "ti,tpa6130a2"; > ... > } > > sound { > compatible = "simple-audio-card"; > ... > simple-audio-card,widgets = > "Headphone", "Headphone Jack"; > simple-audio-card,routing = > "Headphone Jack", "HPLEFT", > "Headphone Jack", "HPRIGHT", > "LEFTIN", "HPL", > "RIGHTIN", "HPR"; > simple-audio-card,aux-devs = <>; > simple-audio-card,cpu { > sound-dai = <>; > }; > simple-audio-card,codec { > sound-dai = <>; > clocks = ... > }; This case, I think you want ... simple-audio-card,codec { - sound-dai = <>; + sound-dai = <>; > diff --git a/Documentation/devicetree/bindings/sound/simple-card.txt > b/Documentation/devicetree/bindings/sound/simple-card.txt > index 59d8628..5579f40 100644 > --- a/Documentation/devicetree/bindings/sound/simple-card.txt > +++ b/Documentation/devicetree/bindings/sound/simple-card.txt > @@ -22,6 +22,8 @@ Optional properties: > headphones are attached. > - simple-audio-card,mic-det-gpio : Reference to GPIO that signals when > a microphone is attached. > +- simple-audio-card,aux-devs : List of phandles pointing to > auxiliary devices, such > + as amplifiers, to be added to the > sound card. > > Optional subnodes: I think it is very helpful if this document has above sample Best regards --- Kuninori Morimoto
Re: [PATCH] PCI: rockchip: Support quirk to disable 5 GT/s (PCIe 2.x) link rate
Hi Brain, 在 2016/9/23 1:31, Brian Norris 写道: rk3399 supports PCIe 2.x link speeds marginally at best, and on some boards, the link won't train at 5 GT/s at all. Rather than sacrifice 500 ms waiting for training that will never happen, let's support a device tree quirk flag to disable generation 2 speeds entirely. I was thinking about could we get target link speed [TLS] from the end-point when finishing Gen1 training, but it seems that the location of ep's TLS is not fixed. Anyway, your patch looks sane to me as we leave gen2 as default and people could drop that feature by adding rockchip,disable-gen2 to their dts if they are sure the board would never supoort Gen2 devices. Acked-by: Shawn LinSigned-off-by: Brian Norris --- .../devicetree/bindings/pci/rockchip-pcie.txt | 2 + drivers/pci/host/pcie-rockchip.c | 57 +- 2 files changed, 37 insertions(+), 22 deletions(-) diff --git a/Documentation/devicetree/bindings/pci/rockchip-pcie.txt b/Documentation/devicetree/bindings/pci/rockchip-pcie.txt index ba67b39939c1..e769726fd093 100644 --- a/Documentation/devicetree/bindings/pci/rockchip-pcie.txt +++ b/Documentation/devicetree/bindings/pci/rockchip-pcie.txt @@ -42,6 +42,8 @@ Required properties: Optional Property: - ep-gpios: contain the entry for pre-reset gpio - num-lanes: number of lanes to use +- rockchip,disable-gen2: present if PCIe generation 2.x (i.e., 5 GT/s link + speeds) is not supported. - vpcie3v3-supply: The phandle to the 3.3v regulator to use for PCIe. - vpcie1v8-supply: The phandle to the 1.8v regulator to use for PCIe. - vpcie0v9-supply: The phandle to the 0.9v regulator to use for PCIe. diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c index c3593e633ccd..f047c4a73f69 100644 --- a/drivers/pci/host/pcie-rockchip.c +++ b/drivers/pci/host/pcie-rockchip.c @@ -53,6 +53,7 @@ #define PCIE_CLIENT_ARI_ENABLE HIWORD_UPDATE_BIT(0x0008) #define PCIE_CLIENT_CONF_LANE_NUM(x) HIWORD_UPDATE(0x0030, ENCODE_LANES(x)) #define PCIE_CLIENT_MODE_RCHIWORD_UPDATE_BIT(0x0040) +#define PCIE_CLIENT_GEN_SEL_1 HIWORD_UPDATE(0x0080, 0) #define PCIE_CLIENT_GEN_SEL_2 HIWORD_UPDATE_BIT(0x0080) #define PCIE_CLIENT_BASIC_STATUS1 (PCIE_CLIENT_BASE + 0x48) #define PCIE_CLIENT_LINK_STATUS_UP 0x0030 @@ -191,6 +192,7 @@ struct rockchip_pcie { struct gpio_desc *ep_gpio; u32 lanes; u8 root_bus_nr; + boolenable_gen2; struct device *dev; struct irq_domain *irq_domain; }; @@ -418,13 +420,19 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip) return err; } + if (rockchip->enable_gen2) + rockchip_pcie_write(rockchip, PCIE_CLIENT_GEN_SEL_2, + PCIE_CLIENT_CONFIG); + else + rockchip_pcie_write(rockchip, PCIE_CLIENT_GEN_SEL_1, + PCIE_CLIENT_CONFIG); + rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE | PCIE_CLIENT_LINK_TRAIN_ENABLE | PCIE_CLIENT_ARI_ENABLE | PCIE_CLIENT_CONF_LANE_NUM(rockchip->lanes) | - PCIE_CLIENT_MODE_RC | - PCIE_CLIENT_GEN_SEL_2, + PCIE_CLIENT_MODE_RC, PCIE_CLIENT_CONFIG); err = phy_power_on(rockchip->phy); @@ -492,29 +500,31 @@ static int rockchip_pcie_init_port(struct rockchip_pcie *rockchip) msleep(20); } - /* -* Enable retrain for gen2. This should be configured only after -* gen1 finished. -*/ - status = rockchip_pcie_read(rockchip, PCIE_RC_CONFIG_LCS); - status |= PCIE_RC_CONFIG_LCS_RETRAIN_LINK; - rockchip_pcie_write(rockchip, status, PCIE_RC_CONFIG_LCS); + if (rockchip->enable_gen2) { + /* +* Enable retrain for gen2. This should be configured only after +* gen1 finished. +*/ + status = rockchip_pcie_read(rockchip, PCIE_RC_CONFIG_LCS); + status |= PCIE_RC_CONFIG_LCS_RETRAIN_LINK; + rockchip_pcie_write(rockchip, status, PCIE_RC_CONFIG_LCS); + + timeout = jiffies + msecs_to_jiffies(500); + for (;;) { + status = rockchip_pcie_read(rockchip, PCIE_CORE_CTRL); + if ((status & PCIE_CORE_PL_CONF_SPEED_MASK) == + PCIE_CORE_PL_CONF_SPEED_5G) { + dev_dbg(dev, "PCIe link training gen2 pass!\n"); + break; + } - timeout = jiffies + msecs_to_jiffies(500); -
[PATCH] drivers: drm: nouveau: Fix brace coding style error
Fixed a coding style issue. Else should follow brace. Signed-off-by: David Archuleta Jr.--- drivers/gpu/drm/nouveau/nouveau_display.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index afbf557..0d39e17 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -851,8 +851,7 @@ nouveau_finish_page_flip(struct nouveau_channel *chan, /* Give up ownership of vblank for page-flipped crtc */ drm_crtc_vblank_put(s->crtc); } - } - else { + } else { /* Give up ownership of vblank for page-flipped crtc */ drm_crtc_vblank_put(s->crtc); } -- 2.10.0
Re: [PATCH] usb: dwc2: add USBTrdTim to initial value
On 9/21/2016 6:43 PM, Pengcheng Li wrote: > After dwc2_core_reset,register is to the initial value, and the USBTrdTim > vale is 0x5. If hsotg->phyif = GUSBCFG_PHYIF8, after the dwc2_writel,the > value is 0xd.So we need to set the USBTrdTim to 0. [++ Felipe] Looks good. But please clean up the subject and message Subject: usb: dwc2: Clear GUSBCFG.UsbTrdTim before setting Description: The USBTRDTIM field needs to be cleared before setting a new value. Otherwise it will result in an incorrect value if phyif == GUSBCFG_PHYIF8. With that Acked-by: John YounThanks, John > > Signed-off-by: Pengcheng Li > --- > drivers/usb/dwc2/gadget.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c > index af46adf..9e52e4f 100644 > --- a/drivers/usb/dwc2/gadget.c > +++ b/drivers/usb/dwc2/gadget.c > @@ -2531,7 +2531,7 @@ void dwc2_hsotg_core_init_disconnected(struct > dwc2_hsotg *hsotg, > /* keep other bits untouched (so e.g. forced modes are not lost) */ > usbcfg = dwc2_readl(hsotg->regs + GUSBCFG); > usbcfg &= ~(GUSBCFG_TOUTCAL_MASK | GUSBCFG_PHYIF16 | GUSBCFG_SRPCAP | > - GUSBCFG_HNPCAP); > + GUSBCFG_HNPCAP | GUSBCFG_USBTRDTIM_MASK); > > /* set the PLL on, remove the HNP/SRP and set the PHY */ > val = (hsotg->phyif == GUSBCFG_PHYIF8) ? 9 : 5; > @@ -3413,7 +3413,7 @@ static void dwc2_hsotg_init(struct dwc2_hsotg *hsotg) > /* keep other bits untouched (so e.g. forced modes are not lost) */ > usbcfg = dwc2_readl(hsotg->regs + GUSBCFG); > usbcfg &= ~(GUSBCFG_TOUTCAL_MASK | GUSBCFG_PHYIF16 | GUSBCFG_SRPCAP | > - GUSBCFG_HNPCAP); > + GUSBCFG_HNPCAP | GUSBCFG_USBTRDTIM_MASK); > > /* set the PLL on, remove the HNP/SRP and set the PHY */ > trdtim = (hsotg->phyif == GUSBCFG_PHYIF8) ? 9 : 5; >
Re: [PATCH 0/4 v3] Add an interface to discover relationships between namespaces
Andrei Vaginwrites: > From: Andrey Vagin > > Each namespace has an owning user namespace and now there is not way > to discover these relationships. > > Pid and user namepaces are hierarchical. There is no way to discover > parent-child relationships too. > > Why we may want to know relationships between namespaces? > > One use would be visualization, in order to understand the running > system. Another would be to answer the question: what capability does > process X have to perform operations on a resource governed by namespace > Y? > > One more use-case (which usually called abnormal) is checkpoint/restart. > In CRIU we are going to dump and restore nested namespaces. > > There [1] was a discussion about which interface to choose to determing > relationships between namespaces. > > Eric suggested to add two ioctl-s [2]: >> Grumble, Grumble. I think this may actually a case for creating ioctls >> for these two cases. Now that random nsfs file descriptors are bind >> mountable the original reason for using proc files is not as pressing. >> >> One ioctl for the user namespace that owns a file descriptor. >> One ioctl for the parent namespace of a namespace file descriptor. > > Here is an implementaions of these ioctl-s. > > $ man man7/namespaces.7 > ... > Since Linux 4.X, the following ioctl(2) calls are supported for > namespace file descriptors. The correct syntax is: > > fd = ioctl(ns_fd, ioctl_type); > > where ioctl_type is one of the following: > > NS_GET_USERNS > Returns a file descriptor that refers to an owning user names‐ > pace. > > NS_GET_PARENT > Returns a file descriptor that refers to a parent namespace. > This ioctl(2) can be used for pid and user namespaces. For > user namespaces, NS_GET_PARENT and NS_GET_USERNS have the same > meaning. > > In addition to generic ioctl(2) errors, the following specific ones > can occur: > > EINVAL NS_GET_PARENT was called for a nonhierarchical namespace. > > EPERM The requested namespace is outside of the current namespace > scope. > > [1] https://lkml.org/lkml/2016/7/6/158 > [2] https://lkml.org/lkml/2016/7/9/101 > > Changes for v2: > * don't return ENOENT for init_user_ns and init_pid_ns. There is nothing > outside of the init namespace, so we can return EPERM in this case too. >> The fewer special cases the easier the code is to get >> correct, and the easier it is to read. // Eric > > Changes for v3: > * rename ns->get_owner() to ns->owner(). get_* usually means that it > grabs a reference. > > Cc: "Eric W. Biederman" > Cc: James Bottomley > Cc: "Michael Kerrisk (man-pages)" > Cc: "W. Trevor King" > Cc: Alexander Viro > Cc: Serge Hallyn > Applied thanks. I didn't see any issues except your patch __ns_get_path was missing a mntput in the retry case. So I just fixed that. Eric
[PATCH RT 03/10] sched: lazy_preempt: avoid a warning in the !RT case
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorSigned-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d07a89f3681f..6d2591018bcb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3027,7 +3027,7 @@ static __always_inline int preemptible_lazy(void) #else -static int preemptible_lazy(void) +static inline int preemptible_lazy(void) { return 1; } -- 2.8.1
[PATCH RT 08/10] x86/preempt-lazy: fixup should_resched()
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewiorshould_resched() returns true if NEED_RESCHED is set and the preempt_count is 0 _or_ if NEED_RESCHED_LAZY is set ignoring the preempt counter. Ignoring the preemp counter is wrong. This patch adds this into account. While at it, __preempt_count_dec_and_test() ignores preempt_lazy_count while checking TIF_NEED_RESCHED_LAZY so we this check, too. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- arch/x86/include/asm/preempt.h | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h index 9afdfb9ce021..aba26ff51f39 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h @@ -107,6 +107,8 @@ static __always_inline bool __preempt_count_dec_and_test(void) if (preempt_count_dec_and_test()) return true; #ifdef CONFIG_PREEMPT_LAZY + if (current_thread_info()->preempt_lazy_count) + return false; return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return false; @@ -119,8 +121,19 @@ static __always_inline bool __preempt_count_dec_and_test(void) static __always_inline bool should_resched(int preempt_offset) { #ifdef CONFIG_PREEMPT_LAZY - return unlikely(__this_cpu_read_4(__preempt_count) == preempt_offset || \ - test_thread_flag(TIF_NEED_RESCHED_LAZY)); + u32 tmp; + + tmp = __this_cpu_read_4(__preempt_count); + if (tmp == preempt_offset) + return true; + + /* preempt count == 0 ? */ + tmp &= ~PREEMPT_NEED_RESCHED; + if (tmp) + return false; + if (current_thread_info()->preempt_lazy_count) + return false; + return test_thread_flag(TIF_NEED_RESCHED_LAZY); #else return unlikely(__this_cpu_read_4(__preempt_count) == preempt_offset); #endif -- 2.8.1
[PATCH RT 05/10] net: add back the missing serialization in ip_send_unicast_reply()
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorSome time ago Sami Pietikainen reported a crash on -RT in ip_send_unicast_reply() which was later fixed by Nicholas Mc Guire (v3.12.8-rt11). Later (v3.18.8) the code was reworked and I dropped the patch. As it turns out it was mistake. I have reports that the same crash is possible with a similar backtrace. It seems that vanilla protects access to this_cpu_ptr() via local_bh_disable(). This does not work the on -RT since we can have NET_RX and NET_TX running in parallel on the same CPU. This is brings back the old locks. |Unable to handle kernel NULL pointer dereference at virtual address 0010 |PC is at __ip_make_skb+0x198/0x3e8 |[] (__ip_make_skb) from [] (ip_push_pending_frames+0x20/0x40) |[] (ip_push_pending_frames) from [] (ip_send_unicast_reply+0x210/0x22c) |[] (ip_send_unicast_reply) from [] (tcp_v4_send_reset+0x190/0x1c0) |[] (tcp_v4_send_reset) from [] (tcp_v4_do_rcv+0x22c/0x288) |[] (tcp_v4_do_rcv) from [] (release_sock+0xb4/0x150) |[] (release_sock) from [] (tcp_close+0x240/0x454) |[] (tcp_close) from [] (inet_release+0x74/0x7c) |[] (inet_release) from [] (sock_release+0x30/0xb0) |[] (sock_release) from [] (sock_close+0x1c/0x24) |[] (sock_close) from [] (__fput+0xe8/0x20c) |[] (__fput) from [] (fput+0x18/0x1c) |[] (fput) from [] (task_work_run+0xa4/0xb8) |[] (task_work_run) from [] (do_work_pending+0xd0/0xe4) |[] (do_work_pending) from [] (work_pending+0xc/0x20) |Code: e3530001 8a01 e3a00040 ea11 (e5973010) Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- net/ipv4/tcp_ipv4.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 1b2a53e625cc..ded5b34eba27 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -62,6 +62,7 @@ #include #include #include +#include #include #include @@ -573,6 +574,7 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff *skb) } EXPORT_SYMBOL(tcp_v4_send_check); +static DEFINE_LOCAL_IRQ_LOCK(tcp_sk_lock); /* * This routine will send an RST to the other tcp. * @@ -691,9 +693,12 @@ static void tcp_v4_send_reset(struct sock *sk, struct sk_buff *skb) net = dev_net(skb_dst(skb)->dev); arg.tos = ip_hdr(skb)->tos; + + local_lock(tcp_sk_lock); ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk), skb, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, , arg.iov[0].iov_len); + local_unlock(tcp_sk_lock); TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS); @@ -775,9 +780,11 @@ static void tcp_v4_send_ack(struct sk_buff *skb, u32 seq, u32 ack, if (oif) arg.bound_dev_if = oif; arg.tos = tos; + local_lock(tcp_sk_lock); ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk), skb, ip_hdr(skb)->saddr, ip_hdr(skb)->daddr, , arg.iov[0].iov_len); + local_unlock(tcp_sk_lock); TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); } -- 2.8.1
[PATCH RT 02/10] timers: wakeup all timer waiters without holding the base lock
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThere should be no need to hold the base lock during the wakeup. There should be no boosting involved, the wakeup list has its own lock so it should be safe to do this without the lock. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/timer.c b/kernel/timer.c index d0552cd25b40..1552f3a3adf9 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -1267,8 +1267,8 @@ static inline void __run_timers(struct tvec_base *base) } } } - wakeup_timer_waiters(base); spin_unlock_irq(>lock); + wakeup_timer_waiters(base); } #ifdef CONFIG_NO_HZ_COMMON -- 2.8.1
[PATCH RT 01/10] timers: wakeup all timer waiters
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorThe base lock is dropped during the invocation if the timer. That means it is possible that we have one waiter while timer1 is running and once this one finished, we get another waiter while timer2 is running. Since we wake up only one waiter it is possible that we miss the other one. This will probably heal itself over time because most of the time we complete timers without an active wake up. To avoid the scenario where we don't wake up all waiters at once, wake_up_all() is used. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- kernel/timer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/timer.c b/kernel/timer.c index 3796af212c95..d0552cd25b40 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -1009,7 +1009,7 @@ static void wait_for_running_timer(struct timer_list *timer) base->running_timer != timer); } -# define wakeup_timer_waiters(b) wake_up(&(b)->wait_for_running_timer) +# define wakeup_timer_waiters(b) wake_up_all(&(b)->wait_for_running_timer) #else static inline void wait_for_running_timer(struct timer_list *timer) { -- 2.8.1
[PATCH RT 10/10] Linux 3.14.79-rt85-rc1
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: "Steven Rostedt (Red Hat)"--- localversion-rt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localversion-rt b/localversion-rt index fc6ea32352bc..f4f446dc00a4 100644 --- a/localversion-rt +++ b/localversion-rt @@ -1 +1 @@ --rt84 +-rt85-rc1 -- 2.8.1
[PATCH RT 07/10] fs/dcache: resched/chill only if we make no progress
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorUpstream commit 47be61845c77 ("fs/dcache.c: avoid soft-lockup in dput()") changed the condition _when_ cpu_relax() / cond_resched() was invoked. This change was adapted in -RT into mostly the same thing except that if cond_resched() did nothing we had to do cpu_chill() to force the task off CPU for a tiny little bit in case the task had RT priority and did not want to leave the CPU. This change resulted in a performance regression (in my testcase the build time on /dev/shm increased from 19min to 24min). The reason is that with this change cpu_chill() was invoked even dput() made progress (dentry_kill() returned a different dentry) instead only if we were trying this operation on the same dentry over and over again. This patch brings back to the old behavior back to cond_resched() & chill if we make no progress. A little improvement is to invoke cpu_chill() only if we are a RT task (and avoid the sleep otherwise). Otherwise the scheduler should remove us from the CPU if we make no progress. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 2c37cbad09d8..4935dbe31d65 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -39,6 +39,8 @@ #include #include #include +#include +#include #include "internal.h" #include "mount.h" @@ -589,6 +591,8 @@ again: */ void dput(struct dentry *dentry) { + struct dentry *parent; + if (unlikely(!dentry)) return; @@ -617,9 +621,19 @@ repeat: return; kill_it: - dentry = dentry_kill(dentry); - if (dentry) + parent = dentry_kill(dentry); + if (parent) { + int r; + + if (parent == dentry) { + /* the task with the highest priority won't schedule */ + r = cond_resched(); + if (!r && (rt_task(current) || dl_task(current))) + cpu_chill(); + } else + dentry = parent; goto repeat; + } } EXPORT_SYMBOL(dput); -- 2.8.1
[PATCH RT 09/10] fs/dcache: incremental fixup of the retry routine
3.14.79-rt85-rc1 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorIt has been pointed out by tglx that on UP the non-RT task could spin its entire time slice because the lock owner is preempted. This won't happen on !RT. So we back to "chill" if we can't cond_resched() did not work. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt --- fs/dcache.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index 4935dbe31d65..8db65a5d2a6e 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -39,8 +39,6 @@ #include #include #include -#include -#include #include "internal.h" #include "mount.h" @@ -628,10 +626,11 @@ kill_it: if (parent == dentry) { /* the task with the highest priority won't schedule */ r = cond_resched(); - if (!r && (rt_task(current) || dl_task(current))) + if (!r) cpu_chill(); - } else + } else { dentry = parent; + } goto repeat; } } -- 2.8.1
[PATCH 0/4] ARM: at91: initial samx7 support
Hi, This series adds initial support for Atmel armv7m SoCs. Most of the work has been done and tested by András, thanks! Alexandre Belloni (1): ARM: at91: handle CONFIG_PM for armv7m configurations Szemző András (3): ARM: at91: Add armv7m support ARM: dts: at91: add samx7 dtsi ARM: at91: debug: add samx7 support arch/arm/Kconfig.debug | 10 + arch/arm/boot/dts/samx7.dtsi | 1166 ++ arch/arm/mach-at91/Kconfig | 15 +- arch/arm/mach-at91/Makefile |4 +- arch/arm/mach-at91/Makefile.boot |3 + arch/arm/mach-at91/samx7.c | 72 +++ arch/arm/mach-at91/soc.h | 21 + 7 files changed, 1288 insertions(+), 3 deletions(-) create mode 100644 arch/arm/boot/dts/samx7.dtsi create mode 100644 arch/arm/mach-at91/Makefile.boot create mode 100644 arch/arm/mach-at91/samx7.c -- 2.9.3