date:20140117

BUG: spinlock lockup

2014-01-17 Thread naveen yadav

Dear All,

We are using 3.8.x  kernel on ARM, We are facing soft lockup issue.
Following are the logs.

BUG: spinlock lockup suspected on CPU#0, process1/525
lock: 0xd8ac9a64, .magic: dead4ead, .owner: /-1, .owner_cpu: -1


1 . Looks like lock is available as owner is -1, why arch_spin_trylock
is getting failed ?

2. There is a patch : ARM: spinlock: retry trylock operation if strex
fails on free lock
http://permalink.gmane.org/gmane.linux.ports.arm.kernel/240913
In this patch, A loop has been added around strexeq %2, %0, [%3]".
{Comment "retry the trylock operation if the lock appears
to be free but the strex reported failure"}

but arch_spin_trylock is called by __spin_lock_debug and its already
getting called in loops. So what purpose is resolves?

static void __spin_lock_debug(raw_spinlock_t *lock)
{
u64 i;
u64 loops = loops_per_jiffy * HZ;

for (i = 0; i < loops; i++) {
if (arch_spin_trylock(>raw_lock))
return;
__delay(1);
}
/* lockup suspected: */
spin_dump(lock, "lockup suspected");
}

3. Is this patch useful to us, How can we reproduce this scenario ?
Scenario : Lock is available but arch_spin_trylock  is returning as failure

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/9] Phase out pci_enable_msi_block()

2014-01-17 Thread Alexander Gordeev

On Fri, Jan 17, 2014 at 02:00:32PM -0700, Bjorn Helgaas wrote:
> > As the release is supposedly this weekend, do you prefer
> > the patches to go to your tree or to individual trees after
> > the release?
> 
> I'd be happy to merge them, except for the fact that they probably
> wouldn't have any time in -next before I ask Linus to pull them.  So
> how about if we wait until after the release, ask the area maintainers
> to take them, and if they don't take them, I'll put them in my tree
> for v3.15?

Patch 11 depends on patches 1-10, so I am not sure how to better handle it.
Whatever works for you ;)

I am only concerned with a regression fix "ahci: Fix broken fallback to
single MSI mode" which would be nice to have in 3.14. But it seems pretty
much too late.

> Bjorn

Thanks!

-- 
Regards,
Alexander Gordeev
agord...@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread dormando

> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote:
> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote:
> > > Hi,
> > >
> > > Upgraded a few kernels to the latest 3.10 stable tree while tracking down
> > > a rare kernel panic, seems to have introduced a much more frequent kernel
> > > panic. Takes anywhere from 4 hours to 2 days to trigger:
> > >
> > > <4>[196727.311203] general protection fault:  [#1] SMP
> > > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan 
> > > bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode 
> > > ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm 
> > > tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp 
> > > pps_core mdio
> > > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
> > > <4>[196727.311344] Hardware name: Supermicro 
> > > X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
> > > <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 task.ti: 
> > > 885e6f072000
> > > <4>[196727.311377] RIP: 0010:[]  [] 
> > > ipv4_dst_destroy+0x4f/0x80
> > > <4>[196727.311399] RSP: 0018:885effd23a70  EFLAGS: 00010282
> > > <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 RCX: 
> > > 0040
> > > <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 RDI: 
> > > dead00200200
> > > <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 R09: 
> > > 885d5a590800
> > > <4>[196727.311451] R10:  R11:  R12: 
> > > 
> > > <4>[196727.311464] R13: 81c8c280 R14:  R15: 
> > > 880e85ee16ce
> > > <4>[196727.311510] FS:  () GS:885effd2() 
> > > knlGS:
> > > <4>[196727.311554] CS:  0010 DS:  ES:  CR0: 80050033
> > > <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 CR4: 
> > > 000407e0
> > > <4>[196727.311625] DR0:  DR1:  DR2: 
> > > 
> > > <4>[196727.311669] DR3:  DR6: 0ff0 DR7: 
> > > 0400
> > > <4>[196727.311713] Stack:
> > > <4>[196727.311733]  8854c398ecc0 8854c398ecc0 885effd23ab0 
> > > 815b7f42
> > > <4>[196727.311784]  88be6595bc00 8854c398ecc0  
> > > 8854c398ecc0
> > > <4>[196727.311834]  885effd23ad0 815b86c6 885d5a590800 
> > > 8816827821c0
> > > <4>[196727.311885] Call Trace:
> > > <4>[196727.311907]  
> > > <4>[196727.311912]  [] dst_destroy+0x32/0xe0
> > > <4>[196727.311959]  [] dst_release+0x56/0x80
> > > <4>[196727.311986]  [] tcp_v4_do_rcv+0x2a5/0x4a0
> > > <4>[196727.312013]  [] tcp_v4_rcv+0x7da/0x820
> > > <4>[196727.312041]  [] ? ip_rcv_finish+0x360/0x360
> > > <4>[196727.312070]  [] ? nf_hook_slow+0x7d/0x150
> > > <4>[196727.312097]  [] ? ip_rcv_finish+0x360/0x360
> > > <4>[196727.312125]  [] 
> > > ip_local_deliver_finish+0xb2/0x230
> > > <4>[196727.312154]  [] ip_local_deliver+0x4a/0x90
> > > <4>[196727.312183]  [] ip_rcv_finish+0x119/0x360
> > > <4>[196727.312212]  [] ip_rcv+0x22b/0x340
> > > <4>[196727.312242]  [] ? macvlan_broadcast+0x160/0x160 
> > > [macvlan]
> > > <4>[196727.312275]  [] 
> > > __netif_receive_skb_core+0x512/0x640
> > > <4>[196727.312308]  [] ? kmem_cache_alloc+0x13b/0x150
> > > <4>[196727.312338]  [] __netif_receive_skb+0x21/0x70
> > > <4>[196727.312368]  [] netif_receive_skb+0x31/0xa0
> > > <4>[196727.312397]  [] napi_gro_receive+0xe8/0x140
> > > <4>[196727.312433]  [] ixgbe_poll+0x551/0x11f0 [ixgbe]
> > > <4>[196727.312463]  [] ? ip_rcv+0x22b/0x340
> > > <4>[196727.312491]  [] net_rx_action+0x111/0x210
> > > <4>[196727.312521]  [] ? __netif_receive_skb+0x21/0x70
> > > <4>[196727.312552]  [] __do_softirq+0xd0/0x270
> > > <4>[196727.312583]  [] call_softirq+0x1c/0x30
> > > <4>[196727.312613]  [] do_softirq+0x55/0x90
> > > <4>[196727.312640]  [] irq_exit+0x55/0x60
> > > <4>[196727.312668]  [] do_IRQ+0x63/0xe0
> > > <4>[196727.312696]  [] common_interrupt+0x6a/0x6a
> > > <4>[196727.312722]  
> > > <4>[196727.312727]  [] ? default_idle+0x20/0xe0
> > > <4>[196727.312775]  [] arch_cpu_idle+0xf/0x20
> > > <4>[196727.312803]  [] cpu_startup_entry+0xc0/0x270
> > > <4>[196727.312833]  [] start_secondary+0x1f9/0x200
> > > <4>[196727.312860] Code: 4a 9f e9 81 e8 13 cb 0c 00 48 8b 93 b0 00 00 00 
> > > 48 bf 00 02 20 00 00 00 ad de 48 8b 83 b8 00 00 00 48 be 00 01 10 00 00 
> > > 00 ad de <48> 89 42 08 48 89 10 48 89 bb b8 00 00 00 48 c7 c7 4a 9f e9 81
> > > <1>[196727.313071] RIP  [] ipv4_dst_destroy+0x4f/0x80
> > > <4>[196727.313100]  RSP 
> > > <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
> > > <0>[196727.380908] Kernel panic - not syncing: Fatal exception in 
> > > interrupt
> > >
> > >
> > > ... bisecting it's going to be a pain... I tried eyeballing the diffs and
> > > am trying a revert or two.
> > >
> > > We've hit it in .25, .26 so far. I

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread Eric Dumazet

On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote:
> On Fri, 2014-01-17 at 17:25 -0800, dormando wrote:
> > Hi,
> > 
> > Upgraded a few kernels to the latest 3.10 stable tree while tracking down
> > a rare kernel panic, seems to have introduced a much more frequent kernel
> > panic. Takes anywhere from 4 hours to 2 days to trigger:
> > 
> > <4>[196727.311203] general protection fault:  [#1] SMP
> > <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge 
> > coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog 
> > ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios 
> > ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
> > <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
> > <4>[196727.311344] Hardware name: Supermicro 
> > X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
> > <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 task.ti: 
> > 885e6f072000
> > <4>[196727.311377] RIP: 0010:[]  [] 
> > ipv4_dst_destroy+0x4f/0x80
> > <4>[196727.311399] RSP: 0018:885effd23a70  EFLAGS: 00010282
> > <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 RCX: 
> > 0040
> > <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 RDI: 
> > dead00200200
> > <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 R09: 
> > 885d5a590800
> > <4>[196727.311451] R10:  R11:  R12: 
> > 
> > <4>[196727.311464] R13: 81c8c280 R14:  R15: 
> > 880e85ee16ce
> > <4>[196727.311510] FS:  () GS:885effd2() 
> > knlGS:
> > <4>[196727.311554] CS:  0010 DS:  ES:  CR0: 80050033
> > <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 CR4: 
> > 000407e0
> > <4>[196727.311625] DR0:  DR1:  DR2: 
> > 
> > <4>[196727.311669] DR3:  DR6: 0ff0 DR7: 
> > 0400
> > <4>[196727.311713] Stack:
> > <4>[196727.311733]  8854c398ecc0 8854c398ecc0 885effd23ab0 
> > 815b7f42
> > <4>[196727.311784]  88be6595bc00 8854c398ecc0  
> > 8854c398ecc0
> > <4>[196727.311834]  885effd23ad0 815b86c6 885d5a590800 
> > 8816827821c0
> > <4>[196727.311885] Call Trace:
> > <4>[196727.311907]  
> > <4>[196727.311912]  [] dst_destroy+0x32/0xe0
> > <4>[196727.311959]  [] dst_release+0x56/0x80
> > <4>[196727.311986]  [] tcp_v4_do_rcv+0x2a5/0x4a0
> > <4>[196727.312013]  [] tcp_v4_rcv+0x7da/0x820
> > <4>[196727.312041]  [] ? ip_rcv_finish+0x360/0x360
> > <4>[196727.312070]  [] ? nf_hook_slow+0x7d/0x150
> > <4>[196727.312097]  [] ? ip_rcv_finish+0x360/0x360
> > <4>[196727.312125]  [] ip_local_deliver_finish+0xb2/0x230
> > <4>[196727.312154]  [] ip_local_deliver+0x4a/0x90
> > <4>[196727.312183]  [] ip_rcv_finish+0x119/0x360
> > <4>[196727.312212]  [] ip_rcv+0x22b/0x340
> > <4>[196727.312242]  [] ? macvlan_broadcast+0x160/0x160 
> > [macvlan]
> > <4>[196727.312275]  [] 
> > __netif_receive_skb_core+0x512/0x640
> > <4>[196727.312308]  [] ? kmem_cache_alloc+0x13b/0x150
> > <4>[196727.312338]  [] __netif_receive_skb+0x21/0x70
> > <4>[196727.312368]  [] netif_receive_skb+0x31/0xa0
> > <4>[196727.312397]  [] napi_gro_receive+0xe8/0x140
> > <4>[196727.312433]  [] ixgbe_poll+0x551/0x11f0 [ixgbe]
> > <4>[196727.312463]  [] ? ip_rcv+0x22b/0x340
> > <4>[196727.312491]  [] net_rx_action+0x111/0x210
> > <4>[196727.312521]  [] ? __netif_receive_skb+0x21/0x70
> > <4>[196727.312552]  [] __do_softirq+0xd0/0x270
> > <4>[196727.312583]  [] call_softirq+0x1c/0x30
> > <4>[196727.312613]  [] do_softirq+0x55/0x90
> > <4>[196727.312640]  [] irq_exit+0x55/0x60
> > <4>[196727.312668]  [] do_IRQ+0x63/0xe0
> > <4>[196727.312696]  [] common_interrupt+0x6a/0x6a
> > <4>[196727.312722]  
> > <4>[196727.312727]  [] ? default_idle+0x20/0xe0
> > <4>[196727.312775]  [] arch_cpu_idle+0xf/0x20
> > <4>[196727.312803]  [] cpu_startup_entry+0xc0/0x270
> > <4>[196727.312833]  [] start_secondary+0x1f9/0x200
> > <4>[196727.312860] Code: 4a 9f e9 81 e8 13 cb 0c 00 48 8b 93 b0 00 00 00 48 
> > bf 00 02 20 00 00 00 ad de 48 8b 83 b8 00 00 00 48 be 00 01 10 00 00 00 ad 
> > de <48> 89 42 08 48 89 10 48 89 bb b8 00 00 00 48 c7 c7 4a 9f e9 81
> > <1>[196727.313071] RIP  [] ipv4_dst_destroy+0x4f/0x80
> > <4>[196727.313100]  RSP 
> > <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
> > <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt
> > 
> > 
> > ... bisecting it's going to be a pain... I tried eyeballing the diffs and
> > am trying a revert or two.
> > 
> > We've hit it in .25, .26 so far. I have .27 running but not sure if it
> > crashed, so the change exists between .15 and .25.
> 
> Please try following fix, thanks for the report !
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index

Re: blackfin + dmaengine: conflicting define/enum "DMA_COMPLETE"

2014-01-17 Thread Mike Frysinger

On Saturday 11 January 2014 13:55:15 Marc Kleine-Budde wrote:
> On 01/11/2014 07:31 PM, Randy Dunlap wrote:
> > On 01/11/2014 10:09 AM, Marc Kleine-Budde wrote:
> >> Hello,
> >> 
> >> in current linux-next (and net-next) the compilation of the CAN
> >> 
> >> drivers[1] with ARCH=blackfin fails with:
> >>>   CC [M]  drivers/net/can/c_can/c_can.o
> >>> 
> >>> In file included from linux/include/linux/netdevice.h:38:0,
> >>> 
> >>>  from linux/drivers/net/can/c_can/c_can.c:32:
> >>> linux/include/linux/dmaengine.h:55:2: error: expected identifier before
> >>> numeric constant linux/include/linux/dmaengine.h: In function
> >>> 'dma_async_is_complete': linux/include/linux/dmaengine.h:1023:9:
> >>> error: 'DMA_IN_PROGRESS' undeclared (first use in this function)
> >>> linux/include/linux/dmaengine.h:1023:9: note: each undeclared
> >>> identifier is reported only once for each function it appears in
> >> 
> >> There are two locations where DMA_COMPLETE is defined:
> >>> arch/blackfin/mach-bf548/include/mach/defBF547.h:602:#define   
> >>>   DMA_COMPLETE  0x8/* DMA Complete */
> >>> arch/blackfin/mach-bf548/include/mach/defBF544.h:622:#define  
> >>>DMA_COMPLETE  0x8/* DMA Complete */
> >> 
> >> and
> >> 
> >>> include/linux/dmaengine.h-enum dma_status {
> >>> include/linux/dmaengine.h:  DMA_COMPLETE,
> >>> include/linux/dmaengine.h-  DMA_IN_PROGRESS,
> >>> include/linux/dmaengine.h-  DMA_PAUSED,
> >>> include/linux/dmaengine.h-  DMA_ERROR,
> >>> include/linux/dmaengine.h-};
> >> 
> >> What's the appropriate fix for the problem?
> > 
> > arch/blackfin/mach-bf548/ needs a less generic name for its macro.
> 
> Mike, is there a in tree user of blacksfin's DMA_COMPLETE? I cannot find
> anyone.

looks like those are defines for the host port peripheral on the BF54x.  
typically for peripherals we didn't have proper drivers for (like CAN and UART 
and SPI and such), we left the defines in the headers.  those in turn matched 
the manual so people coming from other Blackfin environments (and reading the 
manuals) didn't have to figure out what name the Linux headers used.

unfortunately, it leads to cases like this where the names are pretty bad.  
considering the host peripheral most likely never saw any serious use, it 
should be fine to delete all the bit defines in those headers related to those 
registers (i see HOST_{STATUS,CONTROL,TIMEOUT}.
-mike


signature.asc
Description: This is a digitally signed message part.

Re: [PATCH 11/11] ext4: add cross rename support

2014-01-17 Thread Miklos Szeredi

On Fri, Jan 17, 2014 at 11:08 PM, J. Bruce Fields  wrote:
> On Fri, Jan 17, 2014 at 11:53:07PM +1300, Michael Kerrisk (man-pages) wrote:
>> >The following additional errors are defined for renameat2():
>> >
>> >EOPNOTSUPP
>> >   The filesystem does not support a flag in flags
>>
>> This is not the usual error for an invalid bit flag. Please make it EINVAL.
>
> I agree that EINVAL makes sense for an invalid bit flag.
>
> But renameat2() can also fail when the caller passes a perfectly valid
> flags field but the paths resolve to a filesystem that doesn't support
> the RENAME_EXCHANGE operation.  EOPNOTSUPP looks more appropriate in
> that case.

OTOH, from the app's perspective, it makes little difference whether a
particular kernel doesn't support the reanameat2 syscall, or it
doesn't support RENAME_FOO flag or if it does support RENAME_FOO but
not in all filesystems.  In all those cases it has to just fall back
to something supported and it doesn't matter *why* it wasn't
supported.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread Eric Dumazet

On Fri, 2014-01-17 at 17:25 -0800, dormando wrote:
> Hi,
> 
> Upgraded a few kernels to the latest 3.10 stable tree while tracking down
> a rare kernel panic, seems to have introduced a much more frequent kernel
> panic. Takes anywhere from 4 hours to 2 days to trigger:
> 
> <4>[196727.311203] general protection fault:  [#1] SMP
> <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge 
> coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog 
> ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si 
> ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
> <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
> <4>[196727.311344] Hardware name: Supermicro 
> X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
> <4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 task.ti: 
> 885e6f072000
> <4>[196727.311377] RIP: 0010:[]  [] 
> ipv4_dst_destroy+0x4f/0x80
> <4>[196727.311399] RSP: 0018:885effd23a70  EFLAGS: 00010282
> <4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 RCX: 
> 0040
> <4>[196727.311423] RDX: dead00100100 RSI: dead00100100 RDI: 
> dead00200200
> <4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 R09: 
> 885d5a590800
> <4>[196727.311451] R10:  R11:  R12: 
> 
> <4>[196727.311464] R13: 81c8c280 R14:  R15: 
> 880e85ee16ce
> <4>[196727.311510] FS:  () GS:885effd2() 
> knlGS:
> <4>[196727.311554] CS:  0010 DS:  ES:  CR0: 80050033
> <4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 CR4: 
> 000407e0
> <4>[196727.311625] DR0:  DR1:  DR2: 
> 
> <4>[196727.311669] DR3:  DR6: 0ff0 DR7: 
> 0400
> <4>[196727.311713] Stack:
> <4>[196727.311733]  8854c398ecc0 8854c398ecc0 885effd23ab0 
> 815b7f42
> <4>[196727.311784]  88be6595bc00 8854c398ecc0  
> 8854c398ecc0
> <4>[196727.311834]  885effd23ad0 815b86c6 885d5a590800 
> 8816827821c0
> <4>[196727.311885] Call Trace:
> <4>[196727.311907]  
> <4>[196727.311912]  [] dst_destroy+0x32/0xe0
> <4>[196727.311959]  [] dst_release+0x56/0x80
> <4>[196727.311986]  [] tcp_v4_do_rcv+0x2a5/0x4a0
> <4>[196727.312013]  [] tcp_v4_rcv+0x7da/0x820
> <4>[196727.312041]  [] ? ip_rcv_finish+0x360/0x360
> <4>[196727.312070]  [] ? nf_hook_slow+0x7d/0x150
> <4>[196727.312097]  [] ? ip_rcv_finish+0x360/0x360
> <4>[196727.312125]  [] ip_local_deliver_finish+0xb2/0x230
> <4>[196727.312154]  [] ip_local_deliver+0x4a/0x90
> <4>[196727.312183]  [] ip_rcv_finish+0x119/0x360
> <4>[196727.312212]  [] ip_rcv+0x22b/0x340
> <4>[196727.312242]  [] ? macvlan_broadcast+0x160/0x160 
> [macvlan]
> <4>[196727.312275]  [] __netif_receive_skb_core+0x512/0x640
> <4>[196727.312308]  [] ? kmem_cache_alloc+0x13b/0x150
> <4>[196727.312338]  [] __netif_receive_skb+0x21/0x70
> <4>[196727.312368]  [] netif_receive_skb+0x31/0xa0
> <4>[196727.312397]  [] napi_gro_receive+0xe8/0x140
> <4>[196727.312433]  [] ixgbe_poll+0x551/0x11f0 [ixgbe]
> <4>[196727.312463]  [] ? ip_rcv+0x22b/0x340
> <4>[196727.312491]  [] net_rx_action+0x111/0x210
> <4>[196727.312521]  [] ? __netif_receive_skb+0x21/0x70
> <4>[196727.312552]  [] __do_softirq+0xd0/0x270
> <4>[196727.312583]  [] call_softirq+0x1c/0x30
> <4>[196727.312613]  [] do_softirq+0x55/0x90
> <4>[196727.312640]  [] irq_exit+0x55/0x60
> <4>[196727.312668]  [] do_IRQ+0x63/0xe0
> <4>[196727.312696]  [] common_interrupt+0x6a/0x6a
> <4>[196727.312722]  
> <4>[196727.312727]  [] ? default_idle+0x20/0xe0
> <4>[196727.312775]  [] arch_cpu_idle+0xf/0x20
> <4>[196727.312803]  [] cpu_startup_entry+0xc0/0x270
> <4>[196727.312833]  [] start_secondary+0x1f9/0x200
> <4>[196727.312860] Code: 4a 9f e9 81 e8 13 cb 0c 00 48 8b 93 b0 00 00 00 48 
> bf 00 02 20 00 00 00 ad de 48 8b 83 b8 00 00 00 48 be 00 01 10 00 00 00 ad de 
> <48> 89 42 08 48 89 10 48 89 bb b8 00 00 00 48 c7 c7 4a 9f e9 81
> <1>[196727.313071] RIP  [] ipv4_dst_destroy+0x4f/0x80
> <4>[196727.313100]  RSP 
> <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
> <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt
> 
> 
> ... bisecting it's going to be a pain... I tried eyeballing the diffs and
> am trying a revert or two.
> 
> We've hit it in .25, .26 so far. I have .27 running but not sure if it
> crashed, so the change exists between .15 and .25.

Please try following fix, thanks for the report !

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 25071b48921c..78a50a22298a 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1333,7 +1333,7 @@ static void ipv4_dst_destroy(struct dst_entry
*dst)
 
if (!list_empty(>rt_uncached)) {
spin_lock_bh(_uncached_lock);
-

[PATCH 1/2][v5] driver/memory:Move Freescale IFC driver to a common driver

2014-01-17 Thread Prabhakar Kushwaha

 Freescale IFC controller has been used for mpc8xxx. It will be used
 for ARM-based SoC as well. This patch moves the driver to driver/memory
 and fix the header file includes.

 Also remove module_platform_driver() and  instead call
 platform_driver_register() from subsys_initcall() to make sure this module
 has been loaded before MTD partition parsing starts.

Signed-off-by: Prabhakar Kushwaha 
Acked-by: Arnd Bergmann 
---
Changes for v2:
- Move fsl_ifc in driver/memory

Changes for v3:
- move device tree bindings to memory

Changes for v4: Rebased to 
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git

Changes for v5: 
- Moved powerpc/Kconfig option to driver/memory


 .../{powerpc => memory-controllers}/fsl/ifc.txt|0
 arch/powerpc/Kconfig   |4 
 arch/powerpc/sysdev/Makefile   |1 -
 drivers/memory/Kconfig |4 
 drivers/memory/Makefile|1 +
 {arch/powerpc/sysdev => drivers/memory}/fsl_ifc.c  |8 ++--
 drivers/mtd/nand/fsl_ifc_nand.c|2 +-
 .../include/asm => include/linux}/fsl_ifc.h|0
 8 files changed, 12 insertions(+), 8 deletions(-)
 rename Documentation/devicetree/bindings/{powerpc => 
memory-controllers}/fsl/ifc.txt (100%)
 rename {arch/powerpc/sysdev => drivers/memory}/fsl_ifc.c (98%)
 rename {arch/powerpc/include/asm => include/linux}/fsl_ifc.h (100%)

diff --git a/Documentation/devicetree/bindings/powerpc/fsl/ifc.txt 
b/Documentation/devicetree/bindings/memory-controllers/fsl/ifc.txt
similarity index 100%
rename from Documentation/devicetree/bindings/powerpc/fsl/ifc.txt
rename to Documentation/devicetree/bindings/memory-controllers/fsl/ifc.txt
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b44b52c..83fb8b3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -725,10 +725,6 @@ config FSL_LBC
  controller.  Also contains some common code used by
  drivers for specific local bus peripherals.
 
-config FSL_IFC
-   bool
-depends on FSL_SOC
-
 config FSL_GTM
bool
depends on PPC_83xx || QUICC_ENGINE || CPM2
diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index f67ac90..afbcc37 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -21,7 +21,6 @@ obj-$(CONFIG_FSL_SOC) += fsl_soc.o fsl_mpic_err.o
 obj-$(CONFIG_FSL_PCI)  += fsl_pci.o $(fsl-msi-obj-y)
 obj-$(CONFIG_FSL_PMC)  += fsl_pmc.o
 obj-$(CONFIG_FSL_LBC)  += fsl_lbc.o
-obj-$(CONFIG_FSL_IFC)  += fsl_ifc.o
 obj-$(CONFIG_FSL_GTM)  += fsl_gtm.o
 obj-$(CONFIG_FSL_85XX_CACHE_SRAM)  += fsl_85xx_l2ctlr.o 
fsl_85xx_cache_sram.o
 obj-$(CONFIG_SIMPLE_GPIO)  += simple_gpio.o
diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index 29a11db..b33bb0e 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -50,4 +50,8 @@ config TEGRA30_MC
  analysis, especially for IOMMU/SMMU(System Memory Management
  Unit) module.
 
+config FSL_IFC
+   bool
+depends on FSL_SOC
+
 endif
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index 969d923..f2bf25c 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -6,6 +6,7 @@ ifeq ($(CONFIG_DDR),y)
 obj-$(CONFIG_OF)   += of_memory.o
 endif
 obj-$(CONFIG_TI_EMIF)  += emif.o
+obj-$(CONFIG_FSL_IFC)  += fsl_ifc.o
 obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
 obj-$(CONFIG_TEGRA20_MC)   += tegra20-mc.o
 obj-$(CONFIG_TEGRA30_MC)   += tegra30-mc.o
diff --git a/arch/powerpc/sysdev/fsl_ifc.c b/drivers/memory/fsl_ifc.c
similarity index 98%
rename from arch/powerpc/sysdev/fsl_ifc.c
rename to drivers/memory/fsl_ifc.c
index d7fc722..135a950 100644
--- a/arch/powerpc/sysdev/fsl_ifc.c
+++ b/drivers/memory/fsl_ifc.c
@@ -30,8 +30,8 @@
 #include 
 #include 
 #include 
+#include 
 #include 
-#include 
 
 struct fsl_ifc_ctrl *fsl_ifc_ctrl_dev;
 EXPORT_SYMBOL(fsl_ifc_ctrl_dev);
@@ -299,7 +299,11 @@ static struct platform_driver fsl_ifc_ctrl_driver = {
.remove  = fsl_ifc_ctrl_remove,
 };
 
-module_platform_driver(fsl_ifc_ctrl_driver);
+static int __init fsl_ifc_init(void)
+{
+   return platform_driver_register(_ifc_ctrl_driver);
+}
+subsys_initcall(fsl_ifc_init);
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Freescale Semiconductor");
diff --git a/drivers/mtd/nand/fsl_ifc_nand.c b/drivers/mtd/nand/fsl_ifc_nand.c
index 4335577..865b323 100644
--- a/drivers/mtd/nand/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/fsl_ifc_nand.c
@@ -30,7 +30,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #define FSL_IFC_V1_1_0 0x0101
 #define ERR_BYTE   0xFF /* Value returned for read
diff --git a/arch/powerpc/include/asm/fsl_ifc.h b/include/linux/fsl_ifc.h
similarity index 100%
rename from arch/powerpc/include/asm/fsl_ifc.h
rename

[PATCH 2/2][v5] powerpc/config: Enable memory driver

2014-01-17 Thread Prabhakar Kushwaha

As Freescale IFC controller has been moved to driver to driver/memory.

So enable memory driver in powerpc config

Signed-off-by: Prabhakar Kushwaha 
---
 Changes for v2: Sending as it is
 Changes for v3: Sending as it is
 Changes for v4: Rebased to 
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git
 changes for v5:
- Rebased to branch next of 
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git

 arch/powerpc/configs/corenet32_smp_defconfig |1 +
 arch/powerpc/configs/corenet64_smp_defconfig |1 +
 arch/powerpc/configs/mpc85xx_defconfig   |1 +
 arch/powerpc/configs/mpc85xx_smp_defconfig   |1 +
 4 files changed, 4 insertions(+)

diff --git a/arch/powerpc/configs/corenet32_smp_defconfig 
b/arch/powerpc/configs/corenet32_smp_defconfig
index bbd794d..087d437 100644
--- a/arch/powerpc/configs/corenet32_smp_defconfig
+++ b/arch/powerpc/configs/corenet32_smp_defconfig
@@ -142,6 +142,7 @@ CONFIG_RTC_DRV_DS3232=y
 CONFIG_RTC_DRV_CMOS=y
 CONFIG_UIO=y
 CONFIG_STAGING=y
+CONFIG_MEMORY=y
 CONFIG_VIRT_DRIVERS=y
 CONFIG_FSL_HV_MANAGER=y
 CONFIG_EXT2_FS=y
diff --git a/arch/powerpc/configs/corenet64_smp_defconfig 
b/arch/powerpc/configs/corenet64_smp_defconfig
index 63508dd..25b03f8 100644
--- a/arch/powerpc/configs/corenet64_smp_defconfig
+++ b/arch/powerpc/configs/corenet64_smp_defconfig
@@ -129,6 +129,7 @@ CONFIG_EDAC=y
 CONFIG_EDAC_MM_EDAC=y
 CONFIG_DMADEVICES=y
 CONFIG_FSL_DMA=y
+CONFIG_MEMORY=y
 CONFIG_EXT2_FS=y
 CONFIG_EXT3_FS=y
 CONFIG_ISO9660_FS=m
diff --git a/arch/powerpc/configs/mpc85xx_defconfig 
b/arch/powerpc/configs/mpc85xx_defconfig
index 83d3550..cba638c 100644
--- a/arch/powerpc/configs/mpc85xx_defconfig
+++ b/arch/powerpc/configs/mpc85xx_defconfig
@@ -216,6 +216,7 @@ CONFIG_RTC_DRV_CMOS=y
 CONFIG_RTC_DRV_DS1307=y
 CONFIG_DMADEVICES=y
 CONFIG_FSL_DMA=y
+CONFIG_MEMORY=y
 # CONFIG_NET_DMA is not set
 CONFIG_EXT2_FS=y
 CONFIG_EXT3_FS=y
diff --git a/arch/powerpc/configs/mpc85xx_smp_defconfig 
b/arch/powerpc/configs/mpc85xx_smp_defconfig
index 4b68629..e315b8a 100644
--- a/arch/powerpc/configs/mpc85xx_smp_defconfig
+++ b/arch/powerpc/configs/mpc85xx_smp_defconfig
@@ -217,6 +217,7 @@ CONFIG_RTC_DRV_CMOS=y
 CONFIG_RTC_DRV_DS1307=y
 CONFIG_DMADEVICES=y
 CONFIG_FSL_DMA=y
+CONFIG_MEMORY=y
 # CONFIG_NET_DMA is not set
 CONFIG_EXT2_FS=y
 CONFIG_EXT3_FS=y
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

math_state_restore and kernel_fpu_end disable interrupts?

2014-01-17 Thread Nate Eldredge

In trying to track down a bug (see below), I noticed that 
math_state_restore() in arch/x86/kernel/traps.c appears to unconditionally 
disable interrupts when called.  Is this intended behavior or a bug?


The bug in question is triggered by dumping core on an ecryptfs file 
system when aesni-intel is loaded.  (See 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1265841 for the 
original report.)  The symptom is that __find_get_block() gets called with 
interrupts disabled, causing a BUG().  I tried to find where interrupts 
were getting disabled and wound up in aes_set_key_common() in 
arch/x86/crypto/aesni-intel_glue.c.  It calls aesni_set_key(), and since 
that uses the FPU, it wraps it in kernel_fpu_begin()/kernel_fpu_end(). 
But kernel_fpu_end() calls math_state_restore() which disables interrupts. 
I've verified that interrupts are still enabled just before the call to 
kernel_fpu_end().


math_state_restore() does:

local_irq_enable();
init_fpu(tsk);
local_irq_disable();

with the result that interrupts are disabled when it finishes, even if 
they were enabled to begin with.  That looks strange to me; are we sure it 
shouldn't just save and restore the interrupt flag?  Or are we not 
supposed to call it with interrupts enabled?


Given the intimidating comment preceding math_state_restore() ("Don't 
touch unless you *really* know how it works"), it's entirely possible I am 
missing something...


Any suggestions appreciated.  Thanks!

--
Nate Eldredge
n...@thatsmathematics.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH net] net: core: orphan frags before queuing to slow qdisc

2014-01-17 Thread Jason Wang


On 01/17/2014 10:28 PM, Eric Dumazet wrote:

On Fri, 2014-01-17 at 17:42 +0800, Jason Wang wrote:

Many qdiscs can queue a packet for a long time, this will lead an issue
with zerocopy skb. It means the frags will not be orphaned in an expected
short time, this breaks the assumption that virtio-net will transmit the
packet in time.

So if guest packets were queued through such kind of qdisc and hit the
limitation of the max pending packets for virtio/vhost. All packets that
go to another destination from guest will also be blocked.

A case for reproducing the issue:

- Boot two VMs and connect them to the same bridge kvmbr.
- Setup tbf with a very low rate/burst on eth0 which is a port of kvmbr.
- Let VM1 send lots of packets thorugh eth0
- After a while, VM1 is unable to send any packets out since the number of
   pending packets (queued to tbf) were exceeds the limitation of vhost/virito

So whats the problem ? If the limit is low, you cannot sent packets.


It was just an extreme case. The problem is if zercopy packets of vm1 
were throttled by qdisc in eth0, probably all packets from vm1 were 
throttled even if it was not go through eth0.

Solution : increase the limit, or tell the vm to lower its rate.

Oh wait, are you bitten because you did some prior skb_orphan() to allow
the vm to send unlimited number of skbs ???



The problem is sndbuf were defaulted to INT_MAX to prevent a similar 
issue for non-zerocopy packets. For zerocopy, only after the frags were 
orphaned can vhost notify the completion of tx for virtio-net. So 
INT_MAX sndbuf is not enough.

Solve this issue by orphaning the frags before queuing it to a slow qdisc (the
one without TCQ_F_CAN_BYPASS).

Why orphaning the frags only solves the problem ? A skb without zerocopy
frags should also be blocked for a while.


It's ok for non-zerocopy packet to be blocked since VM1 thought the 
packets has been sent instead of pending in the virtqueue. So VM1 can 
still send packet to other destination.

Seriously, lets admit this zero copy stuff is utterly broken.


TCQ_F_CAN_BYPASS is not enough. Some NIC have separate queues with
strict priorities.



Yes, but looks less serious than traffic shaping.

It seems to me that you are pushing to use FIFO (the only qdisc setting
TCQ_F_CAN_BYPASS), by adding yet another test in fast path (I do not
know how we can still call it a fast path), while we already have smart
qdisc to avoid the inherent HOL and unfairness problems of FIFO.



It was just a workaround like the case of sndbuf before we had a better 
solution. So looks like using sfq or fq in guest can mitigate the issue?

Cc: Michael S. Tsirkin
Signed-off-by: Jason Wang
---
  net/core/dev.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index 0ce469e..1209774 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2700,6 +2700,12 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, 
struct Qdisc *q,
contended = qdisc_is_running(q);
if (unlikely(contended))
spin_lock(>busylock);
+   if (!(q->flags&  TCQ_F_CAN_BYPASS)&&
+   unlikely(skb_orphan_frags(skb, GFP_ATOMIC))) {
+   kfree_skb(skb);
+   rc = NET_XMIT_DROP;
+   goto out;
+   }

Are you aware that copying stuff takes time ?

If yes, why is it done after taking the busylock spinlock ?



Yes and it should be done outside the spinlock.


spin_lock(root_lock);
if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED,>state))) {
@@ -2739,6 +2745,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, 
struct Qdisc *q,
}
}
spin_unlock(root_lock);
+out:
if (unlikely(contended))
spin_unlock(>busylock);
return rc;





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] USB: at91: fix the number of endpoint parameter

2014-01-17 Thread Jean-Christophe PLAGNIOL-VILLARD

On 10:59 Fri 17 Jan , Bo Shen wrote:
> In sama5d3 SoC, there are 16 endpoints. As the USBA_NR_ENDPOINTS
> is only 7. So, fix it for sama5d3 SoC using the udc->num_ep.
> 
> Signed-off-by: Bo Shen 
> ---
> 
>  drivers/usb/gadget/atmel_usba_udc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/gadget/atmel_usba_udc.c 
> b/drivers/usb/gadget/atmel_usba_udc.c
> index 2cb52e0..7e67a81 100644
> --- a/drivers/usb/gadget/atmel_usba_udc.c
> +++ b/drivers/usb/gadget/atmel_usba_udc.c
> @@ -1670,7 +1670,7 @@ static irqreturn_t usba_udc_irq(int irq, void *devid)
>   if (ep_status) {
>   int i;
>  
> - for (i = 0; i < USBA_NR_ENDPOINTS; i++)
> + for (i = 0; i < udc->num_ep; i++)

no the limit need to specified in the driver as a checkpoint by the compatible
or platform driver id

Best Regards,
J.
>   if (ep_status & (1 << i)) {
>   if (ep_is_control(>usba_ep[i]))
>   usba_control_irq(udc, >usba_ep[i]);
> -- 
> 1.8.5.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] gpio: mcp23s08: fix casting caused build warning

2014-01-17 Thread SeongJae Park

Signed-off-by: SeongJae Park 
---
 drivers/gpio/gpio-mcp23s08.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-mcp23s08.c b/drivers/gpio/gpio-mcp23s08.c
index b16401e..97198ef 100644
--- a/drivers/gpio/gpio-mcp23s08.c
+++ b/drivers/gpio/gpio-mcp23s08.c
@@ -640,7 +640,7 @@ static int mcp23s08_probe(struct spi_device *spi)
 
match = of_match_device(of_match_ptr(mcp23s08_spi_of_match), >dev);
if (match) {
-   type = (int)match->data;
+   type = (int)(uintptr_t)match->data;
status = of_property_read_u32(spi->dev.of_node,
"microchip,spi-present-mask", _present_mask);
if (status) {
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 4/6] net: rfkill: gpio: add device tree support

2014-01-17 Thread Chen-Yu Tsai

On Sat, Jan 18, 2014 at 7:11 AM, Linus Walleij  wrote:
> On Fri, Jan 17, 2014 at 6:43 PM, Chen-Yu Tsai  wrote:
>> On Sat, Jan 18, 2014 at 12:47 AM, Arnd Bergmann  wrote:
>
 +- NAME_shutdown-gpios  : GPIO phandle to shutdown control
 + (phandle must be the second)
 +- NAME_reset-gpios : GPIO phandle to reset control
 +
 +NAME must match the rfkill-name property. NAME_shutdown-gpios or
 +NAME_reset-gpios, or both, must be defined.
 +
>>>
>>> I don't understand this part. Why do you include the name in the
>>> gpios property, rather than just hardcoding the property strings
>>> to "shutdown-gpios" and "reset-gpios"?
>>
>> This quirk is a result of how gpiod_get_index implements device tree
>> lookup.
>
> Why can't it just have a single property "gpios", where the first
> element is the reset GPIO and the second is the shutdown GPIO?
>
> rfkill-gpio does this:
>
> gpio = devm_gpiod_get_index(>dev, rfkill->reset_name, 0);
> gpio = devm_gpiod_get_index(>dev, rfkill->shutdown_name, 1);
>
> The passed con ID name parameter is only there for the device
> tree case it seems. (ACPI ignores it.) So what about you just
> don't pass it at all and patch it to do like this instead:
>
> gpio = devm_gpiod_get_index(>dev, NULL, 0);
> gpio = devm_gpiod_get_index(>dev, NULL, 1);

I'd like that. It's much cleaner.

> Heikki, are you OK with this change?
>
> I think this is actually necessary if the ACPI and DT unification
> pipe dream shall limp forward, we cannot have arguments passed
> that have a semantic effect on DT but not on ACPI... Drivers
> that are supposed to use both ACPI and DT will always
> have to pass NULL as con ID.
>
>> If con_id is given, it is prepended to "gpios" as the property string.
>> con_id is also used as the label passed to gpiod_request, which is
>> then shown in /sys/kernel/debug/gpio.
>
> If your problem  is really what turns up in debugfs, then we need
> to figure out a way to label gpios outside of the *gpiod_get* calls.

Let's add a gpiod_set_label call. Currently there's a desc_set_label
in gpiolib, which is static inlined. We can either rename and promote
it to non-static, or add a new wrapping function.

> The string passed in *gpiod_get* is a "connection ID" not a proper
> name for the GPIO.

I see. Perhaps we should not pass this to gpiod_request as the label,
or add a comment stating consumers can use the new gpiod_set_label call
to change it.


Cheers,
ChenYu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL][PATCH] tracing: Fix buggered tee(2) on tracing_pipe

2014-01-17 Thread Steven Rostedt


Linus,

Al Viro has concerns with the trace_pipe release method calling
__free_page() instead of using the generic_pipe_buf_release() which
does a page_cache_release(). Looking at the differences between
__free_page() and page_cache_release() I do not think there's a real
issue here. But to be on the safe side, and at least to be symmetric
with generic_pipe_buf_get(), this patch is fine to add.

Please pull the latest trace-fixes-v3.13-rc8 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-fixes-v3.13-rc8

Tag SHA1: 5a8329936de8662773042fa76bc3c3d0c48fe5c3
Head SHA1: c50b3d58415b1f46bdb044fbd4e807cda49f0aa2


Al Viro (1):
  tracing: Fix buggered tee(2) on tracing_pipe


 kernel/trace/trace.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)
---
commit c50b3d58415b1f46bdb044fbd4e807cda49f0aa2
Author: Al Viro 
Date:   Fri Jan 17 07:53:39 2014 -0500

tracing: Fix buggered tee(2) on tracing_pipe

In kernel/trace/trace.c we have this:
static void tracing_pipe_buf_release(struct pipe_inode_info *pipe,
 struct pipe_buffer *buf)
{
__free_page(buf->page);
}
static const struct pipe_buf_operations tracing_pipe_buf_ops = {
.can_merge  = 0,
.map= generic_pipe_buf_map,
.unmap  = generic_pipe_buf_unmap,
.confirm= generic_pipe_buf_confirm,
.release= tracing_pipe_buf_release,
.steal  = generic_pipe_buf_steal,
.get= generic_pipe_buf_get,
};
with
void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer 
*buf)
{
page_cache_get(buf->page);
}

and I don't see anything that would've prevented tee(2) called on the pipe
that got stuff spliced into it from that sucker.  ->ops->get() will be
called, then buf gets copied into target pipe's ->bufs[] and eventually
readers get to both copies of the buffer.  With
get_page(page)
look at that page
__free_page(page)
look at that page
__free_page(page)
which is not a good thing, to put it mildly.  AFAICS, that ought to use
the normal generic_pipe_buf_release() (aka page_cache_release(buf->page)),
shouldn't it?

[
 SDR - As trace_pipe just allocates the page with alloc_page(GFP_KERNEL),
  and doesn't do anything special with it (no LRU logic). The __free_page()
  should be fine, as it wont actually free a page with reference count.
  Maybe there's a chance to leak memory? Anyway, This change is at a minimum
  good for being symmetric with generic_pipe_buf_get, it is fine to add.
]

Signed-off-by: Al Viro 
[ SDR - Removed no longer used tracing_pipe_buf_release ]
Signed-off-by: Steven Rostedt 

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9d20cd9..8f86143 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4212,12 +4212,6 @@ out:
return sret;
 }
 
-static void tracing_pipe_buf_release(struct pipe_inode_info *pipe,
-struct pipe_buffer *buf)
-{
-   __free_page(buf->page);
-}
-
 static void tracing_spd_release_pipe(struct splice_pipe_desc *spd,
 unsigned int idx)
 {
@@ -4229,7 +4223,7 @@ static const struct pipe_buf_operations 
tracing_pipe_buf_ops = {
.map= generic_pipe_buf_map,
.unmap  = generic_pipe_buf_unmap,
.confirm= generic_pipe_buf_confirm,
-   .release= tracing_pipe_buf_release,
+   .release= generic_pipe_buf_release,
.steal  = generic_pipe_buf_steal,
.get= generic_pipe_buf_get,
 };
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usbcore: fix BABBLE failed enumeration of legacy USB2 devices on USB3 bus

2014-01-17 Thread Jérôme Carretero

Hi,


I encountered the same problem with another device.
If possible, it would be nice to pick Marius's patch for stable
kernels (tested here on v3.12.6).

There are chances that MacOSX is affected by a similar issue,
so if anybody has friends there...


Thanks,

-- 
Jérôme


On Wed, 8 Jan 2014 04:00:22 +
"Marius  Silaghi"  wrote:

> 
> Great observation,
> Sarah located a patch that is queued for the 3.14 kernel and that has
> a similar effect. So future kernels could work with that one as well.
> 
> The patch I provided (being very small and safe) can still be
> suggested for maintainers of older kernels in various long-term
> maintained distributions (if you know who is doing that).
> 
> Here are some versions of the patch I made for current kernels:
> 
> The next one was tested on Ubuntu, applied to the source for
> 3.5.0-17-generic (Ubuntu)
> 
> --- linux-3.5.0/drivers/usb/core/hub.c.orig   2014-01-07
> 18:16:01.997031650 -0500 +++
> linux-3.5.0/drivers/usb/core/hub.c2014-01-07
> 18:19:41.617022465 -0500 @@ -4043,7 +4043,11 @@ break;
>   }
>  
> - retval = usb_get_device_descriptor(udev, 8);
> + if (!USE_NEW_SCHEME(retry_counter))
> +   retval = usb_get_device_descriptor(udev, 8);
> + else
> +   retval = usb_get_device_descriptor(udev,
> + sizeof(struct usb_device_descriptor));
>   if (retval < 8) {
>   dev_err(>dev,
>   "device descriptor read/8,
> error %d\n",
> 
> 
> 
> For kernel 3.9.0-0.4
> 
> --- linux-3.5.0/drivers/usb/core/hub.c.orig   2014-01-07
> 18:16:01.997031650 -0500 +++
> linux-3.5.0/drivers/usb/core/hub.c2014-01-07
> 18:19:41.617022465 -0500 @@ -4043,7 +4043,11 @@ break;
>   }
>  
> - retval = usb_get_device_descriptor(udev, 8);
> + if (!USE_NEW_SCHEME(retry_counter))
> +   retval = usb_get_device_descriptor(udev, 8);
> + else
> +   retval = usb_get_device_descriptor(udev,
> + sizeof(struct usb_device_descriptor));
>   if (retval < 8) {
>   dev_err(>dev,
>   "device descriptor read/8,
> error %d\n",
> 
> 
> 
> For kernel 3.10.0-5.15
> 
> --- ubuntu-saucy/drivers/usb/core/hub.c.orig  2014-01-07
> 16:52:41.300835262 -0500 +++
> ubuntu-saucy/drivers/usb/core/hub.c   2014-01-07
> 16:54:53.612829730 -0500 @@ -4126,8 +4126,11 @@ if
> (USE_NEW_SCHEME(retry_counter) && !(hcd->driver->flags & HCD_USB3))
> break; }
> -
> - retval = usb_get_device_descriptor(udev, 8);
> + if (!USE_NEW_SCHEME(retry_counter))
> +   retval = usb_get_device_descriptor(udev, 8);
> + else
> +   retval = usb_get_device_descriptor(udev,
> + sizeof(struct usb_device_descriptor));
>   if (retval < 8) {
>   if (retval != -ENODEV)
>   dev_err(>dev,
> 
> 
> For kernel 3.11
> --- linux-3.11/drivers/usb/core/hub.c.orig2014-01-07
> 16:57:16.352823760 -0500 +++ linux-3.11/drivers/usb/core/hub.c
> 2014-01-07 16:58:10.168821508 -0500 @@ -4161,7 +4161,11 @@
>   break;
>   }
>  
> - retval = usb_get_device_descriptor(udev, 8);
> + if (!USE_NEW_SCHEME(retry_counter))
> +   retval = usb_get_device_descriptor(udev, 8);
> + else
> +   retval = usb_get_device_descriptor(udev,
> + sizeof(struct usb_device_descriptor));
>   if (retval < 8) {
>   if (retval != -ENODEV)
>   dev_err(>dev,
> 
> 
> 
> From: linux-usb-ow...@vger.kernel.org
> [linux-usb-ow...@vger.kernel.org] on behalf of Greg Kroah-Hartman
> [gre...@linuxfoundation.org] Sent: Tuesday, January 07, 2014 19:32
> To: Marius  Silaghi Cc: Sarah Sharp; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alan Stern; Lan Tianyu; Xenia
> Ragiadakou; Jiri Kosina Subject: Re: [PATCH] usbcore: fix BABBLE
> failed enumeration of legacy USB2 devices on USB3 bus
> 
> On Tue, Dec 24, 2013 at 04:19:18AM +, Marius  Silaghi wrote:
> > From: Marius C Silaghi 
> >
> > This patch is generated against the last kernel version in the
> > github kernel repository.
> 
> We work off of the git.kernel.org trees, not github :)
> 
> 
> > Some older families of USB2 cameras (STC-XUSB) do not support
> > querying only the first 8 bytes of their device descriptor and
> > therefore fail at enumeration on USB3 HCDs, with babble error -75
> > as they send more than the expected 8 bytes. The proposed patch
> > extends the mechanism used for non USB3 HCDs in the first part of
> > the

Re: [PATCH 04/20] ARM64 / ACPI: Introduce arm_core.c and its related head file

2014-01-17 Thread Hanjun Guo

On 2014-1-17 22:12, Will Deacon wrote:
> On Fri, Jan 17, 2014 at 12:24:58PM +, Hanjun Guo wrote:
>> Introduce arm_core.c and its related head file, after this patch,
>> we can get ACPI tables from firmware on ARM64 now.
>>
>> Signed-off-by: Al Stone 
>> Signed-off-by: Graeme Gregory 
>> Signed-off-by: Hanjun Guo 
> 
> [...]
> 
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index bd9bbd0..2210353 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -41,6 +41,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #include 
>>  #include 
>> @@ -225,6 +226,11 @@ void __init setup_arch(char **cmdline_p)
>>  
>>  arm64_memblock_init();
>>  
>> +/* Parse the ACPI tables for possible boot-time configuration */
>> +acpi_boot_table_init();
>> +early_acpi_boot_init();
>> +acpi_boot_init();
> 
> Do we really need *three* back-to-back calls for ACPI to initialise?

Sorry, my colleague Graeme had integrate them as one function but I
forgot to merge them in this patch, my bad, will update it in next
version.

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] ACPI / idle: Move idle_boot_override out of the arch directory

2014-01-17 Thread Hanjun Guo

On 2014-1-18 11:45, Hanjun Guo wrote:
> On 2014-1-17 20:06, Sudeep Holla wrote:
>> On 17/01/14 02:03, Hanjun Guo wrote:
>>> Move idle_boot_override out of the arch directory to be a single enum
>>> including both platforms values, this will make it rather easier to
>>> avoid ifdefs around which definitions are for which processor in
>>> generally used ACPI code.
>>>
>>> IDLE_FORCE_MWAIT for IA64 is not used anywhere, so romove it.
>>>
>>> No functional change in this patch.
>>>
>>> Suggested-by: Alan 
>>> Signed-off-by: Hanjun Guo 
>>> ---
[...]
>>> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
>>> index 03e235ad..e324561 100644
>>> --- a/include/linux/cpu.h
>>> +++ b/include/linux/cpu.h
>>> @@ -220,6 +220,14 @@ void cpu_idle(void);
>>>  
>>>  void cpu_idle_poll_ctrl(bool enable);
>>>  
>>> +enum idle_boot_override {
>>> +   IDLE_NO_OVERRIDE = 0,
>>> +   IDLE_HALT,
>>> +   IDLE_NOMWAIT,
>>> +   IDLE_POLL,
>>> +   IDLE_POWERSAVE_OFF
>>> +};
>>> +
>>
>> I do understand the idea behind this change, but IMO HALT and MWAIT are x86
>> specific and may not make sense for other architectures.
> 
> yes, this is the strange part, the value is arch-dependent.
> 
>>
>> It will also require every architecture using ACPI to export
>> boot_option_idle_override which may not be really required.
> 
> so, how about forget this patch and move boot_option_idle_override
> related code into arch directory such as arch/x86/acpi/boot.c for
> x86?

The general idea is that we can move all the arch-dependent codes
in ACPI driver to arch directory, then make codes in drivers/acpi/
arch independent.

Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2][v3] driver/memory:Move Freescale IFC driver to a common driver

2014-01-17 Thread Prabhakar Kushwaha



On 1/17/2014 10:38 PM, Kumar Gala wrote:

On Jan 15, 2014, at 11:42 PM, Prabhakar Kushwaha  
wrote:


Freescale IFC controller has been used for mpc8xxx. It will be used
for ARM-based SoC as well. This patch moves the driver to driver/memory
and fix the header file includes.

Also remove module_platform_driver() and  instead call
platform_driver_register() from subsys_initcall() to make sure this module
has been loaded before MTD partition parsing starts.

Signed-off-by: Prabhakar Kushwaha 
Acked-by: Arnd Bergmann 
---
Changes for v2:
- Move fsl_ifc in driver/memory

Changes for v3:
- move device tree bindings to memory

.../{powerpc => memory-controllers}/fsl/ifc.txt|0
arch/powerpc/sysdev/Makefile   |1 -
drivers/memory/Makefile|1 +
{arch/powerpc/sysdev => drivers/memory}/fsl_ifc.c  |8 ++--
drivers/mtd/nand/fsl_ifc_nand.c|2 +-
.../include/asm => include/linux}/fsl_ifc.h|0
6 files changed, 8 insertions(+), 4 deletions(-)
rename Documentation/devicetree/bindings/{powerpc => 
memory-controllers}/fsl/ifc.txt (100%)
rename {arch/powerpc/sysdev => drivers/memory}/fsl_ifc.c (98%)
rename {arch/powerpc/include/asm => include/linux}/fsl_ifc.h (100%)

The Kconfig option for FSL_IFC should move into drivers/memory/Kconfig


Thanks Kumar for taking time and review this patch.

You are correct. I was checking sysdev/Kconfig  but it is defined in 
powerpc/Kconfig

I missed it :)

Regards,
Prabhakar



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v1 2/3] ACPI, PCI: reuse ACPI hotplug framework to support PCI host bridge hotplug

2014-01-17 Thread Jiang Liu

Hi yinghai,
Sorry for the noise. I didn't noticed Rafael's work,
so I generated this patchset when encountered this issue
during testing PCI host bridge hotplug. It should achieve
the same goal.
Thanks!
Gerry

On 2014/1/18 11:23, Yinghai Lu wrote:
> ACPI / hotplug: Make ACPI PCI root hotplug use common hotplug code
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] clk: Export more clk-provider functions

2014-01-17 Thread Stephen Boyd

Allow drivers to be compiled as modules by exporting more clock
provider functions.

Reported-by: kbuild test robot 
Signed-off-by: Stephen Boyd 
---
 drivers/clk/clk.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 0b27b543dacf..2e83cf643db0 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -547,16 +547,19 @@ struct clk_hw *__clk_get_hw(struct clk *clk)
 {
return !clk ? NULL : clk->hw;
 }
+EXPORT_SYMBOL_GPL(__clk_get_hw);
 
 u8 __clk_get_num_parents(struct clk *clk)
 {
return !clk ? 0 : clk->num_parents;
 }
+EXPORT_SYMBOL_GPL(__clk_get_num_parents);
 
 struct clk *__clk_get_parent(struct clk *clk)
 {
return !clk ? NULL : clk->parent;
 }
+EXPORT_SYMBOL_GPL(__clk_get_parent);
 
 struct clk *clk_get_parent_by_index(struct clk *clk, u8 index)
 {
@@ -570,6 +573,7 @@ struct clk *clk_get_parent_by_index(struct clk *clk, u8 
index)
else
return clk->parents[index];
 }
+EXPORT_SYMBOL_GPL(clk_get_parent_by_index);
 
 unsigned int __clk_get_enable_count(struct clk *clk)
 {
@@ -601,6 +605,7 @@ unsigned long __clk_get_rate(struct clk *clk)
 out:
return ret;
 }
+EXPORT_SYMBOL_GPL(__clk_get_rate);
 
 unsigned long __clk_get_flags(struct clk *clk)
 {
@@ -649,6 +654,7 @@ bool __clk_is_enabled(struct clk *clk)
 out:
return !!ret;
 }
+EXPORT_SYMBOL_GPL(__clk_is_enabled);
 
 static struct clk *__clk_lookup_subtree(const char *name, struct clk *clk)
 {
@@ -740,6 +746,7 @@ out:
 
return best;
 }
+EXPORT_SYMBOL_GPL(__clk_mux_determine_rate);
 
 /***clk api***/
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] ACPI / idle: Move idle_boot_override out of the arch directory

2014-01-17 Thread Hanjun Guo

On 2014-1-17 20:06, Sudeep Holla wrote:
> On 17/01/14 02:03, Hanjun Guo wrote:
>> Move idle_boot_override out of the arch directory to be a single enum
>> including both platforms values, this will make it rather easier to
>> avoid ifdefs around which definitions are for which processor in
>> generally used ACPI code.
>>
>> IDLE_FORCE_MWAIT for IA64 is not used anywhere, so romove it.
>>
>> No functional change in this patch.
>>
>> Suggested-by: Alan 
>> Signed-off-by: Hanjun Guo 
>> ---
>>  arch/ia64/include/asm/processor.h| 3 ---
>>  arch/powerpc/include/asm/processor.h | 1 -
>>  arch/x86/include/asm/processor.h | 3 ---
>>  arch/x86/kernel/process.c| 1 +
>>  include/linux/cpu.h  | 8 
>>  5 files changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/ia64/include/asm/processor.h 
>> b/arch/ia64/include/asm/processor.h
>> index 5a84b3a..ccd63a0 100644
>> --- a/arch/ia64/include/asm/processor.h
>> +++ b/arch/ia64/include/asm/processor.h
>> @@ -698,9 +698,6 @@ prefetchw (const void *x)
>>  
>>  extern unsigned long boot_option_idle_override;
>>  
>> -enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_FORCE_MWAIT,
>> - IDLE_NOMWAIT, IDLE_POLL};
>> -
>>  void default_idle(void);
>>  
>>  #define ia64_platform_is(x) (strcmp(x, ia64_platform_name) == 0)
>> diff --git a/arch/powerpc/include/asm/processor.h 
>> b/arch/powerpc/include/asm/processor.h
>> index fc14a38..06689c0 100644
>> --- a/arch/powerpc/include/asm/processor.h
>> +++ b/arch/powerpc/include/asm/processor.h
>> @@ -440,7 +440,6 @@ static inline unsigned long get_clean_sp(unsigned long 
>> sp, int is_32)
>>  #endif
>>  
>>  extern unsigned long cpuidle_disable;
>> -enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
>>  
> 
> I don't think it is used in the context of ACPI. Though it's same variable 
> name,
> it looks like it just used as boot to override the cpuidle option.
> Does it still make any sense to combine this ?

Yes, it is not related to ACPI on powerpc, I will investigate it will cause
compile warning or not if I don't combine this.

> 
>>  extern int powersave_nap;   /* set if nap mode can be used in idle loop */
>>  extern void power7_nap(void);
>> diff --git a/arch/x86/include/asm/processor.h 
>> b/arch/x86/include/asm/processor.h
>> index 7b034a4..4bee51a 100644
>> --- a/arch/x86/include/asm/processor.h
>> +++ b/arch/x86/include/asm/processor.h
>> @@ -729,9 +729,6 @@ extern void init_amd_e400_c1e_mask(void);
>>  extern unsigned longboot_option_idle_override;
>>  extern bool amd_e400_c1e_detected;
>>  
>> -enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_NOMWAIT,
>> - IDLE_POLL};
>> -
>>  extern void enable_sep_cpu(void);
>>  extern int sysenter_setup(void);
>>  
>> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>> index 3fb8d95..62764ff 100644
>> --- a/arch/x86/kernel/process.c
>> +++ b/arch/x86/kernel/process.c
>> @@ -17,6 +17,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
>> index 03e235ad..e324561 100644
>> --- a/include/linux/cpu.h
>> +++ b/include/linux/cpu.h
>> @@ -220,6 +220,14 @@ void cpu_idle(void);
>>  
>>  void cpu_idle_poll_ctrl(bool enable);
>>  
>> +enum idle_boot_override {
>> +IDLE_NO_OVERRIDE = 0,
>> +IDLE_HALT,
>> +IDLE_NOMWAIT,
>> +IDLE_POLL,
>> +IDLE_POWERSAVE_OFF
>> +};
>> +
> 
> I do understand the idea behind this change, but IMO HALT and MWAIT are x86
> specific and may not make sense for other architectures.

yes, this is the strange part, the value is arch-dependent.

> 
> It will also require every architecture using ACPI to export
> boot_option_idle_override which may not be really required.

so, how about forget this patch and move boot_option_idle_override
related code into arch directory such as arch/x86/acpi/boot.c for
x86?

> 
> Further the only users of boot_option_idle_override(outside x86) are:
> 
> 1. drivers/acpi/processor_core.c
>Your second patch is moving this to x86 specific code anyway
> 
> 2. drivers/acpi/processor_idle.c
>Currently idle driver is bit x86 specific and needs modifications to get it
>working on ARM

Yes, That's why I did not enable acpi idle driver on ARM64 for now.

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread Chris Ball

Hi,

On Sat, Jan 18 2014, Andrew Bresticker wrote:
 There's an existing patch for that...
 http://www.spinics.net/lists/arm-kernel/msg296596.html
>>>
>>> Ah, I see.  Looks like it has yet to be picked up...
>>
>> The patches aren't quite identical -- Andrew's leaves the
>> disable_irq() call in and Aisheng's removes it.  Which should I take?
>
> Since the disable_irq() is now redundant, I suppose Aisheng's is more correct

Thanks, pushed Aisheng's version to mmc-next for 3.14.

- Chris.
-- 
Chris Ball  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 7/7] numa,sched: do statistics calculation using local variables only

2014-01-17 Thread Rik van Riel

On 01/17/2014 04:12 PM, r...@redhat.com wrote:
> From: Rik van Riel 
> 
> The current code in task_numa_placement calculates the difference
> between the old and the new value, but also temporarily stores half
> of the old value in the per-process variables.
> 
> The NUMA balancing code looks at those per-process variables, and
> having other tasks temporarily see halved statistics could lead to
> unwanted numa migrations. This can be avoided by doing all the math
> in local variables.
> 
> This change also simplifies the code a little.

I am seeing what looks like a performance improvement
with this patch, so it is not just a theoretical bug.
The improvement is small, as is to be expected with
such a small race, but with two 32-warehouse specjbb
instances on a 4-node, 10core/20thread per node system,
I see the following change in performance, and reduced
numa page migrations.

Without the patch:
run 1: throughput 367660 367660, migrated 3112982
run 2: throughput 353821 355612, migrated 2881317
run 3: throughput 355027 355027, migrated 3358105
run 4: throughput 354366 354366, migrated 3466687
run 5: throughput 356186 356186, migrated 3152194
run 6: throughput 361431 361431, migrated 3336219
run 7: throughput 354704 354704, migrated 3345418
run 8: throughput 363770 363770, migrated 3642925
run 9: throughput 363380 363380, migrated 3192836
run 10: throughput 358440 358440, migrated 3354028
  avg: througphut 358968, migrated 3284271

With the patch:
run 1: throughput 360580 360580, migrated 3169872
run 2: throughput 361303 361303, migrated 3220280
run 3: throughput 367692 367692, migrated 3096093
run 4: throughput 362320 362320, migrated 2981762
run 5: throughput 364201 364201, migrated 3089107
run 6: throughput 364561 364561, migrated 2892364
run 7: throughput 360771 360771, migrated 3086638
run 8: throughput 361530 361530, migrated 2933256
run 9: throughput 365841 365841, migrated 3356944
run 10: throughput 359188 359188, migrated 3394545
  avg: througphut 362798, migrated 3122086


-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:23 +0100, Sebastian Andrzej Siewior wrote:

> So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue
> which took the waiter lock with irqs off. This should be the same thing
> you try do here.

(yeah, these are just whacked mole body bags;)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:14 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2013-12-25 18:37:37 [+0100]:
> 
> >On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
> >> On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:
> >
> >Having sufficiently recovered from turkey overdose to be able to slither
> >upstairs (bump bump bump) to check on the box, commenting..
> >
> ># timers-do-not-raise-softirq-unconditionally.patch
> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> >
> >..those two out does seem to have stabilized the thing.
> 
> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> 
> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> Didn't you report once that your box deadlocks without this patch? Now
> your 64way box on the other hand does not work with it?

If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
not raise' is not applied, _and_ you wisely do not try to turn on very
expensive nohz_full, things work fine without 'use a trylock'.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT] Networking

2014-01-17 Thread David Miller


1) The value choosen for the new SO_MAX_PACING_RATE socket option on
   parisc was very poorly choosen, let's fix it while we still can.
   From Eric Dumazet.

2) Our generic reciprocal divide was found to handle some edge cases
   incorrectly, part of this is encoded into the BPF as deep as the
   JIT engines themselves.  Just use a real divide throughout for now.
   From Eric Dumazet.

3) Because the initial lookup is lockless, the TCP metrics engine
   can end up creating two entries for the same lookup key.  Fix this
   by doing a second lookup under the lock before we actually create
   the new entry.  From Christoph Paasch.

4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork.

5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting.
   From Dan Carpenter.

6) Netlink socket dumping uses the wrong socket state for timewait
   sockets.  Fix from Neal Cardwell.

7) Fix netlink memory leak in ieee802154_add_iface(), from Christian
   Engelmayer.

8) Multicast forwarding in ipv4 can overflow the per-rule reference
   counts, causing all multicast traffic to cease.  Fix from
   Hannes Frederic Sowa.

9) via-rhine needs to stop all TX queues when it resets the device,
   from Richard Weinberger.

10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions.
From Gerald Schaefer.

Please pull, thanks a lot!

The following changes since commit 228fdc083b017eaf90e578fa86fb1ecfd5ffae87:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2014-01-11 
06:37:11 +0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to 3af57f78c38131b7a66e2b01e06fdacae01992a3:

  s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions 
(2014-01-17 18:54:49 -0800)


Bjørn Mork (1):
  net: usbnet: fix SG initialisation

Christian Engelmayer (1):
  ieee802154: Fix memory leak in ieee802154_add_iface()

Christoph Paasch (1):
  tcp: metrics: Avoid duplicate entries with the same destination-IP

Dan Carpenter (1):
  cxgb4: silence shift wrapping static checker warning

David S. Miller (1):
  Merge tag 'batman-adv-fix-for-davem' of 
git://git.open-mesh.org/linux-merge

Eric Dumazet (2):
  bpf: do not use reciprocal divide
  parisc: fix SO_MAX_PACING_RATE typo

Gerald Schaefer (1):
  net: rds: fix per-cpu helper usage

Hannes Frederic Sowa (2):
  net: avoid reference counter overflows on fib_rules in multicast 
forwarding
  ipv6: simplify detection of first operational link-local address on 
interface

Heiko Carstens (1):
  s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions

Ivan Vecera (1):
  be2net: add dma_mapping_error() check for dma_map_page()

Jitendra Kalsaria (1):
  qlge: Fix vlan netdev features.

Marek Lindner (1):
  batman-adv: fix batman-adv header overhead calculation

Michael S. Tsirkin (1):
  MAINTAINERS: add virtio-dev ML for virtio

Mika Westerberg (1):
  e1000e: Fix compilation warning when !CONFIG_PM_SLEEP

Neal Cardwell (1):
  inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait 
sockets

Peter Korsgaard (1):
  dm9601: add USB IDs for new dm96xx variants

Richard Weinberger (1):
  net,via-rhine: Fix tx_timeout handling

Yuval Mintz (1):
  bnx2x: Don't release PCI bars on shutdown

 MAINTAINERS  |  3 +++
 arch/arm/net/bpf_jit_32.c|  6 +++---
 arch/parisc/include/uapi/asm/socket.h|  2 +-
 arch/powerpc/net/bpf_jit_comp.c  |  7 ---
 arch/s390/net/bpf_jit_comp.c | 29 
+---
 arch/sparc/net/bpf_jit_comp.c| 17 ++---
 arch/x86/net/bpf_jit_comp.c  | 14 ++
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 29 
++--
 drivers/net/ethernet/chelsio/cxgb4/l2t.c |  2 +-
 drivers/net/ethernet/emulex/benet/be_main.c  | 11 +--
 drivers/net/ethernet/intel/e1000e/netdev.c   |  8 ++--
 drivers/net/ethernet/qlogic/qlge/qlge_main.c |  2 ++
 drivers/net/ethernet/via/via-rhine.c |  1 +
 drivers/net/usb/dm9601.c | 12 
 drivers/net/usb/usbnet.c |  2 +-
 include/net/if_inet6.h   |  1 -
 net/batman-adv/main.c|  2 +-
 net/core/filter.c| 30 
++---
 net/ieee802154/nl-phy.c  |  6 --
 net/ipv4/inet_diag.c |  5 -
 net/ipv4/ipmr.c  |  7 +--
 net/ipv4/tcp_metrics.c   | 51 
+++---
 net/ipv6/addrconf.c  | 38

Re: [Patch v1 2/3] ACPI, PCI: reuse ACPI hotplug framework to support PCI host bridge hotplug

2014-01-17 Thread Yinghai Lu

On Fri, Jan 17, 2014 at 6:48 PM, Jiang Liu  wrote:
> Reuse ACPI hotplug framework to support PCI host bridge hotplug, this
> makes PCI host bridge hotplug implementation simpler and more clear.
>
> It also fixes a bug in support of PCI host bridge hot-addition.
> Currently pci_root driver fails to install notification handler for
> PCI host bridge absent at boot time because acpi_is_root_bridge()
> returns false if no ACPI device created for handle. So PCI host
> bridge hot-addition event will just be ignored by system.
>
> Signed-off-by: Jiang Liu 

is the same as

commit 3338db0057ed9f554050bd06863731c515d79672
Author: Rafael J. Wysocki 
Date:   Fri Nov 22 21:55:20 2013 +0100

ACPI / hotplug: Make ACPI PCI root hotplug use common hotplug code

in rafael tree pm/linux-next for 3.14?

> ---
>  drivers/acpi/internal.h |1 -
>  drivers/acpi/pci_root.c |  117 
> +++
>  drivers/acpi/scan.c |2 -
>  3 files changed, 17 insertions(+), 103 deletions(-)
>
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index a29739c..03efa56 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -28,7 +28,6 @@ int init_acpi_device_notify(void);
>  int acpi_scan_init(void);
>  void acpi_pci_root_init(void);
>  void acpi_pci_link_init(void);
> -void acpi_pci_root_hp_init(void);
>  void acpi_processor_init(void);
>  void acpi_platform_init(void);
>  int acpi_sysfs_init(void);
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 20360e4..6f6e6c1 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -50,6 +50,7 @@ ACPI_MODULE_NAME("pci_root");
>  static int acpi_pci_root_add(struct acpi_device *device,
>  const struct acpi_device_id *not_used);
>  static void acpi_pci_root_remove(struct acpi_device *device);
> +static int handle_hotplug_event_root(acpi_handle handle, u32 type, void 
> *ctx);
>
>  #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
> | OSC_PCI_ASPM_SUPPORT \
> @@ -61,12 +62,14 @@ static const struct acpi_device_id root_device_ids[] = {
> {"", 0},
>  };
>
> +
>  static struct acpi_scan_handler pci_root_handler = {
> .ids = root_device_ids,
> +   .prepare = handle_hotplug_event_root,
> .attach = acpi_pci_root_add,
> .detach = acpi_pci_root_remove,
> .hotplug = {
> -   .ignore = true,
> +   .enabled = true,
> },
>  };
>
> @@ -627,113 +630,27 @@ void __init acpi_pci_root_init(void)
>
> if (!acpi_pci_disabled) {
> pci_acpi_crs_quirks();
> -   acpi_scan_add_handler(_root_handler);
> -   }
> -}
> -/* Support root bridge hotplug */
> -
> -static void handle_root_bridge_insertion(acpi_handle handle)
> -{
> -   struct acpi_device *device;
> -
> -   if (!acpi_bus_get_device(handle, )) {
> -   dev_printk(KERN_DEBUG, >dev,
> -  "acpi device already exists; ignoring notify\n");
> -   return;
> +   acpi_scan_add_handler_with_hotplug(_root_handler,
> +  "pci_hostbridge");
> }
> -
> -   if (acpi_bus_scan(handle))
> -   acpi_handle_err(handle, "cannot add bridge to acpi list\n");
>  }
>
> -static void hotplug_event_root(void *data, u32 type)
> +static int handle_hotplug_event_root(acpi_handle handle, u32 type, void *ctx)
>  {
> -   acpi_handle handle = data;
> +   int ret = NOTIFY_OK;
> struct acpi_pci_root *root;
>
> -   acpi_scan_lock_acquire();
> -
> -   root = acpi_pci_find_root(handle);
> -
> -   switch (type) {
> -   case ACPI_NOTIFY_BUS_CHECK:
> -   /* bus enumerate */
> -   acpi_handle_printk(KERN_DEBUG, handle,
> -  "Bus check notify on %s\n", __func__);
> -   if (root)
> +   if (type == ACPI_NOTIFY_BUS_CHECK) {
> +   acpi_scan_lock_acquire();
> +   root = acpi_pci_find_root(handle);
> +   if (root) {
> +   acpi_handle_printk(KERN_DEBUG, handle,
> +   "Bus check notify on %s\n", __func__);
> acpiphp_check_host_bridge(handle);
> -   else
> -   handle_root_bridge_insertion(handle);
> -
> -   break;
> -
> -   case ACPI_NOTIFY_DEVICE_CHECK:
> -   /* device check */
> -   acpi_handle_printk(KERN_DEBUG, handle,
> -  "Device check notify on %s\n", __func__);
> -   if (!root)
> -   handle_root_bridge_insertion(handle);
> -   break;
> -
> -   case ACPI_NOTIFY_EJECT_REQUEST:
> -   /* request device eject */
> -   acpi_handle_printk(KERN_DEBUG, handle,
> -  "Device eject

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread Andrew Bresticker

On Fri, Jan 17, 2014 at 3:40 PM, Chris Ball  wrote:
> Hi, adding Aisheng,
>
> On Fri, Jan 17 2014, Andrew Bresticker wrote:
>> On Fri, Jan 17, 2014 at 3:11 PM, John Tobias  
>> wrote:
>>> There's an existing patch for that...
>>> http://www.spinics.net/lists/arm-kernel/msg296596.html
>>
>> Ah, I see.  Looks like it has yet to be picked up...
>
> The patches aren't quite identical -- Andrew's leaves the
> disable_irq() call in and Aisheng's removes it.  Which should I take?

Since the disable_irq() is now redundant, I suppose Aisheng's is more correct.

Thanks,
Andrew

>
> Thanks,
>
> - Chris.
> --
> Chris Ball  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANNOUNCE] 3.12.6-rt9

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:00 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2013-12-24 16:47:47 [+0100]:
> 
> >I built this kernel with Paul's patch and NO_HZ_FULL enabled again on 64
> >core box.  I haven't seen RCU grip yet, but I just checked on it after
> >3.5 hours into this boot/beat (after fixing crash+kdump setup), and
> >found it in the process of dumping. 
> 
> So you also have the timers-do-not-raise-softirq-unconditionally.patch?

Oh dear, there's holidays, vacation, and massive turkey overdose between
then and now, but I'm almost positive that the tree was virgin $subject,
with only Paul's patch enabled, that being what I wanted to beat on.

> I have a small problem with understanding this…
> 
> |#24 [880273a03cd0] run_timer_softirq at 81069002
> 
> Here we obtain wait_lock from tvec_base of _this_ CPU. And we get to
> init_lists() before the apic timer kicks in. So we have the wait_lock.

gdb fibs a little, we're acquiring.

>---  ---
> >#21 [880273a03b28] apic_timer_interrupt at 815cbf9d
> >[exception RIP: _raw_spin_lock+50]

> In the hard interrupt triggered by the apic timer we get to
> get_next_timer_interrupt() and go again for same the wait_lock. Here we
> have the try_lock so we avoid this deadlock.
> The odd part: we get the lock. It should be the same lock because both use
> | struct tvec_base *base = __this_cpu_read(tvec_bases);
> to ge it. And we shouldn't get it because the lock is already hold.
> We get into trouble in the unlock path where we spin forever:
> 
> |#14 [880276803e50] rt_spin_unlock_after_trylock_in_irq at 
> 815c3425
> |#12 [880276803e28] _raw_spin_trylock at 815c3790
> 
> which releases the lock with a trylock in order to keep lockdep happy.
> My understanding was that we should be able to obtain the wait_lock here
> since we were able to obtain it in the lock path and in irq off context
> there is nothing that could take the lock in the meantime.

IIRC, we were endlessly trying, but with an un-punched ticket under us,
and no Xen like evilness to save the day.

I've since cleaned out my crashdump directory and moved on to frolicking
with hotplug gremlins, so don't have that one to revisit, but the don't
unconditionally raise timer softirq patch is the bad guy.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Bluetooth: remove direct compilation of 6lowpan_iphc.c

2014-01-17 Thread David Miller

From: Stephen Warren 
Date: Fri, 17 Jan 2014 12:29:24 -0700

> From: Stephen Warren 
> 
> It's now built as a separate utility module, and enabling BT selects
> that module in Kconfig. This fixes:
 ...
> (this change probably simply wasn't "git add"d to a53d34c3465b)
> 
> Fixes: a53d34c3465b ("net: move 6lowpan compression code to separate module")
> Fixes: 18722c247023 ("Bluetooth: Enable 6LoWPAN support for BT LE devices")
> Signed-off-by: Stephen Warren 

Applied to net-next, thanks a lot.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] cgroup: make CONFIG_NET_CLS_CGROUP and CONFIG_NETPRIO_CGROUP bool instead of tristate

2014-01-17 Thread David Miller

From: Tejun Heo 
Date: Fri, 17 Jan 2014 13:11:52 -0500

> net_cls and net_prio are the only cgroups which are allowed to be
> built as modules.  The savings from allowing the two controllers to be
> built as modules are tiny especially given that cgroup module support
> itself adds quite a bit of complexity.
> 
> The following are the sizes of vmlinux with both built as module and
> both built as part of the kernel image with cgroup module support
> removed.
> 
>   textdatabss dec
>   202922072411496 1078476833488471
>   202934212412568 1078476833490757
> 
> The total difference is 2286 bytes.  Given that none of other
> controllers has much chance of being made a module and that we're
> unlikely to add new modular controllers, the added complexity is
> simply not justifiable.
> 
> As a first step to drop cgroup module support, this patch changes the
> two config options to bool from tristate and drops module related code
> from the two controllers.
> 
> Signed-off-by: Tejun Heo 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/6] cgroup: clean up cgroup_subsys names and initialization

2014-01-17 Thread David Miller

From: Tejun Heo 
Date: Fri, 17 Jan 2014 13:11:54 -0500

> Signed-off-by: Tejun Heo 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 24/41] net: Replace get_cpu_var through this_cpu_ptr

2014-01-17 Thread David Miller

From: Christoph Lameter 
Date: Fri, 17 Jan 2014 09:18:36 -0600

> [Patch depends on another patch in this series that introduces raw_cpu_ops]
> 
> Replace uses of get_cpu_var for address calculation through this_cpu_ptr.
> 
> Cc: "David S. Miller" 
> Cc: net...@vger.kernel.org
> Cc: Eric Dumazet 
> Signed-off-by: Christoph Lameter 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 17/41] net: Replace __this_cpu_inc in route.c with raw_cpu_inc

2014-01-17 Thread David Miller

From: Christoph Lameter 
Date: Fri, 17 Jan 2014 09:18:29 -0600

> Acked-by: Ingo Molnar 
> Signed-off-by: Christoph Lameter 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 3/4] ARM: pinctrl: Add Broadcom Capri pinctrl driver

2014-01-17 Thread Matt Porter

On Fri, Jan 17, 2014 at 11:59:21AM -0800, Sherman Yin wrote:
> On 14-01-16 05:19 AM, Linus Walleij wrote:
> >On Sat, Dec 21, 2013 at 3:13 AM, Sherman Yin  wrote:
> >
> >'> Adds pinctrl driver for Broadcom Capri (BCM281xx) SoCs.
> >>
> >>Signed-off-by: Sherman Yin 
> >>Reviewed-by: Christian Daudt 
> >>Reviewed-by: Matt Porter 
> >>---
> >>v4: - PINCTRL selected in Kconfig, PINCTRL_CAPRI selected in bcm_defconfig
> >> - make use of regmap
> >> - change CAPRI_PIN_UPDATE from macro to inline function.
> >> - Handle pull-up strength arg in Ohm instead of enum
> >
> >Patch applied. It is really good now! It's late before the merge
> >window, but you've done a tremendous work on this driver and
> >I don't want to delay its deployment further.
> 
> Great, thanks for the support and reviews!

Very nice! Now after having completely missing something fundamental on
my reviews, I feel compelled to bring it up at the 11^H^H12th hour.

That is, this is the *only* BCM281xx driver to be named Capri, both in
the filename and driver code, but also in the binding compatible. We
didn't do that on anything else that's gone upstream to date. This
really introduces an unfortunate inconsistency as it obscures which
SoC family this binding and driver belong with.

I wonder if Linus would accept a rename at this point (too late for 3.14
presumably, but for 3.15) of s/capri/bcm281xx throughout, bcm11351 for
the compatible string, as we have for the machine compatible, and also
BCM281XX for the Kconfig option. If not, I'll survive, but it pains me
to see one thing completely different out of this entire family. If
nothing else, it would be great to address the compatible string before
this hits the 3.14 release.

Linus?

-Matt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Patch v1 3/3] ACPI: kill field 'ignore' in acpi_hotplug_profile

2014-01-17 Thread Jiang Liu

Field 'ignore' in acpi_hotplug_profile is introduced by "ca499fc87ed945
ACPI / hotplug: Fix conflicted PCI bridge notify handlers" to support
PCI host bridge hotplug. Now it's useless, so kill it.

Signed-off-by: Jiang Liu 
---
 drivers/acpi/scan.c |2 +-
 include/acpi/acpi_bus.h |1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index d83e0ff..b932ae6 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -1776,7 +1776,7 @@ static void acpi_scan_init_hotplug(acpi_handle handle, 
int type)
 */
list_for_each_entry(hwid, , list) {
handler = acpi_scan_match_handler(hwid->id, NULL);
-   if (handler && !handler->hotplug.ignore) {
+   if (handler) {
acpi_install_notify_handler(handle, ACPI_SYSTEM_NOTIFY,
acpi_hotplug_notify_cb, handler);
break;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 09a73bd..d1f6ebd 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -100,7 +100,6 @@ enum acpi_hotplug_mode {
 struct acpi_hotplug_profile {
struct kobject kobj;
bool enabled:1;
-   bool ignore:1;
enum acpi_hotplug_mode mode;
 };
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Patch v1 1/3] ACPI: add callback prepare() into acpi_hotplug_handler

2014-01-17 Thread Jiang Liu

Add callback prepare() into acpi_hotplug_handler, which will get called
at the very beginning of ACPI hotplug event handler. The ACPI core will
ignore the event if prepare() returns NOTIFY_STOP.

Signed-off-by: Jiang Liu 
---
 drivers/acpi/scan.c |4 
 include/acpi/acpi_bus.h |1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index fd39459..6b0f419 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -392,6 +392,10 @@ static void acpi_hotplug_notify_cb(acpi_handle handle, u32 
type, void *data)
struct acpi_device *adev;
acpi_status status;
 
+   if (handler->prepare &&
+   handler->prepare(handle, type, data) == NOTIFY_STOP)
+   return;
+
if (!handler->hotplug.enabled)
return acpi_hotplug_unsupported(handle, type);
 
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index ddabed1..09a73bd 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -113,6 +113,7 @@ static inline struct acpi_hotplug_profile 
*to_acpi_hotplug_profile(
 struct acpi_scan_handler {
const struct acpi_device_id *ids;
struct list_head list_node;
+   int (*prepare)(acpi_handle handle, u32 type, void *context);
int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
void (*detach)(struct acpi_device *dev);
struct acpi_hotplug_profile hotplug;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Patch v1 2/3] ACPI, PCI: reuse ACPI hotplug framework to support PCI host bridge hotplug

2014-01-17 Thread Jiang Liu

Reuse ACPI hotplug framework to support PCI host bridge hotplug, this
makes PCI host bridge hotplug implementation simpler and more clear.

It also fixes a bug in support of PCI host bridge hot-addition.
Currently pci_root driver fails to install notification handler for
PCI host bridge absent at boot time because acpi_is_root_bridge()
returns false if no ACPI device created for handle. So PCI host
bridge hot-addition event will just be ignored by system.

Signed-off-by: Jiang Liu 
---
 drivers/acpi/internal.h |1 -
 drivers/acpi/pci_root.c |  117 +++
 drivers/acpi/scan.c |2 -
 3 files changed, 17 insertions(+), 103 deletions(-)

diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index a29739c..03efa56 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -28,7 +28,6 @@ int init_acpi_device_notify(void);
 int acpi_scan_init(void);
 void acpi_pci_root_init(void);
 void acpi_pci_link_init(void);
-void acpi_pci_root_hp_init(void);
 void acpi_processor_init(void);
 void acpi_platform_init(void);
 int acpi_sysfs_init(void);
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 20360e4..6f6e6c1 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -50,6 +50,7 @@ ACPI_MODULE_NAME("pci_root");
 static int acpi_pci_root_add(struct acpi_device *device,
 const struct acpi_device_id *not_used);
 static void acpi_pci_root_remove(struct acpi_device *device);
+static int handle_hotplug_event_root(acpi_handle handle, u32 type, void *ctx);
 
 #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \
| OSC_PCI_ASPM_SUPPORT \
@@ -61,12 +62,14 @@ static const struct acpi_device_id root_device_ids[] = {
{"", 0},
 };
 
+
 static struct acpi_scan_handler pci_root_handler = {
.ids = root_device_ids,
+   .prepare = handle_hotplug_event_root,
.attach = acpi_pci_root_add,
.detach = acpi_pci_root_remove,
.hotplug = {
-   .ignore = true,
+   .enabled = true,
},
 };
 
@@ -627,113 +630,27 @@ void __init acpi_pci_root_init(void)
 
if (!acpi_pci_disabled) {
pci_acpi_crs_quirks();
-   acpi_scan_add_handler(_root_handler);
-   }
-}
-/* Support root bridge hotplug */
-
-static void handle_root_bridge_insertion(acpi_handle handle)
-{
-   struct acpi_device *device;
-
-   if (!acpi_bus_get_device(handle, )) {
-   dev_printk(KERN_DEBUG, >dev,
-  "acpi device already exists; ignoring notify\n");
-   return;
+   acpi_scan_add_handler_with_hotplug(_root_handler,
+  "pci_hostbridge");
}
-
-   if (acpi_bus_scan(handle))
-   acpi_handle_err(handle, "cannot add bridge to acpi list\n");
 }
 
-static void hotplug_event_root(void *data, u32 type)
+static int handle_hotplug_event_root(acpi_handle handle, u32 type, void *ctx)
 {
-   acpi_handle handle = data;
+   int ret = NOTIFY_OK;
struct acpi_pci_root *root;
 
-   acpi_scan_lock_acquire();
-
-   root = acpi_pci_find_root(handle);
-
-   switch (type) {
-   case ACPI_NOTIFY_BUS_CHECK:
-   /* bus enumerate */
-   acpi_handle_printk(KERN_DEBUG, handle,
-  "Bus check notify on %s\n", __func__);
-   if (root)
+   if (type == ACPI_NOTIFY_BUS_CHECK) {
+   acpi_scan_lock_acquire();
+   root = acpi_pci_find_root(handle);
+   if (root) {
+   acpi_handle_printk(KERN_DEBUG, handle,
+   "Bus check notify on %s\n", __func__);
acpiphp_check_host_bridge(handle);
-   else
-   handle_root_bridge_insertion(handle);
-
-   break;
-
-   case ACPI_NOTIFY_DEVICE_CHECK:
-   /* device check */
-   acpi_handle_printk(KERN_DEBUG, handle,
-  "Device check notify on %s\n", __func__);
-   if (!root)
-   handle_root_bridge_insertion(handle);
-   break;
-
-   case ACPI_NOTIFY_EJECT_REQUEST:
-   /* request device eject */
-   acpi_handle_printk(KERN_DEBUG, handle,
-  "Device eject notify on %s\n", __func__);
-   if (!root)
-   break;
-
-   get_device(>device->dev);
-
+   ret = NOTIFY_STOP;
+   }
acpi_scan_lock_release();
-
-   acpi_bus_device_eject(root->device, ACPI_NOTIFY_EJECT_REQUEST);
-   return;
-   default:
-   acpi_handle_warn(handle,
-"notify_handler: unknown event type 0x%x\n",
-type);
-

[V0 PATCH] xen/pvh: set some cr flags upon vcpu start

2014-01-17 Thread Mukesh Rathor

Konrad,

The following patch sets the bits in CR0 and CR4. Please note, I'm working
on patch for the xen side. The CR4 features are not currently exported
to a PVH guest. 

Roger, I added your SOB line, please lmk if I need to add anything else.

This patch was build on top of a71accb67e7645c68061cec2bee6067205e439fc in
konrad devel/pvh.v13 branch.

thanks
Mukesh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V0 PATCH] xen/pvh: set some cr flags upon vcpu start

2014-01-17 Thread Mukesh Rathor

pvh was designed to start with pv flags, but a commit in xen tree
51e2cac257ec8b4080d89f0855c498cbbd76a5e5 removed some of the flags as
they are not necessary. As a result, these CR flags must be set in the
guest.

Signed-off-by: Roger Pau Monne 
Signed-off-by: Mukesh Rathor 
---
 arch/x86/xen/enlighten.c |   43 +--
 arch/x86/xen/smp.c   |2 +-
 arch/x86/xen/xen-ops.h   |2 +-
 3 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 628099a..4a2aaa6 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1410,12 +1410,8 @@ static void __init xen_boot_params_init_edd(void)
  * Set up the GDT and segment registers for -fstack-protector.  Until
  * we do this, we have to be careful not to call any stack-protected
  * function, which is most of the kernel.
- *
- * Note, that it is refok - because the only caller of this after init
- * is PVH which is not going to use xen_load_gdt_boot or other
- * __init functions.
  */
-void __ref xen_setup_gdt(int cpu)
+static void xen_setup_gdt(int cpu)
 {
if (xen_feature(XENFEAT_auto_translated_physmap)) {
 #ifdef CONFIG_X86_64
@@ -1463,13 +1459,48 @@ void __ref xen_setup_gdt(int cpu)
pv_cpu_ops.load_gdt = xen_load_gdt;
 }
 
+/*
+ * A pv guest starts with default flags that are not set for pvh, set them
+ * here asap.
+ */
+static void xen_pvh_set_cr_flags(int cpu)
+{
+   write_cr0(read_cr0() | X86_CR0_MP | X86_CR0_WP | X86_CR0_AM);
+
+   if (!cpu)
+   return;
+   /*
+* Unlike PV, for pvh xen does not set: PSE PGE OSFXSR OSXMMEXCPT
+* For BSP, PSE PGE will be set in probe_page_size_mask(), for AP
+* set them here. For all, OSFXSR OSXMMEXCPT will be set in fpu_init
+*/
+   if (cpu_has_pse)
+   set_in_cr4(X86_CR4_PSE);
+
+   if (cpu_has_pge)
+   set_in_cr4(X86_CR4_PGE);
+}
+
+/*
+ * Note, that it is refok - because the only caller of this after init
+ * is PVH which is not going to use xen_load_gdt_boot or other
+ * __init functions.
+ */
+void __ref xen_pvh_secondary_vcpu_init(int cpu)
+{
+   xen_setup_gdt(cpu);
+   xen_pvh_set_cr_flags(cpu);
+}
+
 static void __init xen_pvh_early_guest_init(void)
 {
if (!xen_feature(XENFEAT_auto_translated_physmap))
return;
 
-   if (xen_feature(XENFEAT_hvm_callback_vector))
+   if (xen_feature(XENFEAT_hvm_callback_vector)) {
xen_have_vector_callback = 1;
+   xen_pvh_set_cr_flags(0);
+   }
 
 #ifdef CONFIG_X86_32
BUG(); /* PVH: Implement proper support. */
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 5e46190..a18eadd 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -105,7 +105,7 @@ static void cpu_bringup_and_idle(int cpu)
 #ifdef CONFIG_X86_64
if (xen_feature(XENFEAT_auto_translated_physmap) &&
xen_feature(XENFEAT_supervisor_mode_kernel))
-   xen_setup_gdt(cpu);
+   xen_pvh_secondary_vcpu_init(cpu);
 #endif
cpu_bringup();
cpu_startup_entry(CPUHP_ONLINE);
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 9059c24..1cb6f4c 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -123,5 +123,5 @@ __visible void xen_adjust_exception_frame(void);
 
 extern int xen_panic_handler_init(void);
 
-void xen_setup_gdt(int cpu);
+void xen_pvh_secondary_vcpu_init(int cpu);
 #endif /* XEN_OPS_H */
-- 
1.7.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kvm: make KVM_MMU_AUDIT help text more readable

2014-01-17 Thread Randy Dunlap

From: Randy Dunlap 

Make KVM_MMU_AUDIT kconfig help text readable and collapse
two spaces between words down to one space.

Signed-off-by: Randy Dunlap 
---
 arch/x86/kvm/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- lnx-313-rc8.orig/arch/x86/kvm/Kconfig
+++ lnx-313-rc8/arch/x86/kvm/Kconfig
@@ -65,7 +65,7 @@ config KVM_MMU_AUDIT
depends on KVM && TRACEPOINTS
---help---
 This option adds a R/W kVM module parameter 'mmu_audit', which allows
-audit  KVM MMU at runtime.
+auditing of KVM MMU events at runtime.
 
 config KVM_DEVICE_ASSIGNMENT
bool "KVM legacy PCI device assignment support"
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread Pavel Machek

On Fri 2014-01-17 17:21:00, H. Peter Anvin wrote:
> On 01/17/2014 04:29 PM, Pavel Machek wrote:
> > 
> > Have you checked your dmesg recently? Normal people don't read
> > it... it is just too much of it.
> > 
> >> Printing a warning is appropriate if we can't actually fix the problem
> >> in the OS.  If we actually make the problem go away then we have just
> >> done our job and we can be done with it.
> > 
> > I disagree. Older kernel versions still may have problem, etc.
> > 
> > We normally do print warnings for problems we work around. We want
> > vendors to fix their hardware, too...
> > 
> > ACPI BIOS Warning (bug): 32/64X FACS address mismatch in FADT -
> > 0xBDB5FF40/0xBDB64F40, using 32 (20131115/tbfadt-522)
> > [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
> > 
> 
> You say people don't read their dmesg...
> ... because there is too much ...
> ... so let's add more?

I'd say that proposed "your bios has a bug" is way more useful than
stuff such as:

pci_bus :00: root bus resource [bus 00-ff]
pci_bus :00: root bus resource [io  0x-0x0cf7]
pci_bus :00: root bus resource [io  0x0d00-0x]
pci_bus :00: root bus resource [mem 0x000a-0x000b]
pci_bus :00: root bus resource [mem 0xc000-0x]
pci :00:00.0: [8086:2e30] type 00 class 0x06
pci :00:01.0: [8086:2e31] type 01 class 0x060400
pci :00:01.0: PME# supported from D0 D3hot D3cold
pci :00:01.0: System wakeup disabled by ACPI
pci :00:02.0: [8086:2e32] type 00 class 0x03
pci :00:02.0: reg 0x10: [mem 0xd000-0xd03f 64bit]
pci :00:02.0: reg 0x18: [mem 0xc000-0xcfff 64bit pref]
pci :00:02.0: reg 0x20: [io  0xf140-0xf147]
pci :00:02.1: [8086:2e33] type 00 class 0x038000
pci :00:02.1: reg 0x10: [mem 0xd040-0xd04f 64bit]
pci :00:1b.0: [8086:27d8] type 00 class 0x040300
pci :00:1b.0: reg 0x10: [mem 0xd060-0xd0603fff 64bit]
pci :00:1b.0: PME# supported from D0 D3hot D3cold
pci :00:1c.0: [8086:27d0] type 01 class 0x060400
pci :00:1c.0: PME# supported from D0 D3hot D3cold
pci :00:1c.0: System wakeup disabled by ACPI
pci :00:1c.1: [8086:27d2] type 01 class 0x060400
pci :00:1c.1: PME# supported from D0 D3hot D3cold
pci :00:1c.1: System wakeup disabled by ACPI
pci :00:1d.0: [8086:27c8] type 00 class 0x0c0300
pci :00:1d.0: reg 0x20: [io  0xf080-0xf09f]
pci :00:1d.0: System wakeup disabled by ACPI

So yes. Lets add useful stuff.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: rds: fix per-cpu helper usage

2014-01-17 Thread David Miller

From: Gerald Schaefer 
Date: Thu, 16 Jan 2014 16:54:48 +0100

> commit ae4b46e9d "net: rds: use this_cpu_* per-cpu helper" broke per-cpu
> handling for rds. chpfirst is the result of __this_cpu_read(), so it is
> an absolute pointer and not __percpu. Therefore, __this_cpu_write()
> should not operate on chpfirst, but rather on cache->percpu->first, just
> like __this_cpu_read() did before.
> 
> Cc:  # 3.8+
> Signed-off-byd Gerald Schaefer 

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v3.11][v3.12][v3.13][Regression] EISA: Initialize device before its resources

2014-01-17 Thread Joseph Salisbury

On 01/17/2014 05:19 PM, Bjorn Helgaas wrote:
> On Fri, Jan 17, 2014 at 02:26:23PM -0500, Joseph Salisbury wrote:
>> On 01/17/2014 12:02 PM, Bjorn Helgaas wrote:
>>> On Thu, Jan 16, 2014 at 01:14:01PM -0500, Joseph Salisbury wrote:
 On 01/16/2014 01:12 PM, Bjorn Helgaas wrote:
> On Thu, Jan 16, 2014 at 10:53 AM, Joseph Salisbury
>  wrote:
>> Hi Bjorn,
>>
>> A kernel bug was opened against Ubuntu [0].  After a kernel bisect, it
>> was found the following commit introduced this bug:
> Sorry about that, and thanks for the report.  Did you mean to include
> URL for the bug?
 Yes, sorry about that:
 http://pad.lv/1251816
>>> Hi Joseph,
>>>
>>> Can you attach the 3.8.0-32-generic config (the one matching the successful
>>> boot at https://launchpadlibrarian.net/156685076/BootDmesg.txt) to the bug?
>> I attached the config file to the bug:
>> https://launchpadlibrarian.net/162754666/config.common.ubuntu
>>
>> I also attached a tar file with the complete config directory for that
>> kernel version.
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1251816/+attachment/3951156/+files/raring-config.tar
> Thanks again.  I attached the following reverts to launchpad.  I screwed up
> when doing those EISA changes.  I'd like to squeeze these into my v3.14
> merge request (probably early next week), so please test and let me know
> if this fixes the problem.  I'm really sorry for the inconvenience.
>
> Bjorn
>
>
> Revert "EISA: Log device resources in dmesg"
>
> From: Bjorn Helgaas 
>
> This reverts commit a2080d0c561c546d73cb8b296d4b7ca414e6860b.
>
> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/eisa/eisa-bus.c |1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c
> index 8842cde69177..1b86fe0c2e80 100644
> --- a/drivers/eisa/eisa-bus.c
> +++ b/drivers/eisa/eisa-bus.c
> @@ -288,7 +288,6 @@ static int __init eisa_request_resources(struct 
> eisa_root_device *root,
>   edev->res[i].flags = IORESOURCE_IO | IORESOURCE_BUSY;
>   }
>  
> - dev_printk(KERN_DEBUG, >dev, "%pR\n", >res[i]);
>   if (request_resource(root->res, >res[i]))
>   goto failed;
>   }
> Revert "EISA: Initialize device before its resources"
>
> From: Bjorn Helgaas 
>
> This reverts commit 26abfeed4341872364386c6a52b9acef8c81a81a.
>
> In the eisa_probe() force_probe path, if we were unable to request slot
> resources (e.g., [io 0x800-0x8ff]), we skipped the slot with "Cannot
> allocate resource for EISA slot %d" before reading the EISA signature in
> eisa_init_device().
>
> Commit 26abfeed4341 moved eisa_init_device() earlier, so we tried to read
> the EISA signature before requesting the slot resources, and this caused
> hangs during boot.
>
> Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1251816
> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/eisa/eisa-bus.c |   26 +++---
>  1 file changed, 15 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c
> index 1b86fe0c2e80..612afeaec3cb 100644
> --- a/drivers/eisa/eisa-bus.c
> +++ b/drivers/eisa/eisa-bus.c
> @@ -277,11 +277,13 @@ static int __init eisa_request_resources(struct 
> eisa_root_device *root,
>   }
>   
>   if (slot) {
> + edev->res[i].name  = NULL;
>   edev->res[i].start = SLOT_ADDRESS(root, slot)
>+ (i * 0x400);
>   edev->res[i].end   = edev->res[i].start + 0xff;
>   edev->res[i].flags = IORESOURCE_IO;
>   } else {
> + edev->res[i].name  = NULL;
>   edev->res[i].start = SLOT_ADDRESS(root, slot)
>+ EISA_VENDOR_ID_OFFSET;
>   edev->res[i].end   = edev->res[i].start + 3;
> @@ -327,19 +329,20 @@ static int __init eisa_probe(struct eisa_root_device 
> *root)
>   return -ENOMEM;
>   }
>   
> - if (eisa_init_device(root, edev, 0)) {
> + if (eisa_request_resources(root, edev, 0)) {
> + dev_warn(root->dev,
> +  "EISA: Cannot allocate resource for mainboard\n");
>   kfree(edev);
>   if (!root->force_probe)
> - return -ENODEV;
> + return -EBUSY;
>   goto force_probe;
>   }
>  
> - if (eisa_request_resources(root, edev, 0)) {
> - dev_warn(root->dev,
> -  "EISA: Cannot allocate resource for mainboard\n");
> + if (eisa_init_device(root, edev, 0)) {
> + eisa_release_resources(edev);
>   kfree(edev);
>   if (!root->force_probe)
> - return -EBUSY;
> + return -ENODEV;
>   goto force_probe;
>   }
>  
> @@ -362,11

ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread dormando

Hi,

Upgraded a few kernels to the latest 3.10 stable tree while tracking down
a rare kernel panic, seems to have introduced a much more frequent kernel
panic. Takes anywhere from 4 hours to 2 days to trigger:

<4>[196727.311203] general protection fault:  [#1] SMP
<4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge 
coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog 
ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si 
ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
<4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
<4>[196727.311344] Hardware name: Supermicro 
X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
<4>[196727.311364] task: 885e6f069700 ti: 885e6f072000 task.ti: 
885e6f072000
<4>[196727.311377] RIP: 0010:[]  [] 
ipv4_dst_destroy+0x4f/0x80
<4>[196727.311399] RSP: 0018:885effd23a70  EFLAGS: 00010282
<4>[196727.311409] RAX: dead00200200 RBX: 8854c398ecc0 RCX: 
0040
<4>[196727.311423] RDX: dead00100100 RSI: dead00100100 RDI: 
dead00200200
<4>[196727.311437] RBP: 885effd23a80 R08: 815fd9e0 R09: 
885d5a590800
<4>[196727.311451] R10:  R11:  R12: 

<4>[196727.311464] R13: 81c8c280 R14:  R15: 
880e85ee16ce
<4>[196727.311510] FS:  () GS:885effd2() 
knlGS:
<4>[196727.311554] CS:  0010 DS:  ES:  CR0: 80050033
<4>[196727.311581] CR2: 7a46751eb000 CR3: 005e65688000 CR4: 
000407e0
<4>[196727.311625] DR0:  DR1:  DR2: 

<4>[196727.311669] DR3:  DR6: 0ff0 DR7: 
0400
<4>[196727.311713] Stack:
<4>[196727.311733]  8854c398ecc0 8854c398ecc0 885effd23ab0 
815b7f42
<4>[196727.311784]  88be6595bc00 8854c398ecc0  
8854c398ecc0
<4>[196727.311834]  885effd23ad0 815b86c6 885d5a590800 
8816827821c0
<4>[196727.311885] Call Trace:
<4>[196727.311907]  
<4>[196727.311912]  [] dst_destroy+0x32/0xe0
<4>[196727.311959]  [] dst_release+0x56/0x80
<4>[196727.311986]  [] tcp_v4_do_rcv+0x2a5/0x4a0
<4>[196727.312013]  [] tcp_v4_rcv+0x7da/0x820
<4>[196727.312041]  [] ? ip_rcv_finish+0x360/0x360
<4>[196727.312070]  [] ? nf_hook_slow+0x7d/0x150
<4>[196727.312097]  [] ? ip_rcv_finish+0x360/0x360
<4>[196727.312125]  [] ip_local_deliver_finish+0xb2/0x230
<4>[196727.312154]  [] ip_local_deliver+0x4a/0x90
<4>[196727.312183]  [] ip_rcv_finish+0x119/0x360
<4>[196727.312212]  [] ip_rcv+0x22b/0x340
<4>[196727.312242]  [] ? macvlan_broadcast+0x160/0x160 
[macvlan]
<4>[196727.312275]  [] __netif_receive_skb_core+0x512/0x640
<4>[196727.312308]  [] ? kmem_cache_alloc+0x13b/0x150
<4>[196727.312338]  [] __netif_receive_skb+0x21/0x70
<4>[196727.312368]  [] netif_receive_skb+0x31/0xa0
<4>[196727.312397]  [] napi_gro_receive+0xe8/0x140
<4>[196727.312433]  [] ixgbe_poll+0x551/0x11f0 [ixgbe]
<4>[196727.312463]  [] ? ip_rcv+0x22b/0x340
<4>[196727.312491]  [] net_rx_action+0x111/0x210
<4>[196727.312521]  [] ? __netif_receive_skb+0x21/0x70
<4>[196727.312552]  [] __do_softirq+0xd0/0x270
<4>[196727.312583]  [] call_softirq+0x1c/0x30
<4>[196727.312613]  [] do_softirq+0x55/0x90
<4>[196727.312640]  [] irq_exit+0x55/0x60
<4>[196727.312668]  [] do_IRQ+0x63/0xe0
<4>[196727.312696]  [] common_interrupt+0x6a/0x6a
<4>[196727.312722]  
<4>[196727.312727]  [] ? default_idle+0x20/0xe0
<4>[196727.312775]  [] arch_cpu_idle+0xf/0x20
<4>[196727.312803]  [] cpu_startup_entry+0xc0/0x270
<4>[196727.312833]  [] start_secondary+0x1f9/0x200
<4>[196727.312860] Code: 4a 9f e9 81 e8 13 cb 0c 00 48 8b 93 b0 00 00 00 48 bf 
00 02 20 00 00 00 ad de 48 8b 83 b8 00 00 00 48 be 00 01 10 00 00 00 ad de <48> 
89 42 08 48 89 10 48 89 bb b8 00 00 00 48 c7 c7 4a 9f e9 81
<1>[196727.313071] RIP  [] ipv4_dst_destroy+0x4f/0x80
<4>[196727.313100]  RSP 
<4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
<0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt


... bisecting it's going to be a pain... I tried eyeballing the diffs and
am trying a revert or two.

We've hit it in .25, .26 so far. I have .27 running but not sure if it
crashed, so the change exists between .15 and .25.

Thanks,
-Dormando
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] target fixes for v3.13

2014-01-17 Thread Linus Torvalds

On Thu, Jan 16, 2014 at 4:09 PM, Nicholas A. Bellinger
 wrote:
>
> This change allows the percpu_ida tag allocator to optionally use
> interruptible sleep that iscsi-target expects, while still leaving the
> functionality + interface for existing percpu_ida consumers unchanged.

I'm not pulling this. Passing in TASK_RUNNING to prepare_to_wait() is
insane (because if it ever were to actually wait, that would be a
bug), and afaik this would be the first time anybody ever does that.

Yes, yes, it may "work", but I'm not pulling that kind of hack just
before a release.

Quite frankly, it looks like you want to have a tristate argument ("no
wait", "wait interruptibly", "wait uninterruptibly") to that
percpu_ida_alloc() function. Fine. But dammit, using this kind of
hackery, and then having two *different* calling conventions (one
mis-using the gfp_t for legacy reasons, and one now using the task
state flags in odd ways) is just not acceptable.

Now, neither of those two is perfect, but I can see why you want to
use the task state ones to say which kind of interruptible you want..
But I really don't like suddenly having a
prepare_to_wait(TASK_RUNNING) caller without any discussion, and I
*really* don't like having two completely different models for this
hack.

So quite frankly, I'd much prefer:

 - talk to the scheduler people, and make them aware of the fact that
you are going to pass in TASK_RUNNING to prepare_to_wait(). It works
with the code as-is, as long as you don't actually then wait.

 - Don't do that wrapper function with a totally different calling
convention logic. Instead, just change all the callers explicitly.
>From a quick look, you really only have a couple of cases:

  (a) target/iscsi, which wants the new ternary argument

  (b) vhost/scsi.c, which uses GFP_ATOMIC and can be changed to TASK_RUNNABLE

  (c) block/blk-mq-tag.c, which already hates the current insane
thing, and uses __GFP_WAIT and (gfp & ~__GFP_WAIT) and other hacks,
and is obviously *very* aware of the internal hackery in the current
percpu_ida_alloc() argument. So I'm getting the feeling that that
whole thing might actually be *happier* with the TASK_xyz flags.

So I really think this needs cleanup, and that hacky "passing in
TASK_RUNNING to prepare_to_wait()" needs to be made official. And yes,
that implies that it's too late to try to push this through for 3.13,
this goes into the next merge window and can be backported.

Added the appropriate people to the Cc..

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2] fs: don't write pages when receiving a pending SIGKILL in __get_user_pages()

2014-01-17 Thread Xishi Qiu

In the process IO direction, dio_refill_pages will call get_user_pages_fast 
to map the page from user space. If ret is less than 0 and IO is write, the 
function will create a zero page to fill data. This may work for some file 
system, but in some device operate we prefer whole write or fail, not half 
data half zero, e.g. fs metadata, like inode, identy.
This happens often when kill a process which is doing direct IO. Consider 
the following cases, the process A is doing IO process, may enter 
__get_user_pages 
function, if other processes send process A SIG_KILL, A will enter the 
following branches 
/*
 * If we have a pending SIGKILL, don't keep faulting
 * pages and potentially allocating memory.
 */
if (unlikely(fatal_signal_pending(current)))
return i ? i : -ERESTARTSYS;
Return current pages. direct IO will write the pages, the subsequent pages 
which can’t get will use zero page instead. 

Signed-off-by: Xishi Qiu 
Signed-off-by: Bin Yang 
---
 fs/direct-io.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 0e04142..b74d565 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -174,6 +174,9 @@ static inline int dio_refill_pages(struct dio *dio, struct 
dio_submit *sdio)
>pages[0]);/* Put results here */
 
if (ret < 0 && sdio->blocks_available && (dio->rw & WRITE)) {
+   /* If task is killed, do not write anymore */
+   if (ret == -ERESTARTSYS)
+   goto out;
struct page *page = ZERO_PAGE(0);
/*
 * A memory fault, but the filesystem has some outstanding
-- 
1.7.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread H. Peter Anvin

On 01/17/2014 04:29 PM, Pavel Machek wrote:
> 
> Have you checked your dmesg recently? Normal people don't read
> it... it is just too much of it.
> 
>> Printing a warning is appropriate if we can't actually fix the problem
>> in the OS.  If we actually make the problem go away then we have just
>> done our job and we can be done with it.
> 
> I disagree. Older kernel versions still may have problem, etc.
> 
> We normally do print warnings for problems we work around. We want
> vendors to fix their hardware, too...
> 
> ACPI BIOS Warning (bug): 32/64X FACS address mismatch in FADT -
> 0xBDB5FF40/0xBDB64F40, using 32 (20131115/tbfadt-522)
> [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
> 

You say people don't read their dmesg...
... because there is too much ...
... so let's add more?

What am I missing here?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Deadlock in do_page_fault() on ARM (old kernel)

2014-01-17 Thread Russell King - ARM Linux

On Fri, Jan 17, 2014 at 07:57:16PM -0500, Alan Ott wrote:
> On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote:
>> My suspicion therefore is that some other thread must have died while
>> holding the mmap_sem, so there's probably a kernel oops earlier...
>> that's my best guess at the moment without seeing the full backtrace.
>
> There's no oops that I'm able to see.
>
> Each of the tasks which lockdep reports as "holding" mmap_sem are  
> blocking for it. If some other task had taken it and then crashed, I  
> assume lockdep would list the crashed task as also holding the resource  
> in the printout.

My point is this:

- the five (or six) threads which are trying to take the mmap_sem in
  read-mode in the fault handler are all blocked on it - they haven't
  taken the lock, which will only happen because there's a pending writer.
- of these in your original post, there are two which faulted from
  __copy_to_user_std().  __copy_to_user_std() doesn't take the mmap_sem -
  this is the non-uaccess-with-memcpy path.
- the pending writers are the two threads in sys_mmap_pgoff(), both of
  which are blocked waiting to gain the write lock.
- there are no *other* threads holding the mmap_sem lock.

So... there's a question here how we got into this state - and frankly
I don't know.  What I do see from your latest dump is that there's two
unknown modules there - something called rcu2m and another called
buttoms, and there are two threads inside ioctls there.  Both have
faulted from the function at 0xc0d2a394 (which won't appear in the
backtrace, but is most likely __copy_to_user_std.)

So, in the absence of you saying anything about there being any preceding
oopses, my conclusion now is that one of those modules is taking the
mmap_sem itself, and is the culpret inducing this deadlock.

Note that your dump ([2]) in your reply was just the hung task detector
printing out the stacktrace for a few tasks, not the full all-threads
stack dump which I was expecting.

So I'm pulling out these conclusions from the very little information
you're supplying.

-- 
FTTC broadband for 0.8mile line: 5.8Mbps down 500kbps up.  Estimation
in database were 13.1 to 19Mbit for a good line, about 7.5+ for a bad.
Estimate before purchase was "up to 13.2Mbit".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/6] cgroup: make CONFIG_NET_CLS_CGROUP and CONFIG_NETPRIO_CGROUP bool instead of tristate

2014-01-17 Thread Li Zefan

Cc: Daniel Borkmann 

On 2014/1/18 2:11, Tejun Heo wrote:
> net_cls and net_prio are the only cgroups which are allowed to be
> built as modules.  The savings from allowing the two controllers to be
> built as modules are tiny especially given that cgroup module support
> itself adds quite a bit of complexity.
> 
> The following are the sizes of vmlinux with both built as module and
> both built as part of the kernel image with cgroup module support
> removed.
> 
>   textdatabss dec
>   202922072411496 1078476833488471
>   202934212412568 1078476833490757
> 
> The total difference is 2286 bytes.  Given that none of other
> controllers has much chance of being made a module and that we're
> unlikely to add new modular controllers, the added complexity is
> simply not justifiable.
> 
> As a first step to drop cgroup module support, this patch changes the
> two config options to bool from tristate and drops module related code
> from the two controllers.
> 

I sugguested Daniel to do this for net_cls, and the change has been in
net-next.

https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=fe1217c4f3f7d7cbf8efdd8dd5fdc7204a1d65a8

I was planning to remove module support after that change goes into
upstream. :)

> Signed-off-by: Tejun Heo 
> Cc: Neil Horman 
> Cc: Thomas Graf 
> Cc: "David S. Miller" 
> ---
>  net/Kconfig   |  2 +-
>  net/core/netprio_cgroup.c | 32 ++--
>  net/sched/Kconfig |  2 +-
>  net/sched/cls_cgroup.c| 23 ++-
>  4 files changed, 6 insertions(+), 53 deletions(-)
> 

The modular version of task_netprioidx() in include/net/netprio_cgroup.h
can be removed.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] dt-bindings: qcom: Fix warning with duplicate dt define

2014-01-17 Thread Stephen Boyd

arch/arm/boot/dts/include/dt-bindings/clock/qcom,mmcc-msm8974.h:60:0:
warning: "RBCPR_CLK_SRC" redefined

Rename this to MMSS_RBCPR_CLK_SRC to avoid conflicts with the
RBCPR clock in the gcc header.

Reported-by: Bjorn Andersson 
Signed-off-by: Stephen Boyd 
---
 include/dt-bindings/clock/qcom,mmcc-msm8974.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/dt-bindings/clock/qcom,mmcc-msm8974.h 
b/include/dt-bindings/clock/qcom,mmcc-msm8974.h
index 04d318d1187a..032ed87ef0f3 100644
--- a/include/dt-bindings/clock/qcom,mmcc-msm8974.h
+++ b/include/dt-bindings/clock/qcom,mmcc-msm8974.h
@@ -57,7 +57,7 @@
 #define EXTPCLK_CLK_SRC40
 #define HDMI_CLK_SRC   41
 #define VSYNC_CLK_SRC  42
-#define RBCPR_CLK_SRC  43
+#define MMSS_RBCPR_CLK_SRC 43
 #define CAMSS_CCI_CCI_AHB_CLK  44
 #define CAMSS_CCI_CCI_CLK  45
 #define CAMSS_CSI0_AHB_CLK 46
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Deadlock in do_page_fault() on ARM (old kernel)

2014-01-17 Thread Alan Ott


On 01/17/2014 08:46 AM, Russell King - ARM Linux wrote:

On Wed, Jan 15, 2014 at 08:13:04PM -0500, Alan Ott wrote:

So my questions are:
1. Why don't I see a full backtrace beyond the exception stack? It's the
same when dump_stack() is called manually.

No idea - it looks like you're not using frame pointers, but are using
the unwinder.  Full backtraces can always be created with frame pointers,
it's just that unwinding seems unreliable.


Hi Russell,

I managed to get frame pointers turned on, which in this kernel version, 
seems like it requires the unwinder to be turned off[1].


When I built with frame pointers, the backtraces show differently. It 
only shows the frames which were _not_ shown with the unwinder. 
Backtrace at [2]. As you can see, it shows the non-exception stack dumps 
ending at places which can't page fault (and if it did page fault in 
those locations, it wouldn't try to take the mmap_sem lock). Its as 
though it shows what's in the modules but not what's in the image, where 
with the unwinder, it showed what's in the image and not what's in the 
modules[3].



I think we do need to see the full backtrace here - from looking at the
full state dump, I don't see any sign of the mmap_sem being held except
by an attempt to process a fault, and two threads trying to do a
sys_mmap_pgoff().


The lockdep printout at the end of [4] (which is the original log I sent 
the other day) shows do_page_fault() holding mmap_sem in six tasks and 
sys_mmap_pgoff() holding it in two. (I don't like the term "held" in the 
lockdep printout and believe it to mean "held or waiting for," based on 
my analysis of the rw_semaphore code.)


Each of those tasks seems to be blocking for it.


My suspicion therefore is that some other thread must have died while
holding the mmap_sem, so there's probably a kernel oops earlier...
that's my best guess at the moment without seeing the full backtrace.


There's no oops that I'm able to see.

Each of the tasks which lockdep reports as "holding" mmap_sem are 
blocking for it. If some other task had taken it and then crashed, I 
assume lockdep would list the crashed task as also holding the resource 
in the printout.


Thanks for having a look at this,

Alan.

[1] It seems a little strange to me even in the newest kernel:
1: lib/Kconfig.debug specifies FRAME_POINTER . The text of this instance 
shows up when one searches for FRAME_POINTER in menuconfig. It can 
become selected even though the dependencies listed here are not met.
2: arch/arm/Kconfig.debug also specifies FRAME_POINTER, with a 
dependency only on !THUMB2_KERNEL, defaulting to yes on !ARM_UNWIND

3: ARM doens't set ARCH_WANT_FRAME_POINTERS

[2] http://www.signal11.us/~alan/stack_dump_with_frame_pointers.txt
Note: This is a sysrq dump of the locks held at lockup time and then the 
automatic hung task detection.


[3] The modules are being built in the standard way for out-of-tree:
$(MAKE) -C $(PRJDIR)/linux M=$(CWD) modules

[4] http://www.signal11.us/~alan/show-all-tasks-deadlock.txt


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RFC 2/2] crypto: Simple load balancer test module

2014-01-17 Thread Tadeusz Struk

Test module for the simple algorithm load balancer

Signed-off-by: Tadeusz Struk 
---
 crypto/Kconfig|6 ++
 crypto/Makefile   |1 +
 crypto/test_alg_loadbalance.c |  231 +
 3 files changed, 247 insertions(+)
 create mode 100644 crypto/test_alg_loadbalance.c

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 7bcb70d..85e1bc2 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1404,6 +1404,12 @@ config CRYPTO_USER_API_SKCIPHER
 config CRYPTO_HASH_INFO
bool
 
+config CRYPTO_ALG_LOAD_BALANCE_TEST
+   tristate "Crypto Algorithm load balancer test module"
+   default n
+   help
+ This option enables the crypto algorithm load balancer test module.
+
 source "drivers/crypto/Kconfig"
 source crypto/asymmetric_keys/Kconfig
 
diff --git a/crypto/Makefile b/crypto/Makefile
index b29402a..5a49e2a 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -97,6 +97,7 @@ obj-$(CONFIG_CRYPTO_GHASH) += ghash-generic.o
 obj-$(CONFIG_CRYPTO_USER_API) += af_alg.o
 obj-$(CONFIG_CRYPTO_USER_API_HASH) += algif_hash.o
 obj-$(CONFIG_CRYPTO_USER_API_SKCIPHER) += algif_skcipher.o
+obj-$(CONFIG_CRYPTO_ALG_LOAD_BALANCE_TEST) += test_alg_loadbalance.o
 
 #
 # generic algorithms and the async_tx api
diff --git a/crypto/test_alg_loadbalance.c b/crypto/test_alg_loadbalance.c
new file mode 100644
index 000..2637247
--- /dev/null
+++ b/crypto/test_alg_loadbalance.c
@@ -0,0 +1,231 @@
+/*
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int encrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
+   struct scatterlist *src, unsigned int nbytes)
+{
+   return 0;
+}
+static int setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen)
+{
+   return 0;
+}
+
+
+#define loops 300
+static struct crypto_ablkcipher *tfms[loops];
+
+static struct crypto_alg tmp = {
+   .cra_name   = "test",
+   .cra_driver_name= "test_driver",
+   .cra_priority   = 1,
+   .cra_flags  = CRYPTO_ALG_TYPE_ABLKCIPHER,
+   .cra_module = THIS_MODULE,
+   .cra_ctxsize= 0,
+   .cra_type   = _blkcipher_type,
+   .cra_u = {
+   .blkcipher = {
+   .min_keysize= 64,
+   .max_keysize= 64,
+   .ivsize = 64,
+   .setkey = setkey,
+   .encrypt= encrypt,
+   .decrypt= encrypt,
+   },
+   },
+   };
+static struct crypto_alg algs[10] = {
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+   { { 0 } },
+};
+
+static int __init cra_lbtest_init(void)
+{
+   int i;
+
+   for (i = 0; i < 10; i++) {
+   algs[i] = tmp;
+   algs[i].cra_priority += i;
+   }
+
+   if (crypto_register_algs(algs, 10))
+   return -1;
+
+   for (i = 0; i < loops; i++) {
+   tfms[i] = crypto_alloc_ablkcipher("test", 0, 0);
+   if (IS_ERR(tfms[i]))
+   return PTR_ERR(tfms[i]);
+   }
+   pr_info("Algorithm load balancig test results for %d allocatoins\n",
+   loops);
+   pr_info("All different alg priorities:\n");
+   for (i = 0; i < 10; i++) {
+   unsigned long times = atomic_read([i].cra_refcnt);
+   unsigned long percent = (times * 100) / loops;
+
+   pr_info("Alg with cra_pri %d allocated %lu times. That's 
~%lu%%\n",
+   algs[i].cra_priority, times, percent);
+
+   }
+   for (i = 0; i < loops; i++)
+   crypto_free_ablkcipher(tfms[i]);
+
+   crypto_unregister_algs(algs, 10);
+
+   for (i = 0; i < 10; i++) {
+   algs[i] = tmp;
+   algs[i].cra_priority = 10;
+   }
+   if (crypto_register_algs(algs, 10))
+   return -1;
+
+   for (i = 0; i < loops; i++) {
+   tfms[i] = crypto_alloc_ablkcipher("test", 0, 0);
+   if (IS_ERR(tfms[i]))
+   return PTR_ERR(tfms[i]);
+   }
+   pr_info("All same alg priorities:\n");
+   for (i = 0; i < 10; i++) {
+   unsigned long times = atomic_read([i].cra_refcnt);
+   unsigned long percent = (times * 100) / loops;
+
+

[PATCH RFC 1/2] crypto: Simple crypto algorithm load balancer

2014-01-17 Thread Tadeusz Struk

Hi,
In case of multiple crypto devices implementing the same algorithms
the arbitration, which one to use is based on the cra_priority.
Currently the algorithm with the highest priority is always be used.
In case of two algorithms with the same priority the one added last is used.
There is no load balancing and on a busy system there can be a situation when
there more crypto accelerators, but only one is used for all crypto jobs
and the others sit idle.
This patch implements a simple load balancing based on the priority
of an algorithm and its usage refctr. The distribution for 10 algorithms
and 300 tfm_alloc allocations is as follows:

 === Algorithm load balancig test results for 300 allocatoins ===
 All different alg priorities:
 Algorithm 1 with cra_pri 1 allocated 3 times. That's ~0%
 Algorithm 2 with cra_pri 2 allocated 10 times. That's ~0%
 Algorithm 3 with cra_pri 3 allocated 38 times. That's ~0%
 Algorithm 4 with cra_pri 4 allocated 168 times. That's ~0%
 Algorithm 5 with cra_pri 5 allocated 781 times. That's ~0%
 Algorithm 6 with cra_pri 6 allocated 3757 times. That's ~0%
 Algorithm 7 with cra_pri 7 allocated 18483 times. That's ~0%
 Algorithm 8 with cra_pri 8 allocated 92540 times. That's ~3%
 Algorithm 9 with cra_pri 9 allocated 469918 times. That's ~15%
 Algorithm 10 with cra_pri 10 allocated 2414312 times. That's ~80%

 All the same alg priorities:
 Algorithm 1 with cra_pri 10 allocated 297521 times. That's ~9%
 Algorithm 2 with cra_pri 10 allocated 297521 times. That's ~9%
 Algorithm 3 with cra_pri 10 allocated 297521 times. That's ~9%
 Algorithm 4 with cra_pri 10 allocated 297522 times. That's ~9%
 Algorithm 5 with cra_pri 10 allocated 297522 times. That's ~9%
 Algorithm 6 with cra_pri 10 allocated 297522 times. That's ~9%
 Algorithm 7 with cra_pri 10 allocated 297522 times. That's ~9%
 Algorithm 8 with cra_pri 10 allocated 297522 times. That's ~9%
 Algorithm 9 with cra_pri 10 allocated 297522 times. That's ~9%
 Algorithm 10 with cra_pri 10 allocated 297522 times. That's ~9%

 A mix of alg priorities:
 Algorithm 1 with cra_pri 9 allocated 1938 times. That's ~0%
 Algorithm 2 with cra_pri 4 allocated 8 times. That's ~0%
 Algorithm 3 with cra_pri 1 allocated 2 times. That's ~0%
 Algorithm 4 with cra_pri 4 allocated 9 times. That's ~0%
 Algorithm 5 with cra_pri 5 allocated 62 times. That's ~0%
 Algorithm 6 with cra_pri 20 allocated 157052 times. That's ~5%
 Algorithm 7 with cra_pri 21 allocated 1410070 times. That's ~47%
 Algorithm 8 with cra_pri 10 allocated 10215 times. That's ~0%
 Algorithm 9 with cra_pri 21 allocated 1420284 times. That's ~47%
 Algorithm 10 with cra_pri 7 allocated 370 times. That's ~0%

 A mix of alg priorities with one very big:
 Algorithm 1 with cra_pri 9 allocated 6 times. That's ~0%
 Algorithm 2 with cra_pri 4 allocated 1 times. That's ~0%
 Algorithm 3 with cra_pri 1 allocated 1 times. That's ~0%
 Algorithm 4 with cra_pri 4 allocated 1 times. That's ~0%
 Algorithm 5 with cra_pri 500 allocated 2993808 times. That's ~99%
 Algorithm 6 with cra_pri 20 allocated 346 times. That's ~0%
 Algorithm 7 with cra_pri 21 allocated 2899 times. That's ~0%
 Algorithm 8 with cra_pri 10 allocated 23 times. That's ~0%
 Algorithm 9 with cra_pri 21 allocated 2922 times. That's ~0%
 Algorithm 10 with cra_pri 7 allocated 3 times. That's ~0%

Signed-off-by: Tadeusz Struk 
---
 crypto/api.c   |   25 ++---
 include/linux/crypto.h |3 ++-
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/crypto/api.c b/crypto/api.c
index a2b39c5..0c0f1c3 100644
--- a/crypto/api.c
+++ b/crypto/api.c
@@ -63,7 +63,7 @@ static struct crypto_alg *__crypto_alg_lookup(const char 
*name, u32 type,
int best = -2;
 
list_for_each_entry(q, _alg_list, cra_list) {
-   int exact, fuzzy;
+   int exact, fuzzy, relative_priority;
 
if (crypto_is_moribund(q))
continue;
@@ -78,13 +78,15 @@ static struct crypto_alg *__crypto_alg_lookup(const char 
*name, u32 type,
 
exact = !strcmp(q->cra_driver_name, name);
fuzzy = !strcmp(q->cra_name, name);
-   if (!exact && !(fuzzy && q->cra_priority > best))
+   relative_priority = q->cra_priority -
+   atomic_read(>cra_lbalance);
+   if (!exact && !(fuzzy && relative_priority > best))
continue;
 
if (unlikely(!crypto_mod_get(q)))
continue;
 
-   best = q->cra_priority;
+   best = relative_priority;
if (alg)
crypto_mod_put(alg);
alg = q;
@@ -92,6 +94,23 @@ static struct crypto_alg *__crypto_alg_lookup(const char 
*name, u32 type,
if (exact)
break;
}
+   if (!alg) {
+   list_for_each_entry(q, _alg_list, cra_list) {
+   int best = -2;
+

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread Pavel Machek

On Fri 2014-01-17 14:51:52, H. Peter Anvin wrote:
> On 01/17/2014 02:50 PM, Borislav Petkov wrote:
> > On Fri, Jan 17, 2014 at 11:28:06PM +0100, Pavel Machek wrote:
> >> Would it make sense to printk() a warning?
> > 
> > No because people come and start bitching about their dmesg containing
> > a warning and whether their hardware is b0rked without even reading the
> > actual words.

Have you checked your dmesg recently? Normal people don't read
it... it is just too much of it.

> Printing a warning is appropriate if we can't actually fix the problem
> in the OS.  If we actually make the problem go away then we have just
> done our job and we can be done with it.

I disagree. Older kernel versions still may have problem, etc.

We normally do print warnings for problems we work around. We want
vendors to fix their hardware, too...

ACPI BIOS Warning (bug): 32/64X FACS address mismatch in FADT -
0xBDB5FF40/0xBDB64F40, using 32 (20131115/tbfadt-522)
[Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] mtd: block2mtd: char mtd major and erasesize parameter check + mutex_destroy

2014-01-17 Thread Fabian Frederick

-Deny use of a char mtd device to map as block device.
-mutex_init when mtd structure is available.
-fixme applied : check device size is a multiple of erasesize.
-mutex_destroy on each device in block2mtd_exit and add_device failure.

Signed-off-by: Fabian Frederick 
---
 drivers/mtd/devices/block2mtd.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index d9fd87a..be67731 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -209,7 +209,6 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
 }
 
 
-/* FIXME: ensure that mtd->size % erase_size == 0 */
 static struct block2mtd_dev *add_device(char *devname, int erase_size)
 {
const fmode_t mode = FMODE_READ | FMODE_WRITE | FMODE_EXCL;
@@ -244,21 +243,27 @@ static struct block2mtd_dev *add_device(char *devname, 
int erase_size)
}
dev->blkdev = bdev;
 
-   if (MAJOR(bdev->bd_dev) == MTD_BLOCK_MAJOR) {
+   if ((MAJOR(bdev->bd_dev) == MTD_BLOCK_MAJOR) ||
+   (MAJOR(bdev->bd_dev) == MTD_CHAR_MAJOR)) {
pr_err("attempting to use an MTD device as a block device\n");
goto devinit_err;
}
 
-   mutex_init(>write_mutex);
 
/* Setup the MTD structure */
/* make the name contain the block device in */
name = kasprintf(GFP_KERNEL, "block2mtd: %s", devname);
+
if (!name)
goto devinit_err;
+
+   if ((long)dev->blkdev->bd_inode->i_size % erase_size) {
+   pr_err("erasesize must be a divisor of device size\n");
+   goto devinit_err;
+   }
 
+   mutex_init(>write_mutex);
dev->mtd.name = name;
-
dev->mtd.size = dev->blkdev->bd_inode->i_size & PAGE_MASK;
dev->mtd.erasesize = erase_size;
dev->mtd.writesize = 1;
@@ -274,7 +279,7 @@ static struct block2mtd_dev *add_device(char *devname, int 
erase_size)
 
if (mtd_device_register(>mtd, NULL, 0)) {
/* Device didn't get added, so free the entry */
-   goto devinit_err;
+   goto register_err;
}
list_add(>list, _device_list);
pr_info("mtd%d: [%s] erase_size = %dKiB [%d]\n",
@@ -283,6 +288,8 @@ static struct block2mtd_dev *add_device(char *devname, int 
erase_size)
dev->mtd.erasesize >> 10, dev->mtd.erasesize);
return dev;
 
+register_err:
+   mutex_destroy(>write_mutex);
 devinit_err:
block2mtd_free_device(dev);
return NULL;
@@ -448,6 +455,7 @@ static void block2mtd_exit(void)
struct block2mtd_dev *dev = list_entry(pos, typeof(*dev), list);
block2mtd_sync(>mtd);
mtd_device_unregister(>mtd);
+   mutex_destroy(>write_mutex);
pr_info("mtd%d: [%s] removed\n",
dev->mtd.index,
dev->mtd.name + strlen("block2mtd: "));
-- 
1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread Chris Ball

Hi, adding Aisheng,

On Fri, Jan 17 2014, Andrew Bresticker wrote:
> On Fri, Jan 17, 2014 at 3:11 PM, John Tobias  wrote:
>> There's an existing patch for that...
>> http://www.spinics.net/lists/arm-kernel/msg296596.html
>
> Ah, I see.  Looks like it has yet to be picked up...

The patches aren't quite identical -- Andrew's leaves the
disable_irq() call in and Aisheng's removes it.  Which should I take?

Thanks,

- Chris.
-- 
Chris Ball  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pinctrl: capri: add dependency on OF

2014-01-17 Thread Stephen Warren

On 01/17/2014 04:13 PM, Linus Walleij wrote:
> On Sat, Jan 18, 2014 at 12:12 AM, Linus Walleij
>  wrote:
>> On Fri, Jan 17, 2014 at 8:51 PM, Sherman Yin  wrote:
>>
>>> Thanks for the fix, Linus.  While we're visiting this config, should we add
>>> "depends on MACH_BCM_MOBILE" as well?
>>
>> No, it's nice to get the compile coverage.
> 
> But maybe you can experiment with that special option that only
> turns on the driver on other platforms to do compile test.

a/k/a
depends on XXX || COMPILE_TEST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread Andrew Bresticker

On Fri, Jan 17, 2014 at 3:11 PM, John Tobias  wrote:
> There's an existing patch for that...
> http://www.spinics.net/lists/arm-kernel/msg296596.html

Ah, I see.  Looks like it has yet to be picked up...

>
> On Fri, Jan 17, 2014 at 2:58 PM, Philip Rakity  wrote:
>>
>> On Jan 17, 2014, at 7:57 PM, Andrew Bresticker  wrote:
>>
>>> sdhci_execute_tuning() takes host->lock without disabling interrupts.
>>> Use spin_lock_irq{save,restore} instead so that we avoid taking an
>>> interrupt and scheduling while holding host->lock.
>>>
>>> Signed-off-by: Andrew Bresticker 
>>> ---
>>> drivers/mmc/host/sdhci.c | 13 +++--
>>> 1 file changed, 7 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>>> index ec3eb30..84c80e7 100644
>>> --- a/drivers/mmc/host/sdhci.c
>>> +++ b/drivers/mmc/host/sdhci.c
>>> @@ -1857,12 +1857,13 @@ static int sdhci_execute_tuning(struct mmc_host 
>>> *mmc, u32 opcode)
>>>   unsigned long timeout;
>>>   int err = 0;
>>>   bool requires_tuning_nonuhs = false;
>>> + unsigned long flags;
>>>
>>>   host = mmc_priv(mmc);
>>>
>>>   sdhci_runtime_pm_get(host);
>>>   disable_irq(host->irq);
>>> - spin_lock(>lock);
>>> + spin_lock_irqsave(>lock, flags);
>>
>>
>> The disable_irq() call stops the controller from doing interrupts.
>> Please explain what problem you are seeing
>>
>>>
>>>   ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
>>>
>>> @@ -1882,14 +1883,14 @@ static int sdhci_execute_tuning(struct mmc_host 
>>> *mmc, u32 opcode)
>>>   requires_tuning_nonuhs)
>>>   ctrl |= SDHCI_CTRL_EXEC_TUNING;
>>>   else {
>>> - spin_unlock(>lock);
>>> + spin_unlock_irqrestore(>lock, flags);
>>>   enable_irq(host->irq);
>>>   sdhci_runtime_pm_put(host);
>>>   return 0;
>>>   }
>>>
>>>   if (host->ops->platform_execute_tuning) {
>>> - spin_unlock(>lock);
>>> + spin_unlock_irqrestore(>lock, flags);
>>>   enable_irq(host->irq);
>>>   err = host->ops->platform_execute_tuning(host, opcode);
>>>   sdhci_runtime_pm_put(host);
>>> @@ -1963,7 +1964,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
>>> u32 opcode)
>>>   host->cmd = NULL;
>>>   host->mrq = NULL;
>>>
>>> - spin_unlock(>lock);
>>> + spin_unlock_irqrestore(>lock, flags);
>>>   enable_irq(host->irq);
>>>
>>>   /* Wait for Buffer Read Ready interrupt */
>>> @@ -1971,7 +1972,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
>>> u32 opcode)
>>>   (host->tuning_done == 1),
>>>   msecs_to_jiffies(50));
>>>   disable_irq(host->irq);
>>> - spin_lock(>lock);
>>> + spin_lock_irqsave(>lock, flags);
>>>
>>>   if (!host->tuning_done) {
>>>   pr_info(DRIVER_NAME ": Timeout waiting for "
>>> @@ -2046,7 +2047,7 @@ out:
>>>   err = 0;
>>>
>>>   sdhci_clear_set_irqs(host, SDHCI_INT_DATA_AVAIL, ier);
>>> - spin_unlock(>lock);
>>> + spin_unlock_irqrestore(>lock, flags);
>>>   enable_irq(host->irq);
>>>   sdhci_runtime_pm_put(host);
>>>
>>> --
>>> 1.8.5.2
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pinctrl: capri: add dependency on OF

2014-01-17 Thread Linus Walleij

On Sat, Jan 18, 2014 at 12:12 AM, Linus Walleij
 wrote:
> On Fri, Jan 17, 2014 at 8:51 PM, Sherman Yin  wrote:
>
>> Thanks for the fix, Linus.  While we're visiting this config, should we add
>> "depends on MACH_BCM_MOBILE" as well?
>
> No, it's nice to get the compile coverage.

But maybe you can experiment with that special option that only
turns on the driver on other platforms to do compile test.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pinctrl: capri: add dependency on OF

2014-01-17 Thread Linus Walleij

On Fri, Jan 17, 2014 at 8:51 PM, Sherman Yin  wrote:

> Thanks for the fix, Linus.  While we're visiting this config, should we add
> "depends on MACH_BCM_MOBILE" as well?

No, it's nice to get the compile coverage.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread John Tobias

There's an existing patch for that...
http://www.spinics.net/lists/arm-kernel/msg296596.html

On Fri, Jan 17, 2014 at 2:58 PM, Philip Rakity  wrote:
>
> On Jan 17, 2014, at 7:57 PM, Andrew Bresticker  wrote:
>
>> sdhci_execute_tuning() takes host->lock without disabling interrupts.
>> Use spin_lock_irq{save,restore} instead so that we avoid taking an
>> interrupt and scheduling while holding host->lock.
>>
>> Signed-off-by: Andrew Bresticker 
>> ---
>> drivers/mmc/host/sdhci.c | 13 +++--
>> 1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>> index ec3eb30..84c80e7 100644
>> --- a/drivers/mmc/host/sdhci.c
>> +++ b/drivers/mmc/host/sdhci.c
>> @@ -1857,12 +1857,13 @@ static int sdhci_execute_tuning(struct mmc_host 
>> *mmc, u32 opcode)
>>   unsigned long timeout;
>>   int err = 0;
>>   bool requires_tuning_nonuhs = false;
>> + unsigned long flags;
>>
>>   host = mmc_priv(mmc);
>>
>>   sdhci_runtime_pm_get(host);
>>   disable_irq(host->irq);
>> - spin_lock(>lock);
>> + spin_lock_irqsave(>lock, flags);
>
>
> The disable_irq() call stops the controller from doing interrupts.
> Please explain what problem you are seeing
>
>>
>>   ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
>>
>> @@ -1882,14 +1883,14 @@ static int sdhci_execute_tuning(struct mmc_host 
>> *mmc, u32 opcode)
>>   requires_tuning_nonuhs)
>>   ctrl |= SDHCI_CTRL_EXEC_TUNING;
>>   else {
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>   sdhci_runtime_pm_put(host);
>>   return 0;
>>   }
>>
>>   if (host->ops->platform_execute_tuning) {
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>   err = host->ops->platform_execute_tuning(host, opcode);
>>   sdhci_runtime_pm_put(host);
>> @@ -1963,7 +1964,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
>> u32 opcode)
>>   host->cmd = NULL;
>>   host->mrq = NULL;
>>
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>
>>   /* Wait for Buffer Read Ready interrupt */
>> @@ -1971,7 +1972,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
>> u32 opcode)
>>   (host->tuning_done == 1),
>>   msecs_to_jiffies(50));
>>   disable_irq(host->irq);
>> - spin_lock(>lock);
>> + spin_lock_irqsave(>lock, flags);
>>
>>   if (!host->tuning_done) {
>>   pr_info(DRIVER_NAME ": Timeout waiting for "
>> @@ -2046,7 +2047,7 @@ out:
>>   err = 0;
>>
>>   sdhci_clear_set_irqs(host, SDHCI_INT_DATA_AVAIL, ier);
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>   sdhci_runtime_pm_put(host);
>>
>> --
>> 1.8.5.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 4/6] net: rfkill: gpio: add device tree support

2014-01-17 Thread Linus Walleij

On Fri, Jan 17, 2014 at 6:43 PM, Chen-Yu Tsai  wrote:
> On Sat, Jan 18, 2014 at 12:47 AM, Arnd Bergmann  wrote:

>>> +- NAME_shutdown-gpios  : GPIO phandle to shutdown control
>>> + (phandle must be the second)
>>> +- NAME_reset-gpios : GPIO phandle to reset control
>>> +
>>> +NAME must match the rfkill-name property. NAME_shutdown-gpios or
>>> +NAME_reset-gpios, or both, must be defined.
>>> +
>>
>> I don't understand this part. Why do you include the name in the
>> gpios property, rather than just hardcoding the property strings
>> to "shutdown-gpios" and "reset-gpios"?
>
> This quirk is a result of how gpiod_get_index implements device tree
> lookup.

Why can't it just have a single property "gpios", where the first
element is the reset GPIO and the second is the shutdown GPIO?

rfkill-gpio does this:

gpio = devm_gpiod_get_index(>dev, rfkill->reset_name, 0);
gpio = devm_gpiod_get_index(>dev, rfkill->shutdown_name, 1);

The passed con ID name parameter is only there for the device
tree case it seems. (ACPI ignores it.) So what about you just
don't pass it at all and patch it to do like this instead:

gpio = devm_gpiod_get_index(>dev, NULL, 0);
gpio = devm_gpiod_get_index(>dev, NULL, 1);

Heikki, are you OK with this change?

I think this is actually necessary if the ACPI and DT unification
pipe dream shall limp forward, we cannot have arguments passed
that have a semantic effect on DT but not on ACPI... Drivers
that are supposed to use both ACPI and DT will always
have to pass NULL as con ID.

> If con_id is given, it is prepended to "gpios" as the property string.
> con_id is also used as the label passed to gpiod_request, which is
> then shown in /sys/kernel/debug/gpio.

If your problem  is really what turns up in debugfs, then we need
to figure out a way to label gpios outside of the *gpiod_get* calls.

The string passed in *gpiod_get* is a "connection ID" not a proper
name for the GPIO.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread Andrew Bresticker

On Fri, Jan 17, 2014 at 2:58 PM, Philip Rakity  wrote:
>
> On Jan 17, 2014, at 7:57 PM, Andrew Bresticker  wrote:
>
>> sdhci_execute_tuning() takes host->lock without disabling interrupts.
>> Use spin_lock_irq{save,restore} instead so that we avoid taking an
>> interrupt and scheduling while holding host->lock.
>>
>> Signed-off-by: Andrew Bresticker 
>> ---
>> drivers/mmc/host/sdhci.c | 13 +++--
>> 1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>> index ec3eb30..84c80e7 100644
>> --- a/drivers/mmc/host/sdhci.c
>> +++ b/drivers/mmc/host/sdhci.c
>> @@ -1857,12 +1857,13 @@ static int sdhci_execute_tuning(struct mmc_host 
>> *mmc, u32 opcode)
>>   unsigned long timeout;
>>   int err = 0;
>>   bool requires_tuning_nonuhs = false;
>> + unsigned long flags;
>>
>>   host = mmc_priv(mmc);
>>
>>   sdhci_runtime_pm_get(host);
>>   disable_irq(host->irq);
>> - spin_lock(>lock);
>> + spin_lock_irqsave(>lock, flags);
>
>
> The disable_irq() call stops the controller from doing interrupts.

Right, but it does not disable other IRQ sources that could cause us
to schedule.

> Please explain what problem you are seeing

The issue we were seeing was that a card-detect interrupt was
triggered (*not* the controller interrupt), causing the card-detect
irq thread to recurse on host->lock:

[   60.962218] BUG: spinlock cpu recursion on CPU#0, irq/362-700b040/89
[   60.975253]  lock: 0xee210c80, .magic: dead4ead, .owner:
kworker/u8:1/33, .owner_cpu: 0
[   60.991638] CPU: 0 PID: 89 Comm: irq/362-700b040 Not tainted 3.10.18 #2
[   61.005199] [<800153cc>] (unwind_backtrace+0x0/0x118) from
[<800124e4>] (show_stack+0x20/0x24)
[   61.022824] [<800124e4>] (show_stack+0x20/0x24) from [<8053d584>]
(dump_stack+0x20/0x28)
[   61.039389] [<8053d584>] (dump_stack+0x20/0x28) from [<8021d508>]
(spin_dump+0x80/0x94)
[   61.055773] [<8021d508>] (spin_dump+0x80/0x94) from [<8021d548>]
(spin_bug+0x2c/0x30)
[   61.071803] [<8021d548>] (spin_bug+0x2c/0x30) from [<8021d61c>]
(do_raw_spin_lock+0x70/0x15c)
[   61.089250] [<8021d61c>] (do_raw_spin_lock+0x70/0x15c) from
[<8054098c>] (_raw_spin_lock_irqsave+0x20/0x28)
[   61.109175] [<8054098c>] (_raw_spin_lock_irqsave+0x20/0x28) from
[<803e2af0>] (sdhci_card_event+0x28/0xfc)
[   61.128922] [<803e2af0>] (sdhci_card_event+0x28/0xfc) from
[<803dc880>] (mmc_gpio_cd_irqt+0x30/0x4c)
[   61.147609] [<803dc880>] (mmc_gpio_cd_irqt+0x30/0x4c) from
[<80091858>] (irq_thread+0xf0/0x224)
[   61.165412] [<80091858>] (irq_thread+0xf0/0x224) from [<80050db4>]
(kthread+0xc8/0xd8)
[   61.181623] [<80050db4>] (kthread+0xc8/0xd8) from [<8000e4d8>]
(ret_from_fork+0x14/0x20)

Thanks,
Andrew

>
>>
>>   ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
>>
>> @@ -1882,14 +1883,14 @@ static int sdhci_execute_tuning(struct mmc_host 
>> *mmc, u32 opcode)
>>   requires_tuning_nonuhs)
>>   ctrl |= SDHCI_CTRL_EXEC_TUNING;
>>   else {
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>   sdhci_runtime_pm_put(host);
>>   return 0;
>>   }
>>
>>   if (host->ops->platform_execute_tuning) {
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>   err = host->ops->platform_execute_tuning(host, opcode);
>>   sdhci_runtime_pm_put(host);
>> @@ -1963,7 +1964,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
>> u32 opcode)
>>   host->cmd = NULL;
>>   host->mrq = NULL;
>>
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>
>>   /* Wait for Buffer Read Ready interrupt */
>> @@ -1971,7 +1972,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
>> u32 opcode)
>>   (host->tuning_done == 1),
>>   msecs_to_jiffies(50));
>>   disable_irq(host->irq);
>> - spin_lock(>lock);
>> + spin_lock_irqsave(>lock, flags);
>>
>>   if (!host->tuning_done) {
>>   pr_info(DRIVER_NAME ": Timeout waiting for "
>> @@ -2046,7 +2047,7 @@ out:
>>   err = 0;
>>
>>   sdhci_clear_set_irqs(host, SDHCI_INT_DATA_AVAIL, ier);
>> - spin_unlock(>lock);
>> + spin_unlock_irqrestore(>lock, flags);
>>   enable_irq(host->irq);
>>   sdhci_runtime_pm_put(host);
>>
>> --
>> 1.8.5.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at

[PATCH v3 0/2] Qualcomm Universal Peripheral (QUP) I2C controller

2014-01-17 Thread Bjorn Andersson

Continuing on Ivans i2c-qup series.

Changes from v2:
 - Removed unused variables and includes
 - Corrected read logic in irq handler
 - Made the polling loop in qup_i2c_poll_state() less arbitrary
 - Only building suspend/resume if CONFIG_PM_SLEEP

Changes from v1:
 - Cleaned up device tree binding example.
 - Refrased device tree bindings.
 - Following changes in the i2c framework.
 - Use the core clock to calculate divider for the bus clock, instead of
   explicitly setting it.
 - Remove explicit pinctrl settting.
 - Split/renamed qup_i2c_enable(bool) into enable/disable functions.
 - Return value was overwritten on error in write_one/read_one.
 - Initialize the i2c core every time, so that we actually can execute
   more than 1 transmission per xfer.

Ivan T. Ivanov (2):
  i2c: qup: Add device tree bindings information
  i2c: New bus driver for the QUP I2C controller

 .../devicetree/bindings/i2c/qcom,i2c-qup.txt   |  41 +
 drivers/i2c/busses/Kconfig |  10 +
 drivers/i2c/busses/Makefile|   1 +
 drivers/i2c/busses/i2c-qup.c   | 894 +
 4 files changed, 946 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/i2c/qcom,i2c-qup.txt
 create mode 100644 drivers/i2c/busses/i2c-qup.c

-- 
1.8.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/2] i2c: New bus driver for the QUP I2C controller

2014-01-17 Thread Bjorn Andersson

From: "Ivan T. Ivanov" 

This bus driver supports the QUP i2c hardware controller in the Qualcomm
MSM SOCs.  The Qualcomm Universal Peripheral Engine (QUP) is a general
purpose data path engine with input/output FIFOs and an embedded i2c
mini-core. The driver supports FIFO mode (for low bandwidth applications)
and block mode (interrupt generated for each block-size data transfer).
The driver currently does not support DMA transfers.

Shamelessly based on codeaurora version of the driver.

Signed-off-by: Ivan T. Ivanov 
[bjorn: updated to reflect i2c framework changes
splited up qup_i2c_enable() in enable/disable
don't overwrite ret value on error in xfer functions
initilize core for each transfer
remove explicit pinctrl selection
use existing clock instead of setting new core clock]
Signed-off-by: Bjorn Andersson 
---
 drivers/i2c/busses/Kconfig   |  10 +
 drivers/i2c/busses/Makefile  |   1 +
 drivers/i2c/busses/i2c-qup.c | 894 +++
 3 files changed, 905 insertions(+)
 create mode 100644 drivers/i2c/busses/i2c-qup.c

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 3b26129..4eaded0 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -648,6 +648,16 @@ config I2C_PXA_SLAVE
  is necessary for systems where the PXA may be a target on the
  I2C bus.
 
+config I2C_QUP
+   tristate "Qualcomm QUP based I2C controller"
+   depends on ARCH_MSM
+   help
+ If you say yes to this option, support will be included for the
+ built-in I2C interface on the MSM family processors.
+
+ This driver can also be built as a module.  If so, the module
+ will be called i2c-qup.
+
 config HAVE_S3C2410_I2C
bool
help
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index c73eb0e..c93f593 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_I2C_PNX) += i2c-pnx.o
 obj-$(CONFIG_I2C_PUV3) += i2c-puv3.o
 obj-$(CONFIG_I2C_PXA)  += i2c-pxa.o
 obj-$(CONFIG_I2C_PXA_PCI)  += i2c-pxa-pci.o
+obj-$(CONFIG_I2C_QUP)  += i2c-qup.o
 obj-$(CONFIG_I2C_S3C2410)  += i2c-s3c2410.o
 obj-$(CONFIG_I2C_S6000)+= i2c-s6000.o
 obj-$(CONFIG_I2C_SH7760)   += i2c-sh7760.o
diff --git a/drivers/i2c/busses/i2c-qup.c b/drivers/i2c/busses/i2c-qup.c
new file mode 100644
index 000..2e0020e
--- /dev/null
+++ b/drivers/i2c/busses/i2c-qup.c
@@ -0,0 +1,894 @@
+/* Copyright (c) 2009-2013, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* QUP Registers */
+#define QUP_CONFIG 0x000
+#define QUP_STATE  0x004
+#define QUP_IO_MODE0x008
+#define QUP_SW_RESET   0x00c
+#define QUP_OPERATIONAL0x018
+#define QUP_ERROR_FLAGS0x01c
+#define QUP_ERROR_FLAGS_EN 0x020
+#define QUP_HW_VERSION 0x030
+#define QUP_MX_OUTPUT_CNT  0x100
+#define QUP_OUT_FIFO_BASE  0x110
+#define QUP_MX_WRITE_CNT   0x150
+#define QUP_MX_INPUT_CNT   0x200
+#define QUP_MX_READ_CNT0x208
+#define QUP_IN_FIFO_BASE   0x218
+#define QUP_I2C_CLK_CTL0x400
+#define QUP_I2C_STATUS 0x404
+
+/* QUP States and reset values */
+#define QUP_RESET_STATE0
+#define QUP_RUN_STATE  1
+#define QUP_PAUSE_STATE3
+#define QUP_STATE_MASK 3
+
+#define QUP_STATE_VALIDBIT(2)
+#define QUP_I2C_MAST_GEN   BIT(4)
+
+#define QUP_OPERATIONAL_RESET  0x000ff0
+#define QUP_I2C_STATUS_RESET   0xfc
+
+/* QUP OPERATIONAL FLAGS */
+#define QUP_OUT_SVC_FLAG   BIT(8)
+#define QUP_IN_SVC_FLAGBIT(9)
+#define QUP_MX_INPUT_DONE  BIT(11)
+
+/* I2C mini core related values */
+#define I2C_MINI_CORE  (2 << 8)
+#define I2C_N_VAL  15
+/* Most significant word offset in FIFO port */
+#define QUP_MSW_SHIFT  (I2C_N_VAL + 1)
+#define QUP_CLOCK_AUTO_GATEBIT(13)
+
+/* Packing/Unpacking words in FIFOs, and IO modes */
+#define QUP_UNPACK_EN  BIT(14)
+#define QUP_PACK_ENBIT(15)
+#define QUP_OUTPUT_BLK_MODEBIT(10)
+#define QUP_INPUT_BLK_MODE BIT(12)
+
+#define QUP_REPACK_EN  (QUP_UNPACK_EN | QUP_PACK_EN)
+
+#define QUP_OUTPUT_BLOCK_SIZE(x)(((x) & (0x03 <<

[PATCH v3 1/2] i2c: qup: Add device tree bindings information

2014-01-17 Thread Bjorn Andersson

From: "Ivan T. Ivanov" 

The Qualcomm Universal Peripherial (QUP) wraps I2C mini-core and
provide input and output FIFO's for it. I2C controller can operate
as master with supported bus speeds of 100Kbps and 400Kbps.

Signed-off-by: Ivan T. Ivanov 
[bjorn: reformulated part of binding description and cleaned up example]
Signed-off-by: Bjorn Andersson 
---
 .../devicetree/bindings/i2c/qcom,i2c-qup.txt   | 41 ++
 1 file changed, 41 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/i2c/qcom,i2c-qup.txt

diff --git a/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.txt 
b/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.txt
new file mode 100644
index 000..a99711b
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/qcom,i2c-qup.txt
@@ -0,0 +1,41 @@
+Qualcomm Universal Peripheral (QUP) I2C controller
+
+Required properties:
+ - compatible: Should be "qcom,i2c-qup".
+ - reg: Should contain QUP register address and length.
+ - interrupts: Should contain I2C interrupt.
+
+ - clocks: Should contain the core clock and the AHB clock.
+ - clock-names: Should be "core" for the core clock and "iface" for the
+AHB clock.
+
+ - #address-cells: Should be <1> Address cells for i2c device address
+ - #size-cells: Should be <0> as i2c addresses have no size component
+
+Optional properties:
+ - clock-frequency: Should specify the desired i2c bus clock frequency in Hz,
+default is 100kHz if omitted.
+
+Child nodes should conform to i2c bus binding.
+
+Example:
+
+ i2c2: i2c@f9924000 {
+   compatible = "qcom,i2c-qup";
+   reg = <0xf9924000 0x1000>;
+   interrupts = <0 96 0>;
+
+   clocks = <_blsp1_qup2_i2c_apps_clk>, <_blsp1_ahb_clk>;
+   clock-names = "core", "iface";
+
+   clock-frequency = <355000>;
+
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   dummy@60 {
+   compatible = "dummy";
+   reg = <0x60>;
+   };
+ };
+
-- 
1.8.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

MAINTAINERS tree branches [xen tip as an example]

2014-01-17 Thread Luis R. Rodriguez

As per linux-next Next/Trees [0], and a recent January MAINTAINERS patch [1]
from David one of the xen development kernel git trees to track is
xen/git.git [2], this tree however gives has undefined references when doing a
fresh clone [shown below], but as expected does work well when only cloning
the linux-next branch [also below]. While I'm sure this is fine for
folks who can do the guess work do we really want to live with trees like
these on MAINTAINERS ? The MAINTAINERS file doesn't let us specify branches
required, so perhaps it should -- if we want to live with these ? Curious, how
many other git are there with a similar situation ?

The xen project web site actually lists [3] Konrad's xen git tree [4] for
development as the primary development tree, that probably should be
updated now, and likely with instructions to clone only the linux-next
branch ?

[0] 
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Next/Trees#n176
[1] http://lists.xen.org/archives/html/xen-devel/2014-01/msg01504.html
[2] git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git
[3] http://wiki.xenproject.org/wiki/Xen_Repositories#Primary_Xen_Repository
[4] git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git

mcgrof@bubbles ~ $ git clone
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git --reference linux/.git
Cloning into 'tip'...
remote: Counting objects: 2806, done.
remote: Compressing objects: 100% (334/334), done.
remote: Total 1797 (delta 1511), reused 1646 (delta 1462)
Receiving objects: 100% (1797/1797), 711.01 KiB | 640.00 KiB/s, done.
Resolving deltas: 100% (1511/1511), completed with 306 local objects.
Checking connectivity... done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.

mcgrof@work ~ $ git clone
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git -b linux-next 
--reference linux/.git
Cloning into 'tip'...
remote: Counting objects: 2806, done.
remote: Compressing objects: 100% (377/377), done.
remote: Total 1797 (delta 1545), reused 1607 (delta 1419)
Receiving objects: 100% (1797/1797), 485.23 KiB | 0 bytes/s, done.
Resolving deltas: 100% (1545/1545), completed with 327 local objects.
Checking connectivity... done.
Checking out files: 100% (44979/44979), done.

  Luis


signature.asc
Description: Digital signature

Re: [PATCH] mmc: sdhci: fix possible scheduling while atomic

2014-01-17 Thread Philip Rakity


On Jan 17, 2014, at 7:57 PM, Andrew Bresticker  wrote:

> sdhci_execute_tuning() takes host->lock without disabling interrupts.
> Use spin_lock_irq{save,restore} instead so that we avoid taking an
> interrupt and scheduling while holding host->lock.
> 
> Signed-off-by: Andrew Bresticker 
> ---
> drivers/mmc/host/sdhci.c | 13 +++--
> 1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index ec3eb30..84c80e7 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -1857,12 +1857,13 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
> u32 opcode)
>   unsigned long timeout;
>   int err = 0;
>   bool requires_tuning_nonuhs = false;
> + unsigned long flags;
> 
>   host = mmc_priv(mmc);
> 
>   sdhci_runtime_pm_get(host);
>   disable_irq(host->irq);
> - spin_lock(>lock);
> + spin_lock_irqsave(>lock, flags);


The disable_irq() call stops the controller from doing interrupts.  
Please explain what problem you are seeing

> 
>   ctrl = sdhci_readw(host, SDHCI_HOST_CONTROL2);
> 
> @@ -1882,14 +1883,14 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
> u32 opcode)
>   requires_tuning_nonuhs)
>   ctrl |= SDHCI_CTRL_EXEC_TUNING;
>   else {
> - spin_unlock(>lock);
> + spin_unlock_irqrestore(>lock, flags);
>   enable_irq(host->irq);
>   sdhci_runtime_pm_put(host);
>   return 0;
>   }
> 
>   if (host->ops->platform_execute_tuning) {
> - spin_unlock(>lock);
> + spin_unlock_irqrestore(>lock, flags);
>   enable_irq(host->irq);
>   err = host->ops->platform_execute_tuning(host, opcode);
>   sdhci_runtime_pm_put(host);
> @@ -1963,7 +1964,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
> u32 opcode)
>   host->cmd = NULL;
>   host->mrq = NULL;
> 
> - spin_unlock(>lock);
> + spin_unlock_irqrestore(>lock, flags);
>   enable_irq(host->irq);
> 
>   /* Wait for Buffer Read Ready interrupt */
> @@ -1971,7 +1972,7 @@ static int sdhci_execute_tuning(struct mmc_host *mmc, 
> u32 opcode)
>   (host->tuning_done == 1),
>   msecs_to_jiffies(50));
>   disable_irq(host->irq);
> - spin_lock(>lock);
> + spin_lock_irqsave(>lock, flags);
> 
>   if (!host->tuning_done) {
>   pr_info(DRIVER_NAME ": Timeout waiting for "
> @@ -2046,7 +2047,7 @@ out:
>   err = 0;
> 
>   sdhci_clear_set_irqs(host, SDHCI_INT_DATA_AVAIL, ier);
> - spin_unlock(>lock);
> + spin_unlock_irqrestore(>lock, flags);
>   enable_irq(host->irq);
>   sdhci_runtime_pm_put(host);
> 
> -- 
> 1.8.5.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3] net/dt: Add support for overriding phy configuration from device tree

2014-01-17 Thread Matthew Garrett

Some hardware may be broken in interesting and board-specific ways, such
that various bits of functionality don't work. This patch provides a
mechanism for overriding mii registers during init based on the contents of
the device tree data, allowing board-specific fixups without having to
pollute generic code.

Signed-off-by: Matthew Garrett 
---

V3: Break the main function out into some helper functions and store the
values in some structs.

 Documentation/devicetree/bindings/net/phy.txt | 21 +++
 drivers/net/phy/phy_device.c  | 29 -
 drivers/of/of_net.c   | 87 +++
 include/linux/of_net.h| 12 
 4 files changed, 148 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/phy.txt 
b/Documentation/devicetree/bindings/net/phy.txt
index 7cd18fb..fc50f02 100644
--- a/Documentation/devicetree/bindings/net/phy.txt
+++ b/Documentation/devicetree/bindings/net/phy.txt
@@ -23,6 +23,21 @@ Optional Properties:
   assume clause 22. The compatible list may also contain other
   elements.
 
+The following properties may be added to either the phy node or the parent
+ethernet device. If not present, the hardware defaults will be used.
+
+- phy-mii-advertise-10half: 1 to advertise half-duplex 10MBit, 0 to disable
+- phy-mii-advertise-10full: 1 to advertise full-duplex 10MBit, 0 to disable
+- phy-mii-advertise-100half: 1 to advertise half-duplex 100MBit, 0 to disable
+- phy-mii-advertise-100full: 1 to advertise full-duplex 100MBit, 0 to disable
+- phy-mii-advertise-100base4: 1 to advertise 100base4, 0 to disable
+- phy-mii-advertise-1000half: 1 to advertise half-duplex 1000MBit, 0 to disable
+- phy-mii-advertise-1000full: 1 to advertise full-duplex 1000MBit, 0 to disable
+- phy-mii-manual-master: 1 to enable manual master/slave configuration, 0
+  to disable manual master/slave configuration
+- phy-mii-as-master: 1 to configure phy to act as master, 0 to configure phy
+  to act as slave. Ignored if manual master/slave configuration is not enabled.
+
 Example:
 
 ethernet-phy@0 {
@@ -32,4 +47,10 @@ ethernet-phy@0 {
interrupts = <35 1>;
reg = <0>;
device_type = "ethernet-phy";
+   // Disable advertising of full duplex 1000MBit
+   phy-mii-advertise-1000full = <0>;
+   // Force manual phy master/slave configuration
+   phy-mii-manual-master = <1>;
+   // Forcibly configure phy as slave
+   phy-mii-as-master = <0>;
 };
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index d6447b3..91793bc 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -497,6 +498,28 @@ void phy_disconnect(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_disconnect);
 
+int phy_override_from_of(struct phy_device *phydev)
+{
+   int reg, regval;
+   u16 val, mask;
+
+   /* Check for phy register overrides from OF */
+   for (reg = 0; reg < 16; reg++) {
+   if (!of_get_mii_register(phydev, reg, , )) {
+   if (!mask)
+   continue;
+   regval = phy_read(phydev, reg);
+   if (regval < 0)
+   continue;
+   regval &= ~mask;
+   regval |= val;
+   phy_write(phydev, reg, regval);
+   }
+   }
+
+   return 0;
+}
+
 int phy_init_hw(struct phy_device *phydev)
 {
int ret;
@@ -508,7 +531,11 @@ int phy_init_hw(struct phy_device *phydev)
if (ret < 0)
return ret;
 
-   return phydev->drv->config_init(phydev);
+   ret = phydev->drv->config_init(phydev);
+   if (ret < 0)
+   return ret;
+
+   return phy_override_from_of(phydev);
 }
 
 /**
diff --git a/drivers/of/of_net.c b/drivers/of/of_net.c
index 8f9be2e..75751b7 100644
--- a/drivers/of/of_net.c
+++ b/drivers/of/of_net.c
@@ -93,3 +93,90 @@ const void *of_get_mac_address(struct device_node *np)
return NULL;
 }
 EXPORT_SYMBOL(of_get_mac_address);
+
+struct mii_override {
+   char *prop;
+   u32 supported;
+   u16 value;
+};
+
+static struct mii_override mii_advertise_override[] = {
+   { "phy-mii-advertise-10half", SUPPORTED_10baseT_Half,
+ ADVERTISE_10HALF },
+   { "phy-mii-advertise-10full", SUPPORTED_10baseT_Full,
+ ADVERTISE_10FULL },
+   { "phy-mii-advertise-100half", SUPPORTED_100baseT_Half,
+ ADVERTISE_100HALF },
+   { "phy-mii-advertise-100full", SUPPORTED_100baseT_Full,
+ ADVERTISE_100FULL },
+   { "phy-mii-advertise-100base4", 0, ADVERTISE_100BASE4 },
+   { NULL },
+};
+
+static struct mii_override mii_gigabit_override[] = {
+   { "phy-mii-advertise-1000half", SUPPORTED_1000baseT_Half,
+ ADVERTISE_1000HALF },
+   {

Re: [RFC PATCHv2 2/2] Change khugepaged to respect MMF_THP_DISABLE flag

2014-01-17 Thread Alex Thorlton

On Fri, Jan 17, 2014 at 09:34:44PM +0100, Oleg Nesterov wrote:
> On 01/16, Alex Thorlton wrote:
> >
> >  static inline int khugepaged_test_exit(struct mm_struct *mm)
> >  {
> > -   return atomic_read(>mm_users) == 0;
> > +   return atomic_read(>mm_users) == 0 ||
> > +  (mm->flags & MMF_THP_DISABLE_MASK);
>^^^
> 
> test_bit(MMF_THP_DISABLE) ?

Probably should just use the bitop here, good call.

> And I am not sure this and another check in transparent_hugepage_enabled
> is actually right...
> 
> I think that MMF_THP_DISABLE_MASK should not disable thp if this
> vma has VM_HUGEPAGE set, iow perhaps madvise() should work even
> after PR_SET_THP_DISABLE?
> 
> IOW, MMF_THP_DISABLE should act as khugepaged_req_madv().

I hadn't thought of this, but maybe that's a good idea.  That way we can
turn off THP in general for an mm, but the places in code that
*specifically* request THP will still get it.  I don't see why that
would be a problem, as long as we go with the assumption that, if
somebody is explicitly requesting THPs, they probably have a good reason
for doing so.

Thanks for the input!

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread Borislav Petkov

On Fri, Jan 17, 2014 at 02:51:52PM -0800, H. Peter Anvin wrote:
> Printing a warning is appropriate if we can't actually fix the problem
> in the OS. If we actually make the problem go away then we have just
> done our job and we can be done with it.

And we do make it go away so no warning and a gold star for us!

:-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCHv2 1/2] Add mm flag to control THP

2014-01-17 Thread Alex Thorlton

On Fri, Jan 17, 2014 at 11:54:09AM -0800, Andy Lutomirski wrote:
> Usual nit: please return -EINVAL if unused args are nonzero.  This
> makes it possible to extend these APIs in the future.

Got it.  Thanks, Andy!

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread H. Peter Anvin

On 01/17/2014 02:50 PM, Borislav Petkov wrote:
> On Fri, Jan 17, 2014 at 11:28:06PM +0100, Pavel Machek wrote:
>> Would it make sense to printk() a warning?
> 
> No because people come and start bitching about their dmesg containing
> a warning and whether their hardware is b0rked without even reading the
> actual words.
> 

Printing a warning is appropriate if we can't actually fix the problem
in the OS.  If we actually make the problem go away then we have just
done our job and we can be done with it.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCHv2 1/2] Add mm flag to control THP

2014-01-17 Thread Alex Thorlton

On Fri, Jan 17, 2014 at 09:09:23PM +0100, Oleg Nesterov wrote:
> On 01/16, Alex Thorlton wrote:
> >
> > +   case PR_GET_THP_DISABLE:
> > +   error = put_user(test_bit(MMF_THP_DISABLE,
> > +>mm->flags),
> > +(int __user *) arg2);
> > +   break;
> 
> Do we really want put_user?
> 
>   error = test_bit(MMF_THP_DISABLE);
> 
> can work too and looks simpler (and to me simpler to use in userspace).

That makes sense.  I'll change that; definitely easier to use from
userspace that way.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread Borislav Petkov

On Fri, Jan 17, 2014 at 11:28:06PM +0100, Pavel Machek wrote:
> Would it make sense to printk() a warning?

No because people come and start bitching about their dmesg containing
a warning and whether their hardware is b0rked without even reading the
actual words.

> BIOS is clearly buggy in this case, and it may cause problems with
> another operating system, earlier kernel, or maybe early in boot
> before MSR is written...

Well, if the problem happens, the machine will lock. If it doesn't lock
=> no problem and we apply the workaround => no problem at all. :-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/11] use ether_addr_equal_64bits

2014-01-17 Thread Pavel Machek

On Fri 2014-01-17 23:02:06, Oleksij Rempel wrote:
> Am 17.01.2014 22:24, schrieb Pavel Machek:
> > On Mon 2013-12-30 19:14:56, Julia Lawall wrote:
> >> Ether_addr_equal_64bits is more efficient than ether_addr_equal, and can be
> >> used when each argument is an array within a structure that contains at
> >> least two bytes of data beyond the array.
> > 
> > I mean, yes, it is probably faster, and yes, most structures probably
> > contain two more bytes, but... is the uglyness worth the speedup? I'd
> > say this should not be done except in very time-critical places...
> 
> This code run on every received beacon, almost on every wifi driver (If
> i understand what you mean.)

That does not look like "sufficiently often" to me. Can you measure
the improvement at least in some microbenchmark? Is there even
theoretical chance to get one?

You are comparing few bytes, number of cacheline accesses stays same,
there is likely _0_ speedup. And even if you saved 1T, that will be
compeletely lost in the noise.

In some kind of routing code, cache-hot... maybe it would make
sense. But once per interrupt?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: More GPIO madness on iMX6 - and the crappy ARM port of Linux

2014-01-17 Thread Linus Walleij

On Fri, Jan 17, 2014 at 9:53 PM, Russell King - ARM Linux
 wrote:
> On Fri, Jan 17, 2014 at 01:42:44PM -0700, Stephen Warren wrote:

>> I believe you want gpio_get_value() to return either the driven or
>> actual pin value where it can on the current HW, but just e.g. hard-code
>> 0 on other HW. That would introduce a core feature that works some
>> places but not others, and hence make drivers that relied on the feature
>> less portable between HW with different actual features.
>
> I can buy that argument, but there's an issue which stands squarely in
> its way, and that is open-drain GPIOs.
>
> These are modelled just as any other GPIO, mainly so that both
> gpio_set_value(gpio, 1) and gpio_direction_input(gpio) both result in
> the signal being high.  The only combination which results in the
> signal being driven low is outputting zero - and the state of the signal
> can aways be read back.
>
> The problem here is that such gpios are implemented in things like the
> I2C driver such that they're _always_ outputs, and gpio_set_value() is
> used to pull the signal down.  gpio_get_value() is used to read its
> current state.
>
> So, if we say that gpio_get_value() is undefined, we force such
> subsystems to always jump through the non-open-drain paths (using
> gpio_direction_input() to set the line high and
> gpio_direction_output(gpio, 0) to drive it low.)

Incidentally that is what gpiolib is doing internally in
gpiod_direction_output().

You're absolutely right that it makes no sense to have open
drain (or open source) unless the signal can be read back from
the hardware.

I'm thinking something like if the driver manages to obtain a
GPIO with

gpio_request_one(gpio, GPIOF_OPEN_DRAIN |
 GPIOF_OUT_INIT_HIGH);

As the I2C core does, and then when that call succeeds, it can
expect that whatever comes back from gpio_get_value() is
always what is actually on the line. If the driver cannot determine
this it should not have allowed that flag to succeed in the first
place, so this might be something we want to enforce.

There are two white spots on the map here:

1. Today this OPEN_DRAIN flag is not even passed down to
the driver so how could it say anything about it :-( it's a pure gpiolib
internal flag. We don't know if the hardware can actually even
do open drain, we just assume it can.

What it should really do - in the best of worlds - is to check if
it can cross-reference the GPIO line to a pin in the pin control
subsystem, and if that is possible, then ask the pin if it
is supporting open drain and set it. It currently has no such
cross-calls, it is just assumed that the configuration is consistent,
and the actual pin is set up as open drain. But it would make
sense to add more cross-calls here, since GPIO is accepting
these flags (OPEN_DRAIN/OPEN_SOURCE).

Like:
int pinctrl_gpio_set_flags(unsigned gpio, unsigned long flags);

Where the pinctrl subsystem would attempt to cross reference
and set the flag, and the pin controller backend will then have
the option to return an error code.

We could atleast support that for the select pin controllers
that use generic pin config. i.MX is another story, but I'm open
to compromises.

2. In the new descriptor API this open drain setting would
be set from the lookup table and be a property on the line,
meaning this flag is not requested explicitly by the consumer,
and the consumer needs to inspect the obtained descriptor
to figure out if it is set to open drain.

Alexandre: do you have plans for how to handle a dynamic
consumer passing flags to its gpio request in the gpiod API?
I noticed that API missing now, there is exactly one user in the
entire kernel, in drivers/i2c/i2c-core.c but a very important one.

I guess to switch the I2C core over to descriptors I could
think of an API like this:

int gpiod_get_flags(const struct gpio_desc *desc);

If the OPEN_DRAIN flag is set on that descriptor we should
always be able to read the input. But as this is not really what the
I2C core wants to know (it really would prefer not to bother with
such GPIO flag details) so is it better if we add a special call to
figure out if the input can be read? Like:

bool gpiod_input_always_valid(const struct gpio_desc *desc);

And leave it up to the core to look at flags, driver characteristics
etc and determine whether the input can be trusted?

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Last minute ACPI fix for v3.13

2014-01-17 Thread Rafael J. Wysocki

Hi Linus,

Please pull from the git repository at

  git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git 
acpi-3.13-fixup

to receive a last-minute ACPI fix for 3.13 as commit 
b40f93bb008ed5d83aa608b21c8c65c5111512e1

  Revert "ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs"

on top of commit 7e22e91102c6b9df7c4ae2168910e19d2bb14cd6

  Linux 3.13-rc8

This reverts a commit that causes the Alan Cox' ASUS T100TA to "crash and
burn" during boot if the Baytrail pinctrl driver is compiled in.

Thanks!


---

Rafael J. Wysocki (1):
  Revert "ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs"

---

 drivers/acpi/acpi_lpss.c   | 1 -
 drivers/pinctrl/pinctrl-baytrail.c | 1 -
 2 files changed, 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, CPU, AMD: Add workaround for family 16h, erratum 793

2014-01-17 Thread Pavel Machek

On Tue 2014-01-14 17:27:23, Borislav Petkov wrote:
> From: Borislav Petkov 
> 
> This adds the workaround for erratum 793 as a precaution in case not
> every BIOS implements it. This addresses CVE-2013-6885.
> 
> Signed-off-by: Borislav Petkov 
> ---
>  arch/x86/kernel/cpu/amd.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 4a48e8bbd857..e4d6f8c91f51 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -507,6 +507,17 @@ static void early_init_amd(struct cpuinfo_x86 *c)
>   set_cpu_cap(c, X86_FEATURE_EXTD_APICID);
>   }
>  #endif
> +
> +#define MSR_AMD_LS_CFG   0xc0011020
> + /* F16h, models 00h-0fh, erratum 793 */
> + if (c->x86 == 0x16 && c->x86_model <= 0xf) {
> + u64 val;
> +
> + rdmsrl(MSR_AMD_LS_CFG, val);
> + if (!(val & BIT(15)))
> + wrmsrl(MSR_AMD_LS_CFG, val | BIT(15));

Would it make sense to printk() a warning?

BIOS is clearly buggy in this case, and it may cause problems with
another operating system, earlier kernel, or maybe early in boot
before MSR is written...
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

AW: [PATCH 1/1] iMX gpio: Allow reading back of pin status if configured as gpio output

2014-01-17 Thread Waibel Georg

> On Fri, Jan 17, 2014 at 14:11 +, Waibel Georg wrote:
> >
> > From cb384950a1153e856ec03109a5156e660a89bf6d Mon Sep 17 00:00:00 2001
> > From: Georg Waibel 
> > Date: Fri, 17 Jan 2014 14:51:38 +0100
> > Subject: [PATCH 1/1] iMX gpio: Allow reading back of pin status if 
> > configured
> >  as gpio output
> >
> > Register PSR was used to read the pin status in the mxc_gpio driver. This
> > register reflects the pin status if a pin is configured as gpio input.
> > If a pin is configured as an gpio output register PSR is not updated and
> > returns 0 instead of the actual pin status. Thus attempting to read back the
> > status of an gpio output pin via PSR returns a wrong value.
> >
> > Reading register DR instead of PSR fixes this issue:
> > - If pin is gpio output: DR returns the value written to DR by software
> > - If pin is gpio input: DR returns the value of register PSR und thus the
> >   pin status
> 
> Got curious because of your specific wording.  In the output
> case, does the DR value reflect what you drive to the pin, or
> what the pin's status is?  Because this need not be identical in
> the open collector case (or other "weak" scenarios like bus
> keeper, and what else HW development came up with).
> 
> If DR always reflects the pin's status, then the patch would be
> OK but the commit message would need an update.  If DR does not
> appropriately reflect the pin's status, then the patch would be
> an improvement (would fix the totem-pole case), but still would
> be wrong or incomplete in the open-collector case.

Hi Gerhard,

actually I did not take the open-collector case into account. 
I checked the iMX6q reference manual and came to this conclusion:

In the output case, reading register DR returns the value I have written 
to DR, not the actual pad state. Thus may patch only works for the 
push-pull (totem-pole) output case, not for open-drain.

However, there is a solution to read back the pin status regardless of 
its input or output configuration. Its the SION (Software Input On Bit)
in the IOMUXC MUX control register. If SION is set, the pad status can be 
read back through register PSR. Thats exactly what I intended to do. 
Bit SION can be set in the device tree (IOMUXC: CONFIG filed in the 
fsl,pins binding, bit 30).
I will verify this on my iMX6 hardware next week.

Seems there is no need for my patch at all.

Thanks for your hint.
Regards
Georg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v3.11][v3.12][v3.13][Regression] EISA: Initialize device before its resources

2014-01-17 Thread Bjorn Helgaas

On Fri, Jan 17, 2014 at 02:26:23PM -0500, Joseph Salisbury wrote:
> On 01/17/2014 12:02 PM, Bjorn Helgaas wrote:
> > On Thu, Jan 16, 2014 at 01:14:01PM -0500, Joseph Salisbury wrote:
> >> On 01/16/2014 01:12 PM, Bjorn Helgaas wrote:
> >>> On Thu, Jan 16, 2014 at 10:53 AM, Joseph Salisbury
> >>>  wrote:
>  Hi Bjorn,
> 
>  A kernel bug was opened against Ubuntu [0].  After a kernel bisect, it
>  was found the following commit introduced this bug:
> >>> Sorry about that, and thanks for the report.  Did you mean to include
> >>> URL for the bug?
> >> Yes, sorry about that:
> >> http://pad.lv/1251816
> > Hi Joseph,
> >
> > Can you attach the 3.8.0-32-generic config (the one matching the successful
> > boot at https://launchpadlibrarian.net/156685076/BootDmesg.txt) to the bug?
> 
> I attached the config file to the bug:
> https://launchpadlibrarian.net/162754666/config.common.ubuntu
> 
> I also attached a tar file with the complete config directory for that
> kernel version.
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1251816/+attachment/3951156/+files/raring-config.tar

Thanks again.  I attached the following reverts to launchpad.  I screwed up
when doing those EISA changes.  I'd like to squeeze these into my v3.14
merge request (probably early next week), so please test and let me know
if this fixes the problem.  I'm really sorry for the inconvenience.

Bjorn


Revert "EISA: Log device resources in dmesg"

From: Bjorn Helgaas 

This reverts commit a2080d0c561c546d73cb8b296d4b7ca414e6860b.

Signed-off-by: Bjorn Helgaas 
---
 drivers/eisa/eisa-bus.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c
index 8842cde69177..1b86fe0c2e80 100644
--- a/drivers/eisa/eisa-bus.c
+++ b/drivers/eisa/eisa-bus.c
@@ -288,7 +288,6 @@ static int __init eisa_request_resources(struct 
eisa_root_device *root,
edev->res[i].flags = IORESOURCE_IO | IORESOURCE_BUSY;
}
 
-   dev_printk(KERN_DEBUG, >dev, "%pR\n", >res[i]);
if (request_resource(root->res, >res[i]))
goto failed;
}
Revert "EISA: Initialize device before its resources"

From: Bjorn Helgaas 

This reverts commit 26abfeed4341872364386c6a52b9acef8c81a81a.

In the eisa_probe() force_probe path, if we were unable to request slot
resources (e.g., [io 0x800-0x8ff]), we skipped the slot with "Cannot
allocate resource for EISA slot %d" before reading the EISA signature in
eisa_init_device().

Commit 26abfeed4341 moved eisa_init_device() earlier, so we tried to read
the EISA signature before requesting the slot resources, and this caused
hangs during boot.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1251816
Signed-off-by: Bjorn Helgaas 
---
 drivers/eisa/eisa-bus.c |   26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/eisa/eisa-bus.c b/drivers/eisa/eisa-bus.c
index 1b86fe0c2e80..612afeaec3cb 100644
--- a/drivers/eisa/eisa-bus.c
+++ b/drivers/eisa/eisa-bus.c
@@ -277,11 +277,13 @@ static int __init eisa_request_resources(struct 
eisa_root_device *root,
}

if (slot) {
+   edev->res[i].name  = NULL;
edev->res[i].start = SLOT_ADDRESS(root, slot)
 + (i * 0x400);
edev->res[i].end   = edev->res[i].start + 0xff;
edev->res[i].flags = IORESOURCE_IO;
} else {
+   edev->res[i].name  = NULL;
edev->res[i].start = SLOT_ADDRESS(root, slot)
 + EISA_VENDOR_ID_OFFSET;
edev->res[i].end   = edev->res[i].start + 3;
@@ -327,19 +329,20 @@ static int __init eisa_probe(struct eisa_root_device 
*root)
return -ENOMEM;
}

-   if (eisa_init_device(root, edev, 0)) {
+   if (eisa_request_resources(root, edev, 0)) {
+   dev_warn(root->dev,
+"EISA: Cannot allocate resource for mainboard\n");
kfree(edev);
if (!root->force_probe)
-   return -ENODEV;
+   return -EBUSY;
goto force_probe;
}
 
-   if (eisa_request_resources(root, edev, 0)) {
-   dev_warn(root->dev,
-"EISA: Cannot allocate resource for mainboard\n");
+   if (eisa_init_device(root, edev, 0)) {
+   eisa_release_resources(edev);
kfree(edev);
if (!root->force_probe)
-   return -EBUSY;
+   return -ENODEV;
goto force_probe;
}
 
@@ -362,11 +365,6 @@ static int __init eisa_probe(struct eisa_root_device *root)
continue;
}
 
-   if

Re: [PATCHv3 1/4] DT: Add vendor prefix for Emerging Display Technologies

2014-01-17 Thread Rob Herring

On Fri, Jan 17, 2014 at 6:28 AM, Lothar Waßmann  
wrote:
>
> Signed-off-by: Lothar Waßmann 
> ---
>  .../devicetree/bindings/vendor-prefixes.txt|1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)

Applied.

Rob

>
> diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
> b/Documentation/devicetree/bindings/vendor-prefixes.txt
> index b458760..49774c3 100644
> --- a/Documentation/devicetree/bindings/vendor-prefixes.txt
> +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
> @@ -29,6 +29,7 @@ dallasMaxim Integrated Products (formerly Dallas 
> Semiconductor)
>  davicomDAVICOM Semiconductor, Inc.
>  denx   Denx Software Engineering
>  dmoData Modul AG
> +edtEmerging Display Technologies
>  emmicroEM Microelectronic
>  epson  Seiko Epson Corp.
>  estESTeem Wireless Modems
> --
> 1.7.2.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/11] ext4: add cross rename support

2014-01-17 Thread J. Bruce Fields

On Fri, Jan 17, 2014 at 11:53:07PM +1300, Michael Kerrisk (man-pages) wrote:
> Hi Miklos,
> 
> A few comments below, including one piece in the code that really must be 
> fixed.
> 
> On 01/16/2014 11:54 PM, Miklos Szeredi wrote:
> >> On Wed, Jan 15, 2014 at 7:23 PM, J. Bruce Fields  
> >> wrote:
> >>> Do you have a man page update somewhere for the two new flags?
> > 
> > Here's the updated man page (and attached the patch)
> > 
> > Michael, could you please review the interface?
> > 
> > I forgot to CC you when posing the patch series.  I can resend it if you 
> > want,
> > or you can fetch the latest version of the cross-rename series from:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git 
> > cross-rename
> 
> [...]
> 
> >renameat2()  has an additional flags argument.  renameat2() call 
> > with a
> >zero flags argument is equivalent to renameat().
> > 
> >The flags argument is a bitfield consisting of zero or more of the 
> > fol-
> >lowing constants defined in :
> > 
> >RENAME_NOREPLACE
> >   Don't  overwrite  the  target of the rename.  Return an error 
> > if
> >   the target would be overwritten.
> > 
> >RENAME_EXCHANGE
> >   Atomically exchange the source and destination.  Both must 
> > exist
> >   but  may  be of a different type (e.g. one a non-empty 
> > directory
> >   and the other a symbolic link).
> 
> Somewhere here it would be good to explain the consequences if
> 
>(flags & (RENAME_NOREPLACE | RENAME_EXCHANGE)) == 
>(RENAME_NOREPLACE | RENAME_EXCHANGE)
> 
> Okay -- it's EINVAL, but here the man page text should say something like
> "these two flags can't be specified together", right?
> 
> > RETURN VALUE
> >On success, renameat() and renameat2()  return  0.   On  error,  -1  
> > is
> >returned and errno is set to indicate the error.
> > 
> > ERRORS
> >The  same errors that occur for rename(2) can also occur for 
> > renameat()
> >and  renameat2().   The  following  additional  errors  can  occur  
> > for
> >renameat() and renameat2():
> > 
> >EBADF  olddirfd or newdirfd is not a valid file descriptor.
> > 
> >ENOTDIR
> >   oldpath  is relative and olddirfd is a file descriptor 
> > referring
> >   to a file other than a directory; or  similar  for  newpath  
> > and
> >   newdirfd
> > 
> >The following additional errors are defined for renameat2():
> > 
> >EOPNOTSUPP
> >   The filesystem does not support a flag in flags
> 
> This is not the usual error for an invalid bit flag. Please make it EINVAL.

I agree that EINVAL makes sense for an invalid bit flag.
 
But renameat2() can also fail when the caller passes a perfectly valid
flags field but the paths resolve to a filesystem that doesn't support
the RENAME_EXCHANGE operation.  EOPNOTSUPP looks more appropriate in
that case.

> (See the man pages for the *at() calls that have a 'flags" argument.)

Aren't those flags mostly handled in the vfs?  In which case they work
everywhere, so there isn't the same distinction between "flag is
defined" and "behavior requested by flag is unsupported for the given
objects".

--b.

> >EINVAL Invalid combination of flags
> 
> (This is okay.)
> 
> Looks otherwise okay to me (and I agree with Bruce's comments).
> 
> Cheers,
> 
> Michael
> 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] clk: sunxi: Add support for PLL6 on the A31

2014-01-17 Thread Mike Turquette

Quoting Maxime Ripard (2014-01-16 09:11:22)
> The A31 has a slightly different PLL6 clock. Add support for this new clock in
> our driver.
> 
> Signed-off-by: Maxime Ripard 

This looks good to me. I guess it will be going in for 3.15 based on the
comments in the coverletter.

Regards,
Mike

> ---
>  Documentation/devicetree/bindings/clock/sunxi.txt |  1 +
>  drivers/clk/sunxi/clk-sunxi.c | 45 
> +++
>  2 files changed, 46 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/clock/sunxi.txt 
> b/Documentation/devicetree/bindings/clock/sunxi.txt
> index c2cb762..954845c 100644
> --- a/Documentation/devicetree/bindings/clock/sunxi.txt
> +++ b/Documentation/devicetree/bindings/clock/sunxi.txt
> @@ -11,6 +11,7 @@ Required properties:
> "allwinner,sun6i-a31-pll1-clk" - for the main PLL clock on A31
> "allwinner,sun4i-pll5-clk" - for the PLL5 clock
> "allwinner,sun4i-pll6-clk" - for the PLL6 clock
> +   "allwinner,sun6i-a31-pll6-clk" - for the PLL6 clock on A31
> "allwinner,sun4i-cpu-clk" - for the CPU multiplexer clock
> "allwinner,sun4i-axi-clk" - for the AXI clock
> "allwinner,sun4i-axi-gates-clk" - for the AXI gates
> diff --git a/drivers/clk/sunxi/clk-sunxi.c b/drivers/clk/sunxi/clk-sunxi.c
> index 659e4ea..990ad5d 100644
> --- a/drivers/clk/sunxi/clk-sunxi.c
> +++ b/drivers/clk/sunxi/clk-sunxi.c
> @@ -249,7 +249,38 @@ static void sun4i_get_pll5_factors(u32 *freq, u32 
> parent_rate,
> *n = DIV_ROUND_UP(div, (*k+1));
>  }
>  
> +/**
> + * sun6i_a31_get_pll6_factors() - calculates n, k factors for A31 PLL6
> + * PLL6 rate is calculated as follows
> + * rate = parent_rate * n * (k + 1) / 2
> + * parent_rate is always 24Mhz
> + */
> +
> +static void sun6i_a31_get_pll6_factors(u32 *freq, u32 parent_rate,
> +  u8 *n, u8 *k, u8 *m, u8 *p)
> +{
> +   u8 div;
> +
> +   /*
> +* We always have 24MHz / 2, so we can just say that our
> +* parent clock is 12MHz.
> +*/
> +   parent_rate = parent_rate / 2;
> +
> +   /* Normalize value to a parent_rate multiple (24M / 2) */
> +   div = *freq / parent_rate;
> +   *freq = parent_rate * div;
> +
> +   /* we were called to round the frequency, we can now return */
> +   if (n == NULL)
> +   return;
> +
> +   *k = div / 32;
> +   if (*k > 3)
> +   *k = 3;
>  
> +   *n = DIV_ROUND_UP(div, (*k+1));
> +}
>  
>  /**
>   * sun4i_get_apb1_factors() - calculates m, p factors for APB1
> @@ -416,6 +447,13 @@ static struct clk_factors_config sun4i_pll5_config = {
> .kwidth = 2,
>  };
>  
> +static struct clk_factors_config sun6i_a31_pll6_config = {
> +   .nshift = 8,
> +   .nwidth = 5,
> +   .kshift = 4,
> +   .kwidth = 2,
> +};
> +
>  static struct clk_factors_config sun4i_apb1_config = {
> .mshift = 0,
> .mwidth = 5,
> @@ -457,6 +495,12 @@ static const struct factors_data sun4i_pll5_data 
> __initconst = {
> .getter = sun4i_get_pll5_factors,
>  };
>  
> +static const struct factors_data sun6i_a31_pll6_data __initconst = {
> +   .enable = 31,
> +   .table = _a31_pll6_config,
> +   .getter = sun6i_a31_get_pll6_factors,
> +};
> +
>  static const struct factors_data sun4i_apb1_data __initconst = {
> .table = _apb1_config,
> .getter = sun4i_get_apb1_factors,
> @@ -972,6 +1016,7 @@ free_clkdata:
>  static const struct of_device_id clk_factors_match[] __initconst = {
> {.compatible = "allwinner,sun4i-pll1-clk", .data = _pll1_data,},
> {.compatible = "allwinner,sun6i-a31-pll1-clk", .data = 
> _a31_pll1_data,},
> +   {.compatible = "allwinner,sun6i-a31-pll6-clk", .data = 
> _a31_pll6_data,},
> {.compatible = "allwinner,sun4i-apb1-clk", .data = _apb1_data,},
> {.compatible = "allwinner,sun4i-mod0-clk", .data = _mod0_data,},
> {.compatible = "allwinner,sun7i-a20-out-clk", .data = 
> _a20_out_data,},
> -- 
> 1.8.4.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/11] use ether_addr_equal_64bits

2014-01-17 Thread Oleksij Rempel

Am 17.01.2014 22:24, schrieb Pavel Machek:
> On Mon 2013-12-30 19:14:56, Julia Lawall wrote:
>> Ether_addr_equal_64bits is more efficient than ether_addr_equal, and can be
>> used when each argument is an array within a structure that contains at
>> least two bytes of data beyond the array.
> 
> I mean, yes, it is probably faster, and yes, most structures probably
> contain two more bytes, but... is the uglyness worth the speedup? I'd
> say this should not be done except in very time-critical places...
>   Pavel

This code run on every received beacon, almost on every wifi driver (If
i understand what you mean.)

-- 
Regards,
Oleksij



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v2 5/9] nvme: Fix invalid call to irq_set_affinity_hint()

2014-01-17 Thread Keith Busch


On Fri, 17 Jan 2014, Bjorn Helgaas wrote:


On Fri, Jan 17, 2014 at 9:02 AM, Alexander Gordeev  wrote:

In case MSI-X and MSI initialization failed the function
irq_set_affinity_hint() is called with uninitialized value
in dev->entry[0].vector. This update fixes the issue.


dev->entry[0].vector is initialized in nvme_dev_map(), and it's used
for free_irq() above the area of your patch, so I don't think this is
actually a bug, though it might be somewhat confusing.


It is confusing, but there's a reason. :)

We send a single command using legacy irq to discover how many msix
vectors we want. The legacy entry needs to be set some time before calling
request_irq in nvme_configure_admin_queue, but also within nvme_dev_start
(for power-management). I don't think there's a place to set it that
won't look odd when looking at nvme_setup_io_queues. I settled on
'nvme_dev_map' was because 'nvme_dev_unmap' invalidates the entries,
so this seemed to provide some amount of symmetry.


Signed-off-by: Alexander Gordeev 
---
 drivers/block/nvme-core.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 26d03fa..e292450 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -1790,15 +1790,15 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
vecs = 32;
for (;;) {
result = pci_enable_msi_block(pdev, vecs);
-   if (result == 0) {
-   for (i = 0; i < vecs; i++)
-   dev->entry[i].vector = i + pdev->irq;
-   break;
+   if (result > 0) {
+   vecs = result;
+   continue;
} else if (result < 0) {
vecs = 1;
-   break;
}
-   vecs = result;
+   for (i = 0; i < vecs; i++)
+   dev->entry[i].vector = i + pdev->irq;
+   break;
}
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: bio_integrity_verify() bug causing READ verify to be silently skipped

2014-01-17 Thread Martin K. Petersen

> "nab" == Nicholas A Bellinger  writes:

>> That breaks partial completion, though. I'll take a look at Kent's
>> changes...

nab> Ping..?  Any updates on a proper bugfix for this..?

I did put your patch in my queue and have been working on a fix for the
partial completion case. The latter requires a bit of massaging that
interferes with other pending changes.

Given that your patch does address a valid issue I'm OK with Jens
putting it in as is. I'll build upon it for my changes.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is (2 < 2) true? Is it a gcc bug?

2014-01-17 Thread Alexei Starovoitov

On Fri, Jan 17, 2014 at 1:02 PM, Markus Trippelsdorf
 wrote:
> On 2014.01.17 at 11:58 -0800, Alexei Starovoitov wrote:
>> On Fri, Jan 17, 2014 at 9:58 AM, Alexei Starovoitov
>>  wrote:
>> > On Fri, Jan 17, 2014 at 5:37 AM, Dorau, Lukasz  
>> > wrote:
>> >> Hi
>> >>
>> >> My story is very simply...
>> >> I applied the following patch:
>> >>
>> >> diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
>> >> --- a/drivers/scsi/isci/init.c
>> >> +++ b/drivers/scsi/isci/init.c
>> >> @@ -698,8 +698,11 @@ static int isci_pci_probe(struct pci_dev *pdev, 
>> >> const struct pci_device_id *id)
>> >> if (err)
>> >> goto err_host_alloc;
>> >>
>> >> -   for_each_isci_host(i, isci_host, pdev)
>> >> +   for_each_isci_host(i, isci_host, pdev) {
>> >> +   pr_err("(%d < %d) == %d\n",\
>> >> +  i, SCI_MAX_CONTROLLERS, (i < SCI_MAX_CONTROLLERS));
>> >> scsi_scan_host(to_shost(isci_host));
>> >> +   }
>> >>
>> >> return 0;
>> >>
>> >> --
>> >> 1.8.3.1
>> >>
>> >> Then I issued the command 'modprobe isci' on platform with two SCU 
>> >> controllers (Patsburg D or T chipset)
>> >> and received the following, very strange, output:
>> >>
>> >> (0 < 2) == 1
>> >> (1 < 2) == 1
>> >> (2 < 2) == 1
>> >>
>> >> Can anyone explain why (2 < 2) is true? Is it a gcc bug?
>> >
>> > gcc sees that i < array_size is the same as i < 2 as part of loop 
>> > condition, so
>> > it optimizes (i < sci_max_controllers) into constant 1.
>> > and emits printk like:
>> >   printk ("\13(%d < %d) == %d\n", i_382, 2, 1);
>> >
>> >> (The kernel was compiled using gcc version 4.8.2.)
>> >
>> > it actually looks to be gcc 4.8 bug.
>> > Can you try gcc 4.7 ?
>> >
>>
>> It is interesting GCC 4.8 bug,
>> since it seems to expose issues in two compiler passes.
>>
>> here is test case:
>>
>> struct isci_host;
>> struct isci_orom;
>>
>> struct isci_pci_info {
>>   struct isci_host *hosts[2];
>>   struct isci_orom *orom;
>> } v = {{(struct isci_host *)1,(struct isci_host *)1}, 0};
>>
>> int printf(const char *fmt, ...);
>>
>> int isci_pci_probe()
>> {
>>   int i;
>>   struct isci_host *isci_host;
>>
>>   for (i = 0, isci_host = v.hosts[i];
>>i < 2 && isci_host;
>>isci_host = v.hosts[++i]) {
>> printf("(%d < %d) == %d\n", i, 2, (i < 2));
>>   }
>>
>>   return 0;
>> }
>>
>> int main()
>> {
>>   isci_pci_probe();
>> }
>>
>> $ gcc bug.c
>> $./a.out
>> 0 < 2) == 1
>> (1 < 2) == 1
>> $ gcc bug.c -O2
>> $ ./a.out
>> (0 < 2) == 1
>> (1 < 2) == 1
>> Segmentation fault (core dumped)
>
> Your testcase is invalid:
>
> markus@x4 tmp % clang -fsanitize=undefined -Wall -Wextra -O2 bug.c
> markus@x4 tmp % ./a.out
> (0 < 2) == 1
> (1 < 2) == 1
> bug.c:16:20: runtime error: index 2 out of bounds for type 'struct isci_host 
> *[2]'
>
> As Jakub Jelinek said on IRC, changing the loop to e.g.:
>
>   for (i = 0;
>i < 2 && (isci_host = v.hosts[i]);
>i++) {
>
> fixes the issue.

testcase was reduced from drivers/scsi/isci/host.h written back in
2011 (commit b329aff107)
#define for_each_isci_host(id, ihost, pdev) \
for (id = 0, ihost = to_pci_info(pdev)->hosts[id]; \
 id < ARRAY_SIZE(to_pci_info(pdev)->hosts) && ihost; \
 ihost = to_pci_info(pdev)->hosts[++id])

yes, it does access 3rd element of 2 element array and will read junk.

C standard says the behavior is undefined and comes handy in compiler defense,
but in this case compiler has obviously all the information to make
right decision
instead of misoptimizing the code.
So yes, the loop is erroneous, non-portable, etc, but gcc can be smarter.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/nobootmem: Fix unused variable

2014-01-17 Thread Andrew Morton

On Thu, 16 Jan 2014 14:33:06 +0100 Philipp Hachtmann 
 wrote:

> This fixes an unused variable warning in nobootmem.c
> 
> ...
>
> --- a/mm/nobootmem.c
> +++ b/mm/nobootmem.c
> @@ -116,9 +116,13 @@ static unsigned long __init 
> __free_memory_core(phys_addr_t start,
>  static unsigned long __init free_low_memory_core_early(void)
>  {
>   unsigned long count = 0;
> - phys_addr_t start, end, size;
> + phys_addr_t start, end;
>   u64 i;
>  
> +#ifdef CONFIG_ARCH_DISCARD_MEMBLOCK
> + phys_addr_t size;
> +#endif
> +
>   for_each_free_mem_range(i, NUMA_NO_NODE, , , NULL)
>   count += __free_memory_core(start, end);

Yes, that is a bit of an eyesore.  We often approach the problem this
way, which is nicer:

static unsigned long __init free_low_memory_core_early(void)
{
unsigned long count = 0;
phys_addr_t start, end;
u64 i;

for_each_free_mem_range(i, NUMA_NO_NODE, , , NULL)
count += __free_memory_core(start, end);

#ifdef CONFIG_ARCH_DISCARD_MEMBLOCK
{
phys_addr_t size;

/* Free memblock.reserved array if it was allocated */
size = get_allocated_memblock_reserved_regions_info();
if (size)
count += __free_memory_core(start, start + size);

/* Free memblock.memory array if it was allocated */
size = get_allocated_memblock_memory_regions_info();
if (size)
count += __free_memory_core(start, start + size);
}
#endif

return count;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dcache: fix d_splice_alias handling of aliases

2014-01-17 Thread J. Bruce Fields

On Fri, Jan 17, 2014 at 04:03:43PM -0500, J. Bruce Fields wrote:
>   - d_splice_alias handles inode == NULL in the same way,

Actually, not exactly; simplifying a bit, in the NULL case they do:

d_splice_alias:

__d_instantiate(dentry, NULL);
security_d_instantiate(dentry, NULL);
if (d_unhashed(dentry))
d_rehash(dentry);

d_materialise_unique:

BUG_ON(!d_unhashed(dentry));

 __d_instantiate(dentry, NULL);
 d_rehash(dentry);
 security_d_instantiate(dentry, NULL);

and a comment on d_splice_alias says

Cluster filesystems may call this function with a negative,
hashed dentry.  In that case, we know that the inode will be a
regular file, and also this will only occur during atomic_open.

I don't understand those callers.  But I guess it would be easy enough
to handle in d_materialise_unique.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/11] use ether_addr_equal_64bits

2014-01-17 Thread Pavel Machek

On Mon 2013-12-30 19:14:56, Julia Lawall wrote:
> Ether_addr_equal_64bits is more efficient than ether_addr_equal, and can be
> used when each argument is an array within a structure that contains at
> least two bytes of data beyond the array.

I mean, yes, it is probably faster, and yes, most structures probably
contain two more bytes, but... is the uglyness worth the speedup? I'd
say this should not be done except in very time-critical places...
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ipc] 5769cf63: LTP semget02 TFAILs

2014-01-17 Thread Andrew Morton

On Fri, 17 Jan 2014 18:24:28 +0800 Fengguang Wu  wrote:

> Hi Davidlohr,
> 
> We noticed LTP test failures
> 
> ltp.msgget02.1.TFAIL
> ltp.semget02.2.TFAIL
> ltp.semget02.3.TFAIL
> 
> and the first bad commit is
> 
> commit 5769cf6355d87f63906b3e51887eff7017c39217
> Author: Davidlohr Bueso 
> AuthorDate: Wed Jan 15 16:56:01 2014 +1100
> Commit: Stephen Rothwell 
> CommitDate: Wed Jan 15 16:56:01 2014 +1100
> 
> ipc: share ids rwsem when possible in ipcget_public
> 

Thanks (a lot). I'll disable that patch for now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/7] numa,sched: do statistics calculation using local variables only

2014-01-17 Thread riel

From: Rik van Riel 

The current code in task_numa_placement calculates the difference
between the old and the new value, but also temporarily stores half
of the old value in the per-process variables.

The NUMA balancing code looks at those per-process variables, and
having other tasks temporarily see halved statistics could lead to
unwanted numa migrations. This can be avoided by doing all the math
in local variables.

This change also simplifies the code a little.

Cc: Peter Zijlstra 
Cc: Mel Gorman 
Cc: Ingo Molnar 
Cc: Chegu Vinod 
Signed-off-by: Rik van Riel 
---
 kernel/sched/fair.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0d395a0..0f48382 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1484,12 +1484,9 @@ static void task_numa_placement(struct task_struct *p)
long diff, f_diff, f_weight;
 
i = task_faults_idx(nid, priv);
-   diff = -p->numa_faults[i];
-   f_diff = -p->numa_faults_from[i];
 
/* Decay existing window, copy faults since last scan */
-   p->numa_faults[i] >>= 1;
-   p->numa_faults[i] += p->numa_faults_buffer[i];
+   diff = p->numa_faults_buffer[i] - p->numa_faults[i] / 2;
fault_types[priv] += p->numa_faults_buffer[i];
p->numa_faults_buffer[i] = 0;
 
@@ -1503,13 +1500,12 @@ static void task_numa_placement(struct task_struct *p)
f_weight = (1024 * runtime *
   p->numa_faults_from_buffer[i]) /
   (total_faults * period + 1);
-   p->numa_faults_from[i] >>= 1;
-   p->numa_faults_from[i] += f_weight;
+   f_diff = f_weight - p->numa_faults_from[i] / 2;
p->numa_faults_from_buffer[i] = 0;
 
+   p->numa_faults[i] += diff;
+   p->numa_faults_from[i] += f_diff;
faults += p->numa_faults[i];
-   diff += p->numa_faults[i];
-   f_diff += p->numa_faults_from[i];
p->total_numa_faults += diff;
if (p->numa_group) {
/* safe because we can only change our own 
group */
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] Remove bus dependency for iommu_domain_alloc.

2014-01-17 Thread Alex Williamson

On Fri, 2014-01-17 at 20:21 +, Varun Sethi wrote:
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Saturday, January 18, 2014 1:39 AM
> > To: Sethi Varun-B16395
> > Cc: j...@8bytes.org; io...@lists.linux-foundation.org; linux-
> > ker...@vger.kernel.org
> > Subject: Re: [RFC][PATCH] Remove bus dependency for iommu_domain_alloc.
> > 
> > On Sat, 2014-01-18 at 01:00 +0530, Varun Sethi wrote:
> > > This patch attempts to remove iommu_domain_alloc function's dependency
> > on the bus type.
> > > This dependency is quiet restrictive in case of vfio, where it's
> > > possible to bind multiple iommu groups (from different bus types) to
> > the same iommu domain.
> > >
> > > This patch is based on the assumption, that there is a single iommu
> > > for all bus types on the system.
> > >
> > > We maintain a list of bus types (for which iommu ops are registered).
> > > In the iommu_domain_alloc function we ensure that all bus types
> > correspond to the same set of iommu operations.
> > 
> > Seems like this just kicks the problem down the road a little ways as I
> > expect the assumption isn't going to last long.  I think there's another
> > way to do this and we can do it entirely from within vfio_iommu_type1.
> > We have a problem on x86 that the IOMMU driver can be backed by multiple
> > IOMMU hardware devices.  These separate devices are architecturally
> > allowed to have different properties.  The property causing us trouble is
> > cache coherency.  Some hardware devices allow us to use IOMMU_CACHE as a
> > mapping attribute, others do not.  Therefore we cannot use a single IOMMU
> > domain to optimally handle all devices in a heterogeneous environment.
> > 
> > I think the solution to this is to have vfio_iommu_type1 transparently
> > support multiple IOMMU domains.  In the implementation of that, it seems
> > to make sense to move the iommu_domain_alloc() to the point where we
> > attach a group to the domain.  That means we can scan the devices in the
> [Sethi Varun-B16395] Multiple iommu groups can also share the same domain (as 
> a part
> Of the same VFIO container). I am not sure how can we handle the case of 
> iommu groups from
> Different bus types in vfio.

Correct and believe I handle this.  The difference is that rather than
attaching a new group to and old domain and hoping for the best, we now
allocate a domain for each group, attach the group to the new domain,
then compare the capabilities of the new domain to the old domain.  If
we determine they are compatible, we throw away the new domain and use
the old one.  If they are not compatible, for instance if they are for
different bus_types or if the cache coherence support is different, they
remain separate and we duplicate mappings to both domains.  Hopefully
this is more clear in the code I just sent.  Thanks,

Alex 

> > domain to determine the bus.  I suppose there is still an assumption that
> > all the devices in a group are on the same bus, but since the group is
> > determined by the IOMMU and we already assume only a single IOMMU per
> > bus, I think we're ok.  I spent some time working on a patch to do this,
> > but it isn't quite finished.  I'll try to bandage the rough edges and
> > send it out as an RFC so you can see what I'm talking about.  Thanks,
> > 
> > Alex
> > 
> > > Signed-off-by: Varun Sethi 
> > > ---
> > >  arch/arm/mm/dma-mapping.c |2 +-
> > >  drivers/gpu/drm/msm/msm_gpu.c |2 +-
> > >  drivers/iommu/amd_iommu_v2.c  |2 +-
> > >  drivers/iommu/iommu.c |   32
> > +---
> > >  drivers/media/platform/omap3isp/isp.c |2 +-
> > >  drivers/remoteproc/remoteproc_core.c  |2 +-
> > >  drivers/vfio/vfio_iommu_type1.c   |2 +-
> > >  include/linux/device.h|2 ++
> > >  include/linux/iommu.h |4 ++--
> > >  virt/kvm/iommu.c  |2 +-
> > >  10 files changed, 40 insertions(+), 12 deletions(-)
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] numa,sched: normalize faults_from stats and weigh by CPU use

2014-01-17 Thread riel

From: Rik van Riel 

The tracepoint has made it abundantly clear that the naive
implementation of the faults_from code has issues.

Specifically, the garbage collector in some workloads will
access orders of magnitudes more memory than the threads
that do all the active work. This resulted in the node with
the garbage collector being marked the only active node in
the group.

This issue is avoided if we weigh the statistics by CPU use
of each task in the numa group, instead of by how many faults
each thread has occurred.

To achieve this, we normalize the number of faults to the
fraction of faults that occurred on each node, and then
multiply that fraction by the fraction of CPU time the
task has used since the last time task_numa_placement was
invoked.

This way the nodes in the active node mask will be the ones
where the tasks from the numa group are most actively running,
and the influence of eg. the garbage collector and other
do-little threads is properly minimized.

Cc: Peter Zijlstra 
Cc: Mel Gorman 
Cc: Ingo Molnar 
Cc: Chegu Vinod 
Signed-off-by: Rik van Riel 
---
 include/linux/sched.h |  2 ++
 kernel/sched/core.c   |  2 ++
 kernel/sched/fair.c   | 48 ++--
 3 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0af6c1a..52de567 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1471,6 +1471,8 @@ struct task_struct {
int numa_preferred_nid;
unsigned long numa_migrate_retry;
u64 node_stamp; /* migration stamp  */
+   u64 last_task_numa_placement;
+   u64 last_sum_exec_runtime;
struct callback_head numa_work;
 
struct list_head numa_entry;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7f45fd5..9a0908a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1758,6 +1758,8 @@ static void __sched_fork(unsigned long clone_flags, 
struct task_struct *p)
p->numa_work.next = >numa_work;
p->numa_faults = NULL;
p->numa_faults_buffer = NULL;
+   p->last_task_numa_placement = 0;
+   p->last_sum_exec_runtime = 0;
 
INIT_LIST_HEAD(>numa_entry);
p->numa_group = NULL;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8e0a53a..0d395a0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1422,11 +1422,41 @@ static void update_task_scan_period(struct task_struct 
*p,
memset(p->numa_faults_locality, 0, sizeof(p->numa_faults_locality));
 }
 
+/*
+ * Get the fraction of time the task has been running since the last
+ * NUMA placement cycle. The scheduler keeps similar statistics, but
+ * decays those on a 32ms period, which is orders of magnitude off
+ * from the dozens-of-seconds NUMA balancing period. Use the scheduler
+ * stats only if the task is so new there are no NUMA statistics yet.
+ */
+static u64 numa_get_avg_runtime(struct task_struct *p, u64 *period)
+{
+   u64 runtime, delta, now;
+   /* Use the start of this time slice to avoid calculations. */
+   now = p->se.exec_start;
+   runtime = p->se.sum_exec_runtime;
+
+   if (p->last_task_numa_placement) {
+   delta = runtime - p->last_sum_exec_runtime;
+   *period = now - p->last_task_numa_placement;
+   } else {
+   delta = p->se.avg.runnable_avg_sum;
+   *period = p->se.avg.runnable_avg_period;
+   }
+
+   p->last_sum_exec_runtime = runtime;
+   p->last_task_numa_placement = now;
+
+   return delta;
+}
+
 static void task_numa_placement(struct task_struct *p)
 {
int seq, nid, max_nid = -1, max_group_nid = -1;
unsigned long max_faults = 0, max_group_faults = 0;
unsigned long fault_types[2] = { 0, 0 };
+   unsigned long total_faults;
+   u64 runtime, period;
spinlock_t *group_lock = NULL;
 
seq = ACCESS_ONCE(p->mm->numa_scan_seq);
@@ -1435,6 +1465,10 @@ static void task_numa_placement(struct task_struct *p)
p->numa_scan_seq = seq;
p->numa_scan_period_max = task_scan_max(p);
 
+   total_faults = p->numa_faults_locality[0] +
+  p->numa_faults_locality[1] + 1;
+   runtime = numa_get_avg_runtime(p, );
+
/* If the task is part of a group prevent parallel updates to group 
stats */
if (p->numa_group) {
group_lock = >numa_group->lock;
@@ -1447,7 +1481,7 @@ static void task_numa_placement(struct task_struct *p)
int priv, i;
 
for (priv = 0; priv < 2; priv++) {
-   long diff, f_diff;
+   long diff, f_diff, f_weight;
 
i = task_faults_idx(nid, priv);
diff = -p->numa_faults[i];
@@ -1459,8 +1493,18 @@ static void task_numa_placement(struct task_struct *p)
fault_types[priv] += p->numa_faults_buffer[i];

[PATCH 5/7] numa,sched,mm: use active_nodes nodemask to limit numa migrations

2014-01-17 Thread riel

From: Rik van Riel 

Use the active_nodes nodemask to make smarter decisions on NUMA migrations.

In order to maximize performance of workloads that do not fit in one NUMA
node, we want to satisfy the following criteria:
1) keep private memory local to each thread
2) avoid excessive NUMA migration of pages
3) distribute shared memory across the active nodes, to
   maximize memory bandwidth available to the workload

This patch accomplishes that by implementing the following policy for
NUMA migrations:
1) always migrate on a private fault
2) never migrate to a node that is not in the set of active nodes
   for the numa_group
3) always migrate from a node outside of the set of active nodes,
   to a node that is in that set
4) within the set of active nodes in the numa_group, only migrate
   from a node with more NUMA page faults, to a node with fewer
   NUMA page faults, with a 25% margin to avoid ping-ponging

This results in most pages of a workload ending up on the actively
used nodes, with reduced ping-ponging of pages between those nodes.

Cc: Peter Zijlstra 
Cc: Mel Gorman 
Cc: Ingo Molnar 
Cc: Chegu Vinod 
Signed-off-by: Rik van Riel 
---
 include/linux/sched.h |  7 +++
 kernel/sched/fair.c   | 37 +
 mm/mempolicy.c|  3 +++
 3 files changed, 47 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index a9f7f05..0af6c1a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1602,6 +1602,8 @@ extern void task_numa_fault(int last_node, int node, int 
pages, int flags);
 extern pid_t task_numa_group_id(struct task_struct *p);
 extern void set_numabalancing_state(bool enabled);
 extern void task_numa_free(struct task_struct *p);
+extern bool should_numa_migrate(struct task_struct *p, int last_cpupid,
+   int src_nid, int dst_nid);
 #else
 static inline void task_numa_fault(int last_node, int node, int pages,
   int flags)
@@ -1617,6 +1619,11 @@ static inline void set_numabalancing_state(bool enabled)
 static inline void task_numa_free(struct task_struct *p)
 {
 }
+static inline bool should_numa_migrate(struct task_struct *p, int last_cpupid,
+  int src_nid, int dst_nid)
+{
+   return true;
+}
 #endif
 
 static inline struct pid *task_pid(struct task_struct *task)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3551009..8e0a53a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -948,6 +948,43 @@ static inline unsigned long group_weight(struct 
task_struct *p, int nid)
return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
 }
 
+bool should_numa_migrate(struct task_struct *p, int last_cpupid,
+int src_nid, int dst_nid)
+{
+   struct numa_group *ng = p->numa_group;
+
+   /* Always allow migrate on private faults */
+   if (cpupid_match_pid(p, last_cpupid))
+   return true;
+
+   /* A shared fault, but p->numa_group has not been set up yet. */
+   if (!ng)
+   return true;
+
+   /*
+* Do not migrate if the destination is not a node that
+* is actively used by this numa group.
+*/
+   if (!node_isset(dst_nid, ng->active_nodes))
+   return false;
+
+   /*
+* Source is a node that is not actively used by this
+* numa group, while the destination is. Migrate.
+*/
+   if (!node_isset(src_nid, ng->active_nodes))
+   return true;
+
+   /*
+* Both source and destination are nodes in active
+* use by this numa group. Maximize memory bandwidth
+* by migrating from more heavily used groups, to less
+* heavily used ones, spreading the load around.
+* Use a 1/4 hysteresis to avoid spurious page movement.
+*/
+   return group_faults(p, dst_nid) < (group_faults(p, src_nid) * 3 / 4);
+}
+
 static unsigned long weighted_cpuload(const int cpu);
 static unsigned long source_load(int cpu, int type);
 static unsigned long target_load(int cpu, int type);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 052abac..050962b 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2405,6 +2405,9 @@ int mpol_misplaced(struct page *page, struct 
vm_area_struct *vma, unsigned long
if (!cpupid_pid_unset(last_cpupid) && 
cpupid_to_nid(last_cpupid) != thisnid) {
goto out;
}
+
+   if (!should_numa_migrate(current, last_cpupid, curnid, polnid))
+   goto out;
}
 
if (curnid != polnid)
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/7] numa,sched: tracepoints for NUMA balancing active nodemask changes

2014-01-17 Thread riel

From: Rik van Riel 

Being able to see how the active nodemask changes over time, and why,
can be quite useful.

Cc: Peter Zijlstra 
Cc: Mel Gorman 
Cc: Ingo Molnar 
Cc: Chegu Vinod 
Signed-off-by: Rik van Riel 
---
 include/trace/events/sched.h | 34 ++
 kernel/sched/fair.c  |  8 ++--
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 67e1bbf..91726b6 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -530,6 +530,40 @@ TRACE_EVENT(sched_swap_numa,
__entry->dst_pid, __entry->dst_tgid, __entry->dst_ngid,
__entry->dst_cpu, __entry->dst_nid)
 );
+
+TRACE_EVENT(update_numa_active_nodes_mask,
+
+   TP_PROTO(int pid, int gid, int nid, int set, long faults, long 
max_faults),
+
+   TP_ARGS(pid, gid, nid, set, faults, max_faults),
+
+   TP_STRUCT__entry(
+   __field(pid_t,  pid)
+   __field(pid_t,  gid)
+   __field(int,nid)
+   __field(int,set)
+   __field(long,   faults)
+   __field(long,   max_faults);
+   ),
+
+   TP_fast_assign(
+   __entry->pid = pid;
+   __entry->gid = gid;
+   __entry->nid = nid;
+   __entry->set = set;
+   __entry->faults = faults;
+   __entry->max_faults = max_faults;
+   ),
+
+   TP_printk("pid=%d gid=%d nid=%d set=%d faults=%ld max_faults=%ld",
+   __entry->pid,
+   __entry->gid,
+   __entry->nid,
+   __entry->set,
+   __entry->faults,
+   __entry->max_faults)
+
+);
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aa680e2..3551009 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1300,10 +1300,14 @@ static void update_numa_active_node_mask(struct 
task_struct *p)
faults = numa_group->faults_from[task_faults_idx(nid, 0)] +
 numa_group->faults_from[task_faults_idx(nid, 1)];
if (!node_isset(nid, numa_group->active_nodes)) {
-   if (faults > max_faults * 4 / 10)
+   if (faults > max_faults * 4 / 10) {
+   
trace_update_numa_active_nodes_mask(current->pid, numa_group->gid, nid, true, 
faults, max_faults);
node_set(nid, numa_group->active_nodes);
-   } else if (faults < max_faults * 2 / 10)
+   }
+   } else if (faults < max_faults * 2 / 10) {
+   trace_update_numa_active_nodes_mask(current->pid, 
numa_group->gid, nid, false, faults, max_faults);
node_clear(nid, numa_group->active_nodes);
+   }
}
 }
 
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/7] numa,sched: build per numa_group active node mask from faults_from statistics

2014-01-17 Thread riel

From: Rik van Riel 

The faults_from statistics are used to maintain an active_nodes nodemask
per numa_group. This allows us to be smarter about when to do numa migrations.

Cc: Peter Zijlstra 
Cc: Mel Gorman 
Cc: Ingo Molnar 
Cc: Chegu Vinod 
Signed-off-by: Rik van Riel 
---
 kernel/sched/fair.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1945ddc..aa680e2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -885,6 +885,7 @@ struct numa_group {
struct list_head task_list;
 
struct rcu_head rcu;
+   nodemask_t active_nodes;
unsigned long total_faults;
unsigned long *faults_from;
unsigned long faults[0];
@@ -1275,6 +1276,38 @@ static void numa_migrate_preferred(struct task_struct *p)
 }
 
 /*
+ * Iterate over the nodes from which NUMA hinting faults were triggered, in
+ * other words where the CPUs that incurred NUMA hinting faults are. The
+ * bitmask is used to limit NUMA page migrations, and spread out memory
+ * between the actively used nodes. To prevent flip-flopping, and excessive
+ * page migrations, nodes are added when they cause over 40% of the maximum
+ * number of faults, but only removed when they drop below 20%.
+ */
+static void update_numa_active_node_mask(struct task_struct *p)
+{
+   unsigned long faults, max_faults = 0;
+   struct numa_group *numa_group = p->numa_group;
+   int nid;
+
+   for_each_online_node(nid) {
+   faults = numa_group->faults_from[task_faults_idx(nid, 0)] +
+numa_group->faults_from[task_faults_idx(nid, 1)];
+   if (faults > max_faults)
+   max_faults = faults;
+   }
+
+   for_each_online_node(nid) {
+   faults = numa_group->faults_from[task_faults_idx(nid, 0)] +
+numa_group->faults_from[task_faults_idx(nid, 1)];
+   if (!node_isset(nid, numa_group->active_nodes)) {
+   if (faults > max_faults * 4 / 10)
+   node_set(nid, numa_group->active_nodes);
+   } else if (faults < max_faults * 2 / 10)
+   node_clear(nid, numa_group->active_nodes);
+   }
+}
+
+/*
  * When adapting the scan rate, the period is divided into NUMA_PERIOD_SLOTS
  * increments. The more local the fault statistics are, the higher the scan
  * period will be for the next scan window. If local/remote ratio is below
@@ -1416,6 +1449,7 @@ static void task_numa_placement(struct task_struct *p)
update_task_scan_period(p, fault_types[0], fault_types[1]);
 
if (p->numa_group) {
+   update_numa_active_node_mask(p);
/*
 * If the preferred task and group nids are different,
 * iterate over the nodes again to find the best place.
@@ -1478,6 +1512,8 @@ static void task_numa_group(struct task_struct *p, int 
cpupid, int flags,
/* Second half of the array tracks where faults come from */
grp->faults_from = grp->faults + 2 * nr_node_ids;
 
+   node_set(task_node(current), grp->active_nodes);
+
for (i = 0; i < 4*nr_node_ids; i++)
grp->faults[i] = p->numa_faults[i];
 
@@ -1547,6 +1583,8 @@ static void task_numa_group(struct task_struct *p, int 
cpupid, int flags,
my_grp->nr_tasks--;
grp->nr_tasks++;
 
+   update_numa_active_node_mask(p);
+
spin_unlock(_grp->lock);
spin_unlock(>lock);
 
-- 
1.8.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1228 matches

Mail list logo