date:20141101

[git pull] vfs.git

2014-11-01 Thread Al Viro

A bunch of assorted fixes, most of them - followups to overlayfs
merge.  Please, pull from the usual place:
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Shortlog:
Al Viro (3):
  overlayfs: barriers for opening upper-layer directory
  isofs_cmp(): we'll never see a dentry for . or ..
  isofs: don't bother with ->d_op for normal case

Daniel Thompson (1):
  staging: android: logger: Fix log corruption regression

David Jeffery (1):
  Return short read or 0 at end of a raw device, not EIO

Miklos Szeredi (3):
  ovl: fix check for cursor
  overlayfs: fix lockdep misannotation
  ovl: initialize ->is_cursor

Paul E. McKenney (1):
  rcu: Provide counterpart to rcu_dereference() for non-RCU situations

Diffstat:
 drivers/char/raw.c   |2 +-
 drivers/staging/android/logger.c |   13 -
 fs/block_dev.c   |3 ++-
 fs/isofs/inode.c |   24 ++--
 fs/isofs/namei.c |   22 --
 fs/namei.c   |2 +-
 fs/overlayfs/readdir.c   |   17 ++---
 include/linux/fs.h   |   10 +++---
 include/linux/rcupdate.h |   15 +++
 9 files changed, 50 insertions(+), 58 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] irqchip: dw-apb-ictl: select GENERIC_IRQ_CHIP

2014-11-01 Thread Jisheng Zhang

On Sat, 1 Nov 2014 19:23:11 -0700
Jason Cooper  wrote:

> Jisheng,
> 
> On Wed, Oct 22, 2014 at 08:59:10PM +0800, Jisheng Zhang wrote:
> > The dw-apb-ictl driver uses the generic-chip functions.
> > Thus it needs to select GENERIC_IRQ_CHIP in Kconfig.
> > 
> > Change-Id: If748beffc4e0d0b47062bb067a59c10994a9148b
> 
> Please don't include this is future patches.

oops, sorry for my fault. I'll take care from now on.

Thanks for kindly reminding,
Jisheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v7 14/18] x86, irq, ACPI: Introduce a rwsem to protect IOAPIC operations from hotplug

2014-11-01 Thread Jiang Liu

On 2014/11/2 2:59, Thomas Gleixner wrote:
> On Mon, 27 Oct 2014, Jiang Liu wrote:
>> We are going to support ACPI based IOAPIC hotplug, so introduce a rwsem
>> to protect IOAPIC data structures from IOAPIC hotplug. We choose to
>> serialize in ACPI instead of in the IOAPIC core because:
>> 1) currently we are only plan to support ACPI based IOAPIC hotplug
>> 2) it's much more cleaner and easier
>> 3) It does't affect IOAPIC discovered by devicetree, SFI and mpparse.
> 
> I had a last intensive look at this series as I was about to merge
> it. So I looked at the locking rules here again
>  
>> +/*
>> + * Locks related to IOAPIC hotplug
>> + * Hotplug side:
>> + *  ->lock_device_hotplug() //device_hotplug_lock
>> + *  ->acpi_ioapic_rwsem
>> + *  ->ioapic_lock
>> + * Interrupt mapping side:
>> + *  ->acpi_ioapic_rwsem
>> + *  ->ioapic_mutex
>> + *  ->ioapic_lock
>> + */
> 
> This looks sane, but I cannot figure out at all why this needs to be a
> rwsem.
> 
>> +static DECLARE_RWSEM(acpi_ioapic_rwsem);
> 
> I think it should be a simple mutex because the rwsem does not protect
> against concurrent execution what taken for read.
> 
> And the site which takes it for write is in the early boot process
> where nothing runs in parallel AFAICT.
Hi Thomas,
You are right. It's not on hot path, so a mutex is better than
a rwsem here. I will send out an updated version soon.
Regards!
Gerry
> 
> Thanks,
> 
>   tglx
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] powerpc: use device_online/offline() instead of cpu_up/down()

2014-11-01 Thread Bharata B Rao

On Fri, Oct 31, 2014 at 03:41:34PM -0400, Dan Streetman wrote:
> In powerpc pseries platform dlpar operations, Use device_online() and
> device_offline() instead of cpu_up() and cpu_down().
> 
> Calling cpu_up/down directly does not update the cpu device offline
> field, which is used to online/offline a cpu from sysfs.  Calling
> device_online/offline instead keeps the sysfs cpu online value correct.
> The hotplug lock, which is required to be held when calling
> device_online/offline, is already held when dlpar_online/offline_cpu
> are called, since they are called only from cpu_probe|release_store.
> 
> This patch fixes errors on PowerVM systems that have cpu(s) added/removed
> using dlpar operations; without this patch, the
> /sys/devices/system/cpu/cpuN/online nodes do not correctly show the
> online state of added/removed cpus.

Verified the patch to be working as expected when I online and offline
CPUs of a PowerKVM guest using QEMU (plus my RFC hotplug patchset for
QEMU)

Regards,
Bharata.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, cpu: trivial printk formatting fixes

2014-11-01 Thread Steven Honeyman

On 1 November 2014 17:51, Borislav Petkov  wrote:
> On Sat, Nov 01, 2014 at 05:38:18PM +, Steven Honeyman wrote:
>> On 1 November 2014 17:19, Borislav Petkov  wrote:
>> > On Sat, Nov 01, 2014 at 03:44:56PM +, Steven Honeyman wrote:
>> >> A 2 line printk makes dmesg output messy, because the second line does 
>> >> not get a timestamp.
>> >> For example:
>> >>
>> >> [0.012863] CPU0: Thermal monitoring enabled (TM1)
>> >> [0.012869] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024
>> >> Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4
>> >> [0.012958] Freeing SMP alternatives memory: 28K (81d86000 - 
>> >> 81d8d000)
>> >> [0.014961] dmar: Host address width 39
>> >
>> > It looks just fine here, albeit with repeated timestamp:
>> >
>> > $ dmesg | grep -E "[id]TLB"
>> > [0.269607] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
>> > [0.269607] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512, 1GB 0
>>
>> That's strange! Is it the same for the other one? I just double
>
> dmesg | grep ENERGY
> [0.061976] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
> [0.061976] ENERGY_PERF_BIAS: View and update with 
> x86_energy_perf_policy(8)
>
>> checked on the slight chance I had an alias causing problems etc, but
>> that wasn't the case:
>>
>> $ 'dmesg'|'grep' ENERGY
>> [0.010557] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
>> ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
>> $ dmesg --version && grep --version
>> dmesg from util-linux 2.25.2
>> grep (GNU grep) 2.20
>
> $ dmesg --version && grep --version
> dmesg from util-linux 2.20.1
> grep (GNU grep) 2.20
>
> I've upgraged util-linux (for dmesg) on the other box:
>
> $ dmesg --version && grep --version
> dmesg from util-linux 2.25.1
> grep (GNU grep) 2.18
>
> and now I get:
>
> dmesg | grep -E "[id]TLB"
> [0.269607] Last level iTLB entries: 4KB 512, 2MB 1024, 4MB 512
> Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 512, 1GB 0
>
> So I'd say it looks like a regression in dmesg itself.

Hmm - the difference is to do with the source of the kernel ring
buffer. The old output format be obtained using the latest dmesg by
adding "-S", which uses syslog(2) rather than /dev/kmsg.
(added in commit 7af230601ab)

The klogctl version interprets the \n and adds the timestamp
afterwards, but /dev/kmsg changes the '\n' to "\x0a" resulting in:
4,355,10557,-;ENERGY_PERF_BIAS: Set to 'normal', was
'performance'\x0aENERGY_P...

It looks as though it's just a matter of opinion/preference whether
control characters are printed (rfc5424)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fsnotify: don't call mutex_lock from TASK_INTERRUPTIBLE context

2014-11-01 Thread Sasha Levin

Sleeping functions should only be called from TASK_RUNNING. The following
code in fanotify_read():

prepare_to_wait(>notification_waitq, , TASK_INTERRUPTIBLE);

mutex_lock(>notification_mutex);

would call it under TASK_INTERRUPTIBLE, and trigger a warning:

[12326.092094] WARNING: CPU: 27 PID: 30207 at kernel/sched/core.c:7305 
__might_sleep+0xd2/0x110()
[12326.092878] do not call blocking ops when !TASK_RUNNING; state=1 set at 
prepare_to_wait (./arch/x86/include/asm/current.h:14 kernel/sched/wait.c:179)
[12326.093938] Modules linked in:
[12326.094261] CPU: 27 PID: 30207 Comm: fanotify01 Not tainted 
3.18.0-rc2-next-20141031-sasha-00057-g9a0b11b-dirty #1435
[12326.095255]  0009  88003b563000 
88005bfbbc38
[12326.096019]  90dabf13  88005bfbbc98 
88005bfbbc88
[12326.096791]  8c1b12fa 88005bfbbc88 8c1f6112 
001d76c0
[12326.097610] Call Trace:
[12326.097881] dump_stack (lib/dump_stack.c:52)
[12326.098383] warn_slowpath_common (kernel/panic.c:432)
[12326.098973] ? __might_sleep (kernel/sched/core.c:7311)
[12326.099512] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.100100] warn_slowpath_fmt (kernel/panic.c:446)
[12326.100704] ? check_chain_key (kernel/locking/lockdep.c:2190)
[12326.101319] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.101870] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.102421] __might_sleep (kernel/sched/core.c:7311)
[12326.102949] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.103502] ? prepare_to_wait (kernel/sched/wait.c:181)
[12326.104060] mutex_lock_nested (kernel/locking/mutex.c:623)
[12326.104620] ? preempt_count_sub (kernel/sched/core.c:2641)
[12326.105324] ? _raw_spin_unlock_irqrestore 
(./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:161 
kernel/locking/spinlock.c:191)
[12326.105986] ? prepare_to_wait (kernel/sched/wait.c:181)
[12326.106542] fanotify_read (./arch/x86/include/asm/atomic.h:27 
include/linux/mutex.h:131 fs/notify/fanotify/fanotify_user.c:57 
fs/notify/fanotify/fanotify_user.c:273)
[12326.107070] ? abort_exclusive_wait (kernel/sched/wait.c:291)
[12326.107676] vfs_read (fs/read_write.c:430)
[12326.108169] SyS_read (fs/read_write.c:569 fs/read_write.c:562)
[12326.108652] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)

Instead of trying to fix fanotify_read() I've converted notification_mutex
into a spinlock. I didn't see a reason why it should be a mutex nor anything
complained when I ran the same tests again.

Signed-off-by: Sasha Levin 
---
 fs/notify/fanotify/fanotify_user.c |   18 +-
 fs/notify/group.c  |2 +-
 fs/notify/inotify/inotify_user.c   |   16 
 fs/notify/notification.c   |   22 +++---
 include/linux/fsnotify_backend.h   |2 +-
 5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c 
b/fs/notify/fanotify/fanotify_user.c
index c991616..f03bffc 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -49,12 +49,12 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
  * enough to fit in "count". Return an error pointer if the count
  * is not large enough.
  *
- * Called with the group->notification_mutex held.
+ * Called with the group->notification_lock held.
  */
 static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
size_t count)
 {
-   BUG_ON(!mutex_is_locked(>notification_mutex));
+   BUG_ON(!spin_is_locked(>notification_lock));
 
pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
 
@@ -64,7 +64,7 @@ static struct fsnotify_event *get_one_event(struct 
fsnotify_group *group,
if (FAN_EVENT_METADATA_LEN > count)
return ERR_PTR(-EINVAL);
 
-   /* held the notification_mutex the whole time, so this is the
+   /* held the notification_lock the whole time, so this is the
 * same event we peeked above */
return fsnotify_remove_first_event(group);
 }
@@ -244,10 +244,10 @@ static unsigned int fanotify_poll(struct file *file, 
poll_table *wait)
int ret = 0;
 
poll_wait(file, >notification_waitq, wait);
-   mutex_lock(>notification_mutex);
+   spin_lock(>notification_lock);
if (!fsnotify_notify_queue_is_empty(group))
ret = POLLIN | POLLRDNORM;
-   mutex_unlock(>notification_mutex);
+   spin_unlock(>notification_lock);
 
return ret;
 }
@@ -269,9 +269,9 @@ static ssize_t fanotify_read(struct file *file, char __user 
*buf,
while (1) {
prepare_to_wait(>notification_waitq, , 
TASK_INTERRUPTIBLE);
 
-   mutex_lock(>notification_mutex);
+

Re: [PATCH] usb: option: Add ID for Peiker LTE NAD

2014-11-01 Thread Lars Melin


On 2014-11-01 23:01, Matthias Klein wrote:

Add ID of the Peiker LTE NAD for legacy serial interface

Signed-off-by: Matthias Klein 
---
  drivers/usb/serial/option.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index d1a3f60..d7f1042 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -1091,6 +1091,7 @@ static const struct usb_device_id option_ids[] = {
{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x6613)}, /* Onda H600/ZTE MF330 */
{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x0023)}, /* ONYX 3G device */
{ USB_DEVICE(QUALCOMM_VENDOR_ID, 0x9000)}, /* SIMCom SIM5218 */
+   { USB_DEVICE(QUALCOMM_VENDOR_ID, 0x9025)}, /* Peiker LTE NAD */
{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) },
{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_CMU_300) },
{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6003),
05c6:9025 already has its net interface (#4) supported by the qmi_wwan 
driver so your patch is wrong.
There is also an ADB interface which I don't think should be driven by 
option.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2/4] pmem: Add support for getgeo()

2014-11-01 Thread Elliott, Robert (Server Storage)



> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Ross Zwisler
> Sent: Wednesday, 27 August, 2014 4:12 PM
> To: Jens Axboe; Matthew Wilcox; Boaz Harrosh; Nick Piggin; linux-
> fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> nvd...@lists.01.org
> Cc: Ross Zwisler
> Subject: [PATCH 2/4] pmem: Add support for getgeo()
> 
> Some programs require HDIO_GETGEO work, which requires we implement
> getgeo.  Based off of the work done to the NVMe driver in this
> commit:
> 
> commit 4cc09e2dc4cb ("NVMe: Add getgeo to block ops")
> 
> Signed-off-by: Ross Zwisler 
> ---
>  drivers/block/pmem.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
> index d366b9b..60bbe0d 100644
> --- a/drivers/block/pmem.c
> +++ b/drivers/block/pmem.c
> @@ -50,6 +50,15 @@ struct pmem_device {
>   size_t  size;
>  };
> 
> +static int pmem_getgeo(struct block_device *bd, struct hd_geometry
> *geo)
> +{
> + /* some standard values */
> + geo->heads = 1 << 6;
> + geo->sectors = 1 << 5;
> + geo->cylinders = get_capacity(bd->bd_disk) >> 11;

Just stuffing the result of get_capacity into the 16-bit 
cylinders field will overflow/wrap on large capacities.
0x << 11 = 0x7FF_F800 = 64 GiB (68.7 GB)

How many programs still need these meaningless fields?
Could the bogus information be created elsewhere so
each block driver doesn't need to do this?


> + return 0;
> +}
> +
>  /*
>   * direct translation from (pmem,sector) => void*
>   * We do not require that sector be page aligned.
> @@ -176,6 +185,7 @@ out:
> 
>  static const struct block_device_operations pmem_fops = {
>   .owner =THIS_MODULE,
> + .getgeo =   pmem_getgeo,
>  };
> 
>  /* Kernel module stuff */
> --


---
Rob ElliottHP Server Storage



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/4] pmem: Initial version of persistent memory driver

2014-11-01 Thread Elliott, Robert (Server Storage)

> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-
> ow...@vger.kernel.org] On Behalf Of Ross Zwisler
> Sent: Wednesday, 27 August, 2014 4:12 PM
> To: Jens Axboe; Matthew Wilcox; Boaz Harrosh; Nick Piggin; linux-
> fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> nvd...@lists.01.org
> Cc: Ross Zwisler
> Subject: [PATCH 1/4] pmem: Initial version of persistent memory
> driver
> 
> PMEM is a new driver that presents a reserved range of memory as a
> block device.  This is useful for developing with NV-DIMMs, and
> can be used with volatile memory as a development platform.
> 
> Signed-off-by: Ross Zwisler 
> ---
>  MAINTAINERS|   6 +
>  drivers/block/Kconfig  |  41 ++
>  drivers/block/Makefile |   1 +
>  drivers/block/pmem.c   | 330
> +
>  4 files changed, 378 insertions(+)
>  create mode 100644 drivers/block/pmem.c
> 
...

> diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
> index 1b8094d..ac52f5a 100644
> --- a/drivers/block/Kconfig
> +++ b/drivers/block/Kconfig
> @@ -404,6 +404,47 @@ config BLK_DEV_RAM_DAX
> and will prevent RAM block device backing store memory from
> being
> allocated from highmem (only a problem for highmem systems).
> 
> +config BLK_DEV_PMEM
> + tristate "Persistent memory block device support"
> + help
> +   Saying Y here will allow you to use a contiguous range of
> reserved
> +   memory as one or more block devices.  Memory for PMEM should
> be
> +   reserved using the "memmap" kernel parameter.
> +
> +   To compile this driver as a module, choose M here: the module
> will be
> +   called pmem.
> +
> +   Most normal users won't need this functionality, and can thus
> say N
> +   here.
> +
> +config BLK_DEV_PMEM_START
> + int "Offset in GiB of where to start claiming space"
> + default "0"
> + depends on BLK_DEV_PMEM
> + help
> +   Starting offset in GiB that PMEM should use when claiming
> memory.  This
> +   memory needs to be reserved from the OS at boot time using
> the
> +   "memmap" kernel parameter.
> +
> +   If you provide PMEM with volatile memory it will act as a
> volatile
> +   RAM disk and your data will not be persistent.
> +
> +config BLK_DEV_PMEM_COUNT
> + int "Default number of PMEM disks"
> + default "4"

For real use I think a default of 1 would be better.

> + depends on BLK_DEV_PMEM
> + help
> +   Number of equal sized block devices that PMEM should create.
> +
> +config BLK_DEV_PMEM_SIZE
> + int "Size in GiB of space to claim"
> + depends on BLK_DEV_PMEM
> + default "0"
> + help
> +   Amount of memory in GiB that PMEM should use when creating
> block
> +   devices.  This memory needs to be reserved from the OS at
> +   boot time using the "memmap" kernel parameter.
> +
>  config CDROM_PKTCDVD
>   tristate "Packet writing on CD/DVD media"
>   depends on !UML


...

> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
> new file mode 100644
> index 000..d366b9b
> --- /dev/null
> +++ b/drivers/block/pmem.c
> @@ -0,0 +1,330 @@
> +/*
> + * Persistent Memory Driver
> + * Copyright (c) 2014, Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or
> modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> License for
> + * more details.
> + *
> + * This driver is heavily based on drivers/block/brd.c.
> + * Copyright (C) 2007 Nick Piggin
> + * Copyright (C) 2007 Novell Inc.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define SECTOR_SHIFT 9
> +#define PAGE_SECTORS_SHIFT   (PAGE_SHIFT - SECTOR_SHIFT)
> +#define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT)
> +
> +/*
> + * driver-wide physical address and total_size - one single,
> contiguous memory
> + * region that we divide up in to same-sized devices
> + */
> +phys_addr_t  phys_addr;
> +void *virt_addr;
> +size_t   total_size;
> +
> +struct pmem_device {
> + struct request_queue*pmem_queue;
> + struct gendisk  *pmem_disk;
> + struct list_headpmem_list;
> +
> + phys_addr_t phys_addr;
> + void*virt_addr;
> + size_t  size;
> +};
> +
> +/*
> + * direct translation from (pmem,sector) => void*
> + * We do not require that sector be page aligned.
> + * The return value will point to the beginning of the page
> containing the
> + * given sector, not to the

[patch 1/3] mm: embed the memcg pointer directly into struct page

2014-11-01 Thread Johannes Weiner

Memory cgroups used to have 5 per-page pointers.  To allow users to
disable that amount of overhead during runtime, those pointers were
allocated in a separate array, with a translation layer between them
and struct page.

There is now only one page pointer remaining: the memcg pointer, that
indicates which cgroup the page is associated with when charged.  The
complexity of runtime allocation and the runtime translation overhead
is no longer justified to save that *potential* 0.19% of memory.  With
CONFIG_SLUB, page->mem_cgroup actually sits in the doubleword padding
after the page->private member and doesn't even increase struct page,
and then this patch actually saves space.  Remaining users that care
can still compile their kernels without CONFIG_MEMCG.

   textdata bss dec hex filename
8828345 1725264  983040 11536649 b00909  vmlinux.old
8827425 1725264  966656 11519345 afc571  vmlinux.new

Signed-off-by: Johannes Weiner 
---
 include/linux/memcontrol.h  |   6 +-
 include/linux/mm_types.h|   5 +
 include/linux/mmzone.h  |  12 --
 include/linux/page_cgroup.h |  53 
 init/main.c |   7 -
 mm/memcontrol.c | 124 +
 mm/page_alloc.c |   2 -
 mm/page_cgroup.c| 319 
 8 files changed, 41 insertions(+), 487 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d4575a1d6e99..dafba59b31b4 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -25,7 +25,6 @@
 #include 
 
 struct mem_cgroup;
-struct page_cgroup;
 struct page;
 struct mm_struct;
 struct kmem_cache;
@@ -466,8 +465,6 @@ memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup 
**memcg, int order)
  * memcg_kmem_uncharge_pages: uncharge pages from memcg
  * @page: pointer to struct page being freed
  * @order: allocation order.
- *
- * there is no need to specify memcg here, since it is embedded in page_cgroup
  */
 static inline void
 memcg_kmem_uncharge_pages(struct page *page, int order)
@@ -484,8 +481,7 @@ memcg_kmem_uncharge_pages(struct page *page, int order)
  *
  * Needs to be called after memcg_kmem_newpage_charge, regardless of success or
  * failure of the allocation. if @page is NULL, this function will revert the
- * charges. Otherwise, it will commit the memcg given by @memcg to the
- * corresponding page_cgroup.
+ * charges. Otherwise, it will commit @page to @memcg.
  */
 static inline void
 memcg_kmem_commit_charge(struct page *page, struct mem_cgroup *memcg, int 
order)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index de183328abb0..57e47dffbdda 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -23,6 +23,7 @@
 #define AT_VECTOR_SIZE (2*(AT_VECTOR_SIZE_ARCH + AT_VECTOR_SIZE_BASE + 1))
 
 struct address_space;
+struct mem_cgroup;
 
 #define USE_SPLIT_PTE_PTLOCKS  (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS)
 #define USE_SPLIT_PMD_PTLOCKS  (USE_SPLIT_PTE_PTLOCKS && \
@@ -168,6 +169,10 @@ struct page {
struct page *first_page;/* Compound tail pages */
};
 
+#ifdef CONFIG_MEMCG
+   struct mem_cgroup *mem_cgroup;
+#endif
+
/*
 * On machines where all RAM is mapped into kernel address space,
 * we can simply calculate the virtual address. On machines with
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 48bf12ef6620..de32d936 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -713,9 +713,6 @@ typedef struct pglist_data {
int nr_zones;
 #ifdef CONFIG_FLAT_NODE_MEM_MAP/* means !SPARSEMEM */
struct page *node_mem_map;
-#ifdef CONFIG_MEMCG
-   struct page_cgroup *node_page_cgroup;
-#endif
 #endif
 #ifndef CONFIG_NO_BOOTMEM
struct bootmem_data *bdata;
@@ -1069,7 +1066,6 @@ static inline unsigned long early_pfn_to_nid(unsigned 
long pfn)
 #define SECTION_ALIGN_DOWN(pfn)((pfn) & PAGE_SECTION_MASK)
 
 struct page;
-struct page_cgroup;
 struct mem_section {
/*
 * This is, logically, a pointer to an array of struct
@@ -1087,14 +1083,6 @@ struct mem_section {
 
/* See declaration of similar field in struct zone */
unsigned long *pageblock_flags;
-#ifdef CONFIG_MEMCG
-   /*
-* If !SPARSEMEM, pgdat doesn't have page_cgroup pointer. We use
-* section. (see memcontrol.h/page_cgroup.h about this.)
-*/
-   struct page_cgroup *page_cgroup;
-   unsigned long pad;
-#endif
/*
 * WARNING: mem_section must be a power-of-2 in size for the
 * calculation and use of SECTION_ROOT_MASK to make sense.
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 1289be6b436c..65be35785c86 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -1,59 +1,6 @@
 #ifndef __LINUX_PAGE_CGROUP_H
 #define __LINUX_PAGE_CGROUP_H
 
-struct pglist_data;
-
-#ifdef CONFIG_MEMCG
-struct

[patch 2/3] mm: page_cgroup: rename file to mm/swap_cgroup.c

2014-11-01 Thread Johannes Weiner

Now that the external page_cgroup data structure and its lookup is
gone, the only code remaining in there is swap slot accounting.

Rename it and move the conditional compilation into mm/Makefile.

Signed-off-by: Johannes Weiner 
---
 MAINTAINERS |   2 +-
 include/linux/page_cgroup.h |  40 -
 include/linux/swap_cgroup.h |  42 +
 mm/Makefile |   3 +-
 mm/memcontrol.c |   2 +-
 mm/page_cgroup.c| 211 
 mm/swap_cgroup.c| 208 +++
 mm/swap_state.c |   1 -
 mm/swapfile.c   |   2 +-
 9 files changed, 255 insertions(+), 256 deletions(-)
 delete mode 100644 include/linux/page_cgroup.h
 create mode 100644 include/linux/swap_cgroup.h
 delete mode 100644 mm/page_cgroup.c
 create mode 100644 mm/swap_cgroup.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7e31be07197e..3a60389d3a13 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2583,7 +2583,7 @@ L:cgro...@vger.kernel.org
 L: linux...@kvack.org
 S: Maintained
 F: mm/memcontrol.c
-F: mm/page_cgroup.c
+F: mm/swap_cgroup.c
 
 CORETEMP HARDWARE MONITORING DRIVER
 M: Fenghua Yu 
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
deleted file mode 100644
index 65be35785c86..
--- a/include/linux/page_cgroup.h
+++ /dev/null
@@ -1,40 +0,0 @@
-#ifndef __LINUX_PAGE_CGROUP_H
-#define __LINUX_PAGE_CGROUP_H
-
-#include 
-
-#ifdef CONFIG_MEMCG_SWAP
-extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
-   unsigned short old, unsigned short new);
-extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id);
-extern unsigned short lookup_swap_cgroup_id(swp_entry_t ent);
-extern int swap_cgroup_swapon(int type, unsigned long max_pages);
-extern void swap_cgroup_swapoff(int type);
-#else
-
-static inline
-unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id)
-{
-   return 0;
-}
-
-static inline
-unsigned short lookup_swap_cgroup_id(swp_entry_t ent)
-{
-   return 0;
-}
-
-static inline int
-swap_cgroup_swapon(int type, unsigned long max_pages)
-{
-   return 0;
-}
-
-static inline void swap_cgroup_swapoff(int type)
-{
-   return;
-}
-
-#endif /* CONFIG_MEMCG_SWAP */
-
-#endif /* __LINUX_PAGE_CGROUP_H */
diff --git a/include/linux/swap_cgroup.h b/include/linux/swap_cgroup.h
new file mode 100644
index ..145306bdc92f
--- /dev/null
+++ b/include/linux/swap_cgroup.h
@@ -0,0 +1,42 @@
+#ifndef __LINUX_SWAP_CGROUP_H
+#define __LINUX_SWAP_CGROUP_H
+
+#include 
+
+#ifdef CONFIG_MEMCG_SWAP
+
+extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
+   unsigned short old, unsigned short new);
+extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id);
+extern unsigned short lookup_swap_cgroup_id(swp_entry_t ent);
+extern int swap_cgroup_swapon(int type, unsigned long max_pages);
+extern void swap_cgroup_swapoff(int type);
+
+#else
+
+static inline
+unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id)
+{
+   return 0;
+}
+
+static inline
+unsigned short lookup_swap_cgroup_id(swp_entry_t ent)
+{
+   return 0;
+}
+
+static inline int
+swap_cgroup_swapon(int type, unsigned long max_pages)
+{
+   return 0;
+}
+
+static inline void swap_cgroup_swapoff(int type)
+{
+   return;
+}
+
+#endif /* CONFIG_MEMCG_SWAP */
+
+#endif /* __LINUX_SWAP_CGROUP_H */
diff --git a/mm/Makefile b/mm/Makefile
index 27ddb80403a9..d9d579484f15 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -56,7 +56,8 @@ obj-$(CONFIG_MIGRATION) += migrate.o
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o
 obj-$(CONFIG_PAGE_COUNTER) += page_counter.o
-obj-$(CONFIG_MEMCG) += memcontrol.o page_cgroup.o vmpressure.o
+obj-$(CONFIG_MEMCG) += memcontrol.o vmpressure.o
+obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o
 obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o
 obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index dc5e0abb18cb..fbb41a170eae 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -51,7 +51,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
deleted file mode 100644
index f0f31c1d4d0c..
--- a/mm/page_cgroup.c
+++ /dev/null
@@ -1,211 +0,0 @@
-#include 
-#include 
-#include 
-#include 
-
-#ifdef CONFIG_MEMCG_SWAP
-
-static DEFINE_MUTEX(swap_cgroup_mutex);
-struct swap_cgroup_ctrl {
-   struct page **map;
-   unsigned long length;
-   spinlock_t  lock;
-};
-
-static struct swap_cgroup_ctrl swap_cgroup_ctrl[MAX_SWAPFILES];
-
-struct swap_cgroup {
-   unsigned short  id;
-};
-#define SC_PER_PAGE(PAGE_SIZE/sizeof(struct swap_cgroup))
-

[patch 3/3] mm: move page->mem_cgroup bad page handling into generic code

2014-11-01 Thread Johannes Weiner

Now that the external page_cgroup data structure and its lookup is
gone, let the generic bad_page() check for page->mem_cgroup sanity.

Signed-off-by: Johannes Weiner 
---
 include/linux/memcontrol.h |  4 
 mm/debug.c |  5 -
 mm/memcontrol.c| 15 ---
 mm/page_alloc.c| 12 
 4 files changed, 12 insertions(+), 24 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index dafba59b31b4..e789551d4db0 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -173,10 +173,6 @@ static inline void mem_cgroup_count_vm_event(struct 
mm_struct *mm,
 void mem_cgroup_split_huge_fixup(struct page *head);
 #endif
 
-#ifdef CONFIG_DEBUG_VM
-bool mem_cgroup_bad_page_check(struct page *page);
-void mem_cgroup_print_bad_page(struct page *page);
-#endif
 #else /* CONFIG_MEMCG */
 struct mem_cgroup;
 
diff --git a/mm/debug.c b/mm/debug.c
index 5ce45c9a29b5..0e58f3211f89 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -95,7 +95,10 @@ void dump_page_badflags(struct page *page, const char 
*reason,
dump_flags(page->flags & badflags,
pageflag_names, ARRAY_SIZE(pageflag_names));
}
-   mem_cgroup_print_bad_page(page);
+#ifdef CONFIG_MEMCG
+   if (page->mem_cgroup)
+   pr_alert("page->mem_cgroup:%p\n", page->mem_cgroup);
+#endif
 }
 
 void dump_page(struct page *page, const char *reason)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fbb41a170eae..3645641513a1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3157,21 +3157,6 @@ static inline int 
mem_cgroup_move_swap_account(swp_entry_t entry,
 }
 #endif
 
-#ifdef CONFIG_DEBUG_VM
-bool mem_cgroup_bad_page_check(struct page *page)
-{
-   if (mem_cgroup_disabled())
-   return false;
-
-   return page->mem_cgroup != NULL;
-}
-
-void mem_cgroup_print_bad_page(struct page *page)
-{
-   pr_alert("page->mem_cgroup:%p\n", page->mem_cgroup);
-}
-#endif
-
 static DEFINE_MUTEX(memcg_limit_mutex);
 
 static int mem_cgroup_resize_limit(struct mem_cgroup *memcg,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6a952237a677..161da09fcda2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -653,8 +653,10 @@ static inline int free_pages_check(struct page *page)
bad_reason = "PAGE_FLAGS_CHECK_AT_FREE flag(s) set";
bad_flags = PAGE_FLAGS_CHECK_AT_FREE;
}
-   if (unlikely(mem_cgroup_bad_page_check(page)))
-   bad_reason = "cgroup check failed";
+#ifdef CONFIG_MEMCG
+   if (unlikely(page->mem_cgroup))
+   bad_reason = "page still charged to cgroup";
+#endif
if (unlikely(bad_reason)) {
bad_page(page, bad_reason, bad_flags);
return 1;
@@ -920,8 +922,10 @@ static inline int check_new_page(struct page *page)
bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
}
-   if (unlikely(mem_cgroup_bad_page_check(page)))
-   bad_reason = "cgroup check failed";
+#ifdef CONFIG_MEMCG
+   if (unlikely(page->mem_cgroup))
+   bad_reason = "page still charged to cgroup";
+#endif
if (unlikely(bad_reason)) {
bad_page(page, bad_reason, bad_flags);
return 1;
-- 
2.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] mm: memcontrol: remove stale page_cgroup_lock comment

2014-11-01 Thread Johannes Weiner

There is no cgroup-specific page lock anymore.

Signed-off-by: Johannes Weiner 
---
 mm/memcontrol.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 38f0647a2f12..d20928597a07 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2467,10 +2467,6 @@ static void commit_charge(struct page *page, struct 
mem_cgroup *memcg,
int isolated;
 
VM_BUG_ON_PAGE(pc->mem_cgroup, page);
-   /*
-* we don't need page_cgroup_lock about tail pages, becase they are not
-* accessed by any other context at this point.
-*/
 
/*
 * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page
-- 
2.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] irqchip: atmel-aic: add RTT irq fixup

2014-11-01 Thread Jason Cooper

On Mon, Sep 15, 2014 at 09:53:56PM +0200, Boris BREZILLON wrote:
> On Mon, 15 Sep 2014 21:49:07 +0200
> Boris BREZILLON  wrote:
> 
> > Hi,
> > 
> > On Sun, 14 Sep 2014 01:46:48 -0400
> > Jason Cooper  wrote:
> > 
> > > On Wed, Sep 03, 2014 at 11:07:46AM +0200, Boris BREZILLON wrote:
> > > > The series depends on the acceptation of the RTT DT bindings proposed 
> > > > here
> > > > [1].
> > > > 
> > > > Best Regards,
> > > > 
> > > > Boris
> > > > 
> > > > [1]https://lkml.org/lkml/2014/9/3/145
> > > 
> > > Hmm, the bindings, unfortunately, still seem to be in flux.  Any update?
> > 
> > Actually, all at91 developers seemed to agree on this binding [2], and
> > these changes do not impact the current series.
> > 
> > Anyway, I think it's safer to wait for an official approval before
> > taking this series into irqchip/next.

Care to resend against v3.18-rc1?

thx,

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] irqchip: dw-apb-ictl: select GENERIC_IRQ_CHIP

2014-11-01 Thread Jason Cooper

Jisheng,

On Wed, Oct 22, 2014 at 08:59:10PM +0800, Jisheng Zhang wrote:
> The dw-apb-ictl driver uses the generic-chip functions.
> Thus it needs to select GENERIC_IRQ_CHIP in Kconfig.
> 
> Change-Id: If748beffc4e0d0b47062bb067a59c10994a9148b

Please don't include this is future patches.

> Signed-off-by: Jisheng Zhang 
> ---
>  drivers/irqchip/Kconfig | 1 +
>  1 file changed, 1 insertion(+)

Applied to irqchip/core

thx,

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] arch: tile: kernel: signal.c: Use __copy_from/to_user() instead of __get/put_user()

2014-11-01 Thread Chen Gang

setup/restore_sigcontext() want to copy all related registers between
user and kernel. So use block copy instead of each registers copy. Then
can let code simple and clearer (which can avoid compiler's warning):

The related warning (with allmodconfig under tile):

CC  arch/tile/kernel/signal.o
  In file included from include/linux/poll.h:11:0,
   from include/linux/ring_buffer.h:7,
   from include/linux/ftrace_event.h:5,
   from include/trace/syscall.h:6,
   from include/linux/syscalls.h:81,
   from arch/tile/kernel/signal.c:30:
  arch/tile/kernel/signal.c: In function 'setup_sigcontext':
  arch/tile/kernel/signal.c:116:31: warning: iteration 53u invokes undefined 
behavior [-Waggressive-loop-optimizations]
 err |= __put_user(regs->regs[i], >gregs[i]);
 ^
  ./arch/tile/include/asm/uaccess.h:236:26: note: in definition of macro 
'__put_user_asm'
  : "r" (ptr), "r" (x), "i" (-EFAULT))
^
  ./arch/tile/include/asm/uaccess.h:297:10: note: in expansion of macro 
'__put_user_8'
case 8: __put_user_8(x, ptr, __ret); break;   \
^
  arch/tile/kernel/signal.c:116:10: note: in expansion of macro '__put_user'
 err |= __put_user(regs->regs[i], >gregs[i]);
^
  arch/tile/kernel/signal.c:115:2: note: containing loop
for (i = 0; i < sizeof(struct pt_regs)/sizeof(long); ++i)
^

Signed-off-by: Chen Gang 
---
 arch/tile/kernel/signal.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/tile/kernel/signal.c b/arch/tile/kernel/signal.c
index 7c2fecc..f867783 100644
--- a/arch/tile/kernel/signal.c
+++ b/arch/tile/kernel/signal.c
@@ -46,7 +46,6 @@ int restore_sigcontext(struct pt_regs *regs,
   struct sigcontext __user *sc)
 {
int err = 0;
-   int i;
 
/* Always make any pending restarted system calls return -EINTR */
current_thread_info()->restart_block.fn = do_no_restart_syscall;
@@ -57,9 +56,7 @@ int restore_sigcontext(struct pt_regs *regs,
 */
BUILD_BUG_ON(sizeof(struct sigcontext) != sizeof(struct pt_regs));
BUILD_BUG_ON(sizeof(struct sigcontext) % 8 != 0);
-
-   for (i = 0; i < sizeof(struct pt_regs)/sizeof(long); ++i)
-   err |= __get_user(regs->regs[i], >gregs[i]);
+   err = __copy_from_user(regs, sc, sizeof(*regs));
 
/* Ensure that the PL is always set to USER_PL. */
regs->ex1 = PL_ICS_EX1(USER_PL, EX1_ICS(regs->ex1));
@@ -110,12 +107,7 @@ badframe:
 
 int setup_sigcontext(struct sigcontext __user *sc, struct pt_regs *regs)
 {
-   int i, err = 0;
-
-   for (i = 0; i < sizeof(struct pt_regs)/sizeof(long); ++i)
-   err |= __put_user(regs->regs[i], >gregs[i]);
-
-   return err;
+   return  __copy_to_user(sc, regs, sizeof(*regs));
 }
 
 /*
-- 
1.9.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] arch: tile: kernel: signal.c: Use explicitly type case "unsigned long *" for register copy

2014-11-01 Thread Chen Gang

On 11/2/14 4:23, Al Viro wrote:
> On Sat, Nov 01, 2014 at 08:49:45PM +0800, Chen Gang wrote:
>> setup_sigcontext() wants to copy all kernel related registers to user
>> space. So let it copy explicitly instead of copying by exceeding member
>> array border. So let code more clearer and avoid warning.
> 
> Er...  Perhaps it would be better to avoid that shite completely and just
> use __copy_to_user() instead of bothering with loops?
> 

OK, thanks, I shall send patch v2 for it.

Also use __copy_from_user() instead of the code in restore_sigcontext().

Thanks
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] irqchip: Fixes for v3.18

2014-11-01 Thread Jason Cooper

Thomas,

Here's a couple of fixes for v3.18 that have been in -next longer than
needed :-/

Please pull.

thx,

Jason.


The following changes since commit f114040e3ea6e07372334ade75d1ee0775c355e1:

  Linux 3.18-rc1 (2014-10-19 18:08:38 -0700)

are available in the git repository at:

  git://git.infradead.org/users/jcooper/linux.git tags/irqchip-urgent-3.18

for you to fetch changes up to 758e8366754d3fa57da978fef9d2c652f7b55c02:

  irqchip: armada-370-xp: Fix MPIC interrupt handling (2014-11-02 01:31:10 
+)


irqchip urgent changes for v3.18

 - armada-370-xp
- Fix MSI and MPIC interrupt handling.


Grzegorz Jaszczyk (2):
  irqchip: armada-370-xp: Fix MSI interrupt handling
  irqchip: armada-370-xp: Fix MPIC interrupt handling

 drivers/irqchip/irq-armada-370-xp.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kdbus: add documentation

2014-11-01 Thread Greg Kroah-Hartman

On Thu, Oct 30, 2014 at 01:20:23PM +0100, Peter Meerwald wrote:
> 
> > kdbus is a system for low-latency, low-overhead, easy to use
> > interprocess communication (IPC).
> > 
> > The interface to all functions in this driver is implemented through ioctls
> > on /dev nodes.  This patch adds detailed documentation about the kernel
> > level API design.
> 
> just some typos below



Many thanks for the fixes, I've made them all to the file now, it will
show up in the next version we send out.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Easy Work,Great Pay,Start Today.

2014-11-01 Thread BHP Billiton Plc

Dear Sir/Madam,

Would you like to work online from home/temporarily and earn constant
Payment?We are glad to offer you a job position in our company,BHP Billiton Plc.
You will be on a monthly salary, if you are interested you are to please fill 
the below form.

Further information's on this job will be sent to you in our next mail.

Names in full: 
Address:   
Occupation:   
SEX:   
Nationality: 
Marital status: 
Fax:
Mobile:   
Email:   
--
PLEASE CONTACT:
Ms.Karen Wood
Company Secretary
BHP Billiton Plc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/12] Add kdbus implementation

2014-11-01 Thread Greg Kroah-Hartman

On Thu, Oct 30, 2014 at 12:00:16AM +0100, Jiri Kosina wrote:
> On Wed, 29 Oct 2014, Greg Kroah-Hartman wrote:
> 
> > kdbus is a kernel-level IPC implementation that aims for resemblance to
> > the the protocol layer with the existing userspace D-Bus daemon while
> > enabling some features that couldn't be implemented before in userspace.
> 
> I'd be interested in the features that can't be implemented in userspace 
> (and therefore would justify existence of kdbus in the kernel). Could you 
> please point me to such list / documentation?
> 
> It seems to me that most of the highlight features from the cover letter 
> can be "easily" (for certain definition of that word, of course) 
> implemented in userspace (vmsplice(), sending fd through unix socket, user 
> namespaces, UUID management, etc).

Sorry for the long delay in getting back to this, I'm battling a bad
case of jet-lag at the moment...

Here's some reasons why I feel it is better to have kdbus in the kernel
rather than trying to implement the same thing in a userspace daemon:

- performance: fewer process context switches, fewer copies, fewer
  syscalls, larger memory chunks via memfd.  This is really important
  for a whole class of userspace programs that are ported from other
  operating systems that are run on tiny ARM systems that rely on
  hundreds of thousands of messages passed at boot time, and at
  "critical" times in their user interaction loops.
- security: the peers which communicate do not have to trust each other,
  as the only trustworthy compoenent in the game is the kernel which
  adds metadata and ensures that all data passed as payload is either
  copied or sealed, so that the receiver can parse the data without
  having to protect against changing memory while parsing buffers.  Also,
  all the data transfer is controlled by the kernel, so that LSMs can
  track and control what is going on, without involving userspace.
  Because of the LSM issue, security people are much happier with this
  model than the current scheme of having to hook into dbus to mediate
  things.
- more metadata can be attached to messages than in userspace
- semantics for apps with heavy data payloads (media apps, for instance)
  with optinal priority message dequeuing, and global message ordering.
  Some "crazy" people are playing with using kdbus for audio data in the
  system.  I'm not saying that this is the best model for this, but
  until now, there wasn't any other way to do this without having to
  create custom "busses", one for each application library.
- being in the kernle closes a lot of races which can't be fixed with
  the current userspace solutions.  For example, with kdbus, there is a
  way a client can disconnect from a bus, but do so only if no further
  messages present in its queue, which is crucial for implementing
  race-free "exit-on-idle" services
- eavesdropping on the kernel level, so privileged users can hook into
  the message stream without hacking support for that into their
  userspace processes
- a number of smaller benefits: for example kdbus learned a way to peek
  full messages without dequeing them, which is really useful for
  logging metadata when handling bus-activation requests. 

Of course, some of the bits above could be implemented in userspace
alone, for example with more sophisticated memory management APIs, but
this is usually done by losing out on the other details.  For example,
for many of the memory management APIs, it's hard to not require the
communicating peers to fully trust each other.  And we _really_ don't
want peers to have to trust each other.

Another benefit of having this in the kernel, rather than as a userspace
daemon, is that you can now easily use the bus from the initrd, or up to
the very end when the system shuts down.  On current userspace D-Bus,
this is not really possible, as this requires passing the bus instance
around between initrd and the "real" system.  Such a transition of all
fds also requires keeping full state of what has already been read from
the connection fds.  kdbus makes this much simpler, as we can change the
ownership of the bus, just by passing one fd over from one part to the
other.

Regarding binder: binder and kdbus follow very different design
concepts.  Binder implies the use of thread-pools to dispatch incoming
method calls.  This is a very efficient scheme, and completely natural
in programming languages like Java.  On most Linux programs, however,
there's a much stronger focus on central poll() loops that dispatch all
sources a program cares about.  kdbus is much more usable in such
environments, as it doesn't enforce a threading model, and it is happy
with serialized dispatching.  In fact, this major difference had an
effect on much of the design decisions: binder does not guarantee global
message ordering due to the parallel dispatching in the thread-pools,
but  kdbus does.  Moreover, there's also a difference in the way message
handling.  In kdbus,

[PATCH V3 00/14] genirq endian fixes; bcm7120/brcmstb IRQ updates

2014-11-01 Thread Kevin Cernekee

V2->V3:

 - Move updated irq_reg_{readl,writel} functions back into 
   so they can be called by irqchip drivers

 - Add gc->reg_{readl,writel} function pointers so that irqchip
   drivers like arch/sh/boards/mach-se/{7343,7722}/irq.c can override them

 - CC: linux-sh list in lieu of Paul's defunct linux-sh.org email address

 - Fix handling of zero L2 status in bcm7120-l2.c

 - Rebase on Linus' head of tree

 - Drop GENERIC_CHIP / GENERIC_CHIP_BE compile-time optimizations

For the latter item, I ran a quick benchmark to see if the extra
indirection in irq_reg_{readl,write} had any perceptible effect on
register access times.  The MIPS BE case did show a small performance
hit from using the read wrapper, but on ARM LE the only differences
were attributed to the presence/absence of a barrier:


BCM3384 (UBUS architecture, MIPS BE, IRQ_GC_BE_IO):

irq_reg_readl   : 207 ns
readl   : 186 ns
__raw_readl : 186 ns
ioread32be  : 195 ns

irq_reg_writel  : 177 ns
writel  : 177 ns
__raw_writel: 177 ns
iowrite32be : 177 ns


BCM7445 (GISB architecture, ARM LE, standard LE readl):

irq_reg_readl   : 519 ns
readl   : 519 ns
__raw_readl : 482 ns
ioread32be  : 519 ns

irq_reg_writel  : 500 ns
writel  : 500 ns
__raw_writel: 482 ns
iowrite32be : 500 ns


Test code (do not merge):

-- 8< --

diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index e7c6155..fcbe8e8 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -14,6 +14,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -21,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -120,6 +123,8 @@ static int bcm7120_l2_intc_init_one(struct device_node *dn,
return 0;
 }
 
+static struct irq_chip_generic *some_gc;
+
 int __init bcm7120_l2_intc_of_init(struct device_node *dn,
struct device_node *parent)
 {
@@ -213,6 +218,7 @@ int __init bcm7120_l2_intc_of_init(struct device_node *dn,
for (idx = 0; idx < data->n_words; idx++) {
irq = idx * IRQS_PER_WORD;
gc = irq_get_domain_generic_chip(data->domain, irq);
+   some_gc = gc;
 
gc->unused = 0x & ~data->irq_map_mask[idx];
gc->reg_base = data->base[idx];
@@ -253,3 +259,58 @@ out_unmap:
 }
 IRQCHIP_DECLARE(bcm7120_l2_intc, "brcm,bcm7120-l2-intc",
bcm7120_l2_intc_of_init);
+
+static const int iterations = 1000;
+
+static void print_elapsed(const char *tag, ktime_t start)
+{
+   printk("%-20s: %lld ns\n", tag,
+   div64_u64(ktime_to_ns(ktime_sub(ktime_get(), start)),
+ iterations));
+}
+
+static int __init reg_timetest(void)
+{
+   int i;
+   ktime_t start;
+   struct irq_chip_generic *gc = some_gc;
+
+   local_irq_disable();
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   irq_reg_readl(gc, IRQSTAT);
+   print_elapsed("irq_reg_readl", start);
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   readl(gc->reg_base + IRQSTAT);
+   print_elapsed("readl", start);
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   __raw_readl(gc->reg_base + IRQSTAT);
+   print_elapsed("__raw_readl", start);
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   ioread32be(gc->reg_base + IRQSTAT);
+   print_elapsed("ioread32be", start);
+
+   printk("\n");
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   irq_reg_writel(gc, 0, IRQSTAT);
+   print_elapsed("irq_reg_writel", start);
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   writel(0, gc->reg_base + IRQSTAT);
+   print_elapsed("writel", start);
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   __raw_writel(0, gc->reg_base + IRQSTAT);
+   print_elapsed("__raw_writel", start);
+
+   for (start = ktime_get(), i = 0; i < iterations; i++)
+   iowrite32be(0, gc->reg_base + IRQSTAT);
+   print_elapsed("iowrite32be", start);
+   local_irq_enable();
+
+   return 0;
+}
+device_initcall(reg_timetest);

-- 8< --

Kevin Cernekee (14):
  sh: Eliminate unused irq_reg_{readl,writel} accessors
  genirq: Generic chip: Change irq_reg_{readl,writel} arguments
  genirq: Generic chip: Allow irqchip drivers to override
irq_reg_{readl,writel}
  genirq: Generic chip: Add big endian I/O accessors
  irqchip: brcmstb-l2: Eliminate dependency on ARM code
  irqchip: bcm7120-l2: Eliminate bad IRQ check
  irqchip: bcm7120-l2, brcmstb-l2: Remove ARM Kconfig dependency
  irqchip: bcm7120-l2: Make sure all register accesses use base+offset
  irqchip: bcm7120-l2: Fix missing nibble in

[PATCH V3 05/14] irqchip: brcmstb-l2: Eliminate dependency on ARM code

2014-11-01 Thread Kevin Cernekee

The irq-brcmstb-l2 driver has a single dependency on the ARM code, the
do_bad_IRQ macro.  Expand this macro in-place so that the driver can be
built on non-ARM platforms.

Signed-off-by: Kevin Cernekee 
Acked-by: Arnd Bergmann 
Acked-by: Florian Fainelli 
---
 drivers/irqchip/irq-brcmstb-l2.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-brcmstb-l2.c b/drivers/irqchip/irq-brcmstb-l2.c
index c15c840..c9bdf20 100644
--- a/drivers/irqchip/irq-brcmstb-l2.c
+++ b/drivers/irqchip/irq-brcmstb-l2.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -30,8 +31,6 @@
 #include 
 #include 
 
-#include 
-
 #include "irqchip.h"
 
 /* Register offsets in the L2 interrupt controller */
@@ -63,7 +62,9 @@ static void brcmstb_l2_intc_irq_handle(unsigned int irq, 
struct irq_desc *desc)
~(__raw_readl(b->base + CPU_MASK_STATUS));
 
if (status == 0) {
-   do_bad_IRQ(irq, desc);
+   raw_spin_lock(>lock);
+   handle_bad_irq(irq, desc);
+   raw_spin_unlock(>lock);
goto out;
}
 
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 02/14] genirq: Generic chip: Change irq_reg_{readl,writel} arguments

2014-11-01 Thread Kevin Cernekee

Pass in the irq_chip_generic struct so we can use different readl/writel
settings for each irqchip driver, when appropriate.  Compute
(gc->reg_base + reg_offset) in the helper function because this is pretty
much what all callers want to do anyway.

Compile-tested using the following configurations:

at91_dt_defconfig (CONFIG_ATMEL_AIC_IRQ=y)
sama5_defconfig (CONFIG_ATMEL_AIC5_IRQ=y)
sunxi_defconfig (CONFIG_ARCH_SUNXI=y)

tb10x (ARC) is untested.

Signed-off-by: Kevin Cernekee 
---
 drivers/irqchip/irq-atmel-aic.c  | 40 -
 drivers/irqchip/irq-atmel-aic5.c | 65 +++-
 drivers/irqchip/irq-sunxi-nmi.c  |  4 +--
 drivers/irqchip/irq-tb10x.c  |  4 +--
 include/linux/irq.h  | 19 +++-
 kernel/irq/generic-chip.c| 20 ++---
 6 files changed, 77 insertions(+), 75 deletions(-)

diff --git a/drivers/irqchip/irq-atmel-aic.c b/drivers/irqchip/irq-atmel-aic.c
index 9a2cf3c..27fdd8c 100644
--- a/drivers/irqchip/irq-atmel-aic.c
+++ b/drivers/irqchip/irq-atmel-aic.c
@@ -65,11 +65,11 @@ aic_handle(struct pt_regs *regs)
u32 irqnr;
u32 irqstat;
 
-   irqnr = irq_reg_readl(gc->reg_base + AT91_AIC_IVR);
-   irqstat = irq_reg_readl(gc->reg_base + AT91_AIC_ISR);
+   irqnr = irq_reg_readl(gc, AT91_AIC_IVR);
+   irqstat = irq_reg_readl(gc, AT91_AIC_ISR);
 
if (!irqstat)
-   irq_reg_writel(0, gc->reg_base + AT91_AIC_EOICR);
+   irq_reg_writel(gc, 0, AT91_AIC_EOICR);
else
handle_domain_irq(aic_domain, irqnr, regs);
 }
@@ -80,7 +80,7 @@ static int aic_retrigger(struct irq_data *d)
 
/* Enable interrupt on AIC5 */
irq_gc_lock(gc);
-   irq_reg_writel(d->mask, gc->reg_base + AT91_AIC_ISCR);
+   irq_reg_writel(gc, d->mask, AT91_AIC_ISCR);
irq_gc_unlock(gc);
 
return 0;
@@ -92,12 +92,12 @@ static int aic_set_type(struct irq_data *d, unsigned type)
unsigned int smr;
int ret;
 
-   smr = irq_reg_readl(gc->reg_base + AT91_AIC_SMR(d->hwirq));
+   smr = irq_reg_readl(gc, AT91_AIC_SMR(d->hwirq));
ret = aic_common_set_type(d, type, );
if (ret)
return ret;
 
-   irq_reg_writel(smr, gc->reg_base + AT91_AIC_SMR(d->hwirq));
+   irq_reg_writel(gc, smr, AT91_AIC_SMR(d->hwirq));
 
return 0;
 }
@@ -108,8 +108,8 @@ static void aic_suspend(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
 
irq_gc_lock(gc);
-   irq_reg_writel(gc->mask_cache, gc->reg_base + AT91_AIC_IDCR);
-   irq_reg_writel(gc->wake_active, gc->reg_base + AT91_AIC_IECR);
+   irq_reg_writel(gc, gc->mask_cache, AT91_AIC_IDCR);
+   irq_reg_writel(gc, gc->wake_active, AT91_AIC_IECR);
irq_gc_unlock(gc);
 }
 
@@ -118,8 +118,8 @@ static void aic_resume(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
 
irq_gc_lock(gc);
-   irq_reg_writel(gc->wake_active, gc->reg_base + AT91_AIC_IDCR);
-   irq_reg_writel(gc->mask_cache, gc->reg_base + AT91_AIC_IECR);
+   irq_reg_writel(gc, gc->wake_active, AT91_AIC_IDCR);
+   irq_reg_writel(gc, gc->mask_cache, AT91_AIC_IECR);
irq_gc_unlock(gc);
 }
 
@@ -128,8 +128,8 @@ static void aic_pm_shutdown(struct irq_data *d)
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
 
irq_gc_lock(gc);
-   irq_reg_writel(0x, gc->reg_base + AT91_AIC_IDCR);
-   irq_reg_writel(0x, gc->reg_base + AT91_AIC_ICCR);
+   irq_reg_writel(gc, 0x, AT91_AIC_IDCR);
+   irq_reg_writel(gc, 0x, AT91_AIC_ICCR);
irq_gc_unlock(gc);
 }
 #else
@@ -148,24 +148,24 @@ static void __init aic_hw_init(struct irq_domain *domain)
 * will not Lock out nIRQ
 */
for (i = 0; i < 8; i++)
-   irq_reg_writel(0, gc->reg_base + AT91_AIC_EOICR);
+   irq_reg_writel(gc, 0, AT91_AIC_EOICR);
 
/*
 * Spurious Interrupt ID in Spurious Vector Register.
 * When there is no current interrupt, the IRQ Vector Register
 * reads the value stored in AIC_SPU
 */
-   irq_reg_writel(0x, gc->reg_base + AT91_AIC_SPU);
+   irq_reg_writel(gc, 0x, AT91_AIC_SPU);
 
/* No debugging in AIC: Debug (Protect) Control Register */
-   irq_reg_writel(0, gc->reg_base + AT91_AIC_DCR);
+   irq_reg_writel(gc, 0, AT91_AIC_DCR);
 
/* Disable and clear all interrupts initially */
-   irq_reg_writel(0x, gc->reg_base + AT91_AIC_IDCR);
-   irq_reg_writel(0x, gc->reg_base + AT91_AIC_ICCR);
+   irq_reg_writel(gc, 0x, AT91_AIC_IDCR);
+   irq_reg_writel(gc, 0x, AT91_AIC_ICCR);
 
for (i = 0; i < 32; i++)
-   irq_reg_writel(i, gc->reg_base + AT91_AIC_SVR(i));
+   irq_reg_writel(gc, i, AT91_AIC_SVR(i));
 }
 
 static int

[PATCH V3 07/14] irqchip: bcm7120-l2, brcmstb-l2: Remove ARM Kconfig dependency

2014-11-01 Thread Kevin Cernekee

This can compile for MIPS (or anything else) now.

Signed-off-by: Kevin Cernekee 
Acked-by: Arnd Bergmann 
Acked-by: Florian Fainelli 
---
 drivers/irqchip/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index b21f12f..09c79d1 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -50,7 +50,6 @@ config ATMEL_AIC5_IRQ
 
 config BRCMSTB_L2_IRQ
bool
-   depends on ARM
select GENERIC_IRQ_CHIP
select IRQ_DOMAIN
 
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 04/14] genirq: Generic chip: Add big endian I/O accessors

2014-11-01 Thread Kevin Cernekee

Use io{read,write}32be if the caller specified IRQ_GC_BE_IO when creating
the irqchip.

Signed-off-by: Kevin Cernekee 
---
 include/linux/irq.h   |  1 +
 kernel/irq/generic-chip.c | 16 
 2 files changed, 17 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index a514ef7..48b364e 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -742,6 +742,7 @@ enum irq_gc_flags {
IRQ_GC_INIT_NESTED_LOCK = 1 << 1,
IRQ_GC_MASK_CACHE_PER_TYPE  = 1 << 2,
IRQ_GC_NO_MASK  = 1 << 3,
+   IRQ_GC_BE_IO= 1 << 4,
 };
 
 /*
diff --git a/kernel/irq/generic-chip.c b/kernel/irq/generic-chip.c
index db458c6..61024e8 100644
--- a/kernel/irq/generic-chip.c
+++ b/kernel/irq/generic-chip.c
@@ -191,6 +191,16 @@ int irq_gc_set_wake(struct irq_data *d, unsigned int on)
return 0;
 }
 
+static u32 irq_readl_be(void __iomem *addr)
+{
+   return ioread32be(addr);
+}
+
+static void irq_writel_be(u32 val, void __iomem *addr)
+{
+   iowrite32be(val, addr);
+}
+
 static void
 irq_init_generic_chip(struct irq_chip_generic *gc, const char *name,
  int num_ct, unsigned int irq_base,
@@ -300,7 +310,13 @@ int irq_alloc_domain_generic_chips(struct irq_domain *d, 
int irqs_per_chip,
dgc->gc[i] = gc = tmp;
irq_init_generic_chip(gc, name, num_ct, i * irqs_per_chip,
  NULL, handler);
+
gc->domain = d;
+   if (gcflags & IRQ_GC_BE_IO) {
+   gc->reg_readl = _readl_be;
+   gc->reg_writel = _writel_be;
+   }
+
raw_spin_lock_irqsave(_lock, flags);
list_add_tail(>list, _list);
raw_spin_unlock_irqrestore(_lock, flags);
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 10/14] irqchip: bcm7120-l2: Use gc->mask_cache to simplify suspend/resume functions

2014-11-01 Thread Kevin Cernekee

The cached value already incorporates irq_fwd_mask, and was saved the
last time an IRQ was enabled/disabled.

Signed-off-by: Kevin Cernekee 
Acked-by: Florian Fainelli 
---
 drivers/irqchip/irq-bcm7120-l2.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index b70679f8..9841121 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -37,7 +37,6 @@ struct bcm7120_l2_intc_data {
bool can_wake;
u32 irq_fwd_mask;
u32 irq_map_mask;
-   u32 saved_mask;
 };
 
 static void bcm7120_l2_intc_irq_handle(unsigned int irq, struct irq_desc *desc)
@@ -62,14 +61,11 @@ static void bcm7120_l2_intc_suspend(struct irq_data *d)
 {
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
struct bcm7120_l2_intc_data *b = gc->private;
-   u32 reg;
 
irq_gc_lock(gc);
-   /* Save the current mask and the interrupt forward mask */
-   b->saved_mask = __raw_readl(b->base + IRQEN) | b->irq_fwd_mask;
if (b->can_wake) {
-   reg = b->saved_mask | gc->wake_active;
-   __raw_writel(reg, b->base + IRQEN);
+   __raw_writel(gc->mask_cache | gc->wake_active,
+b->base + IRQEN);
}
irq_gc_unlock(gc);
 }
@@ -77,11 +73,10 @@ static void bcm7120_l2_intc_suspend(struct irq_data *d)
 static void bcm7120_l2_intc_resume(struct irq_data *d)
 {
struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
-   struct bcm7120_l2_intc_data *b = gc->private;
 
/* Restore the saved mask */
irq_gc_lock(gc);
-   __raw_writel(b->saved_mask, b->base + IRQEN);
+   __raw_writel(gc->mask_cache, b->base + IRQEN);
irq_gc_unlock(gc);
 }
 
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 13/14] irqchip: bcm7120-l2: Convert driver to use irq_reg_{readl,writel}

2014-11-01 Thread Kevin Cernekee

On BE MIPS systems this needs to use the new IRQ_GC_BE_IO gc_flag.  In
all other cases it will use the standard readl/writel accessors.

The initial irq_fwd_mask setup runs before "gc" is initialized, so it
is unchanged for now.  This could potentially be a problem on an ARM
system that boots in LE mode but runs a BE kernel, but currently none
of the supported ARM platforms are ever expected to run BE.

Signed-off-by: Kevin Cernekee 
---
 drivers/irqchip/irq-bcm7120-l2.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index e53a3a6..e7c6155 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -60,8 +61,7 @@ static void bcm7120_l2_intc_irq_handle(unsigned int irq, 
struct irq_desc *desc)
int hwirq;
 
irq_gc_lock(gc);
-   pending = __raw_readl(b->base[idx] + IRQSTAT) &
- gc->mask_cache;
+   pending = irq_reg_readl(gc, IRQSTAT) & gc->mask_cache;
irq_gc_unlock(gc);
 
for_each_set_bit(hwirq, , IRQS_PER_WORD) {
@@ -79,10 +79,8 @@ static void bcm7120_l2_intc_suspend(struct irq_data *d)
struct bcm7120_l2_intc_data *b = gc->private;
 
irq_gc_lock(gc);
-   if (b->can_wake) {
-   __raw_writel(gc->mask_cache | gc->wake_active,
-gc->reg_base + IRQEN);
-   }
+   if (b->can_wake)
+   irq_reg_writel(gc, gc->mask_cache | gc->wake_active, IRQEN);
irq_gc_unlock(gc);
 }
 
@@ -92,7 +90,7 @@ static void bcm7120_l2_intc_resume(struct irq_data *d)
 
/* Restore the saved mask */
irq_gc_lock(gc);
-   __raw_writel(gc->mask_cache, gc->reg_base + IRQEN);
+   irq_reg_writel(gc, gc->mask_cache, IRQEN);
irq_gc_unlock(gc);
 }
 
@@ -132,7 +130,7 @@ int __init bcm7120_l2_intc_of_init(struct device_node *dn,
const __be32 *map_mask;
int num_parent_irqs;
int ret = 0, len;
-   unsigned int idx, irq;
+   unsigned int idx, irq, flags;
 
data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)
@@ -195,9 +193,15 @@ int __init bcm7120_l2_intc_of_init(struct device_node *dn,
goto out_unmap;
}
 
+   /* MIPS chips strapped for BE will automagically configure the
+* peripheral registers for CPU-native byte order.
+*/
+   flags = IRQ_GC_INIT_MASK_CACHE;
+   if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
+   flags |= IRQ_GC_BE_IO;
+
ret = irq_alloc_domain_generic_chips(data->domain, IRQS_PER_WORD, 1,
-   dn->full_name, handle_level_irq, clr, 0,
-   IRQ_GC_INIT_MASK_CACHE);
+   dn->full_name, handle_level_irq, clr, 0, flags);
if (ret) {
pr_err("failed to allocate generic irq chip\n");
goto out_free_domain;
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 08/14] irqchip: bcm7120-l2: Make sure all register accesses use base+offset

2014-11-01 Thread Kevin Cernekee

A couple of accesses to IRQEN (base+0x00) just used "base" directly, so
they would break if IRQEN ever became nonzero.  Make sure that all
reads/writes specify the register offset constant.

Signed-off-by: Kevin Cernekee 
Acked-by: Florian Fainelli 
---
 drivers/irqchip/irq-bcm7120-l2.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index 7086fe0..22d3fa1 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -66,10 +66,10 @@ static void bcm7120_l2_intc_suspend(struct irq_data *d)
 
irq_gc_lock(gc);
/* Save the current mask and the interrupt forward mask */
-   b->saved_mask = __raw_readl(b->base) | b->irq_fwd_mask;
+   b->saved_mask = __raw_readl(b->base + IRQEN) | b->irq_fwd_mask;
if (b->can_wake) {
reg = b->saved_mask | gc->wake_active;
-   __raw_writel(reg, b->base);
+   __raw_writel(reg, b->base + IRQEN);
}
irq_gc_unlock(gc);
 }
@@ -81,7 +81,7 @@ static void bcm7120_l2_intc_resume(struct irq_data *d)
 
/* Restore the saved mask */
irq_gc_lock(gc);
-   __raw_writel(b->saved_mask, b->base);
+   __raw_writel(b->saved_mask, b->base + IRQEN);
irq_gc_unlock(gc);
 }
 
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 06/14] irqchip: bcm7120-l2: Eliminate bad IRQ check

2014-11-01 Thread Kevin Cernekee

This check may be prone to race conditions, e.g.

1) Some external event (e.g. GPIO level) causes an IRQ to become pending
2) Peripheral asserts the L2 IRQ
3) CPU takes an interrupt
4) The event from #1 goes away
5) bcm7120_l2_intc_irq_handle() reads back a 0 status

Unlike the hardware supported by brcmstb-l2, the bcm7120-l2 controller
does not latch the IRQ status.  Bits can change if the inputs to the
controller change.  Also, do_bad_IRQ() is an ARM-specific macro.

So let's just nuke it.

Signed-off-by: Kevin Cernekee 
Acked-by: Florian Fainelli 
---
 drivers/irqchip/irq-bcm7120-l2.c | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index b9f4fb8..7086fe0 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -27,8 +27,6 @@
 
 #include "irqchip.h"
 
-#include 
-
 /* Register offset in the L2 interrupt controller */
 #define IRQEN  0x00
 #define IRQSTAT0x04
@@ -51,19 +49,12 @@ static void bcm7120_l2_intc_irq_handle(unsigned int irq, 
struct irq_desc *desc)
chained_irq_enter(chip, desc);
 
status = __raw_readl(b->base + IRQSTAT);
-
-   if (status == 0) {
-   do_bad_IRQ(irq, desc);
-   goto out;
-   }
-
-   do {
+   while (status) {
irq = ffs(status) - 1;
status &= ~(1 << irq);
generic_handle_irq(irq_find_mapping(b->domain, irq));
-   } while (status);
+   }
 
-out:
chained_irq_exit(chip, desc);
 }
 
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 14/14] irqchip: brcmstb-l2: Convert driver to use irq_reg_{readl,writel}

2014-11-01 Thread Kevin Cernekee

This effectively converts the __raw_ accessors to the non-__raw_
equivalents.  To handle BE, we pass IRQ_GC_BE_IO, similar to what was
done in irq-bcm7120-l2.c.

Since irq_reg_writel now takes an irq_chip_generic argument, writel must
be used for the initial hardware reset in the probe function.  But that
operation never needs endian swapping, so it's probably not a big deal.

Signed-off-by: Kevin Cernekee 
---
 drivers/irqchip/irq-brcmstb-l2.c | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/irqchip/irq-brcmstb-l2.c b/drivers/irqchip/irq-brcmstb-l2.c
index c9bdf20..4aa653a 100644
--- a/drivers/irqchip/irq-brcmstb-l2.c
+++ b/drivers/irqchip/irq-brcmstb-l2.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -53,13 +54,14 @@ struct brcmstb_l2_intc_data {
 static void brcmstb_l2_intc_irq_handle(unsigned int irq, struct irq_desc *desc)
 {
struct brcmstb_l2_intc_data *b = irq_desc_get_handler_data(desc);
+   struct irq_chip_generic *gc = irq_get_domain_generic_chip(b->domain, 0);
struct irq_chip *chip = irq_desc_get_chip(desc);
u32 status;
 
chained_irq_enter(chip, desc);
 
-   status = __raw_readl(b->base + CPU_STATUS) &
-   ~(__raw_readl(b->base + CPU_MASK_STATUS));
+   status = irq_reg_readl(gc, CPU_STATUS) &
+   ~(irq_reg_readl(gc, CPU_MASK_STATUS));
 
if (status == 0) {
raw_spin_lock(>lock);
@@ -71,7 +73,7 @@ static void brcmstb_l2_intc_irq_handle(unsigned int irq, 
struct irq_desc *desc)
do {
irq = ffs(status) - 1;
/* ack at our level */
-   __raw_writel(1 << irq, b->base + CPU_CLEAR);
+   irq_reg_writel(gc, 1 << irq, CPU_CLEAR);
status &= ~(1 << irq);
generic_handle_irq(irq_find_mapping(b->domain, irq));
} while (status);
@@ -86,12 +88,12 @@ static void brcmstb_l2_intc_suspend(struct irq_data *d)
 
irq_gc_lock(gc);
/* Save the current mask */
-   b->saved_mask = __raw_readl(b->base + CPU_MASK_STATUS);
+   b->saved_mask = irq_reg_readl(gc, CPU_MASK_STATUS);
 
if (b->can_wake) {
/* Program the wakeup mask */
-   __raw_writel(~gc->wake_active, b->base + CPU_MASK_SET);
-   __raw_writel(gc->wake_active, b->base + CPU_MASK_CLEAR);
+   irq_reg_writel(gc, ~gc->wake_active, CPU_MASK_SET);
+   irq_reg_writel(gc, gc->wake_active, CPU_MASK_CLEAR);
}
irq_gc_unlock(gc);
 }
@@ -103,11 +105,11 @@ static void brcmstb_l2_intc_resume(struct irq_data *d)
 
irq_gc_lock(gc);
/* Clear unmasked non-wakeup interrupts */
-   __raw_writel(~b->saved_mask & ~gc->wake_active, b->base + CPU_CLEAR);
+   irq_reg_writel(gc, ~b->saved_mask & ~gc->wake_active, CPU_CLEAR);
 
/* Restore the saved mask */
-   __raw_writel(b->saved_mask, b->base + CPU_MASK_SET);
-   __raw_writel(~b->saved_mask, b->base + CPU_MASK_CLEAR);
+   irq_reg_writel(gc, b->saved_mask, CPU_MASK_SET);
+   irq_reg_writel(gc, ~b->saved_mask, CPU_MASK_CLEAR);
irq_gc_unlock(gc);
 }
 
@@ -119,6 +121,7 @@ int __init brcmstb_l2_intc_of_init(struct device_node *np,
struct irq_chip_generic *gc;
struct irq_chip_type *ct;
int ret;
+   unsigned int flags;
 
data = kzalloc(sizeof(*data), GFP_KERNEL);
if (!data)
@@ -132,8 +135,8 @@ int __init brcmstb_l2_intc_of_init(struct device_node *np,
}
 
/* Disable all interrupts by default */
-   __raw_writel(0x, data->base + CPU_MASK_SET);
-   __raw_writel(0x, data->base + CPU_CLEAR);
+   writel(0x, data->base + CPU_MASK_SET);
+   writel(0x, data->base + CPU_CLEAR);
 
data->parent_irq = irq_of_parse_and_map(np, 0);
if (data->parent_irq < 0) {
@@ -149,9 +152,16 @@ int __init brcmstb_l2_intc_of_init(struct device_node *np,
goto out_unmap;
}
 
+   /* MIPS chips strapped for BE will automagically configure the
+* peripheral registers for CPU-native byte order.
+*/
+   flags = 0;
+   if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
+   flags |= IRQ_GC_BE_IO;
+
/* Allocate a single Generic IRQ chip for this node */
ret = irq_alloc_domain_generic_chips(data->domain, 32, 1,
-   np->full_name, handle_edge_irq, clr, 0, 0);
+   np->full_name, handle_edge_irq, clr, 0, flags);
if (ret) {
pr_err("failed to allocate generic irq chip\n");
goto out_free_domain;
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the

[PATCH V3 09/14] irqchip: bcm7120-l2: Fix missing nibble in gc->unused mask

2014-11-01 Thread Kevin Cernekee

This mask should have been 0x_, not 0x0fff_.

The change should not have an effect on current users (STB) because bits
31:27 are unused.

Signed-off-by: Kevin Cernekee 
Acked-by: Arnd Bergmann 
Acked-by: Florian Fainelli 
---
 drivers/irqchip/irq-bcm7120-l2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index 22d3fa1..b70679f8 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -171,7 +171,7 @@ int __init bcm7120_l2_intc_of_init(struct device_node *dn,
}
 
gc = irq_get_domain_generic_chip(data->domain, 0);
-   gc->unused = 0xfff & ~data->irq_map_mask;
+   gc->unused = 0x & ~data->irq_map_mask;
gc->reg_base = data->base;
gc->private = data;
ct = gc->chip_types;
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 11/14] irqchip: bcm7120-l2: Extend driver to support 64+ bit controllers

2014-11-01 Thread Kevin Cernekee

Most implementations of the bcm7120-l2 controller only have a single
32-bit enable word + 32-bit status word.  But some instances have added
more enable/status pairs in order to support 64+ IRQs (which are all
ORed into one parent IRQ input).  Make the following changes to allow
the driver to support this:

 - Extend DT bindings so that multiple words can be specified for the
   reg property, various masks, etc.

 - Add loops to the probe/handle functions to deal with each word
   separately

 - Allocate 1 generic-chip for every 32 IRQs, so we can still use the
   clr/set helper functions

 - Update the documentation

This uses one domain per bcm7120-l2 DT node.  If the DT node defines
multiple enable/status pairs (i.e. >=64 IRQs) then the driver will
create a single IRQ domain with 2+ generic chips.  Multiple generic chips
are required because the generic-chip code can only handle one
enable/status register pair per instance.

Signed-off-by: Kevin Cernekee 
---
 .../interrupt-controller/brcm,bcm7120-l2-intc.txt  |  26 ++--
 drivers/irqchip/irq-bcm7120-l2.c   | 144 ++---
 2 files changed, 113 insertions(+), 57 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt
 
b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt
index ff812a8..bae1f21 100644
--- 
a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt
+++ 
b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt
@@ -13,7 +13,12 @@ Such an interrupt controller has the following hardware 
design:
   or if they will output an interrupt signal at this 2nd level interrupt
   controller, in particular for UARTs
 
-- not all 32-bits within the interrupt controller actually map to an interrupt
+- typically has one 32-bit enable word and one 32-bit status word, but on
+  some hardware may have more than one enable/status pair
+
+- no atomic set/clear operations
+
+- not all bits within the interrupt controller actually map to an interrupt
 
 The typical hardware layout for this controller is represented below:
 
@@ -48,7 +53,9 @@ The typical hardware layout for this controller is 
represented below:
 Required properties:
 
 - compatible: should be "brcm,bcm7120-l2-intc"
-- reg: specifies the base physical address and size of the registers
+- reg: specifies the base physical address and size of the registers;
+  multiple pairs may be specified, with the first pair handling IRQ offsets
+  0..31 and the second pair handling 32..63
 - interrupt-controller: identifies the node as an interrupt controller
 - #interrupt-cells: specifies the number of cells needed to encode an interrupt
   source, should be 1.
@@ -59,18 +66,21 @@ Required properties:
 - brcm,int-map-mask: 32-bits bit mask describing how many and which interrupts
   are wired to this 2nd level interrupt controller, and how they match their
   respective interrupt parents. Should match exactly the number of interrupts
-  specified in the 'interrupts' property.
+  specified in the 'interrupts' property, multiplied by the number of
+  enable/status register pairs implemented by this controller.  For
+  multiple parent IRQs with multiple enable/status words, this looks like:
+  
 
 Optional properties:
 
 - brcm,irq-can-wake: if present, this means the L2 controller can be used as a
   wakeup source for system suspend/resume.
 
-- brcm,int-fwd-mask: if present, a 32-bits bit mask to configure for the
-  interrupts which have a mux gate, typically UARTs. Setting these bits will
-  make their respective interrupts outputs bypass this 2nd level interrupt
-  controller completely, it completely transparent for the interrupt controller
-  parent
+- brcm,int-fwd-mask: if present, a bit mask to configure the interrupts which
+  have a mux gate, typically UARTs. Setting these bits will make their
+  respective interrupt outputs bypass this 2nd level interrupt controller
+  completely; it is completely transparent for the interrupt controller
+  parent. This should have one 32-bit word per enable/status pair.
 
 Example:
 
diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index 9841121..ef4d32c 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "irqchip.h"
@@ -31,27 +32,42 @@
 #define IRQEN  0x00
 #define IRQSTAT0x04
 
+#define MAX_WORDS  4
+#define IRQS_PER_WORD  32
+
 struct bcm7120_l2_intc_data {
-   void __iomem *base;
+   unsigned int n_words;
+   void __iomem *base[MAX_WORDS];
struct irq_domain *domain;
bool can_wake;
-   u32 irq_fwd_mask;
-   u32 irq_map_mask;
+   u32 irq_fwd_mask[MAX_WORDS];
+   u32 irq_map_mask[MAX_WORDS];
 };
 
 static void bcm7120_l2_intc_irq_handle(unsigned int irq, struct irq_desc *desc)
 {
struct

[PATCH V3 12/14] irqchip: bcm7120-l2: Decouple driver from brcmstb-l2

2014-11-01 Thread Kevin Cernekee

Some chips, such as BCM6328, only require bcm7120-l2.  Some BCM7xxx STB
configurations only require brcmstb-l2.  Treat them as two separate
entities, and update the mach-bcm dependencies to reflect the change.

Signed-off-by: Kevin Cernekee 
Acked-by: Arnd Bergmann 
Acked-by: Florian Fainelli 
---
 arch/arm/mach-bcm/Kconfig| 1 +
 drivers/irqchip/Kconfig  | 5 +
 drivers/irqchip/Makefile | 4 ++--
 drivers/irqchip/irq-bcm7120-l2.c | 2 +-
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
index 2abad74..bf47eb0 100644
--- a/arch/arm/mach-bcm/Kconfig
+++ b/arch/arm/mach-bcm/Kconfig
@@ -125,6 +125,7 @@ config ARCH_BRCMSTB
select HAVE_ARM_ARCH_TIMER
select BRCMSTB_GISB_ARB
select BRCMSTB_L2_IRQ
+   select BCM7120_L2_IRQ
help
  Say Y if you intend to run the kernel on a Broadcom ARM-based STB
  chipset.
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 09c79d1..afdc1f3 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -48,6 +48,11 @@ config ATMEL_AIC5_IRQ
select MULTI_IRQ_HANDLER
select SPARSE_IRQ
 
+config BCM7120_L2_IRQ
+   bool
+   select GENERIC_IRQ_CHIP
+   select IRQ_DOMAIN
+
 config BRCMSTB_L2_IRQ
bool
select GENERIC_IRQ_CHIP
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 173bb5f..f0909d0 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -35,6 +35,6 @@ obj-$(CONFIG_TB10X_IRQC)  += irq-tb10x.o
 obj-$(CONFIG_XTENSA)   += irq-xtensa-pic.o
 obj-$(CONFIG_XTENSA_MX)+= irq-xtensa-mx.o
 obj-$(CONFIG_IRQ_CROSSBAR) += irq-crossbar.o
-obj-$(CONFIG_BRCMSTB_L2_IRQ)   += irq-brcmstb-l2.o \
-  irq-bcm7120-l2.o
+obj-$(CONFIG_BCM7120_L2_IRQ)   += irq-bcm7120-l2.o
+obj-$(CONFIG_BRCMSTB_L2_IRQ)   += irq-brcmstb-l2.o
 obj-$(CONFIG_KEYSTONE_IRQ) += irq-keystone.o
diff --git a/drivers/irqchip/irq-bcm7120-l2.c b/drivers/irqchip/irq-bcm7120-l2.c
index ef4d32c..e53a3a6 100644
--- a/drivers/irqchip/irq-bcm7120-l2.c
+++ b/drivers/irqchip/irq-bcm7120-l2.c
@@ -247,5 +247,5 @@ out_unmap:
kfree(data);
return ret;
 }
-IRQCHIP_DECLARE(brcmstb_l2_intc, "brcm,bcm7120-l2-intc",
+IRQCHIP_DECLARE(bcm7120_l2_intc, "brcm,bcm7120-l2-intc",
bcm7120_l2_intc_of_init);
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 03/14] genirq: Generic chip: Allow irqchip drivers to override irq_reg_{readl,writel}

2014-11-01 Thread Kevin Cernekee

Currently, these I/O accessors always assume little endian 32-bit
registers (readl/writel).  On some systems the IRQ registers need to be
accessed in BE mode or using 16-bit loads/stores, so we will provide a
way to override the default behavior.

Signed-off-by: Kevin Cernekee 
---
 include/linux/irq.h | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 0743743..a514ef7 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -709,6 +710,8 @@ struct irq_chip_type {
 struct irq_chip_generic {
raw_spinlock_t  lock;
void __iomem*reg_base;
+   u32 (*reg_readl)(void __iomem *addr);
+   void(*reg_writel)(u32 val, void __iomem *addr);
unsigned intirq_base;
unsigned intirq_cnt;
u32 mask_cache;
@@ -817,13 +820,19 @@ static inline void irq_gc_unlock(struct irq_chip_generic 
*gc) { }
 static inline void irq_reg_writel(struct irq_chip_generic *gc,
  u32 val, int reg_offset)
 {
-   writel(val, gc->reg_base + reg_offset);
+   if (gc->reg_writel)
+   gc->reg_writel(val, gc->reg_base + reg_offset);
+   else
+   writel(val, gc->reg_base + reg_offset);
 }
 
 static inline u32 irq_reg_readl(struct irq_chip_generic *gc,
int reg_offset)
 {
-   return readl(gc->reg_base + reg_offset);
+   if (gc->reg_readl)
+   return gc->reg_readl(gc->reg_base + reg_offset);
+   else
+   return readl(gc->reg_base + reg_offset);
 }
 
 #endif /* _LINUX_IRQ_H */
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3 01/14] sh: Eliminate unused irq_reg_{readl,writel} accessors

2014-11-01 Thread Kevin Cernekee

Defining these macros way down in arch/sh/.../irq.c doesn't cause
kernel/irq/generic-chip.c to use them.  As far as I can tell this code
has no effect.

Signed-off-by: Kevin Cernekee 
---
 arch/sh/boards/mach-se/7343/irq.c | 3 ---
 arch/sh/boards/mach-se/7722/irq.c | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/arch/sh/boards/mach-se/7343/irq.c 
b/arch/sh/boards/mach-se/7343/irq.c
index 7646bf0..1087dba 100644
--- a/arch/sh/boards/mach-se/7343/irq.c
+++ b/arch/sh/boards/mach-se/7343/irq.c
@@ -14,9 +14,6 @@
 #define DRV_NAME "SE7343-FPGA"
 #define pr_fmt(fmt) DRV_NAME ": " fmt
 
-#define irq_reg_readl  ioread16
-#define irq_reg_writel iowrite16
-
 #include 
 #include 
 #include 
diff --git a/arch/sh/boards/mach-se/7722/irq.c 
b/arch/sh/boards/mach-se/7722/irq.c
index f5e2af1..00e6992 100644
--- a/arch/sh/boards/mach-se/7722/irq.c
+++ b/arch/sh/boards/mach-se/7722/irq.c
@@ -11,9 +11,6 @@
 #define DRV_NAME "SE7722-FPGA"
 #define pr_fmt(fmt) DRV_NAME ": " fmt
 
-#define irq_reg_readl  ioread16
-#define irq_reg_writel iowrite16
-
 #include 
 #include 
 #include 
-- 
2.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] staging: ft1000: Logging message neatening

2014-11-01 Thread Joe Perches

Use a more common logging style.

o Convert DEBUG macros to pr_debug
o Add pr_fmt
o Remove embedded function names from pr_debug
o Convert printks to pr_
o Coalesce formats and align arguments
o Add missing terminating newlines

Signed-off-by: Joe Perches 
---
 drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c |   6 +-
 drivers/staging/ft1000/ft1000-pcmcia/ft1000_hw.c   | 340 -
 drivers/staging/ft1000/ft1000-usb/ft1000_debug.c   | 136 -
 .../staging/ft1000/ft1000-usb/ft1000_download.c| 138 -
 drivers/staging/ft1000/ft1000-usb/ft1000_hw.c  | 194 +---
 drivers/staging/ft1000/ft1000-usb/ft1000_usb.c |  85 +++---
 drivers/staging/ft1000/ft1000-usb/ft1000_usb.h |   2 -
 7 files changed, 383 insertions(+), 518 deletions(-)

diff --git a/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c 
b/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c
index deb1256..1150050 100644
--- a/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c
+++ b/drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c
@@ -20,6 +20,8 @@
 
   ---*/
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #define __KERNEL_SYSCALLS__
 
 #include 
@@ -316,7 +318,7 @@ int card_download(struct net_device *dev, const u8 
*pFileStart,
 
file_version = *(long *)pFileStart;
if (file_version != 6) {
-   printk(KERN_ERR "ft1000: unsupported firmware version %ld\n", 
file_version);
+   pr_err("unsupported firmware version %ld\n", file_version);
Status = FAILURE;
}
 
@@ -688,7 +690,7 @@ int card_download(struct net_device *dev, const u8 
*pFileStart,
uiState = STATE_SECTION_PROV;
} else {
netdev_dbg(dev,
-  "FT1000:download:Download error: Bad 
Port IDs in Pseudo Record\n");
+  "Download error: Bad Port IDs in 
Pseudo Record\n");
netdev_dbg(dev, "\t Port Source = 0x%2.2x\n",
   pHdr->portsrc);
netdev_dbg(dev, "\t Port Destination = 
0x%2.2x\n",
diff --git a/drivers/staging/ft1000/ft1000-pcmcia/ft1000_hw.c 
b/drivers/staging/ft1000/ft1000-pcmcia/ft1000_hw.c
index ab6bf4ef..8f0d093 100644
--- a/drivers/staging/ft1000/ft1000-pcmcia/ft1000_hw.c
+++ b/drivers/staging/ft1000/ft1000-pcmcia/ft1000_hw.c
@@ -17,6 +17,8 @@
   Suite 330, Boston, MA 02111-1307, USA.
   -*/
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include 
 #include 
 #include 
@@ -44,12 +46,6 @@
 #include 
 #include 
 
-#ifdef FT_DEBUG
-#define DEBUG(n, args...) printk(KERN_DEBUG args);
-#else
-#define DEBUG(n, args...)
-#endif
-
 #include 
 #include "ft1000.h"
 
@@ -282,12 +278,9 @@ static void ft1000_enable_interrupts(struct net_device 
*dev)
 {
u16 tempword;
 
-   DEBUG(1, "ft1000_hw:ft1000_enable_interrupts()\n");
ft1000_write_reg(dev, FT1000_REG_SUP_IMASK, ISR_DEFAULT_MASK);
tempword = ft1000_read_reg(dev, FT1000_REG_SUP_IMASK);
-   DEBUG(1,
- "ft1000_hw:ft1000_enable_interrupts:current interrupt enable mask 
= 0x%x\n",
- tempword);
+   pr_debug("current interrupt enable mask = 0x%x\n", tempword);
 }
 
 /*---
@@ -304,12 +297,9 @@ static void ft1000_disable_interrupts(struct net_device 
*dev)
 {
u16 tempword;
 
-   DEBUG(1, "ft1000_hw: ft1000_disable_interrupts()\n");
ft1000_write_reg(dev, FT1000_REG_SUP_IMASK, ISR_MASK_ALL);
tempword = ft1000_read_reg(dev, FT1000_REG_SUP_IMASK);
-   DEBUG(1,
- "ft1000_hw:ft1000_disable_interrupts:current interrupt enable 
mask = 0x%x\n",
- tempword);
+   pr_debug("current interrupt enable mask = 0x%x\n", tempword);
 }
 
 /*---
@@ -329,8 +319,6 @@ static void ft1000_reset_asic(struct net_device *dev)
struct ft1000_pcmcia *pcmcia = info->priv;
u16 tempword;
 
-   DEBUG(1, "ft1000_hw:ft1000_reset_asic called\n");
-
(*info->ft1000_reset) (pcmcia->link);
 
/*
@@ -351,10 +339,10 @@ static void ft1000_reset_asic(struct net_device *dev)
}
/* clear interrupts */
tempword = ft1000_read_reg(dev, FT1000_REG_SUP_ISR);
-   DEBUG(1, "ft1000_hw: interrupt status register = 0x%x\n", tempword);
+   pr_debug("interrupt status register = 0x%x\n", tempword);
ft1000_write_reg(dev, FT1000_REG_SUP_ISR, tempword);
tempword = ft1000_read_reg(dev, FT1000_REG_SUP_ISR);
-   DEBUG(1, "ft1000_hw: interrupt status register = 0x%x\n", tempword);
+   pr_debug("interrupt status register = 0x%x\n", tempword);
 
 }
 
@@

[PATCH 0/2] staging: ft1000: generic neatening

2014-11-01 Thread Joe Perches

Joe Perches (2):
  staging: ft1000: Whitespace neatening
  staging: ft1000: Logging message neatening

 drivers/staging/ft1000/ft1000-pcmcia/boot.h|   34 +-
 drivers/staging/ft1000/ft1000-pcmcia/ft1000.h  |   30 +-
 drivers/staging/ft1000/ft1000-pcmcia/ft1000_cs.c   |   50 +-
 drivers/staging/ft1000/ft1000-pcmcia/ft1000_dnld.c |  208 ++--
 drivers/staging/ft1000/ft1000-pcmcia/ft1000_hw.c   |  896 +++
 drivers/staging/ft1000/ft1000-usb/ft1000_debug.c   | 1198 ++--
 .../staging/ft1000/ft1000-usb/ft1000_download.c|  394 +++
 drivers/staging/ft1000/ft1000-usb/ft1000_hw.c  |  436 ---
 drivers/staging/ft1000/ft1000-usb/ft1000_ioctl.h   |   60 +-
 drivers/staging/ft1000/ft1000-usb/ft1000_usb.c |   97 +-
 drivers/staging/ft1000/ft1000-usb/ft1000_usb.h |4 +-
 11 files changed, 1636 insertions(+), 1771 deletions(-)

-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.2 023/102] block: Fix dev_t minor allocation lifetime

2014-11-01 Thread Ben Hutchings

On Sat, 2014-11-01 at 17:18 -0600, Jens Axboe wrote:
> On 2014-11-01 16:28, Ben Hutchings wrote:
> > 3.2.64-rc1 review patch.  If anyone has any objections, please let me know.
> >
> > --
> >
> > From: Keith Busch 
> >
> > commit 2da78092dda13f1efd26edbbf99a567776913750 upstream.
> >
> > Releases the dev_t minor when all references are closed to prevent
> > another device from acquiring the same major/minor.
> >
> > Since the partition's release may be invoked from call_rcu's soft-irq
> > context, the ext_dev_idr's mutex had to be replaced with a spinlock so
> > as not so sleep.
> >
> > Signed-off-by: Keith Busch 
> > Signed-off-by: Jens Axboe 
> > [bwh: Backported to 3.2:
> >   - Adjust filename
> >   - idr insertion API is different, and blk_alloc_devt() is preallocating
> > a node in a different way]
> 
> As I've noted for pretty much every stable branch so far, you have to 
> backport commit 46f341ffcfb5 as well, if you backport this one.

I'm not caught up on reading the stable list, so I missed that.  Thanks
for pointing it out again; I'll add it.

Ben.

-- 
Ben Hutchings
Kids!  Bringing about Armageddon can be dangerous.  Do not attempt it in
your own home. - Terry Pratchett and Neil Gaiman, `Good Omens'


signature.asc
Description: This is a digitally signed message part

Re: Linus GIT 3.18.0-rc2+: INFO: suspicious RCU usage - kernel/sched/core.c:7449 suspicious rcu_dereference_check() usage!

2014-11-01 Thread Wanpeng Li


How about try this patch, https://lkml.org/lkml/2014/10/28/41
On 14/11/2 上午4:45, Miles Lane wrote:

[0.763902] [ INFO: suspicious RCU usage. ]
[0.763904] 3.18.0-rc2+ #25 Not tainted
[0.763906] ---
[0.763907] kernel/sched/core.c:7449 suspicious
rcu_dereference_check() usage!
[0.763908]
other info that might help us debug this:

[0.763911]
rcu_scheduler_active = 1, debug_locks = 1
[0.763913] 2 locks held by swapper/0/0:
[0.763914]  #0:  (>pi_lock){..}, at: []
task_rq_lock+0x30/0xaa
[0.763923]  #1:  (>lock){-.}, at: []
task_rq_lock+0x52/0xaa
[0.763928]
stack backtrace:
[0.763930] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2+ #25
[0.763932] Hardware name: Dell Inc. Inspiron 5437/0DM6M9, BIOS A07
11/14/2013
[0.763934]  0001 81a03cf8 814fdde5
810723dc
[0.763937]  81a16580 81a03d28 8106312e
88021387
[0.763940]  88021ea12780 8802138a 
81a03d88
[0.763943] Call Trace:
[0.763948]  [] dump_stack+0x4f/0x7c
[0.763952]  [] ? console_unlock+0x35a/0x389
[0.763957]  [] lockdep_rcu_suspicious+0xfa/0x103
[0.763960]  [] sched_move_task+0xda/0x17b
[0.763963]  [] ? _raw_write_unlock_irq+0x28/0x48
[0.763966]  [] cpu_cgroup_fork+0x9/0xb
[0.763970]  [] cgroup_post_fork+0x8a/0x99
[0.763974]  [] copy_process+0x17ca/0x1835
[0.763977]  [] ? rest_init+0x130/0x130
[0.763981]  [] do_fork+0x87/0x23d
[0.763983]  [] ? free_reserved_area+0xf4/0x106
[0.763987]  [] kernel_thread+0x21/0x23
[0.763990]  [] rest_init+0x1e/0x130
[0.763994]  [] start_kernel+0x434/0x441
[0.763997]  [] x86_64_start_reservations+0x2a/0x2c
[0.763999]  [] x86_64_start_kernel+0xc8/0xcc

Gnu C  4.9.1
Gnu make   4.0
binutils   2.24.90.20141023
util-linux 2.25.2
mount  debug
module-init-tools  18
e2fsprogs  1.42.12
pcmciautils018
PPP2.4.5
Linux C Library2.19
Dynamic linker (ldd)   2.19
Procps 3.3.9
Net-tools  1.60
Kbd1.15.5
Sh-utils   8.23
Modules Loaded rfcomm bnep ipv6 ecb uvcvideo videobuf2_vmalloc
videobuf2_memops videobuf2_core v4l2_common hid_multitouch videodev
usbhid media ath3k btusb bluetooth snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi intel_rapl
x86_pkg_temp_thermal intel_powerclamp dell_wmi coretemp ath9k
dell_laptop ath9k_common ath9k_hw microcode snd_soc_rt5640
snd_soc_rl6231 ath snd_hda_intel snd_soc_core snd_hda_controller
mac80211 snd_hda_codec snd_compress snd_hwdep snd_pcm_dmaengine
cfg80211 snd_pcm_oss rfkill snd_mixer_oss snd_pcm snd_seq_dummy
snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq
snd_seq_device snd_soc_sst_acpi snd_timer i2c_hid snd soundcore
xhci_pci mei_me ehci_pci xhci_hcd shpchp ehci_hcd ac mei lpc_ich wmi
fuse sr_mod cdrom r8169 mii sdhci_acpi sdhci mmc_core

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 3.18.0-rc2 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi
-fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9
-fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
CONFIG_KERNEL_BZIP2=y
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
#

Re: [PATCH 3.2 000/102] 3.2.64-rc1 review

2014-11-01 Thread Ben Hutchings

On Sat, 2014-11-01 at 16:29 -0700, Guenter Roeck wrote:
> On 11/01/2014 03:28 PM, Ben Hutchings wrote:
> > This is the start of the stable review cycle for the 3.2.64 release.
> > There are 102 patches in this series, which will be posted as responses
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Tue Nov 04 00:00:00 UTC 2014.
> > Anything received after that time might be too late.
> >
> 
> Build results:
>   total: 111 pass: 108 fail: 3
> Failed builds:
>   mips:allmodconfig
>   xtensa:defconfig
>   xtensa:allmodconfig
> 
> Qemu test results:
>   total: 20 pass: 20 fail: 0
> 
> Results are as expected.
> Details are available at http://server.roeck-us.net:8010/builders.

Thanks for checking.

Ben.

-- 
Ben Hutchings
Kids!  Bringing about Armageddon can be dangerous.  Do not attempt it in
your own home. - Terry Pratchett and Neil Gaiman, `Good Omens'


signature.asc
Description: This is a digitally signed message part

[PATCH] of/irq: Drop obsolete 'interrupts' vs 'interrupts-extended' text

2014-11-01 Thread Bjorn Helgaas

a9ecdc0fdc54 ("of/irq: Fix lookup to use 'interrupts-extended' property
first") updated the description to say that:

  - Both 'interrupts' and 'interrupts-extended' may be present
  - Software should prefer 'interrupts-extended'
  - Software that doesn't comprehend 'interrupts-extended' may use
'interrupts'

But there is still a paragraph at the end that prohibits having both and
says 'interrupts' should be preferred.

Remove the contradictory text.

Fixes: a9ecdc0fdc54 ("of/irq: Fix lookup to use 'interrupts-extended' property 
first")
Signed-off-by: Bjorn Helgaas 
CC: sta...@vger.kernel.org  # v3.13+
---
 .../bindings/interrupt-controller/interrupts.txt   |4 
 1 file changed, 4 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/interrupt-controller/interrupts.txt 
b/Documentation/devicetree/bindings/interrupt-controller/interrupts.txt
index ce6a1a072028..8a3c40829899 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/interrupts.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/interrupts.txt
@@ -30,10 +30,6 @@ should only be used when a device has multiple interrupt 
parents.
   Example:
interrupts-extended = < 5 1>, < 1 0>;
 
-A device node may contain either "interrupts" or "interrupts-extended", but not
-both. If both properties are present, then the operating system should log an
-error and use only the data in "interrupts".
-
 2) Interrupt controller nodes
 -
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/4] input: alps: Do not try to parse data as 3 bytes packet when driver is out of sync

2014-11-01 Thread Pali Rohár

5th and 6th byte of ALPS trackstick V3 protocol match condition for first byte
of PS/2 3 bytes packet. When driver enters out of sync state and ALPS trackstick
is sending data then driver match 5th, 6th and next 1st bytes as PS/2.

It basically means if user is using trackstick when driver is in out of sync
state driver will never resync. Processing these bytes as 3 bytes PS/2 data
cause total mess (random cursor movements, random clicks) and make trackstick
unusable until psmouse driver decide to do full device reset.

Lot of users reported problems with ALPS devices on Dell Latitude E6440, E6540
and E7440 laptops. ALPS device or Dell EC for unknown reason send some invalid
ALPS PS/2 bytes which cause driver out of sync. It looks like that i8042 and
psmouse/alps driver always receive group of 6 bytes packets so there are no
missing bytes and no bytes were inserted between valid once.

This patch does not fix root of problem with ALPS devices found in Dell Latitude
laptops but it does not allow to process some (invalid) subsequence of 6 bytes
ALPS packets as 3 bytes PS/2 when driver is out of sync.

So with this patch trackstick input device does not report bogus data when
also driver is out of sync, so trackstick should be usable on those machines.

Unknown is also information which ALPS devices send 3 bytes packets and why
ALPS driver needs to handle also bare PS/2 packets. According to git (and plus
historic tree from bitkeeper) code for processing 3 bytes bare PS/2 packets
is there since first version of alps.c existence (since 2.6.9-rc2).

We do not want to break some older ALPS devices. And disabling processing bare
PS/2 packets when driver is out of sync should not break it.

Signed-off-by: Pali Rohár 
Tested-by: Pali Rohár 
Cc: sta...@vger.kernel.org
---
 drivers/input/mouse/alps.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index 2b0ae8c..a772745 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -1156,7 +1156,9 @@ static psmouse_ret_t alps_process_byte(struct psmouse 
*psmouse)
 {
struct alps_data *priv = psmouse->private;
 
-   if ((psmouse->packet[0] & 0xc8) == 0x08) { /* PS/2 packet */
+   /* FIXME: Could we receive bare PS/2 packets from DualPoint devices?? */
+   if (!psmouse->out_of_sync_cnt &&
+   (psmouse->packet[0] & 0xc8) == 0x08) { /* PS/2 packet */
if (psmouse->pktcnt == 3) {
alps_report_bare_ps2_packet(psmouse, psmouse->packet,
true);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] input: alps: Reset mouse before identifying it

2014-11-01 Thread Pali Rohár

On Thursday 23 October 2014 17:44:04 Dmitry Torokhov wrote:
> On Sun, Oct 19, 2014 at 01:07:41PM +0200, Pali Rohár wrote:
> > On Wednesday 15 October 2014 20:22:56 Dmitry Torokhov wrote:
> > > On Wed, Oct 15, 2014 at 08:10:39PM +0200, Pali Rohár wrote:
> > > > On Wednesday 15 October 2014 20:00:11 Dmitry Torokhov 
wrote:
> > > > > On Wed, Oct 15, 2014 at 07:57:37PM +0200, Pali Rohár 
wrote:
> > > > > > On Wednesday 15 October 2014 19:43:15 Dmitry
> > > > > > Torokhov
> > 
> > wrote:
> > > > > > > On Wed, Oct 15, 2014 at 02:53:11PM +0200, Pali
> > > > > > > Rohár
> > 
> > wrote:
> > > > > > > > On Tuesday 14 October 2014 08:08:34 Dmitry
> > > > > > > > Torokhov
> > > > 
> > > > wrote:
> > > > > > > > > On Fri, Oct 03, 2014 at 11:47:59AM +0200, Hans
> > > > > > > > > de Goede
> > > > > > 
> > > > > > wrote:
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > Thanks for working on this!
> > > > > > > > > > 
> > > > > > > > > > On 10/03/2014 11:43 AM, Pali Rohár wrote:
> > > > > > > > > > > On some systems after starting computer
> > > > > > > > > > > function alps_identify() does not detect
> > > > > > > > > > > dual ALPS touchpad+trackstick device
> > > > > > > > > > > correctly and detect only touchpad.
> > > > > > > > > > > 
> > > > > > > > > > > Resetting ALPS device before identifiying
> > > > > > > > > > > it fixing this problem and both parts
> > > > > > > > > > > touchpad and trackstick are detected.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Pali Rohár
> > > > > > > > > > >  Tested-by: Pali
> > > > > > > > > > > Rohár 
> > > > > > > > > > 
> > > > > > > > > > Looks good and seems sensible:
> > > > > > > > > > 
> > > > > > > > > > Acked-by: Hans de Goede
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > *sigh* I am not really happy about this, as we
> > > > > > > > > making boot longer and longer for people
> > > > > > > > > without ALPS touchpads. It would be better if
> > > > > > > > > we only reset the mouse when we knew we are
> > > > > > > > > dealing with ALPS, and even better if we only
> > > > > > > > > reset it when we suspected that we missed
> > > > > > > > > trackstick. Any chance of doing this?
> > > > > > > > > 
> > > > > > > > > Thanks.
> > > > > > > > 
> > > > > > > > Dmitry, problem is that function check which
> > > > > > > > detecting trackstick does not working when I
> > > > > > > > start my laptop from power-off state and do not
> > > > > > > > reset PS/2 device. But detecting ALPS touchpad
> > > > > > > > looks like working. So if do not like this
> > > > > > > > idea, what about doing something like this in
> > > > > > > > alps_dectect function?
> > > > > > > > 
> > > > > > > > int alps_detect(...)
> > > > > > > > {
> > > > > > > > ...
> > > > > > > > /* detect if device is ALPS */
> > > > > > > > if (alps_identify(...) < 0)
> > > > > > > > return -1;
> > > > > > > > /* now we know that device is ALPS */
> > > > > > > > if (!(flags & ALPS_DUALPOINT)) {
> > > > > > > > /* reset it and identify again, maybe there is
> > > > > > > > trackstick */ psmouse_reset(...);
> > > > > > > > alps_identify(...);
> > > > > > > > }
> > > > > > > > ...
> > > > > > > > }
> > > > > > > > 
> > > > > > > > It will does not affect non ALPS devices
> > > > > > > > (because first identify call will fail), but
> > > > > > > > will affect ALPS devices without trackstick
> > > > > > > > (because identify will be called twice and
> > > > > > > > reset too).
> > > > > > > 
> > > > > > > I think this is a step in right direction. Do you
> > > > > > > know what exactly fails in alps_identify() on
> > > > > > > your box if you do not call psmouse_reset?
> > > > > > > 
> > > > > > > Thanks.
> > > > > > 
> > > > > > Yes, I know. It is failing in
> > > > > > alps_probe_trackstick_v3(). It calls
> > > > > > alps_command_mode_read_reg(...) and it returns 0
> > > > > > which means trackstick is not there.
> > > > > 
> > > > > OK, so can we try sticking psmouse_reset() there? This
> > > > > will limit the exposure of the new delay.
> > > > > 
> > > > > Thanks.
> > > > 
> > > > Sorry, but I think this is not safe. Function
> > > > psmouse_reset will reset device (set it to relative
> > > > mode, etc...) and before and after
> > > > alps_probe_trackstick_v3() are called other functions.
> > > > So it could break something else.
> > > 
> > > We might need to repeat bits of alps_identify() after
> > > resetting the mouse, you are right. It should still be
> > > doable though.
> > 
> > What about checking "E6 report" and if that pass reset
> > device and do full alps_identify? With check for "E6
> > report" we can filter probably all PS/2 devices which are
> > not ALPS.
> 
> Why don't you pull alps_probe_trackstick_v3() from
> alps_identify(), rename it into __alps_identify() and then
> have real alps_identify be:
> 
> static int alps_identfy(struct psmouse *psmouse, struct
> alps_data *priv) {
>   int error;
> 
>   error = __alps_identify(psmouse, priv);
>   if (error)

Re: [PATCH 3.2 000/102] 3.2.64-rc1 review

2014-11-01 Thread Guenter Roeck


On 11/01/2014 03:28 PM, Ben Hutchings wrote:

This is the start of the stable review cycle for the 3.2.64 release.
There are 102 patches in this series, which will be posted as responses
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Nov 04 00:00:00 UTC 2014.
Anything received after that time might be too late.



Build results:
total: 111 pass: 108 fail: 3
Failed builds:
mips:allmodconfig
xtensa:defconfig
xtensa:allmodconfig

Qemu test results:
total: 20 pass: 20 fail: 0

Results are as expected.
Details are available at http://server.roeck-us.net:8010/builders.

Guenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/4] input: alps: For protocol V3, do not process data when last packet's bit7 is set

2014-11-01 Thread Pali Rohár

Sometimes on Dell Latitude laptops psmouse/alps driver receive invalid ALPS
protocol V3 packets with bit7 set in last byte. More often it can be reproduced
on Dell Latitude E6440 or E7440 with closed lid and pushing cover above 
touchpad.

If bit7 in last packet byte is set then it is not valid ALPS packet. I was told
that ALPS devices never send these packets. It is not know yet who send those
packets, it could be Dell EC, bug in BIOS and also bug in touchpad firmware...

With this patch alps driver does not process those invalid packets and drops it
with PSMOUSE_FULL_PACKET so psmouse driver does not enter to out of sync state.

This patch fix problem when psmouse driver still resetting ALPS device when
laptop lid is closed because of receiving invalid packets in out of sync state.

Signed-off-by: Pali Rohár 
Tested-by: Pali Rohár 
Cc: sta...@vger.kernel.org
---
 drivers/input/mouse/alps.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index 7c47e97..e802d28 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -1181,6 +1181,16 @@ static psmouse_ret_t alps_process_byte(struct psmouse 
*psmouse)
return PSMOUSE_BAD_DATA;
}
 
+   if (priv->proto_version == ALPS_PROTO_V3 && psmouse->pktcnt == 
psmouse->pktsize) {
+   // For protocol V3, do not process data when last packet's bit7 
is set
+   if (psmouse->packet[psmouse->pktcnt - 1] & 0x80) {
+   psmouse_dbg(psmouse, "v3 discard packet[%i] = %x\n",
+   psmouse->pktcnt - 1,
+   psmouse->packet[psmouse->pktcnt - 1]);
+   return PSMOUSE_FULL_PACKET;
+   }
+   }
+
/* Bytes 2 - pktsize should have 0 in the highest bit */
if ((priv->proto_version < ALPS_PROTO_V5) &&
psmouse->pktcnt >= 2 && psmouse->pktcnt <= psmouse->pktsize &&
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 4/4] input: alps: Fix trackstick detection

2014-11-01 Thread Pali Rohár

On some laptops after starting them from off state (not after reboot), function
alps_probe_trackstick_v3() (called from function alps_identify()) does not
detect trackstick. To fix this problem we need to reset device. But function
alps_identify() is called also from alps_detect() and we do not want to reset
device in detect function because it will slow down initialization of all other
non alps devices.

Current alps device init sequence is:
alps_detect() --> alps_identify() (trackstick not detected)
alps_init() --> psmouse_reset() --> alps_identify() (trackstick detected)

This patch moves initialization code between driver functions so we can remove
alps_identify() call from alps_detect(). Which means that trackstick function
alps_probe_trackstick_v3() will be called only from alps_init() and only after
device reset so it will always return correct information about trackstick
presence. Code for identifying protocol version is moved to alps_init() and
because psmouse-base.c calling alps_detect() and alps_init() consecutively then
detection of both alps and also other non alps devices will not be broken.

First this patch moves code between functions:

 * Move calling function alps_probe_trackstick_v3() (for rushmore devices) from
   alps_identify() to alps_hw_init_rushmore_v3()

 * Move code for checking "E6 report" from alps_identify() to alps_detect()

 * Move code for setting correct device name string and model/protocol version
   from alps_detect() to alps_init(). To not break psmouse-base.c in function
   alps_detect() set only generic name "DualPoint TouchPad".

Next it removes alps_identify() from alps_detect() because it is not needed
anymore (code which use it was moved to alps_init()).

And last this patch fix another code for trackstick detection of protocol V3
devices. In function alps_hw_init_v3() is removed ALPS_DUALPOINT flag from
device if alps_setup_trackstick_v3() or alps_setup_trackstick_v3() returns
-ENODEV (which means trackstick is not present).

Now trackstick detection should work and in function alps_init() is set
correct name and other properties for both input devices.

Side effect of this patch is also faster alps devices initialization because
function alps_identify() is called only once (from alps_init()).

Signed-off-by: Pali Rohár 
Tested-by: Pali Rohár 
Cc: sta...@vger.kernel.org
---
 drivers/input/mouse/alps.c |   96 +---
 1 file changed, 64 insertions(+), 32 deletions(-)

diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index e802d28..04161b6 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -1732,6 +1732,7 @@ error:
 
 static int alps_hw_init_v3(struct psmouse *psmouse)
 {
+   struct alps_data *priv = psmouse->private;
struct ps2dev *ps2dev = >ps2dev;
int reg_val;
unsigned char param[4];
@@ -1740,9 +1741,15 @@ static int alps_hw_init_v3(struct psmouse *psmouse)
if (reg_val == -EIO)
goto error;
 
-   if (reg_val == 0 &&
-   alps_setup_trackstick_v3(psmouse, ALPS_REG_BASE_PINNACLE) == -EIO)
-   goto error;
+   if (reg_val == 0) {
+   reg_val = alps_setup_trackstick_v3(psmouse,
+  ALPS_REG_BASE_PINNACLE);
+   if (reg_val == -EIO)
+   goto error;
+   }
+
+   if (reg_val == -ENODEV)
+   priv->flags &= ~ALPS_DUALPOINT;
 
if (alps_enter_command_mode(psmouse) ||
alps_absolute_mode_v3(psmouse)) {
@@ -1849,15 +1856,20 @@ static int alps_hw_init_rushmore_v3(struct psmouse 
*psmouse)
struct ps2dev *ps2dev = >ps2dev;
int reg_val, ret = -1;
 
-   if (priv->flags & ALPS_DUALPOINT) {
+   reg_val = alps_probe_trackstick_v3(psmouse, ALPS_REG_BASE_RUSHMORE);
+   if (reg_val == -EIO)
+   goto error;
+
+   if (reg_val == 0) {
reg_val = alps_setup_trackstick_v3(psmouse,
   ALPS_REG_BASE_RUSHMORE);
if (reg_val == -EIO)
goto error;
-   if (reg_val == -ENODEV)
-   priv->flags &= ~ALPS_DUALPOINT;
}
 
+   if (reg_val == -ENODEV)
+   priv->flags &= ~ALPS_DUALPOINT;
+
if (alps_enter_command_mode(psmouse) ||
alps_command_mode_read_reg(psmouse, 0xc2d9) == -1 ||
alps_command_mode_write_reg(psmouse, 0xc2cb, 0x00))
@@ -2176,20 +2188,15 @@ static int alps_match_table(struct psmouse *psmouse, 
struct alps_data *priv,
 
 static int alps_identify(struct psmouse *psmouse, struct alps_data *priv)
 {
-   unsigned char e6[4], e7[4], ec[4];
+   unsigned char e7[4], ec[4];
+   int ret;
 
/*
 * First try "E6 report".
-* ALPS should return 0,0,10 or 0,0,100 if no buttons are pressed.
-* The bits 0-2 of the first byte will be 1s if some buttons

[PATCH v3 0/4] Fixes for ALPS driver

2014-11-01 Thread Pali Rohár

This patch series tries to fix problems with ALPS dualpoint devices on Dell
Latitude laptops which are probably caused by bugs in Dell BIOS, Dell EC or
in ALPS touchpad firmware itself.

Root of problems is yet unknown but at least this patch series could eliminate
reporting bogus data to userspace.

Reported bugs:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1258837
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1320022
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1272624
https://bugzilla.redhat.com/show_bug.cgi?id=1145954

Pali Rohár (4):
  input: alps: Do not try to parse data as 3 bytes packet when driver
is out of sync
  input: alps: Allow 2 invalid packets without resetting device
  input: alps: For protocol V3, do not process data when last packet's
bit7 is set
  input: alps: Fix trackstick detection

 drivers/input/mouse/alps.c |  113 +++-
 1 file changed, 80 insertions(+), 33 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/4] input: alps: Allow 2 invalid packets without resetting device

2014-11-01 Thread Pali Rohár

On some Dell Latitude laptops ALPS device or Dell EC send one invalid byte in
6 bytes ALPS packet. In this case psmouse driver enter out of sync state. It
looks like that all other bytes in packets are valid and also device working
properly. So there is no need to do full device reset, just need to wait
for byte which match condition for first byte (start of packet). Because ALPS
packets are bigger (6 or 8 bytes) default limit is small.

This patch increase number of invalid bytes to size of 2 ALPS packets which
psmouse driver can drop before do full reset.

Resetting ALPS devices take some time and when doing reset on some Dell laptops
touchpad, trackstick and also keyboard do not respond. So it is better to do it
only if really necessary.

Signed-off-by: Pali Rohár 
Tested-by: Pali Rohár 
Cc: sta...@vger.kernel.org
---
 drivers/input/mouse/alps.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index a772745..7c47e97 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -2391,6 +2391,9 @@ int alps_init(struct psmouse *psmouse)
/* We are having trouble resyncing ALPS touchpads so disable it for now 
*/
psmouse->resync_time = 0;
 
+   /* Allow 2 invalid packets without resetting device */
+   psmouse->resetafter = psmouse->pktsize * 2;
+
return 0;
 
 init_fail:
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.2 023/102] block: Fix dev_t minor allocation lifetime

2014-11-01 Thread Jens Axboe


On 2014-11-01 16:28, Ben Hutchings wrote:

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Keith Busch 

commit 2da78092dda13f1efd26edbbf99a567776913750 upstream.

Releases the dev_t minor when all references are closed to prevent
another device from acquiring the same major/minor.

Since the partition's release may be invoked from call_rcu's soft-irq
context, the ext_dev_idr's mutex had to be replaced with a spinlock so
as not so sleep.

Signed-off-by: Keith Busch 
Signed-off-by: Jens Axboe 
[bwh: Backported to 3.2:
  - Adjust filename
  - idr insertion API is different, and blk_alloc_devt() is preallocating
a node in a different way]


As I've noted for pretty much every stable branch so far, you have to 
backport commit 46f341ffcfb5 as well, if you backport this one.


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: EFI-related general protection faults

2014-11-01 Thread Steven Noonan


On Sat, Nov 1, 2014 at 2:43 PM, Steven Noonan  wrote:
> On Sat, Nov 1, 2014 at 6:17 AM, Steven Noonan  wrote:
>> On Sat, Nov 1, 2014 at 6:00 AM, Steven Noonan  wrote:
>>> I've been getting general protection faults in EFI modules at boot time
>>> across several machines. I originally thought it was just an EFI quirk
>>> on one machine so I blacklisted the rtc-efi module (which was the
>>> offender at the time), but I've seen it elsewhere since. Once this
>>> happens, the system is only half-usable and needs to reboot. It's also
>>> sadly not 100% reproducible at every boot.
>>>
>>> From what I've observed, it only occurs at boot time when the various
>>> EFI modules are initializing. I haven't yet tested whether I can
>>> trigger it just by unloading/reloading EFI modules repeatedly, but seems
>>> like it'd be worth a shot.
>>>
>>> In two of the three traces below, it seems to happen while two EFI
>>> modules are loading at the same time (rtc_efi and efivars), so perhaps
>>> there's some common data initialization that's racy?
>>
>> Neat. If I do these in two separate shells simultaneously,
>>
>> # while true; do rmmod rtc_efi; modprobe rtc_efi; done
>> # while true; do rmmod efivars; modprobe efivars; done
>>
>> then it faults:
>>
>> Nov 01 06:10:04 osprey kernel: rtc-efi rtc-efi: rtc core: registered rtc-efi 
>> as rtc1
>> Nov 01 06:10:04 osprey kernel: rtc-efi rtc-efi: rtc core: registered rtc-efi 
>> as rtc1
>> Nov 01 06:10:04 osprey kernel: rtc-efi rtc-efi: rtc core: registered rtc-efi 
>> as rtc1
>> Nov 01 06:10:04 osprey kernel: rtc-efi rtc-efi: rtc core: registered rtc-efi 
>> as rtc1
>> Nov 01 06:10:04 osprey kernel: rtc-efi rtc-efi: rtc core: registered rtc-efi 
>> as rtc1
>> Nov 01 06:10:04 osprey kernel: rtc-efi rtc-efi: rtc core: registered rtc-efi 
>> as rtc1
>> Nov 01 06:10:04 osprey kernel: EFI Variables Facility v0.08 2004-May-17
>> Nov 01 06:10:04 osprey kernel: general protection fault:  [#1] SMP
>> Nov 01 06:10:04 osprey kernel: Modules linked in: rtc_efi(+) efivars(+) 
>> sch_sfq bridge stp llc it87 hwmon_vid joydev hid_generic ecb btusb 
>> sch_fq_codel bluetooth usbhid hid nls_cp437 vfat fat iTCO_wdt 
>> iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp 
>> crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul 
>> glue_helper ablk_helper i2c_i801 r8169 cryptd lpc_ich mfd_core mii fan 
>> thermal battery tpm_tis tpm evdev snd_hda_codec_realtek 
>> snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller 
>> snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore acpi_cpufreq 
>> processor usbip_host usbip_core msr vhost_scsi target_core_mod 
>> crct10dif_generic crct10dif_pclmul configfs vhost_net tun vhost macvtap 
>> macvlan kvm_intel kvm efivarfs ext4 crc16 jbd2 mbcache sd_mod crc_t10dif 
>> crct10dif_common
>> Nov 01 06:10:04 osprey kernel:  ahci libahci libata crc32c_intel ehci_pci 
>> xhci_hcd ehci_hcd scsi_mod usbcore usb_common i915 intel_gtt i2c_algo_bit 
>> video drm_kms_helper drm i2c_core e1000e ptp pps_core ipmi_poweroff 
>> ipmi_msghandler button [last unloaded: rtc_efi]
>> Nov 01 06:10:04 osprey kernel: CPU: 4 PID: 13264 Comm: modprobe Not tainted 
>> 3.17.2-1-ec2 #1
>> Nov 01 06:10:04 osprey kernel: Hardware name: GIGABYTE 
>> M4HM87P-00/M4HM87P-00, BIOS F5 06/23/2014
>> Nov 01 06:10:04 osprey kernel: task: 880401729d60 ti: 8803f869c000 
>> task.ti: 8803f869c000
>> Nov 01 06:10:04 osprey kernel: RIP: 0010:[]  
>> [] efi_call+0x8e/0x100
>> Nov 01 06:10:04 osprey kernel: RSP: 0018:8803f869f9b0  EFLAGS: 00010002
>> Nov 01 06:10:04 osprey kernel: RAX:  RBX: 8803f869fb60 
>> RCX: 
>> Nov 01 06:10:04 osprey kernel: RDX: 80020020 RSI: 8803f869fb60 
>> RDI: fffef0fe3990
>> Nov 01 06:10:04 osprey kernel: RBP: 8803f869fa80 R08: 8803f869fa90 
>> R09: 001e
>> Nov 01 06:10:04 osprey kernel: R10: fffef0ff7f58 R11: 8803f869f8c0 
>> R12: 0286
>> Nov 01 06:10:04 osprey kernel: R13: 8803f869fb61 R14: 8803f869fa90 
>> R15: a40cafd8
>> Nov 01 06:10:04 osprey kernel: FS:  7fdd75904700() 
>> GS:88041eb0() knlGS:
>> Nov 01 06:10:04 osprey kernel: CS:  0010 DS:  ES:  CR0: 
>> 80050033
>> Nov 01 06:10:04 osprey kernel: CR2: 7fdd7593a4e1 CR3: 0009a000 
>> CR4: 001407e0
>> Nov 01 06:10:04 osprey kernel: Stack:
>> Nov 01 06:10:04 osprey kernel:  8803f869fb60 8803f869fa80 
>> 8803f869fb60 fffef0fe3990
>> Nov 01 06:10:04 osprey kernel:  0286 8803f869fb60 
>> 8803f869fa58 80050033
>> Nov 01 06:10:04 osprey kernel:    
>>  00ff
>> Nov 01 06:10:04 osprey kernel: Call Trace:
>> Nov 01 06:10:04 osprey kernel:  [] ? 
>> virt_efi_get_wakeup_time+0x51/0x80
>> Nov 01 06:10:04 osprey kernel:  [] 0xa40cf302
>> Nov 01 06:10:04 osprey kernel:  [] ? 
>>

[PATCH 3.2 013/102] ata_piix: Add Device IDs for Intel 9 Series PCH

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: James Ralston 

commit 6cad1376954e591c3c41500c4e586e183e7ffe6d upstream.

This patch adds the IDE mode SATA Device IDs for the Intel 9 Series PCH.

Signed-off-by: James Ralston 
Signed-off-by: Tejun Heo 
Signed-off-by: Ben Hutchings 
---
 drivers/ata/ata_piix.c | 8 
 1 file changed, 8 insertions(+)

--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -362,6 +362,14 @@ static const struct pci_device_id piix_p
{ 0x8086, 0x0F21, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata_byt },
/* SATA Controller IDE (Coleto Creek) */
{ 0x8086, 0x23a6, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata },
+   /* SATA Controller IDE (9 Series) */
+   { 0x8086, 0x8c88, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata_snb },
+   /* SATA Controller IDE (9 Series) */
+   { 0x8086, 0x8c89, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata_snb },
+   /* SATA Controller IDE (9 Series) */
+   { 0x8086, 0x8c80, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_snb },
+   /* SATA Controller IDE (9 Series) */
+   { 0x8086, 0x8c81, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_snb },
 
{ } /* terminate list */
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 098/102] dm crypt: fix access beyond the end of allocated space

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Mikulas Patocka 

commit d49ec52ff6ddcda178fc2476a109cf1bd1fa19ed upstream.

The DM crypt target accesses memory beyond allocated space resulting in
a crash on 32 bit x86 systems.

This bug is very old (it dates back to 2.6.25 commit 3a7f6c990ad04 "dm
crypt: use async crypto").  However, this bug was masked by the fact
that kmalloc rounds the size up to the next power of two.  This bug
wasn't exposed until 3.17-rc1 commit 298a9fa08a ("dm crypt: use per-bio
data").  By switching to using per-bio data there was no longer any
padding beyond the end of a dm-crypt allocated memory block.

To minimize allocation overhead dm-crypt puts several structures into one
block allocated with kmalloc.  The block holds struct ablkcipher_request,
cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
struct dm_crypt_request and an initialization vector.

The variable dmreq_start is set to offset of struct dm_crypt_request
within this memory block.  dm-crypt allocates the block with this size:
cc->dmreq_start + sizeof(struct dm_crypt_request) + cc->iv_size.

When accessing the initialization vector, dm-crypt uses the function
iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
+ 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).

dm-crypt allocated "cc->iv_size" bytes beyond the end of dm_crypt_request
structure.  However, when dm-crypt accesses the initialization vector, it
takes a pointer to the end of dm_crypt_request, aligns it, and then uses
it as the initialization vector.  If the end of dm_crypt_request is not
aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
alignment causes the initialization vector to point beyond the allocated
space.

Fix this bug by calculating the variable iv_size_padding and adding it
to the allocated size.

Also correct the alignment of dm_crypt_request.  struct dm_crypt_request
is specific to dm-crypt (it isn't used by the crypto subsystem at all),
so it is aligned on __alignof__(struct dm_crypt_request).

Signed-off-by: Mikulas Patocka 
Signed-off-by: Ben Hutchings 
---
 drivers/md/dm-crypt.c |   20 
 1 file changed, 16 insertions(+), 4 deletions(-)

--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1565,6 +1565,7 @@ static int crypt_ctr(struct dm_target *t
unsigned int key_size, opt_params;
unsigned long long tmpll;
int ret;
+   size_t iv_size_padding;
struct dm_arg_set as;
const char *opt_string;
 
@@ -1600,12 +1601,23 @@ static int crypt_ctr(struct dm_target *t
 
cc->dmreq_start = sizeof(struct ablkcipher_request);
cc->dmreq_start += crypto_ablkcipher_reqsize(any_tfm(cc));
-   cc->dmreq_start = ALIGN(cc->dmreq_start, crypto_tfm_ctx_alignment());
-   cc->dmreq_start += crypto_ablkcipher_alignmask(any_tfm(cc)) &
-  ~(crypto_tfm_ctx_alignment() - 1);
+   cc->dmreq_start = ALIGN(cc->dmreq_start, __alignof__(struct 
dm_crypt_request));
+
+   if (crypto_ablkcipher_alignmask(any_tfm(cc)) < CRYPTO_MINALIGN) {
+   /* Allocate the padding exactly */
+   iv_size_padding = -(cc->dmreq_start + sizeof(struct 
dm_crypt_request))
+   & crypto_ablkcipher_alignmask(any_tfm(cc));
+   } else {
+   /*
+* If the cipher requires greater alignment than kmalloc
+* alignment, we don't know the exact position of the
+* initialization vector. We must assume worst case.
+*/
+   iv_size_padding = crypto_ablkcipher_alignmask(any_tfm(cc));
+   }
 
cc->req_pool = mempool_create_kmalloc_pool(MIN_IOS, cc->dmreq_start +
-   sizeof(struct dm_crypt_request) + cc->iv_size);
+   sizeof(struct dm_crypt_request) + iv_size_padding + 
cc->iv_size);
if (!cc->req_pool) {
ti->error = "Cannot allocate crypt request mempool";
goto bad;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 010/102] regmap: Fix handling of volatile registers for format_write() chips

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Mark Brown 

commit 5844a8b9d98ec11ce1d77610daacf3f0a0e14715 upstream.

A previous over-zealous factorisation of code means that we only treat
registers as volatile if they are readable. For most devices this is fine
since normally most registers can be read and volatility implies
readability but for format_write() devices where there is no readback from
the hardware and we use volatility to mean simply uncacheability this means
that we end up treating all registers as cacheble.

A bigger refactoring of the code to clarify this is in order but as a fix
make a minimal change and only check readability when checking volatility
if there is no format_write() operation defined for the device.

Signed-off-by: Mark Brown 
Tested-by: Lars-Peter Clausen 
Signed-off-by: Ben Hutchings 
---
 drivers/base/regmap/regmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -47,7 +47,7 @@ bool regmap_readable(struct regmap *map,
 
 bool regmap_volatile(struct regmap *map, unsigned int reg)
 {
-   if (!regmap_readable(map, reg))
+   if (!map->format.format_write && !regmap_readable(map, reg))
return false;
 
if (map->volatile_reg)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 088/102] kvm: vmx: handle invvpid vm exit gracefully

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Petr Matousek 

commit a642fc305053cc1c6e47e4f4df327895747ab485 upstream.

On systems with invvpid instruction support (corresponding bit in
IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid
causes vm exit, which is currently not handled and results in
propagation of unknown exit to userspace.

Fix this by installing an invvpid vm exit handler.

This is CVE-2014-3646.

Signed-off-by: Petr Matousek 
Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2:
 - Adjust filename
 - Drop inapplicable change to exit reason string array]
Signed-off-by: Ben Hutchings 
---
 arch/x86/include/asm/vmx.h  | 2 ++
 arch/x86/kvm/vmx.c  | 9 -
 2 files changed, 10 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -280,6 +280,7 @@ enum vmcs_field {
 #define EXIT_REASON_EPT_VIOLATION   48
 #define EXIT_REASON_EPT_MISCONFIG   49
 #define EXIT_REASON_INVEPT  50
+#define EXIT_REASON_INVVPID 53
 #define EXIT_REASON_WBINVD 54
 #define EXIT_REASON_XSETBV 55
 
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5620,6 +5620,12 @@ static int handle_invept(struct kvm_vcpu
return 1;
 }
 
+static int handle_invvpid(struct kvm_vcpu *vcpu)
+{
+   kvm_queue_exception(vcpu, UD_VECTOR);
+   return 1;
+}
+
 /*
  * The exit handlers return 1 if the exit was handled fully and guest execution
  * may resume.  Otherwise they set the kvm_run parameter to indicate what needs
@@ -5662,6 +5668,7 @@ static int (*kvm_vmx_exit_handlers[])(st
[EXIT_REASON_MWAIT_INSTRUCTION]   = handle_invalid_op,
[EXIT_REASON_MONITOR_INSTRUCTION] = handle_invalid_op,
[EXIT_REASON_INVEPT]  = handle_invept,
+   [EXIT_REASON_INVVPID] = handle_invvpid,
 };
 
 static const int kvm_vmx_max_exit_handlers =
@@ -5846,7 +5853,7 @@ static bool nested_vmx_exit_handled(stru
case EXIT_REASON_VMPTRST: case EXIT_REASON_VMREAD:
case EXIT_REASON_VMRESUME: case EXIT_REASON_VMWRITE:
case EXIT_REASON_VMOFF: case EXIT_REASON_VMON:
-   case EXIT_REASON_INVEPT:
+   case EXIT_REASON_INVEPT: case EXIT_REASON_INVVPID:
/*
 * VMX instructions trap unconditionally. This allows L1 to
 * emulate them for its L2 guest, i.e., allows 3-level nesting!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 080/102] ipv4: avoid parallel route cache gc executions

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marcelo Ricardo Leitner 

When rt_intern_hash() has to deal with neighbour cache overflowing,
it triggers the route cache garbage collector in an attempt to free
some references on neighbour entries.

Such call cannot be done async but should also not run in parallel with
an already-running one, so that they don't collapse fighting over the
hash lock entries.

This patch thus blocks parallel executions with spinlocks:
- A call from worker and from rt_intern_hash() are not the same, and
cannot be merged, thus they will wait each other on rt_gc_lock.
- Calls to gc from rt_intern_hash() may happen in parallel but we must
wait for it to finish in order to try again. This dedup and
synchrinozation is then performed by the locking just before calling
__do_rt_garbage_collect().

Signed-off-by: Marcelo Ricardo Leitner 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: Ben Hutchings 
---
 net/ipv4/route.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -988,6 +988,7 @@ static void __do_rt_garbage_collect(int
static unsigned long last_gc;
static int rover;
static int equilibrium;
+   static DEFINE_SPINLOCK(rt_gc_lock);
struct rtable *rth;
struct rtable __rcu **rthp;
unsigned long now = jiffies;
@@ -999,6 +1000,8 @@ static void __do_rt_garbage_collect(int
 * do not make it too frequently.
 */
 
+   spin_lock(_gc_lock);
+
RT_CACHE_STAT_INC(gc_total);
 
if (now - last_gc < min_interval &&
@@ -1091,7 +1094,7 @@ static void __do_rt_garbage_collect(int
if (net_ratelimit())
printk(KERN_WARNING "dst cache overflow\n");
RT_CACHE_STAT_INC(gc_dst_overflow);
-   return;
+   goto out;
 
 work_done:
expire += min_interval;
@@ -1099,7 +1102,8 @@ work_done:
dst_entries_get_fast(_dst_ops) < ipv4_dst_ops.gc_thresh ||
dst_entries_get_slow(_dst_ops) < ipv4_dst_ops.gc_thresh)
expire = ip_rt_gc_timeout;
-out:   return;
+out:
+   spin_unlock(_gc_lock);
 }
 
 static void __rt_garbage_collect(struct work_struct *w)
@@ -1174,7 +1178,7 @@ static struct rtable *rt_intern_hash(uns
unsigned long   now;
u32 min_score;
int chain_length;
-   int attempts = !in_softirq();
+   int attempts = 1;
 
 restart:
chain_length = 0;
@@ -1311,8 +1315,15 @@ restart:
   can be released. Try to shrink route cache,
   it is most likely it holds some neighbour records.
 */
-   if (attempts-- > 0) {
-   __do_rt_garbage_collect(1, 0);
+   if (!in_softirq() && attempts-- > 0) {
+   static DEFINE_SPINLOCK(lock);
+
+   if (spin_trylock()) {
+   __do_rt_garbage_collect(1, 0);
+   spin_unlock();
+   } else {
+   spin_unlock_wait();
+   }
goto restart;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 018/102] drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Thomas Hellstrom 

commit f01ea0c3d9db536c64d47922716d8b3b8f21d850 upstream.

The code waiting for fifo idle was incorrect and could possibly spin
forever under certain circumstances.

Signed-off-by: Thomas Hellstrom 
Reported-by: Mark Sheldon 
Reviewed-by: Jakob Bornecrantz 
Reivewed-by: Mark Sheldon 
Signed-off-by: Ben Hutchings 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c
@@ -163,8 +163,9 @@ void vmw_fifo_release(struct vmw_private
 
mutex_lock(_priv->hw_mutex);
 
+   vmw_write(dev_priv, SVGA_REG_SYNC, SVGA_SYNC_GENERIC);
while (vmw_read(dev_priv, SVGA_REG_BUSY) != 0)
-   vmw_write(dev_priv, SVGA_REG_SYNC, SVGA_SYNC_GENERIC);
+   ;
 
dev_priv->last_read_seqno = ioread32(fifo_mem + SVGA_FIFO_FENCE);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 084/102] ipv6: reuse ip6_frag_id from ip6_ufo_append_data

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Hannes Frederic Sowa 

commit 916e4cf46d0204806c062c8c6c4d1f633852c5b6 upstream.

Currently we generate a new fragmentation id on UFO segmentation. It
is pretty hairy to identify the correct net namespace and dst there.
Especially tunnels use IFF_XMIT_DST_RELEASE and thus have no skb_dst
available at all.

This causes unreliable or very predictable ipv6 fragmentation id
generation while segmentation.

Luckily we already have pregenerated the ip6_frag_id in
ip6_ufo_append_data and can use it here.

Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2: adjust filename, indentation]
Signed-off-by: Ben Hutchings 
---
 net/ipv6/udp_offload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1362,7 +1362,7 @@ static struct sk_buff *udp6_ufo_fragment
fptr = (struct frag_hdr *)(skb_network_header(skb) + unfrag_ip6hlen);
fptr->nexthdr = nexthdr;
fptr->reserved = 0;
-   ipv6_select_ident(fptr, (struct rt6_info *)skb_dst(skb));
+   fptr->identification = skb_shinfo(skb)->ip6_frag_id;
 
/* Fragment the skb. ipv6 header and the remaining fields of the
 * fragment header are updated in ipv6_gso_segment()

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 006/102] KVM: s390: Fix user triggerable bug in dead code

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Christian Borntraeger 

commit 614a80e474b227cace52fd6e3c790554db8a396e upstream.

In the early days, we had some special handling for the
KVM_EXIT_S390_SIEIC exit, but this was gone in 2009 with commit
d7b0b5eb3000 (KVM: s390: Make psw available on all exits, not
just a subset).

Now this switch statement is just a sanity check for userspace
not messing with the kvm_run structure. Unfortunately, this
allows userspace to trigger a kernel BUG. Let's just remove
this switch statement.

Signed-off-by: Christian Borntraeger 
Reviewed-by: Cornelia Huck 
Reviewed-by: David Hildenbrand 
Signed-off-by: Ben Hutchings 
---
 arch/s390/kvm/kvm-s390.c | 13 -
 1 file changed, 13 deletions(-)

--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -516,16 +516,6 @@ rerun_vcpu:
 
BUG_ON(vcpu->kvm->arch.float_int.local_int[vcpu->vcpu_id] == NULL);
 
-   switch (kvm_run->exit_reason) {
-   case KVM_EXIT_S390_SIEIC:
-   case KVM_EXIT_UNKNOWN:
-   case KVM_EXIT_INTR:
-   case KVM_EXIT_S390_RESET:
-   break;
-   default:
-   BUG();
-   }
-
vcpu->arch.sie_block->gpsw.mask = kvm_run->psw_mask;
vcpu->arch.sie_block->gpsw.addr = kvm_run->psw_addr;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 019/102] xfs: don't dirty buffers beyond EOF

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Dave Chinner 

commit 22e757a49cf010703fcb9c9b4ef793248c39b0c2 upstream.

generic/263 is failing fsx at this point with a page spanning
EOF that cannot be invalidated. The operations are:

1190 mapwrite   0x52c00 thru0x5e569 (0xb96a bytes)
1191 mapread0x5c000 thru0x5d636 (0x1637 bytes)
1192 write  0x5b600 thru0x771ff (0x1bc00 bytes)

where 1190 extents EOF from 0x54000 to 0x5e569. When the direct IO
write attempts to invalidate the cached page over this range, it
fails with -EBUSY and so any attempt to do page invalidation fails.

The real question is this: Why can't that page be invalidated after
it has been written to disk and cleaned?

Well, there's data on the first two buffers in the page (1k block
size, 4k page), but the third buffer on the page (i.e. beyond EOF)
is failing drop_buffers because it's bh->b_state == 0x3, which is
BH_Uptodate | BH_Dirty.  IOWs, there's dirty buffers beyond EOF. Say
what?

OK, set_buffer_dirty() is called on all buffers from
__set_page_buffers_dirty(), regardless of whether the buffer is
beyond EOF or not, which means that when we get to ->writepage,
we have buffers marked dirty beyond EOF that we need to clean.
So, we need to implement our own .set_page_dirty method that
doesn't dirty buffers beyond EOF.

This is messy because the buffer code is not meant to be shared
and it has interesting locking issues on the buffer dirty bits.
So just copy and paste it and then modify it to suit what we need.

Note: the solutions the other filesystems and generic block code use
of marking the buffers clean in ->writepage does not work for XFS.
It still leaves dirty buffers beyond EOF and invalidations still
fail. Hence rather than play whack-a-mole, this patch simply
prevents those buffers from being dirtied in the first place.

Signed-off-by: Dave Chinner 
Reviewed-by: Brian Foster 
Signed-off-by: Dave Chinner 
Signed-off-by: Ben Hutchings 
---
 fs/xfs/xfs_aops.c | 61 +++
 1 file changed, 61 insertions(+)

--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -1448,11 +1448,72 @@ xfs_vm_readpages(
return mpage_readpages(mapping, pages, nr_pages, xfs_get_blocks);
 }
 
+/*
+ * This is basically a copy of __set_page_dirty_buffers() with one
+ * small tweak: buffers beyond EOF do not get marked dirty. If we mark them
+ * dirty, we'll never be able to clean them because we don't write buffers
+ * beyond EOF, and that means we can't invalidate pages that span EOF
+ * that have been marked dirty. Further, the dirty state can leak into
+ * the file interior if the file is extended, resulting in all sorts of
+ * bad things happening as the state does not match the underlying data.
+ *
+ * XXX: this really indicates that bufferheads in XFS need to die. Warts like
+ * this only exist because of bufferheads and how the generic code manages 
them.
+ */
+STATIC int
+xfs_vm_set_page_dirty(
+   struct page *page)
+{
+   struct address_space*mapping = page->mapping;
+   struct inode*inode = mapping->host;
+   loff_t  end_offset;
+   loff_t  offset;
+   int newly_dirty;
+
+   if (unlikely(!mapping))
+   return !TestSetPageDirty(page);
+
+   end_offset = i_size_read(inode);
+   offset = page_offset(page);
+
+   spin_lock(>private_lock);
+   if (page_has_buffers(page)) {
+   struct buffer_head *head = page_buffers(page);
+   struct buffer_head *bh = head;
+
+   do {
+   if (offset < end_offset)
+   set_buffer_dirty(bh);
+   bh = bh->b_this_page;
+   offset += 1 << inode->i_blkbits;
+   } while (bh != head);
+   }
+   newly_dirty = !TestSetPageDirty(page);
+   spin_unlock(>private_lock);
+
+   if (newly_dirty) {
+   /* sigh - __set_page_dirty() is static, so copy it here, too */
+   unsigned long flags;
+
+   spin_lock_irqsave(>tree_lock, flags);
+   if (page->mapping) {/* Race with truncate? */
+   WARN_ON_ONCE(!PageUptodate(page));
+   account_page_dirtied(page, mapping);
+   radix_tree_tag_set(>page_tree,
+   page_index(page), PAGECACHE_TAG_DIRTY);
+   }
+   spin_unlock_irqrestore(>tree_lock, flags);
+   __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
+   }
+   return newly_dirty;
+}
+
 const struct address_space_operations xfs_address_space_operations = {
.readpage   = xfs_vm_readpage,
.readpages  = xfs_vm_readpages,
.writepage  = xfs_vm_writepage,
.writepages =

[PATCH 3.2 087/102] nEPT: Nested INVEPT

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Nadav Har'El 

commit bfd0a56b90005f8c8a004baf407ad90045c2b11e upstream.

If we let L1 use EPT, we should probably also support the INVEPT instruction.

In our current nested EPT implementation, when L1 changes its EPT table
for L2 (i.e., EPT12), L0 modifies the shadow EPT table (EPT02), and in
the course of this modification already calls INVEPT. But if last level
of shadow page is unsync not all L1's changes to EPT12 are intercepted,
which means roots need to be synced when L1 calls INVEPT. Global INVEPT
should not be different since roots are synced by kvm_mmu_load() each
time EPTP02 changes.

Reviewed-by: Xiao Guangrong 
Signed-off-by: Nadav Har'El 
Signed-off-by: Jun Nakajima 
Signed-off-by: Xinhao Xu 
Signed-off-by: Yang Zhang 
Signed-off-by: Gleb Natapov 
Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2:
 - Adjust context, filename
 - Add definition of nested_ept_get_cr3(), added upstream by commit
   155a97a3d7c7 ("nEPT: MMU context for nested EPT")]
Signed-off-by: Ben Hutchings 
---
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -279,6 +279,7 @@ enum vmcs_field {
 #define EXIT_REASON_APIC_ACCESS 44
 #define EXIT_REASON_EPT_VIOLATION   48
 #define EXIT_REASON_EPT_MISCONFIG   49
+#define EXIT_REASON_INVEPT  50
 #define EXIT_REASON_WBINVD 54
 #define EXIT_REASON_XSETBV 55
 
@@ -397,6 +398,7 @@ enum vmcs_field {
 #define VMX_EPT_EXTENT_INDIVIDUAL_ADDR 0
 #define VMX_EPT_EXTENT_CONTEXT 1
 #define VMX_EPT_EXTENT_GLOBAL  2
+#define VMX_EPT_EXTENT_SHIFT   24
 
 #define VMX_EPT_EXECUTE_ONLY_BIT   (1ull)
 #define VMX_EPT_PAGE_WALK_4_BIT(1ull << 6)
@@ -404,6 +406,7 @@ enum vmcs_field {
 #define VMX_EPTP_WB_BIT(1ull << 14)
 #define VMX_EPT_2MB_PAGE_BIT   (1ull << 16)
 #define VMX_EPT_1GB_PAGE_BIT   (1ull << 17)
+#define VMX_EPT_INVEPT_BIT (1ull << 20)
 #define VMX_EPT_EXTENT_INDIVIDUAL_BIT  (1ull << 24)
 #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull << 25)
 #define VMX_EPT_EXTENT_GLOBAL_BIT  (1ull << 26)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2869,6 +2869,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu
mmu_sync_roots(vcpu);
spin_unlock(>kvm->mmu_lock);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_sync_roots);
 
 static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t vaddr,
  u32 access, struct x86_exception *exception)
@@ -3131,6 +3132,7 @@ void kvm_mmu_flush_tlb(struct kvm_vcpu *
++vcpu->stat.tlb_flush;
kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_flush_tlb);
 
 static void paging_new_cr3(struct kvm_vcpu *vcpu)
 {
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -602,6 +602,7 @@ static void nested_release_page_clean(st
kvm_release_page_clean(page);
 }
 
+static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu);
 static u64 construct_eptp(unsigned long root_hpa);
 static void kvm_cpu_vmxon(u64 addr);
 static void kvm_cpu_vmxoff(void);
@@ -1899,6 +1900,7 @@ static u32 nested_vmx_secondary_ctls_low
 static u32 nested_vmx_pinbased_ctls_low, nested_vmx_pinbased_ctls_high;
 static u32 nested_vmx_exit_ctls_low, nested_vmx_exit_ctls_high;
 static u32 nested_vmx_entry_ctls_low, nested_vmx_entry_ctls_high;
+static u32 nested_vmx_ept_caps;
 static __init void nested_vmx_setup_ctls_msrs(void)
 {
/*
@@ -5550,6 +5552,74 @@ static int handle_vmptrst(struct kvm_vcp
return 1;
 }
 
+/* Emulate the INVEPT instruction */
+static int handle_invept(struct kvm_vcpu *vcpu)
+{
+   u32 vmx_instruction_info, types;
+   unsigned long type;
+   gva_t gva;
+   struct x86_exception e;
+   struct {
+   u64 eptp, gpa;
+   } operand;
+   u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
+
+   if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
+   !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
+   kvm_queue_exception(vcpu, UD_VECTOR);
+   return 1;
+   }
+
+   if (!nested_vmx_check_permission(vcpu))
+   return 1;
+
+   if (!kvm_read_cr0_bits(vcpu, X86_CR0_PE)) {
+   kvm_queue_exception(vcpu, UD_VECTOR);
+   return 1;
+   }
+
+   vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
+   type = kvm_register_read(vcpu, (vmx_instruction_info >> 28) & 0xf);
+
+   types = (nested_vmx_ept_caps >> VMX_EPT_EXTENT_SHIFT) & 6;
+
+   if (!(types & (1UL << type))) {
+   nested_vmx_failValid(vcpu,
+   VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+   return 1;
+   }
+
+   /* According to the Intel VMX instruction reference, the

[PATCH 3.2 009/102] regmap: if format_write is used, declare all registers as "unreadable"

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Wolfram Sang 

commit 4191f19792bf91267835eb090d970e9cd6277a65 upstream.

Using .format_write means, we have a custom function to write to the
chip, but not to read back. Also, mark registers as "not precious" and
"not volatile" which is implicit because we cannot read them. Make those
functions use 'regmap_readable' to reuse the checks done there.

Signed-off-by: Wolfram Sang 
Signed-off-by: Mark Brown 
Signed-off-by: Ben Hutchings 
---
 drivers/base/regmap/regmap.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -36,6 +36,9 @@ bool regmap_readable(struct regmap *map,
if (map->max_register && reg > map->max_register)
return false;
 
+   if (map->format.format_write)
+   return false;
+
if (map->readable_reg)
return map->readable_reg(map->dev, reg);
 
@@ -44,7 +47,7 @@ bool regmap_readable(struct regmap *map,
 
 bool regmap_volatile(struct regmap *map, unsigned int reg)
 {
-   if (map->max_register && reg > map->max_register)
+   if (!regmap_readable(map, reg))
return false;
 
if (map->volatile_reg)
@@ -55,7 +58,7 @@ bool regmap_volatile(struct regmap *map,
 
 bool regmap_precious(struct regmap *map, unsigned int reg)
 {
-   if (map->max_register && reg > map->max_register)
+   if (!regmap_readable(map, reg))
return false;
 
if (map->precious_reg)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 100/102] nfsd: Fix ACL null pointer deref

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Sergio Gelato 

BugLink: http://bugs.launchpad.net/bugs/1348670

Fix regression introduced in pre-3.14 kernels by cherry-picking
aa07c713ecfc0522916f3cd57ac628ea6127c0ec
(NFSD: Call ->set_acl with a NULL ACL structure if no entries).

The affected code was removed in 3.14 by commit
4ac7249ea5a0ceef9f8269f63f33cc873c3fac61
(nfsd: use get_acl and ->set_acl).
The ->set_acl methods are already able to cope with a NULL argument.

Signed-off-by: Sergio Gelato 
[bwh: Rewrite the subject]
Signed-off-by: Ben Hutchings 
---
 fs/nfsd/vfs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 446dc01..fc208e4 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -508,6 +508,9 @@
char *buf = NULL;
int error = 0;
 
+   if (!pacl)
+   return vfs_setxattr(dentry, key, NULL, 0, 0);
+
buflen = posix_acl_xattr_size(pacl->a_count);
buf = kmalloc(buflen, GFP_KERNEL);
error = -ENOMEM;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 008/102] MIPS: ZBOOT: add missing include

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Aurelien Jarno 

commit 29593fd5a8149462ed6fad0d522234facdaee6c8 upstream.

Commit dc4d7b37 (MIPS: ZBOOT: gather string functions into string.c)
moved the string related functions into a separate file, which might
cause the following build error, depending on the configuration:

| CC  arch/mips/boot/compressed/decompress.o
| In file included from 
linux/arch/mips/boot/compressed/../../../../lib/decompress_unxz.c:234:0,
|  from linux/arch/mips/boot/compressed/decompress.c:67:
| linux/arch/mips/boot/compressed/../../../../lib/xz/xz_dec_stream.c: In 
function 'fill_temp':
| linux/arch/mips/boot/compressed/../../../../lib/xz/xz_dec_stream.c:162:2: 
error: implicit declaration of function 'memcpy' 
[-Werror=implicit-function-declaration]
| cc1: some warnings being treated as errors
| linux/scripts/Makefile.build:308: recipe for target 
'arch/mips/boot/compressed/decompress.o' failed
| make[6]: *** [arch/mips/boot/compressed/decompress.o] Error 1
| linux/arch/mips/Makefile:308: recipe for target 'vmlinuz' failed

It does not fail with the standard configuration, as when
CONFIG_DYNAMIC_DEBUG is not enabled  gets included in
include/linux/dynamic_debug.h. There might be other ways for it to
get indirectly included.

We can't add the include directly in xz_dec_stream.c as some
architectures might want to use a different version for the boot/
directory (see for example arch/x86/boot/string.h).

Signed-off-by: Aurelien Jarno 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7420/
Signed-off-by: Ralf Baechle 
Signed-off-by: Ben Hutchings 
---
 arch/mips/boot/compressed/decompress.c | 1 +
 1 file changed, 1 insertion(+)

--- a/arch/mips/boot/compressed/decompress.c
+++ b/arch/mips/boot/compressed/decompress.c
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 016/102] USB: sierra: avoid CDC class functions on "68A3" devices

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Bjørn Mork 

commit 049255f51644c1105775af228396d187402a5934 upstream.

Sierra Wireless Direct IP devices using the 68A3 product ID
can be configured for modes including a CDC ECM class function.
The known example uses interface numbers 12 and 13 for the ECM
control and data interfaces respectively, consistent with CDC
MBIM function interface numbering on other Sierra devices.

It seems cleaner to restrict this driver to the ff/ff/ff
vendor specific interfaces rather than increasing the already
long interface number blacklist.  This should be more future
proof if Sierra adds more class functions using interface
numbers not yet in the blacklist.

Signed-off-by: Bjørn Mork 
Signed-off-by: Johan Hovold 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/usb/serial/sierra.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -296,14 +296,16 @@ static const struct usb_device_id id_tab
{ USB_DEVICE(0x1199, 0x68A2),   /* Sierra Wireless MC77xx in QMI mode */
  .driver_info = (kernel_ulong_t)_ip_interface_blacklist
},
-   { USB_DEVICE(0x1199, 0x68A3),   /* Sierra Wireless Direct IP modems */
+   /* Sierra Wireless Direct IP modems */
+   { USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x68A3, 0xFF, 0xFF, 0xFF),
  .driver_info = (kernel_ulong_t)_ip_interface_blacklist
},
/* AT Direct IP LTE modems */
{ USB_DEVICE_AND_INTERFACE_INFO(0x0F3D, 0x68AA, 0xFF, 0xFF, 0xFF),
  .driver_info = (kernel_ulong_t)_ip_interface_blacklist
},
-   { USB_DEVICE(0x0f3d, 0x68A3),   /* Airprime/Sierra Wireless Direct IP 
modems */
+   /* Airprime/Sierra Wireless Direct IP modems */
+   { USB_DEVICE_AND_INTERFACE_INFO(0x0F3D, 0x68A3, 0xFF, 0xFF, 0xFF),
  .driver_info = (kernel_ulong_t)_ip_interface_blacklist
},
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 081/102] ipv4: disable bh while doing route gc

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marcelo Ricardo Leitner 

Further tests revealed that after moving the garbage collector to a work
queue and protecting it with a spinlock may leave the system prone to
soft lockups if bottom half gets very busy.

It was reproced with a set of firewall rules that REJECTed packets. If
the NIC bottom half handler ends up running on the same CPU that is
running the garbage collector on a very large cache, the garbage
collector will not be able to do its job due to the amount of work
needed for handling the REJECTs and also won't reschedule.

The fix is to disable bottom half during the garbage collecting, as it
already was in the first place (most calls to it came from softirqs).

Signed-off-by: Marcelo Ricardo Leitner 
Acked-by: Hannes Frederic Sowa 
Acked-by: David S. Miller 
Signed-off-by: Ben Hutchings 
---
 net/ipv4/route.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1000,7 +1000,7 @@ static void __do_rt_garbage_collect(int
 * do not make it too frequently.
 */
 
-   spin_lock(_gc_lock);
+   spin_lock_bh(_gc_lock);
 
RT_CACHE_STAT_INC(gc_total);
 
@@ -1103,7 +1103,7 @@ work_done:
dst_entries_get_slow(_dst_ops) < ipv4_dst_ops.gc_thresh)
expire = ip_rt_gc_timeout;
 out:
-   spin_unlock(_gc_lock);
+   spin_unlock_bh(_gc_lock);
 }
 
 static void __rt_garbage_collect(struct work_struct *w)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 075/102] mm: migrate: Close race between migration completion and mprotect

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Mel Gorman 

commit d3cb8bf6081b8b7a2dabb1264fe968fd870fa595 upstream.

A migration entry is marked as write if pte_write was true at the time the
entry was created. The VMA protections are not double checked when migration
entries are being removed as mprotect marks write-migration-entries as
read. It means that potentially we take a spurious fault to mark PTEs write
again but it's straight-forward. However, there is a race between write
migrations being marked read and migrations finishing. This potentially
allows a PTE to be write that should have been read. Close this race by
double checking the VMA permissions using maybe_mkwrite when migration
completes.

[torva...@linux-foundation.org: use maybe_mkwrite]
Signed-off-by: Mel Gorman 
Acked-by: Rik van Riel 
Signed-off-by: Linus Torvalds 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 mm/migrate.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -141,8 +141,11 @@ static int remove_migration_pte(struct p
 
get_page(new);
pte = pte_mkold(mk_pte(new, vma->vm_page_prot));
+
+   /* Recheck VMA as permissions can change since migration started  */
if (is_write_migration_entry(entry))
-   pte = pte_mkwrite(pte);
+   pte = maybe_mkwrite(pte, vma);
+
 #ifdef CONFIG_HUGETLB_PAGE
if (PageHuge(new))
pte = pte_mkhuge(pte);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 086/102] KVM: x86: Improve thread safety in pit

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Andy Honig 

commit 2febc839133280d5a5e8e1179c94ea674489dae2 upstream.

There's a race condition in the PIT emulation code in KVM.  In
__kvm_migrate_pit_timer the pit_timer object is accessed without
synchronization.  If the race condition occurs at the wrong time this
can crash the host kernel.

This fixes CVE-2014-3611.

Signed-off-by: Andrew Honig 
Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 arch/x86/kvm/i8254.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -264,8 +264,10 @@ void __kvm_migrate_pit_timer(struct kvm_
return;
 
timer = >pit_state.pit_timer.timer;
+   mutex_lock(>pit_state.lock);
if (hrtimer_cancel(timer))
hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
+   mutex_unlock(>pit_state.lock);
 }
 
 static void destroy_pit_timer(struct kvm_pit *pit)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 092/102] KVM: x86: use new CS.RPL as CPL during task switch

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Paolo Bonzini 

commit 2356aaeb2f58f491679dc0c38bc3f6dbe54e7ded upstream.

During task switch, all of CS.DPL, CS.RPL, SS.DPL must match (in addition
to all the other requirements) and will be the new CPL.  So far this
worked by carefully setting the CS selector and flag before doing the
task switch; setting CS.selector will already change the CPL.

However, this will not work once we get the CPL from SS.DPL, because
then you will have to set the full segment descriptor cache to change
the CPL.  ctxt->ops->cpl(ctxt) will then return the old CPL during the
task switch, and the check that SS.DPL == CPL will fail.

Temporarily assume that the CPL comes from CS.RPL during task switch
to a protected-mode task.  This is the same approach used in QEMU's
emulation code, which (until version 2.0) manually tracks the CPL.

Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2:
 - Adjust context
 - load_state_from_tss32() does not support VM86 mode]
Signed-off-by: Ben Hutchings 
---
 arch/x86/kvm/emulate.c | 60 +++---
 1 file changed, 33 insertions(+), 27 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1233,11 +1233,11 @@ static int write_segment_descriptor(stru
 }
 
 /* Does not support long mode */
-static int load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
-  u16 selector, int seg)
+static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+u16 selector, int seg, u8 cpl)
 {
struct desc_struct seg_desc;
-   u8 dpl, rpl, cpl;
+   u8 dpl, rpl;
unsigned err_vec = GP_VECTOR;
u32 err_code = 0;
bool null_selector = !(selector & ~0x3); /* -0003 are null */
@@ -1286,7 +1286,6 @@ static int load_segment_descriptor(struc
 
rpl = selector & 3;
dpl = seg_desc.dpl;
-   cpl = ctxt->ops->cpl(ctxt);
 
switch (seg) {
case VCPU_SREG_SS:
@@ -1349,6 +1348,13 @@ exception:
return X86EMUL_PROPAGATE_FAULT;
 }
 
+static int load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
+  u16 selector, int seg)
+{
+   u8 cpl = ctxt->ops->cpl(ctxt);
+   return __load_segment_descriptor(ctxt, selector, seg, cpl);
+}
+
 static void write_register_operand(struct operand *op)
 {
/* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */
@@ -2213,6 +2219,7 @@ static int load_state_from_tss16(struct
 struct tss_segment_16 *tss)
 {
int ret;
+   u8 cpl;
 
ctxt->_eip = tss->ip;
ctxt->eflags = tss->flag | 2;
@@ -2235,23 +2242,25 @@ static int load_state_from_tss16(struct
set_segment_selector(ctxt, tss->ss, VCPU_SREG_SS);
set_segment_selector(ctxt, tss->ds, VCPU_SREG_DS);
 
+   cpl = tss->cs & 3;
+
/*
 * Now load segment descriptors. If fault happenes at this stage
 * it is handled in a context of new task
 */
-   ret = load_segment_descriptor(ctxt, tss->ldt, VCPU_SREG_LDTR);
+   ret = __load_segment_descriptor(ctxt, tss->ldt, VCPU_SREG_LDTR, cpl);
if (ret != X86EMUL_CONTINUE)
return ret;
-   ret = load_segment_descriptor(ctxt, tss->es, VCPU_SREG_ES);
+   ret = __load_segment_descriptor(ctxt, tss->es, VCPU_SREG_ES, cpl);
if (ret != X86EMUL_CONTINUE)
return ret;
-   ret = load_segment_descriptor(ctxt, tss->cs, VCPU_SREG_CS);
+   ret = __load_segment_descriptor(ctxt, tss->cs, VCPU_SREG_CS, cpl);
if (ret != X86EMUL_CONTINUE)
return ret;
-   ret = load_segment_descriptor(ctxt, tss->ss, VCPU_SREG_SS);
+   ret = __load_segment_descriptor(ctxt, tss->ss, VCPU_SREG_SS, cpl);
if (ret != X86EMUL_CONTINUE)
return ret;
-   ret = load_segment_descriptor(ctxt, tss->ds, VCPU_SREG_DS);
+   ret = __load_segment_descriptor(ctxt, tss->ds, VCPU_SREG_DS, cpl);
if (ret != X86EMUL_CONTINUE)
return ret;
 
@@ -2330,6 +2339,7 @@ static int load_state_from_tss32(struct
 struct tss_segment_32 *tss)
 {
int ret;
+   u8 cpl;
 
if (ctxt->ops->set_cr(ctxt, 3, tss->cr3))
return emulate_gp(ctxt, 0);
@@ -2346,7 +2356,8 @@ static int load_state_from_tss32(struct
 
/*
 * SDM says that segment selectors are loaded before segment
-* descriptors
+* descriptors.  This is important because CPL checks will
+* use CS.RPL.
 */
set_segment_selector(ctxt, tss->ldt_selector, VCPU_SREG_LDTR);
set_segment_selector(ctxt, tss->es, VCPU_SREG_ES);
@@ -2356,29 +2367,31 @@ static int load_state_from_tss32(struct
set_segment_selector(ctxt, tss->fs, VCPU_SREG_FS);
set_segment_selector(ctxt, tss->gs,

[PATCH 3.2 001/102] regulatory: add NUL to alpha2

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Eliad Peller 

commit a5fe8e7695dc3f547e955ad2b662e3e72969e506 upstream.

alpha2 is defined as 2-chars array, but is used in multiple
places as string (e.g. with nla_put_string calls), which
might leak kernel data.

Solve it by simply adding an extra char for the NULL
terminator, making such operations safe.

Signed-off-by: Eliad Peller 
Signed-off-by: Johannes Berg 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 include/net/regulatory.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/net/regulatory.h
+++ b/include/net/regulatory.h
@@ -92,7 +92,7 @@ struct ieee80211_reg_rule {
 
 struct ieee80211_regdomain {
u32 n_reg_rules;
-   char alpha2[2];
+   char alpha2[3];
struct ieee80211_reg_rule reg_rules[];
 };
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 102/102] ring-buffer: Fix infinite spin in reading buffer

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: "Steven Rostedt (Red Hat)" 

commit 24607f114fd14f2f37e3e0cb3d47bce96e81e848 upstream.

Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"
fixed one bug but in the process caused another one. The reset is to
update the header page, but that fix also changed the way the cached
reads were updated. The cache reads are used to test if an iterator
needs to be updated or not.

A ring buffer iterator, when created, disables writes to the ring buffer
but does not stop other readers or consuming reads from happening.
Although all readers are synchronized via a lock, they are only
synchronized when in the ring buffer functions. Those functions may
be called by any number of readers. The iterator continues down when
its not interrupted by a consuming reader. If a consuming read
occurs, the iterator starts from the beginning of the buffer.

The way the iterator sees that a consuming read has happened since
its last read is by checking the reader "cache". The cache holds the
last counts of the read and the reader page itself.

Commit 651e22f2701b changed what was saved by the cache_read when
the rb_iter_reset() occurred, making the iterator never match the cache.
Then if the iterator calls rb_iter_reset(), it will go into an
infinite loop by checking if the cache doesn't match, doing the reset
and retrying, just to see that the cache still doesn't match! Which
should never happen as the reset is suppose to set the cache to the
current value and there's locks that keep a consuming reader from
having access to the data.

Fixes: 651e22f2701b "ring-buffer: Always reset iterator to reader page"
Signed-off-by: Steven Rostedt 
Signed-off-by: Ben Hutchings 
---
 kernel/trace/ring_buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2847,7 +2847,7 @@ static void rb_iter_reset(struct ring_bu
iter->head = cpu_buffer->reader_page->read;
 
iter->cache_reader_page = iter->head_page;
-   iter->cache_read = iter->head;
+   iter->cache_read = cpu_buffer->read;
 
if (iter->head)
iter->read_stamp = cpu_buffer->read_stamp;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 079/102] ipv4: move route garbage collector to work queue

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marcelo Ricardo Leitner 

Currently the route garbage collector gets called by dst_alloc() if it
have more entries than the threshold. But it's an expensive call, that
don't really need to be done by then.

Another issue with current way is that it allows running the garbage
collector with the same start parameters on multiple CPUs at once, which
is not optimal. A system may even soft lockup if the cache is big enough
as the garbage collectors will be fighting over the hash lock entries.

This patch thus moves the garbage collector to run asynchronously on a
work queue, much similar to how rt_expire_check runs.

There is one condition left that allows multiple executions, which is
handled by the next patch.

Signed-off-by: Marcelo Ricardo Leitner 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: Ben Hutchings 
---
 net/ipv4/route.c | 43 +--
 1 file changed, 29 insertions(+), 14 deletions(-)

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -151,6 +151,9 @@ static void  ipv4_link_failure(struct s
 static void ip_rt_update_pmtu(struct dst_entry *dst, u32 mtu);
 static int rt_garbage_collect(struct dst_ops *ops);
 
+static void __rt_garbage_collect(struct work_struct *w);
+static DECLARE_WORK(rt_gc_worker, __rt_garbage_collect);
+
 static void ipv4_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
int how)
 {
@@ -979,7 +982,7 @@ static void rt_emergency_hash_rebuild(st
and when load increases it reduces to limit cache size.
  */
 
-static int rt_garbage_collect(struct dst_ops *ops)
+static void __do_rt_garbage_collect(int elasticity, int min_interval)
 {
static unsigned long expire = RT_GC_TIMEOUT;
static unsigned long last_gc;
@@ -998,7 +1001,7 @@ static int rt_garbage_collect(struct dst
 
RT_CACHE_STAT_INC(gc_total);
 
-   if (now - last_gc < ip_rt_gc_min_interval &&
+   if (now - last_gc < min_interval &&
entries < ip_rt_max_size) {
RT_CACHE_STAT_INC(gc_ignored);
goto out;
@@ -1006,7 +1009,7 @@ static int rt_garbage_collect(struct dst
 
entries = dst_entries_get_slow(_dst_ops);
/* Calculate number of entries, which we want to expire now. */
-   goal = entries - (ip_rt_gc_elasticity << rt_hash_log);
+   goal = entries - (elasticity << rt_hash_log);
if (goal <= 0) {
if (equilibrium < ipv4_dst_ops.gc_thresh)
equilibrium = ipv4_dst_ops.gc_thresh;
@@ -1023,7 +1026,7 @@ static int rt_garbage_collect(struct dst
equilibrium = entries - goal;
}
 
-   if (now - last_gc >= ip_rt_gc_min_interval)
+   if (now - last_gc >= min_interval)
last_gc = now;
 
if (goal <= 0) {
@@ -1088,15 +1091,33 @@ static int rt_garbage_collect(struct dst
if (net_ratelimit())
printk(KERN_WARNING "dst cache overflow\n");
RT_CACHE_STAT_INC(gc_dst_overflow);
-   return 1;
+   return;
 
 work_done:
-   expire += ip_rt_gc_min_interval;
+   expire += min_interval;
if (expire > ip_rt_gc_timeout ||
dst_entries_get_fast(_dst_ops) < ipv4_dst_ops.gc_thresh ||
dst_entries_get_slow(_dst_ops) < ipv4_dst_ops.gc_thresh)
expire = ip_rt_gc_timeout;
-out:   return 0;
+out:   return;
+}
+
+static void __rt_garbage_collect(struct work_struct *w)
+{
+   __do_rt_garbage_collect(ip_rt_gc_elasticity, ip_rt_gc_min_interval);
+}
+
+static int rt_garbage_collect(struct dst_ops *ops)
+{
+   if (!work_pending(_gc_worker))
+   schedule_work(_gc_worker);
+
+   if (dst_entries_get_fast(_dst_ops) >= ip_rt_max_size ||
+   dst_entries_get_slow(_dst_ops) >= ip_rt_max_size) {
+   RT_CACHE_STAT_INC(gc_dst_overflow);
+   return 1;
+   }
+   return 0;
 }
 
 /*
@@ -1291,13 +1312,7 @@ restart:
   it is most likely it holds some neighbour records.
 */
if (attempts-- > 0) {
-   int saved_elasticity = ip_rt_gc_elasticity;
-   int saved_int = ip_rt_gc_min_interval;
-   ip_rt_gc_elasticity = 1;
-   ip_rt_gc_min_interval   = 0;
-   rt_garbage_collect(_dst_ops);
-   ip_rt_gc_min_interval   = saved_int;
-   ip_rt_gc_elasticity = saved_elasticity;
+   __do_rt_garbage_collect(1, 0);
goto restart;
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at

[PATCH 3.2 101/102] ipvs: avoid netns exit crash on ip_vs_conn_drop_conntrack

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Julian Anastasov 

commit 2627b7e15c5064ddd5e578e4efd948d48d531a3f upstream.

commit 8f4e0a18682d91 ("IPVS netns exit causes crash in conntrack")
added second ip_vs_conn_drop_conntrack call instead of just adding
the needed check. As result, the first call still can cause
crash on netns exit. Remove it.

Signed-off-by: Julian Anastasov 
Signed-off-by: Hans Schillstrom 
Signed-off-by: Simon Horman 
Signed-off-by: Ben Hutchings 
---
 net/netfilter/ipvs/ip_vs_conn.c |1 -
 1 file changed, 1 deletion(-)

--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -777,7 +777,6 @@ static void ip_vs_conn_expire(unsigned l
ip_vs_control_del(cp);
 
if (cp->flags & IP_VS_CONN_F_NFCT) {
-   ip_vs_conn_drop_conntrack(cp);
/* Do not access conntracks during subsys cleanup
 * because nf_conntrack_find_get can not be used after
 * conntrack cleanup for the net.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 099/102] ext2: Fix fs corruption in ext2_get_xip_mem()

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Jan Kara 

commit 7ba3ec5749ddb61f79f7be17b5fd7720eebc52de upstream.

Commit 8e3dffc651cb "Ext2: mark inode dirty after the function
dquot_free_block_nodirty is called" unveiled a bug in __ext2_get_block()
called from ext2_get_xip_mem(). That function called ext2_get_block()
mistakenly asking it to map 0 blocks while 1 was intended. Before the
above mentioned commit things worked out fine by luck but after that commit
we started returning that we allocated 0 blocks while we in fact
allocated 1 block and thus allocation was looping until all blocks in
the filesystem were exhausted.

Fix the problem by properly asking for one block and also add assertion
in ext2_get_blocks() to catch similar problems.

Reported-and-tested-by: Andiry Xu 
Signed-off-by: Jan Kara 
Signed-off-by: Ben Hutchings 
---
 fs/ext2/inode.c | 2 ++
 fs/ext2/xip.c   | 1 +
 2 files changed, 3 insertions(+)

--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -619,6 +619,8 @@ static int ext2_get_blocks(struct inode
int count = 0;
ext2_fsblk_t first_block = 0;
 
+   BUG_ON(maxblocks == 0);
+
depth = ext2_block_to_path(inode,iblock,offsets,_to_boundary);
 
if (depth == 0)
--- a/fs/ext2/xip.c
+++ b/fs/ext2/xip.c
@@ -37,6 +37,7 @@ __ext2_get_block(struct inode *inode, pg
int rc;
 
memset(, 0, sizeof(struct buffer_head));
+   tmp.b_size = 1 << inode->i_blkbits;
rc = ext2_get_block(inode, pgoff, , create);
*result = tmp.b_blocknr;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 014/102] Revert "iwlwifi: dvm: don't enable CTS to self"

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Emmanuel Grumbach 

commit f47f46d7b09cf1d09e4b44b6cc4dd7d68a08028c upstream.

This reverts commit 43d826ca5979927131685cc2092c7ce862cb91cd.

This commit caused packet loss.

Signed-off-by: Emmanuel Grumbach 
[bwh: Backported to 3.2:
 - Adjust filename
 - Condition for RXON_FLG_SELF_CTS_EN in iwlagn_commit_rxon() was different]
Signed-off-by: Ben Hutchings 
---
 drivers/net/wireless/iwlwifi/iwl-agn-rxon.c | 13 -
 1 file changed, 13 deletions(-)

--- a/drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
@@ -440,6 +440,14 @@ int iwlagn_commit_rxon(struct iwl_priv *
/* always get timestamp with Rx frame */
ctx->staging.flags |= RXON_FLG_TSF2HOST_MSK;
 
+   /*
+* force CTS-to-self frames protection if RTS-CTS is not preferred
+* one aggregation protection method
+*/
+   if (!(priv->cfg->ht_params &&
+ priv->cfg->ht_params->use_rts_for_aggregation))
+   ctx->staging.flags |= RXON_FLG_SELF_CTS_EN;
+
if ((ctx->vif && ctx->vif->bss_conf.use_short_slot) ||
!(ctx->staging.flags & RXON_FLG_BAND_24G_MSK))
ctx->staging.flags |= RXON_FLG_SHORT_SLOT_MSK;
@@ -872,6 +880,11 @@ void iwlagn_bss_info_changed(struct ieee
else
ctx->staging.flags &= ~RXON_FLG_TGG_PROTECT_MSK;
 
+   if (bss_conf->use_cts_prot)
+   ctx->staging.flags |= RXON_FLG_SELF_CTS_EN;
+   else
+   ctx->staging.flags &= ~RXON_FLG_SELF_CTS_EN;
+
memcpy(ctx->staging.bssid_addr, bss_conf->bssid, ETH_ALEN);
 
if (vif->type == NL80211_IFTYPE_AP ||

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 094/102] net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Daniel Borkmann 

commit 9de7922bc709eee2f609cd01d98aaedc4cf5ea74 upstream.

Commit 6f4c618ddb0 ("SCTP : Add paramters validity check for
ASCONF chunk") added basic verification of ASCONF chunks, however,
it is still possible to remotely crash a server by sending a
special crafted ASCONF chunk, even up to pre 2.6.12 kernels:

skb_over_panic: text:a01ea1c3 len:31056 put:30768
 head:88011bd81800 data:88011bd81800 tail:0x7950
 end:0x440 dev:
 [ cut here ]
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
 
 [] skb_put+0x5c/0x70
 [] sctp_addto_chunk+0x63/0xd0 [sctp]
 [] sctp_process_asconf+0x1af/0x540 [sctp]
 [] ? _read_unlock_bh+0x15/0x20
 [] sctp_sf_do_asconf+0x168/0x240 [sctp]
 [] sctp_do_sm+0x71/0x1210 [sctp]
 [] ? fib_rules_lookup+0xad/0xf0
 [] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
 [] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
 [] sctp_inq_push+0x56/0x80 [sctp]
 [] sctp_rcv+0x982/0xa10 [sctp]
 [] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
 [] ? nf_iterate+0x69/0xb0
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ? nf_hook_slow+0x76/0x120
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ip_local_deliver_finish+0xdd/0x2d0
 [] ip_local_deliver+0x98/0xa0
 [] ip_rcv_finish+0x12d/0x440
 [] ip_rcv+0x275/0x350
 [] __netif_receive_skb+0x4ab/0x750
 [] netif_receive_skb+0x58/0x60

This can be triggered e.g., through a simple scripted nmap
connection scan injecting the chunk after the handshake, for
example, ...

  -- INIT[ASCONF; ASCONF_ACK] ->
  <--- INIT-ACK[ASCONF; ASCONF_ACK] 
   COOKIE-ECHO >
  < COOKIE-ACK -
  -- ASCONF; UNKNOWN -->

... where ASCONF chunk of length 280 contains 2 parameters ...

  1) Add IP address parameter (param length: 16)
  2) Add/del IP address parameter (param length: 255)

... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
Address Parameter in the ASCONF chunk is even missing, too.
This is just an example and similarly-crafted ASCONF chunks
could be used just as well.

The ASCONF chunk passes through sctp_verify_asconf() as all
parameters passed sanity checks, and after walking, we ended
up successfully at the chunk end boundary, and thus may invoke
sctp_process_asconf(). Parameter walking is done with
WORD_ROUND() to take padding into account.

In sctp_process_asconf()'s TLV processing, we may fail in
sctp_process_asconf_param() e.g., due to removal of the IP
address that is also the source address of the packet containing
the ASCONF chunk, and thus we need to add all TLVs after the
failure to our ASCONF response to remote via helper function
sctp_add_asconf_response(), which basically invokes a
sctp_addto_chunk() adding the error parameters to the given
skb.

When walking to the next parameter this time, we proceed
with ...

  length = ntohs(asconf_param->param_hdr.length);
  asconf_param = (void *)asconf_param + length;

... instead of the WORD_ROUND()'ed length, thus resulting here
in an off-by-one that leads to reading the follow-up garbage
parameter length of 12336, and thus throwing an skb_over_panic
for the reply when trying to sctp_addto_chunk() next time,
which implicitly calls the skb_put() with that length.

Fix it by using sctp_walk_params() [ which is also used in
INIT parameter processing ] macro in the verification *and*
in ASCONF processing: it will make sure we don't spill over,
that we walk parameters WORD_ROUND()'ed. Moreover, we're being
more defensive and guard against unknown parameter types and
missized addresses.

Joint work with Vlad Yasevich.

Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK 
chunks.")
Signed-off-by: Daniel Borkmann 
Signed-off-by: Vlad Yasevich 
Acked-by: Neil Horman 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2:
 - Adjust context
 - sctp_sf_violation_paramlen() doesn't take a struct net * parameter]
Signed-off-by: Ben Hutchings 
---
 include/net/sctp/sm.h|  6 +--
 net/sctp/sm_make_chunk.c | 99 +++-
 net/sctp/sm_statefuns.c  | 18 +
 3 files changed, 60 insertions(+), 63 deletions(-)

--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -251,9 +251,9 @@ struct sctp_chunk *sctp_make_asconf_upda
  int, __be16);
 struct sctp_chunk *sctp_make_asconf_set_prim(struct sctp_association *asoc,
 union sctp_addr *addr);
-int sctp_verify_asconf(const struct sctp_association *asoc,
-  struct sctp_paramhdr *param_hdr, void *chunk_end,
-  struct sctp_paramhdr **errp);
+bool sctp_verify_asconf(const struct sctp_association *asoc,
+   struct sctp_chunk *chunk, bool addr_param_needed,
+   struct

Re: [PATCH 0/8] Armada XP pinctrl consolidation and ix4-300d fixes

2014-11-01 Thread Jason Cooper

On Wed, Oct 15, 2014 at 02:53:02AM +0200, Benoit Masson wrote:
> 
> Le 6 oct. 2014 à 18:13, Sebastian Hesselbarth 
>  a écrit :
> 
> > On 10/06/2014 01:11 AM, Benoit Masson wrote:
> >> Le 3 oct. 2014 à 17:41, Sebastian Hesselbarth 
> >>  a écrit :
> >>> On 10/03/2014 05:29 PM, Benoit Masson wrote:
>  Le 3 oct. 2014 à 17:06, Sebastian Hesselbarth 
>   a écrit :
> > On 10/03/2014 04:11 PM, Jason Cooper wrote:
> >> On Sun, Sep 21, 2014 at 04:11:23PM +0200, Benoit Masson wrote:
>  Le 19 sept. 2014 à 22:14, Sebastian Hesselbarth 
>   a écrit :
> >>> [...]
>  Patches are based on v3.17-rc1 and intended for v3.18 but I am not in
>  a hurry. I only compile tested this, so a formal Tested-by from 
>  Benoit
>  for the ix4 and any other Armada XP board would be great.
> >>> 
> >>> I'm not sure what to test since I only receive some patch from the
> >>> series of 8. Should I get all 8 or only those you sent me. I'll be
> >>> able to test during next week.
> >> 
> >> Did you ever get a chance to test this series?
> > 
> > Uhm, I never prepared a branch for Benoit to test. I have pushed the
> > patches with Thomas Acked-by's and renamed eeprom node based on
> > v3.17-rc1 to
> > 
> > https://github.com/shesselba/linux-dove.git devel/mvebu-ix4
> > 
> >>> [...]
> >> Maybe I missed something ? is this branch you sent me a bare fork
> >> from mainline 3.17 ? does it includes the armada XP step A0 patch ?
> > 
> > Benoit,
> > 
> Hi,
> > I prepared more branches with the series
> > - on top of v3.17 release:
> > https://github.com/shesselba/linux-dove.git devel/mvebu-ix4_v3.17
> > 
> > - on top of next-20141003 (i.e. what will become v3.18-rc1):
> > https://github.com/shesselba/linux-dove.git devel/mvebu-ix4_next-20141003
> > 
> > It would be great, if you can test in this order:
> > - vanilla v3.17
> > - mvebu-ix4_v3.17
> > - mvebu-ix4_next-20141003
> > 
> All the 3 branch works WHEN APPLYING A0 patch (below), with both my
> custom kernel config and the arch/arm/configs/mvebu_v7_defconfig

Can I count this as a Tested-by?

> The reason why it didn't worked last time was that apparently the A0
> patch (copied) below was not merged into 3.17 :(

Yep, sorry.  Life got the best of me.  The pull request is now sent.

> This means that ix4-300d support is broken on 3.17 because of the A0
> stepping patch not merged.

Not a problem.  It's flagged for stable v3.12 and up.

thx,

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 091/102] KVM: x86: Emulator fixes for eip canonical checks on near branches

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Nadav Amit 

commit 234f3ce485d54017f15cf5e0699cff4100121601 upstream.

Before changing rip (during jmp, call, ret, etc.) the target should be asserted
to be canonical one, as real CPUs do.  During sysret, both target rsp and rip
should be canonical. If any of these values is noncanonical, a #GP exception
should occur.  The exception to this rule are syscall and sysenter instructions
in which the assigned rip is checked during the assignment to the relevant
MSRs.

This patch fixes the emulator to behave as real CPUs do for near branches.
Far branches are handled by the next patch.

This fixes CVE-2014-3647.

Signed-off-by: Nadav Amit 
Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2:
 - Adjust context
 - Use ctxt->regs[] instead of reg_read(), reg_write(), reg_rmw()]
Signed-off-by: Ben Hutchings 
---
 arch/x86/kvm/emulate.c | 78 ++
 1 file changed, 54 insertions(+), 24 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -529,7 +529,8 @@ static int emulate_nm(struct x86_emulate
return emulate_exception(ctxt, NM_VECTOR, 0, false);
 }
 
-static inline void assign_eip_near(struct x86_emulate_ctxt *ctxt, ulong dst)
+static inline int assign_eip_far(struct x86_emulate_ctxt *ctxt, ulong dst,
+  int cs_l)
 {
switch (ctxt->op_bytes) {
case 2:
@@ -539,16 +540,25 @@ static inline void assign_eip_near(struc
ctxt->_eip = (u32)dst;
break;
case 8:
+   if ((cs_l && is_noncanonical_address(dst)) ||
+   (!cs_l && (dst & ~(u32)-1)))
+   return emulate_gp(ctxt, 0);
ctxt->_eip = dst;
break;
default:
WARN(1, "unsupported eip assignment size\n");
}
+   return X86EMUL_CONTINUE;
+}
+
+static inline int assign_eip_near(struct x86_emulate_ctxt *ctxt, ulong dst)
+{
+   return assign_eip_far(ctxt, dst, ctxt->mode == X86EMUL_MODE_PROT64);
 }
 
-static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
+static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
 {
-   assign_eip_near(ctxt, ctxt->_eip + rel);
+   return assign_eip_near(ctxt, ctxt->_eip + rel);
 }
 
 static u16 get_segment_selector(struct x86_emulate_ctxt *ctxt, unsigned seg)
@@ -1787,13 +1797,15 @@ static int em_grp45(struct x86_emulate_c
case 2: /* call near abs */ {
long int old_eip;
old_eip = ctxt->_eip;
-   ctxt->_eip = ctxt->src.val;
+   rc = assign_eip_near(ctxt, ctxt->src.val);
+   if (rc != X86EMUL_CONTINUE)
+   break;
ctxt->src.val = old_eip;
rc = em_push(ctxt);
break;
}
case 4: /* jmp abs */
-   ctxt->_eip = ctxt->src.val;
+   rc = assign_eip_near(ctxt, ctxt->src.val);
break;
case 5: /* jmp far */
rc = em_jmp_far(ctxt);
@@ -1825,10 +1837,14 @@ static int em_grp9(struct x86_emulate_ct
 
 static int em_ret(struct x86_emulate_ctxt *ctxt)
 {
-   ctxt->dst.type = OP_REG;
-   ctxt->dst.addr.reg = >_eip;
-   ctxt->dst.bytes = ctxt->op_bytes;
-   return em_pop(ctxt);
+   int rc;
+   unsigned long eip;
+
+   rc = emulate_pop(ctxt, , ctxt->op_bytes);
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+
+   return assign_eip_near(ctxt, eip);
 }
 
 static int em_ret_far(struct x86_emulate_ctxt *ctxt)
@@ -2060,7 +2076,7 @@ static int em_sysexit(struct x86_emulate
 {
struct x86_emulate_ops *ops = ctxt->ops;
struct desc_struct cs, ss;
-   u64 msr_data;
+   u64 msr_data, rcx, rdx;
int usermode;
u16 cs_sel = 0, ss_sel = 0;
 
@@ -2076,6 +2092,9 @@ static int em_sysexit(struct x86_emulate
else
usermode = X86EMUL_MODE_PROT32;
 
+   rcx = ctxt->regs[VCPU_REGS_RCX];
+   rdx = ctxt->regs[VCPU_REGS_RDX];
+
cs.dpl = 3;
ss.dpl = 3;
ops->get_msr(ctxt, MSR_IA32_SYSENTER_CS, _data);
@@ -2093,6 +2112,9 @@ static int em_sysexit(struct x86_emulate
ss_sel = cs_sel + 8;
cs.d = 0;
cs.l = 1;
+   if (is_noncanonical_address(rcx) ||
+   is_noncanonical_address(rdx))
+   return emulate_gp(ctxt, 0);
break;
}
cs_sel |= SELECTOR_RPL_MASK;
@@ -2101,8 +2123,8 @@ static int em_sysexit(struct x86_emulate
ops->set_segment(ctxt, cs_sel, , 0, VCPU_SREG_CS);
ops->set_segment(ctxt, ss_sel, , 0, VCPU_SREG_SS);
 
-   ctxt->_eip = ctxt->regs[VCPU_REGS_RDX];
-   ctxt->regs[VCPU_REGS_RSP] = ctxt->regs[VCPU_REGS_RCX];
+   ctxt->_eip = rdx;
+   ctxt->regs[VCPU_REGS_RSP] = rcx;
 
return

[PATCH 3.2 090/102] KVM: x86: Fix wrong masking on relative jump/call

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Nadav Amit 

commit 05c83ec9b73c8124555b706f6af777b10adf0862 upstream.

Relative jumps and calls do the masking according to the operand size, and not
according to the address size as the KVM emulator does today.

This patch fixes KVM behavior.

Signed-off-by: Nadav Amit 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Ben Hutchings 
---
 arch/x86/kvm/emulate.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -456,11 +456,6 @@ register_address_increment(struct x86_em
*reg = (*reg & ~ad_mask(ctxt)) | ((*reg + inc) & ad_mask(ctxt));
 }
 
-static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
-{
-   register_address_increment(ctxt, >_eip, rel);
-}
-
 static u32 desc_limit_scaled(struct desc_struct *desc)
 {
u32 limit = get_desc_limit(desc);
@@ -534,6 +529,28 @@ static int emulate_nm(struct x86_emulate
return emulate_exception(ctxt, NM_VECTOR, 0, false);
 }
 
+static inline void assign_eip_near(struct x86_emulate_ctxt *ctxt, ulong dst)
+{
+   switch (ctxt->op_bytes) {
+   case 2:
+   ctxt->_eip = (u16)dst;
+   break;
+   case 4:
+   ctxt->_eip = (u32)dst;
+   break;
+   case 8:
+   ctxt->_eip = dst;
+   break;
+   default:
+   WARN(1, "unsupported eip assignment size\n");
+   }
+}
+
+static inline void jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
+{
+   assign_eip_near(ctxt, ctxt->_eip + rel);
+}
+
 static u16 get_segment_selector(struct x86_emulate_ctxt *ctxt, unsigned seg)
 {
u16 selector;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 071/102] MIPS: mcount: Adjust stack pointer for static trace in MIPS32

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Markos Chandras 

commit 8a574cfa2652545eb95595d38ac2a0bb501af0ae upstream.

Every mcount() call in the MIPS 32-bit kernel is done as follows:

[...]
move at, ra
jal _mcount
addiu sp, sp, -8
[...]

but upon returning from the mcount() function, the stack pointer
is not adjusted properly. This is explained in details in 58b69401c797
(MIPS: Function tracer: Fix broken function tracing).

Commit ad8c396936e3 ("MIPS: Unbreak function tracer for 64-bit kernel.)
fixed the stack manipulation for 64-bit but it didn't fix it completely
for MIPS32.

Signed-off-by: Markos Chandras 
Cc: linux-m...@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7792/
Signed-off-by: Ralf Baechle 
Signed-off-by: Ben Hutchings 
---
 arch/mips/kernel/mcount.S | 12 
 1 file changed, 12 insertions(+)

--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -119,7 +119,11 @@ NESTED(_mcount, PT_SIZE, ra)
 nop
 #endif
b   ftrace_stub
+#ifdef CONFIG_32BIT
+addiu sp, sp, 8
+#else
 nop
+#endif
 
 static_trace:
MCOUNT_SAVE_REGS
@@ -129,6 +133,9 @@ static_trace:
 move   a1, AT  /* arg2: parent's return address */
 
MCOUNT_RESTORE_REGS
+#ifdef CONFIG_32BIT
+   addiu sp, sp, 8
+#endif
.globl ftrace_stub
 ftrace_stub:
RETURN_BACK
@@ -177,6 +184,11 @@ NESTED(ftrace_graph_caller, PT_SIZE, ra)
jal prepare_ftrace_return
 nop
MCOUNT_RESTORE_REGS
+#ifndef CONFIG_DYNAMIC_FTRACE
+#ifdef CONFIG_32BIT
+   addiu sp, sp, 8
+#endif
+#endif
RETURN_BACK
END(ftrace_graph_caller)
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 066/102] ALSA: pcm: fix fifo_size frame calculation

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Clemens Ladisch 

commit a9960e6a293e6fc3ed414643bb4e4106272e4d0a upstream.

The calculated frame size was wrong because snd_pcm_format_physical_width()
actually returns the number of bits, not bytes.

Use snd_pcm_format_size() instead, which not only returns bytes, but also
simplifies the calculation.

Fixes: 8bea869c5e56 ("ALSA: PCM midlevel: improve fifo_size handling")
Signed-off-by: Clemens Ladisch 
Signed-off-by: Takashi Iwai 
Signed-off-by: Ben Hutchings 
---
 sound/core/pcm_lib.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -1692,14 +1692,16 @@ static int snd_pcm_lib_ioctl_fifo_size(s
 {
struct snd_pcm_hw_params *params = arg;
snd_pcm_format_t format;
-   int channels, width;
+   int channels;
+   ssize_t frame_size;
 
params->fifo_size = substream->runtime->hw.fifo_size;
if (!(substream->runtime->hw.info & SNDRV_PCM_INFO_FIFO_IN_FRAMES)) {
format = params_format(params);
channels = params_channels(params);
-   width = snd_pcm_format_physical_width(format);
-   params->fifo_size /= width * channels;
+   frame_size = snd_pcm_format_size(format, channels);
+   if (frame_size > 0)
+   params->fifo_size /= (unsigned)frame_size;
}
return 0;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 089/102] KVM: x86 emulator: Use opcode::execute for CALL

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Takuya Yoshikawa 

commit d4ddafcdf2201326ec9717172767cfad0ede1472 upstream.

CALL: E8

Signed-off-by: Takuya Yoshikawa 
Signed-off-by: Marcelo Tosatti 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 arch/x86/kvm/emulate.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2536,6 +2536,15 @@ static int em_das(struct x86_emulate_ctx
return X86EMUL_CONTINUE;
 }
 
+static int em_call(struct x86_emulate_ctxt *ctxt)
+{
+   long rel = ctxt->src.val;
+
+   ctxt->src.val = (unsigned long)ctxt->_eip;
+   jmp_rel(ctxt, rel);
+   return em_push(ctxt);
+}
+
 static int em_call_far(struct x86_emulate_ctxt *ctxt)
 {
u16 sel, old_cs;
@@ -3271,7 +3280,7 @@ static struct opcode opcode_table[256] =
D2bvIP(SrcImmUByte | DstAcc, in,  check_perm_in),
D2bvIP(SrcAcc | DstImmUByte, out, check_perm_out),
/* 0xE8 - 0xEF */
-   D(SrcImm | Stack), D(SrcImm | ImplicitOps),
+   I(SrcImm | Stack, em_call), D(SrcImm | ImplicitOps),
I(SrcImmFAddr | No64, em_jmp_far), D(SrcImmByte | ImplicitOps),
D2bvIP(SrcDX | DstAcc, in,  check_perm_in),
D2bvIP(SrcAcc | DstDX, out, check_perm_out),
@@ -3966,13 +3975,6 @@ special_insn:
case 0xe6: /* outb */
case 0xe7: /* out */
goto do_io_out;
-   case 0xe8: /* call (near) */ {
-   long int rel = ctxt->src.val;
-   ctxt->src.val = (unsigned long) ctxt->_eip;
-   jmp_rel(ctxt, rel);
-   rc = em_push(ctxt);
-   break;
-   }
case 0xe9: /* jmp rel */
case 0xeb: /* jmp rel short */
jmp_rel(ctxt, ctxt->src.val);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 096/102] net: sctp: fix remote memory pressure from excessive queueing

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Daniel Borkmann 

commit 26b87c7881006311828bb0ab271a551a62dcceb4 upstream.

This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...

  -- INIT[ASCONF; ASCONF_ACK] ->
  <--- INIT-ACK[ASCONF; ASCONF_ACK] 
   COOKIE-ECHO >
  < COOKIE-ACK -
   ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK -->
  [...]
   ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK -->

... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.

We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].

The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54b1 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.

In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.

One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 
packet")
Signed-off-by: Daniel Borkmann 
Signed-off-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings 
---
 net/sctp/inqueue.c  | 33 +++--
 net/sctp/sm_statefuns.c |  3 +++
 2 files changed, 10 insertions(+), 26 deletions(-)

--- a/net/sctp/inqueue.c
+++ b/net/sctp/inqueue.c
@@ -152,18 +152,9 @@ struct sctp_chunk *sctp_inq_pop(struct s
} else {
/* Nothing to do. Next chunk in the packet, please. */
ch = (sctp_chunkhdr_t *) chunk->chunk_end;
-
/* Force chunk->skb->data to chunk->chunk_end.  */
-   skb_pull(chunk->skb,
-chunk->chunk_end - chunk->skb->data);
-
-   /* Verify that we have at least chunk headers
-* worth of buffer left.
-*/
-   if (skb_headlen(chunk->skb) < sizeof(sctp_chunkhdr_t)) {
-   sctp_chunk_free(chunk);
-   chunk = queue->in_progress = NULL;
-   }
+   skb_pull(chunk->skb, chunk->chunk_end - 
chunk->skb->data);
+   /* We are guaranteed to pull a SCTP header. */
}
}
 
@@ -199,24 +190,14 @@ struct sctp_chunk *sctp_inq_pop(struct s
skb_pull(chunk->skb, sizeof(sctp_chunkhdr_t));
chunk->subh.v = NULL; /* Subheader is no longer valid.  */
 
-   if

[PATCH 3.2 095/102] net: sctp: fix panic on duplicate ASCONF chunks

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Daniel Borkmann 

commit b69040d8e39f20d5215a03502a8e8b4c6ab78395 upstream.

When receiving a e.g. semi-good formed connection scan in the
form of ...

  -- INIT[ASCONF; ASCONF_ACK] ->
  <--- INIT-ACK[ASCONF; ASCONF_ACK] 
   COOKIE-ECHO >
  < COOKIE-ACK -
   ASCONF_a; ASCONF_b ->

... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!

The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.

Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():

While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit 2e3216cd54b1
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.

Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.

Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.

Joint work with Vlad Yasevich.

Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 
packet")
Signed-off-by: Daniel Borkmann 
Signed-off-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings 
---
 include/net/sctp/sctp.h | 5 +
 net/sctp/associola.c| 2 ++
 2 files changed, 7 insertions(+)

--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -523,6 +523,11 @@ static inline void sctp_assoc_pending_pm
asoc->pmtu_pending = 0;
 }
 
+static inline bool sctp_chunk_pending(const struct sctp_chunk *chunk)
+{
+   return !list_empty(>list);
+}
+
 /* Walk through a list of TLV parameters.  Don't trust the
  * individual parameter lengths and instead depend on
  * the chunk length to indicate when to stop.  Make sure
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -1638,6 +1638,8 @@ struct sctp_chunk *sctp_assoc_lookup_asc
 * ack chunk whose serial number matches that of the request.
 */
list_for_each_entry(ack, >asconf_ack_list, transmitted_list) {
+   if (sctp_chunk_pending(ack))
+   continue;
if (ack->subh.addip_hdr->serial == serial) {
sctp_chunk_hold(ack);
return ack;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 082/102] ipv6: reallocate addrconf router for ipv6 address when lo device up

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: chenweilong 

It fix the bug 67951 on bugzilla
https://bugzilla.kernel.org/show_bug.cgi?id=67951

The patch can't be applied directly, as it' used the function introduced
by "commit 94e187c0" ip6_rt_put(), that patch can't be applied directly
either.



From: Gao feng 

commit 33d99113b1102c2d2f8603b9ba72d89d915c13f5 upstream.

This commit don't have a stable tag, but it fix the bug
no reply after loopback down-up.It's very worthy to be
applied to stable 3.4 kernels.

The bug is 67951 on bugzilla
https://bugzilla.kernel.org/show_bug.cgi?id=67951


CC: Sabrina Dubroca 
CC: Hannes Frederic Sowa 
Reported-by: Weilong Chen 
Signed-off-by: Weilong Chen 
Signed-off-by: Gao feng 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller 
[weilong: s/ip6_rt_put/dst_release]
Signed-off-by: Chen Weilong 
Signed-off-by: Ben Hutchings 
---
 net/ipv6/addrconf.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2443,8 +2443,18 @@ static void init_loopback(struct net_dev
if (sp_ifa->flags & (IFA_F_DADFAILED | IFA_F_TENTATIVE))
continue;
 
-   if (sp_ifa->rt)
-   continue;
+   if (sp_ifa->rt) {
+   /* This dst has been added to garbage list when
+* lo device down, release this obsolete dst and
+* reallocate a new router for ifa.
+*/
+   if (sp_ifa->rt->dst.obsolete > 0) {
+   dst_release(_ifa->rt->dst);
+   sp_ifa->rt = NULL;
+   } else {
+   continue;
+   }
+   }
 
sp_rt = addrconf_dst_alloc(idev, _ifa->addr, 0);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 074/102] shmem: fix nlink for rename overwrite directory

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Miklos Szeredi 

commit b928095b0a7cff7fb9fcf4c706348ceb8ab2c295 upstream.

If overwriting an empty directory with rename, then need to drop the extra
nlink.

Test prog:

#include 
#include 
#include 
#include 

int main(void)
{
const char *test_dir1 = "test-dir1";
const char *test_dir2 = "test-dir2";
int res;
int fd;
struct stat statbuf;

res = mkdir(test_dir1, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir1);

res = mkdir(test_dir2, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir2);

fd = open(test_dir2, O_RDONLY);
if (fd == -1)
err(1, "open(\"%s\")", test_dir2);

res = rename(test_dir1, test_dir2);
if (res == -1)
err(1, "rename(\"%s\", \"%s\")", test_dir1, test_dir2);

res = fstat(fd, );
if (res == -1)
err(1, "fstat(%i)", fd);

if (statbuf.st_nlink != 0) {
fprintf(stderr, "nlink is %lu, should be 0\n", 
statbuf.st_nlink);
return 1;
}

return 0;
}

Signed-off-by: Miklos Szeredi 
Signed-off-by: Al Viro 
Signed-off-by: Ben Hutchings 
---
 mm/shmem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1719,8 +1719,10 @@ static int shmem_rename(struct inode *ol
 
if (new_dentry->d_inode) {
(void) shmem_unlink(new_dir, new_dentry);
-   if (they_are_dirs)
+   if (they_are_dirs) {
+   drop_nlink(new_dentry->d_inode);
drop_nlink(old_dir);
+   }
} else if (they_are_dirs) {
drop_nlink(old_dir);
inc_nlink(new_dir);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 065/102] can: at91_can: add missing prepare and unprepare of the clock

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Dueck 

commit e77980e50bc2850599d4d9c0192b67a9ffd6daac upstream.

In order to make the driver work with the common clock framework, this patch
converts the clk_enable()/clk_disable() to
clk_prepare_enable()/clk_disable_unprepare(). While there, add the missing
error handling.

Signed-off-by: David Dueck 
Signed-off-by: Anthony Harivel 
Acked-by: Boris Brezillon 
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Ben Hutchings 
---
 drivers/net/can/at91_can.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- a/drivers/net/can/at91_can.c
+++ b/drivers/net/can/at91_can.c
@@ -1115,7 +1115,9 @@ static int at91_open(struct net_device *
struct at91_priv *priv = netdev_priv(dev);
int err;
 
-   clk_enable(priv->clk);
+   err = clk_prepare_enable(priv->clk);
+   if (err)
+   return err;
 
/* check or determine and set bittime */
err = open_candev(dev);
@@ -1139,7 +1141,7 @@ static int at91_open(struct net_device *
  out_close:
close_candev(dev);
  out:
-   clk_disable(priv->clk);
+   clk_disable_unprepare(priv->clk);
 
return err;
 }
@@ -1156,7 +1158,7 @@ static int at91_close(struct net_device
at91_chip_stop(dev, CAN_STATE_STOPPED);
 
free_irq(dev->irq, dev);
-   clk_disable(priv->clk);
+   clk_disable_unprepare(priv->clk);
 
close_candev(dev);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 078/102] MIPS: Fix forgotten preempt_enable() when CPU has inclusive pcaches

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Yoichi Yuasa 

commit 5596b0b245fb9d2cefb5023b11061050351c1398 upstream.

[1.904000] BUG: scheduling while atomic: swapper/1/0x0002
[1.908000] Modules linked in:
[1.916000] CPU: 0 PID: 1 Comm: swapper Not tainted 
3.12.0-rc2-lemote-los.git-5318619-dirty #1
[1.92] Stack : 31aac000 810d 0052 
802730a4
   0001 810cdf90 810d
  8068b968 806f5537 810cdf90 98009f0782e8
  0001 8072 806b 98009f078000
  98009f29 805f312c 98009f05b5d8 80233518
  98009f05b5e8 80274b7c 98009f078000 8068b968
     
   98009f05b520  805f2f6c
   8070 8070 806fc758
  8070 8020be98 806fceb0 805f2f6c
  ...
[2.028000] Call Trace:
[2.032000] [] show_stack+0x80/0x98
[2.036000] [] __schedule_bug+0x44/0x6c
[2.04] [] __schedule+0x518/0x5b0
[2.044000] [] schedule_timeout+0x128/0x1f0
[2.048000] [] msleep+0x3c/0x60
[2.052000] [] do_probe+0x238/0x3a8
[2.056000] [] ide_probe_port+0x340/0x7e8
[2.06] [] ide_host_register+0x2d0/0x7a8
[2.064000] [] ide_pci_init_two+0x4e4/0x790
[2.068000] [] amd74xx_probe+0x148/0x2c8
[2.072000] [] pci_device_probe+0xc4/0x130
[2.076000] [] driver_probe_device+0x98/0x270
[2.08] [] __driver_attach+0xe0/0xe8
[2.084000] [] bus_for_each_dev+0x78/0xe0
[2.088000] [] bus_add_driver+0x230/0x310
[2.092000] [] driver_register+0x84/0x158
[2.096000] [] do_one_initcall+0x104/0x160

Signed-off-by: Yoichi Yuasa 
Reported-by: Aaro Koskinen 
Tested-by: Aaro Koskinen 
Cc: linux-m...@linux-mips.org
Cc: Linux Kernel Mailing List 
Patchwork: https://patchwork.linux-mips.org/patch/5941/
Signed-off-by: Ralf Baechle 
Signed-off-by: Ben Hutchings 
---
 arch/mips/mm/c-r4k.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -606,6 +606,7 @@ static void r4k_dma_cache_wback_inv(unsi
r4k_blast_scache();
else
blast_scache_range(addr, addr + size);
+   preempt_enable();
__sync();
return;
}
@@ -647,6 +648,7 @@ static void r4k_dma_cache_inv(unsigned l
 */
blast_inv_scache_range(addr, addr + size);
}
+   preempt_enable();
__sync();
return;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 083/102] ext4: fix BUG_ON in mb_free_blocks()

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Theodore Ts'o 

commit c99d1e6e83b06744c75d9f5e491ed495a7086b7b upstream.

If we suffer a block allocation failure (for example due to a memory
allocation failure), it's possible that we will call
ext4_discard_allocated_blocks() before we've actually allocated any
blocks.  In that case, fe_len and fe_start in ac->ac_f_ex will still
be zero, and this will result in mb_free_blocks(inode, e4b, 0, 0)
triggering the BUG_ON on mb_free_blocks():

BUG_ON(last >= (sb->s_blocksize << 3));

Fix this by bailing out of ext4_discard_allocated_blocks() if fs_len
is zero.

Also fix a missing ext4_mb_unload_buddy() call in
ext4_discard_allocated_blocks().

Google-Bug-Id: 16844242

Fixes: 86f0afd463215fc3e58020493482faa4ac3a4d69
Signed-off-by: Theodore Ts'o 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 fs/ext4/mballoc.c | 5 +
 1 file changed, 5 insertions(+)

--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1312,6 +1312,8 @@ static void mb_free_blocks(struct inode
void *buddy2;
struct super_block *sb = e4b->bd_sb;
 
+   if (WARN_ON(count == 0))
+   return;
BUG_ON(first + count > (sb->s_blocksize << 3));
assert_spin_locked(ext4_group_lock_ptr(sb, e4b->bd_group));
mb_check_buddy(e4b);
@@ -3132,6 +3134,8 @@ static void ext4_discard_allocated_block
int err;
 
if (pa == NULL) {
+   if (ac->ac_f_ex.fe_len == 0)
+   return;
err = ext4_mb_load_buddy(ac->ac_sb, ac->ac_f_ex.fe_group, );
if (err) {
/*
@@ -3146,6 +3150,7 @@ static void ext4_discard_allocated_block
mb_free_blocks(ac->ac_inode, , ac->ac_f_ex.fe_start,
   ac->ac_f_ex.fe_len);
ext4_unlock_group(ac->ac_sb, ac->ac_f_ex.fe_group);
+   ext4_mb_unload_buddy();
return;
}
if (pa->pa_type == MB_INODE_PA)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 000/102] 3.2.64-rc1 review

2014-11-01 Thread Ben Hutchings

This is the start of the stable review cycle for the 3.2.64 release.
There are 102 patches in this series, which will be posted as responses
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Nov 04 00:00:00 UTC 2014.
Anything received after that time might be too late.

A combined patch relative to 3.2.63 will be posted as an additional
response to this.  A shortlog and diffstat can be found below.

Ben.

-

Al Viro (2):
  be careful with nd->inode in path_init() and follow_dotdot_rcu()
 [4023bfc9f351a7994fb6a7d515476c320f94a574]
  don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()
 [7bd88377d482e1eae3c5329b12e33cfd664fa6a9]

Alban Crequy (1):
  cgroup: reject cgroup names with '\n'
 [71b1fb5c4473a5b1e601d41b109bdfe001ec82e0]

Alex Deucher (1):
  drm/radeon: add connector quirk for fujitsu board
 [1952f24d0fa6292d65f886887af87ba8ac79b3ba]

Andreas Rohner (1):
  nilfs2: fix data loss with mmap()
 [56d7acc792c0d98f38f22058671ee715ff197023]

Andrew Hunter (1):
  jiffies: Fix timeval conversion to jiffies
 [d78c9300c51d6ceed9f6d078d4e9366f259de28c]

Andy Honig (1):
  KVM: x86: Improve thread safety in pit
 [2febc839133280d5a5e8e1179c94ea674489dae2]

Andy Lutomirski (1):
  x86,kvm,vmx: Preserve CR4 across VM entry
 [d974baa398f34393db76be45f7d4d04fbdbb4a0a]

Anton Altaparmakov (1):
  Fix nasty 32-bit overflow bug in buffer i/o code.
 [f2d5a94436cc7cc0221b9a81bba2276a25187dd3]

Aurelien Jarno (1):
  MIPS: ZBOOT: add missing  include
 [29593fd5a8149462ed6fad0d522234facdaee6c8]

Ben Hutchings (1):
  vfs: Fold follow_mount_rcu() into follow_dotdot_rcu()
 [b37199e626b31e1175fb06764c5d1d687723aac2]

Bjørn Mork (2):
  USB: sierra: add 1199:68AA device ID
 [5b3da69285c143b7ea76b3b9f73099ff1093ab73]
  USB: sierra: avoid CDC class functions on "68A3" devices
 [049255f51644c1105775af228396d187402a5934]

Chenweilong (1):
  ipv6: reallocate addrconf router for ipv6  address when lo device up
 [33d99113b1102c2d2f8603b9ba72d89d915c13f5]

Christian Borntraeger (1):
  KVM: s390: Fix user triggerable bug in dead code
 [614a80e474b227cace52fd6e3c790554db8a396e]

Clemens Ladisch (1):
  ALSA: pcm: fix fifo_size frame calculation
 [a9960e6a293e6fc3ed414643bb4e4106272e4d0a]

Cong Wang (1):
  perf: Fix a race condition in perf_remove_from_context()
 [3577af70a2ce4853d58e57d832e687d739281479]

Daniel Borkmann (3):
  net: sctp: fix panic on duplicate ASCONF chunks
 [b69040d8e39f20d5215a03502a8e8b4c6ab78395]
  net: sctp: fix remote memory pressure from excessive queueing
 [26b87c7881006311828bb0ab271a551a62dcceb4]
  net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks
 [9de7922bc709eee2f609cd01d98aaedc4cf5ea74]

Dave Chinner (1):
  xfs: don't dirty buffers beyond EOF
 [22e757a49cf010703fcb9c9b4ef793248c39b0c2]

David Dueck (1):
  can: at91_can: add missing prepare and unprepare of the clock
 [e77980e50bc2850599d4d9c0192b67a9ffd6daac]

David Jander (2):
  can: flexcan: correctly initialize mailboxes
 [fc05b884a31dbf259cc73cc856e634ec3acbebb6]
  can: flexcan: implement workaround for errata ERR005829
 [25e924450fcb23c11c07c95ea8964dd9f174652e]

Dmitry Torokhov (1):
  Input: synaptics - add support for ForcePads
 [5715fc764f7753d464dbe094b5ef9cffa6e479a4]

Eliad Peller (1):
  regulatory: add NUL to alpha2
 [a5fe8e7695dc3f547e955ad2b662e3e72969e506]

Emmanuel Grumbach (1):
  Revert "iwlwifi: dvm: don't enable CTS to self"
 [f47f46d7b09cf1d09e4b44b6cc4dd7d68a08028c]

Felipe Balbi (3):
  usb: dwc3: core: fix order of PM runtime calls
 [fed33afce0eda44a46ae24d93aec1b5198c0bac4]
  usb: dwc3: core: use pm_runtime_put_sync() on remove
 [16b972a592ea2c9a3c2a3c12238de650fd4043a9]
  usb: host: xhci: fix compliance mode workaround
 [96908589a8b2584b1185f834d365f5cc360e8226]

Hannes Frederic Sowa (1):
  ipv6: reuse ip6_frag_id from ip6_ufo_append_data
 [916e4cf46d0204806c062c8c6c4d1f633852c5b6]

Hans de Goede (3):
  Input: elantech - fix detection of touchpad on ASUS s301l
 [271329b3c798b2102120f5df829071c211ef00ed]
  Input: i8042 - add Fujitsu U574 to no_timeout dmi table
 [cc18a69c92d0972bc2fc5a047ee3be1e8398171b]
  Input: i8042 - add nomux quirk for Avatar AVIU-145A6
 [d2682118f4bb3ceb835f91c1a694407a31bb7378]

Honggang Li (1):
  percpu: free percpu allocation info for uniprocessor system
 [3189eddbcafcc4d827f7f19facbeddec4424eba8]

Ilya Dryomov (3):
  libceph: add process_one_ticket() helper
 [597cda357716a3cf8d994cb11927af917c8d71fa]
  libceph: do not hard code max auth ticket len

[PATCH 3.2 064/102] can: flexcan: put TX mailbox into TX_INACTIVE mode after tx-complete

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marc Kleine-Budde 

commit de5944883ebbedbf5adc8497659772f5da7b7d72 upstream.

After sending a RTR frame the TX mailbox becomes a RX_EMPTY mailbox. To avoid
side effects when the RX-FIFO is full, this patch puts the TX mailbox into
TX_INACTIVE mode in the transmission complete interrupt handler. This, of
course, leaves a race window between the actual completion of the transmission
and the handling of tx-complete interrupt. However this is the best we can do
without busy polling the tx complete interrupt.

Signed-off-by: Marc Kleine-Budde 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 drivers/net/can/flexcan.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -632,6 +632,9 @@ static irqreturn_t flexcan_irq(int irq,
if (reg_iflag1 & (1 << FLEXCAN_TX_BUF_ID)) {
/* tx_bytes is incremented in flexcan_start_xmit */
stats->tx_packets++;
+   /* after sending a RTR frame mailbox is in RX mode */
+   flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
+ >cantxfg[FLEXCAN_TX_BUF_ID].can_ctrl);
flexcan_write((1 << FLEXCAN_TX_BUF_ID), >iflag1);
netif_wake_queue(dev);
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 051/102] vfs: Fold follow_mount_rcu() into follow_dotdot_rcu()

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Ben Hutchings 

This is needed before commit 4023bfc9f351 ('be careful with nd->inode
in path_init() and follow_dotdot_rcu()').  A similar change was made
upstream as part of commit b37199e626b3 ('rcuwalk: recheck mount_lock
after mountpoint crossing attempts').

Signed-off-by: Ben Hutchings 
---
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -911,19 +911,6 @@ static bool __follow_mount_rcu(struct na
return true;
 }
 
-static void follow_mount_rcu(struct nameidata *nd)
-{
-   while (d_mountpoint(nd->path.dentry)) {
-   struct vfsmount *mounted;
-   mounted = __lookup_mnt(nd->path.mnt, nd->path.dentry, 1);
-   if (!mounted)
-   break;
-   nd->path.mnt = mounted;
-   nd->path.dentry = mounted->mnt_root;
-   nd->seq = read_seqcount_begin(>path.dentry->d_seq);
-   }
-}
-
 static int follow_dotdot_rcu(struct nameidata *nd)
 {
if (!nd->root.mnt)
@@ -950,7 +937,15 @@ static int follow_dotdot_rcu(struct name
break;
nd->seq = read_seqcount_begin(>path.dentry->d_seq);
}
-   follow_mount_rcu(nd);
+   while (d_mountpoint(nd->path.dentry)) {
+   struct vfsmount *mounted;
+   mounted = __lookup_mnt(nd->path.mnt, nd->path.dentry, 1);
+   if (!mounted)
+   break;
+   nd->path.mnt = mounted;
+   nd->path.dentry = mounted->mnt_root;
+   nd->seq = read_seqcount_begin(>path.dentry->d_seq);
+   }
nd->inode = nd->path.dentry->d_inode;
return 0;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 049/102] alarmtimer: Lock k_itimer during timer callback

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Richard Larocque 

commit 474e941bed9262f5fa2394f9a4a67e24499e5926 upstream.

Locks the k_itimer's it_lock member when handling the alarm timer's
expiry callback.

The regular posix timers defined in posix-timers.c have this lock held
during timout processing because their callbacks are routed through
posix_timer_fn().  The alarm timers follow a different path, so they
ought to grab the lock somewhere else.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Richard Cochran 
Cc: Prarit Bhargava 
Cc: Sharvil Nanavati 
Signed-off-by: Richard Larocque 
Signed-off-by: John Stultz 
Signed-off-by: Ben Hutchings 
---
 kernel/time/alarmtimer.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -448,8 +448,12 @@ static enum alarmtimer_type clock2alarm(
 static enum alarmtimer_restart alarm_handle_timer(struct alarm *alarm,
ktime_t now)
 {
+   unsigned long flags;
struct k_itimer *ptr = container_of(alarm, struct k_itimer,
it.alarm.alarmtimer);
+   enum alarmtimer_restart result = ALARMTIMER_NORESTART;
+
+   spin_lock_irqsave(>it_lock, flags);
if ((ptr->it_sigev_notify & ~SIGEV_THREAD_ID) != SIGEV_NONE) {
if (posix_timer_event(ptr, 0) != 0)
ptr->it_overrun++;
@@ -459,9 +463,11 @@ static enum alarmtimer_restart alarm_han
if (ptr->it.alarm.interval.tv64) {
ptr->it_overrun += alarm_forward(alarm, now,
ptr->it.alarm.interval);
-   return ALARMTIMER_RESTART;
+   result = ALARMTIMER_RESTART;
}
-   return ALARMTIMER_NORESTART;
+   spin_unlock_irqrestore(>it_lock, flags);
+
+   return result;
 }
 
 /**

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 093/102] KVM: x86: Handle errors when RIP is set during far jumps

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Nadav Amit 

commit d1442d85cc30ea75f7d399474ca738e0bc96f715 upstream.

Far jmp/call/ret may fault while loading a new RIP.  Currently KVM does not
handle this case, and may result in failed vm-entry once the assignment is
done.  The tricky part of doing so is that loading the new CS affects the
VMCS/VMCB state, so if we fail during loading the new RIP, we are left in
unconsistent state.  Therefore, this patch saves on 64-bit the old CS
descriptor and restores it if loading RIP failed.

This fixes CVE-2014-3647.

Signed-off-by: Nadav Amit 
Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2:
 - Adjust context
 - __load_segment_descriptor() does not take an in_task_switch parameter]
Signed-off-by: Ben Hutchings 
---
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1234,7 +1234,8 @@ static int write_segment_descriptor(stru
 
 /* Does not support long mode */
 static int __load_segment_descriptor(struct x86_emulate_ctxt *ctxt,
-u16 selector, int seg, u8 cpl)
+u16 selector, int seg, u8 cpl,
+struct desc_struct *desc)
 {
struct desc_struct seg_desc;
u8 dpl, rpl;
@@ -1342,6 +1343,8 @@ static int __load_segment_descriptor(str
}
 load:
ctxt->ops->set_segment(ctxt, selector, _desc, 0, seg);
+   if (desc)
+   *desc = seg_desc;
return X86EMUL_CONTINUE;
 exception:
emulate_exception(ctxt, err_vec, err_code, true);
@@ -1352,7 +1355,7 @@ static int load_segment_descriptor(struc
   u16 selector, int seg)
 {
u8 cpl = ctxt->ops->cpl(ctxt);
-   return __load_segment_descriptor(ctxt, selector, seg, cpl);
+   return __load_segment_descriptor(ctxt, selector, seg, cpl, NULL);
 }
 
 static void write_register_operand(struct operand *op)
@@ -1694,17 +1697,31 @@ static int em_iret(struct x86_emulate_ct
 static int em_jmp_far(struct x86_emulate_ctxt *ctxt)
 {
int rc;
-   unsigned short sel;
+   unsigned short sel, old_sel;
+   struct desc_struct old_desc, new_desc;
+   const struct x86_emulate_ops *ops = ctxt->ops;
+   u8 cpl = ctxt->ops->cpl(ctxt);
+
+   /* Assignment of RIP may only fail in 64-bit mode */
+   if (ctxt->mode == X86EMUL_MODE_PROT64)
+   ops->get_segment(ctxt, _sel, _desc, NULL,
+VCPU_SREG_CS);
 
memcpy(, ctxt->src.valptr + ctxt->op_bytes, 2);
 
-   rc = load_segment_descriptor(ctxt, sel, VCPU_SREG_CS);
+   rc = __load_segment_descriptor(ctxt, sel, VCPU_SREG_CS, cpl,
+  _desc);
if (rc != X86EMUL_CONTINUE)
return rc;
 
-   ctxt->_eip = 0;
-   memcpy(>_eip, ctxt->src.valptr, ctxt->op_bytes);
-   return X86EMUL_CONTINUE;
+   rc = assign_eip_far(ctxt, ctxt->src.val, new_desc.l);
+   if (rc != X86EMUL_CONTINUE) {
+   WARN_ON(!ctxt->mode != X86EMUL_MODE_PROT64);
+   /* assigning eip failed; restore the old cs */
+   ops->set_segment(ctxt, old_sel, _desc, 0, VCPU_SREG_CS);
+   return rc;
+   }
+   return rc;
 }
 
 static int em_grp1a(struct x86_emulate_ctxt *ctxt)
@@ -1856,21 +1873,34 @@ static int em_ret(struct x86_emulate_ctx
 static int em_ret_far(struct x86_emulate_ctxt *ctxt)
 {
int rc;
-   unsigned long cs;
+   unsigned long eip, cs;
+   u16 old_cs;
int cpl = ctxt->ops->cpl(ctxt);
+   struct desc_struct old_desc, new_desc;
+   const struct x86_emulate_ops *ops = ctxt->ops;
+
+   if (ctxt->mode == X86EMUL_MODE_PROT64)
+   ops->get_segment(ctxt, _cs, _desc, NULL,
+VCPU_SREG_CS);
 
-   rc = emulate_pop(ctxt, >_eip, ctxt->op_bytes);
+   rc = emulate_pop(ctxt, , ctxt->op_bytes);
if (rc != X86EMUL_CONTINUE)
return rc;
-   if (ctxt->op_bytes == 4)
-   ctxt->_eip = (u32)ctxt->_eip;
rc = emulate_pop(ctxt, , ctxt->op_bytes);
if (rc != X86EMUL_CONTINUE)
return rc;
/* Outer-privilege level return is not implemented */
if (ctxt->mode >= X86EMUL_MODE_PROT16 && (cs & 3) > cpl)
return X86EMUL_UNHANDLEABLE;
-   rc = load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS);
+   rc = __load_segment_descriptor(ctxt, (u16)cs, VCPU_SREG_CS, 0,
+  _desc);
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+   rc = assign_eip_far(ctxt, eip, new_desc.l);
+   if (rc != X86EMUL_CONTINUE) {
+   WARN_ON(!ctxt->mode != X86EMUL_MODE_PROT64);
+   ops->set_segment(ctxt, old_cs, _desc, 0, VCPU_SREG_CS);
+   }
return rc;
 }
 
@@ -2248,19 +2278,24 @@ static int load_state_from_tss16(struct
 * Now

[PATCH 3.2 085/102] KVM: x86: Check non-canonical addresses upon WRMSR

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Nadav Amit 

commit 854e8bb1aa06c578c2c9145fa6bfe3680ef63b23 upstream.

Upon WRMSR, the CPU should inject #GP if a non-canonical value (address) is
written to certain MSRs. The behavior is "almost" identical for AMD and Intel
(ignoring MSRs that are not implemented in either architecture since they would
anyhow #GP). However, IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if
non-canonical address is written on Intel but not on AMD (which ignores the top
32-bits).

Accordingly, this patch injects a #GP on the MSRs which behave identically on
Intel and AMD.  To eliminate the differences between the architecutres, the
value which is written to IA32_SYSENTER_ESP and IA32_SYSENTER_EIP is turned to
canonical value before writing instead of injecting a #GP.

Some references from Intel and AMD manuals:

According to Intel SDM description of WRMSR instruction #GP is expected on
WRMSR "If the source register contains a non-canonical address and ECX
specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE,
IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP."

According to AMD manual instruction manual:
LSTAR/CSTAR (SYSCALL): "The WRMSR instruction loads the target RIP into the
LSTAR and CSTAR registers.  If an RIP written by WRMSR is not in canonical
form, a general-protection exception (#GP) occurs."
IA32_GS_BASE and IA32_FS_BASE (WRFSBASE/WRGSBASE): "The address written to the
base field must be in canonical form or a #GP fault will occur."
IA32_KERNEL_GS_BASE (SWAPGS): "The address stored in the KernelGSbase MSR must
be in canonical form."

This patch fixes CVE-2014-3610.

Signed-off-by: Nadav Amit 
Signed-off-by: Paolo Bonzini 
[bwh: Backported to 3.2:
 - The various set_msr() functions all separate msr_index and data parameters]
Signed-off-by: Ben Hutchings 
---
 arch/x86/include/asm/kvm_host.h | 14 ++
 arch/x86/kvm/svm.c  |  2 +-
 arch/x86/kvm/vmx.c  |  2 +-
 arch/x86/kvm/x86.c  | 27 ++-
 4 files changed, 42 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -821,6 +821,20 @@ static inline void kvm_inject_gp(struct
kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
 }
 
+static inline u64 get_canonical(u64 la)
+{
+   return ((int64_t)la << 16) >> 16;
+}
+
+static inline bool is_noncanonical_address(u64 la)
+{
+#ifdef CONFIG_X86_64
+   return get_canonical(la) != la;
+#else
+   return false;
+#endif
+}
+
 #define TSS_IOPB_BASE_OFFSET 0x66
 #define TSS_BASE_SIZE 0x68
 #define TSS_IOPB_SIZE (65536 / 8)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3109,7 +3109,7 @@ static int wrmsr_interception(struct vcp
 
 
svm->next_rip = kvm_rip_read(>vcpu) + 2;
-   if (svm_set_msr(>vcpu, ecx, data)) {
+   if (kvm_set_msr(>vcpu, ecx, data)) {
trace_kvm_msr_write_ex(ecx, data);
kvm_inject_gp(>vcpu, 0);
} else {
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4544,7 +4544,7 @@ static int handle_wrmsr(struct kvm_vcpu
u64 data = (vcpu->arch.regs[VCPU_REGS_RAX] & -1u)
| ((u64)(vcpu->arch.regs[VCPU_REGS_RDX] & -1u) << 32);
 
-   if (vmx_set_msr(vcpu, ecx, data) != 0) {
+   if (kvm_set_msr(vcpu, ecx, data) != 0) {
trace_kvm_msr_write_ex(ecx, data);
kvm_inject_gp(vcpu, 0);
return 1;
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -893,7 +893,6 @@ void kvm_enable_efer_bits(u64 mask)
 }
 EXPORT_SYMBOL_GPL(kvm_enable_efer_bits);
 
-
 /*
  * Writes msr value into into the appropriate "register".
  * Returns 0 on success, non-0 otherwise.
@@ -901,8 +900,34 @@ EXPORT_SYMBOL_GPL(kvm_enable_efer_bits);
  */
 int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data)
 {
+   switch (msr_index) {
+   case MSR_FS_BASE:
+   case MSR_GS_BASE:
+   case MSR_KERNEL_GS_BASE:
+   case MSR_CSTAR:
+   case MSR_LSTAR:
+   if (is_noncanonical_address(data))
+   return 1;
+   break;
+   case MSR_IA32_SYSENTER_EIP:
+   case MSR_IA32_SYSENTER_ESP:
+   /*
+* IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if
+* non-canonical address is written on Intel but not on
+* AMD (which ignores the top 32-bits, because it does
+* not implement 64-bit SYSENTER).
+*
+* 64-bit code should hence be able to write a non-canonical
+* value on AMD.  Making the address canonical ensures that
+* vmentry does not fail on Intel after writing a non-canonical
+* value, and that something deterministic happens if the guest
+* invokes 64-bit SYSENTER.
+*/
+   data

[PATCH 3.2 070/102] ARM: 8165/1: alignment: don't break misaligned NEON load/store

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Robin Murphy 

commit 5ca918e5e3f9df4634077c06585c42bc6a8d699a upstream.

The alignment fixup incorrectly decodes faulting ARM VLDn/VSTn
instructions (where the optional alignment hint is given but incorrect)
as LDR/STR, leading to register corruption. Detect these and correctly
treat them as unhandled, so that userspace gets the fault it expects.

Reported-by: Simon Hosie 
Signed-off-by: Robin Murphy 
Signed-off-by: Russell King 
Signed-off-by: Ben Hutchings 
---
 arch/arm/mm/alignment.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/arch/arm/mm/alignment.c
+++ b/arch/arm/mm/alignment.c
@@ -38,6 +38,7 @@
  * This code is not portable to processors with late data abort handling.
  */
 #define CODING_BITS(i) (i & 0x0e00)
+#define COND_BITS(i)   (i & 0xf000)
 
 #define LDST_I_BIT(i)  (i & (1 << 26)) /* Immediate constant   */
 #define LDST_P_BIT(i)  (i & (1 << 24)) /* Preindex */
@@ -812,6 +813,8 @@ do_alignment(unsigned long addr, unsigne
break;
 
case 0x0400:/* ldr or str immediate */
+   if (COND_BITS(instr) == 0xf000) /* NEON VLDn, VSTn */
+   goto bad;
offset.un = OFFSET_BITS(instr);
handler = do_alignment_ldrstr;
break;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 052/102] be careful with nd->inode in path_init() and follow_dotdot_rcu()

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Al Viro 

commit 4023bfc9f351a7994fb6a7d515476c320f94a574 upstream.

in the former we simply check if dentry is still valid after picking
its ->d_inode; in the latter we fetch ->d_inode in the same places
where we fetch dentry and its ->d_seq, under the same checks.

Signed-off-by: Al Viro 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 fs/namei.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -913,6 +913,7 @@ static bool __follow_mount_rcu(struct na
 
 static int follow_dotdot_rcu(struct nameidata *nd)
 {
+   struct inode *inode = nd->inode;
if (!nd->root.mnt)
set_root_rcu(nd);
 
@@ -926,6 +927,7 @@ static int follow_dotdot_rcu(struct name
struct dentry *parent = old->d_parent;
unsigned seq;
 
+   inode = parent->d_inode;
seq = read_seqcount_begin(>d_seq);
if (read_seqcount_retry(>d_seq, nd->seq))
goto failed;
@@ -935,6 +937,7 @@ static int follow_dotdot_rcu(struct name
}
if (!follow_up_rcu(>path))
break;
+   inode = nd->path.dentry->d_inode;
nd->seq = read_seqcount_begin(>path.dentry->d_seq);
}
while (d_mountpoint(nd->path.dentry)) {
@@ -944,9 +947,10 @@ static int follow_dotdot_rcu(struct name
break;
nd->path.mnt = mounted;
nd->path.dentry = mounted->mnt_root;
+   inode = nd->path.dentry->d_inode;
nd->seq = read_seqcount_begin(>path.dentry->d_seq);
}
-   nd->inode = nd->path.dentry->d_inode;
+   nd->inode = inode;
return 0;
 
 failed:
@@ -1556,7 +1560,14 @@ static int path_init(int dfd, const char
}
 
nd->inode = nd->path.dentry->d_inode;
-   return 0;
+   if (!(flags & LOOKUP_RCU))
+   return 0;
+   if (likely(!read_seqcount_retry(>path.dentry->d_seq, nd->seq)))
+   return 0;
+   if (!(nd->flags & LOOKUP_ROOT))
+   nd->root.mnt = NULL;
+   rcu_read_unlock();
+   return -ECHILD;
 
 fput_fail:
fput_light(file, fput_needed);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 062/102] can: flexcan: correctly initialize mailboxes

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: David Jander 

commit fc05b884a31dbf259cc73cc856e634ec3acbebb6 upstream.

Apparently mailboxes may contain random data at startup, causing some of them
being prepared for message reception. This causes overruns being missed or even
confusing the IRQ check for trasmitted messages, increasing the transmit
counter instead of the error counter.

This patch initializes all mailboxes after the FIFO as RX_INACTIVE.

Signed-off-by: David Jander 
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Ben Hutchings 
---
 drivers/net/can/flexcan.c | 7 +++
 1 file changed, 7 insertions(+)

--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -679,6 +679,7 @@ static int flexcan_chip_start(struct net
struct flexcan_regs __iomem *regs = priv->base;
int err;
u32 reg_mcr, reg_ctrl;
+   int i;
 
/* enable module */
flexcan_chip_enable(priv);
@@ -744,6 +745,12 @@ static int flexcan_chip_start(struct net
dev_dbg(dev->dev.parent, "%s: writing ctrl=0x%08x", __func__, reg_ctrl);
flexcan_write(reg_ctrl, >ctrl);
 
+   /* clear and invalidate all mailboxes first */
+   for (i = FLEXCAN_TX_BUF_ID; i < ARRAY_SIZE(regs->cantxfg); i++) {
+   flexcan_write(FLEXCAN_MB_CODE_RX_INACTIVE,
+ >cantxfg[i].can_ctrl);
+   }
+
/* mark TX mailbox as INACTIVE */
flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
  >cantxfg[FLEXCAN_TX_BUF_ID].can_ctrl);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 061/102] can: flexcan: mark TX mailbox as TX_INACTIVE

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Marc Kleine-Budde 

commit c32fe4ad3e4861b2bfa1f44114c564935a123dda upstream.

This patch fixes the initialization of the TX mailbox. It is now correctly
initialized as TX_INACTIVE not RX_EMPTY.

Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Ben Hutchings 
---
 drivers/net/can/flexcan.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -131,6 +131,17 @@
 
 /* FLEXCAN message buffers */
 #define FLEXCAN_MB_CNT_CODE(x) (((x) & 0xf) << 24)
+#define FLEXCAN_MB_CODE_RX_INACTIVE(0x0 << 24)
+#define FLEXCAN_MB_CODE_RX_EMPTY   (0x4 << 24)
+#define FLEXCAN_MB_CODE_RX_FULL(0x2 << 24)
+#define FLEXCAN_MB_CODE_RX_OVERRRUN(0x6 << 24)
+#define FLEXCAN_MB_CODE_RX_RANSWER (0xa << 24)
+
+#define FLEXCAN_MB_CODE_TX_INACTIVE(0x8 << 24)
+#define FLEXCAN_MB_CODE_TX_ABORT   (0x9 << 24)
+#define FLEXCAN_MB_CODE_TX_DATA(0xc << 24)
+#define FLEXCAN_MB_CODE_TX_TANSWER (0xe << 24)
+
 #define FLEXCAN_MB_CNT_SRR BIT(22)
 #define FLEXCAN_MB_CNT_IDE BIT(21)
 #define FLEXCAN_MB_CNT_RTR BIT(20)
@@ -733,8 +744,8 @@ static int flexcan_chip_start(struct net
dev_dbg(dev->dev.parent, "%s: writing ctrl=0x%08x", __func__, reg_ctrl);
flexcan_write(reg_ctrl, >ctrl);
 
-   /* Abort any pending TX, mark Mailbox as INACTIVE */
-   flexcan_write(FLEXCAN_MB_CNT_CODE(0x4),
+   /* mark TX mailbox as INACTIVE */
+   flexcan_write(FLEXCAN_MB_CODE_TX_INACTIVE,
  >cantxfg[FLEXCAN_TX_BUF_ID].can_ctrl);
 
/* acceptance mask/acceptance code (accept everything) */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.2 060/102] nl80211: clear skb cb before passing to netlink

2014-11-01 Thread Ben Hutchings

3.2.64-rc1 review patch.  If anyone has any objections, please let me know.

--

From: Johannes Berg 

commit bd8c78e78d5011d8111bc2533ee73b13a3bd6c42 upstream.

In testmode and vendor command reply/event SKBs we use the
skb cb data to store nl80211 parameters between allocation
and sending. This causes the code for CONFIG_NETLINK_MMAP
to get confused, because it takes ownership of the skb cb
data when the SKB is handed off to netlink, and it doesn't
explicitly clear it.

Clear the skb cb explicitly when we're done and before it
gets passed to netlink to avoid this issue.

Reported-by: Assaf Azulay 
Reported-by: David Spinadel 
Signed-off-by: Johannes Berg 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings 
---
 net/wireless/nl80211.c | 6 ++
 1 file changed, 6 insertions(+)

--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -4804,6 +4804,9 @@ int cfg80211_testmode_reply(struct sk_bu
void *hdr = ((void **)skb->cb)[1];
struct nlattr *data = ((void **)skb->cb)[2];
 
+   /* clear CB data for netlink core to own from now on */
+   memset(skb->cb, 0, sizeof(skb->cb));
+
if (WARN_ON(!rdev->testmode_info)) {
kfree_skb(skb);
return -EINVAL;
@@ -4830,6 +4833,9 @@ void cfg80211_testmode_event(struct sk_b
void *hdr = ((void **)skb->cb)[1];
struct nlattr *data = ((void **)skb->cb)[2];
 
+   /* clear CB data for netlink core to own from now on */
+   memset(skb->cb, 0, sizeof(skb->cb));
+
nla_nest_end(skb, data);
genlmsg_end(skb, hdr);
genlmsg_multicast_netns(wiphy_net(>wiphy), skb, 0,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 >

1 - 100 of 536 matches

Mail list logo