Re: Regarding "scsi_lib: Decode T10 vendor IDs" d230823a1c4c3e97afd4c934b86b3975d5e20249

2016-11-25 Thread Hannes Reinecke
On 11/24/2016 04:48 PM, Sylvain Munaut wrote:
> Hi,
> 
> 
> Regarding this commit :
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d230823a1c4c3e97afd4c934b86b3975d5e20249
> 
> Looking at both the code and the comment, it seems to me you wanted
> 'T10' id to be a last resort if nothing else was available.
> 
> If that was the intent, it's not working ...
> 
> I have a Samsung 850 Pro SSD that has the NAA identifier after the T10
> one in the vpd_pg83 and so it's returning the T10 now because it's
> longer.
> 
Hmm.

> Did I misunderstand the intent ? Or is it a bug ? (for the latter, I
> can write a patch myself, I just didn't want to loose my time if I
> misunderstood).
> 
The T10 identifier should _always_ be the last resort; anything is
better than that :-)

But for NAA we have to check the length, as I've come across an array
which will (errorneously) report _two_ NAA identifiers, and only the
longer one of them will be providing the correct identification.
The other one is in fact the NAA for the target port, and thus identical
for all LUNs from that port :-(

So yeah, the intention was not to prefer a longer T10 to a NAA identifier.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] reset: hisilicon: add a polarity cell for reset line specifier

2016-11-25 Thread Jiancheng Xue


On 2016/11/25 11:45, Jiancheng Xue wrote:
> 
> On 2016/11/21 10:58, Jiancheng Xue wrote:
>> Hi Philipp,
>>
>>> On 2016/11/15 18:43, Philipp Zabel wrote:
 Hi Jiancheng,

 Am Dienstag, den 15.11.2016, 15:09 +0800 schrieb Jiancheng Xue:
> Add a polarity cell for reset line specifier. If the reset line
> is asserted when the register bit is 1, the polarity is
> normal. Otherwise, it is inverted.
>
> Signed-off-by: Jiancheng Xue 
> ---
>>> Thank you very much for replying so soon.
>>>
>>> Please allow me to decribe the reason why this patch exists first.
>>> All bits in the reset controller were designed to be active-high.
>>> But in a recent chip only one bit was implemented to be active-low :(
>>>
>  .../devicetree/bindings/clock/hisi-crg.txt | 11 ---
>  arch/arm/boot/dts/hi3519.dtsi  |  2 +-
>  drivers/clk/hisilicon/reset.c  | 36 
> --
>  3 files changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/clock/hisi-crg.txt 
> b/Documentation/devicetree/bindings/clock/hisi-crg.txt
> index e3919b6..fcbb4f3 100644
> --- a/Documentation/devicetree/bindings/clock/hisi-crg.txt
> +++ b/Documentation/devicetree/bindings/clock/hisi-crg.txt
> @@ -25,19 +25,20 @@ to specify the clock which they consume.
>  
>  All these identifier could be found in 
> .
>  
> -- #reset-cells: should be 2.
> +- #reset-cells: should be 3.
>  
>  A reset signal can be controlled by writing a bit register in the CRG 
> module.
> -The reset specifier consists of two cells. The first cell represents the
> +The reset specifier consists of three cells. The first cell represents 
> the
>  register offset relative to the base address. The second cell represents 
> the
> -bit index in the register.
> +bit index in the register. The third cell represents the polarity of the 
> reset
> +line (0 for normal, 1 for inverted).

>> #reset-cells: Should be 2 if compatilbe string is "hisilicon,hi3519-crg". 
>> Should be 3 otherwise.
>>A reset signal can be controlled by writing a bit register in the 
>> CRG module.
>>The reset specifier consists of two or three cells. The first 
>> cell represents the
>>register offset relative to the base address. The second cell 
>> represents the
>>bit index in the register.The third cell represents the polarity 
>> of the reset
>>line (0 for active-high, 1 for active-low).
>>
>> If I change the binding like this, can it be accepted?
>>
> Hi Philipp,
> 
> Could you give me more suggestions about this?  If you really don't like 
> changing the
> reset-cells like this, I can modify the patch according to your suggestions.
> Thank you.
> 

I'll drop this patch and use "ti,syscon-reset" instead to resolve the polarity 
issue. Thanks.

Regards,
Jiancheng






Re: [PATCH v6 3/9] tpm: replace dynamically allocated bios_dir with a static array

2016-11-25 Thread Jarkko Sakkinen
On Thu, Nov 24, 2016 at 09:53:13AM -0700, Jason Gunthorpe wrote:
> On Thu, Nov 24, 2016 at 03:57:23PM +0200, Jarkko Sakkinen wrote:
> > I manually added the changes to:
> > 
> >   tpm: replace dynamically allocated bios_dir with a static array
> 
> For this patch..
> 
> Could drop 'int rc' from tpm1_chip_register, but it will come back in
> a later patch
> 
> Could dump TPM_NUM_EVENT_LOG_FILES and just use
> ARRAY_SIZE(chip->bios_dir)

Not a bug fix. Happy take a patch after the pull request.

> Now the the stub for tpm_bios_log_setup can properly return -ENODEV
> 
> This is no good at this point in the series - we need the ENODEV
> detection in tpm_chip_register() from the 'Fix handle of missing event
> log' moved into this patch, because it now returns ENODEV due to
> sercurityfs

Sure it would be cleaner but not really necessary. Do you really think
this is mandatory? No matter how I reorder patches this will require
time and effort to fix various merge conflicts because of the replacemnt
of event log. After that I have to test everything.

Not doing this for light reasons...

> The commit 'tpm: vtpm_proxy: Do not access host's event log' still
> needs a proper commit message - it doesn't match what the patch is
> doing at all.

The commit message otherwise great except for the short summary, which
is now fixed.

> If you are going to be editing the patches I'd suggest squashing all
> the bug fix ones with proper credit so it at least makes sense to
> read..
> 
> Jason

I do not have interest to edit patches more than I have to.

/Jarkko


Re: [tpmdd-devel] [PATCH RFC 2/2] tpm: refactor tpm2_get_tpm_pt to tpm2_getcap_cmd

2016-11-25 Thread Jarkko Sakkinen
On Thu, Nov 24, 2016 at 07:12:57PM +0530, Nayna wrote:
> 
> 
> On 11/15/2016 05:18 AM, Jarkko Sakkinen wrote:
> > On Fri, Nov 11, 2016 at 04:02:43PM -0800, Jarkko Sakkinen wrote:
> > > On Fri, Nov 11, 2016 at 09:51:45AM +0530, Nayna wrote:
> > > > 
> > > > 
> > > > On 10/09/2016 03:44 PM, Jarkko Sakkinen wrote:
> > > > > Refactored tpm2_get_tpm_pt to tpm2_getcap_cmd, which means that it 
> > > > > also
> > > > > takes capability ID as input. This is required to access
> > > > > TPM_CAP_HANDLES, which contains metadata needed for swapping transient
> > > > > data.
> > > > > 
> > > > > Signed-off-by: Jarkko Sakkinen 
> > > > > ---
> > > > >drivers/char/tpm/tpm.h  |  6 +++-
> > > > >drivers/char/tpm/tpm2-cmd.c | 64 
> > > > > -
> > > > >drivers/char/tpm/tpm_tis_core.c |  3 +-
> > > > >3 files changed, 38 insertions(+), 35 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> > > > > index 0fab6d5..8176f42 100644
> > > > > --- a/drivers/char/tpm/tpm.h
> > > > > +++ b/drivers/char/tpm/tpm.h
> > > > > @@ -85,6 +85,10 @@ enum tpm2_capabilities {
> > > > >   TPM2_CAP_TPM_PROPERTIES = 6,
> > > > >};
> > > > > 
> > > > > +enum tpm2_properties {
> > > > > + TPM2_PT_FAMILY_INDICATOR= 0x100,
> > > > > +};
> > > > > +
> > > > >enum tpm2_startup_types {
> > > > >   TPM2_SU_CLEAR   = 0x,
> > > > >   TPM2_SU_STATE   = 0x0001,
> > > > > @@ -485,7 +489,7 @@ int tpm2_seal_trusted(struct tpm_chip *chip,
> > > > >int tpm2_unseal_trusted(struct tpm_chip *chip,
> > > > >   struct trusted_key_payload *payload,
> > > > >   struct trusted_key_options *options);
> > > > > -ssize_t tpm2_get_tpm_pt(struct tpm_chip *chip, u32 property_id,
> > > > > +ssize_t tpm2_getcap_cmd(struct tpm_chip *chip, u32 cap_id, u32 
> > > > > property_id,
> > > > >   u32 *value, const char *desc);
> > > > > 
> > > > >int tpm2_auto_startup(struct tpm_chip *chip);
> > > > > diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
> > > > > index 2900e18..fcf3d86 100644
> > > > > --- a/drivers/char/tpm/tpm2-cmd.c
> > > > > +++ b/drivers/char/tpm/tpm2-cmd.c
> > > > > @@ -111,13 +111,13 @@ struct tpm2_pcr_extend_in {
> > > > >   u8  digest[TPM_DIGEST_SIZE];
> > > > >} __packed;
> > > > > 
> > > > > -struct tpm2_get_tpm_pt_in {
> > > > > +struct tpm2_getcap_in {
> > > > >   __be32  cap_id;
> > > > >   __be32  property_id;
> > > > >   __be32  property_cnt;
> > > > >} __packed;
> > > > > 
> > > > > -struct tpm2_get_tpm_pt_out {
> > > > > +struct tpm2_getcap_out {
> > > > >   u8  more_data;
> > > > >   __be32  subcap_id;
> > > > >   __be32  property_cnt;
> > > > > @@ -140,8 +140,8 @@ union tpm2_cmd_params {
> > > > >   struct  tpm2_pcr_read_inpcrread_in;
> > > > >   struct  tpm2_pcr_read_out   pcrread_out;
> > > > >   struct  tpm2_pcr_extend_in  pcrextend_in;
> > > > > - struct  tpm2_get_tpm_pt_in  get_tpm_pt_in;
> > > > > - struct  tpm2_get_tpm_pt_out get_tpm_pt_out;
> > > > > + struct  tpm2_getcap_in  getcap_in;
> > > > > + struct  tpm2_getcap_out getcap_out;
> > > > >   struct  tpm2_get_random_in  getrandom_in;
> > > > >   struct  tpm2_get_random_out getrandom_out;
> > > > >};
> > > > > @@ -435,16 +435,6 @@ int tpm2_get_random(struct tpm_chip *chip, u8 
> > > > > *out, size_t max)
> > > > >   return total ? total : -EIO;
> > > > >}
> > > > > 
> > > > > -#define TPM2_GET_TPM_PT_IN_SIZE \
> > > > > - (sizeof(struct tpm_input_header) + \
> > > > > -  sizeof(struct tpm2_get_tpm_pt_in))
> > > > > -
> > > > > -static const struct tpm_input_header tpm2_get_tpm_pt_header = {
> > > > > - .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS),
> > > > > - .length = cpu_to_be32(TPM2_GET_TPM_PT_IN_SIZE),
> > > > > - .ordinal = cpu_to_be32(TPM2_CC_GET_CAPABILITY)
> > > > > -};
> > > > > -
> > > > >/**
> > > > > * Append TPMS_AUTH_COMMAND to the buffer. The buffer must be 
> > > > > allocated with
> > > > > * tpm_buf_alloc().
> > > > > @@ -750,35 +740,43 @@ out:
> > > > >   return rc;
> > > > >}
> > > > > 
> > > > > +#define TPM2_GETCAP_IN_SIZE \
> > > > > + (sizeof(struct tpm_input_header) + sizeof(struct 
> > > > > tpm2_getcap_in))
> > > > > +
> > > > > +static const struct tpm_input_header tpm2_getcap_header = {
> > > > > + .tag = cpu_to_be16(TPM2_ST_NO_SESSIONS),
> > > > > + .length = cpu_to_be32(TPM2_GETCAP_IN_SIZE),
> > > > > + .ordinal = cpu_to_be32(TPM2_CC_GET_CAPABILITY)
> > > > > +};
> > > > > +
> > > > >/**
> > > > > - * tpm2_get_tpm_pt() - get value of a TPM_CAP_TPM_PROPERTIES type 
> > > > > property
> > > > > - * @chip:TPM chip to use.
> > > > > - * @property_id: property ID.
> > > > > - * @value:   output var

Re: [PATCH v3 2/6] iio: adc: Add support for STM32 ADC core

2016-11-25 Thread Fabrice Gasnier

On 11/24/2016 09:40 PM, Jonathan Cameron wrote:

On 21/11/16 08:54, Fabrice Gasnier wrote:

On 11/19/2016 01:17 PM, Jonathan Cameron wrote:

On 15/11/16 15:30, Fabrice Gasnier wrote:

Add core driver for STMicroelectronics STM32 ADC (Analog to Digital
Converter). STM32 ADC can be composed of up to 3 ADCs with shared
resources like clock prescaler, common interrupt line and analog
reference voltage.
This core driver basically manages shared resources.

Signed-off-by: Fabrice Gasnier 

There is nothing in here that demands selecting a fixed regulator.
I've also switched the select regulator over to depends on inline with
other drivers in IIO that have a hard dependency on regulators.
Other than that which showed up during build tests, looks good to me.
Shout if I've broken anything with this change.

Hi Jonathan, All,

First many thanks.
This is not a big deal. Only thing is: I think patch 4 of this series (on 
stm32_defconfig) need to be updated
to accommodate this change. E.g. :
+CONFIG_REGULATOR=y
+CONFIG_REGULATOR_FIXED_VOLTAGE=y

Shall I send a new version of this series (all patches), including your 
changes, with updated defconfig as well ?
Or only updated patch on defconfig is enough ?

Just update those that haven't already been applied.

Hi,

I'll update these only.
Thanks,
Fabrice



Thanks,

Jonathan

Please advise,
Fabrice

Applied to the togreg branch of iio.git and pushed out as testing for
the autobuilders to play with it.

Thanks,

Jonathan

---
   drivers/iio/adc/Kconfig  |  13 ++
   drivers/iio/adc/Makefile |   1 +
   drivers/iio/adc/stm32-adc-core.c | 303 
+++
   drivers/iio/adc/stm32-adc-core.h |  52 +++
   4 files changed, 369 insertions(+)
   create mode 100644 drivers/iio/adc/stm32-adc-core.c
   create mode 100644 drivers/iio/adc/stm32-adc-core.h

diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
index 7edcf32..ff30239 100644
--- a/drivers/iio/adc/Kconfig
+++ b/drivers/iio/adc/Kconfig
@@ -419,6 +419,19 @@ config ROCKCHIP_SARADC
 To compile this driver as a module, choose M here: the
 module will be called rockchip_saradc.
   +config STM32_ADC_CORE
+tristate "STMicroelectronics STM32 adc core"
+depends on ARCH_STM32 || COMPILE_TEST
+depends on OF
+select REGULATOR
+select REGULATOR_FIXED_VOLTAGE
+help
+  Select this option to enable the core driver for STMicroelectronics
+  STM32 analog-to-digital converter (ADC).
+
+  This driver can also be built as a module.  If so, the module
+  will be called stm32-adc-core.
+
   config STX104
   tristate "Apex Embedded Systems STX104 driver"
   depends on X86 && ISA_BUS_API
diff --git a/drivers/iio/adc/Makefile b/drivers/iio/adc/Makefile
index 7a40c04..a1e8f44 100644
--- a/drivers/iio/adc/Makefile
+++ b/drivers/iio/adc/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_QCOM_SPMI_IADC) += qcom-spmi-iadc.o
   obj-$(CONFIG_QCOM_SPMI_VADC) += qcom-spmi-vadc.o
   obj-$(CONFIG_ROCKCHIP_SARADC) += rockchip_saradc.o
   obj-$(CONFIG_STX104) += stx104.o
+obj-$(CONFIG_STM32_ADC_CORE) += stm32-adc-core.o
   obj-$(CONFIG_TI_ADC081C) += ti-adc081c.o
   obj-$(CONFIG_TI_ADC0832) += ti-adc0832.o
   obj-$(CONFIG_TI_ADC12138) += ti-adc12138.o
diff --git a/drivers/iio/adc/stm32-adc-core.c b/drivers/iio/adc/stm32-adc-core.c
new file mode 100644
index 000..4214b0c
--- /dev/null
+++ b/drivers/iio/adc/stm32-adc-core.c
@@ -0,0 +1,303 @@
+/*
+ * This file is part of STM32 ADC driver
+ *
+ * Copyright (C) 2016, STMicroelectronics - All Rights Reserved
+ * Author: Fabrice Gasnier .
+ *
+ * Inspired from: fsl-imx25-tsadc
+ *
+ * License type: GPLv2
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ * or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program. If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "stm32-adc-core.h"
+
+/* STM32F4 - common registers for all ADC instances: 1, 2 & 3 */
+#define STM32F4_ADC_CSR(STM32_ADCX_COMN_OFFSET + 0x00)
+#define STM32F4_ADC_CCR(STM32_ADCX_COMN_OFFSET + 0x04)
+
+/* STM32F4_ADC_CSR - bit fields */
+#define STM32F4_EOC3BIT(17)
+#define STM32F4_EOC2BIT(9)
+#define STM32F4_EOC1BIT(1)
+
+/* STM32F4_ADC_CCR - bit fields */
+#define STM32F4_ADC_ADCPRE_SHIFT16
+#define STM32F4_ADC_ADCPRE_MASKGENMASK(17, 16)
+
+/* STM32 F4 maximum analog clock rate (from datasheet) */
+#define STM32F4_ADC_MAX_CLK_RATE3

Re: [tpmdd-devel] [PATCH v5 3/3] tpm: add securityfs support for TPM 2.0 firmware event log

2016-11-25 Thread Jarkko Sakkinen
On Thu, Nov 24, 2016 at 09:51:03PM -0500, Stefan Berger wrote:
> On 11/24/2016 04:10 PM, Jarkko Sakkinen wrote:
> > On Wed, Nov 23, 2016 at 12:27:37PM -0500, Nayna Jain wrote:
> > > Unlike the device driver support for TPM 1.2, the TPM 2.0 does
> > > not support the securityfs pseudo files for displaying the
> > > firmware event log.
> > > 
> > > This patch enables support for providing the TPM 2.0 event log in
> > > binary form. TPM 2.0 event log supports a crypto agile format that
> > > records multiple digests, which is different from TPM 1.2. This
> > > patch enables the tpm_bios_log_setup for TPM 2.0  and adds the
> > > event log parser which understand the TPM 2.0 crypto agile format.
> > > 
> > > Signed-off-by: Nayna Jain 
> > I don't want to say much about this before I've tested it. I wonder
> > what cheap hardware I could use to test this. Any advice is on this
> > from anyone is much appreciated.
> 
> Virtual hardware would be cheap :-)
> 
> I tested this series with QEMU + vTPM + SeaBIOS with TPM 1.2 + TPM 2 support
> (basing the log on ACPI). I had to fix an endianess issue on the SeaBIOS
> side, which made it work. So for this version of the patches I can give it
> my tested-by:
> 
> Tested-by: Stefan Berger 

Thanks.

/Jarkko


Re: [PATCH resend v3] ASoC: sun4i-codec: Add "Right Mixer" to "Line Out Mono Diff." route

2016-11-25 Thread Maxime Ripard
On Thu, Nov 24, 2016 at 07:46:49PM +0800, Chen-Yu Tsai wrote:
> The mono differential output for "Line Out" downmixes the stereo audio
> from the mixer, instead of just taking the left channel.
> 
> Add a route from the "Right Mixer" to "Line Out Source Playback Route"
> through the "Mono Differential" path, so DAPM doesn't shut down
> everything if the left channel is muted.
> 
> Fixes: 0f909f98d7cb ("ASoC: sun4i-codec: Add support for A31 Line Out
> playback")
> Signed-off-by: Chen-Yu Tsai 

Acked-by: Maxime Ripard 

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [v3,2/3] powerpc: get hugetlbpage handling more generic

2016-11-25 Thread Christophe LEROY



Le 24/11/2016 à 06:23, Scott Wood a écrit :

On Wed, Sep 21, 2016 at 10:11:54AM +0200, Christophe Leroy wrote:

Today there are two implementations of hugetlbpages which are managed
by exclusive #ifdefs:
* FSL_BOOKE: several directory entries points to the same single hugepage
* BOOK3S: one upper level directory entry points to a table of hugepages

In preparation of implementation of hugepage support on the 8xx, we
need a mix of the two above solutions, because the 8xx needs both cases
depending on the size of pages:
* In 4k page size mode, each PGD entry covers a 4M bytes area. It means
that 2 PGD entries will be necessary to cover an 8M hugepage while a
single PGD entry will cover 8x 512k hugepages.
* In 16 page size mode, each PGD entry covers a 64M bytes area. It means
that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
hugepages will be covers by one PGD entry.

This patch:
* removes #ifdefs in favor of if/else based on the range sizes
* merges the two huge_pte_alloc() functions as they are pretty similar
* merges the two hugetlbpage_init() functions as they are pretty similar

Signed-off-by: Christophe Leroy 
Reviewed-by: Aneesh Kumar K.V 


With this patch on e6500, running the hugetlb testsuite results in the
system hanging in a storm of OOM killer invocations (I'll try to debug
more deeply later).  This patch also changes the default hugepage size on
FSL book3e from 4M to 16M.



Regarding the default hugepage size, it is a result of the merge of the 
two hugetlbpage_init().

Should I add an ifdef to get 4M on FSL book3e by default ?
What's the reason for selecting different hugepage sizes depending on 
the CPU ? I thought default size was selected based on what was existing.


What testsuite do you run exactly ?

Christophe


Re: [PATCH 00/29] UBIFS File Encryption v1

2016-11-25 Thread Richard Weinberger
Ted,

On 14.11.2016 04:05, Theodore Ts'o wrote:
> Richard,
> 
> Your fscrypt patches look good.  I've created an fscrypt branch on the
> ext4.git tree which contains your changes as well as Eric Bigger's
> recent fspatch cleanups changes.  If you want to base your ubifs
> changes on that branch, that would be great.  The ext4 dev branch will
> be including that fscrypt branch, so it will be feeding into
> linux-next that way.  If you also base your patches on that, it will
> avoid duplicate patches in linux-next and in Linus's tree when he
> pulls them.

Do you want us to address Eric's review comments on top of the fscrypt
branch or shall we rebase?
I'd suggest the former.

Thanks,
//richard


Re: [tip:x86/core] x86: Enable Intel Turbo Boost Max Technology 3.0

2016-11-25 Thread Ingo Molnar

* tip-bot for Tim Chen  wrote:

> Commit-ID:  5e76b2ab36b40ca33023e78725bdc69eafd63134
> Gitweb: http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
> Author: Tim Chen 
> AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
> Committer:  Thomas Gleixner 
> CommitDate: Thu, 24 Nov 2016 20:44:19 +0100
> 
> x86: Enable Intel Turbo Boost Max Technology 3.0

This patch doesn't build:

Note that this patch has to be redone anyway, as it won't even build:

> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

arch/x86/kernel/itmt.c:26:23: fatal error: asm/mutex.h: No such file or 
directory

> +config SCHED_ITMT
> + bool "Intel Turbo Boost Max Technology (ITMT) scheduler support"
> + depends on SCHED_MC && CPU_SUP_INTEL && X86_INTEL_PSTATE
> + ---help---
> +   ITMT enabled scheduler support improves the CPU scheduler's decision
> +   to move tasks to cpu core that can be boosted to a higher frequency
> +   than others. It will have better performance at a cost of slightly
> +   increased overhead in task migrations. If unsure say N here.

Argh, so the 'itmt' name really sucks as well - could we please make it 
something 
more obvious - like SCHED_INTEL_TURBO or so - and similarly rename the file as 
well?

The sched_intel_turbo.c file could thus host all things related to scheduler 
support of turbo frequencies - it shouldn't be named after the Intel acronym of 
the day...

Thanks,

Ingo


Re: Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Christoph Hellwig
On Thu, Nov 24, 2016 at 11:11:34AM -0700, Logan Gunthorpe wrote:
> * Regular DAX in the FS doesn't work at this time because the FS can
> move the file you think your transfer to out from under you. Though I
> understand there's been some work with XFS to solve that issue.

The file system will never move anything under locked down pages,
locking down pages is used exactly to protect against that.  So as long
as we page structures available RDMA to/from device memory _from kernel
space_ is trivial, although for file systems to work properly you
really want a notification to the consumer if the file systems wants
to remove the mapping.  We have implemented that using FL_LAYOUTS locks
for NFSD, but only XFS supports it so far.  Without that a long term
locked down region of memory (e.g. a kernel MR) would prevent various
file operations that would simply hang.


Re: [PATCH] v4l: async: make v4l2 coexists with devicetree nodes in a dt overlay

2016-11-25 Thread Sakari Ailus
Hi Javi,

On Wed, Nov 23, 2016 at 04:15:11PM +, Javi Merino wrote:
> On Wed, Nov 23, 2016 at 05:10:42PM +0200, Sakari Ailus wrote:
> > Hi Javi,
> 
> Hi Sakari,
> 
> > On Wed, Nov 23, 2016 at 10:09:57AM +, Javi Merino wrote:
> > > In asd's configured with V4L2_ASYNC_MATCH_OF, if the v4l2 subdev is in
> > > a devicetree overlay, its of_node pointer will be different each time
> > > the overlay is applied.  We are not interested in matching the
> > > pointer, what we want to match is that the path is the one we are
> > > expecting.  Change to use of_node_cmp() so that we continue matching
> > > after the overlay has been removed and reapplied.
> > > 
> > > Cc: Mauro Carvalho Chehab 
> > > Cc: Javier Martinez Canillas 
> > > Cc: Sakari Ailus 
> > > Signed-off-by: Javi Merino 
> > > ---
> > > Hi,
> > > 
> > > I feel it is a bit of a hack, but I couldn't think of anything better.
> > > I'm ccing devicetree@ and Pantelis because there may be a simpler
> > > solution.
> > 
> > First I have to admit that I'm not an expert when it comes to DT overlays.
> > 
> > That said, my understanding is that the sub-device and the async sub-device
> > are supposed to point to the exactly same DT node. I wonder if there's
> > actually anything wrong in the current code.
> > 
> > If the overlay has changed between probing the driver for the async notifier
> > and the async sub-device, there should be no match here, should there? The
> > two nodes actually point to a node in a different overlay in that case.
> 
> Overlays are parts of the devicetree that can be added and removed.
> When the overlay is applied, the camera driver is probed and does
> v4l2_async_register_subdev().  However, v4l2_async_belongs() fails.
> The problem is with comparing pointers.  I haven't looked at the
> implementation of overlays in detail, but what I see is that the
> of_node pointer changes when you remove and reapply an overlay (I
> guess it's dynamically allocated and when you remove the overlay, it's
> freed).

The concern here which we were discussing was whether the overlay should be
relied on having compliant configuration compared to the part which was not
part of the overlay.

As external components are involved, quite possibly also the ISP DT node
will require changes, not just the image source (TV tuner, camera sensor
etc.). This could be because of number of CSI-2 lanes or parallel bus width,
for instance.

I'd also be interested in having an actual driver implement support for
removing and adding a DT overlay first so we'd see how this would actually
work. We need both in order to be able to actually remove and add DT
overlays _without_ unbinding the ISP driver. Otherwise it should already
work in the current codebase.

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ai...@iki.fi XMPP: sai...@retiisi.org.uk


RE: [PATCH] fsl/usb: Workarourd for USB erratum-A005697

2016-11-25 Thread Jerry Huang
Thanks, Sriram,
It is better to move this delay out of spin-lock.

Best Regards
Jerry Huang


-Original Message-
From: Sriram Dash 
Sent: Thursday, November 24, 2016 7:17 PM
To: Jerry Huang ; st...@rowland.harvard.edu; 
gre...@linuxfoundation.org
Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; 
julia.law...@lip6.fr; Jerry Huang ; Ramneek Mehresh 
; Suresh Gupta 
Subject: RE: [PATCH] fsl/usb: Workarourd for USB erratum-A005697

>From: Changming Huang [mailto:jerry.hu...@nxp.com] As per USB 
>specification, in the Suspend state, the status bit does not change 
>until the port is suspended. However, there may be a delay in 
>suspending a port if there is a transaction currently in progress on the bus.
>
>In the USBDR controller, the PORTSCx[SUSP] bit changes immediately when 
>the application sets it and not when the port is actually suspended
>
>Workaround for this issue involves waiting for a minimum of 10ms to 
>allow the controller to go into SUSPEND state before proceeding ahead
>
>Signed-off-by: Changming Huang 
>Signed-off-by: Ramneek Mehresh 
>---
> drivers/usb/host/ehci-fsl.c  |3 +++
> drivers/usb/host/ehci-hub.c  |2 ++
> drivers/usb/host/ehci.h  |6 ++
> drivers/usb/host/fsl-mph-dr-of.c |2 ++
> include/linux/fsl_devices.h  |1 +
> 5 files changed, 14 insertions(+)
>
>diff --git a/drivers/usb/host/ehci-fsl.c b/drivers/usb/host/ehci-fsl.c 
>index 9f5ffb6..91701cc 100644
>--- a/drivers/usb/host/ehci-fsl.c
>+++ b/drivers/usb/host/ehci-fsl.c
>@@ -286,6 +286,9 @@ static int ehci_fsl_usb_setup(struct ehci_hcd *ehci)
>   if (pdata->has_fsl_erratum_a005275 == 1)
>   ehci->has_fsl_hs_errata = 1;
>
>+  if (pdata->has_fsl_erratum_a005697 == 1)
>+  ehci->has_fsl_susp_errata = 1;
>+
>   if ((pdata->operating_mode == FSL_USB2_DR_HOST) ||
>   (pdata->operating_mode == FSL_USB2_DR_OTG))
>   if (ehci_fsl_setup_phy(hcd, pdata->phy_mode, 0)) diff --git 
>a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c index 
>74f62d6..86d154e 100644
>--- a/drivers/usb/host/ehci-hub.c
>+++ b/drivers/usb/host/ehci-hub.c
>@@ -305,6 +305,8 @@ static int ehci_bus_suspend (struct usb_hcd *hcd)
>   USB_PORT_STAT_HIGH_SPEED)
>   fs_idle_delay = true;
>   ehci_writel(ehci, t2, reg);
>+  if (ehci_has_fsl_susp_errata(ehci))
>+  mdelay(10);

Hi Jerry,

Move the delay out of the spin lock. Other than that, it looks fine to me.

>   changed = 1;
>   }
>   }
>diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h index 
>3f3b74a..7706e4a 100644
>--- a/drivers/usb/host/ehci.h
>+++ b/drivers/usb/host/ehci.h
>@@ -219,6 +219,7 @@ struct ehci_hcd {  /* one per controller */
>   unsignedno_selective_suspend:1;
>   unsignedhas_fsl_port_bug:1; /* FreeScale */
>   unsignedhas_fsl_hs_errata:1;/* Freescale HS quirk */
>+  unsignedhas_fsl_susp_errata:1;  /*Freescale SUSP quirk*/
>   unsignedbig_endian_mmio:1;
>   unsignedbig_endian_desc:1;
>   unsignedbig_endian_capbase:1;
>@@ -703,10 +704,15 @@ struct ehci_tt {
> #if defined(CONFIG_PPC_85xx)
> /* Some Freescale processors have an erratum (USB A-005275) in which
>  * incoming packets get corrupted in HS mode
>+ * Some Freescale processors have an erratum (USB A-005697) in which
>+ * we need to wait for 10ms for bus to fo into suspend mode after
>+ * setting SUSP bit
>  */
> #define ehci_has_fsl_hs_errata(e) ((e)->has_fsl_hs_errata)
>+#define ehci_has_fsl_susp_errata(e)   ((e)->has_fsl_susp_errata)
> #else
> #define ehci_has_fsl_hs_errata(e) (0)
>+#define ehci_has_fsl_susp_errata(e)   (0)
> #endif
>
> /*
>diff --git a/drivers/usb/host/fsl-mph-dr-of.c 
>b/drivers/usb/host/fsl-mph-dr-of.c
>index f07ccb2..e90ddb5 100644
>--- a/drivers/usb/host/fsl-mph-dr-of.c
>+++ b/drivers/usb/host/fsl-mph-dr-of.c
>@@ -226,6 +226,8 @@ static int fsl_usb2_mph_dr_of_probe(struct 
>platform_device *ofdev)
>   of_property_read_bool(np, "fsl,usb-erratum-a007792");
>   pdata->has_fsl_erratum_a005275 =
>   of_property_read_bool(np, "fsl,usb-erratum-a005275");
>+  pdata->has_fsl_erratum_a005697 =
>+  of_property_read_bool(np, "fsl,usb_erratum-a005697");
>
>   /*
>* Determine whether phy_clk_valid needs to be checked diff --git 
>a/include/linux/fsl_devices.h b/include/linux/fsl_devices.h index 
>f291291..60cef82
>100644
>--- a/include/linux/fsl_devices.h
>+++ b/include/linux/fsl_devices.h
>@@ -100,6 +100,7 @@ struct fsl_usb2_platform_data {
>   unsignedalready_suspended:1;
>   unsignedhas_fsl_erratum_a007792:1;
>   unsignedhas_fsl_erratum_a005275:1;
>+  unsign

[PATCH 3.12 002/127] KVM: MIPS: Drop other CPU ASIDs on guest MMU changes

2016-11-25 Thread Jiri Slaby
From: James Hogan 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 91e4f1b6073dd680d86cdb7e42d7a9db39d8 upstream.

When a guest TLB entry is replaced by TLBWI or TLBWR, we only invalidate
TLB entries on the local CPU. This doesn't work correctly on an SMP host
when the guest is migrated to a different physical CPU, as it could pick
up stale TLB mappings from the last time the vCPU ran on that physical
CPU.

Therefore invalidate both user and kernel host ASIDs on other CPUs,
which will cause new ASIDs to be generated when it next runs on those
CPUs.

We're careful only to do this if the TLB entry was already valid, and
only for the kernel ASID where the virtual address it mapped is outside
of the guest user address range.

Signed-off-by: James Hogan 
Cc: Paolo Bonzini 
Cc: "Radim Krčmář" 
Cc: Ralf Baechle 
Cc: linux-m...@linux-mips.org
Cc: k...@vger.kernel.org
[james.ho...@imgtec.com: Backport to 3.10..3.16]
Signed-off-by: James Hogan 
Signed-off-by: Jiri Slaby 
---
 arch/mips/kvm/kvm_mips_emul.c | 61 +--
 1 file changed, 53 insertions(+), 8 deletions(-)

diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c
index 9f7643874fba..7f3fb19d2156 100644
--- a/arch/mips/kvm/kvm_mips_emul.c
+++ b/arch/mips/kvm/kvm_mips_emul.c
@@ -310,6 +310,47 @@ enum emulation_result kvm_mips_emul_tlbr(struct kvm_vcpu 
*vcpu)
return er;
 }
 
+/**
+ * kvm_mips_invalidate_guest_tlb() - Indicates a change in guest MMU map.
+ * @vcpu:  VCPU with changed mappings.
+ * @tlb:   TLB entry being removed.
+ *
+ * This is called to indicate a single change in guest MMU mappings, so that we
+ * can arrange TLB flushes on this and other CPUs.
+ */
+static void kvm_mips_invalidate_guest_tlb(struct kvm_vcpu *vcpu,
+ struct kvm_mips_tlb *tlb)
+{
+   int cpu, i;
+   bool user;
+
+   /* No need to flush for entries which are already invalid */
+   if (!((tlb->tlb_lo0 | tlb->tlb_lo1) & MIPS3_PG_V))
+   return;
+   /* User address space doesn't need flushing for KSeg2/3 changes */
+   user = tlb->tlb_hi < KVM_GUEST_KSEG0;
+
+   preempt_disable();
+
+   /*
+* Probe the shadow host TLB for the entry being overwritten, if one
+* matches, invalidate it
+*/
+   kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi);
+
+   /* Invalidate the whole ASID on other CPUs */
+   cpu = smp_processor_id();
+   for_each_possible_cpu(i) {
+   if (i == cpu)
+   continue;
+   if (user)
+   vcpu->arch.guest_user_asid[i] = 0;
+   vcpu->arch.guest_kernel_asid[i] = 0;
+   }
+
+   preempt_enable();
+}
+
 /* Write Guest TLB Entry @ Index */
 enum emulation_result kvm_mips_emul_tlbwi(struct kvm_vcpu *vcpu)
 {
@@ -331,10 +372,8 @@ enum emulation_result kvm_mips_emul_tlbwi(struct kvm_vcpu 
*vcpu)
}
 
tlb = &vcpu->arch.guest_tlb[index];
-#if 1
-   /* Probe the shadow host TLB for the entry being overwritten, if one 
matches, invalidate it */
-   kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi);
-#endif
+
+   kvm_mips_invalidate_guest_tlb(vcpu, tlb);
 
tlb->tlb_mask = kvm_read_c0_guest_pagemask(cop0);
tlb->tlb_hi = kvm_read_c0_guest_entryhi(cop0);
@@ -373,10 +412,7 @@ enum emulation_result kvm_mips_emul_tlbwr(struct kvm_vcpu 
*vcpu)
 
tlb = &vcpu->arch.guest_tlb[index];
 
-#if 1
-   /* Probe the shadow host TLB for the entry being overwritten, if one 
matches, invalidate it */
-   kvm_mips_host_tlb_inv(vcpu, tlb->tlb_hi);
-#endif
+   kvm_mips_invalidate_guest_tlb(vcpu, tlb);
 
tlb->tlb_mask = kvm_read_c0_guest_pagemask(cop0);
tlb->tlb_hi = kvm_read_c0_guest_entryhi(cop0);
@@ -419,6 +455,7 @@ kvm_mips_emulate_CP0(uint32_t inst, uint32_t *opc, uint32_t 
cause,
int32_t rt, rd, copz, sel, co_bit, op;
uint32_t pc = vcpu->arch.pc;
unsigned long curr_pc;
+   int cpu, i;
 
/*
 * Update PC and hold onto current PC in case there is
@@ -538,8 +575,16 @@ kvm_mips_emulate_CP0(uint32_t inst, uint32_t *opc, 
uint32_t cause,
 ASID_MASK,
 vcpu->arch.gprs[rt] & ASID_MASK);
 
+   preempt_disable();
/* Blow away the shadow host TLBs */
kvm_mips_flush_host_tlb(1);
+   cpu = smp_processor_id();
+   for_each_possible_cpu(i)
+   if (i != cpu) {
+   
vcpu->arch.guest_user_asid[i] = 0;
+   
vcpu->arch.guest_kernel_asid[i] = 0;
+

Re: [PATCH] auxdisplay: ht16k33: select CONFIG_FB_SYS_FOPS

2016-11-25 Thread Arnd Bergmann
On Friday, November 25, 2016 8:50:04 AM CET Robin van der Gracht wrote:
> 
> Thanks for reporting this. You are right about the missing helper.
> However, the fb_ops struct uses several helpers which are all missing.
> 
> static struct fb_ops ht16k33_fb_ops = {
> .owner = THIS_MODULE,
> .fb_read = fb_sys_read,
> .fb_write = fb_sys_write,
> .fb_fillrect = sys_fillrect,
> .fb_copyarea = sys_copyarea,
> .fb_imageblit = sys_imageblit,
> .fb_mmap = ht16k33_mmap,
> };
> 
> HT16K33 should also select:
> FB_CFB_FILLRECT
> FB_CFB_COPYAREA
> FB_CFB_IMAGEBLIT
> 

Ah right. I had not run into those so far during randconfig
testing, probably because there is usually at least one other
framebuffer enabled that selects them.

Can you submit a patch to add all those?

Arnd



[PATCH 3.12 026/127] USB: serial: cp210x: fix tiocmget error handling

2016-11-25 Thread Jiri Slaby
From: Johan Hovold 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit de24e0a108bc48062e1c7acaa97014bce32a919f upstream.

The current tiocmget implementation would fail to report errors up the
stack and instead leaked a few bits from the stack as a mask of
modem-status flags.

Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control")
Signed-off-by: Johan Hovold 
Signed-off-by: Jiri Slaby 
---
 drivers/usb/serial/cp210x.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index f5e4fda7f902..188e50446514 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -919,7 +919,9 @@ static int cp210x_tiocmget(struct tty_struct *tty)
unsigned int control;
int result;
 
-   cp210x_get_config(port, CP210X_GET_MDMSTS, &control, 1);
+   result = cp210x_get_config(port, CP210X_GET_MDMSTS, &control, 1);
+   if (result)
+   return result;
 
result = ((control & CONTROL_DTR) ? TIOCM_DTR : 0)
|((control & CONTROL_RTS) ? TIOCM_RTS : 0)
-- 
2.10.2



[PATCH 3.12 003/127] MIPS: KVM: Fix unused variable build warning

2016-11-25 Thread Jiri Slaby
From: Nicholas Mc Guire 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 5f508c43a7648baa892528922402f1e13f258bd4 upstream.

As kvm_mips_complete_mmio_load() did not yet modify PC at this point
as James Hogans  explained the curr_pc variable
and the comments along with it can be dropped.

Signed-off-by: Nicholas Mc Guire 
Link: http://lkml.org/lkml/2015/5/8/422
Cc: Gleb Natapov 
Cc: Paolo Bonzini 
Cc: James Hogan 
Cc: k...@vger.kernel.org
Cc: linux-m...@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9993/
Signed-off-by: Ralf Baechle 
[james.ho...@imgtec.com: Backport to 3.10..3.16]
Signed-off-by: James Hogan 
Signed-off-by: Jiri Slaby 
---
 arch/mips/kvm/kvm_mips_emul.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c
index 7f3fb19d2156..779a376c4cce 100644
--- a/arch/mips/kvm/kvm_mips_emul.c
+++ b/arch/mips/kvm/kvm_mips_emul.c
@@ -1655,7 +1655,6 @@ kvm_mips_complete_mmio_load(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 {
unsigned long *gpr = &vcpu->arch.gprs[vcpu->arch.io_gpr];
enum emulation_result er = EMULATE_DONE;
-   unsigned long curr_pc;
 
if (run->mmio.len > sizeof(*gpr)) {
printk("Bad MMIO length: %d", run->mmio.len);
@@ -1663,11 +1662,6 @@ kvm_mips_complete_mmio_load(struct kvm_vcpu *vcpu, 
struct kvm_run *run)
goto done;
}
 
-   /*
-* Update PC and hold onto current PC in case there is
-* an error and we want to rollback the PC
-*/
-   curr_pc = vcpu->arch.pc;
er = update_pc(vcpu, vcpu->arch.pending_load_cause);
if (er == EMULATE_FAIL)
return er;
-- 
2.10.2



[PATCH 3.12 047/127] tty: vt, fix bogus division in csi_J

2016-11-25 Thread Jiri Slaby
3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 42acfc6615f47e465731c263bee0c799edb098f2 upstream.

In csi_J(3), the third parameter of scr_memsetw (vc_screenbuf_size) is
divided by 2 inappropriatelly. But scr_memsetw expects size, not
count, because it divides the size by 2 on its own before doing actual
memset-by-words.

So remove the bogus division.

Signed-off-by: Jiri Slaby 
Cc: Petr Písař 
Fixes: f8df13e0a9 (tty: Clean console safely)
Signed-off-by: Jiri Slaby 
---
 drivers/tty/vt/vt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 60ce423c6364..75c059c56a23 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1169,7 +1169,7 @@ static void csi_J(struct vc_data *vc, int vpar)
break;
case 3: /* erase scroll-back buffer (and whole display) */
scr_memsetw(vc->vc_screenbuf, vc->vc_video_erase_char,
-   vc->vc_screenbuf_size >> 1);
+   vc->vc_screenbuf_size);
set_origin(vc);
if (CON_IS_VISIBLE(vc))
update_screen(vc);
-- 
2.10.2



[PATCH 3.12 048/127] HID: usbhid: add ATEN CS962 to list of quirky devices

2016-11-25 Thread Jiri Slaby
From: Oliver Neukum 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit cf0ea4da4c7df11f7a508b2f37518e0f117f3791 upstream.

Like many similar devices it needs a quirk to work.
Issuing the request gets the device into an irrecoverable state.

Signed-off-by: Oliver Neukum 
Signed-off-by: Jiri Kosina 
Signed-off-by: Jiri Slaby 
---
 drivers/hid/hid-ids.h   | 1 +
 drivers/hid/usbhid/hid-quirks.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index b241b6d5fc9a..16583e6621d4 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -166,6 +166,7 @@
 #define USB_DEVICE_ID_ATEN_4PORTKVM0x2205
 #define USB_DEVICE_ID_ATEN_4PORTKVMC   0x2208
 #define USB_DEVICE_ID_ATEN_CS682   0x2213
+#define USB_DEVICE_ID_ATEN_CS692   0x8021
 
 #define USB_VENDOR_ID_ATMEL0x03eb
 #define USB_DEVICE_ID_ATMEL_MULTITOUCH 0x211c
diff --git a/drivers/hid/usbhid/hid-quirks.c b/drivers/hid/usbhid/hid-quirks.c
index 98c2cf97b17f..3fd5fa9385ae 100644
--- a/drivers/hid/usbhid/hid-quirks.c
+++ b/drivers/hid/usbhid/hid-quirks.c
@@ -61,6 +61,7 @@ static const struct hid_blacklist {
{ USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_4PORTKVM, HID_QUIRK_NOGET },
{ USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_4PORTKVMC, HID_QUIRK_NOGET },
{ USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_CS682, HID_QUIRK_NOGET },
+   { USB_VENDOR_ID_ATEN, USB_DEVICE_ID_ATEN_CS692, HID_QUIRK_NOGET },
{ USB_VENDOR_ID_CH, USB_DEVICE_ID_CH_FIGHTERSTICK, HID_QUIRK_NOGET },
{ USB_VENDOR_ID_CH, USB_DEVICE_ID_CH_COMBATSTICK, HID_QUIRK_NOGET },
{ USB_VENDOR_ID_CH, USB_DEVICE_ID_CH_FLIGHT_SIM_ECLIPSE_YOKE, 
HID_QUIRK_NOGET },
-- 
2.10.2



[PATCH 3.12 080/127] IB/mlx5: Fix fatal error dispatching

2016-11-25 Thread Jiri Slaby
From: Eli Cohen 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit dbaaff2a2caa03d472b5cc53a3fbfd415c97dc26 upstream.

When an internal error condition is detected, make sure to set the
device inactive after dispatching the event so ULPs can get a
notification of this event.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen 
Signed-off-by: Maor Gottlieb 
Reviewed-by: Mohamad Haj Yahia 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Jiri Slaby 
---
 drivers/infiniband/hw/mlx5/main.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index b1a6cb3a2809..1300a377aca8 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -959,12 +959,13 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, enum 
mlx5_dev_event event,
 {
struct mlx5_ib_dev *ibdev = container_of(dev, struct mlx5_ib_dev, mdev);
struct ib_event ibev;
+   bool fatal = false;
u8 port = 0;
 
switch (event) {
case MLX5_DEV_EVENT_SYS_ERROR:
-   ibdev->ib_active = false;
ibev.event = IB_EVENT_DEVICE_FATAL;
+   fatal = true;
break;
 
case MLX5_DEV_EVENT_PORT_UP:
@@ -1012,6 +1013,9 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, enum 
mlx5_dev_event event,
 
if (ibdev->ib_active)
ib_dispatch_event(&ibev);
+
+   if (fatal)
+   ibdev->ib_active = false;
 }
 
 static void get_ext_port_caps(struct mlx5_ib_dev *dev)
-- 
2.10.2



[PATCH 3.12 075/127] mfd: core: Fix device reference leak in mfd_clone_cell

2016-11-25 Thread Jiri Slaby
From: Johan Hovold 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 722f191080de641f023feaa7d5648caf377844f5 upstream.

Make sure to drop the reference taken by bus_find_device_by_name()
before returning from mfd_clone_cell().

Fixes: a9bbba996302 ("mfd: add platform_device sharing support for mfd")
Signed-off-by: Johan Hovold 
Signed-off-by: Lee Jones 
Signed-off-by: Jiri Slaby 
---
 drivers/mfd/mfd-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c
index f421586f29fb..a1f0f73245c5 100644
--- a/drivers/mfd/mfd-core.c
+++ b/drivers/mfd/mfd-core.c
@@ -265,6 +265,8 @@ int mfd_clone_cell(const char *cell, const char **clones, 
size_t n_clones)
clones[i]);
}
 
+   put_device(dev);
+
return 0;
 }
 EXPORT_SYMBOL(mfd_clone_cell);
-- 
2.10.2



[PATCH 3.12 078/127] IB/mlx4: Fix create CQ error flow

2016-11-25 Thread Jiri Slaby
From: Matan Barak 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 593ff73bcfdc79f79a8a0df55504f75ad3e5d1a9 upstream.

Currently, if ib_copy_to_udata fails, the CQ
won't be deleted from the radix tree and the HW (HW2SW).

Fixes: 225c7b1feef1 ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand 
adapters')
Signed-off-by: Matan Barak 
Signed-off-by: Daniel Jurgens 
Reviewed-by: Mark Bloch 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Jiri Slaby 
---
 drivers/infiniband/hw/mlx4/cq.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index d5e60f44ba5a..5b8a62c6bc8d 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -239,11 +239,14 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev, 
int entries, int vector
if (context)
if (ib_copy_to_udata(udata, &cq->mcq.cqn, sizeof (__u32))) {
err = -EFAULT;
-   goto err_dbmap;
+   goto err_cq_free;
}
 
return &cq->ibcq;
 
+err_cq_free:
+   mlx4_cq_free(dev->dev, &cq->mcq);
+
 err_dbmap:
if (context)
mlx4_ib_db_unmap_user(to_mucontext(context), &cq->db);
-- 
2.10.2



[PATCH 3.12 027/127] KVM: x86: fix wbinvd_dirty_mask use-after-free

2016-11-25 Thread Jiri Slaby
From: Ido Yariv 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit bd768e146624cbec7122ed15dead8daa137d909d upstream.

vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
corrupting memory. For example, the following call trace may set a bit
in an already freed cpu mask:
kvm_arch_vcpu_load
vcpu_load
vmx_free_vcpu_nested
vmx_free_vcpu
kvm_arch_vcpu_free

Fix this by deferring freeing of wbinvd_dirty_mask.

Signed-off-by: Ido Yariv 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Radim Krčmář 
Signed-off-by: Jiri Slaby 
---
 arch/x86/kvm/x86.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 06b37a671b12..c9e086117ae2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6662,11 +6662,13 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 
 void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu)
 {
+   void *wbinvd_dirty_mask = vcpu->arch.wbinvd_dirty_mask;
+
kvmclock_reset(vcpu);
 
-   free_cpumask_var(vcpu->arch.wbinvd_dirty_mask);
fx_free(vcpu);
kvm_x86_ops->vcpu_free(vcpu);
+   free_cpumask_var(wbinvd_dirty_mask);
 }
 
 struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
-- 
2.10.2



[PATCH 3.12 079/127] IB/mlx5: Use cache line size to select CQE stride

2016-11-25 Thread Jiri Slaby
From: Daniel Jurgens 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 16b0e0695a73b68d8ca40288c8f9614ef208917b upstream.

When creating kernel CQs use 128B CQE stride if the
cache line size is 128B, 64B otherwise.  This prevents
multiple CQEs from residing in a 128B cache line,
which can cause retries when there are concurrent
read and writes in one cache line.

Tested with IPoIB on PPC64, saw ~5% throughput
improvement.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Daniel Jurgens 
Signed-off-by: Maor Gottlieb 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Jiri Slaby 
---
 drivers/infiniband/hw/mlx5/cq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 706833ab7e7e..e5a6d839f1d1 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -684,8 +684,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, 
int entries,
if (err)
goto err_create;
} else {
-   /* for now choose 64 bytes till we have a proper interface */
-   cqe_size = 64;
+   cqe_size = cache_line_size() == 128 ? 128 : 64;
err = create_cq_kernel(dev, cq, entries, cqe_size, &cqb,
   &index, &inlen);
if (err)
-- 
2.10.2



[PATCH 3.12 025/127] tty: limit terminal size to 4M chars

2016-11-25 Thread Jiri Slaby
From: Dmitry Vyukov 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 32b2921e6a7461fe63b71217067a6cf4bddb132f upstream.

Size of kmalloc() in vc_do_resize() is controlled by user.
Too large kmalloc() size triggers WARNING message on console.
Put a reasonable upper bound on terminal size to prevent WARNINGs.

Signed-off-by: Dmitry Vyukov 
CC: David Rientjes 
Cc: One Thousand Gnomes 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: Peter Hurley 
Cc: linux-kernel@vger.kernel.org
Cc: syzkal...@googlegroups.com
Signed-off-by: Jiri Slaby 
---
 drivers/tty/vt/vt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index d52e653076f4..60ce423c6364 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -863,6 +863,8 @@ static int vc_do_resize(struct tty_struct *tty, struct 
vc_data *vc,
if (new_cols == vc->vc_cols && new_rows == vc->vc_rows)
return 0;
 
+   if (new_screen_size > (4 << 20))
+   return -EINVAL;
newscreen = kmalloc(new_screen_size, GFP_USER);
if (!newscreen)
return -ENOMEM;
-- 
2.10.2



[PATCH 3.12 038/127] drm/radeon/si_dpm: Limit clocks on HD86xx part

2016-11-25 Thread Jiri Slaby
From: Tom St Denis 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit fb9a5b0c1c9893db2e0d18544fd49e19d784a87d upstream.

Limit clocks on a specific HD86xx part to avoid
crashes (while awaiting an appropriate PP fix).

Signed-off-by: Tom St Denis 
Reviewed-by: Alex Deucher 
Signed-off-by: Jiri Slaby 
---
 drivers/gpu/drm/radeon/si_dpm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c
index c1281fc39040..c75e14fdcf54 100644
--- a/drivers/gpu/drm/radeon/si_dpm.c
+++ b/drivers/gpu/drm/radeon/si_dpm.c
@@ -3018,6 +3018,12 @@ static void si_apply_state_adjust_rules(struct 
radeon_device *rdev,
max_sclk = 75000;
max_mclk = 8;
}
+   /* limit clocks on HD8600 series */
+   if (rdev->pdev->device == 0x6660 &&
+   rdev->pdev->revision == 0x83) {
+   max_sclk = 75000;
+   max_mclk = 8;
+   }
 
/* XXX validate the min clocks required for display */
 
-- 
2.10.2



[PATCH 3.12 076/127] uwb: fix device reference leaks

2016-11-25 Thread Jiri Slaby
From: Johan Hovold 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit d6124b409ca33c100170ffde51cd8dff761454a1 upstream.

This subsystem consistently fails to drop the device reference taken by
class_find_device().

Note that some of these lookup functions already take a reference to the
returned data, while others claim no reference is needed (or does not
seem need one).

Fixes: 183b9b592a62 ("uwb: add the UWB stack (core files)")
Signed-off-by: Johan Hovold 
Signed-off-by: Jiri Slaby 
---
 drivers/uwb/lc-rc.c | 16 +---
 drivers/uwb/pal.c   |  2 ++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/uwb/lc-rc.c b/drivers/uwb/lc-rc.c
index 3eca6ceb9844..4be2a5d1a9d2 100644
--- a/drivers/uwb/lc-rc.c
+++ b/drivers/uwb/lc-rc.c
@@ -56,8 +56,11 @@ static struct uwb_rc *uwb_rc_find_by_index(int index)
struct uwb_rc *rc = NULL;
 
dev = class_find_device(&uwb_rc_class, NULL, &index, 
uwb_rc_index_match);
-   if (dev)
+   if (dev) {
rc = dev_get_drvdata(dev);
+   put_device(dev);
+   }
+
return rc;
 }
 
@@ -368,7 +371,9 @@ struct uwb_rc *__uwb_rc_try_get(struct uwb_rc *target_rc)
if (dev) {
rc = dev_get_drvdata(dev);
__uwb_rc_get(rc);
+   put_device(dev);
}
+
return rc;
 }
 EXPORT_SYMBOL_GPL(__uwb_rc_try_get);
@@ -421,8 +426,11 @@ struct uwb_rc *uwb_rc_get_by_grandpa(const struct device 
*grandpa_dev)
 
dev = class_find_device(&uwb_rc_class, NULL, grandpa_dev,
find_rc_grandpa);
-   if (dev)
+   if (dev) {
rc = dev_get_drvdata(dev);
+   put_device(dev);
+   }
+
return rc;
 }
 EXPORT_SYMBOL_GPL(uwb_rc_get_by_grandpa);
@@ -454,8 +462,10 @@ struct uwb_rc *uwb_rc_get_by_dev(const struct uwb_dev_addr 
*addr)
struct uwb_rc *rc = NULL;
 
dev = class_find_device(&uwb_rc_class, NULL, addr, find_rc_dev);
-   if (dev)
+   if (dev) {
rc = dev_get_drvdata(dev);
+   put_device(dev);
+   }
 
return rc;
 }
diff --git a/drivers/uwb/pal.c b/drivers/uwb/pal.c
index c1304b8d4985..678e93741ae1 100644
--- a/drivers/uwb/pal.c
+++ b/drivers/uwb/pal.c
@@ -97,6 +97,8 @@ static bool uwb_rc_class_device_exists(struct uwb_rc 
*target_rc)
 
dev = class_find_device(&uwb_rc_class, NULL, target_rc, find_rc);
 
+   put_device(dev);
+
return (dev != NULL);
 }
 
-- 
2.10.2



[PATCH 3.12 077/127] PM / sleep: fix device reference leak in test_suspend

2016-11-25 Thread Jiri Slaby
From: Johan Hovold 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit ceb75787bc75d0a7b88519ab8a68067ac690f55a upstream.

Make sure to drop the reference taken by class_find_device() after
opening the RTC device.

Fixes: 77437fd4e61f (pm: boot time suspend selftest)
Signed-off-by: Johan Hovold 
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Jiri Slaby 
---
 kernel/power/suspend_test.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/power/suspend_test.c b/kernel/power/suspend_test.c
index 269b097e78ea..743615bfdcec 100644
--- a/kernel/power/suspend_test.c
+++ b/kernel/power/suspend_test.c
@@ -169,8 +169,10 @@ static int __init test_suspend(void)
 
/* RTCs have initialized by now too ... can we use one? */
dev = class_find_device(rtc_class, NULL, NULL, has_wakealarm);
-   if (dev)
+   if (dev) {
rtc = rtc_class_open(dev_name(dev));
+   put_device(dev);
+   }
if (!rtc) {
printk(warn_no_rtc);
goto done;
-- 
2.10.2



[PATCH 3.12 046/127] pwm: Unexport children before chip removal

2016-11-25 Thread Jiri Slaby
From: David Hsu 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 0733424c9ba9f42242409d1ece780777272f7ea1 upstream.

Exported pwm channels aren't removed before the pwmchip and are
leaked. This results in invalid sysfs files. This fix removes
all exported pwm channels before chip removal.

Signed-off-by: David Hsu 
Fixes: 76abbdde2d95 ("pwm: Add sysfs interface")
Signed-off-by: Thierry Reding 
Signed-off-by: Jiri Slaby 
---
 drivers/pwm/core.c  |  2 ++
 drivers/pwm/sysfs.c | 18 ++
 include/linux/pwm.h |  5 +
 3 files changed, 25 insertions(+)

diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
index 2ca95042a0b9..c244e7dc6d66 100644
--- a/drivers/pwm/core.c
+++ b/drivers/pwm/core.c
@@ -293,6 +293,8 @@ int pwmchip_remove(struct pwm_chip *chip)
unsigned int i;
int ret = 0;
 
+   pwmchip_sysfs_unexport_children(chip);
+
mutex_lock(&pwm_lock);
 
for (i = 0; i < chip->npwm; i++) {
diff --git a/drivers/pwm/sysfs.c b/drivers/pwm/sysfs.c
index 8c20332d4825..809b5ab9074c 100644
--- a/drivers/pwm/sysfs.c
+++ b/drivers/pwm/sysfs.c
@@ -348,6 +348,24 @@ void pwmchip_sysfs_unexport(struct pwm_chip *chip)
}
 }
 
+void pwmchip_sysfs_unexport_children(struct pwm_chip *chip)
+{
+   struct device *parent;
+   unsigned int i;
+
+   parent = class_find_device(&pwm_class, NULL, chip,
+  pwmchip_sysfs_match);
+   if (!parent)
+   return;
+
+   for (i = 0; i < chip->npwm; i++) {
+   struct pwm_device *pwm = &chip->pwms[i];
+
+   if (test_bit(PWMF_EXPORTED, &pwm->flags))
+   pwm_unexport_child(parent, pwm);
+   }
+}
+
 static int __init pwm_sysfs_init(void)
 {
return class_register(&pwm_class);
diff --git a/include/linux/pwm.h b/include/linux/pwm.h
index f0feafd184a0..08b0215128dc 100644
--- a/include/linux/pwm.h
+++ b/include/linux/pwm.h
@@ -295,6 +295,7 @@ static inline void pwm_add_table(struct pwm_lookup *table, 
size_t num)
 #ifdef CONFIG_PWM_SYSFS
 void pwmchip_sysfs_export(struct pwm_chip *chip);
 void pwmchip_sysfs_unexport(struct pwm_chip *chip);
+void pwmchip_sysfs_unexport_children(struct pwm_chip *chip);
 #else
 static inline void pwmchip_sysfs_export(struct pwm_chip *chip)
 {
@@ -303,6 +304,10 @@ static inline void pwmchip_sysfs_export(struct pwm_chip 
*chip)
 static inline void pwmchip_sysfs_unexport(struct pwm_chip *chip)
 {
 }
+
+static inline void pwmchip_sysfs_unexport_children(struct pwm_chip *chip)
+{
+}
 #endif /* CONFIG_PWM_SYSFS */
 
 #endif /* __LINUX_PWM_H */
-- 
2.10.2



[PATCH 3.12 116/127] sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.

2016-11-25 Thread Jiri Slaby
From: "David S. Miller" 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit a236441bb69723032db94128761a469030c3fe6d ]

Just like the non-cross-call TLB flush handlers, the cross-call ones need
to avoid doing PC-relative branches outside of their code blocks.

Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/mm/ultra.S | 42 ++
 1 file changed, 30 insertions(+), 12 deletions(-)

diff --git a/arch/sparc/mm/ultra.S b/arch/sparc/mm/ultra.S
index 5128d38b1d1a..0fa2e6202c1f 100644
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -484,7 +484,7 @@ cheetah_patch_cachetlbops:
 */
.align  32
.globl  xcall_flush_tlb_mm
-xcall_flush_tlb_mm:/* 21 insns */
+xcall_flush_tlb_mm:/* 24 insns */
mov PRIMARY_CONTEXT, %g2
ldxa[%g2] ASI_DMMU, %g3
srlx%g3, CTX_PGSZ1_NUC_SHIFT, %g4
@@ -506,9 +506,12 @@ xcall_flush_tlb_mm:/* 21 insns */
nop
nop
nop
+   nop
+   nop
+   nop
 
.globl  xcall_flush_tlb_page
-xcall_flush_tlb_page:  /* 17 insns */
+xcall_flush_tlb_page:  /* 20 insns */
/* %g5=context, %g1=vaddr */
mov PRIMARY_CONTEXT, %g4
ldxa[%g4] ASI_DMMU, %g2
@@ -527,9 +530,12 @@ xcall_flush_tlb_page:  /* 17 insns */
retry
nop
nop
+   nop
+   nop
+   nop
 
.globl  xcall_flush_tlb_kernel_range
-xcall_flush_tlb_kernel_range:  /* 25 insns */
+xcall_flush_tlb_kernel_range:  /* 28 insns */
sethi   %hi(PAGE_SIZE - 1), %g2
or  %g2, %lo(PAGE_SIZE - 1), %g2
andn%g1, %g2, %g1
@@ -555,6 +561,9 @@ xcall_flush_tlb_kernel_range:   /* 25 insns */
nop
nop
nop
+   nop
+   nop
+   nop
 
/* This runs in a very controlled environment, so we do
 * not need to worry about BH races etc.
@@ -737,7 +746,7 @@ __hypervisor_tlb_xcall_error:
ba,a,pt %xcc, rtrap
 
.globl  __hypervisor_xcall_flush_tlb_mm
-__hypervisor_xcall_flush_tlb_mm: /* 21 insns */
+__hypervisor_xcall_flush_tlb_mm: /* 24 insns */
/* %g5=ctx, g1,g2,g3,g4,g7=scratch, %g6=unusable */
mov %o0, %g2
mov %o1, %g3
@@ -751,7 +760,7 @@ __hypervisor_xcall_flush_tlb_mm: /* 21 insns */
mov HV_FAST_MMU_DEMAP_CTX, %o5
ta  HV_FAST_TRAP
mov HV_FAST_MMU_DEMAP_CTX, %g6
-   brnz,pn %o0, __hypervisor_tlb_xcall_error
+   brnz,pn %o0, 1f
 mov%o0, %g5
mov %g2, %o0
mov %g3, %o1
@@ -760,9 +769,12 @@ __hypervisor_xcall_flush_tlb_mm: /* 21 insns */
mov %g7, %o5
membar  #Sync
retry
+1: sethi   %hi(__hypervisor_tlb_xcall_error), %g4
+   jmpl%g4 + %lo(__hypervisor_tlb_xcall_error), %g0
+nop
 
.globl  __hypervisor_xcall_flush_tlb_page
-__hypervisor_xcall_flush_tlb_page: /* 17 insns */
+__hypervisor_xcall_flush_tlb_page: /* 20 insns */
/* %g5=ctx, %g1=vaddr */
mov %o0, %g2
mov %o1, %g3
@@ -774,16 +786,19 @@ __hypervisor_xcall_flush_tlb_page: /* 17 insns */
sllx%o0, PAGE_SHIFT, %o0
ta  HV_MMU_UNMAP_ADDR_TRAP
mov HV_MMU_UNMAP_ADDR_TRAP, %g6
-   brnz,a,pn   %o0, __hypervisor_tlb_xcall_error
+   brnz,a,pn   %o0, 1f
 mov%o0, %g5
mov %g2, %o0
mov %g3, %o1
mov %g4, %o2
membar  #Sync
retry
+1: sethi   %hi(__hypervisor_tlb_xcall_error), %g4
+   jmpl%g4 + %lo(__hypervisor_tlb_xcall_error), %g0
+nop
 
.globl  __hypervisor_xcall_flush_tlb_kernel_range
-__hypervisor_xcall_flush_tlb_kernel_range: /* 25 insns */
+__hypervisor_xcall_flush_tlb_kernel_range: /* 28 insns */
/* %g1=start, %g7=end, g2,g3,g4,g5,g6=scratch */
sethi   %hi(PAGE_SIZE - 1), %g2
or  %g2, %lo(PAGE_SIZE - 1), %g2
@@ -800,7 +815,7 @@ __hypervisor_xcall_flush_tlb_kernel_range: /* 25 insns */
mov HV_MMU_ALL, %o2 /* ARG2: flags */
ta  HV_MMU_UNMAP_ADDR_TRAP
mov HV_MMU_UNMAP_ADDR_TRAP, %g6
-   brnz,pn %o0, __hypervisor_tlb_xcall_error
+   brnz,pn %o0, 1f
 mov%o0, %g5
sethi   %hi(PAGE_SIZE), %o2
brnz,pt %g3, 1b
@@ -810,6 +825,9 @@ __hypervisor_xcall_flush_tlb_kernel_range: /* 25 insns */
mov %g7, %o2
membar  #Sync
retry
+1: sethi   %hi(__hypervisor_tlb_xc

[PATCH 3.12 122/127] drivers/net: Disable UFO through virtio

2016-11-25 Thread Jiri Slaby
From: Ben Hutchings 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 3d0ad09412ffe00c9afa201d01effdb6023d09b4 upstream.

IPv6 does not allow fragmentation by routers, so there is no
fragmentation ID in the fixed header.  UFO for IPv6 requires the ID to
be passed separately, but there is no provision for this in the virtio
net protocol.

Until recently our software implementation of UFO/IPv6 generated a new
ID, but this was a bug.  Now we will use ID=0 for any UFO/IPv6 packet
passed through a tap, which is even worse.

Unfortunately there is no distinction between UFO/IPv4 and v6
features, so disable UFO on taps and virtio_net completely until we
have a proper solution.

We cannot depend on VM managers respecting the tap feature flags, so
keep accepting UFO packets but log a warning the first time we do
this.

Signed-off-by: Ben Hutchings 
Fixes: 916e4cf46d02 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data")
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 drivers/net/macvtap.c| 13 +
 drivers/net/tun.c| 19 +++
 drivers/net/virtio_net.c | 24 ++--
 3 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 98ce4feb9a79..576c3236fa40 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -67,7 +67,7 @@ static struct cdev macvtap_cdev;
 static const struct proto_ops macvtap_socket_ops;
 
 #define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
- NETIF_F_TSO6 | NETIF_F_UFO)
+ NETIF_F_TSO6)
 #define RX_OFFLOADS (NETIF_F_GRO | NETIF_F_LRO)
 #define TAP_FEATURES (NETIF_F_GSO | NETIF_F_SG | NETIF_F_FRAGLIST)
 
@@ -566,6 +566,8 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
gso_type = SKB_GSO_TCPV6;
break;
case VIRTIO_NET_HDR_GSO_UDP:
+   pr_warn_once("macvtap: %s: using disabled UFO feature; 
please fix this program\n",
+current->comm);
gso_type = SKB_GSO_UDP;
if (skb->protocol == htons(ETH_P_IPV6))
ipv6_proxy_select_ident(skb);
@@ -613,8 +615,6 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
-   else if (sinfo->gso_type & SKB_GSO_UDP)
-   vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_UDP;
else
BUG();
if (sinfo->gso_type & SKB_GSO_TCP_ECN)
@@ -962,9 +962,6 @@ static int set_offload(struct macvtap_queue *q, unsigned 
long arg)
if (arg & TUN_F_TSO6)
feature_mask |= NETIF_F_TSO6;
}
-
-   if (arg & TUN_F_UFO)
-   feature_mask |= NETIF_F_UFO;
}
 
/* tun/tap driver inverts the usage for TSO offloads, where
@@ -975,7 +972,7 @@ static int set_offload(struct macvtap_queue *q, unsigned 
long arg)
 * When user space turns off TSO, we turn off GSO/LRO so that
 * user-space will not receive TSO frames.
 */
-   if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_UFO))
+   if (feature_mask & (NETIF_F_TSO | NETIF_F_TSO6))
features |= RX_OFFLOADS;
else
features &= ~RX_OFFLOADS;
@@ -1076,7 +1073,7 @@ static long macvtap_ioctl(struct file *file, unsigned int 
cmd,
case TUNSETOFFLOAD:
/* let the user check for future flags */
if (arg & ~(TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 |
-   TUN_F_TSO_ECN | TUN_F_UFO))
+   TUN_F_TSO_ECN))
return -EINVAL;
 
rtnl_lock();
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 813750d09680..46f9cb21ec56 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -173,7 +173,7 @@ struct tun_struct {
struct net_device   *dev;
netdev_features_t   set_features;
 #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
- NETIF_F_TSO6|NETIF_F_UFO)
+ NETIF_F_TSO6)
 
int vnet_hdr_sz;
int sndbuf;
@@ -1113,10 +1113,20 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
break;
case VIRTIO_NET_HDR_GSO_UDP:
+   {
+   static bool warned;
+
+   if (!warned) {
+   warned = true;
+  

[PATCH 3.12 126/127] xen-pciback: Add name prefix to global 'permissive' variable

2016-11-25 Thread Jiri Slaby
From: Ben Hutchings 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 8014bcc86ef112eab9ee1db312dba4e6b608cf89 upstream.

The variable for the 'permissive' module parameter used to be static
but was recently changed to be extern.  This puts it in the kernel
global namespace if the driver is built-in, so its name should begin
with a prefix identifying the driver.

Signed-off-by: Ben Hutchings 
Fixes: af6fc858a35b ("xen-pciback: limit guest control of command register")
Signed-off-by: David Vrabel 
Signed-off-by: Jiri Slaby 
---
 drivers/xen/xen-pciback/conf_space.c| 6 +++---
 drivers/xen/xen-pciback/conf_space.h| 2 +-
 drivers/xen/xen-pciback/conf_space_header.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/xen-pciback/conf_space.c 
b/drivers/xen/xen-pciback/conf_space.c
index ba3fac8318bb..47a4177b16d2 100644
--- a/drivers/xen/xen-pciback/conf_space.c
+++ b/drivers/xen/xen-pciback/conf_space.c
@@ -16,8 +16,8 @@
 #include "conf_space.h"
 #include "conf_space_quirks.h"
 
-bool permissive;
-module_param(permissive, bool, 0644);
+bool xen_pcibk_permissive;
+module_param_named(permissive, xen_pcibk_permissive, bool, 0644);
 
 /* This is where xen_pcibk_read_config_byte, xen_pcibk_read_config_word,
  * xen_pcibk_write_config_word, and xen_pcibk_write_config_byte are created. */
@@ -260,7 +260,7 @@ int xen_pcibk_config_write(struct pci_dev *dev, int offset, 
int size, u32 value)
 * This means that some fields may still be read-only because
 * they have entries in the config_field list that intercept
 * the write and do nothing. */
-   if (dev_data->permissive || permissive) {
+   if (dev_data->permissive || xen_pcibk_permissive) {
switch (size) {
case 1:
err = pci_write_config_byte(dev, offset,
diff --git a/drivers/xen/xen-pciback/conf_space.h 
b/drivers/xen/xen-pciback/conf_space.h
index 2e1d73d1d5d0..62461a8ba1d6 100644
--- a/drivers/xen/xen-pciback/conf_space.h
+++ b/drivers/xen/xen-pciback/conf_space.h
@@ -64,7 +64,7 @@ struct config_field_entry {
void *data;
 };
 
-extern bool permissive;
+extern bool xen_pcibk_permissive;
 
 #define OFFSET(cfg_entry) ((cfg_entry)->base_offset+(cfg_entry)->field->offset)
 
diff --git a/drivers/xen/xen-pciback/conf_space_header.c 
b/drivers/xen/xen-pciback/conf_space_header.c
index 2d7369391472..f8baf463dd35 100644
--- a/drivers/xen/xen-pciback/conf_space_header.c
+++ b/drivers/xen/xen-pciback/conf_space_header.c
@@ -105,7 +105,7 @@ static int command_write(struct pci_dev *dev, int offset, 
u16 value, void *data)
 
cmd->val = value;
 
-   if (!permissive && (!dev_data || !dev_data->permissive))
+   if (!xen_pcibk_permissive && (!dev_data || !dev_data->permissive))
return 0;
 
/* Only allow the guest to control certain bits. */
-- 
2.10.2



[PATCH 3.12 115/127] sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.

2016-11-25 Thread Jiri Slaby
From: "David S. Miller" 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 830cda3f9855ff092b0e9610346d110846fc497c ]

Noticed by James Clarke.

Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/mm/ultra.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sparc/mm/ultra.S b/arch/sparc/mm/ultra.S
index 85de139bfad6..5128d38b1d1a 100644
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -358,7 +358,7 @@ __hypervisor_flush_tlb_page: /* 22 insns */
nop
nop
 
-__hypervisor_flush_tlb_pending: /* 16 insns */
+__hypervisor_flush_tlb_pending: /* 27 insns */
/* %o0 = context, %o1 = nr, %o2 = vaddrs[] */
sllx%o1, 3, %g1
mov %o2, %g2
-- 
2.10.2



[PATCH 3.12 124/127] perf: Tighten (and fix) the grouping condition

2016-11-25 Thread Jiri Slaby
From: Peter Zijlstra 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit c3c87e770458aa004bd7ed3f29945ff436fd6511 upstream.

The fix from 9fc81d87420d ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.

Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered in practice
as well by me via the perf fuzzer.

Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.

Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Link: http://lkml.kernel.org/r/20150123125834.090683...@infradead.org
Signed-off-by: Ingo Molnar 
Signed-off-by: Jiri Slaby 
---
 include/linux/perf_event.h |  6 --
 kernel/events/core.c   | 15 +--
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8ba627c1d60..45aa1c62dbfa 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -439,11 +439,6 @@ struct perf_event {
 #endif /* CONFIG_PERF_EVENTS */
 };
 
-enum perf_event_context_type {
-   task_context,
-   cpu_context,
-};
-
 /**
  * struct perf_event_context - event context structure
  *
@@ -451,7 +446,6 @@ enum perf_event_context_type {
  */
 struct perf_event_context {
struct pmu  *pmu;
-   enum perf_event_context_typetype;
/*
 * Protect the states of the events in the list,
 * nr_active, and the list:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0b3c09a3f7b6..a4a1516f3efc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6503,7 +6503,6 @@ skip_type:
__perf_event_init_context(&cpuctx->ctx);
lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
-   cpuctx->ctx.type = cpu_context;
cpuctx->ctx.pmu = pmu;
 
__perf_cpu_hrtimer_init(cpuctx, cpu);
@@ -7136,7 +7135,19 @@ SYSCALL_DEFINE5(perf_event_open,
 * task or CPU context:
 */
if (move_group) {
-   if (group_leader->ctx->type != ctx->type)
+   /*
+* Make sure we're both on the same task, or both
+* per-cpu events.
+*/
+   if (group_leader->ctx->task != ctx->task)
+   goto err_context;
+
+   /*
+* Make sure we're both events for the same CPU;
+* grouping events for different CPUs is broken; since
+* you can never concurrently schedule them anyhow.
+*/
+   if (group_leader->cpu != event->cpu)
goto err_context;
} else {
if (group_leader->ctx != ctx)
-- 
2.10.2



[PATCH 3.12 121/127] KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()

2016-11-25 Thread Jiri Slaby
From: Ard Biesheuvel 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 85c8555ff07ef09261bd50d603cd4290cff5a8cc upstream.

Read-only memory ranges may be backed by the zero page, so avoid
misidentifying it a a MMIO pfn.

This fixes another issue I identified when testing QEMU+KVM_UEFI, where
a read to an uninitialized emulated NOR flash brought in the zero page,
but mapped as a read-write device region, because kvm_is_mmio_pfn()
misidentifies it as a MMIO pfn due to its PG_reserved bit being set.

Signed-off-by: Ard Biesheuvel 
Fixes: b88657674d39 ("ARM: KVM: user_mem_abort: support stage 2 MMIO page 
mapping")
Signed-off-by: Paolo Bonzini 
Signed-off-by: Jiri Slaby 
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3351605d2608..e7a1166c3eb4 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -104,7 +104,7 @@ static bool largepages_enabled = true;
 bool kvm_is_mmio_pfn(pfn_t pfn)
 {
if (pfn_valid(pfn))
-   return PageReserved(pfn_to_page(pfn));
+   return !is_zero_pfn(pfn) && PageReserved(pfn_to_page(pfn));
 
return true;
 }
-- 
2.10.2



[PATCH 3.12 120/127] mm: export symbol dependencies of is_zero_pfn()

2016-11-25 Thread Jiri Slaby
From: Ard Biesheuvel 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 0b70068e47e8f0c813a902dc3d6def601fd15acb upstream.

In order to make the static inline function is_zero_pfn() callable by
modules, export its symbol dependencies 'zero_pfn' and (for s390 and
mips) 'zero_page_mask'.

We need this for KVM, as CONFIG_KVM is a tristate for all supported
architectures except ARM and arm64, and testing a pfn whether it refers
to the zero page is required to correctly distinguish the zero page
from other special RAM ranges that may also have the PG_reserved bit
set, but need to be treated as MMIO memory.

Signed-off-by: Ard Biesheuvel 
Acked-by: Andrew Morton 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Jiri Slaby 
---
 arch/mips/mm/init.c | 1 +
 arch/s390/mm/init.c | 1 +
 mm/memory.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index e205ef598e97..c247cf5a31cb 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -74,6 +74,7 @@
  */
 unsigned long empty_zero_page, zero_page_mask;
 EXPORT_SYMBOL_GPL(empty_zero_page);
+EXPORT_SYMBOL(zero_page_mask);
 
 /*
  * Not static inline because used by IP27 special magic initialization code
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index ad446b0c55b6..1b30d5488f82 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -43,6 +43,7 @@ pgd_t swapper_pg_dir[PTRS_PER_PGD] 
__attribute__((__aligned__(PAGE_SIZE)));
 
 unsigned long empty_zero_page, zero_page_mask;
 EXPORT_SYMBOL(empty_zero_page);
+EXPORT_SYMBOL(zero_page_mask);
 
 static void __init setup_zero_pages(void)
 {
diff --git a/mm/memory.c b/mm/memory.c
index a0c9c6cb59d1..f5744269a454 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -116,6 +116,8 @@ __setup("norandmaps", disable_randmaps);
 unsigned long zero_pfn __read_mostly;
 unsigned long highest_memmap_pfn __read_mostly;
 
+EXPORT_SYMBOL(zero_pfn);
+
 /*
  * CONFIG_MMU architectures set up ZERO_PAGE in their paging_init()
  */
-- 
2.10.2



[PATCH 3.12 118/127] cgroup: use an ordered workqueue for cgroup destruction

2016-11-25 Thread Jiri Slaby
From: Hugh Dickins 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit ab3f5faa6255a0eb4f832675507d9e295ca7e9ba upstream.

Sometimes the cleanup after memcg hierarchy testing gets stuck in
mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.

There may turn out to be several causes, but a major cause is this: the
workitem to offline parent can get run before workitem to offline child;
parent's mem_cgroup_reparent_charges() circles around waiting for the
child's pages to be reparented to its lrus, but it's holding cgroup_mutex
which prevents the child from reaching its mem_cgroup_reparent_charges().

Just use an ordered workqueue for cgroup_destroy_wq.

tj: Committing as the temporary fix until the reverse dependency can
be removed from memcg.  Comment updated accordingly.

Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
Suggested-by: Filipe Brandenburger 
Signed-off-by: Hugh Dickins 
Signed-off-by: Tejun Heo 
Signed-off-by: Jiri Slaby 
---
 kernel/cgroup.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 5d9d542c0bb5..e89f6cec01c9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5168,12 +5168,16 @@ static int __init cgroup_wq_init(void)
/*
 * There isn't much point in executing destruction path in
 * parallel.  Good chunk is serialized with cgroup_mutex anyway.
-* Use 1 for @max_active.
+*
+* XXX: Must be ordered to make sure parent is offlined after
+* children.  The ordering requirement is for memcg where a
+* parent's offline may wait for a child's leading to deadlock.  In
+* the long term, this should be fixed from memcg side.
 *
 * We would prefer to do this in cgroup_init() above, but that
 * is called before init_workqueues(): so leave this until after.
 */
-   cgroup_destroy_wq = alloc_workqueue("cgroup_destroy", 0, 1);
+   cgroup_destroy_wq = alloc_ordered_workqueue("cgroup_destroy", 0);
BUG_ON(!cgroup_destroy_wq);
return 0;
 }
-- 
2.10.2



[PATCH 3.12 113/127] sparc64: Handle extremely large kernel TSB range flushes sanely.

2016-11-25 Thread Jiri Slaby
From: "David S. Miller" 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 849c498766060a16aad5b0e0d03206726e7d2fa4 ]

If the number of pages we are flushing is more than twice the number
of entries in the TSB, just scan the TSB table for matches rather
than probing each and every page in the range.

Based upon a patch and report by James Clarke.

Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/mm/tsb.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/sparc/mm/tsb.c b/arch/sparc/mm/tsb.c
index 12f172117043..48a09e48d444 100644
--- a/arch/sparc/mm/tsb.c
+++ b/arch/sparc/mm/tsb.c
@@ -26,6 +26,20 @@ static inline int tag_compare(unsigned long tag, unsigned 
long vaddr)
return (tag == (vaddr >> 22));
 }
 
+static void flush_tsb_kernel_range_scan(unsigned long start, unsigned long end)
+{
+   unsigned long idx;
+
+   for (idx = 0; idx < KERNEL_TSB_NENTRIES; idx++) {
+   struct tsb *ent = &swapper_tsb[idx];
+   unsigned long match = idx << 13;
+
+   match |= (ent->tag << 22);
+   if (match >= start && match < end)
+   ent->tag = (1UL << TSB_TAG_INVALID_BIT);
+   }
+}
+
 /* TSB flushes need only occur on the processor initiating the address
  * space modification, not on each cpu the address space has run on.
  * Only the TLB flush needs that treatment.
@@ -35,6 +49,9 @@ void flush_tsb_kernel_range(unsigned long start, unsigned 
long end)
 {
unsigned long v;
 
+   if ((end - start) >> PAGE_SHIFT >= 2 * KERNEL_TSB_NENTRIES)
+   return flush_tsb_kernel_range_scan(start, end);
+
for (v = start; v < end; v += PAGE_SIZE) {
unsigned long hash = tsb_hash(v, PAGE_SHIFT,
  KERNEL_TSB_NENTRIES);
-- 
2.10.2



[PATCH 3.12 125/127] PCI: Handle read-only BARs on AMD CS553x devices

2016-11-25 Thread Jiri Slaby
From: Myron Stowe 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 06cf35f903aa6da0cc8d9f81e9bcd1f7e1b534bb upstream.

Some AMD CS553x devices have read-only BARs because of a firmware or
hardware defect.  There's a workaround in quirk_cs5536_vsa(), but it no
longer works after 36e8164882ca ("PCI: Restore detection of read-only
BARs").  Prior to 36e8164882ca, we filled in res->start; afterwards we
leave it zeroed out.  The quirk only updated the size, so the driver tried
to use a region starting at zero, which didn't work.

Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
hard-code the sizes.

On Nix's system BAR 2's read-only value is 0x6200.  Prior to 36e8164882ca,
we interpret that as a 512-byte BAR based on the lowest-order bit set.  Per
datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
avoid clearing any address bits if a platform uses only 64-byte alignment.

[js] pcibios_bus_to_resource takes pdev, not bus in 3.12

[bhelgaas: changelog, reduce BAR 2 size to 64]
Fixes: 36e8164882ca ("PCI: Restore detection of read-only BARs")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85991#c4
Link: http://support.amd.com/TechDocs/31506_cs5535_databook.pdf
Link: http://support.amd.com/TechDocs/33238G_cs5536_db.pdf
Reported-and-tested-by: Nix 
Signed-off-by: Myron Stowe 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Jiri Slaby 
---
 drivers/pci/quirks.c | 41 +
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 019dbc1fae11..cb245bd510a2 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -339,19 +339,52 @@ static void quirk_s3_64M(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_868,   
quirk_s3_64M);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_S3, PCI_DEVICE_ID_S3_968,   
quirk_s3_64M);
 
+static void quirk_io(struct pci_dev *dev, int pos, unsigned size,
+const char *name)
+{
+   u32 region;
+   struct pci_bus_region bus_region;
+   struct resource *res = dev->resource + pos;
+
+   pci_read_config_dword(dev, PCI_BASE_ADDRESS_0 + (pos << 2), ®ion);
+
+   if (!region)
+   return;
+
+   res->name = pci_name(dev);
+   res->flags = region & ~PCI_BASE_ADDRESS_IO_MASK;
+   res->flags |=
+   (IORESOURCE_IO | IORESOURCE_PCI_FIXED | IORESOURCE_SIZEALIGN);
+   region &= ~(size - 1);
+
+   /* Convert from PCI bus to resource space */
+   bus_region.start = region;
+   bus_region.end = region + size - 1;
+   pcibios_bus_to_resource(dev, res, &bus_region);
+
+   dev_info(&dev->dev, FW_BUG "%s quirk: reg 0x%x: %pR\n",
+name, PCI_BASE_ADDRESS_0 + (pos << 2), res);
+}
+
 /*
  * Some CS5536 BIOSes (for example, the Soekris NET5501 board w/ comBIOS
  * ver. 1.33  20070103) don't set the correct ISA PCI region header info.
  * BAR0 should be 8 bytes; instead, it may be set to something like 8k
  * (which conflicts w/ BAR1's memory range).
+ *
+ * CS553x's ISA PCI BARs may also be read-only (ref:
+ * https://bugzilla.kernel.org/show_bug.cgi?id=85991 - Comment #4 forward).
  */
 static void quirk_cs5536_vsa(struct pci_dev *dev)
 {
+   static char *name = "CS5536 ISA bridge";
+
if (pci_resource_len(dev, 0) != 8) {
-   struct resource *res = &dev->resource[0];
-   res->end = res->start + 8 - 1;
-   dev_info(&dev->dev, "CS5536 ISA bridge bug detected "
-   "(incorrect header); workaround applied.\n");
+   quirk_io(dev, 0,   8, name);/* SMB */
+   quirk_io(dev, 1, 256, name);/* GPIO */
+   quirk_io(dev, 2,  64, name);/* MFGPT */
+   dev_info(&dev->dev, "%s bug detected (incorrect header); 
workaround applied\n",
+name);
}
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CS5536_ISA, 
quirk_cs5536_vsa);
-- 
2.10.2



[PATCH 3.12 123/127] usb: musb: musb_cppi41: recognize HS devices in hostmode

2016-11-25 Thread Jiri Slaby
From: Sebastian Andrzej Siewior 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 1eec34e9f25664cf71e05321329d128e0565beae upstream.

There is a poll loop for max 25us for HS devices. Now guess what, I
tested it in gadget mode and forgot about the little detail. Nobody seem
to have it noticed…
This patch adds the missing logic for hostmode so it is recognized in
host and device mode properly.

Fixes: 50aea6fca771 ("usb: musb: cppi41: fire hrtimer according to
programmed channel length")
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Felipe Balbi 
Signed-off-by: Jiri Slaby 
---
 drivers/usb/musb/musb_cppi41.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/musb/musb_cppi41.c b/drivers/usb/musb/musb_cppi41.c
index cce32e91fd9e..83bee312df8d 100644
--- a/drivers/usb/musb/musb_cppi41.c
+++ b/drivers/usb/musb/musb_cppi41.c
@@ -234,6 +234,7 @@ static void cppi41_dma_callback(void *private_data)
cppi41_trans_done(cppi41_channel);
} else {
struct cppi41_dma_controller *controller;
+   int is_hs = 0;
/*
 * On AM335x it has been observed that the TX interrupt fires
 * too early that means the TXFIFO is not yet empty but the DMA
@@ -246,7 +247,14 @@ static void cppi41_dma_callback(void *private_data)
 */
controller = cppi41_channel->controller;
 
-   if (musb->g.speed == USB_SPEED_HIGH) {
+   if (is_host_active(musb)) {
+   if (musb->port1_status & USB_PORT_STAT_HIGH_SPEED)
+   is_hs = 1;
+   } else {
+   if (musb->g.speed == USB_SPEED_HIGH)
+   is_hs = 1;
+   }
+   if (is_hs) {
unsigned wait = 25;
 
do {
-- 
2.10.2



[PATCH 3.12 119/127] mm: filemap: update find_get_pages_tag() to deal with shadow entries

2016-11-25 Thread Jiri Slaby
From: Johannes Weiner 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 139b6a6fb1539e04b01663d61baff3088c63dbb5 upstream.

Dave Jones reports the following crash when find_get_pages_tag() runs
into an exceptional entry:

  kernel BUG at mm/filemap.c:1347!
  RIP: find_get_pages_tag+0x1cb/0x220
  Call Trace:
find_get_pages_tag+0x36/0x220
pagevec_lookup_tag+0x21/0x30
filemap_fdatawait_range+0xbe/0x1e0
filemap_fdatawait+0x27/0x30
sync_inodes_sb+0x204/0x2a0
sync_inodes_one_sb+0x19/0x20
iterate_supers+0xb2/0x110
sys_sync+0x44/0xb0
ia32_do_call+0x13/0x13

  1343 /*
  1344  * This function is never used on a shmem/tmpfs
  1345  * mapping, so a swap entry won't be found here.
  1346  */
  1347 BUG();

After commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in
page cache radix trees") this comment and BUG() are out of date because
exceptional entries can now appear in all mappings - as shadows of
recently evicted pages.

However, as Hugh Dickins notes,

  "it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
   any other PAGECACHE_TAG_*) to appear on an exceptional entry.

   I expect it comes down to an occasional race in RCU lookup of the
   radix_tree: lacking absolute synchronization, we might sometimes
   catch an exceptional entry, with the tag which really belongs with
   the unexceptional entry which was there an instant before."

And indeed, not only is the tree walk lockless, the tags are also read
in chunks, one radix tree node at a time.  There is plenty of time for
page reclaim to swoop in and replace a page that was already looked up
as tagged with a shadow entry.

Remove the BUG() and update the comment.  While reviewing all other
lookup sites for whether they properly deal with shadow entries of
evicted pages, update all the comments and fix memcg file charge moving
to not miss shmem/tmpfs swapcache pages.

Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix 
trees")
Signed-off-by: Johannes Weiner 
Reported-by: Dave Jones 
Acked-by: Hugh Dickins 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Jiri Slaby 
---
 mm/filemap.c| 49 -
 mm/memcontrol.c | 20 
 mm/truncate.c   |  8 
 3 files changed, 40 insertions(+), 37 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index af9e11ea4ecf..9fa5c3f40cd6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -808,8 +808,8 @@ EXPORT_SYMBOL(page_cache_prev_hole);
  * Looks up the page cache slot at @mapping & @offset.  If there is a
  * page cache page, it is returned with an increased refcount.
  *
- * If the slot holds a shadow entry of a previously evicted page, it
- * is returned.
+ * If the slot holds a shadow entry of a previously evicted page, or a
+ * swap entry from shmem/tmpfs, it is returned.
  *
  * Otherwise, %NULL is returned.
  */
@@ -830,9 +830,9 @@ repeat:
if (radix_tree_deref_retry(page))
goto repeat;
/*
-* Otherwise, shmem/tmpfs must be storing a swap entry
-* here as an exceptional entry: so return it without
-* attempting to raise page count.
+* A shadow entry of a recently evicted page,
+* or a swap entry from shmem/tmpfs.  Return
+* it without attempting to raise page count.
 */
goto out;
}
@@ -865,8 +865,8 @@ EXPORT_SYMBOL(find_get_entry);
  * page cache page, it is returned locked and with an increased
  * refcount.
  *
- * If the slot holds a shadow entry of a previously evicted page, it
- * is returned.
+ * If the slot holds a shadow entry of a previously evicted page, or a
+ * swap entry from shmem/tmpfs, it is returned.
  *
  * Otherwise, %NULL is returned.
  *
@@ -999,8 +999,8 @@ EXPORT_SYMBOL(pagecache_get_page);
  * with ascending indexes.  There may be holes in the indices due to
  * not-present pages.
  *
- * Any shadow entries of evicted pages are included in the returned
- * array.
+ * Any shadow entries of evicted pages, or swap entries from
+ * shmem/tmpfs, are included in the returned array.
  *
  * find_get_entries() returns the number of pages and shadow entries
  * which were found.
@@ -1028,9 +1028,9 @@ repeat:
if (radix_tree_deref_retry(page))
goto restart;
/*
-* Otherwise, we must be storing a swap entry
-* here as an exceptional entry: so return it
-* without attempting to raise page count.
+* A shado

[PATCH 3.12 114/127] sparc64: Fix illegal relative branches in hypervisor patched TLB code.

2016-11-25 Thread Jiri Slaby
From: "David S. Miller" 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit b429ae4d5b565a71dfffd759dfcd4f6c093ced94 ]

When we copy code over to patch another piece of code, we can only use
PC-relative branches that target code within that piece of code.

Such PC-relative branches cannot be made to external symbols because
the patch moves the location of the code and thus modifies the
relative address of external symbols.

Use an absolute jmpl to fix this problem.

Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/mm/ultra.S | 65 ---
 1 file changed, 51 insertions(+), 14 deletions(-)

diff --git a/arch/sparc/mm/ultra.S b/arch/sparc/mm/ultra.S
index b4f4733abc6e..85de139bfad6 100644
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -30,7 +30,7 @@
.text
.align  32
.globl  __flush_tlb_mm
-__flush_tlb_mm:/* 18 insns */
+__flush_tlb_mm:/* 19 insns */
/* %o0=(ctx & TAG_CONTEXT_BITS), %o1=SECONDARY_CONTEXT */
ldxa[%o1] ASI_DMMU, %g2
cmp %g2, %o0
@@ -81,7 +81,7 @@ __flush_tlb_page: /* 22 insns */
 
.align  32
.globl  __flush_tlb_pending
-__flush_tlb_pending:   /* 26 insns */
+__flush_tlb_pending:   /* 27 insns */
/* %o0 = context, %o1 = nr, %o2 = vaddrs[] */
rdpr%pstate, %g7
sllx%o1, 3, %o1
@@ -113,7 +113,7 @@ __flush_tlb_pending:/* 26 insns */
 
.align  32
.globl  __flush_tlb_kernel_range
-__flush_tlb_kernel_range:  /* 16 insns */
+__flush_tlb_kernel_range:  /* 19 insns */
/* %o0=start, %o1=end */
cmp %o0, %o1
be,pn   %xcc, 2f
@@ -131,6 +131,9 @@ __flush_tlb_kernel_range:   /* 16 insns */
retl
 nop
nop
+   nop
+   nop
+   nop
 
 __spitfire_flush_tlb_mm_slow:
rdpr%pstate, %g1
@@ -309,19 +312,28 @@ __hypervisor_tlb_tl0_error:
ret
 restore
 
-__hypervisor_flush_tlb_mm: /* 10 insns */
+__hypervisor_flush_tlb_mm: /* 19 insns */
mov %o0, %o2/* ARG2: mmu context */
mov 0, %o0  /* ARG0: CPU lists unimplemented */
mov 0, %o1  /* ARG1: CPU lists unimplemented */
mov HV_MMU_ALL, %o3 /* ARG3: flags */
mov HV_FAST_MMU_DEMAP_CTX, %o5
ta  HV_FAST_TRAP
-   brnz,pn %o0, __hypervisor_tlb_tl0_error
+   brnz,pn %o0, 1f
 movHV_FAST_MMU_DEMAP_CTX, %o1
retl
 nop
+1: sethi   %hi(__hypervisor_tlb_tl0_error), %o5
+   jmpl%o5 + %lo(__hypervisor_tlb_tl0_error), %g0
+nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
 
-__hypervisor_flush_tlb_page: /* 11 insns */
+__hypervisor_flush_tlb_page: /* 22 insns */
/* %o0 = context, %o1 = vaddr */
mov %o0, %g2
mov %o1, %o0  /* ARG0: vaddr + IMMU-bit */
@@ -330,10 +342,21 @@ __hypervisor_flush_tlb_page: /* 11 insns */
srlx%o0, PAGE_SHIFT, %o0
sllx%o0, PAGE_SHIFT, %o0
ta  HV_MMU_UNMAP_ADDR_TRAP
-   brnz,pn %o0, __hypervisor_tlb_tl0_error
+   brnz,pn %o0, 1f
 movHV_MMU_UNMAP_ADDR_TRAP, %o1
retl
 nop
+1: sethi   %hi(__hypervisor_tlb_tl0_error), %o2
+   jmpl%o2 + %lo(__hypervisor_tlb_tl0_error), %g0
+nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
 
 __hypervisor_flush_tlb_pending: /* 16 insns */
/* %o0 = context, %o1 = nr, %o2 = vaddrs[] */
@@ -347,14 +370,25 @@ __hypervisor_flush_tlb_pending: /* 16 insns */
srlx%o0, PAGE_SHIFT, %o0
sllx%o0, PAGE_SHIFT, %o0
ta  HV_MMU_UNMAP_ADDR_TRAP
-   brnz,pn %o0, __hypervisor_tlb_tl0_error
+   brnz,pn %o0, 1f
 movHV_MMU_UNMAP_ADDR_TRAP, %o1
brnz,pt %g1, 1b
 nop
retl
 nop
+1: sethi   %hi(__hypervisor_tlb_tl0_error), %o2
+   jmpl%o2 + %lo(__hypervisor_tlb_tl0_error), %g0
+nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
 
-__hypervisor_flush_tlb_kernel_range: /* 16 insns */
+__hypervisor_flush_tlb_kernel_range: /* 19 insns */
/* %o0=start, %o1=end */
cmp %o0, %o1
be,pn   %xcc, 2f
@@ -366,12 +400,15 @@ __hypervisor_flush_tlb_kernel_range: /* 16 insns */
mov 0, %o1  /* ARG1: mmu context */
mov HV_MMU_ALL, %o2 /* ARG2: 

[PATCH 3.12 127/127] ALSA: usb-audio: Fix runtime PM unbalance

2016-11-25 Thread Jiri Slaby
From: Takashi Iwai 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 9003ebb13f61e8c78a641e0dda7775183ada0625 upstream.

The fix for deadlock in PM in commit [1ee23fe07ee8: ALSA: usb-audio:
Fix deadlocks at resuming] introduced a new check of in_pm flag.
However, the brainless patch author evaluated it in a wrong way
(logical AND instead of logical OR), thus usb_autopm_get_interface()
is wrongly called at probing, leading to unbalance of runtime PM
refcount.

This patch fixes it by correcting the logic.

Reported-by: Hans Yang 
Fixes: 1ee23fe07ee8 ('ALSA: usb-audio: Fix deadlocks at resuming')
Signed-off-by: Takashi Iwai 
Signed-off-by: Jiri Slaby 
---
 sound/usb/card.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/usb/card.c b/sound/usb/card.c
index bc5795f342a7..96a09226be7d 100644
--- a/sound/usb/card.c
+++ b/sound/usb/card.c
@@ -661,7 +661,7 @@ int snd_usb_autoresume(struct snd_usb_audio *chip)
int err = -ENODEV;
 
down_read(&chip->shutdown_rwsem);
-   if (chip->probing && chip->in_pm)
+   if (chip->probing || chip->in_pm)
err = 0;
else if (!chip->shutdown)
err = usb_autopm_get_interface(chip->pm_intf);
-- 
2.10.2



[PATCH 3.12 110/127] sparc: Don't leak context bits into thread->fault_address

2016-11-25 Thread Jiri Slaby
From: "David S. Miller" 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 4f6deb8cbab532a8d7250bc09234c1795ecb5e2c ]

On pre-Niagara systems, we fetch the fault address on data TLB
exceptions from the TLB_TAG_ACCESS register.  But this register also
contains the context ID assosciated with the fault in the low 13 bits
of the register value.

This propagates into current_thread_info()->fault_address and can
cause trouble later on.

So clear the low 13-bits out of the TLB_TAG_ACCESS value in the cases
where it matters.

Reported-by: Mikulas Patocka 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/kernel/dtlb_prot.S |  4 ++--
 arch/sparc/kernel/ktlb.S  | 12 
 arch/sparc/kernel/tsb.S   | 12 ++--
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/sparc/kernel/dtlb_prot.S b/arch/sparc/kernel/dtlb_prot.S
index d668ca149e64..4087a62f96b0 100644
--- a/arch/sparc/kernel/dtlb_prot.S
+++ b/arch/sparc/kernel/dtlb_prot.S
@@ -25,13 +25,13 @@
 
 /* PROT ** ICACHE line 2: More real fault processing */
ldxa[%g4] ASI_DMMU, %g5 ! Put tagaccess in %g5
+   srlx%g5, PAGE_SHIFT, %g5
+   sllx%g5, PAGE_SHIFT, %g5! Clear context ID bits
bgu,pn  %xcc, winfix_trampoline ! Yes, perform winfixup
 movFAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4
ba,pt   %xcc, sparc64_realfault_common  ! Nope, normal fault
 nop
nop
-   nop
-   nop
 
 /* PROT ** ICACHE line 3: Unused...*/
nop
diff --git a/arch/sparc/kernel/ktlb.S b/arch/sparc/kernel/ktlb.S
index ef0d8e9e1210..f22bec0db645 100644
--- a/arch/sparc/kernel/ktlb.S
+++ b/arch/sparc/kernel/ktlb.S
@@ -20,6 +20,10 @@ kvmap_itlb:
mov TLB_TAG_ACCESS, %g4
ldxa[%g4] ASI_IMMU, %g4
 
+   /* The kernel executes in context zero, therefore we do not
+* need to clear the context ID bits out of %g4 here.
+*/
+
/* sun4v_itlb_miss branches here with the missing virtual
 * address already loaded into %g4
 */
@@ -128,6 +132,10 @@ kvmap_dtlb:
mov TLB_TAG_ACCESS, %g4
ldxa[%g4] ASI_DMMU, %g4
 
+   /* The kernel executes in context zero, therefore we do not
+* need to clear the context ID bits out of %g4 here.
+*/
+
/* sun4v_dtlb_miss branches here with the missing virtual
 * address already loaded into %g4
 */
@@ -251,6 +259,10 @@ kvmap_dtlb_longpath:
nop
.previous
 
+   /* The kernel executes in context zero, therefore we do not
+* need to clear the context ID bits out of %g5 here.
+*/
+
be,pt   %xcc, sparc64_realfault_common
 movFAULT_CODE_DTLB, %g4
ba,pt   %xcc, winfix_trampoline
diff --git a/arch/sparc/kernel/tsb.S b/arch/sparc/kernel/tsb.S
index be98685c14c6..d568c8207af7 100644
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -29,13 +29,17 @@
 */
 tsb_miss_dtlb:
mov TLB_TAG_ACCESS, %g4
+   ldxa[%g4] ASI_DMMU, %g4
+   srlx%g4, PAGE_SHIFT, %g4
ba,pt   %xcc, tsb_miss_page_table_walk
-ldxa   [%g4] ASI_DMMU, %g4
+sllx   %g4, PAGE_SHIFT, %g4
 
 tsb_miss_itlb:
mov TLB_TAG_ACCESS, %g4
+   ldxa[%g4] ASI_IMMU, %g4
+   srlx%g4, PAGE_SHIFT, %g4
ba,pt   %xcc, tsb_miss_page_table_walk
-ldxa   [%g4] ASI_IMMU, %g4
+sllx   %g4, PAGE_SHIFT, %g4
 
/* At this point we have:
 * %g1 --   PAGE_SIZE TSB entry address
@@ -284,6 +288,10 @@ tsb_do_dtlb_fault:
nop
.previous
 
+   /* Clear context ID bits.  */
+   srlx%g5, PAGE_SHIFT, %g5
+   sllx%g5, PAGE_SHIFT, %g5
+
be,pt   %xcc, sparc64_realfault_common
 movFAULT_CODE_DTLB, %g4
ba,pt   %xcc, winfix_trampoline
-- 
2.10.2



[PATCH 3.12 117/127] sparc64: Handle extremely large kernel TLB range flushes more gracefully.

2016-11-25 Thread Jiri Slaby
From: "David S. Miller" 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit a74ad5e660a9ee1d071665e7e8ad822784a2dc7f ]

When the vmalloc area gets fragmented, and because the firmware
mapping area sits between where modules live and the vmalloc area, we
can sometimes receive requests for enormous kernel TLB range flushes.

When this happens the cpu just spins flushing billions of pages and
this triggers the NMI watchdog and other problems.

We took care of this on the TSB side by doing a linear scan of the
table once we pass a certain threshold.

Do something similar for the TLB flush, however we are limited by
the TLB flush facilities provided by the different chip variants.

First of all we use an (mostly arbitrary) cut-off of 256K which is
about 32 pages.  This can be tuned in the future.

The huge range code path for each chip works as follows:

1) On spitfire we flush all non-locked TLB entries using diagnostic
   acceses.

2) On cheetah we use the "flush all" TLB flush.

3) On sun4v/hypervisor we do a TLB context flush on context 0, which
   unlike previous chips does not remove "permanent" or locked
   entries.

We could probably do something better on spitfire, such as limiting
the flush to kernel TLB entries or even doing range comparisons.
However that probably isn't worth it since those chips are old and
the TLB only had 64 entries.

Reported-by: James Clarke 
Tested-by: James Clarke 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/mm/ultra.S | 283 --
 1 file changed, 228 insertions(+), 55 deletions(-)

diff --git a/arch/sparc/mm/ultra.S b/arch/sparc/mm/ultra.S
index 0fa2e6202c1f..5d2fd6cd3189 100644
--- a/arch/sparc/mm/ultra.S
+++ b/arch/sparc/mm/ultra.S
@@ -113,12 +113,14 @@ __flush_tlb_pending:  /* 27 insns */
 
.align  32
.globl  __flush_tlb_kernel_range
-__flush_tlb_kernel_range:  /* 19 insns */
+__flush_tlb_kernel_range:  /* 31 insns */
/* %o0=start, %o1=end */
cmp %o0, %o1
be,pn   %xcc, 2f
+sub%o1, %o0, %o3
+   srlx%o3, 18, %o4
+   brnz,pn %o4, __spitfire_flush_tlb_kernel_range_slow
 sethi  %hi(PAGE_SIZE), %o4
-   sub %o1, %o0, %o3
sub %o3, %o4, %o3
or  %o0, 0x20, %o0  ! Nucleus
 1: stxa%g0, [%o0 + %o3] ASI_DMMU_DEMAP
@@ -134,6 +136,38 @@ __flush_tlb_kernel_range:  /* 19 insns */
nop
nop
nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+
+__spitfire_flush_tlb_kernel_range_slow:
+   mov 63 * 8, %o4
+1: ldxa[%o4] ASI_ITLB_DATA_ACCESS, %o3
+   andcc   %o3, 0x40, %g0  /* _PAGE_L_4U */
+   bne,pn  %xcc, 2f
+movTLB_TAG_ACCESS, %o3
+   stxa%g0, [%o3] ASI_IMMU
+   stxa%g0, [%o4] ASI_ITLB_DATA_ACCESS
+   membar  #Sync
+2: ldxa[%o4] ASI_DTLB_DATA_ACCESS, %o3
+   andcc   %o3, 0x40, %g0
+   bne,pn  %xcc, 2f
+movTLB_TAG_ACCESS, %o3
+   stxa%g0, [%o3] ASI_DMMU
+   stxa%g0, [%o4] ASI_DTLB_DATA_ACCESS
+   membar  #Sync
+2: sub %o4, 8, %o4
+   brgez,pt%o4, 1b
+nop
+   retl
+nop
 
 __spitfire_flush_tlb_mm_slow:
rdpr%pstate, %g1
@@ -288,6 +322,40 @@ __cheetah_flush_tlb_pending:   /* 27 insns */
retl
 wrpr   %g7, 0x0, %pstate
 
+__cheetah_flush_tlb_kernel_range:  /* 31 insns */
+   /* %o0=start, %o1=end */
+   cmp %o0, %o1
+   be,pn   %xcc, 2f
+sub%o1, %o0, %o3
+   srlx%o3, 18, %o4
+   brnz,pn %o4, 3f
+sethi  %hi(PAGE_SIZE), %o4
+   sub %o3, %o4, %o3
+   or  %o0, 0x20, %o0  ! Nucleus
+1: stxa%g0, [%o0 + %o3] ASI_DMMU_DEMAP
+   stxa%g0, [%o0 + %o3] ASI_IMMU_DEMAP
+   membar  #Sync
+   brnz,pt %o3, 1b
+sub%o3, %o4, %o3
+2: sethi   %hi(KERNBASE), %o3
+   flush   %o3
+   retl
+nop
+3: mov 0x80, %o4
+   stxa%g0, [%o4] ASI_DMMU_DEMAP
+   membar  #Sync
+   stxa%g0, [%o4] ASI_IMMU_DEMAP
+   membar  #Sync
+   retl
+nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+   nop
+
 #ifdef DCACHE_ALIASING_POSSIBLE
 __cheetah_flush_dcache_page: /* 11 insns */
sethi   %hi(PAGE_OFFSET), %g1
@@ -388,13 +456,15 @@ __hypervisor_flush_tlb_pending: /* 27 insns */
 

[PATCH 3.12 106/127] sctp: assign assoc_id earlier in __sctp_connect

2016-11-25 Thread Jiri Slaby
From: Marcelo Ricardo Leitner 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 7233bc84a3aeda835d334499dc00448373caf5c0 ]

sctp_wait_for_connect() currently already holds the asoc to keep it
alive during the sleep, in case another thread release it. But Andrey
Konovalov and Dmitry Vyukov reported an use-after-free in such
situation.

Problem is that __sctp_connect() doesn't get a ref on the asoc and will
do a read on the asoc after calling sctp_wait_for_connect(), but by then
another thread may have closed it and the _put on sctp_wait_for_connect
will actually release it, causing the use-after-free.

Fix is, instead of doing the read after waiting for the connect, do it
before so, and avoid this issue as the socket is still locked by then.
There should be no issue on returning the asoc id in case of failure as
the application shouldn't trust on that number in such situations
anyway.

This issue doesn't exist in sctp_sendmsg() path.

Reported-by: Dmitry Vyukov 
Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Signed-off-by: Marcelo Ricardo Leitner 
Reviewed-by: Xin Long 
Acked-by: Neil Horman 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/sctp/socket.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 98cd6606f4a4..2c5cb6d2787d 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1217,9 +1217,12 @@ static int __sctp_connect(struct sock* sk,
 
timeo = sock_sndtimeo(sk, f_flags & O_NONBLOCK);
 
-   err = sctp_wait_for_connect(asoc, &timeo);
-   if ((err == 0 || err == -EINPROGRESS) && assoc_id)
+   if (assoc_id)
*assoc_id = asoc->assoc_id;
+   err = sctp_wait_for_connect(asoc, &timeo);
+   /* Note: the asoc may be freed after the return of
+* sctp_wait_for_connect.
+*/
 
/* Don't free association on exit. */
asoc = NULL;
-- 
2.10.2



[PATCH 3.12 107/127] neigh: check error pointer instead of NULL for ipv4_neigh_lookup()

2016-11-25 Thread Jiri Slaby
From: WANG Cong 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 2c1a4311b61072afe2309d4152a7993e92caa41c upstream.

Fixes: commit f187bc6efb7250afee0e2009b6106 ("ipv4: No need to set generic 
neighbour pointer")
Cc: David S. Miller 
Signed-off-by: Cong Wang 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv4/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 2d709773dc6c..f1631aec4206 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -765,7 +765,7 @@ static void __ip_do_redirect(struct rtable *rt, struct 
sk_buff *skb, struct flow
}
 
n = ipv4_neigh_lookup(&rt->dst, NULL, &new_gw);
-   if (n) {
+   if (!IS_ERR(n)) {
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
} else {
-- 
2.10.2



[PATCH 3.12 104/127] ipv6: dccp: fix out of bound access in dccp_v6_err()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 1aa9d1a0e7eefcc61696e147d123453fc0016005 ]

dccp_v6_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmpv6_notify() are more than enough.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/dccp/ipv6.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 86eedbaf037f..e9ce581b9502 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -83,7 +83,7 @@ static void dccp_v6_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
u8 type, u8 code, int offset, __be32 info)
 {
const struct ipv6hdr *hdr = (const struct ipv6hdr *)skb->data;
-   const struct dccp_hdr *dh = (struct dccp_hdr *)(skb->data + offset);
+   const struct dccp_hdr *dh;
struct dccp_sock *dp;
struct ipv6_pinfo *np;
struct sock *sk;
@@ -91,12 +91,13 @@ static void dccp_v6_err(struct sk_buff *skb, struct 
inet6_skb_parm *opt,
__u64 seq;
struct net *net = dev_net(skb->dev);
 
-   if (skb->len < offset + sizeof(*dh) ||
-   skb->len < offset + __dccp_basic_hdr_len(dh)) {
-   ICMP6_INC_STATS_BH(net, __in6_dev_get(skb->dev),
-  ICMP6_MIB_INERRORS);
-   return;
-   }
+   /* Only need dccph_dport & dccph_sport which are the first
+* 4 bytes in dccp header.
+* Our caller (icmpv6_notify()) already pulled 8 bytes for us.
+*/
+   BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+   BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
+   dh = (struct dccp_hdr *)(skb->data + offset);
 
sk = inet6_lookup(net, &dccp_hashinfo,
&hdr->daddr, dh->dccph_dport,
-- 
2.10.2



[PATCH 3.12 108/127] ipv4: use new_gw for redirect neigh lookup

2016-11-25 Thread Jiri Slaby
From: Stephen Suryaputra Lin 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 969447f226b451c453ddc83cac6144eaeac6f2e3 ]

In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
and then the state of the neigh for the new_gw is checked. If the state
isn't valid then the redirected route is deleted. This behavior is
maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
is assigned to peer->redirect_learned.a4 before calling
ipv4_neigh_lookup().

After commit 5943634fc559 ("ipv4: Maintain redirect and PMTU info in
struct rtable again."), ipv4_neigh_lookup() is performed without the
rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
isn't zero, the function uses it as the key. The neigh is most likely
valid since the old_gw is the one that sends the ICMP redirect message.
Then the new_gw is assigned to fib_nh_exception. The problem is: the
new_gw ARP may never gets resolved and the traffic is blackholed.

So, use the new_gw for neigh lookup.

Changes from v1:
 - use __ipv4_neigh_lookup instead (per Eric Dumazet).

Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable 
again.")
Signed-off-by: Stephen Suryaputra Lin 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv4/route.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f1631aec4206..fd2811086257 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -764,7 +764,9 @@ static void __ip_do_redirect(struct rtable *rt, struct 
sk_buff *skb, struct flow
goto reject_redirect;
}
 
-   n = ipv4_neigh_lookup(&rt->dst, NULL, &new_gw);
+   n = __ipv4_neigh_lookup(rt->dst.dev, new_gw);
+   if (!n)
+   n = neigh_create(&arp_tbl, &new_gw, rt->dst.dev);
if (!IS_ERR(n)) {
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
-- 
2.10.2



[PATCH 3.12 111/127] sparc64 mm: Fix base TSB sizing when hugetlb pages are used

2016-11-25 Thread Jiri Slaby
From: Mike Kravetz 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit af1b1a9b36b8f9d583d4b4f90dd8946ed0cd4bd0 ]

do_sparc64_fault() calculates both the base and huge page RSS sizes and
uses this information in calls to tsb_grow().  The calculation for base
page TSB size is not correct if the task uses hugetlb pages.  hugetlb
pages are not accounted for in RSS, therefore the call to get_mm_rss(mm)
does not include hugetlb pages.  However, the number of pages based on
huge_pte_count (which does include hugetlb pages) is subtracted from
this value.  This will result in an artificially small and often negative
RSS calculation.  The base TSB size is then often set to max_tsb_size
as the passed RSS is unsigned, so a negative value looks really big.

THP pages are also accounted for in huge_pte_count, and THP pages are
accounted for in RSS so the calculation in do_sparc64_fault() is correct
if a task only uses THP pages.

A single huge_pte_count is not sufficient for TSB sizing if both hugetlb
and THP pages can be used.  Instead of a single counter, use two:  one
for hugetlb and one for THP.

Signed-off-by: Mike Kravetz 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/include/asm/mmu_64.h |  3 ++-
 arch/sparc/mm/fault_64.c|  6 +++---
 arch/sparc/mm/hugetlbpage.c |  4 ++--
 arch/sparc/mm/init_64.c |  3 ++-
 arch/sparc/mm/tlb.c |  4 ++--
 arch/sparc/mm/tsb.c | 14 --
 6 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/arch/sparc/include/asm/mmu_64.h b/arch/sparc/include/asm/mmu_64.h
index f668797ae234..4994815fccc7 100644
--- a/arch/sparc/include/asm/mmu_64.h
+++ b/arch/sparc/include/asm/mmu_64.h
@@ -92,7 +92,8 @@ struct tsb_config {
 typedef struct {
spinlock_t  lock;
unsigned long   sparc64_ctx_val;
-   unsigned long   huge_pte_count;
+   unsigned long   hugetlb_pte_count;
+   unsigned long   thp_pte_count;
struct tsb_config   tsb_block[MM_NUM_TSBS];
struct hv_tsb_descr tsb_descr[MM_NUM_TSBS];
 } mm_context_t;
diff --git a/arch/sparc/mm/fault_64.c b/arch/sparc/mm/fault_64.c
index c7009d7762b1..a21917c8f44f 100644
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -478,14 +478,14 @@ good_area:
up_read(&mm->mmap_sem);
 
mm_rss = get_mm_rss(mm);
-#if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
-   mm_rss -= (mm->context.huge_pte_count * (HPAGE_SIZE / PAGE_SIZE));
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE)
+   mm_rss -= (mm->context.thp_pte_count * (HPAGE_SIZE / PAGE_SIZE));
 #endif
if (unlikely(mm_rss >
 mm->context.tsb_block[MM_TSB_BASE].tsb_rss_limit))
tsb_grow(mm, MM_TSB_BASE, mm_rss);
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
-   mm_rss = mm->context.huge_pte_count;
+   mm_rss = mm->context.hugetlb_pte_count + mm->context.thp_pte_count;
if (unlikely(mm_rss >
 mm->context.tsb_block[MM_TSB_HUGE].tsb_rss_limit)) {
if (mm->context.tsb_block[MM_TSB_HUGE].tsb)
diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c
index d941cd024f22..387ae1e9b462 100644
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -184,7 +184,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long 
addr,
int i;
 
if (!pte_present(*ptep) && pte_present(entry))
-   mm->context.huge_pte_count++;
+   mm->context.hugetlb_pte_count++;
 
addr &= HPAGE_MASK;
for (i = 0; i < (1 << HUGETLB_PAGE_ORDER); i++) {
@@ -203,7 +203,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, 
unsigned long addr,
 
entry = *ptep;
if (pte_present(entry))
-   mm->context.huge_pte_count--;
+   mm->context.hugetlb_pte_count--;
 
addr &= HPAGE_MASK;
 
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 9633e0706d6e..4650a3840305 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -353,7 +353,8 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned 
long address, pte_t *
spin_lock_irqsave(&mm->context.lock, flags);
 
 #if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_TRANSPARENT_HUGEPAGE)
-   if (mm->context.huge_pte_count && is_hugetlb_pte(pte))
+   if ((mm->context.hugetlb_pte_count || mm->context.thp_pte_count) &&
+   is_hugetlb_pte(pte))
__update_mmu_tsb_insert(mm, MM_TSB_HUGE, REAL_HPAGE_SHIFT,
address, pte_val(pte));
else
diff --git a/arch/sparc/mm/tlb.c b/arch/sparc/mm/tlb.c
index c24d0aa2b615..56b820924b07 100644
--- a/arch/sparc/mm/tlb.c
+++ b/arch/sparc/mm/tlb.c
@@ -166,9 +166,9 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 
if ((

[PATCH 3.12 109/127] tcp: take care of truncations done by sk_filter()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit ac6e780070e30e4c35bd395acfe9191e6268bdd3 ]

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet 
Reported-by: Marco Grassi 
Reported-by: Vladis Dronov 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 include/linux/filter.h |  6 +-
 include/net/tcp.h  |  1 +
 net/core/filter.c  | 10 +-
 net/ipv4/tcp_ipv4.c| 19 ++-
 net/ipv6/tcp_ipv6.c|  6 --
 5 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index ff4e40cd45b1..264c1a440240 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -41,7 +41,11 @@ static inline unsigned int sk_filter_size(unsigned int 
proglen)
   offsetof(struct sk_filter, insns[proglen]));
 }
 
-extern int sk_filter(struct sock *sk, struct sk_buff *skb);
+int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap);
+static inline int sk_filter(struct sock *sk, struct sk_buff *skb)
+{
+   return sk_filter_trim_cap(sk, skb, 1);
+}
 extern unsigned int sk_run_filter(const struct sk_buff *skb,
  const struct sock_filter *filter);
 extern int sk_unattached_filter_create(struct sk_filter **pfp,
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 035135b43820..83d03f86e914 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1049,6 +1049,7 @@ static inline void tcp_prequeue_init(struct tcp_sock *tp)
 }
 
 extern bool tcp_prequeue(struct sock *sk, struct sk_buff *skb);
+int tcp_filter(struct sock *sk, struct sk_buff *skb);
 
 #undef STATE_TRACE
 
diff --git a/net/core/filter.c b/net/core/filter.c
index ebce437678fc..5903efc408da 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -67,9 +67,10 @@ static inline void *load_pointer(const struct sk_buff *skb, 
int k,
 }
 
 /**
- * sk_filter - run a packet through a socket filter
+ * sk_filter_trim_cap - run a packet through a socket filter
  * @sk: sock associated with &sk_buff
  * @skb: buffer to filter
+ * @cap: limit on how short the eBPF program may trim the packet
  *
  * Run the filter code and then cut skb->data to correct size returned by
  * sk_run_filter. If pkt_len is 0 we toss packet. If skb->len is smaller
@@ -78,7 +79,7 @@ static inline void *load_pointer(const struct sk_buff *skb, 
int k,
  * be accepted or -EPERM if the packet should be tossed.
  *
  */
-int sk_filter(struct sock *sk, struct sk_buff *skb)
+int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap)
 {
int err;
struct sk_filter *filter;
@@ -99,14 +100,13 @@ int sk_filter(struct sock *sk, struct sk_buff *skb)
filter = rcu_dereference(sk->sk_filter);
if (filter) {
unsigned int pkt_len = SK_RUN_FILTER(filter, skb);
-
-   err = pkt_len ? pskb_trim(skb, pkt_len) : -EPERM;
+   err = pkt_len ? pskb_trim(skb, max(cap, pkt_len)) : -EPERM;
}
rcu_read_unlock();
 
return err;
 }
-EXPORT_SYMBOL(sk_filter);
+EXPORT_SYMBOL(sk_filter_trim_cap);
 
 /**
  * sk_run_filter - run a filter on a socket
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 4b2040762733..57f5bad5650c 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1941,6 +1941,21 @@ bool tcp_prequeue(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(tcp_prequeue);
 
+int tcp_filter(struct sock *sk, struct sk_buff *skb)
+{
+   struct tcphdr *th = (struct tcphdr *)skb->data;
+   unsigned int eaten = skb->len;
+   int err;
+
+   err = sk_filter_trim_cap(sk, skb, th->doff * 4);
+   if (!err) {
+   eaten -= skb->len;
+   TCP_SKB_CB(skb)->end_seq -= eaten;
+   }
+   return err;
+}
+EXPORT_SYMBOL(tcp_filter);
+
 /*
  * From tcp_input.c
  */
@@ -2003,8 +2018,10 @@ process:
goto discard_and_relse;
nf_reset(skb);
 
-   if (sk_filter(sk, skb))
+   if (tcp_filter(sk, skb))
goto discard_and_relse;
+   th = (const struct tcphdr *)skb->data;
+   iph = ip_hdr(skb);
 
sk_mark_napi_id(sk, skb);
skb->dev = NULL;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0812b615885d..e5bafd576a13 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1339,7 +1339,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff 
*skb)
goto discard;
 #endif
 
-   

[PATCH 3.12 112/127] sparc: Handle negative offsets in arch_jump_label_transform

2016-11-25 Thread Jiri Slaby
From: James Clarke 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 9d9fa230206a3aea6ef451646c97122f04777983 ]

Additionally, if the offset will overflow the immediate for a ba,pt
instruction, fall back on a standard ba to get an extra 3 bits.

Signed-off-by: James Clarke 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 arch/sparc/kernel/jump_label.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/kernel/jump_label.c b/arch/sparc/kernel/jump_label.c
index 48565c11e82a..6d0dacb5812d 100644
--- a/arch/sparc/kernel/jump_label.c
+++ b/arch/sparc/kernel/jump_label.c
@@ -13,19 +13,30 @@
 void arch_jump_label_transform(struct jump_entry *entry,
   enum jump_label_type type)
 {
-   u32 val;
u32 *insn = (u32 *) (unsigned long) entry->code;
+   u32 val;
 
if (type == JUMP_LABEL_ENABLE) {
s32 off = (s32)entry->target - (s32)entry->code;
+   bool use_v9_branch = false;
+
+   BUG_ON(off & 3);
 
 #ifdef CONFIG_SPARC64
-   /* ba,pt %xcc, . + (off << 2) */
-   val = 0x1068 | ((u32) off >> 2);
-#else
-   /* ba . + (off << 2) */
-   val = 0x1080 | ((u32) off >> 2);
+   if (off <= 0xf && off >= -0x10)
+   use_v9_branch = true;
 #endif
+   if (use_v9_branch) {
+   /* WDISP19 - target is . + immed << 2 */
+   /* ba,pt %xcc, . + off */
+   val = 0x1068 | (((u32) off >> 2) & 0x7);
+   } else {
+   /* WDISP22 - target is . + immed << 2 */
+   BUG_ON(off > 0x7f);
+   BUG_ON(off < -0x80);
+   /* ba . + off */
+   val = 0x1080 | (((u32) off >> 2) & 0x3f);
+   }
} else {
val = 0x0100;
}
-- 
2.10.2



[PATCH 3.12 105/127] ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 990ff4d84408fc55942ca6644f67e361737b3d8e ]

While fuzzing kernel with syzkaller, Andrey reported a nasty crash
in inet6_bind() caused by DCCP lacking a required method.

Fixes: ab1e0a13d7029 ("[SOCK] proto: Add hashinfo member to struct proto")
Signed-off-by: Eric Dumazet 
Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Cc: Arnaldo Carvalho de Melo 
Acked-by: Arnaldo Carvalho de Melo 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/dccp/ipv6.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index e9ce581b9502..736fdedf9c85 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1023,6 +1023,7 @@ static const struct inet_connection_sock_af_ops 
dccp_ipv6_mapped = {
.getsockopt= ipv6_getsockopt,
.addr2sockaddr = inet6_csk_addr2sockaddr,
.sockaddr_len  = sizeof(struct sockaddr_in6),
+   .bind_conflict = inet6_csk_bind_conflict,
 #ifdef CONFIG_COMPAT
.compat_setsockopt = compat_ipv6_setsockopt,
.compat_getsockopt = compat_ipv6_getsockopt,
-- 
2.10.2



Re: [PATCH 2/2] rcu: Force resched_cpu when jiffies >= rcu_state.jiffies_resched

2016-11-25 Thread Paul E. McKenney
On Tue, Nov 22, 2016 at 05:12:20PM +0900, Byungchul Park wrote:
> On Wed, Nov 09, 2016 at 03:32:15PM +0900, Byungchul Park wrote:
> > Currently rcu code forces CPU into scheduler when jiffies >=
> > rcu_state.gp_start + jiffies_till_sched_qs, via resched_cpu().
> > 
> > It would be better to force CPU into scheduler when jiffies >=
> > rcu_state.jiffies_resched, too.
> 
> Hello,
> 
> I think these two patches are necessary to call resched_cpu() even in
> case of jiffies >= rcu_state.jiffies_resched, too. Am I wrong?
> 
> It would be appriciated if you let me know if I was wrong.

My current thought is that both the "if" statement and the call to
resched_cpu() should be removed, but I am still testing and working
through the timing.

Either way, I do very much appreciate your having called my attention
to this code!

Thanx, Paul

> Thank you,
> Byungchul
> 
> > 
> > Signed-off-by: Byungchul Park 
> > ---
> >  kernel/rcu/tree.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index d8e8859..287f468 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1217,11 +1217,10 @@ static int rcu_implicit_dynticks_qs(struct rcu_data 
> > *rdp,
> >READ_ONCE(*rcrmp) + rdp->rsp->flavor_mask);
> > }
> > rdp->rsp->jiffies_resched += 5; /* Re-enable beating. */
> > -   }
> >  
> > -   /* And if it has been a really long time, kick the CPU as well. */
> > -   if (ULONG_CMP_GE(jiffies, rdp->rsp->gp_start + jiffies_till_sched_qs))
> > +   /* And if it has been a really long time, kick the CPU as well. 
> > */
> > resched_cpu(rdp->cpu);  /* Force CPU into scheduler. */
> > +   }
> >  
> > return 0;
> >  }
> > -- 
> > 1.9.1
> 



[PATCH 3.12 097/127] stddef.h: move offsetofend inside #ifndef/#endif guard, neaten

2016-11-25 Thread Jiri Slaby
From: Joe Perches 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 8c7fbe5795a016259445a61e072eb0118aaf6a61 upstream.

Commit 3876488444e7 ("include/stddef.h: Move offsetofend() from vfio.h
to a generic kernel header") added offsetofend outside the normal
include #ifndef/#endif guard.  Move it inside.

Miscellanea:

o remove unnecessary blank line
o standardize offsetof macros whitespace style

Signed-off-by: Joe Perches 
Cc: Denys Vlasenko 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Jiri Slaby 
---
 include/linux/stddef.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 076af437284d..9c61c7cda936 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -3,7 +3,6 @@
 
 #include 
 
-
 #undef NULL
 #define NULL ((void *)0)
 
@@ -14,10 +13,9 @@ enum {
 
 #undef offsetof
 #ifdef __compiler_offsetof
-#define offsetof(TYPE,MEMBER) __compiler_offsetof(TYPE,MEMBER)
+#define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
 #else
-#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
-#endif
+#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
 #endif
 
 /**
@@ -28,3 +26,5 @@ enum {
  */
 #define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))
+
+#endif
-- 
2.10.2



[PATCH 3.12 103/127] dccp: fix out of bound access in dccp_v4_err()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 6706a97fec963d6cb3f7fc2978ec1427b4651214 ]

dccp_v4_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmp_socket_deliver() are more than enough.

This patch might allow to process more ICMP messages, as some routers
are still limiting the size of reflected bytes to 28 (RFC 792), instead
of extended lengths (RFC 1812 4.3.2.3)

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/dccp/ipv4.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index ebc54fef85a5..294c642fbebb 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -212,7 +212,7 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
 {
const struct iphdr *iph = (struct iphdr *)skb->data;
const u8 offset = iph->ihl << 2;
-   const struct dccp_hdr *dh = (struct dccp_hdr *)(skb->data + offset);
+   const struct dccp_hdr *dh;
struct dccp_sock *dp;
struct inet_sock *inet;
const int type = icmp_hdr(skb)->type;
@@ -222,11 +222,13 @@ static void dccp_v4_err(struct sk_buff *skb, u32 info)
int err;
struct net *net = dev_net(skb->dev);
 
-   if (skb->len < offset + sizeof(*dh) ||
-   skb->len < offset + __dccp_basic_hdr_len(dh)) {
-   ICMP_INC_STATS_BH(net, ICMP_MIB_INERRORS);
-   return;
-   }
+   /* Only need dccph_dport & dccph_sport which are the first
+* 4 bytes in dccp header.
+* Our caller (icmp_socket_deliver()) already pulled 8 bytes for us.
+*/
+   BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_sport) > 8);
+   BUILD_BUG_ON(offsetofend(struct dccp_hdr, dccph_dport) > 8);
+   dh = (struct dccp_hdr *)(skb->data + offset);
 
sk = inet_lookup(net, &dccp_hashinfo,
iph->daddr, dh->dccph_dport,
-- 
2.10.2



[PATCH 3.12 087/127] ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()

2016-11-25 Thread Jiri Slaby
From: Lance Richardson 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit db32e4e49ce2b0e5fcc17803d011a401c0a637f6 ]

Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in
xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.

Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
This affected output route lookup for packets sent on an ip6gretap device
in cases where routing was dependent on the value of flowi6_proto.

Since the correct proto is already set in the tunnel flowi6 template via
commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
path."), simply delete the line setting the incorrect flowi6_proto value.

Suggested-by: Jiri Benc 
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Reviewed-by: Shmulik Ladkani 
Signed-off-by: Lance Richardson 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv6/ip6_gre.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 737af492ed75..6b5acd50103f 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -895,7 +895,6 @@ static int ip6gre_xmit_other(struct sk_buff *skb, struct 
net_device *dev)
encap_limit = t->parms.encap_limit;
 
memcpy(&fl6, &t->fl.u.ip6, sizeof(fl6));
-   fl6.flowi6_proto = skb->protocol;
 
err = ip6gre_xmit2(skb, dev, 0, &fl6, encap_limit, &mtu);
 
-- 
2.10.2



[PATCH 3.12 099/127] net: mangle zero checksum in skb_checksum_help()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 4f2e4ad56a65f3b7d64c258e373cb71e8d2499f4 ]

Sending zero checksum is ok for TCP, but not for UDP.

UDPv6 receiver should by default drop a frame with a 0 checksum,
and UDPv4 would not verify the checksum and might accept a corrupted
packet.

Simply replace such checksum by 0x, regardless of transport.

This error was caught on SIT tunnels, but seems generic.

Signed-off-by: Eric Dumazet 
Cc: Maciej Żenczykowski 
Cc: Willem de Bruijn 
Acked-by: Maciej Żenczykowski 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index b3788eb33ce4..80468a34ef12 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2263,7 +2263,7 @@ int skb_checksum_help(struct sk_buff *skb)
goto out;
}
 
-   *(__sum16 *)(skb->data + offset) = csum_fold(csum);
+   *(__sum16 *)(skb->data + offset) = csum_fold(csum) ?: CSUM_MANGLED_0;
 out_set_summed:
skb->ip_summed = CHECKSUM_NONE;
 out:
-- 
2.10.2



[PATCH 3.12 102/127] dccp: do not send reset to already closed sockets

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 346da62cc186c4b4b1ac59f87f4482b47a047388 ]

Andrey reported following warning while fuzzing with syzkaller

WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 88003d4c7738 81b474f4 0003 dc00
 844f8b00 88003d4c7804 88003d4c7800 8140c06a
 41b58ab3 8479ab7d 8140beae 8140cd00
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0xb3/0x10f lib/dump_stack.c:51
 [] panic+0x1bc/0x39d kernel/panic.c:179
 [] __warn+0x1cc/0x1f0 kernel/panic.c:542
 [] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
 [] dccp_set_state+0x229/0x290 net/dccp/proto.c:83
 [] dccp_close+0x612/0xc10 net/dccp/proto.c:1016
 [] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415
 [] sock_release+0x8e/0x1d0 net/socket.c:570
 [] sock_close+0x16/0x20 net/socket.c:1017
 [] __fput+0x29d/0x720 fs/file_table.c:208
 [] fput+0x15/0x20 fs/file_table.c:244
 [] task_work_run+0xf8/0x170 kernel/task_work.c:116
 [< inline >] exit_task_work include/linux/task_work.h:21
 [] do_exit+0x883/0x2ac0 kernel/exit.c:828
 [] do_group_exit+0x10e/0x340 kernel/exit.c:931
 [] get_signal+0x634/0x15a0 kernel/signal.c:2307
 [] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807
 [] exit_to_usermode_loop+0xe5/0x130
arch/x86/entry/common.c:156
 [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
 [] syscall_return_slowpath+0x1a8/0x1e0
arch/x86/entry/common.c:259
 [] entry_SYSCALL_64_fastpath+0xc0/0xc2
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled

Fix this the same way we did for TCP in commit 565b7b2d2e63
("tcp: do not send reset to already closed sockets")

Signed-off-by: Eric Dumazet 
Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/dccp/proto.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index ba64750f0387..f6f6fa1ddeb0 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -1012,6 +1012,10 @@ void dccp_close(struct sock *sk, long timeout)
__kfree_skb(skb);
}
 
+   /* If socket has been already reset kill it. */
+   if (sk->sk_state == DCCP_CLOSED)
+   goto adjudge_to_death;
+
if (data_was_unread) {
/* Unread data was tossed, send an appropriate Reset Code */
DCCP_WARN("ABORT with %u bytes unread\n", data_was_unread);
-- 
2.10.2



[PATCH 3.12 101/127] tcp: fix potential memory corruption

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit ac9e70b17ecd7c6e933ff2eaf7ab37429e71bf4d ]

Imagine initial value of max_skb_frags is 17, and last
skb in write queue has 15 frags.

Then max_skb_frags is lowered to 14 or smaller value.

tcp_sendmsg() will then be allowed to add additional page frags
and eventually go past MAX_SKB_FRAGS, overflowing struct
skb_shared_info.

Fixes: 5f74f82ea34c ("net:Add sysctl_max_skb_frags")
Signed-off-by: Eric Dumazet 
Cc: Hans Westgaard Ry 
Cc: Håkon Bugge 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv4/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 392d3259f9ad..3e63b5fb2121 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1169,7 +1169,7 @@ new_segment:
 
if (!skb_can_coalesce(skb, i, pfrag->page,
  pfrag->offset)) {
-   if (i == sysctl_max_skb_frags || !sg) {
+   if (i >= sysctl_max_skb_frags || !sg) {
tcp_mark_push(tp, skb);
goto new_segment;
}
-- 
2.10.2



[PATCH 3.12 098/127] net: clear sk_err_soft in sk_clone_lock()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit e551c32d57c88923f99f8f010e89ca7ed0735e83 ]

At accept() time, it is possible the parent has a non zero
sk_err_soft, leftover from a prior error.

Make sure we do not leave this value in the child, as it
makes future getsockopt(SO_ERROR) calls quite unreliable.

Signed-off-by: Eric Dumazet 
Acked-by: Soheil Hassas Yeganeh 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/core/sock.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/core/sock.c b/net/core/sock.c
index 516b45c82093..73c6093e136a 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1537,6 +1537,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const 
gfp_t priority)
}
 
newsk->sk_err  = 0;
+   newsk->sk_err_soft = 0;
newsk->sk_priority = 0;
/*
 * Before updating sk_refcnt, we must commit prior changes to 
memory
-- 
2.10.2



[PATCH 3.12 089/127] net: Add netdev all_adj_list refcnt propagation to fix panic

2016-11-25 Thread Jiri Slaby
From: Andrew Collins 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 93409033ae653f1c9a949202fb537ab095b2092f ]

This is a respin of a patch to fix a relatively easily reproducible kernel
panic related to the all_adj_list handling for netdevs in recent kernels.

The following sequence of commands will reproduce the issue:

ip link add link eth0 name eth0.100 type vlan id 100
ip link add link eth0 name eth0.200 type vlan id 200
ip link add name testbr type bridge
ip link set eth0.100 master testbr
ip link set eth0.200 master testbr
ip link add link testbr mac0 type macvlan
ip link delete dev testbr

This creates an upper/lower tree of (excuse the poor ASCII art):

/---eth0.100-eth0
mac0-testbr-
\---eth0.200-eth0

When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice 
from
the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
reference to eth0 is added, so this results in a panic.

This change adds reference count propagation so things are handled properly.

Matthias Schiffer reported a similar crash in batman-adv:

https://github.com/freifunk-gluon/gluon/issues/680
https://www.open-mesh.org/issues/247

which this patch also seems to resolve.

Signed-off-by: Andrew Collins 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/core/dev.c | 76 +-
 1 file changed, 43 insertions(+), 33 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d30c12263f38..b3788eb33ce4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4546,6 +4546,7 @@ EXPORT_SYMBOL(netdev_master_upper_dev_get_rcu);
 
 static int __netdev_adjacent_dev_insert(struct net_device *dev,
struct net_device *adj_dev,
+   u16 ref_nr,
bool neighbour, bool master,
bool upper)
 {
@@ -4555,7 +4556,7 @@ static int __netdev_adjacent_dev_insert(struct net_device 
*dev,
 
if (adj) {
BUG_ON(neighbour);
-   adj->ref_nr++;
+   adj->ref_nr += ref_nr;
return 0;
}
 
@@ -4566,7 +4567,7 @@ static int __netdev_adjacent_dev_insert(struct net_device 
*dev,
adj->dev = adj_dev;
adj->master = master;
adj->neighbour = neighbour;
-   adj->ref_nr = 1;
+   adj->ref_nr = ref_nr;
 
dev_hold(adj_dev);
pr_debug("dev_hold for %s, because of %s link added from %s to %s\n",
@@ -4589,22 +4590,25 @@ static int __netdev_adjacent_dev_insert(struct 
net_device *dev,
 
 static inline int __netdev_upper_dev_insert(struct net_device *dev,
struct net_device *udev,
+   u16 ref_nr,
bool master, bool neighbour)
 {
-   return __netdev_adjacent_dev_insert(dev, udev, neighbour, master,
-   true);
+   return __netdev_adjacent_dev_insert(dev, udev, ref_nr, neighbour,
+   master, true);
 }
 
 static inline int __netdev_lower_dev_insert(struct net_device *dev,
struct net_device *ldev,
+   u16 ref_nr,
bool neighbour)
 {
-   return __netdev_adjacent_dev_insert(dev, ldev, neighbour, false,
+   return __netdev_adjacent_dev_insert(dev, ldev, ref_nr, neighbour, false,
false);
 }
 
 void __netdev_adjacent_dev_remove(struct net_device *dev,
- struct net_device *adj_dev, bool upper)
+ struct net_device *adj_dev, u16 ref_nr,
+ bool upper)
 {
struct netdev_adjacent *adj;
 
@@ -4616,8 +4620,8 @@ void __netdev_adjacent_dev_remove(struct net_device *dev,
if (!adj)
BUG();
 
-   if (adj->ref_nr > 1) {
-   adj->ref_nr--;
+   if (adj->ref_nr > ref_nr) {
+   adj->ref_nr -= ref_nr;
return;
}
 
@@ -4630,30 +4634,33 @@ void __netdev_adjacent_dev_remove(struct net_device 
*dev,
 }
 
 static inline void __netdev_upper_dev_remove(struct net_device *dev,
-struct net_device *udev)
+struct net_device *udev,
+u16 ref_nr)
 {
-   return __netdev_adjacent_dev_remove(dev, udev, true);
+   return __netdev_adjacent_dev_remove(dev, udev, ref_nr, true);
 }
 
 static inline void __netdev_lower_dev_remove(struct net_device *dev,
-struct net_device *ldev)
+struc

[PATCH 3.12 094/127] sctp: validate chunk len before actually using it

2016-11-25 Thread Jiri Slaby
From: Marcelo Ricardo Leitner 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit bf911e985d6bbaa328c20c3e05f4eb03de11fdd6 ]

Andrey Konovalov reported that KASAN detected that SCTP was using a slab
beyond the boundaries. It was caused because when handling out of the
blue packets in function sctp_sf_ootb() it was checking the chunk len
only after already processing the first chunk, validating only for the
2nd and subsequent ones.

The fix is to just move the check upwards so it's also validated for the
1st chunk.

Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Signed-off-by: Marcelo Ricardo Leitner 
Reviewed-by: Xin Long 
Acked-by: Neil Horman 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/sctp/sm_statefuns.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 63a116c31a8b..ce6c8910f041 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3427,6 +3427,12 @@ sctp_disposition_t sctp_sf_ootb(struct net *net,
return sctp_sf_violation_chunklen(net, ep, asoc, type, 
arg,
  commands);
 
+   /* Report violation if chunk len overflows */
+   ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
+   if (ch_end > skb_tail_pointer(skb))
+   return sctp_sf_violation_chunklen(net, ep, asoc, type, 
arg,
+ commands);
+
/* Now that we know we at least have a chunk header,
 * do things that are type appropriate.
 */
@@ -3458,12 +3464,6 @@ sctp_disposition_t sctp_sf_ootb(struct net *net,
}
}
 
-   /* Report violation if chunk len overflows */
-   ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch->length));
-   if (ch_end > skb_tail_pointer(skb))
-   return sctp_sf_violation_chunklen(net, ep, asoc, type, 
arg,
- commands);
-
ch = (sctp_chunkhdr_t *) ch_end;
} while (ch_end < skb_tail_pointer(skb));
 
-- 
2.10.2



[PATCH 3.12 091/127] ipv6: correctly add local routes when lo goes up

2016-11-25 Thread Jiri Slaby
From: Nicolas Dichtel 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit a220445f9f4382c36a53d8ef3e08165fa27f7e2c ]

The goal of the patch is to fix this scenario:
 ip link add dummy1 type dummy
 ip link set dummy1 up
 ip link set lo down ; ip link set lo up

After that sequence, the local route to the link layer address of dummy1 is
not there anymore.

When the loopback is set down, all local routes are deleted by
addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
exists, because the corresponding idev has a reference on it. After the rcu
grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
set obsolete to DST_OBSOLETE_DEAD.

In this case, init_loopback() is called before dst_rcu_free(), thus
obsolete is still sets to something <= 0. So, the function doesn't add the
route again. To avoid that race, let's check the rt6 refcnt instead.

Fixes: 25fb6ca4ed9c ("net IPv6 : Fix broken IPv6 routing table after loopback 
down-up")
Fixes: a881ae1f625c ("ipv6: don't call addrconf_dst_alloc again when enable lo")
Fixes: 33d99113b110 ("ipv6: reallocate addrconf router for ipv6 address when lo 
device up")
Reported-by: Francesco Santoro 
Reported-by: Samuel Gauthier 
CC: Balakumaran Kannan 
CC: Maruthi Thotad 
CC: Sabrina Dubroca 
CC: Hannes Frederic Sowa 
CC: Weilong Chen 
CC: Gao feng 
Signed-off-by: Nicolas Dichtel 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv6/addrconf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index bbf35875e4ef..1e31fc5477e8 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2648,7 +2648,7 @@ static void init_loopback(struct net_device *dev)
 * lo device down, release this obsolete dst and
 * reallocate a new router for ifa.
 */
-   if (sp_ifa->rt->dst.obsolete > 0) {
+   if (!atomic_read(&sp_ifa->rt->rt6i_ref)) {
ip6_rt_put(sp_ifa->rt);
sp_ifa->rt = NULL;
} else {
-- 
2.10.2



[PATCH 3.12 095/127] drivers/vfio: Rework offsetofend()

2016-11-25 Thread Jiri Slaby
From: Gavin Shan 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit b13460b92093b29347e99d6c3242e350052b62cd upstream.

The macro offsetofend() introduces unnecessary temporary variable
"tmp". The patch avoids that and saves a bit memory in stack.

Signed-off-by: Gavin Shan 
Signed-off-by: Alex Williamson 
Signed-off-by: Jiri Slaby 
---
 include/linux/vfio.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 24579a0312a0..43f6bf4f8585 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -86,9 +86,8 @@ extern void vfio_unregister_iommu_driver(
  * from user space.  This allows us to easily determine if the provided
  * structure is sized to include various fields.
  */
-#define offsetofend(TYPE, MEMBER) ({   \
-   TYPE tmp;   \
-   offsetof(TYPE, MEMBER) + sizeof(tmp.MEMBER); }) \
+#define offsetofend(TYPE, MEMBER) \
+   (offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))
 
 /*
  * External user API
-- 
2.10.2



Re: [tip:x86/core] x86: Enable Intel Turbo Boost Max Technology 3.0

2016-11-25 Thread Peter Zijlstra
On Fri, Nov 25, 2016 at 09:19:47AM +0100, Ingo Molnar wrote:
> 
> * tip-bot for Tim Chen  wrote:
> 
> > Commit-ID:  5e76b2ab36b40ca33023e78725bdc69eafd63134
> > Gitweb: 
> > http://git.kernel.org/tip/5e76b2ab36b40ca33023e78725bdc69eafd63134
> > Author: Tim Chen 
> > AuthorDate: Tue, 22 Nov 2016 12:23:55 -0800
> > Committer:  Thomas Gleixner 
> > CommitDate: Thu, 24 Nov 2016 20:44:19 +0100
> > 
> > x86: Enable Intel Turbo Boost Max Technology 3.0
> 
> This patch doesn't build:
> 
> Note that this patch has to be redone anyway, as it won't even build:
> 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> 
> arch/x86/kernel/itmt.c:26:23: fatal error: asm/mutex.h: No such file or 
> directory

Hehe, indeed, we killed that dead in the locking branch. Weird include
to have anyway.


[PATCH 3.12 093/127] net: sctp, forbid negative length

2016-11-25 Thread Jiri Slaby
3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit a4b8e71b05c27bae6bad3bdecddbc6b68a3ad8cf ]

Most of getsockopt handlers in net/sctp/socket.c check len against
sizeof some structure like:
if (len < sizeof(int))
return -EINVAL;

On the first look, the check seems to be correct. But since len is int
and sizeof returns size_t, int gets promoted to unsigned size_t too. So
the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
false.

Fix this in sctp by explicitly checking len < 0 before any getsockopt
handler is called.

Note that sctp_getsockopt_events already handled the negative case.
Since we added the < 0 check elsewhere, this one can be removed.

If not checked, this is the result:
UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
shift exponent 52 is too large for 32-bit type 'int'
CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
  88006d99f2a8 b2f7bdea 41b58ab3
 b4363c14 b2f7bcde 88006d99f2d0 88006d99f270
   0034 b5096422
Call Trace:
 [] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
...
 [] ? kmalloc_order+0x24/0x90
 [] ? kmalloc_order_trace+0x24/0x220
 [] ? __kmalloc+0x330/0x540
 [] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
 [] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
 [] ? sock_common_getsockopt+0xb9/0x150
 [] ? SyS_getsockopt+0x1a5/0x270

Signed-off-by: Jiri Slaby 
Cc: Vlad Yasevich 
Cc: Neil Horman 
Cc: "David S. Miller" 
Cc: linux-s...@vger.kernel.org
Cc: net...@vger.kernel.org
Acked-by: Neil Horman 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/sctp/socket.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index ead3a8adca08..98cd6606f4a4 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4247,7 +4247,7 @@ static int sctp_getsockopt_disable_fragments(struct sock 
*sk, int len,
 static int sctp_getsockopt_events(struct sock *sk, int len, char __user 
*optval,
  int __user *optlen)
 {
-   if (len <= 0)
+   if (len == 0)
return -EINVAL;
if (len > sizeof(struct sctp_event_subscribe))
len = sizeof(struct sctp_event_subscribe);
@@ -5758,6 +5758,9 @@ static int sctp_getsockopt(struct sock *sk, int level, 
int optname,
if (get_user(len, optlen))
return -EFAULT;
 
+   if (len < 0)
+   return -EINVAL;
+
sctp_lock_sock(sk);
 
switch (optname) {
-- 
2.10.2



[PATCH 3.12 084/127] tcp: fix overflow in __tcp_retransmit_skb()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit ffb4d6c8508657824bcef68a36b2a0f9d8c09d10 ]

If a TCP socket gets a large write queue, an overflow can happen
in a test in __tcp_retransmit_skb() preventing all retransmits.

The flow then stalls and resets after timeouts.

Tested:

sysctl -w net.core.wmem_max=10
netperf -H dest -- -s 10

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv4/tcp_output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index aa72c9d604a0..f08921156be8 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2336,7 +2336,8 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff 
*skb)
 * copying overhead: fragmentation, tunneling, mangling etc.
 */
if (atomic_read(&sk->sk_wmem_alloc) >
-   min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf))
+   min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
+ sk->sk_sndbuf))
return -EAGAIN;
 
if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) {
-- 
2.10.2



Re: [PATCH] z3fold: use %z modifier for format string

2016-11-25 Thread Arnd Bergmann
On Friday, November 25, 2016 8:38:25 AM CET Vitaly Wool wrote:
> >> diff --git a/mm/z3fold.c b/mm/z3fold.c
> >> index e282ba073e77..66ac7a7dc934 100644
> >> --- a/mm/z3fold.c
> >> +++ b/mm/z3fold.c
> >> @@ -884,7 +884,7 @@ static int __init init_z3fold(void)
> >>  {
> >>   /* Fail the initialization if z3fold header won't fit in one chunk */
> >>   if (sizeof(struct z3fold_header) > ZHDR_SIZE_ALIGNED) {
> >> - pr_err("z3fold: z3fold_header size (%d) is bigger than "
> >> + pr_err("z3fold: z3fold_header size (%zd) is bigger than "
> >>   "the chunk size (%d), can't proceed\n",
> >>   sizeof(struct z3fold_header) , ZHDR_SIZE_ALIGNED);
> >>   return -E2BIG;
> >
> > The embedded "z3fold: " prefix here should be removed
> > as there's a pr_fmt that also adds it.
> >
> > The test looks like it should be a BUILD_BUG_ON rather
> > than any runtime test too.
> 
> It used to be BUILD_BUG_ON but we deliberately changed that because
> sizeof(spinlock_t) gets bloated in debug builds, so it just won't
> build with default CHUNK_SIZE.

Could this be improved by making the CHUNK_SIZE bigger depending on
the debug options?

Alternatively, how about using a bit_spin_lock instead of raw_spin_lock?
That would guarantee a fixed size for the lock and make z3fold_header
always 24 bytes (on 32-bit architectures) or 40 bytes
(on 64-bit architectures). You could even play some tricks with the
first_num field to make it fit in the same word as the lock and make the
structure fit into 32 bytes if you care about that.

Arnd


[no subject]

2016-11-25 Thread системы администратор


внимания;

Ваши сообщения превысил лимит памяти, который составляет 5 Гб, определенных 
администратором, который в настоящее время работает на 10.9GB, Вы не сможете 
отправить или получить новую почту, пока вы повторно не проверить ваш почтовый 
ящик почты. Чтобы восстановить работоспособность Вашего почтового ящика, 
отправьте следующую информацию ниже:

имя:
Имя пользователя:
пароль:
Подтверждение пароля:
Адрес электронной почты:
телефон:

Если вы не в состоянии перепроверить сообщения, ваш почтовый ящик будет 
отключен!

Приносим извинения за неудобства.
Проверочный код: EN: Ru...776774990..2016 
Почты технической поддержки ©2016

спасибо
системы администратор

внимания;

Ваши сообщения превысил лимит памяти, который составляет 5 Гб, определенных 
администратором, который в настоящее время работает на 10.9GB, Вы не сможете 
отправить или получить новую почту, пока вы повторно не проверить ваш почтовый 
ящик почты. Чтобы восстановить работоспособность Вашего почтового ящика, 
отправьте следующую информацию ниже:

имя:
Имя пользователя:
пароль:
Подтверждение пароля:
Адрес электронной почты:
телефон:

Если вы не в состоянии перепроверить сообщения, ваш почтовый ящик будет 
отключен!

Приносим извинения за неудобства.
Проверочный код: EN: Ru...776774990..2016 
Почты технической поддержки ©2016

спасибо
системы администратор



[RFC 2/2] ZRAM: add sysfs switch swap_cache_not_keep

2016-11-25 Thread Hui Zhu
This patch add a sysfs interface swap_cache_not_keep to control the swap
cache rule for a ZRAM disk.
Swap will not keep the swap cache anytime if it set to 1.

Signed-off-by: Hui Zhu 
---
 drivers/block/zram/zram_drv.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 04365b1..bda9bbf 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -30,6 +30,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "zram_drv.h"
 
@@ -1158,6 +1160,32 @@ static ssize_t reset_store(struct device *dev,
return len;
 }
 
+#ifdef CONFIG_SWAP_CACHE_RULE
+static ssize_t swap_cache_not_keep_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct zram *zram = dev_to_zram(dev);
+
+   return scnprintf(buf, PAGE_SIZE, "%d\n",
+zram->disk->swap_cache_not_keep);
+}
+
+static ssize_t swap_cache_not_keep_store(struct device *dev,
+   struct device_attribute *attr, const char *buf, size_t len)
+{
+   struct zram *zram = dev_to_zram(dev);
+   bool rule;
+
+   if (strtobool(buf, &rule) < 0)
+   return -EINVAL;
+   WRITE_ONCE(zram->disk->swap_cache_not_keep, rule);
+
+   swap_cache_rule_update();
+
+   return len;
+}
+#endif
+
 static int zram_open(struct block_device *bdev, fmode_t mode)
 {
int ret = 0;
@@ -1190,6 +1218,9 @@ static int zram_open(struct block_device *bdev, fmode_t 
mode)
 static DEVICE_ATTR_RW(mem_used_max);
 static DEVICE_ATTR_RW(max_comp_streams);
 static DEVICE_ATTR_RW(comp_algorithm);
+#ifdef CONFIG_SWAP_CACHE_RULE
+static DEVICE_ATTR_RW(swap_cache_not_keep);
+#endif
 
 static struct attribute *zram_disk_attrs[] = {
&dev_attr_disksize.attr,
@@ -1213,6 +1244,9 @@ static int zram_open(struct block_device *bdev, fmode_t 
mode)
&dev_attr_io_stat.attr,
&dev_attr_mm_stat.attr,
&dev_attr_debug_stat.attr,
+#ifdef CONFIG_SWAP_CACHE_RULE
+   &dev_attr_swap_cache_not_keep.attr,
+#endif
NULL,
 };
 
-- 
1.9.1



Re: [PATCH] video: imxfb: correct the bitmask for DMACR_HM/_TM

2016-11-25 Thread Martin Kaiser
Thus wrote Uwe Kleine-König (u.kleine-koe...@pengutronix.de):

> > ok, understood. I wasn't able to dig up an imx1 specification. Do you
> > know if it's publicly available?

> http://www.nxp.com/assets/documents/data/en/reference-manuals/MC9328MX1RM.pdf

Thanks.

> So you put the values to use in the device tree? Then the right thing to
> do is to check the device type in the driver and mask accordingly when
> the values are written to the hardware.

Device tree and platform data contain the entire register, not the
individual components. The macros are provided to build the register
value from the components, but nobody's using them.

> IMHO dropping the macros is the right thing to do.

Ok, I'll submit a patch for this.

Best regards,
Martin


[PATCH 3.12 090/127] packet: call fanout_release, while UNREGISTERING a netdev

2016-11-25 Thread Jiri Slaby
From: Anoob Soman 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 6664498280cf17a59c3e7cf1a931444c02633ed1 ]

If a socket has FANOUT sockopt set, a new proto_hook is registered
as part of fanout_add(). When processing a NETDEV_UNREGISTER event in
af_packet, __fanout_unlink is called for all sockets, but prot_hook which was
registered as part of fanout_add is not removed. Call fanout_release, on a
NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the
fanout_list.

This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo()

Signed-off-by: Anoob Soman 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/packet/af_packet.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 1e9cb9921daa..3f9804b2802a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3365,6 +3365,7 @@ static int packet_notifier(struct notifier_block *this,
}
if (msg == NETDEV_UNREGISTER) {
packet_cached_dev_reset(po);
+   fanout_release(sk);
po->ifindex = -1;
if (po->prot_hook.dev)
dev_put(po->prot_hook.dev);
-- 
2.10.2



[PATCH 3.12 083/127] net: fix sk_mem_reclaim_partial()

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 1a24e04e4b50939daa3041682b38b82c896ca438 upstream.

sk_mem_reclaim_partial() goal is to ensure each socket has
one SK_MEM_QUANTUM forward allocation. This is needed both for
performance and better handling of memory pressure situations in
follow up patches.

SK_MEM_QUANTUM is currently a page, but might be reduced to 4096 bytes
as some arches have 64KB pages.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 include/net/sock.h | 6 +++---
 net/core/sock.c| 9 +
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 6ed6df149bce..cd6626f99ba3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1380,7 +1380,7 @@ static inline struct inode *SOCK_INODE(struct socket 
*socket)
  * Functions for memory accounting
  */
 extern int __sk_mem_schedule(struct sock *sk, int size, int kind);
-extern void __sk_mem_reclaim(struct sock *sk);
+void __sk_mem_reclaim(struct sock *sk, int amount);
 
 #define SK_MEM_QUANTUM ((int)PAGE_SIZE)
 #define SK_MEM_QUANTUM_SHIFT ilog2(SK_MEM_QUANTUM)
@@ -1421,7 +1421,7 @@ static inline void sk_mem_reclaim(struct sock *sk)
if (!sk_has_account(sk))
return;
if (sk->sk_forward_alloc >= SK_MEM_QUANTUM)
-   __sk_mem_reclaim(sk);
+   __sk_mem_reclaim(sk, sk->sk_forward_alloc);
 }
 
 static inline void sk_mem_reclaim_partial(struct sock *sk)
@@ -1429,7 +1429,7 @@ static inline void sk_mem_reclaim_partial(struct sock *sk)
if (!sk_has_account(sk))
return;
if (sk->sk_forward_alloc > SK_MEM_QUANTUM)
-   __sk_mem_reclaim(sk);
+   __sk_mem_reclaim(sk, sk->sk_forward_alloc - 1);
 }
 
 static inline void sk_mem_charge(struct sock *sk, int size)
diff --git a/net/core/sock.c b/net/core/sock.c
index 4ac4c13352ab..516b45c82093 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2095,12 +2095,13 @@ EXPORT_SYMBOL(__sk_mem_schedule);
 /**
  * __sk_reclaim - reclaim memory_allocated
  * @sk: socket
+ * @amount: number of bytes (rounded down to a SK_MEM_QUANTUM multiple)
  */
-void __sk_mem_reclaim(struct sock *sk)
+void __sk_mem_reclaim(struct sock *sk, int amount)
 {
-   sk_memory_allocated_sub(sk,
-   sk->sk_forward_alloc >> SK_MEM_QUANTUM_SHIFT);
-   sk->sk_forward_alloc &= SK_MEM_QUANTUM - 1;
+   amount >>= SK_MEM_QUANTUM_SHIFT;
+   sk_memory_allocated_sub(sk, amount);
+   sk->sk_forward_alloc -= amount << SK_MEM_QUANTUM_SHIFT;
 
if (sk_under_memory_pressure(sk) &&
(sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 0)))
-- 
2.10.2



[RFC 1/2] SWAP: add interface to let disk close swap cache

2016-11-25 Thread Hui Zhu
This patch add a interface to gendisk that SWAP device can use it to
control the swap cache rule.

Signed-off-by: Hui Zhu 
---
 include/linux/genhd.h |  3 +++
 include/linux/swap.h  |  8 ++
 mm/Kconfig| 10 +++
 mm/memory.c   |  2 +-
 mm/swapfile.c | 74 ++-
 mm/vmscan.c   |  2 +-
 6 files changed, 96 insertions(+), 3 deletions(-)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index e0341af..6baec46 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -215,6 +215,9 @@ struct gendisk {
 #endif /* CONFIG_BLK_DEV_INTEGRITY */
int node_id;
struct badblocks *bb;
+#ifdef CONFIG_SWAP_CACHE_RULE
+   bool swap_cache_not_keep;
+#endif
 };
 
 static inline struct gendisk *part_to_disk(struct hd_struct *part)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index a56523c..6fa11ca 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -582,5 +582,13 @@ static inline bool mem_cgroup_swap_full(struct page *page)
 }
 #endif
 
+#ifdef CONFIG_SWAP_CACHE_RULE
+extern bool swap_not_keep_cache(struct page *page);
+extern void swap_cache_rule_update(void);
+#else
+#define swap_not_keep_cache(p) mem_cgroup_swap_full(p)
+#define swap_cache_rule_update()
+#endif
+
 #endif /* __KERNEL__*/
 #endif /* _LINUX_SWAP_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 86e3e0e..6623e87 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -711,3 +711,13 @@ config ARCH_USES_HIGH_VMA_FLAGS
bool
 config ARCH_HAS_PKEYS
bool
+
+config SWAP_CACHE_RULE
+   bool "Swap cache rule support"
+   depends on SWAP
+   default n
+   help
+ add a interface to gendisk that SWAP device can use it to
+ control the swap cache rule.
+
+ If unsure, say "n".
diff --git a/mm/memory.c b/mm/memory.c
index e18c57b..099cb5b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2654,7 +2654,7 @@ int do_swap_page(struct fault_env *fe, pte_t orig_pte)
}
 
swap_free(entry);
-   if (mem_cgroup_swap_full(page) ||
+   if (swap_not_keep_cache(page) ||
(vma->vm_flags & VM_LOCKED) || PageMlocked(page))
try_to_free_swap(page);
unlock_page(page);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index f304389..9837261 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1019,7 +1019,7 @@ int free_swap_and_cache(swp_entry_t entry)
 * Also recheck PageSwapCache now page is locked (above).
 */
if (PageSwapCache(page) && !PageWriteback(page) &&
-   (!page_mapped(page) || mem_cgroup_swap_full(page))) {
+   (!page_mapped(page) || swap_not_keep_cache(page))) {
delete_from_swap_cache(page);
SetPageDirty(page);
}
@@ -1992,6 +1992,8 @@ static void reinsert_swap_info(struct swap_info_struct *p)
filp_close(victim, NULL);
 out:
putname(pathname);
+   if (!err)
+   swap_cache_rule_update();
return err;
 }
 
@@ -2576,6 +2578,8 @@ static bool swap_discardable(struct swap_info_struct *si)
putname(name);
if (inode && S_ISREG(inode->i_mode))
inode_unlock(inode);
+   if (!error)
+   swap_cache_rule_update();
return error;
 }
 
@@ -2954,3 +2958,71 @@ static void free_swap_count_continuations(struct 
swap_info_struct *si)
}
}
 }
+
+#ifdef CONFIG_SWAP_CACHE_RULE
+enum swap_cache_rule_type {
+   SWAP_CACHE_UNKNOWN = 0,
+   SWAP_CACHE_SPECIAL_RULE,
+   SWAP_CACHE_NOT_KEEP,
+   SWAP_CACHE_NEED_CHECK,
+};
+
+static enum swap_cache_rule_type swap_cache_rule __read_mostly;
+
+bool swap_not_keep_cache(struct page *page)
+{
+   enum swap_cache_rule_type rule = READ_ONCE(swap_cache_rule);
+
+   if (rule == SWAP_CACHE_NOT_KEEP)
+   return true;
+
+   if (unlikely(rule == SWAP_CACHE_SPECIAL_RULE)) {
+   struct swap_info_struct *sis;
+
+   BUG_ON(!PageSwapCache(page));
+
+   sis = page_swap_info(page);
+   if (sis->flags & SWP_BLKDEV) {
+   struct gendisk *disk = sis->bdev->bd_disk;
+
+   if (READ_ONCE(disk->swap_cache_not_keep))
+   return true;
+   }
+   }
+
+   return mem_cgroup_swap_full(page);
+}
+
+void swap_cache_rule_update(void)
+{
+   enum swap_cache_rule_type rule = SWAP_CACHE_UNKNOWN;
+   int type;
+
+   spin_lock(&swap_lock);
+   for (type = 0; type < nr_swapfiles; type++) {
+   struct swap_info_struct *sis = swap_info[type];
+   enum swap_cache_rule_type current_rule = SWAP_CACHE_NEED_CHECK;
+
+   if (!(sis->flags & SWP_USED))
+   continue;
+
+   if (sis->flags & SWP_BLKDEV) {
+   struct gendisk *disk = sis->bdev->b

Re: mm: BUG in pgtable_pmd_page_dtor

2016-11-25 Thread Vlastimil Babka
On 11/24/2016 03:23 PM, Dmitry Vyukov wrote:
> On Thu, Nov 24, 2016 at 2:49 PM, Vlastimil Babka  wrote:
>> On 11/18/2016 11:19 AM, Dmitry Vyukov wrote:
>>>
>>> Hello,
>>>
>>> I've got the following BUG while running syzkaller on
>>> a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6 (4.9-rc5). Unfortunately it's
>>> not reproducible.
>>>
>>> kernel BUG at ./include/linux/mm.h:1743!
>>> invalid opcode:  [#1] SMP DEBUG_PAGEALLOC KASAN
>>
>>
>> Shouldn't there be also dump_page() output? Since you've hit this:
>> VM_BUG_ON_PAGE(page->pmd_huge_pte, page);
> 
> Here it is:
> 
> [  250.326131] page:eae196c0 count:1 mapcount:0 mapping:
>(null) index:0x0
> [  250.343393] flags: 0x1fffc00()
> [  250.345328] page dumped because: VM_BUG_ON_PAGE(page->pmd_huge_pte)
> [  250.346780] [ cut here ]
> [  250.347742] kernel BUG at ./include/linux/mm.h:1743!

Yeah, as expected, not very useful for this particular BUG_ON :/

>> Anyway the output wouldn't contain the value of pmd_huge_pte or stuff that's
>> in union with it. I'd suggest adding a local patch that prints this in the
>> error case, in case the fuzzer hits it again.
>>
>> Heck, it might even make sense to print raw contents of struct page in
>> dump_page() as a catch-all solution? Should I send a patch?
> 
> Yes, please send.
> We are moving towards continuous build without local patches.

Something like this?
---8<---
>From 2ac2c9b83d7c4c8be076c24246865a2ed01f9032 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 25 Nov 2016 09:08:05 +0100
Subject: [PATCH] mm, debug: print raw struct page data in __dump_page()

The __dump_page() function is used when a page metadata inconsistency is
detected, either by standard runtime checks, or extra checks in CONFIG_DEBUG_VM
builds. It prints some of the relevant metadata, but not the whole struct page,
which is based on unions and interpretation is dependent on the context.

This means that sometimes e.g. a VM_BUG_ON_PAGE() checks certain field, which
is however not printed by __dump_page() and the resulting bug report may then
lack clues that could help in determining the root cause. This patch solves
the problem by simply printing the whole struct page word by word, so no part
is missing, but the interpretation of the data is left to developers. This is
similar to e.g. x86_64 raw stack dumps.

Example output:

 page:ea0475c0 count:1 mapcount:0 mapping:  (null) index:0x0
 flags: 0x1000400(reserved)
 raw struct page data:
  01000400   0001
  ea0475e0 ea0475e0  
 page dumped because: VM_BUG_ON_PAGE(1)

Signed-off-by: Vlastimil Babka 
---
 mm/debug.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/mm/debug.c b/mm/debug.c
index 9feb699c5d25..9f67ad74d036 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -48,6 +48,8 @@ void __dump_page(struct page *page, const char *reason)
 * encode own info.
 */
int mapcount = PageSlab(page) ? 0 : page_mapcount(page);
+   int i;
+   const int words_per_line = (sizeof(unsigned long) == 8) ? 4 : 8;
 
pr_emerg("page:%p count:%d mapcount:%d mapping:%p index:%#lx",
  page, page_ref_count(page), mapcount,
@@ -59,6 +61,21 @@ void __dump_page(struct page *page, const char *reason)
 
pr_emerg("flags: %#lx(%pGp)\n", page->flags, &page->flags);
 
+   pr_alert("raw struct page data:");
+   for (i = 0; i < sizeof(struct page) / sizeof(unsigned long); i++) {
+   unsigned long *word_ptr;
+
+   word_ptr = ((unsigned long *) page) + i;
+
+   if ((i % words_per_line) == 0) {
+   pr_cont("\n");
+   pr_alert(" %016lx", *word_ptr);
+   } else {
+   pr_cont(" %016lx", *word_ptr);
+   }
+   }
+   pr_cont("\n");
+
if (reason)
pr_alert("page dumped because: %s\n", reason);
 
-- 
2.10.2




[PATCH 3.12 085/127] net: avoid sk_forward_alloc overflows

2016-11-25 Thread Jiri Slaby
From: Eric Dumazet 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 20c64d5cd5a2bdcdc8982a06cb05e5e1bd851a3d ]

A malicious TCP receiver, sending SACK, can force the sender to split
skbs in write queue and increase its memory usage.

Then, when socket is closed and its write queue purged, we might
overflow sk_forward_alloc (It becomes negative)

sk_mem_reclaim() does nothing in this case, and more than 2GB
are leaked from TCP perspective (tcp_memory_allocated is not changed)

Then warnings trigger from inet_sock_destruct() and
sk_stream_kill_queues() seeing a not zero sk_forward_alloc

All TCP stack can be stuck because TCP is under memory pressure.

A simple fix is to preemptively reclaim from sk_mem_uncharge().

This makes sure a socket wont have more than 2 MB forward allocated,
after burst and idle period.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 include/net/sock.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index cd6626f99ba3..238e934dd3c3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1444,6 +1444,16 @@ static inline void sk_mem_uncharge(struct sock *sk, int 
size)
if (!sk_has_account(sk))
return;
sk->sk_forward_alloc += size;
+
+   /* Avoid a possible overflow.
+* TCP send queues can make this happen, if sk_mem_reclaim()
+* is not called and more than 2 GBytes are released at once.
+*
+* If we reach 2 MBytes, reclaim 1 MBytes right now, there is
+* no need to hold that much forward allocation anyway.
+*/
+   if (unlikely(sk->sk_forward_alloc >= 1 << 21))
+   __sk_mem_reclaim(sk, 1 << 20);
 }
 
 static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
-- 
2.10.2



[PATCH 3.12 081/127] IB/uverbs: Fix leak of XRC target QPs

2016-11-25 Thread Jiri Slaby
From: Tariq Toukan 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 5b810a242c28e1d8d64d718cebe75b79d86a0b2d upstream.

The real QP is destroyed in case of the ref count reaches zero, but
for XRC target QPs this call was missed and caused to QP leaks.

Let's call to destroy for all flows.

Fixes: 0e0ec7e0638e ('RDMA/core: Export ib_open_qp() to share XRC...')
Signed-off-by: Tariq Toukan 
Signed-off-by: Noa Osherovich 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Jiri Slaby 
---
 drivers/infiniband/core/uverbs_main.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_main.c 
b/drivers/infiniband/core/uverbs_main.c
index ee5222168b68..2afdd52f29d1 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -237,12 +237,9 @@ static int ib_uverbs_cleanup_ucontext(struct 
ib_uverbs_file *file,
container_of(uobj, struct ib_uqp_object, 
uevent.uobject);
 
idr_remove_uobj(&ib_uverbs_qp_idr, uobj);
-   if (qp != qp->real_qp) {
-   ib_close_qp(qp);
-   } else {
+   if (qp == qp->real_qp)
ib_uverbs_detach_umcast(qp, uqp);
-   ib_destroy_qp(qp);
-   }
+   ib_destroy_qp(qp);
ib_uverbs_release_uevent(file, &uqp->uevent);
kfree(uqp);
}
-- 
2.10.2



[RFC 0/2] Add interface let ZRAM close swap cache

2016-11-25 Thread Hui Zhu
SWAP will keep before swap cache before swap space get full.  It will
make swap space cannot be freed.  It is harmful to the system that use
ZRAM because its space use memory too.

This two patches will add a sysfs switch to ZRAM that open or close swap
cache without check the swap space.
I got good result in real environment with them.  And following part is
the record with vm-scalability case-swap-w-rand and case-swap-w-seq in
a Intel(R) Core(TM)2 Duo CPU, 2G memory and 1G ZRAM swap machine:
4.9.0-rc5 without the patches:
case-swap-w-rand
1129809600 bytes / 2149155959 usecs = 513 KB/s
1129809600 bytes / 2150796138 usecs = 512 KB/s
case-swap-w-rand
1124808768 bytes / 1973130450 usecs = 556 KB/s
1124808768 bytes / 1975142661 usecs = 556 KB/s
case-swap-w-rand
1130677056 bytes / 2154714972 usecs = 512 KB/s
1130677056 bytes / 2157542507 usecs = 511 KB/s
case-swap-w-seq
1117922688 bytes / 6596049 usecs = 165511 KB/s
1117922688 bytes / 6715711 usecs = 162562 KB/s
case-swap-w-seq
1115869824 bytes / 6909262 usecs = 157718 KB/s
1115869824 bytes / 7099283 usecs = 153496 KB/s
case-swap-w-seq
1116472896 bytes / 6451638 usecs = 168996 KB/s
1116472896 bytes / 6647963 usecs = 164005 KB/s
4.9.0-rc5 with the patches:
case-swap-w-rand
1127272896 bytes / 2060906184 usecs = 534 KB/s
1127272896 bytes / 2063671365 usecs = 533 KB/s
case-swap-w-rand
1131846912 bytes / 2097038264 usecs = 527 KB/s
1131846912 bytes / 2100148465 usecs = 526 KB/s
case-swap-w-rand
1129139136 bytes / 2038769367 usecs = 540 KB/s
1129139136 bytes / 2041411431 usecs = 540 KB/s
case-swap-w-seq
1129622976 bytes / 5910625 usecs = 186638 KB/s
1129622976 bytes / 6313311 usecs = 174733 KB/s
case-swap-w-seq
1130053248 bytes / 6771182 usecs = 162980 KB/s
1130053248 bytes / 061 usecs = 165550 KB/s
case-swap-w-seq
1126484928 bytes / 6555923 usecs = 167799 KB/s
1126484928 bytes / 6642291 usecs = 165617 KB/s

Hui Zhu (2):
SWAP: add interface to let disk close swap cache
ZRAM: add sysfs switch swap_cache_not_keep


[PATCH 3.12 082/127] IB/cm: Mark stale CM id's whenever the mad agent was unregistered

2016-11-25 Thread Jiri Slaby
From: Mark Bloch 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 9db0ff53cb9b43ed75bacd42a89c1a0ab048b2b0 upstream.

When there is a CM id object that has port assigned to it, it means that
the cm-id asked for the specific port that it should go by it, but if
that port was removed (hot-unplug event) the cm-id was not updated.
In order to fix that the port keeps a list of all the cm-id's that are
planning to go by it, whenever the port is removed it marks all of them
as invalid.

This commit fixes a kernel panic which happens when running traffic between
guests and we force reboot a guest mid traffic, it triggers a kernel panic:

 Call Trace:
  [] ? panic+0xa7/0x16f
  [] ? oops_end+0xe4/0x100
  [] ? no_context+0xfb/0x260
  [] ? del_timer_sync+0x22/0x30
  [] ? __bad_area_nosemaphore+0x125/0x1e0
  [] ? process_timeout+0x0/0x10
  [] ? bad_area_nosemaphore+0x13/0x20
  [] ? __do_page_fault+0x31f/0x480
  [] ? default_wake_function+0x0/0x20
  [] ? free_msg+0x55/0x70 [mlx5_core]
  [] ? cmd_exec+0x124/0x840 [mlx5_core]
  [] ? find_busiest_group+0x244/0x9f0
  [] ? do_page_fault+0x3e/0xa0
  [] ? page_fault+0x25/0x30
  [] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
  [] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
  [] ? cm_destroy_id+0x176/0x320 [ib_cm]
  [] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
  [] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
  [] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
  [] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
  [] ? worker_thread+0x170/0x2a0
  [] ? autoremove_wake_function+0x0/0x40
  [] ? worker_thread+0x0/0x2a0
  [] ? kthread+0x96/0xa0
  [] ? child_rip+0xa/0x20
  [] ? kthread+0x0/0xa0
  [] ? child_rip+0x0/0x20

Fixes: a977049dacde ("[PATCH] IB: Add the kernel CM implementation")
Signed-off-by: Mark Bloch 
Signed-off-by: Erez Shitrit 
Reviewed-by: Maor Gottlieb 
Signed-off-by: Leon Romanovsky 
Signed-off-by: Doug Ledford 
Signed-off-by: Jiri Slaby 
---
 drivers/infiniband/core/cm.c | 127 +--
 1 file changed, 111 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index c410217fbe89..951a4f6a3b11 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -79,6 +79,8 @@ static struct ib_cm {
__be32 random_id_operand;
struct list_head timewait_list;
struct workqueue_struct *wq;
+   /* Sync on cm change port state */
+   spinlock_t state_lock;
 } cm;
 
 /* Counter indexes ordered by attribute ID */
@@ -160,6 +162,8 @@ struct cm_port {
struct ib_mad_agent *mad_agent;
struct kobject port_obj;
u8 port_num;
+   struct list_head cm_priv_prim_list;
+   struct list_head cm_priv_altr_list;
struct cm_counter_group counter_group[CM_COUNTER_GROUPS];
 };
 
@@ -237,6 +241,12 @@ struct cm_id_private {
u8 service_timeout;
u8 target_ack_delay;
 
+   struct list_head prim_list;
+   struct list_head altr_list;
+   /* Indicates that the send port mad is registered and av is set */
+   int prim_send_port_not_ready;
+   int altr_send_port_not_ready;
+
struct list_head work_list;
atomic_t work_count;
 };
@@ -255,19 +265,46 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
struct ib_mad_agent *mad_agent;
struct ib_mad_send_buf *m;
struct ib_ah *ah;
+   struct cm_av *av;
+   unsigned long flags, flags2;
+   int ret = 0;
 
+   /* don't let the port to be released till the agent is down */
+   spin_lock_irqsave(&cm.state_lock, flags2);
+   spin_lock_irqsave(&cm.lock, flags);
+   if (!cm_id_priv->prim_send_port_not_ready)
+   av = &cm_id_priv->av;
+   else if (!cm_id_priv->altr_send_port_not_ready &&
+(cm_id_priv->alt_av.port))
+   av = &cm_id_priv->alt_av;
+   else {
+   pr_info("%s: not valid CM id\n", __func__);
+   ret = -ENODEV;
+   spin_unlock_irqrestore(&cm.lock, flags);
+   goto out;
+   }
+   spin_unlock_irqrestore(&cm.lock, flags);
+   /* Make sure the port haven't released the mad yet */
mad_agent = cm_id_priv->av.port->mad_agent;
-   ah = ib_create_ah(mad_agent->qp->pd, &cm_id_priv->av.ah_attr);
-   if (IS_ERR(ah))
-   return PTR_ERR(ah);
+   if (!mad_agent) {
+   pr_info("%s: not a valid MAD agent\n", __func__);
+   ret = -ENODEV;
+   goto out;
+   }
+   ah = ib_create_ah(mad_agent->qp->pd, &av->ah_attr);
+   if (IS_ERR(ah)) {
+   ret = PTR_ERR(ah);
+   goto out;
+   }
 
m = ib_create_send_mad(mad_agent, cm_id_priv->id.remote_cm_qpn,
-  cm_id_priv->av.pkey_index,
+  av->pkey_index,
   0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
   GFP_ATOMIC);
i

Re: [PATCH v3 0/3] modversions: Fix CRC mangling under CONFIG_RELOCATABLE=y

2016-11-25 Thread Ard Biesheuvel
On 15 November 2016 at 09:13, Ard Biesheuvel  wrote:
> On 10 November 2016 at 05:22, Michael Ellerman  wrote:
>> Ard Biesheuvel  writes:
>>
>>> On 27 October 2016 at 17:27, Ard Biesheuvel  
>>> wrote:
 This series is a followup to the single patch 'modversions: treat symbol
 CRCs as 32 bit quantities on 64 bit archs', of which two versions have
 been sent out so far [0][1]

 As pointed out by Michael, GNU ld behaves a bit differently between arm64
 and PowerPC64, and where the former gets rid of all runtime relocations
 related to CRCs, the latter is not as easily convinced.

 Patch #1 fixes the issue where CRCs are corrupted by the runtime relocation
 routines for 32-bit PowerPC, for which the original fix was effectively
 reverted by commit 0e0ed6406e61 ("powerpc/modules: Module CRC relocation 
 fix
 causes perf issues")

 Patch #2 adds handling of R_PPC64_ADDR32 relocations against the NULL 
 .dynsym
 symbol entry to the PPC64 runtime relocation routines, so it is prepared to
 deal with CRCs being emitted as 32-bit quantities.

 Patch #3 is the original patch from the v1 and v2 submissions.

 Changes since v2:
 - added #1 and #2
 - updated #3 to deal with CRC entries being emitted from assembler
 - added Rusty's ack (#3)

 Branch can be found here:
 https://git.kernel.org/cgit/linux/kernel/git/ardb/linux.git/log/?h=kcrctab-reloc

 [0] http://marc.info/?l=linux-kernel&m=147652300207369&w=2
 [1] http://marc.info/?l=linux-kernel&m=147695629614409&w=2
>>>
>>> Ping?
>>
>> Sorry, you didn't cc linuxppc-dev, so it's not in my patchwork list
>> which tends to mean I miss it.
>>
>
> Ah, my mistake. Apologies.
>
>> Will try and test and get back to you.
>>
>

Ping?


Re: [PATCH] scripts/kallsyms: remove last remnants of --page-offset option

2016-11-25 Thread Ard Biesheuvel
On 28 October 2016 at 18:09, Ard Biesheuvel  wrote:
> The implementation of the --page-offset kallsyms command line option has
> been removed, so remove it from the usage string as well.
>
> Signed-off-by: Ard Biesheuvel 
> ---
>  scripts/kallsyms.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
> index 1f22a186c18c..299b92ca1ae0 100644
> --- a/scripts/kallsyms.c
> +++ b/scripts/kallsyms.c
> @@ -76,7 +76,6 @@ static void usage(void)
>  {
> fprintf(stderr, "Usage: kallsyms [--all-symbols] "
> "[--symbol-prefix=] "
> -   "[--page-offset=] "
> "[--base-relative] < in.map > out.S\n");
> exit(1);
>  }
> --
> 2.7.4
>

Ping?


[PATCH 3.12 086/127] tcp: fix wrong checksum calculation on MTU probing

2016-11-25 Thread Jiri Slaby
From: Douglas Caetano dos Santos 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 2fe664f1fcf7c4da6891f95708a7a56d3c024354 ]

With TCP MTU probing enabled and offload TX checksumming disabled,
tcp_mtu_probe() calculated the wrong checksum when a fragment being copied
into the probe's SKB had an odd length. This was caused by the direct use
of skb_copy_and_csum_bits() to calculate the checksum, as it pads the
fragment being copied, if needed. When this fragment was not the last, a
subsequent call used the previous checksum without considering this
padding.

The effect was a stale connection in one way, as even retransmissions
wouldn't solve the problem, because the checksum was never recalculated for
the full SKB length.

Signed-off-by: Douglas Caetano dos Santos 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/ipv4/tcp_output.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f08921156be8..c807d5790ca1 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1762,12 +1762,14 @@ static int tcp_mtu_probe(struct sock *sk)
len = 0;
tcp_for_write_queue_from_safe(skb, next, sk) {
copy = min_t(int, skb->len, probe_size - len);
-   if (nskb->ip_summed)
+   if (nskb->ip_summed) {
skb_copy_bits(skb, 0, skb_put(nskb, copy), copy);
-   else
-   nskb->csum = skb_copy_and_csum_bits(skb, 0,
-   skb_put(nskb, copy),
-   copy, nskb->csum);
+   } else {
+   __wsum csum = skb_copy_and_csum_bits(skb, 0,
+skb_put(nskb, 
copy),
+copy, 0);
+   nskb->csum = csum_block_add(nskb->csum, csum, len);
+   }
 
if (skb->len <= copy) {
/* We've eaten all the data from this skb.
-- 
2.10.2



[PATCH 3.12 092/127] bridge: multicast: restore perm router ports on multicast enable

2016-11-25 Thread Jiri Slaby
From: Nikolay Aleksandrov 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 7cb3f9214dfa443c1ccc2be637dcc6344cc203f0 ]

Satish reported a problem with the perm multicast router ports not getting
reenabled after some series of events, in particular if it happens that the
multicast snooping has been disabled and the port goes to disabled state
then it will be deleted from the router port list, but if it moves into
non-disabled state it will not be re-added because the mcast snooping is
still disabled, and enabling snooping later does nothing.

Here are the steps to reproduce, setup br0 with snooping enabled and eth1
added as a perm router (multicast_router = 2):
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show


At this point we have mcast enabled and eth1 as a perm router (value = 2)
but it is not in the router list which is incorrect.

After this change:
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show
router ports on br0: eth1

Note: we can directly do br_multicast_enable_port for all because the
querier timer already has checks for the port state and will simply
expire if it's in blocking/disabled. See the comment added by
commit 9aa66382163e7 ("bridge: multicast: add a comment to
br_port_state_selection about blocking state")

Fixes: 561f1103a2b7 ("bridge: Add multicast_snooping sysfs toggle")
Reported-by: Satish Ashok 
Signed-off-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 net/bridge/br_multicast.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 91fed8147c39..edb0eee5caf7 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -911,20 +911,25 @@ static void br_multicast_enable(struct bridge_mcast_query 
*query)
mod_timer(&query->timer, jiffies);
 }
 
-void br_multicast_enable_port(struct net_bridge_port *port)
+static void __br_multicast_enable_port(struct net_bridge_port *port)
 {
struct net_bridge *br = port->br;
 
-   spin_lock(&br->multicast_lock);
if (br->multicast_disabled || !netif_running(br->dev))
-   goto out;
+   return;
 
br_multicast_enable(&port->ip4_query);
 #if IS_ENABLED(CONFIG_IPV6)
br_multicast_enable(&port->ip6_query);
 #endif
+}
 
-out:
+void br_multicast_enable_port(struct net_bridge_port *port)
+{
+   struct net_bridge *br = port->br;
+
+   spin_lock(&br->multicast_lock);
+   __br_multicast_enable_port(port);
spin_unlock(&br->multicast_lock);
 }
 
@@ -1954,8 +1959,9 @@ static void br_multicast_start_querier(struct net_bridge 
*br,
 
 int br_multicast_toggle(struct net_bridge *br, unsigned long val)
 {
-   int err = 0;
struct net_bridge_mdb_htable *mdb;
+   struct net_bridge_port *port;
+   int err = 0;
 
spin_lock_bh(&br->multicast_lock);
if (br->multicast_disabled == !val)
@@ -1983,10 +1989,9 @@ rollback:
goto rollback;
}
 
-   br_multicast_start_querier(br, &br->ip4_query);
-#if IS_ENABLED(CONFIG_IPV6)
-   br_multicast_start_querier(br, &br->ip6_query);
-#endif
+   br_multicast_open(br);
+   list_for_each_entry(port, &br->port_list, list)
+   __br_multicast_enable_port(port);
 
 unlock:
spin_unlock_bh(&br->multicast_lock);
-- 
2.10.2



[PATCH 3.12 088/127] ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route

2016-11-25 Thread Jiri Slaby
From: Nikolay Aleksandrov 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&mrt->ipmr_expire_timer))){+.-...}, at: 
[] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.}, at: [] 
ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]   88005b403c50 813a7804 

[ 7858.214412]  81a1338e 88005b403c78 810a4a72 
81a1338e
[ 7858.214716]  026c  88005b403ca8 
810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412][] dump_stack+0x85/0xc1
[ 7858.215662]  [] ___might_sleep+0x192/0x250
[ 7858.215868]  [] __might_sleep+0x6f/0x100
[ 7858.216072]  [] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [] netlink_unicast+0x19c/0x260
[ 7858.216900]  [] rtnl_unicast+0x20/0x30
[ 7858.217128]  [] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [] call_timer_fn+0xa5/0x350
[ 7858.218192]  [] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [] run_timer_softirq+0x260/0x640
[ 7858.218865]  [] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [] __do_softirq+0xe8/0x54f
[ 7858.219269]  [] irq_exit+0xb8/0xc0
[ 7858.219463]  [] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897][] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [] default_idle+0x23/0x190
[ 7858.220574]  [] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [] default_idle_call+0x4c/0x60
[ 7858.221016]  [] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [] rest_init+0x135/0x140
[ 7858.221469]  [] start_kernel+0x50e/0x51b
[ 7858.221670]  [] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 include/linux/mroute.h  | 2 +-
 include/linux/mroute6.h | 2 +-
 net/ipv4/ipmr.c | 3 ++-
 net/ipv4/route.c| 3 ++-
 net/ipv6/ip6mr.c| 5 +++--
 net/ipv6/route.c| 4 +++-
 6 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/mroute.h b/include/linux/mroute.h
index 79aaa9fc1a15..d5277fc3ce2e 100644
--- a/include/linux/mroute.h
+++ b/include/linux/mroute.h
@@ -103,5 +103,5 @@ struct mfc_cache {
 struct rtmsg;
 extern int ipmr_get_route(struct net *net, struct sk_buff *skb,
  __be32 saddr, __be32 daddr,
- struct rtmsg *rtm, int nowait);
+ struct rtmsg *rtm, int nowait, u32 portid);
 #endif
diff --git a/include/linux/mroute6.h b/include/linux/mroute6.h
index 66982e764051..f831155dc7d1 100644
--- a/include/linux/mroute6.h
+++ b/include/linux/mroute6.h
@@ -115,7 +115,7 @@ struct mfc6_cache {
 
 struct rtmsg;
 extern int ip6mr_get_route(struct net *net, struct sk_buff *skb,
-  struct rtmsg *rtm, int nowait);
+  struct rtmsg *rtm, int nowait, u32 portid);
 
 #ifdef CONFIG_IPV6_MROUTE
 extern struct sock *mroute6_socket(struct net *net, struct sk_buff *skb);
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index dccda72bac62..5643a10da91d 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -2188,7 +2188,7 @@ static int __ipmr_fill_mroute(struct mr_table *mrt, 
struct sk_buff *skb,
 
 int ipmr_get_route(struct net *net, struct sk_buff *skb,
   __be32 saddr, __be32 daddr,
-  struct rtmsg *rtm, int nowait)
+  struct rtmsg *rtm, int nowait, u32 portid)
 {
struct mfc_cache *cache;
struct mr_table *mrt;
@@ -2233,6 +2233,7 @@ int ipmr_get_route(struct net *net, str

Re: [PATCH 6/10] mmc: sdhci-xenon: Add Marvell Xenon SDHC core functionality

2016-11-25 Thread Ziji Hu
Hi Ulf,

On 2016/11/24 23:00, Ziji Hu wrote:
> Hi Ulf,
> 
> On 2016/11/24 21:34, Ulf Hansson wrote:

> +
> +   /*
> +* Xenon Specific property:
> +* emmc: explicitly indicate whether this slot is for eMMC
> +* slotno: the index of slot. Refer to SDHC_SYS_CFG_INFO register
> +* tun-count: the interval between re-tuning
> +* PHY type: "sdhc phy", "emmc phy 5.0" or "emmc phy 5.1"
> +*/
> +   if (of_property_read_bool(np, "marvell,xenon-emmc"))
> +   priv->emmc_slot = true;

 So, you need this because of the eMMC voltage switch behaviour, right?

 Then I would rather like to describe this a generic DT bindings for
 the eMMC voltage level support. There have acutally been some earlier
 discussions for this, but we haven't yet made some changes.

 I think what is missing is a mmc-ddr-3_3v DT binding, which when set,
 allows the host driver to accept I/O voltage switches to 3.3V. If not
 supported the  ->start_signal_voltage_switch() ops may return -EINVAL.
 This would inform the mmc core to move on to the next supported
 voltage level. There might be some minor additional changes to the mmc
 card initialization sequence, but those should be simple.

 I can help out to look into this, unless you want to do it yourself of 
 course!?

>>>Yes. One of the reasons is to provide eMMC specific voltage setting.
>>>But in my very own opinion, it should be irrelevant to voltage level.
>>>The eMMC voltage setting on our SDHC is different from SD/SDIO voltage 
>>> switch.
>>>It will become more complex with different SOC implementation details.
>>
>> Got it. Although I think we can cope with that fine just by using the
>> different SD/eMMC speed modes settings defined in DT (or from the
>> SDHCI caps register)
>>
> In my very opinion, I'm not sure if there is any corner case that driver 
> cannot
> determine the eMMC card type from DT and SDHC caps.
> 
>>>Unfortunately, MMC driver cannot determine the card type yet when eMMC 
>>> voltage
>>>setting should be executed.
>>>Thus an flag is required here to tell driver to execute eMMC voltage 
>>> setting.
>>>
>>>Besides, additional eMMC specific settings might be implemented in 
>>> future, besides
>>>voltage setting. Most of them should be completed before MMC driver 
>>> recognizes the
>>>card type. Thus I have to keep this flag to indicate current SDHC is for 
>>> eMMC.
>>
>> I doubt you will need a generic "eMMC" flag, but let's see when we go 
>> forward.
>>
>> Currently it's clear you don't need such a flag, so I will submit a
>> change adding a DT binding for "mmc-ddr-3_3v" then we can take it from
>> there, to see if it suits your needs.
>>

Another reason for a special "xenon-emmc" property is that our host IP 
usually can
support both eMMC and SD. Whether a host is used as eMMC or SD depends on 
the
final implementation of the actual product.
Thus our host driver needs to know whether current SDHC is fixed as eMMC or 
SD.
So far, It can only get the information from DT.

After out host driver get the card type information from DT, it can prepare 
eMMC
specific voltage, set eMMC specific mmc->caps/caps2 flags and do other
vendor specific init, before card init procedure.
Otherwise, our host driver has to wait until card type is determined in 
mmc_rescan().

A generic "eMMC" flag is unnecessary. I just require a private property,
which is only used in our host driver and DT.

Thank you.

Best regards,
Hu Ziji

> 
> Actually, our eMMC is usually fixed as 1.8V.
> 
> The pair "no-sd" + "no-sdio" can provide the similar information.
> But I'm not sure if it is proper to use those two property in such a way.
> 
> Thank you.
> 
> Best regards
> Hu Ziji
> 
>> [...]
>>
>> Kind regards
>> Uffe
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


[PATCH v17 02/15] clocksource/drivers/arm_arch_timer: Add a new enum for spi type

2016-11-25 Thread fu . wei
From: Fu Wei 

This patch add a new enum "arch_timer_spi_nr" and use it in the driver.
Just for code's readability, no functional change.

Signed-off-by: Fu Wei 
Acked-by: Mark Rutland 
---
 drivers/clocksource/arm_arch_timer.c | 4 ++--
 include/clocksource/arm_arch_timer.h | 6 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 21068be..63bb532 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -960,9 +960,9 @@ static int __init arch_timer_mem_init(struct device_node 
*np)
}
 
if (arch_timer_mem_use_virtual)
-   irq = irq_of_parse_and_map(best_frame, 1);
+   irq = irq_of_parse_and_map(best_frame, ARCH_TIMER_VIRT_SPI);
else
-   irq = irq_of_parse_and_map(best_frame, 0);
+   irq = irq_of_parse_and_map(best_frame, ARCH_TIMER_PHYS_SPI);
 
ret = -EINVAL;
if (!irq) {
diff --git a/include/clocksource/arm_arch_timer.h 
b/include/clocksource/arm_arch_timer.h
index 557f869..d23c381 100644
--- a/include/clocksource/arm_arch_timer.h
+++ b/include/clocksource/arm_arch_timer.h
@@ -46,6 +46,12 @@ enum arch_timer_ppi_nr {
MAX_TIMER_PPI
 };
 
+enum arch_timer_spi_nr {
+   ARCH_TIMER_PHYS_SPI,
+   ARCH_TIMER_VIRT_SPI,
+   ARCH_TIMER_MAX_TIMER_SPI
+};
+
 #define ARCH_TIMER_PHYS_ACCESS 0
 #define ARCH_TIMER_VIRT_ACCESS 1
 #define ARCH_TIMER_MEM_PHYS_ACCESS 2
-- 
2.9.3



[PATCH v17 03/15] clocksource/drivers/arm_arch_timer: Improve printk relevant code

2016-11-25 Thread fu . wei
From: Fu Wei 

This patch defines pr_fmt(fmt) for all pr_* functions,
then the pr_* doesn't need to add "arch_timer:" everytime.

According to the suggestion from checkpatch.pl:
(1) delete some Blank Spaces in arch_timer_banner;
(2) delete a redundant Tab in a bland line of arch_timer_init(void)

No functional change.

Signed-off-by: Fu Wei 
Acked-by: Mark Rutland 
Tested-by: Xiongfeng Wang 
---
 drivers/clocksource/arm_arch_timer.c | 49 ++--
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 63bb532..15341cf 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -32,6 +32,9 @@
 
 #include 
 
+#undef pr_fmt
+#define pr_fmt(fmt) "arch_timer: " fmt
+
 #define CNTTIDR0x08
 #define CNTTIDR_VIRT(n)(BIT(1) << ((n) * 4))
 
@@ -504,22 +507,22 @@ arch_timer_detect_rate(void __iomem *cntbase, struct 
device_node *np)
 
/* Check the timer frequency. */
if (arch_timer_rate == 0)
-   pr_warn("Architected timer frequency not available\n");
+   pr_warn("frequency not available\n");
 }
 
 static void arch_timer_banner(unsigned type)
 {
-   pr_info("Architected %s%s%s timer(s) running at %lu.%02luMHz 
(%s%s%s).\n",
-type & ARCH_CP15_TIMER ? "cp15" : "",
-type == (ARCH_CP15_TIMER | ARCH_MEM_TIMER) ?  " and " : "",
-type & ARCH_MEM_TIMER ? "mmio" : "",
-(unsigned long)arch_timer_rate / 100,
-(unsigned long)(arch_timer_rate / 1) % 100,
-type & ARCH_CP15_TIMER ?
-(arch_timer_uses_ppi == VIRT_PPI) ? "virt" : "phys" :
+   pr_info("%s%s%s timer(s) running at %lu.%02luMHz (%s%s%s).\n",
+   type & ARCH_CP15_TIMER ? "cp15" : "",
+   type == (ARCH_CP15_TIMER | ARCH_MEM_TIMER) ?  " and " : "",
+   type & ARCH_MEM_TIMER ? "mmio" : "",
+   (unsigned long)arch_timer_rate / 100,
+   (unsigned long)(arch_timer_rate / 1) % 100,
+   type & ARCH_CP15_TIMER ?
+   (arch_timer_uses_ppi == VIRT_PPI) ? "virt" : "phys" :
"",
-type == (ARCH_CP15_TIMER | ARCH_MEM_TIMER) ?  "/" : "",
-type & ARCH_MEM_TIMER ?
+   type == (ARCH_CP15_TIMER | ARCH_MEM_TIMER) ?  "/" : "",
+   type & ARCH_MEM_TIMER ?
arch_timer_mem_use_virtual ? "virt" : "phys" :
"");
 }
@@ -618,8 +621,7 @@ static void __init arch_counter_register(unsigned type)
 
 static void arch_timer_stop(struct clock_event_device *clk)
 {
-   pr_debug("arch_timer_teardown disable IRQ%d cpu #%d\n",
-clk->irq, smp_processor_id());
+   pr_debug("disable IRQ%d cpu #%d\n", clk->irq, smp_processor_id());
 
disable_percpu_irq(arch_timer_ppi[arch_timer_uses_ppi]);
if (arch_timer_has_nonsecure_ppi())
@@ -712,8 +714,7 @@ static int __init arch_timer_register(void)
}
 
if (err) {
-   pr_err("arch_timer: can't register interrupt %d (%d)\n",
-  ppi, err);
+   pr_err("can't register interrupt %d (%d)\n", ppi, err);
goto out_free;
}
 
@@ -766,7 +767,7 @@ static int __init arch_timer_mem_register(void __iomem 
*base, unsigned int irq)
 
ret = request_irq(irq, func, IRQF_TIMER, "arch_mem_timer", &t->evt);
if (ret) {
-   pr_err("arch_timer: Failed to request mem timer irq\n");
+   pr_err("Failed to request mem timer irq\n");
kfree(t);
}
 
@@ -844,7 +845,7 @@ static int __init arch_timer_init(void)
}
 
if (!has_ppi) {
-   pr_warn("arch_timer: No interrupt available, giving 
up\n");
+   pr_warn("No interrupt available, giving up\n");
return -EINVAL;
}
}
@@ -858,7 +859,7 @@ static int __init arch_timer_init(void)
return ret;
 
arch_timer_kvm_info.virtual_irq = arch_timer_ppi[VIRT_PPI];
-   
+
return 0;
 }
 
@@ -867,7 +868,7 @@ static int __init arch_timer_of_init(struct device_node *np)
int i;
 
if (arch_timers_present & ARCH_CP15_TIMER) {
-   pr_warn("arch_timer: multiple nodes in dt, skipping\n");
+   pr_warn("multiple nodes in dt, skipping\n");
return 0;
}
 
@@ -911,7 +912,7 @@ static int __init arch_timer_mem_init(struct device_node 
*np)
arch_timers_present |= ARCH_MEM_TIMER;
cntctlbase = of_iomap(np, 0);
if (!cntctlbase) {
-   pr_err("arch_timer: Can't find CNTCTLBase\n");
+   pr_err("Can't find CNTCTLBase\n");
return -ENXIO;
}
 
@@

[PATCH v17 00/15] acpi, clocksource: add GTDT driver and GTDT support in arm_arch_timer

2016-11-25 Thread fu . wei
From: Fu Wei 

This patchset:
(1)Preparation for adding GTDT support in arm_arch_timer:
1. Move some enums and marcos to header file;
2. Add a new enum for spi type;
3. Improve printk relevant code;
4. Rename some enums and defines;
5. Rework PPI determination;
6. Rework counter frequency detection;
7. Refactor arch_timer_needs_probing, move it into DT init call
8. Introduce some new structs and refactor the MMIO timer init code
for reusing some common code.

(2)Introduce ACPI GTDT parser: drivers/acpi/arm64/acpi_gtdt.c
Parse all kinds of timer in GTDT table of ACPI:arch timer,
memory-mapped timer and SBSA Generic Watchdog timer.
This driver can help to simplify all the relevant timer drivers,
and separate all the ACPI GTDT knowledge from them.

(3)Simplify ACPI code for arm_arch_timer

(4)Add GTDT support for ARM memory-mapped timer.

This patchset has been tested on the following platforms with ACPI enabled:
(1)ARM Foundation v8 model

Changelog:
v17: https://lkml.org/lkml/2016/11/25/
 Take out some cleanups from 4/15.
 Merge 5/15 and 6/15, improve PPI determination code,
 improve commit message.
 Rework counter frequency detection.
 Move arch_timer_needs_of_probing into DT init call.
 Move Platform Timer scan loop back to timer init call to avoid allocating
 and free memory.
 Improve all the exported functions' comment.

v16: https://lkml.org/lkml/2016/11/16/268
 Fix patchset problem about static enum ppi_nr of 01/13 in v15.
 Refactor arch_timer_detect_rate.
 Refactor arch_timer_needs_probing.

v15: https://lkml.org/lkml/2016/11/15/366
 Re-order patches
 Add arm_arch_timer refactoring patches to prepare for GTDT:
 1. rename some  enums and defines, and some cleanups
 2. separate out arch_timer_uses_ppi init code and fix a potential bug
 3. Improve some new structs, refactor the timer init code.
 Since the some structs have been changed, GTDT parser for memory-mapped
 timer and SBSA Generic Watchdog timer have been update.

v14: https://lkml.org/lkml/2016/9/28/573
 Separate memory-mapped timer GTDT support into two patches
 1. Refactor the timer init code to prepare for GTDT
 2. Add GTDT support for memory-mapped timer

v13: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1231717.html
 Improve arm_arch_timer code for memory-mapped
 timer GTDT support, refactor original memory-mapped timer
 dt support for reusing some common code.

v12: https://lkml.org/lkml/2016/9/13/250
 Rebase to latest Linux 4.8-rc6
 Delete the confusing "skipping" in the error message.

V11: https://lkml.org/lkml/2016/9/6/354
 Rebase to latest Linux 4.8-rc5
 Delete typedef (suggested by checkpatch.pl)

V10: https://lkml.org/lkml/2016/7/26/215
 Drop the "readq" patch.
 Rebase to latest Linux 4.7.

V9: https://lkml.org/lkml/2016/7/25/345
Improve pr_err message in acpi gtdt driver.
Update Commit message for 7/9
shorten the irq mapping function name
Improve GTDT driver for memory-mapped timer

v8: https://lkml.org/lkml/2016/7/19/660
Improve "pr_fmt(fmt)" definition: add "ACPI" in front of "GTDT",
and also improve printk message.
Simplify is_timer_block and is_watchdog.
Merge acpi_gtdt_desc_init and gtdt_arch_timer_init into acpi_gtdt_init();
Delete __init in include/linux/acpi.h for GTDT API
Make ARM64 select GTDT.
Delete "#include " from acpi_gtdt.c
Simplify GT block parse code.

v7: https://lkml.org/lkml/2016/7/13/769
Move the GTDT driver to drivers/acpi/arm64
Add add the ARM64-specific ACPI Support maintainers in MAINTAINERS
Merge 3 patches of GTDT parser driver.
Fix the for_each_platform_timer bug.

v6: https://lkml.org/lkml/2016/6/29/580
split the GTDT driver to 4 parts: basic, arch_timer, memory-mapped timer,
and SBSA Generic Watchdog timer
Improve driver by suggestions and example code from Daniel Lezcano

v5: https://lkml.org/lkml/2016/5/24/356
Sorting out all patches, simplify the API of GTDT driver:
GTDT driver just fills the data struct for arm_arch_timer driver.

v4: https://lists.linaro.org/pipermail/linaro-acpi/2016-March/006667.html
Delete the kvm relevant patches
Separate two patches for sorting out the code for arm_arch_timer.
Improve irq info export code to allow missing irq info in GTDT table.

v3: https://lkml.org/lkml/2016/2/1/658
Improve GTDT driver code:
  (1)improve pr_* by defining pr_fmt(fmt)
  (2)simplify gtdt_sbsa_gwdt_init
  (3)improve gtdt_arch_timer_data_init, if table is NULL, it will try
  to get GTDT table.
Move enum ppi_nr to arm_arch_timer.h, and add enum spi_nr.
Add arm_arch_timer get ppi from DT and GTDT support for kvm.

v2: https://lkml.org/lkml/2015/12/2/10
Rebase to latest kernel version(4.4-rc3).
Fix the bug ab

RE: [PATCH V5 3/3] ARM64 LPC: LPC driver implementation on Hip06

2016-11-25 Thread Gabriele Paoloni
Hi Arnd

Many thanks for your contribution, much appreciated

I have some comments...see inline below 

> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Sent: 23 November 2016 23:23
> To: linux-arm-ker...@lists.infradead.org
> Cc: Gabriele Paoloni; mark.rutl...@arm.com; catalin.mari...@arm.com;
> linux-...@vger.kernel.org; liviu.du...@arm.com; Linuxarm;
> lorenzo.pieral...@arm.com; xuwei (O); Jason Gunthorpe; T homas
> Petazzoni; linux-ser...@vger.kernel.org; b...@kernel.crashing.org;
> devicet...@vger.kernel.org; miny...@acm.org; will.dea...@arm.com; John
> Garry; o...@lixom.net; robh...@kernel.org; bhelgaas@go og le.com;
> kant...@163.com; zhichang.yua...@gmail.com; linux-
> ker...@vger.kernel.org; Yuanzhichang; zourongr...@gmail.com
> Subject: Re: [PATCH V5 3/3] ARM64 LPC: LPC driver implementation on
> Hip06
> 
> On Wednesday, November 23, 2016 6:07:11 PM CET Arnd Bergmann wrote:
> > On Wednesday, November 23, 2016 3:22:33 PM CET Gabriele Paoloni
> wrote:
> > > From: Arnd Bergmann [mailto:a...@arndb.de]
> > > > On Friday, November 18, 2016 5:03:11 PM CET Gabriele Paoloni
> wrote:
> >
> > Please don't proliferate the use of
> > pci_pio_to_address/pci_address_to_pio here, computing the physical
> > address from the logical address is trivial, you just need to
> > subtract the start of the range that you already use when matching
> > the port number range.
> >
> > The only thing we need here is to make of_address_to_resource()
> > return the correct logical port number that was registered for
> > a given host device when asked to translate an address that
> > does not have a CPU address associated with it.
> 
> Ok, I admit this was a little harder than I expected, but see below
> for a rough outline of how I think it can be done.
> 
> This makes it possible to translate bus specific I/O port numbers
> from device nodes into Linux port numbers, and gives a way to register
> them. We could take this further and completely remove
> pci_pio_to_address
> and pci_address_to_pio if we make the I/O port translation always
> go through the io_range list, looking up up the hostbridge by fwnode,
> but we don't have to do that now.
> 
> The patch is completely untested and probably buggy, it just seemed
> easier to put out a prototype than to keep going in circles with the
> discussion.
> 
> Signed-off-by: Arnd Bergmann 
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index bf601d4df8cf..6cadf0501bb0 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -730,7 +730,8 @@ static void acpi_pci_root_validate_resources(struct
> device *dev,
>   }
>  }
> 
> -static void acpi_pci_root_remap_iospace(struct resource_entry *entry)
> +static void acpi_pci_root_remap_iospace(struct fwnode_handle *node,
> + struct resource_entry *entry)
>  {
>  #ifdef PCI_IOBASE
>   struct resource *res = entry->res;
> @@ -739,11 +740,7 @@ static void acpi_pci_root_remap_iospace(struct
> resource_entry *entry)
>   resource_size_t length = resource_size(res);
>   unsigned long port;
> 
> - if (pci_register_io_range(cpu_addr, length))
> - goto err;
> -
> - port = pci_address_to_pio(cpu_addr);
> - if (port == (unsigned long)-1)
> + if (pci_register_io_range(node, cpu_addr, length, &port))
>   goto err;
> 
>   res->start = port;
> @@ -781,7 +778,8 @@ int acpi_pci_probe_root_resources(struct
> acpi_pci_root_info *info)
>   else {
>   resource_list_for_each_entry_safe(entry, tmp, list) {
>   if (entry->res->flags & IORESOURCE_IO)
> - acpi_pci_root_remap_iospace(entry);
> + acpi_pci_root_remap_iospace(&device->fwnode,
> + entry);
> 
>   if (entry->res->flags & IORESOURCE_DISABLED)
>   resource_list_destroy_entry(entry);
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index a50025a3777f..df96955a43f8 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -760,8 +760,10 @@ static int __nbd_ioctl(struct block_device *bdev,
> struct nbd_device *nbd,
>   set_bit(NBD_RUNNING, &nbd->runtime_flags);
>   blk_mq_update_nr_hw_queues(&nbd->tag_set, nbd-
> >num_connections);
>   args = kcalloc(num_connections, sizeof(*args), GFP_KERNEL);
> - if (!args)
> + if (!args) {
> + error = -ENOMEM;
>   goto out_err;
> + }
>   nbd->task_recv = current;
>   mutex_unlock(&nbd->config_lock);
> 
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 02b2903fe9d2..5decaba96eed 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -2,6 +2,7 @@
>  #define pr_fmt(fmt)  "OF: " fmt
> 
>  #include 
> +#include 
>  #include 
>  #include 
>  #inclu

[PATCH 3.12 096/127] include/stddef.h: Move offsetofend() from vfio.h to a generic kernel header

2016-11-25 Thread Jiri Slaby
From: Denys Vlasenko 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

commit 3876488444e71238e287459c39d7692b6f718c3e upstream.

Suggested by Andy.

Suggested-by: Andy Lutomirski 
Signed-off-by: Denys Vlasenko 
Acked-by: Linus Torvalds 
Cc: Alexei Starovoitov 
Cc: Borislav Petkov 
Cc: Frederic Weisbecker 
Cc: H. Peter Anvin 
Cc: Kees Cook 
Cc: Oleg Nesterov 
Cc: Steven Rostedt 
Cc: Will Drewry 
Link: 
http://lkml.kernel.org/r/1425912738-559-1-git-send-email-dvlas...@redhat.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Jiri Slaby 
---
 include/linux/stddef.h |  9 +
 include/linux/vfio.h   | 13 -
 2 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index f4aec0e75c3a..076af437284d 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -19,3 +19,12 @@ enum {
 #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
 #endif
 #endif
+
+/**
+ * offsetofend(TYPE, MEMBER)
+ *
+ * @TYPE: The type of the structure
+ * @MEMBER: The member within the structure to get the end offset of
+ */
+#define offsetofend(TYPE, MEMBER) \
+   (offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 43f6bf4f8585..9131a4bf5c3e 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -76,19 +76,6 @@ extern int vfio_register_iommu_driver(const struct 
vfio_iommu_driver_ops *ops);
 extern void vfio_unregister_iommu_driver(
const struct vfio_iommu_driver_ops *ops);
 
-/**
- * offsetofend(TYPE, MEMBER)
- *
- * @TYPE: The type of the structure
- * @MEMBER: The member within the structure to get the end offset of
- *
- * Simple helper macro for dealing with variable sized structures passed
- * from user space.  This allows us to easily determine if the provided
- * structure is sized to include various fields.
- */
-#define offsetofend(TYPE, MEMBER) \
-   (offsetof(TYPE, MEMBER) + sizeof(((TYPE *)0)->MEMBER))
-
 /*
  * External user API
  */
-- 
2.10.2



[PATCH 3.12 100/127] ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()

2016-11-25 Thread Jiri Slaby
From: Eli Cooper 

3.12-stable review patch.  If anyone has any objections, please let me know.

===

[ Upstream commit 23f4ffedb7d751c7e298732ba91ca75d224bc1a6 ]

skb->cb may contain data from previous layers. In the observed scenario,
the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
that small packets sent through the tunnel are mistakenly fragmented.

This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.

Cc: sta...@vger.kernel.org
Signed-off-by: Eli Cooper 
Signed-off-by: David S. Miller 
Signed-off-by: Jiri Slaby 
---
 include/net/ip6_tunnel.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 6d1549c4893c..e6f0917d1ab5 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -75,6 +75,7 @@ static inline void ip6tunnel_xmit(struct sk_buff *skb, struct 
net_device *dev)
struct net_device_stats *stats = &dev->stats;
int pkt_len, err;
 
+   memset(skb->cb, 0, sizeof(struct inet6_skb_parm));
pkt_len = skb->len;
err = ip6_local_out(skb);
 
-- 
2.10.2



[PATCH v17 05/15] clocksource/drivers/arm_arch_timer: rework PPI determination

2016-11-25 Thread fu . wei
From: Fu Wei 

Currently, the arch timer driver uses ARCH_TIMER_PHYS_SECURE_PPI to
mean the driver will use the secure PPI *and* potentialy also use the
non-secure PPI. This is somewhat confusing.

For arm64, where it never makes sense to use the secure PPI, this
means we must always request the useless secure PPI, adding to the
confusion. For ACPI, where we may not even have a valid secure PPI
number, this is additionally problematic. We need the driver to be
able to use *only* the non-secure PPI.

The logic to choose which PPI to use is intertwined with other logic
in arch_timer_init(). This patch factors the PPI determination out
into a new function named arch_timer_select_ppi, and then reworks it
so that we can handle having only a non-secure PPI.

This patch also moves arch_timer_ppi verification out to caller,
because we can verify the configuration from device-tree for ARM by this
way.

Meanwhile, because we will select ARCH_TIMER_PHYS_NONSECURE_PPI for ARM64,
the logic in arch_timer_register also need to be updated.

Signed-off-by: Fu Wei 
---
 drivers/clocksource/arm_arch_timer.c | 77 +---
 1 file changed, 46 insertions(+), 31 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index 231175b..e43be0a 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -699,7 +699,7 @@ static int __init arch_timer_register(void)
case ARCH_TIMER_PHYS_NONSECURE_PPI:
err = request_percpu_irq(ppi, arch_timer_handler_phys,
 "arch_timer", arch_timer_evt);
-   if (!err && arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI]) {
+   if (!err && arch_timer_has_nonsecure_ppi()) {
ppi = arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI];
err = request_percpu_irq(ppi, arch_timer_handler_phys,
 "arch_timer", arch_timer_evt);
@@ -821,39 +821,41 @@ static int __init arch_timer_common_init(void)
return arch_timer_arch_init();
 }
 
-static int __init arch_timer_init(void)
+/**
+ * arch_timer_select_ppi() - Select suitable PPI for the current system.
+ *
+ * If HYP mode is available, we know that the physical timer
+ * has been configured to be accessible from PL1. Use it, so
+ * that a guest can use the virtual timer instead.
+ *
+ * On ARMv8.1 with VH extensions, the kernel runs in HYP. VHE
+ * accesses to CNTP_*_EL1 registers are silently redirected to
+ * their CNTHP_*_EL2 counterparts, and use a different PPI
+ * number.
+ *
+ * If no interrupt provided for virtual timer, we'll have to
+ * stick to the physical timer. It'd better be accessible...
+ * For arm64 we never use the secure interrupt.
+ *
+ * Return: a suitable PPI type for the current system.
+ */
+static enum arch_timer_ppi_nr __init arch_timer_select_ppi(void)
 {
-   int ret;
-   /*
-* If HYP mode is available, we know that the physical timer
-* has been configured to be accessible from PL1. Use it, so
-* that a guest can use the virtual timer instead.
-*
-* If no interrupt provided for virtual timer, we'll have to
-* stick to the physical timer. It'd better be accessible...
-*
-* On ARMv8.1 with VH extensions, the kernel runs in HYP. VHE
-* accesses to CNTP_*_EL1 registers are silently redirected to
-* their CNTHP_*_EL2 counterparts, and use a different PPI
-* number.
-*/
-   if (is_hyp_mode_available() || !arch_timer_ppi[ARCH_TIMER_VIRT_PPI]) {
-   bool has_ppi;
+   if (is_hyp_mode_available() && is_kernel_in_hyp_mode())
+   return ARCH_TIMER_HYP_PPI;
 
-   if (is_kernel_in_hyp_mode()) {
-   arch_timer_uses_ppi = ARCH_TIMER_HYP_PPI;
-   has_ppi = !!arch_timer_ppi[ARCH_TIMER_HYP_PPI];
-   } else {
-   arch_timer_uses_ppi = ARCH_TIMER_PHYS_SECURE_PPI;
-   has_ppi = (!!arch_timer_ppi[ARCH_TIMER_PHYS_SECURE_PPI] 
||
-  
!!arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI]);
-   }
+   if (arch_timer_ppi[ARCH_TIMER_VIRT_PPI])
+   return ARCH_TIMER_VIRT_PPI;
 
-   if (!has_ppi) {
-   pr_warn("No interrupt available, giving up\n");
-   return -EINVAL;
-   }
-   }
+   if (IS_ENABLED(CONFIG_ARM64))
+   return ARCH_TIMER_PHYS_NONSECURE_PPI;
+
+   return ARCH_TIMER_PHYS_SECURE_PPI;
+}
+
+static int __init arch_timer_init(void)
+{
+   int ret;
 
ret = arch_timer_register();
if (ret)
@@ -901,6 +903,13 @@ static int __init arch_timer_of_init(struct device_node 
*np)
if (IS_ENABLED(CONFIG_ARM) &&
of_property_read_bool(np, "arm,cpu-registers-not-fw-configured"))
 

[PATCH v3 2/3] zram: revalidate disk under init_lock

2016-11-25 Thread Minchan Kim
[1] moved revalidate_disk call out of init_lock to avoid lockdep
false-positive splat. However, [2] remove init_lock in IO path
so there is no worry about lockdep splat. So, let's restore it.
This patch need to set BDI_CAP_STABLE_WRITES atomically in
next patch.

[1] b4c5c60920e3: zram: avoid lockdep splat by revalidate_disk
[2] 08eee69fcf6b: zram: remove init_lock in zram_make_request

Fixes: da9556a2367c ("zram: user per-cpu compression streams")
Cc: sta...@vger.kernel.org
Signed-off-by: Minchan Kim 
---
 drivers/block/zram/zram_drv.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 5163c8f918cb..d93a4b2135c2 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1094,14 +1094,8 @@ static ssize_t disksize_store(struct device *dev,
zram->comp = comp;
zram->disksize = disksize;
set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT);
-   up_write(&zram->init_lock);
-
-   /*
-* Revalidate disk out of the init_lock to avoid lockdep splat.
-* It's okay because disk's capacity is protected by init_lock
-* so that revalidate_disk always sees up-to-date capacity.
-*/
revalidate_disk(zram->disk);
+   up_write(&zram->init_lock);
 
return len;
 
-- 
2.7.4



[PATCH v3 1/3] mm: support anonymous stable page

2016-11-25 Thread Minchan Kim
During developemnt for zram-swap asynchronous writeback, I found strange
corruption of compressed page.

Modules linked in: zram(E)
CPU: 3 PID: 1520 Comm: zramd-1 Tainted: GE   
4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: 88007620b840 task.stack: 88007809
RIP: 0010:[]  [] 
set_freeobj.part.43+0x1c/0x1f
RSP: 0018:880078093ca8  EFLAGS: 00010246
RAX: 0018 RBX: 880076798d88 RCX: 81c408c8
RDX: 0018 RSI:  RDI: 0246
RBP: 880078093cb0 R08:  R09: 
R10: 88005bc43030 R11: 1df3 R12: 880076798d88
R13: 0005bc43 R14: 88007819d1b8 R15: 0001
FS:  () GS:88007e38() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fc934048f20 CR3: 77b01000 CR4: 000406e0
Stack:
 ea00016f10c0 880078093d08 811d43cb 88005bc43030
 88005bc43030 0001 88007aa17f68 88007819d1b8
 0200020a 02000200 880076798d88 88007aa17f68
Call Trace:
 [] obj_malloc+0x22b/0x260
 [] zs_malloc+0x1e4/0x580
 [] ? lz4_compress_crypto+0x30/0x50
 [] zram_bvec_rw+0x4cd/0x830 [zram]
 [] page_requests_rw+0x9c/0x130 [zram]
 [] ? page_requests_rw+0x130/0x130 [zram]
 [] zram_thread+0xe6/0x173 [zram]
 [] ? wake_atomic_t_function+0x60/0x60
 [] kthread+0xca/0xe0
 [] ? kthread_park+0x60/0x60
 [] ret_from_fork+0x25/0x30

With investigation, it reveals currently stable page doesn't support
anonymous page.  IOW, reuse_swap_page can reuse the page without waiting
writeback completion so it can overwrite page zram is compressing.

Unfortunately, zram has used per-cpu stream feature from v4.7.
It aims for increasing cache hit ratio of scratch buffer for
compressing. Downside of that approach is that zram should ask
memory space for compressed page in per-cpu context which requires
stricted gfp flag which could be failed. If so, it retries to
allocate memory space out of per-cpu context so it could get memory
this time and compress the data again, copies it to the memory space.

In this scenario, zram assumes the data should never be changed
but it is not true unless stable page supports. So, If the data is
changed under us, zram can make buffer overrun because second
compression size could be bigger than one we got in previous trial
and blindly, copy bigger size object to smaller buffer which is
buffer overrun. The overrun breaks zsmalloc free object chaining
so system goes crash like above.

I think below is same problem.
https://bugzilla.suse.com/show_bug.cgi?id=997574

Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
writeback in there so the approach in this patch is simply return false if
we found it needs stable page.  Although it increases memory footprint
temporarily, it happens rarely and it should be reclaimed easily althoug
it happened.  Also, It would be better than waiting of IO completion,
which is critial path for application latency.

Fixes: da9556a2367c ("zram: user per-cpu compression streams")
Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox
Signed-off-by: Minchan Kim 
Acked-by: Hugh Dickins 
Cc: linux...@kvack.org
Cc: Darrick J. Wong 
Cc: sta...@vger.kernel.org
---
 include/linux/swap.h |  3 ++-
 mm/swapfile.c| 20 +++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 09f4be179ff3..7f47b7098b1b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -150,8 +150,9 @@ enum {
SWP_FILE= (1 << 7), /* set after swap_activate success */
SWP_AREA_DISCARD = (1 << 8),/* single-time swap area discards */
SWP_PAGE_DISCARD = (1 << 9),/* freed swap page-cluster discards */
+   SWP_STABLE_WRITES = (1 << 10),  /* no overwrite PG_writeback pages */
/* add others here before... */
-   SWP_SCANNING= (1 << 10),/* refcount in scan_swap_map */
+   SWP_SCANNING= (1 << 11),/* refcount in scan_swap_map */
 };
 
 #define SWAP_CLUSTER_MAX 32UL
diff --git a/mm/swapfile.c b/mm/swapfile.c
index f30438970cd1..d76b2a18f044 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -943,11 +943,25 @@ bool reuse_swap_page(struct page *page, int 
*total_mapcount)
count = page_trans_huge_mapcount(page, total_mapcount);
if (count <= 1 && PageSwapCache(page)) {
count += page_swapcount(page);
-   if (count == 1 && !PageWriteback(page)) {
+   if (count != 1)
+   goto out;
+   if (!PageWriteback(page)) {
delete_from_swap_cache(page);
SetPageDirty(page);
+   } else {
+   swp_entry_t entry;
+   

[PATCH 1/1] net: macb: fix the RX queue reset in macb_rx()

2016-11-25 Thread Cyrille Pitchen
On macb only (not gem), when a RX queue corruption was detected from
macb_rx(), the RX queue was reset: during this process the RX ring
buffer descriptor was initialized by macb_init_rx_ring() but we forgot
to also set bp->rx_tail to 0.

Indeed, when processing the received frames, bp->rx_tail provides the
macb driver with the index in the RX ring buffer of the next buffer to
process. So when the whole ring buffer is reset we must also reset
bp->rx_tail so the driver is synchronized again with the hardware.

Since macb_init_rx_ring() is called from many locations, currently from
macb_rx() and macb_init_rings(), we'd rather add the "bp->rx_tail = 0;"
line inside macb_init_rx_ring() than add the very same line after each
call of this function.

Without this fix, the rx queue is not reset properly to recover from
queue corruption and connection drop may occur.

Signed-off-by: Cyrille Pitchen 
Fixes: 9ba723b081a2 ("net: macb: remove BUG_ON() and reset the queue to handle 
RX errors")
---
 drivers/net/ethernet/cadence/macb.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index 0e489bb82456..8ee303b8da08 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -991,6 +991,7 @@ static inline void macb_init_rx_ring(struct macb *bp)
addr += bp->rx_buffer_size;
}
bp->rx_ring[bp->rx_ring_size - 1].addr |= MACB_BIT(RX_WRAP);
+   bp->rx_tail = 0;
 }
 
 static int macb_rx(struct macb *bp, int budget)
@@ -1736,8 +1737,6 @@ static void macb_init_rings(struct macb *bp)
bp->queues[0].tx_head = 0;
bp->queues[0].tx_tail = 0;
bp->queues[0].tx_ring[bp->tx_ring_size - 1].ctrl |= MACB_BIT(TX_WRAP);
-
-   bp->rx_tail = 0;
 }
 
 static void macb_reset_hw(struct macb *bp)
-- 
2.7.4



[PATCH v17 13/15] acpi/arm64: Add memory-mapped timer support in GTDT driver

2016-11-25 Thread fu . wei
From: Fu Wei 

On platforms booting with ACPI, architected memory-mapped timers'
configuration data is provided by firmware through the ACPI GTDT
static table.

The clocksource architected timer kernel driver requires a firmware
interface to collect timer configuration and configure its driver.
this infrastructure is present for device tree systems, but it is
missing on systems booting with ACPI.

Implement the kernel infrastructure required to parse the static
ACPI GTDT table so that the architected timer clocksource driver can
make use of it on systems booting with ACPI, therefore enabling
the corresponding timers configuration.

Signed-off-by: Fu Wei 
Signed-off-by: Hanjun Guo 
---
 drivers/acpi/arm64/gtdt.c | 124 ++
 include/linux/acpi.h  |   1 +
 2 files changed, 125 insertions(+)

diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c
index d93a790..91ea6cb 100644
--- a/drivers/acpi/arm64/gtdt.c
+++ b/drivers/acpi/arm64/gtdt.c
@@ -37,6 +37,28 @@ struct acpi_gtdt_descriptor {
 
 static struct acpi_gtdt_descriptor acpi_gtdt_desc __initdata;
 
+static inline void *next_platform_timer(void *platform_timer)
+{
+   struct acpi_gtdt_header *gh = platform_timer;
+
+   platform_timer += gh->length;
+   if (platform_timer < acpi_gtdt_desc.gtdt_end)
+   return platform_timer;
+
+   return NULL;
+}
+
+#define for_each_platform_timer(_g)\
+   for (_g = acpi_gtdt_desc.platform_timer; _g;\
+_g = next_platform_timer(_g))
+
+static inline bool is_timer_block(void *platform_timer)
+{
+   struct acpi_gtdt_header *gh = platform_timer;
+
+   return gh->type == ACPI_GTDT_TYPE_TIMER_BLOCK;
+}
+
 static int __init map_gt_gsi(u32 interrupt, u32 flags)
 {
int trigger, polarity;
@@ -155,3 +177,105 @@ int __init acpi_gtdt_init(struct acpi_table_header *table,
 
return ret;
 }
+
+static int __init gtdt_parse_timer_block(struct acpi_gtdt_timer_block *block,
+struct arch_timer_mem *data)
+{
+   int i;
+   struct acpi_gtdt_timer_entry *frame;
+
+   if (!block->timer_count) {
+   pr_err(FW_BUG "GT block present, but frame count is zero.");
+   return -ENODEV;
+   }
+
+   if (block->timer_count > ARCH_TIMER_MEM_MAX_FRAMES) {
+   pr_err(FW_BUG "GT block lists %d frames, ACPI spec only allows 
8\n",
+  block->timer_count);
+   return -EINVAL;
+   }
+
+   data->cntctlbase = (phys_addr_t)block->block_address;
+   /*
+* According to "Table * CNTCTLBase memory map" of
+*  for ARMv8,
+* The size of the CNTCTLBase frame is 4KB(Offset 0x000 – 0xFFC).
+*/
+   data->size = SZ_4K;
+   data->num_frames = block->timer_count;
+
+   frame = (void *)block + block->timer_offset;
+   if (frame + block->timer_count != (void *)block + block->header.length)
+   return -EINVAL;
+
+   /*
+* Get the GT timer Frame data for every GT Block Timer
+*/
+   for (i = 0; i < block->timer_count; i++, frame++) {
+   if (!frame->base_address || !frame->timer_interrupt)
+   return -EINVAL;
+
+   data->frame[i].phys_irq = map_gt_gsi(frame->timer_interrupt,
+frame->timer_flags);
+   if (data->frame[i].phys_irq <= 0) {
+   pr_warn("failed to map physical timer irq in frame 
%d.\n",
+   i);
+   return -EINVAL;
+   }
+
+   data->frame[i].virt_irq =
+   map_gt_gsi(frame->virtual_timer_interrupt,
+  frame->virtual_timer_flags);
+   if (data->frame[i].virt_irq <= 0) {
+   pr_warn("failed to map virtual timer irq in frame 
%d.\n",
+   i);
+   acpi_unregister_gsi(frame->timer_interrupt);
+   return -EINVAL;
+   }
+
+   data->frame[i].frame_nr = frame->frame_number;
+   data->frame[i].cntbase = frame->base_address;
+   /*
+* According to "Table * CNTBaseN memory map" of
+*  for ARMv8,
+* The size of the CNTBaseN frame is 4KB(Offset 0x000 – 0xFFC).
+*/
+   data->frame[i].size = SZ_4K;
+   }
+
+   return 0;
+}
+
+/**
+ * acpi_arch_timer_mem_init() - Get the info of all GT blocks in GTDT table.
+ * @data:  the pointer to the array of struct arch_timer_mem for returning
+ * the result of parsing. The element number of this array should
+ * be platform_timer_count(the total number of platform timers).
+ * @count: The pointer of int variate for returning the number of GT
+ * blocks we have pa

[PATCH v17 11/15] acpi/arm64: Add GTDT table parse driver

2016-11-25 Thread fu . wei
From: Fu Wei 

This patch adds support for parsing arch timer info in GTDT,
provides some kernel APIs to parse all the PPIs and
always-on info in GTDT and export them.

By this driver, we can simplify arm_arch_timer drivers, and
separate the ACPI GTDT knowledge from it.

Signed-off-by: Fu Wei 
Signed-off-by: Hanjun Guo 
Acked-by: Rafael J. Wysocki 
---
 arch/arm64/Kconfig  |   1 +
 drivers/acpi/arm64/Kconfig  |   3 +
 drivers/acpi/arm64/Makefile |   1 +
 drivers/acpi/arm64/gtdt.c   | 157 
 include/linux/acpi.h|   6 ++
 5 files changed, 168 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 969ef88..4277a21 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2,6 +2,7 @@ config ARM64
def_bool y
select ACPI_CCA_REQUIRED if ACPI
select ACPI_GENERIC_GSI if ACPI
+   select ACPI_GTDT if ACPI
select ACPI_REDUCED_HARDWARE_ONLY if ACPI
select ACPI_MCFG if ACPI
select ACPI_SPCR_TABLE if ACPI
diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig
index 4616da4..5a6f80f 100644
--- a/drivers/acpi/arm64/Kconfig
+++ b/drivers/acpi/arm64/Kconfig
@@ -4,3 +4,6 @@
 
 config ACPI_IORT
bool
+
+config ACPI_GTDT
+   bool
diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
index 72331f2..1017def 100644
--- a/drivers/acpi/arm64/Makefile
+++ b/drivers/acpi/arm64/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_ACPI_IORT)+= iort.o
+obj-$(CONFIG_ACPI_GTDT)+= gtdt.o
diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c
new file mode 100644
index 000..d93a790
--- /dev/null
+++ b/drivers/acpi/arm64/gtdt.c
@@ -0,0 +1,157 @@
+/*
+ * ARM Specific GTDT table Support
+ *
+ * Copyright (C) 2016, Linaro Ltd.
+ * Author: Daniel Lezcano 
+ * Fu Wei 
+ * Hanjun Guo 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#undef pr_fmt
+#define pr_fmt(fmt) "ACPI GTDT: " fmt
+
+/**
+ * struct acpi_gtdt_descriptor - Store the key info of GTDT for all functions
+ * @gtdt:  The pointer to the struct acpi_table_gtdt of GTDT table.
+ * @gtdt_end:  The pointer to the end of GTDT table.
+ * @platform_timer:The pointer to the start of Platform Timer Structure
+ *
+ * The struct store the key info of GTDT table, it should be initialized by
+ * acpi_gtdt_init.
+ */
+struct acpi_gtdt_descriptor {
+   struct acpi_table_gtdt *gtdt;
+   void *gtdt_end;
+   void *platform_timer;
+};
+
+static struct acpi_gtdt_descriptor acpi_gtdt_desc __initdata;
+
+static int __init map_gt_gsi(u32 interrupt, u32 flags)
+{
+   int trigger, polarity;
+
+   trigger = (flags & ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
+   : ACPI_LEVEL_SENSITIVE;
+
+   polarity = (flags & ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
+   : ACPI_ACTIVE_HIGH;
+
+   return acpi_register_gsi(NULL, interrupt, trigger, polarity);
+}
+
+/**
+ * acpi_gtdt_map_ppi() - Map the PPIs of per-cpu arch_timer.
+ * @type:  the type of PPI.
+ *
+ * Note: Linux on arm64 isn't supported on the secure side.
+ * So we only handle the non-secure timer PPIs,
+ * ARCH_TIMER_PHYS_SECURE_PPI is treated as invalid type.
+ *
+ * Return: the mapped PPI value, 0 if error.
+ */
+int __init acpi_gtdt_map_ppi(int type)
+{
+   struct acpi_table_gtdt *gtdt = acpi_gtdt_desc.gtdt;
+
+   switch (type) {
+   case ARCH_TIMER_PHYS_NONSECURE_PPI:
+   return map_gt_gsi(gtdt->non_secure_el1_interrupt,
+ gtdt->non_secure_el1_flags);
+   case ARCH_TIMER_VIRT_PPI:
+   return map_gt_gsi(gtdt->virtual_timer_interrupt,
+ gtdt->virtual_timer_flags);
+
+   case ARCH_TIMER_HYP_PPI:
+   return map_gt_gsi(gtdt->non_secure_el2_interrupt,
+ gtdt->non_secure_el2_flags);
+   default:
+   pr_err("Failed to map timer interrupt: invalid type.\n");
+   }
+
+   return 0;
+}
+
+/**
+ * acpi_gtdt_c3stop() - Got c3stop info from GTDT according to the type of PPI.
+ * @type:  the type of PPI.
+ *
+ * Return: 1 if the timer can be in deep idle state, 0 otherwise.
+ */
+bool __init acpi_gtdt_c3stop(int type)
+{
+   struct acpi_table_gtdt *gtdt = acpi_gtdt_desc.gtdt;
+
+   switch (type) {
+   case ARCH_TIMER_PHYS_NONSECURE_PPI:
+   return !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON);
+
+   case ARCH_TIMER_VIRT_PPI:
+   return !(gtdt->virtual_timer_flags & ACPI_GTDT_ALWAYS_ON);
+
+   case ARCH_TIMER_HYP_PPI:
+   return !(gtdt->non_secure_el2_flags & ACPI_GTDT_ALWAYS_ON);
+
+   default:
+   pr_err("Failed to get c3stop in

[PATCH v17 15/15] acpi/arm64: Add SBSA Generic Watchdog support in GTDT driver

2016-11-25 Thread fu . wei
From: Fu Wei 

This driver adds support for parsing SBSA Generic Watchdog timer
in GTDT, parse all info in SBSA Generic Watchdog Structure in GTDT,
and creating a platform device with that information.

This allows the operating system to obtain device data from the
resource of platform device. The platform device named "sbsa-gwdt"
can be used by the ARM SBSA Generic Watchdog driver.

Signed-off-by: Fu Wei 
Signed-off-by: Hanjun Guo 
Tested-by: Xiongfeng Wang 
---
 drivers/acpi/arm64/gtdt.c | 93 +++
 drivers/watchdog/Kconfig  |  1 +
 2 files changed, 94 insertions(+)

diff --git a/drivers/acpi/arm64/gtdt.c b/drivers/acpi/arm64/gtdt.c
index 91ea6cb..22d3659 100644
--- a/drivers/acpi/arm64/gtdt.c
+++ b/drivers/acpi/arm64/gtdt.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -59,6 +60,13 @@ static inline bool is_timer_block(void *platform_timer)
return gh->type == ACPI_GTDT_TYPE_TIMER_BLOCK;
 }
 
+static inline bool is_watchdog(void *platform_timer)
+{
+   struct acpi_gtdt_header *gh = platform_timer;
+
+   return gh->type == ACPI_GTDT_TYPE_WATCHDOG;
+}
+
 static int __init map_gt_gsi(u32 interrupt, u32 flags)
 {
int trigger, polarity;
@@ -279,3 +287,88 @@ int __init acpi_arch_timer_mem_init(struct arch_timer_mem 
*data,
 
return 0;
 }
+
+/*
+ * Initialize a SBSA generic Watchdog platform device info from GTDT
+ */
+static int __init gtdt_import_sbsa_gwdt(struct acpi_gtdt_watchdog *wd,
+   int index)
+{
+   struct platform_device *pdev;
+   int irq = map_gt_gsi(wd->timer_interrupt, wd->timer_flags);
+   int no_irq = 1;
+
+   /*
+* According to SBSA specification the size of refresh and control
+* frames of SBSA Generic Watchdog is SZ_4K(Offset 0x000 – 0xFFF).
+*/
+   struct resource res[] = {
+   DEFINE_RES_MEM(wd->control_frame_address, SZ_4K),
+   DEFINE_RES_MEM(wd->refresh_frame_address, SZ_4K),
+   DEFINE_RES_IRQ(irq),
+   };
+
+   pr_debug("found a Watchdog (0x%llx/0x%llx gsi:%u flags:0x%x).\n",
+wd->refresh_frame_address, wd->control_frame_address,
+wd->timer_interrupt, wd->timer_flags);
+
+   if (!(wd->refresh_frame_address && wd->control_frame_address)) {
+   pr_err(FW_BUG "failed to get the Watchdog base address.\n");
+   return -EINVAL;
+   }
+
+   if (!wd->timer_interrupt)
+   pr_warn(FW_BUG "failed to get the Watchdog interrupt.\n");
+   else if (irq <= 0)
+   pr_warn("failed to map the Watchdog interrupt.\n");
+   else
+   no_irq = 0;
+
+   /*
+* Add a platform device named "sbsa-gwdt" to match the platform driver.
+* "sbsa-gwdt": SBSA(Server Base System Architecture) Generic Watchdog
+* The platform driver (like drivers/watchdog/sbsa_gwdt.c)can get device
+* info below by matching this name.
+*/
+   pdev = platform_device_register_simple("sbsa-gwdt", index, res,
+  ARRAY_SIZE(res) - no_irq);
+   if (IS_ERR(pdev)) {
+   acpi_unregister_gsi(wd->timer_interrupt);
+   return PTR_ERR(pdev);
+   }
+
+   return 0;
+}
+
+static int __init gtdt_sbsa_gwdt_init(void)
+{
+   int ret, i = 0;
+   void *platform_timer;
+   struct acpi_table_header *table;
+
+   if (acpi_disabled)
+   return 0;
+
+   if (ACPI_FAILURE(acpi_get_table(ACPI_SIG_GTDT, 0, &table)))
+   return -EINVAL;
+
+   ret = acpi_gtdt_init(table, NULL);
+   if (ret)
+   return ret;
+
+   for_each_platform_timer(platform_timer) {
+   if (is_watchdog(platform_timer)) {
+   ret = gtdt_import_sbsa_gwdt(platform_timer, i);
+   if (ret)
+   break;
+   i++;
+   }
+   }
+
+   if (i)
+   pr_info("found %d SBSA generic Watchdog(s).\n", i);
+
+   return ret;
+}
+
+device_initcall(gtdt_sbsa_gwdt_init);
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index fdd3228..e5ba1f0 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -218,6 +218,7 @@ config ARM_SBSA_WATCHDOG
tristate "ARM SBSA Generic Watchdog"
depends on ARM64
depends on ARM_ARCH_TIMER
+   depends on ACPI_GTDT || !ACPI
select WATCHDOG_CORE
help
  ARM SBSA Generic Watchdog has two stage timeouts:
-- 
2.9.3



[PATCH v17 12/15] clocksource/drivers/arm_arch_timer: Simplify ACPI support code.

2016-11-25 Thread fu . wei
From: Fu Wei 

The patch update arm_arch_timer driver to use the function
provided by the new GTDT driver of ACPI.
By this way, arm_arch_timer.c can be simplified, and separate
all the ACPI GTDT knowledge from this timer driver.

Signed-off-by: Fu Wei 
Signed-off-by: Hanjun Guo 
Tested-by: Xiongfeng Wang 
---
 drivers/clocksource/arm_arch_timer.c | 47 +++-
 1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index bcdceca..7f059f9 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -1053,58 +1053,36 @@ static int __init arch_timer_mem_of_init(struct 
device_node *np)
 CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, "arm,armv7-timer-mem",
   arch_timer_mem_of_init);
 
-#ifdef CONFIG_ACPI
-static int __init map_generic_timer_interrupt(u32 interrupt, u32 flags)
-{
-   int trigger, polarity;
-
-   if (!interrupt)
-   return 0;
-
-   trigger = (flags & ACPI_GTDT_INTERRUPT_MODE) ? ACPI_EDGE_SENSITIVE
-   : ACPI_LEVEL_SENSITIVE;
-
-   polarity = (flags & ACPI_GTDT_INTERRUPT_POLARITY) ? ACPI_ACTIVE_LOW
-   : ACPI_ACTIVE_HIGH;
-
-   return acpi_register_gsi(NULL, interrupt, trigger, polarity);
-}
-
+#ifdef CONFIG_ACPI_GTDT
 /* Initialize per-processor generic timer */
 static int __init arch_timer_acpi_init(struct acpi_table_header *table)
 {
-   struct acpi_table_gtdt *gtdt;
+   int ret;
 
if (arch_timers_present & ARCH_TIMER_TYPE_CP15) {
pr_warn("already initialized, skipping\n");
return -EINVAL;
}
 
-   gtdt = container_of(table, struct acpi_table_gtdt, header);
-
arch_timers_present |= ARCH_TIMER_TYPE_CP15;
 
-   arch_timer_ppi[ARCH_TIMER_PHYS_SECURE_PPI] =
-   map_generic_timer_interrupt(gtdt->secure_el1_interrupt,
-   gtdt->secure_el1_flags);
+   ret = acpi_gtdt_init(table, NULL);
+   if (ret) {
+   pr_err("Failed to init GTDT table.\n");
+   return ret;
+   }
 
arch_timer_ppi[ARCH_TIMER_PHYS_NONSECURE_PPI] =
-   map_generic_timer_interrupt(gtdt->non_secure_el1_interrupt,
-   gtdt->non_secure_el1_flags);
+   acpi_gtdt_map_ppi(ARCH_TIMER_PHYS_NONSECURE_PPI);
 
arch_timer_ppi[ARCH_TIMER_VIRT_PPI] =
-   map_generic_timer_interrupt(gtdt->virtual_timer_interrupt,
-   gtdt->virtual_timer_flags);
+   acpi_gtdt_map_ppi(ARCH_TIMER_VIRT_PPI);
 
arch_timer_ppi[ARCH_TIMER_HYP_PPI] =
-   map_generic_timer_interrupt(gtdt->non_secure_el2_interrupt,
-   gtdt->non_secure_el2_flags);
+   acpi_gtdt_map_ppi(ARCH_TIMER_HYP_PPI);
 
arch_timer_kvm_info.virtual_irq = arch_timer_ppi[ARCH_TIMER_VIRT_PPI];
 
-   /* Get the frequency from CNTFRQ */
-   arch_timer_detect_rate();
-
arch_timer_uses_ppi = arch_timer_select_ppi();
if (!arch_timer_ppi[arch_timer_uses_ppi]) {
pr_err("No interrupt available, giving up\n");
@@ -1112,7 +1090,10 @@ static int __init arch_timer_acpi_init(struct 
acpi_table_header *table)
}
 
/* Always-on capability */
-   arch_timer_c3stop = !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON);
+   arch_timer_c3stop = acpi_gtdt_c3stop(arch_timer_uses_ppi);
+
+   /* Get the frequency from CNTFRQ */
+   arch_timer_detect_rate();
 
ret = arch_timer_register();
if (ret)
-- 
2.9.3



  1   2   3   4   5   6   7   8   9   >