date:20170502

Re: [PATCH 4.4 00/43] 4.4.66-stable review

2017-05-02 Thread Shuah Khan

On 05/01/2017 09:59 PM, kernelci.org bot wrote:
> stable-rc/linux-4.4.y boot: 100 boots: 0 failed, 100 passed 
> (v4.4.65-44-gbb57d8400b2e)
> 
> Full Boot Summary: 
> https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.4.y/kernel/v4.4.65-44-gbb57d8400b2e/
> Full Build Summary: 
> https://kernelci.org/build/stable-rc/branch/linux-4.4.y/kernel/v4.4.65-44-gbb57d8400b2e/
> 
> Tree: stable-rc
> Branch: linux-4.4.y
> Git Describe: v4.4.65-44-gbb57d8400b2e
> Git Commit: bb57d8400b2eec5a7cb4526879431f4b6cac6009
> Git URL: 
> http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Tested: 14 unique boards, 8 SoC families, 18 builds out of 199
> 
> ---
> For more info write to 
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

Re: [PATCH v2 29/30] dt-bindings: add vendor prefix for bananapi

2017-05-02 Thread Rob Herring

+arm-soc

On Mon, May 1, 2017 at 1:58 AM, Sean Wang  wrote:
> On Fri, 2017-04-28 at 15:37 -0500, Rob Herring wrote:
>> On Wed, Apr 26, 2017 at 05:26:13PM +0800, sean.w...@mediatek.com wrote:
>> > From: Sean Wang 
>> >
>> > Banana Pi team in Sinovoip Co., Limited which are dedicated to
>> > design and manufacture open hardware product.
>> >
>> > Website: http://www.banana-pi.org/
>> >
>> > Signed-off-by: Sean Wang 
>> > ---
>> >  Documentation/devicetree/bindings/vendor-prefixes.txt | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt 
>> > b/Documentation/devicetree/bindings/vendor-prefixes.txt
>> > index ec0bfb9..8ca0f3c 100644
>> > --- a/Documentation/devicetree/bindings/vendor-prefixes.txt
>> > +++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
>> > @@ -44,6 +44,7 @@ avia  avia semiconductor
>> >  avic   Shanghai AVIC Optoelectronics Co., Ltd.
>> >  axentiaAxentia Technologies AB
>> >  axis   Axis Communications AB
>> > +bananapi Banana Pi SINOVOP CO., LIMITED
>>
>> s/SINOVOP/SINOVOIP/
>
> Hi Rob,
>
> thanks for your careful reviewing , i will correct it in the next
> version.
>
> BTW, Could those related dts changes go through your tree if everything
> looks good? Because I find Matthias have been inactive for a while and
> the latest branch in his tree seems still staying on 4.10.

Happy to take binding doc changes, but I don't take dts changes
because those go thru arm-soc. If the platform maintainer is not
responsive, then we should replace or add maintainers.

Rob

Re: [PATCH 1/3] drm: fourcc byteorder: drop DRM_FORMAT_BIG_ENDIAN

2017-05-02 Thread Emil Velikov

Hi Gerd,

I did not have the change to follow through the discussion.
Pardon if my suggestion have already been discussed.

On 2 May 2017 at 14:34, Gerd Hoffmann  wrote:
> It's unused.
>
> Suggested-by: Daniel Vetter 
> Signed-off-by: Gerd Hoffmann 
> ---
>  include/uapi/drm/drm_fourcc.h | 2 --
>  drivers/gpu/drm/drm_fourcc.c  | 3 +--
>  drivers/gpu/drm/drm_framebuffer.c | 2 +-
>  3 files changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index 995c8f9c69..305bc34be0 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -33,8 +33,6 @@ extern "C" {
>  #define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \
>  ((__u32)(c) << 16) | ((__u32)(d) << 24))
>
> -#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of 
> little endian */
> -

Weston references DRM_FORMAT_BIG_ENDIAN thus this patch will lead to
build breakage.
That is never fun, so please carefully coordinate with the Weston devs
to minimise the fireworks.

Personally I would leave the symbol, since it's UAPI and document that
should not be used. Not my call, so feel free to ignore.

Thanks
Emil

Re: [PATCH 4.10 00/62] 4.10.14-stable review

2017-05-02 Thread Shuah Khan

On 05/01/2017 10:59 PM, kernelci.org bot wrote:
> stable-rc/linux-4.10.y boot: 87 boots: 0 failed, 87 passed 
> (v4.10.13-63-gcabfe9402479)
> 
> Full Boot Summary: 
> https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.10.y/kernel/v4.10.13-63-gcabfe9402479/
> Full Build Summary: 
> https://kernelci.org/build/stable-rc/branch/linux-4.10.y/kernel/v4.10.13-63-gcabfe9402479/
> 
> Tree: stable-rc
> Branch: linux-4.10.y
> Git Describe: v4.10.13-63-gcabfe9402479
> Git Commit: cabfe940247902943161bc50dec830a5f19352e5
> Git URL: 
> http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Tested: 15 unique boards, 9 SoC families, 18 builds out of 203
> 
> ---
> For more info write to 
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

Re: [PATCH 4.9 00/54] 4.9.26-stable review

2017-05-02 Thread Shuah Khan

On 05/01/2017 10:19 PM, kernelci.org bot wrote:
> stable-rc/linux-4.9.y boot: 96 boots: 0 failed, 96 passed 
> (v4.9.25-55-g49bccd4690a2)
> 
> Full Boot Summary: 
> https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.25-55-g49bccd4690a2/
> Full Build Summary: 
> https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.25-55-g49bccd4690a2/
> 
> Tree: stable-rc
> Branch: linux-4.9.y
> Git Describe: v4.9.25-55-g49bccd4690a2
> Git Commit: 49bccd4690a2adf6e3236fef3005d905b91cf520
> Git URL: 
> http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Tested: 16 unique boards, 9 SoC families, 20 builds out of 203
> 
> ---
> For more info write to 
> 

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

[PATCH 0/6] mailbox: arm_mhu: add support for subchannels

2017-05-02 Thread Sudeep Holla

Hi Jassi,

This series adds subchannel support to ARM MHU mailbox controller
driver. Since SCPI never used second slot, we were able to use the
existing driver as is. However, that's changing soon and the new
SCMI protocol under development needs subchannel support. If you
recall when you initially added this driver, I was pushing for some
of these changes like threaded irq. This patch series adds support
for the subchannels on ARM MHU controllers.

Regards,
Sudeep

Sudeep Holla (6):
  mailbox: arm_mhu: reorder header inclusion and drop unneeded ones
  Documentation: devicetree: add bindings to support ARM MHU subchannels
  mailbox: arm_mhu: migrate to threaded irq handler
  mailbox: arm_mhu: re-factor data structure to add subchannel support
  mailbox: arm_mhu: add full support for sub-channels
  mailbox: arm_mhu: add name support to record mbox-name

 .../devicetree/bindings/mailbox/arm-mhu.txt|  44 ++-
 drivers/mailbox/arm_mhu.c  | 381 ++---
 2 files changed, 376 insertions(+), 49 deletions(-)

-- 
2.7.4

[PATCH 1/6] mailbox: arm_mhu: reorder header inclusion and drop unneeded ones

2017-05-02 Thread Sudeep Holla

This patch just re-orders some of the headers includes and also drop
the ones that are unnecessary.

Cc: Alexey Klimov 
Cc: Jassi Brar 
Signed-off-by: Sudeep Holla 
---
 drivers/mailbox/arm_mhu.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/mailbox/arm_mhu.c b/drivers/mailbox/arm_mhu.c
index 99befa76e37c..be0f293a9457 100644
--- a/drivers/mailbox/arm_mhu.c
+++ b/drivers/mailbox/arm_mhu.c
@@ -13,16 +13,13 @@
  * GNU General Public License for more details.
  */
 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
 #include 
+#include 
 #include 
-#include 
-#include 
 #include 
+#include 
 
 #define INTR_STAT_OFS  0x0
 #define INTR_SET_OFS   0x8
-- 
2.7.4

[PATCH 4/6] mailbox: arm_mhu: re-factor data structure to add subchannel support

2017-05-02 Thread Sudeep Holla

In order to support subchannels, we need a bit of reword around data
structures that are per-channel. Since the number of subchannels are
not fixed though restricted to maximum of 20, the channel assignment
and initialization is move to xlate function.

This patch also adds the platform data for the existing support of one
channel per physical channel.

Cc: Alexey Klimov 
Cc: Jassi Brar 
Signed-off-by: Sudeep Holla 
---
 drivers/mailbox/arm_mhu.c | 208 +-
 1 file changed, 186 insertions(+), 22 deletions(-)

diff --git a/drivers/mailbox/arm_mhu.c b/drivers/mailbox/arm_mhu.c
index ebe17c097f43..4e0f690b97fd 100644
--- a/drivers/mailbox/arm_mhu.c
+++ b/drivers/mailbox/arm_mhu.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #define INTR_STAT_OFS  0x0
 #define INTR_SET_OFS   0x8
@@ -30,7 +32,8 @@
 #define MHU_SEC_OFFSET 0x200
 #define TX_REG_OFFSET  0x100
 
-#define MHU_CHANS  3
+#define MHU_NUM_PCHANS 3   /* Secure, Non-Secure High and Low Priority */
+#define MHU_CHAN_MAX   20  /* Max channels to save on unused RAM */
 
 struct mhu_link {
unsigned irq;
@@ -40,53 +43,175 @@ struct mhu_link {
 
 struct arm_mhu {
void __iomem *base;
-   struct mhu_link mlink[MHU_CHANS];
-   struct mbox_chan chan[MHU_CHANS];
+   struct mhu_link mlink[MHU_NUM_PCHANS];
struct mbox_controller mbox;
+   struct device *dev;
 };
 
+/**
+ * ARM MHU Mailbox platform specific configuration
+ *
+ * @num_pchans: Maximum number of physical channels
+ * @num_subchans: Maximum number of sub-channels per physical channel
+ */
+struct mhu_mbox_pdata {
+   unsigned int num_pchans;
+   unsigned int num_subchans;
+   bool support_subchans;
+};
+
+/**
+ * ARM MHU Mailbox allocated channel information
+ *
+ * @mhu: Pointer to parent mailbox device
+ * @pchan: Physical channel within which this sub-channel resides in
+ * @channel: Sub-channel number pertaining to this container
+ */
+struct mhu_channel {
+   struct arm_mhu *mhu;
+   unsigned int pchan;
+   unsigned int subchan;
+};
+
+static inline struct mbox_chan *
+mhu_mbox_to_channel(struct mbox_controller *mbox,
+   unsigned int pchan, unsigned int subchan)
+{
+   int i;
+   struct mhu_channel *chan_info;
+
+   for (i = 0; i < mbox->num_chans; i++) {
+   chan_info = mbox->chans[i].con_priv;
+   if (chan_info && chan_info->pchan == pchan &&
+   chan_info->subchan == subchan)
+   return >chans[i];
+   }
+
+   dev_err(mbox->dev,
+   "Channel not registered: physical channel: %d subchan: %d\n",
+   pchan, subchan);
+
+   return NULL;
+}
+
+static unsigned int mhu_mbox_irq_to_pchan_num(struct arm_mhu *mhu, int irq)
+{
+   unsigned int pchan;
+   struct mhu_mbox_pdata *pdata = dev_get_platdata(mhu->dev);
+
+   for (pchan = 0; pchan < pdata->num_pchans; pchan++)
+   if (mhu->mlink[pchan].irq == irq)
+   break;
+   return pchan;
+}
+
+static struct mbox_chan *mhu_mbox_xlate(struct mbox_controller *mbox,
+   const struct of_phandle_args *spec)
+{
+   struct arm_mhu *mhu = dev_get_drvdata(mbox->dev);
+   struct mhu_mbox_pdata *pdata = dev_get_platdata(mhu->dev);
+   struct mhu_channel *chan_info;
+   struct mbox_chan *chan = NULL;
+   unsigned int pchan = spec->args[0];
+   unsigned int subchan = pdata->support_subchans ? spec->args[1] : 0;
+   int i;
+
+   /* Bounds checking */
+   if (pchan >= pdata->num_pchans || subchan >= pdata->num_subchans) {
+   dev_err(mbox->dev,
+   "Invalid channel requested pchan: %d subchan: %d\n",
+   pchan, subchan);
+   return ERR_PTR(-EINVAL);
+   }
+
+   for (i = 0; i < mbox->num_chans; i++) {
+   chan_info = mbox->chans[i].con_priv;
+
+   /* Is requested channel free? */
+   if (chan_info &&
+   mbox->dev == chan_info->mhu->dev &&
+   pchan == chan_info->pchan &&
+   subchan == chan_info->subchan) {
+   dev_err(mbox->dev, "Channel in use\n");
+   return ERR_PTR(-EBUSY);
+   }
+
+   /*
+* Find the first free slot, then continue checking
+* to see if requested channel is in use
+*/
+   if (!chan && !chan_info)
+   chan = >chans[i];
+   }
+
+   if (!chan) {
+   dev_err(mbox->dev, "No free channels left\n");
+   return ERR_PTR(-EBUSY);
+   }
+
+   chan_info = devm_kzalloc(mbox->dev, sizeof(*chan_info), GFP_KERNEL);
+   if (!chan_info)
+   return ERR_PTR(-ENOMEM);
+
+   chan_info->mhu = mhu;
+   chan_info->pchan = pchan;
+

[PATCH 6/6] mailbox: arm_mhu: add name support to record mbox-name

2017-05-02 Thread Sudeep Holla

It's sometimes useful to identify the mailbox controller with the name
as specified in the devicetree via mbox-name property especially in a
system with multiple controllers.

This patch adds support to read and record the mailbox controller name.

Cc: Alexey Klimov 
Cc: Jassi Brar 
Signed-off-by: Sudeep Holla 
---
 drivers/mailbox/arm_mhu.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/mailbox/arm_mhu.c b/drivers/mailbox/arm_mhu.c
index 0f5ab2204649..9aa623a3aa9a 100644
--- a/drivers/mailbox/arm_mhu.c
+++ b/drivers/mailbox/arm_mhu.c
@@ -47,6 +47,7 @@ struct arm_mhu {
struct mhu_link mlink[MHU_NUM_PCHANS];
struct mbox_controller mbox;
struct device *dev;
+   const char *name;
 };
 
 /**
@@ -257,8 +258,8 @@ static struct mbox_chan *mhu_mbox_xlate(struct 
mbox_controller *mbox,
 
chan->con_priv = chan_info;
 
-   dev_dbg(mbox->dev, "mbox: created channel - physical: %d sub: %d\n",
-   pchan, subchan);
+   dev_dbg(mbox->dev, "mbox: %s, created channel - physical: %d sub: %d\n",
+   mhu->name, pchan, subchan);
 
return chan;
 }
@@ -361,6 +362,7 @@ static int mhu_probe(struct amba_device *adev, const struct 
amba_id *id)
struct mhu_mbox_pdata *pdata;
const struct of_device_id *match;
struct device *dev = >dev;
+   struct device_node *np = dev->of_node;
int mhu_reg[MHU_NUM_PCHANS] = {
MHU_LP_OFFSET, MHU_HP_OFFSET, MHU_SEC_OFFSET,
};
@@ -390,6 +392,10 @@ static int mhu_probe(struct amba_device *adev, const 
struct amba_id *id)
return PTR_ERR(mhu->base);
}
 
+   err = of_property_read_string(np, "mbox-name", >name);
+   if (err)
+   mhu->name = np->full_name;
+
chans = devm_kcalloc(dev, max_chans, sizeof(*chans), GFP_KERNEL);
if (!chans)
return -ENOMEM;
@@ -441,7 +447,7 @@ static int mhu_probe(struct amba_device *adev, const struct 
amba_id *id)
}
}
 
-   dev_info(dev, "ARM MHU Mailbox registered\n");
+   dev_info(dev, "%s mailbox registered\n", mhu->name);
return 0;
 }
 
-- 
2.7.4

[PATCH 3/6] mailbox: arm_mhu: migrate to threaded irq handler

2017-05-02 Thread Sudeep Holla

In preparation to introduce support for subchannels which require the
interrupt handlers to be threaded, this patch moves the existing
interrupt handler to threaded handler.

Also it moves out the registering and freeing of the handlers from
the mailbox startup and shutdown methods. This also is required to
support sub-channels.

Cc: Alexey Klimov 
Cc: Jassi Brar 
Signed-off-by: Sudeep Holla 
---
 drivers/mailbox/arm_mhu.c | 41 ++---
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/drivers/mailbox/arm_mhu.c b/drivers/mailbox/arm_mhu.c
index be0f293a9457..ebe17c097f43 100644
--- a/drivers/mailbox/arm_mhu.c
+++ b/drivers/mailbox/arm_mhu.c
@@ -84,27 +84,15 @@ static int mhu_startup(struct mbox_chan *chan)
 {
struct mhu_link *mlink = chan->con_priv;
u32 val;
-   int ret;
 
val = readl_relaxed(mlink->tx_reg + INTR_STAT_OFS);
writel_relaxed(val, mlink->tx_reg + INTR_CLR_OFS);
 
-   ret = request_irq(mlink->irq, mhu_rx_interrupt,
- IRQF_SHARED, "mhu_link", chan);
-   if (ret) {
-   dev_err(chan->mbox->dev,
-   "Unable to acquire IRQ %d\n", mlink->irq);
-   return ret;
-   }
-
return 0;
 }
 
 static void mhu_shutdown(struct mbox_chan *chan)
 {
-   struct mhu_link *mlink = chan->con_priv;
-
-   free_irq(mlink->irq, chan);
 }
 
 static const struct mbox_chan_ops mhu_ops = {
@@ -132,13 +120,6 @@ static int mhu_probe(struct amba_device *adev, const 
struct amba_id *id)
return PTR_ERR(mhu->base);
}
 
-   for (i = 0; i < MHU_CHANS; i++) {
-   mhu->chan[i].con_priv = >mlink[i];
-   mhu->mlink[i].irq = adev->irq[i];
-   mhu->mlink[i].rx_reg = mhu->base + mhu_reg[i];
-   mhu->mlink[i].tx_reg = mhu->mlink[i].rx_reg + TX_REG_OFFSET;
-   }
-
mhu->mbox.dev = dev;
mhu->mbox.chans = >chan[0];
mhu->mbox.num_chans = MHU_CHANS;
@@ -155,6 +136,28 @@ static int mhu_probe(struct amba_device *adev, const 
struct amba_id *id)
return err;
}
 
+   for (i = 0; i < MHU_CHANS; i++) {
+   int irq = mhu->mlink[i].irq = adev->irq[i];
+
+   if (irq <= 0) {
+   dev_dbg(dev, "No IRQ found for Channel %d\n", i);
+   continue;
+   }
+
+   mhu->chan[i].con_priv = >mlink[i];
+   mhu->mlink[i].rx_reg = mhu->base + mhu_reg[i];
+   mhu->mlink[i].tx_reg = mhu->mlink[i].rx_reg + TX_REG_OFFSET;
+
+   err = devm_request_threaded_irq(dev, irq, NULL,
+   mhu_rx_interrupt, IRQF_ONESHOT,
+   "mhu_link", >chan[i]);
+   if (err) {
+   dev_err(dev, "Can't claim IRQ %d\n", irq);
+   mbox_controller_unregister(>mbox);
+   return err;
+   }
+   }
+
dev_info(dev, "ARM MHU Mailbox registered\n");
return 0;
 }
-- 
2.7.4

[PATCH 2/6] Documentation: devicetree: add bindings to support ARM MHU subchannels

2017-05-02 Thread Sudeep Holla

The ARM MHU has mechanism to assert interrupt signals to facilitate
inter-processor message based communication. It drives the signal using
a 32-bit register, with all 32-bits logically ORed together. It also
enables software to set, clear and check the status of each of the bits
of this register independently. Each bit of the register can be
associated with a type of event that can contribute to raising the
interrupt thereby allowing it to be used as independent subchannels.

Since the first version of this binding can't support sub-channels,
this patch extends the existing binding to support them.

Cc: Alexey Klimov 
Cc: Jassi Brar 
Cc: Rob Herring 
Cc: devicet...@vger.kernel.org
Signed-off-by: Sudeep Holla 
---
 .../devicetree/bindings/mailbox/arm-mhu.txt| 44 --
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/mailbox/arm-mhu.txt 
b/Documentation/devicetree/bindings/mailbox/arm-mhu.txt
index 4971f03f0b33..86a66f7918e2 100644
--- a/Documentation/devicetree/bindings/mailbox/arm-mhu.txt
+++ b/Documentation/devicetree/bindings/mailbox/arm-mhu.txt
@@ -10,21 +10,40 @@ STAT register and the remote clears it after having read 
the data.
 The last channel is specified to be a 'Secure' resource, hence can't be
 used by Linux running NS.
 
+The MHU drives the interrupt signal using a 32-bit register, with all
+32-bits logically ORed together. It provides a set of registers to
+enable software to set, clear and check the status of each of the bits
+of this register independently. The use of 32 bits per interrupt line
+enables software to provide more information about the source of the
+interrupt. For example, each bit of the register can be associated with
+a type of event that can contribute to raising the interrupt.
+
 Mailbox Device Node:
 
 
 Required properties:
 
-- compatible:  Shall be "arm,mhu" & "arm,primecell"
+- compatible:  Shall be "arm,primecell" and one of the below:
+   "arm,mhu" - if the controller doesn't support
+   subchannels
+   "arm,mhu-v2" - if the controller supports subchannels
 - reg: Contains the mailbox register address range (base
address and length)
-- #mbox-cells  Shall be 1 - the index of the channel needed.
+- #mbox-cells  Shall be 1 - the index of the channel needed when
+   compatible is "arm,mhu"
+   Shall be 2 - the index of the channel needed, and
+   the index of the subchannel with the channel when
+   compatible is "arm,mhu-v2"
 - interrupts:  Contains the interrupt information corresponding to
-   each of the 3 links of MHU.
+   each of the 3 physical channels of MHU namely low
+   priority non-secure, high priority non-secure and
+   secure channels.
 
 Example:
 
 
+1. Controller which doesn't support subchannels
+
mhu: mailbox@2b1f {
#mbox-cells = <1>;
compatible = "arm,mhu", "arm,primecell";
@@ -41,3 +60,22 @@ used by Linux running NS.
reg = <0 0x2e00 0x4000>;
mboxes = < 1>; /* HP-NonSecure */
};
+
+2. Controller which supports subchannels
+
+   mhu: mailbox@2b1f {
+   #mbox-cells = <2>;
+   compatible = "arm,mhu-v2", "arm,primecell";
+   reg = <0 0x2b1f 0x1000>;
+   interrupts = <0 36 4>, /* LP-NonSecure */
+<0 35 4>, /* HP-NonSecure */
+<0 37 4>; /* Secure */
+   clocks = < 0 2 1>;
+   clock-names = "apb_pclk";
+   };
+
+   mhu_client: scb@2e00 {
+   compatible = "arm,scpi";
+   reg = <0 0x2e00 0x200>;
+   mboxes = < 1 4>; /* HP-NonSecure 5th sub-channel */
+   };
-- 
2.7.4

[PATCH 5/6] mailbox: arm_mhu: add full support for sub-channels

2017-05-02 Thread Sudeep Holla

We now have the basic infrastructure and binding to support subchannels
on ARM MHU controller.

This patch adds all the necessary mailbox operations to add support for
the sub-channels. Maximum of 32 subchannels are supported on each physical
channel, however the total number of subchannels is restricted to 20
in order to save memory. It can increased if required in future.

Cc: Alexey Klimov 
Cc: Jassi Brar 
Signed-off-by: Sudeep Holla 
---
 drivers/mailbox/arm_mhu.c | 127 --
 1 file changed, 123 insertions(+), 4 deletions(-)

diff --git a/drivers/mailbox/arm_mhu.c b/drivers/mailbox/arm_mhu.c
index 4e0f690b97fd..0f5ab2204649 100644
--- a/drivers/mailbox/arm_mhu.c
+++ b/drivers/mailbox/arm_mhu.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -94,6 +95,14 @@ mhu_mbox_to_channel(struct mbox_controller *mbox,
return NULL;
 }
 
+static void mhu_mbox_clear_irq(struct mbox_chan *chan)
+{
+   struct mhu_channel *chan_info = chan->con_priv;
+   void __iomem *base = chan_info->mhu->mlink[chan_info->pchan].rx_reg;
+
+   writel_relaxed(BIT(chan_info->subchan), base + INTR_CLR_OFS);
+}
+
 static unsigned int mhu_mbox_irq_to_pchan_num(struct arm_mhu *mhu, int irq)
 {
unsigned int pchan;
@@ -105,6 +114,95 @@ static unsigned int mhu_mbox_irq_to_pchan_num(struct 
arm_mhu *mhu, int irq)
return pchan;
 }
 
+static struct mbox_chan *mhu_mbox_irq_to_channel(struct arm_mhu *mhu,
+unsigned int pchan)
+{
+   unsigned long bits;
+   unsigned int subchan;
+   struct mbox_chan *chan = NULL;
+   struct mbox_controller *mbox = >mbox;
+   void __iomem *base = mhu->mlink[pchan].rx_reg;
+
+   bits = readl_relaxed(base + INTR_STAT_OFS);
+   if (!bits)
+   /* No IRQs fired in specified physical channel */
+   return NULL;
+
+   /* An IRQ has fired, find the associated channel */
+   for (subchan = 0; bits; subchan++) {
+   if (!test_and_clear_bit(subchan, ))
+   continue;
+
+   chan = mhu_mbox_to_channel(mbox, pchan, subchan);
+   if (chan)
+   break;
+   }
+
+   return chan;
+}
+
+static irqreturn_t mhu_mbox_thread_handler(int irq, void *data)
+{
+   struct mbox_chan *chan;
+   struct arm_mhu *mhu = data;
+   unsigned int pchan = mhu_mbox_irq_to_pchan_num(mhu, irq);
+
+   while (NULL != (chan = mhu_mbox_irq_to_channel(mhu, pchan))) {
+   mbox_chan_received_data(chan, NULL);
+   mhu_mbox_clear_irq(chan);
+   }
+
+   return IRQ_HANDLED;
+}
+
+static bool mhu_subchan_last_tx_done(struct mbox_chan *chan)
+{
+   struct mhu_channel *chan_info = chan->con_priv;
+   void __iomem *base = chan_info->mhu->mlink[chan_info->pchan].tx_reg;
+
+   if (readl_relaxed(base + INTR_STAT_OFS) & BIT(chan_info->subchan))
+   return false;
+
+   return true;
+}
+
+static int mhu_subchan_send_data(struct mbox_chan *chan, void *data)
+{
+   struct mhu_channel *chan_info = chan->con_priv;
+   void __iomem *base = chan_info->mhu->mlink[chan_info->pchan].tx_reg;
+
+   /* Send event to co-processor */
+   writel_relaxed(BIT(chan_info->subchan), base + INTR_SET_OFS);
+
+   return 0;
+}
+
+static int mhu_subchan_startup(struct mbox_chan *chan)
+{
+   mhu_mbox_clear_irq(chan);
+   return 0;
+}
+
+static void mhu_subchan_shutdown(struct mbox_chan *chan)
+{
+   struct mhu_channel *chan_info = chan->con_priv;
+   struct mbox_controller *mbox = _info->mhu->mbox;
+   int i;
+
+   for (i = 0; i < mbox->num_chans; i++)
+   if (chan == >chans[i])
+   break;
+
+   if (mbox->num_chans == i) {
+   dev_warn(mbox->dev, "Request to free non-existent channel\n");
+   return;
+   }
+
+   /* Reset channel */
+   mhu_mbox_clear_irq(chan);
+   chan->con_priv = NULL;
+}
+
 static struct mbox_chan *mhu_mbox_xlate(struct mbox_controller *mbox,
const struct of_phandle_args *spec)
 {
@@ -227,14 +325,28 @@ static const struct mbox_chan_ops mhu_ops = {
.last_tx_done = mhu_last_tx_done,
 };
 
+static const struct mbox_chan_ops mhu_subchan_ops = {
+   .send_data = mhu_subchan_send_data,
+   .startup = mhu_subchan_startup,
+   .shutdown = mhu_subchan_shutdown,
+   .last_tx_done = mhu_subchan_last_tx_done,
+};
+
 static const struct mhu_mbox_pdata arm_mhu_pdata = {
.num_pchans = 3,
.num_subchans = 1,
.support_subchans = false,
 };
 
+static const struct mhu_mbox_pdata arm_mhu_v2_pdata = {
+   .num_pchans = 2,/* Secure can't be used */
+   .num_subchans = 32,
+   .support_subchans = true,
+};
+
 static const struct of_device_id mhu_mbox_match[] = {
{ .compatible = "arm,mhu",

RE: [PATCH 3/4] net: macb: Add hardware PTP support

2017-05-02 Thread Rafal Ozieblo

> From: Richard Cochran [mailto:richardcoch...@gmail.com]
> Sent: 14 kwietnia 2017 20:29
> To: Rafal Ozieblo 
> Cc: David Miller ; nicolas.fe...@atmel.com;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org;
> devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> harinikatakamli...@gmail.com; harini.kata...@xilinx.com;
> andrei.pistir...@microchip.com
> Subject: Re: [PATCH 3/4] net: macb: Add hardware PTP support
> 
> On Thu, Apr 13, 2017 at 02:39:23PM +0100, Rafal Ozieblo wrote:
(...)
> > +static int gem_tsu_get_time(struct macb *bp, struct timespec64 *ts)
> > +{
> > +   long first, second;
> > +   u32 secl, sech;
> > +   unsigned long flags;
> > +
> > +   if (!bp || !ts)
> > +   return -EINVAL;
> 
> Useless test.
Sorry for me being too carefulness, I'll remove all tests.
> 
(...)
> > +static int gem_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64
> *ts)
> > +{
> > +   struct macb *bp = container_of(ptp, struct macb, ptp_clock_info);
> > +
> > +   if (!ptp || !ts)
> > +   return -EINVAL;
> 
> Useles test.
> 
> What is the point of this wrapper function anyhow?  Please remove it.
gem_ptp_gettime() is assigned in ptp_clock_info and it has to have 
ptp_clock_info pointer as first parameter. gem_tsu_get_time() is used in
the source code but with macb pointer.
Do you want me to do something like:
gem_ptp_gettime(macb->ptp, ts);
and first would be getting macb pointer from ptp ?
struct macb *bp = container_of(ptp, struct macb, ptp_clock_info);
> 
> > +
> > +   gem_tsu_get_time(bp, ts);
> > +   return 0;
> > +}
> > +
> > +static int gem_ptp_settime(struct ptp_clock_info *ptp,
> > +   const struct timespec64 *ts)
> > +{
> > +   struct macb *bp = container_of(ptp, struct macb, ptp_clock_info);
> > +
> > +   if (!ptp || !ts)
> > +   return -EINVAL;
> 
> Another useless function.
ditto
> 
> > +   gem_tsu_set_time(bp, ts);
> > +   return 0;
> > +}
> > +
> > +static int gem_ptp_enable(struct ptp_clock_info *ptp,
> > + struct ptp_clock_request *rq, int on)
> > +{
> > +   struct macb *bp = container_of(ptp, struct macb, ptp_clock_info);
> > +
> > +   if (!ptp || !rq)
> > +   return -EINVAL;
> 
> Sigh.
> 
> > +
> > +   switch (rq->type) {
> > +   case PTP_CLK_REQ_EXTTS: /* Toggle TSU match interrupt */
> > +   if (on)
> > +   macb_writel(bp, IER, MACB_BIT(TCI));
> 
> No locking to protect IER and IDE?
There is no need.
> 
> > +   else
> > +   macb_writel(bp, IDR, MACB_BIT(TCI));
> > +   break;
> > +   case PTP_CLK_REQ_PEROUT: /* Toggle Periodic output */
> > +   return -EOPNOTSUPP;
> > +   /* break; */
> > +   case PTP_CLK_REQ_PPS:   /* Toggle TSU periodic (second)
> interrupt */
> > +   if (on)
> > +   macb_writel(bp, IER, MACB_BIT(SRI));
> > +   else
> > +   macb_writel(bp, IDR, MACB_BIT(SRI));
> > +   break;
> > +   default:
> > +   break;
> > +   }
> > +   return 0;
> > +}
> > +
(...)
> > --
> > 2.4.5
> >
> 
> Thanks,
> Richard

Re: [PATCH RFC v2 5/6] proc: instantiate only pids that we can ptrace on 'limit_pids=1' mount option

2017-05-02 Thread Djalal Harouni

Hello Andy,

(Sorry for my late response)

On Thu, Apr 27, 2017 at 12:09 AM, Andy Lutomirski  wrote:
> On Tue, Apr 25, 2017 at 5:23 AM, Djalal Harouni  wrote:
>> If "limit_pids=1" mount option is set then do not instantiate pids that
>> we can not ptrace. "limit_pids=1" means that procfs should only contain
>> pids that the caller can ptrace.
>>
>> Cc: Kees Cook 
>> Cc: Andy Lutomirski 
>> Signed-off-by: Djalal Harouni 
>> ---
>>  fs/proc/base.c | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 2e0f661..a663284 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -3149,6 +3149,7 @@ struct dentry *proc_pid_lookup(struct inode *dir, 
>> struct dentry * dentry, unsign
>> unsigned tgid;
>> struct proc_fs_info *fs_info = proc_sb(dir->i_sb);
>> struct pid_namespace *ns = fs_info->pid_ns;
>> +   int limit_pids = proc_fs_limit_pids(fs_info);
>
> Shouldn't the addition of proc_fs_limit_pids() be in this patch?

Actually I think this patch should be part of proc_fs_limit_pids()
since it will cover paths that are not handled by inode ->permission()
checks. But I will review this patch ad discussion "devpts: Make each
mount of devpts an independent filesystem" [1] and try to get the
rational about it, if doing permission like checks in lookups is
related, or if it should be done in the first place...

The other thing that I didn't have time to check is standardizing on
returned error codes: I prefer to always return -ENOENT, instead of
-EPERM or sometimes -ENOENT.

> Also, can we name it something self-documented?
> "ptraceable_pids_only=1", perhaps?  Or even pids=ptraceable (as
> opposed to pids=all or maybe other choices in the future)?

Yes, I will update.

[1] https://patchwork.kernel.org/patch/9150781/

-- 
tixxdz

Re: [RESEND PATCH] spi: bcm63xx-hsspi: Export OF device ID table as module aliases

2017-05-02 Thread Jonas Gorski

On 1 May 2017 at 22:13, Andres Galacho  wrote:
> The device table is required to load modules based on
> modaliases. After adding MODULE_DEVICE_TABLE, below entries
> for example will be added to module.alias:
> alias:  of:N*T*Cbrcm,bcm6328-hsspiC*
> alias:  of:N*T*Cbrcm,bcm6328-hsspi
>
> Signed-off-by: Andres Galacho 

Not that it matters much with a trivial patch like this, but

Acked-by: Jonas Gorski 

also thanks for doing it.


Regards
Jonas

error value for "internal error"

2017-05-02 Thread Miklos Szeredi

I've been wondering what to return for soft asserts like this:

if (WARN_ON(something unexpected))
return -E;

EINVAL doesn't fit because it means the input from userspace was
wrong.  EIO means something went bad with the hardware.

There's no "software error" or "internal error" type return code.
Would it make sense to introduce one?

Rule would be to always add more detail to dmesg (such as done by
WARN_ON) when such a code is returned.

Thanks,
Miklos

Re: [PATCH] tools/kvm: fix top level makefile

2017-05-02 Thread Paolo Bonzini



On 11/04/2017 17:34, Justin M. Forbes wrote:
> The top level tools/Makefile includes kvm_stat as a target in help, but
> the actual target is missing.
> 
> Signed-off-by: Justin M. Forbes 
> ---
>  tools/Makefile | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/Makefile b/tools/Makefile
> index 00caacd..c8a90d0 100644
> --- a/tools/Makefile
> +++ b/tools/Makefile
> @@ -86,10 +86,13 @@ tmon: FORCE
>  freefall: FORCE
>   $(call descend,laptop/$@)
> 
> +kvm_stat: FORCE
> + $(call descend,kvm/$@)
> +
>  all: acpi cgroup cpupower gpio hv firewire lguest \
>   perf selftests turbostat usb \
>   virtio vm net x86_energy_perf_policy \
> - tmon freefall objtool
> + tmon freefall objtool kvm_stat
> 
>  acpi_install:
>   $(call descend,power/$(@:_install=),install)
> 

Applied, thanks.

Paolo

[PATCH V2] pwm: tegra: Set maximum pwm clock source per SoC tapeout

2017-05-02 Thread Laxman Dewangan

The PWM hardware IP is taped-out with different maximum frequency
on different SoCs.

>From HW team:
Before Tegra186, it is 38.4MHz.
In Tegra186, it is 102MHz.

Add support to limit the clock source frequency to the maximum IP
supported frequency. Provide these values via SoC chipdata.

Signed-off-by: Laxman Dewangan 

---
Changes from V1:
- Set the 48MHz maximum frequency for Tegra210 and earlier.
- Set the maximum frequency unconditionally as per V1 review comment.
---
 drivers/pwm/pwm-tegra.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/pwm/pwm-tegra.c b/drivers/pwm/pwm-tegra.c
index 8c6ed55..e9b33f0 100644
--- a/drivers/pwm/pwm-tegra.c
+++ b/drivers/pwm/pwm-tegra.c
@@ -41,6 +41,9 @@
 
 struct tegra_pwm_soc {
unsigned int num_channels;
+
+   /* Maximum IP frequency for given SoCs */
+   unsigned long max_frequency;
 };
 
 struct tegra_pwm_chip {
@@ -201,7 +204,18 @@ static int tegra_pwm_probe(struct platform_device *pdev)
if (IS_ERR(pwm->clk))
return PTR_ERR(pwm->clk);
 
-   /* Read PWM clock rate from source */
+   /* Set maximum frequency of the IP */
+   ret = clk_set_rate(pwm->clk, pwm->soc->max_frequency);
+   if (ret < 0) {
+   dev_err(>dev, "Failed to set max frequency: %d\n", ret);
+   return ret;
+   }
+
+   /*
+* The requested and configured frequency may differ due to
+* clock register resolutions. Get the configured frequency
+* so that PWM period can be calculated more accurately.
+*/
pwm->clk_rate = clk_get_rate(pwm->clk);
 
pwm->rst = devm_reset_control_get(>dev, "pwm");
@@ -273,10 +287,12 @@ static int tegra_pwm_resume(struct device *dev)
 
 static const struct tegra_pwm_soc tegra20_pwm_soc = {
.num_channels = 4,
+   .max_frequency = 4800UL,
 };
 
 static const struct tegra_pwm_soc tegra186_pwm_soc = {
.num_channels = 1,
+   .max_frequency = 10200UL,
 };
 
 static const struct of_device_id tegra_pwm_of_match[] = {
-- 
2.1.4

Re: net/ipv6: warning in inet6_ifa_finish_destroy

2017-05-02 Thread Mike Manning

On 28/04/17 21:39, Cong Wang wrote:
> On Fri, Apr 28, 2017 at 6:08 AM, Andrey Konovalov  
> wrote:
>> Hi,
>>
>> I've got the following error report while fuzzing the kernel with syzkaller.
>>
>> On commit 5a7ad1146caa895ad718a534399e38bd2ba721b7 (4.11-rc8).
>>
>> C reproducer and .config are attached.
>> It takes 1-2 minutes of running the reproducer to trigger the issue.
>>
>> [ cut here ]
>> WARNING: CPU: 0 PID: 21 at net/ipv6/addrconf.c:894
>> inet6_ifa_finish_destroy+0x12e/0x190
>> Modules linked in:
>> CPU: 0 PID: 21 Comm: kworker/0:1 Not tainted 4.11.0-rc8+ #296
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Workqueue: ipv6_addrconf addrconf_dad_work
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:16
>>  dump_stack+0x292/0x398 lib/dump_stack.c:52
>>  __warn+0x19f/0x1e0 kernel/panic.c:549
>>  warn_slowpath_null+0x2c/0x40 kernel/panic.c:584
>>  inet6_ifa_finish_destroy+0x12e/0x190 c:894
>>  in6_ifa_put ./include/net/addrconf.h:330
>>  addrconf_dad_work+0x4e9/0x1040 net/ipv6/addrconf.c:3963
> 
> 
> I don't look too much, but a quick glance shows in the following
> path:
> 
> } else if (action == DAD_ABORT) {
> in6_ifa_hold(ifp);
> addrconf_dad_stop(ifp, 1);
> if (disable_ipv6)
> addrconf_ifdown(idev->dev, 0);
> goto out;
> }
> 
> the inet6_addr could be removed from hash table in
> addrconf_ifdown() before calling in6_ifa_put(). which causes
> this warning.
> 

git describe 85b51b12115c79cce7ea1ced6c0bd0339a165d3f --contains
v4.8-rc5~34^2~32

This fix was introduced in 4.8, so it is interesting that this problem is only 
showing up now for 4.11. Also, it is not reproducible manually, i.e. DAD 
failure with disable_ipv6 works just fine without triggering this warning, with 
or without keeping IPv6 addresses on admin down.

I will go ahead with putting out a fix so that in6_ifa_put() precedes the call 
to addrconf_ifdown() in this case.

Thanks for the heads up on this,
Mike

Re: [v6,10/23] drivers/fsi: Add device read/write/peek API

2017-05-02 Thread Eddie James

On Apr 10, 2017, at 3:46 PM, Christopher Bostic 
 wrote:

From: Jeremy Kerr 

This change introduces the fsi device API: simple read, write and peek
accessors for the devices' address spaces.

Includes contributions from Chris Bostic 
and Edward A. James .

Signed-off-by: Edward A. James 
Signed-off-by: Jeremy Kerr 
Signed-off-by: Chris Bostic 
Signed-off-by: Joel Stanley 
---
 drivers/fsi/fsi-core.c | 33 +
 include/linux/fsi.h|  7 ++-
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/fsi/fsi-core.c b/drivers/fsi/fsi-core.c
index a8faa89..4da0b030 100644
--- a/drivers/fsi/fsi-core.c
+++ b/drivers/fsi/fsi-core.c
@@ -32,6 +32,8 @@
 #define FSI_SLAVE_CONF_CRC_MASK0x000f
 #define FSI_SLAVE_CONF_DATA_BITS28

+#define FSI_PEEK_BASE0x410
+
 static const int engine_page_size = 0x400;

 #define FSI_SLAVE_BASE0x800
@@ -73,9 +75,40 @@ static int fsi_master_read(struct fsi_master 
*master, int link,

 uint8_t slave_id, uint32_t addr, void *val, size_t size);
 static int fsi_master_write(struct fsi_master *master, int link,
 uint8_t slave_id, uint32_t addr, const void *val, size_t size);
+static int fsi_slave_read(struct fsi_slave *slave, uint32_t addr,
+void *val, size_t size);
+static int fsi_slave_write(struct fsi_slave *slave, uint32_t addr,
+const void *val, size_t size);

 /* FSI endpoint-device support */

+int fsi_device_read(struct fsi_device *dev, uint32_t addr, void *val,
+size_t size)
+{
+if (addr > dev->size || size > dev->size || addr > dev->size - size)
+return -EINVAL;
+
+return fsi_slave_read(dev->slave, dev->addr + addr, val, size);
+}
+EXPORT_SYMBOL_GPL(fsi_device_read);
+
+int fsi_device_write(struct fsi_device *dev, uint32_t addr, const 
void *val,

+size_t size)
+{
+if (addr > dev->size || size > dev->size || addr > dev->size - size)
+return -EINVAL;
+
+return fsi_slave_write(dev->slave, dev->addr + addr, val, size);
+}
+EXPORT_SYMBOL_GPL(fsi_device_write);
+
+int fsi_device_peek(struct fsi_device *dev, void *val)
+{
+uint32_t addr = FSI_PEEK_BASE + ((dev->unit - 2) * 
sizeof(uint32_t));

+
+return fsi_slave_read(dev->slave, addr, val, sizeof(uint32_t));
+}
+
 static void fsi_device_release(struct device *_device)
 {
 struct fsi_device *device = to_fsi_dev(_device);
diff --git a/include/linux/fsi.h b/include/linux/fsi.h
index efa55ba..66bce48 100644
--- a/include/linux/fsi.h
+++ b/include/linux/fsi.h
@@ -27,6 +27,12 @@ struct fsi_device {
 uint32_tsize;
 };

+extern int fsi_device_read(struct fsi_device *dev, uint32_t addr,
+void *val, size_t size);
+extern int fsi_device_write(struct fsi_device *dev, uint32_t addr,
+const void *val, size_t size);
+extern int fsi_device_peek(struct fsi_device *dev, void *val);
+
 struct fsi_device_id {
 u8engine_type;
 u8version;
@@ -40,7 +46,6 @@ struct fsi_device_id {
 #define FSI_DEVICE_VERSIONED(t, v) \
 .engine_type = (t), .version = (v),

-
 struct fsi_driver {
 struct device_driverdrv;
 const struct fsi_device_id*id_table;



I also wrote a driver against this API (an i2c algorithm). Everything 
looks good and working well.


Edward (Eddie) James

Re: [PATCH v4 1/5] dt-bindings: gpu: add bindings for the ARM Mali Midgard GPU

2017-05-02 Thread Rob Herring

On Tue, May 2, 2017 at 6:23 AM, Guillaume Tucker
 wrote:
> Hi Rob,
>
> On 28/04/17 20:27, Rob Herring wrote:
>>
>> On Tue, Apr 25, 2017 at 02:16:16PM +0100, Guillaume Tucker wrote:
>
>
>>> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
>>> b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
>>> new file mode 100644
>>> index ..547ddeceb498
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
>>> @@ -0,0 +1,82 @@
>>> +ARM Mali Midgard GPU
>>> +
>>> +
>>> +Required properties:
>>> +
>>> +- compatible :
>>> +  * Must be one of the following:
>>> ++ "arm,mali-t60x"
>>> ++ "arm,mali-t62x"
>>
>>
>> Don't use wildcards.
>
>
> Sure, old habits die hard...  I'll fix it in patch v5.
>
>>> ++ "arm,mali-t720"
>>> ++ "arm,mali-t760"
>>> ++ "arm,mali-t820"
>>> ++ "arm,mali-t830"
>>> ++ "arm,mali-t860"
>>> ++ "arm,mali-t880"
>>> +  * And, optionally, one of the vendor specific compatible:
>>
>>
>> IMO, these should not be optional.
>
>
> Well, vendor compatible strings are clearly optional for the
> Utgard GPU series for which the bindings docs were recently
> merged.  It seems that whether these should be optional or not,
> the documentation should be consistent between at least all
> similar types of devices like Midgard and Utgard GPUs.  They have
> different architectures but from a device tree point of view,
> they both have the same kind of SoC-specific integration (clocks,
> irqs, regulators...).

Clocks should not vary by SoC. There is often variation because clocks
get driven by same source or are not s/w controlled, but really there
should not be that variation. I noticed Utgard has 2 clocks. So is
Midgard really just 1 clock? The DT should have all the clocks listed
in the TRMs.

> So was this was overlooked in the Utgard case and should it
> ideally be fixed there as well as non-optional?  Or, is it OK to
> keep these optional on a second thought?

Probably should be required in the Utgard case as well.

Rob

Re: [PATCH 1/3] drm: fourcc byteorder: drop DRM_FORMAT_BIG_ENDIAN

2017-05-02 Thread Gerd Hoffmann

> > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > index 995c8f9c69..305bc34be0 100644
> > --- a/include/uapi/drm/drm_fourcc.h
> > +++ b/include/uapi/drm/drm_fourcc.h
> > @@ -33,8 +33,6 @@ extern "C" {
> >  #define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \
> >  ((__u32)(c) << 16) | ((__u32)(d) << 24))
> >
> > -#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of 
> > little endian */
> > -
> 
> Weston references DRM_FORMAT_BIG_ENDIAN thus this patch will lead to
> build breakage.

> Personally I would leave the symbol, since it's UAPI and document that
> should not be used.

Fair enough.  I can surely add a deprecated comment instead of just
dropping it.

I'm wondering how it is used though, given that none of the drivers in
the kernel actually  support this flag ...

cheers,
  Gerd

Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free

2017-05-02 Thread Marc MERLIN

On Tue, May 02, 2017 at 09:44:33AM +0200, Michal Hocko wrote:
> On Mon 01-05-17 21:12:35, Marc MERLIN wrote:
> > Howdy,
> > 
> > Well, sadly, the problem is more or less back is 4.11.0. The system doesn't 
> > really 
> > crash but it goes into an infinite loop with
> > [34776.826800] BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 
> > stuck for 33s!
> > More logs: https://pastebin.com/YqE4riw0
> 
> I am seeing a lot of traces where tasks is waiting for an IO. I do not
> see any OOM report there. Why do you believe this is an OOM killer
> issue?

Good question. This is a followup of the problem I had in 4.8.8 until I
got a patch to fix the issue. Then, it used to OOM and later, to pile up
I/O tasks like this.
Now it doesn't OOM anymore, but tasks still pile up.
I temporarily fixed the issue by doing this:
gargamel:~# echo 0 > /proc/sys/vm/dirty_ratio
gargamel:~# echo 0 > /proc/sys/vm/dirty_background_ratio

of course my performance is abysmal now, but I can at least run btrfs
scrub without piling up enough IO to deadlock the system.

On Tue, May 02, 2017 at 07:44:47PM +0900, Tetsuo Handa wrote:
> > Any idea what I should do next?
> 
> Maybe you can try collecting list of all in-flight allocations with backtraces
> using kmallocwd patches at
> http://lkml.kernel.org/r/1489578541-81526-1-git-send-email-penguin-ker...@i-love.sakura.ne.jp
> and 
> http://lkml.kernel.org/r/201704272019.jeh26057.shfotmljoov...@i-love.sakura.ne.jp
> which also tracks mempool allocations.
> (Well, the
> 
> - cond_resched();
> + //cond_resched();
> 
> change in the latter patch would not be preferable.)

Thanks. I can give that a shot as soon as my current scrub is done, it
may take another 12 to 24H at this rate.
In the meantimne, as explained above, not allowing any dirty VM has
worked around the problem (Linus pointed out to me in the original
thread that on a lightly loaded 24GB system, even 1 or 2% could still be
a lot of memory for requests to pile up in and cause issues in
degenerative cases like mine).
Now I'm still curious what changed betweeen 4.8.8 + custom patches and 4.11 to 
cause
this.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901

Re: [GIT PULL] PCI fixes for v4.10

2017-05-02 Thread Sinan Kaya

On 5/2/2017 6:49 AM, Lukas Wunner wrote:
> On Mon, May 01, 2017 at 10:41:20PM -0400, Sinan Kaya wrote:
>> On 5/1/2017 9:54 PM, Lukas Wunner wrote:
>>> (b) ASPM L1 enabled on boot, but disabled after powering off and back on
>>> => I believe Sinan is working on this (+cc).
>>
>> The decision was made not to touch ASPM registers following hotplug insertion
>> unless pcie_aspm.policy=powersave is specified.
>>
>> The discussion is here: https://lkml.org/lkml/2017/4/17/255
>>
>> This was done to maintain existing behavior and not break things.
> 
> Thanks for the reference, I hadn't followed the discussion in April
> very closely, but I think the outcome of the discussion is unfortunate.
> 
> As can be seen in Ashok's tests, merely turning slot power off and back
> on is sufficient to end up with a setting that draws more power.  That
> may be equally surprising for users as the issues would be that we seek
> to avoid with a "safety-first" ASPM policy.  In any case it seems
> undesirable.
> 
> I hope this is not the end if it and would like to encourage you to
> keep working on this.  Perhaps it is too simple to just define a
> default policy, and what is really needed is a policy that adjusts
> itself dynamically to specific devices or workloads, or that can be
> influenced by device drivers.
> 

I think our conclusion was to push PM decision to a userspace utility.
ASPM already has a knob in sysfs that can be adjusted at runtime. The
name of the file is policy. Same argument as the kernel command line
but it is writable. Different policies can be programmed into this
field after OS boot.

It is really the system power management entity that needs to decide
how system should behave. 

We just need to educate the userspace utility about the presence
of ASPM policy variable.

1. OS could still boot with the default parameters. 
2. Userspace utility sets the policy variable to powersave
3. User performs hotplug remove + insert
4. ASPM is enabled due to policy change.

I don't know anything about what this userspace utility is or if
it understands about PCIE ASPM at all. 

As an analogy, we can choose the level of power management in windows
through Power Policy.

control panel -> power options -> change advanced power options ->
PCI Express -> Link State Power Management

We want to have the same level of configuration.

> Thanks for your efforts,
> 
> Lukas
> 

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [usb-host] question about pointer dereference before null check

2017-05-02 Thread Alan Stern

On Mon, 1 May 2017, Gustavo A. R. Silva wrote:

> Hello everybody,
> 
> While taking a look into Coverity ID 100828 I ran into the following  
> piece of code at drivers/usb/host/ehci-sched.c:1096:
> 
> u32 addr;
> int think_time;
> int hs_transfers;
> 
> addr = dev->ttport << 24;
> if (!ehci_is_TDI(ehci)
>  || (dev->tt->hub !=
>  ehci_to_hcd(ehci)->self.root_hub))
>  addr |= dev->tt->hub->devnum << 16;
> addr |= epnum << 8;
> addr |= dev->devnum;
> stream->ps.usecs = HS_USECS_ISO(maxp);
> think_time = dev->tt ? dev->tt->think_time : 0;
> 
> The issue here is that dev->tt is being dereferenced before null check.
> 
> I was thinking on placing think_time = dev->tt ? dev->tt->think_time :  
> 0; just before the _if_ statement. But this doesn't address the  
> problem of dev->tt actually being NULL.
> 
> While looking into the callers of the function containing this piece  
> of code (iso_stream_init()) my impression is that dev->tt is not NULL  
> at the time this function is called and, a very simple patch like the  
> following can be applied in order to avoid the Coverity issue:
> 
> -think_time = dev->tt ? dev->tt->think_time : 0;
> +think_time = dev->tt->think_time;
> 
> But I can't tell for sure, so in case dev->tt is NULL, a good strategy  
> to properly update the _addr_ variable would be needed.
> 
> What do you think?
> 
> I would really appreciate any comment on this,
> Thank you!

You are right; udev->tt cannot ever be NULL when this section of code 
runs.  The test should be removed.

Alan Stern

Hopefully

2017-05-02 Thread Rita Micheal

Dear friend,

My name is Mr Micheal Rita, I am the Bill and Exchange (assistant)
Manager of Bank of Africa Ouagadougou, Burkina Faso. In my department
I discovered an abandoned sum of teen million five hundred thousand United
State of American dollars (10.5MILLION USA DOLLARS) in an account that
belongs to one of our foreign customer who died in airline that crashed on 4th
October 2001.

Since I got information about his death I have been expecting his next
of kin to come over and claim his money because we can not release
it unless somebody applies for it as the next of kin or relation to the
deceased as indicated in our banking guidelines, but unfortunately
we learnt that all his supposed next of kin or relation died alongside
with him in the plane crash leaving nobody behind for the claim. It is
therefore upon this discovery that I decided to make this business
proposal to you and release the money to you as next of kin or relation
to the deceased for safety and subsequent disbursement since nobody
is coming for it and I don't want the money to go into the bank treasury
as unclaimed bill.

You will be entitled with 40% of the total sum while 60% will be for
me after which I will visit your Country to invest my own share when
the fund is successfully transferred into your account, Please I would
like you to keep this transaction confidential and as a top secret as
you may wish to know that I am a bank official.

Yours sincerely,
Mr Micheal Rita.

[PATCH 0/6] md: Fine-tuning for some function implementations

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 16:12:34 +0200

Some update suggestions were taken into account
from static source code analysis.

Markus Elfring (6):
  Replace seven seq_printf() calls by seq_putc()
  Replace 17 seq_printf() calls by seq_puts()
  Adjust four function calls together with a variable assignment
  Use seq_puts() in faulty_status()
  Adjust six function calls together with a variable assignment in 
faulty_status()
  Add some spaces for better code readability

 drivers/md/faulty.c | 48 ++
 drivers/md/md.c | 66 +
 2 files changed, 64 insertions(+), 50 deletions(-)

-- 
2.12.2

[PATCH] Revert "KVM: Support vCPU-based gfn->hva cache"

2017-05-02 Thread Paolo Bonzini

This reverts commit bbd6411513aa8ef3ea02abab61318daf87c1af1e.

I've been sitting on this revert for too long and it unfortunately
missed 4.11.  It's also the reason why I haven't merged ring-based
dirty tracking for 4.12.

Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and
kvm_vcpu_write_guest_offset_cached means that the MSR value can
now be used to access SMRAM, simply by making it point to an SMRAM
physical address.  This is problematic because it lets the guest
OS overwrite memory that it shouldn't be able to touch.

Cc: sta...@vger.kernel.org
Fixes: bbd6411513aa8ef3ea02abab61318daf87c1af1e
Signed-off-by: Paolo Bonzini 
---
 arch/x86/kvm/lapic.c | 22 --
 arch/x86/kvm/x86.c   | 41 +
 include/linux/kvm_host.h | 16 
 virt/kvm/kvm_main.c  | 34 +-
 4 files changed, 58 insertions(+), 55 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index bad6a25067bc..9fa5b8164961 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -529,14 +529,16 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct 
kvm_lapic_irq *irq,
 
 static int pv_eoi_put_user(struct kvm_vcpu *vcpu, u8 val)
 {
-   return kvm_vcpu_write_guest_cached(vcpu, >arch.pv_eoi.data, ,
-  sizeof(val));
+
+   return kvm_write_guest_cached(vcpu->kvm, >arch.pv_eoi.data, ,
+ sizeof(val));
 }
 
 static int pv_eoi_get_user(struct kvm_vcpu *vcpu, u8 *val)
 {
-   return kvm_vcpu_read_guest_cached(vcpu, >arch.pv_eoi.data, val,
- sizeof(*val));
+
+   return kvm_read_guest_cached(vcpu->kvm, >arch.pv_eoi.data, val,
+ sizeof(*val));
 }
 
 static inline bool pv_eoi_enabled(struct kvm_vcpu *vcpu)
@@ -2285,8 +2287,8 @@ void kvm_lapic_sync_from_vapic(struct kvm_vcpu *vcpu)
if (!test_bit(KVM_APIC_CHECK_VAPIC, >arch.apic_attention))
return;
 
-   if (kvm_vcpu_read_guest_cached(vcpu, >arch.apic->vapic_cache, 
,
-  sizeof(u32)))
+   if (kvm_read_guest_cached(vcpu->kvm, >arch.apic->vapic_cache, 
,
+ sizeof(u32)))
return;
 
apic_set_tpr(vcpu->arch.apic, data & 0xff);
@@ -2338,14 +2340,14 @@ void kvm_lapic_sync_to_vapic(struct kvm_vcpu *vcpu)
max_isr = 0;
data = (tpr & 0xff) | ((max_isr & 0xf0) << 8) | (max_irr << 24);
 
-   kvm_vcpu_write_guest_cached(vcpu, >arch.apic->vapic_cache, ,
-   sizeof(u32));
+   kvm_write_guest_cached(vcpu->kvm, >arch.apic->vapic_cache, ,
+   sizeof(u32));
 }
 
 int kvm_lapic_set_vapic_addr(struct kvm_vcpu *vcpu, gpa_t vapic_addr)
 {
if (vapic_addr) {
-   if (kvm_vcpu_gfn_to_hva_cache_init(vcpu,
+   if (kvm_gfn_to_hva_cache_init(vcpu->kvm,
>arch.apic->vapic_cache,
vapic_addr, sizeof(u32)))
return -EINVAL;
@@ -2439,7 +2441,7 @@ int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 
data)
vcpu->arch.pv_eoi.msr_val = data;
if (!pv_eoi_enabled(vcpu))
return 0;
-   return kvm_vcpu_gfn_to_hva_cache_init(vcpu, >arch.pv_eoi.data,
+   return kvm_gfn_to_hva_cache_init(vcpu->kvm, >arch.pv_eoi.data,
 addr, sizeof(u8));
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2fe9aa116288..b38a302858a0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1785,7 +1785,7 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v)
struct kvm_vcpu_arch *vcpu = >arch;
struct pvclock_vcpu_time_info guest_hv_clock;
 
-   if (unlikely(kvm_vcpu_read_guest_cached(v, >pv_time,
+   if (unlikely(kvm_read_guest_cached(v->kvm, >pv_time,
_hv_clock, sizeof(guest_hv_clock
return;
 
@@ -1806,9 +1806,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v)
BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != 0);
 
vcpu->hv_clock.version = guest_hv_clock.version + 1;
-   kvm_vcpu_write_guest_cached(v, >pv_time,
-   >hv_clock,
-   sizeof(vcpu->hv_clock.version));
+   kvm_write_guest_cached(v->kvm, >pv_time,
+   >hv_clock,
+   sizeof(vcpu->hv_clock.version));
 
smp_wmb();
 
@@ -1822,16 +1822,16 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v)
 
trace_kvm_pvclock_update(v->vcpu_id, >hv_clock);
 
-   kvm_vcpu_write_guest_cached(v, >pv_time,
-   >hv_clock,
-   sizeof(vcpu->hv_clock));
+   kvm_write_guest_cached(v->kvm,

[PATCH 1/6] md: Replace seven seq_printf() calls by seq_putc()

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 14:01:17 +0200

A few single characters should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/md/md.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 82f798be964f..7f0a35ee192a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7498,7 +7498,7 @@ static void status_unused(struct seq_file *seq)
if (!i)
seq_printf(seq, "");
 
-   seq_printf(seq, "\n");
+   seq_putc(seq, '\n');
 }
 
 static int status_resync(struct seq_file *seq, struct mddev *mddev)
@@ -7552,12 +7552,13 @@ static int status_resync(struct seq_file *seq, struct 
mddev *mddev)
per_milli = res;
{
int i, x = per_milli/50, y = 20-x;
-   seq_printf(seq, "[");
+
+   seq_putc(seq, '[');
for (i = 0; i < x; i++)
-   seq_printf(seq, "=");
-   seq_printf(seq, ">");
+   seq_putc(seq, '=');
+   seq_putc(seq, '>');
for (i = 0; i < y; i++)
-   seq_printf(seq, ".");
+   seq_putc(seq, '.');
seq_printf(seq, "] ");
}
seq_printf(seq, " %s =%3u.%u%% (%llu/%llu)",
@@ -7678,7 +7679,7 @@ static int md_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, "[%s] ", pers->name);
 
spin_unlock(_lock);
-   seq_printf(seq, "\n");
+   seq_putc(seq, '\n');
seq->poll_event = atomic_read(_event_count);
return 0;
}
@@ -7754,8 +7755,7 @@ static int md_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, "\n   ");
 
bitmap_status(seq, mddev->bitmap);
-
-   seq_printf(seq, "\n");
+   seq_putc(seq, '\n');
}
spin_unlock(>lock);
 
-- 
2.12.2

[PATCH 2/6] md: Replace 17 seq_printf() calls by seq_puts()

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 14:22:45 +0200

Strings which did not contain data format specifications should be put
into a sequence. Thus use the corresponding function "seq_puts".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/md/md.c | 38 --
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7f0a35ee192a..65233a91e5e2 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7487,7 +7487,7 @@ static void status_unused(struct seq_file *seq)
int i = 0;
struct md_rdev *rdev;
 
-   seq_printf(seq, "unused devices: ");
+   seq_puts(seq, "unused devices: ");
 
list_for_each_entry(rdev, _raid_disks, same_set) {
char b[BDEVNAME_SIZE];
@@ -7496,7 +7496,7 @@ static void status_unused(struct seq_file *seq)
  bdevname(rdev->bdev,b));
}
if (!i)
-   seq_printf(seq, "");
+   seq_puts(seq, "");
 
seq_putc(seq, '\n');
 }
@@ -7525,13 +7525,13 @@ static int status_resync(struct seq_file *seq, struct 
mddev *mddev)
 
if (resync == 0) {
if (mddev->recovery_cp < MaxSector) {
-   seq_printf(seq, "\tresync=PENDING");
+   seq_puts(seq, "\tresync=PENDING");
return 1;
}
return 0;
}
if (resync < 3) {
-   seq_printf(seq, "\tresync=DELAYED");
+   seq_puts(seq, "\tresync=DELAYED");
return 1;
}
 
@@ -7559,7 +7559,7 @@ static int status_resync(struct seq_file *seq, struct 
mddev *mddev)
seq_putc(seq, '>');
for (i = 0; i < y; i++)
seq_putc(seq, '.');
-   seq_printf(seq, "] ");
+   seq_puts(seq, "] ");
}
seq_printf(seq, " %s =%3u.%u%% (%llu/%llu)",
   (test_bit(MD_RECOVERY_RESHAPE, >recovery)?
@@ -7673,7 +7673,8 @@ static int md_seq_show(struct seq_file *seq, void *v)
 
if (v == (void*)1) {
struct md_personality *pers;
-   seq_printf(seq, "Personalities : ");
+
+   seq_puts(seq, "Personalities : ");
spin_lock(_lock);
list_for_each_entry(pers, _list, list)
seq_printf(seq, "[%s] ", pers->name);
@@ -7694,9 +7695,9 @@ static int md_seq_show(struct seq_file *seq, void *v)
mddev->pers ? "" : "in");
if (mddev->pers) {
if (mddev->ro==1)
-   seq_printf(seq, " (read-only)");
+   seq_puts(seq, " (read-only)");
if (mddev->ro==2)
-   seq_printf(seq, " (auto-read-only)");
+   seq_puts(seq, " (auto-read-only)");
seq_printf(seq, " %s", mddev->pers->name);
}
 
@@ -7707,17 +7708,17 @@ static int md_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, " %s[%d]",
bdevname(rdev->bdev,b), rdev->desc_nr);
if (test_bit(WriteMostly, >flags))
-   seq_printf(seq, "(W)");
+   seq_puts(seq, "(W)");
if (test_bit(Journal, >flags))
-   seq_printf(seq, "(J)");
+   seq_puts(seq, "(J)");
if (test_bit(Faulty, >flags)) {
-   seq_printf(seq, "(F)");
+   seq_puts(seq, "(F)");
continue;
}
if (rdev->raid_disk < 0)
-   seq_printf(seq, "(S)"); /* spare */
+   seq_puts(seq, "(S)"); /* spare */
if (test_bit(Replacement, >flags))
-   seq_printf(seq, "(R)");
+   seq_puts(seq, "(R)");
sectors += rdev->sectors;
}
rcu_read_unlock();
@@ -7742,17 +7743,18 @@ static int md_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, " super external:%s",
   mddev->metadata_type);
else
-   seq_printf(seq, " super non-persistent");
+   seq_puts(seq, " super non-persistent");
 
if (mddev->pers) {
mddev->pers->status(seq, mddev);
-   seq_printf(seq, "\n  ");
+   seq_puts(seq, "\n  ");
if (mddev->pers->sync_request) {
if (status_resync(seq, mddev))
-

[PATCH 3/6] md: Adjust four function calls together with a variable assignment

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 15:07:21 +0200

The script "checkpatch.pl" pointed information out like the following.

ERROR: do not use assignment in if condition

Thus fix affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/md/md.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 65233a91e5e2..60580095a5de 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2153,7 +2153,8 @@ static int bind_rdev_to_array(struct md_rdev *rdev, 
struct mddev *mddev)
rdev->mddev = mddev;
pr_debug("md: bind<%s>\n", b);
 
-   if ((err = kobject_add(>kobj, >kobj, "dev-%s", b)))
+   err = kobject_add(>kobj, >kobj, "dev-%s", b);
+   if (err)
goto fail;
 
ko = _to_dev(rdev->bdev->bd_part)->kobj;
@@ -7306,7 +7307,8 @@ static int md_open(struct block_device *bdev, fmode_t 
mode)
}
BUG_ON(mddev != bdev->bd_disk->private_data);
 
-   if ((err = mutex_lock_interruptible(>open_mutex)))
+   err = mutex_lock_interruptible(>open_mutex);
+   if (err)
goto out;
 
if (test_bit(MD_CLOSING, >flags)) {
@@ -8927,10 +8929,12 @@ static int __init md_init(void)
if (!md_misc_wq)
goto err_misc_wq;
 
-   if ((ret = register_blkdev(MD_MAJOR, "md")) < 0)
+   ret = register_blkdev(MD_MAJOR, "md");
+   if (ret < 0)
goto err_md;
 
-   if ((ret = register_blkdev(0, "mdp")) < 0)
+   ret = register_blkdev(0, "mdp");
+   if (ret < 0)
goto err_mdp;
mdp_major = ret;
 
-- 
2.12.2

Re: [PATCH 1/3] drm: fourcc byteorder: drop DRM_FORMAT_BIG_ENDIAN

2017-05-02 Thread Pekka Paalanen

On Tue, 2 May 2017 14:53:43 +0100
Emil Velikov  wrote:

> Hi Gerd,
> 
> I did not have the change to follow through the discussion.
> Pardon if my suggestion have already been discussed.
> 
> On 2 May 2017 at 14:34, Gerd Hoffmann  wrote:
> > It's unused.
> >
> > Suggested-by: Daniel Vetter 
> > Signed-off-by: Gerd Hoffmann 
> > ---
> >  include/uapi/drm/drm_fourcc.h | 2 --
> >  drivers/gpu/drm/drm_fourcc.c  | 3 +--
> >  drivers/gpu/drm/drm_framebuffer.c | 2 +-
> >  3 files changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> > index 995c8f9c69..305bc34be0 100644
> > --- a/include/uapi/drm/drm_fourcc.h
> > +++ b/include/uapi/drm/drm_fourcc.h
> > @@ -33,8 +33,6 @@ extern "C" {
> >  #define fourcc_code(a, b, c, d) ((__u32)(a) | ((__u32)(b) << 8) | \
> >  ((__u32)(c) << 16) | ((__u32)(d) << 24))
> >
> > -#define DRM_FORMAT_BIG_ENDIAN (1<<31) /* format is big endian instead of 
> > little endian */
> > -  
> 
> Weston references DRM_FORMAT_BIG_ENDIAN thus this patch will lead to
> build breakage.
> That is never fun, so please carefully coordinate with the Weston devs
> to minimise the fireworks.
> 
> Personally I would leave the symbol, since it's UAPI and document that
> should not be used. Not my call, so feel free to ignore.

Hi,

indeed, weston does have one occurrence of it. I don't think it has
actually been needed in practice ever, but removing it will cause a
build failure:
https://cgit.freedesktop.org/wayland/weston/tree/libweston/gl-renderer.c?id=2.0.0#n1820
Funnily enough, it's only ever used to get rid of the bit, "just in
case".

I also think that this patch requires more comments than the
commit message has at the moment.

Removing the definition also removes the possibility to describe a lot
of pixel formats, so that should definitely be mentioned. I think it
would also be good to have some kind of justified claim that no
hardware actually needs the pixel formats this is removing (e.g. RGB565
BE). Maybe this was already in the long discussions, but I feel it
should be summarized in the commit message.

Thanks,
pq

pgp6YKhPrkfy_.pgp
Description: OpenPGP digital signature

Re: [patch 2/2] MM: allow per-cpu vmstat_threshold and vmstat_worker configuration

2017-05-02 Thread Luiz Capitulino

On Tue, 25 Apr 2017 10:57:19 -0300
Marcelo Tosatti  wrote:

> The per-CPU vmstat worker is a problem on -RT workloads (because
> ideally the CPU is entirely reserved for the -RT app, without
> interference). The worker transfers accumulated per-CPU 
> vmstat counters to global counters.

This is a problem for non-RT too. Any task pinned to an isolated
CPU that doesn't want to be ever interrupted will be interrupted
by the vmstat kworker.

> To resolve the problem, create two tunables:
> 
> * Userspace configurable per-CPU vmstat threshold: by default the 
> VM code calculates the size of the per-CPU vmstat arrays. This 
> tunable allows userspace to configure the values.
> 
> * Userspace configurable per-CPU vmstat worker: allow disabling
> the per-CPU vmstat worker.

I have several questions about the tunables:

 - What does the vmstat_threshold value mean? What are the implications
   of changing this value? What's the difference in choosing 1, 2, 3
   or 500?

 - If the purpose of having vmstat_threshold is to allow disabling
   the vmstat kworker, why can't the kernel pick a value automatically?

 - What are the implications of disabling the vmstat kworker? Will vm
   stats still be collected someway or will it be completely off for
   the CPU?

Also, shouldn't this patch be split into two?

> The patch below contains documentation which describes the tunables
> in more detail.
> 
> Signed-off-by: Marcelo Tosatti 
> 
> ---
>  Documentation/vm/vmstat_thresholds.txt |   38 +
>  mm/vmstat.c|  248 
> +++--
>  2 files changed, 272 insertions(+), 14 deletions(-)
> 
> Index: linux-2.6-git-disable-vmstat-worker/mm/vmstat.c
> ===
> --- linux-2.6-git-disable-vmstat-worker.orig/mm/vmstat.c  2017-04-25 
> 07:39:13.941019853 -0300
> +++ linux-2.6-git-disable-vmstat-worker/mm/vmstat.c   2017-04-25 
> 10:44:51.581977296 -0300
> @@ -91,8 +91,17 @@
>  EXPORT_SYMBOL(vm_zone_stat);
>  EXPORT_SYMBOL(vm_node_stat);
>  
> +struct vmstat_uparam {
> + atomic_t vmstat_work_enabled;
> + atomic_t user_stat_thresh;
> +};
> +
> +static DEFINE_PER_CPU(struct vmstat_uparam, vmstat_uparam);
> +
>  #ifdef CONFIG_SMP
>  
> +#define MAX_THRESHOLD 125
> +
>  int calculate_pressure_threshold(struct zone *zone)
>  {
>   int threshold;
> @@ -110,9 +119,9 @@
>   threshold = max(1, (int)(watermark_distance / num_online_cpus()));
>  
>   /*
> -  * Maximum threshold is 125
> +  * Maximum threshold is MAX_THRESHOLD == 125
>*/
> - threshold = min(125, threshold);
> + threshold = min(MAX_THRESHOLD, threshold);
>  
>   return threshold;
>  }
> @@ -188,15 +197,31 @@
>   threshold = calculate_normal_threshold(zone);
>  
>   for_each_online_cpu(cpu) {
> - int pgdat_threshold;
> + int pgdat_threshold, ustat_thresh;
> + struct vmstat_uparam *vup;
>  
> - per_cpu_ptr(zone->pageset, cpu)->stat_threshold
> - = threshold;
> + struct per_cpu_nodestat __percpu *pcp;
> + struct per_cpu_pageset *p;
> +
> + p = per_cpu_ptr(zone->pageset, cpu);
> +
> + vup = _cpu(vmstat_uparam, cpu);
> + ustat_thresh = atomic_read(>user_stat_thresh);
> +
> + if (ustat_thresh)
> + p->stat_threshold = ustat_thresh;
> + else
> + p->stat_threshold = threshold;
> +
> + pcp = per_cpu_ptr(pgdat->per_cpu_nodestats, cpu);
>  
>   /* Base nodestat threshold on the largest populated 
> zone. */
> - pgdat_threshold = per_cpu_ptr(pgdat->per_cpu_nodestats, 
> cpu)->stat_threshold;
> - per_cpu_ptr(pgdat->per_cpu_nodestats, 
> cpu)->stat_threshold
> - = max(threshold, pgdat_threshold);
> + pgdat_threshold = pcp->stat_threshold;
> + if (ustat_thresh)
> + pcp->stat_threshold = ustat_thresh;
> + else
> + pcp->stat_threshold = max(threshold,
> +   pgdat_threshold);
>   }
>  
>   /*
> @@ -226,9 +251,24 @@
>   continue;
>  
>   threshold = (*calculate_pressure)(zone);
> - for_each_online_cpu(cpu)
> + for_each_online_cpu(cpu) {
> + int t, ustat_thresh;
> + struct vmstat_uparam *vup;
> +
> + vup = _cpu(vmstat_uparam, cpu);
> + ustat_thresh = atomic_read(>user_stat_thresh);
> + t = threshold;
> +
> + /*
> +

[PATCH 4/6] md: Use seq_puts() in faulty_status()

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 15:20:30 +0200

A string which did not contain a data format specification should be put
into a sequence. Thus use the corresponding function "seq_puts".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/md/faulty.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/faulty.c b/drivers/md/faulty.c
index b0536cfd8e17..f5536c91be5c 100644
--- a/drivers/md/faulty.c
+++ b/drivers/md/faulty.c
@@ -253,7 +253,7 @@ static void faulty_status(struct seq_file *seq, struct 
mddev *mddev)
   n, conf->period[ReadFixable]);
 
if ((n=atomic_read(>counters[WriteAll])) != 0)
-   seq_printf(seq, " WriteAll");
+   seq_puts(seq, " WriteAll");
 
seq_printf(seq, " nfaults=%d", conf->nfaults);
 }
-- 
2.12.2

[PATCH 5/6] md: Adjust six function calls together with a variable assignment in faulty_status()

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 15:35:35 +0200

The script "checkpatch.pl" pointed information out like the following.

ERROR: do not use assignment in if condition

Thus fix the affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/md/faulty.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/md/faulty.c b/drivers/md/faulty.c
index f5536c91be5c..2573009b1265 100644
--- a/drivers/md/faulty.c
+++ b/drivers/md/faulty.c
@@ -231,28 +231,33 @@ static void faulty_status(struct seq_file *seq, struct 
mddev *mddev)
struct faulty_conf *conf = mddev->private;
int n;
 
-   if ((n=atomic_read(>counters[WriteTransient])) != 0)
+   n = atomic_read(>counters[WriteTransient]);
+   if (n != 0)
seq_printf(seq, " WriteTransient=%d(%d)",
   n, conf->period[WriteTransient]);
 
-   if ((n=atomic_read(>counters[ReadTransient])) != 0)
+   n = atomic_read(>counters[ReadTransient]);
+   if (n != 0)
seq_printf(seq, " ReadTransient=%d(%d)",
   n, conf->period[ReadTransient]);
 
-   if ((n=atomic_read(>counters[WritePersistent])) != 0)
+   n = atomic_read(>counters[WritePersistent]);
+   if (n != 0)
seq_printf(seq, " WritePersistent=%d(%d)",
   n, conf->period[WritePersistent]);
 
-   if ((n=atomic_read(>counters[ReadPersistent])) != 0)
+   n = atomic_read(>counters[ReadPersistent]);
+   if (n != 0)
seq_printf(seq, " ReadPersistent=%d(%d)",
   n, conf->period[ReadPersistent]);
 
-
-   if ((n=atomic_read(>counters[ReadFixable])) != 0)
+   n = atomic_read(>counters[ReadFixable]);
+   if (n != 0)
seq_printf(seq, " ReadFixable=%d(%d)",
   n, conf->period[ReadFixable]);
 
-   if ((n=atomic_read(>counters[WriteAll])) != 0)
+   n = atomic_read(>counters[WriteAll]);
+   if (n != 0)
seq_puts(seq, " WriteAll");
 
seq_printf(seq, " nfaults=%d", conf->nfaults);
-- 
2.12.2

Re: [PATCH RFC v2 4/6] proc: support mounting private procfs instances inside same pid namespace

2017-05-02 Thread Djalal Harouni

On Thu, Apr 27, 2017 at 12:13 AM, Andy Lutomirski  wrote:
> On Tue, Apr 25, 2017 at 5:23 AM, Djalal Harouni  wrote:
[...]
>> We have to align procfs and modernize it to have a per mount context
>> where at least the mount option do not propagate to all other mounts,
>> then maybe we can continue to implement new features. One example is to
>> require CAP_SYS_ADMIN in the init user namespace on some /proc/* which are
>> not pids and which are are not virtualized by design, or CAP_NET_ADMIN
>> inside userns on the net bits that are virtualized, etc.
>> These mount options won't propagate to previous mounts, and the system
>> will continue to be usable.
>>
>> Ths patch introduces the new 'limit_pids' mount option as it was also
>> suggesed by Andy Lutomirski [1]. When this option is passed we
>> automatically create a private procfs instance. This is not the default
>> behaviour since we do not want to break userspace and we do not want to
>> provide different devices IDs by default, please see [1] for why.
>
> I think that calling the option to make a separate instance
> "limit_pids" is extremely counterintuitive.

Ok.

> My strong preference would be to make proc *always* make a separate
> instance (unless it's a bind mount) and to make it work.  If that
> means fudging stat() output, so be it.

I also agree, but as said if we change stat(), userspace won't be able
to notice if these two proc instances are really separated, the device
ID is the only indication here.

So,

> Failing that, let's come up with some coherent way to make this work.
> "new_instance" or similar would do.  Then make limit_pid cause an
> error unless new_instance is also set.

This is reasonable and it follows devpts filesystem, when
'new_instance' was introduced to gradually switch to private
instances, then it was made default behaviour. We can do the same and
improve the situation bit by bit without breaking anything.

I will prepare a new clean version with "newinstance" and
"pids=ptraceable" requiring it, this way we don't change anything that
is related to current 'hidepid' behaviour. Let me know please if you
don't agree.

Thank you for the feedback!

-- 
tixxdz

[PATCH 6/6] md: Add some spaces for better code readability

2017-05-02 Thread SF Markus Elfring

From: Markus Elfring 
Date: Tue, 2 May 2017 16:00:45 +0200

Use space characters and blank lines at some source code places
according to the Linux coding style convention.

Signed-off-by: Markus Elfring 
---
 drivers/md/faulty.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/md/faulty.c b/drivers/md/faulty.c
index 2573009b1265..434c3643c9c3 100644
--- a/drivers/md/faulty.c
+++ b/drivers/md/faulty.c
@@ -110,19 +110,20 @@ static int check_sector(struct faulty_conf *conf, 
sector_t start, sector_t end,
 {
/* If we find a ReadFixable sector, we fix it ... */
int i;
-   for (i=0; infaults; i++)
+
+   for (i = 0; i < conf->nfaults; i++)
if (conf->faults[i] >= start &&
conf->faults[i] < end) {
/* found it ... */
switch (conf->modes[i] * 2 + dir) {
-   case WritePersistent*2+WRITE: return 1;
-   case ReadPersistent*2+READ: return 1;
-   case ReadFixable*2+READ: return 1;
-   case ReadFixable*2+WRITE:
+   case WritePersistent * 2 + WRITE: return 1;
+   case ReadPersistent * 2 + READ: return 1;
+   case ReadFixable * 2 + READ: return 1;
+   case ReadFixable * 2 + WRITE:
conf->modes[i] = NoPersist;
return 0;
-   case AllPersist*2+READ:
-   case AllPersist*2+WRITE: return 1;
+   case AllPersist * 2 + READ:
+   case AllPersist * 2 + WRITE: return 1;
default:
return 0;
}
@@ -134,9 +135,10 @@ static void add_sector(struct faulty_conf *conf, sector_t 
start, int mode)
 {
int i;
int n = conf->nfaults;
-   for (i=0; infaults; i++)
+
+   for (i = 0; i < conf->nfaults; i++)
if (conf->faults[i] == start) {
-   switch(mode) {
+   switch (mode) {
case NoPersist: conf->modes[i] = mode; return;
case WritePersistent:
if (conf->modes[i] == ReadPersistent ||
@@ -167,7 +169,7 @@ static void add_sector(struct faulty_conf *conf, sector_t 
start, int mode)
conf->faults[n] = start;
conf->modes[n] = mode;
if (conf->nfaults == n)
-   conf->nfaults = n+1;
+   conf->nfaults = n + 1;
 }
 
 static void faulty_make_request(struct mddev *mddev, struct bio *bio)
@@ -278,7 +280,8 @@ static int faulty_reshape(struct mddev *mddev)
conf->nfaults = 0;
else if (mode == ClearErrors) {
int i;
-   for (i=0 ; i < Modes ; i++) {
+
+   for (i = 0; i < Modes; i++) {
conf->period[i] = 0;
atomic_set(>counters[i], 0);
}
@@ -317,7 +320,7 @@ static int faulty_run(struct mddev *mddev)
if (!conf)
return -ENOMEM;
 
-   for (i=0; icounters[i], 0);
conf->period[i] = 0;
}
-- 
2.12.2

Re: [PATCH] usb: gadget: udc: add null check before pointer dereference

2017-05-02 Thread Alan Stern

On Mon, 1 May 2017, Gustavo A. R. Silva wrote:

> Add null check before dereferencing dev->regs pointer inside
> net2280_led_shutdown() function.
> 
> Addresses-Coverity-ID: 101783
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/usb/gadget/udc/net2280.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/gadget/udc/net2280.c 
> b/drivers/usb/gadget/udc/net2280.c
> index 3828c2e..1898a4b 100644
> --- a/drivers/usb/gadget/udc/net2280.c
> +++ b/drivers/usb/gadget/udc/net2280.c
> @@ -3573,7 +3573,11 @@ static void net2280_remove(struct pci_dev *pdev)
>   BUG_ON(dev->driver);
>  
>   /* then clean up the resources we allocated during probe() */
> - net2280_led_shutdown(dev);
> +
> + if (dev->regs) {
> + net2280_led_shutdown(dev);
> + iounmap(dev->regs);
> + }
>   if (dev->requests) {
>   int i;
>   for (i = 1; i < 5; i++) {
> @@ -3588,8 +3592,6 @@ static void net2280_remove(struct pci_dev *pdev)
>   free_irq(pdev->irq, dev);
>   if (dev->quirks & PLX_PCIE)
>   pci_disable_msi(pdev);
> - if (dev->regs)
> - iounmap(dev->regs);
>   if (dev->region)
>   release_mem_region(pci_resource_start(pdev, 0),
>   pci_resource_len(pdev, 0));

No, you must not move the iounmap() call, because an interrupt could
theoretically occur at any time.

Either just live with an extra test of dev->regs, or else move the 
net2280_led_shutdown() call later.

Alan Stern

Re: [PATCH] md/bitmap: use i_blocksize()

2017-05-02 Thread Geliang Tang

On Sun, Jan 22, 2017 at 09:50:30AM +0800, Geliang Tang wrote:
> On Sat, Jan 21, 2017 at 10:13:07AM -0800, Shaohua Li wrote:
> > On Fri, Jan 20, 2017 at 10:29:52PM +0800, Geliang Tang wrote:
> > > Since i_blocksize() helper has been defined in fs.h, use it instead
> > > of open-coding.
> > 
> > which tree is this patch applied to? I can't find it in Linus's tree
> > 
> 
> This patch is against the last linux-next tree, next-20170120. It
> depends on the patch '2651ed7 fs: add i_blocksize()' which isn't merged
> into Linus's tree yet.
> 
> -Geliang

Hi Shaohua,

The i_blocksize() helper has been merged into Linus's tree already, and
this patch is still valid. Could you please reconsider it if this patch
can be applied?

Thanks.

-Geliang

Re: [PATCH 00/14] sched/topology fixes

2017-05-02 Thread Peter Zijlstra

On Fri, Apr 28, 2017 at 03:53:39PM +0200, Peter Zijlstra wrote:
> Also, the following occurred to me:
> 
>   sg_span & sg_mask == sg_mask
> 
> Therefore, we don't need to do the whole "sg_span &" business.
> 
> Hmm?

> @@ -856,7 +857,7 @@ build_sched_groups(struct sched_domain *
>   continue;
>  
>   group = get_group(i, sdd, );
> - cpumask_setall(sched_group_mask(sg));
> + cpumask_copy(sched_group_mask(sg), sched_group_cpus(sg));
>  
>   for_each_cpu(j, span) {
>   if (get_group(j, sdd, NULL) != group)

OK, so this explodes mightily.

That code also hurt my brain bad, so I had to fix that a little.

The below seems to boot.

---
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7996,7 +7996,7 @@ static int active_load_balance_cpu_stop(
 static int should_we_balance(struct lb_env *env)
 {
struct sched_group *sg = env->sd->groups;
-   struct cpumask *sg_cpus, *sg_mask;
+   struct cpumask *sg_mask;
int cpu, balance_cpu = -1;
 
/*
@@ -8006,11 +8006,10 @@ static int should_we_balance(struct lb_e
if (env->idle == CPU_NEWLY_IDLE)
return 1;
 
-   sg_cpus = sched_group_cpus(sg);
sg_mask = sched_group_mask(sg);
/* Try to find first idle cpu */
-   for_each_cpu_and(cpu, sg_cpus, env->cpus) {
-   if (!cpumask_test_cpu(cpu, sg_mask) || !idle_cpu(cpu))
+   for_each_cpu_and(cpu, sg_mask, env->cpus) {
+   if (!idle_cpu(cpu))
continue;
 
balance_cpu = cpu;
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -85,7 +85,8 @@ static int sched_domain_debug_one(struct
group->sgc->id,
cpumask_pr_args(sched_group_cpus(group)));
 
-   if ((sd->flags & SD_OVERLAP) && 
!cpumask_full(sched_group_mask(group))) {
+   if ((sd->flags & SD_OVERLAP) &&
+   !cpumask_equal(sched_group_mask(group), 
sched_group_cpus(group))) {
printk(KERN_CONT " mask=%*pbl",
cpumask_pr_args(sched_group_mask(group)));
}
@@ -505,7 +506,7 @@ enum s_alloc {
  */
 int group_balance_cpu(struct sched_group *sg)
 {
-   return cpumask_first_and(sched_group_cpus(sg), sched_group_mask(sg));
+   return cpumask_first(sched_group_mask(sg));
 }
 
 
@@ -833,23 +834,34 @@ build_overlap_sched_groups(struct sched_
  * [*] in other words, the first group of each domain is its child domain.
  */
 
-static int get_group(int cpu, struct sd_data *sdd, struct sched_group **sg)
+static struct sched_group *get_group(int cpu, struct sd_data *sdd)
 {
struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu);
struct sched_domain *child = sd->child;
+   struct sched_group *sg;
 
if (child)
cpu = cpumask_first(sched_domain_span(child));
 
-   if (sg) {
-   *sg = *per_cpu_ptr(sdd->sg, cpu);
-   (*sg)->sgc = *per_cpu_ptr(sdd->sgc, cpu);
+   sg = *per_cpu_ptr(sdd->sg, cpu);
+   sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
 
-   /* For claim_allocations: */
-   atomic_set(&(*sg)->sgc->ref, 1);
+   /* For claim_allocations: */
+   atomic_inc(>ref);
+   atomic_inc(>sgc->ref);
+
+   if (child) {
+   cpumask_copy(sched_group_cpus(sg), sched_domain_span(child));
+   cpumask_copy(sched_group_mask(sg), sched_group_cpus(sg));
+   } else {
+   cpumask_set_cpu(cpu, sched_group_cpus(sg));
+   cpumask_set_cpu(cpu, sched_group_cpus(sg));
}
 
-   return cpu;
+   sg->sgc->capacity = SCHED_CAPACITY_SCALE * 
cpumask_weight(sched_group_cpus(sg));
+   sg->sgc->min_capacity = SCHED_CAPACITY_SCALE;
+
+   return sg;
 }
 
 /*
@@ -868,34 +880,20 @@ build_sched_groups(struct sched_domain *
struct cpumask *covered;
int i;
 
-   get_group(cpu, sdd, >groups);
-   atomic_inc(>groups->ref);
-
-   if (cpu != cpumask_first(span))
-   return 0;
-
lockdep_assert_held(_domains_mutex);
covered = sched_domains_tmpmask;
 
cpumask_clear(covered);
 
-   for_each_cpu(i, span) {
+   for_each_cpu_wrap(i, span, cpu) {
struct sched_group *sg;
-   int group, j;
 
if (cpumask_test_cpu(i, covered))
continue;
 
-   group = get_group(i, sdd, );
-   cpumask_setall(sched_group_mask(sg));
+   sg = get_group(i, sdd);
 
-   for_each_cpu(j, span) {
-   if (get_group(j, sdd, NULL) != group)
-   continue;
-
-   cpumask_set_cpu(j, covered);
-   cpumask_set_cpu(j, sched_group_cpus(sg));
-   }
+   cpumask_or(covered, covered, sched_group_cpus(sg));

[PATCH V2 0/3] try to save some memory for kmem_cache in some cases

2017-05-02 Thread Wei Yang

kmem_cache is a frequently used data in kernel. During the code reading, I
found maybe we could save some space in some cases.

1. On 64bit arch, type int will occupy a word if it doesn't sit well.
2. cpu_slab->partial is just used when CONFIG_SLUB_CPU_PARTIAL is set
3. cpu_partial is just used when CONFIG_SLUB_CPU_PARTIAL is set, while just
save some space on 32bit arch.

v2:
   define some macro to make the code more elegant

Wei Yang (3):
  mm/slub: pack red_left_pad with another int to save a word
  mm/slub: wrap cpu_slab->partial in CONFIG_SLUB_CPU_PARTIAL
  mm/slub: wrap kmem_cache->cpu_partial in config
CONFIG_SLUB_CPU_PARTIAL

 include/linux/slub_def.h | 34 ++-
 mm/slub.c| 85 ++--
 2 files changed, 80 insertions(+), 39 deletions(-)

-- 
2.11.0

[PATCH V2 1/3] mm/slub: pack red_left_pad with another int to save a word

2017-05-02 Thread Wei Yang

On 64bit arch, struct is 8-bytes aligned, so int will occupy a word if it
doesn't sits well.

This patch pack red_left_pad with reserved to save 8 bytes for struct
kmem_cache on a 64bit arch.

Signed-off-by: Wei Yang 
---
 include/linux/slub_def.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 07ef550c6627..ec13aab32647 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -79,9 +79,9 @@ struct kmem_cache {
int inuse;  /* Offset to metadata */
int align;  /* Alignment */
int reserved;   /* Reserved bytes at the end of slabs */
+   int red_left_pad;   /* Left redzone padding size */
const char *name;   /* Name (only for display!) */
struct list_head list;  /* List of slab caches */
-   int red_left_pad;   /* Left redzone padding size */
 #ifdef CONFIG_SYSFS
struct kobject kobj;/* For sysfs */
 #endif
-- 
2.11.0

[PATCH V2 3/3] mm/slub: wrap kmem_cache->cpu_partial in config CONFIG_SLUB_CPU_PARTIAL

2017-05-02 Thread Wei Yang

kmem_cache->cpu_partial is just used when CONFIG_SLUB_CPU_PARTIAL is set,
so wrap it with config CONFIG_SLUB_CPU_PARTIAL will save some space
on 32bit arch.

This patch wrap kmem_cache->cpu_partial in config CONFIG_SLUB_CPU_PARTIAL
and wrap its sysfs too.

Signed-off-by: Wei Yang 

---
v2: define slub_cpu_partial() to make code more elegant
---
 include/linux/slub_def.h | 13 +
 mm/slub.c| 69 ++--
 2 files changed, 51 insertions(+), 31 deletions(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index f882a34bb9aa..d808e8e6293b 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -86,7 +86,9 @@ struct kmem_cache {
int size;   /* The size of an object including meta data */
int object_size;/* The size of an object without meta data */
int offset; /* Free pointer offset. */
+#ifdef CONFIG_SLUB_CPU_PARTIAL
int cpu_partial;/* Number of per cpu partial objects to keep 
around */
+#endif
struct kmem_cache_order_objects oo;
 
/* Allocation and freeing of slabs */
@@ -130,6 +132,17 @@ struct kmem_cache {
struct kmem_cache_node *node[MAX_NUMNODES];
 };
 
+#ifdef CONFIG_SLUB_CPU_PARTIAL
+#define slub_cpu_partial(s)((s)->cpu_partial)
+#define slub_set_cpu_partial(s, n) \
+({ \
+   slub_cpu_partial(s) = (n);  \
+})
+#else
+#define slub_cpu_partial(s)(0)
+#define slub_set_cpu_partial(s, n)
+#endif // CONFIG_SLUB_CPU_PARTIAL
+
 #ifdef CONFIG_SYSFS
 #define SLAB_SUPPORTS_SYSFS
 void sysfs_slab_release(struct kmem_cache *);
diff --git a/mm/slub.c b/mm/slub.c
index ae6166533261..795112b65c61 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1829,7 +1829,7 @@ static void *get_partial_node(struct kmem_cache *s, 
struct kmem_cache_node *n,
stat(s, CPU_PARTIAL_NODE);
}
if (!kmem_cache_has_cpu_partial(s)
-   || available > s->cpu_partial / 2)
+   || available > slub_cpu_partial(s) / 2)
break;
 
}
@@ -3410,6 +3410,39 @@ static void set_min_partial(struct kmem_cache *s, 
unsigned long min)
s->min_partial = min;
 }
 
+static void set_cpu_partial(struct kmem_cache *s)
+{
+#ifdef CONFIG_SLUB_CPU_PARTIAL
+   /*
+* cpu_partial determined the maximum number of objects kept in the
+* per cpu partial lists of a processor.
+*
+* Per cpu partial lists mainly contain slabs that just have one
+* object freed. If they are used for allocation then they can be
+* filled up again with minimal effort. The slab will never hit the
+* per node partial lists and therefore no locking will be required.
+*
+* This setting also determines
+*
+* A) The number of objects from per cpu partial slabs dumped to the
+*per node list when we reach the limit.
+* B) The number of objects in cpu partial slabs to extract from the
+*per node list when we run out of per cpu objects. We only fetch
+*50% to keep some capacity around for frees.
+*/
+   if (!kmem_cache_has_cpu_partial(s))
+   s->cpu_partial = 0;
+   else if (s->size >= PAGE_SIZE)
+   s->cpu_partial = 2;
+   else if (s->size >= 1024)
+   s->cpu_partial = 6;
+   else if (s->size >= 256)
+   s->cpu_partial = 13;
+   else
+   s->cpu_partial = 30;
+#endif
+}
+
 /*
  * calculate_sizes() determines the order and the distribution of data within
  * a slab object.
@@ -3568,33 +3601,7 @@ static int kmem_cache_open(struct kmem_cache *s, 
unsigned long flags)
 */
set_min_partial(s, ilog2(s->size) / 2);
 
-   /*
-* cpu_partial determined the maximum number of objects kept in the
-* per cpu partial lists of a processor.
-*
-* Per cpu partial lists mainly contain slabs that just have one
-* object freed. If they are used for allocation then they can be
-* filled up again with minimal effort. The slab will never hit the
-* per node partial lists and therefore no locking will be required.
-*
-* This setting also determines
-*
-* A) The number of objects from per cpu partial slabs dumped to the
-*per node list when we reach the limit.
-* B) The number of objects in cpu partial slabs to extract from the
-*per node list when we run out of per cpu objects. We only fetch
-*50% to keep some capacity around for frees.
-*/
-   if (!kmem_cache_has_cpu_partial(s))
-   s->cpu_partial = 0;
-   else if (s->size >= PAGE_SIZE)
-   s->cpu_partial = 2;
-   else if (s->size >= 1024)
-

[PATCH V2 2/3] mm/slub: wrap cpu_slab->partial in CONFIG_SLUB_CPU_PARTIAL

2017-05-02 Thread Wei Yang

cpu_slab's field partial is used when CONFIG_SLUB_CPU_PARTIAL is set, which
means we can save a pointer's space on each cpu for every slub item.

This patch wrap cpu_slab->partial in CONFIG_SLUB_CPU_PARTIAL and wrap its
sysfs too.

Signed-off-by: Wei Yang 

---
v2: define slub_percpu_partial() to make code more elegant
---
 include/linux/slub_def.h | 19 +++
 mm/slub.c| 16 +---
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index ec13aab32647..f882a34bb9aa 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -41,12 +41,31 @@ struct kmem_cache_cpu {
void **freelist;/* Pointer to next available object */
unsigned long tid;  /* Globally unique transaction id */
struct page *page;  /* The slab from which we are allocating */
+#ifdef CONFIG_SLUB_CPU_PARTIAL
struct page *partial;   /* Partially allocated frozen slabs */
+#endif
 #ifdef CONFIG_SLUB_STATS
unsigned stat[NR_SLUB_STAT_ITEMS];
 #endif
 };
 
+#ifdef CONFIG_SLUB_CPU_PARTIAL
+#define slub_percpu_partial(c) ((c)->partial)
+
+#define slub_set_percpu_partial(c, p)  \
+({ \
+   slub_percpu_partial(c) = (p)->next; \
+})
+
+#define slub_percpu_partial_read_once(c) READ_ONCE(slub_percpu_partial(c))
+#else
+#define slub_percpu_partial(c) NULL
+
+#define slub_set_percpu_partial(c, p)
+
+#define slub_percpu_partial_read_once(c)   NULL
+#endif // CONFIG_SLUB_CPU_PARTIAL
+
 /*
  * Word size structure that can be atomically updated or read and that
  * contains both the order and the number of objects that a slab of the
diff --git a/mm/slub.c b/mm/slub.c
index 7f4bc7027ed5..ae6166533261 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2302,7 +2302,7 @@ static bool has_cpu_slab(int cpu, void *info)
struct kmem_cache *s = info;
struct kmem_cache_cpu *c = per_cpu_ptr(s->cpu_slab, cpu);
 
-   return c->page || c->partial;
+   return c->page || slub_percpu_partial(c);
 }
 
 static void flush_all(struct kmem_cache *s)
@@ -2568,9 +2568,9 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t 
gfpflags, int node,
 
 new_slab:
 
-   if (c->partial) {
-   page = c->page = c->partial;
-   c->partial = page->next;
+   if (slub_percpu_partial(c)) {
+   page = c->page = slub_percpu_partial(c);
+   slub_set_percpu_partial(c, page);
stat(s, CPU_PARTIAL_ALLOC);
c->freelist = NULL;
goto redo;
@@ -4760,7 +4760,7 @@ static ssize_t show_slab_objects(struct kmem_cache *s,
total += x;
nodes[node] += x;
 
-   page = READ_ONCE(c->partial);
+   page = slub_percpu_partial_read_once(c);
if (page) {
node = page_to_nid(page);
if (flags & SO_TOTAL)
@@ -4988,7 +4988,8 @@ static ssize_t slabs_cpu_partial_show(struct kmem_cache 
*s, char *buf)
int len;
 
for_each_online_cpu(cpu) {
-   struct page *page = per_cpu_ptr(s->cpu_slab, cpu)->partial;
+   struct page *page =
+   slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu));
 
if (page) {
pages += page->pages;
@@ -5000,7 +5001,8 @@ static ssize_t slabs_cpu_partial_show(struct kmem_cache 
*s, char *buf)
 
 #ifdef CONFIG_SMP
for_each_online_cpu(cpu) {
-   struct page *page = per_cpu_ptr(s->cpu_slab, cpu) ->partial;
+   struct page *page =
+   slub_percpu_partial(per_cpu_ptr(s->cpu_slab, cpu));
 
if (page && len < PAGE_SIZE - 20)
len += sprintf(buf + len, " C%d=%d(%d)", cpu,
-- 
2.11.0

[PATCH] ARM: imx_v6_v7_defconfig: Enable cpufreq governors

2017-05-02 Thread Leonard Crestez

Enable more common cpufreq governors in imx defconfig because this is
very useful for testing. In particular you can't use cpufreq-set -f
$FREQ without explicitly defining CONFIG_CPU_FREQ_GOV_USERSPACE=y.

Signed-off-by: Leonard Crestez 

---

It might make sense for all governors to be enabled by default from
drivers/cpufreq/Kconfig and allow defconfigs to be shorter.
Right now the descriptions for some of them includes a line that says
"If in doubt, say Y" but the config options don't have actually have a
default value defined and they effectively default to N.

Cycling via savedefconfig on shawnguo/for-next also generates some
reordering for some newly added options CONFIG_TOUCHSCREEN_MAX11801=y
and CONFIG_HID_MULTITOUCH=y. Those were not included but it's strange
that this happens. Maybe those options were inserted manually, or
otherwise there is an annoying bug in kconfig?

 arch/arm/configs/imx_v6_v7_defconfig | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index bb6fa56..bf1e7e3 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -55,6 +55,9 @@ CONFIG_CMDLINE="noinitrd console=ttymxc0,115200"
 CONFIG_KEXEC=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
+CONFIG_CPU_FREQ_GOV_POWERSAVE=y
+CONFIG_CPU_FREQ_GOV_USERSPACE=y
+CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
 CONFIG_ARM_IMX6Q_CPUFREQ=y
 CONFIG_CPU_IDLE=y
 CONFIG_VFP=y
-- 
2.7.4

Re: [PATCH 2/2] efi/efi_test: drop useless kfree

2017-05-02 Thread Geliang Tang

On Tue, May 02, 2017 at 03:11:45PM +0800, ivanhu wrote:
> 
> 
> On 04/29/2017 09:42 AM, Geliang Tang wrote:
> > Drop useless kfree when memdup_user() failed, since we have already
> > called kfree in memdup_user().
> > 
> > Signed-off-by: Geliang Tang 
> > ---
> >   drivers/firmware/efi/test/efi_test.c | 4 +---
> >   1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/drivers/firmware/efi/test/efi_test.c 
> > b/drivers/firmware/efi/test/efi_test.c
> > index 08129b7..00332ae 100644
> > --- a/drivers/firmware/efi/test/efi_test.c
> > +++ b/drivers/firmware/efi/test/efi_test.c
> > @@ -261,10 +261,8 @@ static long efi_runtime_set_variable(unsigned long arg)
> > }
> > data = memdup_user(setvariable.data, setvariable.data_size);
> > -   if (IS_ERR(data)) {
> > -   kfree(name);
> You mean "name" or "data"?
> 

Sorry, it's my mistake. The original code is correct. Please ignore this
patch.

-Geliang

> Cheers,
> Ivan
> 
> > +   if (IS_ERR(data))
> > return PTR_ERR(data);
> > -   }
> > status = efi.set_variable(name, _guid,
> > setvariable.attributes,

Re: [PATCH v4 1/5] dt-bindings: gpu: add bindings for the ARM Mali Midgard GPU

2017-05-02 Thread Guillaume Tucker


On 02/05/17 15:13, Rob Herring wrote:

On Tue, May 2, 2017 at 6:23 AM, Guillaume Tucker
 wrote:

Hi Rob,

On 28/04/17 20:27, Rob Herring wrote:


On Tue, Apr 25, 2017 at 02:16:16PM +0100, Guillaume Tucker wrote:




diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
new file mode 100644
index ..547ddeceb498
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.txt
@@ -0,0 +1,82 @@
+ARM Mali Midgard GPU
+
+
+Required properties:
+
+- compatible :
+  * Must be one of the following:
++ "arm,mali-t60x"
++ "arm,mali-t62x"



Don't use wildcards.



Sure, old habits die hard...  I'll fix it in patch v5.


++ "arm,mali-t720"
++ "arm,mali-t760"
++ "arm,mali-t820"
++ "arm,mali-t830"
++ "arm,mali-t860"
++ "arm,mali-t880"
+  * And, optionally, one of the vendor specific compatible:



IMO, these should not be optional.



Well, vendor compatible strings are clearly optional for the
Utgard GPU series for which the bindings docs were recently
merged.  It seems that whether these should be optional or not,
the documentation should be consistent between at least all
similar types of devices like Midgard and Utgard GPUs.  They have
different architectures but from a device tree point of view,
they both have the same kind of SoC-specific integration (clocks,
irqs, regulators...).


Clocks should not vary by SoC. There is often variation because clocks
get driven by same source or are not s/w controlled, but really there
should not be that variation. I noticed Utgard has 2 clocks. So is
Midgard really just 1 clock? The DT should have all the clocks listed
in the TRMs.


I meant to say that the clock sources are different in each SoC,
but yes the same clock input is always needed by the GPU.

The TRM is confidential but to the best of my knowledge and based
on existing device trees and the out-of-tree kernel driver, the
Midgard GPU has only one clock input.


So was this was overlooked in the Utgard case and should it
ideally be fixed there as well as non-optional?  Or, is it OK to
keep these optional on a second thought?


Probably should be required in the Utgard case as well.


OK, so I'll make the vendor compatible strings required (for
Midgard) in patch v5.

Guillaume

Re: [PATCH v2] IB/i40iw: use setup_timer

2017-05-02 Thread Geliang Tang

On Sat, Apr 29, 2017 at 09:45:27AM -0400, Doug Ledford wrote:
> On 4/28/2017 9:37 PM, Geliang Tang wrote:
> > Use setup_timer() instead of init_timer() to simplify the code.
> > 
> > Signed-off-by: Geliang Tang 
> > ---
> > Changes in v2:
> >  - use setup_timer() in i40iw_terminate_start_timer().
> 
> I already applied your previous patch and did the fixup to add
> i40iw_terminate_start_timer() into the patch myself.  Thanks.
> 
> 
> -- 
> Doug Ledford 
> GPG Key ID: B826A3330E572FDD
> Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
> 

Thank you, Doug.

-Geliang

Re: [PATCH] vmscan: scan pages until it founds eligible pages

2017-05-02 Thread Minchan Kim

Hi Michal,

On Tue, May 02, 2017 at 09:54:32AM +0200, Michal Hocko wrote:
> On Tue 02-05-17 14:14:52, Minchan Kim wrote:
> > Oops, forgot to add lkml and linux-mm.
> > Sorry for that.
> > Send it again.
> > 
> > >From 8ddf1c8aa15baf085bc6e8c62ce705459d57ea4c Mon Sep 17 00:00:00 2001
> > From: Minchan Kim 
> > Date: Tue, 2 May 2017 12:34:05 +0900
> > Subject: [PATCH] vmscan: scan pages until it founds eligible pages
> > 
> > On Tue, May 02, 2017 at 01:40:38PM +0900, Minchan Kim wrote:
> > There are premature OOM happening. Although there are a ton of free
> > swap and anonymous LRU list of elgible zones, OOM happened.
> > 
> > With investigation, skipping page of isolate_lru_pages makes reclaim
> > void because it returns zero nr_taken easily so LRU shrinking is
> > effectively nothing and just increases priority aggressively.
> > Finally, OOM happens.
> 
> I am not really sure I understand the problem you are facing. Could you
> be more specific please? What is your configuration etc...

Sure, KVM guest on x86_64, It has 2G memory and 1G swap and configured
movablecore=1G to simulate highmem zone.
Workload is a process consumes 2.2G memory and then random touch the
address space so it makes lots of swap in/out.

> 
> > balloon invoked oom-killer: 
> > gfp_mask=0x17080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), 
> > nodemask=(null),  order=0, oom_score_adj=0
> [...]
> > Node 0 active_anon:1698864kB inactive_anon:261256kB active_file:208kB 
> > inactive_file:184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> > mapped:532kB dirty:108kB writeback:0kB shmem:172kB writeback_tmp:0kB 
> > unstable:0kB all_unreclaimable? no
> > DMA free:7316kB min:32kB low:44kB high:56kB active_anon:8064kB 
> > inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
> > writepending:0kB present:15992kB managed:15908kB mlocked:0kB 
> > slab_reclaimable:464kB slab_unreclaimable:40kB kernel_stack:0kB 
> > pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > lowmem_reserve[]: 0 992 992 1952
> > DMA32 free:9088kB min:2048kB low:3064kB high:4080kB active_anon:952176kB 
> > inactive_anon:0kB active_file:36kB inactive_file:0kB unevictable:0kB 
> > writepending:88kB present:1032192kB managed:1019388kB mlocked:0kB 
> > slab_reclaimable:13532kB slab_unreclaimable:16460kB kernel_stack:3552kB 
> > pagetables:6672kB bounce:0kB free_pcp:56kB local_pcp:24kB free_cma:0kB
> > lowmem_reserve[]: 0 0 0 959
> 
> Hmm DMA32 has sufficient free memory to allow this order-0 request.
> Inactive anon lru is basically empty. Why do not we rotate a really
> large active anon list? Isn't this the primary problem?

It's a side effect by skipping page logic in isolate_lru_pages
I mentioned above in changelog.

The problem is a lot of anonymous memory in movable zone(ie, highmem)
and non-small memory in DMA32 zone. In heavy memory pressure,
requesting a page in GFP_KERNEL triggers reclaim. VM knows inactive list
is low so it tries to deactivate pages. For it, first of all, it tries
to isolate pages from active list but there are lots of anonymous pages
from movable zone so skipping logic in isolate_lru_pages works. With
the result, isolate_lru_pages cannot isolate any eligible pages so
reclaim trial is effectively void. It continues to meet OOM.

I'm on long vacation from today so understand if my response is slow.

Re: [PATCH 06/14] sched/topology,debug: Verify the first group matches the child domain

2017-05-02 Thread Peter Zijlstra

On Mon, May 01, 2017 at 05:13:26PM -0400, Rik van Riel wrote:
> On Fri, 2017-04-28 at 15:20 +0200, Peter Zijlstra wrote:
> > Signed-off-by: Peter Zijlstra (Intel) 
> 
> This could use a changelog.

Yes indeed... I put off writing one because $hard, and clearly I forgot
entirely :-/

How's this?

---
Subject: sched/topology,debug: Verify the first group matches the child domain
From: Peter Zijlstra 
Date: Fri Apr 14 18:20:48 CEST 2017

We want sched_groups to be sibling child domains (or individual CPUs
when there are no child domains). Furthermore, since the first group
of a domain should include the CPU of that domain, the first group of
each domain should match the child domain.

Verify this is indeed so.

Signed-off-by: Peter Zijlstra (Intel) 
---
 kernel/sched/topology.c |6 ++
 1 file changed, 6 insertions(+)

--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -93,6 +93,12 @@ static int sched_domain_debug_one(struct
group->sgc->capacity);
}

+   if (group == sd->groups && sd->child &&
+   !cpumask_equal(sched_domain_span(sd->child),
+  sched_group_cpus(group))) {
+   printk(KERN_ERR "ERROR: domain->groups does not match 
domain->child\n");
+   }
+
group = group->next;

if (group != sd->groups)

Re: [PATCH v2 1/9] dt-bindings: display: sun4i: Add component endpoint ID numbering scheme

2017-05-02 Thread Maxime Ripard

Hi,

On Fri, Apr 28, 2017 at 08:48:41AM -0500, Rob Herring wrote:
> On Fri, Apr 21, 2017 at 04:38:49PM +0800, Chen-Yu Tsai wrote:
> > The Allwinner display pipeline contains many hardware components, some
> > of which can consume data from one of multiple upstream components.
> > The numbering scheme of these components must be encoded into the device
> > tree so the driver can figure out which component out of two or more of
> > the same type it is supposed to use or program.
> > 
> > This patch adds the constraint that local endpoint IDs must be the index
> > or number of the remote endpoint's hardware block, for all components
> > in the display pipeline up to the TCONs.
> > 
> > Signed-off-by: Chen-Yu Tsai 
> > ---
> >  Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt | 10 
> > ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt 
> > b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
> > index 57a8d0610062..7acdbf14ae1c 100644
> > --- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
> > +++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
> > @@ -4,6 +4,16 @@ Allwinner A10 Display Pipeline
> >  The Allwinner A10 Display pipeline is composed of several components
> >  that are going to be documented below:
> >  
> > +For the input port of all components up to the TCON in the display
> > +pipeline, if there are multiple components, the local endpoint IDs
> > +must correspond to the index of the upstream block. For example, if
> > +the remote endpoint is Frontend 1, then the local endpoint ID must
> > +be 1.
> > +
> > +Conversely, for the output ports of the same group, the remote endpoint
> > +ID must be the index of the local hardware block. If the local backend
> > +is backend 1, then the remote endpoint ID must be 1.
> 
> It would be clearer if you just explicitly listed IDs and their 
> connections. From how this is worded, it would not work if you had 
> connections like this:
> 
> DevA 0
> DevA 1
> DevB 0
> DevB 1
> 
> These would need to be endpoints 0-3 in TCON, and that doesn't reflect 
> the remote devices' index.

Chen-Yu, can you send a patch to rephrase the doc that way?

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature

Re: [PATCH 00/14] sched/topology fixes

2017-05-02 Thread Lauro Venancio

On 05/02/2017 11:43 AM, Peter Zijlstra wrote:
> On Fri, Apr 28, 2017 at 03:53:39PM +0200, Peter Zijlstra wrote:
>> Also, the following occurred to me:
>>
>>   sg_span & sg_mask == sg_mask
>>
>> Therefore, we don't need to do the whole "sg_span &" business.
>>
>> Hmm?
>> @@ -856,7 +857,7 @@ build_sched_groups(struct sched_domain *
>>  continue;
>>  
>>  group = get_group(i, sdd, );
>> -cpumask_setall(sched_group_mask(sg));
>> +cpumask_copy(sched_group_mask(sg), sched_group_cpus(sg));
>>  
>>  for_each_cpu(j, span) {
>>  if (get_group(j, sdd, NULL) != group)
> OK, so this explodes mightily.
>
> That code also hurt my brain bad, so I had to fix that a little.
>
> The below seems to boot.
>
> ---
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7996,7 +7996,7 @@ static int active_load_balance_cpu_stop(
>  static int should_we_balance(struct lb_env *env)
>  {
>   struct sched_group *sg = env->sd->groups;
> - struct cpumask *sg_cpus, *sg_mask;
> + struct cpumask *sg_mask;
>   int cpu, balance_cpu = -1;
>  
>   /*
> @@ -8006,11 +8006,10 @@ static int should_we_balance(struct lb_e
>   if (env->idle == CPU_NEWLY_IDLE)
>   return 1;
>  
> - sg_cpus = sched_group_cpus(sg);
>   sg_mask = sched_group_mask(sg);
>   /* Try to find first idle cpu */
> - for_each_cpu_and(cpu, sg_cpus, env->cpus) {
> - if (!cpumask_test_cpu(cpu, sg_mask) || !idle_cpu(cpu))
> + for_each_cpu_and(cpu, sg_mask, env->cpus) {
> + if (!idle_cpu(cpu))
>   continue;
>  
>   balance_cpu = cpu;
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -85,7 +85,8 @@ static int sched_domain_debug_one(struct
>   group->sgc->id,
>   cpumask_pr_args(sched_group_cpus(group)));
>  
> - if ((sd->flags & SD_OVERLAP) && 
> !cpumask_full(sched_group_mask(group))) {
> + if ((sd->flags & SD_OVERLAP) &&
> + !cpumask_equal(sched_group_mask(group), 
> sched_group_cpus(group))) {
>   printk(KERN_CONT " mask=%*pbl",
>   cpumask_pr_args(sched_group_mask(group)));
>   }
> @@ -505,7 +506,7 @@ enum s_alloc {
>   */
>  int group_balance_cpu(struct sched_group *sg)
>  {
> - return cpumask_first_and(sched_group_cpus(sg), sched_group_mask(sg));
> + return cpumask_first(sched_group_mask(sg));
>  }
>  
>  
> @@ -833,23 +834,34 @@ build_overlap_sched_groups(struct sched_
>   * [*] in other words, the first group of each domain is its child domain.
>   */
>  
> -static int get_group(int cpu, struct sd_data *sdd, struct sched_group **sg)
> +static struct sched_group *get_group(int cpu, struct sd_data *sdd)
>  {
>   struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu);
>   struct sched_domain *child = sd->child;
> + struct sched_group *sg;
>  
>   if (child)
>   cpu = cpumask_first(sched_domain_span(child));
>  
> - if (sg) {
> - *sg = *per_cpu_ptr(sdd->sg, cpu);
> - (*sg)->sgc = *per_cpu_ptr(sdd->sgc, cpu);
> + sg = *per_cpu_ptr(sdd->sg, cpu);
> + sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
>  
> - /* For claim_allocations: */
> - atomic_set(&(*sg)->sgc->ref, 1);
> + /* For claim_allocations: */
> + atomic_inc(>ref);
> + atomic_inc(>sgc->ref);
> +
> + if (child) {
> + cpumask_copy(sched_group_cpus(sg), sched_domain_span(child));
> + cpumask_copy(sched_group_mask(sg), sched_group_cpus(sg));
> + } else {
> + cpumask_set_cpu(cpu, sched_group_cpus(sg));
> + cpumask_set_cpu(cpu, sched_group_cpus(sg));
Typo here. The mask is not being set in the else clause.

>   }
>  
> - return cpu;
> + sg->sgc->capacity = SCHED_CAPACITY_SCALE * 
> cpumask_weight(sched_group_cpus(sg));
> + sg->sgc->min_capacity = SCHED_CAPACITY_SCALE;
> +
> + return sg;
>  }
>  
>  /*
> @@ -868,34 +880,20 @@ build_sched_groups(struct sched_domain *
>   struct cpumask *covered;
>   int i;
>  
> - get_group(cpu, sdd, >groups);
> - atomic_inc(>groups->ref);
> -
> - if (cpu != cpumask_first(span))
> - return 0;
> -
>   lockdep_assert_held(_domains_mutex);
>   covered = sched_domains_tmpmask;
>  
>   cpumask_clear(covered);
>  
> - for_each_cpu(i, span) {
> + for_each_cpu_wrap(i, span, cpu) {
>   struct sched_group *sg;
> - int group, j;
>  
>   if (cpumask_test_cpu(i, covered))
>   continue;
>  
> - group = get_group(i, sdd, );
> - cpumask_setall(sched_group_mask(sg));
> + sg = get_group(i, sdd);
>  
> - for_each_cpu(j, span) {
> - if (get_group(j, sdd, NULL) != group)
> -

Re: [PATCH 1/4] PCI: add resizeable BAR infrastructure v4

2017-05-02 Thread Christian König


Am 26.04.2017 um 18:45 schrieb Andy Shevchenko:

[SNIP]

-#define  PCI_REBAR_CTRL_NBAR_MASK  (7 << 5)/* mask for # bars */
-#define  PCI_REBAR_CTRL_NBAR_SHIFT 5   /* shift for # bars */
+#define  PCI_REBAR_CTRL_NBAR_MASK  (7 << 5)/* mask for # BARs */
+#define  PCI_REBAR_CTRL_NBAR_SHIFT 5   /* shift for # BARs */

I understand why, but I dunno it worth to do.
Bjorn suggested that. Either way is fine with me, but he needs to stick 
his signed-of-by on it when pushing it upstream. So his opinion usually 
wins.


All other comments will be integrated in the next version of the patch.

Thanks for the review,
Christian.

Re: [PATCH v2 1/2] mm: Uncharge poisoned pages

2017-05-02 Thread Laurent Dufour

On 28/04/2017 15:48, Michal Hocko wrote:
> On Fri 28-04-17 11:17:34, Laurent Dufour wrote:
>> On 28/04/2017 09:31, Michal Hocko wrote:
>>> [CC Johannes and Vladimir - the patch is
>>> http://lkml.kernel.org/r/1493130472-22843-2-git-send-email-lduf...@linux.vnet.ibm.com]
>>>
>>> On Fri 28-04-17 08:07:55, Michal Hocko wrote:
 On Thu 27-04-17 13:51:23, Andi Kleen wrote:
> Michal Hocko  writes:
>
>> On Tue 25-04-17 16:27:51, Laurent Dufour wrote:
>>> When page are poisoned, they should be uncharged from the root memory
>>> cgroup.
>>>
>>> This is required to avoid a BUG raised when the page is onlined back:
>>> BUG: Bad page state in process mem-on-off-test  pfn:7ae3b
>>> page:f1eb8ec0 count:0 mapcount:0 mapping:  (null)
>>> index:0x1
>>> flags: 0x380020(hwpoison)
>>
>> My knowledge of memory poisoning is very rudimentary but aren't those
>> pages supposed to leak and never come back? In other words isn't the
>> hoplug code broken because it should leave them alone?
>
> Yes that would be the right interpretation. If it was really offlined
> due to a hardware error the memory will be poisoned and any access
> could cause a machine check.

 OK, thanks for the clarification. Then I am not sure the patch is
 correct. Why do we need to uncharge that page at all?
>>>
>>> Now, I have realized that we actually want to uncharge that page because
>>> it will pin the memcg and we do not want to have that memcg and its
>>> whole hierarchy pinned as well. This used to work before the charge
>>> rework 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API") I guess
>>> because we used to uncharge on page cache removal.
>>>
>>> I do not think the patch is correct, though. memcg_kmem_enabled() will
>>> check whether kmem accounting is enabled and we are talking about page
>>> cache pages here. You should be using mem_cgroup_uncharge instead.
>>
>> Thanks for the review Michal.
>>
>> I was not comfortable either with this patch.
>>
>> I did some tests calling mem_cgroup_uncharge() when isolate_lru_page()
>> succeeds only, so not calling it if isolate_lru_page() failed.
> 
> Wait a moment. This cannot possibly work. isolate_lru_page asumes page
> count > 0 and increments the counter so the resulting page count is > 1
> I have only now realized that we have VM_BUG_ON_PAGE(page_count(page), page)
> in uncharge_list().

My mistake, my kernel was not build with CONFIG_DEBUG_VM set.
You're right this cannot work this way.

> This is getting quite hairy. What is the expected page count of the
> hwpoison page? I guess we would need to update the VM_BUG_ON in the
> memcg uncharge code to ignore the page count of hwpoison pages if it can
> be arbitrary.

Based on the experiment I did, page count == 2 when isolate_lru_page()
succeeds, even in the case of a poisoned page. In my case I think this
is because the page is still used by the process which is calling madvise().

I'm wondering if I'm looking at the right place. May be the poisoned
page should remain attach to the memory_cgroup until no one is using it.
In that case this means that something should be done when the page is
off-lined... I've to dig further here.

> 
> Before we go any further, is there any documentation about the expected
> behavior and the state of the hwpoison pages? I have a very bad feeling
> that the current behavior is quite arbitrary and "testing driven"
> holes plugging will make it only more messy. So let's start with the
> clear description of what should happen with the hwpoison pages.

I didn't find any documentation about that. The root cause is that a bug
message is displayed when a poisoned page is off-lined, may be this is
in that path that something is missing.

Cheers,
Laurent.

[PATCH] input: edt-ft5x06: increase allowed data range for threshold parameter

2017-05-02 Thread Martin Kepplinger

The datasheet and application note does not mention an allowed range for
the M09_REGISTER_THRESHOLD parameter. One of our customers needs to set
lower values than 20 and they seem to work just fine on EDT EP0xx0M09 with
T5x06 touch.

So, lacking a known lower limit, we increase the range for thresholds,
and set the lower limit to 0. The documentation is updated accordingly.

Signed-off-by: Schoefegger Stefan 
Signed-off-by: Manfred Schlaegl 
Signed-off-by: Martin Kepplinger 
---
 Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt | 2 +-
 Documentation/input/devices/edt-ft5x06.rst | 2 +-
 drivers/input/touchscreen/edt-ft5x06.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt 
b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
index 6db2210..025cf8c 100644
--- a/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
+++ b/Documentation/devicetree/bindings/input/touchscreen/edt-ft5x06.txt
@@ -36,7 +36,7 @@ Optional properties:
 control gpios
 
  - threshold:   allows setting the "click"-threshold in the range
-from 20 to 80.
+from 0 to 80.
 
  - gain:allows setting the sensitivity in the range from 0 to
 31. Note that lower values indicate higher
diff --git a/Documentation/input/devices/edt-ft5x06.rst 
b/Documentation/input/devices/edt-ft5x06.rst
index 2032f0b..1ccc94b 100644
--- a/Documentation/input/devices/edt-ft5x06.rst
+++ b/Documentation/input/devices/edt-ft5x06.rst
@@ -15,7 +15,7 @@ It has been tested with the following devices:
 The driver allows configuration of the touch screen via a set of sysfs files:
 
 /sys/class/input/eventX/device/device/threshold:
-allows setting the "click"-threshold in the range from 20 to 80.
+allows setting the "click"-threshold in the range from 0 to 80.
 
 /sys/class/input/eventX/device/device/gain:
 allows setting the sensitivity in the range from 0 to 31. Note that
diff --git a/drivers/input/touchscreen/edt-ft5x06.c 
b/drivers/input/touchscreen/edt-ft5x06.c
index 8cf8d8d..f872817 100644
--- a/drivers/input/touchscreen/edt-ft5x06.c
+++ b/drivers/input/touchscreen/edt-ft5x06.c
@@ -471,7 +471,7 @@ static EDT_ATTR(gain, S_IWUSR | S_IRUGO, WORK_REGISTER_GAIN,
 static EDT_ATTR(offset, S_IWUSR | S_IRUGO, WORK_REGISTER_OFFSET,
M09_REGISTER_OFFSET, 0, 31);
 static EDT_ATTR(threshold, S_IWUSR | S_IRUGO, WORK_REGISTER_THRESHOLD,
-   M09_REGISTER_THRESHOLD, 20, 80);
+   M09_REGISTER_THRESHOLD, 0, 80);
 static EDT_ATTR(report_rate, S_IWUSR | S_IRUGO, WORK_REGISTER_REPORT_RATE,
NO_REGISTER, 3, 14);
 
-- 
2.1.4

Re: [PATCH 1/3] drm: fourcc byteorder: drop DRM_FORMAT_BIG_ENDIAN

2017-05-02 Thread Gerd Hoffmann

  Hi,

> I also think that this patch requires more comments than the
> commit message has at the moment.
> 
> Removing the definition also removes the possibility to describe a lot
> of pixel formats, so that should definitely be mentioned. I think it
> would also be good to have some kind of justified claim that no
> hardware actually needs the pixel formats this is removing (e.g. RGB565
> BE).

That and RGB2101010 BE are the only candidates I can think of.

Dealing with those in none-native byteorder is a PITA though.  Display
hardware is little endian (pci byte order) for quite a while already.

> Maybe this was already in the long discussions, but I feel it
> should be summarized in the commit message.

Radeon and nvidia (nv40) cards where mentioned.  I'll try to summarize
(feel free to correct me if I'm wrong).

nvidia has support for 8 bit-per-color formats only on bigendian hosts.
Not sure whenever this is a driver or hardware limitation.

radeon looks differently on pre-R600 and R600+ hardware.

pre-R600 can byteswap on cpu access, so the cpu view is bigendian
whereas things are actually stored on little endian byte order.
Hardware supports both 16bit and 32bit swaps.  Used for fbdev emulation
(probably 32bit swaps, but not fully sure).  xorg radeon driver doesn't
use the byteswapping feature, because it is a PITA when bo's are moved
between vram and system memory.

R600+ supports bigendian framebuffer formats, so no byteswapping on
access is needed.  Not sure whenever that includes 16bpp formats or
whenever this is limited to the 8 bit-per-color formats (simliar to
nvidia).  Discussion focused on the pre-R600 hardware and how this
on-acpu-access byteswapping is more a problem than it helps ...

cheers,
  Gerd

Re: [PATCH 2/7] perf config: Check list empty before showing configs

2017-05-02 Thread Arnaldo Carvalho de Melo

Em Wed, Apr 26, 2017 at 09:21:03PM +0900, Taeung Song escreveu:
> If existent config files contains nothing,
> the sections list in config_set can be empty.
> 
> So check not only NULL pointer of config_set but
> also the list in config_set.
 
> +++ b/tools/perf/builtin-config.c
> @@ -75,7 +75,7 @@ static int show_spec_config(struct perf_config_set *set, 
> const char *var)
>   struct perf_config_section *section;
>   struct perf_config_item *item;
>  
> - if (set == NULL)
> + if (set == NULL || list_empty(>sections))
>   return -1;

But should we consider an error to have an empty config file? I don't
think so :-\

- Arnaldo
>  
>   perf_config_items__for_each_entry(>sections, section) {
> @@ -105,7 +105,7 @@ static int show_config(struct perf_config_set *set)
>   struct perf_config_section *section;
>   struct perf_config_item *item;
>  
> - if (set == NULL)
> + if (set == NULL || list_empty(>sections))
>   return -1;
>  
>   perf_config_set__for_each_entry(set, section, item) {
> diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
> index 8d724f0..492c862 100644
> --- a/tools/perf/util/config.c
> +++ b/tools/perf/util/config.c
> @@ -707,7 +707,7 @@ int perf_config(config_fn_t fn, void *data)
>   struct perf_config_section *section;
>   struct perf_config_item *item;
>  
> - if (config_set == NULL)
> + if (config_set == NULL || list_empty(_set->sections))
>   return -1;
>  
>   perf_config_set__for_each_entry(config_set, section, item) {
> -- 
> 2.7.4

Re: [PATCH] vmscan: scan pages until it founds eligible pages

2017-05-02 Thread Michal Hocko

On Tue 02-05-17 23:51:50, Minchan Kim wrote:
> Hi Michal,
> 
> On Tue, May 02, 2017 at 09:54:32AM +0200, Michal Hocko wrote:
> > On Tue 02-05-17 14:14:52, Minchan Kim wrote:
> > > Oops, forgot to add lkml and linux-mm.
> > > Sorry for that.
> > > Send it again.
> > > 
> > > >From 8ddf1c8aa15baf085bc6e8c62ce705459d57ea4c Mon Sep 17 00:00:00 2001
> > > From: Minchan Kim 
> > > Date: Tue, 2 May 2017 12:34:05 +0900
> > > Subject: [PATCH] vmscan: scan pages until it founds eligible pages
> > > 
> > > On Tue, May 02, 2017 at 01:40:38PM +0900, Minchan Kim wrote:
> > > There are premature OOM happening. Although there are a ton of free
> > > swap and anonymous LRU list of elgible zones, OOM happened.
> > > 
> > > With investigation, skipping page of isolate_lru_pages makes reclaim
> > > void because it returns zero nr_taken easily so LRU shrinking is
> > > effectively nothing and just increases priority aggressively.
> > > Finally, OOM happens.
> > 
> > I am not really sure I understand the problem you are facing. Could you
> > be more specific please? What is your configuration etc...
> 
> Sure, KVM guest on x86_64, It has 2G memory and 1G swap and configured
> movablecore=1G to simulate highmem zone.
> Workload is a process consumes 2.2G memory and then random touch the
> address space so it makes lots of swap in/out.
> 
> > 
> > > balloon invoked oom-killer: 
> > > gfp_mask=0x17080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), 
> > > nodemask=(null),  order=0, oom_score_adj=0
> > [...]
> > > Node 0 active_anon:1698864kB inactive_anon:261256kB active_file:208kB 
> > > inactive_file:184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
> > > mapped:532kB dirty:108kB writeback:0kB shmem:172kB writeback_tmp:0kB 
> > > unstable:0kB all_unreclaimable? no
> > > DMA free:7316kB min:32kB low:44kB high:56kB active_anon:8064kB 
> > > inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
> > > writepending:0kB present:15992kB managed:15908kB mlocked:0kB 
> > > slab_reclaimable:464kB slab_unreclaimable:40kB kernel_stack:0kB 
> > > pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > > lowmem_reserve[]: 0 992 992 1952
> > > DMA32 free:9088kB min:2048kB low:3064kB high:4080kB active_anon:952176kB 
> > > inactive_anon:0kB active_file:36kB inactive_file:0kB unevictable:0kB 
> > > writepending:88kB present:1032192kB managed:1019388kB mlocked:0kB 
> > > slab_reclaimable:13532kB slab_unreclaimable:16460kB kernel_stack:3552kB 
> > > pagetables:6672kB bounce:0kB free_pcp:56kB local_pcp:24kB free_cma:0kB
> > > lowmem_reserve[]: 0 0 0 959
> > 
> > Hmm DMA32 has sufficient free memory to allow this order-0 request.
> > Inactive anon lru is basically empty. Why do not we rotate a really
> > large active anon list? Isn't this the primary problem?
> 
> It's a side effect by skipping page logic in isolate_lru_pages
> I mentioned above in changelog.
> 
> The problem is a lot of anonymous memory in movable zone(ie, highmem)
> and non-small memory in DMA32 zone.

Such a configuration is questionable on its own. But let't keep this
part alone.

> In heavy memory pressure,
> requesting a page in GFP_KERNEL triggers reclaim. VM knows inactive list
> is low so it tries to deactivate pages. For it, first of all, it tries
> to isolate pages from active list but there are lots of anonymous pages
> from movable zone so skipping logic in isolate_lru_pages works. With
> the result, isolate_lru_pages cannot isolate any eligible pages so
> reclaim trial is effectively void. It continues to meet OOM.

But skipped pages should be rotated and we should eventually hit pages
from the right zone(s). Moreover we should scan the full LRU at priority
0 so why exactly we hit the OOM killer?

Anyway [1] has changed this behavior. Are you seeing the issue with this
patch dropped?

[1] 
http://www.ozlabs.org/~akpm/mmotm/broken-out/revert-mm-vmscan-account-for-skipped-pages-as-a-partial-scan.patch
-- 
Michal Hocko
SUSE Labs

Re: [PATCH] Allow to use DMA_CTRL_REUSE flag for all channel types

2017-05-02 Thread Eugeniy Paltsev

Hi Vinod,

On Mon, 2017-05-01 at 11:21 +0530, Vinod Koul wrote:
> On Fri, Apr 28, 2017 at 04:37:46PM +0300, Eugeniy Paltsev wrote:
> > In the current implementation dma_get_slave_caps is used to check
> > state of descriptor_reuse option. But dma_get_slave_caps includes
> > check if the channel supports slave transactions.
> > So DMA_CTRL_REUSE flag can be set (even for MEM-TO-MEM tranfers)
> > only if channel supports slave transactions.
> > 
> > Now we can use DMA_CTRL_REUSE flag for all channel types.
> > Also it allows to test reusing mechanism with simply mem-to-mem dma
> > test.
> 
> We do not want to allow that actually. Slave is always treated as a
> special
> case, so resue was allowed.
> 
> With memcpy the assumptions are different and clients can do reuse.

Could you please clarify why don't we want to allow use DMA_CTRL_REUSE
for mem-to-mem transfers?

Reusing of mem-to-mem (MEMCPY and DMA_SG) descriptors will work fine on
virt-dma based drivers.

Anyway the current implementation behaviour is quite strange:
If channel supports *slave* transfers DMA_CTRL_REUSE can be set to
slave and *mem-to-mem* transfers.

And, of course, we can pass DMA_CTRL_REUSE flag to device_prep_dma_sg
or device_prep_dma_memcpy directly without checks.

> > 
> > Signed-off-by: Eugeniy Paltsev 
> > ---
> >  include/linux/dmaengine.h | 6 +-
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > index 5336808..92cf8b0 100644
> > --- a/include/linux/dmaengine.h
> > +++ b/include/linux/dmaengine.h
> > @@ -1376,11 +1376,7 @@ static inline int dma_get_slave_caps(struct
> > dma_chan *chan,
> >  
> >  static inline int dmaengine_desc_set_reuse(struct
> > dma_async_tx_descriptor *tx)
> >  {
> > -   struct dma_slave_caps caps;
> > -
> > -   dma_get_slave_caps(tx->chan, );
> > -
> > -   if (caps.descriptor_reuse) {
> > +   if (tx->chan->device->descriptor_reuse) {
> >     tx->flags |= DMA_CTRL_REUSE;
> >     return 0;
> >     } else {
> > -- 
> > 2.9.3
> > 
> > --
-- 
 Eugeniy Paltsev

Re: [PATCH] Remove ARM errata Workarounds 458693 and 460075

2017-05-02 Thread Catalin Marinas

On Tue, May 02, 2017 at 01:27:54PM +0100, Robin Murphy wrote:
> On 18/04/17 16:57, Catalin Marinas wrote:
> > On Sun, Apr 16, 2017 at 09:04:46AM +0100, Russell King - ARM Linux wrote:
> >> On Sat, Apr 15, 2017 at 07:06:06PM -0500, Nisal Menuka wrote:
> >>> According to ARM, these errata exist only in a version of Cortex-A8
> >>> (r2p0) which was never built. Therefore, I believe there are no platforms
> >>> where this workaround should be enabled.
> >>> link :http://infocenter.arm.com/help/index.jsp?topic=
> >>> /com.arm.doc.faqs/ka15634.html
> >>
> >> These were submitted by ARM Ltd back in 2009 - if the silicon was never
> >> built, there would've been no reason to submit them.  Maybe Catalin can
> >> shed some light on this, being the commit author who introduced these?
> > 
> > We normally try not to submit errata workarounds for revisions that are
> > not going to be built/deployed. It's possible that at the time there
> > were plans for r2p0 to be licensed and built (not just FPGA) but I don't
> > really remember the details. The A8 errata document indeed states that
> > r1p0 and r2p0 are obsolete but this can mean many things (like not
> > available to license).
> > 
> > I'll try to see if any of the A8 past product managers know anything
> > about this. In the meantime, I would leave them in (no run-time
> > overhead).
> 
> FWIW, I just fired up a RealView PB-A8 board to check, and that reports
> r1p1. True, it's not strictly a real silicon implementation (I think
> it's one of the structured ASIC test chips), but since it was, as far as
> I'm aware, a commercially-available development system, it's not
> impossible that someone may still own and use one of these beasts.

The above errata were specific to r1p0 (not r1p1) and r2p0. Since ARM
Ltd claims there are no products built around these revisions, I'm fine
with removing the workarounds from the kernel (internal testchips/FPGA
don't count as products but I doubt they are still relevant 8 years
later).

-- 
Catalin

Re: [PATCH V2] pwm: tegra: Set maximum pwm clock source per SoC tapeout

2017-05-02 Thread Jon Hunter


On 02/05/17 15:05, Laxman Dewangan wrote:
> The PWM hardware IP is taped-out with different maximum frequency
> on different SoCs.
> 
> From HW team:
>   Before Tegra186, it is 38.4MHz.
>   In Tegra186, it is 102MHz.
> 
> Add support to limit the clock source frequency to the maximum IP
> supported frequency. Provide these values via SoC chipdata.
> 
> Signed-off-by: Laxman Dewangan 
> 
> ---
> Changes from V1:
> - Set the 48MHz maximum frequency for Tegra210 and earlier.

I think that your changelog needs to be updated, because it still says
38.4MHz and not 48MHz.

Cheers
Jon

-- 
nvpublic

[GIT PULL] Hardware module parameter annotation for secure boot

2017-05-02 Thread David Howells

Hi Linus,

Can you pull this branch, please?  It consists of a set of patches that
annotate module parameters that configure hardware resources including
ioports, iomem addresses, irq lines and dma channels.

This allows a future patch to prohibit the use of such module parameters to
prevent that hardware from being abused to gain access to the running
kernel image as part of locking the kernel down under UEFI secure boot
conditions.

Annotations are made by changing:

module_param(n, t, p)
module_param_named(n, v, t, p)
module_param_array(n, t, m, p)

to:

module_param_hw(n, t, hwtype, p)
module_param_hw_named(n, v, t, hwtype, p)
module_param_hw_array(n, t, hwtype, m, p)

where the module parameter refers to a hardware setting

hwtype specifies the type of the resource being configured.  This can be
one of:

ioport  Module parameter configures an I/O port
iomem   Module parameter configures an I/O mem address
ioport_or_iomem Module parameter could be either (runtime set)
irq Module parameter configures an I/O port
dma Module parameter configures a DMA channel
dma_addrModule parameter configures a DMA buffer address
other   Module parameter configures some other value

Note that the hwtype is compile checked, but not currently stored (the
lockdown code probably won't require it).  It is, however, there for future
use.

A bonus is that the hwtype can also be used for grepping.

The intention is for the kernel to ignore or reject attempts to set
annotated module parameters if lockdown is enabled.  This applies to
options passed on the boot command line, passed to insmod/modprobe or
direct twiddling in /sys/module/ parameter files.  The module
initialisation then needs to handle the parameter not being set, by (1)
giving an error, (2) probing for a value or (3) using a reasonable default.

What I can't do is just reject a module out of hand because it may take a
hardware setting in the module parameters.  Some important modules, some
ipmi stuff for instance, both probe for hardware and allow hardware to be
manually specified; if the driver is aborts with any error, you don't get
any ipmi hardware.

Further, trying to do this entirely in the module initialisation code
doesn't protect against sysfs twiddling.


[!] Note that in and of itself, this series of patches should have no
effect on the the size of the kernel or code execution - that is left to a
patch in the next series to effect.  It does mark annotated kernel
parameters with a KERNEL_PARAM_FL_HWPARAM flag in an already existing
field.

[!] Further note that this series needs to be pulled after James Morris's
security/next.  Pulling it at the end of the merge window should be fine
and it shouldn't be a problem if annotations are discarded due to merge
collisions.

David
---
The following changes since commit ddb99e118e37f324a4be65a411bb60ae62795cf9:

  security, keys: convert key_user.usage from atomic_t to refcount_t 
(2017-04-03 10:49:06 +1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git 
tags/hwparam-20170420

for you to fetch changes up to 6192c41fc608b0a58d5540b015aa1672c266f3c5:

  Annotate hardware config module parameters in sound/pci/ (2017-04-20 12:02:32 
+0100)


Annotation of module parameters that specify device settings


David Howells (38):
  Annotate module params that specify hardware parameters (eg. ioport)
  Annotate hardware config module parameters in arch/x86/mm/
  Annotate hardware config module parameters in drivers/char/ipmi/
  Annotate hardware config module parameters in drivers/char/mwave/
  Annotate hardware config module parameters in drivers/char/
  Annotate hardware config module parameters in drivers/clocksource/
  Annotate hardware config module parameters in drivers/cpufreq/
  Annotate hardware config module parameters in drivers/gpio/
  Annotate hardware config module parameters in drivers/i2c/
  Annotate hardware config module parameters in drivers/iio/
  Annotate hardware config module parameters in drivers/input/
  Annotate hardware config module parameters in drivers/isdn/
  Annotate hardware config module parameters in drivers/media/
  Annotate hardware config module parameters in drivers/misc/
  Annotate hardware config module parameters in drivers/mmc/host/
  Annotate hardware config module parameters in drivers/net/appletalk/
  Annotate hardware config module parameters in drivers/net/arcnet/
  Annotate hardware config module parameters in drivers/net/can/
  Annotate hardware config module parameters in drivers/net/ethernet/
  Annotate hardware config module parameters in

Re: linux-next: Tree for May 2 (xen: events_base.c)

2017-05-02 Thread Randy Dunlap

On 05/01/17 23:47, Stephen Rothwell wrote:
> Hi all,
> 
> Please do not add any v4.13 destined material in your linux-next
> included branches until after v4.12-rc1 has been released.
> 
> Changes since 20170501:
> 

on x86_64:

drivers/built-in.o: In function `set_affinity_irq':
events_base.c:(.text+0x1632c4): undefined reference to 
`xen_have_vector_callback'
arch/x86/pci/built-in.o: In function `pci_xen_hvm_init':
(.init.text+0x59f): undefined reference to `xen_have_vector_callback'


-- 
~Randy

[PATCH tip/core/rcu 0/2] Debloat rcu_segcblist header file

2017-05-02 Thread Paul E. McKenney

Hello!

This series debloats the include/linux/rcu_segcblist.h include file:

1.  Leave data structures in include/linux/rcu_segcblist.h, but
move functions to kernel/rcu/rcu_segcblist.h, courtesy of
Ingo Molnar.

2.  Move non-trivial functions from kernel/rcu/rcu_segcblist.h
to kernel/rcu/rcu_segcblist.c to avoid excessive inlining
of large functions.

Additional patches open-coding some of the more trivial functions to follow.

Thanx, Paul



 include/linux/rcu_segcblist.h |  628 --
 init/Kconfig  |3 
 kernel/rcu/Makefile   |1 
 kernel/rcu/rcu_segcblist.c|  505 ++
 kernel/rcu/rcu_segcblist.h| 1178 --
 kernel/rcu/srcutiny.c |1 
 kernel/rcu/srcutree.c |1 
 kernel/rcu/tree.h |3 
 8 files changed, 1196 insertions(+), 1124 deletions(-)

[PATCH tip/core/rcu 2/2] rcu: Separately compile large rcu_segcblist functions

2017-05-02 Thread Paul E. McKenney

This commit creates a new kernel/rcu/rcu_segcblist.c file that
contains non-trivial segcblist functions.  Trivial functions
remain as static inline functions in kernel/rcu/rcu_segcblist.h

Reported-by: Linus Torvalds 
Signed-off-by: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
---
 init/Kconfig   |   3 +
 kernel/rcu/Makefile|   1 +
 kernel/rcu/rcu_segcblist.c | 505 ++
 kernel/rcu/rcu_segcblist.h | 533 +++--
 4 files changed, 544 insertions(+), 498 deletions(-)
 create mode 100644 kernel/rcu/rcu_segcblist.c

diff --git a/init/Kconfig b/init/Kconfig
index 42a346b0df43..1d3475fc9496 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -573,6 +573,9 @@ config RCU_STALL_COMMON
  the tiny variants to disable RCU CPU stall warnings, while
  making these warnings mandatory for the tree variants.
 
+config RCU_NEED_SEGCBLIST
+   def_bool ( TREE_RCU || PREEMPT_RCU || TINY_SRCU || TREE_SRCU )
+
 config CONTEXT_TRACKING
bool
 
diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 158e6593d58c..23803c7d5180 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_TREE_RCU) += tree.o
 obj-$(CONFIG_PREEMPT_RCU) += tree.o
 obj-$(CONFIG_TREE_RCU_TRACE) += tree_trace.o
 obj-$(CONFIG_TINY_RCU) += tiny.o
+obj-$(CONFIG_RCU_NEED_SEGCBLIST) += rcu_segcblist.o
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
new file mode 100644
index ..2b62a38b080f
--- /dev/null
+++ b/kernel/rcu/rcu_segcblist.c
@@ -0,0 +1,505 @@
+/*
+ * RCU segmented callback lists, function definitions
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ * Copyright IBM Corporation, 2017
+ *
+ * Authors: Paul E. McKenney 
+ */
+
+#include 
+#include 
+#include 
+
+#include "rcu_segcblist.h"
+
+/* Initialize simple callback list. */
+void rcu_cblist_init(struct rcu_cblist *rclp)
+{
+   rclp->head = NULL;
+   rclp->tail = >head;
+   rclp->len = 0;
+   rclp->len_lazy = 0;
+}
+
+/*
+ * Debug function to actually count the number of callbacks.
+ * If the number exceeds the limit specified, return -1.
+ */
+long rcu_cblist_count_cbs(struct rcu_cblist *rclp, long lim)
+{
+   int cnt = 0;
+   struct rcu_head **rhpp = >head;
+
+   for (;;) {
+   if (!*rhpp)
+   return cnt;
+   if (++cnt > lim)
+   return -1;
+   rhpp = &(*rhpp)->next;
+   }
+}
+
+/*
+ * Dequeue the oldest rcu_head structure from the specified callback
+ * list.  This function assumes that the callback is non-lazy, but
+ * the caller can later invoke rcu_cblist_dequeued_lazy() if it
+ * finds otherwise (and if it cares about laziness).  This allows
+ * different users to have different ways of determining laziness.
+ */
+struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp)
+{
+   struct rcu_head *rhp;
+
+   rhp = rclp->head;
+   if (!rhp)
+   return NULL;
+   rclp->len--;
+   rclp->head = rhp->next;
+   if (!rclp->head)
+   rclp->tail = >head;
+   return rhp;
+}
+
+/*
+ * Initialize an rcu_segcblist structure.
+ */
+void rcu_segcblist_init(struct rcu_segcblist *rsclp)
+{
+   int i;
+
+   BUILD_BUG_ON(RCU_NEXT_TAIL + 1 != ARRAY_SIZE(rsclp->gp_seq));
+   BUILD_BUG_ON(ARRAY_SIZE(rsclp->tails) != ARRAY_SIZE(rsclp->gp_seq));
+   rsclp->head = NULL;
+   for (i = 0; i < RCU_CBLIST_NSEGS; i++)
+   rsclp->tails[i] = >head;
+   rsclp->len = 0;
+   rsclp->len_lazy = 0;
+}
+
+/*
+ * Disable the specified rcu_segcblist structure, so that callbacks can
+ * no longer be posted to it.  This structure must be empty.
+ */
+void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
+{
+   WARN_ON_ONCE(!rcu_segcblist_empty(rsclp));
+   WARN_ON_ONCE(rcu_segcblist_n_cbs(rsclp));
+   WARN_ON_ONCE(rcu_segcblist_n_lazy_cbs(rsclp));
+   rsclp->tails[RCU_NEXT_TAIL] = NULL;
+}
+
+/*
+ * Is the specified segment of the specified rcu_segcblist structure
+ * empty of callbacks?
+ */
+bool rcu_segcblist_segempty(struct rcu_segcblist *rsclp, int seg)
+{
+   if (seg == RCU_DONE_TAIL)
+   return >head == rsclp->tails[RCU_DONE_TAIL];
+   return

[PATCH tip/core/rcu 1/2] srcu: Debloat the header

2017-05-02 Thread Paul E. McKenney

From: Ingo Molnar 

Linus noticed that the  has huge inline functions
which should not be inline at all.

As a first step in cleaning this up, move them all to kernel/rcu/ and
only keep an absolute minimum of data type defines in the header:

  before:   -rw-r--r-- 1 mingo mingo 22284 May  2 10:25 
include/linux/rcu_segcblist.h
   after:   -rw-r--r-- 1 mingo mingo  3180 May  2 10:22 
include/linux/rcu_segcblist.h

More can be done, such as uninlining the large functions, which inlining
is unjustified even if it's an RCU internal matter.

Reported-by: Linus Torvalds 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Signed-off-by: Paul E. McKenney 
---
 include/linux/rcu_segcblist.h | 628 +---
 kernel/rcu/rcu_segcblist.h| 645 ++
 kernel/rcu/srcutiny.c |   1 +
 kernel/rcu/srcutree.c |   1 +
 kernel/rcu/tree.h |   3 +-
 5 files changed, 652 insertions(+), 626 deletions(-)
 create mode 100644 kernel/rcu/rcu_segcblist.h

diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
index ced8f313fd05..ba4d2621d9ca 100644
--- a/include/linux/rcu_segcblist.h
+++ b/include/linux/rcu_segcblist.h
@@ -20,8 +20,8 @@
  * Authors: Paul E. McKenney 
  */
 
-#ifndef __KERNEL_RCU_SEGCBLIST_H
-#define __KERNEL_RCU_SEGCBLIST_H
+#ifndef __INCLUDE_LINUX_RCU_SEGCBLIST_H
+#define __INCLUDE_LINUX_RCU_SEGCBLIST_H
 
 /* Simple unsegmented callback lists. */
 struct rcu_cblist {
@@ -33,102 +33,6 @@ struct rcu_cblist {
 
 #define RCU_CBLIST_INITIALIZER(n) { .head = NULL, .tail =  }
 
-/* Initialize simple callback list. */
-static inline void rcu_cblist_init(struct rcu_cblist *rclp)
-{
-   rclp->head = NULL;
-   rclp->tail = >head;
-   rclp->len = 0;
-   rclp->len_lazy = 0;
-}
-
-/* Is simple callback list empty? */
-static inline bool rcu_cblist_empty(struct rcu_cblist *rclp)
-{
-   return !rclp->head;
-}
-
-/* Return number of callbacks in simple callback list. */
-static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
-{
-   return rclp->len;
-}
-
-/* Return number of lazy callbacks in simple callback list. */
-static inline long rcu_cblist_n_lazy_cbs(struct rcu_cblist *rclp)
-{
-   return rclp->len_lazy;
-}
-
-/*
- * Debug function to actually count the number of callbacks.
- * If the number exceeds the limit specified, return -1.
- */
-static inline long rcu_cblist_count_cbs(struct rcu_cblist *rclp, long lim)
-{
-   int cnt = 0;
-   struct rcu_head **rhpp = >head;
-
-   for (;;) {
-   if (!*rhpp)
-   return cnt;
-   if (++cnt > lim)
-   return -1;
-   rhpp = &(*rhpp)->next;
-   }
-}
-
-/*
- * Dequeue the oldest rcu_head structure from the specified callback
- * list.  This function assumes that the callback is non-lazy, but
- * the caller can later invoke rcu_cblist_dequeued_lazy() if it
- * finds otherwise (and if it cares about laziness).  This allows
- * different users to have different ways of determining laziness.
- */
-static inline struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp)
-{
-   struct rcu_head *rhp;
-
-   rhp = rclp->head;
-   if (!rhp)
-   return NULL;
-   rclp->len--;
-   rclp->head = rhp->next;
-   if (!rclp->head)
-   rclp->tail = >head;
-   return rhp;
-}
-
-/*
- * Account for the fact that a previously dequeued callback turned out
- * to be marked as lazy.
- */
-static inline void rcu_cblist_dequeued_lazy(struct rcu_cblist *rclp)
-{
-   rclp->len_lazy--;
-}
-
-/*
- * Interim function to return rcu_cblist head pointer.  Longer term, the
- * rcu_cblist will be used more pervasively, removing the need for this
- * function.
- */
-static inline struct rcu_head *rcu_cblist_head(struct rcu_cblist *rclp)
-{
-   return rclp->head;
-}
-
-/*
- * Interim function to return rcu_cblist head pointer.  Longer term, the
- * rcu_cblist will be used more pervasively, removing the need for this
- * function.
- */
-static inline struct rcu_head **rcu_cblist_tail(struct rcu_cblist *rclp)
-{
-   WARN_ON_ONCE(rcu_cblist_empty(rclp));
-   return rclp->tail;
-}
-
 /* Complicated segmented callback lists.  ;-) */
 
 /*
@@ -183,530 +87,4 @@ struct rcu_segcblist {
.tails[RCU_NEXT_TAIL] = , \
 }
 
-/*
- * Initialize an rcu_segcblist structure.
- */
-static inline void rcu_segcblist_init(struct rcu_segcblist *rsclp)
-{
-   int i;
-
-   BUILD_BUG_ON(RCU_NEXT_TAIL + 1 != ARRAY_SIZE(rsclp->gp_seq));
-   BUILD_BUG_ON(ARRAY_SIZE(rsclp->tails) != ARRAY_SIZE(rsclp->gp_seq));
-   rsclp->head = NULL;
-   for (i = 0; i < RCU_CBLIST_NSEGS; i++)
-   rsclp->tails[i] = >head;
-   rsclp->len = 0;
-   rsclp->len_lazy = 0;
-}
-
-/*
- * Is the specified rcu_segcblist structure empty?
- *
- * But careful!  The fact that the ->head field is NULL

RE: Asmedia USB 1343 crashes

2017-05-02 Thread David Laight

From: Thomas Fjellstrom
> Sent: 01 May 2017 14:40
> I've got a 970 Pro gaming aura motherboard with an Asmedia 1343 Usb 3.1
> controller. It's been consistently throwing errors and eventually crashing and
> becomming unresponsive.
...

I've an earlier Asmedia 1042 controller.
It has a bug (which I don't remember seeing a fix for) that it will
only process one entry from the command ring each time the doorbell
is rung.
If more than one command gets queued then it all got very confusing
because responses to timed out commands kept being seen.
(I no longer have the USB3 ethernet device needed to reproduce this.)

The later part might have the same bug.

David

Re: [PATCH 1/1 linux-next] btrfs: kmap() can't fail

2017-05-02 Thread David Sterba

On Tue, Apr 25, 2017 at 08:11:02PM +0200, Fabian Frederick wrote:
> Remove NULL test on kmap()
> 
> Signed-off-by: Fabian Frederick 

Reviewed-by: David Sterba

Re: [PATCH 2/4] PCI: add functionality for resizing resources v3

2017-05-02 Thread Christian König


Am 26.04.2017 um 19:00 schrieb Andy Shevchenko:

On Tue, Apr 25, 2017 at 4:19 PM, Christian König
 wrote:

From: Christian König 

This allows device drivers to request resizing their BARs.

The function only tries to reprogram the windows of the bridge directly above
the requesting device and only the BAR of the same type (usually mem, 64bit,
prefetchable). This is done to make sure not to disturb other drivers by
changing the BARs of their devices.

If reprogramming the bridge BAR fails the old status is restored and -ENOSPC
returned to the calling device driver.
+int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
+{
+   const unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+   IORESOURCE_PREFETCH | IORESOURCE_MEM_64;
+

Redundant.


Redundant, but also a reminder to myself that I wanted to ask something 
about that.


This type_mask is used already three times in this file, shouldn't we 
add a define for that?



[SNIP]

+   list_for_each_entry(dev_res, , list) {
+   /* Skip the bridge we just assigned resources for. */
+   if (bridge == dev_res->dev)
+   continue;
+
+   bridge = dev_res->dev;
+   pci_setup_bridge(bridge->subordinate);
+   }
+
+   free_list();
+   free_list();
+   return ret;

You might re-use two lines with below, but perhaps better to show
which case returns 0 explicitly and drop assignment ret = 0 above.


Good point, but actually the free_list() is superfluous here 
since when the failed list isn't empty we end up in the cleanup path.


Going to fix all other comments in the next version.

Regards,
Christian.

Re: [PATCH] usb: gadget: udc: add null check before pointer dereference

2017-05-02 Thread Gustavo A. R. Silva


Hi Alan,

Quoting Alan Stern :


On Mon, 1 May 2017, Gustavo A. R. Silva wrote:


Add null check before dereferencing dev->regs pointer inside
net2280_led_shutdown() function.

Addresses-Coverity-ID: 101783
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/usb/gadget/udc/net2280.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/gadget/udc/net2280.c  
b/drivers/usb/gadget/udc/net2280.c

index 3828c2e..1898a4b 100644
--- a/drivers/usb/gadget/udc/net2280.c
+++ b/drivers/usb/gadget/udc/net2280.c
@@ -3573,7 +3573,11 @@ static void net2280_remove(struct pci_dev *pdev)
BUG_ON(dev->driver);

/* then clean up the resources we allocated during probe() */
-   net2280_led_shutdown(dev);
+
+   if (dev->regs) {
+   net2280_led_shutdown(dev);
+   iounmap(dev->regs);
+   }
if (dev->requests) {
int i;
for (i = 1; i < 5; i++) {
@@ -3588,8 +3592,6 @@ static void net2280_remove(struct pci_dev *pdev)
free_irq(pdev->irq, dev);
if (dev->quirks & PLX_PCIE)
pci_disable_msi(pdev);
-   if (dev->regs)
-   iounmap(dev->regs);
if (dev->region)
release_mem_region(pci_resource_start(pdev, 0),
pci_resource_len(pdev, 0));


No, you must not move the iounmap() call, because an interrupt could
theoretically occur at any time.



Yeah, I was suspicious about it.


Either just live with an extra test of dev->regs, or else move the
net2280_led_shutdown() call later.



In this case I think it is safe to move the net2280_led_shutdown()  
call, as the function is only turning off the LEDs.


I'll send a patch shortly.
Thank you
--
Gustavo A. R. Silva

[PATCH v4 RFT 1/2] powerpc/fadump: reduce memory consumption for capture kernel

2017-05-02 Thread Michal Suchanek

With fadump (dump capture) kernel booting like a regular kernel, it almost
needs the same amount of memory to boot as the production kernel, which is
unwarranted for a dump capture kernel. But with no option to disable some
of the unnecessary subsystems in fadump kernel, that much memory is wasted
on fadump, depriving the production kernel of that memory.

Introduce kernel parameter 'fadump_append=' that would take regular kernel
parameters to be appended when fadump is active.
This 'fadump_append=' parameter can be leveraged to pass parameters like
nr_cpus=1, cgroup_disable=memory and numa=off, to disable unwarranted
resources/subsystems.

Also, ensure the log "Firmware-assisted dump is active" is printed early
in the boot process to put the subsequent fadump messages in context.

Suggested-by: Michael Ellerman 
Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
v4:
 - use space separated arguments instead of comma separated
 - do not append parameters when fadummp is disabled
---
 arch/powerpc/kernel/fadump.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index ebf2e9c..e0c728a 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -79,8 +79,10 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 * dump data waiting for us.
 */
fdm_active = of_get_flat_dt_prop(node, "ibm,kernel-dump", NULL);
-   if (fdm_active)
+   if (fdm_active) {
+   pr_info("Firmware-assisted dump is active.\n");
fw_dump.dump_active = 1;
+   }
 
/* Get the sizes required to store dump data for the firmware provided
 * dump sections.
@@ -263,8 +265,12 @@ int __init fadump_reserve_mem(void)
 {
unsigned long base, size, memory_boundary;
 
-   if (!fw_dump.fadump_enabled)
+   if (!fw_dump.fadump_enabled) {
+   if (fw_dump.dump_active)
+   pr_warn("Firmware-assisted dump was active but kernel"
+   " booted with fadump disabled!\n");
return 0;
+   }
 
if (!fw_dump.fadump_supported) {
printk(KERN_INFO "Firmware-assisted dump is not supported on"
@@ -304,7 +310,6 @@ int __init fadump_reserve_mem(void)
memory_boundary = memblock_end_of_DRAM();
 
if (fw_dump.dump_active) {
-   printk(KERN_INFO "Firmware-assisted dump is active.\n");
/*
 * If last boot has crashed then reserve all the memory
 * above boot_memory_size so that we don't touch it until
@@ -363,6 +368,22 @@ unsigned long __init arch_reserved_kernel_pages(void)
return memblock_reserved_size() / PAGE_SIZE;
 }
 
+/* Look for fadump_append= cmdline option. */
+static int __init early_fadump_append_param(char *p)
+{
+   if (!p)
+   return 1;
+
+   if (fw_dump.fadump_enabled && fw_dump.dump_active) {
+   pr_info("enforcing additional parameters (%s) passed through "
+   "'fadump_append=' parameter\n", p);
+   parse_early_options(p);
+   }
+
+   return 0;
+}
+early_param("fadump_append", early_fadump_append_param);
+
 /* Look for fadump= cmdline option. */
 static int __init early_fadump_param(char *p)
 {
-- 
2.10.2

[PATCH v4 RFT 2/2] powerpc/fadump: update documentation about 'fadump_append=' parameter

2017-05-02 Thread Michal Suchanek

With the introduction of 'fadump_append=' parameter to pass additional
parameters to fadump (capture) kernel, update documentation about it.

Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
 Documentation/powerpc/firmware-assisted-dump.txt | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/Documentation/powerpc/firmware-assisted-dump.txt 
b/Documentation/powerpc/firmware-assisted-dump.txt
index 9cabaf8..99ab8f4 100644
--- a/Documentation/powerpc/firmware-assisted-dump.txt
+++ b/Documentation/powerpc/firmware-assisted-dump.txt
@@ -162,7 +162,11 @@ How to enable firmware-assisted dump (fadump):
 
 1. Set config option CONFIG_FA_DUMP=y and build kernel.
 2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
-3. Optionally, user can also set 'crashkernel=' kernel cmdline
+3. A user can pass additional kernel parameters as a comma separated list
+   through 'fadump_append=' parameter, to be be appended when fadump is active.
+   This can be used to pass parameters like nr_cpus=1, numa=off to reduce
+   memory consumption during dump capture.
+4. Optionally, user can also set 'fadump_reserve_mem=' kernel cmdline
to specify size of the memory to reserve for boot memory dump
preservation.
 
@@ -172,6 +176,8 @@ NOTE: 1. 'fadump_reserve_mem=' parameter has been 
deprecated. Instead
   2. If firmware-assisted dump fails to reserve memory then it
  will fallback to existing kdump mechanism if 'crashkernel='
  option is set at kernel cmdline.
+  3. 'fadump_append=' parameter con be quoted to append multiple arguments
+ as in 'fadump_append="nr_cpus=8 numa=off"'
 
 Sysfs/debugfs files:
 
-- 
2.10.2

Re: [Xen-devel] linux-next: Tree for May 2 (xen: events_base.c)

2017-05-02 Thread Boris Ostrovsky

On 05/02/2017 11:34 AM, Randy Dunlap wrote:
> On 05/01/17 23:47, Stephen Rothwell wrote:
>> Hi all,
>>
>> Please do not add any v4.13 destined material in your linux-next
>> included branches until after v4.12-rc1 has been released.
>>
>> Changes since 20170501:
>>
> on x86_64:
>
> drivers/built-in.o: In function `set_affinity_irq':
> events_base.c:(.text+0x1632c4): undefined reference to 
> `xen_have_vector_callback'
> arch/x86/pci/built-in.o: In function `pci_xen_hvm_init':
> (.init.text+0x59f): undefined reference to `xen_have_vector_callback'
>
>


Can you send me (or to the list) your config file?


Thanks.
-boris

Re: [PATCH v2 3/4] iio: accel: adxl345: Setup DATA_READY trigger

2017-05-02 Thread Jonathan Cameron

On 02/05/17 04:01, Rob Herring wrote:
> On Sun, Apr 30, 2017 at 7:32 PM, Jonathan Cameron  wrote:
>> On 29/04/17 08:49, Eva Rachel Retuya wrote:
>>> The ADXL345 provides a DATA_READY interrupt function to signal
>>> availability of new data. This interrupt function is latched and can be
>>> cleared by reading the data registers. The polarity is set to active
>>> high by default.
>>>
>>> Support this functionality by setting it up as an IIO trigger.
>>>
>>> In addition, two output pins INT1 and INT2 are available for driving
>>> interrupts. Allow mapping to either pins by specifying the
>>> interrupt-names property in device tree.
>>>
>>> Signed-off-by: Eva Rachel Retuya 
>> Coming together nicely, but a few more bits and pieces inline...
>>
>> One slight worry is that the irq names stuff is to restrictive
>> as we want to direct different interrupts to different pins if
>> both are supported!
> 
> [...]
> 
>>> @@ -199,6 +253,22 @@ int adxl345_core_probe(struct device *dev, struct 
>>> regmap *regmap,
>>>   dev_err(dev, "Failed to set data range: %d\n", ret);
>>>   return ret;
>>>   }
>>> + /*
>>> +  * Any bits set to 0 send their respective interrupts to the INT1 pin,
>>> +  * whereas bits set to 1 send their respective interrupts to the INT2
>>> +  * pin. Map all interrupts to the specified pin.
>> This is an interesting comment.  The usual reason for dual interrupt
>> pins is precisely to not map all functions to the same one.  That allows
>> for a saving in querying which interrupt it is by having just the data ready
>> on one pin and just the events on the other...
>>
>> Perhaps the current approach won't support that mode of operation?
>> Clearly we can't merge a binding that enforces them all being the same
>> and then change it later as it'll be incompatible.
>>
>> I'm not quite sure how one should do this sort of stuff in DT though.
>>
>> Rob?
> 
> DT should just describe what is connected which I gather here could be
> either one or both IRQs. We generally distinguish the IRQs with the
> interrupt-names property and then retrieve it as below.
Picking this branch to continue on I'll grab Eva's replay as well.

Eva said:
> I've thought about this before since to me that's the better approach
> than one or the other. I'm in a time crunch before hence I went with
> this way. The input driver does this as well and what I just did is to
> match what it does. If you could point me some drivers for reference,
> I'll gladly analyze those and present something better on the next
> revision.

So taking both of these and having thought about it a bit more in my
current jet lagged state (I hate travelling - particularly with the
added amusement of a flat tyre on the plane).

To my mind we need to describe what interrupts at there as Rob says.
It's all obvious if there is only one interrupt connected (often
the case I suspect as pins are in short supply on many SoCs).

If we allow the binding to specify both pins (using names to do the
matching to which pin they are on the chip), then we could allow
the driver itself to optimize the usage according to what is enabled.
Note though that this can come later - for now we just need to allow
the specification of both interrupts if they are present.

So lets talk about the ideal ;)
Probably makes sense to separate dataready and the events if possible.
Ideal would be to even allow individual events to have there own pins
as long as there are only two available.  So we need a heuristic to
work out what interrupts to put where.  It doesn't work well as a lookup
table (I tried it)

#define ADXL345_OVERRUN = BIT(0)
#define ADXL345_WATERMARK = BIT(1)
#define ADXL345_FREEFALL = BIT(2)
#define ADXL345_INACTIVITY = BIT(3)
#define ADXL345_ACTIVITY = BIT(4)
#define ADXL345_DOUBLE_TAP = BIT(5)
#define ADXL345_SINGLE_TAP = BIT(6)
#define ADXL345_DATA_READY = BIT(7)

So some function that takes the bitmap of what is enabled and
tries to divide it sensibly.

int adxl345_int_heuristic(u8 input, u8 *output)
{
long bounce;
switch (hweight8())
{
case 0 ... 1:
*output = input;
break;
case 2:
*output = BIT(ffs()); //this will put one on each 
interrupt.
break;
case 3 ... 7: //now it gets tricky. Perhaps always have dataready and 
watermark on own interrupt if set?

if (input & (ADXL345_DATA_READY | ADXL345_WATERMARK)) 
output = input & (ADXL345_DATA_READY | 
ADXL345_WATERMARK);
else // taps always on same one etc...
}
}

Then your interrupt handler will need to look at the incoming and work out if it
needs to read the status register to know what it has. If it doesn't
need to then it doesn't do so.  Be careful to only clear the right
interrupts though in that case as it is always possible both are set.

Anyhow, right now all that needs to be there is the

Re: [usb-host] question about pointer dereference before null check

2017-05-02 Thread Gustavo A. R. Silva


Hi Alan,

Quoting Alan Stern :


On Mon, 1 May 2017, Gustavo A. R. Silva wrote:


Hello everybody,

While taking a look into Coverity ID 100828 I ran into the following
piece of code at drivers/usb/host/ehci-sched.c:1096:

u32 addr;
int think_time;
int hs_transfers;

addr = dev->ttport << 24;
if (!ehci_is_TDI(ehci)
 || (dev->tt->hub !=
 ehci_to_hcd(ehci)->self.root_hub))
 addr |= dev->tt->hub->devnum << 16;
addr |= epnum << 8;
addr |= dev->devnum;
stream->ps.usecs = HS_USECS_ISO(maxp);
think_time = dev->tt ? dev->tt->think_time : 0;

The issue here is that dev->tt is being dereferenced before null check.

I was thinking on placing think_time = dev->tt ? dev->tt->think_time :
0; just before the _if_ statement. But this doesn't address the
problem of dev->tt actually being NULL.

While looking into the callers of the function containing this piece
of code (iso_stream_init()) my impression is that dev->tt is not NULL
at the time this function is called and, a very simple patch like the
following can be applied in order to avoid the Coverity issue:

-think_time = dev->tt ? dev->tt->think_time : 0;
+think_time = dev->tt->think_time;

But I can't tell for sure, so in case dev->tt is NULL, a good strategy
to properly update the _addr_ variable would be needed.

What do you think?

I would really appreciate any comment on this,
Thank you!


You are right; udev->tt cannot ever be NULL when this section of code
runs.  The test should be removed.



Thanks for confirming, I'll send a patch shortly.
--
Gustavo A. R. Silva

Re: [PATCH RFC] hugetlbfs 'noautofill' mount option

2017-05-02 Thread Prakash Sangappa




On 5/2/17 3:53 AM, Anshuman Khandual wrote:

On 05/01/2017 11:30 PM, Prakash Sangappa wrote:

Some applications like a database use hugetblfs for performance
reasons. Files on hugetlbfs filesystem are created and huge pages
allocated using fallocate() API. Pages are deallocated/freed using
fallocate() hole punching support that has been added to hugetlbfs.
These files are mmapped and accessed by many processes as shared memory.
Such applications keep track of which offsets in the hugetlbfs file have
pages allocated.

Any access to mapped address over holes in the file, which can occur due

s/mapped/unmapped/ ^ ?


It is 'mapped' address.




to bugs in the application, is considered invalid and expect the process
to simply receive a SIGBUS.  However, currently when a hole in the file is
accessed via the mapped address, kernel/mm attempts to automatically
allocate a page at page fault time, resulting in implicitly filling the
hole

But this is expected when you try to control the file allocation from
a mapped address. Any changes while walking past or writing the range
in the memory mapped should reflect exactly in the file on the disk.
Why its not a valid behavior ?

Sure, that is a valid behavior. However, hugetlbfs is a pesudo filesystem
and the purpose is for applications to use hugepage memory. The contents
of these filesystem are not backed by disk nor are they swapped out.

The proposed new behavior is only applicable for hugetlbfs filesystem 
mounted
with the new 'noautofill' mount option. The file's page allocation/free 
are managed

using the 'fallocate()' API.

For hugetlbfs filesystems mounted without this option, there is no 
change in behavior.



in the file. This may not be the desired behavior for applications like the
database that want to explicitly manage page allocations of hugetlbfs
files.

This patch adds a new hugetlbfs mount option 'noautofill', to indicate that
pages should not be allocated at page fault time when accessed thru mmapped
address.

When the page should be allocated for mapping ?

The application would allocate/free file pages using the fallocate() API.

[PATCH] usb: host: remove unnecessary null check

2017-05-02 Thread Gustavo A. R. Silva

Remove unnecessary null check. udev->tt cannot ever be NULL when this
section of code runs.

Addresses-Coverity-ID: 100828
Cc: Alan Stern 
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/usb/host/ehci-sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/ehci-sched.c b/drivers/usb/host/ehci-sched.c
index 980a6b3..6bc6304 100644
--- a/drivers/usb/host/ehci-sched.c
+++ b/drivers/usb/host/ehci-sched.c
@@ -1105,7 +1105,7 @@ iso_stream_init(
addr |= epnum << 8;
addr |= dev->devnum;
stream->ps.usecs = HS_USECS_ISO(maxp);
-   think_time = dev->tt ? dev->tt->think_time : 0;
+   think_time = dev->tt->think_time;
stream->ps.tt_usecs = NS_TO_US(think_time + usb_calc_bus_time(
dev->speed, is_input, 1, maxp));
hs_transfers = max(1u, (maxp + 187) / 188);
-- 
2.5.0

Re: [PATCH v2 4/4] iio: accel: adxl345: Add support for triggered buffer

2017-05-02 Thread Jonathan Cameron

On 02/05/17 13:23, Eva Rachel Retuya wrote:
> On Mon, May 01, 2017 at 01:42:29AM +0100, Jonathan Cameron wrote:
> [...]
>> Few minor bits inline...  I'm a little bit in two minds about the 
>> holding up waiting for new data when using another trigger...
>>
>> Jonathan
> [...]
>>>  static int adxl345_read_raw(struct iio_dev *indio_dev,
>>> @@ -127,6 +151,10 @@ static int adxl345_read_raw(struct iio_dev *indio_dev,
>>>  
>>> switch (mask) {
>>> case IIO_CHAN_INFO_RAW:
>>> +   ret = iio_device_claim_direct_mode(indio_dev);
>>> +   if (ret)
>>> +   return ret;
>>> +
>>> mutex_lock(>lock);
>>> ret = adxl345_set_mode(data, ADXL345_POWER_CTL_MEASURE);
>>> if (ret < 0) {
>>> @@ -148,12 +176,14 @@ static int adxl345_read_raw(struct iio_dev *indio_dev,
>>> ret = regmap_bulk_read(data->regmap, chan->address, ,
>>>sizeof(regval));
>>> mutex_unlock(>lock);
>>> +   iio_device_release_direct_mode(indio_dev);
>>> if (ret < 0) {
>>> adxl345_set_mode(data, ADXL345_POWER_CTL_STANDBY);
>>> return ret;
>>> }
>>>  
>>> -   *val = sign_extend32(le16_to_cpu(regval), 12);
>>> +   *val = sign_extend32(le16_to_cpu(regval),
>>> +chan->scan_type.realbits - 1)
>> This change isn't really needed, but I suppose it does little harm...
>>
>>> adxl345_set_mode(data, ADXL345_POWER_CTL_STANDBY);
>>>  
>>> return IIO_VAL_INT;
>>> @@ -186,6 +216,64 @@ static irqreturn_t adxl345_irq(int irq, void *p)
>>> return IRQ_NONE;
>>>  }
>>>  
>>> +static irqreturn_t adxl345_trigger_handler(int irq, void *p)
>>> +{
>>> +   struct iio_poll_func *pf = p;
>>> +   struct iio_dev *indio_dev = pf->indio_dev;
>>> +   struct adxl345_data *data = iio_priv(indio_dev);
>>> +   int ret;
>>> +
>>> +   mutex_lock(>lock);
>>> +   /* Make sure data is ready when using external trigger */
>> I 'think' this is only really relevant for the very first one.
>> After that general rule of thumb is that if an external trigger
>> is too quick - bad luck you'll get repeated data.
>>
>> One of the reasons we would want to use another trigger is to
>> support capture in parallel from several sensors - if we 'hold'
>> like this we'll get out of sync.
>>
>> As such I wonder if a better strategy would be to 'hold' for the
>> first reading in the buffer enable - thus guaranteeing valid
>> data before we start.  After that we wouldn't need to check this
>> here.
>>
> 
> Thanks for the explanation. If we are to go with this one, where to put
> it, preenable or postenable? I'm assuming the latter but would like to
> confirm.

postenable.  It could in theory be effected by a future use of the
update_scan_mode callback so should be after that.

J
> 
>> What do others think?
>>
> 
> Any other inputs are greatly appreciated.
> 
> Eva
> --
> To unsubscribe from this list: send the line "unsubscribe linux-iio" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Re: [PATCH] device-dax: fix sysfs attribute deadlock

2017-05-02 Thread Dan Williams

On Tue, May 2, 2017 at 3:43 AM, Yi Zhang  wrote:
> Verified this patch on 4.11.
>
> Tested-by: Yi Zhang 

Thanks!

Re: [PATCH v3 0/4] Improved seccomp logging

2017-05-02 Thread Andy Lutomirski

On Mon, May 1, 2017 at 7:41 PM, Tyler Hicks  wrote:
> On 04/27/2017 07:42 PM, Kees Cook wrote:
>> On Thu, Apr 27, 2017 at 3:17 PM, Tyler Hicks  wrote:
>>> Quick update... I finished the move from the high-water mark
>>> log_max_action sysctl to the bitmask based actions_logged sysctl.
>>
>> Awesome!
>>
>>> Unfortunately, I've just realized that SECCOMP_SET_LOGGING, or any
>>> process-wide logging configuration mechanism, will not work. It is fine
>>> for the situation where two unrelated processes set up seccomp filters
>>> that should be logged differently. However, it fails when two closely
>>> related processes, such as parent and child, need to set up seccomp
>>> filters that should be logged differently. Imagine a launcher that sets
>>> up an application sandbox (including a seccomp filter) and then launches
>>> an electron app which will have its own seccomp filter for sandboxing
>>> untrusted code that it runs. Unless the launcher and app completely
>>> agree on actions that should be logged, the logging won't work as
>>> intended for both processes.
>>
>> Oh, you mean the forked process sets up the logging it wants for the
>> filters it just installed, then after exec a process sets up new
>> logging requirements?
>
> Yes - see below.
>
>>
>>> I think this needs to be configured at the filter level.
>>
>> I'm not sure that's even the right way to compose the logging desires.
>>
>> So, my initial thought was "whatever ran SECCOMP_SET_LOGGING knows
>> what it's doing" and it should be the actual value.
>>
>> If the launcher wants logs of everything the application does with its
>> filters, then a purely-tied-to-filter approach won't work either.
>>
>> Perhaps log bits can only be enabled? I.e. SECCOMP_SET_LOGGING
>> performs an OR instead of an assignment?
>
> The problem that I'm envisioning with this design is this:
>
> 1. Launcher is told to launch Chrome and forks off a process.
>
> 2. Launcher sets up a filter using RET_ERRNO for all unacceptable
> syscalls and enables auditing of RET_ERRNO.
>
> 3. Launcher execs Chrome.
>
> 4. Chrome then sets up its own, more restrictive filter that uses
> RET_ERRNO, among other actions, but does not want auditing of RET_ERRNO.
>
> If we use process-wide auditing controls, the logs will be filled with
> RET_ERRNO messages that were unintended and unrelated to the RET_ERRNO
> actions set up in the launcher's filter.
>
> Unfortunately, the OR'ing idea doesn't solve the problem.

Things like my more complicated solution solve this completely, I
think.  The launcher would, by whatever means, say "RET_ERRNO and log
this".  The more restrictive sandbox would say "RET_ERROR and don't
log this" and we'd just make sure that the composition rules mean the
inner rule wins.

--Andy

Re: [PATCH] mtd: nand: gpio: make nCE GPIO optional

2017-05-02 Thread Brian Norris

Hi Christophe, Boris,

On Tue, May 02, 2017 at 11:03:34AM +0200, Boris Brezillon wrote:
> On Tue, 2 May 2017 07:47:40 +0200
> Christophe LEROY  wrote:
> 
> > Le 01/05/2017 à 23:46, Brian Norris a écrit :
> > > On Fri, Feb 10, 2017 at 03:01:10PM +0100, Christophe Leroy wrote:  
> > >> On some hardware, the nCE signal is wired to the ChipSelect associated
> > >> to bus address of the NAND, so it is automatically driven during the
> > >> memory access and it is not managed by a GPIO.
> > >>
> > >> Signed-off-by: Christophe Leroy   
> > >
> > > Not really a problem with this patch exactly, but FYI you're only making
> > > this optional for the non-DT case. For device tree, this is kinda hard
> > > to do, since the current binding suggests we retrieve the GPIOs based on
> > > index position, not by name. So if you leave one off...I guess we well
> > > just be off-by-1 on the indeces until we hit a non-optional one...which
> > > I guess is "CLE".
> > >
> > > If we wanted this to work for DT, we'd need to extend this driver (and
> > > binding doc) to support requesting GPIOs by name.
> > >  
> > 
> > It works for me with devicetree.
> > 
> > I have the following definition in my DT:
> > 
> > nand@1,0 {
> > compatible = "gpio-control-nand";
> > reg = <1 0x0 0x01>;
> > #address-cells = <1>;
> > #size-cells = <1>;
> > gpios = <_pio_c 24 1 // RDY
> >  0  // nCE
> >  _pio_c 26 1 // ALE
> >  _pio_c 25 1 // CLE
> >  0>;// nwp  
> > };
> > 
> 
> Yep, it's perfectly fine to have 'empty' gpio entries (entries with
> phandle set to 0/NULL), we're using this trick in the atmel_nand
> driver as well.

I wasn't aware. In that case, you need to change the binding doc to
note that nCE is optional now.

Brian

Re: LKML: "Fixes as per checkpatch.pl" patches

2017-05-02 Thread Joe Perches

On Tue, 2017-05-02 at 18:08 +0300, Ali Kaasinen wrote:
> Hellooo, presumably contacted the right people.
> 
> I often browse LKML, and patches saying "Fix style issues as reported by 
> checkpatch.pl", and Greg responding "That's really vague, you need to be 
> specific, and only fix one type of thing per patch" seem fairly common 
> these days, e.g.:
> 
> - http://lkml.iu.edu/hypermail/linux/kernel/1704.3/01867.html
> 
> - http://lkml.iu.edu/hypermail/linux/kernel/1705.0/00105.html
> 
> Just wondering, shouldn't checkpatch mention that?

This question should be asked while cc'ing lkml.

What Greg does with patches is up to Greg.

A checkpatch user guide in Documentation could be useful.

[PATCH tip/core/rcu 0/3] Open-code trivial rcu_cblist accessors

2017-05-02 Thread Paul E. McKenney

Hello!

And this series open-codes trivial rcu_cblist structure accessors:

1.  Open-code rcu_cblist_empty(p) as !p->head.

2.  Open-code rcu_cblist_n_cbs(p) as p->len.

3.  Open-code rcu_cblist_n_lazy_cbs(p) as p->len_lazy.


Thanx, Paul



 rcu_segcblist.h |   20 +---
 tree.c  |   20 +---
 tree_plugin.h   |8 
 tree_trace.c|4 ++--
 4 files changed, 16 insertions(+), 36 deletions(-)

Re: [PATCH v4] irqchip/gicv3-its: Avoid memory over allocation for ITEs

2017-05-02 Thread Marc Zyngier

On Sun, Apr 30 2017 at  3:36:15 pm BST, Shanker Donthineni 
 wrote:
> We are always allocating extra 255Bytes of memory to handle ITE
> physical address alignment requirement. The kmalloc() satisfies
> the ITE alignment since the ITS driver is requesting a minimum
> size of ITS_ITT_ALIGN bytes.
>
> Let's try to allocate the exact amount of memory that is required
> for ITEs to avoid wastage.
>
> Signed-off-by: Shanker Donthineni 
> ---
> Changes:
> v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from 
> commit.
> v3: changed from IITE to ITE.
> v3: removed fallback since kmalloc() guarantees the right alignment.
>
>  drivers/irqchip/irq-gic-v3-its.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index 45ea1933..72e56f03 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -261,7 +261,6 @@ static struct its_collection *its_build_mapd_cmd(struct 
> its_cmd_block *cmd,
>   u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>  
>   itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
> - itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>  
>   its_encode_cmd(cmd, GITS_CMD_MAPD);
>   its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
> @@ -1329,13 +1328,14 @@ static struct its_device *its_create_device(struct 
> its_node *its, u32 dev_id,
>*/
>   nr_ites = max(2UL, roundup_pow_of_two(nvecs));
>   sz = nr_ites * its->ite_size;
> - sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> + sz = max(sz, ITS_ITT_ALIGN);
>   itt = kzalloc(sz, GFP_KERNEL);
>   lpi_map = its_lpi_alloc_chunks(nvecs, _base, _lpis);
>   if (lpi_map)
>   col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL);
>  
> - if (!dev || !itt || !lpi_map || !col_map) {
> + if (!dev || !itt || !lpi_map || !col_map ||
> + !IS_ALIGNED(virt_to_phys(itt), ITS_ITT_ALIGN)) {
>   kfree(dev);
>   kfree(itt);
>   kfree(lpi_map);

I'm confused. Either the alignment is guaranteed (and you should
document why it is so), or it is not, and we need to handle the
non-alignment (instead of failing).

Thanks,

M.
-- 
Jazz is not dead, it just smell funny.

[PATCH tip/core/rcu 1/3] rcu: Open-code the rcu_cblist_empty() function

2017-05-02 Thread Paul E. McKenney

Because the rcu_cblist_empty() just samples the ->head pointer, and
because the rcu_cblist structure is quite straightforward, it makes
sense to open-code rcu_cblist_empty(p) as !p->head, cutting out a
level of indirection.  This commit makes this change.

Reported-by: Ingo Molnar 
Signed-off-by: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Linus Torvalds 
---
 kernel/rcu/rcu_segcblist.h | 8 +---
 kernel/rcu/tree.c  | 9 -
 kernel/rcu/tree_plugin.h   | 4 ++--
 3 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 86bc1101b806..7d18d41f0116 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -22,12 +22,6 @@
 
 #include 
 
-/* Is simple callback list empty? */
-static inline bool rcu_cblist_empty(struct rcu_cblist *rclp)
-{
-   return !rclp->head;
-}
-
 /* Return number of callbacks in simple callback list. */
 static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
 {
@@ -66,7 +60,7 @@ static inline struct rcu_head *rcu_cblist_head(struct 
rcu_cblist *rclp)
  */
 static inline struct rcu_head **rcu_cblist_tail(struct rcu_cblist *rclp)
 {
-   WARN_ON_ONCE(rcu_cblist_empty(rclp));
+   WARN_ON_ONCE(!rclp->head);
return rclp->tail;
 }
 
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 91fff49d5869..35152414760d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2647,9 +2647,9 @@ static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, 
unsigned long flags)
 
/* First adopt the ready-to-invoke callbacks, then the done ones. */
rcu_segcblist_insert_done_cbs(>cblist, >orphan_done);
-   WARN_ON_ONCE(!rcu_cblist_empty(>orphan_done));
+   WARN_ON_ONCE(rsp->orphan_done.head);
rcu_segcblist_insert_pend_cbs(>cblist, >orphan_pend);
-   WARN_ON_ONCE(!rcu_cblist_empty(>orphan_pend));
+   WARN_ON_ONCE(rsp->orphan_pend.head);
WARN_ON_ONCE(rcu_segcblist_empty(>cblist) !=
 !rcu_segcblist_n_cbs(>cblist));
 }
@@ -2800,9 +2800,8 @@ static void rcu_do_batch(struct rcu_state *rsp, struct 
rcu_data *rdp)
 
local_irq_save(flags);
count = -rcu_cblist_n_cbs();
-   trace_rcu_batch_end(rsp->name, count, !rcu_cblist_empty(),
-   need_resched(), is_idle_task(current),
-   rcu_is_callbacks_kthread());
+   trace_rcu_batch_end(rsp->name, count, !!rcl.head, need_resched(),
+   is_idle_task(current), rcu_is_callbacks_kthread());
 
/* Update counts and requeue any remaining callbacks. */
rcu_segcblist_insert_done_cbs(>cblist, );
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 7f1d677a2a25..6b4b1f8a272d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1942,12 +1942,12 @@ static bool __maybe_unused 
rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
return false;
 
/* First, enqueue the donelist, if any.  This preserves CB ordering. */
-   if (!rcu_cblist_empty(>orphan_done)) {
+   if (rsp->orphan_done.head) {
__call_rcu_nocb_enqueue(rdp, rcu_cblist_head(>orphan_done),
rcu_cblist_tail(>orphan_done),
ql, qll, flags);
}
-   if (!rcu_cblist_empty(>orphan_pend)) {
+   if (rsp->orphan_pend.head) {
__call_rcu_nocb_enqueue(rdp, rcu_cblist_head(>orphan_pend),
rcu_cblist_tail(>orphan_pend),
ql, qll, flags);
-- 
2.5.2

[PATCH tip/core/rcu 3/3] rcu: Open-code the rcu_cblist_n_lazy_cbs() function

2017-05-02 Thread Paul E. McKenney

Because the rcu_cblist_n_lazy_cbs() just samples the ->len_lazy counter,
and because the rcu_cblist structure is quite straightforward, it makes
sense to open-code rcu_cblist_n_lazy_cbs(p) as p->len_lazy, cutting out
a level of indirection.  This commit makes this change.

Reported-by: Ingo Molnar 
Signed-off-by: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Linus Torvalds 
---
 kernel/rcu/rcu_segcblist.h | 6 --
 kernel/rcu/tree.c  | 2 +-
 kernel/rcu/tree_plugin.h   | 2 +-
 kernel/rcu/tree_trace.c| 2 +-
 4 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 424a6b230921..6e36e36478cd 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -22,12 +22,6 @@
 
 #include 
 
-/* Return number of lazy callbacks in simple callback list. */
-static inline long rcu_cblist_n_lazy_cbs(struct rcu_cblist *rclp)
-{
-   return rclp->len_lazy;
-}
-
 /*
  * Account for the fact that a previously dequeued callback turned out
  * to be marked as lazy.
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 942d529fccbc..1205c8ad138a 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2634,7 +2634,7 @@ static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, 
unsigned long flags)
 
/* Do the accounting first. */
rdp->n_cbs_adopted += rsp->orphan_done.len;
-   if (rcu_cblist_n_lazy_cbs(>orphan_done) != rsp->orphan_done.len)
+   if (rsp->orphan_done.len_lazy != rsp->orphan_done.len)
rcu_idle_count_callbacks_posted();
rcu_segcblist_insert_count(>cblist, >orphan_done);
 
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 7ebe357df155..c9a48657512a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1935,7 +1935,7 @@ static bool __maybe_unused 
rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
 unsigned long flags)
 {
long ql = rsp->orphan_done.len;
-   long qll = rcu_cblist_n_lazy_cbs(>orphan_done);
+   long qll = rsp->orphan_done.len_lazy;
 
/* If this is not a no-CBs CPU, tell the caller to do it the old way. */
if (!rcu_is_nocb_cpu(smp_processor_id()))
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index b7743aa2965f..6cea17a1ea30 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -277,7 +277,7 @@ static void print_one_rcu_state(struct seq_file *m, struct 
rcu_state *rsp)
   rsp->n_force_qs, rsp->n_force_qs_ngp,
   rsp->n_force_qs - rsp->n_force_qs_ngp,
   READ_ONCE(rsp->n_force_qs_lh),
-  rcu_cblist_n_lazy_cbs(>orphan_done),
+  rsp->orphan_done.len_lazy,
   rsp->orphan_done.len);
for (rnp = >node[0]; rnp - >node[0] < rcu_num_nodes; rnp++) {
if (rnp->level != level) {
-- 
2.5.2

Re: [PATCH] x86: kvm: Avoid guest page table walk when gpa_available is set

2017-05-02 Thread Paolo Bonzini



On 28/04/2017 21:15, Brijesh Singh wrote:
> Hi Radim,
> 
>>
>> This will probably return false negatives when then vcpu->arch.gpa_val
>> couldn't be used anyway (possibly after a VM exits), so it is hard to
>> draw a conclusion.
>>
>> I would really like if we had a solution that passed the gpa inside
>> function parameters.
>>
>> (Btw. cr2 can also be a virtual address, so we can call it gpa directly)
>>
> 
> I've tried the below patch and it seems to work fine. This does not
> consider
> PIO case and as you rightly pointed PIO should trigger #NPF relatively
> rarely.
> At least so far in my runs I have not seen PIO causing #NPF. If this sounds
> acceptable approach then I can submit v2 with these changes and remove
> gpa_val
> additional.

PIO can certainly cause #NPF, albeit rarely, so this is not a viable one.

Paolo

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 13c132b..c040e38 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4662,17 +4662,17 @@ static int emulator_read_write_onepage(unsigned
> long addr, void *val,
>  */
> if (vcpu->arch.gpa_available &&
> emulator_can_use_gpa(ctxt) &&
> -   vcpu_is_mmio_gpa(vcpu, addr, exception->address, write) &&
> (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
> gpa = exception->address;
> -   goto mmio;
> +   ret = vcpu_is_mmio_gpa(vcpu, addr, gpa, write);
> +   } else {
> +   dump_stack();
> +   ret = vcpu_mmio_gva_to_gpa(vcpu, addr, , exception,
> write);
> +
> +   if (ret < 0)
> +   return X86EMUL_PROPAGATE_FAULT;
> }
>  
> -   ret = vcpu_mmio_gva_to_gpa(vcpu, addr, , exception, write);
> -
> -   if (ret < 0)
> -   return X86EMUL_PROPAGATE_FAULT;
> -
> /* For APIC access vmexit */
> if (ret)
> goto mmio;
> @@ -7056,11 +7056,11 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
> return r;
>  }
>  
> -static inline int complete_emulated_io(struct kvm_vcpu *vcpu)
> +static inline int complete_emulated_io(struct kvm_vcpu *vcpu, unsigned
> long cr2)
>  {
> int r;
> vcpu->srcu_idx = srcu_read_lock(>kvm->srcu);
> -   r = emulate_instruction(vcpu, EMULTYPE_NO_DECODE);
> +   r = x86_emulate_instruction(vcpu, cr2, EMULTYPE_NO_DECODE, NULL,
> 0);
> srcu_read_unlock(>kvm->srcu, vcpu->srcu_idx);
> if (r != EMULATE_DONE)
> return 0;
> @@ -7071,7 +7071,7 @@ static int complete_emulated_pio(struct kvm_vcpu
> *vcpu)
>  {
> BUG_ON(!vcpu->arch.pio.count);
>  
> -   return complete_emulated_io(vcpu);
> +   return complete_emulated_io(vcpu, 0);
>  }
>  /*
> @@ -7097,6 +7097,7 @@ static int complete_emulated_mmio(struct kvm_vcpu
> *vcpu)
> struct kvm_run *run = vcpu->run;
> struct kvm_mmio_fragment *frag;
> unsigned len;
> +   gpa_t gpa;
>  
> BUG_ON(!vcpu->mmio_needed);
>  
> @@ -7106,6 +7107,12 @@ static int complete_emulated_mmio(struct kvm_vcpu
> *vcpu)
> if (!vcpu->mmio_is_write)
> memcpy(frag->data, run->mmio.data, len);
>  
> +   /*
> +   * lets use the GPA from previous guest page table walk to avoid yet
> +   * another guest page table walk when completing the MMIO
> page-fault.
> +*/
> +   gpa = frag->gpa;
> +
> if (frag->len <= 8) {
> /* Switch to the next fragment. */
> frag++;
> @@ -7124,7 +7131,7 @@ static int complete_emulated_mmio(struct kvm_vcpu
> *vcpu)
> if (vcpu->mmio_is_write)
> return 1;
> vcpu->mmio_read_completed = 1;
> -   return complete_emulated_io(vcpu);
> +   return complete_emulated_io(vcpu, gpa);
> }
>  
> run->exit_reason = KVM_EXIT_MMIO;
>

[PATCH tip/core/rcu 2/3] rcu: Open-code the rcu_cblist_n_cbs() function

2017-05-02 Thread Paul E. McKenney

Because the rcu_cblist_n_cbs() just samples the ->len counter, and
because the rcu_cblist structure is quite straightforward, it makes
sense to open-code rcu_cblist_n_cbs(p) as p->len, cutting out a level
of indirection.  This commit makes this change.

Reported-by: Ingo Molnar 
Signed-off-by: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Linus Torvalds 
---
 kernel/rcu/rcu_segcblist.h | 6 --
 kernel/rcu/tree.c  | 9 -
 kernel/rcu/tree_plugin.h   | 2 +-
 kernel/rcu/tree_trace.c| 2 +-
 4 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 7d18d41f0116..424a6b230921 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -22,12 +22,6 @@
 
 #include 
 
-/* Return number of callbacks in simple callback list. */
-static inline long rcu_cblist_n_cbs(struct rcu_cblist *rclp)
-{
-   return rclp->len;
-}
-
 /* Return number of lazy callbacks in simple callback list. */
 static inline long rcu_cblist_n_lazy_cbs(struct rcu_cblist *rclp)
 {
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 35152414760d..942d529fccbc 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2633,9 +2633,8 @@ static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, 
unsigned long flags)
return;
 
/* Do the accounting first. */
-   rdp->n_cbs_adopted += rcu_cblist_n_cbs(>orphan_done);
-   if (rcu_cblist_n_lazy_cbs(>orphan_done) !=
-   rcu_cblist_n_cbs(>orphan_done))
+   rdp->n_cbs_adopted += rsp->orphan_done.len;
+   if (rcu_cblist_n_lazy_cbs(>orphan_done) != rsp->orphan_done.len)
rcu_idle_count_callbacks_posted();
rcu_segcblist_insert_count(>cblist, >orphan_done);
 
@@ -2792,14 +2791,14 @@ static void rcu_do_batch(struct rcu_state *rsp, struct 
rcu_data *rdp)
 * Stop only if limit reached and CPU has something to do.
 * Note: The rcl structure counts down from zero.
 */
-   if (-rcu_cblist_n_cbs() >= bl &&
+   if (-rcl.len >= bl &&
(need_resched() ||
 (!is_idle_task(current) && !rcu_is_callbacks_kthread(
break;
}
 
local_irq_save(flags);
-   count = -rcu_cblist_n_cbs();
+   count = -rcl.len;
trace_rcu_batch_end(rsp->name, count, !!rcl.head, need_resched(),
is_idle_task(current), rcu_is_callbacks_kthread());
 
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 6b4b1f8a272d..7ebe357df155 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1934,7 +1934,7 @@ static bool __maybe_unused 
rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
 struct rcu_data *rdp,
 unsigned long flags)
 {
-   long ql = rcu_cblist_n_cbs(>orphan_done);
+   long ql = rsp->orphan_done.len;
long qll = rcu_cblist_n_lazy_cbs(>orphan_done);
 
/* If this is not a no-CBs CPU, tell the caller to do it the old way. */
diff --git a/kernel/rcu/tree_trace.c b/kernel/rcu/tree_trace.c
index 30c5bf89ee58..b7743aa2965f 100644
--- a/kernel/rcu/tree_trace.c
+++ b/kernel/rcu/tree_trace.c
@@ -278,7 +278,7 @@ static void print_one_rcu_state(struct seq_file *m, struct 
rcu_state *rsp)
   rsp->n_force_qs - rsp->n_force_qs_ngp,
   READ_ONCE(rsp->n_force_qs_lh),
   rcu_cblist_n_lazy_cbs(>orphan_done),
-  rcu_cblist_n_cbs(>orphan_done));
+  rsp->orphan_done.len);
for (rnp = >node[0]; rnp - >node[0] < rcu_num_nodes; rnp++) {
if (rnp->level != level) {
seq_puts(m, "\n");
-- 
2.5.2

Re: [BUG] cannot mount nfs, bisected to 0db10944a76ba ("nfs: Convert to separately allocated bdi")

2017-05-02 Thread Jan Kara

Hello,

On Fri 28-04-17 11:56:24, Corentin Labbe wrote:
> Since linux next-20170421, mounting nfs give me:
> [  774.994934] [ cut here ]
> [  774.994975] WARNING: CPU: 1 PID: 10284 at /linux-next/fs/sysfs/dir.c:31 
> sysfs_warn_dup+0x64/0x74
> [  774.994985] sysfs: cannot create duplicate filename 
> '/devices/virtual/bdi/0:32'
> [  774.994992] Modules linked in: axp20x_usb_power gpio_axp209 
> nvmem_sunxi_sid sun4i_dma sun4i_ss virt_dma
> [  774.995047] CPU: 1 PID: 10284 Comm: mount.nfs Not tainted 4.11.0-rc4+ #14
> [  774.995054] Hardware name: Allwinner sun7i (A20) Family
> [  774.995085] [] (unwind_backtrace) from [] 
> (show_stack+0x10/0x14)
> [  774.995104] [] (show_stack) from [] 
> (dump_stack+0x78/0x8c)
> [  774.995121] [] (dump_stack) from [] (__warn+0xe8/0x100)
> [  774.995135] [] (__warn) from [] 
> (warn_slowpath_fmt+0x38/0x48)
> [  774.995150] [] (warn_slowpath_fmt) from [] 
> (sysfs_warn_dup+0x64/0x74)
> [  774.995167] [] (sysfs_warn_dup) from [] 
> (sysfs_create_dir_ns+0x84/0x94)
> [  774.995184] [] (sysfs_create_dir_ns) from [] 
> (kobject_add_internal+0x9c/0x2ec)
> [  774.995199] [] (kobject_add_internal) from [] 
> (kobject_add+0x48/0x98)
> [  774.995217] [] (kobject_add) from [] 
> (device_add+0xe4/0x5a0)
> [  774.995232] [] (device_add) from [] 
> (device_create_groups_vargs+0xac/0xbc)
> [  774.995247] [] (device_create_groups_vargs) from [] 
> (device_create_vargs+0x20/0x28)
> [  774.995263] [] (device_create_vargs) from [] 
> (bdi_register_va+0x44/0xfc)
> [  774.995280] [] (bdi_register_va) from [] 
> (super_setup_bdi_name+0x48/0xa4)
> [  774.995299] [] (super_setup_bdi_name) from [] 
> (nfs_fill_super+0x1a4/0x204)
> [  774.995315] [] (nfs_fill_super) from [] 
> (nfs_fs_mount_common+0x140/0x1e8)
> [  774.995333] [] (nfs_fs_mount_common) from [] 
> (nfs4_remote_mount+0x50/0x58)
> [  774.995349] [] (nfs4_remote_mount) from [] 
> (mount_fs+0x14/0xa4)
> [  774.995368] [] (mount_fs) from [] 
> (vfs_kern_mount+0x54/0x128)
> [  774.995385] [] (vfs_kern_mount) from [] 
> (nfs_do_root_mount+0x80/0xa0)
> [  774.995400] [] (nfs_do_root_mount) from [] 
> (nfs4_try_mount+0x28/0x3c)
> [  774.995415] [] (nfs4_try_mount) from [] 
> (nfs_fs_mount+0x2cc/0x8c4)
> [  774.995430] [] (nfs_fs_mount) from [] 
> (mount_fs+0x14/0xa4)
> [  774.995445] [] (mount_fs) from [] 
> (vfs_kern_mount+0x54/0x128)
> [  774.995461] [] (vfs_kern_mount) from [] 
> (do_mount+0x158/0xc7c)
> [  774.995475] [] (do_mount) from [] (SyS_mount+0x8c/0xb4)
> [  774.995491] [] (SyS_mount) from [] 
> (ret_fast_syscall+0x0/0x3c)
> [  774.995501] ---[ end trace 0665e451f8864ff0 ]---
...
> The mount command is
> mount -t nfs -o tcp,hard,intr,async,rsize=4096,wsize=4096 
> 192.168.1.100:/mnt/local_kernel /usr/src/
> 
> the mount command failling with: "mount.nfs: Cannot allocate memory"

I've tried reproducing this (both with NFSv3 and NFSv4) and failed. Also I
have looked through the code and I fail to see how this could happen. Is
this the only NFS mount that you have on your system? Didn't also the
WARN_ON in super_setup_bdi_name() trigger?

Can you run with the attached debug patch and post full dmesg after the
failure?

Honza
-- 
Jan Kara 
SUSE Labs, CR
>From 90f66349ce91d8dd4811eca5efffb605cf084193 Mon Sep 17 00:00:00 2001
From: Jan Kara 
Date: Tue, 2 May 2017 18:24:57 +0200
Subject: [PATCH] nfs: Debug bdi registration failure

Debug failure and fixup error code propagation.

Signed-off-by: Jan Kara 
---
 fs/nfs/super.c   | 5 -
 mm/backing-dev.c | 2 ++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index dc69314d455e..f60a387105ad 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2344,6 +2344,8 @@ int nfs_fill_super(struct super_block *sb, struct nfs_mount_info *mount_info)
 
  	nfs_initialise_sb(sb);
 
+	printk("Registering bdi %u:%u for server %p\n", MAJOR(server->s_dev), MINOR(server->s_dev), server);
+	dump_stack();
 	ret = super_setup_bdi_name(sb, "%u:%u", MAJOR(server->s_dev),
    MINOR(server->s_dev));
 	if (ret)
@@ -2607,7 +2609,7 @@ struct dentry *nfs_fs_mount_common(struct nfs_server *server,
 		/* initial superblock/root creation */
 		error = mount_info->fill_super(s, mount_info);
 		if (error)
-			goto error_splat_super;
+			goto error_splat_super_err;
 		nfs_get_cache_cookie(s, mount_info->parsed, mount_info->cloned);
 	}
 
@@ -2630,6 +2632,7 @@ struct dentry *nfs_fs_mount_common(struct nfs_server *server,
 
 error_splat_root:
 	dput(mntroot);
+error_splat_super_err:
 	mntroot = ERR_PTR(error);
 error_splat_super:
 	deactivate_locked_super(s);
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index f028a9a472fd..ac173d06834a 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -923,6 +923,8 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi)
 
 void bdi_unregister(struct backing_dev_info *bdi)
 {
+	printk("Unregistering bdi %s\n", dev_name(bdi->dev));
+

[PATCH 2/2] ARM: dts: omap4-droid4: Add vibrator

2017-05-02 Thread Sebastian Reichel

Add vibrator to Droid4's device tree.

Signed-off-by: Sebastian Reichel 
---
 arch/arm/boot/dts/omap4-droid4-xt894.dts | 38 
 1 file changed, 38 insertions(+)

diff --git a/arch/arm/boot/dts/omap4-droid4-xt894.dts 
b/arch/arm/boot/dts/omap4-droid4-xt894.dts
index 89eb607f4a9e..67d286a53368 100644
--- a/arch/arm/boot/dts/omap4-droid4-xt894.dts
+++ b/arch/arm/boot/dts/omap4-droid4-xt894.dts
@@ -116,6 +116,32 @@
 
};
};
+
+   pwm8: dmtimer-pwm-8 {
+   pinctrl-names = "default";
+   pinctrl-0 = <_direction_pin>;
+
+   compatible = "ti,omap-dmtimer-pwm";
+   #pwm-cells = <3>;
+   ti,timers = <>;
+   ti,clock-source = <0x01>;
+   };
+
+   pwm9: dmtimer-pwm-9 {
+   pinctrl-names = "default";
+   pinctrl-0 = <_enable_pin>;
+
+   compatible = "ti,omap-dmtimer-pwm";
+   #pwm-cells = <3>;
+   ti,timers = <>;
+   ti,clock-source = <0x01>;
+   };
+
+   vibrator {
+   compatible = "motorola,mapphone-pwm-vibrator";
+   pwms = < 0 10 0>, < 0 10 0>;
+   pwm-names = "enable", "direction";
+   };
 };
 
  {
@@ -453,6 +479,18 @@
OMAP4_IOPAD(0x1c8, PIN_INPUT_PULLUP | MUX_MODE7)
>;
};
+
+   vibrator_direction_pin: pinmux_vibrator_direction_pin {
+   pinctrl-single,pins = <
+   OMAP4_IOPAD(0x1ce, PIN_OUTPUT | MUX_MODE1)  /* 
dmtimer8_pwm_evt (gpio_27) */
+   >;
+   };
+
+   vibrator_enable_pin: pinmux_vibrator_enable_pin {
+   pinctrl-single,pins = <
+   OMAP4_IOPAD(0X1d0, PIN_OUTPUT | MUX_MODE1)  /* 
dmtimer9_pwm_evt (gpio_28) */
+   >;
+   };
 };
 
 _pmx_wkup {
-- 
2.11.0

[PATCH 0/2] PWM Vibrator driver

2017-05-02 Thread Sebastian Reichel

Hi,

The Motorola Droid 4 has a vibrator, that is connected
to two GPIOs. Motorola's stock kernel names them vib_en
and vib_dir, which probably stand for vibrator_enable
and vibrator_direction. In their stock kernel both GPIOs
are toggled using a hrtimer and a custom vibrator "misc"
device is provided to userspace.

Thankfully the hardware designers the used GPIOs can
also be used from OMAP's dmtimers, so that they can
be driven as PWM output instead saving some CPU cycles
(and code).

The driver is loosely based on an old patch from Dmitry,
that I found in the internet(tm) [0]. Note, that I did
not check the generic vibrator stuff. I just kept it in
the driver, since it's probably what other people expect
from a pwm-vibra driver :)

[0] https://lkml.org/lkml/2012/4/10/41

-- Sebastian

Sebastian Reichel (2):
  Input: pwm-vibra: new driver
  ARM: dts: omap4-droid4: Add vibrator

 .../devicetree/bindings/input/pwm-vibrator.txt |  60 
 arch/arm/boot/dts/omap4-droid4-xt894.dts   |  38 +++
 drivers/input/misc/Kconfig |  11 +
 drivers/input/misc/Makefile|   1 +
 drivers/input/misc/pwm-vibra.c | 348 +
 5 files changed, 458 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/pwm-vibrator.txt
 create mode 100644 drivers/input/misc/pwm-vibra.c

-- 
2.11.0

[PATCH 1/2] Input: pwm-vibra: new driver

2017-05-02 Thread Sebastian Reichel

Provide a simple driver for PWM controllable vibrators. It
will be used by Motorola Droid 4.

Signed-off-by: Sebastian Reichel 
---
 .../devicetree/bindings/input/pwm-vibrator.txt |  60 
 drivers/input/misc/Kconfig |  11 +
 drivers/input/misc/Makefile|   1 +
 drivers/input/misc/pwm-vibra.c | 348 +
 4 files changed, 420 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/pwm-vibrator.txt
 create mode 100644 drivers/input/misc/pwm-vibra.c

diff --git a/Documentation/devicetree/bindings/input/pwm-vibrator.txt 
b/Documentation/devicetree/bindings/input/pwm-vibrator.txt
new file mode 100644
index ..c35be4691366
--- /dev/null
+++ b/Documentation/devicetree/bindings/input/pwm-vibrator.txt
@@ -0,0 +1,60 @@
+* PWM vibrator device tree bindings
+
+Registers a PWM device as vibrator.
+
+Required properties:
+- compatible: should be
+  * "pwm-vibrator"
+For vibrators controlled using the PWM channel's duty cycle (higher duty 
means
+the vibrator becomes stronger).
+  * "motorola,mapphone-pwm-vibrator"
+ For vibrator found in Motorola Droid 4. This vibrator generates a pulse 
for
+ every rising edge, so its controlled using a duty cycle of 50% and 
changing
+ the period time.
+- pwm-names: Should contain "enable" and optionally "direction"
+- pwms: Should contain a PWM handle for each entry in pwm-names
+
+Example from Motorola Droid 4:
+
+_pmx_core {
+   vibrator_direction_pin: pinmux_vibrator_direction_pin {
+   pinctrl-single,pins = <
+   OMAP4_IOPAD(0x1ce, PIN_OUTPUT | MUX_MODE1) /* dmtimer8_pwm_evt 
(gpio_27) */
+   >;
+   };
+
+   vibrator_enable_pin: pinmux_vibrator_enable_pin {
+   pinctrl-single,pins = <
+   OMAP4_IOPAD(0X1d0, PIN_OUTPUT | MUX_MODE1) /* dmtimer9_pwm_evt 
(gpio_28) */
+   >;
+   };
+};
+
+/ {
+   pwm8: dmtimer-pwm {
+   pinctrl-names = "default";
+   pinctrl-0 = <_direction_pin>;
+
+   compatible = "ti,omap-dmtimer-pwm";
+   #pwm-cells = <3>;
+   ti,timers = <>;
+   ti,clock-source = <0x01>;
+   };
+
+   pwm9: dmtimer-pwm {
+   pinctrl-names = "default";
+   pinctrl-0 = <_enable_pin>;
+
+   compatible = "ti,omap-dmtimer-pwm";
+   #pwm-cells = <3>;
+   ti,timers = <>;
+   ti,clock-source = <0x01>;
+   };
+
+   vibrator {
+   compatible = "pwm-vibrator";
+   pwms = < 0 10 0>,
+  < 0 10 0>;
+   pwm-names = "enable", "direction";
+   };
+};
diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
index 5b6c52210d20..d8e0b25eb217 100644
--- a/drivers/input/misc/Kconfig
+++ b/drivers/input/misc/Kconfig
@@ -571,6 +571,17 @@ config INPUT_PWM_BEEPER
  To compile this driver as a module, choose M here: the module will be
  called pwm-beeper.
 
+config INPUT_PWM_VIBRA
+   tristate "PWM vibrator support"
+   depends on PWM
+   help
+ Say Y here to get support for PWM based vibrator devices.
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the module will be
+ called pwm-vibra.
+
 config INPUT_GPIO_ROTARY_ENCODER
tristate "Rotary encoders connected to GPIO pins"
depends on GPIOLIB || COMPILE_TEST
diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
index b10523f2878e..9a6517f5458c 100644
--- a/drivers/input/misc/Makefile
+++ b/drivers/input/misc/Makefile
@@ -58,6 +58,7 @@ obj-$(CONFIG_INPUT_PM8XXX_VIBRATOR)   += pm8xxx-vibrator.o
 obj-$(CONFIG_INPUT_PMIC8XXX_PWRKEY)+= pmic8xxx-pwrkey.o
 obj-$(CONFIG_INPUT_POWERMATE)  += powermate.o
 obj-$(CONFIG_INPUT_PWM_BEEPER) += pwm-beeper.o
+obj-$(CONFIG_INPUT_PWM_VIBRA)  += pwm-vibra.o
 obj-$(CONFIG_INPUT_RB532_BUTTON)   += rb532_button.o
 obj-$(CONFIG_INPUT_REGULATOR_HAPTIC)   += regulator-haptic.o
 obj-$(CONFIG_INPUT_RETU_PWRBUTTON) += retu-pwrbutton.o
diff --git a/drivers/input/misc/pwm-vibra.c b/drivers/input/misc/pwm-vibra.c
new file mode 100644
index ..a7d36d88679a
--- /dev/null
+++ b/drivers/input/misc/pwm-vibra.c
@@ -0,0 +1,348 @@
+/*
+ *  PWM vibrator driver
+ *
+ *  Copyright (C) 2017 Collabora Ltd.
+ *
+ *  Based on previous work from:
+ *  Copyright (C) 2012 Dmitry Torokhov 
+ *
+ *  Based on PWM beeper driver:
+ *  Copyright (C) 2010, Lars-Peter Clausen 
+ *
+ *  This program is free software; you can redistribute it and/or modify it
+ *  under  the terms of the GNU General  Public License as published by the
+ *  Free Software Foundation;  either version 2 of the License, or (at your
+ *  option) any later version.
+ */
+
+#define DEBUG
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include

[PATCH] btrfs: always write superblocks synchronously

2017-05-02 Thread Davidlohr Bueso

Commit b685d3d65ac7 "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_FUA implementation.
Since REQ_FUA and REQ_FLUSH flags are stripped from submitted IO
when the disk doesn't have volatile write cache and thus effectively
make the write async. This was seen to cause performance hits up
to 90% regression in disk IO related benchmarks such as reaim and
dbench[1].

Fix the problem by making sure the first superblock write is also
treated as synchronous since they can block progress of the
journalling (commit, log syncs) machinery and thus the whole filesystem.

[1] https://www.spinics.net/lists/linux-ext4/msg56238.html

Fixes: b685d3d65ac (block: treat REQ_FUA and REQ_PREFLUSH as synchronous)
Cc: stable 
Cc: Jan Kara 
Signed-off-by: Davidlohr Bueso 
---
 fs/btrfs/disk-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 061c1d1f774f..51b2fd8ceccb 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3468,7 +3468,7 @@ static int write_dev_supers(struct btrfs_device *device,
 * to go down lazy.
 */
if (i == 0)
-   ret = btrfsic_submit_bh(REQ_OP_WRITE, REQ_FUA, bh);
+   ret = btrfsic_submit_bh(REQ_OP_WRITE, REQ_FUA | 
REQ_SYNC, bh);
else
ret = btrfsic_submit_bh(REQ_OP_WRITE, REQ_SYNC, bh);
if (ret)
-- 
2.12.0

Re: [PATCH 1/1] arm64: Always provide "model name" in /proc/cpuinfo

2017-05-02 Thread Heinrich Schuchardt

On 05/02/2017 01:08 PM, Catalin Marinas wrote:
> On Tue, May 02, 2017 at 12:39:13AM +0200, Heinrich Schuchardt wrote:
>> There is no need to hide the model name in processes
>> that are not PER_LINUX32.
>>
>> So let us always provide a model name that is easily readable.
>>
>> Fixes: e47b020a323d ("arm64: Provide "model name" in /proc/cpuinfo for 
>> PER_LINUX32 tasks")
>> Signed-off-by: Heinrich Schuchardt 
>> ---
>>  arch/arm64/kernel/cpuinfo.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
>> index b3d5b3e8fbcb..9ad9ddcd2f19 100644
>> --- a/arch/arm64/kernel/cpuinfo.c
>> +++ b/arch/arm64/kernel/cpuinfo.c
>> @@ -118,9 +118,8 @@ static int c_show(struct seq_file *m, void *v)
>>   * "processor".  Give glibc what it expects.
>>   */
>>  seq_printf(m, "processor\t: %d\n", i);
>> -if (compat)
>> -seq_printf(m, "model name\t: ARMv8 Processor rev %d 
>> (%s)\n",
>> -   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
>> +seq_printf(m, "model name\t: ARMv8 Processor rev %d (%s)\n",
>> +   MIDR_REVISION(midr), COMPAT_ELF_PLATFORM);
>>  
>>  seq_printf(m, "BogoMIPS\t: %lu.%02lu\n",
>> loops_per_jiffy / (50UL/HZ),
> 
> Such patch seems to come up regularly:
> 
> https://patchwork.kernel.org/patch/9303311/
> 
> (and it usually gets rejected)
> 
Dear Catalin,

thank you for pointing me to the previous discussion. I understand that
adding model name in the current form would not provide sufficient
valuable information.

The real interesting thing in an ARM SOC is to see which CPU is A72,
A57, A53 or whatever.

This information is available from the device tree in the compatible
property of the individual CPUs
(/sys/firmware/devicetree/base/cpus/cpu@*/compatible), e.g.
compatible =3D "arm,cortex-a53", "arm,armv8";

I guess this information is the closest to the model name property on
other architectures that we can get.

Would exposing this information in /proc/cpuinfo as 'model name' make
sense to you?

Best regards

Heinrich

[PATCH v3] arm64: perf: Ignore exclude_hv when kernel is running in HYP

2017-05-02 Thread Ganapatrao Kulkarni

commit d98ecdaca296 ("arm64: perf: Count EL2 events if the kernel is
running in HYP") returns -EINVAL when perf system call perf_event_open is
called with exclude_hv != exclude_kernel. This change breaks applications
on VHE enabled ARMv8.1 platforms. The issue was observed with HHVM
application, which calls perf_event_open with exclude_hv = 1 and
exclude_kernel = 0.

There is no separate hypervisor privilege level when VHE is enabled, the
host kernel runs at EL2. So when VHE is enabled, we should ignore
exclude_hv from the application. This behaviour is consistent with PowerPC
where the exclude_hv is ignored when the hypervisor is not present and with
x86 where this flag is ignored.

Signed-off-by: Ganapatrao Kulkarni 
---

Changelog:

V2/V3:
 - Changes as per Will Deacon's suggestions.

V1: Initial patch

 arch/arm64/kernel/perf_event.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 57ae9d9..f6748c0 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -871,15 +871,17 @@ static int armv8pmu_set_event_filter(struct hw_perf_event 
*event,
 
if (attr->exclude_idle)
return -EPERM;
-   if (is_kernel_in_hyp_mode() &&
-   attr->exclude_kernel != attr->exclude_hv)
-   return -EINVAL;
+   if (is_kernel_in_hyp_mode()) {
+   if (!attr->exclude_kernel)
+   config_base |= ARMV8_PMU_INCLUDE_EL2;
+   } else {
+   if (attr->exclude_kernel)
+   config_base |= ARMV8_PMU_EXCLUDE_EL1;
+   if (!attr->exclude_hv)
+   config_base |= ARMV8_PMU_INCLUDE_EL2;
+   }
if (attr->exclude_user)
config_base |= ARMV8_PMU_EXCLUDE_EL0;
-   if (!is_kernel_in_hyp_mode() && attr->exclude_kernel)
-   config_base |= ARMV8_PMU_EXCLUDE_EL1;
-   if (!attr->exclude_hv)
-   config_base |= ARMV8_PMU_INCLUDE_EL2;
 
/*
 * Install the filter into config_base as this is used to
-- 
1.8.1.4

[PATCH] ata-sff: always map page before data transfer

2017-05-02 Thread Tycho Andersen

The XPFO [1] patchset may unmap pages from physmap if they happened to be
destined for userspace. If such a page is unmapped, it needs to be
remapped. Rather than test if a page is in the highmem/xpfo unmapped state,
Christoph suggested [2] that we simply always map the page.

Suggested-by: Christoph Hellwig 
Signed-off-by: Tycho Andersen 
CC: Juerg Haefliger 
CC: Tejun Heo 

[1]: https://lkml.org/lkml/2016/11/4/245
[2]: https://lkml.org/lkml/2016/11/4/253
---
I don't understand all the factors at play here, so thoughts are definitely
welcome.
---
 drivers/ata/libata-sff.c | 50 +---
 1 file changed, 18 insertions(+), 32 deletions(-)

diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 2bd92dc..8da2572 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -703,6 +703,7 @@ static void ata_pio_sector(struct ata_queued_cmd *qc)
struct page *page;
unsigned int offset;
unsigned char *buf;
+   unsigned long flags;
 
if (qc->curbytes == qc->nbytes - qc->sect_size)
ap->hsm_task_state = HSM_ST_LAST;
@@ -716,24 +717,16 @@ static void ata_pio_sector(struct ata_queued_cmd *qc)
 
DPRINTK("data %s\n", qc->tf.flags & ATA_TFLAG_WRITE ? "write" : "read");
 
-   if (PageHighMem(page)) {
-   unsigned long flags;
-
-   /* FIXME: use a bounce buffer */
-   local_irq_save(flags);
-   buf = kmap_atomic(page);
+   /* FIXME: use a bounce buffer */
+   local_irq_save(flags);
+   buf = kmap_atomic(page);
 
-   /* do the actual data transfer */
-   ap->ops->sff_data_xfer(qc, buf + offset, qc->sect_size,
-  do_write);
+   /* do the actual data transfer */
+   ap->ops->sff_data_xfer(qc, buf + offset, qc->sect_size,
+  do_write);
 
-   kunmap_atomic(buf);
-   local_irq_restore(flags);
-   } else {
-   buf = page_address(page);
-   ap->ops->sff_data_xfer(qc, buf + offset, qc->sect_size,
-  do_write);
-   }
+   kunmap_atomic(buf);
+   local_irq_restore(flags);
 
if (!do_write && !PageSlab(page))
flush_dcache_page(page);
@@ -836,6 +829,7 @@ static int __atapi_pio_bytes(struct ata_queued_cmd *qc, 
unsigned int bytes)
struct page *page;
unsigned char *buf;
unsigned int offset, count, consumed;
+   unsigned long flags;
 
 next_sg:
sg = qc->cursg;
@@ -861,24 +855,16 @@ static int __atapi_pio_bytes(struct ata_queued_cmd *qc, 
unsigned int bytes)
 
DPRINTK("data %s\n", qc->tf.flags & ATA_TFLAG_WRITE ? "write" : "read");
 
-   if (PageHighMem(page)) {
-   unsigned long flags;
-
-   /* FIXME: use bounce buffer */
-   local_irq_save(flags);
-   buf = kmap_atomic(page);
+   /* FIXME: use bounce buffer */
+   local_irq_save(flags);
+   buf = kmap_atomic(page);
 
-   /* do the actual data transfer */
-   consumed = ap->ops->sff_data_xfer(qc, buf + offset,
-   count, rw);
+   /* do the actual data transfer */
+   consumed = ap->ops->sff_data_xfer(qc, buf + offset,
+   count, rw);
 
-   kunmap_atomic(buf);
-   local_irq_restore(flags);
-   } else {
-   buf = page_address(page);
-   consumed = ap->ops->sff_data_xfer(qc, buf + offset,
-   count, rw);
-   }
+   kunmap_atomic(buf);
+   local_irq_restore(flags);
 
bytes -= min(bytes, consumed);
qc->curbytes += count;
-- 
2.9.3

Re: [PATCH v2 2/4] iio: accel: adxl345_core: Introduce set_mode and data_ready functions

2017-05-02 Thread Jonathan Cameron

On 02/05/17 12:39, Eva Rachel Retuya wrote:
> On Mon, May 01, 2017 at 01:22:52AM +0100, Jonathan Cameron wrote:
> Hello Jonathan,
> [...]
>>> +static int adxl345_set_mode(struct adxl345_data *data, u8 mode)
>>> +{
>>> +   struct device *dev = regmap_get_device(data->regmap);
>>> +   int ret;
>>> +
>>> +   ret = regmap_write(data->regmap, ADXL345_REG_POWER_CTL, mode);
>>> +   if (ret < 0) {
>>> +   dev_err(dev, "Failed to set power mode, %d\n", ret);
>>> +   return ret;
>> drop the return ret here and just return ret at the end of the function.
>> One of the static checkers will probably moan about this otherwise.
> 
> OK.
> 
>>> +   }
>>> +
>>> +   return 0;
>>> +}
>>> +
>>> +static int adxl345_data_ready(struct adxl345_data *data)
>>> +{
>> So this is a polling the dataready bit.  Will ensure we always
>> get fresh data when a read occurs.  Please add a comment to
>> that effect as that's not always how devices work.
> 
> OK.
> 
>>> +   struct device *dev = regmap_get_device(data->regmap);
>>> +   int tries = 5;
>>> +   u32 val;
>>> +   int ret;
>>> +
>>> +   do {
>>> +   /*
>>> +* 1/ODR + 1.1ms; 11.1ms at ODR of 0.10 Hz
>>> +* Sensor currently operates at default ODR of 100 Hz
>>> +*/
>>> +   usleep_range(1100, 11100);
>> That's a huge range to allow... I'm not following the argument for why.
>> Or do we have a stray 0?
>>
> 
> Not a stray 0. Range is from 1.1ms to 11.1ms, this represents the
> wake-up time when going to standby/other power saving modes ->
> measurement mode. I'm going to clarify the comment on why it is needed
> on the next revision.
The point about a range sleep is to allow the kernel flexibility in scheduling
so as to avoid waking the processor from lower power states when high precision 
is
not needed.

If the thing you are talking about needs the maximum time sometimes then you
should set the minimum value to that and add a bit to avoid unnecessary 
processor
wake ups.
> 
>>> +
>>> +   ret = regmap_read(data->regmap, ADXL345_REG_INT_SOURCE, );
>>> +   if (ret < 0)
>>> +   return ret;
>>> +   if ((val & ADXL345_INT_DATA_READY) == ADXL345_INT_DATA_READY)
>>> +   return 0;
>>> +   } while (--tries);
>>> +   dev_err(dev, "Data is not yet ready, try again.\n");
>>> +
>> This is almost certainly a hardware fault. I'd be more brutal with
>> the error and return -EIO.  If you get here your hardware is very unlikely
>> to be working correctly if you try again.
> 
> OK, will change it to -EIO then.
> 
>>> +   return -EAGAIN;
>>> +}
>>> +
>>>  #define ADXL345_CHANNEL(reg, axis) {   
>>> \
>>> .type = IIO_ACCEL,  \
>>> .modified = 1,  \
>>> @@ -72,6 +118,19 @@ static int adxl345_read_raw(struct iio_dev *indio_dev,
>>>  
>>> switch (mask) {
>>> case IIO_CHAN_INFO_RAW:
>>> +   mutex_lock(>lock);
>>> +   ret = adxl345_set_mode(data, ADXL345_POWER_CTL_MEASURE);
>>> +   if (ret < 0) {
>>> +   mutex_unlock(>lock);
>>> +   return ret;
>>> +   }
>>> +
>>> +   ret = adxl345_data_ready(data);
>>> +   if (ret < 0) {
>>> +   adxl345_set_mode(data, ADXL345_POWER_CTL_STANDBY);
>>> +   mutex_unlock(>lock);
>> What is the logic that puts the mutex_unlock here in the error case
>> and before the set_mode in the normal path?  Even if it doesn't
>> matter make them the same as it is less likely to raise questions
>> in the future!
> 
> OK, will make it consistent.
> 
>>> +   return ret;
>>> +   }
>>> /*
>>>  * Data is stored in adjacent registers:
>>>  * ADXL345_REG_DATA(X0/Y0/Z0) contain the least significant byte
>>> @@ -79,10 +138,15 @@ static int adxl345_read_raw(struct iio_dev *indio_dev,
>>>  */
>>> ret = regmap_bulk_read(data->regmap, chan->address, ,
>>>sizeof(regval));
>>> -   if (ret < 0)
>>> +   mutex_unlock(>lock);
>>> +   if (ret < 0) {
>>> +   adxl345_set_mode(data, ADXL345_POWER_CTL_STANDBY);
>>> return ret;
>>> +   }
>>>  
>>> *val = sign_extend32(le16_to_cpu(regval), 12);
>>> +   adxl345_set_mode(data, ADXL345_POWER_CTL_STANDBY);
>>> +
>>> return IIO_VAL_INT;
>>> case IIO_CHAN_INFO_SCALE:
>>> *val = 0;
> [...]
>>> @@ -169,8 +224,7 @@ int adxl345_core_remove(struct device *dev)
>>>  
>>> iio_device_unregister(indio_dev);
>>>  
>>> -   return regmap_write(data->regmap, ADXL345_REG_POWER_CTL,
>>> -   ADXL345_POWER_CTL_STANDBY);
>>> +   return adxl345_set_mode(data, ADXL345_POWER_CTL_STANDBY);
>> Under what circumstances would we not already be in the correct state?
>> A

Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of ns_last_pid's on ns hierarhy

2017-05-02 Thread Oleg Nesterov

sorry for delay, vacation...

On 04/28, Kirill Tkhai wrote:
>
> On 27.04.2017 19:22, Oleg Nesterov wrote:
> >
> > Ah, OK, I didn't notice the ns->child_reaper check in 
> > pidns_for_children_get().
> >
> > But note that it doesn't need tasklist_lock too.
>
> Hm, are there possible strange situations with memory ordering, when we see
> ns->child_reaper of already died ns, which was placed in the same memory?
> Do we have to use some memory barriers here?

Could you spell please? I don't understand your concerns...

I don't see how, say,

static struct ns_common *pidns_for_children_get(struct task_struct 
*task)
{
struct ns_common *ns = NULL;
struct pid_namespace *pid_ns;

task_lock(task);
if (task->nsproxy) {
pid_ns = task->nsproxy->pid_ns_for_children;
if (pid_ns->child_reaper) {
ns = _ns->ns;
get_pid_ns(ns);
}
}
task_unlock(task);

return ns;
}

can be wrong. It also looks more clean to me.

->child_reaper is not stable without tasklist, it can be dead/etc, but
we do not care?

Oleg.

Re: [PATCH RFC v2 4/6] proc: support mounting private procfs instances inside same pid namespace

2017-05-02 Thread Andy Lutomirski

On Tue, May 2, 2017 at 7:29 AM, Djalal Harouni  wrote:
> On Thu, Apr 27, 2017 at 12:13 AM, Andy Lutomirski  wrote:
>> On Tue, Apr 25, 2017 at 5:23 AM, Djalal Harouni  wrote:
> [...]
>>> We have to align procfs and modernize it to have a per mount context
>>> where at least the mount option do not propagate to all other mounts,
>>> then maybe we can continue to implement new features. One example is to
>>> require CAP_SYS_ADMIN in the init user namespace on some /proc/* which are
>>> not pids and which are are not virtualized by design, or CAP_NET_ADMIN
>>> inside userns on the net bits that are virtualized, etc.
>>> These mount options won't propagate to previous mounts, and the system
>>> will continue to be usable.
>>>
>>> Ths patch introduces the new 'limit_pids' mount option as it was also
>>> suggesed by Andy Lutomirski [1]. When this option is passed we
>>> automatically create a private procfs instance. This is not the default
>>> behaviour since we do not want to break userspace and we do not want to
>>> provide different devices IDs by default, please see [1] for why.
>>
>> I think that calling the option to make a separate instance
>> "limit_pids" is extremely counterintuitive.
>
> Ok.
>
>> My strong preference would be to make proc *always* make a separate
>> instance (unless it's a bind mount) and to make it work.  If that
>> means fudging stat() output, so be it.
>
> I also agree, but as said if we change stat(), userspace won't be able
> to notice if these two proc instances are really separated, the device
> ID is the only indication here.

I re-read all the threads and I'm still not convinced I see why we
need new_instance to be non-default.  It's true that the device
numbers of /proc/ns/* matter, but if you look (with stat -L, for
example), they're *already* not tied to the procfs instance.

I'm okay with adding new_instance to be on the safe side, but I'd like
it to be done in a way that we could make it become the default some
day without breaking anything.  This means that we need to be rather
careful about how new_instance and hidepid interact.

Re: [GIT PULL] TTY/Serial driver fixes for 4.11-rc4

2017-05-02 Thread Dmitry Vyukov

On Fri, Apr 14, 2017 at 2:30 PM, Greg KH  wrote:
> On Fri, Apr 14, 2017 at 11:41:26AM +0200, Vegard Nossum wrote:
>> On 13 April 2017 at 20:34, Greg KH  wrote:
>> > On Thu, Apr 13, 2017 at 09:07:40AM -0700, Linus Torvalds wrote:
>> >> On Thu, Apr 13, 2017 at 3:50 AM, Vegard Nossum  
>> >> wrote:
>> >> >
>> >> > I've bisected a syzkaller crash down to this commit
>> >> > (5362544bebe85071188dd9e479b5a5040841c895). The crash is:
>> >> >
>> >> > [   25.137552] BUG: unable to handle kernel paging request at 
>> >> > 2280
>> >> > [   25.137579] IP: mutex_lock_interruptible+0xb/0x30
>> >>
>> >> It would seem to be the
>> >>
>> >> if (mutex_lock_interruptible(>atomic_read_lock))
>> >>
>> >> call in n_tty_read(), the offset is about right for a NULL 'ldata'
>> >> pointer (it's a big structure, it has a couple of character buffers of
>> >> size N_TTY_BUF_SIZE).
>> >>
>> >> I don't see the obvious fix, so I suspect at this point we should just
>> >> revert, as that commit seems to introduce worse problems that it is
>> >> supposed to fix. Greg?
>> >
>> > Unless Dmitry has a better idea, I will just revert it and send you the
>> > pull request in a day or so.
>>
>> I don't think we need to rush a revert, I'd hope there's a way to fix
>> it properly.
>
> For this late in the release cycle, for something as complex as tty
> ldisc handling, for an issue that has been present for over a decade,
> the safest thing right now is to go back to the old well-known code by
> applying a revert :)
>
>> So the original problem is that the vmalloc() in n_tty_open() can
>> fail, and that will panic in tty_set_ldisc()/tty_ldisc_restore()
>> because of its unwillingness to proceed if the tty doesn't have an
>> ldisc.
>>
>> Dmitry fixed this by allowing tty->ldisc == NULL in the case of memory
>> allocation failure as we can see from the comment in tty_set_ldisc().
>>
>> Unfortunately, it would appear that some other bits of code do not
>> like tty->ldisc == NULL (other than the crash in this thread, I saw
>> 2-3 similar crashes in other functions, e.g. poll()). I see two
>> possibilities:
>>
>> 1) make other code handle tty->ldisc == NULL.
>>
>> 2) don't close/free the old ldisc until the new one has been
>> successfully created/initialised/opened/attached to the tty, and
>> return an error to userspace if changing it failed.
>>
>> I'm leaning towards #2 as the more obviously correct fix, it makes
>> tty_set_ldisc() transactional, the fix seems limited in scope to
>> tty_set_ldisc() itself, and we don't need to make every other bit of
>> code that uses tty->ldisc handle the NULL case.
>
> That sounds reasonable to me, care to work on a patch for this?

Vegard, do you know how to do this?
That was first thing that I tried, but I did not manage to make it
work. disc is tied to tty, so it's not that one can create a fully
initialized disc on the side and then simply swap pointers. Looking at
the code now, there is at least TTY_LDISC_OPEN bit in tty. But as far
as I remember there were more fundamental problems. Or maybe I just
did not try too hard.

< 5 6 7 8 9 10 11 12 13 >

901 - 1000 of 1246 matches

Mail list logo