date:20140616

Re: [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load

2014-06-16 Thread Vivek Goyal

On Mon, Jun 16, 2014 at 02:25:07PM -0700, H. Peter Anvin wrote:
> On 06/16/2014 02:09 PM, Borislav Petkov wrote:
> > 
> > Nah, I don't feel strongly about it - I just don't trust userspace and
> > think that every value we get from it should be "sanitized".
> > 
> 
> Borislav and I talked about this briefly over IRC.  A key part of that
> is that if userspace could manipulate this system call to consume an
> unreasonable amount of memory, we would have a problem, for example if
> this code used vzalloc() instead of kzalloc().  However, since
> kmalloc/kzalloc implies a relatively restrictive limit on the memory
> allocation size anyway, well short of anything that could cause OOM
> problems, that pretty much solves the problem.

Actually currently I am using vzalloc() for command line buffer
allocation.

image->cmdline_buf = vzalloc(cmdline_len);
if (!image->cmdline_buf)
goto out;

Should I switch to using kzalloc() instead?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 10/10] mfd: cros_ec: ec_dev->cmd_xfer() returns number of bytes received from EC

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

When communicating with the EC, the cmd_xfer() function should return the
number of bytes it received from the EC, or negative on error.

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 drivers/i2c/busses/i2c-cros-ec-tunnel.c | 2 +-
 drivers/mfd/cros_ec_i2c.c   | 2 +-
 drivers/mfd/cros_ec_spi.c   | 2 +-
 include/linux/mfd/cros_ec.h | 8 
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/i2c/busses/i2c-cros-ec-tunnel.c 
b/drivers/i2c/busses/i2c-cros-ec-tunnel.c
index dd07818..05e033c 100644
--- a/drivers/i2c/busses/i2c-cros-ec-tunnel.c
+++ b/drivers/i2c/busses/i2c-cros-ec-tunnel.c
@@ -228,7 +228,7 @@ static int ec_i2c_xfer(struct i2c_adapter *adap, struct 
i2c_msg i2c_msgs[],
msg.insize = response_len;
 
result = bus->ec->cmd_xfer(bus->ec, );
-   if (result)
+   if (result < 0)
goto exit;
 
result = ec_i2c_parse_response(response, i2c_msgs, );
diff --git a/drivers/mfd/cros_ec_i2c.c b/drivers/mfd/cros_ec_i2c.c
index 2276096..dc0ba29 100644
--- a/drivers/mfd/cros_ec_i2c.c
+++ b/drivers/mfd/cros_ec_i2c.c
@@ -120,7 +120,7 @@ static int cros_ec_cmd_xfer_i2c(struct cros_ec_device 
*ec_dev,
goto done;
}
 
-   ret = 0;
+   ret = i2c_msg[1].buf[1];
  done:
kfree(in_buf);
kfree(out_buf);
diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 4d34f1c..beba1bc 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -333,7 +333,7 @@ static int cros_ec_cmd_xfer_spi(struct cros_ec_device 
*ec_dev,
goto exit;
}
 
-   ret = 0;
+   ret = len;
 exit:
mutex_unlock(_spi->lock);
return ret;
diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index 60c0880..7b65a75 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -41,7 +41,7 @@ enum {
  * @outdata: Outgoing data to EC
  * @outsize: Outgoing length in bytes
  * @indata: Where to put the incoming data from EC
- * @insize: Incoming length in bytes (filled in by EC)
+ * @insize: Max number of bytes to accept from EC
  * @result: EC's response to the command (separate from communication failure)
  */
 struct cros_ec_command {
@@ -64,9 +64,9 @@ struct cros_ec_command {
  * sleep at the last suspend
  * @event_notifier: interrupt event notifier for transport devices
  * @cmd_xfer: send command to EC and get response
- * Returns 0 if the communication succeeded, but that doesn't mean the EC
- * was happy with the command it got. Caller should check msg.result for
- * the EC's result code.
+ * Returns the number of bytes received if the communication succeeded, but
+ * that doesn't mean the EC was happy with the command. The caller
+ * should check msg.result for the EC's result code.
  *
  * @priv: Private data
  * @irq: Interrupt to use
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 08/10] mfd: cros_ec: cleanup: Remove EC wrapper functions

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

Remove the three wrapper functions that talk to the EC without passing all
the desired arguments and just use the underlying communication function
that passes everything in a struct intead.

This is internal code refactoring only. Nothing should change.

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 drivers/i2c/busses/i2c-cros-ec-tunnel.c | 15 +++
 drivers/input/keyboard/cros_ec_keyb.c   | 14 --
 drivers/mfd/cros_ec.c   | 32 
 include/linux/mfd/cros_ec.h | 19 ++-
 4 files changed, 29 insertions(+), 51 deletions(-)

diff --git a/drivers/i2c/busses/i2c-cros-ec-tunnel.c 
b/drivers/i2c/busses/i2c-cros-ec-tunnel.c
index 8e7a714..dd07818 100644
--- a/drivers/i2c/busses/i2c-cros-ec-tunnel.c
+++ b/drivers/i2c/busses/i2c-cros-ec-tunnel.c
@@ -183,6 +183,7 @@ static int ec_i2c_xfer(struct i2c_adapter *adap, struct 
i2c_msg i2c_msgs[],
u8 *request = NULL;
u8 *response = NULL;
int result;
+   struct cros_ec_command msg;
 
request_len = ec_i2c_count_message(i2c_msgs, num);
if (request_len < 0) {
@@ -218,9 +219,15 @@ static int ec_i2c_xfer(struct i2c_adapter *adap, struct 
i2c_msg i2c_msgs[],
}
 
ec_i2c_construct_message(request, i2c_msgs, num, bus_num);
-   result = bus->ec->command_sendrecv(bus->ec, EC_CMD_I2C_PASSTHRU,
-  request, request_len,
-  response, response_len);
+
+   msg.version = 0;
+   msg.command = EC_CMD_I2C_PASSTHRU;
+   msg.outdata = request;
+   msg.outsize = request_len;
+   msg.indata = response;
+   msg.insize = response_len;
+
+   result = bus->ec->cmd_xfer(bus->ec, );
if (result)
goto exit;
 
@@ -258,7 +265,7 @@ static int ec_i2c_probe(struct platform_device *pdev)
u32 remote_bus;
int err;
 
-   if (!ec->command_sendrecv) {
+   if (!ec->cmd_xfer) {
dev_err(dev, "Missing sendrecv\n");
return -EINVAL;
}
diff --git a/drivers/input/keyboard/cros_ec_keyb.c 
b/drivers/input/keyboard/cros_ec_keyb.c
index 4083796..dc37b6b 100644
--- a/drivers/input/keyboard/cros_ec_keyb.c
+++ b/drivers/input/keyboard/cros_ec_keyb.c
@@ -191,8 +191,18 @@ static void cros_ec_keyb_close(struct input_dev *dev)
 
 static int cros_ec_keyb_get_state(struct cros_ec_keyb *ckdev, uint8_t 
*kb_state)
 {
-   return ckdev->ec->command_recv(ckdev->ec, EC_CMD_MKBP_STATE,
- kb_state, ckdev->cols);
+   int ret;
+   struct cros_ec_command msg = {
+   .version = 0,
+   .command = EC_CMD_MKBP_STATE,
+   .outdata = NULL,
+   .outsize = 0,
+   .indata = kb_state,
+   .insize = ckdev->cols,
+   };
+
+   ret = ckdev->ec->cmd_xfer(ckdev->ec, );
+   return ret;
 }
 
 static int cros_ec_keyb_work(struct notifier_block *nb,
diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index d242714..6dd91e9 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -44,34 +44,6 @@ int cros_ec_prepare_tx(struct cros_ec_device *ec_dev,
 }
 EXPORT_SYMBOL(cros_ec_prepare_tx);
 
-static int cros_ec_command_sendrecv(struct cros_ec_device *ec_dev,
-   uint16_t cmd, void *out_buf, int out_len,
-   void *in_buf, int in_len)
-{
-   struct cros_ec_command msg;
-
-   msg.version = cmd >> 8;
-   msg.command = cmd & 0xff;
-   msg.outdata = out_buf;
-   msg.outsize = out_len;
-   msg.indata = in_buf;
-   msg.insize = in_len;
-
-   return ec_dev->cmd_xfer(ec_dev, );
-}
-
-static int cros_ec_command_recv(struct cros_ec_device *ec_dev,
-   uint16_t cmd, void *buf, int buf_len)
-{
-   return cros_ec_command_sendrecv(ec_dev, cmd, NULL, 0, buf, buf_len);
-}
-
-static int cros_ec_command_send(struct cros_ec_device *ec_dev,
-   uint16_t cmd, void *buf, int buf_len)
-{
-   return cros_ec_command_sendrecv(ec_dev, cmd, buf, buf_len, NULL, 0);
-}
-
 static irqreturn_t ec_irq_thread(int irq, void *data)
 {
struct cros_ec_device *ec_dev = data;
@@ -104,10 +76,6 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
 
BLOCKING_INIT_NOTIFIER_HEAD(_dev->event_notifier);
 
-   ec_dev->command_send = cros_ec_command_send;
-   ec_dev->command_recv = cros_ec_command_recv;
-   ec_dev->command_sendrecv = cros_ec_command_sendrecv;
-
if (ec_dev->din_size) {
ec_dev->din = devm_kzalloc(dev, ec_dev->din_size, GFP_KERNEL);
if (!ec_dev->din)
diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index 2b0c598..60c0880 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -63,9 +63,10 @@ struct cros_ec_command {
  * @was_wake_device: true if this device was set to

Re: perf/workqueue: lockdep warning on process exit

2014-06-16 Thread Sasha Levin

On 06/16/2014 10:24 AM, Sasha Levin wrote:
> Hi all,
> 
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel I've stumbled on the following spew:

I think that this is due to a recent change, since it seems to reproduce very
often. I've tried bisecting it but it seems that I'm hitting unrelated issues
and can't correctly run the bisection.

Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/10] mfd: cros_ec: Check result code from EC messages

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

Just because the host was able to talk to the EC doesn't mean that the EC
was happy with what it was told. Errors in communincation are not the same
as error messages from the EC itself.

This change lets the EC report its errors separately.

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 drivers/mfd/cros_ec_i2c.c | 15 +++
 drivers/mfd/cros_ec_spi.c | 26 ++
 2 files changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/mfd/cros_ec_i2c.c b/drivers/mfd/cros_ec_i2c.c
index 5bb32f5..2276096 100644
--- a/drivers/mfd/cros_ec_i2c.c
+++ b/drivers/mfd/cros_ec_i2c.c
@@ -92,11 +92,18 @@ static int cros_ec_cmd_xfer_i2c(struct cros_ec_device 
*ec_dev,
}
 
/* check response error code */
-   if (i2c_msg[1].buf[0]) {
-   dev_warn(ec_dev->dev, "command 0x%02x returned an error %d\n",
-msg->command, i2c_msg[1].buf[0]);
-   ret = -EINVAL;
+   msg->result = i2c_msg[1].buf[0];
+   switch (msg->result) {
+   case EC_RES_SUCCESS:
+   break;
+   case EC_RES_IN_PROGRESS:
+   ret = -EAGAIN;
+   dev_dbg(ec_dev->dev, "command 0x%02x in progress\n",
+   msg->command);
goto done;
+   default:
+   dev_warn(ec_dev->dev, "command 0x%02x returned %d\n",
+msg->command, msg->result);
}
 
/* copy response packet payload and compute checksum */
diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 09ca789..4d34f1c 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -289,21 +289,23 @@ static int cros_ec_cmd_xfer_spi(struct cros_ec_device 
*ec_dev,
goto exit;
}
 
-   /* check response error code */
ptr = ec_dev->din;
-   if (ptr[0]) {
-   if (ptr[0] == EC_RES_IN_PROGRESS) {
-   dev_dbg(ec_dev->dev, "command 0x%02x in progress\n",
-   ec_msg->command);
-   ret = -EAGAIN;
-   goto exit;
-   }
-   dev_warn(ec_dev->dev, "command 0x%02x returned an error %d\n",
-ec_msg->command, ptr[0]);
-   debug_packet(ec_dev->dev, "in_err", ptr, len);
-   ret = -EINVAL;
+
+   /* check response error code */
+   ec_msg->result = ptr[0];
+   switch (ec_msg->result) {
+   case EC_RES_SUCCESS:
+   break;
+   case EC_RES_IN_PROGRESS:
+   ret = -EAGAIN;
+   dev_dbg(ec_dev->dev, "command 0x%02x in progress\n",
+   ec_msg->command);
goto exit;
+   default:
+   dev_warn(ec_dev->dev, "command 0x%02x returned %d\n",
+ec_msg->command, ec_msg->result);
}
+
len = ptr[1];
sum = ptr[0] + ptr[1];
if (len > ec_msg->insize) {
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/10] mfd: cros_ec: Tweak struct cros_ec_device for clarity

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

The members of struct cros_ec_device were improperly commented, and
intermixed the private and public sections. This is just cleanup to make it
more obvious what goes with what.

[dianders: left lock in the structure but gave it the name that will
eventually be used.]

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 drivers/mfd/cros_ec.c   |  2 +-
 drivers/mfd/cros_ec_i2c.c   |  4 +--
 drivers/mfd/cros_ec_spi.c   | 10 +++
 include/linux/mfd/cros_ec.h | 65 -
 4 files changed, 43 insertions(+), 38 deletions(-)

diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index bd6f936..a9eede5 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -57,7 +57,7 @@ static int cros_ec_command_sendrecv(struct cros_ec_device 
*ec_dev,
msg.in_buf = in_buf;
msg.in_len = in_len;
 
-   return ec_dev->command_xfer(ec_dev, );
+   return ec_dev->cmd_xfer(ec_dev, );
 }
 
 static int cros_ec_command_recv(struct cros_ec_device *ec_dev,
diff --git a/drivers/mfd/cros_ec_i2c.c b/drivers/mfd/cros_ec_i2c.c
index 4f71be9..777e529 100644
--- a/drivers/mfd/cros_ec_i2c.c
+++ b/drivers/mfd/cros_ec_i2c.c
@@ -29,7 +29,7 @@ static inline struct cros_ec_device *to_ec_dev(struct device 
*dev)
return i2c_get_clientdata(client);
 }
 
-static int cros_ec_command_xfer(struct cros_ec_device *ec_dev,
+static int cros_ec_cmd_xfer_i2c(struct cros_ec_device *ec_dev,
struct cros_ec_msg *msg)
 {
struct i2c_client *client = ec_dev->priv;
@@ -136,7 +136,7 @@ static int cros_ec_i2c_probe(struct i2c_client *client,
ec_dev->dev = dev;
ec_dev->priv = client;
ec_dev->irq = client->irq;
-   ec_dev->command_xfer = cros_ec_command_xfer;
+   ec_dev->cmd_xfer = cros_ec_cmd_xfer_i2c;
ec_dev->ec_name = client->name;
ec_dev->phys_name = client->adapter->name;
ec_dev->parent = >dev;
diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 0b8d328..52d4d7b 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -73,7 +73,7 @@
  * if no record
  * @end_of_msg_delay: used to set the delay_usecs on the spi_transfer that
  *  is sent when we want to turn off CS at the end of a transaction.
- * @lock: mutex to ensure only one user of cros_ec_command_spi_xfer at a time
+ * @lock: mutex to ensure only one user of cros_ec_cmd_xfer_spi at a time
  */
 struct cros_ec_spi {
struct spi_device *spi;
@@ -210,13 +210,13 @@ static int cros_ec_spi_receive_response(struct 
cros_ec_device *ec_dev,
 }
 
 /**
- * cros_ec_command_spi_xfer - Transfer a message over SPI and receive the reply
+ * cros_ec_cmd_xfer_spi - Transfer a message over SPI and receive the reply
  *
  * @ec_dev: ChromeOS EC device
  * @ec_msg: Message to transfer
  */
-static int cros_ec_command_spi_xfer(struct cros_ec_device *ec_dev,
-   struct cros_ec_msg *ec_msg)
+static int cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
+   struct cros_ec_msg *ec_msg)
 {
struct cros_ec_spi *ec_spi = ec_dev->priv;
struct spi_transfer trans;
@@ -372,7 +372,7 @@ static int cros_ec_spi_probe(struct spi_device *spi)
ec_dev->dev = dev;
ec_dev->priv = ec_spi;
ec_dev->irq = spi->irq;
-   ec_dev->command_xfer = cros_ec_command_spi_xfer;
+   ec_dev->cmd_xfer = cros_ec_cmd_xfer_spi;
ec_dev->ec_name = ec_spi->spi->modalias;
ec_dev->phys_name = dev_name(_spi->spi->dev);
ec_dev->parent = _spi->spi->dev;
diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index 2ee3190..79a3585 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -16,7 +16,9 @@
 #ifndef __LINUX_MFD_CROS_EC_H
 #define __LINUX_MFD_CROS_EC_H
 
+#include 
 #include 
+#include 
 
 /*
  * Command interface between EC and AP, for LPC, I2C and SPI interfaces.
@@ -55,34 +57,53 @@ struct cros_ec_msg {
 /**
  * struct cros_ec_device - Information about a ChromeOS EC device
  *
+ * @ec_name: name of EC device (e.g. 'chromeos-ec')
+ * @phys_name: name of physical comms layer (e.g. 'i2c-4')
+ * @dev: Device pointer
+ * @was_wake_device: true if this device was set to wake the system from
+ * sleep at the last suspend
+ * @event_notifier: interrupt event notifier for transport devices
+ * @command_send: send a command
+ * @command_recv: receive a response
+ * @command_sendrecv: send a command and receive a response
+ *
  * @name: Name of this EC interface
  * @priv: Private data
  * @irq: Interrupt to use
- * @din: input buffer (from EC)
- * @dout: output buffer (to EC)
+ * @din: input buffer (for data from EC)
+ * @dout: output buffer (for data to EC)
  * \note
  * These two buffers will always be dword-aligned and include enough
  * space for up to 7 word-alignment bytes also, so we can ensure that
  * the body of the message is always

[PATCH 07/10] mfd: cros_ec: cleanup: remove unused fields from struct cros_ec_device

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

struct cros_ec_device has a superfluous "name" field. We can get all the
debugging info we need from the existing ec_name and phys_name fields, so
let's take out the extra field.

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 drivers/mfd/cros_ec.c   | 2 +-
 drivers/mfd/cros_ec_i2c.c   | 1 -
 drivers/mfd/cros_ec_spi.c   | 1 -
 include/linux/mfd/cros_ec.h | 2 --
 4 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index 9304056..d242714 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -138,7 +138,7 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
goto fail_mfd;
}
 
-   dev_info(dev, "Chrome EC (%s)\n", ec_dev->name);
+   dev_info(dev, "Chrome EC device registered\n");
 
return 0;
 
diff --git a/drivers/mfd/cros_ec_i2c.c b/drivers/mfd/cros_ec_i2c.c
index 37ed12f..5bb32f5 100644
--- a/drivers/mfd/cros_ec_i2c.c
+++ b/drivers/mfd/cros_ec_i2c.c
@@ -132,7 +132,6 @@ static int cros_ec_i2c_probe(struct i2c_client *client,
return -ENOMEM;
 
i2c_set_clientdata(client, ec_dev);
-   ec_dev->name = "I2C";
ec_dev->dev = dev;
ec_dev->priv = client;
ec_dev->irq = client->irq;
diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 2d713fe..09ca789 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -374,7 +374,6 @@ static int cros_ec_spi_probe(struct spi_device *spi)
cros_ec_spi_dt_probe(ec_spi, dev);
 
spi_set_drvdata(spi, ec_dev);
-   ec_dev->name = "SPI";
ec_dev->dev = dev;
ec_dev->priv = ec_spi;
ec_dev->irq = spi->irq;
diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index f27c037..2b0c598 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -67,7 +67,6 @@ struct cros_ec_command {
  * @command_recv: receive a response
  * @command_sendrecv: send a command and receive a response
  *
- * @name: Name of this EC interface
  * @priv: Private data
  * @irq: Interrupt to use
  * @din: input buffer (for data from EC)
@@ -104,7 +103,6 @@ struct cros_ec_device {
void *in_buf, int in_len);
 
/* These are used to implement the platform-specific interface */
-   const char *name;
void *priv;
int irq;
uint8_t *din;
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/10] mfd: cros_ec: Fix the comment on cros_ec_remove()

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

This comment was incorrect, so update it.

Signed-off-by: Bill Richardson 
Signed-off-by: Simon Glass 
Signed-off-by: Doug Anderson 
---
 include/linux/mfd/cros_ec.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index 887ef4f..7e9fe6e 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -148,8 +148,7 @@ int cros_ec_prepare_tx(struct cros_ec_device *ec_dev,
 /**
  * cros_ec_remove - Remove a ChromeOS EC
  *
- * Call this to deregister a ChromeOS EC. After this you should call
- * cros_ec_free().
+ * Call this to deregister a ChromeOS EC, then clean up any private data.
  *
  * @ec_dev: Device to register
  * @return 0 if ok, -ve on error
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/10] mfd: cros_ec: Allow static din/dout buffers with cros_ec_register()

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

The lower-level driver may want to provide its own buffers. If so,
there's no need to allocate new ones.  This already happens to work
just fine (since we check for size of 0 and use devm allocation), but
it's good to document it.

[dianders: Resolved conflicts; documented that no code changes needed
on mainline]

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 include/linux/mfd/cros_ec.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index 7e9fe6e..2ee3190 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -68,8 +68,8 @@ struct cros_ec_msg {
  * We use this alignment to keep ARM and x86 happy. Probably word
  * alignment would be OK, there might be a small performance advantage
  * to using dword.
- * @din_size: size of din buffer
- * @dout_size: size of dout buffer
+ * @din_size: size of din buffer to allocate (zero to use static din)
+ * @dout_size: size of dout buffer to allocate (zero to use static dout)
  * @command_send: send a command
  * @command_recv: receive a command
  * @ec_name: name of EC device (e.g. 'chromeos-ec')
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/10] mfd: cros_ec: IRQs for cros_ec should be optional

2014-06-16 Thread Doug Anderson

From: Bill Richardson 

Preparing the way for the LPC device, which is just a plaform_device without
interrupts.

Signed-off-by: Bill Richardson 
Signed-off-by: Doug Anderson 
---
 drivers/mfd/cros_ec.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index 38fe9bf..bd6f936 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -119,17 +119,15 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
return -ENOMEM;
}
 
-   if (!ec_dev->irq) {
-   dev_dbg(dev, "no valid IRQ: %d\n", ec_dev->irq);
-   return err;
-   }
-
-   err = request_threaded_irq(ec_dev->irq, NULL, ec_irq_thread,
-  IRQF_TRIGGER_LOW | IRQF_ONESHOT,
-  "chromeos-ec", ec_dev);
-   if (err) {
-   dev_err(dev, "request irq %d: error %d\n", ec_dev->irq, err);
-   return err;
+   if (ec_dev->irq) {
+   err = request_threaded_irq(ec_dev->irq, NULL, ec_irq_thread,
+   IRQF_TRIGGER_LOW | IRQF_ONESHOT,
+   "chromeos-ec", ec_dev);
+   if (err) {
+   dev_err(dev, "request irq %d: error %d\n",
+   ec_dev->irq, err);
+   return err;
+   }
}
 
err = mfd_add_devices(dev, 0, cros_devs,
@@ -145,7 +143,8 @@ int cros_ec_register(struct cros_ec_device *ec_dev)
return 0;
 
 fail_mfd:
-   free_irq(ec_dev->irq, ec_dev);
+   if (ec_dev->irq)
+   free_irq(ec_dev->irq, ec_dev);
 
return err;
 }
@@ -154,7 +153,8 @@ EXPORT_SYMBOL(cros_ec_register);
 int cros_ec_remove(struct cros_ec_device *ec_dev)
 {
mfd_remove_devices(ec_dev->dev);
-   free_irq(ec_dev->irq, ec_dev);
+   if (ec_dev->irq)
+   free_irq(ec_dev->irq, ec_dev);
 
return 0;
 }
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/10] Batch of cleanup patches for cros_ec

2014-06-16 Thread Doug Anderson

This is a batch of cleanup patches picked from the ChromeOS 3.8 kernel
tree and applied to ToT.  Most of these patches were authored by Bill
Richardson (CCed).  Where appropriate I've squashed patches together,
though I have erred on the side of keeping patches logically distinct
rather than squashing into one big cleanup patch.

There is very little functionality added by this series, but this gets
us closer to how things look in the ChromeOS tree so we can add more
patches atop it.  In general I took the oldest patches from our tree
and stopped picking when I got to a reasonable patch size (10
patches).  There are about 5 more cleanup patches still in the
ChromeOS tree, then some more major functionality patches.

Note that I didn't take the "cros_ec_dev" userspace inteface, the
"LPC" implementation, the "vboot context" implementation, and patches
relating to exynos5250-spring when picking patches.  These bits are
very separate (and big!) and can be added and debated separately after
we've got cleanup in.  Whenever patches touched those pieces of the
code I ignored that part of the patch.  In general I did take cleanup
code that was intended to make it easier to later add these bits.

I have tested basic functionality of these patches on exynos5250-snow
and exynos5420-peach-pit.


Bill Richardson (9):
  mfd: cros_ec: Fix the comment on cros_ec_remove()
  mfd: cros_ec: IRQs for cros_ec should be optional
  mfd: cros_ec: Allow static din/dout buffers with cros_ec_register()
  mfd: cros_ec: Tweak struct cros_ec_device for clarity
  mfd: cros_ec: Use struct cros_ec_command to communicate with the EC
  mfd: cros_ec: cleanup: remove unused fields from struct cros_ec_device
  mfd: cros_ec: cleanup: Remove EC wrapper functions
  mfd: cros_ec: Check result code from EC messages
  mfd: cros_ec: ec_dev->cmd_xfer() returns number of bytes received from
EC

Simon Glass (1):
  mdf: cros_ec: Detect in-progress commands

 drivers/i2c/busses/i2c-cros-ec-tunnel.c |  17 --
 drivers/input/keyboard/cros_ec_keyb.c   |  14 -
 drivers/mfd/cros_ec.c   |  76 +++-
 drivers/mfd/cros_ec_i2c.c   |  44 --
 drivers/mfd/cros_ec_spi.c   |  43 --
 include/linux/mfd/cros_ec.h | 100 +++-
 6 files changed, 144 insertions(+), 150 deletions(-)

-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.15: kernel BUG at kernel/auditsc.c:1525!

2014-06-16 Thread Andy Lutomirski

[cc: hpa, x86 list]

On Mon, Jun 16, 2014 at 1:43 PM, Richard Weinberger  wrote:
> Am 16.06.2014 22:41, schrieb Toralf Förster:
>> Well, might be the mail:subject should be adapted, b/c the issue can be 
>> triggered in a 3.13.11 kernel too.
>> Unfortunately it does not appear within an UML guest, therefore an automated 
>> bisecting isn't possible I fear.
>
> You could try KVM. :)

Before you do that, just to clarify:

What bitness is your kernel?  That is, are you on 32-bit or 64-bit kernel?

What bitness is your test case?  'file a.out' will say.

What does /proc/cpuinfo say in flags?

Can you try the attached patch?  It's only compile-tested.

To hpa, etc:  It appears that entry_32.S is missing any call to the
audit exit hook on the badsys path.  If I'm diagnosing this bug report
correctly, this causes OOPSes.

The the world at large: it's increasingly apparent that no one (except
maybe the blackhats) has ever scrutinized the syscall auditing code.
This is two old severe bugs in the code that have probably been there
for a long time.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
From 8b43bd2118d876cb3163e8f7d9cd8253da649335 Mon Sep 17 00:00:00 2001
Message-Id: <8b43bd2118d876cb3163e8f7d9cd8253da649335.1402954406.git.l...@amacapital.net>
From: Andy Lutomirski 
Date: Mon, 16 Jun 2014 14:28:19 -0700
Subject: [PATCH] x86_32,entry: Fix badsys paths
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The bad syscall nr paths are their own incomprehensible route
through the entry control flow.  Rearrange them to work just like
syscalls that return -ENOSYS.

This should fix an OOPS in the audit code when auditing is enabled
and bad syscall nrs are used.

Reported-by: Toralf Förster 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/entry_32.S | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 98313ff..eb6e07e 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -431,9 +431,10 @@ sysenter_past_esp:
 	jnz sysenter_audit
 sysenter_do_call:
 	cmpl $(NR_syscalls), %eax
-	jae syscall_badsys
+	jae sysenter_badsys
 	call *sys_call_table(,%eax,4)
 	movl %eax,PT_EAX(%esp)
+sysenter_after_call:
 	LOCKDEP_SYS_EXIT
 	DISABLE_INTERRUPTS(CLBR_ANY)
 	TRACE_IRQS_OFF
@@ -687,7 +688,12 @@ END(syscall_fault)

 syscall_badsys:
 	movl $-ENOSYS,PT_EAX(%esp)
-	jmp resume_userspace
+	jmp syscall_exit
+END(syscall_badsys)
+
+sysenter_badsys:
+	movl $-ENOSYS,PT_EAX(%esp)
+	jmp sysenter_after_call
 END(syscall_badsys)
 	CFI_ENDPROC
 /*
-- 
1.9.3

Re: netconsole breaks netpoll on bridge

2014-06-16 Thread Francois Romieu

Stefan Priebe - Profihost AG  :
[...]
> That sounds great! Is there anything I can do or some code I can port to veth?

You may add an empty handler for .ndo_poll_controller in drivers/net/veth.c
and give it a try on current kernel.

It should not be too bad.

-- 
Ueimor
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: samsung: Fix compile error with SERIAL_SAMSUNG_DEBUG

2014-06-16 Thread Greg Kroah-Hartman

On Sun, Jun 15, 2014 at 10:46:56AM -0700, Joe Perches wrote:
> On Sun, 2014-06-15 at 23:06 +0530, Sachin Kamat wrote:
> > On Sun, Jun 15, 2014 at 2:03 AM, Joe Perches  wrote:
> > > Greg?  Can you please apply this soon?
> > 
> > A similar patch has already been posted [1]. Probably Greg will apply it
> > once 3.16-rc1 is out.
> > 
> > [1] http://www.spinics.net/lists/linux-serial/msg12843.html
> 
> I know.  It was just another prompt to Greg.
> 
> It's just unfortunate to have a compile failure path
> in Linus' tree.  Especially one I introduced.
> 
> He'd said 10 days ago or so it'd be a day or two.
> 
> http://www.spinics.net/lists/linux-serial/msg12859.html

Life gets in the way, I'll get it merged in time for 3.16-rc2 or -3.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net/cadence/macb: clear interrupts simply and correctly

2014-06-16 Thread Sören Brinkmann

On Thu, 2014-06-12 at 05:50PM +0900, Jongsung Kim wrote:
> The "Rx used bit read" interrupt is enabled but not cleared for some
> systems with the ISR (Interrupt Status Register) configured as clear-
> on-write. This interrupt may be asserted when the CPU does not handle
> Rx-complete interrupts for a long time. (e.g., if the CPU is stopped
> by debugger) Once asserted, it'll not be cleared, and the CPU will
> loop infinitly in the interrupt handler.
> 
> This patch forces to use a dedicated function for reading the ISR,
> and the function clears it if clear-on-write. So the ISR is always
> cleared after read, regardless of clear-on-write configuration.
> 
> Reported-by: Hayun Hwang 
> Signed-off-by: Youngkyu Choi 
> Signed-off-by: Jongsung Kim 
> Tested-by: Hayun Hwang 
> ---
>  drivers/net/ethernet/cadence/macb.c |   37 ++
>  1 files changed, 15 insertions(+), 22 deletions(-)
> 
[...]
> @@ -552,9 +562,6 @@ static void macb_tx_interrupt(struct macb *bp)
>   status = macb_readl(bp, TSR);
>   macb_writel(bp, TSR, status);
>  
> - if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> - macb_writel(bp, ISR, MACB_BIT(TCOMP));
> -
>   netdev_vdbg(bp->dev, "macb_tx_interrupt status = 0x%03lx\n",
>   (unsigned long)status);
>  
> @@ -883,13 +890,10 @@ static int macb_poll(struct napi_struct *napi, int 
> budget)
>  
>   /* Packets received while interrupts were disabled */
>   status = macb_readl(bp, RSR);
> - if (status) {
> - if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> - macb_writel(bp, ISR, MACB_BIT(RCOMP));
> + if (status)
>   napi_reschedule(napi);
> - } else {
> + else
>   macb_writel(bp, IER, MACB_RX_INT_FLAGS);
> - }
>   }
>  
>   /* TODO: Handle errors */
> @@ -903,7 +907,7 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
>   struct macb *bp = netdev_priv(dev);
>   u32 status;
>  
> - status = macb_readl(bp, ISR);
> + status = macb_read_isr(bp);
>  
>   if (unlikely(!status))
>   return IRQ_NONE;
> @@ -928,8 +932,6 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
>* now.
>*/
>   macb_writel(bp, IDR, MACB_RX_INT_FLAGS);
> - if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> - macb_writel(bp, ISR, MACB_BIT(RCOMP));
Shouldn't it be sufficient to replace 'MACB_BIT(RCOMP) with 'MACB_RX_INT_FLAGS'
to clear all the RX IRQ flags.

Sören
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/13] kexec-bzImage: Support for loading bzImage using 64bit entry

2014-06-16 Thread Borislav Petkov

On Mon, Jun 16, 2014 at 05:15:00PM -0400, Vivek Goyal wrote:
> Do we want to show all the rejection messages from bzImage64 and
> bzImage32 loaders. It might be too verbose to show users that before
> vmlinux loader accepted the image other loaders on this arches rejcted
> the image.

I get all that. But, if people want to get feedback from the system
about *why* their image didn't load, they absolutely have to enable
dynamic debug. And this is not optimal IMO because they will have to
look at the code first to see what they need to do.

Or is kexec-tools going to be taught to interpret return values from the
syscall?

In any case, we want information about why an image fails loading to
reach the user in the easiest way possible. And why should the user need
to enable dynamic debug if he can get the info without doing so?

Oh, and not everyone knows about dynamic debug so...

And I don't think it'll be too much info - only the line which fails
the check will be printed before the image loader fails so that's
practically one error reason per failed image.

Ok?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load

2014-06-16 Thread H. Peter Anvin

On 06/16/2014 02:09 PM, Borislav Petkov wrote:
> 
> Nah, I don't feel strongly about it - I just don't trust userspace and
> think that every value we get from it should be "sanitized".
> 

Borislav and I talked about this briefly over IRC.  A key part of that
is that if userspace could manipulate this system call to consume an
unreasonable amount of memory, we would have a problem, for example if
this code used vzalloc() instead of kzalloc().  However, since
kmalloc/kzalloc implies a relatively restrictive limit on the memory
allocation size anyway, well short of anything that could cause OOM
problems, that pretty much solves the problem.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 3.15.1

2014-06-16 Thread Greg KH

diff --git a/Makefile b/Makefile
index 6d1e304943a3..e2846acd2841 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 15
-SUBLEVEL = 0
+SUBLEVEL = 1
 EXTRAVERSION =
 NAME = Shuffling Zombie Juror
 
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 60707814a84b..dae5607e1115 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -445,10 +445,14 @@ static const struct pci_device_id ahci_pci_tbl[] = {
  .driver_data = board_ahci_yes_fbs },  /* 88se9172 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9192),
  .driver_data = board_ahci_yes_fbs },  /* 88se9172 on 
some Gigabyte */
+   { PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x91a0),
+ .driver_data = board_ahci_yes_fbs },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x91a3),
  .driver_data = board_ahci_yes_fbs },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9230),
  .driver_data = board_ahci_yes_fbs },
+   { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642),
+ .driver_data = board_ahci_yes_fbs },
 
/* Promise */
{ PCI_VDEVICE(PROMISE, 0x3f20), board_ahci },   /* PDC42819 */
diff --git a/drivers/media/dvb-core/dvb-usb-ids.h 
b/drivers/media/dvb-core/dvb-usb-ids.h
index 1bdc0e7e8b79..80643ef9183f 100644
--- a/drivers/media/dvb-core/dvb-usb-ids.h
+++ b/drivers/media/dvb-core/dvb-usb-ids.h
@@ -361,6 +361,7 @@
 #define USB_PID_FRIIO_WHITE0x0001
 #define USB_PID_TVWAY_PLUS 0x0002
 #define USB_PID_SVEON_STV200xe39d
+#define USB_PID_SVEON_STV20_RTL2832U   0xd39d
 #define USB_PID_SVEON_STV220xe401
 #define USB_PID_SVEON_STV22_IT9137 0xe411
 #define USB_PID_AZUREWAVE_AZ6027   0x3275
@@ -375,4 +376,5 @@
 #define USB_PID_CTVDIGDUAL_V2  0xe410
 #define USB_PID_PCTV_2002E  0x025c
 #define USB_PID_PCTV_2002E_SE   0x025d
+#define USB_PID_SVEON_STV27 0xd3af
 #endif
diff --git a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c 
b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
index dcbd392e6efc..a676e4452847 100644
--- a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
+++ b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
@@ -1537,6 +1537,12 @@ static const struct usb_device_id rtl28xxu_id_table[] = {
_props, "Crypto ReDi PC 50 A", NULL) },
{ DVB_USB_DEVICE(USB_VID_KYE, 0x707f,
_props, "Genius TVGo DVB-T03", NULL) },
+   { DVB_USB_DEVICE(USB_VID_KWORLD_2, 0xd395,
+   _props, "Peak DVB-T USB", NULL) },
+   { DVB_USB_DEVICE(USB_VID_KWORLD_2, USB_PID_SVEON_STV20_RTL2832U,
+   _props, "Sveon STV20", NULL) },
+   { DVB_USB_DEVICE(USB_VID_KWORLD_2, USB_PID_SVEON_STV27,
+   _props, "Sveon STV27", NULL) },
 
/* RTL2832P devices: */
{ DVB_USB_DEVICE(USB_VID_HANFTEK, 0x0131,
diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c
index 8dbdaaef1af5..bdcebfa30fc8 100644
--- a/drivers/misc/mei/hw-me.c
+++ b/drivers/misc/mei/hw-me.c
@@ -164,6 +164,9 @@ static void mei_me_hw_reset_release(struct mei_device *dev)
hcsr |= H_IG;
hcsr &= ~H_RST;
mei_hcsr_set(hw, hcsr);
+
+   /* complete this write before we set host ready on another CPU */
+   mmiowb();
 }
 /**
  * mei_me_hw_reset - resets fw via mei csr register.
@@ -183,8 +186,21 @@ static int mei_me_hw_reset(struct mei_device *dev, bool 
intr_enable)
else
hcsr &= ~H_IE;
 
+   dev->recvd_hw_ready = false;
mei_me_reg_write(hw, H_CSR, hcsr);
 
+   /*
+* Host reads the H_CSR once to ensure that the
+* posted write to H_CSR completes.
+*/
+   hcsr = mei_hcsr_read(hw);
+
+   if ((hcsr & H_RST) == 0)
+   dev_warn(>pdev->dev, "H_RST is not set = 0x%08X", hcsr);
+
+   if ((hcsr & H_RDY) == H_RDY)
+   dev_warn(>pdev->dev, "H_RDY is not cleared 0x%08X", hcsr);
+
if (intr_enable == false)
mei_me_hw_reset_release(dev);
 
@@ -201,6 +217,7 @@ static int mei_me_hw_reset(struct mei_device *dev, bool 
intr_enable)
 static void mei_me_host_set_ready(struct mei_device *dev)
 {
struct mei_me_hw *hw = to_me_hw(dev);
+   hw->host_hw_state = mei_hcsr_read(hw);
hw->host_hw_state |= H_IE | H_IG | H_RDY;
mei_hcsr_set(hw, hw->host_hw_state);
 }
@@ -233,10 +250,7 @@ static bool mei_me_hw_is_ready(struct mei_device *dev)
 static int mei_me_hw_ready_wait(struct mei_device *dev)
 {
int err;
-   if (mei_me_hw_is_ready(dev))
-   return 0;
 
-   dev->recvd_hw_ready = false;
mutex_unlock(>device_lock);
err = wait_event_interruptible_timeout(dev->wait_hw_ready,
dev->recvd_hw_ready,
@@ -491,14 +505,13 @@ irqreturn_t mei_me_irq_thread_handler(int

Linux 3.15.1

2014-06-16 Thread Greg KH

I'm announcing the release of the 3.15.1 kernel.

All users of the 3.15 kernel series must upgrade.

The updated 3.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.15.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile|2 +-
 drivers/ata/ahci.c  |4 
 drivers/media/dvb-core/dvb-usb-ids.h|2 ++
 drivers/media/usb/dvb-usb-v2/rtl28xxu.c |6 ++
 drivers/misc/mei/hw-me.c|   25 +++--
 drivers/pci/msi.c   |2 +-
 fs/attr.c   |8 
 fs/dcache.c |4 +++-
 fs/inode.c  |   10 +++---
 fs/namei.c  |   11 ++-
 fs/xfs/xfs_ioctl.c  |2 +-
 include/linux/capability.h  |2 +-
 kernel/auditsc.c|   27 ++-
 kernel/capability.c |   20 
 14 files changed, 81 insertions(+), 44 deletions(-)

Al Viro (1):
  lock_parent: don't step on stale ->d_parent of all-but-freed one

Alessandro Miceli (2):
  rtl28xxu: add [1b80:d39d] Sveon STV20
  rtl28xxu: add [1b80:d3af] Sveon STV27

Alexei Starovoitov (1):
  PCI/MSI: Fix memory leak in free_msi_irqs()

Andreas Schrägle (1):
  ahci: add PCI ID for Marvell 88SE91A0 SATA Controller

Andy Lutomirski (2):
  fs,userns: Change inode_capable to capable_wrt_inode_uidgid
  auditsc: audit_krule mask accesses need bounds checking

Brian Healy (1):
  rtl28xxu: add 1b80:d395 Peak DVB-T USB

Greg Kroah-Hartman (1):
  Linux 3.15.1

Jérôme Carretero (1):
  ahci: Add Device ID for HighPoint RocketRaid 642L

Tomas Winkler (3):
  mei: me: fix hw ready reset flow
  mei: me: drop harmful wait optimization
  mei: me: read H_CSR after asserting reset



pgpayoLnYLamT.pgp
Description: PGP signature

Re: Linux 3.14.8

2014-06-16 Thread Greg KH

diff --git a/Makefile b/Makefile
index f2d1225828c2..ef1d59b750ea 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 14
-SUBLEVEL = 7
+SUBLEVEL = 8
 EXTRAVERSION =
 NAME = Remembering Coco
 
diff --git a/arch/mips/include/asm/thread_info.h 
b/arch/mips/include/asm/thread_info.h
index 24846f9053fe..e80ae50cae80 100644
--- a/arch/mips/include/asm/thread_info.h
+++ b/arch/mips/include/asm/thread_info.h
@@ -136,7 +136,8 @@ static inline struct thread_info *current_thread_info(void)
 #define _TIF_SYSCALL_TRACEPOINT(1np_thread_lock);
+   if (!np->enabled) {
+   spin_unlock_bh(>np_thread_lock);
+   pr_debug("iscsi_np is not enabled, reject connect request\n");
+   return rdma_reject(cma_id, NULL, 0);
+   }
+   spin_unlock_bh(>np_thread_lock);
+
pr_debug("Entering isert_connect_request cma_id: %p, context: %p\n",
 cma_id, cma_id->context);
 
diff --git a/drivers/media/dvb-core/dvb-usb-ids.h 
b/drivers/media/dvb-core/dvb-usb-ids.h
index f19a2ccd1e4b..80643ef9183f 100644
--- a/drivers/media/dvb-core/dvb-usb-ids.h
+++ b/drivers/media/dvb-core/dvb-usb-ids.h
@@ -257,6 +257,7 @@
 #define USB_PID_TERRATEC_T50x10a1
 #define USB_PID_NOXON_DAB_STICK0x00b3
 #define USB_PID_NOXON_DAB_STICK_REV2   0x00e0
+#define USB_PID_NOXON_DAB_STICK_REV3   0x00b4
 #define USB_PID_PINNACLE_EXPRESSCARD_320CX 0x022e
 #define USB_PID_PINNACLE_PCTV2000E 0x022c
 #define USB_PID_PINNACLE_PCTV_DVB_T_FLASH  0x0228
@@ -360,6 +361,7 @@
 #define USB_PID_FRIIO_WHITE0x0001
 #define USB_PID_TVWAY_PLUS 0x0002
 #define USB_PID_SVEON_STV200xe39d
+#define USB_PID_SVEON_STV20_RTL2832U   0xd39d
 #define USB_PID_SVEON_STV220xe401
 #define USB_PID_SVEON_STV22_IT9137 0xe411
 #define USB_PID_AZUREWAVE_AZ6027   0x3275
@@ -374,4 +376,5 @@
 #define USB_PID_CTVDIGDUAL_V2  0xe410
 #define USB_PID_PCTV_2002E  0x025c
 #define USB_PID_PCTV_2002E_SE   0x025d
+#define USB_PID_SVEON_STV27 0xd3af
 #endif
diff --git a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c 
b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
index fda5c64ba0e8..fd1312d0b078 100644
--- a/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
+++ b/drivers/media/usb/dvb-usb-v2/rtl28xxu.c
@@ -1382,6 +1382,7 @@ static const struct dvb_usb_device_properties 
rtl2832u_props = {
 };
 
 static const struct usb_device_id rtl28xxu_id_table[] = {
+   /* RTL2831U devices: */
{ DVB_USB_DEVICE(USB_VID_REALTEK, USB_PID_REALTEK_RTL2831U,
_props, "Realtek RTL2831U reference design", NULL) },
{ DVB_USB_DEVICE(USB_VID_WIDEVIEW, USB_PID_FREECOM_DVBT,
@@ -1389,6 +1390,7 @@ static const struct usb_device_id rtl28xxu_id_table[] = {
{ DVB_USB_DEVICE(USB_VID_WIDEVIEW, USB_PID_FREECOM_DVBT_2,
_props, "Freecom USB2.0 DVB-T", NULL) },
 
+   /* RTL2832U devices: */
{ DVB_USB_DEVICE(USB_VID_REALTEK, 0x2832,
_props, "Realtek RTL2832U reference design", NULL) },
{ DVB_USB_DEVICE(USB_VID_REALTEK, 0x2838,
@@ -1401,6 +1403,8 @@ static const struct usb_device_id rtl28xxu_id_table[] = {
_props, "TerraTec NOXON DAB Stick", NULL) },
{ DVB_USB_DEVICE(USB_VID_TERRATEC, USB_PID_NOXON_DAB_STICK_REV2,
_props, "TerraTec NOXON DAB Stick (rev 2)", NULL) },
+   { DVB_USB_DEVICE(USB_VID_TERRATEC, USB_PID_NOXON_DAB_STICK_REV3,
+   _props, "TerraTec NOXON DAB Stick (rev 3)", NULL) },
{ DVB_USB_DEVICE(USB_VID_GTEK, USB_PID_TREKSTOR_TERRES_2_0,
_props, "Trekstor DVB-T Stick Terres 2.0", NULL) },
{ DVB_USB_DEVICE(USB_VID_DEXATEK, 0x1101,
@@ -1429,7 +1433,16 @@ static const struct usb_device_id rtl28xxu_id_table[] = {
_props, "Leadtek WinFast DTV Dongle mini", NULL) },
{ DVB_USB_DEVICE(USB_VID_GTEK, USB_PID_CPYTO_REDI_PC50A,
_props, "Crypto ReDi PC 50 A", NULL) },
-
+   { DVB_USB_DEVICE(USB_VID_KYE, 0x707f,
+   _props, "Genius TVGo DVB-T03", NULL) },
+   { DVB_USB_DEVICE(USB_VID_KWORLD_2, 0xd395,
+   _props, "Peak DVB-T USB", NULL) },
+   { DVB_USB_DEVICE(USB_VID_KWORLD_2, USB_PID_SVEON_STV20_RTL2832U,
+   _props, "Sveon STV20", NULL) },
+   { DVB_USB_DEVICE(USB_VID_KWORLD_2, USB_PID_SVEON_STV27,
+   _props, "Sveon STV27", NULL) },
+
+   /* RTL2832P devices: */
{ DVB_USB_DEVICE(USB_VID_HANFTEK, 0x0131,
_props, "Astrometa DVB-T2", NULL) },
{ }
diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c

Linux 3.14.8

2014-06-16 Thread Greg KH

I'm announcing the release of the 3.14.8 kernel.

All users of the 3.14 kernel series must upgrade.

The updated 3.14.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.14.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile  |2 +-
 arch/mips/include/asm/thread_info.h   |3 ++-
 drivers/ata/ahci.c|4 
 drivers/infiniband/ulp/isert/ib_isert.c   |8 
 drivers/media/dvb-core/dvb-usb-ids.h  |3 +++
 drivers/media/usb/dvb-usb-v2/rtl28xxu.c   |   15 ++-
 drivers/misc/mei/hw-me.c  |   25 +++--
 drivers/pci/msi.c |2 +-
 drivers/target/iscsi/iscsi_target.c   |1 +
 drivers/target/iscsi/iscsi_target_core.h  |1 +
 drivers/target/iscsi/iscsi_target_login.c |1 +
 drivers/target/iscsi/iscsi_target_tpg.c   |2 ++
 drivers/target/target_core_alua.c |9 +
 fs/attr.c |8 
 fs/inode.c|   10 +++---
 fs/namei.c|   11 ++-
 fs/xfs/xfs_ioctl.c|2 +-
 include/linux/capability.h|2 +-
 kernel/auditsc.c  |   27 ++-
 kernel/capability.c   |   20 
 net/ipv4/netfilter/nf_defrag_ipv4.c   |5 +++--
 21 files changed, 114 insertions(+), 47 deletions(-)

Alessandro Miceli (2):
  rtl28xxu: add [1b80:d39d] Sveon STV20
  rtl28xxu: add [1b80:d3af] Sveon STV27

Alexei Starovoitov (1):
  PCI/MSI: Fix memory leak in free_msi_irqs()

Andreas Schrägle (1):
  ahci: add PCI ID for Marvell 88SE91A0 SATA Controller

Andy Lutomirski (2):
  fs,userns: Change inode_capable to capable_wrt_inode_uidgid
  auditsc: audit_krule mask accesses need bounds checking

Brian Healy (1):
  rtl28xxu: add 1b80:d395 Peak DVB-T USB

Florian Westphal (1):
  netfilter: ipv4: defrag: set local_df flag on defragmented skb

Greg Kroah-Hartman (1):
  Linux 3.14.8

Jan Vcelak (2):
  rtl28xxu: add USB ID for Genius TVGo DVB-T03
  rtl28xxu: add chipset version comments into device list

Jérôme Carretero (1):
  ahci: Add Device ID for HighPoint RocketRaid 642L

Markos Chandras (1):
  MIPS: asm: thread_info: Add _TIF_SECCOMP flag

Nicholas Bellinger (2):
  iser-target: Fix multi network portal shutdown regression
  target: Allow READ_CAPACITY opcode in ALUA Standby access state

Sagi Grimberg (1):
  Target/iscsi,iser: Avoid accepting transport connections during stop stage

Till Dörges (1):
  rtl28xxu: add ID [0ccd:00b4] TerraTec NOXON DAB Stick (rev 3)

Tomas Winkler (3):
  mei: me: fix hw ready reset flow
  mei: me: drop harmful wait optimization
  mei: me: read H_CSR after asserting reset



pgpNPncWNKkhS.pgp
Description: PGP signature

Linux 3.10.44

2014-06-16 Thread Greg KH

I'm announcing the release of the 3.10.44 kernel.

All users of the 3.10 kernel series must upgrade.

The updated 3.10.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.10.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile |2 
 arch/arm/boot/dts/armada-xp-gp.dts   |2 
 arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts |2 
 drivers/ata/ahci.c   |4 +
 drivers/infiniband/ulp/isert/ib_isert.c  |8 +++
 drivers/misc/mei/hw-me.c |4 -
 drivers/net/ethernet/mellanox/mlx4/en_cq.c   |1 
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c   |6 --
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |1 
 drivers/scsi/megaraid/megaraid_sas.h |1 
 drivers/scsi/megaraid/megaraid_sas_base.c|5 +-
 drivers/target/iscsi/iscsi_target.c  |1 
 drivers/target/iscsi/iscsi_target_core.h |1 
 drivers/target/iscsi/iscsi_target_login.c|   51 +++-
 drivers/target/iscsi/iscsi_target_tpg.c  |9 ++-
 drivers/target/target_core_alua.c|9 +++
 drivers/target/target_core_configfs.c|5 ++
 fs/attr.c|8 +--
 fs/inode.c   |   10 ++--
 fs/namei.c   |   11 ++--
 include/linux/capability.h   |2 
 kernel/auditsc.c |   27 +++---
 kernel/capability.c  |   18 ++-
 mm/compaction.c  |   57 ++-
 net/ipv4/netfilter/nf_defrag_ipv4.c  |5 +-
 25 files changed, 151 insertions(+), 99 deletions(-)

Andreas Schrägle (1):
  ahci: add PCI ID for Marvell 88SE91A0 SATA Controller

Andy Lutomirski (2):
  fs,userns: Change inode_capable to capable_wrt_inode_uidgid
  auditsc: audit_krule mask accesses need bounds checking

Ben Collins (1):
  SCSI: megaraid: Use resource_size_t for PCI resources, not long

Chris Mason (1):
  mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll

Florian Westphal (1):
  netfilter: ipv4: defrag: set local_df flag on defragmented skb

Greg Kroah-Hartman (1):
  Linux 3.10.44

Jérôme Carretero (1):
  ahci: Add Device ID for HighPoint RocketRaid 642L

Nicholas Bellinger (3):
  iser-target: Fix multi network portal shutdown regression
  target: Allow READ_CAPACITY opcode in ALUA Standby access state
  target: Fix alua_access_state attribute OOPs for un-configured devices

Roland Dreier (1):
  iscsi-target: Fix wrong buffer / buffer overrun in 
iscsi_change_param_value()

Sagi Grimberg (1):
  Target/iscsi,iser: Avoid accepting transport connections during stop stage

Thomas Petazzoni (2):
  ARM: mvebu: fix NOR bus-width in Armada XP GP Device Tree
  ARM: mvebu: fix NOR bus-width in Armada XP OpenBlocks AX3 Device Tree

Tomas Winkler (1):
  mei: me: drop harmful wait optimization

Vlastimil Babka (3):
  mm: compaction: reset cached scanner pfn's before reading them
  mm: compaction: detect when scanners meet in isolate_freepages
  mm/compaction: make isolate_freepages start at pageblock boundary



pgpseq0n9uC2J.pgp
Description: PGP signature

Re: Linux 3.10.44

2014-06-16 Thread Greg KH


diff --git a/Makefile b/Makefile
index 9cf513828341..e55476c4aef0 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 10
-SUBLEVEL = 43
+SUBLEVEL = 44
 EXTRAVERSION =
 NAME = TOSSUG Baby Fish
 
diff --git a/arch/arm/boot/dts/armada-xp-gp.dts 
b/arch/arm/boot/dts/armada-xp-gp.dts
index 76db557adbe7..f97550420fcc 100644
--- a/arch/arm/boot/dts/armada-xp-gp.dts
+++ b/arch/arm/boot/dts/armada-xp-gp.dts
@@ -124,7 +124,7 @@
/* Device Bus parameters are required */
 
/* Read parameters */
-   devbus,bus-width= <8>;
+   devbus,bus-width= <16>;
devbus,turn-off-ps  = <6>;
devbus,badr-skew-ps = <0>;
devbus,acc-first-ps = <124000>;
diff --git a/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts 
b/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
index fdea75c73411..9746d0e7fcb4 100644
--- a/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
+++ b/arch/arm/boot/dts/armada-xp-openblocks-ax3-4.dts
@@ -152,7 +152,7 @@
/* Device Bus parameters are required */
 
/* Read parameters */
-   devbus,bus-width= <8>;
+   devbus,bus-width= <16>;
devbus,turn-off-ps  = <6>;
devbus,badr-skew-ps = <0>;
devbus,acc-first-ps = <124000>;
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 4942058402a4..b0d33d9533aa 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -444,10 +444,14 @@ static const struct pci_device_id ahci_pci_tbl[] = {
  .driver_data = board_ahci_yes_fbs },  /* 88se9172 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9192),
  .driver_data = board_ahci_yes_fbs },  /* 88se9172 on 
some Gigabyte */
+   { PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x91a0),
+ .driver_data = board_ahci_yes_fbs },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x91a3),
  .driver_data = board_ahci_yes_fbs },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9230),
  .driver_data = board_ahci_yes_fbs },
+   { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642),
+ .driver_data = board_ahci_yes_fbs },
 
/* Promise */
{ PCI_VDEVICE(PROMISE, 0x3f20), board_ahci },   /* PDC42819 */
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index bae20f8bb034..14418022 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -382,6 +382,14 @@ isert_connect_request(struct rdma_cm_id *cma_id, struct 
rdma_cm_event *event)
struct ib_device *ib_dev = cma_id->device;
int ret = 0;
 
+   spin_lock_bh(>np_thread_lock);
+   if (!np->enabled) {
+   spin_unlock_bh(>np_thread_lock);
+   pr_debug("iscsi_np is not enabled, reject connect request\n");
+   return rdma_reject(cma_id, NULL, 0);
+   }
+   spin_unlock_bh(>np_thread_lock);
+
pr_debug("Entering isert_connect_request cma_id: %p, context: %p\n",
 cma_id, cma_id->context);
 
diff --git a/drivers/misc/mei/hw-me.c b/drivers/misc/mei/hw-me.c
index 1bf3f8b5ce3a..06311c5ada36 100644
--- a/drivers/misc/mei/hw-me.c
+++ b/drivers/misc/mei/hw-me.c
@@ -183,6 +183,7 @@ static void mei_me_hw_reset(struct mei_device *dev, bool 
intr_enable)
else
hcsr &= ~H_IE;
 
+   dev->recvd_hw_ready = false;
mei_me_reg_write(hw, H_CSR, hcsr);
 
if (dev->dev_state == MEI_DEV_POWER_DOWN)
@@ -233,10 +234,7 @@ static bool mei_me_hw_is_ready(struct mei_device *dev)
 static int mei_me_hw_ready_wait(struct mei_device *dev)
 {
int err;
-   if (mei_me_hw_is_ready(dev))
-   return 0;
 
-   dev->recvd_hw_ready = false;
mutex_unlock(>device_lock);
err = wait_event_interruptible_timeout(dev->wait_hw_ready,
dev->recvd_hw_ready,
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c 
b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 1e6c594d6d04..58c18d3a4880 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -55,7 +55,6 @@ int mlx4_en_create_cq(struct mlx4_en_priv *priv,
 
cq->ring = ring;
cq->is_tx = mode;
-   spin_lock_init(>lock);
 
err = mlx4_alloc_hwq_res(mdev->dev, >wqres,
cq->buf_size, 2 * PAGE_SIZE);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 89c47ea84b50..063f3f4d4867 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1190,15

Re: Linux 3.4.94

2014-06-16 Thread Greg KH


diff --git a/Makefile b/Makefile
index 20f420096dfa..0864af4a683b 100644
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,6 @@
 VERSION = 3
 PATCHLEVEL = 4
-SUBLEVEL = 93
+SUBLEVEL = 94
 EXTRAVERSION =
 NAME = Saber-toothed Squirrel
 
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 6524c6e21896..694aeedcbf88 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -67,9 +67,11 @@ LDFLAGS_vmlinux-y := -Bstatic
 LDFLAGS_vmlinux-$(CONFIG_RELOCATABLE) := -pie
 LDFLAGS_vmlinux:= $(LDFLAGS_vmlinux-y)
 
+asinstr := $(call as-instr,lis 9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)
+
 CFLAGS-$(CONFIG_PPC64) := -mminimal-toc -mtraceback=no -mcall-aixdesc
 CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 -mmultiple
-KBUILD_CPPFLAGS+= -Iarch/$(ARCH)
+KBUILD_CPPFLAGS+= -Iarch/$(ARCH) $(asinstr)
 KBUILD_AFLAGS  += -Iarch/$(ARCH)
 KBUILD_CFLAGS  += -msoft-float -pipe -Iarch/$(ARCH) $(CFLAGS-y)
 CPP= $(CC) -E $(KBUILD_CFLAGS)
diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 50f73aa2ba21..6f5a837431e9 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -294,11 +294,16 @@ n:
  *  ld rY,ADDROFF(name)(rX)
  */
 #ifdef __powerpc64__
+#ifdef HAVE_AS_ATHIGH
+#define __AS_ATHIGH high
+#else
+#define __AS_ATHIGH h
+#endif
 #define LOAD_REG_IMMEDIATE(reg,expr)   \
lis (reg),(expr)@highest;   \
ori (reg),(reg),(expr)@higher;  \
rldicr  (reg),(reg),32,31;  \
-   oris(reg),(reg),(expr)@h;   \
+   oris(reg),(reg),(expr)@__AS_ATHIGH; \
ori (reg),(reg),(expr)@l;
 
 #define LOAD_REG_ADDR(reg,name)\
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index ccd7b6711196..0e87baf8fcc2 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -441,6 +441,8 @@ static const struct pci_device_id ahci_pci_tbl[] = {
  .driver_data = board_ahci_yes_fbs },
{ PCI_DEVICE(0x1b4b, 0x9230),
  .driver_data = board_ahci_yes_fbs },
+   { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642),
+ .driver_data = board_ahci_yes_fbs },
 
/* Promise */
{ PCI_VDEVICE(PROMISE, 0x3f20), board_ahci },   /* PDC42819 */
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c 
b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 00b81272e314..174b622dcaef 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -55,7 +55,6 @@ int mlx4_en_create_cq(struct mlx4_en_priv *priv,
 
cq->ring = ring;
cq->is_tx = mode;
-   spin_lock_init(>lock);
 
err = mlx4_alloc_hwq_res(mdev->dev, >wqres,
cq->buf_size, 2 * PAGE_SIZE);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 31b455a49273..467a51171d47 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -370,15 +370,11 @@ static void mlx4_en_netpoll(struct net_device *dev)
 {
struct mlx4_en_priv *priv = netdev_priv(dev);
struct mlx4_en_cq *cq;
-   unsigned long flags;
int i;
 
for (i = 0; i < priv->rx_ring_num; i++) {
cq = >rx_cq[i];
-   spin_lock_irqsave(>lock, flags);
-   napi_synchronize(>napi);
-   mlx4_en_process_rx_cq(dev, cq, 0);
-   spin_unlock_irqrestore(>lock, flags);
+   napi_schedule(>napi);
}
 }
 #endif
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d69fee41f24a..8df3c4be3ff1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -301,7 +301,6 @@ struct mlx4_en_cq {
struct mlx4_cq  mcq;
struct mlx4_hwq_resources wqres;
int ring;
-   spinlock_t  lock;
struct net_device  *dev;
struct napi_struct  napi;
/* Per-core Tx cq processing support */
diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
b/drivers/scsi/megaraid/megaraid_sas.h
index e5f416f8042d..1a7955a39070 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -1294,7 +1294,6 @@ struct megasas_instance {
u32 *reply_queue;
dma_addr_t reply_queue_h;
 
-   unsigned long base_addr;
struct megasas_register_set __iomem *reg_set;
 
struct megasas_pd_list  pd_list[MEGASAS_MAX_PD];
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
b/drivers/scsi/megaraid/megaraid_sas_base.c
index 79261628d067..618870033dd0 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3445,6 +3445,7 @@ static int megasas_init_fw(struct megasas_instance 
*instance)
u32 max_sectors_1;
u32

Linux 3.4.94

2014-06-16 Thread Greg KH

I'm announcing the release of the 3.4.94 kernel.

All users of the 3.4 kernel series must upgrade.

The updated 3.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.4.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

thanks,

greg k-h



 Makefile   |2 -
 arch/powerpc/Makefile  |4 ++-
 arch/powerpc/include/asm/ppc_asm.h |7 +-
 drivers/ata/ahci.c |2 +
 drivers/net/ethernet/mellanox/mlx4/en_cq.c |1 
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |6 -
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |1 
 drivers/scsi/megaraid/megaraid_sas.h   |1 
 drivers/scsi/megaraid/megaraid_sas_base.c  |5 ++--
 drivers/staging/zram/zram_sysfs.c  |   21 ++-
 kernel/auditsc.c   |   27 -
 net/ipv4/netfilter/nf_defrag_ipv4.c|5 ++--
 12 files changed, 53 insertions(+), 29 deletions(-)

Andy Lutomirski (1):
  auditsc: audit_krule mask accesses need bounds checking

Ben Collins (1):
  SCSI: megaraid: Use resource_size_t for PCI resources, not long

Chris Mason (1):
  mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll

Florian Westphal (1):
  netfilter: ipv4: defrag: set local_df flag on defragmented skb

Greg Kroah-Hartman (1):
  Linux 3.4.94

Guenter Roeck (1):
  powerpc: Fix 64 bit builds with binutils 2.24

Jiang Liu (1):
  zram: protect sysfs handler from invalid memory access

Jérôme Carretero (1):
  ahci: Add Device ID for HighPoint RocketRaid 642L

Rashika Kheria (1):
  Staging: zram: Fix memory leak by refcount mismatch



pgpXtXSx9xJYs.pgp
Description: PGP signature

Re: [PATCH -next 12/26] intel: Use dma_zalloc_coherent

2014-06-16 Thread Jeff Kirsher

On Sun, 2014-06-15 at 13:37 -0700, Joe Perches wrote:
> Use the zeroing function instead of dma_alloc_coherent & memset(,0,)
> 
> Signed-off-by: Joe Perches 

Acked-by: Jeff Kirsher 

> ---
>  drivers/net/ethernet/intel/ixgb/ixgb_main.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)




signature.asc
Description: This is a digitally signed message part

Re: [PATCH 7/7] mincore: apply page table walker on do_mincore()

2014-06-16 Thread Sasha Levin

On 06/16/2014 12:44 PM, Naoya Horiguchi wrote:
> Hi Sasha,
> 
> Thanks for bug reporting.
> 
> On Mon, Jun 16, 2014 at 11:24:16AM -0400, Sasha Levin wrote:
>> On 06/06/2014 06:58 PM, Naoya Horiguchi wrote:
>>> This patch makes do_mincore() use walk_page_vma(), which reduces many lines
>>> of code by using common page table walk code.
>>>
>>> Signed-off-by: Naoya Horiguchi 
>>
>> Hi Naoya,
>>
>> This patch is causing a few issues on -next:
>>
>> [  367.679282] BUG: sleeping function called from invalid context at 
>> mm/mincore.c:37
> 
> cond_resched() in mincore_hugetlb() triggered this. This is done in common
> pagewalk code, so I should have removed it.
> 
> ...
>> And:
>>
>> [  391.118663] BUG: unable to handle kernel paging request at 
>> 880142aca000
>> [  391.118663] IP: mincore_hole (mm/mincore.c:99 (discriminator 2))
> 
> walk->pte_hole cannot assume walk->vma != NULL, so I should've checked it
> in mincore_hole() before using walk->vma.
> 
> Could you try the following fixes?

That solved those two, but I'm seeing new ones:

[  650.352956] BUG: unable to handle kernel paging request at 8802fdf03000
[  650.352956] IP: mincore_hole (mm/mincore.c:101 (discriminator 2))
[  650.352956] PGD 23bcd067 PUD 704b48067 PMD 704958067 PTE 8002fdf03060
[  650.352956] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[  650.352956] Dumping ftrace buffer:
[  650.352956](ftrace buffer empty)
[  650.352956] Modules linked in:
[  650.352956] CPU: 12 PID: 15403 Comm: trinity-c363 Tainted: GW 
3.15.0-next-20140616-sasha-00025-g0fd1f7d-dirty #657
[  650.352956] task: 88027caf3000 ti: 880279d5c000 task.ti: 
880279d5c000
[  650.352956] RIP: mincore_hole (mm/mincore.c:101 (discriminator 2))
[  650.352956] RSP: 0018:880279d5fd48  EFLAGS: 00010202
[  650.352956] RAX: 0001 RBX: 7f244540 RCX: 
[  650.352956] RDX:  RSI: 7f244540 RDI: 7f244520
[  650.352956] RBP: 880279d5fd88 R08: 0001 R09: 88000100
[  650.352956] R10: 0001 R11: 7f2444126000 R12: 7f248000
[  650.352956] R13: 8802fdf03000 R14: 0200 R15: 8804e32f2000
[  650.352956] FS:  7f24899ec700() GS:8802ff00() 
knlGS:
[  650.352956] CS:  0010 DS:  ES:  CR0: 80050033
[  650.352956] CR2: 8802fdf03000 CR3: 00027c39b000 CR4: 06a0
[  650.352956] DR0: 006df000 DR1:  DR2: 
[  650.352956] DR3:  DR6: 0ff0 DR7: 0600
[  650.352956] Stack:
[  650.352956]  880279d5fef0   
7f244540
[  650.352956]  7f248000 880279d5fef0 7f244520 
8804e30fd148
[  650.352956]  880279d5fe38 9c2d1c4d 8802 
9c1a0038
[  650.352956] Call Trace:
[  650.352956] walk_pgd_range (mm/pagewalk.c:73 mm/pagewalk.c:141 
mm/pagewalk.c:170)
[  650.352956] ? preempt_count_sub (kernel/sched/core.c:2602)
[  650.352956] __walk_page_range (mm/pagewalk.c:264)
[  650.352956] ? SyS_mincore (mm/mincore.c:160 mm/mincore.c:244 
mm/mincore.c:212)
[  650.352956] walk_page_vma (mm/pagewalk.c:376)
[  650.352956] SyS_mincore (mm/mincore.c:177 mm/mincore.c:244 mm/mincore.c:212)
[  650.352956] ? mincore_hugetlb (mm/mincore.c:143)
[  650.352956] ? mincore_hole (mm/mincore.c:109)
[  650.352956] ? mincore_page (mm/mincore.c:87)
[  650.352956] ? copy_page_range (mm/mincore.c:24)
[  650.352956] tracesys (arch/x86/kernel/entry_64.S:542)
[ 650.352956] Code: 87 a0 00 00 00 48 83 c3 01 48 8b b8 f8 01 00 00 e8 ab fe ff 
ff 48 8b 55 c8 88 02 49 63 c4 49 39 c6 77 cd eb 14 0f 1f 00 83 c0 01 <41> c6 44 
15 00 00 48 63 d0 49 39 d6 77 ef 48 8b 55 c0 4c 8b 6a
All code

   0:   87 a0 00 00 00 48   xchg   %esp,0x4800(%rax)
   6:   83 c3 01add$0x1,%ebx
   9:   48 8b b8 f8 01 00 00mov0x1f8(%rax),%rdi
  10:   e8 ab fe ff ff  callq  0xfec0
  15:   48 8b 55 c8 mov-0x38(%rbp),%rdx
  19:   88 02   mov%al,(%rdx)
  1b:   49 63 c4movslq %r12d,%rax
  1e:   49 39 c6cmp%rax,%r14
  21:   77 cd   ja 0xfff0
  23:   eb 14   jmp0x39
  25:   0f 1f 00nopl   (%rax)
  28:   83 c0 01add$0x1,%eax
  2b:*  41 c6 44 15 00 00   movb   $0x0,0x0(%r13,%rdx,1)<-- 
trapping instruction
  31:   48 63 d0movslq %eax,%rdx
  34:   49 39 d6cmp%rdx,%r14
  37:   77 ef   ja 0x28
  39:   48 8b 55 c0 mov-0x40(%rbp),%rdx
  3d:   4c 8b 6a 00 mov0x0(%rdx),%r13

Code starting with the faulting

Re: [PATCH 11/13] kexec-bzImage: Support for loading bzImage using 64bit entry

2014-06-16 Thread Vivek Goyal

On Mon, Jun 16, 2014 at 10:57:43PM +0200, Borislav Petkov wrote:
> On Mon, Jun 16, 2014 at 04:06:08PM -0400, Vivek Goyal wrote:
> > There can be more than one loader and the one which claims first
> > to recognize the image will get to load the image. So once 32 bit
> > loader support comes in, it might happen that we ask 64bit loader
> > first and it rejects the image and then we ask 32bit loader.
> 
> What does that have to do with anything??

Say down the line you support 3 types of kernel images. 64bit bzImage, 
32bit bzImage and ELF vmlinux. And there are 3 different loader
implementations in kernel. Now assume user us trying to load an ELF vmlinux
image. 

Generic code will call.

arch_kexec_kernel_image_probe {
for (i = 0; i < nr_file_types; i++) {
if (!kexec_file_type[i].probe)
continue;

ret = kexec_file_type[i].probe(buf, buf_len);
if (!ret) {
image->file_handler_idx = i;
return ret;
}
}
return ret;
}

This code calls into very registered loader and if nobody is ready to
load the image it returns error. Say first bzImage64 and bzImage32 bit
loader are called. They both will reject the image and vmlinux loader
will accept it.

Do we want to show all the rejection messages from bzImage64 and bzImage32
loaders. It might be too verbose to show users that before vmlinux loader
accepted the image other loaders on this arches rejcted the image.

This is very similar to binary file loaing. Looks at load_elf_binary(). If
it does not find elf header it knows it is not an ELF binary and
returns error -ENOEXEC without outputing any messages.

> 
> > So these message are really debug message which tells why loader
> > is not accepting an image. It might not be image destined for that
> > loader at all.
> > 
> > pr_debug() allows being verbose if user wants to for debugging purposes.
> > You just have to make sure that CONFIG_DYNAMIC_DEBUG=y and enable verbosity
> > in individual file.
> > 
> > echo 'file kexec-bzimage.c +p' > /sys/kernel/debug/dynamic_debug/control
> 
> So people are supposed to enable dynamic_debug just so that they see
> *why* their image doesn't load.
> 
> Doesn't sound optimal to me.

This is one way of doing it. I can change it if you think that displaying
messages from all the loaders is fine.

> 
> > Same here. We will potentially be trying multiple loaders and if every
> > loader prints messages for rejection by default, it is too much of
> > info, IMO.
> 
> For max two loaders on one architecture? I don't think so. Now you're
> just arguing for the sake of it.

Well, we have 3 loaders in user space currently for x86_64. bzImage64,
bzImage32 and ELF vmlinux. So one would think that somebody might
go about implementing these in kernel space too.

> 
> > I like doing memory allocations early in the functions (as far as
> > possible) and error out if need be. If memory is available to begin
> > with for all the data structures needed by this function, it is kind
> > of pointless to do rest of the processing.
> 
> We're talking about memory for a single void * which is ridiculous. And
> I think simplifying the error paths is a much higher win than doing some
> minor allocation.

Ok, I will change it.

> 
> > Hmm..., If you feel strongly about it, I can make this change. I
> > thought I just made it easier to share the code between 32bit and
> > 64bit by this.
> 
> Someone later can do that - right now this code is 64-bit only as far as
> we're concerned and if it can be made to work on 32-bit, then people are
> free to do so.

Ok.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Turn off -Wmaybe-uninitialized completely and move it to W=1

2014-06-16 Thread Sam Ravnborg

On Mon, Jun 16, 2014 at 03:20:45PM +0200, Borislav Petkov wrote:
> Hi,
> 
> so 3.16-rc1 adds this false positive from gcc, see below (triggers on
> 4.9 and 4.8.2). Now, it is firing wrong and gcc people tell me there's
> no way for the compiler to know that the "from" and "to" values will NOT
> be used in the error case, i.e. thus the "maybe" aspect.
> 
> So, we've disabled it for -Os already:
> 
>   e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os")
> 
> maybe we want to disable it by default on all and move it to W=1. This
> way people can still have it fire but not by default. And from what I've
> seen so far, it is mostly firing wrong and it is becoming annoying.
> 
> So what do people think, any reasons for keeping it enabled by default?
Agreed.
The noise ratio is too high - so move it to W=1.

Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 00/13][V3] kexec: A new system call to allow in kernel loading

2014-06-16 Thread Borislav Petkov

On Tue, Jun 03, 2014 at 09:06:49AM -0400, Vivek Goyal wrote:
> This patch series does not do kernel signature verification yet. I
> plan to post another patch series for that. Now bzImage is already
> signed with PKCS7 signature I plan to parse and verify those
> signatures.

Btw, do you have a brief outline on how you are going to do the
extension to signature verification? Nothing formal, just enough of an
outline that I can see where in the flow it will be plugged in.

I was wondering how the whole signature signing and verification will
be done, i.e., where do I get the signature, how and who will verify it
(I'm guessing the purgatory code), etc, etc.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mfd: cros_ec_spi: set wakeup capability

2014-06-16 Thread Doug Anderson

From: Prathyush K 

Set the device as wakeup capable and register the wakeup source.

Note: Though it makes more sense to have the SPI framework do this,
(either via device tree or by board_info)
this change is as per an existing mail chain:
https://lkml.org/lkml/2009/8/27/291

Signed-off-by: Prathyush K 
Signed-off-by: Doug Anderson 
---
Note that I don't have suspend/resume actually working upstream, but I
see that /sys/bus/spi/drivers/cros-ec-spi/spi2.0/power/wakeup exists
with this patch and doesn't exist without it.

 drivers/mfd/cros_ec_spi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 0b8d328..ef22dd5 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -385,6 +385,8 @@ static int cros_ec_spi_probe(struct spi_device *spi)
return err;
}
 
+   device_init_wakeup(>dev, true);
+
return 0;
 }
 
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PM / Runtime: let rpm_resume fail if rpm disabled and device suspended.

2014-06-16 Thread Rafael J. Wysocki

On Monday, June 16, 2014 01:40:05 PM Alan Stern wrote:
> On Sat, 14 Jun 2014, Allen Yu wrote:
> 
> > --- a/drivers/base/power/runtime.c
> > +++ b/drivers/base/power/runtime.c
> > @@ -608,7 +608,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
> >   repeat:
> > if (dev->power.runtime_error)
> > retval = -EINVAL;
> > -   else if (dev->power.disable_depth == 1 && dev->power.is_suspended
> > +   else if (dev->power.disable_depth == 1 && !dev->power.is_suspended
> > && dev->power.runtime_status == RPM_ACTIVE)
> > retval = 1;
> 
> For reasons having nothing to do with Allen's suggested change, I
> wonder if we shouldn't replace this line with something like:
> 
> - else if (dev->power.disable_depth == 1 && dev->power.is_suspended
> + else if (dev->power.disable > 0 && !dev->power.is_suspended
>   && dev->power.runtime_status == RPM_ACTIVE)
>   retval = 1;
> 
> It seems that I've been bitten by this several times in the past.  
> When a device is disabled for runtime PM, and more or less permanently
> stuck in the RPM_ACTIVE state, calls to pm_runtime_resume() or
> pm_runtime_get_sync() shouldn't fail.
> 
> For example, suppose some devices of a certain type support runtime 
> power management but others don't.  We naturally want to call 
> pm_runtime_disable() for the ones that don't.  But we also want the 
> same driver to work for all the devices, which means that 
> pm_runtime_get_sync() should return success -- otherwise the driver 
> will think that something has gone wrong.
> 
> Rafael, what do you think?

That condition is there specifically to take care of the system suspend
code path.  It means that if runtime PM is disabled, but it only has been
disabled by the system suspend code path, we should treat the device as
"active" (ie. return 1).  That won't work after the proposed change.

I guess drivers that want to work with devices where runtime PM may be
disabled can just check the return value of rpm_resume() for -EACCES?

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load

2014-06-16 Thread Borislav Petkov

On Mon, Jun 16, 2014 at 04:53:31PM -0400, Vivek Goyal wrote:
> Kdump kernel uses a different command line. It adds extra command line
> options to currently running kernels.
> 
> Till recent past we used to pass new kernel's memory map using command
> line "memmap=" and when command line size was 256, we could easily exhaust
> command line on large machines.
> 
> Now we support 2048 and we have not seen that issue and now we have
> moved to passing memory ranges in bootparams so that issue does not
> exist. But kernel still does allow passing memmap= on command line.
> 
> One can do same thing using kexec too.
> 
> Agreed that it is a very corner case use case. Now we can say that we
> will not support it. I am fine with that but I atleast wanted a discussion
> and common understanding of what new syscall will support and what it
> will not.
> 
> Some arches still seem to have COMMAND_LINE_SIZE 256. They will more
> likely to hit this scenario at some point of time.
> 
> Given the fact you feel so strongly on putting this upper limit, I will
> introduce it. And put a comment that if the kernel we are kexecing into
> supports longer command line, the we will not support that size and one
> needs to upgrade first kernel.

Nah, I don't feel strongly about it - I just don't trust userspace and
think that every value we get from it should be "sanitized".

But if you say that you want to be able to pass bigger command line to
2nd kernel because this is how kexec passes info, then I'm fine with it.
This is actually a very valid use case which I was asking for, thanks!

I guess if a malicious user goes at lenths to manipulate
header->cmdline_size just so that kmalloc still succeeds and we're fine
with it then I certainly don't have anything against it. I.e., if user
really wants to shoot himself in the foot, user can.

So it is a good thing we talked about it then. :-)

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm: page_alloc: simplify drain_zone_pages by using min()

2014-06-16 Thread Michal Nazarewicz

Instead of open-coding getting minimal value of two, just use min macro.
That is why it is there for.  While changing the function also change
type of batch local variable to match type of per_cpu_pages::batch
(which is int).

Signed-off-by: Michal Nazarewicz 
---
 mm/page_alloc.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5dba293..26aa003 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1224,15 +1224,11 @@ static int rmqueue_bulk(struct zone *zone, unsigned int 
order,
 void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp)
 {
unsigned long flags;
-   int to_drain;
-   unsigned long batch;
+   int to_drain, batch;
 
local_irq_save(flags);
batch = ACCESS_ONCE(pcp->batch);
-   if (pcp->count >= batch)
-   to_drain = batch;
-   else
-   to_drain = pcp->count;
+   to_drain = min(pcp->count, batch);
if (to_drain > 0) {
free_pcppages_bulk(zone, to_drain, pcp);
pcp->count -= to_drain;
-- 
2.0.0.526.g5318336

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] include: kernel.h: rewrite min3, max3 and clamp using min and max

2014-06-16 Thread Michal Nazarewicz

It appears that gcc is better at optimising a double call to min
and max rather than open coded min3 and max3.  This can be observed
here:

$ cat min-max.c
#define min(x, y) ({\
typeof(x) _min1 = (x);  \
typeof(y) _min2 = (y);  \
(void) (&_min1 == &_min2);  \
_min1 < _min2 ? _min1 : _min2; })
#define min3(x, y, z) ({\
typeof(x) _min1 = (x);  \
typeof(y) _min2 = (y);  \
typeof(z) _min3 = (z);  \
(void) (&_min1 == &_min2);  \
(void) (&_min1 == &_min3);  \
_min1 < _min2 ? (_min1 < _min3 ? _min1 : _min3) : \
(_min2 < _min3 ? _min2 : _min3); })

int fmin3(int x, int y, int z) { return min3(x, y, z); }
int fmin2(int x, int y, int z) { return min(min(x, y), z); }

$ gcc -O2 -o min-max.s -S min-max.c; cat min-max.s
.file   "min-max.c"
.text
.p2align 4,,15
.globl  fmin3
.type   fmin3, @function
fmin3:
.LFB0:
.cfi_startproc
cmpl%esi, %edi
jl  .L5
cmpl%esi, %edx
movl%esi, %eax
cmovle  %edx, %eax
ret
.p2align 4,,10
.p2align 3
.L5:
cmpl%edi, %edx
movl%edi, %eax
cmovle  %edx, %eax
ret
.cfi_endproc
.LFE0:
.size   fmin3, .-fmin3
.p2align 4,,15
.globl  fmin2
.type   fmin2, @function
fmin2:
.LFB1:
.cfi_startproc
cmpl%edi, %esi
movl%edx, %eax
cmovle  %esi, %edi
cmpl%edx, %edi
cmovle  %edi, %eax
ret
.cfi_endproc
.LFE1:
.size   fmin2, .-fmin2
.ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section.note.GNU-stack,"",@progbits

fmin3 function, which uses open-coded min3 macro, is compiled into
total of ten instructions including a conditional branch, whereas fmin2
function, which uses two calls to min2 macro, is compiled into six
instructions with no branches.

Similarly, open-coded clamp produces the same code as clamp using min
and max macros, but the latter is much shorter:

$ cat clamp.c
#define clamp(val, min, max) ({ \
typeof(val) __val = (val);  \
typeof(min) __min = (min);  \
typeof(max) __max = (max);  \
(void) (&__val == &__min);  \
(void) (&__val == &__max);  \
__val = __val < __min ? __min: __val;   \
__val > __max ? __max: __val; })
#define min(x, y) ({\
typeof(x) _min1 = (x);  \
typeof(y) _min2 = (y);  \
(void) (&_min1 == &_min2);  \
_min1 < _min2 ? _min1 : _min2; })
#define max(x, y) ({\
typeof(x) _max1 = (x);  \
typeof(y) _max2 = (y);  \
(void) (&_max1 == &_max2);  \
_max1 > _max2 ? _max1 : _max2; })

int fclamp(int v, int min, int max) { return clamp(v, min, max); }
int fclampmm(int v, int min, int max) { return min(max(v, min), max); }

$ gcc -O2 -o clamp.s -S clamp.c; cat clamp.s
.file   "clamp.c"
.text
.p2align 4,,15
.globl  fclamp
.type   fclamp, @function
fclamp:
.LFB0:
.cfi_startproc
cmpl%edi, %esi
movl%edx, %eax
cmovge  %esi, %edi
cmpl%edx, %edi
cmovle  %edi, %eax
ret
.cfi_endproc
.LFE0:
.size   fclamp, .-fclamp
.p2align 4,,15
.globl  fclampmm
.type   fclampmm, @function
fclampmm:
.LFB1:
.cfi_startproc
cmpl%edi, %esi
cmovge  %esi, %edi
cmpl%edi, %edx
movl%edi, %eax
cmovle  %edx, %eax
ret
.cfi_endproc
.LFE1:
.size   fclampmm, .-fclampmm
.ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section.note.GNU-stack,"",@progbits

Furthermore, after “make allmodconfig && make bzImage modules” this is the
comparison of image and modules sizes:

# Without this patch applied
$ ls -l arch/x86/boot/bzImage **/*.ko |awk '{size += $5} END {print size}'
350715800

# With this patch applied
$ ls -l arch/x86/boot/bzImage **/*.ko |awk '{size += $5} END {print size}'
349856528

The above builds were done on:

$ uname -a; gcc --version
Linux mpn-glaptop 3.13.0-29-generic #53~precise1-Ubuntu SMP Wed Jun 4 
22:06:25 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.

Re: [PATCH v3 2/2] Add support for Compact (Bluetooth|USB) keyboard with Trackpoint

2014-06-16 Thread Antonio Ospite

On Sun, 15 Jun 2014 21:39:50 +0100
Jamie Lentin  wrote:

This one does not compile on 3.15, see below.

Maybe you can take the chance to split the series in 4 patches:
1. rename only
2. cleanup of already existing code
3. preparatory changes to support multiple devices (the original
   1/2)
4. support for the Compact keyboard (the original 2/2).

but two patches are fine too as long as the important issues are sorted
out.

> Signed-off-by: Jamie Lentin 
> ---
>  drivers/hid/Kconfig  |   2 +
>  drivers/hid/hid-core.c   |   2 +
>  drivers/hid/hid-ids.h|   2 +
>  drivers/hid/hid-lenovo.c | 202 
> +++
>  include/linux/hid.h  |   1 +
>  5 files changed, 209 insertions(+)
> 
> diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
> index dd07d59..48b4777 100644
> --- a/drivers/hid/Kconfig
> +++ b/drivers/hid/Kconfig
> @@ -334,6 +334,8 @@ config HID_LENOVO
>   Thinkpad standalone keyboards, e.g:
>   - ThinkPad USB Keyboard with TrackPoint (supports extra LEDs and 
> trackpoint
> configuration)
> + - ThinkPad Compact Bluetooth Keyboard with TrackPoint (supports Fn keys)
> + - ThinkPad Compact USB Keyboard with TrackPoint (supports Fn keys)
>  
>  config HID_LOGITECH
>   tristate "Logitech devices" if EXPERT
> diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> index e8ce932..bce37c3 100644
> --- a/drivers/hid/hid-core.c
> +++ b/drivers/hid/hid-core.c
> @@ -1738,6 +1738,8 @@ static const struct hid_device_id 
> hid_have_special_driver[] = {
>   { HID_USB_DEVICE(USB_VENDOR_ID_LCPOWER, USB_DEVICE_ID_LCPOWER_LC1000 ) 
> },
>  #if IS_ENABLED(CONFIG_HID_LENOVO)
>   { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_TPKBD) },
> + { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_CUSBKBD) },
> + { HID_BLUETOOTH_DEVICE(USB_VENDOR_ID_LENOVO, 
> USB_DEVICE_ID_LENOVO_CBTKBD) },
>  #endif
>   { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_MX3000_RECEIVER) 
> },
>   { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_S510_RECEIVER) },
> diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
> index 6e12cd0..1763a07 100644
> --- a/drivers/hid/hid-ids.h
> +++ b/drivers/hid/hid-ids.h
> @@ -551,6 +551,8 @@
>  
>  #define USB_VENDOR_ID_LENOVO 0x17ef
>  #define USB_DEVICE_ID_LENOVO_TPKBD   0x6009
> +#define USB_DEVICE_ID_LENOVO_CUSBKBD 0x6047
> +#define USB_DEVICE_ID_LENOVO_CBTKBD  0x6048

One TAB is enough here to align the second entry.

>  
>  #define USB_VENDOR_ID_LG 0x1fd2
>  #define USB_DEVICE_ID_LG_MULTITOUCH  0x0064
> diff --git a/drivers/hid/hid-lenovo.c b/drivers/hid/hid-lenovo.c
> index aabf084..0a19f84 100644
> --- a/drivers/hid/hid-lenovo.c
> +++ b/drivers/hid/hid-lenovo.c
> @@ -1,8 +1,11 @@
>  /*
>   *  HID driver for Lenovo:-
>   *  - ThinkPad USB Keyboard with TrackPoint (tpkbd)
> + *  - ThinkPad Compact Bluetooth Keyboard with TrackPoint (cptkbd)
> + *  - ThinkPad Compact USB Keyboard with TrackPoint (cptkbd)
>   *
>   *  Copyright (c) 2012 Bernhard Seibold
> + *  Copyright (c) 2014 Jamie Lentin 
>   */
>  
>  /*
> @@ -33,6 +36,10 @@ struct lenovo_drvdata_tpkbd {
>   int press_speed;
>  };
>  
> +struct lenovo_drvdata_cptkbd {
> + unsigned int fn_lock;

bool? 

> +};
> +
>  #define map_key_clear(c) hid_map_usage_clear(hi, usage, bit, max, EV_KEY, 
> (c))
>  
>  static int lenovo_input_mapping_tpkbd(struct hid_device *hdev,
> @@ -48,6 +55,49 @@ static int lenovo_input_mapping_tpkbd(struct hid_device 
> *hdev,
>   return 0;
>  }
>  
> +static int lenovo_input_mapping_cptkbd(struct hid_device *hdev,
> + struct hid_input *hi, struct hid_field *field,
> + struct hid_usage *usage, unsigned long **bit, int *max)
> +{
> + /* HID_UP_LNVENDOR = USB, HID_UP_MSVENDOR = BT */
> + if ((usage->hid & HID_USAGE_PAGE) == HID_UP_MSVENDOR ||
> + (usage->hid & HID_USAGE_PAGE) == HID_UP_LNVENDOR) {
> + set_bit(EV_REP, hi->input->evbit);
> + switch (usage->hid & HID_USAGE) {
> + case 0x00f1: /* Fn-F4: Mic mute */
> + map_key_clear(KEY_MICMUTE);
> + return 1;
> + case 0x00f2: /* Fn-F5: Brightness down */
> + map_key_clear(KEY_BRIGHTNESSDOWN);
> + return 1;
> + case 0x00f3: /* Fn-F6: Brightness up */
> + map_key_clear(KEY_BRIGHTNESSUP);
> + return 1;
> + case 0x00f4: /* Fn-F7: External display (projector) */
> + map_key_clear(KEY_SWITCHVIDEOMODE);
> + return 1;
> + case 0x00f5: /* Fn-F8: Wireless */
> + map_key_clear(KEY_WLAN);
> + return 1;
> + case 0x00f6: /* Fn-F9: Control panel */
> + map_key_clear(KEY_CONFIG);
> + return 1;
> +

Re: mm: NULL ptr deref in remove_migration_pte

2014-06-16 Thread Sasha Levin

88046548b000 0007
>>>>>> > >>>> [ 2552.320286] Call Trace:
>>>>>> > >>>> [ 2552.320286] ? _raw_spin_unlock_irq 
>>>>>> > >>>> (arch/x86/include/asm/preempt.h:98 
>>>>>> > >>>> include/linux/spinlock_api_smp.h:169 
>>>>>> > >>>> kernel/locking/spinlock.c:199)
>>>>>> > >>>> [ 2552.320286] ? finish_task_switch (include/linux/tick.h:206 
>>>>>> > >>>> kernel/sched/core.c:2163)
>>>>>> > >>>> [ 2552.320286] ? finish_task_switch 
>>>>>> > >>>> (arch/x86/include/asm/current.h:14 kernel/sched/sched.h:993 
>>>>>> > >>>> kernel/sched/core.c:2145)
>>>>>> > >>>> [ 2552.320286] ? retint_restore_args 
>>>>>> > >>>> (arch/x86/kernel/entry_64.S:1040)
>>>>>> > >>>> [ 2552.320286] ? __this_cpu_preempt_check 
>>>>>> > >>>> (lib/smp_processor_id.c:63)
>>>>>> > >>>> [ 2552.320286] ? trace_hardirqs_on_caller 
>>>>>> > >>>> (kernel/locking/lockdep.c:2557 kernel/locking/lockdep.c:2599)
>>>>>> > >>>> [ 2552.320286] lock_acquire (arch/x86/include/asm/current.h:14 
>>>>>> > >>>> kernel/locking/lockdep.c:3602)
>>>>>> > >>>> [ 2552.320286] ? remove_migration_pte (mm/migrate.c:137)
>>>>>> > >>>> [ 2552.320286] ? retint_restore_args 
>>>>>> > >>>> (arch/x86/kernel/entry_64.S:1040)
>>>>>> > >>>> [ 2552.320286] _raw_spin_lock 
>>>>>> > >>>> (include/linux/spinlock_api_smp.h:143 
>>>>>> > >>>> kernel/locking/spinlock.c:151)
>>>>>> > >>>> [ 2552.320286] ? remove_migration_pte (mm/migrate.c:137)
>>>>>> > >>>> [ 2552.320286] remove_migration_pte (mm/migrate.c:137)
>>>>>> > >>>> [ 2552.320286] rmap_walk (mm/rmap.c:1628 mm/rmap.c:1699)
>>>>>> > >>>> [ 2552.320286] remove_migration_ptes (mm/migrate.c:224)
>>>>>> > >>>> [ 2552.320286] ? new_page_node (mm/migrate.c:107)
>>>>>> > >>>> [ 2552.320286] ? remove_migration_pte (mm/migrate.c:195)
>>>>>> > >>>> [ 2552.320286] migrate_pages (mm/migrate.c:922 mm/migrate.c:960 
>>>>>> > >>>> mm/migrate.c:1126)
>>>>>> > >>>> [ 2552.320286] ? perf_trace_mm_numa_migrate_ratelimit 
>>>>>> > >>>> (mm/migrate.c:1574)
>>>>>> > >>>> [ 2552.320286] migrate_misplaced_page (mm/migrate.c:1733)
>>>>>> > >>>> [ 2552.320286] __handle_mm_fault (mm/memory.c:3762 
>>>>>> > >>>> mm/memory.c:3812 mm/memory.c:3925)
>>>>>> > >>>> [ 2552.320286] ? __const_udelay (arch/x86/lib/delay.c:126)
>>>>>> > >>>> [ 2552.320286] ? __rcu_read_unlock (kernel/rcu/update.c:97)
>>>>>> > >>>> [ 2552.320286] handle_mm_fault (mm/memory.c:3948)
>>>>>> > >>>> [ 2552.320286] __get_user_pages (mm/memory.c:1851)
>>>>>> > >>>> [ 2552.320286] ? preempt_count_sub (kernel/sched/core.c:2527)
>>>>>> > >>>> [ 2552.320286] __mlock_vma_pages_range (mm/mlock.c:255)
>>>>>> > >>>> [ 2552.320286] __mm_populate (mm/mlock.c:711)
>>>>>> > >>>> [ 2552.320286] SyS_mlockall (include/linux/mm.h:1799 
>>>>>> > >>>> mm/mlock.c:817 mm/mlock.c:791)
>>>>>> > >>>> [ 2552.320286] tracesys (arch/x86/kernel/entry_64.S:749)
>>>>>> > >>>> [ 2552.320286] Code: 85 2d 1e 00 00 48 c7 c1 d7 68 6c a0 48 c7 c2 
>>>>>> > >>>> 47 11 6c a0 31 c0 be fa 0b 00 00 48 c7 c7 91 68 6c a0 e8 1c 6d f9 
>>>>>> > >>>> ff e9 07 1e 00 00 <49> 81 7d 00 80 31 76 a2 b8 00 00 00 00 44 0f 
>>>>>> > >>>> 44 c0 eb 07 0f 1f
>>>>>> > >>>> [ 2552.320286] RIP __lock_acquire (kernel/locking/lockdep.c:3070 
>>>>>> > >>>> (discriminator 1))
>>>>>> > >>>> [ 2552.320286]  RSP

Re: [PATCH tty-next 15/22] isdn: tty: Use private flag for ASYNC_CLOSING

2014-06-16 Thread Peter Hurley


Hi David,

On 06/16/2014 11:37 AM, David Laight wrote:

From: Of Peter Hurley

ASYNC_CLOSING is no longer used in the tty core; use private flag
info->closing as substitute.

...

@@ -311,6 +311,7 @@ typedef struct atemu {
  typedef struct modem_info {
int magic;
struct tty_port port;
+  int  closing:1;   /* port count has dropped to 0*/
int x_char;  /* xon/xoff character */
int mcr; /* Modem control register */
int   msr; /* Modem status register  */


That should probably be a bool and set to true/false.
You are probably adding a load of padding.


struct modem_info is over 1K, with several existing int-as-bool fields.
An array of 64 struct modem_info are statically allocated with every isdn 
device.

It doesn't look like memory consumption has been a consideration with the isdn 
driver.

Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm, thp: move invariant bug check out of loop in __split_huge_page_map

2014-06-16 Thread Kirill A. Shutemov

On Mon, Jun 16, 2014 at 11:49:34PM +0300, Kirill A. Shutemov wrote:
> On Mon, Jun 16, 2014 at 03:35:48PM -0400, Waiman Long wrote:
> > In the __split_huge_page_map() function, the check for
> > page_mapcount(page) is invariant within the for loop. Because of the
> > fact that the macro is implemented using atomic_read(), the redundant
> > check cannot be optimized away by the compiler leading to unnecessary
> > read to the page structure.

And atomic_read() is *not* atomic operation. It's implemented as
dereferencing though cast to volatile, which suppress compiler
optimization, but doesn't affect what CPU can do with the variable.

So I doubt difference will be measurable anywhere.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Writing watchdog_thresh triggers BUG: sleeping function called from invalid context

2014-06-16 Thread Peter Wu

On Monday 16 June 2014 16:52:45 Don Zickus wrote:
> On Mon, Jun 16, 2014 at 04:12:44PM +0200, Peter Wu wrote:
> > Hi,
> > 
> > Writing to /proc/sys/kernel/watchdog_thresh causes the following BUG in
> > at least v3.13-rc2-625-g06151db, v3.15 and v3.16-rc1. Kernel config is
> > attached.
> > 
> > It was originally found on bare metal, since then reproduced in QEMU in
> > init, and when directly executing it.
> > 
> > Regards,
> > Peter
> 
> Hi Peter,
> 
> I assume the following patch will work?

Yes, the patch works (no more BUG). No idea whether it is
(in)correct though.

Regards,
Peter

> Michal, do you remember why we needed preempt here?  I wouldn't think it
> mattered as we are not doing anything per-cpu specific.
> 
> Cheers,
> Don
> 
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 516203e..30e4822 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -527,10 +527,8 @@ static void update_timers_all_cpus(void)
>   int cpu;
>  
>   get_online_cpus();
> - preempt_disable();
>   for_each_online_cpu(cpu)
>   update_timers(cpu);
> - preempt_enable();
>   put_online_cpus();
>  }
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/13] kexec-bzImage: Support for loading bzImage using 64bit entry

2014-06-16 Thread Borislav Petkov

On Mon, Jun 16, 2014 at 04:06:08PM -0400, Vivek Goyal wrote:
> There can be more than one loader and the one which claims first
> to recognize the image will get to load the image. So once 32 bit
> loader support comes in, it might happen that we ask 64bit loader
> first and it rejects the image and then we ask 32bit loader.

What does that have to do with anything??

> So these message are really debug message which tells why loader
> is not accepting an image. It might not be image destined for that
> loader at all.
> 
> pr_debug() allows being verbose if user wants to for debugging purposes.
> You just have to make sure that CONFIG_DYNAMIC_DEBUG=y and enable verbosity
> in individual file.
> 
> echo 'file kexec-bzimage.c +p' > /sys/kernel/debug/dynamic_debug/control

So people are supposed to enable dynamic_debug just so that they see
*why* their image doesn't load.

Doesn't sound optimal to me.

> Same here. We will potentially be trying multiple loaders and if every
> loader prints messages for rejection by default, it is too much of
> info, IMO.

For max two loaders on one architecture? I don't think so. Now you're
just arguing for the sake of it.

> I like doing memory allocations early in the functions (as far as
> possible) and error out if need be. If memory is available to begin
> with for all the data structures needed by this function, it is kind
> of pointless to do rest of the processing.

We're talking about memory for a single void * which is ridiculous. And
I think simplifying the error paths is a much higher win than doing some
minor allocation.

> Hmm..., If you feel strongly about it, I can make this change. I
> thought I just made it easier to share the code between 32bit and
> 64bit by this.

Someone later can do that - right now this code is 64-bit only as far as
we're concerned and if it can be made to work on 32-bit, then people are
free to do so.

> I think it just makes it safer that we don't try to copy more than
> size of destination, in case ->eddbuf_entries is not right or corrupted.
> 
> I see copy_edd() does similar thing.
> 
> memcpy(edd.edd_info, boot_params.eddbuf, sizeof(edd.edd_info));
> edd.edd_info_nr = boot_params.eddbuf_entries;
> 
> So may be it is not a bad idea to copy based on max size of data
> structures.

Ok, makes sense.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load

2014-06-16 Thread Vivek Goyal

On Mon, Jun 16, 2014 at 10:05:26PM +0200, Borislav Petkov wrote:
> On Mon, Jun 16, 2014 at 01:38:23PM -0400, Vivek Goyal wrote:
> > And what's the sane default in this case?
> 
> COMMAND_LINE_SIZE
> 
> > Using current kernel's command line size will not work if future
> > kernel decide to support even longer command line size.
> 
> When do you ever get to kexec a kernel with command line size differing
> from the first kernel?This use case is pretty much non-existant to
> say the least (mind you, I'm open to examples but am still waiting for
> them). And even then you go and simply upgrade the first kernel.

Kdump kernel uses a different command line. It adds extra command line
options to currently running kernels.

Till recent past we used to pass new kernel's memory map using command
line "memmap=" and when command line size was 256, we could easily exhaust
command line on large machines.

Now we support 2048 and we have not seen that issue and now we have
moved to passing memory ranges in bootparams so that issue does not
exist. But kernel still does allow passing memmap= on command line.

One can do same thing using kexec too.

Agreed that it is a very corner case use case. Now we can say that we
will not support it. I am fine with that but I atleast wanted a discussion
and common understanding of what new syscall will support and what it
will not.

Some arches still seem to have COMMAND_LINE_SIZE 256. They will more
likely to hit this scenario at some point of time.

Given the fact you feel so strongly on putting this upper limit, I will
introduce it. And put a comment that if the kernel we are kexecing into
supports longer command line, the we will not support that size and one
needs to upgrade first kernel.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/11] qspinlock with paravirt support

2014-06-16 Thread Konrad Rzeszutek Wilk

On Sun, Jun 15, 2014 at 02:46:57PM +0200, Peter Zijlstra wrote:
> Since Waiman seems incapable of doing simple things; here's my take on the
> paravirt crap.
> 
> The first few patches are taken from Waiman's latest series, but the virt
> support is completely new. Its primary aim is to not mess up the native code.

OK. I finally cleared some time to look over this and are reading the code
in details to make sure I have it clear in mind. I will most likely ask
some questions that are naive - hopefully they will lead to the code being
self-explanatory for anybody else taking a stab at understanding them when
bugs appear.
> 
> I've not stress tested it, but the virt and paravirt (kvm) cases boot on 
> simple
> smp guests. I've not done Xen, but the patch should be simple and similar.

Looking forward to seeing it. Glancing over the KVM one and comparing it
to the original version that Waiman posted it should be fairly simple. Perhaps
even some of the code could be shared?

> 
> I ripped out all the unfair nonsense as its not at all required for paravirt
> and optimizations that make paravirt better at the cost of code clarity and/or
> native performance are just not worth it.
> 
> Also; if we were to ever add some of that unfair nonsense you do so _after_ 
> you
> got the simple things working.
> 
> The thing I'm least sure about is the head tracking, I chose to do something
> different from what Waiman did, because his is O(nr_cpus) and had the
> assumption that guests have small nr_cpus. AFAIK this is not at all true. The
> biggest problem I have with what I did is that it contains wait loops itself.
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm, thp: move invariant bug check out of loop in __split_huge_page_map

2014-06-16 Thread Kirill A. Shutemov

On Mon, Jun 16, 2014 at 03:35:48PM -0400, Waiman Long wrote:
> In the __split_huge_page_map() function, the check for
> page_mapcount(page) is invariant within the for loop. Because of the
> fact that the macro is implemented using atomic_read(), the redundant
> check cannot be optimized away by the compiler leading to unnecessary
> read to the page structure.
> 
> This patch move the invariant bug check out of the loop so that it
> will be done only once.

Looks okay, but why? Was you able to measure difference?

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Writing watchdog_thresh triggers BUG: sleeping function called from invalid context

2014-06-16 Thread Don Zickus

On Mon, Jun 16, 2014 at 04:12:44PM +0200, Peter Wu wrote:
> Hi,
> 
> Writing to /proc/sys/kernel/watchdog_thresh causes the following BUG in
> at least v3.13-rc2-625-g06151db, v3.15 and v3.16-rc1. Kernel config is
> attached.
> 
> It was originally found on bare metal, since then reproduced in QEMU in
> init, and when directly executing it.
> 
> Regards,
> Peter

Hi Peter,

I assume the following patch will work?

Michal, do you remember why we needed preempt here?  I wouldn't think it
mattered as we are not doing anything per-cpu specific.

Cheers,
Don

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 516203e..30e4822 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -527,10 +527,8 @@ static void update_timers_all_cpus(void)
int cpu;
 
get_online_cpus();
-   preempt_disable();
for_each_online_cpu(cpu)
update_timers(cpu);
-   preempt_enable();
put_online_cpus();
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/11] qspinlock: A simple generic 4-byte queue spinlock

2014-06-16 Thread Konrad Rzeszutek Wilk

On Sun, Jun 15, 2014 at 02:46:58PM +0200, Peter Zijlstra wrote:
> From: Waiman Long 
> 
> This patch introduces a new generic queue spinlock implementation that
> can serve as an alternative to the default ticket spinlock. Compared
> with the ticket spinlock, this queue spinlock should be almost as fair
> as the ticket spinlock. It has about the same speed in single-thread
> and it can be much faster in high contention situations especially when
> the spinlock is embedded within the data structure to be protected.
> 
> Only in light to moderate contention where the average queue depth
> is around 1-3 will this queue spinlock be potentially a bit slower
> due to the higher slowpath overhead.
> 
> This queue spinlock is especially suit to NUMA machines with a large
> number of cores as the chance of spinlock contention is much higher
> in those machines. The cost of contention is also higher because of
> slower inter-node memory traffic.
> 
> Due to the fact that spinlocks are acquired with preemption disabled,
> the process will not be migrated to another CPU while it is trying
> to get a spinlock. Ignoring interrupt handling, a CPU can only be
> contending in one spinlock at any one time. Counting soft IRQ, hard
> IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
> activities.  By allocating a set of per-cpu queue nodes and used them
> to form a waiting queue, we can encode the queue node address into a
> much smaller 24-bit size (including CPU number and queue node index)
> leaving one byte for the lock.
> 
> Please note that the queue node is only needed when waiting for the
> lock. Once the lock is acquired, the queue node can be released to
> be used later.
> 
> Signed-off-by: Waiman Long 
> Signed-off-by: Peter Zijlstra 

Thank you for the repost. I have some questions about the implementation
that hopefully will be easy to answer and said answers I hope can
be added in the code to enlighten other folks.

See below.
.. snip..

> Index: linux-2.6/kernel/locking/mcs_spinlock.h
> ===
> --- linux-2.6.orig/kernel/locking/mcs_spinlock.h
> +++ linux-2.6/kernel/locking/mcs_spinlock.h
> @@ -17,6 +17,7 @@
>  struct mcs_spinlock {
>   struct mcs_spinlock *next;
>   int locked; /* 1 if lock acquired */
> + int count;

This could use a comment.

>  };
>  
>  #ifndef arch_mcs_spin_lock_contended
> Index: linux-2.6/kernel/locking/qspinlock.c
> ===
> --- /dev/null
> +++ linux-2.6/kernel/locking/qspinlock.c
> @@ -0,0 +1,197 @@
> +/*
> + * Queue spinlock
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * (C) Copyright 2013-2014 Hewlett-Packard Development Company, L.P.
> + *
> + * Authors: Waiman Long 
> + *  Peter Zijlstra 
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/*
> + * The basic principle of a queue-based spinlock can best be understood
> + * by studying a classic queue-based spinlock implementation called the
> + * MCS lock. The paper below provides a good description for this kind
> + * of lock.
> + *
> + * http://www.cise.ufl.edu/tr/DOC/REP-1992-71.pdf
> + *
> + * This queue spinlock implementation is based on the MCS lock, however to 
> make
> + * it fit the 4 bytes we assume spinlock_t to be, and preserve its existing
> + * API, we must modify it some.
> + *
> + * In particular; where the traditional MCS lock consists of a tail pointer
> + * (8 bytes) and needs the next pointer (another 8 bytes) of its own node to
> + * unlock the next pending (next->locked), we compress both these: {tail,
> + * next->locked} into a single u32 value.
> + *
> + * Since a spinlock disables recursion of its own context and there is a 
> limit
> + * to the contexts that can nest; namely: task, softirq, hardirq, nmi, we can
> + * encode the tail as and index indicating this context and a cpu number.
> + *
> + * We can further change the first spinner to spin on a bit in the lock word
> + * instead of its node; whereby avoiding the need to carry a node from lock 
> to
> + * unlock, and preserving API.
> + */
> +
> +#include "mcs_spinlock.h"
> +
> +/*
> + * Per-CPU queue node structures; we can never have more than 4 nested
> + * contexts: task, softirq, hardirq, nmi.
> + *
> + * Exactly fits one cacheline.
> + */
> +static DEFINE_PER_CPU_ALIGNED(struct mcs_spinlock, mcs_nodes[4]);
> +
> +/*
> + * We must be able to

Re: 3.15: kernel BUG at kernel/auditsc.c:1525!

2014-06-16 Thread Richard Weinberger

Am 16.06.2014 22:41, schrieb Toralf Förster:
> Well, might be the mail:subject should be adapted, b/c the issue can be 
> triggered in a 3.13.11 kernel too.
> Unfortunately it does not appear within an UML guest, therefore an automated 
> bisecting isn't possible I fear.

You could try KVM. :)

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] firmware: Add device tree binding for coreboot

2014-06-16 Thread Stephen Warren

On 06/16/2014 07:30 AM, Rob Herring wrote:
> On Fri, Jun 13, 2014 at 4:58 PM, Julius Werner  wrote:
...
>> Rob Herring wrote:
>>> Don't you need need to keep the kernel from allocating this memory by
>>> using one of the reserved memory mechanisms? The recently added one
>>> should be able to specific what the memory is reserved for IIRC.
>>
>> Our bootloader is carving the location out of the /memory node and
>> adding it to the device tree reserve map. As far as I know, that only
>> contains a list of raw start and size entries. At any rate, I think
>> it's useful (and in line with other bindings) to add a more explicit
>> node like this (if only to make it easier accessible through
>> /proc/device-tree).
> 
> Understand there are 3 different memory reservation bindings. The
> original "/memreserve/" method is indeed limited. What I think you
> should use is the binding documented in
> Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt.
> So you could do something like this:
> 
> reserved-memory {
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
> 
> /* global autoconfigured region for contiguous allocations */
> linux,cma {
> compatible = "shared-dma-pool";
> reusable;
> size = <0x400>;
> alignment = <0x2000>;
> linux,cma-default;
> };
> 
> coreboot_reserved: coreboot@fdfea000 {
>   compatible = "coreboot";
>   reg = <0xfdfea000 0x264>,
>   <0xfdfea000 0x16000>;
> };
> 

I thought that the /reserved-memory node was more so that the
(preferred?) location of firmware images or data buffers used by HW
accelerators could be communicated to the kernel. This feels like pure data.

The coreboot binding seems to be more about defining an interface to a
particular firmware (this feels like semantics), which as a side-effect
needs to communicate the location of certain data.

If /reserved-memory is used to communicate the address of the memory
regions, I think we still need a /firmware/coreboot node to indicate the
semantics of the reserved memory region, and point at the phandle of the
region. As such, it seems simpler just to put the addresses in the
coreboot node's reg property. The only exception I see to that argument
is if putting the region in /reserved-memory automatically carves that
region out of the memory the kernel will allocate from. That would
simplify the bootloader, since it wouldn't have to fiddle with the
/memory node and do the carveout itself.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.15: kernel BUG at kernel/auditsc.c:1525!

2014-06-16 Thread Toralf Förster

Well, might be the mail:subject should be adapted, b/c the issue can be 
triggered in a 3.13.11 kernel too.
Unfortunately it does not appear within an UML guest, therefore an automated 
bisecting isn't possible I fear.

-- 
Toralf

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/2] Rename hid-lenovo-tpkbd to hid-lenovo, so we can add other keyboards

2014-06-16 Thread Antonio Ospite

On Sun, 15 Jun 2014 21:39:49 +0100
Jamie Lentin  wrote:

Almost there :)

> Signed-off-by: Jamie Lentin 
> ---
>  drivers/hid/Kconfig  |  14 +-
>  drivers/hid/Makefile |   2 +-
>  drivers/hid/hid-core.c   |   2 +-
>  drivers/hid/{hid-lenovo-tpkbd.c => hid-lenovo.c} | 233 
> +--
>  4 files changed, 142 insertions(+), 109 deletions(-)
>  rename drivers/hid/{hid-lenovo-tpkbd.c => hid-lenovo.c} (59%)
> 
> diff --git a/drivers/hid/Kconfig b/drivers/hid/Kconfig
> index f722001..dd07d59 100644
> --- a/drivers/hid/Kconfig
> +++ b/drivers/hid/Kconfig
> @@ -322,18 +322,18 @@ config HID_LCPOWER
>   ---help---
>   Support for LC-Power RC1000MCE RF remote control.
>  
> -config HID_LENOVO_TPKBD
> - tristate "Lenovo ThinkPad USB Keyboard with TrackPoint"
> +config HID_LENOVO
> + tristate "Lenovo / Thinkpad devices"
>   depends on HID
>   select NEW_LEDS
>   select LEDS_CLASS
>   ---help---
> - Support for the Lenovo ThinkPad USB Keyboard with TrackPoint.
> + Support for Lenovo devices that are not fully compliant with HID 
> standard.
>

Try to wrap text in Kconfig at 80 chars.

> - Say Y here if you have a Lenovo ThinkPad USB Keyboard with TrackPoint
> - and would like to use device-specific features like changing the
> - sensitivity of the trackpoint, using the microphone mute button or
> - controlling the mute and microphone mute LEDs.
> + Say Y if you want support for the non-compliant features of the Lenovo
> + Thinkpad standalone keyboards, e.g:

Maybe don't mention keyboards just yet on the line above since the
driver is now supposed to support other devices too.

> + - ThinkPad USB Keyboard with TrackPoint (supports extra LEDs and 
> trackpoint
> +   configuration)
>

Wrap at 80 chars here too.

>  config HID_LOGITECH
>   tristate "Logitech devices" if EXPERT
> diff --git a/drivers/hid/Makefile b/drivers/hid/Makefile
> index 30e4431..384f981 100644
> --- a/drivers/hid/Makefile
> +++ b/drivers/hid/Makefile
> @@ -58,7 +58,7 @@ obj-$(CONFIG_HID_KENSINGTON)+= hid-kensington.o
>  obj-$(CONFIG_HID_KEYTOUCH)   += hid-keytouch.o
>  obj-$(CONFIG_HID_KYE)+= hid-kye.o
>  obj-$(CONFIG_HID_LCPOWER)   += hid-lcpower.o
> -obj-$(CONFIG_HID_LENOVO_TPKBD)   += hid-lenovo-tpkbd.o
> +obj-$(CONFIG_HID_LENOVO) += hid-lenovo.o
>  obj-$(CONFIG_HID_LOGITECH)   += hid-logitech.o
>  obj-$(CONFIG_HID_LOGITECH_DJ)+= hid-logitech-dj.o
>  obj-$(CONFIG_HID_MAGICMOUSE)+= hid-magicmouse.o
> diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> index 8a5384c..e8ce932 100644
> --- a/drivers/hid/hid-core.c
> +++ b/drivers/hid/hid-core.c
> @@ -1736,7 +1736,7 @@ static const struct hid_device_id 
> hid_have_special_driver[] = {
>   { HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_EASYPEN_M610X) },
>   { HID_USB_DEVICE(USB_VENDOR_ID_LABTEC, 
> USB_DEVICE_ID_LABTEC_WIRELESS_KEYBOARD) },
>   { HID_USB_DEVICE(USB_VENDOR_ID_LCPOWER, USB_DEVICE_ID_LCPOWER_LC1000 ) 
> },
> -#if IS_ENABLED(CONFIG_HID_LENOVO_TPKBD)
> +#if IS_ENABLED(CONFIG_HID_LENOVO)
>   { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_TPKBD) },
>  #endif
>   { HID_USB_DEVICE(USB_VENDOR_ID_LOGITECH, USB_DEVICE_ID_MX3000_RECEIVER) 
> },
> diff --git a/drivers/hid/hid-lenovo-tpkbd.c b/drivers/hid/hid-lenovo.c
> similarity index 59%
> rename from drivers/hid/hid-lenovo-tpkbd.c
> rename to drivers/hid/hid-lenovo.c
> index 2d25b6c..aabf084 100644
> --- a/drivers/hid/hid-lenovo-tpkbd.c
> +++ b/drivers/hid/hid-lenovo.c
> @@ -1,5 +1,6 @@
>  /*
> - *  HID driver for Lenovo ThinkPad USB Keyboard with TrackPoint
> + *  HID driver for Lenovo:-

The dash before the colon is not needed (really not a big deal I
mentioned that again just because there are other fixes too).

> + *  - ThinkPad USB Keyboard with TrackPoint (tpkbd)
>   *

This hunk could go into a preparatory patch separated from the rename
one, see below.

>   *  Copyright (c) 2012 Bernhard Seibold
>   */
> @@ -20,8 +21,7 @@
>  
>  #include "hid-ids.h"
>  
> -/* This is only used for the trackpoint part of the driver, hence _tp */
> -struct tpkbd_data_pointer {
> +struct lenovo_drvdata_tpkbd {
>   int led_state;
>   struct led_classdev led_mute;
>   struct led_classdev led_micmute;
> @@ -35,7 +35,7 @@ struct tpkbd_data_pointer {
>  
>  #define map_key_clear(c) hid_map_usage_clear(hi, usage, bit, max, EV_KEY, 
> (c))
>  
> -static int tpkbd_input_mapping(struct hid_device *hdev,
> +static int lenovo_input_mapping_tpkbd(struct hid_device *hdev,

I'd name this just lenovo_input_mapping() for now.

>   struct hid_input *hi, struct hid_field *field,
>   struct hid_usage *usage, unsigned long **bit, int *max)
>  {
> @@ -48,12 +48,26 @@ static int tpkbd_input_mapping(struct hid_device *hdev,
>   return 0;
>  }
>  
> +static int

Re: [PATCH] drm/msm: update and activate iommu support

2014-06-16 Thread Rob Clark

On Mon, Jun 16, 2014 at 2:20 PM, Stephane Viau  wrote:
> This changes activates the iommu support for MDP5, through the
> platform config structure.
>
> Iommu support is also slightly modified in order to make sure
> that MDP iommu is properly cleaned up if a probe deferral is
> requested. Before this change, IOMMU faults would occur if the
> probe failed (-EPROBE_DEFER).

So, this looks like it is really two patches.. one making
-EPROBE_DEFER work properly, and the other adding the missing bits for
mdp5 which so far are only on downstream kernel.

Could you split this patch into two, so I can queue the -EPROBE_DEFER
fix for a 3.16 -fixes pull req?

BR,
-R


> Signed-off-by: Stephane Viau 
> ---
>  drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c | 28 +++-
>  drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h |  1 +
>  drivers/gpu/drm/msm/msm_gem.c   |  6 ++
>  drivers/gpu/drm/msm/msm_iommu.c | 21 +++--
>  drivers/gpu/drm/msm/msm_mmu.h   |  1 +
>  5 files changed, 50 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c 
> b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
> index 2967b19..62ee5cd 100644
> --- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
> +++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
> @@ -20,6 +20,10 @@
>  #include "msm_mmu.h"
>  #include "mdp5_kms.h"
>
> +static const char *iommu_ports[] = {
> +   "mdp_0",
> +};
> +
>  static struct mdp5_platform_config *mdp5_get_config(struct platform_device 
> *dev);
>
>  uint32_t __read_mostly _mdp5_pipe_vig_base;
> @@ -201,6 +205,12 @@ static void mdp5_preclose(struct msm_kms *kms, struct 
> drm_file *file)
>  static void mdp5_destroy(struct msm_kms *kms)
>  {
> struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> +   struct msm_mmu *mmu = mdp5_kms->mmu;
> +
> +   if (mmu) {
> +   mmu->funcs->detach(mmu, iommu_ports, ARRAY_SIZE(iommu_ports));
> +   mmu->funcs->destroy(mmu);
> +   }
> kfree(mdp5_kms);
>  }
>
> @@ -313,10 +323,6 @@ fail:
> return ret;
>  }
>
> -static const char *iommu_ports[] = {
> -   "mdp_0",
> -};
> -
>  static int get_clk(struct platform_device *pdev, struct clk **clkp,
> const char *name)
>  {
> @@ -406,17 +412,23 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
> mmu = msm_iommu_new(dev, config->iommu);
> if (IS_ERR(mmu)) {
> ret = PTR_ERR(mmu);
> +   dev_err(dev->dev, "failed to init iommu: %d\n", ret);
> goto fail;
> }
> +
> ret = mmu->funcs->attach(mmu, iommu_ports,
> ARRAY_SIZE(iommu_ports));
> -   if (ret)
> +   if (ret) {
> +   dev_err(dev->dev, "failed to attach iommu: %d\n", 
> ret);
> +   mmu->funcs->destroy(mmu);
> goto fail;
> +   }
> } else {
> dev_info(dev->dev, "no iommu, fallback to phys "
> "contig buffers for scanout\n");
> mmu = NULL;
> }
> +   mdp5_kms->mmu = mmu;
>
> mdp5_kms->id = msm_register_mmu(dev, mmu);
> if (mdp5_kms->id < 0) {
> @@ -445,5 +457,11 @@ static struct mdp5_platform_config 
> *mdp5_get_config(struct platform_device *dev)
>  #ifdef CONFIG_OF
> /* TODO */
>  #endif
> +   config.iommu = iommu_domain_alloc(_bus_type, 0);
> +   /* TODO hard-coded in downstream mdss, but should it be? */
> +   config.max_clk = 2;
> +   /* TODO get from DT: */
> +   config.smp_blk_cnt = 22;
> +
> return 
>  }
> diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h 
> b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h
> index 6a89b04..20ea748 100644
> --- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h
> +++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h
> @@ -41,6 +41,7 @@ struct mdp5_kms {
>
> /* mapper-id used to request GEM buffer mapped for scanout: */
> int id;
> +   struct msm_mmu *mmu;
>
> /* for tracking smp allocation amongst pipes: */
> mdp5_smp_state_t smp_state;
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index bb8026d..690d7e7 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -278,6 +278,7 @@ int msm_gem_get_iova_locked(struct drm_gem_object *obj, 
> int id,
> uint32_t *iova)
>  {
> struct msm_gem_object *msm_obj = to_msm_bo(obj);
> +   struct drm_device *dev = obj->dev;
> int ret = 0;
>
> if (!msm_obj->domain[id].iova) {
> @@ -285,6 +286,11 @@ int msm_gem_get_iova_locked(struct drm_gem_object *obj, 
> int id,
> struct msm_mmu *mmu = priv->mmus[id];
> struct page **pages = get_pages(obj);
>
> +   if (!mmu) {
> +

Re: [patch V2 5/5] futex: Simplify futex_lock_pi_atomic() and make it more robust

2014-06-16 Thread Darren Hart

On Fri, 2014-06-13 at 11:44 +0200, Thomas Gleixner wrote:
> Subject: futex: Simplify futex_lock_pi_atomic() and make it more robust
> From: Thomas Gleixner 
> Date: Wed, 11 Jun 2014 20:45:41 -
> 
> futex_lock_pi_atomic() is a maze of retry hoops and loops.
> 
> Reduce it to simple and understandable states:
> 
> First step is to lookup existing waiters (state) in the kernel.
> 
> If there is an existing waiter, validate it and attach to it.
> 
> If there is no existing waiter, check the user space value
> 
> If the TID encoded in the user space value is 0, take over the futex
> preserving the owner died bit.
> 
> If the TID encoded in the user space value is != 0, lookup the owner
> task, validate it and attach to it.
> 
> Reduces text size by 128 bytes on x8664.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Peter Zijlstra 
> Cc: Darren Hart 
> Cc: Davidlohr Bueso 
> Cc: Kees Cook 
> Cc: w...@chromium.org
> Link: http://lkml.kernel.org/r/20140611204237.361836...@linutronix.de
> Signed-off-by: Thomas Gleixner 
> ---
> 
> V2: Fixed the brown paperbag bug of V1
> 
>  kernel/futex.c |  141 
> ++---
>  1 file changed, 55 insertions(+), 86 deletions(-)
> 
> Index: linux/kernel/futex.c
> ===
> --- linux.orig/kernel/futex.c
> +++ linux/kernel/futex.c
> @@ -956,6 +956,17 @@ static int lookup_pi_state(u32 uval, str
>   return attach_to_pi_owner(uval, key, ps);
>  }
>  
> +static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
> +{
> + u32 uninitialized_var(curval);
> +
> + if (unlikely(cmpxchg_futex_value_locked(, uaddr, uval, newval)))
> + return -EFAULT;
> +
> + /*If user space value changed, let the caller retry */
> + return curval != uval ? -EAGAIN : 0;
> +}

Given the complexity of this update and how fragile this path can be, I
think this refactoring would be best done in an independent patch, as
you did with the previous two.

Two general concerns, we appear to be eliminating both the force_take
and the retry.

The force_take only occurs if TID==0, and that is covered here in a
cleaner way, so I believe we are good here.

As for the retry, the remaining use case (outside of TID==0 ->
force_take=1 -> retry) appears to be that userspace changed the value
while we were running. Reading the value early doesn't protect us from
this scenario. How does this change account for that?

It looks to me that before we would retry, while here we just give up
and return -EAGAIN..., which is addressed in futex_lock_pi(), but not in
the futex_requeue() callsite for futex_proxy_trylock_atomic. It does
handle it, but I guess also needs a comment update to "The owner was
exiting" to include "or userspace changed the value" as you did for
futex_lock_pi().

>From my analysis, this is a good cleanup and makes the code for
explicit. I'm nervous about missing corner cases, and would like to
understand what level of testing this has received. We need to add PI
locking tests to futextest. There are some in glibc. Which tests were
run to validate PI locking?

Thanks,

Darren Hart
> +
>  /**
>   * futex_lock_pi_atomic() - Atomic work required to acquire a pi aware futex
>   * @uaddr:   the pi futex user address
> @@ -979,113 +990,69 @@ static int futex_lock_pi_atomic(u32 __us
>   struct futex_pi_state **ps,
>   struct task_struct *task, int set_waiters)
>  {
> - int lock_taken, ret, force_take = 0;
> - u32 uval, newval, curval, vpid = task_pid_vnr(task);
> -
> -retry:
> - ret = lock_taken = 0;
> + u32 uval, newval, vpid = task_pid_vnr(task);
> + struct futex_q *match;
> + int ret;
>  
>   /*
> -  * To avoid races, we attempt to take the lock here again
> -  * (by doing a 0 -> TID atomic cmpxchg), while holding all
> -  * the locks. It will most likely not succeed.
> +  * Read the user space value first so we can validate a few
> +  * things before proceeding further.
>*/
> - newval = vpid;
> - if (set_waiters)
> - newval |= FUTEX_WAITERS;
> -
> - if (unlikely(cmpxchg_futex_value_locked(, uaddr, 0, newval)))
> + if (get_futex_value_locked(, uaddr))
>   return -EFAULT;
>  
>   /*
>* Detect deadlocks.
>*/
> - if ((unlikely((curval & FUTEX_TID_MASK) == vpid)))
> + if ((unlikely((uval & FUTEX_TID_MASK) == vpid)))
>   return -EDEADLK;
>  
>   /*
> -  * Surprise - we got the lock, but we do not trust user space at all.
> +  * Lookup existing state first. If it exists, try to attach to
> +  * its pi_state.
>*/
> - if (unlikely(!curval)) {
> - /*
> -  * We verify whether there is kernel state for this
> -  * futex. If not, we can safely assume, that the 0 ->
> -  * TID transition is correct. If state exists,

Re: [PATCH] tile: use ARRAY_SIZE

2014-06-16 Thread Chris Metcalf


On 6/16/2014 4:12 PM, Himangi Saraogi wrote:

ARRAY_SIZE is more concise to use when the size of an array is divided
by the size of its type or the size of its first element.

The semantic patch that makes this change is as follows:

// 
@i@
@@

@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(T))
+ ARRAY_SIZE(E)
// 

Signed-off-by: Himangi Saraogi 
Acked-by: Julia Lawall 
---
Not compile tested.
  arch/tile/kernel/traps.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Thanks, taken into the tile tree.

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] firmware: Add device tree binding for coreboot

2014-06-16 Thread Julius Werner

This patch adds documentation describing a device tree binding for the
coreboot firmware. It is meant to be dynamically added during boot and
contains address definitions for the coreboot table (a list of
variable-sized descriptors providing information about various compile-
and run-time generated firmware parameters) and the CBMEM area (the
structure containing most run-time resident memory regions set up by
coreboot).

These definitions allow kernel drivers to easily access data contained
in and pointed to by these regions (such as coreboot's in-memory log).
(An example implementation can be seen at http://crosreview.com/203371,
which will be submitted at a later point.)

Change-Id: I97609d461d306f85851e5efc26c675ca1e2d7e9d
Signed-off-by: Julius Werner 
---
 .../devicetree/bindings/firmware/coreboot.txt  | 32 ++
 1 file changed, 32 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/coreboot.txt

diff --git a/Documentation/devicetree/bindings/firmware/coreboot.txt 
b/Documentation/devicetree/bindings/firmware/coreboot.txt
new file mode 100644
index 000..5055f0e
--- /dev/null
+++ b/Documentation/devicetree/bindings/firmware/coreboot.txt
@@ -0,0 +1,32 @@
+COREBOOT firmware information
+
+The device tree node to communicate the location of coreboot's memory-resident
+bookkeeping structures to the kernel. Since coreboot itself cannot boot a
+device-tree-based kernel (yet), this node needs to be inserted by a
+second-stage bootloader (a coreboot "payload").
+
+Required properties:
+ - compatible: Should be "coreboot"
+ - reg: Address and length of the following two memory regions, in order:
+   1.) The coreboot table. This is a list of variable-sized descriptors
+   that contain various compile- and run-time generated firmware
+   parameters. It is identified by the magic string "LBIO" in its first
+   four bytes. See coreboot's src/include/boot/coreboot_tables.h for
+   details.
+   2.) The CBMEM area. This is a downward-growing memory region used by
+   coreboot to dynamically allocate data structures that remain resident.
+   It may or may not include the coreboot table as one of its members. It
+   is identified by a root node descriptor with the magic number
+   0xc0389479 that resides in the topmost 8 bytes of the area. See
+   coreboot's src/lib/dynamic_cbmem.c for details.
+
+Example:
+   firmware {
+   ranges;
+
+   coreboot {
+   compatible = "coreboot";
+   reg = <0xfdfea000 0x264>,
+ <0xfdfea000 0x16000>;
+   }
+   };
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] printk: allow increasing the ring buffer depending on the number of CPUs

2014-06-16 Thread Chris Metcalf


On 6/12/2014 2:45 PM, Joe Perches wrote:

(adding Chris Metcalf for arch/tile,
  I think this change might impact that arch)


Thanks for the Cc.  I've been following the discussion and I think it's on the 
right track for arch/tile.

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mfd: wm8994: Add a bunch of missing defaults/readables

2014-06-16 Thread Charles Keepax

Ever since this commit:

commit d4807ad2c4c0e17b6f00e3be9492c81de0804f40
regmap: Check readable regs in _regmap_read

Regmap will refuse to read a register which is not marked as readable,
this has highlighted a number of controls in this driver which are not
marked as readable/missing defaults.

This patch corrects the situation, by adding the missing
readables/defaults.

Signed-off-by: Charles Keepax 
---
 drivers/mfd/wm8994-regmap.c |   64 +++
 1 file changed, 64 insertions(+)

diff --git a/drivers/mfd/wm8994-regmap.c b/drivers/mfd/wm8994-regmap.c
index 2fbce9c..770a256 100644
--- a/drivers/mfd/wm8994-regmap.c
+++ b/drivers/mfd/wm8994-regmap.c
@@ -123,14 +123,23 @@ static struct reg_default wm1811_defaults[] = {
{ 0x0402, 0x00C0 },/* R1026 - AIF1 DAC1 Left Volume */
{ 0x0403, 0x00C0 },/* R1027 - AIF1 DAC1 Right Volume */
{ 0x0410, 0x },/* R1040 - AIF1 ADC1 Filters */
+   { 0x0411, 0x },/* R1041 - AIF1 ADC2 Filters */
{ 0x0420, 0x0200 },/* R1056 - AIF1 DAC1 Filters (1) */
{ 0x0421, 0x0010 },/* R1057 - AIF1 DAC1 Filters (2) */
+   { 0x0422, 0x0200 },/* R1058 - AIF1 DAC2 Filters (1) */
+   { 0x0423, 0x0010 },/* R1059 - AIF1 DAC2 Filters (2) */
{ 0x0430, 0x0068 },/* R1072 - AIF1 DAC1 Noise Gate */
+   { 0x0431, 0x0068 },/* R1073 - AIF1 DAC2 Noise Gate */
{ 0x0440, 0x0098 },/* R1088 - AIF1 DRC1 (1) */
{ 0x0441, 0x0845 },/* R1089 - AIF1 DRC1 (2) */
{ 0x0442, 0x },/* R1090 - AIF1 DRC1 (3) */
{ 0x0443, 0x },/* R1091 - AIF1 DRC1 (4) */
{ 0x0444, 0x },/* R1092 - AIF1 DRC1 (5) */
+   { 0x0450, 0x0098 },/* R1104 - AIF1 DRC2 (1) */
+   { 0x0451, 0x0845 },/* R1105 - AIF1 DRC2 (2) */
+   { 0x0452, 0x },/* R1106 - AIF1 DRC2 (3) */
+   { 0x0453, 0x },/* R1107 - AIF1 DRC2 (4) */
+   { 0x0454, 0x },/* R1108 - AIF1 DRC2 (5) */
{ 0x0480, 0x6318 },/* R1152 - AIF1 DAC1 EQ Gains (1) */
{ 0x0481, 0x6300 },/* R1153 - AIF1 DAC1 EQ Gains (2) */
{ 0x0482, 0x0FCA },/* R1154 - AIF1 DAC1 EQ Band 1 A */
@@ -152,6 +161,27 @@ static struct reg_default wm1811_defaults[] = {
{ 0x0492, 0x0559 },/* R1170 - AIF1 DAC1 EQ Band 5 B */
{ 0x0493, 0x4000 },/* R1171 - AIF1 DAC1 EQ Band 5 PG */
{ 0x0494, 0x },/* R1172 - AIF1 DAC1 EQ Band 1 C */
+   { 0x04A0, 0x6318 },/* R1184 - AIF1 DAC2 EQ Gains (1) */
+   { 0x04A1, 0x6300 },/* R1185 - AIF1 DAC2 EQ Gains (2) */
+   { 0x04A2, 0x0FCA },/* R1186 - AIF1 DAC2 EQ Band 1 A */
+   { 0x04A3, 0x0400 },/* R1187 - AIF1 DAC2 EQ Band 1 B */
+   { 0x04A4, 0x00D8 },/* R1188 - AIF1 DAC2 EQ Band 1 PG */
+   { 0x04A5, 0x1EB5 },/* R1189 - AIF1 DAC2 EQ Band 2 A */
+   { 0x04A6, 0xF145 },/* R1190 - AIF1 DAC2 EQ Band 2 B */
+   { 0x04A7, 0x0B75 },/* R1191 - AIF1 DAC2 EQ Band 2 C */
+   { 0x04A8, 0x01C5 },/* R1192 - AIF1 DAC2 EQ Band 2 PG */
+   { 0x04A9, 0x1C58 },/* R1193 - AIF1 DAC2 EQ Band 3 A */
+   { 0x04AA, 0xF373 },/* R1194 - AIF1 DAC2 EQ Band 3 B */
+   { 0x04AB, 0x0A54 },/* R1195 - AIF1 DAC2 EQ Band 3 C */
+   { 0x04AC, 0x0558 },/* R1196 - AIF1 DAC2 EQ Band 3 PG */
+   { 0x04AD, 0x168E },/* R1197 - AIF1 DAC2 EQ Band 4 A */
+   { 0x04AE, 0xF829 },/* R1198 - AIF1 DAC2 EQ Band 4 B */
+   { 0x04AF, 0x07AD },/* R1199 - AIF1 DAC2 EQ Band 4 C */
+   { 0x04B0, 0x1103 },/* R1200 - AIF1 DAC2 EQ Band 4 PG */
+   { 0x04B1, 0x0564 },/* R1201 - AIF1 DAC2 EQ Band 5 A */
+   { 0x04B2, 0x0559 },/* R1202 - AIF1 DAC2 EQ Band 5 B */
+   { 0x04B3, 0x4000 },/* R1203 - AIF1 DAC2 EQ Band 5 PG */
+   { 0x04B4, 0x },/* R1204 - AIF1 DAC2 EQ Band 1 C */
{ 0x0500, 0x00C0 },/* R1280 - AIF2 ADC Left Volume */
{ 0x0501, 0x00C0 },/* R1281 - AIF2 ADC Right Volume */
{ 0x0502, 0x00C0 },/* R1282 - AIF2 DAC Left Volume */
@@ -194,6 +224,8 @@ static struct reg_default wm1811_defaults[] = {
{ 0x0605, 0x },/* R1541 - AIF2ADC Right Mixer Routing */
{ 0x0606, 0x },/* R1542 - AIF1 ADC1 Left Mixer Routing */
{ 0x0607, 0x },/* R1543 - AIF1 ADC1 Right Mixer Routing */
+   { 0x0608, 0x },/* R1544 - AIF1 ADC2 Left Mixer Routing */
+   { 0x0609, 0x },/* R1545 - AIF1 ADC2 Right Mixer Routing */
{ 0x0610, 0x02C0 },/* R1552 - DAC1 Left Volume */
{ 0x0611, 0x02C0 },/* R1553 - DAC1 Right Volume */
{ 0x0612, 0x02C0 },/* R1554 - AIF2TX Left Volume */
@@ -846,14 +878,23 @@ static bool wm1811_readable_register(struct device *dev, 
unsigned int reg)
case WM8994_AIF1_DAC1_LEFT_VOLUME:
case WM8994_AIF1_DAC1_RIGHT_VOLUME:
case WM8994_AIF1_ADC1_FILTERS:
+   case WM8994_AIF1_ADC2_FILTERS:

Re: [PATCH 1/2] phy: qcom: Add driver for QCOM IPQ806x SATA PHY

2014-06-16 Thread Kumar Gala


On Jun 16, 2014, at 5:04 AM, Kishon Vijay Abraham I  wrote:

> Hi,
> 
> On Friday 13 June 2014 12:48 AM, Kumar Gala wrote:
>> Add a PHY driver for uses with AHCI based SATA controller driver on the
>> IPQ806x family of SoCs.
>> 
>> Signed-off-by: Kumar Gala 
>> ---
>> drivers/phy/Kconfig |   6 ++
>> drivers/phy/Makefile|   1 +
>> drivers/phy/phy-qcom-ipq806x-sata.c | 204 
>> 
>> 3 files changed, 211 insertions(+)
>> create mode 100644 drivers/phy/phy-qcom-ipq806x-sata.c
>> 
>> diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
>> index 16a2f06..52bfb93 100644
>> --- a/drivers/phy/Kconfig
>> +++ b/drivers/phy/Kconfig
>> @@ -178,4 +178,10 @@ config PHY_XGENE
>>  help
>>This option enables support for APM X-Gene SoC multi-purpose PHY.
>> 
>> +config PHY_QCOM_IPQ806X_SATA
>> +tristate "Qualcomm IPQ806x SATA SerDes/PHY driver"
>> +depends on ARCH_QCOM
>> +depends on OF
> depends on HAS_IOMEM?

will add

>> +select GENERIC_PHY
>> +
>> endmenu
>> diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
>> index b4f1d57..d950317 100644
>> --- a/drivers/phy/Makefile
>> +++ b/drivers/phy/Makefile
>> @@ -20,3 +20,4 @@ phy-exynos-usb2-$(CONFIG_PHY_EXYNOS4X12_USB2)  += 
>> phy-exynos4x12-usb2.o
>> phy-exynos-usb2-$(CONFIG_PHY_EXYNOS5250_USB2)+= phy-exynos5250-usb2.o
>> obj-$(CONFIG_PHY_EXYNOS5_USBDRD) += phy-exynos5-usbdrd.o
>> obj-$(CONFIG_PHY_XGENE)  += phy-xgene.o
>> +obj-$(CONFIG_PHY_QCOM_IPQ806X_SATA) += phy-qcom-ipq806x-sata.o
>> diff --git a/drivers/phy/phy-qcom-ipq806x-sata.c 
>> b/drivers/phy/phy-qcom-ipq806x-sata.c
>> new file mode 100644
>> index 000..fc57340
>> --- /dev/null
>> +++ b/drivers/phy/phy-qcom-ipq806x-sata.c
>> @@ -0,0 +1,204 @@
>> +/*
>> + * Copyright (c) 2014, The Linux Foundation. All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 and
>> + * only version 2 as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +struct qcom_ipq806x_sata_phy {
>> +struct device *dev;
> 
> dev is not used anywhere. remove it.

already done in v2

>> +void __iomem *mmio;
>> +struct clk *cfg_clk;
>> +};
>> +
>> +#define __set(v, a, b)  (((v) << (b)) & GENMASK(a, b))
>> +
>> +#define SATA_PHY_P0_PARAM0  0x200
>> +#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN3(x)__set(x, 17, 12)
>> +#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN3_MASK  GENMASK(17, 12)
>> +#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN2(x)__set(x, 11, 6)
>> +#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN2_MASK  GENMASK(11, 6)
>> +#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN1(x)__set(x, 5, 0)
>> +#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN1_MASK  GENMASK(5, 0)
>> +
>> +#define SATA_PHY_P0_PARAM1  0x204
>> +#define SATA_PHY_P0_PARAM1_RESERVED_BITS31_21(x)__set(x, 31, 21)
>> +#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN3(x)  __set(x, 20, 14)
>> +#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN3_MASKGENMASK(20, 14)
>> +#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN2(x)  __set(x, 13, 7)
>> +#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN2_MASKGENMASK(13, 7)
>> +#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN1(x)  __set(x, 6, 0)
>> +#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN1_MASKGENMASK(6, 0)
>> +
>> +#define SATA_PHY_P0_PARAM2  0x208
>> +#define SATA_PHY_P0_PARAM2_RX_EQ(x) __set(x, 20, 18)
>> +#define SATA_PHY_P0_PARAM2_RX_EQ_MASK   GENMASK(20, 18)
>> +
>> +#define SATA_PHY_P0_PARAM3  0x20C
>> +#define SATA_PHY_SSC_EN 0x8
>> +#define SATA_PHY_P0_PARAM4  0x210
>> +#define SATA_PHY_REF_SSP_EN 0x2
>> +#define SATA_PHY_RESET  0x1
>> +
>> +static inline void qcom_ipq806x_sata_delay_us(unsigned int delay)
>> +{
>> +/* sleep for max. 50us more to combine processor wakeups */
>> +usleep_range(delay, delay + 50);
>> +}
>> +
>> +static int qcom_ipq806x_sata_phy_init(struct phy *generic_phy)
>> +{
>> +struct qcom_ipq806x_sata_phy *phy = phy_get_drvdata(generic_phy);
>> +u32 reg = 0;
>> +
>> +/* Setting SSC_EN to 1 */
>> +reg = readl_relaxed(phy->mmio + SATA_PHY_P0_PARAM3);
> 
> Why readl_relaxed?

because there is no need for readl’s memory barriers here.

>> +reg = reg | SATA_PHY_SSC_EN;
>> +writel_relaxed(reg, phy->mmio + SATA_PHY_P0_PARAM3);
>> +
>> +reg = readl_relaxed(phy->mmio +

Re: [PATCH] firmware: Add device tree binding for coreboot

2014-06-16 Thread Julius Werner

On Mon, Jun 16, 2014 at 6:30 AM, Rob Herring  wrote:
> On Fri, Jun 13, 2014 at 4:58 PM, Julius Werner  wrote:
>>> This is just to export a fixed log to userspace (like a DMI table) or
>>> the kernel will actually use the data in some way? Based on the link,
>>> it looks like the former to me.
>>
>> I could imagine both. The link is an in-kernel driver that exposes a
>> log through a sysfs node (in a way that has already been established
>> on x86 systems, which find the location through EBDA or ACPI entries
>> instead). We are also using a user-space tool that reads the address
>> from /proc/device-tree and accesses it through /dev/mem. The areas can
>> contain many interesting entries (like the location of an early
>> framebuffer set up by the firmware), so I could also imagine use cases
>> where the kernel makes use of it directly.
>
> I can be argued that the boot interface is DT and any configuration
> data should be put there in a common way. We don't really need yet
> another boot mechanism as we already have:
>
> UEFI + FDT
> UEFI + ACPI
> "standard" bootloaders (e.g. u-boot, grub, barebox, etc.) + FDT
>
> Allowing every bootloader to define its own boot interfaces would only
> result in a mess for both code and testing. I don't want to get into a
> debate about this now as it is not too relevant to this patch, but
> just want to highlight the resistance you will face going down this
> path.
>
>>> Don't you need need to keep the kernel from allocating this memory by
>>> using one of the reserved memory mechanisms? The recently added one
>>> should be able to specific what the memory is reserved for IIRC.
>>
>> Our bootloader is carving the location out of the /memory node and
>> adding it to the device tree reserve map. As far as I know, that only
>> contains a list of raw start and size entries. At any rate, I think
>> it's useful (and in line with other bindings) to add a more explicit
>> node like this (if only to make it easier accessible through
>> /proc/device-tree).
>
> Understand there are 3 different memory reservation bindings. The
> original "/memreserve/" method is indeed limited. What I think you
> should use is the binding documented in
> Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt.
> So you could do something like this:
>
>reserved-memory {
>#address-cells = <1>;
>#size-cells = <1>;
>ranges;
>
>/* global autoconfigured region for contiguous allocations */
>linux,cma {
>compatible = "shared-dma-pool";
>reusable;
>size = <0x400>;
>alignment = <0x2000>;
>linux,cma-default;
>};
>
>coreboot_reserved: coreboot@fdfea000 {
>  compatible = "coreboot";
>  reg = <0xfdfea000 0x264>,
>  <0xfdfea000 0x16000>;
>};
>
>

Okay, I see. But do you really think this is the best way to specify
that interface? Bindings for other firmware also seems to prefer some
form of /firmware, so I think putting it there or under there is more
consistent. Especially if we later find a need to add more properties
to the coreboot node (maybe a version number, feature availability, or
things like that), a reserved-memory node would feel like the wrong
place for it to me.

>>> /firmware is already used IIRC. What if you have other firmware such
>>> as Trustzone?
>>
>> I'm not quite sure how Trusted Foundations works and whether it would
>> even make sense to use it in parallel to coreboot, but it seems to be
>> using the /firmware/trusted-foundations subnode so that should be
>> fine. "firmware" seems to be used by other firmware implementations
>> (like "samsung,secure-firmware") which are similar in nature to and
>> mutually exclusive with coreboot, so I thought the node makes sense.
>> (The kernel should use the compatible string to find it anyway, so a
>> future name clash would not be world-ending.)
>
> They are not mutually exclusive. What runs in secure world or not is
> entirely independent of non-secure boot. You may not care about it,
> but other platforms could.

On Mon, Jun 16, 2014 at 9:39 AM, Olof Johansson  wrote:
> 2014-06-13 14:58 GMT-07:00 Julius Werner :
>>> This is just to export a fixed log to userspace (like a DMI table) or
>>> the kernel will actually use the data in some way? Based on the link,
>>> it looks like the former to me.
>>
>> I could imagine both. The link is an in-kernel driver that exposes a
>> log through a sysfs node (in a way that has already been established
>> on x86 systems, which find the location through EBDA or ACPI entries
>> instead). We are also using a user-space tool that reads the address
>> from /proc/device-tree and accesses it through /dev/mem. The areas can
>> contain many interesting entries (like the location of an early

Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!)

2014-06-16 Thread John Stultz

On Tue, Jun 3, 2014 at 7:57 AM, Johannes Weiner  wrote:
> On Thu, May 08, 2014 at 10:12:40AM -0700, John Stultz wrote:
>> On 04/29/2014 02:21 PM, John Stultz wrote:
>> > Another few weeks and another volatile ranges patchset...
>> >
>> > After getting the sense that the a major objection to the earlier
>> > patches was the introduction of a new syscall (and its somewhat
>> > strange dual length/purged-bit return values), I spent some time
>> > trying to rework the vma manipulations so we can be we won't fail
>> > mid-way through changing volatility (basically making it atomic).
>> > I think I have it working, and thus, there is no longer the
>> > need for a new syscall, and we can go back to using madvise()
>> > to set and unset pages as volatile.
>>
>> Johannes: To get some feedback, maybe I'll needle you directly here a
>> bit. :)
>>
>> Does moving this interface to madvise help reduce your objections?  I
>> feel like your cleaning-the-dirty-bit idea didn't work out, but I was
>> hoping that by reworking the vma manipulations to be atomic, we could
>> move to madvise and still avoid the new syscall that you seemed bothered
>> by. But I've not really heard much from you recently so I worry your
>> concerns on this were actually elsewhere, and I'm just churning the
>> patch needlessly.
>
> My objection was not the syscall.
>
> From a reclaim perspective, using the dirty state to denote whether a
> swap-backed page needs writeback before reclaim is quite natural and I
> much prefer Minchan's changes to the reclaim code over yours.
>
> From an interface point of view, I would prefer the simplicity of
> cleaning dirty bits to invalidate pages, and a default of zero-filling
> invalidated pages instead of sending SIGBUS.  This also is quite
> natural when you think of anon/shmem mappings as cache pages on top of
> /dev/zero (see mmap_zero() and shmem_zero_setup()).  And it translates
> well to tmpfs.
>
> At the same time, I acknowledge that there are usecases that want
> SIGBUS delivery for more than just convenience in order to implement
> userspace fault handling, and this is the only place where I see a
> real divergence in actual functionality from Minchan's code.

Thanks for the clarification and feedback. Sorry for my slow response,
as I was on vacation for a week and am just now catching up on this.

So again, SIGBUS for userspace fault handling is really of a
side-effect of having more userspace friendly semantics, and isn't
really the primary goal/usage model.

Zerofill semantics are mostly problematic because they make userspace
mistakes harder to find and diagnose. Android's ashmem actually uses
zerofill semantics, so while I see it as less ideal, technically
zerofill would work here.

However, combining zerofill with your preferred overloading of the
dirty state is particularly problematic because it makes any dirtying
of volatile data clear both the volatile state as well as the purged
state for the entire page. The volatile state is surprising, but less
problematic, but the clearing of the purged state means applications
would possibly get a partial zero page (for whatever wasn't written)
and no warning that their data was lost.  This is a very surprising
and unfriendly side-effect from a userspace perspective.

For context,  Android's ashmem preserves both the volatile and purged
state on volatile page dirtying (since the volatility and purged state
are kept in their own range structure independently from the VM).

> That, however, truly is a separate virtual memory feature.  Would it
> be possible for you to take MADV_FREE and MADV_REVIVE as a base and
> implement an madvise op that switches the no-page behavior of a VMA
> from zero-filling to SIGBUS delivery?

I'll see if I can look into it if I get some time. However, I suspect
its more likely I'll just have to admit defeat on this one and let
someone else champion the effort. Interest and reviews have seemingly
dropped again here and with other work ramping up, I'm not sure if
I'll be able to justify further work on this. :(

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tile: use ARRAY_SIZE

2014-06-16 Thread Himangi Saraogi

ARRAY_SIZE is more concise to use when the size of an array is divided
by the size of its type or the size of its first element.

The semantic patch that makes this change is as follows:

// 
@i@
@@

@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(T))
+ ARRAY_SIZE(E)
// 

Signed-off-by: Himangi Saraogi 
Acked-by: Julia Lawall 
---
Not compile tested.
 arch/tile/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/tile/kernel/traps.c b/arch/tile/kernel/traps.c
index f3ceb63..86900cc 100644
--- a/arch/tile/kernel/traps.c
+++ b/arch/tile/kernel/traps.c
@@ -277,7 +277,7 @@ void __kprobes do_trap(struct pt_regs *regs, int fault_num,
if (fixup_exception(regs))  /* ILL_TRANS or UNALIGN_DATA */
return;
if (fault_num >= 0 &&
-   fault_num < sizeof(int_name)/sizeof(int_name[0]) &&
+   fault_num < ARRAY_SIZE(int_name) &&
int_name[fault_num] != NULL)
name = int_name[fault_num];
else
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] SELinux: use ARRAY_SIZE

2014-06-16 Thread Himangi Saraogi

ARRAY_SIZE is more concise to use when the size of an array is divided
by the size of its type or the size of its first element.

The Coccinelle semantic patch that makes this change is as follows:

// 
@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(E[...]))
+ ARRAY_SIZE(E)
// 

Signed-off-by: Himangi Saraogi 
---
 security/selinux/ss/policydb.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
index 9c5cdc2..56eb65f 100644
--- a/security/selinux/ss/policydb.c
+++ b/security/selinux/ss/policydb.c
@@ -2608,7 +2608,7 @@ static int mls_write_range_helper(struct mls_range *r, 
void *fp)
if (!eq)
buf[2] = cpu_to_le32(r->level[1].sens);
 
-   BUG_ON(items > (sizeof(buf)/sizeof(buf[0])));
+   BUG_ON(items > ARRAY_SIZE(buf));
 
rc = put_entry(buf, sizeof(u32), items, fp);
if (rc)
@@ -2990,7 +2990,7 @@ static int role_write(void *vkey, void *datum, void *ptr)
if (p->policyvers >= POLICYDB_VERSION_BOUNDARY)
buf[items++] = cpu_to_le32(role->bounds);
 
-   BUG_ON(items > (sizeof(buf)/sizeof(buf[0])));
+   BUG_ON(items > ARRAY_SIZE(buf));
 
rc = put_entry(buf, sizeof(u32), items, fp);
if (rc)
@@ -3040,7 +3040,7 @@ static int type_write(void *vkey, void *datum, void *ptr)
} else {
buf[items++] = cpu_to_le32(typdatum->primary);
}
-   BUG_ON(items > (sizeof(buf) / sizeof(buf[0])));
+   BUG_ON(items > ARRAY_SIZE(buf));
rc = put_entry(buf, sizeof(u32), items, fp);
if (rc)
return rc;
@@ -3069,7 +3069,7 @@ static int user_write(void *vkey, void *datum, void *ptr)
buf[items++] = cpu_to_le32(usrdatum->value);
if (p->policyvers >= POLICYDB_VERSION_BOUNDARY)
buf[items++] = cpu_to_le32(usrdatum->bounds);
-   BUG_ON(items > (sizeof(buf) / sizeof(buf[0])));
+   BUG_ON(items > ARRAY_SIZE(buf));
rc = put_entry(buf, sizeof(u32), items, fp);
if (rc)
return rc;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/8] mfd: Add support for DA9150 combined charger & fuel-gauge device

2014-06-16 Thread Jonathan Cameron


On 16/06/14 14:12, Opensource [Adam Thomson] wrote:

On Sun, Jun 15, 2014 at 20:49, Jonathan Cameron wrote:


Hi Adam,

Some general comments inline.

It's been a while since I've looked at any particularly similar parts,
but it seems to me that a lot of indirection gets added here that
if anything makes the codes slightly harder to follow...

Feel free to disagree with me though!


Will do :)


To my mind all these wrappers add nothing significant so you might as well
just call da9150->read_dev etc directly.

Also, what are the read_qif and write_qif for?  They don't seem to be used
anywhere.


read_qif and write_qif are for the Fuel-Gauge functionality of the chip. The
associated driver will be submitted after acceptance of initial driver code,
and will make use of these functions.

I'll be interested to see how this interacts with the rest of the driver.
Or does it?  Often these dual i2c address devices are best handled as two
entirely separate drivers sitting on the same bus.  No idea if that
is true here as don't have a datasheet!


The wrappers automatically choose the correct client to use (QIF uses a
different slave address to the main chip one). Means the child drivers only need
to pass through the da9150 struct and the rest is dealt with underneath.


The only real reason I can see for these wrappers is because you want
to hide the struct da9150 contents from the children of the mfd. As you
aren't doing that, you might as well drop these in favour of direct
calls to regmap_read and friends.


As I have a need to pass through the main da9150 struct point for the
aforementioned wrappers, it seemed cleaner and more consistent to have wrappers
for these as well, which did the job of regmap access. Means all HW access
uses the same kind of approach, and all sub-devices just need a point to the
main da9150 struct to be able to use the functions.


I'll continue my tirade against obvious comments. Wrong format and
adds nothing to what is here as init and exit functions are clearly
doing what their name suggests (it's one of my pet hates ;)


I agree the comment doesn't add much in terms of description but for me it
breaks up the code to make it easier to follow. However if I get an overwhelming
hatred for this I can change it. Also, I know the rule regarding single/multiple
line comments but here again I feel it helps separate the code and makes it
easier to read.

Kernel code generally has to keep to the kernel coding style.  If you leave the
single line comments as are, then chances are the mfd maintainers will get a
patch soon after 'fixing' them which is always irritating.

Having said that, mfd is their area not mine, so up to Lee / Samuel.
The same is true of the code structure comments. Now for the IIO driver
I get to be picky ;)



As a general good practice point, I'd rather that the driver supported
more than one instance of the chip.. Hence you'd take a copy of da9150_devs
to use here.  I guess it is relatively unlikely with one of these, but
you never know ;)


Have followed the general methods for MFD here, and a number of drivers take the
same approach. Also, I think it would be undesirable to have multiple charger
chips of the same type in one platform. I agree generally it's best to support
multiple instances, but here I don't think we should.

I'm happy to let this go if it is something that would not be done, but we
certainly should not be preventing it if someone wants to build hardware
with more than one of these.  If they submit a patch later allowing it because
they have hardware that does this, then there is no way anyone is going to
block it.



Why does this need it's own file?  Does the DA9150 support any other
interfaces?


Yes, the DA9150 also has a SPI interface. At present the plan is to just add I2C
support for now, but in the future we may add SPI support, so have written the
code with this in mind.


Why the indirection?  The da9150 only supports i2c as far as I can see.


As per my last comment.

Fair enough.  I couldn't find a datasheet and the product brief only seemed to 
mention
i2c.



I'd roll this into one line and not bother with the local variable...


Fair enough but I think this keeps the code cleaner, and to me it makes sense
for the actual logic to be in core file as that's interface agnostic.


Drop comments on things that are self-evident.  Also these are one
line comments so should be using the single line comment syntax.


As per my previous comment I think it just helps to break up the code and makes
it more readable. Will change it though if the general consensus is to remove
it.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] zorro: use ARRAY_SIZE

2014-06-16 Thread Himangi Saraogi

ARRAY_SIZE is more concise to use when the size of an array is divided
by the size of its type or the size of its first element.

The Coccinelle semantic patch that makes this change is as follows:

// 
@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(T))
+ ARRAY_SIZE(E)
// 


Signed-off-by: Himangi Saraogi 
---
 drivers/zorro/names.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/zorro/names.c b/drivers/zorro/names.c
index 6f3fd99..4ccbcc9 100644
--- a/drivers/zorro/names.c
+++ b/drivers/zorro/names.c
@@ -52,7 +52,7 @@ static struct zorro_manuf_info __initdata zorro_manuf_list[] 
= {
 #include "devlist.h"
 };
 
-#define MANUFS (sizeof(zorro_manuf_list)/sizeof(struct zorro_manuf_info))
+#define MANUFS ARRAY_SIZE(zorro_manuf_list)
 
 void __init zorro_name_device(struct zorro_dev *dev)
 {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/13] purgatory: Core purgatory functionality

2014-06-16 Thread Borislav Petkov

On Mon, Jun 16, 2014 at 01:25:38PM -0400, Vivek Goyal wrote:
> I tried following with CONFIG_KEXEC=n
> 
> ifeq ($(CONFIG_KEXEC),y)
> $(Q)$(MAKE) $(clean)=arch/x86/purgatory
> endif
> 
> And still "make V=1 clean" shows me that it is going in purgatory dir
> to clean things up.

So add the ifdef for the CONFIG_KEXEC=n .configs.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 3/4] drm/tegra: Request memory bandwidth for the display controller

2014-06-16 Thread Stephen Warren

On 06/16/2014 07:35 AM, Tomeu Vizoso wrote:
> Request it based solely on the current mode's refresh rate. More
> accurate requirements can be requested in future patches.

> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c

> + bandwidth = mode->clock * window.bits_per_pixel / 8;
> + err = tegra124_emc_reserve_bandwidth(TEGRA_EMC_CONSUMER_DISP1, 
> bandwidth);

DISP1 shouldn't be hard-coded here; the code should use DISP1 or DISP2
based on head or DC identity. We certainly have some boards capable of
dual-head operation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/13] kexec-bzImage: Support for loading bzImage using 64bit entry

2014-06-16 Thread Vivek Goyal

On Sun, Jun 15, 2014 at 06:35:15PM +0200, Borislav Petkov wrote:

[..]
> > +int bzImage64_probe(const char *buf, unsigned long len)
> > +{
> > +   int ret = -ENOEXEC;
> > +   struct setup_header *header;
> > +
> > +   /* kernel should be atleast two sector long */
> 
>   two sectors
> 
> > +   if (len < 2 * 512) {
> > +   pr_debug("File is too short to be a bzImage\n");
> 
> Those error messages are all pr_debug. Now, wouldn't we want to tell
> userspace what the problem is, *when* there is one?
> 
> I.e., pr_err or pr_info is much more helpful than pr_debug IMO.

There can be more than one loader and the one which claims first
to recognize the image will get to load the image. So once 32 bit
loader support comes in, it might happen that we ask 64bit loader
first and it rejects the image and then we ask 32bit loader.

So these message are really debug message which tells why loader
is not accepting an image. It might not be image destined for that
loader at all.

pr_debug() allows being verbose if user wants to for debugging purposes.
You just have to make sure that CONFIG_DYNAMIC_DEBUG=y and enable verbosity
in individual file.

echo 'file kexec-bzimage.c +p' > /sys/kernel/debug/dynamic_debug/control

> 
> > +   return ret;
> > +   }
> > +
> > +   header = (struct setup_header *)(buf + offsetof(struct boot_params,
> > +   hdr));
> 
> Just let that stick out. The 80 cols limit is not a hard one anyway,
> especially if it impairs readability.

Will do.

> 
> > +   if (memcmp((char *)>header, "HdrS", 4) != 0) {
> 
> Not strncmp? "HdrS" is a string...

As peter said, this is not string. So I will retain it.

> 
> > +   pr_debug("Not a bzImage\n");
> > +   return ret;
> > +   }
> > +
> > +   if (header->boot_flag != 0xAA55) {
> > +   pr_debug("No x86 boot sector present\n");
> > +   return ret;
> > +   }
> > +
> > +   if (header->version < 0x020C) {
> > +   pr_debug("Must be at least protocol version 2.12\n");
> > +   return ret;
> > +   }
> > +
> > +   if ((header->loadflags & LOADED_HIGH) == 0) {
> 
>   if (!(header->loadflags.. ))

Will do.

> 
> > +   pr_debug("zImage not a bzImage\n");
> > +   return ret;
> > +   }
> > +
> > +   if (!(header->xloadflags & XLF_KERNEL_64)) {
> > +   pr_debug("Not a bzImage64. XLF_KERNEL_64 is not set.\n");
> > +   return ret;
> > +   }
> > +
> > +   if (!(header->xloadflags & XLF_CAN_BE_LOADED_ABOVE_4G)) {
> > +   pr_debug("XLF_CAN_BE_LOADED_ABOVE_4G is not set.\n");
> > +   return ret;
> > +   }
> 
> Just merge the two checks:
> 
>   if ((header->xloadflags & (XLF_KERNEL_64 | XLF_CAN_BE_LOADED_ABOVE_4G)) 
> !=
>   (XLF_KERNEL_64 | 
> XLF_CAN_BE_LOADED_ABOVE_4G)) {
> pr_err("Not a bzImage, xloadflags: 0x%x\n", 
> header->xloadflags);
> return ret;
> }

I think I like separate checks better. That way I can output much better
debug message. Just saying xloadflags=0x%x does not tell anything about
what flags the loader is looking for (without looking at the code).

> 
> > +
> > +   /* I've got a bzImage */
> > +   pr_debug("It's a relocatable bzImage64\n");
> > +   ret = 0;
> > +
> > +   return ret;
> > +}
> > +
> > +void *bzImage64_load(struct kimage *image, char *kernel,
> > +   unsigned long kernel_len,
> > +   char *initrd, unsigned long initrd_len,
> > +   char *cmdline, unsigned long cmdline_len)
> 
> Arg alignment.

Will do.

[..]
> > +   header = (struct setup_header *)(kernel + setup_hdr_offset);
> > +   setup_sects = header->setup_sects;
> > +   if (setup_sects == 0)
> > +   setup_sects = 4;
> > +
> > +   kern16_size = (setup_sects + 1) * 512;
> > +   if (kernel_len < kern16_size) {
> > +   pr_debug("bzImage truncated\n");
> 
> Ditto for all those pr_debug's in here - I think we want to know why the
> bzImage load fails and pr_debug is not suitable for that.

Same here. We will potentially be trying multiple loaders and if every
loader prints messages for rejection by default, it is too much of info,
IMO.

> 
> > +   return ERR_PTR(-ENOEXEC);
> > +   }
> > +
> > +   if (cmdline_len > header->cmdline_size) {
> 
> As we talked, I think COMMAND_LINE_SIZE is perfectly fine and safe for
> all intents and purposes.

I still have concerns about using COMMAND_LINE_SIZE. If header information
is useful for a bootloader, then kernel is just a bootloader in this case
and if we really want to limit the size, it should be based on information
present in the header and not based on currently running kernel's limit.

> 
> > +   pr_debug("Kernel command line too long\n");
> > +   return ERR_PTR(-EINVAL);
> > +   }
> > +
> > +   /* Allocate loader specific data */
> > +   ldata = kzalloc(sizeof(struct bzimage64_data), GFP_KERNEL);
> > +   if

Re: [PATCH] Check for Null return from logfs_readpage_nolock in btree_write_block

2014-06-16 Thread Mateusz Guzik

On Mon, Jun 16, 2014 at 03:47:01PM -0400, Nicholas Krause wrote:
> Signed-off-by: Nicholas Krause 
> ---
>  fs/logfs/readwrite.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/logfs/readwrite.c b/fs/logfs/readwrite.c
> index 4814031..adb9233 100644
> --- a/fs/logfs/readwrite.c
> +++ b/fs/logfs/readwrite.c
> @@ -2210,6 +2210,8 @@ void btree_write_block(struct logfs_block *block)
>   page = logfs_get_write_page(inode, block->bix, block->level);
>  
>   err = logfs_readpage_nolock(page);
> + if (!err)
> + return -ENOMEM;
>   BUG_ON(err);
>   BUG_ON(!PagePrivate(page));
>   BUG_ON(logfs_block(page) != block);

This function returns 0 on success, which you turn into error condition
and return -ENOMEM.
But the function returns void, thus it cannot return the error.
It does not allocate anything, thus -ENOMEM would not be appropriate in
the first place.

Since the function returns error, nobody would check the condition in
the first place.

Even if it was not void, it would either have to return the error or
oops on BUG_ON(err).

Read the code more carefully and at least compile-test your changes.
Instructions how to compile the kernel can be found here:
http://kernelnewbies.org/FAQ/KernelCompilation

I would also suggest letting the patch wait few hours and have another
look before sending.

Cheers,
-- 
Mateusz Guzik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load

2014-06-16 Thread Borislav Petkov

On Mon, Jun 16, 2014 at 01:38:23PM -0400, Vivek Goyal wrote:
> And what's the sane default in this case?

COMMAND_LINE_SIZE

> Using current kernel's command line size will not work if future
> kernel decide to support even longer command line size.

When do you ever get to kexec a kernel with command line size differing
from the first kernel? This use case is pretty much non-existant to
say the least (mind you, I'm open to examples but am still waiting for
them). And even then you go and simply upgrade the first kernel.

Why are we even talking about this?

> I agree that some kind of upper value is good. But I am disagreeing
> that using current kernel's COMMAND_LINE_SIZE is better thing to do.

Again, stop arguing about some nonsensical cases and give me a real use
case where this is a problem.

> Also what's the upper limit on initramfs size? There is none. The issues
> you are trying to prevent can be easily created simply by passing in
> a large initrd file.
> 
> If we are not putting any sane defaults on size of kernel and initramfs, I
> am not really sure what do we gain here by putting an incorrect limit
> on kernel command line size.

You need to have a *sane* default length for a command line size - not
what's possible or what's not - something sane.

> Who knows that in future we might have to extend it beyond 2048. You
> can't say that 2048 wil never be changed. Nobody knows.

Dude, stop arguing this dumb case - if the command line size is changed,
you simply update the first kernel. What is the use case of having to
kexec a newer kernel on an older kernel? Spit it out already.

> > And even if this is a problem - which I seriously doubt - it would be
> > problem with the 1st kernel too, not only with the kexec-ed one.
> 
> Why it will be a problem with first kernel?

Because if a kernel overflows COMMAND_LINE_SIZE, then something's wrong
with that use case and needs to get information passed in a different
manner - not 2K of cmdline string. Again, where is the sane use case?

> So assuming that you will agree that we might have to extend kernel
> command line some day, my question is how would you support kexec from
> old kernel to newer kernel with larger command line size.

Why do I need to support that case?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7] iio: add support of the max1027

2014-06-16 Thread Jonathan Cameron


On 14/06/14 23:27, Philippe Reynes wrote:

This driver add partial support of the
maxim 1027/1029/1031. Differential mode is not
supported.

It was tested on armadeus apf27 board.

Signed-off-by: Philippe Reynes 

I'm happy with this now.

Hartmut, anything else you want to raise, or do you want to add
a reviewed by? No particular rush at this point in a cycle so
would be nice to acknowledge your hard work - I am very pleased
to see how my reviewing you (and others!) are now doing on this
list.  Makes my side of things much more manageable!

J

---
  .../devicetree/bindings/iio/adc/max1027-adc.txt|   22 +
  drivers/iio/adc/Kconfig|9 +
  drivers/iio/adc/Makefile   |1 +
  drivers/iio/adc/max1027.c  |  521 
  4 files changed, 553 insertions(+), 0 deletions(-)
  create mode 100644 Documentation/devicetree/bindings/iio/adc/max1027-adc.txt
  create mode 100644 drivers/iio/adc/max1027.c

Changelog:
v7: (thanks Hartmut Knaack and Jonathan Cameron for the feedback)
- use const with info in max1027_state
- use bitops.h (and BIT)
- separate operators with whitespace
- define a reg member in max1027_state

v6: (thanks Peter Meerwald for the feedback)
- dont use index with temperature
- use const on scan mask
- use __be16 * for buffer
- return -EBUSY if trigger is already running
- replace it by interrupt in comment
- dont finish case with ";"
- use iio_push_to_buffers
- remove useless initialization of st->buffer
- remove useless obvious comment

v5: (thanks Jonathan Cameron for the feedback)
- add validate_trigger
- add validate_device
- remove useless (void *) cast
- use allocated buffer for spi_write

v4: (thanks Jonathan Cameron for the feedback)
- use iio_trigger_generic_data_rdy_poll

v3: (thanks Hartmut Knaack, Lars-Peter Clausen and Jonathan Cameron for the 
feedback)
- move to drivers/iio/adc (was in staging)
- clean binding doc
- drop empty update_scan_mode callback
- add a lock around single channel read code
- remove useless wrappers around spi_write and spi_read
- fix available scan mask (a bit was missing)
- remove useless header
- some others little cleanp

v2: (thanks Hartmut Knaack and Jonathan Cameron for the feedback)
- really use devm_*
- use demux magic
- use spi_read and spi_write (instead of spi_sync)
- use define for register (instead of hardcoded value)

diff --git a/Documentation/devicetree/bindings/iio/adc/max1027-adc.txt 
b/Documentation/devicetree/bindings/iio/adc/max1027-adc.txt
new file mode 100644
index 000..a8770cc
--- /dev/null
+++ b/Documentation/devicetree/bindings/iio/adc/max1027-adc.txt
@@ -0,0 +1,22 @@
+* Maxim 1027/1029/1031 Analog to Digital Converter (ADC)
+
+Required properties:
+  - compatible: Should be "maxim,max1027" or "maxim,max1029" or "maxim,max1031"
+  - reg: SPI chip select number for the device
+  - interrupt-parent: phandle to the parent interrupt controller
+  see: Documentation/devicetree/bindings/interrupt-controller/interrupts.txt
+  - interrupts: IRQ line for the ADC
+  see: Documentation/devicetree/bindings/interrupt-controller/interrupts.txt
+
+Recommended properties:
+- spi-max-frequency: Definition as per
+ Documentation/devicetree/bindings/spi/spi-bus.txt
+
+Example:
+adc@0 {
+   compatible = "maxim,max1027";
+   reg = <0>;
+   interrupt-parent = <>;
+   interrupts = <15 IRQ_TYPE_EDGE_RISING>;
+   spi-max-frequency = <100>;
+};
diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
index a80d236..20a7073 100644
--- a/drivers/iio/adc/Kconfig
+++ b/drivers/iio/adc/Kconfig
@@ -131,6 +131,15 @@ config LP8788_ADC
help
  Say yes here to build support for TI LP8788 ADC.

+config MAX1027
+   tristate "Maxim max1027 ADC driver"
+   depends on SPI
+   select IIO_BUFFER
+   select IIO_TRIGGERED_BUFFER
+   help
+ Say yes here to build support for Maxim SPI ADC models
+ max1027, max1029 and max1031.
+
  config MAX1363
tristate "Maxim max1363 ADC driver"
depends on I2C
diff --git a/drivers/iio/adc/Makefile b/drivers/iio/adc/Makefile
index 9d60f2d..38cf5c3 100644
--- a/drivers/iio/adc/Makefile
+++ b/drivers/iio/adc/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_AD799X) += ad799x.o
  obj-$(CONFIG_AT91_ADC) += at91_adc.o
  obj-$(CONFIG_EXYNOS_ADC) += exynos_adc.o
  obj-$(CONFIG_LP8788_ADC) += lp8788_adc.o
+obj-$(CONFIG_MAX1027) += max1027.o
  obj-$(CONFIG_MAX1363) += max1363.o
  obj-$(CONFIG_MCP320X) += mcp320x.o
  obj-$(CONFIG_MCP3422) += mcp3422.o
diff --git a/drivers/iio/adc/max1027.c b/drivers/iio/adc/max1027.c
new file mode 100644
index 000..87ee1c7
--- /dev/null
+++ b/drivers/iio/adc/max1027.c
@@ -0,0 +1,521 @@
+ /*
+  * iio/adc/max1027.c
+  * Copyright (C) 2014 Philippe Reynes
+  *
+  * based on linux/drivers/iio/ad7923.c
+  * Copyright 2011 Analog Devices Inc (from AD7923 Driver)
+  * Copyright 2012 CS Systemes d'Information
+  *
+  *

[PATCH v3 1/2] phy: qcom: Add driver for QCOM IPQ806x SATA PHY

2014-06-16 Thread Kumar Gala


Add a PHY driver for uses with AHCI based SATA controller driver on the
IPQ806x family of SoCs.

Signed-off-by: Kumar Gala 
---
v3:
* Added Kconfig HAS_IOMEM dep
* re-ordered probe function so phy_provider_register is last
 
v2:
* dropped unused dev pointer in struct qcom_ipq806x_sata_phy
* remove unnecessary reg initializaiton
* Removed unneeded error message
* Added remove function to disable the clock

 drivers/phy/Kconfig |   7 ++
 drivers/phy/Makefile|   1 +
 drivers/phy/phy-qcom-ipq806x-sata.c | 211 
 3 files changed, 219 insertions(+)
 create mode 100644 drivers/phy/phy-qcom-ipq806x-sata.c

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index 16a2f06..b7b6bce 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -178,4 +178,11 @@ config PHY_XGENE
help
  This option enables support for APM X-Gene SoC multi-purpose PHY.
 
+config PHY_QCOM_IPQ806X_SATA
+   tristate "Qualcomm IPQ806x SATA SerDes/PHY driver"
+   depends on ARCH_QCOM
+   depends on HAS_IOMEM
+   depends on OF
+   select GENERIC_PHY
+
 endmenu
diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
index b4f1d57..d950317 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -20,3 +20,4 @@ phy-exynos-usb2-$(CONFIG_PHY_EXYNOS4X12_USB2) += 
phy-exynos4x12-usb2.o
 phy-exynos-usb2-$(CONFIG_PHY_EXYNOS5250_USB2)  += phy-exynos5250-usb2.o
 obj-$(CONFIG_PHY_EXYNOS5_USBDRD)   += phy-exynos5-usbdrd.o
 obj-$(CONFIG_PHY_XGENE)+= phy-xgene.o
+obj-$(CONFIG_PHY_QCOM_IPQ806X_SATA)+= phy-qcom-ipq806x-sata.o
diff --git a/drivers/phy/phy-qcom-ipq806x-sata.c 
b/drivers/phy/phy-qcom-ipq806x-sata.c
new file mode 100644
index 000..e931aee
--- /dev/null
+++ b/drivers/phy/phy-qcom-ipq806x-sata.c
@@ -0,0 +1,211 @@
+/*
+ * Copyright (c) 2014, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct qcom_ipq806x_sata_phy {
+   void __iomem *mmio;
+   struct clk *cfg_clk;
+};
+
+#define __set(v, a, b) (((v) << (b)) & GENMASK(a, b))
+
+#define SATA_PHY_P0_PARAM0 0x200
+#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN3(x)   __set(x, 17, 12)
+#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN3_MASK GENMASK(17, 12)
+#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN2(x)   __set(x, 11, 6)
+#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN2_MASK GENMASK(11, 6)
+#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN1(x)   __set(x, 5, 0)
+#define SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN1_MASK GENMASK(5, 0)
+
+#define SATA_PHY_P0_PARAM1 0x204
+#define SATA_PHY_P0_PARAM1_RESERVED_BITS31_21(x)   __set(x, 31, 21)
+#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN3(x) __set(x, 20, 14)
+#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN3_MASK   GENMASK(20, 14)
+#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN2(x) __set(x, 13, 7)
+#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN2_MASK   GENMASK(13, 7)
+#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN1(x) __set(x, 6, 0)
+#define SATA_PHY_P0_PARAM1_P0_TX_AMPLITUDE_GEN1_MASK   GENMASK(6, 0)
+
+#define SATA_PHY_P0_PARAM2 0x208
+#define SATA_PHY_P0_PARAM2_RX_EQ(x)__set(x, 20, 18)
+#define SATA_PHY_P0_PARAM2_RX_EQ_MASK  GENMASK(20, 18)
+
+#define SATA_PHY_P0_PARAM3 0x20C
+#define SATA_PHY_SSC_EN0x8
+#define SATA_PHY_P0_PARAM4 0x210
+#define SATA_PHY_REF_SSP_EN0x2
+#define SATA_PHY_RESET 0x1
+
+static inline void qcom_ipq806x_sata_delay_us(unsigned int delay)
+{
+   /* sleep for max. 50us more to combine processor wakeups */
+   usleep_range(delay, delay + 50);
+}
+
+static int qcom_ipq806x_sata_phy_init(struct phy *generic_phy)
+{
+   struct qcom_ipq806x_sata_phy *phy = phy_get_drvdata(generic_phy);
+   u32 reg;
+
+   /* Setting SSC_EN to 1 */
+   reg = readl_relaxed(phy->mmio + SATA_PHY_P0_PARAM3);
+   reg = reg | SATA_PHY_SSC_EN;
+   writel_relaxed(reg, phy->mmio + SATA_PHY_P0_PARAM3);
+
+   reg = readl_relaxed(phy->mmio + SATA_PHY_P0_PARAM0) &
+   ~(SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN3_MASK |
+ SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN2_MASK |
+ SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN1_MASK);
+   reg |= SATA_PHY_P0_PARAM0_P0_TX_PREEMPH_GEN3(0xf);
+   writel_relaxed(reg,

Re: [RFC PATCH 1/4] memory: tegra124-emc: Add EMC driver

2014-06-16 Thread Stephen Warren

On 06/16/2014 07:35 AM, Tomeu Vizoso wrote:
> Adds functionality for registering memory bandwidth needs and setting
> the EMC clock rate based on that.
> 
> Also adds API for setting floor and ceiling frequency rates.

> diff --git 
> a/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra124-emc.txt 
> b/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra124-emc.txt
> new file mode 100644
> index 000..88e6a55
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra124-emc.txt
> @@ -0,0 +1,26 @@
> +Tegra124 External Memory Controller
> +
> +Properties:
> +- compatible : Should contain "nvidia,tegra124-emc".
> +- reg : Should contain the register range of the device
> +- #address-cells : Should be 1
> +- #size-cells : Should be 0
> +- nvidia,mc : phandle to the mc bus connected to EMC.
> +- clocks : phandle to EMC, EMC shared bus override, and all parent clocks.
> +- clock-names : name of each clock.
> +- nvidia,pmc : phandle to the PMC syscon node.
> +- max-clock-frequency : optional, specifies the maximum EMC rate in kHz.
> +
> +Child device nodes describe the memory settings for different configurations 
> and
> +clock rates.

How do the child nodes do that? The binding needs to specify the format
of the child node. This binding looks quite anaemic vs.
Documentation/devicetree/bindings/arm/tegra/nvidia,tegra20-emc.txt; I
would expect that this binding needs all the EMC register data from the
tegra20-emc binding too. Can the two bindings be identical?

Can you explain what the nvidia,mc and nvidia,pmc references are needed
for? Hopefully, this driver isn't going to reach into those devices and
touch their registers directly.

> diff --git a/include/linux/platform_data/tegra_emc.h 
> b/include/linux/platform_data/tegra_emc.h

A header file that defines platform data format isn't the correct place
to put the definitions of public APIs. I'd expect something more like
.

> +#ifdef CONFIG_TEGRA124_EMC
> +int tegra124_emc_reserve_bandwidth(unsigned int consumer, unsigned long 
> rate);
> +void tegra124_emc_set_floor(unsigned long freq);
> +void tegra124_emc_set_ceiling(unsigned long freq);
> +#else
> +int tegra124_emc_reserve_bandwidth(unsigned int consumer, unsigned long rate)
> +{ return -ENODEV; }
> +void tegra124_emc_set_floor(unsigned long freq)
> +{ return; }
> +void tegra124_emc_set_ceiling(unsigned long freq)
> +{ return; }
> +#endif

I'll repeat what I said off-list so that we can have the whole
conversation on the list:

That looks like a custom Tegra-specific API. I think it'd be much better
to integrate this into the common clock framework as a standard clock
constraints API. There are other use-cases for clock constraints besides
EMC scaling (e.g. some in audio on Tegra, and I'm sure many on other
SoCs too).

See https://lkml.org/lkml/2014/5/16/569 for some previous discussion on
this topic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] audit: use atomic_t to simplify audit_serial()

2014-06-16 Thread Richard Guy Briggs

Since there is already a primitive to do this operation in the atomic_t, use it
to simplify audit_serial().

Signed-off-by: Richard Guy Briggs 
---
 kernel/audit.c |   14 ++
 1 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 218899b..d41266c 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1257,19 +1257,9 @@ err:
  */
 unsigned int audit_serial(void)
 {
-   static DEFINE_SPINLOCK(serial_lock);
-   static unsigned int serial = 0;
+   static atomic_t serial = ATOMIC_INIT(0);
 
-   unsigned long flags;
-   unsigned int ret;
-
-   spin_lock_irqsave(_lock, flags);
-   do {
-   ret = ++serial;
-   } while (unlikely(!ret));
-   spin_unlock_irqrestore(_lock, flags);
-
-   return ret;
+   return atomic_add_return(1, );
 }
 
 static inline void audit_get_stamp(struct audit_context *ctx,
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] hugetlb: fix copy_hugetlb_page_range() to handle migration/hwpoisoned entry

2014-06-16 Thread Naoya Horiguchi

On Sun, Jun 15, 2014 at 05:19:29PM -0700, Hugh Dickins wrote:
> On Fri, 6 Jun 2014, Naoya Horiguchi wrote:
> 
> > There's a race between fork() and hugepage migration, as a result we try to
> > "dereference" a swap entry as a normal pte, causing kernel panic.
> > The cause of the problem is that copy_hugetlb_page_range() can't handle 
> > "swap
> > entry" family (migration entry and hwpoisoned entry,) so let's fix it.
> > 
> > Signed-off-by: Naoya Horiguchi 
> > Cc: sta...@vger.kernel.org # v2.6.36+
> 
> Seems a good catch.  But a few reservations...
> 
> > ---
> >  include/linux/mm.h |  6 +
> >  mm/hugetlb.c   | 72 
> > --
> >  mm/memory.c|  5 
> >  3 files changed, 49 insertions(+), 34 deletions(-)
> > 
> > diff --git v3.15-rc8.orig/include/linux/mm.h v3.15-rc8/include/linux/mm.h
> > index d6777060449f..6b4fe9ec79ba 100644
> > --- v3.15-rc8.orig/include/linux/mm.h
> > +++ v3.15-rc8/include/linux/mm.h
> > @@ -1924,6 +1924,12 @@ static inline struct vm_area_struct 
> > *find_exact_vma(struct mm_struct *mm,
> > return vma;
> >  }
> >  
> > +static inline bool is_cow_mapping(vm_flags_t flags)
> > +{
> > +   return (flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
> > +}
> > +
> > +
> 
> This is an unrelated cleanup, which makes the patch unnecessarily larger,
> needlessly touching include/linux/mm.h and mm/memory.c, making it more
> likely not to apply to all the old releases you're asking for in the
> stable line.

OK, I drop this unessential change.

> And 3.16-rc moves is_cow_mapping() to mm/internal.h not include/linux/mm.h.
> 
> >  #ifdef CONFIG_MMU
> >  pgprot_t vm_get_page_prot(unsigned long vm_flags);
> >  #else
> > diff --git v3.15-rc8.orig/mm/hugetlb.c v3.15-rc8/mm/hugetlb.c
> > index c82290b9c1fc..47ae7db288f7 100644
> > --- v3.15-rc8.orig/mm/hugetlb.c
> > +++ v3.15-rc8/mm/hugetlb.c
> > @@ -2377,6 +2377,31 @@ static void set_huge_ptep_writable(struct 
> > vm_area_struct *vma,
> > update_mmu_cache(vma, address, ptep);
> >  }
> >  
> > +static int is_hugetlb_entry_migration(pte_t pte)
> > +{
> > +   swp_entry_t swp;
> > +
> > +   if (huge_pte_none(pte) || pte_present(pte))
> > +   return 0;
> > +   swp = pte_to_swp_entry(pte);
> > +   if (non_swap_entry(swp) && is_migration_entry(swp))
> > +   return 1;
> > +   else
> > +   return 0;
> > +}
> > +
> > +static int is_hugetlb_entry_hwpoisoned(pte_t pte)
> > +{
> > +   swp_entry_t swp;
> > +
> > +   if (huge_pte_none(pte) || pte_present(pte))
> > +   return 0;
> > +   swp = pte_to_swp_entry(pte);
> > +   if (non_swap_entry(swp) && is_hwpoison_entry(swp))
> > +   return 1;
> > +   else
> > +   return 0;
> > +}
> >  
> >  int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > struct vm_area_struct *vma)
> > @@ -2391,7 +2416,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, 
> > struct mm_struct *src,
> > unsigned long mmun_end; /* For mmu_notifiers */
> > int ret = 0;
> >  
> > -   cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
> > +   cow = is_cow_mapping(vma->vm_flags);
> 
> So, just leave this out and it all becomes easier, no?

Yes, let's leave this.

> >  
> > mmun_start = vma->vm_start;
> > mmun_end = vma->vm_end;
> > @@ -2416,10 +2441,25 @@ int copy_hugetlb_page_range(struct mm_struct *dst, 
> > struct mm_struct *src,
> > dst_ptl = huge_pte_lock(h, dst, dst_pte);
> > src_ptl = huge_pte_lockptr(h, src, src_pte);
> > spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
> > -   if (!huge_pte_none(huge_ptep_get(src_pte))) {
> > +   entry = huge_ptep_get(src_pte);
> > +   if (huge_pte_none(entry)) { /* skip none entry */
> > +   ;
> 
> Not very pretty, but I would probably have made the same choice.
> 
> > +   } else if (unlikely(is_hugetlb_entry_migration(entry) ||
> > +   is_hugetlb_entry_hwpoisoned(entry))) {
> > +   swp_entry_t swp_entry = pte_to_swp_entry(entry);
> > +   if (is_write_migration_entry(swp_entry) && cow) {
> > +   /*
> > +* COW mappings require pages in both
> > +* parent and child to be set to read.
> > +*/
> > +   make_migration_entry_read(_entry);
> > +   entry = swp_entry_to_pte(swp_entry);
> > +   set_pte_at(src, addr, src_pte, entry);
> > +   }
> > +   set_huge_pte_at(dst, addr, dst_pte, entry);
> 
> It's odd to see set_pte_at(src, addr, src_pte, entry)
> followed by set_huge_pte_at(dst, addr, dst_pte, entry).
> 
> Probably they should both say set_huge_pte_at().  But have you
> consulted the relevant architectures to check whether set_huge_pte_at()

[patch 01/12] mm: memcontrol: fold mem_cgroup_do_charge()

2014-06-16 Thread Johannes Weiner

This function was split out because mem_cgroup_try_charge() got too
big.  But having essentially one sequence of operations arbitrarily
split in half is not good for reworking the code.  Fold it back in.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
---
 mm/memcontrol.c | 166 ++--
 1 file changed, 64 insertions(+), 102 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2c7bcb0e6eb..94531df14d37 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2551,80 +2551,6 @@ static int memcg_cpu_hotplug_callback(struct 
notifier_block *nb,
return NOTIFY_OK;
 }
 
-
-/* See mem_cgroup_try_charge() for details */
-enum {
-   CHARGE_OK,  /* success */
-   CHARGE_RETRY,   /* need to retry but retry is not bad */
-   CHARGE_NOMEM,   /* we can't do more. return -ENOMEM */
-   CHARGE_WOULDBLOCK,  /* GFP_WAIT wasn't set and no enough res. */
-};
-
-static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
-   unsigned int nr_pages, unsigned int min_pages,
-   bool invoke_oom)
-{
-   unsigned long csize = nr_pages * PAGE_SIZE;
-   struct mem_cgroup *mem_over_limit;
-   struct res_counter *fail_res;
-   unsigned long flags = 0;
-   int ret;
-
-   ret = res_counter_charge(>res, csize, _res);
-
-   if (likely(!ret)) {
-   if (!do_swap_account)
-   return CHARGE_OK;
-   ret = res_counter_charge(>memsw, csize, _res);
-   if (likely(!ret))
-   return CHARGE_OK;
-
-   res_counter_uncharge(>res, csize);
-   mem_over_limit = mem_cgroup_from_res_counter(fail_res, memsw);
-   flags |= MEM_CGROUP_RECLAIM_NOSWAP;
-   } else
-   mem_over_limit = mem_cgroup_from_res_counter(fail_res, res);
-   /*
-* Never reclaim on behalf of optional batching, retry with a
-* single page instead.
-*/
-   if (nr_pages > min_pages)
-   return CHARGE_RETRY;
-
-   if (!(gfp_mask & __GFP_WAIT))
-   return CHARGE_WOULDBLOCK;
-
-   if (gfp_mask & __GFP_NORETRY)
-   return CHARGE_NOMEM;
-
-   ret = mem_cgroup_reclaim(mem_over_limit, gfp_mask, flags);
-   if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
-   return CHARGE_RETRY;
-   /*
-* Even though the limit is exceeded at this point, reclaim
-* may have been able to free some pages.  Retry the charge
-* before killing the task.
-*
-* Only for regular pages, though: huge pages are rather
-* unlikely to succeed so close to the limit, and we fall back
-* to regular pages anyway in case of failure.
-*/
-   if (nr_pages <= (1 << PAGE_ALLOC_COSTLY_ORDER) && ret)
-   return CHARGE_RETRY;
-
-   /*
-* At task move, charge accounts can be doubly counted. So, it's
-* better to wait until the end of task_move if something is going on.
-*/
-   if (mem_cgroup_wait_acct_move(mem_over_limit))
-   return CHARGE_RETRY;
-
-   if (invoke_oom)
-   mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(csize));
-
-   return CHARGE_NOMEM;
-}
-
 /**
  * mem_cgroup_try_charge - try charging a memcg
  * @memcg: memcg to charge
@@ -2641,7 +2567,11 @@ static int mem_cgroup_try_charge(struct mem_cgroup 
*memcg,
 {
unsigned int batch = max(CHARGE_BATCH, nr_pages);
int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
-   int ret;
+   struct mem_cgroup *mem_over_limit;
+   struct res_counter *fail_res;
+   unsigned long nr_reclaimed;
+   unsigned long flags = 0;
+   unsigned long long size;
 
if (mem_cgroup_is_root(memcg))
goto done;
@@ -2661,44 +2591,76 @@ static int mem_cgroup_try_charge(struct mem_cgroup 
*memcg,
 
if (gfp_mask & __GFP_NOFAIL)
oom = false;
-again:
+retry:
if (consume_stock(memcg, nr_pages))
goto done;
 
-   do {
-   bool invoke_oom = oom && !nr_oom_retries;
+   size = batch * PAGE_SIZE;
+   if (!res_counter_charge(>res, size, _res)) {
+   if (!do_swap_account)
+   goto done_restock;
+   if (!res_counter_charge(>memsw, size, _res))
+   goto done_restock;
+   res_counter_uncharge(>res, size);
+   mem_over_limit = mem_cgroup_from_res_counter(fail_res, memsw);
+   flags |= MEM_CGROUP_RECLAIM_NOSWAP;
+   } else
+   mem_over_limit = mem_cgroup_from_res_counter(fail_res, res);
 
-   /* If killed, bypass charge */
-   if (fatal_signal_pending(current))
-   goto bypass;
+   if (batch > nr_pages) {
+   batch = nr_pages;
+

[patch 02/12] mm: memcontrol: rearrange charging fast path

2014-06-16 Thread Johannes Weiner

The charging path currently starts out with OOM condition checks when
OOM is the rarest possible case.

Rearrange this code to run OOM/task dying checks only after trying the
percpu charge and the res_counter charge and bail out before entering
reclaim.  Attempting a charge does not hurt an (oom-)killed task as
much as every charge attempt having to check OOM conditions.  Also,
only check __GFP_NOFAIL when the charge would actually fail.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
---
 mm/memcontrol.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 94531df14d37..e946f7439b16 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2575,22 +2575,6 @@ static int mem_cgroup_try_charge(struct mem_cgroup 
*memcg,
 
if (mem_cgroup_is_root(memcg))
goto done;
-   /*
-* Unlike in global OOM situations, memcg is not in a physical
-* memory shortage.  Allow dying and OOM-killed tasks to
-* bypass the last charges so that they can exit quickly and
-* free their memory.
-*/
-   if (unlikely(test_thread_flag(TIF_MEMDIE) ||
-fatal_signal_pending(current) ||
-current->flags & PF_EXITING))
-   goto bypass;
-
-   if (unlikely(task_in_memcg_oom(current)))
-   goto nomem;
-
-   if (gfp_mask & __GFP_NOFAIL)
-   oom = false;
 retry:
if (consume_stock(memcg, nr_pages))
goto done;
@@ -2612,6 +2596,20 @@ retry:
goto retry;
}
 
+   /*
+* Unlike in global OOM situations, memcg is not in a physical
+* memory shortage.  Allow dying and OOM-killed tasks to
+* bypass the last charges so that they can exit quickly and
+* free their memory.
+*/
+   if (unlikely(test_thread_flag(TIF_MEMDIE) ||
+fatal_signal_pending(current) ||
+current->flags & PF_EXITING))
+   goto bypass;
+
+   if (unlikely(task_in_memcg_oom(current)))
+   goto nomem;
+
if (!(gfp_mask & __GFP_WAIT))
goto nomem;
 
@@ -2640,6 +2638,9 @@ retry:
if (mem_cgroup_wait_acct_move(mem_over_limit))
goto retry;
 
+   if (gfp_mask & __GFP_NOFAIL)
+   goto bypass;
+
if (fatal_signal_pending(current))
goto bypass;
 
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 04/12] mm: memcontrol: retry reclaim for oom-disabled and __GFP_NOFAIL charges

2014-06-16 Thread Johannes Weiner

There is no reason why oom-disabled and __GFP_NOFAIL charges should
try to reclaim only once when every other charge tries several times
before giving up.  Make them all retry the same number of times.

Signed-off-by: Johannes Weiner 
---
 mm/memcontrol.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e946f7439b16..52550bbff1ef 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2566,7 +2566,7 @@ static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
 bool oom)
 {
unsigned int batch = max(CHARGE_BATCH, nr_pages);
-   int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
+   int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
struct mem_cgroup *mem_over_limit;
struct res_counter *fail_res;
unsigned long nr_reclaimed;
@@ -2638,6 +2638,9 @@ retry:
if (mem_cgroup_wait_acct_move(mem_over_limit))
goto retry;
 
+   if (nr_retries--)
+   goto retry;
+
if (gfp_mask & __GFP_NOFAIL)
goto bypass;
 
@@ -2647,9 +2650,6 @@ retry:
if (!oom)
goto nomem;
 
-   if (nr_oom_retries--)
-   goto retry;
-
mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(batch));
 nomem:
if (!(gfp_mask & __GFP_NOFAIL))
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 00/12] mm: memcontrol: naturalize charge lifetime v3

2014-06-16 Thread Johannes Weiner

Hi,

this is v3 of the memcg charge naturalization series.  Changes since
v2 include:

o make THP charges use __GFP_NORETRY to prevent excessive reclaim (Michal)
o simplify move precharging while in the area
o add acks & rebase to v3.16-rc1

These patches rework memcg charge lifetime to integrate more naturally
with the lifetime of user pages.  This drastically simplifies the code
and reduces charging and uncharging overhead.  The most expensive part
of charging and uncharging is the page_cgroup bit spinlock, which is
removed entirely after this series.

Here are the top-10 profile entries of a stress test that reads a 128G
sparse file on a freshly booted box, without even a dedicated cgroup
(i.e. executing in the root memcg).  Before:

15.36%  cat  [kernel.kallsyms]   [k] copy_user_generic_string   
   
13.31%  cat  [kernel.kallsyms]   [k] memset 
   
11.48%  cat  [kernel.kallsyms]   [k] do_mpage_readpage  
   
 4.23%  cat  [kernel.kallsyms]   [k] get_page_from_freelist 
   
 2.38%  cat  [kernel.kallsyms]   [k] put_page   
   
 2.32%  cat  [kernel.kallsyms]   [k] __mem_cgroup_commit_charge 
   
 2.18%  kswapd0  [kernel.kallsyms]   [k] 
__mem_cgroup_uncharge_common  
 1.92%  kswapd0  [kernel.kallsyms]   [k] shrink_page_list   
   
 1.86%  cat  [kernel.kallsyms]   [k] __radix_tree_lookup
   
 1.62%  cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn   
   

After:

15.67%   cat  [kernel.kallsyms]   [k] copy_user_generic_string  

13.48%   cat  [kernel.kallsyms]   [k] memset

11.42%   cat  [kernel.kallsyms]   [k] do_mpage_readpage 

 3.98%   cat  [kernel.kallsyms]   [k] get_page_from_freelist

 2.46%   cat  [kernel.kallsyms]   [k] put_page  

 2.13%   kswapd0  [kernel.kallsyms]   [k] shrink_page_list  

 1.88%   cat  [kernel.kallsyms]   [k] __radix_tree_lookup   

 1.67%   cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn  

 1.39%   kswapd0  [kernel.kallsyms]   [k] free_pcppages_bulk

 1.30%   cat  [kernel.kallsyms]   [k] kfree 


As you can see, the memcg footprint has shrunk quite a bit.

   textdata bss dec hex filename
  379709892 400   48262bc86 mm/memcontrol.o.old
  353039892 400   45595b21b mm/memcontrol.o

 Documentation/cgroups/memcg_test.txt |  160 +---
 include/linux/memcontrol.h   |   94 +--
 include/linux/page_cgroup.h  |   43 +-
 include/linux/swap.h |   15 +-
 kernel/events/uprobes.c  |1 +
 mm/filemap.c |   13 +-
 mm/huge_memory.c |   57 +-
 mm/memcontrol.c  | 1516 --
 mm/memory.c  |   43 +-
 mm/migrate.c |   44 +-
 mm/rmap.c|   20 -
 mm/shmem.c   |   32 +-
 mm/swap.c|   40 +
 mm/swap_state.c  |8 +-
 mm/swapfile.c|   21 +-
 mm/truncate.c|9 -
 mm/vmscan.c  |   12 +-
 mm/zswap.c   |2 +-
 18 files changed, 754 insertions(+), 1376 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 06/12] mm: memcontrol: simplify move precharge function

2014-06-16 Thread Johannes Weiner

The move precharge function does some baroque things: it tries raw
res_counter charging of the entire amount first, and then falls back
to a loop of one-by-one charges, with checks for pending signals and
cond_resched() batching.

Just use mem_cgroup_try_charge() without __GFP_WAIT for the first bulk
charge attempt.  In the one-by-one loop, remove the signal check (this
is already checked in try_charge), and simply call cond_resched()
after every charge - it's not that expensive.

Signed-off-by: Johannes Weiner 
---
 mm/memcontrol.c | 51 +--
 1 file changed, 17 insertions(+), 34 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9c646b9b56f4..3d9df94896a7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6372,55 +6372,38 @@ static void mem_cgroup_css_free(struct 
cgroup_subsys_state *css)
 
 #ifdef CONFIG_MMU
 /* Handlers for move charge at task migration. */
-#define PRECHARGE_COUNT_AT_ONCE256
 static int mem_cgroup_do_precharge(unsigned long count)
 {
-   int ret = 0;
-   int batch_count = PRECHARGE_COUNT_AT_ONCE;
-   struct mem_cgroup *memcg = mc.to;
+   int ret;
 
-   if (mem_cgroup_is_root(memcg)) {
+   if (mem_cgroup_is_root(mc.to)) {
mc.precharge += count;
/* we don't need css_get for root */
return ret;
}
-   /* try to charge at once */
-   if (count > 1) {
-   struct res_counter *dummy;
-   /*
-* "memcg" cannot be under rmdir() because we've already checked
-* by cgroup_lock_live_cgroup() that it is not removed and we
-* are still under the same cgroup_mutex. So we can postpone
-* css_get().
-*/
-   if (res_counter_charge(>res, PAGE_SIZE * count, ))
-   goto one_by_one;
-   if (do_swap_account && res_counter_charge(>memsw,
-   PAGE_SIZE * count, )) {
-   res_counter_uncharge(>res, PAGE_SIZE * count);
-   goto one_by_one;
-   }
+
+   /* Try a single bulk charge without reclaim first */
+   ret = mem_cgroup_try_charge(mc.to, GFP_KERNEL & ~__GFP_WAIT,
+   count, false);
+   if (!ret) {
mc.precharge += count;
return ret;
}
-one_by_one:
-   /* fall back to one by one charge */
+
+   /* Try charges one by one with reclaim */
while (count--) {
-   if (signal_pending(current)) {
-   ret = -EINTR;
-   break;
-   }
-   if (!batch_count--) {
-   batch_count = PRECHARGE_COUNT_AT_ONCE;
-   cond_resched();
-   }
-   ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
+   ret = mem_cgroup_try_charge(mc.to, GFP_KERNEL, 1, false);
+   /*
+* In case of failure, any residual charges against
+* mc.to will be dropped by mem_cgroup_clear_mc()
+* later on.
+*/
if (ret)
-   /* mem_cgroup_clear_mc() will do uncharge later */
return ret;
mc.precharge++;
+   cond_resched();
}
-   return ret;
+   return 0;
 }
 
 /**
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 03/12] mm: huge_memory: use GFP_TRANSHUGE when charging huge pages

2014-06-16 Thread Johannes Weiner

Transparent huge page charges prefer falling back to regular pages
rather than spending a lot of time in direct reclaim.

Desired reclaim behavior is usually declared in the gfp mask, but THP
charges use GFP_KERNEL and then rely on the fact that OOM is disabled
for THP charges, and that OOM-disabled charges currently skip reclaim.
Needless to say, this is anything but obvious and quite error prone.

Convert THP charges to use GFP_TRANSHUGE instead, which implies
__GFP_NORETRY, to indicate the low-latency requirement.

Signed-off-by: Johannes Weiner 
---
 mm/huge_memory.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e60837dc785c..10cd7f2bf776 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -827,7 +827,7 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
count_vm_event(THP_FAULT_FALLBACK);
return VM_FAULT_FALLBACK;
}
-   if (unlikely(mem_cgroup_charge_anon(page, mm, GFP_KERNEL))) {
+   if (unlikely(mem_cgroup_charge_anon(page, mm, GFP_TRANSHUGE))) {
put_page(page);
count_vm_event(THP_FAULT_FALLBACK);
return VM_FAULT_FALLBACK;
@@ -1101,7 +1101,7 @@ alloc:
goto out;
}
 
-   if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))) {
+   if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_TRANSHUGE))) {
put_page(new_page);
if (page) {
split_huge_page(page);
@@ -2368,7 +2368,7 @@ static void collapse_huge_page(struct mm_struct *mm,
if (!new_page)
return;
 
-   if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL)))
+   if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_TRANSHUGE)))
return;
 
/*
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] audit: use union for audit_field values since they are mutually exclusive

2014-06-16 Thread Richard Guy Briggs

Since only one of val, uid and gid are used at any given time, combine them to
reduce the size of the struct audit_field.

Signed-off-by: Richard Guy Briggs 
---
 include/linux/audit.h |8 +---
 kernel/auditfilter.c  |2 --
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 1ae0089..06141b3 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -66,9 +66,11 @@ struct audit_krule {
 
 struct audit_field {
u32 type;
-   u32 val;
-   kuid_t  uid;
-   kgid_t  gid;
+   union {
+   u32 val;
+   kuid_t  uid;
+   kgid_t  gid;
+   };
u32 op;
char*lsm_str;
void*lsm_rule;
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index b65a138..ea8d389 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -422,8 +422,6 @@ static struct audit_entry *audit_data_to_entry(struct 
audit_rule_data *data,
 
f->type = data->fields[i];
f->val = data->values[i];
-   f->uid = INVALID_UID;
-   f->gid = INVALID_GID;
f->lsm_str = NULL;
f->lsm_rule = NULL;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 10/12] mm: memcontrol: do not acquire page_cgroup lock for kmem pages

2014-06-16 Thread Johannes Weiner

Kmem page charging and uncharging is serialized by means of exclusive
access to the page.  Do not take the page_cgroup lock and don't set
pc->flags atomically.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
Acked-by: Vladimir Davydov 
---
 mm/memcontrol.c | 21 +++--
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1cde6e2b33d9..764e182ccde3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3414,12 +3414,13 @@ void __memcg_kmem_commit_charge(struct page *page, 
struct mem_cgroup *memcg,
memcg_uncharge_kmem(memcg, PAGE_SIZE << order);
return;
}
-
+   /*
+* The page is freshly allocated and not visible to any
+* outside callers yet.  Set up pc non-atomically.
+*/
pc = lookup_page_cgroup(page);
-   lock_page_cgroup(pc);
pc->mem_cgroup = memcg;
-   SetPageCgroupUsed(pc);
-   unlock_page_cgroup(pc);
+   pc->flags = PCG_USED;
 }
 
 void __memcg_kmem_uncharge_pages(struct page *page, int order)
@@ -3429,19 +3430,11 @@ void __memcg_kmem_uncharge_pages(struct page *page, int 
order)
 
 
pc = lookup_page_cgroup(page);
-   /*
-* Fast unlocked return. Theoretically might have changed, have to
-* check again after locking.
-*/
if (!PageCgroupUsed(pc))
return;
 
-   lock_page_cgroup(pc);
-   if (PageCgroupUsed(pc)) {
-   memcg = pc->mem_cgroup;
-   ClearPageCgroupUsed(pc);
-   }
-   unlock_page_cgroup(pc);
+   memcg = pc->mem_cgroup;
+   pc->flags = 0;
 
/*
 * We trust that only if there is a memcg associated with the page, it
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 09/12] mm: memcontrol: remove ordering between pc->mem_cgroup and PageCgroupUsed

2014-06-16 Thread Johannes Weiner

There is a write barrier between setting pc->mem_cgroup and
PageCgroupUsed, which was added to allow LRU operations to lookup the
memcg LRU list of a page without acquiring the page_cgroup lock.

But ever since 38c5d72f3ebe ("memcg: simplify LRU handling by new
rule"), pages are ensured to be off-LRU while charging, so nobody else
is changing LRU state while pc->mem_cgroup is being written, and there
are no read barriers anymore.

Remove the unnecessary write barrier.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
---
 mm/memcontrol.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3726f6774860..1cde6e2b33d9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2801,14 +2801,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup 
*memcg,
}
 
pc->mem_cgroup = memcg;
-   /*
-* We access a page_cgroup asynchronously without lock_page_cgroup().
-* Especially when a page_cgroup is taken from a page, pc->mem_cgroup
-* is accessed after testing USED bit. To make pc->mem_cgroup visible
-* before USED bit, we need memory barrier here.
-* See mem_cgroup_add_lru_list(), etc.
-*/
-   smp_wmb();
SetPageCgroupUsed(pc);
 
if (lrucare) {
@@ -3490,7 +3482,6 @@ void mem_cgroup_split_huge_fixup(struct page *head)
for (i = 1; i < HPAGE_PMD_NR; i++) {
pc = head_pc + i;
pc->mem_cgroup = memcg;
-   smp_wmb();/* see __commit_charge() */
pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
}
__this_cpu_sub(memcg->stat->count[MEM_CGROUP_STAT_RSS_HUGE],
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 07/12] mm: memcontrol: catch root bypass in move precharge

2014-06-16 Thread Johannes Weiner

When mem_cgroup_try_charge() returns -EINTR, it bypassed the charge to
the root memcg.  But move precharging does not catch this and treats
this case as if no charge had happened, thus leaking a charge against
root.  Because of an old optimization, the root memcg's res_counter is
not actually charged right now, but it's still an imbalance and
subsequent patches will charge the root memcg again.

Catch those bypasses to the root memcg and properly cancel them before
giving up the move.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
---
 mm/memcontrol.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3d9df94896a7..a3b69d4a6f17 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6389,6 +6389,10 @@ static int mem_cgroup_do_precharge(unsigned long count)
mc.precharge += count;
return ret;
}
+   if (ret == -EINTR) {
+   __mem_cgroup_cancel_charge(root_mem_cgroup, count);
+   return ret;
+   }
 
/* Try charges one by one with reclaim */
while (count--) {
@@ -6396,8 +6400,11 @@ static int mem_cgroup_do_precharge(unsigned long count)
/*
 * In case of failure, any residual charges against
 * mc.to will be dropped by mem_cgroup_clear_mc()
-* later on.
+* later on.  However, cancel any charges that are
+* bypassed to root right away or they'll be lost.
 */
+   if (ret == -EINTR)
+   __mem_cgroup_cancel_charge(root_mem_cgroup, 1);
if (ret)
return ret;
mc.precharge++;
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 11/12] mm: memcontrol: rewrite charge API

2014-06-16 Thread Johannes Weiner

The memcg charge API charges pages before they are rmapped - i.e. have
an actual "type" - and so every callsite needs its own set of charge
and uncharge functions to know what type is being operated on.  Worse,
uncharge has to happen from a context that is still type-specific,
rather than at the end of the page's lifetime with exclusive access,
and so requires a lot of synchronization.

Rewrite the charge API to provide a generic set of try_charge(),
commit_charge() and cancel_charge() transaction operations, much like
what's currently done for swap-in:

  mem_cgroup_try_charge() attempts to reserve a charge, reclaiming
  pages from the memcg if necessary.

  mem_cgroup_commit_charge() commits the page to the charge once it
  has a valid page->mapping and PageAnon() reliably tells the type.

  mem_cgroup_cancel_charge() aborts the transaction.

This reduces the charge API and enables subsequent patches to
drastically simplify uncharging.

As pages need to be committed after rmap is established but before
they are added to the LRU, page_add_new_anon_rmap() must stop doing
LRU additions again.  Revive lru_cache_add_active_or_unevictable().

Signed-off-by: Johannes Weiner 
---
 Documentation/cgroups/memcg_test.txt |  32 +--
 include/linux/memcontrol.h   |  53 ++---
 include/linux/swap.h |   3 +
 kernel/events/uprobes.c  |   1 +
 mm/filemap.c |   9 +-
 mm/huge_memory.c |  57 +++--
 mm/memcontrol.c  | 423 ++-
 mm/memory.c  |  41 ++--
 mm/rmap.c|  19 --
 mm/shmem.c   |  24 +-
 mm/swap.c|  34 +++
 mm/swapfile.c|  14 +-
 12 files changed, 320 insertions(+), 390 deletions(-)

diff --git a/Documentation/cgroups/memcg_test.txt 
b/Documentation/cgroups/memcg_test.txt
index 80ac454704b8..bcf750d3cecd 100644
--- a/Documentation/cgroups/memcg_test.txt
+++ b/Documentation/cgroups/memcg_test.txt
@@ -24,24 +24,7 @@ Please note that implementation details can be changed.
 
a page/swp_entry may be charged (usage += PAGE_SIZE) at
 
-   mem_cgroup_charge_anon()
- Called at new page fault and Copy-On-Write.
-
-   mem_cgroup_try_charge_swapin()
- Called at do_swap_page() (page fault on swap entry) and swapoff.
- Followed by charge-commit-cancel protocol. (With swap accounting)
- At commit, a charge recorded in swap_cgroup is removed.
-
-   mem_cgroup_charge_file()
- Called at add_to_page_cache()
-
-   mem_cgroup_cache_charge_swapin()
- Called at shmem's swapin.
-
-   mem_cgroup_prepare_migration()
- Called before migration. "extra" charge is done and followed by
- charge-commit-cancel protocol.
- At commit, charge against oldpage or newpage will be committed.
+   mem_cgroup_try_charge()
 
 2. Uncharge
   a page/swp_entry may be uncharged (usage -= PAGE_SIZE) by
@@ -69,19 +52,14 @@ Please note that implementation details can be changed.
to new page is committed. At failure, charge to old page is committed.
 
 3. charge-commit-cancel
-   In some case, we can't know this "charge" is valid or not at charging
-   (because of races).
-   To handle such case, there are charge-commit-cancel functions.
-   mem_cgroup_try_charge_XXX
-   mem_cgroup_commit_charge_XXX
-   mem_cgroup_cancel_charge_XXX
-   these are used in swap-in and migration.
+   Memcg pages are charged in two steps:
+   mem_cgroup_try_charge()
+   mem_cgroup_commit_charge() or mem_cgroup_cancel_charge()
 
At try_charge(), there are no flags to say "this page is charged".
at this point, usage += PAGE_SIZE.
 
-   At commit(), the function checks the page should be charged or not
-   and set flags or avoid charging.(usage -= PAGE_SIZE)
+   At commit(), the page is associated with the memcg.
 
At cancel(), simply usage -= PAGE_SIZE.
 
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index eb65d29516ca..1a9a096858e0 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -54,28 +54,11 @@ struct mem_cgroup_reclaim_cookie {
 };
 
 #ifdef CONFIG_MEMCG
-/*
- * All "charge" functions with gfp_mask should use GFP_KERNEL or
- * (gfp_mask & GFP_RECLAIM_MASK). In current implementatin, memcg doesn't
- * alloc memory but reclaims memory from all available zones. So, "where I want
- * memory from" bits of gfp_mask has no meaning. So any bits of that field is
- * available but adding a rule is better. charge functions' gfp_mask should
- * be set to GFP_KERNEL or gfp_mask & GFP_RECLAIM_MASK for avoiding ambiguous
- * codes.
- * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
- */
-
-extern int mem_cgroup_charge_anon(struct page *page, struct

[patch 08/12] mm: memcontrol: use root_mem_cgroup res_counter

2014-06-16 Thread Johannes Weiner

Due to an old optimization to keep expensive res_counter changes at a
minimum, the root_mem_cgroup res_counter is never charged; there is no
limit at that level anyway, and any statistics can be generated on
demand by summing up the counters of all other cgroups.

However, with per-cpu charge caches, res_counter operations do not
even show up in profiles anymore, so this optimization is no longer
necessary.

Remove it to simplify the code.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
---
 mm/memcontrol.c | 150 
 1 file changed, 43 insertions(+), 107 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a3b69d4a6f17..3726f6774860 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2572,9 +2572,8 @@ static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
unsigned long nr_reclaimed;
unsigned long flags = 0;
unsigned long long size;
+   int ret = 0;
 
-   if (mem_cgroup_is_root(memcg))
-   goto done;
 retry:
if (consume_stock(memcg, nr_pages))
goto done;
@@ -2655,13 +2654,15 @@ nomem:
if (!(gfp_mask & __GFP_NOFAIL))
return -ENOMEM;
 bypass:
-   return -EINTR;
+   memcg = root_mem_cgroup;
+   ret = -EINTR;
+   goto retry;
 
 done_restock:
if (batch > nr_pages)
refill_stock(memcg, batch - nr_pages);
 done:
-   return 0;
+   return ret;
 }
 
 /**
@@ -2701,13 +2702,11 @@ static struct mem_cgroup 
*mem_cgroup_try_charge_mm(struct mm_struct *mm,
 static void __mem_cgroup_cancel_charge(struct mem_cgroup *memcg,
   unsigned int nr_pages)
 {
-   if (!mem_cgroup_is_root(memcg)) {
-   unsigned long bytes = nr_pages * PAGE_SIZE;
+   unsigned long bytes = nr_pages * PAGE_SIZE;
 
-   res_counter_uncharge(>res, bytes);
-   if (do_swap_account)
-   res_counter_uncharge(>memsw, bytes);
-   }
+   res_counter_uncharge(>res, bytes);
+   if (do_swap_account)
+   res_counter_uncharge(>memsw, bytes);
 }
 
 /*
@@ -2719,9 +2718,6 @@ static void __mem_cgroup_cancel_local_charge(struct 
mem_cgroup *memcg,
 {
unsigned long bytes = nr_pages * PAGE_SIZE;
 
-   if (mem_cgroup_is_root(memcg))
-   return;
-
res_counter_uncharge_until(>res, memcg->res.parent, bytes);
if (do_swap_account)
res_counter_uncharge_until(>memsw,
@@ -3956,7 +3952,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum 
charge_type ctype,
 * replacement page, so leave it alone when phasing out the
 * page that is unused after the migration.
 */
-   if (!end_migration && !mem_cgroup_is_root(memcg))
+   if (!end_migration)
mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
 
return memcg;
@@ -4089,8 +4085,7 @@ void mem_cgroup_uncharge_swap(swp_entry_t ent)
 * We uncharge this because swap is freed.  This memcg can
 * be obsolete one. We avoid calling css_tryget_online().
 */
-   if (!mem_cgroup_is_root(memcg))
-   res_counter_uncharge(>memsw, PAGE_SIZE);
+   res_counter_uncharge(>memsw, PAGE_SIZE);
mem_cgroup_swap_statistics(memcg, false);
css_put(>css);
}
@@ -4780,78 +4775,24 @@ out:
return retval;
 }
 
-
-static unsigned long mem_cgroup_recursive_stat(struct mem_cgroup *memcg,
-  enum mem_cgroup_stat_index idx)
-{
-   struct mem_cgroup *iter;
-   long val = 0;
-
-   /* Per-cpu values can be negative, use a signed accumulator */
-   for_each_mem_cgroup_tree(iter, memcg)
-   val += mem_cgroup_read_stat(iter, idx);
-
-   if (val < 0) /* race ? */
-   val = 0;
-   return val;
-}
-
-static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
-{
-   u64 val;
-
-   if (!mem_cgroup_is_root(memcg)) {
-   if (!swap)
-   return res_counter_read_u64(>res, RES_USAGE);
-   else
-   return res_counter_read_u64(>memsw, RES_USAGE);
-   }
-
-   /*
-* Transparent hugepages are still accounted for in MEM_CGROUP_STAT_RSS
-* as well as in MEM_CGROUP_STAT_RSS_HUGE.
-*/
-   val = mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_CACHE);
-   val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_RSS);
-
-   if (swap)
-   val += mem_cgroup_recursive_stat(memcg, MEM_CGROUP_STAT_SWAP);
-
-   return val << PAGE_SHIFT;
-}
-
 static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
-  struct cftype *cft)
+  struct cftype *cft)
 {
struct mem_cgroup *memcg = mem_cgroup_from_css(css);
-   u64 val;
-

[PATCH] audit: reduce scope of audit_log_fcaps

2014-06-16 Thread Richard Guy Briggs

audit_log_fcaps() isn't used outside kernel/audit.c.  Reduce its scope.

Signed-off-by: Richard Guy Briggs 
---
 kernel/audit.c |2 +-
 kernel/audit.h |1 -
 2 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index bdd0172..3225a5d 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1637,7 +1637,7 @@ void audit_log_cap(struct audit_buffer *ab, char *prefix, 
kernel_cap_t *cap)
}
 }
 
-void audit_log_fcaps(struct audit_buffer *ab, struct audit_names *name)
+static void audit_log_fcaps(struct audit_buffer *ab, struct audit_names *name)
 {
kernel_cap_t *perm = >fcap.permitted;
kernel_cap_t *inh = >fcap.inheritable;
diff --git a/kernel/audit.h b/kernel/audit.h
index 7bb6573..3cdffad 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -222,7 +222,6 @@ extern void audit_copy_inode(struct audit_names *name,
 const struct inode *inode);
 extern void audit_log_cap(struct audit_buffer *ab, char *prefix,
  kernel_cap_t *cap);
-extern void audit_log_fcaps(struct audit_buffer *ab, struct audit_names *name);
 extern void audit_log_name(struct audit_context *context,
   struct audit_names *n, struct path *path,
   int record_num, int *call_panic);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 05/12] mm: memcontrol: reclaim at least once for __GFP_NORETRY

2014-06-16 Thread Johannes Weiner

Currently, __GFP_NORETRY tries charging once and gives up before even
trying to reclaim.  Bring the behavior on par with the page allocator
and reclaim at least once before giving up.

Signed-off-by: Johannes Weiner 
Acked-by: Michal Hocko 
---
 mm/memcontrol.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 52550bbff1ef..9c646b9b56f4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2613,13 +2613,13 @@ retry:
if (!(gfp_mask & __GFP_WAIT))
goto nomem;
 
-   if (gfp_mask & __GFP_NORETRY)
-   goto nomem;
-
nr_reclaimed = mem_cgroup_reclaim(mem_over_limit, gfp_mask, flags);
 
if (mem_cgroup_margin(mem_over_limit) >= batch)
goto retry;
+
+   if (gfp_mask & __GFP_NORETRY)
+   goto nomem;
/*
 * Even though the limit is exceeded at this point, reclaim
 * may have been able to free some pages.  Retry the charge
-- 
2.0.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] audit: reduce scope of audit_net_id

2014-06-16 Thread Richard Guy Briggs

audit_net_id isn't used outside kernel/audit.c.  Reduce its scope.

Signed-off-by: Richard Guy Briggs 
---
 kernel/audit.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 59c0bbe..bdd0172 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -126,7 +126,7 @@ static atomic_taudit_lost = ATOMIC_INIT(0);
 
 /* The netlink socket. */
 static struct sock *audit_sock;
-int audit_net_id;
+static int audit_net_id;
 
 /* Hash for inode-based rules */
 struct list_head audit_inode_hash[AUDIT_INODE_BUCKETS];
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] audit: fix dangling keywords in integrity ima message output

2014-06-16 Thread Richard Guy Briggs

Replace spaces in op keyword labels in log output since userspace audit tools
can't parse orphaned keywords.

Reported-by: Steve Grubb 
Signed-off-by: Richard Guy Briggs 
---
 security/integrity/ima/ima_appraise.c |2 +-
 security/integrity/ima/ima_policy.c   |6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/integrity/ima/ima_appraise.c 
b/security/integrity/ima/ima_appraise.c
index 734e946..61c95af 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -214,7 +214,7 @@ int ima_appraise_measurement(int func, struct 
integrity_iint_cache *iint,
hash_start = 1;
case IMA_XATTR_DIGEST:
if (iint->flags & IMA_DIGSIG_REQUIRED) {
-   cause = "IMA signature required";
+   cause = "IMA-signature-required";
status = INTEGRITY_FAIL;
break;
}
diff --git a/security/integrity/ima/ima_policy.c 
b/security/integrity/ima/ima_policy.c
index a9c3d3c..dbdc528 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -330,7 +330,7 @@ void __init ima_init_policy(void)
 void ima_update_policy(void)
 {
const char *op = "policy_update";
-   const char *cause = "already exists";
+   const char *cause = "already-exists";
int result = 1;
int audit_info = 0;
 
@@ -654,7 +654,7 @@ ssize_t ima_parse_add_rule(char *rule)
/* Prevent installed policy from changing */
if (ima_rules != _default_rules) {
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL,
-   NULL, op, "already exists",
+   NULL, op, "already-exists",
-EACCES, audit_info);
return -EACCES;
}
@@ -680,7 +680,7 @@ ssize_t ima_parse_add_rule(char *rule)
if (result) {
kfree(entry);
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL,
-   NULL, op, "invalid policy", result,
+   NULL, op, "invalid-policy", result,
audit_info);
return result;
}
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] sched: Fast idling of CPU when system is partially loaded

2014-06-16 Thread Tim Chen

Thanks to the review from Jason and Peter.  I've moved the check
of whether load balance is required into fair.c's idle_balance.

When a system is lightly loaded (i.e. no more than 1 job per cpu),
attempt to pull job to a cpu before putting it to idle is unnecessary and
can be skipped.  This patch adds an indicator so the scheduler can know
when there's no more than 1 active job is on any CPU in the system to
skip needless job pulls.

On a 4 socket machine with a request/response kind of workload from
clients, we saw about 0.13 msec delay when we go through a full load
balance to try pull job from all the other cpus.  While 0.1 msec was
spent on processing the request and generating a response, the 0.13 msec
load balance overhead was actually more than the actual work being done.
This overhead can be skipped much of the time for lightly loaded systems.

With this patch, we tested with a netperf request/response workload that
has the server busy with half the cpus in a 4 socket system.  We found
the patch eliminated 75% of the load balance attempts before idling a cpu.

The overhead of setting/clearing the indicator is low as we already gather
the necessary info while we call add_nr_running and update_sd_lb_stats.
We switch to full load balance load immediately if any cpu got more than
one job on its run queue in add_nr_running.  We'll clear the indicator
to avoid load balance when we detect no cpu's have more than one job
when we scan the work queues in update_sg_lb_stats.  We are aggressive
in maintaining the load balance and opportunistic in skipping the load
balance.

Signed-off-by: Tim Chen 
---
 kernel/sched/fair.c  | 24 +---
 kernel/sched/sched.h | 10 --
 2 files changed, 29 insertions(+), 5 deletions(-)

Change log:
v2. 
1. Move the skip load balance code to idle_balance.
2. Use env->dst_rq->rd to get the root domain directly.

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9855e87..95bb541 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5863,7 +5863,8 @@ static inline int sg_capacity(struct lb_env *env, struct 
sched_group *group)
  */
 static inline void update_sg_lb_stats(struct lb_env *env,
struct sched_group *group, int load_idx,
-   int local_group, struct sg_lb_stats *sgs)
+   int local_group, struct sg_lb_stats *sgs,
+   bool *overload)
 {
unsigned long load;
int i;
@@ -5881,6 +5882,8 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
sgs->group_load += load;
sgs->sum_nr_running += rq->nr_running;
+   if (overload && rq->nr_running > 1)
+   *overload = true;
 #ifdef CONFIG_NUMA_BALANCING
sgs->nr_numa_running += rq->nr_numa_running;
sgs->nr_preferred_running += rq->nr_preferred_running;
@@ -5991,6 +5994,7 @@ static inline void update_sd_lb_stats(struct lb_env *env, 
struct sd_lb_stats *sd
struct sched_group *sg = env->sd->groups;
struct sg_lb_stats tmp_sgs;
int load_idx, prefer_sibling = 0;
+   bool overload = false;
 
if (child && child->flags & SD_PREFER_SIBLING)
prefer_sibling = 1;
@@ -6011,7 +6015,13 @@ static inline void update_sd_lb_stats(struct lb_env 
*env, struct sd_lb_stats *sd
update_group_power(env->sd, env->dst_cpu);
}
 
-   update_sg_lb_stats(env, sg, load_idx, local_group, sgs);
+   if (env->sd->parent)
+   update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
+   NULL);
+   else
+   /* gather overload info if we are at root domain */
+   update_sg_lb_stats(env, sg, load_idx, local_group, sgs,
+   );
 
if (local_group)
goto next_group;
@@ -6045,6 +6055,13 @@ next_group:
 
if (env->sd->flags & SD_NUMA)
env->fbq_type = fbq_classify_group(>busiest_stat);
+
+   if (!env->sd->parent) {
+   /* update overload indicator if we are at root domain */
+   if (env->dst_rq->rd->overload != overload)
+   env->dst_rq->rd->overload = overload;
+   }
+
 }
 
 /**
@@ -6762,7 +6779,8 @@ static int idle_balance(struct rq *this_rq)
 */
this_rq->idle_stamp = rq_clock(this_rq);
 
-   if (this_rq->avg_idle < sysctl_sched_migration_cost) {
+   if (this_rq->avg_idle < sysctl_sched_migration_cost ||
+   !this_rq->rd->overload) {
rcu_read_lock();
sd = rcu_dereference_check_sched_domain(this_rq->sd);
if (sd)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e47679b..396bce0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -477,6 +477,9 @@ struct

[PATCH] Documentation: corrected sub-chapter number

2014-06-16 Thread Richard Guy Briggs

The index is correct, but there are two sections tagged 3.7.  Fix.

Signed-off-by: Richard Guy Briggs 
---
 Documentation/filesystems/proc.txt |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/proc.txt 
b/Documentation/filesystems/proc.txt
index f00bee1..68e3a76 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1648,7 +1648,7 @@ pids, so one need to either stop or freeze processes 
being inspected
 if precise results are needed.
 
 
-3.7/proc//fdinfo/ - Information about opened file
+3.8/proc//fdinfo/ - Information about opened file
 ---
 This file provides information associated with an opened file. The regular
 files have at least two fields -- 'pos' and 'flags'. The 'pos' represents
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Check for Null return from logfs_readpage_nolock in btree_write_block

2014-06-16 Thread Nicholas Krause

Signed-off-by: Nicholas Krause 
---
 fs/logfs/readwrite.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/logfs/readwrite.c b/fs/logfs/readwrite.c
index 4814031..adb9233 100644
--- a/fs/logfs/readwrite.c
+++ b/fs/logfs/readwrite.c
@@ -2210,6 +2210,8 @@ void btree_write_block(struct logfs_block *block)
page = logfs_get_write_page(inode, block->bix, block->level);
 
err = logfs_readpage_nolock(page);
+   if (!err)
+   return -ENOMEM;
BUG_ON(err);
BUG_ON(!PagePrivate(page));
BUG_ON(logfs_block(page) != block);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 1/5] devicetree: bindings: document Broadcom CPU enable method

2014-06-16 Thread Alex Elder

Broadcom mobile SoCs use a ROM-implemented holding pen for
controlled boot of secondary cores.  A special register is
used to communicate to the ROM that a secondary core should
start executing kernel code.  This enable method is currently
used for members of the bcm281xx and bcm21664 SoC families.

The use of an enable method also allows the SMP operation vector to
be assigned as a result of device tree content for these SoCs.

Signed-off-by: Alex Elder 
---
 Documentation/devicetree/bindings/arm/cpus.txt | 12 
 1 file changed, 12 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt 
b/Documentation/devicetree/bindings/arm/cpus.txt
index 1fe72a0..cdca080 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -184,6 +184,7 @@ nodes to be present and contain the properties described 
below.
  can be one of:
"allwinner,sun6i-a31"
"arm,psci"
+   "brcm,bcm11351-cpu-method"
"marvell,armada-375-smp"
"marvell,armada-380-smp"
"marvell,armada-xp-smp"
@@ -215,6 +216,17 @@ nodes to be present and contain the properties described 
below.
Value type: 
Definition: Specifies the ACC[2] node associated with this CPU.
 
+   - secondary-boot-reg
+   Usage:
+   Required for systems that have an "enable-method"
+   property value of "brcm,bcm11351-cpu-method".
+   Value type: 
+   Definition:
+   Specifies the physical address of the register used to
+   request the ROM holding pen code release a secondary
+   CPU.  The value written to the register is formed by
+   encoding the target CPU id into the low bits of the
+   physical start address it should jump to.
 
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 4/5] ARM: dts: enable SMP support for bcm28155

2014-06-16 Thread Alex Elder

Define nodes representing the two Cortex A9 CPUs in a bcm28155 SoC.

Signed-off-by: Ray Jui 
Signed-off-by: Alex Elder 
---
 arch/arm/boot/dts/bcm11351.dtsi | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/bcm11351.dtsi b/arch/arm/boot/dts/bcm11351.dtsi
index 6b05ae6..2ddaa51 100644
--- a/arch/arm/boot/dts/bcm11351.dtsi
+++ b/arch/arm/boot/dts/bcm11351.dtsi
@@ -27,6 +27,25 @@
bootargs = "console=ttyS0,115200n8";
};
 
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   enable-method = "brcm,bcm11351-cpu-method";
+   secondary-boot-reg = <0x3500417c>;
+
+   cpu0: cpu@0 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a9";
+   reg = <0>;
+   };
+
+   cpu1: cpu@1 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a9";
+   reg = <1>;
+   };
+   };
+
gic: interrupt-controller@3ff00100 {
compatible = "arm,cortex-a9-gic";
#interrupt-cells = <3>;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 5/5] ARM: dts: enable SMP support for bcm21664

2014-06-16 Thread Alex Elder

Define nodes representing the two Cortex A9 CPUs in a bcm21644 SoC.

Signed-off-by: Alex Elder 
---
 arch/arm/boot/dts/bcm21664.dtsi | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm/boot/dts/bcm21664.dtsi b/arch/arm/boot/dts/bcm21664.dtsi
index 8b36682..2016b72 100644
--- a/arch/arm/boot/dts/bcm21664.dtsi
+++ b/arch/arm/boot/dts/bcm21664.dtsi
@@ -27,6 +27,25 @@
bootargs = "console=ttyS0,115200n8";
};
 
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   enable-method = "brcm,bcm11351-cpu-method";
+   secondary-boot-reg = <0x35004178>;
+
+   cpu0: cpu@0 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a9";
+   reg = <0>;
+   };
+
+   cpu1: cpu@1 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a9";
+   reg = <1>;
+   };
+   };
+
gic: interrupt-controller@3ff00100 {
compatible = "arm,cortex-a9-gic";
#interrupt-cells = <3>;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 3/5] ARM: configs: enable SMP in bcm_defconfig

2014-06-16 Thread Alex Elder

Also explicitly set CONFIG_NR_CPUS to 2, limiting it to the most we
currently need.

Signed-off-by: Ray Jui 
Signed-off-by: Alex Elder 
---
 arch/arm/configs/bcm_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/bcm_defconfig b/arch/arm/configs/bcm_defconfig
index dcfc559..0339c5e 100644
--- a/arch/arm/configs/bcm_defconfig
+++ b/arch/arm/configs/bcm_defconfig
@@ -27,6 +27,7 @@ CONFIG_PARTITION_ADVANCED=y
 CONFIG_ARCH_BCM=y
 CONFIG_ARCH_BCM_MOBILE=y
 CONFIG_ARM_THUMBEE=y
+CONFIG_SMP=y
 CONFIG_PREEMPT=y
 CONFIG_AEABI=y
 # CONFIG_COMPACTION is not set
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 2/5] ARM: add SMP support for Broadcom mobile SoCs

2014-06-16 Thread Alex Elder

This patch adds SMP support for BCM281XX and BCM21664 family SoCs.

This feature is controlled with a distinct config option such that
an SMP-enabled multi-v7 binary can be configured to run these SoCs
in uniprocessor mode.  Since this SMP functionality is used for
multiple Broadcom mobile chip families the config option is called
ARCH_BCM_MOBILE_SMP (for lack of a better name).

On SoCs of this type, the secondary core is not held in reset on
power-on.  Instead it loops in a ROM-based holding pen.  To release
it, one must write into a special register a jump address whose
low-order bits have been replaced with a secondary core's id, then
trigger an event with SEV.  On receipt of an event, the ROM code
will examine the register's contents, and if the low-order bits
match its cpu id, it will clear them and write the value back to the
register just prior to jumping to the address specified.

The location of the special register is defined in the device tree
using a "secondary-boot-reg" property in a node whose "enable-method"
matches.

Derived from code originally provided by Ray Jui 

Signed-off-by: Alex Elder 
---
 arch/arm/mach-bcm/Kconfig|  18 +++-
 arch/arm/mach-bcm/Makefile   |   3 +
 arch/arm/mach-bcm/kona_smp.c | 202 +++
 3 files changed, 220 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm/mach-bcm/kona_smp.c

diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
index 9bc6db1..ac6d139 100644
--- a/arch/arm/mach-bcm/Kconfig
+++ b/arch/arm/mach-bcm/Kconfig
@@ -10,7 +10,6 @@ config ARCH_BCM_MOBILE
bool "Broadcom Mobile SoC Support" if ARCH_MULTI_V7
select ARCH_REQUIRE_GPIOLIB
select ARM_ERRATA_754322
-   select ARM_ERRATA_764369 if SMP
select ARM_ERRATA_775420
select ARM_GIC
select GPIO_BCM_KONA
@@ -27,16 +26,18 @@ menu "Broadcom Mobile SoC Selection"
 config ARCH_BCM_281XX
bool "Broadcom BCM281XX SoC family"
default y
+   select HAVE_SMP
help
- Enable support for the the BCM281XX family, which includes
+ Enable support for the BCM281XX family, which includes
  BCM11130, BCM11140, BCM11351, BCM28145 and BCM28155
  variants.
 
 config ARCH_BCM_21664
bool "Broadcom BCM21664 SoC family"
default y
+   select HAVE_SMP
help
- Enable support for the the BCM21664 family, which includes
+ Enable support for the BCM21664 family, which includes
  BCM21663 and BCM21664 variants.
 
 config ARCH_BCM_MOBILE_L2_CACHE
@@ -50,6 +51,17 @@ config ARCH_BCM_MOBILE_SMC
bool
depends on ARCH_BCM_281XX || ARCH_BCM_21664
 
+config ARCH_BCM_MOBILE_SMP
+   bool "Broadcom mobile SoC SMP support"
+   depends on (ARCH_BCM_281XX || ARCH_BCM_21664) && SMP
+   default y
+   select HAVE_ARM_SCU
+   select ARM_ERRATA_764369
+   help
+ SMP support for the BCM281XX and BCM21664 SoC families.
+ Provided as an option so SMP support for SoCs of this type
+ can be disabled for an SMP-enabled kernel.
+
 endmenu
 
 endif
diff --git a/arch/arm/mach-bcm/Makefile b/arch/arm/mach-bcm/Makefile
index 7312921..2eaafc8 100644
--- a/arch/arm/mach-bcm/Makefile
+++ b/arch/arm/mach-bcm/Makefile
@@ -16,6 +16,9 @@ obj-$(CONFIG_ARCH_BCM_281XX)  += board_bcm281xx.o
 # BCM21664
 obj-$(CONFIG_ARCH_BCM_21664)   += board_bcm21664.o
 
+# BCM281XX and BCM21664 SMP support
+obj-$(CONFIG_ARCH_BCM_MOBILE_SMP) += kona_smp.o
+
 # BCM281XX and BCM21664 L2 cache control
 obj-$(CONFIG_ARCH_BCM_MOBILE_L2_CACHE) += kona_l2_cache.o
 
diff --git a/arch/arm/mach-bcm/kona_smp.c b/arch/arm/mach-bcm/kona_smp.c
new file mode 100644
index 000..66a0465
--- /dev/null
+++ b/arch/arm/mach-bcm/kona_smp.c
@@ -0,0 +1,202 @@
+/*
+ * Copyright (C) 2014 Broadcom Corporation
+ * Copyright 2014 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+/* Size of mapped Cortex A9 SCU address space */
+#define CORTEX_A9_SCU_SIZE 0x58
+
+#define SECONDARY_TIMEOUT_NS   NSEC_PER_MSEC   /* 1 msec (in nanoseconds) */
+#define BOOT_ADDR_CPUID_MASK   0x3
+
+/* Name of device node property defining secondary boot register location */
+#define OF_SECONDARY_BOOT  "secondary-boot-reg"
+
+/* I/O address of register used to coordinate secondary core startup */
+static u32 secondary_boot;
+
+/*
+ * Enable the Cortex A9 Snoop Control Unit
+ *
+ * By the time this is called we already know there are

[PATCH v5 0/5] ARM: SMP: support Broadcom mobile SoCs

2014-06-16 Thread Alex Elder

This series adds SMP support for two Broadcom mobile SoC families.
It uses CPU_METHOD_OF_DECLARE() so that SMP operations are assigned
using device tree rather than adding it to a machine definition in a
board file.

The enable method starts a secondary core by writing to a register
monitored by CPUs spinning in a ROM-based holding pen loop.  The
address of this register is recorded as a property in the "cpus"
node of the device tree.

-Alex

Notes:
- I would prefer to document the binding in a separate file in
  the way suggested here:
https://lkml.org/lkml/2014/5/20/559
  But for now I'm keeping this independent of that.
- This series is based on v3.16-rc1, plus two recently-posted patches:

http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/264012.html

http://lists.infradead.org/pipermail/linux-arm-kernel/2014-June/264024.html
- This series is available here:
http://git.linaro.org/landing-teams/working/broadcom/kernel.git
Branch review/bcm-smp-v5

History
v5: - No real change; rebased onto v3.16-rc1.
v4: - Renamed "platsmp.c" to be "kona_smp.c".
- Rebased onto v3.15-rc5
v3: - Dropped definition and use of CPU_METHOD_OF_DECLARE_SETUP()
- Added documentation for "enable-method"
- Rebased onto v3.15-rc4
v2: - Fixed a Makefile error (:= should have been +=)
- No longer set CONFIG_NR_CPUS in bcm_defconfig
- Rebased onto v3.15-rc1

Alex Elder (5):
  devicetree: bindings: document Broadcom CPU enable method
  ARM: add SMP support for Broadcom mobile SoCs
  ARM: configs: enable SMP in bcm_defconfig
  ARM: dts: enable SMP support for bcm28155
  ARM: dts: enable SMP support for bcm21664

 Documentation/devicetree/bindings/arm/cpus.txt |  12 ++
 arch/arm/boot/dts/bcm11351.dtsi|  19 +++
 arch/arm/boot/dts/bcm21664.dtsi|  19 +++
 arch/arm/configs/bcm_defconfig |   1 +
 arch/arm/mach-bcm/Kconfig  |  18 ++-
 arch/arm/mach-bcm/Makefile |   3 +
 arch/arm/mach-bcm/kona_smp.c   | 202 +
 7 files changed, 271 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm/mach-bcm/kona_smp.c

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm, thp: move invariant bug check out of loop in __split_huge_page_map

2014-06-16 Thread Waiman Long

In the __split_huge_page_map() function, the check for
page_mapcount(page) is invariant within the for loop. Because of the
fact that the macro is implemented using atomic_read(), the redundant
check cannot be optimized away by the compiler leading to unnecessary
read to the page structure.

This patch move the invariant bug check out of the loop so that it
will be done only once.

Signed-off-by: Waiman Long 
---
 mm/huge_memory.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index b4b1feb..b8bb16c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1744,6 +1744,8 @@ static int __split_huge_page_map(struct page *page,
if (pmd) {
pgtable = pgtable_trans_huge_withdraw(mm, pmd);
pmd_populate(mm, &_pmd, pgtable);
+   if (pmd_write(*pmd))
+   BUG_ON(page_mapcount(page) != 1);
 
haddr = address;
for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
@@ -1753,8 +1755,6 @@ static int __split_huge_page_map(struct page *page,
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
if (!pmd_write(*pmd))
entry = pte_wrprotect(entry);
-   else
-   BUG_ON(page_mapcount(page) != 1);
if (!pmd_young(*pmd))
entry = pte_mkold(entry);
if (pmd_numa(*pmd))
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: Move __vma_address() to internal.h to be inlined in huge_memory.c

2014-06-16 Thread Waiman Long


On 06/12/2014 05:45 PM, David Rientjes wrote:

On Thu, 12 Jun 2014, Waiman Long wrote:


The vma_address() function which is used to compute the virtual address
within a VMA is used only by 2 files in the mm subsystem - rmap.c and
huge_memory.c. This function is defined in rmap.c and is inlined by
its callers there, but it is also declared as an external function.

However, the __split_huge_page() function which calls vma_address()
in huge_memory.c is calling it as a real function call. This is not
as efficient as an inlined function. This patch moves the underlying
inlined __vma_address() function to internal.h to be shared by both
the rmap.c and huge_memory.c file.

This increases huge_memory.o's text+data_bss by 311 bytes, which makes
me suspect that it is a bad change due to its increase of kernel cache
footprint.

Perhaps we should be noinlining __vma_address()?

On my test machine, I saw an increase of 144 bytes in the text segment
of huge_memory.o. The size in size is caused by an increase in the size
of the __split_huge_page function. When I remove the

 if (unlikely(is_vm_hugetlb_page(vma)))
 pgoff = page->index<<  huge_page_order(page_hstate(page));

check, the increase in size drops down to 24 bytes. As a THP cannot be
a hugetlb page, there is no point in doing this check for a THP. I will
update the patch to pass in an additional argument to disable this
check for __split_huge_page.


I think we're seeking a reason or performance numbers that suggest
__vma_address() being inline is appropriate and so far we lack any such
evidence.  Adding additional parameters to determine checks isn't going to
change the fact that it increases text size needlessly.


This patch was motivated by my investigation of a freeze problem of an 
application running on SLES11 sp3 which was caused by the long time it 
took to munmap part of a THP. Inlining vma_address help a bit in that 
situation. However, the problem will be essentially gone after including 
patches that changing the anon_vma_chain to use rbtree instead of a 
simple list.


I do agree that performance impact of inlining vma_address in minimal in 
the latest kernel. So I am not going to pursue this any further.


Thank for the review.

-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 1732 matches

Mail list logo