Re: [PATCH 0/3] scheduler include file reorganization

2013-02-13 Thread Namhyung Kim
On Wed, 13 Feb 2013 09:19:37 -0600, Clark Williams wrote:
> On Wed, 13 Feb 2013 10:15:12 +0100
> Ingo Molnar  wrote:
>> * Namhyung Kim  wrote:
>> > On Mon, 11 Feb 2013 10:54:58 +0100, Ingo Molnar wrote:
>> > > * Clark Williams  wrote:
>> > >
>> > >> I figured that was coming. :)
>> > >
>> > > ;-)
>> > >
>> > >> I'll look at it again and see about pulling the 
>> > >> autogroup/cgroup stuff into it's own header. After that it's 
>> > >> probably going to require some serious changes.
>> > >> 
>> > >> Any suggestions?
>> > >
>> > > I'd suggest doing it as finegrained as possible - potentially 
>> > > one concept at a time. I wouldn't mind a dozen small files in 
>> > > include/linux/sched/ - possibly more.
>> > 
>> > What about the .c files?  AFAICS the sched/core.c and 
>> > sched/fair.c are rather huge and contain various concepts 
>> > which might be separated to their own files.  It'd be better 
>> > reorganizing them too IMHO.
>> 
>> I'd be more careful about those, because there's various 
>> scheduler patch-sets floating modifying them.
>> 
>> sched.h is much more static and it is the one that actually gets 
>> included in like 60% of all *other* .c files, adding a few 
>> thousand lines to every .o compilation and causing measurable 
>> compile time overhead ...
>> 
>> So sched.h splitting is something we should really do, if 
>> there's people interested in and capable of pulling it off.
>
> And since I'm one of the people that care about the RT patch (which
> modifies the scheduler files) I'll just start with baby steps and reorg
> the headers. 

Understood.  Thanks for the explanation!

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] gpio: Add device driver for GRGPIO cores and support custom accessors with gpio-generic

2013-02-13 Thread Andreas Larsson
This driver supports GRGPIO gpio cores available in the GRLIB VHDL IP
core library from Aeroflex Gaisler.

This also adds support to gpio-generic for using custom accessor
functions. The grgpio driver uses this to use ioread32be and iowrite32be
for big endian register accesses.

Reviewed-by: Anton Vorontsov 
Signed-off-by: Andreas Larsson 
---

Changes since v3:
- Add Reveiwed-by
- Fix pointed out style issues
- Use np->full_name directly instead of kstrdup'ing it as it is a const char*
- Call gpiochip_remove in grgpio_remove

 .../devicetree/bindings/gpio/gpio-grgpio.txt   |   29 +++
 drivers/gpio/Kconfig   |8 +
 drivers/gpio/Makefile  |1 +
 drivers/gpio/gpio-generic.c|   26 ++-
 drivers/gpio/gpio-grgpio.c |  253 
 5 files changed, 308 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/gpio/gpio-grgpio.txt
 create mode 100644 drivers/gpio/gpio-grgpio.c

diff --git a/Documentation/devicetree/bindings/gpio/gpio-grgpio.txt 
b/Documentation/devicetree/bindings/gpio/gpio-grgpio.txt
new file mode 100644
index 000..36f456f
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpio/gpio-grgpio.txt
@@ -0,0 +1,29 @@
+Aeroflex Gaisler GRGPIO General Purpose I/O cores.
+
+The GRGPIO GPIO core is available in the GRLIB VHDL IP core library.
+
+Note: In the ordinary ordinary environment for the GRGPIO core, a Leon SPARC
+system, these properties are built from information in the AMBA plug
+
+Required properties:
+
+- name : Should be "GAISLER_GPIO" or "01_01a"
+
+- reg : Address and length of the register set for the device
+
+- interrupts : Interrupt numbers for this device
+
+Optional properties:
+
+- base : The base gpio number for the core. A dynamic base is used if not
+   present
+
+- nbits : The number of gpio lines. If not present driver assumes 32 lines.
+
+- irqmap : An array with an index for each gpio line. An index is either a 
valid
+   index into the interrupts property array, or 0x that indicates
+   no irq for that line. Driver provides no interrupt support if not
+   present.
+
+For further information look in the documentation for the GLIB IP core library:
+http://www.gaisler.com/products/grlib/grip.pdf
diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index ab97eb8..5472778 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -309,6 +309,14 @@ config GPIO_LYNXPOINT
  driver for GPIO functionality on Intel Lynxpoint PCH chipset
  Requires ACPI device enumeration code to set up a platform device.
 
+config GPIO_GRGPIO
+   tristate "Aeroflex Gaisler GRGPIO support"
+   depends on OF
+   select GPIO_GENERIC
+   help
+ Select this to support Aeroflex Gaisler GRGPIO cores from the GRLIB
+ VHDL IP core library.
+
 comment "I2C GPIO expanders:"
 
 config GPIO_ARIZONA
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 4398034..f3b49a2 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -26,6 +26,7 @@ obj-$(CONFIG_ARCH_DAVINCI)+= gpio-davinci.o
 obj-$(CONFIG_GPIO_EM)  += gpio-em.o
 obj-$(CONFIG_GPIO_EP93XX)  += gpio-ep93xx.o
 obj-$(CONFIG_GPIO_GE_FPGA) += gpio-ge.o
+obj-$(CONFIG_GPIO_GRGPIO)  += gpio-grgpio.o
 obj-$(CONFIG_GPIO_ICH) += gpio-ich.o
 obj-$(CONFIG_GPIO_IT8761E) += gpio-it8761e.o
 obj-$(CONFIG_GPIO_JANZ_TTL)+= gpio-janz-ttl.o
diff --git a/drivers/gpio/gpio-generic.c b/drivers/gpio/gpio-generic.c
index 05fcc0f..f854799 100644
--- a/drivers/gpio/gpio-generic.c
+++ b/drivers/gpio/gpio-generic.c
@@ -251,24 +251,25 @@ static int bgpio_setup_accessors(struct device *dev,
 struct bgpio_chip *bgc,
 bool be)
 {
+   struct bgpio_chip def;
 
switch (bgc->bits) {
case 8:
-   bgc->read_reg   = bgpio_read8;
-   bgc->write_reg  = bgpio_write8;
+   def.read_reg= bgpio_read8;
+   def.write_reg   = bgpio_write8;
break;
case 16:
-   bgc->read_reg   = bgpio_read16;
-   bgc->write_reg  = bgpio_write16;
+   def.read_reg= bgpio_read16;
+   def.write_reg   = bgpio_write16;
break;
case 32:
-   bgc->read_reg   = bgpio_read32;
-   bgc->write_reg  = bgpio_write32;
+   def.read_reg= bgpio_read32;
+   def.write_reg   = bgpio_write32;
break;
 #if BITS_PER_LONG >= 64
case 64:
-   bgc->read_reg   = bgpio_read64;
-   bgc->write_reg  = bgpio_write64;
+   def.read_reg= bgpio_read64;
+   def.write_reg   = bgpio_write64;
break;
 #endif /* BITS_PER_LONG >= 64 */
default:
@@ -276,7 +277,14 @@ static int 

Re: [RFC 1/1] xattr: provide integrity. namespace to read real values

2013-02-13 Thread Kasatkin, Dmitry
Hello,

Any comments about this patch and functionality?

Thanks,
Dmitry

On Wed, Feb 13, 2013 at 11:07 AM, Dmitry Kasatkin
 wrote:
> User space tools use getxattr() system call to read values of extended
> attributes. getxattr() system call uses vfs_getattr(), which for "security."
> namespace might get a value of the xattr indirectly from LSM via calling
> xattr_getsecurity(). For that reason value set by setxattr and read by 
> getxattr
> might differ.
>
> Here is an example of SMACK label, which shows that set and read values are
> different:
>
>   setfattr -n security.SMACK64 -v "hello world" foo
>   getfattr -n security.SMACK64 foo
>   # file: foo
>   security.SMACK64="hello"
>
> EVM uses vfs_getxattr_alloc(), which directly reads xattr values from the file
> system. When performing the file system labeling with digital signatures, it 
> is
> necessary to read real xattr values in order to generate the correct 
> signatures.
>
> This patch adds the virtual "integrity." name space, which allows to bypass
> calling LSM and read real extended attribute values.
>
>   getfattr -e text -n integrity.SMACK64 foo
>   # file: foo
>   integrity.SMACK64="hello world"
>
> Suggested-by: Casey Schaufler 
> Signed-off-by: Dmitry Kasatkin 
> ---
>  fs/xattr.c |   22 +++---
>  include/uapi/linux/xattr.h |4 
>  2 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xattr.c b/fs/xattr.c
> index 3377dff..76c2620 100644
> --- a/fs/xattr.c
> +++ b/fs/xattr.c
> @@ -232,12 +232,28 @@ vfs_getxattr(struct dentry *dentry, const char *name, 
> void *value, size_t size)
>  {
> struct inode *inode = dentry->d_inode;
> int error;
> +   char *usename = (char *)name, name_buf[XATTR_NAME_MAX];
> +
> +   /* because this function calls LSM for "security." namespace,
> +* it may be impossible to get real value stored in xattr.
> +* An LSM may mangle the attribute value to its own ends.
> +* Smack is known to do this.
> +* virtual namespace "integrity." is used to fetch real
> +* security attributes without talking to LSM
> +*/
> +   if (!strncmp(name, XATTR_INTEGRITY_PREFIX,
> +   XATTR_INTEGRITY_PREFIX_LEN)) {
> +   /* replace "integrity. with security. */
> +   snprintf(name_buf, sizeof(name_buf), "security.%s",
> +name + XATTR_INTEGRITY_PREFIX_LEN);
> +   usename = name_buf;
> +   }
>
> -   error = xattr_permission(inode, name, MAY_READ);
> +   error = xattr_permission(inode, usename, MAY_READ);
> if (error)
> return error;
>
> -   error = security_inode_getxattr(dentry, name);
> +   error = security_inode_getxattr(dentry, usename);
> if (error)
> return error;
>
> @@ -255,7 +271,7 @@ vfs_getxattr(struct dentry *dentry, const char *name, 
> void *value, size_t size)
> }
>  nolsm:
> if (inode->i_op->getxattr)
> -   error = inode->i_op->getxattr(dentry, name, value, size);
> +   error = inode->i_op->getxattr(dentry, usename, value, size);
> else
> error = -EOPNOTSUPP;
>
> diff --git a/include/uapi/linux/xattr.h b/include/uapi/linux/xattr.h
> index 26607bd..133998b 100644
> --- a/include/uapi/linux/xattr.h
> +++ b/include/uapi/linux/xattr.h
> @@ -20,6 +20,10 @@
>  #define XATTR_SECURITY_PREFIX  "security."
>  #define XATTR_SECURITY_PREFIX_LEN (sizeof (XATTR_SECURITY_PREFIX) - 1)
>
> +/* integrity - security mirror namespace for integrity purpose */
> +#define XATTR_INTEGRITY_PREFIX "integrity."
> +#define XATTR_INTEGRITY_PREFIX_LEN (sizeof (XATTR_INTEGRITY_PREFIX) - 1)
> +
>  #define XATTR_SYSTEM_PREFIX "system."
>  #define XATTR_SYSTEM_PREFIX_LEN (sizeof (XATTR_SYSTEM_PREFIX) - 1)
>
> --
> 1.7.10.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 5/6] Input: matrix-keymap: Add function to read the new DT binding

2013-02-13 Thread Simon Glass
We now have a binding which adds two parameters to the matrix keypad DT
node. This is separate from the GPIO-driven matrix keypad binding, and
unfortunately incompatible, since that uses row-gpios/col-gpios for the
row and column counts.

So the easiest option here is to provide a function for non-GPIO drivers
to use to decode the binding.

Note: We could in fact create an entirely separate structure to hold
these two fields, but it does not seem worth it, yet. If we have more
parameters then we can add this, and then refactor each driver to hold
such a structure.

Signed-off-by: Simon Glass 
Tested-by: Sourav Poddar  (v2)
---
Changes in v3:
- Add stub for matrix_keypad_parse_of_params() when no CONFIG_OF
- Put back full DT range checking in tca8418 driver

Changes in v2:
- Add new patch to decode matrix-keypad DT binding

 drivers/input/keyboard/lpc32xx-keys.c   | 11 ++-
 drivers/input/keyboard/omap4-keypad.c   | 16 +---
 drivers/input/keyboard/tca8418_keypad.c |  7 +--
 drivers/input/matrix-keymap.c   | 19 +++
 include/linux/input/matrix_keypad.h | 19 +++
 5 files changed, 54 insertions(+), 18 deletions(-)

diff --git a/drivers/input/keyboard/lpc32xx-keys.c 
b/drivers/input/keyboard/lpc32xx-keys.c
index 1b8add6..4218143 100644
--- a/drivers/input/keyboard/lpc32xx-keys.c
+++ b/drivers/input/keyboard/lpc32xx-keys.c
@@ -144,12 +144,13 @@ static int lpc32xx_parse_dt(struct device *dev,
 {
struct device_node *np = dev->of_node;
u32 rows = 0, columns = 0;
+   int err;
 
-   of_property_read_u32(np, "keypad,num-rows", );
-   of_property_read_u32(np, "keypad,num-columns", );
-   if (!rows || rows != columns) {
-   dev_err(dev,
-   "rows and columns must be specified and be equal!\n");
+   err = matrix_keypad_parse_of_params(dev, , );
+   if (err)
+   return err;
+   if (rows != columns) {
+   dev_err(dev, "rows and columns must be equal!\n");
return -EINVAL;
}
 
diff --git a/drivers/input/keyboard/omap4-keypad.c 
b/drivers/input/keyboard/omap4-keypad.c
index e25b022..1b28909 100644
--- a/drivers/input/keyboard/omap4-keypad.c
+++ b/drivers/input/keyboard/omap4-keypad.c
@@ -215,18 +215,12 @@ static int omap4_keypad_parse_dt(struct device *dev,
 struct omap4_keypad *keypad_data)
 {
struct device_node *np = dev->of_node;
+   int err;
 
-   if (!np) {
-   dev_err(dev, "missing DT data");
-   return -EINVAL;
-   }
-
-   of_property_read_u32(np, "keypad,num-rows", _data->rows);
-   of_property_read_u32(np, "keypad,num-columns", _data->cols);
-   if (!keypad_data->rows || !keypad_data->cols) {
-   dev_err(dev, "number of keypad rows/columns not specified\n");
-   return -EINVAL;
-   }
+   err = matrix_keypad_parse_of_params(dev, _data->rows,
+   _data->cols);
+   if (err)
+   return err;
 
if (of_get_property(np, "linux,input-no-autorepeat", NULL))
keypad_data->no_autorepeat = true;
diff --git a/drivers/input/keyboard/tca8418_keypad.c 
b/drivers/input/keyboard/tca8418_keypad.c
index a34cc67..55c1530 100644
--- a/drivers/input/keyboard/tca8418_keypad.c
+++ b/drivers/input/keyboard/tca8418_keypad.c
@@ -288,8 +288,11 @@ static int tca8418_keypad_probe(struct i2c_client *client,
irq_is_gpio = pdata->irq_is_gpio;
} else {
struct device_node *np = dev->of_node;
-   of_property_read_u32(np, "keypad,num-rows", );
-   of_property_read_u32(np, "keypad,num-columns", );
+   int err;
+
+   err = matrix_keypad_parse_of_params(dev, , );
+   if (err)
+   return err;
rep = of_property_read_bool(np, "keypad,autorepeat");
}
 
diff --git a/drivers/input/matrix-keymap.c b/drivers/input/matrix-keymap.c
index 3ae496e..619b382 100644
--- a/drivers/input/matrix-keymap.c
+++ b/drivers/input/matrix-keymap.c
@@ -50,6 +50,25 @@ static bool matrix_keypad_map_key(struct input_dev 
*input_dev,
 }
 
 #ifdef CONFIG_OF
+int matrix_keypad_parse_of_params(struct device *dev,
+ unsigned int *rows, unsigned int *cols)
+{
+   struct device_node *np = dev->of_node;
+
+   if (!np) {
+   dev_err(dev, "missing DT data");
+   return -EINVAL;
+   }
+   of_property_read_u32(np, "keypad,num-rows", rows);
+   of_property_read_u32(np, "keypad,num-columns", cols);
+   if (!*rows || !*cols) {
+   dev_err(dev, "number of keypad rows/columns not specified\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static int matrix_keypad_parse_of_keymap(const char *propname,
 unsigned int rows, unsigned 

Re: [PATCH] efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter

2013-02-13 Thread Matt Fleming
On Wed, 2013-02-13 at 17:20 -0800, H. Peter Anvin wrote:
> On 02/13/2013 04:12 PM, Satoru Takeuchi wrote:
> > From: Satoru Takeuchi 
> > 
> > There was a serious problem in samsung-laptop that its platform driver is
> > designed to run under BIOS and running under EFI can cause the machine to
> > become bricked or can cause Machine Check Exceptions.
> > 
> 
> Matt, unless you object I'll pick this one up as urgent, please take the
> cleanup patch in normal order.

No objection from me, this looks correct.

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 5/5] UBIFS: add ubifs_err() to print error reason

2013-02-13 Thread Artem Bityutskiy
On Wed, 2013-02-13 at 11:23 +0100, Marc Kleine-Budde wrote:
> err = ubifs_init_security(dir, inode, >d_name);
> -   if (err)
> +   if (err) {
> +   ubifs_err("cannot initialize extended attribute, error %d",
> + err);
> goto out_cancel;
> +   }

Would you please instead make 'ubifs_init_security()' print the error
message.

-- 
Best Regards,
Artem Bityutskiy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 4/5] UBIFS: Add security.* XATTR support for the UBIFS

2013-02-13 Thread Artem Bityutskiy
On Wed, 2013-02-13 at 11:23 +0100, Marc Kleine-Budde wrote:
> --- a/fs/ubifs/journal.c
> +++ b/fs/ubifs/journal.c
> @@ -553,7 +553,8 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct 
> inode *dir,
>  
> dbg_jnl("ino %lu, dent '%.*s', data len %d in dir ino %lu",
> inode->i_ino, nm->len, nm->name, ui->data_len, dir->i_ino);
> -   ubifs_assert(dir_ui->data_len == 0);
> +   if (!xent)
> +   ubifs_assert(dir_ui->data_len == 0);

Shouldn't this snippet be in 2/5 instead?

-- 
Best Regards,
Artem Bityutskiy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] generic-adc-battery: forever loop in gab_remove()

2013-02-13 Thread Dan Carpenter
There is a forever loop calling iio_channel_release() because the
"chan < " part of the "chan < ARRAY_SIZE()" is missing.  This is in both
the error handling on probe and also in the remove function.

The other thing is that it's possible for some of the elements of the
adc_bat->channel[chan] array to be an ERR_PTR().  I've changed them to
be NULL instead.  We're still not allowed to pass NULLs to
iio_channel_release() so I've added a check.

Finally, I removed an unused "chan = ARRAY_SIZE(gab_chan_name);"
statement as a small cleanup.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/power/generic-adc-battery.c 
b/drivers/power/generic-adc-battery.c
index 42733c4..8cb5d7f 100644
--- a/drivers/power/generic-adc-battery.c
+++ b/drivers/power/generic-adc-battery.c
@@ -263,9 +263,6 @@ static int gab_probe(struct platform_device *pdev)
psy->external_power_changed = gab_ext_power_changed;
adc_bat->pdata = pdata;
 
-   /* calculate the total number of channels */
-   chan = ARRAY_SIZE(gab_chan_name);
-
/*
 * copying the static properties and allocating extra memory for holding
 * the extra configurable properties received from platform data.
@@ -291,6 +288,7 @@ static int gab_probe(struct platform_device *pdev)
 gab_chan_name[chan]);
if (IS_ERR(adc_bat->channel[chan])) {
ret = PTR_ERR(adc_bat->channel[chan]);
+   adc_bat->channel[chan] = NULL;
} else {
/* copying properties for supported channels only */
memcpy(properties + sizeof(*(psy->properties)) * index,
@@ -344,8 +342,10 @@ err_gpio:
 gpio_req_fail:
power_supply_unregister(psy);
 err_reg_fail:
-   for (chan = 0; ARRAY_SIZE(gab_chan_name); chan++)
-   iio_channel_release(adc_bat->channel[chan]);
+   for (chan = 0; chan < ARRAY_SIZE(gab_chan_name); chan++) {
+   if (adc_bat->channel[chan])
+   iio_channel_release(adc_bat->channel[chan]);
+   }
 second_mem_fail:
kfree(psy->properties);
 first_mem_fail:
@@ -365,8 +365,10 @@ static int gab_remove(struct platform_device *pdev)
gpio_free(pdata->gpio_charge_finished);
}
 
-   for (chan = 0; ARRAY_SIZE(gab_chan_name); chan++)
-   iio_channel_release(adc_bat->channel[chan]);
+   for (chan = 0; chan < ARRAY_SIZE(gab_chan_name); chan++) {
+   if (adc_bat->channel[chan])
+   iio_channel_release(adc_bat->channel[chan]);
+   }
 
kfree(adc_bat->psy.properties);
cancel_delayed_work(_bat->bat_work);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 1/5] UBIFS: xattr: protect ui_size and data_len by ui_mutex

2013-02-13 Thread Artem Bityutskiy
On Wed, 2013-02-13 at 11:23 +0100, Marc Kleine-Budde wrote:
> This patch moves the modification of ui->ui_size and ui->data_len in the
> create_xattr() and change_xattr() functions, so that they are protected by the
> ui_mutex as stated in the documenation of the the struct ubifs_inode.
> 
> Signed-off-by: Marc Kleine-Budde 

I guess this one and 2/5 should have 'Cc: sta...@vger.kernel.org',
right?

-- 
Best Regards,
Artem Bityutskiy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 4/4] i2c-mux: i2c_add_mux_adapter() should use -1 for auto bus num

2013-02-13 Thread Jean Delvare
On Wed, 13 Feb 2013 14:09:08 -0700, Stephen Warren wrote:
> On 02/13/2013 11:02 AM, Doug Anderson wrote:
> > The force_nr parameter to i2c_add_mux_adapter() uses 0 to signify that
> > we don't want to force the bus number of the adapter.  This is
> > non-ideal because:
> > * 0 is actually a valid bus number to request
> > * i2c_add_numbered_adapter() (which i2c_add_mux_adapter() calls) uses
> >   -1 to mean the same thing.  That means extra logic in
> >   i2c_add_mux_adapter().
> > 
> > Fix i2c_add_mux_adapter() to use -1 and update all mux drivers
> > accordingly.
> > 
> > Signed-off-by: Doug Anderson 
> > ---
> > Notes:
> > - If there's a good reason that force_nr uses 0 for auto then feel
> >   free to drop this patch.  I've place it at the end of the series to
> >   make it easy to just drop it.
> 
> IIRC (and I only vaguely do...) it's because:
> 
> > diff --git a/drivers/i2c/muxes/i2c-mux-gpio.c 
> > b/drivers/i2c/muxes/i2c-mux-gpio.c
> > index 9f50ef0..301ed0b 100644
> > --- a/drivers/i2c/muxes/i2c-mux-gpio.c
> > +++ b/drivers/i2c/muxes/i2c-mux-gpio.c
> > @@ -208,7 +208,7 @@ static int i2c_mux_gpio_probe(struct platform_device 
> > *pdev)
> > }
> >  
> > for (i = 0; i < mux->data.n_values; i++) {
> > -   u32 nr = mux->data.base_nr ? (mux->data.base_nr + i) : 0;
> > +   int nr = mux->data.base_nr ? (mux->data.base_nr + i) : -1;
> 
> Here, mux->data.base_nr is platform data (or copied directly from it),
> and any field in a platform data struct stored in a global variable not
> explicitly initialized would be 0, hence 0 would typically mean "no
> explicit bus number desired". Since a mux can't exist without a parent
> I2C bus, it's unlikely anyone would want a mux to be I2C bus 0, but
> rather the parent to have that number.

Yes, as I recall this is exactly the reason why the current code is the
way it is.

-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SELinux + ubifs: possible circular locking dependency

2013-02-13 Thread Artem Bityutskiy
Mark, how about this one? I compiled it and ran on my fedora 16 with
SElinux enabled, no obvious issues.

>From a19350097200570571aa522afebb96b34db534f4 Mon Sep 17 00:00:00 2001
From: Artem Bityutskiy 
Date: Thu, 14 Feb 2013 09:07:36 +0200
Subject: [PATCH] selinux: do not confuse lockdep

Selinux has per-inode mutexes called 'isec->lock', and they are initialized in
the same place, which makes lockdep treat all of the them as if they were
identical. However, locking rules may be a little bit different depending on
the file-system, so we should put these locks to separate classes, just like we
do for 'i_mutex'. Namely, we should put them to per-FS type classes, which is
exactly what this patch does.

The problem this patch intends to fix is a strange lockdep warning, which I,
frankly speaking, do not really understand, but I believe the root-cause should
be fixed by this patch.

Look at the stacktrace #4: here we have 'debugfs_create_dir()'

[5.390312] ==
[5.396500] [ INFO: possible circular locking dependency detected ]
[5.402781] 3.8.0-rc6-5-g4f7e39d #49 Not tainted
[5.407750] ---
[5.414031] systemd/1 is trying to acquire lock:
[5.418656]  (>tnc_mutex){+.+...}, at: [] 
ubifs_tnc_locate+0x30/0x198
[5.426343]
[5.426343] but task is already holding lock:
[5.432218]  (>lock){+.+.+.}, at: [] 
inode_doinit_with_dentry+0x8c/0x55c
[5.440375]
[5.440375] which lock already depends on the new lock.
[5.440375]
[5.448593]
[5.448593] the existing dependency chain (in reverse order) is:
[5.456093]
-> #4 (>lock){+.+.+.}:
[5.460250][] lock_acquire+0x64/0x78
[5.465437][] mutex_lock_nested+0x5c/0x2ec
[5.471125][] inode_doinit_with_dentry+0x8c/0x55c
[5.477437][] security_d_instantiate+0x1c/0x34
[5.483500][] debugfs_mknod.part.15.constprop.18+0x94/0x128
[5.490656][] __create_file+0x1b0/0x25c
[5.496093][] debugfs_create_dir+0x1c/0x28
[5.501781][] pinctrl_init+0x1c/0xd0
[5.506968][] do_one_initcall+0x108/0x17c
[5.512593][] kernel_init_freeable+0xec/0x1b4
[5.518562][] kernel_init+0x8/0xe4
[5.523562][] ret_from_fork+0x14/0x2c
[5.528812]
-> #3 (>s_type->i_mutex_key#2){+.+.+.}:
[5.534312][] lock_acquire+0x64/0x78
[5.539468][] mutex_lock_nested+0x5c/0x2ec
[5.545187][] __create_file+0x50/0x25c
[5.550531][] debugfs_create_dir+0x1c/0x28
[5.556218][] clk_debug_create_subtree+0x1c/0x108
[5.562500][] clk_debug_init+0x68/0xc0
[5.567875][] do_one_initcall+0x108/0x17c
[5.573468][] kernel_init_freeable+0xec/0x1b4
[5.579437][] kernel_init+0x8/0xe4
[5.584437][] ret_from_fork+0x14/0x2c
[5.589687]
-> #2 (prepare_lock){+.+.+.}:
[5.593937][] lock_acquire+0x64/0x78
[5.599125][] mutex_lock_nested+0x5c/0x2ec
[5.604812][] clk_prepare+0x18/0x38
[5.609906][] __gpmi_enable_clk+0x30/0xb0
[5.615531][] gpmi_begin+0x18/0x530
[5.620625][] gpmi_select_chip+0x3c/0x54
[5.626156][] nand_do_read_ops+0x7c/0x3e4
[5.631750][] nand_read+0x50/0x74
[5.636656][] part_read+0x5c/0xa4
[5.641593][] mtd_read+0x84/0xb8
[5.646406][] ubi_io_read+0xa0/0x2c0
[5.651593][] ubi_eba_read_leb+0x190/0x424
[5.657281][] ubi_leb_read+0xac/0x120
[5.662562][] ubifs_leb_read+0x28/0x8c
[5.667906][] ubifs_read_node+0x98/0x2a0
[5.673437][] ubifs_read_sb_node+0x54/0x78
[5.679125][] ubifs_read_superblock+0xc60/0x163c
[5.685343][] ubifs_mount+0x800/0x171c
[5.690687][] mount_fs+0x44/0x184
[5.695593][] vfs_kern_mount+0x4c/0xc0
[5.700968][] do_mount+0x18c/0x8d0
[5.705968][] sys_mount+0x84/0xb8
[5.710875][] mount_block_root+0x118/0x258
[5.716562][] prepare_namespace+0x8c/0x17c
[5.722281][] kernel_init+0x8/0xe4
[5.727281][] ret_from_fork+0x14/0x2c
[5.732531]
-> #1 (>mutex){..}:
[5.736625][] lock_acquire+0x64/0x78
[5.741812][] down_read+0x40/0x54
[5.746718][] ubi_eba_read_leb+0x34/0x424
[5.752312][] ubi_leb_read+0xac/0x120
[5.757562][] ubifs_leb_read+0x28/0x8c
[5.762937][] ubifs_read_node+0x98/0x2a0
[5.768437][] ubifs_load_znode+0x88/0x560
[5.774062][] ubifs_lookup_level0+0x190/0x1dc
[5.780031][] ubifs_tnc_locate+0x44/0x198
[5.785656][] ubifs_iget+0x6c/0x8a4
[5.790718][] ubifs_mount+0xc18/0x171c
[5.796093][] mount_fs+0x44/0x184
[5.801000][] vfs_kern_mount+0x4c/0xc0
[5.806343][] 

Re: [PATCH review 52/85] sunrpc: Properly encode kuids and kgids in auth.unix.gid rpc pipe upcalls.

2013-02-13 Thread Stanislav Kinsbursky

14.02.2013 03:22, Eric W. Biederman пишет:

"J. Bruce Fields"  writes:


On Wed, Feb 13, 2013 at 02:32:29PM -0800, Eric W. Biederman wrote:

"J. Bruce Fields"  writes:


On Wed, Feb 13, 2013 at 01:29:35PM -0800, Eric W. Biederman wrote:

"J. Bruce Fields"  writes:


On Wed, Feb 13, 2013 at 09:51:41AM -0800, Eric W. Biederman wrote:

From: "Eric W. Biederman" 

When a new rpc connection is established with an in-kernel server, the
traffic passes through svc_process_common, and svc_set_client and down
into svcauth_unix_set_client if it is of type RPC_AUTH_NULL or
RPC_AUTH_UNIX.

svcauth_unix_set_client then looks at the uid of the credential we
have assigned to the incomming client and if we don't have the groups
already cached makes an upcall to get a list of groups that the client
can use.

The upcall encodes send a rpc message to user space encoding the uid
of the user whose groups we want to know.  Encode the kuid of the user
in the initial user namespace as nfs mounts can only happen today in
the initial user namespace.


OK, I didn't know that.

(Though I'm unclear how it should matter to the server what user
namespace the client is in?)


Perhaps I have the description a little scrambled.  The short version
is that to start I only support the initial network namespace.

If I haven't succeeded it is my intent to initially limit the servers
to the initial user namespace as well.  I should see if I can figure
that out.


When a reply to an upcall comes in convert interpret the uid and gid values
from the rpc pipe as uids and gids in the initial user namespace and convert
them into kuids and kgids before processing them further.

When reading proc files listing the uid to gid list cache convert the
kuids and kgids from into uids and gids the initial user namespace.  As we are
displaying server internal details it makes sense to display these values
from the servers perspective.


All of these caches are already per-network-namespace.  Ideally wouldn't
we also like to associate a user namespace with each cache somehow?


Ideally yes.  I read through the caches enough to figure out where there
user space interfaces were, and to make certain we had conversions
to/from kuids and kgids.

I haven't looked at what user namespace makes sense for these
caches.  For this cache my first guess is that net->user_ns
is what we want as it will be shared by all users in network namespace I
presume.


Oh, I didn't know about net->user_ns--so each network namespace is
associated with a single user namespace, great, that simplifies life.
Yes, that sounds exactly right.


Yes. net->user_ns is the user namespace the network namespace was
created in.  And it is the user namespace that is used in test
like ns_capable(net->user_ns, CAP_NET_ADMIN) to see if you are allowed
to manipulate the network namespace.  So looks like exactly what we
want for that cache.

Could you double check my understanding of the code?

I want to be certain that I can't _yet_ start an sunrpc server process
outside of the initial user namespace.  While writing an earlier reply I
realized that I hadn't thought about where sunrpc server processes come
from.

Reading through the code it looks like we can have nfs mounts outside of
the initial network namespace.


We're talking about the server side here, not the client, so I'm not
sure what you mean by "nfs mounts".  The nfs server does use various
pseudofilesystems ("proc", "nfsd"), and those can be mounted outside the
initial network namespace.


Actually I was seeing that nfs clients were starting lockd.  So I was
just reasoning here that anything that came from a nfs client was
ultimately in the user namespace of that client, which is ultimately
limited by the client out.


The server can receive rpc requests over network interfaces outside the
initial network namespace, sure.  The server doesn't perform mounts on
behalf of clients, though, it just accesses previously mounted
filesystems on clients' behalf.


But nfsd_init_socks only creates sockets in a single network namespace,
and today we pass only _net.


But because they are mounts they are
still limited to the initial user namespace.


OK, so that's just a limitation on any mount whatsoever for now.  I'm
catching on, slowly, thanks!


If you set in struct filesystem .fs_flags = FS_USERNS_MOUNT your
filesystem can be mounted outside of the initial user namespace.  But
since that takes extra work and because unprivileged users are allowed
to create user namespaces and perform the mounts by default it is off.


Now looking at the nfs server, seems to be hard coded to only start
in the initial network namespace despite almost having support for
starting in more.


Right, Stanislav's got 4 more patches that should finish the job; see
http://mid.gmane.org/<20130201125210.3257.46454.stgit@localhost.localdomain>
and followups.  That should make it for 3.9, I just need to review
them


Ok that is interesting.

There is an interesting corner case 

Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
On Thu, 2013-02-14 at 15:44 +0900, Namjae Jeon wrote:
> 2013/2/14, Andrew Bartlett :
> > (apologies for the duplicate mail, I typo-ed the maintainers address)
> >
> > G'day,
> >
> > I've been looking into the patch "[v2] fat: editions to support
> > fat_fallocate()" and I wonder if there is a way we can split this issue
> > in two, so that we get at least some of the patch into the kernel.
> >
> > https://lkml.org/lkml/2012/10/13/75
> > https://patchwork.kernel.org/patch/1589161/
> >
> > What I'm wanting to discuss (and perhaps implement, with you if
> > possible) is splitting this patch into writing to existing pre-allocated
> > files, and creating a new pre-allocation.
> >
> > If Windows does, as you claim, simply read preallocations as zero, and
> > writes to them normally and without error, then Linux should do the
> > same.  Here of course I'm assuming that Windows is not preallocating,
> > but instead simply trying to recover gracefully and safely from a simple
> > 'file system corruption', where the sectors are allocated but not used.
> >
> > The bulk of this patch is implementing this transparent recovery, and it
> > seem relatively harmless to include this into the kernel.
> >
> > Then vendors doing TV streaming, or in my case copies of large files
> > onto Samba-mounted USB FAT devices, can add only the smaller patch to
> > implement fallocate, at their own risk and fully knowing that it will be
> > regarded as corrupt on Linux.
> >
> > If accepted read support will, over a period of years, trickle down to
> > other Linux users, broadening the base that can still read these
> > 'corrupt' drives, no matter the cause.
> >
> > I hope you agree that this is a practical way forward, and I look
> > forward to working with you on this.
> >
> > Thanks,
> Hi Andrew.
> 
> First, Thanks for your interest !
> A mismatch between inode size and reserved blocks can be either due to
> pre-allocation (after our changes) or due to corruption (sudden unplug
> of media etc).
> We don’t think it is right to include only read only support (i.e.
> without fallocate support) for such files because if such files are
> encountered it only means that the file is corrupted, as there is no
> current method to check if the issue is due to pre-allocation.
> If it is to be included in the kernel, then the whole patch has to go
> in. 

I don't see why that is the case. 

> But then again, since the FAT specifications do not accommodate
> for pre-allocation, then it is up to OGAWA to decide if this is
> acceptable.
> In any case, the patch will definitely break backward compatibility
> (on an older fat driver without fallocate support) and also in case
> for the two variants for the same kernel versions and only one has
> FALLOCATE enabled, in such cases also, the behavior will assume
> corruption in one case.

I agree that the sudden unplug is a concern, but why not make the
filesystem more robust against that inevitable occurrence?  If the
blocks appear to be allocated to the file, why not use them?

That is, while it is hard to predict the many different ways a
filesystem can be corrupted, what would go wrong if we did use these
clusters?  Do you fear that they might also be allocated to someone
else? 

That would, if I understand correctly just mean that that more broken,
not quite valid USB thumb drives and other FAT filesystems work equally
well on Windows and Linux, without administrative privileges.  (Given
that running fsck requires root, and isn't trivially available to normal
users in Linux, and I presume is similarly privileged in windows). 

What I'm doing is suggesting re-purposing your patch, from preallocation
to robustness.  In this light, do you think this worth pushing forward?

We can later address if there is any safe way to preallocate files on
FAT as a different question, hoping that this means it will 'just work'
on a broader range of other Linux hosts, just as it is claimed to 'just
work' on Windows.

Thanks,

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Feb 13 (virtio_console)

2013-02-13 Thread Stephen Rothwell
Hi Rusty,

On Thu, 14 Feb 2013 13:30:37 +1030 Rusty Russell  wrote:
>
> This looks like an impossible config.  CONFIG_VIRTIO_CONSOLE=y, but
> CONFIG_HVC_DRIVER isn't set.
> 
> From drivers/char/Kconfig:
> 
> config VIRTIO_CONSOLE
>   tristate "Virtio console"
>   depends on VIRTIO

This also has "&& TTY" in -next (not actually relevant)

>   select HVC_DRIVER

Its weird, but since CONFIG_TTY is not set (see the config), the
HVC_DRIVER symbol is not even visible, so I suspect that the above select
does nothing :-(  But also, I can't see how VIRTIO_CONSOLE could be set
in the first place since TTY is not set.

(cc'ing some more people) (this is a randconfig that has TTY=n,
HVC_DRIVER=n, but VIRTIO_CONSOLE=y)
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpoK2yTEsPCF.pgp
Description: PGP signature


Re: [PATCH 1/4] of/pci: Provide support for parsing PCI DT ranges property

2013-02-13 Thread Thierry Reding
On Wed, Feb 13, 2013 at 10:09:56PM +, Grant Likely wrote:
> On Wed, 13 Feb 2013 22:29:51 +0100, Thierry Reding 
>  wrote:
> > On Wed, Feb 13, 2013 at 01:54:53PM -0600, Rob Herring wrote:
> > > On 02/13/2013 08:25 AM, Thierry Reding wrote:
> > > > On Wed, Feb 13, 2013 at 08:23:28AM -0600, Rob Herring wrote:
> > > >> On 02/12/2013 12:45 AM, Thierry Reding wrote:
> > > >>> On Mon, Feb 11, 2013 at 01:43:03PM -0600, Rob Herring wrote:
> > >  On 02/11/2013 02:22 AM, Thierry Reding wrote:
> > > > From: Andrew Murray 
> > > >>
> > > > @@ -13,6 +13,7 @@
> > > >  #define OF_CHECK_COUNTS(na, ns)(OF_CHECK_ADDR_COUNT(na) && 
> > > > (ns) > 0)
> > > >  
> > > >  static struct of_bus *of_match_bus(struct device_node *np);
> > > > +static struct of_bus *of_find_bus(const char *name);
> > > 
> > >  Can you move this function up to avoid the forward declaration.
> > > >>>
> > > >>> It needs to be defined after the of_busses structure, which is defined
> > > >>> below the CONFIG_PCI block where of_pci_process_ranges() is defined. 
> > > >>> I'd
> > > >>> have to move that one as well and add another #ifdef CONFIG_PCI 
> > > >>> section.
> > > >>> If you prefer that I can do that.
> > > >>
> > > >> Okay, it's fine as is.
> > > >>
> > > > +static struct of_bus *of_find_bus(const char *name)
> > > > +{
> > > > +   unsigned int i;
> > > > +
> > > > +   for (i = 0; i < ARRAY_SIZE(of_busses); i++)
> > > > +   if (strcmp(name, of_busses[i].name) == 0)
> > >    ^
> > >  space needed.
> > > >>>
> > > >>> I don't understand. Do you want the space to go between '.' and 
> > > >>> "name"?
> > > >>
> > > >> Must have been some dirt on my screen... Never mind.
> > > >>
> > > >> I'll apply these for 3.9.
> > > > 
> > > > Great, thanks!
> > > 
> > > Grant vetoed merging. We need to see the other architectures using these
> > > functions rather than add yet another copy.
> > 
> > I think I've said this before, but converting the other architectures
> > isn't very trivial, mostly because each has a specific way of storing
> > the values read from these properties.
> 
> Sorry to be harsh, but this isn't new information. I've had to deal with
> the pain more than once before of copied infrastructure that at some
> time in the future needs to be merged again. Just looking at your patch
> I can tell that it is directly derived from the powerpc
> pci_process_bridge_OF_ranges() and which microblaze has already has a
> verbatum copy of.
> 
> So, no, I'm not okay with it for v3.9. I don't want more copies of the
> same code. This doesn't block your v3.10 drivers. When a better patch is
> ready we can set up a separate branch with just the new functions in it
> and the various subsystems can merge that in if needed to resolve
> dependencies.
> 
> Instead, here is what you do; you've got the bones of a good approach,
> but you need to show how it is derived from the powerpc approach. I'll
> reply in specifics to the patches themselves, but I can definitely see
> large blocks of code that can be moved out of powerpc & microblaze and
> into drivers/of/address.c without getting into the platform-specific
> PCI representations that you're concerned about.
> 
> Now, to be clear here, I'm asking you to change powerpc/microblaze code,
> but I am *not asking you to test it*. This is a code move exercises, and
> I will help you with it if you need.

Alright. I have no idea about how this is going to affect the timeframe,
though. Granted, this doesn't sound as painful as I had assumed, but it
is quite a bit of work and I have to see how I can squeeze it in with
everything else.

Thierry


pgpe4QGxGVNKO.pgp
Description: PGP signature


Re: [PATCH RFC] davinci: poll for sleep completion in resume routine.

2013-02-13 Thread Sekhar Nori
On 2/14/2013 10:46 AM, Vishwanathrao Badarkhe, Manish wrote:
> Hi Sekhar,
> 
> On Thu, Feb 14, 2013 at 09:48:59, Nori, Sekhar wrote:
>> Manish,
>>
>> On 1/31/2013 2:56 PM, Vishwanathrao Badarkhe, Manish wrote:
>>> As per OMAP-L138 TRM, Software must poll for SLEEPCOMPLETE bit until 
>>> it is set to 1 before clearing SLEEPENABLE bit in DEEPSLEEP register 
>>> in resume routine.
>>> Modifications are as per datasheet:
>>> http://www.ti.com/lit/ug/spruh77a/spruh77a.pdf
>>> See sections 10.10.2.2 and 11.5.21 for more detailed explanation.
>>
>> Polling for SLEEPCOMPLETE is not required in RTC controlled wake-up which is 
>> the mode currently supported (see section 10.10.2.1 of the TRM). Polling for 
>> SLEEPCOMPLETE is required for external controlled wake-up which to my 
>> knowledge has never been tested. If you have tested this with external 
>> controlled wakep-up, then I can consider this patch.
>> Else, I would like to take it only after externally controlled wake-up is 
>> fully tested/supported instead of taking bits and pieces.
> 
> Yes, for RTC controlled wakeup, this polling is not required as per section 
> 10.10.2.1.
> But if we see in section 10.10.2.2 (Exiting Deep Sleep Mode) step 2, When 
> sleep count 
> completes SLEEPCOMPLETE bit gets sets in DEEPSLEEP register till that it's 
> not safe to 
> release clock to devices. So If we don’t poll for SLEEPCOMPLETE, this delay 
> will not
> come into picture which we actually set while entering deep sleep in case of 
> RTC 
> controlled wakeup (Section 10.10.2.1 step 9). 
> Please let me know, whether these understanding is correct?

The  delay is coming from hardware. Till SLEEPCOUNT completes, the clock
to device is not provided. There is no need to poll for SLEEPCOMPLETE
and indeed 10.10.2.2 does not ask for this bit to be polled.

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] of/pci: Add of_pci_get_bus() function

2013-02-13 Thread Thierry Reding
On Wed, Feb 13, 2013 at 10:56:07PM +, Grant Likely wrote:
> On Mon, 11 Feb 2013 09:22:19 +0100, Thierry Reding 
>  wrote:
> > This function can be used to parse the number of a device's parent PCI
> > bus from a standard 5-cell PCI resource.
> > 
> > Signed-off-by: Thierry Reding 
> 
> This patch should be deferred until there is another patch in the same
> series that actually uses it.

And this one too.

Thierry


pgpMg3Hp8lIHm.pgp
Description: PGP signature


Re: [PATCH v2 2/4] of/pci: Add of_pci_get_devfn() function

2013-02-13 Thread Thierry Reding
On Wed, Feb 13, 2013 at 10:59:50PM +, Grant Likely wrote:
> On Mon, 11 Feb 2013 09:22:18 +0100, Thierry Reding 
>  wrote:
> > This function can be used to parse the device and function number from a
> > standard 5-cell PCI resource. PCI_SLOT() and PCI_FUNC() can be used on
> > the returned value obtain the device and function numbers respectively.
> > 
> > Signed-off-by: Thierry Reding 
> > ---
> > Changes in v2:
> > - rename devfn and err variables for clarity
> > 
> >  drivers/of/of_pci.c| 34 +-
> >  include/linux/of_pci.h |  1 +
> 
> There isn't a whole lot of value in this patch without another user.
> I'll need to see the other patches that make use of this.

Alright, I'll take that back into the Tegra series as well.

Thierry


pgpsjfFo4UZJI.pgp
Description: PGP signature


Re: [PATCH 4/4] of/pci: Add of_pci_parse_bus_range() function

2013-02-13 Thread Thierry Reding
On Wed, Feb 13, 2013 at 10:58:44PM +, Grant Likely wrote:
> On Mon, 11 Feb 2013 09:22:20 +0100, Thierry Reding 
>  wrote:
> > This function can be used to parse a bus-range property as specified by
> > device nodes representing PCI bridges.
> > 
> > Signed-off-by: Thierry Reding 
> 
> Ditto for this one. We can wait on it until there is a user.

This is used by the Tegra driver and as I've explained the reason for
sending these patches separately was to get them merged beforehand to
reduce the number of dependencies that need to be tracked once the
driver is merged (possibly in 3.10, maybe later given the amount of
extra work you want done).

The patch used to be part of the Tegra series, but since Thomas started
using them for his Marvell work I thought it might be a good idea to
make them available separately.

But I can take it back into the Tegra series since that's where we seem
to be headed.

> > +/**
> > + * of_pci_parse_bus_range() - parse the bus-range property of a PCI device
> > + * @node: device node
> > + * @res: address to a struct resource to return the bus-range
> > + *
> > + * Returns 0 on success or a negative error-code on failure.
> > + */
> > +int of_pci_parse_bus_range(struct device_node *node, struct resource *res)
> > +{
> > +   const __be32 *values;
> > +   int len;
> > +
> > +   values = of_get_property(node, "bus-range", );
> > +   if (!values || len < sizeof(*values) * 2)
> > +   return -EINVAL;
> > +
> > +   res->name = node->name;
> > +   res->start = be32_to_cpup(values++);
> > +   res->end = be32_to_cpup(values);
> > +   res->flags = IORESOURCE_BUS;
> 
> Is there precedence for using struct resource for passing around the PCI
> bus range values? Who will be the user of this function?

The PCI core code actually keeps track of bus-ranges this way. See
drivers/pci/probe.c for instance which uses the global the busn_resource
variable and keeps a list of struct resource:s for each domain as well.

Thierry


pgp6K72FSMZm0.pgp
Description: PGP signature


[PATCH v3 6/6] Input: Add ChromeOS EC keyboard driver

2013-02-13 Thread Simon Glass
Use the key-matrix layer to interpret key scan information from the EC
and inject input based on the FDT-supplied key map. This driver registers
itself with the ChromeOS EC driver to perform communications.

The matrix-keypad FDT binding is used with a small addition to control
ghosting.

Signed-off-by: Simon Glass 
Signed-off-by: Luigi Semenzato 
Signed-off-by: Vincent Palatin 
---
Changes in v3:
- Remove 'select MFD_CROS_EC' from Kconfig as it isn't necessary
- Remove old_state by using input layer's idev->key
- Move inner loop of cros_ec_keyb_has_ghosting() into its own function and 
simplify
- Add check for not finding the device tree node
- Remove comment about leaking matrix_keypad_build_keymap()
- Use platform_get_drvdata() where possible
- Remove call to input_free_device() after input_unregister_device()

Changes in v2:
- Remove use of __devinit/__devexit
- Use function to read matrix-keypad parameters from DT
- Remove key autorepeat parameters from DT binding and driver
- Use unsigned int for rows/cols

 .../devicetree/bindings/input/cros-ec-keyb.txt |  72 
 drivers/input/keyboard/Kconfig |  11 +
 drivers/input/keyboard/Makefile|   1 +
 drivers/input/keyboard/cros_ec_keyb.c  | 364 +
 4 files changed, 448 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/cros-ec-keyb.txt
 create mode 100644 drivers/input/keyboard/cros_ec_keyb.c

diff --git a/Documentation/devicetree/bindings/input/cros-ec-keyb.txt 
b/Documentation/devicetree/bindings/input/cros-ec-keyb.txt
new file mode 100644
index 000..0f6355c
--- /dev/null
+++ b/Documentation/devicetree/bindings/input/cros-ec-keyb.txt
@@ -0,0 +1,72 @@
+ChromeOS EC Keyboard
+
+Google's ChromeOS EC Keyboard is a simple matrix keyboard implemented on
+a separate EC (Embedded Controller) device. It provides a message for reading
+key scans from the EC. These are then converted into keycodes for processing
+by the kernel.
+
+This binding is based on matrix-keymap.txt and extends/modifies it as follows:
+
+Required properties:
+- compatible: "google,cros-ec-keyb"
+
+Optional properties:
+- google,needs-ghost-filter: True to enable a ghost filter for the matrix
+keyboard. This is recommended if the EC does not have its own logic or
+hardware for this.
+
+
+Example:
+
+cros-ec-keyb {
+   compatible = "google,cros-ec-keyb";
+   keypad,num-rows = <8>;
+   keypad,num-columns = <13>;
+   google,needs-ghost-filter;
+   /*
+* Keymap entries take the form of 0xRRCC where
+* RR=Row CC=Column =Key Code
+* The values below are for a US keyboard layout and
+* are taken from the Linux driver. Note that the
+* 102ND key is not used for US keyboards.
+*/
+   linux,keymap = <
+   /* CAPSLCK F1 B  F10 */
+   0x0001003a 0x0002003b 0x00030030 0x00040044
+   /* N   =  R_ALT  ESC */
+   0x00060031 0x0008000d 0x000a0064 0x01010001
+   /* F4  G  F7 H   */
+   0x0102003e 0x01030022 0x01040041 0x01060023
+   /* '   F9 BKSPACEL_CTRL  */
+   0x01080028 0x01090043 0x010b000e 0x021d
+   /* TAB F3 T  F6  */
+   0x0201000f 0x0202003d 0x02030014 0x02040040
+   /* ]   Y  102ND  [   */
+   0x0205001b 0x02060015 0x02070056 0x0208001a
+   /* F8  GRAVE  F2 5   */
+   0x02090042 0x03010029 0x0302003c 0x03030006
+   /* F5  6  -  \   */
+   0x0304003f 0x03060007 0x0308000c 0x030b002b
+   /* R_CTRL  A  D  F   */
+   0x0461 0x0401001e 0x04020020 0x04030021
+   /* S   K  J  ;   */
+   0x0404001f 0x04050025 0x04060024 0x04080027
+   /* L   ENTER  Z  C   */
+   0x04090026 0x040b001c 0x0501002c 0x0502002e
+   /* V   X  ,  M   */
+   0x0503002f 0x0504002d 0x05050033 0x05060032
+   /* L_SHIFT /  .  SPACE   */
+   0x0507002a 0x05080035 0x05090034 0x050B0039
+   /* 1   3  4  2   */
+   0x06010002 0x06020004 0x06030005 0x06040003
+   /* 8   7  0  9   */
+   0x06050009 0x06060008 0x0608000b 0x0609000a
+   /* L_ALT   DOWN   RIGHT  Q   */
+   0x060a0038 0x060b006c 0x060c006a 0x07010010
+   /* E   R  W  I   */
+   0x07020012 0x07030013 0x07040011 0x07050017
+   /* U   R_SHIFTP  O   */
+   0x07060016 0x07070036 0x07080019 0x07090018
+

Re: [PATCH v2 6/6] Input: Add ChromeOS EC keyboard driver

2013-02-13 Thread Simon Glass
Hi Dmitry,

On Wed, Feb 13, 2013 at 12:02 PM, Dmitry Torokhov
 wrote:
> Hi SImon,
>
> On Tue, Feb 12, 2013 at 06:42:26PM -0800, Simon Glass wrote:
>> Use the key-matrix layer to interpret key scan information from the EC
>> and inject input based on the FDT-supplied key map. This driver registers
>> itself with the ChromeOS EC driver to perform communications.
>>
>> Additional FDT bindings are provided to specify rows/columns and the
>> auto-repeat information.
>>
>> Signed-off-by: Simon Glass 
>> Signed-off-by: Luigi Semenzato 
>> Signed-off-by: Vincent Palatin 
>> ---
>> Changes in v2:
>> - Remove use of __devinit/__devexit
>> - Use function to read matrix-keypad parameters from DT
>> - Remove key autorepeat parameters from DT binding and driver
>> - Use unsigned int for rows/cols

Thanks for all the review comments.

>>
>>  .../devicetree/bindings/input/cros-ec-keyb.txt |  72 
>>  drivers/input/keyboard/Kconfig |  12 +
>>  drivers/input/keyboard/Makefile|   1 +
>>  drivers/input/keyboard/cros_ec_keyb.c  | 394 
>> +
>>  4 files changed, 479 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/input/cros-ec-keyb.txt
>>  create mode 100644 drivers/input/keyboard/cros_ec_keyb.c
>>
>> diff --git a/Documentation/devicetree/bindings/input/cros-ec-keyb.txt 
>> b/Documentation/devicetree/bindings/input/cros-ec-keyb.txt
>> new file mode 100644
>> index 000..0f6355c
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/input/cros-ec-keyb.txt
>> @@ -0,0 +1,72 @@
>> +ChromeOS EC Keyboard
>> +
>> +Google's ChromeOS EC Keyboard is a simple matrix keyboard implemented on
>> +a separate EC (Embedded Controller) device. It provides a message for 
>> reading
>> +key scans from the EC. These are then converted into keycodes for processing
>> +by the kernel.
>> +
>> +This binding is based on matrix-keymap.txt and extends/modifies it as 
>> follows:
>> +
>> +Required properties:
>> +- compatible: "google,cros-ec-keyb"
>> +
>> +Optional properties:
>> +- google,needs-ghost-filter: True to enable a ghost filter for the matrix
>> +keyboard. This is recommended if the EC does not have its own logic or
>> +hardware for this.
>> +
>> +
>> +Example:
>> +
>> +cros-ec-keyb {
>> + compatible = "google,cros-ec-keyb";
>> + keypad,num-rows = <8>;
>> + keypad,num-columns = <13>;
>> + google,needs-ghost-filter;
>> + /*
>> +  * Keymap entries take the form of 0xRRCC where
>> +  * RR=Row CC=Column =Key Code
>> +  * The values below are for a US keyboard layout and
>> +  * are taken from the Linux driver. Note that the
>> +  * 102ND key is not used for US keyboards.
>> +  */
>> + linux,keymap = <
>> + /* CAPSLCK F1 B  F10 */
>> + 0x0001003a 0x0002003b 0x00030030 0x00040044
>> + /* N   =  R_ALT  ESC */
>> + 0x00060031 0x0008000d 0x000a0064 0x01010001
>> + /* F4  G  F7 H   */
>> + 0x0102003e 0x01030022 0x01040041 0x01060023
>> + /* '   F9 BKSPACEL_CTRL  */
>> + 0x01080028 0x01090043 0x010b000e 0x021d
>> + /* TAB F3 T  F6  */
>> + 0x0201000f 0x0202003d 0x02030014 0x02040040
>> + /* ]   Y  102ND  [   */
>> + 0x0205001b 0x02060015 0x02070056 0x0208001a
>> + /* F8  GRAVE  F2 5   */
>> + 0x02090042 0x03010029 0x0302003c 0x03030006
>> + /* F5  6  -  \   */
>> + 0x0304003f 0x03060007 0x0308000c 0x030b002b
>> + /* R_CTRL  A  D  F   */
>> + 0x0461 0x0401001e 0x04020020 0x04030021
>> + /* S   K  J  ;   */
>> + 0x0404001f 0x04050025 0x04060024 0x04080027
>> + /* L   ENTER  Z  C   */
>> + 0x04090026 0x040b001c 0x0501002c 0x0502002e
>> + /* V   X  ,  M   */
>> + 0x0503002f 0x0504002d 0x05050033 0x05060032
>> + /* L_SHIFT /  .  SPACE   */
>> + 0x0507002a 0x05080035 0x05090034 0x050B0039
>> + /* 1   3  4  2   */
>> + 0x06010002 0x06020004 0x06030005 0x06040003
>> + /* 8   7  0  9   */
>> + 0x06050009 0x06060008 0x0608000b 0x0609000a
>> + /* L_ALT   DOWN   RIGHT  Q   */
>> + 0x060a0038 0x060b006c 0x060c006a 0x07010010
>> + /* E   R  W  I   */
>> + 0x07020012 0x07030013 0x07040011 0x07050017
>> + /* U   R_SHIFTP  O   */
>> + 0x07060016 0x07070036 0x07080019 0x07090018
>> + /* UP  

Re: [PATCH 1/4] clocksource: pass DT node pointer to init functions

2013-02-13 Thread Michal Simek
2013/2/14 Rob Herring :
> On 02/13/2013 11:33 AM, Michal Simek wrote:
>> 2013/2/13 Rob Herring :
>>> On 02/13/2013 10:21 AM, Michal Simek wrote:
 2013/2/7 Rob Herring :
> From: Rob Herring 
>
> In cases where we have multiple nodes of the same type, we may need the
> node pointer to know which node was matched. Passing the node pointer
> also keeps the init function from having to match the node a 2nd time.
>
> Signed-off-by: Rob Herring 
> Cc: John Stultz 
> Cc: Thomas Gleixner 
> ---
>  drivers/clocksource/clksrc-of.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

 Tested-by: Michal Simek 

 The rest is just the same as I have done. Any option to add these
 patches to v3.9?
>>>
>>> I would like to before we have more users to fix, but it will have to be
>>> post rc1. If not, Arnd/Olof should be be able to provide a stable branch
>>> for 3.10.
>>
>> ok
>
> I now see you were trying to get zynq changes in for 3.9. You could add
> this patch to your pull request. As is, it is not dependent on some DT
> code changes, but the subsequent patches are. I can send the rest after
> rc1. It's a bit of a hack with the function call prototype, but nothing
> actually breaks. I was going to combine as Arnd suggested, but either
> way is probably fine.

It is not big deal with that. I just want to use this patch. No problem to keep
it in my repo. I am not worried about. The main point for me is that
the patch exists
and I can use it.


 Because I need these patches for zynq timer because we have two in the soc.
 Is it OK to register several clock source and clockevent devices?
>>>
>>> If it is 1 DT node, then that should be fine.
>>
>> zynq is using two triple timer counter IP . There are also described by two
>> different DT nodes because there are separated and uses different 
>> baseaddresses.
>>
>> Does it mean that if there are 2 DT nodes that it won't work?
>>
>>
>> One more thing. Is there any rule which should describe which timer should be
>> used for clockevent and for clocksource?
>
> No. This is a common problem. A simple solution is a "linux,clockevent"
> property, but I want to avoid that.

Let me describe it a little bit more. I am going to change current
mainline implementation
for zynq timer which uses special compatible string to define
clocksource and clockevent
device. I don't think this is right way to go because compatible
string shouldn't point
to device usage. Which is exact case you wanted to avoid.

> Ultimately it is some feature of the
> h/w that makes you choose. This could be it has an interrupt or not,
> higher frequency, has timer compare pins, gets power gated, etc. So you
> should describe enough of the h/w properties to make this decision.

For different timer type, kernel should decide which timer should be
used. It should be easy to test
because I can add some timers to the programmable logic (as is done
for Microblaze) and check
how kernel decide which clocksource/clockevent device will use. I
believe there is any logic around.


For solution with two instances of the same triple timer counters it
is impossible
to specify additional h/w property because all of 6 timers are the same.
And also adding special parameter to IP goes against rule
that device-tree should describe hw not Linux usage.
Maybe enough to save information to driver that clocksource and
clockevent device is registered
and do not try to register another timer (and also another timer from
another instance).

Or maybe it can be done via chosen property or via aliases property where
timer0 alias is that one who should be used for clocksource and
clockevent device.
(for my case alias to one instance of triple timer counter).

I saw some dtses which have aliases timer.

How good/bad is this option?


> OMAP
> is an example doing this with lots of timers with varying integration
> level differences.

Can you point me that that omap drivers you are talking about?

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Namjae Jeon
2013/2/14, Andrew Bartlett :
> (apologies for the duplicate mail, I typo-ed the maintainers address)
>
> G'day,
>
> I've been looking into the patch "[v2] fat: editions to support
> fat_fallocate()" and I wonder if there is a way we can split this issue
> in two, so that we get at least some of the patch into the kernel.
>
> https://lkml.org/lkml/2012/10/13/75
> https://patchwork.kernel.org/patch/1589161/
>
> What I'm wanting to discuss (and perhaps implement, with you if
> possible) is splitting this patch into writing to existing pre-allocated
> files, and creating a new pre-allocation.
>
> If Windows does, as you claim, simply read preallocations as zero, and
> writes to them normally and without error, then Linux should do the
> same.  Here of course I'm assuming that Windows is not preallocating,
> but instead simply trying to recover gracefully and safely from a simple
> 'file system corruption', where the sectors are allocated but not used.
>
> The bulk of this patch is implementing this transparent recovery, and it
> seem relatively harmless to include this into the kernel.
>
> Then vendors doing TV streaming, or in my case copies of large files
> onto Samba-mounted USB FAT devices, can add only the smaller patch to
> implement fallocate, at their own risk and fully knowing that it will be
> regarded as corrupt on Linux.
>
> If accepted read support will, over a period of years, trickle down to
> other Linux users, broadening the base that can still read these
> 'corrupt' drives, no matter the cause.
>
> I hope you agree that this is a practical way forward, and I look
> forward to working with you on this.
>
> Thanks,
Hi Andrew.

First, Thanks for your interest !
A mismatch between inode size and reserved blocks can be either due to
pre-allocation (after our changes) or due to corruption (sudden unplug
of media etc).
We don’t think it is right to include only read only support (i.e.
without fallocate support) for such files because if such files are
encountered it only means that the file is corrupted, as there is no
current method to check if the issue is due to pre-allocation.
If it is to be included in the kernel, then the whole patch has to go
in. But then again, since the FAT specifications do not accommodate
for pre-allocation, then it is up to OGAWA to decide if this is
acceptable.
In any case, the patch will definitely break backward compatibility
(on an older fat driver without fallocate support) and also in case
for the two variants for the same kernel versions and only one has
FALLOCATE enabled, in such cases also, the behavior will assume
corruption in one case.

Thanks.

>
> Andrew Bartlett
> --
> Andrew Bartletthttp://samba.org/~abartlet/
> Authentication Developer, Samba Team   http://samba.org
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] ARM: dt: add header to define tegra20 clocks

2013-02-13 Thread Hiroshi Doyu
To replace magic number in "clocks = <_car 28>;"

Signed-off-by: Hiroshi Doyu 
---
This patch depends on:

  [PATCH 0/9] ARM: tegra: use new dtc+cpp feature
  
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-February/149613.html

This patch is the experiment for Tegra20. The same replacement can be
done for Tegra{30,114}.

Usage:
Modified arch/arm/boot/dts/tegra20.dtsip
diff --git a/arch/arm/boot/dts/tegra20.dtsip b/arch/arm/boot/dts/tegra20.dtsip
index 7b05f53..6edd397 100644
--- a/arch/arm/boot/dts/tegra20.dtsip
+++ b/arch/arm/boot/dts/tegra20.dtsip
@@ -1,6 +1,7 @@
 #include "skeleton.dtsi"
 #include "tegra-gpio.h"
 #include "arm-gic.h"
+#include "tegra20-car.h"

 / {
compatible = "nvidia,tegra20";
@@ -19,7 +20,7 @@
reg = <0x5000 0x00024000>;
interrupts = , /* syncpt */
 ; /* general */
-   clocks = <_car 28>;
+   clocks = <_car CLK_HOST1X>;

Signed-off-by: Hiroshi Doyu 
---
 arch/arm/boot/dts/tegra20-car.h |  126 +++
 1 file changed, 126 insertions(+)
 create mode 100644 arch/arm/boot/dts/tegra20-car.h

diff --git a/arch/arm/boot/dts/tegra20-car.h b/arch/arm/boot/dts/tegra20-car.h
new file mode 100644
index 000..6426320
--- /dev/null
+++ b/arch/arm/boot/dts/tegra20-car.h
@@ -0,0 +1,126 @@
+#define CLK_CPU0
+/* UNUSED 1 */
+/* UNUSED 2 */
+#define CLK_AC97   3
+#define CLK_RTC4
+#define CLK_TIMER  5
+#define CLK_UARTA  6
+/* UNUSED 7 */
+#define CLK_GPIO   8
+#define CLK_SDMMC2 9
+/* UNUSED 10 */
+#define CLK_I2S1   11
+#define CLK_I2C1   12
+#define CLK_NDFLASH13
+#define CLK_SDMMC1 14
+#define CLK_SDMMC4 15
+#define CLK_TWC16
+#define CLK_PWM17
+#define CLK_I2S2   18
+#define CLK_EPP19
+/* UNUSED 20 */
+#define CLK_GR2D   21
+#define CLK_USBD   22
+#define CLK_ISP23
+#define CLK_GR3D   24
+#define CLK_IDE25
+#define CLK_DISP2  26
+#define CLK_DISP1  27
+#define CLK_HOST1X 28
+#define CLK_VCP29
+/* UNUSED 30 */
+#define CLK_CACHE2 31
+#define CLK_MEM32
+#define CLK_AHBDMA 33
+#define CLK_APBDMA 34
+/* UNUSED 35 */
+#define CLK_KBC36
+#define CLK_STAT_MON   37
+#define CLK_PMC38
+#define CLK_FUSE   39
+#define CLK_KFUSE  40
+#define CLK_SBC1   41
+#define CLK_NOR42
+#define CLK_SPI43
+#define CLK_SBC2   44
+#define CLK_XIO45
+#define CLK_SBC3   46
+#define CLK_DVC47
+#define CLK_DSI48
+/* UNUSED 49 */
+#define CLK_MIPI   50
+#define CLK_HDMI   51
+#define CLK_CSI52
+#define CLK_TVDAC  53
+#define CLK_I2C2   54
+#define CLK_UARTC  55
+/* UNUSED 56 */
+#define CLK_EMC57
+#define CLK_USB2   58
+#define CLK_USB3   59
+#define CLK_MPE60
+#define CLK_VDE61
+#define CLK_BSEA   62
+#define CLK_BSEV   63
+#define CLK_SPEEDO 64
+#define CLK_UARTD  65
+#define CLK_UARTE  66
+#define CLK_I2C3   67
+#define CLK_SBC4   68
+#define CLK_SDMMC3 69
+#define CLK_PEX70
+#define CLK_OWR71
+#define CLK_AFI72
+#define CLK_CSITE  73
+#define CLK_PCIE_XCLK  74
+#define CLK_AVPUCQ 75
+#define CLK_LA 76
+/* UNUSED 77-83 */
+#define CLK_IRAMA  84
+#define CLK_IRAMB  85
+#define CLK_IRAMC  86
+#define CLK_IRAMD  87
+#define CLK_CRAM2  88
+#define CLK_AUDIO_2X   89
+#define CLK_CLK_D  90
+/* UNUSED 91 */
+#define CLK_CSUS   92
+#define CLK_CDEV1  93
+#define CLK_CDEV2  94
+/* UNUSED 95 */
+#define CLK_UARTB  96
+#define CLK_VFIR   97
+#define CLK_SPDIF_IN   98
+#define CLK_SPDIF_OUT  99
+#define CLK_VI 100
+#define CLK_VI_SENSOR  101
+#define CLK_TVO102
+#define CLK_CVE103
+#define CLK_OSC104
+#define CLK_CLK_32K105
+#define CLK_CLK_M  106
+#define CLK_SCLK   107
+#define CLK_CCLK   108
+#define CLK_HCLK   109
+#define CLK_PCLK   110
+#define CLK_BLINK  111
+#define CLK_PLL_A  112
+#define CLK_PLL_A_OUT0 113
+#define CLK_PLL_C  114
+#define CLK_PLL_C_OUT1 115
+#define CLK_PLL_D  116
+#define CLK_PLL_D_OUT0 117
+#define CLK_PLL_E  118
+#define CLK_PLL_M  119
+#define CLK_PLL_M_OUT1 120
+#define CLK_PLL_P  121
+#define CLK_PLL_P_OUT1 122
+#define CLK_PLL_P_OUT2 123
+#define CLK_PLL_P_OUT3 124
+#define CLK_PLL_P_OUT4 125
+#define CLK_PLL_U  126
+#define CLK_PLL_X  127
+#define CLK_AUDIO  128
+#define CLK_PLL_REF129
+#define CLK_TWD130
+#define CLK_MAX131
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body 

Re: SELinux + ubifs: possible circular locking dependency

2013-02-13 Thread Artem Bityutskiy
On Wed, 2013-02-13 at 15:37 +0100, Marc Kleine-Budde wrote:
> > +   lockdep_set_class(>lock, inode->i_sb->s_type->i_mutex_key);
> 
> So I added an "&", so that the line looks like that:

Yeah, I did not compile it, and for the deadlock of course I have to add
own class for isec->lock in the fstype structure. I'll try to come up
with another patch. Thanks!

-- 
Best Regards,
Artem Bityutskiy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2 2/3] mmc: davinci_mmc: add DT support

2013-02-13 Thread Manjunathappa, Prakash
Hi Sekhar,

This mail reached my inbox after I sent out v3.

On Tue, Feb 12, 2013 at 11:51:34, Nori, Sekhar wrote:
> On 2/7/2013 1:27 PM, Manjunathappa, Prakash wrote:
> > Adds device tree support for davinci_mmc. Also add binding documentation.
> > Tested in non-dma PIO mode and without GPIO card_detect/write_protect
> > option because of dependencies on EDMA and GPIO module DT support.
> > 
> > Signed-off-by: Manjunathappa, Prakash 
> > Cc: linux-...@vger.kernel.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: davinci-linux-open-sou...@linux.davincidsp.com
> > Cc: devicetree-disc...@lists.ozlabs.org
> > Cc: c...@laptop.org
> > Cc: Sekhar Nori 
> > Cc: mpor...@ti.com
> > ---
> > Since v1:
> > Modified DT parse function to take default values and accomodate controller
> > version in compatible field.
> > 
> >  .../devicetree/bindings/mmc/davinci_mmc.txt|   30 
> >  drivers/mmc/host/davinci_mmc.c |   70 
> > +++-
> >  2 files changed, 99 insertions(+), 1 deletions(-)
> >  create mode 100644 Documentation/devicetree/bindings/mmc/davinci_mmc.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/mmc/davinci_mmc.txt 
> > b/Documentation/devicetree/bindings/mmc/davinci_mmc.txt
> > new file mode 100644
> > index 000..6717ab1
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/mmc/davinci_mmc.txt
> > @@ -0,0 +1,30 @@
> > +* TI Highspeed MMC host controller for DaVinci
> > +
> > +The Highspeed MMC Host Controller on TI DaVinci family
> > +provides an interface for MMC, SD and SDIO types of memory cards.
> > +
> > +This file documents the properties used by the davinci_mmc driver.
> > +
> > +Required properties:
> > +- compatible:
> > + Should be "ti,davinci-mmc-da830": for da830, da850, dm365
> > + Should be "ti,davinci-mmc-dm355": for dm355, dm644x
> > +
> > +Optional properties:
> > +- bus-width: Number of data lines, can be <4>, or <8>, default <1>
> > +- max-frequency: Maximum operating clock frequency, default 25MHz.
> > +- mmc-cap-mmc-highspeed: Indicates support for MMC in high speed mode
> > +- mmc-cap-sd-highspeed: Indicates support for SD in high speed mode
> > +
> > +Example:
> > +   mmc0: mmc@1c4 {
> > +   compatible = "ti,davinci-mmc-da830",
> > +   reg = <0x4 0x1000>;
> > +   interrupts = <16>;
> > +   status = "okay";
> > +   bus-width = <4>;
> > +   max-frequency = <5000>;
> > +   mmc-cap-sd-highspeed;
> > +   mmc-cap-mmc-highspeed;
> > +   };
> > +
> > diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c
> > index 27123f8..3f90316 100644
> > --- a/drivers/mmc/host/davinci_mmc.c
> > +++ b/drivers/mmc/host/davinci_mmc.c
> > @@ -34,6 +34,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >  
> >  #include 
> >  
> > @@ -1157,15 +1159,80 @@ static void __init init_mmcsd_host(struct 
> > mmc_davinci_host *host)
> > mmc_davinci_reset_ctrl(host, 0);
> >  }
> >  
> > -static int __init davinci_mmcsd_probe(struct platform_device *pdev)
> > +static const struct of_device_id davinci_mmc_dt_ids[] = {
> > +   {
> > +   .compatible = "ti,davinci-mmc-dm355",
> > +   .data = (void *)MMC_CTLR_VERSION_1,
> > +   },
> > +   {
> > +   .compatible = "ti,davinci-mmc-da830",
> > +   .data = (void *)MMC_CTLR_VERSION_2,
> > +   },
> > +   {},
> > +};
> > +MODULE_DEVICE_TABLE(of, davinci_mmc_dt_ids);
> 
> If you are doing this why not also kill passing IP version through
> platform data using a platform_device_id table? Look at what Afzal did
> for drivers/rtc/rtc-omap.c
> 

Agreed, I will send out v4 having these changes also in this series.

Thanks,
Prakash

> Thanks,
> Sekhar
> 



Re: [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes

2013-02-13 Thread Rusty Russell
Paolo Bonzini  writes:
> This series adds a different set of APIs for adding a buffer to a
> virtqueue.  The new API lets you pass the buffers piecewise, wrapping
> multiple calls to virtqueue_add_sg between virtqueue_start_buf and
> virtqueue_end_buf.  Letting drivers call virtqueue_add_sg multiple times
> if they already have a scatterlist provided by someone else simplifies the
> code and, for virtio-scsi, it saves the copying and related locking.

They are ugly though.  It's convoluted because we do actually know all
the buffers at once, we don't need a piecemeal API.

As a result, you now have arbitrary changes to the indirect heuristic,
because the API is now piecemeal.

How about this as a first step?

virtio_ring: virtqueue_add_sgs, to add multiple sgs.

virtio_scsi and virtio_blk can really use these, to avoid their current
hack of copying the whole sg array.

Signed-off-by: Ruty Russell  

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ffd7e7d..c5afc5d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -121,14 +121,16 @@ struct vring_virtqueue
 
 /* Set up an indirect table of descriptors and add it to the queue. */
 static int vring_add_indirect(struct vring_virtqueue *vq,
- struct scatterlist sg[],
- unsigned int out,
- unsigned int in,
+ struct scatterlist *sgs[],
+ unsigned int total_sg,
+ unsigned int out_sgs,
+ unsigned int in_sgs,
  gfp_t gfp)
 {
struct vring_desc *desc;
unsigned head;
-   int i;
+   struct scatterlist *sg;
+   int i, n;
 
/*
 * We require lowmem mappings for the descriptors because
@@ -137,26 +139,31 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
 */
gfp &= ~(__GFP_HIGHMEM | __GFP_HIGH);
 
-   desc = kmalloc((out + in) * sizeof(struct vring_desc), gfp);
+   desc = kmalloc(total_sg * sizeof(struct vring_desc), gfp);
if (!desc)
return -ENOMEM;
 
-   /* Transfer entries from the sg list into the indirect page */
-   for (i = 0; i < out; i++) {
-   desc[i].flags = VRING_DESC_F_NEXT;
-   desc[i].addr = sg_phys(sg);
-   desc[i].len = sg->length;
-   desc[i].next = i+1;
-   sg++;
+   /* Transfer entries from the sg lists into the indirect page */
+   i = 0;
+   for (n = 0; n < out_sgs; n++) {
+   for (sg = sgs[n]; sg; sg = sg_next(sg)) {
+   desc[i].flags = VRING_DESC_F_NEXT;
+   desc[i].addr = sg_phys(sg);
+   desc[i].len = sg->length;
+   desc[i].next = i+1;
+   }
}
-   for (; i < (out + in); i++) {
-   desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
-   desc[i].addr = sg_phys(sg);
-   desc[i].len = sg->length;
-   desc[i].next = i+1;
-   sg++;
+   for (; n < (out_sgs + in_sgs); n++) {
+   for (sg = sgs[n]; sg; sg = sg_next(sg)) {
+   desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
+   desc[i].addr = sg_phys(sg);
+   desc[i].len = sg->length;
+   desc[i].next = i+1;
+   }
}
 
+   BUG_ON(i != total_sg);
+
/* Last one doesn't continue. */
desc[i-1].flags &= ~VRING_DESC_F_NEXT;
desc[i-1].next = 0;
@@ -176,6 +183,15 @@ static int vring_add_indirect(struct vring_virtqueue *vq,
return head;
 }
 
+/* FIXME */
+static inline void sg_unmark_end(struct scatterlist *sg)
+{
+#ifdef CONFIG_DEBUG_SG
+   BUG_ON(sg->sg_magic != SG_MAGIC);
+#endif
+   sg->page_link &= ~0x02;
+}
+
 /**
  * virtqueue_add_buf - expose buffer to other end
  * @vq: the struct virtqueue we're talking about.
@@ -197,8 +213,47 @@ int virtqueue_add_buf(struct virtqueue *_vq,
  void *data,
  gfp_t gfp)
 {
+   struct scatterlist *sgs[2];
+   unsigned int i;
+
+   sgs[0] = sg;
+   sgs[1] = sg + out;
+
+   /* Workaround until callers pass well-formed sgs. */
+   for (i = 0; i < out + in; i++)
+   sg_unmark_end(sg + i);
+
+   sg_unmark_end(sg + out + in);
+   if (out && in)
+   sg_unmark_end(sg + out);
+
+   return virtqueue_add_sgs(_vq, sgs, out ? 1 : 0, in ? 1 : 0, data, gfp);
+}
+
+/**
+ * virtqueue_add_sgs - expose buffers to other end
+ * @vq: the struct virtqueue we're talking about.
+ * @sgs: array of terminated scatterlists.
+ * @out_num: the number of scatterlists readable by other side
+ * @in_num: the number of scatterlists which are writable (after readable ones)
+ * @data: the token identifying the buffer.
+ 

Re: linux-next: Tree for Feb 13 (virtio_console)

2013-02-13 Thread Rusty Russell
Randy Dunlap  writes:

> On 02/13/13 00:35, Stephen Rothwell wrote:
>> Hi all,
>> 
>> Changes since 20130212:
>
> on i386:
>
> drivers/built-in.o: In function `in_intr':
> virtio_console.c:(.text+0x2dd31): undefined reference to `hvc_poll'
> virtio_console.c:(.text+0x2dd41): undefined reference to `hvc_kick'
> drivers/built-in.o: In function `resize_console':
> virtio_console.c:(.text+0x2e26f): undefined reference to `__hvc_resize'
> drivers/built-in.o: In function `unplug_port':
> virtio_console.c:(.text+0x2e572): undefined reference to `hvc_remove'
> drivers/built-in.o: In function `init_port_console':
> (.text+0x2fe59): undefined reference to `hvc_alloc'
> drivers/built-in.o: In function `virtio_cons_early_init':
> (.init.text+0x16d1): undefined reference to `hvc_instantiate'
>
>
> Full randconfig file is attached.

This looks like an impossible config.  CONFIG_VIRTIO_CONSOLE=y, but
CONFIG_HVC_DRIVER isn't set.

>From drivers/char/Kconfig:

config VIRTIO_CONSOLE
tristate "Virtio console"
depends on VIRTIO
select HVC_DRIVER

???

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] virt_mmio: fix signature checking for BE guests

2013-02-13 Thread Rusty Russell
"Michael S. Tsirkin"  writes:
> On Wed, Feb 13, 2013 at 03:28:52PM +, Marc Zyngier wrote:
>> On 13/02/13 15:08, Pawel Moll wrote:
>> > On Wed, 2013-02-13 at 14:25 +, Marc Zyngier wrote:
>> >> Using readl() to read the magic value and then memcmp() to check it
>> >> fails on BE, as bytes will be the other way around (by virtue of
>> >> the registers to follow the endianess of the guest).
>> > 
>> > Hm. Interesting. I missed the fact that readl() as a "PCI operation"
>> > will always assume LE values...
>> > 
>> >> Fix it by encoding the magic as an integer instead of a string.
>> >> So I'm not completely sure this is the right fix, 
>> > 
>> > It seems right, however...
>> > 
>> >> - Using __raw_readl() instead. Is that a generic enough API?
>> >>
>> > ... this implies that either the spec is wrong (as it should say: the
>> > device registers are always LE, in the PCI spirit) or all readl()s & co.
>> > should be replaced with __raw equivalents.
>> 
>> Well, the spec clearly says that the registers reflect the endianess of
>> the guest, and it makes sense: when performing the MMIO access, KVM
>> needs to convert between host and guest endianess.
>> 
>> > Having said that, does the change make everything else work with a BE
>> > guest? (I assume we're talking about the guest being BE, right? ;-) If
>> > so it means that the host is not following the current spec and it
>> > treats all the registers as LE.
>> 
>> Yes, I only care about a BE guest. And no, not much is actually working
>> (kvmtool is not happy about the guest addresses it finds in the
>> virtio-ring). Need to dive into it and understand what needs to be fixed...
>
> Does it work for qemu? I know people were using virtio on BE there.

It's had no end of problems: the powerpc folk are quite upset with me.

If I were doing virtio again, I'd use LE everywhere: device authors
*expect* to worry about endian, and it's confusing for them when they
don't.

It's tempting to use LE for the PCI config space for the new layout, for
example.

If you want to specify virtio-mmio as LE, feel free.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/8] sched: explicitly cpu_idle_type checking in rebalance_domains()

2013-02-13 Thread Joonsoo Kim
After commit 88b8dac0, dst-cpu can be changed in load_balance(),
then we can't know cpu_idle_type of dst-cpu when load_balance()
return positive. So, add explicit cpu_idle_type checking.

Cc: Srivatsa Vaddagiri 
Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6f72851..0c6aaf6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5515,10 +5515,10 @@ static void rebalance_domains(int cpu, enum 
cpu_idle_type idle)
if (time_after_eq(jiffies, sd->last_balance + interval)) {
if (load_balance(cpu, rq, sd, idle, )) {
/*
-* We've pulled tasks over so either we're no
-* longer idle.
+* We've pulled tasks over so either we may
+* be no longer idle.
 */
-   idle = CPU_NOT_IDLE;
+   idle = idle_cpu(cpu) ? CPU_IDLE : CPU_NOT_IDLE;
}
sd->last_balance = jiffies;
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/8] sched: don't consider other cpus in our group in case of NEWLY_IDLE

2013-02-13 Thread Joonsoo Kim
Commit 88b8dac0 makes load_balance() consider other cpus in its group,
regardless of idle type. When we do NEWLY_IDLE balancing, we should not
consider it, because a motivation of NEWLY_IDLE balancing is to turn
this cpu to non idle state if needed. This is not the case of other cpus.
So, change code not to consider other cpus for NEWLY_IDLE balancing.

With this patch, assign 'if (pulled_task) this_rq->idle_stamp = 0'
in idle_balance() is corrected, because NEWLY_IDLE balancing doesn't
consider other cpus. Assigning to 'this_rq->idle_stamp' is now valid.

Cc: Srivatsa Vaddagiri 
Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0c6aaf6..97498f4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5016,8 +5016,15 @@ static int load_balance(int this_cpu, struct rq *this_rq,
.cpus   = cpus,
};
 
+   /* For NEWLY_IDLE load_balancing, we don't need to consider
+* other cpus in our group */
+   if (idle == CPU_NEWLY_IDLE) {
+   env.dst_grpmask = NULL;
+   max_lb_iterations = 0;
+   } else {
+   max_lb_iterations = cpumask_weight(env.dst_grpmask);
+   }
cpumask_copy(cpus, cpu_active_mask);
-   max_lb_iterations = cpumask_weight(env.dst_grpmask);
 
schedstat_inc(sd, lb_count[idle]);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/8] correct load_balance()

2013-02-13 Thread Joonsoo Kim
Commit 88b8dac0 makes load_balance() consider other cpus in its group.
But, there are some missing parts for this feature to work properly.
This patchset correct these things and make load_balance() robust.

Others are related to LBF_ALL_PINNED. This is fallback functionality
when all tasks can't be moved as cpu affinity. But, currently,
if imbalance is not large enough to task's load, we leave LBF_ALL_PINNED
flag and 'redo' is triggered. This is not our intention, so correct it.

These are based on v3.8-rc7.

Joonsoo Kim (8):
  sched: change position of resched_cpu() in load_balance()
  sched: explicitly cpu_idle_type checking in rebalance_domains()
  sched: don't consider other cpus in our group in case of NEWLY_IDLE
  sched: clean up move_task() and move_one_task()
  sched: move up affinity check to mitigate useless redoing overhead
  sched: rename load_balance_tmpmask to load_balance_cpu_active
  sched: prevent to re-select dst-cpu in load_balance()
  sched: reset lb_env when redo in load_balance()

 kernel/sched/core.c |9 +++--
 kernel/sched/fair.c |  107 +--
 2 files changed, 67 insertions(+), 49 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/8] sched: clean up move_task() and move_one_task()

2013-02-13 Thread Joonsoo Kim
Some validation for task moving is performed in move_tasks() and
move_one_task(). We can move these code to can_migrate_task()
which is already exist for this purpose.

Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 97498f4..849bc8e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3874,19 +3874,40 @@ task_hot(struct task_struct *p, u64 now, struct 
sched_domain *sd)
return delta < (s64)sysctl_sched_migration_cost;
 }
 
+static unsigned long task_h_load(struct task_struct *p);
+
 /*
  * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
+ * @load is only meaningful when !@lb_active and return value is true
  */
 static
-int can_migrate_task(struct task_struct *p, struct lb_env *env)
+int can_migrate_task(struct task_struct *p, struct lb_env *env,
+   bool lb_active, unsigned long *load)
 {
int tsk_cache_hot = 0;
/*
 * We do not migrate tasks that are:
-* 1) running (obviously), or
-* 2) cannot be migrated to this CPU due to cpus_allowed, or
-* 3) are cache-hot on their current CPU.
+* 1) throttled_lb_pair, or
+* 2) task's load is too low, or
+* 3) task's too large to imbalance, or
+* 4) cannot be migrated to this CPU due to cpus_allowed, or
+* 5) running (obviously), or
+* 6) are cache-hot on their current CPU.
 */
+
+   if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
+   return 0;
+
+   if (!lb_active) {
+   *load = task_h_load(p);
+   if (sched_feat(LB_MIN) &&
+   *load < 16 && !env->sd->nr_balance_failed)
+   return 0;
+
+   if ((*load / 2) > env->imbalance)
+   return 0;
+   }
+
if (!cpumask_test_cpu(env->dst_cpu, tsk_cpus_allowed(p))) {
int new_dst_cpu;
 
@@ -3957,10 +3978,7 @@ static int move_one_task(struct lb_env *env)
struct task_struct *p, *n;
 
list_for_each_entry_safe(p, n, >src_rq->cfs_tasks, se.group_node) {
-   if (throttled_lb_pair(task_group(p), env->src_rq->cpu, 
env->dst_cpu))
-   continue;
-
-   if (!can_migrate_task(p, env))
+   if (!can_migrate_task(p, env, true, NULL))
continue;
 
move_task(p, env);
@@ -3975,8 +3993,6 @@ static int move_one_task(struct lb_env *env)
return 0;
 }
 
-static unsigned long task_h_load(struct task_struct *p);
-
 static const unsigned int sched_nr_migrate_break = 32;
 
 /*
@@ -4011,18 +4027,7 @@ static int move_tasks(struct lb_env *env)
break;
}
 
-   if (throttled_lb_pair(task_group(p), env->src_cpu, 
env->dst_cpu))
-   goto next;
-
-   load = task_h_load(p);
-
-   if (sched_feat(LB_MIN) && load < 16 && 
!env->sd->nr_balance_failed)
-   goto next;
-
-   if ((load / 2) > env->imbalance)
-   goto next;
-
-   if (!can_migrate_task(p, env))
+   if (!can_migrate_task(p, env, false, ))
goto next;
 
move_task(p, env);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/8] sched: move up affinity check to mitigate useless redoing overhead

2013-02-13 Thread Joonsoo Kim
Currently, LBF_ALL_PINNED is cleared after affinity check is passed.
So, if can_migrate_task() is failed by small load value or small
imbalance value, we don't clear LBF_ALL_PINNED. At last, we trigger
'redo' in load_balance().

Imbalance value is often so small that any tasks cannot be moved
to other cpus and, of course, this situaltion may be continued after
we change the target cpu. So this patch clear LBF_ALL_PINNED in order
to mitigate useless redoing overhead, if can_migrate_task() is failed
by above reason.

Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 849bc8e..bb373f4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3888,9 +3888,9 @@ int can_migrate_task(struct task_struct *p, struct lb_env 
*env,
/*
 * We do not migrate tasks that are:
 * 1) throttled_lb_pair, or
-* 2) task's load is too low, or
-* 3) task's too large to imbalance, or
-* 4) cannot be migrated to this CPU due to cpus_allowed, or
+* 2) cannot be migrated to this CPU due to cpus_allowed, or
+* 3) task's load is too low, or
+* 4) task's too large to imbalance, or
 * 5) running (obviously), or
 * 6) are cache-hot on their current CPU.
 */
@@ -3898,16 +3898,6 @@ int can_migrate_task(struct task_struct *p, struct 
lb_env *env,
if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu))
return 0;
 
-   if (!lb_active) {
-   *load = task_h_load(p);
-   if (sched_feat(LB_MIN) &&
-   *load < 16 && !env->sd->nr_balance_failed)
-   return 0;
-
-   if ((*load / 2) > env->imbalance)
-   return 0;
-   }
-
if (!cpumask_test_cpu(env->dst_cpu, tsk_cpus_allowed(p))) {
int new_dst_cpu;
 
@@ -3936,6 +3926,16 @@ int can_migrate_task(struct task_struct *p, struct 
lb_env *env,
/* Record that we found atleast one task that could run on dst_cpu */
env->flags &= ~LBF_ALL_PINNED;
 
+   if (!lb_active) {
+   *load = task_h_load(p);
+   if (sched_feat(LB_MIN) &&
+   *load < 16 && !env->sd->nr_balance_failed)
+   return 0;
+
+   if ((*load / 2) > env->imbalance)
+   return 0;
+   }
+
if (task_running(env->src_rq, p)) {
schedstat_inc(p, se.statistics.nr_failed_migrations_running);
return 0;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/8] sched: prevent to re-select dst-cpu in load_balance()

2013-02-13 Thread Joonsoo Kim
Commit 88b8dac0 makes load_balance() consider other cpus in its group.
But, in that, there is no code for preventing to re-select dst-cpu.
So, same dst-cpu can be selected over and over.

This patch add functionality to load_balance() in order to exclude
cpu which is selected once.

Cc: Srivatsa Vaddagiri 
Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e6f8783..d4c6ed0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6814,6 +6814,7 @@ struct task_group root_task_group;
 LIST_HEAD(task_groups);
 #endif
 
+DECLARE_PER_CPU(cpumask_var_t, load_balance_dst_grp);
 DECLARE_PER_CPU(cpumask_var_t, load_balance_cpu_active);
 
 void __init sched_init(void)
@@ -6828,7 +6829,7 @@ void __init sched_init(void)
alloc_size += 2 * nr_cpu_ids * sizeof(void **);
 #endif
 #ifdef CONFIG_CPUMASK_OFFSTACK
-   alloc_size += num_possible_cpus() * cpumask_size();
+   alloc_size += num_possible_cpus() * cpumask_size() * 2;
 #endif
if (alloc_size) {
ptr = (unsigned long)kzalloc(alloc_size, GFP_NOWAIT);
@@ -6851,6 +6852,8 @@ void __init sched_init(void)
 #endif /* CONFIG_RT_GROUP_SCHED */
 #ifdef CONFIG_CPUMASK_OFFSTACK
for_each_possible_cpu(i) {
+   per_cpu(load_balance_dst_grp, i) = (void *)ptr;
+   ptr += cpumask_size();
per_cpu(load_balance_cpu_active, i) = (void *)ptr;
ptr += cpumask_size();
}
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7382fa5..70631e8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4974,6 +4974,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 #define MAX_PINNED_INTERVAL512
 
 /* Working cpumask for load_balance and load_balance_newidle. */
+DEFINE_PER_CPU(cpumask_var_t, load_balance_dst_grp);
 DEFINE_PER_CPU(cpumask_var_t, load_balance_cpu_active);
 
 static int need_active_balance(struct lb_env *env)
@@ -5005,17 +5006,17 @@ static int load_balance(int this_cpu, struct rq 
*this_rq,
int *balance)
 {
int ld_moved, cur_ld_moved, active_balance = 0;
-   int lb_iterations, max_lb_iterations;
struct sched_group *group;
struct rq *busiest;
unsigned long flags;
+   struct cpumask *dst_grp = __get_cpu_var(load_balance_dst_grp);
struct cpumask *cpus = __get_cpu_var(load_balance_cpu_active);
 
struct lb_env env = {
.sd = sd,
.dst_cpu= this_cpu,
.dst_rq = this_rq,
-   .dst_grpmask= sched_group_cpus(sd->groups),
+   .dst_grpmask= dst_grp,
.idle   = idle,
.loop_break = sched_nr_migrate_break,
.cpus   = cpus,
@@ -5025,9 +5026,9 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 * other cpus in our group */
if (idle == CPU_NEWLY_IDLE) {
env.dst_grpmask = NULL;
-   max_lb_iterations = 0;
} else {
-   max_lb_iterations = cpumask_weight(env.dst_grpmask);
+   cpumask_copy(dst_grp, sched_group_cpus(sd->groups));
+   cpumask_clear_cpu(env.dst_cpu, env.dst_grpmask);
}
cpumask_copy(cpus, cpu_active_mask);
 
@@ -5055,7 +5056,6 @@ redo:
schedstat_add(sd, lb_imbalance[idle], env.imbalance);
 
ld_moved = 0;
-   lb_iterations = 1;
if (busiest->nr_running > 1) {
/*
 * Attempt to move tasks. If find_busiest_group has found
@@ -5112,14 +5112,17 @@ more_balance:
 * moreover subsequent load balance cycles should correct the
 * excess load moved.
 */
-   if ((env.flags & LBF_SOME_PINNED) && env.imbalance > 0 &&
-   lb_iterations++ < max_lb_iterations) {
+   if ((env.flags & LBF_SOME_PINNED) && env.imbalance > 0) {
 
env.dst_rq   = cpu_rq(env.new_dst_cpu);
env.dst_cpu  = env.new_dst_cpu;
env.flags   &= ~LBF_SOME_PINNED;
env.loop = 0;
env.loop_break   = sched_nr_migrate_break;
+
+   /* Prevent to re-select dst_cpu */
+   cpumask_clear_cpu(env.dst_cpu, env.dst_grpmask);
+
/*
 * Go back to "more_balance" rather than "redo" since we
 * need to continue with same src_cpu.
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/8] sched: change position of resched_cpu() in load_balance()

2013-02-13 Thread Joonsoo Kim
cur_ld_moved is reset if env.flags hit LBF_NEED_BREAK.
So, there is possibility that we miss doing resched_cpu().
Correct it as changing position of resched_cpu()
before checking LBF_NEED_BREAK.

Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 81fa536..6f72851 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5070,17 +5070,17 @@ more_balance:
double_rq_unlock(env.dst_rq, busiest);
local_irq_restore(flags);
 
-   if (env.flags & LBF_NEED_BREAK) {
-   env.flags &= ~LBF_NEED_BREAK;
-   goto more_balance;
-   }
-
/*
 * some other cpu did the load balance for us.
 */
if (cur_ld_moved && env.dst_cpu != smp_processor_id())
resched_cpu(env.dst_cpu);
 
+   if (env.flags & LBF_NEED_BREAK) {
+   env.flags &= ~LBF_NEED_BREAK;
+   goto more_balance;
+   }
+
/*
 * Revisit (affine) tasks on src_cpu that couldn't be moved to
 * us and move them to an alternate dst_cpu in our sched_group
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 8/8] sched: reset lb_env when redo in load_balance()

2013-02-13 Thread Joonsoo Kim
Commit 88b8dac0 makes load_balance() consider other cpus in its group.
So, now, When we redo in load_balance(), we should reset some fields of
lb_env to ensure that load_balance() works for initial cpu, not for other
cpus in its group. So correct it.

Cc: Srivatsa Vaddagiri 
Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 70631e8..25c798c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5014,14 +5014,20 @@ static int load_balance(int this_cpu, struct rq 
*this_rq,
 
struct lb_env env = {
.sd = sd,
-   .dst_cpu= this_cpu,
-   .dst_rq = this_rq,
.dst_grpmask= dst_grp,
.idle   = idle,
-   .loop_break = sched_nr_migrate_break,
.cpus   = cpus,
};
 
+   schedstat_inc(sd, lb_count[idle]);
+   cpumask_copy(cpus, cpu_active_mask);
+
+redo:
+   env.dst_cpu = this_cpu;
+   env.dst_rq = this_rq;
+   env.loop = 0;
+   env.loop_break = sched_nr_migrate_break;
+
/* For NEWLY_IDLE load_balancing, we don't need to consider
 * other cpus in our group */
if (idle == CPU_NEWLY_IDLE) {
@@ -5030,11 +5036,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
cpumask_copy(dst_grp, sched_group_cpus(sd->groups));
cpumask_clear_cpu(env.dst_cpu, env.dst_grpmask);
}
-   cpumask_copy(cpus, cpu_active_mask);
 
-   schedstat_inc(sd, lb_count[idle]);
-
-redo:
group = find_busiest_group(, balance);
 
if (*balance == 0)
@@ -5133,11 +5135,9 @@ more_balance:
/* All tasks on this runqueue were pinned by CPU affinity */
if (unlikely(env.flags & LBF_ALL_PINNED)) {
cpumask_clear_cpu(cpu_of(busiest), cpus);
-   if (!cpumask_empty(cpus)) {
-   env.loop = 0;
-   env.loop_break = sched_nr_migrate_break;
+   if (!cpumask_empty(cpus))
goto redo;
-   }
+
goto out_balanced;
}
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/8] sched: rename load_balance_tmpmask to load_balance_cpu_active

2013-02-13 Thread Joonsoo Kim
This name doesn't represent specific meaning.
So rename it to imply it's purpose.

Signed-off-by: Joonsoo Kim 

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 26058d0..e6f8783 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6814,7 +6814,7 @@ struct task_group root_task_group;
 LIST_HEAD(task_groups);
 #endif
 
-DECLARE_PER_CPU(cpumask_var_t, load_balance_tmpmask);
+DECLARE_PER_CPU(cpumask_var_t, load_balance_cpu_active);
 
 void __init sched_init(void)
 {
@@ -6851,7 +6851,7 @@ void __init sched_init(void)
 #endif /* CONFIG_RT_GROUP_SCHED */
 #ifdef CONFIG_CPUMASK_OFFSTACK
for_each_possible_cpu(i) {
-   per_cpu(load_balance_tmpmask, i) = (void *)ptr;
+   per_cpu(load_balance_cpu_active, i) = (void *)ptr;
ptr += cpumask_size();
}
 #endif /* CONFIG_CPUMASK_OFFSTACK */
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bb373f4..7382fa5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4974,7 +4974,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 #define MAX_PINNED_INTERVAL512
 
 /* Working cpumask for load_balance and load_balance_newidle. */
-DEFINE_PER_CPU(cpumask_var_t, load_balance_tmpmask);
+DEFINE_PER_CPU(cpumask_var_t, load_balance_cpu_active);
 
 static int need_active_balance(struct lb_env *env)
 {
@@ -5009,7 +5009,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
struct sched_group *group;
struct rq *busiest;
unsigned long flags;
-   struct cpumask *cpus = __get_cpu_var(load_balance_tmpmask);
+   struct cpumask *cpus = __get_cpu_var(load_balance_cpu_active);
 
struct lb_env env = {
.sd = sd,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] SIGKILL vs. SIGSEGV on late execve() failures

2013-02-13 Thread Al Viro
If execve() fails past flush_old_exec(), we are obviously going to
kill the process.  Right now it's implemented in $BIGNUM places in
->load_binary() and that's obviously brittle (and in at least one case
buggy - binfmt_flat lacks send_sig_info() on late failures).  Now, there's
an obvious way to check that we had done successful flush_old_exec() -
bfmt->mm becomes NULL just past the last failure exit there.  So it would
be tempting to have these send_sig_info() moved into search_binary_handler(),
especially since we already have
if (retval != -ENOEXEC || bprm->mm == NULL)
break;
in there and turning that into
if (bprm->mm == NULL) {
/* past the point of no return */
suicide
break;
}
if (retval != -ENOEXEC)
break;
would be trivial.

The only problem is that some suicides do SIGKILL, some SIGSEGV.
AFAICS, it started as SIGSEGV and had been switched to SIGKILL for a.out
(without any comments) in 1.1.62.  By that time ELF had been there, with
SIGSEGV in the same places.  Not replaced with SIGKILL; as the matter of
fact, they are still there.  Additional failure exits in case of ELF had
been added with SIGKILL; ELF-FDPIC has copied ELF and FLAT hadn't bothered
with send_sig_info() at all.

Since by that point we have an empty sighandler table, the only
real difference is whether we attempt to produce a coredump on such late
failures.  Is there any real reason not to try that?  After all, with that
kind of late failure in execve(2) a coredump is obviously something the
caller might want to take a look at...

What was the reason for switch in 1.1.62?  It's before my time
and the only exec-related comments I see in 1.1.61->1.1.62 summary are

Don't execute files that are being written to.
If we can't get write access to a core dump file, don't core dump.
Remove redundant test for non-null executables. 
No need to release the shared memory by hand, when loading different executable.

neither of which covers that one...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH RFC] davinci: poll for sleep completion in resume routine.

2013-02-13 Thread Vishwanathrao Badarkhe, Manish
Hi Sekhar,

On Thu, Feb 14, 2013 at 09:48:59, Nori, Sekhar wrote:
> Manish,
> 
> On 1/31/2013 2:56 PM, Vishwanathrao Badarkhe, Manish wrote:
> > As per OMAP-L138 TRM, Software must poll for SLEEPCOMPLETE bit until 
> > it is set to 1 before clearing SLEEPENABLE bit in DEEPSLEEP register 
> > in resume routine.
> > Modifications are as per datasheet:
> > http://www.ti.com/lit/ug/spruh77a/spruh77a.pdf
> > See sections 10.10.2.2 and 11.5.21 for more detailed explanation.
> 
> Polling for SLEEPCOMPLETE is not required in RTC controlled wake-up which is 
> the mode currently supported (see section 10.10.2.1 of the TRM). Polling for 
> SLEEPCOMPLETE is required for external controlled wake-up which to my 
> knowledge has never been tested. If you have tested this with external 
> controlled wakep-up, then I can consider this patch.
> Else, I would like to take it only after externally controlled wake-up is 
> fully tested/supported instead of taking bits and pieces.

Yes, for RTC controlled wakeup, this polling is not required as per section 
10.10.2.1.
But if we see in section 10.10.2.2 (Exiting Deep Sleep Mode) step 2, When sleep 
count 
completes SLEEPCOMPLETE bit gets sets in DEEPSLEEP register till that it's not 
safe to 
release clock to devices. So If we don’t poll for SLEEPCOMPLETE, this delay 
will not
come into picture which we actually set while entering deep sleep in case of 
RTC 
controlled wakeup (Section 10.10.2.1 step 9). 
Please let me know, whether these understanding is correct?

For external controlled wakeup, we need to do hardware modifications and hence, 
Yet to 
be tested external controlled wakeup functionality.

Thanks, 
Manish Badarkhe

> > 
> > Tested on da850-evm.
> > 
> > Signed-off-by: Vishwanathrao Badarkhe, Manish 
> > ---
> > :100644 100644 d4e9316... 976f096... M  arch/arm/mach-davinci/sleep.S
> >  arch/arm/mach-davinci/sleep.S |8 
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/arm/mach-davinci/sleep.S 
> > b/arch/arm/mach-davinci/sleep.S index d4e9316..976f096 100644
> > --- a/arch/arm/mach-davinci/sleep.S
> > +++ b/arch/arm/mach-davinci/sleep.S
> > @@ -35,6 +35,7 @@
> >  #define PLL_LOCK_CYCLES(PLL_LOCK_TIME * 25)
> >  
> >  #define DEEPSLEEP_SLEEPENABLE_BIT  BIT(31)
> > +#define DEEPSLEEP_SLEEPCOMPLETE_BITBIT(30)
> >  
> > .text
> >  /*
> > @@ -110,6 +111,13 @@ ENTRY(davinci_cpu_suspend)
> >  
> > /* Wake up from sleep */
> >  
> > +   /* wait for sleep complete */
> > +sleep_complete:
> > +   ldr ip, [r4]
> > +   and ip, ip, #DEEPSLEEP_SLEEPCOMPLETE_BIT
> > +   cmp ip, #DEEPSLEEP_SLEEPCOMPLETE_BIT
> > +   bne sleep_complete
> > +
> > /* Clear sleep enable */
> > ldr ip, [r4]
> > bic ip, ip, #DEEPSLEEP_SLEEPENABLE_BIT
> > 
> 




Re: 3.2.38 most of the time has 100% cpu use reported

2013-02-13 Thread Ben Hutchings
On Tue, 2013-02-12 at 00:01 -0500, tmhik...@gmail.com wrote:
>   Okay, I finally have located the patch causing this bizzare problem
> for me. Before I discuss it, I'm going to drag out the kernel bug reporting
> guidelines and try to make a proper bug report out of this.
> 
> [1.] One line summary of the problem:
> 3.2.38 most of the time has 100% cpu use reported
> 
> [2.] Full description of the problem/report:
> Reverse applying the patch for
> 
> [9a1f08a1a192f9177d7063d903773aed800b840f] drivers/firmware/dmi_scan.c: fetch 
> dmi version from SMBIOS if it exists
> 
> on top of a clean 3.2.38 tree makes the problem go away.
[...]
> I have to admit I have no idea what the patch I'm reversing actually does,

It changes how we look for the version of a BIOS interface (DMI or
SMBIOS).  All the code that it touches, and the version number variable,
are discarded after boot and therefore can have very limited effect on
what happens later!

> what the cpu is doing when it's stuck at 100%, nor how I would diagnose this
> further or whom I should contact.  I'd like to help anyone who wants to fix
> whatever went wrong, so please contact me.

'perf top' will tell you where the CPU is spending its time.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
  - Albert Camus


signature.asc
Description: This is a digitally signed message part


[tip:x86/urgent] efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter

2013-02-13 Thread tip-bot for Satoru Takeuchi
Commit-ID:  1de63d60cd5b0d33a812efa455d5933bf1564a51
Gitweb: http://git.kernel.org/tip/1de63d60cd5b0d33a812efa455d5933bf1564a51
Author: Satoru Takeuchi 
AuthorDate: Thu, 14 Feb 2013 09:12:52 +0900
Committer:  H. Peter Anvin 
CommitDate: Wed, 13 Feb 2013 17:24:11 -0800

efi: Clear EFI_RUNTIME_SERVICES rather than EFI_BOOT by "noefi" boot parameter

There was a serious problem in samsung-laptop that its platform driver is
designed to run under BIOS and running under EFI can cause the machine to
become bricked or can cause Machine Check Exceptions.

Discussion about this problem:
https://bugs.launchpad.net/ubuntu-cdimage/+bug/1040557
https://bugzilla.kernel.org/show_bug.cgi?id=47121

The patches to fix this problem:
efi: Make 'efi_enabled' a function to query EFI facilities
83e68189745ad931c2afd45d8ee3303929233e7f

samsung-laptop: Disable on EFI hardware
e0094244e41c4d0c7ad69920681972fc45d8ce34

Unfortunately this problem comes back again if users specify "noefi" option.
This parameter clears EFI_BOOT and that driver continues to run even if running
under EFI. Refer to the document, this parameter should clear
EFI_RUNTIME_SERVICES instead.

Documentation/kernel-parameters.txt:
===
...
noefi   [X86] Disable EFI runtime services support.
...
===

Documentation/x86/x86_64/uefi.txt:
===
...
- If some or all EFI runtime services don't work, you can try following
  kernel command line parameters to turn off some or all EFI runtime
  services.
noefi   turn off all EFI runtime services
...
===

Signed-off-by: Satoru Takeuchi 
Link: http://lkml.kernel.org/r/511c2c04.2070...@jp.fujitsu.com
Cc: Matt Fleming 
Cc: 
Signed-off-by: H. Peter Anvin 
---
 arch/x86/platform/efi/efi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 77cf009..928bf83 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -87,7 +87,7 @@ EXPORT_SYMBOL(efi_enabled);
 
 static int __init setup_noefi(char *arg)
 {
-   clear_bit(EFI_BOOT, _efi_facility);
+   clear_bit(EFI_RUNTIME_SERVICES, _efi_facility);
return 0;
 }
 early_param("noefi", setup_noefi);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] of: dma.c: fix memory leakage

2013-02-13 Thread Vinod Koul
On Wed, Feb 13, 2013 at 12:26:23PM +0100, Cong Ding wrote:
> > > > You need to send this to whomever is working on DMA bindings.
> > > Thank you bob, I added Vinod the the receiver list.
> > I have moved the of/dma.c to dma/of-dma.c, can you regenerate this patch and
> > resend to me
> Sorry Vinod, I didn't manage to get this commit from either linux-next tree or
> slave-dma tree, and the last commit by you for of/dma.c file is on Jan 7. Did
> you have any hints for me to get the latest version dma/of-dma.c?
AH my bad, seems i have not pushed it out :(

Pushed now...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 2/4] ARM: dts: Add i2c-arbitrator bus for exynos5250-snow

2013-02-13 Thread Stephen Warren
On 02/13/2013 05:38 PM, Doug Anderson wrote:
> Stephen,
> 
> 
> On Wed, Feb 13, 2013 at 1:04 PM, Stephen Warren  wrote:
>> On 02/13/2013 11:02 AM, Doug Anderson wrote:
>>> We need to use the i2c-arbitrator to talk to any of the devices on i2c
>>> bus 4 on exynos5250-snow so that we don't confuse the embedded
>>> controller (EC).  Add the i2c-arbitrator to the device tree.  As we
>>> add future devices (keyboard, sbs, tps65090) we'll add them on top of
>>> this.
>>>
>>> The arbitrated bus is numbered 104 simply as a convenience to make it
>>> easier for people poking around to guess that it might have something
>>> to do with the physical bus 4.
>>>
>>> The addition is split between the cros5250-common and the snow device
>>> tree file since not all cros5250-class devices use arbitration.

>>> diff --git a/arch/arm/boot/dts/exynos5250-snow.dts 
>>> b/arch/arm/boot/dts/exynos5250-snow.dts
>>
>>> + i2c-arbitrator {
>>> + compatible = "i2c-arbitrator";
>>> + #address-cells = <1>;
>>> + #size-cells = <0>;
>>
>>> + /* Use ID 104 as a hint that we're on physical bus 4 */
>>> + i2c_104: i2c@0 {
>>
>> Does something use that hint? It sounds a little odd.
> 
> The i2c bus numbering patches will end up creating "/dev/i2c-104".

Oh sorry, I see this is just the alias doing it's job. I'd misread that
as the reg value being 104 and driving the bus ID.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the akpm tree with the tip tree

2013-02-13 Thread H. Peter Anvin

On 02/13/2013 08:25 PM, Stephen Rothwell wrote:

Hi Andrew,

Today's linux-next merge of the akpm tree got a conflict in
kernel/timeconst.pl between commit 63a3f603413f ("timeconst.pl: Eliminate
Perl warning") from the tip tree and commit "timeconst.pl: remove
deprecated defined(@array)" from the akpm tree.

These both fix the same problem, I arbitrarily chose the akpm tree version.



I should try to resurrect the bc version (which doesn't need the canning 
junk, bc being the POSIX tool for arbitrary-precision arithmetic.) 
There was an error on one of akpm's machines long ago which confused the 
bcrap out of us, because bc hadn't changed, but recently someone pointed 
to a bug in *make* (relating to pipes) from around that era which would 
have explained (a) the failure, and (b) why it only hit one box even 
though bc was the exact same version.


-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the akpm tree with the tip tree

2013-02-13 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm tree got a conflict in
include/linux/sched.h between commit cf4aebc292fa ("sched: Move sched.h
sysctl bits into separate header") from the tip tree and commit "aio:
don't include aio.h in sched.h" from the akpm tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc include/linux/sched.h
index fe38049,f0e3a11..000
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@@ -326,8 -339,23 +326,6 @@@ extern int mutex_spin_on_owner(struct m
  struct nsproxy;
  struct user_namespace;
  
- #include 
- 
 -/*
 - * Default maximum number of active map areas, this limits the number of vmas
 - * per mm struct. Users can overwrite this number by sysctl but there is a
 - * problem.
 - *
 - * When a program's coredump is generated as ELF format, a section is created
 - * per a vma. In ELF, the number of sections is represented in unsigned short.
 - * This means the number of sections should be smaller than 65535 at coredump.
 - * Because the kernel adds some informative sections to a image of program at
 - * generating coredump, we need some margin. The number of extra sections is
 - * 1-3 now and depends on arch. We use "5" as safe margin, here.
 - */
 -#define MAPCOUNT_ELF_CORE_MARGIN  (5)
 -#define DEFAULT_MAX_MAP_COUNT (USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN)
 -
 -extern int sysctl_max_map_count;
 -
  #ifdef CONFIG_MMU
  extern void arch_pick_mmap_layout(struct mm_struct *mm);
  extern unsigned long


pgpkHmPqfe0Tm.pgp
Description: PGP signature


Re: [PATCH] mfd: support stmpe1801 18 bits enhanced port expander

2013-02-13 Thread Dmitry Torokhov
On Tue, Feb 12, 2013 at 11:05:12AM +0100, Samuel Ortiz wrote:
> Adding Dmitry to the thread, for the input parts.

Looks reasonable, however the patch is against older version of the
driver and most likely will not apply anymore...

Thanks.

> 
> On Thu, Dec 20, 2012 at 09:57:19AM +0100, Jean-Nicolas Graux wrote:
> > Provides support for 1801 variant of stmpe gpio port expanders.
> > This chip has 18 gpios configurable as GPI, GPO, keypad matrix,
> > special key or dedicated key function.
> > 
> > Note that special/dedicated key function is not supported yet.
> > 
> > Signed-off-by: Jean-Nicolas Graux 
> > ---
> >  drivers/gpio/gpio-stmpe.c |   52 +--
> >  drivers/input/keyboard/stmpe-keypad.c |  251 
> > +++--
> >  drivers/mfd/Kconfig   |1 +
> >  drivers/mfd/stmpe-i2c.c   |1 +
> >  drivers/mfd/stmpe.c   |   97 -
> >  drivers/mfd/stmpe.h   |   49 +++
> >  include/linux/mfd/stmpe.h |3 +
> >  7 files changed, 392 insertions(+), 62 deletions(-)
> > 
> > diff --git a/drivers/gpio/gpio-stmpe.c b/drivers/gpio/gpio-stmpe.c
> > index dce3472..662c415 100644
> > --- a/drivers/gpio/gpio-stmpe.c
> > +++ b/drivers/gpio/gpio-stmpe.c
> > @@ -42,14 +42,25 @@ static inline struct stmpe_gpio *to_stmpe_gpio(struct 
> > gpio_chip *chip)
> > return container_of(chip, struct stmpe_gpio, chip);
> >  }
> >  
> > +static inline u8 stmpe_gpio_reg(struct stmpe *stmpe, u8 index, int offset)
> > +{
> > +   u8 reg;
> > +   if (stmpe->partnum == STMPE1801)
> > +   reg = stmpe->regs[index] + offset;
> > +   else
> > +   reg = stmpe->regs[index] - offset;
> > +   return reg;
> > +}
> > +
> >  static int stmpe_gpio_get(struct gpio_chip *chip, unsigned offset)
> >  {
> > struct stmpe_gpio *stmpe_gpio = to_stmpe_gpio(chip);
> > struct stmpe *stmpe = stmpe_gpio->stmpe;
> > -   u8 reg = stmpe->regs[STMPE_IDX_GPMR_LSB] - (offset / 8);
> > -   u8 mask = 1 << (offset % 8);
> > int ret;
> > +   u8 reg, mask;
> >  
> > +   reg = stmpe_gpio_reg(stmpe, STMPE_IDX_GPMR_LSB, offset / 8);
> > +   mask = 1 << (offset % 8);
> > ret = stmpe_reg_read(stmpe, reg);
> > if (ret < 0)
> > return ret;
> > @@ -62,8 +73,10 @@ static void stmpe_gpio_set(struct gpio_chip *chip, 
> > unsigned offset, int val)
> > struct stmpe_gpio *stmpe_gpio = to_stmpe_gpio(chip);
> > struct stmpe *stmpe = stmpe_gpio->stmpe;
> > int which = val ? STMPE_IDX_GPSR_LSB : STMPE_IDX_GPCR_LSB;
> > -   u8 reg = stmpe->regs[which] - (offset / 8);
> > -   u8 mask = 1 << (offset % 8);
> > +   u8 reg, mask;
> > +
> > +   reg = stmpe_gpio_reg(stmpe, which, offset / 8);
> > +   mask = 1 << (offset % 8);
> >  
> > /*
> >  * Some variants have single register for gpio set/clear functionality.
> > @@ -80,8 +93,10 @@ static int stmpe_gpio_direction_output(struct gpio_chip 
> > *chip,
> >  {
> > struct stmpe_gpio *stmpe_gpio = to_stmpe_gpio(chip);
> > struct stmpe *stmpe = stmpe_gpio->stmpe;
> > -   u8 reg = stmpe->regs[STMPE_IDX_GPDR_LSB] - (offset / 8);
> > -   u8 mask = 1 << (offset % 8);
> > +   u8 reg, mask;
> > +
> > +   reg = stmpe_gpio_reg(stmpe, STMPE_IDX_GPDR_LSB, offset / 8);
> > +   mask = 1 << (offset % 8);
> >  
> > stmpe_gpio_set(chip, offset, val);
> >  
> > @@ -93,8 +108,10 @@ static int stmpe_gpio_direction_input(struct gpio_chip 
> > *chip,
> >  {
> > struct stmpe_gpio *stmpe_gpio = to_stmpe_gpio(chip);
> > struct stmpe *stmpe = stmpe_gpio->stmpe;
> > -   u8 reg = stmpe->regs[STMPE_IDX_GPDR_LSB] - (offset / 8);
> > -   u8 mask = 1 << (offset % 8);
> > +   u8 reg, mask;
> > +
> > +   reg = stmpe_gpio_reg(stmpe, STMPE_IDX_GPDR_LSB, offset / 8);
> > +   mask = 1 << (offset % 8);
> >  
> > return stmpe_set_bits(stmpe, reg, mask, 0);
> >  }
> > @@ -174,6 +191,7 @@ static void stmpe_gpio_irq_sync_unlock(struct irq_data 
> > *d)
> > [REG_IE]= STMPE_IDX_IEGPIOR_LSB,
> > };
> > int i, j;
> > +   u8 reg;
> >  
> > for (i = 0; i < CACHE_NR_REGS; i++) {
> > /* STMPE801 doesn't have RE and FE registers */
> > @@ -189,7 +207,8 @@ static void stmpe_gpio_irq_sync_unlock(struct irq_data 
> > *d)
> > continue;
> >  
> > stmpe_gpio->oldregs[i][j] = new;
> > -   stmpe_reg_write(stmpe, stmpe->regs[regmap[i]] - j, new);
> > +   reg = stmpe_gpio_reg(stmpe, regmap[i], j);
> > +   stmpe_reg_write(stmpe, reg, new);
> > }
> > }
> >  
> > @@ -229,18 +248,20 @@ static irqreturn_t stmpe_gpio_irq(int irq, void *dev)
> >  {
> > struct stmpe_gpio *stmpe_gpio = dev;
> > struct stmpe *stmpe = stmpe_gpio->stmpe;
> > -   u8 statmsbreg = stmpe->regs[STMPE_IDX_ISGPIOR_MSB];
> > int num_banks = DIV_ROUND_UP(stmpe->num_gpios, 8);
> > u8 status[num_banks];
> > int ret;
> > int i;
> > +   bool lsb = 

Re: [PATCH] Input: synaptics - fix 1->3 contact transition reporting

2013-02-13 Thread Dmitry Torokhov
On Fri, Feb 01, 2013 at 04:29:00PM +0800, Daniel Kurtz wrote:
> Investigating the following gesture highlighted two slight implementation
> errors with choosing which slots to report in which slot when multiple
> contacts are present:
> 
> Action SGM  AGM (MTB slot:Contact)
> 1. Touch contact 0(0:0)
> 2. Touch contact 1(0:0, 1:1)
> 3. Lift  contact 0(1:1)
> 4. Touch contacts 2,3 (0:2, 1:3)
> 
> In step 4, slot 1 was not being cleared first, which means the same
> tracking ID was being used for reporting both the old contact 1 and the
> new contact 3.  This could result in "drumroll", where the old contact 1
> would appear to suddenly jump to new finger 3 position.
> 
> Similarly, if contacts 2 & 3 are not detected at the same sample, step 4
> is split into two:
> 
> ActionSGM  AGM  (MTB slot:contact)
> 1. Touch contact 0   (0:0)
> 2. Touch contact 1   (0:0, 1:1)
> 3. Lift  contact 0   (1:1)
> 4. Touch contact 2   (0:2, 1:1)
> 5. Touch contact 3   (0:2, 1:3)
> 
> In this case, there was also a bug.  In step 4, when contact 1 moves from
> SGM to AGM and contact 2 is first reported in SGM, slot 0 was actually
> empty.  So slot 0 can be used to report the new SGM (contact 0),
> immediately.  Since it was empty, contact 2 in slot 0 will get a new
> tracking ID.
> 
> Signed-off-by: Daniel Kurtz 

Applied, thank you Daniel.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/1] Input: mouse: cyapa - Add support for cyapa smbus protocol

2013-02-13 Thread Dmitry Torokhov
On Sun, Feb 10, 2013 at 12:15:40PM -0800, Benson Leung wrote:
> This patch adds support for the Cypress APA Smbus Trackpad type,
> which uses a modified register map that fits within the
> limitations of the smbus protocol.
> 
> Devices that use this protocol include:
> CYTRA-116001-00 - Samsung Series 5 550 Chromebook trackpad
> CYTRA-103002-00 - Acer C7 Chromebook trackpad
> CYTRA-101003-00 - HP Pavilion 14 Chromebook trackpad
> 
> Signed-off-by: Dudley Du 
> Signed-off-by: Benson Leung 
> Reviewed-by: Daniel Kurtz 
> ---
> v2 : Minor style cleanup. Removed unused struct device *dev declarations.
> ---

Applied, thank you Benson.

Henrik, I put you down as "reviewed-by".

Thanks!

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the akpm tree with the tip tree

2013-02-13 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm tree got a conflict in
kernel/timeconst.pl between commit 63a3f603413f ("timeconst.pl: Eliminate
Perl warning") from the tip tree and commit "timeconst.pl: remove
deprecated defined(@array)" from the akpm tree.

These both fix the same problem, I arbitrarily chose the akpm tree version.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpvCmOR3fs2A.pgp
Description: PGP signature


Re: [PATCH 01/15] drivers/input: add couple of missing GENERIC_HARDIRQS dependencies

2013-02-13 Thread Dmitry Torokhov
On Wed, Feb 06, 2013 at 05:23:49PM +0100, Heiko Carstens wrote:
> When removing the !S390 dependency from drivers/input/Kconfig a couple
> of drivers don't compile because they have a dependency on GENERIC_HARDIRQS.
> So add the missing dependencies.
> Fixes e.g. this one:
> 
> drivers/input/keyboard/lm8323.c: In function ‘lm8323_suspend’:
> drivers/input/keyboard/lm8323.c:801:2: error: implicit declaration of 
> function ‘irq_set_irq_wake’
>   [-Werror=implicit-function-declaration]
> 
> Cc: Dmitry Torokhov 
> Signed-off-by: Heiko Carstens 

Queued for 3.9, thanks Heiko.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] davinci: poll for sleep completion in resume routine.

2013-02-13 Thread Sekhar Nori
Manish,

On 1/31/2013 2:56 PM, Vishwanathrao Badarkhe, Manish wrote:
> As per OMAP-L138 TRM, Software must poll for
> SLEEPCOMPLETE bit until it is set to 1 before clearing
> SLEEPENABLE bit in DEEPSLEEP register in resume routine.
> Modifications are as per datasheet:
> http://www.ti.com/lit/ug/spruh77a/spruh77a.pdf
> See sections 10.10.2.2 and 11.5.21 for more detailed
> explanation.

Polling for SLEEPCOMPLETE is not required in RTC controlled wake-up
which is the mode currently supported (see section 10.10.2.1 of the
TRM). Polling for SLEEPCOMPLETE is required for external controlled
wake-up which to my knowledge has never been tested. If you have tested
this with external controlled wakep-up, then I can consider this patch.
Else, I would like to take it only after externally controlled wake-up
is fully tested/supported instead of taking bits and pieces.

Thanks,
Sekhar

> 
> Tested on da850-evm.
> 
> Signed-off-by: Vishwanathrao Badarkhe, Manish 
> ---
> :100644 100644 d4e9316... 976f096... March/arm/mach-davinci/sleep.S
>  arch/arm/mach-davinci/sleep.S |8 
>  1 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/mach-davinci/sleep.S b/arch/arm/mach-davinci/sleep.S
> index d4e9316..976f096 100644
> --- a/arch/arm/mach-davinci/sleep.S
> +++ b/arch/arm/mach-davinci/sleep.S
> @@ -35,6 +35,7 @@
>  #define PLL_LOCK_CYCLES  (PLL_LOCK_TIME * 25)
>  
>  #define DEEPSLEEP_SLEEPENABLE_BITBIT(31)
> +#define DEEPSLEEP_SLEEPCOMPLETE_BIT  BIT(30)
>  
>   .text
>  /*
> @@ -110,6 +111,13 @@ ENTRY(davinci_cpu_suspend)
>  
>   /* Wake up from sleep */
>  
> + /* wait for sleep complete */
> +sleep_complete:
> + ldr ip, [r4]
> + and ip, ip, #DEEPSLEEP_SLEEPCOMPLETE_BIT
> + cmp ip, #DEEPSLEEP_SLEEPCOMPLETE_BIT
> + bne sleep_complete
> +
>   /* Clear sleep enable */
>   ldr ip, [r4]
>   bic ip, ip, #DEEPSLEEP_SLEEPENABLE_BIT
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/5] Add P state driver for Intel Core Processors

2013-02-13 Thread Viresh Kumar
On Wed, Feb 13, 2013 at 10:08 PM, Dirk Brandewie
 wrote:
> For the case where both are built-in the load order works my driver uses
> device_initcall() and acpi_cpufreq uses late_initcall().
>
> For the case where both are a module (which I was sure I tested) you are
> right
> I will have to do something.
>
> For now I propose to make my driver built-in only while I sort out the right
> solution for the module build.  Does this seem reasonable to everyone?

Of-course i am missing something here. Why would anybody want to insert
acpi-cpufreq module when the system supports the pstate driver.

In case they are mutually exclusive, then we can have something like
depends on !ACPI-DRIVER in the kconfig option of pstate driver.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 wq/for-3.9] workqueue: rename cpu_workqueue to pool_workqueue

2013-02-13 Thread Tejun Heo
workqueue has moved away from global_cwqs to worker_pools and with the
scheduled custom worker pools, wforkqueues will be associated with
pools which don't have anything to do with CPUs.  The workqueue code
went through significant amount of changes recently and mass renaming
isn't likely to hurt much additionally.  Let's replace 'cpu' with
'pool' so that it reflects the current design.

* s/struct cpu_workqueue_struct/struct pool_workqueue/
* s/cpu_wq/pool_wq/
* s/cwq/pwq/

This patch is purely cosmetic.

Signed-off-by: Tejun Heo 
---

Rebased on top of wq/for-3.9 + two is_chained_work() patches in the
following thread.

 http://thread.gmane.org/gmane.linux.kernel/1441302

Thanks.

 include/linux/workqueue.h|   12 -
 include/trace/events/workqueue.h |   10 
 kernel/workqueue.c   |  433 +++
 kernel/workqueue_internal.h  |2 
 4 files changed, 228 insertions(+), 229 deletions(-)

--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -27,7 +27,7 @@ void delayed_work_timer_fn(unsigned long
 enum {
WORK_STRUCT_PENDING_BIT = 0,/* work item is pending execution */
WORK_STRUCT_DELAYED_BIT = 1,/* work item is delayed */
-   WORK_STRUCT_CWQ_BIT = 2,/* data points to cwq */
+   WORK_STRUCT_PWQ_BIT = 2,/* data points to pwq */
WORK_STRUCT_LINKED_BIT  = 3,/* next work is linked to this one */
 #ifdef CONFIG_DEBUG_OBJECTS_WORK
WORK_STRUCT_STATIC_BIT  = 4,/* static initializer (debugobjects) */
@@ -40,7 +40,7 @@ enum {
 
WORK_STRUCT_PENDING = 1 << WORK_STRUCT_PENDING_BIT,
WORK_STRUCT_DELAYED = 1 << WORK_STRUCT_DELAYED_BIT,
-   WORK_STRUCT_CWQ = 1 << WORK_STRUCT_CWQ_BIT,
+   WORK_STRUCT_PWQ = 1 << WORK_STRUCT_PWQ_BIT,
WORK_STRUCT_LINKED  = 1 << WORK_STRUCT_LINKED_BIT,
 #ifdef CONFIG_DEBUG_OBJECTS_WORK
WORK_STRUCT_STATIC  = 1 << WORK_STRUCT_STATIC_BIT,
@@ -60,14 +60,14 @@ enum {
WORK_CPU_END= NR_CPUS + 1,
 
/*
-* Reserve 7 bits off of cwq pointer w/ debugobjects turned
-* off.  This makes cwqs aligned to 256 bytes and allows 15
-* workqueue flush colors.
+* Reserve 7 bits off of pwq pointer w/ debugobjects turned off.
+* This makes pwqs aligned to 256 bytes and allows 15 workqueue
+* flush colors.
 */
WORK_STRUCT_FLAG_BITS   = WORK_STRUCT_COLOR_SHIFT +
  WORK_STRUCT_COLOR_BITS,
 
-   /* data contains off-queue information when !WORK_STRUCT_CWQ */
+   /* data contains off-queue information when !WORK_STRUCT_PWQ */
WORK_OFFQ_FLAG_BASE = WORK_STRUCT_FLAG_BITS,
 
WORK_OFFQ_CANCELING = (1 << WORK_OFFQ_FLAG_BASE),
--- a/include/trace/events/workqueue.h
+++ b/include/trace/events/workqueue.h
@@ -27,7 +27,7 @@ DECLARE_EVENT_CLASS(workqueue_work,
 /**
  * workqueue_queue_work - called when a work gets queued
  * @req_cpu:   the requested cpu
- * @cwq:   pointer to struct cpu_workqueue_struct
+ * @pwq:   pointer to struct pool_workqueue
  * @work:  pointer to struct work_struct
  *
  * This event occurs when a work is queued immediately or once a
@@ -36,10 +36,10 @@ DECLARE_EVENT_CLASS(workqueue_work,
  */
 TRACE_EVENT(workqueue_queue_work,
 
-   TP_PROTO(unsigned int req_cpu, struct cpu_workqueue_struct *cwq,
+   TP_PROTO(unsigned int req_cpu, struct pool_workqueue *pwq,
 struct work_struct *work),
 
-   TP_ARGS(req_cpu, cwq, work),
+   TP_ARGS(req_cpu, pwq, work),
 
TP_STRUCT__entry(
__field( void *,work)
@@ -52,9 +52,9 @@ TRACE_EVENT(workqueue_queue_work,
TP_fast_assign(
__entry->work   = work;
__entry->function   = work->func;
-   __entry->workqueue  = cwq->wq;
+   __entry->workqueue  = pwq->wq;
__entry->req_cpu= req_cpu;
-   __entry->cpu= cwq->pool->cpu;
+   __entry->cpu= pwq->pool->cpu;
),
 
TP_printk("work struct=%p function=%pf workqueue=%p req_cpu=%u cpu=%u",
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -154,11 +154,12 @@ struct worker_pool {
 } cacheline_aligned_in_smp;
 
 /*
- * The per-CPU workqueue.  The lower WORK_STRUCT_FLAG_BITS of
- * work_struct->data are used for flags and thus cwqs need to be
- * aligned at two's power of the number of flag bits.
+ * The per-pool workqueue.  While queued, the lower WORK_STRUCT_FLAG_BITS
+ * of work_struct->data are used for flags and the remaining high bits
+ * point to the pwq; thus, pwqs need to be aligned at two's power of the
+ * number of flag bits.
  */
-struct cpu_workqueue_struct {
+struct pool_workqueue {
struct worker_pool  *pool;  /* I: the associated pool */
struct workqueue_struct *wq;/* I: the 

Re: [PATCH] driver core: add wait event for deferred probe

2013-02-13 Thread anish singh
On Thu, Feb 14, 2013 at 3:06 AM, Grant Likely  wrote:
> On Tue, 12 Feb 2013 10:52:10 +0800, Haojian Zhuang 
>  wrote:
>> On 12 February 2013 07:10, Andrew Morton  wrote:
>> > On Sun, 10 Feb 2013 00:57:57 +0800
>> > Haojian Zhuang  wrote:
>> >
>> >> do_initcalls() could call all driver initialization code in kernel_init
>> >> thread. It means that probe() function will be also called from that
>> >> time. After this, kernel could access console & release __init section
>> >> in the same thread.
>> >>
>> >> But if device probe fails and moves into deferred probe list, a new
>> >> thread is created to reinvoke probe. If the device is serial console,
>> >> kernel has to open console failure & release __init section before
>> >> reinvoking failure. Because there's no synchronization mechanism.
>> >> Now add wait event to synchronize after do_initcalls().
>> >
>> > It sounds like this:
>> >
>> > static int __ref kernel_init(void *unused)
>> > {
>> > kernel_init_freeable();
>> > /* need to finish all async __init code before freeing the memory 
>> > */
>> > async_synchronize_full();
>> >
>> > is designed to prevent the problem you describe?
>> >
>> It can't prevent the problem that I described. Because deferred_probe()
>> is introduced recently.
>>
>> All synchronization should be finished just after do_initcalls(). Since
>> load_default_modules() is also called in the end of kernel_init_freeable(),
>> I'm not sure that whether I could remove async_synchronize_full()
>> here. So I didn't touch it.
>>
>> >> --- a/init/main.c
>> >> +++ b/init/main.c
>> >> @@ -786,6 +786,7 @@ static void __init do_basic_setup(void)
>> >>   do_ctors();
>> >>   usermodehelper_enable();
>> >>   do_initcalls();
>> >> + wait_for_device_probe();
>> >>  }
>> >
>> > Needs a nice comment here explaining what's going on.
>>
>> No problem. I'll add comment here.
>
> Actually, this approach will create new problems. There is no guarantee
> that a given device will be able to initialize before exiting
> do_basic_setup(). If, for instance, a device depends on a resource
> provided by a module, then it will just keep deferring. In that case
> you've got a hung kernel.
>
> I think what you really want is the following:
>
>  static int deferred_probe_initcall(void)
>  {
> deferred_wq = create_singlethread_workqueue("deferwq");
> if (WARN_ON(!deferred_wq))
> return -ENOMEM;
>
> driver_deferred_probe_enable = true;
> +   deferred_probe_work_func(NULL);
> -   driver_deferred_probe_trigger();
> return 0;
>  }
>  late_initcall(deferred_probe_initcall);
>
> Or something similar. That would guarantee that as many passes as are needed
> (which in practical terms only means a couple) for device probing to
> settle down before exiting the initcall processing. That should achieve
> the effect you desire.
>
> It still masks the __init section issue by making it a lot less likely,
Grant, Can you please explain me this problem?My understanding is below:
If all the detection of devices with there respective driver is done before
__init section is freed then we will not have the problem mentioned.
However if the driver requests the probing to be deferred then __init section
of the deferred driver will not be freed right?

I am afraid but the patch description is bit cryptic for me specially
this line "kernel has to open console failure & release __init section before
reinvoking failure".

> but it does ensure that all of the built-in driver dependency order
> issues are processed before continuing on to userspace.
>
> g.
>
> --
> Grant Likely, B.Sc, P.Eng.
> Secret Lab Technologies, Ltd.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2 wq/for-3.9] workqueue: reimplement is_chained_work() using current_wq_worker()

2013-02-13 Thread Tejun Heo
is_chained_work() was added before current_wq_worker() and implemented
its own ham-fisted way of finding out whether %current is a workqueue
worker - it iterates through all possible workers.

Drop the custom implementation and reimplement using
current_wq_worker().

Signed-off-by: Tejun Heo 
---
 kernel/workqueue.c |   33 -
 1 file changed, 8 insertions(+), 25 deletions(-)

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1159,35 +1159,18 @@ static void insert_work(struct cpu_workq
 
 /*
  * Test whether @work is being queued from another work executing on the
- * same workqueue.  This is rather expensive and should only be used from
- * cold paths.
+ * same workqueue.
  */
 static bool is_chained_work(struct workqueue_struct *wq)
 {
-   unsigned long flags;
-   unsigned int cpu;
+   struct worker *worker;
 
-   for_each_cwq_cpu(cpu, wq) {
-   struct cpu_workqueue_struct *cwq = get_cwq(cpu, wq);
-   struct worker_pool *pool = cwq->pool;
-   struct worker *worker;
-   struct hlist_node *pos;
-   int i;
-
-   spin_lock_irqsave(>lock, flags);
-   for_each_busy_worker(worker, i, pos, pool) {
-   if (worker->task != current)
-   continue;
-   spin_unlock_irqrestore(>lock, flags);
-   /*
-* I'm @worker, no locking necessary.  See if @work
-* is headed to the same workqueue.
-*/
-   return worker->current_cwq->wq == wq;
-   }
-   spin_unlock_irqrestore(>lock, flags);
-   }
-   return false;
+   worker = current_wq_worker();
+   /*
+* Return %true iff I'm a worker execuing a work item on @wq.  If
+* I'm @worker, it's safe to dereference it without locking.
+*/
+   return worker && worker->current_cwq->wq == wq;
 }
 
 static void __queue_work(unsigned int cpu, struct workqueue_struct *wq,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2 wq/for-3.9] workqueue: fix is_chained_work() regression

2013-02-13 Thread Tejun Heo
c9e7cf273f ("workqueue: move busy_hash from global_cwq to
worker_pool") incorrectly converted is_chained_work() to use
get_gcwq() inside for_each_gcwq_cpu() while removing get_gcwq().

As cwq might not exist for all possible workqueue CPUs, @cwq can be
NULL and the following cwq deferences can lead to oops.

Fix it by using for_each_cwq_cpu() instead, which is the better one to
use anyway as we only need to check pools that the wq is associated
with.

Signed-off-by: Tejun Heo 
---
 kernel/workqueue.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1167,7 +1167,7 @@ static bool is_chained_work(struct workq
unsigned long flags;
unsigned int cpu;
 
-   for_each_wq_cpu(cpu) {
+   for_each_cwq_cpu(cpu, wq) {
struct cpu_workqueue_struct *cwq = get_cwq(cpu, wq);
struct worker_pool *pool = cwq->pool;
struct worker *worker;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] VSOCK: Introduce VM Sockets

2013-02-13 Thread Andy King
> I've seen you have a notify_ops in the vmci bits.  Do you have different
> notify ops depending on socket type or something?  Does it make sense to
> move the notify ops ptr into "struct vsock_sock" maybe?

The notify stuff only applies to STREAMs.  However, we have two different
notify impls, one for legacy ESX and one for newer, and we figure out at
runtime which protocol we're using with the hypervisor and set the
callbacks appropriately.  The difference between the two is that the
newer one is much smarter and knows not to signal (the peer) quite so much,
i.e., it has some basic but sensible flow-control, which improves
performance quite a bit.  Again, that might not make any sense at all
for virtio.  Do you need to signal when you enqueue to a ring?  And is
there coalescing?  Dunno...

> And can we make it optional please (i.e. allow the function pointers to
> be NULL)?

They were originally allowed to be NULL, but I changed it in the last
round of patches while moving them into the transport, since I disliked
the NULL checks so much.  I can put them back, but that's a bigger
change, and I'm not sure we want to push large patches to Dave right
now :)

> Which problem you are trying to tackle with the notifications?

It's to do with signaling the peer, or more appropriately, trying to
avoid signaling the peer when possible.  The naive impl. is to signal
every time we enqueue or dequeue data (into our VMCI queuepairs).
But signaling is slow, since it involves a world exit, so we prefer
not to.  Which means we need to keep track of rate of flow and figure
out when we should and should not, and that's what all the notification
stuff does.  It's...ugly...

> > For the VMCI transport, it indicates if the underlying queuepair is
> > still around (i.e., make sure we haven't torn it down while sleeping
> > in a blocking send or receive).  Perhaps it's not the best name?
> 
> How you'd hit that?  Peer closing the socket while sleeping?  Other
> thread closing the socket wile sleeping?  Both?
> 
> I think a state field in struct vsock_sock would be a better solution here.

Hrm, lemme think about this one.

> 
> >> What is *_allow?
> > 
> > It's very basic filtering.  We have specific addresses that we don't
> > allow, and we look for them in the allow() functions.  You can just
> > return true if you like.
> 
> Can we make those calls optional too please?

Sure.

> The notify_*_init could return a opaque pointer instead.  But then
> you'll need a notify_*_free too.  And you can't place it on the stack
> and thus have a allocation in the hot path, which I guess you are trying
> to avoid in the first place.

Avoiding allocation is good, but at the same time, I wonder if it's not
a big deal compared with the time spent copying out of the buffers, in
which case it won't matter.  So maybe we can do this.

Thanks!
- Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 05/18] sched: quicker balancing on fork/exec/wake

2013-02-13 Thread Alex Shi
On 02/12/2013 06:22 PM, Peter Zijlstra wrote:
> On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote:
>> Guess the search cpu from bottom to up in domain tree come from
>> commit 3dbd5342074a1e sched: multilevel sbe sbf, the purpose is
>> balancing over tasks on all level domains.
>>
>> This balancing cost too much if there has many domain/groups in a
>> large system.
>>
>> If we remove this code, we will get quick fork/exec/wake with a
>> similar
>> balancing result amony whole system.
>>
>> This patch increases 10+% performance of hackbench on my 4 sockets
>> SNB machines and about 3% increasing on 2 sockets servers.
>>
>>
> Numbers be groovy.. still I'd like a little more on the behavioural
> change. Expand on what exactly is lost by this change so that if we
> later find a regression we have a better idea of what and how.
> 
> For instance, note how find_idlest_group() isn't symmetric wrt
> local_group. So by not doing the domain iteration we change things.
> 
> Now, it might well be that all this is somewhat overkill as it is, but
> should we then not replace all of it with a simple min search over all
> eligible cpus; that would be a real clean up.
>  

Um, will think this again..
> 


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] gpiolib: rename local offset variables to "hwgpio"

2013-02-13 Thread Alexandre Courbot
On Thu, Feb 14, 2013 at 7:54 AM, Ryan Mallon  wrote:
> Nitpicky - Is it accurate to call these hardware numbers? Don't some of
> the platforms remap the gpio numbers? These numbers may not match
> against the platform's datasheet for example.

I'm following a suggestion by Grant and Linus W. here. This is closer
to existing convention in e.g. irqdomain.

Also at that level the GPIO numbers are local to the GPIO controller,
i.e. they are not supposed to be the GPIO numbers used by consumers
(or the datasheet for that instance).

Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/16] zcache/debug: Coalesce all debug under CONFIG_ZCACHE_DEBUG

2013-02-13 Thread Konrad Rzeszutek Wilk
and also define this extra attribute in the Kconfig entry.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/Kconfig   | 8 
 drivers/staging/zcache/debug.c   | 2 +-
 drivers/staging/zcache/zcache-main.c | 6 +++---
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/zcache/Kconfig b/drivers/staging/zcache/Kconfig
index 7358270..2da6cc4 100644
--- a/drivers/staging/zcache/Kconfig
+++ b/drivers/staging/zcache/Kconfig
@@ -10,6 +10,14 @@ config ZCACHE
  memory to store clean page cache pages and swap in RAM,
  providing a noticeable reduction in disk I/O.
 
+config ZCACHE_DEBUG
+   bool "Enable debug statistics"
+   depends on DEBUG_FS && ZCACHE
+   default n
+   help
+ This is used to provide an debugfs directory with counters of
+ how zcache is doing. You probably want to set this to 'N'.
+
 config RAMSTER
bool "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
depends on CONFIGFS_FS=y && SYSFS=y && !HIGHMEM && ZCACHE=y
diff --git a/drivers/staging/zcache/debug.c b/drivers/staging/zcache/debug.c
index cf19adc..e951c64 100644
--- a/drivers/staging/zcache/debug.c
+++ b/drivers/staging/zcache/debug.c
@@ -1,7 +1,7 @@
 #include 
 #include "debug.h"
 
-#ifdef CONFIG_DEBUG_FS
+#ifdef CONFIG_ZCACHE_DEBUG
 #include 
 
 #define ATTR(x)  { .name = #x, .val = _##x, }
diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 4b9ee7f..7c0fda4 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -306,7 +306,7 @@ static void zcache_free_page(struct page *page)
max_pageframes = curr_pageframes;
if (curr_pageframes < min_pageframes)
min_pageframes = curr_pageframes;
-#ifdef ZCACHE_DEBUG
+#ifdef CONFIG_ZCACHE_DEBUG
if (curr_pageframes > 2L || curr_pageframes < -2L) {
/* pr_info here */
}
@@ -1774,7 +1774,7 @@ static int __init zcache_init(void)
old_ops = zcache_cleancache_register_ops();
pr_info("%s: cleancache enabled using kernel transcendent "
"memory and compression buddies\n", namestr);
-#ifdef ZCACHE_DEBUG
+#ifdef CONFIG_ZCACHE_DEBUG
pr_info("%s: cleancache: ignorenonactive = %d\n",
namestr, !disable_cleancache_ignore_nonactive);
 #endif
@@ -1789,7 +1789,7 @@ static int __init zcache_init(void)
frontswap_tmem_exclusive_gets(true);
pr_info("%s: frontswap enabled using kernel transcendent "
"memory and compression buddies\n", namestr);
-#ifdef ZCACHE_DEBUG
+#ifdef CONFIG_ZCACHE_DEBUG
pr_info("%s: frontswap: excl gets = %d active only = %d\n",
namestr, frontswap_has_exclusive_gets,
!disable_frontswap_ignore_nonactive);
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 11/18] sched: log the cpu utilization at rq

2013-02-13 Thread Alex Shi
On 02/12/2013 06:39 PM, Peter Zijlstra wrote:
> On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote:
>>
>> The cpu's utilization is to measure how busy is the cpu.
>> util = cpu_rq(cpu)->avg.runnable_avg_sum
>> / cpu_rq(cpu)->avg.runnable_avg_period;
>>
>> Since the util is no more than 1, we use its percentage value in later
>> caculations. And set the the FULL_UTIL as 100%.
>>
>> In later power aware scheduling, we are sensitive for how busy of the
>> cpu, not how much weight of its load. As to power consuming, it is more
>> related with cpu busy time, not the load weight.
> 
> I think we can make that argument in general; that is irrespective of
> the actual policy. We simply never had anything better to go with.
> 
> So please clarify why you think this only applies to power aware
> scheduling.

Um, the rq->util is a general argument. It can be used on any other
places if needed, not power aware scheduling specific.
> 


-- 
Thanks
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/16] zcache/debug: Use an array to initialize/use debugfs attributes.

2013-02-13 Thread Konrad Rzeszutek Wilk
It makes it neater and also allows us to piggyback on that
in the zcache_dump function.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/debug.c | 163 +
 1 file changed, 51 insertions(+), 112 deletions(-)

diff --git a/drivers/staging/zcache/debug.c b/drivers/staging/zcache/debug.c
index 622d5f3..cf19adc 100644
--- a/drivers/staging/zcache/debug.c
+++ b/drivers/staging/zcache/debug.c
@@ -3,130 +3,69 @@
 
 #ifdef CONFIG_DEBUG_FS
 #include 
-#definezdfsdebugfs_create_size_t
-#definezdfs64  debugfs_create_u64
+
+#define ATTR(x)  { .name = #x, .val = _##x, }
+static struct debug_entry {
+   const char *name;
+   ssize_t *val;
+} attrs[] = {
+   ATTR(obj_count), ATTR(obj_count_max),
+   ATTR(objnode_count), ATTR(objnode_count_max),
+   ATTR(flush_total), ATTR(flush_found),
+   ATTR(flobj_total), ATTR(flobj_found),
+   ATTR(failed_eph_puts), ATTR(failed_pers_puts),
+   ATTR(failed_getfreepages), ATTR(failed_alloc),
+   ATTR(put_to_flush),
+   ATTR(compress_poor), ATTR(mean_compress_poor),
+   ATTR(eph_ate_tail), ATTR(eph_ate_tail_failed),
+   ATTR(pers_ate_eph), ATTR(pers_ate_eph_failed),
+   ATTR(evicted_eph_zpages), ATTR(evicted_eph_pageframes),
+   ATTR(eph_pageframes), ATTR(eph_pageframes_max),
+   ATTR(eph_zpages), ATTR(eph_zpages_max),
+   ATTR(pers_zpages), ATTR(pers_zpages_max),
+   ATTR(last_active_file_pageframes),
+   ATTR(last_inactive_file_pageframes),
+   ATTR(last_active_anon_pageframes),
+   ATTR(last_inactive_anon_pageframes),
+   ATTR(eph_nonactive_puts_ignored),
+   ATTR(pers_nonactive_puts_ignored),
+#ifdef CONFIG_ZCACHE_WRITEBACK
+   ATTR(zcache_outstanding_writeback_pages),
+   ATTR(zcache_writtenback_pages),
+#endif
+};
+#undef ATTR
 int zcache_debugfs_init(void)
 {
+   unsigned int i;
struct dentry *root = debugfs_create_dir("zcache", NULL);
if (root == NULL)
return -ENXIO;
 
-   zdfs("obj_count", S_IRUGO, root, _obj_count);
-   zdfs("obj_count_max", S_IRUGO, root, _obj_count_max);
-   zdfs("objnode_count", S_IRUGO, root, _objnode_count);
-   zdfs("objnode_count_max", S_IRUGO, root, _objnode_count_max);
-   zdfs("flush_total", S_IRUGO, root, _flush_total);
-   zdfs("flush_found", S_IRUGO, root, _flush_found);
-   zdfs("flobj_total", S_IRUGO, root, _flobj_total);
-   zdfs("flobj_found", S_IRUGO, root, _flobj_found);
-   zdfs("failed_eph_puts", S_IRUGO, root, _failed_eph_puts);
-   zdfs("failed_pers_puts", S_IRUGO, root, _failed_pers_puts);
-   zdfs("failed_get_free_pages", S_IRUGO, root,
-   _failed_getfreepages);
-   zdfs("failed_alloc", S_IRUGO, root, _failed_alloc);
-   zdfs("put_to_flush", S_IRUGO, root, _put_to_flush);
-   zdfs("compress_poor", S_IRUGO, root, _compress_poor);
-   zdfs("mean_compress_poor", S_IRUGO, root, _mean_compress_poor);
-   zdfs("eph_ate_tail", S_IRUGO, root, _eph_ate_tail);
-   zdfs("eph_ate_tail_failed", S_IRUGO, root, _eph_ate_tail_failed);
-   zdfs("pers_ate_eph", S_IRUGO, root, _pers_ate_eph);
-   zdfs("pers_ate_eph_failed", S_IRUGO, root, _pers_ate_eph_failed);
-   zdfs("evicted_eph_zpages", S_IRUGO, root, _evicted_eph_zpages);
-   zdfs("evicted_eph_pageframes", S_IRUGO, root,
-   _evicted_eph_pageframes);
-   zdfs("eph_pageframes", S_IRUGO, root, _eph_pageframes);
-   zdfs("eph_pageframes_max", S_IRUGO, root, _eph_pageframes_max);
-   zdfs("pers_pageframes", S_IRUGO, root, _pers_pageframes);
-   zdfs("pers_pageframes_max", S_IRUGO, root, _pers_pageframes_max);
-   zdfs("eph_zpages", S_IRUGO, root, _eph_zpages);
-   zdfs("eph_zpages_max", S_IRUGO, root, _eph_zpages_max);
-   zdfs("pers_zpages", S_IRUGO, root, _pers_zpages);
-   zdfs("pers_zpages_max", S_IRUGO, root, _pers_zpages_max);
-   zdfs("last_active_file_pageframes", S_IRUGO, root,
-   _last_active_file_pageframes);
-   zdfs("last_inactive_file_pageframes", S_IRUGO, root,
-   _last_inactive_file_pageframes);
-   zdfs("last_active_anon_pageframes", S_IRUGO, root,
-   _last_active_anon_pageframes);
-   zdfs("last_inactive_anon_pageframes", S_IRUGO, root,
-   _last_inactive_anon_pageframes);
-   zdfs("eph_nonactive_puts_ignored", S_IRUGO, root,
-   _eph_nonactive_puts_ignored);
-   zdfs("pers_nonactive_puts_ignored", S_IRUGO, root,
-   _pers_nonactive_puts_ignored);
-   zdfs64("eph_zbytes", S_IRUGO, root, _eph_zbytes);
-   zdfs64("eph_zbytes_max", S_IRUGO, root, _eph_zbytes_max);
-   zdfs64("pers_zbytes", S_IRUGO, root, _pers_zbytes);
-   zdfs64("pers_zbytes_max", S_IRUGO, root, _pers_zbytes_max);
-   

[PATCH 10/16] zcache: Module license is defined twice.

2013-02-13 Thread Konrad Rzeszutek Wilk
The other (same license) is at the end of the file.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 059c0f2..4b9ee7f 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -73,8 +73,6 @@ static char *namestr __read_mostly = "zcache";
 #define ZCACHE_GFP_MASK \
(__GFP_FS | __GFP_NORETRY | __GFP_NOWARN | __GFP_NOMEMALLOC)
 
-MODULE_LICENSE("GPL");
-
 /* crypto API for zcache  */
 #define ZCACHE_COMP_NAME_SZ CRYPTO_MAX_ALG_NAME
 static char zcache_comp_name[ZCACHE_COMP_NAME_SZ] __read_mostly;
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] VSOCK: Introduce VM Sockets

2013-02-13 Thread Andy King
> > +   if (protocol)
> > +   return -EPROTONOSUPPORT;
> > +
> 
> IMO protocol == PF_VSOCK should not get rejected here.

Agreed, let me fix that too.

Thanks!
- Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch v4 08/18] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"

2013-02-13 Thread Preeti U Murthy
Hi everyone,

On 02/13/2013 09:15 PM, Paul Turner wrote:
> On Wed, Feb 13, 2013 at 7:23 AM, Alex Shi  wrote:
>> On 02/12/2013 06:27 PM, Peter Zijlstra wrote:
>>> On Thu, 2013-01-24 at 11:06 +0800, Alex Shi wrote:
 Remove CONFIG_FAIR_GROUP_SCHED that covers the runnable info, then
 we can use runnable load variables.

>>> It would be nice if we could quantify the performance hit of doing so.
>>> Haven't yet looked at later patches to see if we remove anything to
>>> off-set this.
>>>
>>
>> In our rough testing, no much clear performance changes.
>>
> 
> I'd personally like this to go with a series that actually does
> something with it.
> 
> There's been a few proposals floating around on _how_ to do this; but
> the challenge is in getting it stable enough that all of the wake-up
> balancing does not totally perforate your stability gains into the
> noise.  select_idle_sibling really is your nemesis here.
> 
> It's a small enough patch that it can go at the head of any such
> series (and indeed; it was originally structured to make such a patch
> rather explicit.)
> 
>> --
>> Thanks
>> Alex
> 

Paul,what exactly do you mean by select_idle_sibling() is our nemesis
here? What we observed through our experiments was that:
1.With the per entity load tracking(runnable_load_avg) in load
balancing,the load is distributed appropriately across the cpus.
2.However when a task sleeps and wakes up,select_idle_sibling() searches
for the idlest group top to bottom.If a suitable candidate is not
found,it wakes up the task on the prev_cpu/waker_cpu.This would increase
the runqueue size and load of prev_cpu/waker_cpu respectively.
3.The load balancer would then come to the rescue and redistribute the load.

As a consequence,

*The primary observation was that there is no performance degradation
with the integration of per entity load tracking into the load balancer
but there was a good increase in the number of migrations*. This  as I
see it, is due to the point2 and point3 above.Is this what you call as
the nemesis? OR

select_idle_sibling() does a top to bottom search of the chosen domain
for an idlest group and is very likely to spread the waking task to a
far off group,in case of underutilized systems.This would prove costly
for the software buddies in finding each other due to the time taken for
the search and the possible spreading of the software buddy tasks.Is
this what you call nemesis?

Another approach to remove the above two nemesis,if they are so,would be
to use blocked_load+runnable_load for balancing.But when waking up a
task,use select_idle_sibling() only to search the L2 cache domains for
an idlest group.If unsuccessful,return the prev_cpu which has already
accounted for the task in the blocked_load,hence this move would not
increase its load.Would you recommend going in this direction?

Thank you

Regards
Preeti U Murthy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/16] zcache/zbud: Provide the accessory functions for counter decrease.

2013-02-13 Thread Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zbud.c | 38 --
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/zcache/zbud.c b/drivers/staging/zcache/zbud.c
index cff596c..e139cd6 100644
--- a/drivers/staging/zcache/zbud.c
+++ b/drivers/staging/zcache/zbud.c
@@ -321,6 +321,16 @@ static inline void inc_zbud_eph_unbuddied_count(void) { 
zbud_eph_unbuddied_count
 static inline void inc_zbud_pers_unbuddied_count(void) { 
zbud_pers_unbuddied_count++; };
 static inline void inc_zbud_eph_zombie_count(void) { zbud_eph_zombie_count++; 
};
 static inline void inc_zbud_pers_zombie_count(void) { 
zbud_pers_zombie_count++; };
+static inline void dec_zbud_eph_pageframes(void) { zbud_eph_pageframes--; };
+static inline void dec_zbud_pers_pageframes(void) { zbud_pers_pageframes--; };
+static inline void dec_zbud_eph_zpages(void) { zbud_eph_zpages--; };
+static inline void dec_zbud_pers_zpages(void) { zbud_pers_zpages--; };
+static inline void dec_zbud_eph_zbytes(ssize_t bytes) { zbud_eph_zbytes -= 
bytes; };
+static inline void dec_zbud_pers_zbytes(ssize_t bytes) { zbud_pers_zbytes -= 
bytes; };
+static inline void dec_zbud_eph_buddied_count(void) { 
zbud_eph_buddied_count--; };
+static inline void dec_zbud_pers_buddied_count(void) { 
zbud_pers_buddied_count--; };
+static inline void dec_zbud_eph_unbuddied_count(void) { 
zbud_eph_unbuddied_count--; };
+static inline void dec_zbud_pers_unbuddied_count(void) { 
zbud_pers_unbuddied_count--; };
 static atomic_t zbud_eph_zombie_atomic;
 static atomic_t zbud_pers_zombie_atomic;
 
@@ -420,9 +430,9 @@ static inline struct page *zbud_unuse_zbudpage(struct 
zbudpage *zbudpage,
BUG_ON(zbudpage_is_dying(zbudpage));
BUG_ON(zbudpage_is_zombie(zbudpage));
if (eph)
-   zbud_eph_pageframes--;
+   dec_zbud_eph_pageframes();
else
-   zbud_pers_pageframes--;
+   dec_zbud_pers_pageframes();
zbudpage_spin_unlock(zbudpage);
reset_page_mapcount(page);
init_page_count(page);
@@ -445,11 +455,11 @@ static inline void zbud_unuse_zbud(struct zbudpage 
*zbudpage,
zbudpage->zbud1_size = 0;
}
if (eph) {
-   zbud_eph_zbytes -= size;
-   zbud_eph_zpages--;
+   dec_zbud_eph_zbytes(size);
+   dec_zbud_eph_zpages();
} else {
-   zbud_pers_zbytes -= size;
-   zbud_pers_zpages--;
+   dec_zbud_pers_zbytes(size);
+   dec_zbud_pers_zpages();
}
 }
 
@@ -610,9 +620,9 @@ struct page *zbud_free_and_delist(struct zbudref *zref, 
bool eph,
list_del_init(>lru);
spin_unlock(lists_lock);
if (eph)
-   zbud_eph_unbuddied_count--;
+   dec_zbud_eph_unbuddied_count();
else
-   zbud_pers_unbuddied_count--;
+   dec_zbud_pers_unbuddied_count();
page = zbud_unuse_zbudpage(zbudpage, eph);
} else { /* was buddied: move remaining buddy to unbuddied list */
chunks = zbud_size_to_chunks(other_bud_size) ;
@@ -622,11 +632,11 @@ struct page *zbud_free_and_delist(struct zbudref *zref, 
bool eph,
unbud[chunks].count++;
}
if (eph) {
-   zbud_eph_buddied_count--;
+   dec_zbud_eph_buddied_count();
inc_zbud_eph_unbuddied_count();
} else {
inc_zbud_pers_unbuddied_count();
-   zbud_pers_buddied_count--;
+   dec_zbud_pers_buddied_count();
}
/* don't mess with lru, no need to move it */
zbudpage_spin_unlock(zbudpage);
@@ -683,7 +693,7 @@ found_unbuddied:
if (eph) {
list_add_tail(>budlist, _eph_buddied_list);
unbud[found_good_buddy].count--;
-   zbud_eph_unbuddied_count--;
+   dec_zbud_eph_unbuddied_count();
inc_zbud_eph_buddied_count();
/* "promote" raw zbudpage to most-recently-used */
list_del_init(>lru);
@@ -691,7 +701,7 @@ found_unbuddied:
} else {
list_add_tail(>budlist, _pers_buddied_list);
unbud[found_good_buddy].count--;
-   zbud_pers_unbuddied_count--;
+   dec_zbud_pers_unbuddied_count();
inc_zbud_pers_buddied_count();
/* "promote" raw zbudpage to most-recently-used */
list_del_init(>lru);
@@ -973,9 +983,9 @@ evict_page:
spin_unlock(_eph_lists_lock);
inc_zbud_eph_evicted_pageframes();
if (*zpages == 1)
-   zbud_eph_unbuddied_count--;
+   dec_zbud_eph_unbuddied_count();
else
-   

[PATCH 04/16] zcache: The last of the atomic reads has now an accessory function.

2013-02-13 Thread Konrad Rzeszutek Wilk
And now we can move the code ([inc|dec]_zcache_[*]) to their own file
with a header to make them nops or feed in debugfs.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 50a408a..b71e61b 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -253,6 +253,14 @@ static inline void dec_zcache_pers_zpages(unsigned zpages)
 {
zcache_pers_zpages = atomic_sub_return(zpages, 
_pers_zpages_atomic);
 }
+
+static inline unsigned long curr_pageframes_count(void)
+{
+   return zcache_pageframes_alloced -
+   atomic_read(_pageframes_freed_atomic) -
+   atomic_read(_eph_pageframes_atomic) -
+   atomic_read(_pers_pageframes_atomic);
+};
 /* but for the rest of these, counting races are ok */
 static unsigned long zcache_flush_total;
 static unsigned long zcache_flush_found;
@@ -565,10 +573,7 @@ static void zcache_free_page(struct page *page)
BUG();
__free_page(page);
inc_zcache_pageframes_freed();
-   curr_pageframes = zcache_pageframes_alloced -
-   atomic_read(_pageframes_freed_atomic) -
-   atomic_read(_eph_pageframes_atomic) -
-   atomic_read(_pers_pageframes_atomic);
+   curr_pageframes = curr_pageframes_count();
if (curr_pageframes > max_pageframes)
max_pageframes = curr_pageframes;
if (curr_pageframes < min_pageframes)
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Various cleanups/fixes to zcache (v3).

2013-02-13 Thread Konrad Rzeszutek Wilk
>From Konrad Rzeszutek Wilk  # This line is ignored.
From: Konrad Rzeszutek Wilk 
Subject: [PATCH] Various cleanups/fixes to zcache (v3).
In-Reply-To: 

Hey Greg,

These patches do various cleanups of the zcache driver. The majority of the
work is just to move all the different counters out to a debug file. The next
step would be to figure out which ones are actually pertient and which can
go under the knife. Oh, and also fix some of the compiler warnings.

This is based on top of ommit 76426daf50d5df38893cc189e9ccd026093debc8
("staging/zcache: Fix/improve zcache writeback code, tie to a config option")
so should apply cleanly to your tree.

Please apply.

These patches are also available on this git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm.git 
devel/zcache.cleanup.v2.5

if you would prefer to pull them.


 drivers/staging/zcache/Kconfig   |   8 +
 drivers/staging/zcache/Makefile  |   1 +
 drivers/staging/zcache/debug.c   |  71 ++
 drivers/staging/zcache/debug.h   | 229 ++
 drivers/staging/zcache/ramster/ramster.c |  34 +--
 drivers/staging/zcache/zbud.c| 130 ++
 drivers/staging/zcache/zcache-main.c | 401 ---
 7 files changed, 501 insertions(+), 373 deletions(-)


Konrad Rzeszutek Wilk (16):
  zcache: s/int/bool/ on the various options.
  zcache: Provide accessory functions for counter increase
  zcache: Provide accessory functions for counter decrease.
  zcache: The last of the atomic reads has now an accessory function.
  zcache: Fix compile warnings due to usage of debugfs_create_size_t
  zcache: Make the debug code use pr_debug
  zcache: Move debugfs code out of zcache-main.c file.
  zcache/debug: Use an array to initialize/use debugfs attributes.
  zcache: Move the last of the debugfs counters out
  zcache: Module license is defined twice.
  zcache/debug: Coalesce all debug under CONFIG_ZCACHE_DEBUG
  zcache/zbud: Fix compiler warnings.
  zcache/zbud: Add incremental accessory counters
  zcache/zbud: Provide the accessory functions for counter decrease.
  ramster: Fix compile warnings due to usage of debugfs_create_size_t
  zcache/zbud: Fix __init mismatch

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/16] zcache: Make the debug code use pr_debug

2013-02-13 Thread Konrad Rzeszutek Wilk
as if you are debugging this driver you would be using 'debug'
on the command line anyhow - and this would dump the debug
data on the proper loglevel.

While at it also remove the unconditional #define ZCACHE_DEBUG.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 85 +---
 1 file changed, 41 insertions(+), 44 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 4bd4107..d4bf4a2 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -355,71 +355,68 @@ static int zcache_debugfs_init(void)
 #undef zdfs64
 #endif
 
-#define ZCACHE_DEBUG
-#ifdef ZCACHE_DEBUG
 /* developers can call this in case of ooms, e.g. to find memory leaks */
 void zcache_dump(void)
 {
-   pr_info("zcache: obj_count=%zd\n", zcache_obj_count);
-   pr_info("zcache: obj_count_max=%zd\n", zcache_obj_count_max);
-   pr_info("zcache: objnode_count=%zd\n", zcache_objnode_count);
-   pr_info("zcache: objnode_count_max=%zd\n", zcache_objnode_count_max);
-   pr_info("zcache: flush_total=%zd\n", zcache_flush_total);
-   pr_info("zcache: flush_found=%zd\n", zcache_flush_found);
-   pr_info("zcache: flobj_total=%zd\n", zcache_flobj_total);
-   pr_info("zcache: flobj_found=%zd\n", zcache_flobj_found);
-   pr_info("zcache: failed_eph_puts=%zd\n", zcache_failed_eph_puts);
-   pr_info("zcache: failed_pers_puts=%zd\n", zcache_failed_pers_puts);
-   pr_info("zcache: failed_get_free_pages=%zd\n",
+   pr_debug("zcache: obj_count=%zd\n", zcache_obj_count);
+   pr_debug("zcache: obj_count_max=%zd\n", zcache_obj_count_max);
+   pr_debug("zcache: objnode_count=%zd\n", zcache_objnode_count);
+   pr_debug("zcache: objnode_count_max=%zd\n", zcache_objnode_count_max);
+   pr_debug("zcache: flush_total=%zd\n", zcache_flush_total);
+   pr_debug("zcache: flush_found=%zd\n", zcache_flush_found);
+   pr_debug("zcache: flobj_total=%zd\n", zcache_flobj_total);
+   pr_debug("zcache: flobj_found=%zd\n", zcache_flobj_found);
+   pr_debug("zcache: failed_eph_puts=%zd\n", zcache_failed_eph_puts);
+   pr_debug("zcache: failed_pers_puts=%zd\n", zcache_failed_pers_puts);
+   pr_debug("zcache: failed_get_free_pages=%zd\n",
zcache_failed_getfreepages);
-   pr_info("zcache: failed_alloc=%zd\n", zcache_failed_alloc);
-   pr_info("zcache: put_to_flush=%zd\n", zcache_put_to_flush);
-   pr_info("zcache: compress_poor=%zd\n", zcache_compress_poor);
-   pr_info("zcache: mean_compress_poor=%zd\n",
+   pr_debug("zcache: failed_alloc=%zd\n", zcache_failed_alloc);
+   pr_debug("zcache: put_to_flush=%zd\n", zcache_put_to_flush);
+   pr_debug("zcache: compress_poor=%zd\n", zcache_compress_poor);
+   pr_debug("zcache: mean_compress_poor=%zd\n",
zcache_mean_compress_poor);
-   pr_info("zcache: eph_ate_tail=%zd\n", zcache_eph_ate_tail);
-   pr_info("zcache: eph_ate_tail_failed=%zd\n",
+   pr_debug("zcache: eph_ate_tail=%zd\n", zcache_eph_ate_tail);
+   pr_debug("zcache: eph_ate_tail_failed=%zd\n",
zcache_eph_ate_tail_failed);
-   pr_info("zcache: pers_ate_eph=%zd\n", zcache_pers_ate_eph);
-   pr_info("zcache: pers_ate_eph_failed=%zd\n",
+   pr_debug("zcache: pers_ate_eph=%zd\n", zcache_pers_ate_eph);
+   pr_debug("zcache: pers_ate_eph_failed=%zd\n",
zcache_pers_ate_eph_failed);
-   pr_info("zcache: evicted_eph_zpages=%zd\n", zcache_evicted_eph_zpages);
-   pr_info("zcache: evicted_eph_pageframes=%zd\n",
+   pr_debug("zcache: evicted_eph_zpages=%zd\n", zcache_evicted_eph_zpages);
+   pr_debug("zcache: evicted_eph_pageframes=%zd\n",
zcache_evicted_eph_pageframes);
-   pr_info("zcache: eph_pageframes=%zd\n", zcache_eph_pageframes);
-   pr_info("zcache: eph_pageframes_max=%zd\n", zcache_eph_pageframes_max);
-   pr_info("zcache: pers_pageframes=%zd\n", zcache_pers_pageframes);
-   pr_info("zcache: pers_pageframes_max=%zd\n",
+   pr_debug("zcache: eph_pageframes=%zd\n", zcache_eph_pageframes);
+   pr_debug("zcache: eph_pageframes_max=%zd\n", zcache_eph_pageframes_max);
+   pr_debug("zcache: pers_pageframes=%zd\n", zcache_pers_pageframes);
+   pr_debug("zcache: pers_pageframes_max=%zd\n",
zcache_pers_pageframes_max);
-   pr_info("zcache: eph_zpages=%zd\n", zcache_eph_zpages);
-   pr_info("zcache: eph_zpages_max=%zd\n", zcache_eph_zpages_max);
-   pr_info("zcache: pers_zpages=%zd\n", zcache_pers_zpages);
-   pr_info("zcache: pers_zpages_max=%zd\n", zcache_pers_zpages_max);
-   pr_info("zcache: last_active_file_pageframes=%zd\n",
+   pr_debug("zcache: eph_zpages=%zd\n", zcache_eph_zpages);
+   pr_debug("zcache: eph_zpages_max=%zd\n", 

[PATCH 01/16] zcache: s/int/bool/ on the various options.

2013-02-13 Thread Konrad Rzeszutek Wilk
There are so many, but this allows us to at least have them
right in as bool.

[v1: Rebase on ramster->zcache move]
[v2: Rebase on staging/zcache: Fix/improve zcache writeback code, tie to a 
config option]
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 4456ab4..5daa0be 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -34,9 +34,9 @@
 #include "zbud.h"
 #include "ramster.h"
 #ifdef CONFIG_RAMSTER
-static int ramster_enabled;
+static bool ramster_enabled __read_mostly;
 #else
-#define ramster_enabled 0
+#define ramster_enabled false
 #endif
 
 #ifndef __PG_WAS_ACTIVE
@@ -62,11 +62,11 @@ static inline void frontswap_tmem_exclusive_gets(bool b)
 /* enable (or fix code) when Seth's patches are accepted upstream */
 #define zcache_writeback_enabled 0
 
-static int zcache_enabled __read_mostly;
-static int disable_cleancache __read_mostly;
-static int disable_frontswap __read_mostly;
-static int disable_frontswap_ignore_nonactive __read_mostly;
-static int disable_cleancache_ignore_nonactive __read_mostly;
+static bool zcache_enabled __read_mostly;
+static bool disable_cleancache __read_mostly;
+static bool disable_frontswap __read_mostly;
+static bool disable_frontswap_ignore_nonactive __read_mostly;
+static bool disable_cleancache_ignore_nonactive __read_mostly;
 static char *namestr __read_mostly = "zcache";
 
 #define ZCACHE_GFP_MASK \
@@ -1840,16 +1840,16 @@ struct frontswap_ops zcache_frontswap_register_ops(void)
 
 static int __init enable_zcache(char *s)
 {
-   zcache_enabled = 1;
+   zcache_enabled = true;
return 1;
 }
 __setup("zcache", enable_zcache);
 
 static int __init enable_ramster(char *s)
 {
-   zcache_enabled = 1;
+   zcache_enabled = true;
 #ifdef CONFIG_RAMSTER
-   ramster_enabled = 1;
+   ramster_enabled = true;
 #endif
return 1;
 }
@@ -1859,7 +1859,7 @@ __setup("ramster", enable_ramster);
 
 static int __init no_cleancache(char *s)
 {
-   disable_cleancache = 1;
+   disable_cleancache = true;
return 1;
 }
 
@@ -1867,7 +1867,7 @@ __setup("nocleancache", no_cleancache);
 
 static int __init no_frontswap(char *s)
 {
-   disable_frontswap = 1;
+   disable_frontswap = true;
return 1;
 }
 
@@ -1883,7 +1883,7 @@ __setup("nofrontswapexclusivegets", 
no_frontswap_exclusive_gets);
 
 static int __init no_frontswap_ignore_nonactive(char *s)
 {
-   disable_frontswap_ignore_nonactive = 1;
+   disable_frontswap_ignore_nonactive = true;
return 1;
 }
 
@@ -1891,7 +1891,7 @@ __setup("nofrontswapignorenonactive", 
no_frontswap_ignore_nonactive);
 
 static int __init no_cleancache_ignore_nonactive(char *s)
 {
-   disable_cleancache_ignore_nonactive = 1;
+   disable_cleancache_ignore_nonactive = true;
return 1;
 }
 
@@ -1900,7 +1900,7 @@ __setup("nocleancacheignorenonactive", 
no_cleancache_ignore_nonactive);
 static int __init enable_zcache_compressor(char *s)
 {
strncpy(zcache_comp_name, s, ZCACHE_COMP_NAME_SZ);
-   zcache_enabled = 1;
+   zcache_enabled = true;
return 1;
 }
 __setup("zcache=", enable_zcache_compressor);
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/16] zcache: Provide accessory functions for counter increase

2013-02-13 Thread Konrad Rzeszutek Wilk
This is the first step in moving the debugfs code out of the
main file in-to another file. And also allow the code to run
without CONFIG_DEBUG_FS defined.

[v2: Rebase on top staging/zcache: Fix/improve zcache writeback code, tie to a 
config option]
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 108 +++
 1 file changed, 73 insertions(+), 35 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 5daa0be..5ad915a 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -138,32 +138,88 @@ static DEFINE_PER_CPU(struct zcache_preload, 
zcache_preloads) = { 0, };
 static long zcache_obj_count;
 static atomic_t zcache_obj_atomic = ATOMIC_INIT(0);
 static long zcache_obj_count_max;
+static inline void inc_zcache_obj_count(void)
+{
+   zcache_obj_count = atomic_inc_return(_obj_atomic);
+   if (zcache_obj_count > zcache_obj_count_max)
+   zcache_obj_count_max = zcache_obj_count;
+}
+
 static long zcache_objnode_count;
 static atomic_t zcache_objnode_atomic = ATOMIC_INIT(0);
 static long zcache_objnode_count_max;
+static inline void inc_zcache_objnode_count(void)
+{
+   zcache_objnode_count = atomic_inc_return(_objnode_atomic);
+   if (zcache_objnode_count > zcache_objnode_count_max)
+   zcache_objnode_count_max = zcache_objnode_count;
+};
 static u64 zcache_eph_zbytes;
 static atomic_long_t zcache_eph_zbytes_atomic = ATOMIC_INIT(0);
 static u64 zcache_eph_zbytes_max;
+static inline void inc_zcache_eph_zbytes(unsigned clen)
+{
+   zcache_eph_zbytes = atomic_long_add_return(clen, 
_eph_zbytes_atomic);
+   if (zcache_eph_zbytes > zcache_eph_zbytes_max)
+   zcache_eph_zbytes_max = zcache_eph_zbytes;
+};
 static u64 zcache_pers_zbytes;
 static atomic_long_t zcache_pers_zbytes_atomic = ATOMIC_INIT(0);
 static u64 zcache_pers_zbytes_max;
+static inline void inc_zcache_pers_zbytes(unsigned clen)
+{
+   zcache_pers_zbytes = atomic_long_add_return(clen, 
_pers_zbytes_atomic);
+   if (zcache_pers_zbytes > zcache_pers_zbytes_max)
+   zcache_pers_zbytes_max = zcache_pers_zbytes;
+}
 static long zcache_eph_pageframes;
 static atomic_t zcache_eph_pageframes_atomic = ATOMIC_INIT(0);
 static long zcache_eph_pageframes_max;
+static inline void inc_zcache_eph_pageframes(void)
+{
+   zcache_eph_pageframes = 
atomic_inc_return(_eph_pageframes_atomic);
+   if (zcache_eph_pageframes > zcache_eph_pageframes_max)
+   zcache_eph_pageframes_max = zcache_eph_pageframes;
+};
 static long zcache_pers_pageframes;
 static atomic_t zcache_pers_pageframes_atomic = ATOMIC_INIT(0);
 static long zcache_pers_pageframes_max;
+static inline void inc_zcache_pers_pageframes(void)
+{
+   zcache_pers_pageframes = 
atomic_inc_return(_pers_pageframes_atomic);
+   if (zcache_pers_pageframes > zcache_pers_pageframes_max)
+   zcache_pers_pageframes_max = zcache_pers_pageframes;
+}
 static long zcache_pageframes_alloced;
 static atomic_t zcache_pageframes_alloced_atomic = ATOMIC_INIT(0);
+static inline void inc_zcache_pageframes_alloced(void)
+{
+   zcache_pageframes_alloced = 
atomic_inc_return(_pageframes_alloced_atomic);
+};
 static long zcache_pageframes_freed;
 static atomic_t zcache_pageframes_freed_atomic = ATOMIC_INIT(0);
+static inline void inc_zcache_pageframes_freed(void)
+{
+   zcache_pageframes_freed = 
atomic_inc_return(_pageframes_freed_atomic);
+}
 static long zcache_eph_zpages;
 static atomic_t zcache_eph_zpages_atomic = ATOMIC_INIT(0);
 static long zcache_eph_zpages_max;
+static inline void inc_zcache_eph_zpages(void)
+{
+   zcache_eph_zpages = atomic_inc_return(_eph_zpages_atomic);
+   if (zcache_eph_zpages > zcache_eph_zpages_max)
+   zcache_eph_zpages_max = zcache_eph_zpages;
+}
 static long zcache_pers_zpages;
 static atomic_t zcache_pers_zpages_atomic = ATOMIC_INIT(0);
 static long zcache_pers_zpages_max;
-
+static inline void inc_zcache_pers_zpages(void)
+{
+   zcache_pers_zpages = atomic_inc_return(_pers_zpages_atomic);
+   if (zcache_pers_zpages > zcache_pers_zpages_max)
+   zcache_pers_zpages_max = zcache_pers_zpages;
+}
 /* but for the rest of these, counting races are ok */
 static unsigned long zcache_flush_total;
 static unsigned long zcache_flush_found;
@@ -421,9 +477,7 @@ static struct tmem_objnode *zcache_objnode_alloc(struct 
tmem_pool *pool)
}
}
BUG_ON(objnode == NULL);
-   zcache_objnode_count = atomic_inc_return(_objnode_atomic);
-   if (zcache_objnode_count > zcache_objnode_count_max)
-   zcache_objnode_count_max = zcache_objnode_count;
+   inc_zcache_objnode_count();
return objnode;
 }
 
@@ -445,9 +499,7 @@ static struct tmem_obj *zcache_obj_alloc(struct tmem_pool 
*pool)
obj = kp->obj;
BUG_ON(obj == NULL);
kp->obj 

[PATCH 05/16] zcache: Fix compile warnings due to usage of debugfs_create_size_t

2013-02-13 Thread Konrad Rzeszutek Wilk
When we compile we get tons of:
include/linux/debugfs.h:80:16: note: expected ‘size_t *’ but argument is
of type ‘long int *’
drivers/staging/zcache/zcache-main.c:279:2: warning: passing argument 4
of ‘debugfs_create_size_t’ from incompatible pointer type [enabled by d
efault]

which is b/c we end up using 'unsigned' or 'unsigned long' instead
of 'ssize_t'. So lets fix this up and use the proper type.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 163 ++-
 1 file changed, 82 insertions(+), 81 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index b71e61b..4bd4107 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -135,23 +135,23 @@ static struct kmem_cache *zcache_obj_cache;
 static DEFINE_PER_CPU(struct zcache_preload, zcache_preloads) = { 0, };
 
 /* we try to keep these statistics SMP-consistent */
-static long zcache_obj_count;
+static ssize_t zcache_obj_count;
 static atomic_t zcache_obj_atomic = ATOMIC_INIT(0);
-static long zcache_obj_count_max;
+static ssize_t zcache_obj_count_max;
 static inline void inc_zcache_obj_count(void)
 {
zcache_obj_count = atomic_inc_return(_obj_atomic);
if (zcache_obj_count > zcache_obj_count_max)
zcache_obj_count_max = zcache_obj_count;
 }
-static long zcache_objnode_count;
+static ssize_t zcache_objnode_count;
 static inline void dec_zcache_obj_count(void)
 {
zcache_obj_count = atomic_dec_return(_obj_atomic);
BUG_ON(zcache_obj_count < 0);
 };
 static atomic_t zcache_objnode_atomic = ATOMIC_INIT(0);
-static long zcache_objnode_count_max;
+static ssize_t zcache_objnode_count_max;
 static inline void inc_zcache_objnode_count(void)
 {
zcache_objnode_count = atomic_inc_return(_objnode_atomic);
@@ -185,64 +185,65 @@ static inline void inc_zcache_pers_zbytes(unsigned clen)
if (zcache_pers_zbytes > zcache_pers_zbytes_max)
zcache_pers_zbytes_max = zcache_pers_zbytes;
 }
-static long zcache_eph_pageframes;
+static ssize_t zcache_eph_pageframes;
 static inline void dec_zcache_pers_zbytes(unsigned zsize)
 {
zcache_pers_zbytes = atomic_long_sub_return(zsize, 
_pers_zbytes_atomic);
 }
 static atomic_t zcache_eph_pageframes_atomic = ATOMIC_INIT(0);
-static long zcache_eph_pageframes_max;
+static ssize_t zcache_eph_pageframes_max;
 static inline void inc_zcache_eph_pageframes(void)
 {
zcache_eph_pageframes = 
atomic_inc_return(_eph_pageframes_atomic);
if (zcache_eph_pageframes > zcache_eph_pageframes_max)
zcache_eph_pageframes_max = zcache_eph_pageframes;
 };
-static long zcache_pers_pageframes;
+static ssize_t zcache_pers_pageframes;
 static inline void dec_zcache_eph_pageframes(void)
 {
zcache_eph_pageframes = 
atomic_dec_return(_eph_pageframes_atomic);
 };
 static atomic_t zcache_pers_pageframes_atomic = ATOMIC_INIT(0);
-static long zcache_pers_pageframes_max;
+static ssize_t zcache_pers_pageframes_max;
 static inline void inc_zcache_pers_pageframes(void)
 {
zcache_pers_pageframes = 
atomic_inc_return(_pers_pageframes_atomic);
if (zcache_pers_pageframes > zcache_pers_pageframes_max)
zcache_pers_pageframes_max = zcache_pers_pageframes;
 }
-static long zcache_pageframes_alloced;
+static ssize_t zcache_pageframes_alloced;
 static inline void dec_zcache_pers_pageframes(void)
 {
zcache_pers_pageframes = 
atomic_dec_return(_pers_pageframes_atomic);
 }
 static atomic_t zcache_pageframes_alloced_atomic = ATOMIC_INIT(0);
+static ssize_t zcache_pageframes_freed;
+static atomic_t zcache_pageframes_freed_atomic = ATOMIC_INIT(0);
+static ssize_t zcache_eph_zpages;
 static inline void inc_zcache_pageframes_alloced(void)
 {
zcache_pageframes_alloced = 
atomic_inc_return(_pageframes_alloced_atomic);
 };
-static long zcache_pageframes_freed;
-static atomic_t zcache_pageframes_freed_atomic = ATOMIC_INIT(0);
 static inline void inc_zcache_pageframes_freed(void)
 {
zcache_pageframes_freed = 
atomic_inc_return(_pageframes_freed_atomic);
 }
-static long zcache_eph_zpages;
+static ssize_t zcache_eph_zpages;
 static atomic_t zcache_eph_zpages_atomic = ATOMIC_INIT(0);
-static long zcache_eph_zpages_max;
+static ssize_t zcache_eph_zpages_max;
 static inline void inc_zcache_eph_zpages(void)
 {
zcache_eph_zpages = atomic_inc_return(_eph_zpages_atomic);
if (zcache_eph_zpages > zcache_eph_zpages_max)
zcache_eph_zpages_max = zcache_eph_zpages;
 }
-static long zcache_pers_zpages;
+static ssize_t zcache_pers_zpages;
 static inline void dec_zcache_eph_zpages(unsigned zpages)
 {
zcache_eph_zpages = atomic_sub_return(zpages, 
_eph_zpages_atomic);
 }
 static atomic_t zcache_pers_zpages_atomic = ATOMIC_INIT(0);
-static long zcache_pers_zpages_max;
+static ssize_t zcache_pers_zpages_max;
 static inline void inc_zcache_pers_zpages(void)

[PATCH 03/16] zcache: Provide accessory functions for counter decrease.

2013-02-13 Thread Konrad Rzeszutek Wilk
This way we can have all wrapped with these functions and
can disable/enable this with CONFIG_DEBUG_FS.

[v2: Rebase on top of staging/zcache: Fix/improve zcache writeback code, tie to 
a config option]
Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zcache-main.c | 96 +---
 1 file changed, 57 insertions(+), 39 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index 5ad915a..50a408a 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -144,8 +144,12 @@ static inline void inc_zcache_obj_count(void)
if (zcache_obj_count > zcache_obj_count_max)
zcache_obj_count_max = zcache_obj_count;
 }
-
 static long zcache_objnode_count;
+static inline void dec_zcache_obj_count(void)
+{
+   zcache_obj_count = atomic_dec_return(_obj_atomic);
+   BUG_ON(zcache_obj_count < 0);
+};
 static atomic_t zcache_objnode_atomic = ATOMIC_INIT(0);
 static long zcache_objnode_count_max;
 static inline void inc_zcache_objnode_count(void)
@@ -154,6 +158,11 @@ static inline void inc_zcache_objnode_count(void)
if (zcache_objnode_count > zcache_objnode_count_max)
zcache_objnode_count_max = zcache_objnode_count;
 };
+static inline void dec_zcache_objnode_count(void)
+{
+   zcache_objnode_count = atomic_dec_return(_objnode_atomic);
+   BUG_ON(zcache_objnode_count < 0);
+};
 static u64 zcache_eph_zbytes;
 static atomic_long_t zcache_eph_zbytes_atomic = ATOMIC_INIT(0);
 static u64 zcache_eph_zbytes_max;
@@ -163,6 +172,10 @@ static inline void inc_zcache_eph_zbytes(unsigned clen)
if (zcache_eph_zbytes > zcache_eph_zbytes_max)
zcache_eph_zbytes_max = zcache_eph_zbytes;
 };
+static inline void dec_zcache_eph_zbytes(unsigned zsize)
+{
+   zcache_eph_zbytes = atomic_long_sub_return(zsize, 
_eph_zbytes_atomic);
+};
 static u64 zcache_pers_zbytes;
 static atomic_long_t zcache_pers_zbytes_atomic = ATOMIC_INIT(0);
 static u64 zcache_pers_zbytes_max;
@@ -173,6 +186,10 @@ static inline void inc_zcache_pers_zbytes(unsigned clen)
zcache_pers_zbytes_max = zcache_pers_zbytes;
 }
 static long zcache_eph_pageframes;
+static inline void dec_zcache_pers_zbytes(unsigned zsize)
+{
+   zcache_pers_zbytes = atomic_long_sub_return(zsize, 
_pers_zbytes_atomic);
+}
 static atomic_t zcache_eph_pageframes_atomic = ATOMIC_INIT(0);
 static long zcache_eph_pageframes_max;
 static inline void inc_zcache_eph_pageframes(void)
@@ -182,6 +199,10 @@ static inline void inc_zcache_eph_pageframes(void)
zcache_eph_pageframes_max = zcache_eph_pageframes;
 };
 static long zcache_pers_pageframes;
+static inline void dec_zcache_eph_pageframes(void)
+{
+   zcache_eph_pageframes = 
atomic_dec_return(_eph_pageframes_atomic);
+};
 static atomic_t zcache_pers_pageframes_atomic = ATOMIC_INIT(0);
 static long zcache_pers_pageframes_max;
 static inline void inc_zcache_pers_pageframes(void)
@@ -191,6 +212,10 @@ static inline void inc_zcache_pers_pageframes(void)
zcache_pers_pageframes_max = zcache_pers_pageframes;
 }
 static long zcache_pageframes_alloced;
+static inline void dec_zcache_pers_pageframes(void)
+{
+   zcache_pers_pageframes = 
atomic_dec_return(_pers_pageframes_atomic);
+}
 static atomic_t zcache_pageframes_alloced_atomic = ATOMIC_INIT(0);
 static inline void inc_zcache_pageframes_alloced(void)
 {
@@ -212,6 +237,10 @@ static inline void inc_zcache_eph_zpages(void)
zcache_eph_zpages_max = zcache_eph_zpages;
 }
 static long zcache_pers_zpages;
+static inline void dec_zcache_eph_zpages(unsigned zpages)
+{
+   zcache_eph_zpages = atomic_sub_return(zpages, 
_eph_zpages_atomic);
+}
 static atomic_t zcache_pers_zpages_atomic = ATOMIC_INIT(0);
 static long zcache_pers_zpages_max;
 static inline void inc_zcache_pers_zpages(void)
@@ -220,6 +249,10 @@ static inline void inc_zcache_pers_zpages(void)
if (zcache_pers_zpages > zcache_pers_zpages_max)
zcache_pers_zpages_max = zcache_pers_zpages;
 }
+static inline void dec_zcache_pers_zpages(unsigned zpages)
+{
+   zcache_pers_zpages = atomic_sub_return(zpages, 
_pers_zpages_atomic);
+}
 /* but for the rest of these, counting races are ok */
 static unsigned long zcache_flush_total;
 static unsigned long zcache_flush_found;
@@ -484,9 +517,7 @@ static struct tmem_objnode *zcache_objnode_alloc(struct 
tmem_pool *pool)
 static void zcache_objnode_free(struct tmem_objnode *objnode,
struct tmem_pool *pool)
 {
-   zcache_objnode_count =
-   atomic_dec_return(_objnode_atomic);
-   BUG_ON(zcache_objnode_count < 0);
+   dec_zcache_objnode_count();
kmem_cache_free(zcache_objnode_cache, objnode);
 }
 
@@ -505,9 +536,7 @@ static struct tmem_obj *zcache_obj_alloc(struct tmem_pool 
*pool)
 
 static void zcache_obj_free(struct tmem_obj *obj, struct 

[PATCH 07/16] zcache: Move debugfs code out of zcache-main.c file.

2013-02-13 Thread Konrad Rzeszutek Wilk
Note that at this point there is no CONFIG_ZCACHE_DEBUG
option in the Kconfig. So in effect all of the counters
are nop until that option gets re-introduced in:
zcache/debug: Coalesce all debug under CONFIG_ZCACHE_DEBUG

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/Makefile  |   1 +
 drivers/staging/zcache/debug.c   | 132 +
 drivers/staging/zcache/debug.h   | 187 
 drivers/staging/zcache/zcache-main.c | 265 ++-
 4 files changed, 328 insertions(+), 257 deletions(-)
 create mode 100644 drivers/staging/zcache/debug.c
 create mode 100644 drivers/staging/zcache/debug.h

diff --git a/drivers/staging/zcache/Makefile b/drivers/staging/zcache/Makefile
index 4711049..24fd6aa 100644
--- a/drivers/staging/zcache/Makefile
+++ b/drivers/staging/zcache/Makefile
@@ -1,4 +1,5 @@
 zcache-y   :=  zcache-main.o tmem.o zbud.o
+zcache-$(CONFIG_ZCACHE_DEBUG) += debug.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/ramster.o ramster/r2net.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/nodemanager.o ramster/tcp.o
 zcache-$(CONFIG_RAMSTER)   +=  ramster/heartbeat.o ramster/masklog.o
diff --git a/drivers/staging/zcache/debug.c b/drivers/staging/zcache/debug.c
new file mode 100644
index 000..622d5f3
--- /dev/null
+++ b/drivers/staging/zcache/debug.c
@@ -0,0 +1,132 @@
+#include 
+#include "debug.h"
+
+#ifdef CONFIG_DEBUG_FS
+#include 
+#definezdfsdebugfs_create_size_t
+#definezdfs64  debugfs_create_u64
+int zcache_debugfs_init(void)
+{
+   struct dentry *root = debugfs_create_dir("zcache", NULL);
+   if (root == NULL)
+   return -ENXIO;
+
+   zdfs("obj_count", S_IRUGO, root, _obj_count);
+   zdfs("obj_count_max", S_IRUGO, root, _obj_count_max);
+   zdfs("objnode_count", S_IRUGO, root, _objnode_count);
+   zdfs("objnode_count_max", S_IRUGO, root, _objnode_count_max);
+   zdfs("flush_total", S_IRUGO, root, _flush_total);
+   zdfs("flush_found", S_IRUGO, root, _flush_found);
+   zdfs("flobj_total", S_IRUGO, root, _flobj_total);
+   zdfs("flobj_found", S_IRUGO, root, _flobj_found);
+   zdfs("failed_eph_puts", S_IRUGO, root, _failed_eph_puts);
+   zdfs("failed_pers_puts", S_IRUGO, root, _failed_pers_puts);
+   zdfs("failed_get_free_pages", S_IRUGO, root,
+   _failed_getfreepages);
+   zdfs("failed_alloc", S_IRUGO, root, _failed_alloc);
+   zdfs("put_to_flush", S_IRUGO, root, _put_to_flush);
+   zdfs("compress_poor", S_IRUGO, root, _compress_poor);
+   zdfs("mean_compress_poor", S_IRUGO, root, _mean_compress_poor);
+   zdfs("eph_ate_tail", S_IRUGO, root, _eph_ate_tail);
+   zdfs("eph_ate_tail_failed", S_IRUGO, root, _eph_ate_tail_failed);
+   zdfs("pers_ate_eph", S_IRUGO, root, _pers_ate_eph);
+   zdfs("pers_ate_eph_failed", S_IRUGO, root, _pers_ate_eph_failed);
+   zdfs("evicted_eph_zpages", S_IRUGO, root, _evicted_eph_zpages);
+   zdfs("evicted_eph_pageframes", S_IRUGO, root,
+   _evicted_eph_pageframes);
+   zdfs("eph_pageframes", S_IRUGO, root, _eph_pageframes);
+   zdfs("eph_pageframes_max", S_IRUGO, root, _eph_pageframes_max);
+   zdfs("pers_pageframes", S_IRUGO, root, _pers_pageframes);
+   zdfs("pers_pageframes_max", S_IRUGO, root, _pers_pageframes_max);
+   zdfs("eph_zpages", S_IRUGO, root, _eph_zpages);
+   zdfs("eph_zpages_max", S_IRUGO, root, _eph_zpages_max);
+   zdfs("pers_zpages", S_IRUGO, root, _pers_zpages);
+   zdfs("pers_zpages_max", S_IRUGO, root, _pers_zpages_max);
+   zdfs("last_active_file_pageframes", S_IRUGO, root,
+   _last_active_file_pageframes);
+   zdfs("last_inactive_file_pageframes", S_IRUGO, root,
+   _last_inactive_file_pageframes);
+   zdfs("last_active_anon_pageframes", S_IRUGO, root,
+   _last_active_anon_pageframes);
+   zdfs("last_inactive_anon_pageframes", S_IRUGO, root,
+   _last_inactive_anon_pageframes);
+   zdfs("eph_nonactive_puts_ignored", S_IRUGO, root,
+   _eph_nonactive_puts_ignored);
+   zdfs("pers_nonactive_puts_ignored", S_IRUGO, root,
+   _pers_nonactive_puts_ignored);
+   zdfs64("eph_zbytes", S_IRUGO, root, _eph_zbytes);
+   zdfs64("eph_zbytes_max", S_IRUGO, root, _eph_zbytes_max);
+   zdfs64("pers_zbytes", S_IRUGO, root, _pers_zbytes);
+   zdfs64("pers_zbytes_max", S_IRUGO, root, _pers_zbytes_max);
+   zdfs("outstanding_writeback_pages", S_IRUGO, root,
+   _outstanding_writeback_pages);
+   zdfs("writtenback_pages", S_IRUGO, root, _writtenback_pages);
+
+   return 0;
+}
+#undef zdebugfs
+#undef zdfs64
+
+/* developers can call this in case of ooms, e.g. to find memory leaks */
+void 

[PATCH 12/16] zcache/zbud: Fix compiler warnings.

2013-02-13 Thread Konrad Rzeszutek Wilk
We get tons of:
drivers/staging/zcache/zbud.c: In function ‘zbud_debugfs_init’:
drivers/staging/zcache/zbud.c:323:2: warning: passing argument 4 of
‘debugfs_create_size_t’ from incompatible pointer type [enabled by
default]
In file included from drivers/staging/zcache/zbud.c:305:0:

This fixes it

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zbud.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/zcache/zbud.c b/drivers/staging/zcache/zbud.c
index 6835fab..4654978 100644
--- a/drivers/staging/zcache/zbud.c
+++ b/drivers/staging/zcache/zbud.c
@@ -281,26 +281,26 @@ static inline char *zbud_data(void *zbpg,
  * debugfs viewers, some of these should also be atomic_long_t, but
  * I don't know how to expose atomics via debugfs either...
  */
-static unsigned long zbud_eph_pageframes;
-static unsigned long zbud_pers_pageframes;
-static unsigned long zbud_eph_zpages;
-static unsigned long zbud_pers_zpages;
+static ssize_t zbud_eph_pageframes;
+static ssize_t zbud_pers_pageframes;
+static ssize_t zbud_eph_zpages;
+static ssize_t zbud_pers_zpages;
 static u64 zbud_eph_zbytes;
 static u64 zbud_pers_zbytes;
-static unsigned long zbud_eph_evicted_pageframes;
-static unsigned long zbud_pers_evicted_pageframes;
-static unsigned long zbud_eph_cumul_zpages;
-static unsigned long zbud_pers_cumul_zpages;
+static ssize_t zbud_eph_evicted_pageframes;
+static ssize_t zbud_pers_evicted_pageframes;
+static ssize_t zbud_eph_cumul_zpages;
+static ssize_t zbud_pers_cumul_zpages;
 static u64 zbud_eph_cumul_zbytes;
 static u64 zbud_pers_cumul_zbytes;
-static unsigned long zbud_eph_cumul_chunk_counts[NCHUNKS];
-static unsigned long zbud_pers_cumul_chunk_counts[NCHUNKS];
-static unsigned long zbud_eph_buddied_count;
-static unsigned long zbud_pers_buddied_count;
-static unsigned long zbud_eph_unbuddied_count;
-static unsigned long zbud_pers_unbuddied_count;
-static unsigned long zbud_eph_zombie_count;
-static unsigned long zbud_pers_zombie_count;
+static ssize_t zbud_eph_cumul_chunk_counts[NCHUNKS];
+static ssize_t zbud_pers_cumul_chunk_counts[NCHUNKS];
+static ssize_t zbud_eph_buddied_count;
+static ssize_t zbud_pers_buddied_count;
+static ssize_t zbud_eph_unbuddied_count;
+static ssize_t zbud_pers_unbuddied_count;
+static ssize_t zbud_eph_zombie_count;
+static ssize_t zbud_pers_zombie_count;
 static atomic_t zbud_eph_zombie_atomic;
 static atomic_t zbud_pers_zombie_atomic;
 
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/16] ramster: Fix compile warnings due to usage of debugfs_create_size_t

2013-02-13 Thread Konrad Rzeszutek Wilk
We get tons of "note: expected ‘size_t *’ but argument is of type ‘long
int *’" warnings. This fixes it.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/ramster/ramster.c | 34 
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/zcache/ramster/ramster.c 
b/drivers/staging/zcache/ramster/ramster.c
index c06709f..bf96a1c 100644
--- a/drivers/staging/zcache/ramster/ramster.c
+++ b/drivers/staging/zcache/ramster/ramster.c
@@ -67,25 +67,25 @@ static int ramster_remote_target_nodenum __read_mostly = -1;
 static long ramster_flnodes;
 static atomic_t ramster_flnodes_atomic = ATOMIC_INIT(0);
 static unsigned long ramster_flnodes_max;
-static long ramster_foreign_eph_pages;
+static ssize_t ramster_foreign_eph_pages;
 static atomic_t ramster_foreign_eph_pages_atomic = ATOMIC_INIT(0);
-static unsigned long ramster_foreign_eph_pages_max;
-static long ramster_foreign_pers_pages;
+static ssize_t ramster_foreign_eph_pages_max;
+static ssize_t ramster_foreign_pers_pages;
 static atomic_t ramster_foreign_pers_pages_atomic = ATOMIC_INIT(0);
-static unsigned long ramster_foreign_pers_pages_max;
-static unsigned long ramster_eph_pages_remoted;
-static unsigned long ramster_pers_pages_remoted;
-static unsigned long ramster_eph_pages_remote_failed;
-static unsigned long ramster_pers_pages_remote_failed;
-static unsigned long ramster_remote_eph_pages_succ_get;
-static unsigned long ramster_remote_pers_pages_succ_get;
-static unsigned long ramster_remote_eph_pages_unsucc_get;
-static unsigned long ramster_remote_pers_pages_unsucc_get;
-static unsigned long ramster_pers_pages_remote_nomem;
-static unsigned long ramster_remote_objects_flushed;
-static unsigned long ramster_remote_object_flushes_failed;
-static unsigned long ramster_remote_pages_flushed;
-static unsigned long ramster_remote_page_flushes_failed;
+static ssize_t ramster_foreign_pers_pages_max;
+static ssize_t ramster_eph_pages_remoted;
+static ssize_t ramster_pers_pages_remoted;
+static ssize_t ramster_eph_pages_remote_failed;
+static ssize_t ramster_pers_pages_remote_failed;
+static ssize_t ramster_remote_eph_pages_succ_get;
+static ssize_t ramster_remote_pers_pages_succ_get;
+static ssize_t ramster_remote_eph_pages_unsucc_get;
+static ssize_t ramster_remote_pers_pages_unsucc_get;
+static ssize_t ramster_pers_pages_remote_nomem;
+static ssize_t ramster_remote_objects_flushed;
+static ssize_t ramster_remote_object_flushes_failed;
+static ssize_t ramster_remote_pages_flushed;
+static ssize_t ramster_remote_page_flushes_failed;
 /* FIXME frontswap selfshrinking knobs in debugfs? */
 
 #ifdef CONFIG_DEBUG_FS
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] 3.0.62-rt88

2013-02-13 Thread Steven Rostedt

Dear RT Folks,

I'm pleased to announce the 3.0.62-rt88 stable release.


You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  Head SHA1: 3568cae0aa56660cadd607e19096ad853822b28c


Or to build 3.0.62-rt88 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.0.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.0/patch-3.0.62.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.0/patch-3.0.62-rt88.patch.xz


You can also build from 3.0.62-rt87 by applying the incremental patch:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.0/incr/patch-3.0.62-rt87-rt88.patch.xz



Enjoy,

-- Steve


Changes from 3.0.62-rt87:

---

Steven Rostedt (1):
  Linux 3.0.62-rt88

Thomas Gleixner (4):
  drivers-tty-pl011-irq-disable-madness.patch
  mmci: Remove bogus local_irq_save()
  sched: Init idle->on_rq in init_idle()
  mm: swap: Initialize local locks early


 drivers/mmc/host/mmci.c |5 -
 drivers/tty/serial/amba-pl011.c |   15 ++-
 kernel/sched.c  |1 +
 localversion-rt |2 +-
 mm/swap.c   |   12 +---
 5 files changed, 21 insertions(+), 14 deletions(-)
---
diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index 9394d0b..8d0bf36 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -741,15 +741,12 @@ static irqreturn_t mmci_pio_irq(int irq, void *dev_id)
struct sg_mapping_iter *sg_miter = >sg_miter;
struct variant_data *variant = host->variant;
void __iomem *base = host->base;
-   unsigned long flags;
u32 status;
 
status = readl(base + MMCISTATUS);
 
dev_dbg(mmc_dev(host->mmc), "irq1 (pio) %08x\n", status);
 
-   local_irq_save(flags);
-
do {
unsigned int remain, len;
char *buffer;
@@ -789,8 +786,6 @@ static irqreturn_t mmci_pio_irq(int irq, void *dev_id)
 
sg_miter_stop(sg_miter);
 
-   local_irq_restore(flags);
-
/*
 * If we have less than the fifo 'half-full' threshold to transfer,
 * trigger a PIO interrupt as soon as any data is available.
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index 7cbb367..40356ae 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1754,13 +1754,19 @@ pl011_console_write(struct console *co, const char *s, 
unsigned int count)
 
clk_enable(uap->clk);
 
-   local_irq_save(flags);
+   /*
+* local_irq_save(flags);
+*
+* This local_irq_save() is nonsense. If we come in via sysrq
+* handling then interrupts are already disabled. Aside of
+* that the port.sysrq check is racy on SMP regardless.
+   */
if (uap->port.sysrq)
locked = 0;
else if (oops_in_progress)
-   locked = spin_trylock(>port.lock);
+   locked = spin_trylock_irqsave(>port.lock, flags);
else
-   spin_lock(>port.lock);
+   spin_lock_irqsave(>port.lock, flags);
 
/*
 *  First save the CR then disable the interrupts
@@ -1782,8 +1788,7 @@ pl011_console_write(struct console *co, const char *s, 
unsigned int count)
writew(old_cr, uap->port.membase + UART011_CR);
 
if (locked)
-   spin_unlock(>port.lock);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>port.lock, flags);
 
clk_disable(uap->clk);
 }
diff --git a/kernel/sched.c b/kernel/sched.c
index 858f5df..53b78f0 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6139,6 +6139,7 @@ void __cpuinit init_idle(struct task_struct *idle, int 
cpu)
rcu_read_unlock();
 
rq->curr = rq->idle = idle;
+   idle->on_rq = 1;
 #if defined(CONFIG_SMP)
idle->on_cpu = 1;
 #endif
diff --git a/localversion-rt b/localversion-rt
index bd5aee3..666227d 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt87
+-rt88
diff --git a/mm/swap.c b/mm/swap.c
index 069390f..662972f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -772,6 +772,15 @@ unsigned pagevec_lookup(struct pagevec *pvec, struct 
address_space *mapping,
 
 EXPORT_SYMBOL(pagevec_lookup);
 
+/* Early setup for the local locks */
+static int __init swap_init_locks(void)
+{
+   local_irq_lock_init(rotate_lock);
+   local_irq_lock_init(swap_lock);
+   return 1;
+}
+early_initcall(swap_init_locks);
+
 unsigned pagevec_lookup_tag(struct pagevec *pvec, struct address_space 
*mapping,
pgoff_t *index, int tag, unsigned nr_pages)
 {
@@ -789,9 +798,6 @@ void __init swap_setup(void)
 {
unsigned long megs = totalram_pages >> (20 - PAGE_SHIFT);
 
-   local_irq_lock_init(rotate_lock);
-   local_irq_lock_init(swap_lock);
-
 #ifdef CONFIG_SWAP

[PATCH 16/16] zcache/zbud: Fix __init mismatch

2013-02-13 Thread Konrad Rzeszutek Wilk
We get:
WARNING: drivers/staging/zcache/zcache.o(.text+0x13a1): Section mismatch
in reference from the function zcache_init() to the function
.init.text:zbud_init()
The function zcache_init() references
the function __init zbud_init().
This is often because zcache_init lacks a __init
annotation or the annotation of zbud_init is wrong.

And this fixes it.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zbud.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/zcache/zbud.c b/drivers/staging/zcache/zbud.c
index e139cd6..9aa5bd8 100644
--- a/drivers/staging/zcache/zbud.c
+++ b/drivers/staging/zcache/zbud.c
@@ -1077,7 +1077,7 @@ out:
return ret;
 }
 
-void __init zbud_init(void)
+void zbud_init(void)
 {
int i;
 
-- 
1.8.0.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/16] zcache/zbud: Add incremental accessory counters

2013-02-13 Thread Konrad Rzeszutek Wilk
that are going to be used for debug fs entries.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/zbud.c | 58 +--
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/zcache/zbud.c b/drivers/staging/zcache/zbud.c
index 4654978..cff596c 100644
--- a/drivers/staging/zcache/zbud.c
+++ b/drivers/staging/zcache/zbud.c
@@ -301,6 +301,26 @@ static ssize_t zbud_eph_unbuddied_count;
 static ssize_t zbud_pers_unbuddied_count;
 static ssize_t zbud_eph_zombie_count;
 static ssize_t zbud_pers_zombie_count;
+static inline void inc_zbud_eph_pageframes(void) { zbud_eph_pageframes++; };
+static inline void inc_zbud_pers_pageframes(void) { zbud_pers_pageframes++; };
+static inline void inc_zbud_eph_zpages(void) { zbud_eph_zpages++; };
+static inline void inc_zbud_pers_zpages(void) { zbud_pers_zpages++; };
+static inline void inc_zbud_eph_zbytes(ssize_t bytes) { zbud_eph_zbytes += 
bytes; };
+static inline void inc_zbud_pers_zbytes(ssize_t bytes) { zbud_pers_zbytes += 
bytes; };
+static inline void inc_zbud_eph_evicted_pageframes(void) { 
zbud_eph_evicted_pageframes++; };
+static inline void inc_zbud_pers_evicted_pageframes(void) { 
zbud_pers_evicted_pageframes++; };
+static inline void inc_zbud_eph_cumul_zpages(void) { zbud_eph_cumul_zpages++; 
};
+static inline void inc_zbud_pers_cumul_zpages(void) { 
zbud_pers_cumul_zpages++; };
+static inline void inc_zbud_eph_cumul_zbytes(ssize_t bytes) { 
zbud_eph_cumul_zbytes += bytes; };
+static inline void inc_zbud_pers_cumul_zbytes(ssize_t bytes) { 
zbud_pers_cumul_zbytes += bytes; };
+static inline void inc_zbud_eph_cumul_chunk_counts(unsigned n) { 
zbud_eph_cumul_chunk_counts[n]++; };
+static inline void inc_zbud_pers_cumul_chunk_counts(unsigned n) { 
zbud_pers_cumul_chunk_counts[n]++; };
+static inline void inc_zbud_eph_buddied_count(void) { 
zbud_eph_buddied_count++; };
+static inline void inc_zbud_pers_buddied_count(void) { 
zbud_pers_buddied_count++; };
+static inline void inc_zbud_eph_unbuddied_count(void) { 
zbud_eph_unbuddied_count++; };
+static inline void inc_zbud_pers_unbuddied_count(void) { 
zbud_pers_unbuddied_count++; };
+static inline void inc_zbud_eph_zombie_count(void) { zbud_eph_zombie_count++; 
};
+static inline void inc_zbud_pers_zombie_count(void) { 
zbud_pers_zombie_count++; };
 static atomic_t zbud_eph_zombie_atomic;
 static atomic_t zbud_pers_zombie_atomic;
 
@@ -379,9 +399,9 @@ static inline struct zbudpage *zbud_init_zbudpage(struct 
page *page, bool eph)
zbudpage->zbud1_size = 0;
zbudpage->unevictable = 0;
if (eph)
-   zbud_eph_pageframes++;
+   inc_zbud_eph_pageframes();
else
-   zbud_pers_pageframes++;
+   inc_zbud_pers_pageframes();
return zbudpage;
 }
 
@@ -465,17 +485,17 @@ static void zbud_init_zbud(struct zbudpage *zbudpage, 
struct tmem_handle *th,
else
zbudpage->zbud1_size = size;
if (eph) {
-   zbud_eph_cumul_chunk_counts[nchunks]++;
-   zbud_eph_zpages++;
-   zbud_eph_cumul_zpages++;
-   zbud_eph_zbytes += size;
-   zbud_eph_cumul_zbytes += size;
+   inc_zbud_eph_cumul_chunk_counts(nchunks);
+   inc_zbud_eph_zpages();
+   inc_zbud_eph_cumul_zpages();
+   inc_zbud_eph_zbytes(size);
+   inc_zbud_eph_cumul_zbytes(size);
} else {
-   zbud_pers_cumul_chunk_counts[nchunks]++;
-   zbud_pers_zpages++;
-   zbud_pers_cumul_zpages++;
-   zbud_pers_zbytes += size;
-   zbud_pers_cumul_zbytes += size;
+   inc_zbud_pers_cumul_chunk_counts(nchunks);
+   inc_zbud_pers_zpages();
+   inc_zbud_pers_cumul_zpages();
+   inc_zbud_pers_zbytes(size);
+   inc_zbud_pers_cumul_zbytes(size);
}
 }
 
@@ -603,9 +623,9 @@ struct page *zbud_free_and_delist(struct zbudref *zref, 
bool eph,
}
if (eph) {
zbud_eph_buddied_count--;
-   zbud_eph_unbuddied_count++;
+   inc_zbud_eph_unbuddied_count();
} else {
-   zbud_pers_unbuddied_count++;
+   inc_zbud_pers_unbuddied_count();
zbud_pers_buddied_count--;
}
/* don't mess with lru, no need to move it */
@@ -664,7 +684,7 @@ found_unbuddied:
list_add_tail(>budlist, _eph_buddied_list);
unbud[found_good_buddy].count--;
zbud_eph_unbuddied_count--;
-   zbud_eph_buddied_count++;
+   inc_zbud_eph_buddied_count();
/* "promote" raw zbudpage to most-recently-used */
list_del_init(>lru);
list_add_tail(>lru, _eph_lru_list);
@@ -672,7 +692,7 @@ found_unbuddied:
  

[ANNOUNCE] 3.2.38-rt57

2013-02-13 Thread Steven Rostedt

Dear RT Folks,

I'm pleased to announce the 3.2.38-rt57 stable release.


You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  Head SHA1: 97eeac44604babd344fbcea25620de62ab520d88


Or to build 3.2.38-rt57 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.2.38.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/patch-3.2.38-rt57.patch.xz


You can also build from 3.2.38-rt56 by applying the incremental patch:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/incr/patch-3.2.38-rt56-rt57.patch.xz



Enjoy,

-- Steve


Changes from 3.2.38-rt56:

---

Steven Rostedt (1):
  Linux 3.2.38-rt57

Thomas Gleixner (5):
  drivers-tty-pl011-irq-disable-madness.patch
  mmci: Remove bogus local_irq_save()
  sched: Init idle->on_rq in init_idle()
  sched: Check for idle task in might_sleep()
  mm: swap: Initialize local locks early


 drivers/mmc/host/mmci.c |5 -
 drivers/tty/serial/amba-pl011.c |   15 ++-
 kernel/sched.c  |4 +++-
 localversion-rt |2 +-
 mm/swap.c   |   12 +---
 5 files changed, 23 insertions(+), 15 deletions(-)
---
diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index 0726e59..5d7bf83 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -859,15 +859,12 @@ static irqreturn_t mmci_pio_irq(int irq, void *dev_id)
struct sg_mapping_iter *sg_miter = >sg_miter;
struct variant_data *variant = host->variant;
void __iomem *base = host->base;
-   unsigned long flags;
u32 status;
 
status = readl(base + MMCISTATUS);
 
dev_dbg(mmc_dev(host->mmc), "irq1 (pio) %08x\n", status);
 
-   local_irq_save(flags);
-
do {
unsigned int remain, len;
char *buffer;
@@ -907,8 +904,6 @@ static irqreturn_t mmci_pio_irq(int irq, void *dev_id)
 
sg_miter_stop(sg_miter);
 
-   local_irq_restore(flags);
-
/*
 * If we have less than the fifo 'half-full' threshold to transfer,
 * trigger a PIO interrupt as soon as any data is available.
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index fe9f111..1fbaf66 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1761,13 +1761,19 @@ pl011_console_write(struct console *co, const char *s, 
unsigned int count)
 
clk_enable(uap->clk);
 
-   local_irq_save(flags);
+   /*
+* local_irq_save(flags);
+*
+* This local_irq_save() is nonsense. If we come in via sysrq
+* handling then interrupts are already disabled. Aside of
+* that the port.sysrq check is racy on SMP regardless.
+   */
if (uap->port.sysrq)
locked = 0;
else if (oops_in_progress)
-   locked = spin_trylock(>port.lock);
+   locked = spin_trylock_irqsave(>port.lock, flags);
else
-   spin_lock(>port.lock);
+   spin_lock_irqsave(>port.lock, flags);
 
/*
 *  First save the CR then disable the interrupts
@@ -1789,8 +1795,7 @@ pl011_console_write(struct console *co, const char *s, 
unsigned int count)
writew(old_cr, uap->port.membase + UART011_CR);
 
if (locked)
-   spin_unlock(>port.lock);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>port.lock, flags);
 
clk_disable(uap->clk);
 }
diff --git a/kernel/sched.c b/kernel/sched.c
index b318b4a..14219ed 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6519,6 +6519,7 @@ void __cpuinit init_idle(struct task_struct *idle, int 
cpu)
rcu_read_unlock();
 
rq->curr = rq->idle = idle;
+   idle->on_rq = 1;
 #if defined(CONFIG_SMP)
idle->on_cpu = 1;
 #endif
@@ -8936,7 +8937,8 @@ void __might_sleep(const char *file, int line, int 
preempt_offset)
static unsigned long prev_jiffy;/* ratelimiting */
 
rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
-   if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
+   if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
+!is_idle_task(current)) ||
system_state != SYSTEM_RUNNING || oops_in_progress)
return;
if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
diff --git a/localversion-rt b/localversion-rt
index fdb0f88..c06cc435 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt56
+-rt57
diff --git a/mm/swap.c b/mm/swap.c
index e3f7d6f..c428897 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -772,6 +772,15 @@ unsigned pagevec_lookup(struct pagevec *pvec, struct 
address_space *mapping,
 
 

[PATCH 09/16] zcache: Move the last of the debugfs counters out

2013-02-13 Thread Konrad Rzeszutek Wilk
We now have in zcache-main only the counters that are
are not debugfs related.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 drivers/staging/zcache/debug.h   | 80 +++-
 drivers/staging/zcache/zcache-main.c | 75 +
 2 files changed, 89 insertions(+), 66 deletions(-)

diff --git a/drivers/staging/zcache/debug.h b/drivers/staging/zcache/debug.h
index 98dc491..eef67db 100644
--- a/drivers/staging/zcache/debug.h
+++ b/drivers/staging/zcache/debug.h
@@ -128,34 +128,56 @@ static inline unsigned long curr_pageframes_count(void)
atomic_read(_pers_pageframes_atomic);
 };
 /* but for the rest of these, counting races are ok */
-extern ssize_t zcache_flush_total;
-extern ssize_t zcache_flush_found;
-extern ssize_t zcache_flobj_total;
-extern ssize_t zcache_flobj_found;
-extern ssize_t zcache_failed_eph_puts;
-extern ssize_t zcache_failed_pers_puts;
-extern ssize_t zcache_failed_getfreepages;
-extern ssize_t zcache_failed_alloc;
-extern ssize_t zcache_put_to_flush;
-extern ssize_t zcache_compress_poor;
-extern ssize_t zcache_mean_compress_poor;
-extern ssize_t zcache_eph_ate_tail;
-extern ssize_t zcache_eph_ate_tail_failed;
-extern ssize_t zcache_pers_ate_eph;
-extern ssize_t zcache_pers_ate_eph_failed;
-extern ssize_t zcache_evicted_eph_zpages;
-extern ssize_t zcache_evicted_eph_pageframes;
+static ssize_t zcache_flush_total;
+static ssize_t zcache_flush_found;
+static ssize_t zcache_flobj_total;
+static ssize_t zcache_flobj_found;
+static ssize_t zcache_failed_eph_puts;
+static ssize_t zcache_failed_pers_puts;
+static ssize_t zcache_failed_getfreepages;
+static ssize_t zcache_failed_alloc;
+static ssize_t zcache_put_to_flush;
+static ssize_t zcache_compress_poor;
+static ssize_t zcache_mean_compress_poor;
+static ssize_t zcache_eph_ate_tail;
+static ssize_t zcache_eph_ate_tail_failed;
+static ssize_t zcache_pers_ate_eph;
+static ssize_t zcache_pers_ate_eph_failed;
+static ssize_t zcache_evicted_eph_zpages;
+static ssize_t zcache_evicted_eph_pageframes;
+
 extern ssize_t zcache_last_active_file_pageframes;
 extern ssize_t zcache_last_inactive_file_pageframes;
 extern ssize_t zcache_last_active_anon_pageframes;
 extern ssize_t zcache_last_inactive_anon_pageframes;
-extern ssize_t zcache_eph_nonactive_puts_ignored;
-extern ssize_t zcache_pers_nonactive_puts_ignored;
+static ssize_t zcache_eph_nonactive_puts_ignored;
+static ssize_t zcache_pers_nonactive_puts_ignored;
 #ifdef CONFIG_ZCACHE_WRITEBACK
 extern ssize_t zcache_writtenback_pages;
 extern ssize_t zcache_outstanding_writeback_pages;
 #endif
 
+static inline void inc_zcache_flush_total(void) { zcache_flush_total ++; };
+static inline void inc_zcache_flush_found(void) { zcache_flush_found ++; };
+static inline void inc_zcache_flobj_total(void) { zcache_flobj_total ++; };
+static inline void inc_zcache_flobj_found(void) { zcache_flobj_found ++; };
+static inline void inc_zcache_failed_eph_puts(void) { zcache_failed_eph_puts 
++; };
+static inline void inc_zcache_failed_pers_puts(void) { zcache_failed_pers_puts 
++; };
+static inline void inc_zcache_failed_getfreepages(void) { 
zcache_failed_getfreepages ++; };
+static inline void inc_zcache_failed_alloc(void) { zcache_failed_alloc ++; };
+static inline void inc_zcache_put_to_flush(void) { zcache_put_to_flush ++; };
+static inline void inc_zcache_compress_poor(void) { zcache_compress_poor ++; };
+static inline void inc_zcache_mean_compress_poor(void) { 
zcache_mean_compress_poor ++; };
+static inline void inc_zcache_eph_ate_tail(void) { zcache_eph_ate_tail ++; };
+static inline void inc_zcache_eph_ate_tail_failed(void) { 
zcache_eph_ate_tail_failed ++; };
+static inline void inc_zcache_pers_ate_eph(void) { zcache_pers_ate_eph ++; };
+static inline void inc_zcache_pers_ate_eph_failed(void) { 
zcache_pers_ate_eph_failed ++; };
+static inline void inc_zcache_evicted_eph_zpages(unsigned zpages) { 
zcache_evicted_eph_zpages += zpages; };
+static inline void inc_zcache_evicted_eph_pageframes(void) { 
zcache_evicted_eph_pageframes ++; };
+
+static inline void inc_zcache_eph_nonactive_puts_ignored(void) { 
zcache_eph_nonactive_puts_ignored ++; };
+static inline void inc_zcache_pers_nonactive_puts_ignored(void) { 
zcache_pers_nonactive_puts_ignored ++; };
+
 int zcache_debugfs_init(void);
 #else
 static inline void inc_zcache_obj_count(void) { };
@@ -184,4 +206,24 @@ static inline int zcache_debugfs_init(void)
 {
return 0;
 };
+static inline void inc_zcache_flush_total(void) { };
+static inline void inc_zcache_flush_found(void) { };
+static inline void inc_zcache_flobj_total(void) { };
+static inline void inc_zcache_flobj_found(void) { };
+static inline void inc_zcache_failed_eph_puts(void) { };
+static inline void inc_zcache_failed_pers_puts(void) { };
+static inline void inc_zcache_failed_getfreepages(void) { };
+static inline void inc_zcache_failed_alloc(void) { };
+static inline void inc_zcache_put_to_flush(void) { };
+static inline 

[ANNOUNCE] 3.4.29-rt42

2013-02-13 Thread Steven Rostedt

Dear RT Folks,

I'm pleased to announce the 3.4.29-rt42 stable release.


You can get this release via the git tree at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git

  Head SHA1: eab759b5f284bd3c29aaa12f41623ef2cb2ac60d


Or to build 3.4.29-rt42 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.4.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.4.29.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/patch-3.4.29-rt42.patch.xz


You can also build from 3.4.29-rt41 by applying the incremental patch:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.4/incr/patch-3.4.29-rt41-rt42.patch.xz



Enjoy,

-- Steve


Changes from 3.4.29-rt41:

---

Steven Rostedt (1):
  Linux 3.4.29-rt42

Thomas Gleixner (5):
  drivers-tty-pl011-irq-disable-madness.patch
  mmci: Remove bogus local_irq_save()
  sched: Init idle->on_rq in init_idle()
  sched: Check for idle task in might_sleep()
  mm: swap: Initialize local locks early


 drivers/mmc/host/mmci.c |5 -
 drivers/tty/serial/amba-pl011.c |   15 ++-
 kernel/sched/core.c |4 +++-
 localversion-rt |2 +-
 mm/swap.c   |   12 +---
 5 files changed, 23 insertions(+), 15 deletions(-)
---
diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index 032b847..6f8a1a7 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -910,15 +910,12 @@ static irqreturn_t mmci_pio_irq(int irq, void *dev_id)
struct sg_mapping_iter *sg_miter = >sg_miter;
struct variant_data *variant = host->variant;
void __iomem *base = host->base;
-   unsigned long flags;
u32 status;
 
status = readl(base + MMCISTATUS);
 
dev_dbg(mmc_dev(host->mmc), "irq1 (pio) %08x\n", status);
 
-   local_irq_save(flags);
-
do {
unsigned int remain, len;
char *buffer;
@@ -958,8 +955,6 @@ static irqreturn_t mmci_pio_irq(int irq, void *dev_id)
 
sg_miter_stop(sg_miter);
 
-   local_irq_restore(flags);
-
/*
 * If we have less than the fifo 'half-full' threshold to transfer,
 * trigger a PIO interrupt as soon as any data is available.
diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
index b69356c..7b29c92 100644
--- a/drivers/tty/serial/amba-pl011.c
+++ b/drivers/tty/serial/amba-pl011.c
@@ -1788,13 +1788,19 @@ pl011_console_write(struct console *co, const char *s, 
unsigned int count)
 
clk_enable(uap->clk);
 
-   local_irq_save(flags);
+   /*
+* local_irq_save(flags);
+*
+* This local_irq_save() is nonsense. If we come in via sysrq
+* handling then interrupts are already disabled. Aside of
+* that the port.sysrq check is racy on SMP regardless.
+   */
if (uap->port.sysrq)
locked = 0;
else if (oops_in_progress)
-   locked = spin_trylock(>port.lock);
+   locked = spin_trylock_irqsave(>port.lock, flags);
else
-   spin_lock(>port.lock);
+   spin_lock_irqsave(>port.lock, flags);
 
/*
 *  First save the CR then disable the interrupts
@@ -1816,8 +1822,7 @@ pl011_console_write(struct console *co, const char *s, 
unsigned int count)
writew(old_cr, uap->port.membase + UART011_CR);
 
if (locked)
-   spin_unlock(>port.lock);
-   local_irq_restore(flags);
+   spin_unlock_irqrestore(>port.lock, flags);
 
clk_disable(uap->clk);
 }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fec5603..751ec60 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5264,6 +5264,7 @@ void __cpuinit init_idle(struct task_struct *idle, int 
cpu)
rcu_read_unlock();
 
rq->curr = rq->idle = idle;
+   idle->on_rq = 1;
 #if defined(CONFIG_SMP)
idle->on_cpu = 1;
 #endif
@@ -7535,7 +7536,8 @@ void __might_sleep(const char *file, int line, int 
preempt_offset)
static unsigned long prev_jiffy;/* ratelimiting */
 
rcu_sleep_check(); /* WARN_ON_ONCE() by default, no rate limit reqd. */
-   if ((preempt_count_equals(preempt_offset) && !irqs_disabled()) ||
+   if ((preempt_count_equals(preempt_offset) && !irqs_disabled() &&
+!is_idle_task(current)) ||
system_state != SYSTEM_RUNNING || oops_in_progress)
return;
if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy)
diff --git a/localversion-rt b/localversion-rt
index 629e0b4..8bdfb9a 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt41
+-rt42
diff --git a/mm/swap.c b/mm/swap.c
index 2051da9..62dc70c 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -767,6 +767,15 @@ unsigned pagevec_lookup_tag(struct pagevec *pvec, struct 
address_space 

[PATCH] staging/rtl8192u/ieee80211: Fix buffer overflow in ieee80211_softmac_wx.c

2013-02-13 Thread Peter Huewe
Clang/scan-build complains about a possible buffer overflow in
ieee80211_wx_get_name:

.../staging/rtl8192u/ieee80211/ieee80211_softmac_wx.c:499:3:
warning: String copy function overflows destination buffer
strcat(wrqu->name," link..");

.../staging/rtl8192u/ieee80211/ieee80211_softmac_wx.c:497:3:
warning: String copy function overflows destination buffer
strcat(wrqu->name," linked");

The buffer wrqu->name is only IFNAMSIZ bytes big (currently 16),
so if we have a "802.11b/g/n linked" device we overrun the buffer by 3
bytes.

-> Use strlcopy / strlcat to populate the name.
This is done in a similar fashion in
staging/rtl8187se/ieee80211/ieee80211_softmac_wx.c

While at it cleaned some whitespace issues.

Signed-off-by: Peter Huewe 
---
 .../rtl8192u/ieee80211/ieee80211_softmac_wx.c  |   29 ++-
 1 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac_wx.c 
b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac_wx.c
index 45422db..60746b8 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac_wx.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_softmac_wx.c
@@ -482,22 +482,23 @@ int ieee80211_wx_get_name(struct ieee80211_device *ieee,
 struct iw_request_info *info,
 union iwreq_data *wrqu, char *extra)
 {
-   strcpy(wrqu->name, "802.11");
-   if(ieee->modulation & IEEE80211_CCK_MODULATION){
-   strcat(wrqu->name, "b");
-   if(ieee->modulation & IEEE80211_OFDM_MODULATION)
-   strcat(wrqu->name, "/g");
-   }else if(ieee->modulation & IEEE80211_OFDM_MODULATION)
-   strcat(wrqu->name, "g");
-   if (ieee->mode & (IEEE_N_24G | IEEE_N_5G))
-   strcat(wrqu->name, "/n");
+   strlcpy(wrqu->name, "802.11", IFNAMSIZ);
+   if (ieee->modulation & IEEE80211_CCK_MODULATION) {
+   strlcat(wrqu->name, "b", IFNAMSIZ);
+   if (ieee->modulation & IEEE80211_OFDM_MODULATION)
+   strlcat(wrqu->name, "/g", IFNAMSIZ);
+   } else if (ieee->modulation & IEEE80211_OFDM_MODULATION) {
+   strlcat(wrqu->name, "g", IFNAMSIZ);
+   }
 
-   if((ieee->state == IEEE80211_LINKED) ||
-   (ieee->state == IEEE80211_LINKED_SCANNING))
-   strcat(wrqu->name," linked");
-   else if(ieee->state != IEEE80211_NOLINK)
-   strcat(wrqu->name," link..");
+   if (ieee->mode & (IEEE_N_24G | IEEE_N_5G))
+   strlcat(wrqu->name, "/n", IFNAMSIZ);
 
+   if ((ieee->state == IEEE80211_LINKED) ||
+   (ieee->state == IEEE80211_LINKED_SCANNING))
+   strlcat(wrqu->name, " linked", IFNAMSIZ);
+   else if (ieee->state != IEEE80211_NOLINK)
+   strlcat(wrqu->name, " link..", IFNAMSIZ);
 
return 0;
 }
-- 
1.7.8.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] 3.6.11-rt29

2013-02-13 Thread Paul Gortmaker
On Wed, Feb 13, 2013 at 9:13 AM, Thomas Gleixner  wrote:
> Dear RT Folks,
>
> I'm pleased to announce the 3.6.11-rt29 release.
>
> Changes since 3.6.11-rt26:
>
>1) Fix the RT highmem implementation on x86 this time really. The
>   issue I was seeing with kmap_atomic and friends was actually
>   when CONFIG_HIGHMEM was disabled. x8632 uses the atomic maps for
>   io_mapping_map_atomic_wc() even when CONFIG_HIGHMEM is off.
>
>2) Modify the kmap_atomic per thread storage mechanism to reduce
>   code in switch_to
>3) Rewrite RT highmem support for ARM with the kmap_atomic switch
>   mechanism like x86_32 uses it.
>
> This is probably the last release for 3.6 from my side. Steven might
> keep it maintained until the 3.8-rt stabilizes, but that's not yet
> decided.

I happened to notice that there was a misplaced dependency, that had
existed long before 3.6 trees.  It was meant for RCU's Kconfig item
"PREEMPT_RCU" but got accidentally placed under the Kconfig
"RT_GROUP_SCHED" option.  The fixup can be had by fetching:

git://git.kernel.org/pub/scm/linux/kernel/git/paulg/3.6-rt-patches.git

on branch "v3.6.11-rt29-fixes",  Unless you have been making your
own custom kernels, this change probably does not impact you.

Thanks,
Paul.
--


>
> The delta patch against 3.6.11-rt28 is appended below and can be found
> here:
>
>   
> http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/incr/patch-3.6.11-rt28-rt29.patch.xz
>
> The RT patch against 3.6.11 can be found here:
>
>   
> http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.11-rt29.patch.xz
>
> The split quilt queue is available at:
>
>   
> http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.11-rt29.tar.xz
>
> Enjoy,
>
> tglx
>
> ->
> Index: linux-stable/arch/arm/include/asm/highmem.h
> ===
> --- linux-stable.orig/arch/arm/include/asm/highmem.h
> +++ linux-stable/arch/arm/include/asm/highmem.h
> @@ -57,25 +57,10 @@ static inline void *kmap_high_get(struct
>  #ifdef CONFIG_HIGHMEM
>  extern void *kmap(struct page *page);
>  extern void kunmap(struct page *page);
> -# ifndef CONFIG_PREEMPT_RT_FULL
>  extern void *kmap_atomic(struct page *page);
>  extern void __kunmap_atomic(void *kvaddr);
>  extern void *kmap_atomic_pfn(unsigned long pfn);
>  extern struct page *kmap_atomic_to_page(const void *ptr);
> -# else
> -#  define kmap_atomic(page)\
> -   ({ pagefault_disable(); kmap(page); })
> -
> -#  define kmap_atomic_pfn(pfn) \
> -   ({ pagefault_disable(); kmap(pfn_to_page(pfn)) })
> -
> -#  define __kunmap_atomic(kvaddr)  \
> -   do { kunmap(kmap_to_page(kvaddr)); pagefault_enable(); } while(0)
> -
> -#  define kmap_atomic_to_page(kvaddr)  \
> -   kmap_to_page(kvaddr)
> -
> -# endif
>  #endif
>
>  #endif
> Index: linux-stable/arch/arm/mm/highmem.c
> ===
> --- linux-stable.orig/arch/arm/mm/highmem.c
> +++ linux-stable/arch/arm/mm/highmem.c
> @@ -36,9 +36,9 @@ void kunmap(struct page *page)
>  }
>  EXPORT_SYMBOL(kunmap);
>
> -#ifndef CONFIG_PREEMPT_RT_FULL
>  void *kmap_atomic(struct page *page)
>  {
> +   pte_t pte = mk_pte(page, kmap_prot);
> unsigned int idx;
> unsigned long vaddr;
> void *kmap;
> @@ -77,7 +77,10 @@ void *kmap_atomic(struct page *page)
>  * in place, so the contained TLB flush ensures the TLB is updated
>  * with the new mapping.
>  */
> -   set_top_pte(vaddr, mk_pte(page, kmap_prot));
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +   current->kmap_pte[type] = pte;
> +#endif
> +   set_top_pte(vaddr, pte);
>
> return (void *)vaddr;
>  }
> @@ -111,6 +114,7 @@ EXPORT_SYMBOL(__kunmap_atomic);
>
>  void *kmap_atomic_pfn(unsigned long pfn)
>  {
> +   pte_t pte = pfn_pte(pfn, kmap_prot);
> unsigned long vaddr;
> int idx, type;
>
> @@ -122,7 +126,10 @@ void *kmap_atomic_pfn(unsigned long pfn)
>  #ifdef CONFIG_DEBUG_HIGHMEM
> BUG_ON(!pte_none(get_top_pte(vaddr)));
>  #endif
> -   set_top_pte(vaddr, pfn_pte(pfn, kmap_prot));
> +#ifdef CONFIG_PREEMPT_RT_FULL
> +   current->kmap_pte[type] = pte;
> +#endif
> +   set_top_pte(vaddr, pte);
>
> return (void *)vaddr;
>  }
> @@ -136,4 +143,28 @@ struct page *kmap_atomic_to_page(const v
>
> return pte_page(get_top_pte(vaddr));
>  }
> +
> +#if defined CONFIG_PREEMPT_RT_FULL
> +void switch_kmaps(struct task_struct *prev_p, struct task_struct *next_p)
> +{
> +   int i;
> +
> +   /*
> +* Clear @prev's kmap_atomic mappings
> +*/
> +   for (i = 0; i < prev_p->kmap_idx; i++) {
> +   int idx = i + KM_TYPE_NR * smp_processor_id();
> +
> +   set_top_pte(__fix_to_virt(FIX_KMAP_BEGIN + idx), __pte(0));
> +   }
> +   /*
> +* Restore @next_p's kmap_atomic mappings
> +*/
> +   for (i = 0; i < 

Re: [PATCH 1/4] gpiolib: check descriptors validity before use

2013-02-13 Thread Alexandre Courbot
On Thu, Feb 14, 2013 at 7:49 AM, Ryan Mallon  wrote:
> Is it really useful to use the same pr_debug for the error case? Why not do:
>
> desc = gpio_to_desc(gpio);
> if (!desc) {
> pr_debug("%s - Invalid gpio %d\n", __func__, gpio);
> return -EINVAL;
> }
>
> ...
>
> At this point desc is known valid, though you could just use the gpio
> number that was passed in (assuming that it is always the same as
> desc_to_gpio).
>
> pr_debug("%s: gpio%d status %d\n", __func__,
>  desc_to_gpio(desc), status);
> return status;
>
> That provides more information (the original gpio number and the reason
> for the -EINVAL) if the gpio is not valid, and removes the ugly ternary
> operator from the pr_debug. Same goes for the other functions.

That's mainly a style issue - I tried to preserve the code's behavior
as much as possible with these patches. But indeed the invalid GPIO
case is special, and on second thought identifying invalid GPIOs are
"gpio-1" is confusing. One could also say horrible.

Guess this deserves to be fixed in a v2. Thanks.

Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
(apologies for the duplicate mail, I typo-ed the maintainers address)

G'day,

I've been looking into the patch "[v2] fat: editions to support
fat_fallocate()" and I wonder if there is a way we can split this issue
in two, so that we get at least some of the patch into the kernel.

https://lkml.org/lkml/2012/10/13/75
https://patchwork.kernel.org/patch/1589161/

What I'm wanting to discuss (and perhaps implement, with you if
possible) is splitting this patch into writing to existing pre-allocated
files, and creating a new pre-allocation.

If Windows does, as you claim, simply read preallocations as zero, and
writes to them normally and without error, then Linux should do the
same.  Here of course I'm assuming that Windows is not preallocating,
but instead simply trying to recover gracefully and safely from a simple
'file system corruption', where the sectors are allocated but not used. 

The bulk of this patch is implementing this transparent recovery, and it
seem relatively harmless to include this into the kernel.

Then vendors doing TV streaming, or in my case copies of large files
onto Samba-mounted USB FAT devices, can add only the smaller patch to
implement fallocate, at their own risk and fully knowing that it will be
regarded as corrupt on Linux. 

If accepted read support will, over a period of years, trickle down to
other Linux users, broadening the base that can still read these
'corrupt' drives, no matter the cause. 

I hope you agree that this is a practical way forward, and I look
forward to working with you on this.

Thanks,

Andrew Bartlett
-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Read support for fat_fallocate()? (was [v2] fat: editions to support fat_fallocate())

2013-02-13 Thread Andrew Bartlett
G'day,

I've been looking into the patch "[v2] fat: editions to support
fat_fallocate()" and I wonder if there is a way we can split this issue
in two, so that we get at least some of the patch into the kernel.

https://lkml.org/lkml/2012/10/13/75
https://patchwork.kernel.org/patch/1589161/

What I'm wanting to discuss (and perhaps implement, with you if
possible) is splitting this patch into writing to existing pre-allocated
files, and creating a new pre-allocation.

If Windows does, as you claim, simply read preallocations as zero, and
writes to them normally and without error, then Linux should do the
same.  Here of course I'm assuming that Windows is not preallocating,
but instead simply trying to recover gracefully and safely from a simple
'file system corruption', where the sectors are allocated but not used. 

The bulk of this patch is implementing this transparent recovery, and it
seem relatively harmless to include this into the kernel.

Then vendors doing TV streaming, or in my case copies of large files
onto Samba-mounted USB FAT devices, can add only the smaller patch to
implement fallocate, at their own risk and fully knowing that it will be
regarded as corrupt on Linux. 

If accepted read support will, over a period of years, trickle down to
other Linux users, broadening the base that can still read these
'corrupt' drives, no matter the cause. 

I hope you agree that this is a practical way forward, and I look
forward to working with you on this.

Thanks,

Andrew Bartlett
-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: Lock down MSR writing in secure boot

2013-02-13 Thread Matthew Garrett
On Wed, 2013-02-13 at 17:08 -0800, H. Peter Anvin wrote:

> Well, for at least things with device nodes (/dev/mem, /dev/msr and so
> on) it should be possible, no?  ioperm() and iopl() are another matter.

Sure, if we can guarantee that a signed userspace loads a signed SELinux
policy before any unsigned code runs. But, realistically, that's not
going to be possible.

-- 
Matthew Garrett | mj...@srcf.ucam.org


[PATCH] staging/wlan-ng: Fix 'Branch condition evaluates to a garbage value' in p80211netdev.c

2013-02-13 Thread Peter Huewe
clang/scan-build complains that:
p80211netdev.c:451:6: warning: Branch condition evaluates to a garbage
value
if ((p80211_wep.data) && (p80211_wep.data != skb->data))
^

This can happen in p80211knetdev_hard_start_xmit if
- if (wlandev->state != WLAN_DEVICE_OPEN) evaluates to true.
the execution flow then continues at the 'failed' label where
p80211_wep.data is used without being initialized first.

-> Initialize the data field to NULL to fix this issue.

Signed-off-by: Peter Huewe 
---
 drivers/staging/wlan-ng/p80211netdev.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/wlan-ng/p80211netdev.c 
b/drivers/staging/wlan-ng/p80211netdev.c
index 750330f..0039e08 100644
--- a/drivers/staging/wlan-ng/p80211netdev.c
+++ b/drivers/staging/wlan-ng/p80211netdev.c
@@ -351,6 +351,8 @@ static int p80211knetdev_hard_start_xmit(struct sk_buff 
*skb,
union p80211_hdr p80211_hdr;
struct p80211_metawep p80211_wep;
 
+   p80211_wep.data = NULL;
+
if (skb == NULL)
return NETDEV_TX_OK;
 
-- 
1.7.8.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks

2013-02-13 Thread Moore, Robert
> > > I thought about that, but actually there's no guarantee that the
> > > handle will be valid after _EJ0 as far as I can say.  So the race
> > > condition is going to be there anyway and using struct acpi_device
> > > just makes it easier to avoid it.
> >
> > In theory, yes, a stale handle could be a problem, if _EJ0 performs
> > unload table and if ACPICA frees up its internal data structure
> > pointed by the handle as a result.  But we should not see such issue
> > now since we do not support dynamic ACPI namespace yet.
> 
> I'm waiting for information from Bob about that.  If we can assume ACPI
> handles to be always valid, that will simplify things quite a bit.

If a table is unloaded, all the namespace nodes for that table are removed from 
the namespace, and thus any ACPI_HANDLE pointers go stale and invalid.

Bob



linux-next: build failure after merge of the tip tree

2013-02-13 Thread Stephen Rothwell
Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

drivers/thermal/intel_powerclamp.c: In function 'clamp_thread':
drivers/thermal/intel_powerclamp.c:360:21: error: 'MAX_USER_RT_PRIO' undeclared 
(first use in this function)

Caused by commit 8bd75c77b7c6 ("sched/rt: Move rt specific bits into new
header file") interacting with commit d6d71ee4a14a ("PM: Introduce Intel
PowerClamp Driver") from the thermal tree.

I applied this merge fix patch and can carry it as necessary:

From: Stephen Rothwell 
Date: Thu, 14 Feb 2013 13:26:22 +1100
Subject: [PATCH] sched/rt: fix PowerClamp Driver for define move

Signed-off-by: Stephen Rothwell 
---
 drivers/thermal/intel_powerclamp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/thermal/intel_powerclamp.c 
b/drivers/thermal/intel_powerclamp.c
index ab3ed90..b40b37c 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
-- 
1.8.1

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpWQNJdZYq6x.pgp
Description: PGP signature


Re: [PATCH RFC 10/12] userns: Convert xfs to use kuid/kgid/kprojid where appropriate

2013-02-13 Thread Dave Chinner
On Wed, Feb 13, 2013 at 10:13:16AM -0800, Eric W. Biederman wrote:
> Joel Becker  writes:
> 
> > On Wed, Nov 21, 2012 at 10:55:24AM +1100, Dave Chinner wrote:
> >> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> >> > index 2778258..3656b88 100644
> >> > --- a/fs/xfs/xfs_inode.c
> >> > +++ b/fs/xfs/xfs_inode.c
> >> > @@ -570,11 +570,12 @@ xfs_dinode_from_disk(
> >> >  to->di_version = from ->di_version;
> >> >  to->di_format = from->di_format;
> >> >  to->di_onlink = be16_to_cpu(from->di_onlink);
> >> > -to->di_uid = be32_to_cpu(from->di_uid);
> >> > -to->di_gid = be32_to_cpu(from->di_gid);
> >> > +to->di_uid = make_kuid(_user_ns, 
> >> > be32_to_cpu(from->di_uid));
> >> > +to->di_gid = make_kgid(_user_ns, 
> >> > be32_to_cpu(from->di_gid));
> >> 
> >> You can't do this, because the incore inode structure is written
> >> directly to the log. This is effectively an on-disk format change.
> >
> > Yeah, I don't get this either.  Over in ocfs2, you do the
> > correct thing, translating at the boundary from ocfs2_dinode to struct
> > inode.
> 
> This is the boundary.

It is *a* boundary. It is the in-core disk inode to on disk inode
boundary (i.e. struct xfs_icdinode to struct xfs_dinode).
Namespaces don't belong at this boundary - this is internal XFS
stuff that nothing from the VFS should be interacting with. The
structure of XFS is roughly:

userspace
-
   VFS
-
 VFS/XFS<< here is where you need to modify
interface
-
core XFS
-
XFS/disk<< here is where you actually modified
interface
-
 storage


IOWs, the boundary you are looking for is the VFS/XFS boundary (i.e.
struct inode to struct xfs_icdinode). i.e. namespace aware uid/gid
is in the struct inode, flattened 32 bit values are in the struct
xfs_icdinode. The struct inode and the struct xfs_icdinode are both
embedded in the struct xfs_inode, so we just have to translate
between the two internal structures are the right point in time.

Hence for namespaces to work correctly, anything that is currently
using current_fs*id() for uid/gid comparison needs to be converted
to use the VFS inode values (i.e. VFS_I(ip)->i_*id). For values
written to the xfs inode, the VFS uid/gid needs to be flattened to
a 32bit value.

These flattened values are needed during inode allocation (for
initial on-disk values) and creating dquots associated with the new
inodes. You should be able to derive them from current_fs*id(),
right? Then when changing uid/gid via .setattr, we can flatten the
namespace aware VFS uid/gid and into the XFS incore idinode (i.e.
ip->i_d.di_*id) via the same method. Conversion from XFS on-disk to
namespace aware VFS uid/gid then occurs when when initialising the
VFS inode from the XFS inode (i.e. in xfs_setup_inode() like I
previously suggested).

This keeps namespace aware uid/gid up at the VFS layer and
conversion at the VFS/XFS boundaries in the XFS code, and everything
should work fine.

> The crazy thing is that is that xfs appears to
> directly write their incore inode structure into their journal. 

Off topic, but it's actually a very sane thing to do. It's called
logical object logging, as opposed to physical logging like ext3/4
and ocfs2 use. XFS uses a combination of logical logging
(superblock, dquots, inodes) and physical logging (via buffers).

Logical logging decouples in-memory object modification from buffer
IO and ensures the buffer is not a single point of serialisation
when multiple objects share a single buffer. Hence we can read/write
an inode buffer and concurrent modify inodes in memory from that
buffer at the same time.  i.e. we only need buffers for IO, not for
ongoing modifications.

This decoupling allows XFS to use large buffers for inodes and so
minimise IO for reading and/or writing inodes.  Further, we can also
easily serialise logged, in-memory modifications for all objects in
a single backing buffer with only minor interruption to ongoing
modifications. It also allows us to use simple fire-and-forget
writeback semantics for metadata.

IOWs, the use of logical logging techniques vastly improves
concurrency and scalability over the physical logging methods other
filesystems use. Call it crazy if you want, but I find general most
people say this simply because they don't understand why XFS does
what it does

> I had
> missed the journal reference the first time through and simply assumed
> since this is where the disk inode to the incore inode coversion
> happened that the weird scary comment in the xfs header file was wrong.

Comments in XFS, especially weird scary ones, are rarely wrong. Some
of them might have been there for close on 20 years, but they are
our documentation for all the weird, scary stuff that XFS does.  I
rely on them being correct, so it's something I always pay 

[PATCH][WIP] dio rewrite

2013-02-13 Thread Kent Overstreet
Last posting: http://marc.info/?l=linux-fsdevel=136063048002755=2

Got it working and ran some benchmarks. On a high end SSD, doing 4k
random reads with fio I got around a 30% increase in throughput.

(That was without the aio changes I recently did. With those, throughput
was aproximately doubled).

The decrease in compiled binary size is even more dramatic than the
reduction in LOC:

3.8-rc7
   textdata bss dec hex filename
  11609  16   8   116332d71 fs/direct-io.o

My version:
   textdata bss dec hex filename
   3545  16   03561 de9 fs/direct-io.o

It's only been lightly tested - I haven't run xfstests yet - but there
shouldn't be anything broken excluding btrfs.

There's a few more performance optimizations I may do, but aside from
the btrfs issues I think it's essentially done.

Due to the sheer number of hairy corner cases in the dio code, I'd
really like to get as much review as possible. The new code should be
vastly easier to review and understand, I think.

Git repo:
http://evilpiepirate.org/git/linux-bcache.git block_stuff

---
 fs/direct-io.c | 1323 +++-
 1 file changed, 357 insertions(+), 966 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index 8e838b1..1fb9fb4 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -8,7 +8,7 @@
  * 04Jul2002   Andrew Morton
  * Initial version
  * 11Sep2002   janet...@us.ibm.com
- * added readv/writev support.
+ * added readv/writev support.
  * 29Oct2002   Andrew Morton
  * rewrote bio_add_page() support.
  * 30Oct2002   pbad...@us.ibm.com
@@ -38,183 +38,37 @@
 #include 
 #include 
 
-/*
- * How many user pages to map in one call to get_user_pages().  This determines
- * the size of a structure in the slab cache
- */
-#define DIO_PAGES  64
-
-/*
- * This code generally works in units of "dio_blocks".  A dio_block is
- * somewhere between the hard sector size and the filesystem block size.  it
- * is determined on a per-invocation basis.   When talking to the filesystem
- * we need to convert dio_blocks to fs_blocks by scaling the dio_block quantity
- * down by dio->blkfactor.  Similarly, fs-blocksize quantities are converted
- * to bio_block quantities by shifting left by blkfactor.
- *
- * If blkfactor is zero then the user's request was aligned to the filesystem's
- * blocksize.
- */
-
 /* dio_state only used in the submission path */
-
 struct dio_submit {
-   struct bio *bio;/* bio under assembly */
-   unsigned blkbits;   /* doesn't change */
-   unsigned blkfactor; /* When we're using an alignment which
-  is finer than the filesystem's soft
-  blocksize, this specifies how much
-  finer.  blkfactor=2 means 1/4-block
-  alignment.  Does not change */
-   unsigned start_zero_done;   /* flag: sub-blocksize zeroing has
-  been performed at the start of a
-  write */
-   int pages_in_io;/* approximate total IO pages */
-   size_t  size;   /* total request size (doesn't change)*/
-   sector_t block_in_file; /* Current offset into the underlying
-  file in dio_block units. */
-   unsigned blocks_available;  /* At block_in_file.  changes */
-   int reap_counter;   /* rate limit reaping */
-   sector_t final_block_in_request;/* doesn't change */
-   unsigned first_block_in_page;   /* doesn't change, Used only once */
-   int boundary;   /* prev block is at a boundary */
-   get_block_t *get_block; /* block mapping function */
-   dio_submit_t *submit_io;/* IO submition function */
-
-   loff_t logical_offset_in_bio;   /* current first logical block in bio */
-   sector_t final_block_in_bio;/* current final block in bio + 1 */
-   sector_t next_block_for_io; /* next block to be put under IO,
-  in dio_blocks units */
-
-   /*
-* Deferred addition of a page to the dio.  These variables are
-* private to dio_send_cur_page(), submit_page_section() and
-* dio_bio_add_page().
-*/
-   struct page *cur_page;  /* The page */
-   unsigned cur_page_offset;   /* Offset into it, in bytes */
-   unsigned cur_page_len;  /* Nr of bytes at cur_page_offset */
-   sector_t cur_page_block;/* Where it starts */
-   loff_t cur_page_fs_offset;  /* Offset in file */
-
-   /*
-* Page fetching state. These variables belong to dio_refill_pages().
-*/
-   int curr_page;  /* 

Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread H. Peter Anvin
On 02/13/2013 05:31 PM, Linus Torvalds wrote:
> On Wed, Feb 13, 2013 at 4:54 PM, H. Peter Anvin  wrote:
>>
>> It does for the callee, but only on a whole-file basis.  It would be a
>> lot nicer if we could do it with function attributes.
> 
> A way to just set the callee-clobbered list on a per-function basis
> would be lovely. Gcc has limited support for this on some
> architectures, where you can specify "save every register for this
> function" in order to do things like interrupt handlers etc without
> even resorting to asm. But there is no generic (or even just x86)
> support for anything like it :-(
> 
> There are other calling-convention attributes that make me suspect gcc
> could easily do this (it already supports per-function ABI
> specification, so presumably it already has some concept of
> callee-saved registers being different for different attributes), but
> from my reading you currently have to generate asm wrappers by hand
> (and call them by hand with inline asm) if you want to do something
> like this.
> 

I just filed a gcc bugzilla on this:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56314

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:core/locking] x86/smp: Move waiting on contended ticket lock out of line

2013-02-13 Thread Linus Torvalds
On Wed, Feb 13, 2013 at 5:21 PM, Linus Torvalds
 wrote:
>
> Now, on other machines you get the call chain even with pebs because
> you can get the whole

Oops, that got cut short early, because I started looking up when PEBS
and the last-branch-buffer work together, and couldn't find it, and
then came back to the email and forgot to finish the sentence.

Anyway: sometimes you get call chains with precise events, sometimes
you don't. And it turns out that I can't find out which cores do both,
and which ones don't.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] aio: only suppress events from cancelled kiocbs if free_ioctx() is in progress

2013-02-13 Thread Kent Overstreet
On Wed, Feb 13, 2013 at 04:52:14PM -0500, Benjamin LaHaise wrote:
> The io_cancel() syscall allows for cancellation of iocbs in flight to
> generate a completion event.  The current behaviour of batch_complete_aio()
> is to suppress all completion events.  Some types of asynchronous operations
> cannot be cancelled synchronously, and must generate a completion event at
> some point after the io_cancel() syscall.  Instead, only suppress
> completion events during kioctx teardown by free_ioctx().
> 
> Signed-off-by: Benjamin LaHaise 
> ---
>  fs/aio.c |6 +-
>  1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 46f9dd0..1bcb818 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -81,6 +81,7 @@ struct kioctx {
>  
>   /* sys_io_setup currently limits this to an unsigned int */
>   unsignedmax_reqs;
> + unsigneddead;

Can use percpu_ref_dead() for this - it returns true after
percpu_ref_kill() has been called (it's what I converted the old
ctx->dead uses to).

>  
>   unsigned long   mmap_base;
>   unsigned long   mmap_size;
> @@ -311,6 +312,7 @@ static void free_ioctx(struct kioctx *ctx)
>   struct kiocb *req;
>   unsigned cpu, head, avail;
>  
> + ctx->dead = 1;
>   spin_lock_irq(>ctx_lock);
>  
>   while (!list_empty(>active_reqs)) {
> @@ -749,7 +751,9 @@ void batch_complete_aio(struct batch_complete *batch)
>   n = rb_parent(n);
>   }
>  
> - if (unlikely(xchg(>ki_cancel,
> + /* Suppress cancelled events if free_ioctx() is in progress. */
> + if (unlikely(req->ki_ctx->dead &&
> +  xchg(>ki_cancel,
> KIOCB_CANCELLED) == KIOCB_CANCELLED)) {

I'm not seeing why we need to supress events during teardown - if we
just let it be delivered to the ringbuffer like normal, the free_ioctx()
code will find it there.

I think the event supressing code can be deleted entirely.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] aio: fix kioctx not being freed after cancellation at exit time

2013-02-13 Thread Kent Overstreet
On Wed, Feb 13, 2013 at 12:46:36PM -0500, Benjamin LaHaise wrote:
> The recent changes overhauling fs/aio.c introduced a bug that results in the
> kioctx not being freed when outstanding kiocbs are cancelled at exit_aio()
> time.  Specifically, a kiocb that is cancelled has its completion events
> discarded by batch_complete_aio(), which then fails to wake up the process
> stuck in free_ioctx().  Fix this by adding a wake_up() in batch_complete_aio()
> and modifying the wait_event() condition in free_ioctx() appropriately.
> 
> Signed-off-by: Benjamin LaHaise 
> ---
>  fs/aio.c |5 -
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index dc52b0c..46f9dd0 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -335,7 +335,9 @@ static void free_ioctx(struct kioctx *ctx)
>   kunmap_atomic(ring);
>  
>   while (atomic_read(>reqs_available) < ctx->nr) {
> - wait_event(ctx->wait, head != ctx->shadow_tail);
> + wait_event(ctx->wait,
> +(head != ctx->shadow_tail) ||
> +(atomic_read(>reqs_available) != ctx->nr));

That test looks backwards - I think we want to wait until reqs_available
== ctx->nr

>  
>   avail = (head <= ctx->shadow_tail ?
>ctx->shadow_tail : ctx->nr) - head;
> @@ -754,6 +756,7 @@ void batch_complete_aio(struct batch_complete *batch)
>* with free_ioctx()
>*/
>   atomic_inc(>ki_ctx->reqs_available);
> + wake_up(>ki_ctx->wait);
>   aio_put_req(req);
>   continue;
>   }
> -- 
> 1.7.4.1
> 
> 
> -- 
> "Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: Lock down MSR writing in secure boot

2013-02-13 Thread Casey Schaufler
On 2/13/2013 5:04 PM, Matthew Garrett wrote:
> On Wed, 2013-02-13 at 16:44 -0800, Casey Schaufler wrote:
>
>> If you want that sort of granularity throw yourself on the SELinux
>> bandwagon. Fine grained capabilities are insane and unmanageable
>> and will only lead to tears. Security is despised because of the
>> notion that making systems impossible to use is a good thing.
> SELinux is completely unusable for this specific case.
>
Well, you'll get no argument from me there.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] aio: correct calculation of available events

2013-02-13 Thread Kent Overstreet
On Wed, Feb 13, 2013 at 12:45:52PM -0500, Benjamin LaHaise wrote:
> When the number of available events in the ring buffer is calculated,
> the avail calculation is incorrect when head == tail.  This is harmless
> in aio_read_events_ring(), but in free_ioctx() leads to the subsequent
> WARN_ON(atomic_read(>reqs_available) > ctx->nr).  Correct this.
> 
> Signed-off-by: Benjamin LaHaise 
> ---
>  fs/aio.c |5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index 24ba228..dc52b0c 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -337,7 +337,8 @@ static void free_ioctx(struct kioctx *ctx)
>   while (atomic_read(>reqs_available) < ctx->nr) {
>   wait_event(ctx->wait, head != ctx->shadow_tail);
>  
> - avail = (head < ctx->shadow_tail ? ctx->shadow_tail : ctx->nr) 
> - head;
> + avail = (head <= ctx->shadow_tail ?
> +  ctx->shadow_tail : ctx->nr) - head;
>  
>   atomic_add(avail, >reqs_available);
>   head += avail;
> @@ -884,7 +885,7 @@ static long aio_read_events_ring(struct kioctx *ctx,
>   goto out;
>  
>   while (ret < nr) {
> - long avail = (head < ctx->shadow_tail
> + long avail = (head <= ctx->shadow_tail
> ? ctx->shadow_tail : ctx->nr) - head;
>   struct io_event *ev;
>   struct page *page;
> -- 
> 1.7.4.1
> 
> 
> -- 
> "Thought is the essence of where you are now."

Whoops!

Reviewed-by: Kent Overstreet 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >